├── README.md ├── preprocess ├── get_data_mean.py └── get_test_img_list.py ├── resnext ├── merge.py ├── predict.py ├── predict.sh ├── run.sh ├── symbol_se_resnext_w_d.py └── train_se_resnext_w_d.py ├── simple_model ├── merge.py ├── merge.sh ├── pre.sh ├── predict_multi.py ├── train.sh └── train_ini.py └── 答辩.pdf /README.md: -------------------------------------------------------------------------------- 1 | # IQIYI多模态视频人物识别 2 | ## 队伍:炸天 3 | 4 | --- 5 | ## 运行环境 6 | 基于官方特征向量的人脸识别: python3.6+keras 7 | 8 | 场景识别: python2.6+mxnet 9 | 10 | GPU:Nvidia P100 11 | 12 | ## 模型摘要 13 | ### 基于特征向量的人脸识别 14 | 运用Dropout、Batchnormalization、负样本处理以及**对特征向量进行数据增强**,单模型效果可达0.8259。 15 | 融合多模型后分数可达0.8381。 16 | ### 基于场景的人物识别 17 | 每个视频取两帧图片进行训练。为了兼顾训练速度和准备率,选择模型为[resnext](https://github.com/bruinxiong/SENet.mxnet)。 对测试集中人脸质量较差或检测不到人脸的视频进行识别,并将结果跟基于特征向量得到的结果进行融合。 融合后分数可达0.8505(最终结果)。 18 | 19 | ## 算法流程说明 20 | ### 预处理 21 | #### preprocess/get_data_mean.py 22 | 对特征向量进行数据增强。 提供的特征向量是基于arcface模型训练的,因为不同人物对应的特征向量之间存在着几何意义,即一个人物的特征向量的集合是一个闭集。 集合中任意N个向量的平均值仍在集合中。 23 | 24 | 我们对每个视频中特征向量进行了增强,随机取2,3,4,5个特征向量,求出其平均值。将得到的新特征向量加入到原始数据中。 25 | 26 | 基于该方法,我们将测试集的特征向量的数量扩充了接近一倍。 27 | 28 | 但该方法对训练数据的扩充效果并不明显,因为扩充的数据仍在集合内,集合的范围没有因此扩大或者缩小,对于训练模型而言帮助有限。因此我们只对测试数据进行了数据增强。 29 | 30 | 31 | #### preprocess/get_test_img_list.py 32 | 得到测试集中,人脸质量较低(低于40)或者检测不到人脸的视频。 33 | 并将其分别记录。这部分数据是用于基于场景的人物识别模型的。 34 | 35 | ### 基于特征向量的人脸识别 36 | #### 训练阶段 simple_model/train_ini.py 37 | 运行脚本 simple_model/train.sh 38 | 39 | 验证集和训练集都用来训练,将验证集中的噪声数据设为第4935类。每次训练提取全部数据的80%。 40 | 41 | 每次训练包括4个模型,分别取人脸质量为0到200,20到200,40到200,0到80的数据进行训练。 42 | 一共训练18次,通过控制random seed,使得每次训练的数据不同。总共可得到72个模型。模型保存在data/simple_model/save_model/中 43 | 44 | #### 预测阶段 simple_model/predict.py 45 | 运行脚本 simple_model/pre.sh 46 | 47 | 每次预测同一次训练的4个模型(机器内存限制),然后将得到的预测结果取平均。总共预测18次,得到18个结果。再将得到的结果,每6个合并为一个。全部得到3个结果,保存在data/simple_model/ 中。 48 | 将得到的三个结果,与我之前瞎调得到的最优结果(分数为0.8282)融合(数据保存在data/simple_model/tmp_result中)。得到的结果即为我们队伍基于特征向量的最优结果,分数为0.8381。 49 | 50 | 分次预测及分次融合是因为机器的限制,若机器内存足够,可一次完成。 51 | 预测结果融合的代码及脚本为merge.py、merge.sh 52 | 53 | 54 | ### 基于场景的人物识别 55 | #### 数据准备 56 | 训练集:每个视频取第一帧和最后一帧。使用mxnet官方方法生产.rec文件,resize和crop成224*224的大小。 此处不赘述。 57 | 58 | 测试集: 我们将人脸识别为0-20(包含无人脸)、20-30、30-40分别存放在3个文件夹中。 59 | 60 | 预训练模型存放在data/resnext_model/model中 61 | 62 | #### 训练 resnext/train_se_resnext_w_d 63 | 运行脚本: resnext/run.sh 64 | 65 | 使用基于imagenet的预训练模型。训练16个epoch。训练好的模型保存在data/resnext_model/save_model中 66 | 67 | 68 | #### 预测 69 | 运行脚本: resnext/predict.sh 70 | 71 | 对每个视频的所有帧进行预测,对所有帧的预测结果取平均,作为该视频的预测结果。 72 | #### 融合 resnext/merge.py 73 | 将resnext得到的预测结果与基于特征向量得到的结果做融合。 74 | 75 | 人脸质量为0到20的,特征向量的结果权重为0.5,resnext得到的结果权重为0.5。人脸质量为320-30的,权重分别为0.6、0.4。人脸质量为30-40的,权重分别为0.7、0.3。 76 | 77 | 对于那些检测不到人脸的视频,只取resnext得到的结果,权重为0.8.(因为resnext得到的结果性能并不优越,在验证集上约有46%的准确率) 78 | 最终提交的结果保存在submit_result/best_merge.pickle中 -------------------------------------------------------------------------------- /preprocess/get_data_mean.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | import os 3 | import numpy as np 4 | import pickle 5 | import time 6 | import random 7 | from util import remove_outlier, get_video_annotation 8 | 9 | 10 | def get_data(pickle_path='', save_path='', only_new=False): 11 | with open(pickle_path, 'rb') as fin: 12 | feats_dict = pickle.load(fin, encoding='iso-8859-1') 13 | with open('/home/jzhengas/Jason/data/simple_model/dict.pickle', 'rb') as fin: 14 | data_dict = pickle.load(fin, encoding='iso-8859-1') 15 | 16 | window_len = [2,3,4,5,6] 17 | data_dict = None 18 | x = [] 19 | y = [] 20 | det_score_list = [] 21 | face_score_list = [] 22 | name_list = [] 23 | for video_name in feats_dict: 24 | feats = feats_dict[video_name] 25 | if not data_dict is None: 26 | if video_name in data_dict: 27 | label = data_dict[video_name] 28 | else: 29 | label = 4935 30 | else: 31 | label = 4935 32 | if len(feats) == 0: 33 | continue 34 | 35 | label = int(label)-1 36 | tmp_x = [] 37 | tmp_det = [] 38 | tmp_score = [] 39 | 40 | tmp_y = [] 41 | tmp_name = [] 42 | for feat in feats: 43 | [frame_num, bbox, det_score, qua_score, feat_arr] = feat 44 | tmp_x.append(feat_arr) 45 | tmp_det.append(det_score) 46 | tmp_score.append(qua_score) 47 | 48 | new_feats = [] if only_new else feats 49 | 50 | for d in window_len: 51 | s_list = list(zip(tmp_x, tmp_det, tmp_score)) 52 | random.shuffle(s_list) 53 | tmp_x, tmp_det, tmp_score = zip(*s_list) 54 | for i in range(0, len(tmp_x)-d, 2): 55 | x = np.mean(tmp_x[i:i+d], axis=0) 56 | det = np.mean(tmp_det[i:i+d], axis=0) 57 | score = np.mean(tmp_score[i:i+d], axis=0) 58 | new_feats.append([0, 0, det, score, x]) 59 | 60 | feats_dict[video_name] = new_feats 61 | 62 | with open(save_path, 'wb+') as f: 63 | pickle.dump(feats_dict, f) 64 | 65 | 66 | if __name__ == '__main__': 67 | save_path = '../data/test_mean.pickle' 68 | pickle_path = '/home/jzhengas/Jason/data/feats_test.pickle' 69 | get_data(pickle_path=pickle_path, save_path=save_path, only_new=False) 70 | 71 | -------------------------------------------------------------------------------- /preprocess/get_test_img_list.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | import os 3 | import numpy as np 4 | import pickle 5 | import time 6 | 7 | 8 | output_file1 = '../data/test_part1.lst' 9 | output_file2 = '../data/test_part2.lst' 10 | output_file3 = '../data/test_part3.lst' 11 | pickle_path = '/home/jzhengas/Jason/data/feats_test.pickle' 12 | img_prefix = '' 13 | 14 | 15 | with open(pickle_path, 'rb') as fin: 16 | feats_dict = pickle.load(fin, encoding='iso-8859-1') 17 | 18 | count = 0 19 | identity = {} 20 | 21 | out_fin1 = open(output_file1, 'w') 22 | out_fin2 = open(output_file2, 'w') 23 | out_fin3 = open(output_file3, 'w') 24 | 25 | for video_name in feats_dict: 26 | feats = feats_dict[video_name] 27 | 28 | video_name = video_name.split('.')[0].split('_')[-1] 29 | 30 | if len(feats) == 0: 31 | out_fin1.write(video_name+'\n') 32 | continue 33 | 34 | 35 | 36 | score_sum = 0 37 | for feat in feats: 38 | [frame_num, bbox, det_score, qua_score, feat_arr] = feat 39 | score_sum += qua_score 40 | mean_score = 1.0 * score_sum / len(feats) 41 | if mean_score < 20 and mean_score > 0: 42 | out_fin1.write(video_name+'\n') 43 | elif mean_score < 30 and mean_score > 20: 44 | out_fin2.write(video_name+'\n') 45 | elif mean_score < 40 and mean_score > 30: 46 | out_fin3.write(video_name+'\n') 47 | 48 | 49 | 50 | out_fin1.close() 51 | out_fin2.close() 52 | out_fin3.close() 53 | 54 | 55 | 56 | -------------------------------------------------------------------------------- /resnext/merge.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | import random 3 | import numpy as np 4 | import argparse 5 | 6 | 7 | def read_from_pickle(pickle_path): 8 | with open(pickle_path, 'rb') as fin: 9 | result = pickle.load(fin, encoding='iso-8859-1') 10 | return result 11 | 12 | 13 | if __name__ == '__main__': 14 | parser = argparse.ArgumentParser(description='Train face network') 15 | 16 | parser.add_argument('--pickle-file', default='../submit_result/best_face.pickle') 17 | parser.add_argument('--img-file', default='result/tmp_resultv1.pickle') 18 | parser.add_argument('--img-file2', default='result/tmp_resultv2.pickle') 19 | parser.add_argument('--img-file3', default='result/tmp_resultv3.pickle') 20 | parser.add_argument('--target', default='../submit_result/best_merge.pickle', help='directory to save model.') 21 | parser.add_argument('--txt', type=int, default=1) 22 | args = parser.parse_args() 23 | 24 | 25 | pre_result = read_from_pickle(args.pickle_file) 26 | img_result = read_from_pickle(args.img_file) 27 | img_result2 = read_from_pickle(args.img_file2) 28 | img_result3 = read_from_pickle(args.img_file3) 29 | ratio_1, ratio_2, only2 = 0.5, 0.5, 0.8 30 | 31 | 32 | merge_result = [] 33 | result_dict = {} 34 | for i in range(0, len(pre_result)): 35 | result_dict[pre_result[i][0]] = pre_result[i][1] 36 | 37 | for video_name in img_result: 38 | img_result[video_name] = np.concatenate([img_result[video_name],[0]]) 39 | if video_name in result_dict: 40 | result_dict[video_name] = 0.5*result_dict[video_name] + 0.5*img_result[video_name] 41 | 42 | else: 43 | result_dict[video_name] = only2*img_result[video_name] 44 | 45 | for video_name in img_result2: 46 | img_result2[video_name] = np.concatenate([img_result2[video_name],[0]]) 47 | if video_name in result_dict: 48 | result_dict[video_name] = 0.6*result_dict[video_name] + 0.4*img_result2[video_name] 49 | 50 | else: 51 | result_dict[video_name] = only2*img_result2[video_name] 52 | 53 | for video_name in img_result3: 54 | img_result3[video_name] = np.concatenate([img_result3[video_name],[0]]) 55 | if video_name in result_dict: 56 | result_dict[video_name] = 0.7*result_dict[video_name] + 0.3*img_result3[video_name] 57 | else: 58 | result_dict[video_name] = only2*img_result3[video_name] 59 | 60 | 61 | merge_result = [] 62 | for video_name in result_dict: 63 | merge_result.append([video_name, result_dict[video_name]]) 64 | 65 | 66 | if args.txt: 67 | classify_result = [] 68 | for i in range(0, 4934): 69 | classify_result.append([]) 70 | print('Start sorting...') 71 | for i in range(0, 4934): 72 | merge_result.sort(key = lambda x: x[1][i], reverse=True) 73 | classify_result[i] = merge_result[0:100] 74 | 75 | result_path = args.target + '.txt' 76 | with open(result_path, 'w+') as fin: 77 | for i in range(0, 4934): 78 | output_str = str(i+1) 79 | data = classify_result[i] 80 | for d in data: 81 | output_str += (' ' + d[0]) 82 | output_str += '\n' 83 | fin.write(output_str) 84 | 85 | with open(args.target+'.pickle', 'wb+') as f: 86 | pickle.dump(merge_result, f) 87 | 88 | 89 | -------------------------------------------------------------------------------- /resnext/predict.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import mxnet as mx 3 | import numpy as np 4 | import argparse 5 | import random 6 | import os 7 | from PIL import Image 8 | from PIL import ImageFilter 9 | from PIL import ImageEnhance 10 | from collections import namedtuple 11 | from mxnet import io,nd 12 | import pickle 13 | 14 | prefix = 'save_model/se-resnext-imagenet-50-0' 15 | epoch = 16 16 | ctx = mx.gpu() 17 | 18 | batch_size = 64 19 | 20 | model = mx.module.Module.load(prefix = prefix, epoch = epoch, context = ctx) 21 | model.bind(for_training=False, data_shapes=[('data', (batch_size, 3, 224, 224))] ) 22 | Batch = namedtuple('Batch', ['data']) 23 | 24 | 25 | 26 | 27 | def transform(image): 28 | image = mx.image.resize_short(image, 224) #minimum 224x224 images 29 | image, crop_info = mx.image.center_crop(image, (224, 224)) 30 | 31 | image = image.transpose((2,0,1)) # Transposing from (224, 224, 3) to (3, 224, 224) 32 | # image = transposed.expand_dims(axis=0) # change the shape from (3, 224, 224) to (1, 3, 224, 224) 33 | image = nd.array(image, ctx) 34 | return image 35 | 36 | 37 | 38 | def predict(batch_file): 39 | batch_data = nd.empty((batch_size, 3, 224, 224), ctx) 40 | for i in range(0, batch_size): 41 | fn = batch_file[i] 42 | img = mx.image.imread(fn) 43 | img = transform(img) 44 | batch_data[i][:] = img 45 | # img = nd.array(img, ctx) 46 | model.forward(Batch([batch_data]), is_train = False) 47 | outputs = model.get_outputs()[0].asnumpy() 48 | return outputs 49 | 50 | 51 | def predict_directory(video_list, file_list): 52 | all_result = [] 53 | file_num = len(file_list) 54 | print('Image num: ', file_num) 55 | begin, end = 0, 0 56 | for i in range(0, file_num, batch_size): 57 | begin = i 58 | end = i + batch_size 59 | batch_file = [] 60 | if end < file_num: 61 | batch_file = file_list[begin: end] 62 | else: 63 | extra_num = end - file_num + 1 64 | batch_file = file_list[begin: file_num] 65 | batch_file = batch_file + file_list[0: extra_num] 66 | 67 | predict_result = predict(batch_file) 68 | all_result.append(predict_result) 69 | all_result = np.concatenate(all_result) 70 | print('finish predict') 71 | result_dict = {} 72 | for i in range(0, file_num): 73 | video = video_list[i] 74 | if video in result_dict: 75 | result_dict[video].append(all_result[i]) 76 | else: 77 | result_dict[video] = [all_result[i]] 78 | for video in result_dict: 79 | result_dict[video] = np.mean(result_dict[video], axis=0) 80 | return result_dict 81 | 82 | 83 | def get_img_list(directory, video_name_prefix): 84 | path_iter = os.walk(directory) 85 | file_list = [] 86 | count = 0 87 | for root, dirs, files in path_iter: 88 | count += 1 89 | if count == 1: 90 | video_list = dirs 91 | else: 92 | for f in files: 93 | file_list.append([root.split('/')[-1], os.path.join(root, f)]) 94 | random.shuffle(file_list) 95 | video_list, file_list = zip(*file_list) 96 | video_list = list(video_list) 97 | for i in range(0, len(video_list)): 98 | video_list[i] = video_name_prefix + video_list[i] + '.mp4' 99 | return video_list, file_list 100 | 101 | 102 | 103 | 104 | if __name__ == "__main__": 105 | video_list, file_list = get_img_list('/home/jzhengas/Jason/img_data/test_data_part1', 'IQIYI_VID_TEST_') 106 | result_dict = predict_directory(video_list, file_list) 107 | with open('result/tmp_resultv1.pickle', 'wb+') as f: 108 | pickle.dump(result_dict, f) 109 | 110 | video_list, file_list = get_img_list('/home/jzhengas/Jason/img_data/test_data_part2', 'IQIYI_VID_TEST_') 111 | result_dict = predict_directory(video_list, file_list) 112 | with open('result/tmp_resultv2.pickle', 'wb+') as f: 113 | pickle.dump(result_dict, f) 114 | 115 | video_list, file_list = get_img_list('/home/jzhengas/Jason/img_data/test_data_part3', 'IQIYI_VID_TEST_') 116 | result_dict = predict_directory(video_list, file_list) 117 | with open('result/tmp_resultv3.pickle', 'wb+') as f: 118 | pickle.dump(result_dict, f) 119 | 120 | -------------------------------------------------------------------------------- /resnext/predict.sh: -------------------------------------------------------------------------------- 1 | export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 2 | python2 predict.py 3 | 4 | python2 merge.py --pickle-file ../submit_result/best_face.pickle --img-file result/tmp_resultv1.pickle --img-file2 result/tmp_resultv2.pickle --img-file3 result/tmp_resultv3.pickle --target ../submit_result/best_merge.pickle 5 | 6 | -------------------------------------------------------------------------------- /resnext/run.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env sh 2 | export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 3 | python2 -u train_se_resnext_w_d.py --lr 0.001 --data-train /home/jzhengas/Jason/img_data/data/all_data/new_all --data-val /home/jzhengas/Jason/img_data/data/low_val/low_val_all.rec --data-type imagenet --num-classes=4934 --num-examples=1000000 --depth 50 --batch-size 64 --model-load-epoch=2 --drop-out 0.0 --gpus=0 --num-epoch=100 --retrain --freeze 0 --finetune 0 --model-name save_model 4 | 5 | -------------------------------------------------------------------------------- /resnext/symbol_se_resnext_w_d.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Adapted from https://github.com/tornadomeet/ResNet/blob/master/symbol_resnet.py 3 | Original author Wei Wu 4 | 5 | Implemented the following paper: 6 | Saining Xie, Ross Girshick, Piotr Dollar, Zhuowen Tu, Kaiming He. "Aggregated Residual Transformations for Deep Neural Network" CVPR 2017 https://arxiv.org/pdf/1611.05431v2.pdf 7 | Jie Hu, Li Shen, Gang Sun. "Squeeze-and-Excitation Networks" https://arxiv.org/pdf/1709.01507v1.pdf 8 | 9 | This modification version is based on ResNet v1 10 | This modificaiton version adds dropout layer followed by last pooling layer. 11 | Modified by Lin Xiong Feb-11, 2017 12 | Updated by Lin Xiong Jul-21, 2017 13 | Added Squeeze-and-Excitation block by Lin Xiong Sep-13, 2017 14 | ''' 15 | import mxnet as mx 16 | 17 | def residual_unit(data, num_filter, ratio, stride, dim_match, name, num_group, bottle_neck=True, bn_mom=0.9, workspace=256, memonger=False): 18 | """Return ResNext Unit symbol for building ResNext 19 | Parameters 20 | ---------- 21 | data : str 22 | Input data 23 | num_filter : int 24 | Number of output channels 25 | stride : tupe 26 | Stride used in convolution 27 | dim_match : Boolen 28 | True means channel number between input and output is the same, otherwise means differ 29 | name : str 30 | Base name of the operators 31 | bottle_neck : Boolen 32 | Whether or not to adopt bottle_neck trick as did in ResNet 33 | num_group : int 34 | Number of convolution groupes 35 | bn_mom : float 36 | Momentum of batch normalization 37 | workspace : int 38 | Workspace used in convolution operator 39 | """ 40 | if bottle_neck: 41 | # the same as https://github.com/facebook/fb.resnet.torch#notes, a bit difference with origin paper 42 | 43 | conv1 = mx.sym.Convolution(data=data, num_filter=int(num_filter*0.5), kernel=(1,1), stride=(1,1), pad=(0,0), 44 | no_bias=True, workspace=workspace, name=name + '_conv1') 45 | bn1 = mx.sym.BatchNorm(data=conv1, fix_gamma=False, eps=2e-5, momentum=bn_mom, name=name + '_bn1') 46 | act1 = mx.sym.Activation(data=bn1, act_type='relu', name=name + '_relu1') 47 | conv2 = mx.sym.Convolution(data=act1, num_filter=int(num_filter*0.5), num_group=num_group, kernel=(3,3), stride=stride, pad=(1,1), 48 | no_bias=True, workspace=workspace, name=name + '_conv2') 49 | bn2 = mx.sym.BatchNorm(data=conv2, fix_gamma=False, eps=2e-5, momentum=bn_mom, name=name + '_bn2') 50 | act2 = mx.sym.Activation(data=bn2, act_type='relu', name=name + '_relu2') 51 | conv3 = mx.sym.Convolution(data=act2, num_filter=num_filter, kernel=(1,1), stride=(1,1), pad=(0,0), no_bias=True, 52 | workspace=workspace, name=name + '_conv3') 53 | bn3 = mx.sym.BatchNorm(data=conv3, fix_gamma=False, eps=2e-5, momentum=bn_mom, name=name + '_bn3') 54 | 55 | squeeze = mx.sym.Pooling(data=bn3, global_pool=True, kernel=(7, 7), pool_type='avg', name=name + '_squeeze') 56 | squeeze = mx.symbol.Flatten(data=squeeze, name=name + '_flatten') 57 | excitation = mx.symbol.FullyConnected(data=squeeze, num_hidden=int(num_filter*ratio), name=name + '_excitation1') 58 | excitation = mx.sym.Activation(data=excitation, act_type='relu', name=name + '_excitation1_relu') 59 | excitation = mx.symbol.FullyConnected(data=excitation, num_hidden=num_filter, name=name + '_excitation2') 60 | excitation = mx.sym.Activation(data=excitation, act_type='sigmoid', name=name + '_excitation2_sigmoid') 61 | bn3 = mx.symbol.broadcast_mul(bn3, mx.symbol.reshape(data=excitation, shape=(-1, num_filter, 1, 1))) 62 | 63 | if dim_match: 64 | shortcut = data 65 | else: 66 | shortcut_conv = mx.sym.Convolution(data=data, num_filter=num_filter, kernel=(1,1), stride=stride, no_bias=True, 67 | workspace=workspace, name=name+'_sc') 68 | shortcut = mx.sym.BatchNorm(data=shortcut_conv, fix_gamma=False, eps=2e-5, momentum=bn_mom, name=name + '_sc_bn') 69 | 70 | if memonger: 71 | shortcut._set_attr(mirror_stage='True') 72 | eltwise = bn3 + shortcut 73 | return mx.sym.Activation(data=eltwise, act_type='relu', name=name + '_relu') 74 | else: 75 | conv1 = mx.sym.Convolution(data=data, num_filter=num_filter, kernel=(3,3), stride=stride, pad=(1,1), 76 | no_bias=True, workspace=workspace, name=name + '_conv1') 77 | bn1 = mx.sym.BatchNorm(data=conv1, fix_gamma=False, momentum=bn_mom, eps=2e-5, name=name + '_bn1') 78 | act1 = mx.sym.Activation(data=bn1, act_type='relu', name=name + '_relu1') 79 | conv2 = mx.sym.Convolution(data=act1, num_filter=num_filter, kernel=(3,3), stride=(1,1), pad=(1,1), 80 | no_bias=True, workspace=workspace, name=name + '_conv2') 81 | bn2 = mx.sym.BatchNorm(data=conv2, fix_gamma=False, momentum=bn_mom, eps=2e-5, name=name + '_bn2') 82 | 83 | squeeze = mx.sym.Pooling(data=bn2, global_pool=True, kernel=(7, 7), pool_type='avg', name=name + '_squeeze') 84 | squeeze = mx.symbol.Flatten(data=squeeze, name=name + '_flatten') 85 | excitation = mx.symbol.FullyConnected(data=squeeze, num_hidden=int(num_filter*ratio), name=name + '_excitation1') 86 | excitation = mx.sym.Activation(data=excitation, act_type='relu', name=name + '_excitation1_relu') 87 | excitation = mx.symbol.FullyConnected(data=excitation, num_hidden=num_filter, name=name + '_excitation2') 88 | excitation = mx.sym.Activation(data=excitation, act_type='sigmoid', name=name + '_excitation2_sigmoid') 89 | bn2 = mx.symbol.broadcast_mul(bn2, mx.symbol.reshape(data=excitation, shape=(-1, num_filter, 1, 1))) 90 | 91 | if dim_match: 92 | shortcut = data 93 | else: 94 | shortcut_conv = mx.sym.Convolution(data=data, num_filter=num_filter, kernel=(1,1), stride=stride, no_bias=True, 95 | workspace=workspace, name=name+'_sc') 96 | shortcut = mx.sym.BatchNorm(data=shortcut_conv, fix_gamma=False, eps=2e-5, momentum=bn_mom, name=name + '_sc_bn') 97 | 98 | if memonger: 99 | shortcut._set_attr(mirror_stage='True') 100 | eltwise = bn2 + shortcut 101 | return mx.sym.Activation(data=eltwise, act_type='relu', name=name + '_relu') 102 | 103 | def resnext(units, num_stage, filter_list, ratio_list, num_class, num_group, data_type, drop_out, bottle_neck=True, bn_mom=0.9, workspace=256, memonger=False): 104 | """Return ResNeXt symbol of 105 | Parameters 106 | ---------- 107 | units : list 108 | Number of units in each stage 109 | num_stages : int 110 | Number of stage 111 | filter_list : list 112 | Channel size of each stage 113 | num_class : int 114 | Ouput size of symbol 115 | num_groupes: int 116 | Number of convolution groups 117 | drop_out : float 118 | Probability of an element to be zeroed. Default = 0.0 119 | data_type : str 120 | Dataset type, only cifar10, imagenet and vggface supports 121 | workspace : int 122 | Workspace used in convolution operator 123 | """ 124 | num_unit = len(units) 125 | assert(num_unit == num_stage) 126 | data = mx.sym.Variable(name='data') 127 | data = mx.sym.BatchNorm(data=data, fix_gamma=True, eps=2e-5, momentum=bn_mom, name='bn_data') 128 | if data_type == 'cifar10': 129 | body = mx.sym.Convolution(data=data, num_filter=filter_list[0], kernel=(3, 3), stride=(1,1), pad=(1, 1), 130 | no_bias=True, name="conv0", workspace=workspace) 131 | elif data_type == 'imagenet': 132 | body = mx.sym.Convolution(data=data, num_filter=filter_list[0], kernel=(7, 7), stride=(2,2), pad=(3, 3), 133 | no_bias=True, name="conv0", workspace=workspace) 134 | body = mx.sym.BatchNorm(data=body, fix_gamma=False, eps=2e-5, momentum=bn_mom, name='bn0') 135 | body = mx.sym.Activation(data=body, act_type='relu', name='relu0') 136 | body = mx.symbol.Pooling(data=body, kernel=(3, 3), stride=(2,2), pad=(1,1), pool_type='max') 137 | elif data_type == 'vggface': 138 | body = mx.sym.Convolution(data=data, num_filter=filter_list[0], kernel=(7, 7), stride=(2,2), pad=(3, 3), 139 | no_bias=True, name="conv0", workspace=workspace) 140 | body = mx.sym.BatchNorm(data=body, fix_gamma=False, eps=2e-5, momentum=bn_mom, name='bn0') 141 | body = mx.sym.Activation(data=body, act_type='relu', name='relu0') 142 | body = mx.symbol.Pooling(data=body, kernel=(3, 3), stride=(2,2), pad=(1,1), pool_type='max') 143 | elif data_type == 'msface': 144 | body = mx.sym.Convolution(data=data, num_filter=filter_list[0], kernel=(7, 7), stride=(2,2), pad=(3, 3), 145 | no_bias=True, name="conv0", workspace=workspace) 146 | body = mx.sym.BatchNorm(data=body, fix_gamma=False, eps=2e-5, momentum=bn_mom, name='bn0') 147 | body = mx.sym.Activation(data=body, act_type='relu', name='relu0') 148 | body = mx.symbol.Pooling(data=body, kernel=(3, 3), stride=(2,2), pad=(1,1), pool_type='max') 149 | else: 150 | raise ValueError("do not support {} yet".format(data_type)) 151 | for i in range(num_stage): 152 | body = residual_unit(body, filter_list[i+1], ratio_list[2], (1 if i==0 else 2, 1 if i==0 else 2), False, 153 | name='stage%d_unit%d' % (i + 1, 1), num_group=num_group, bottle_neck=bottle_neck, 154 | bn_mom=bn_mom, workspace=workspace, memonger=memonger) 155 | for j in range(units[i]-1): 156 | body = residual_unit(body, filter_list[i+1], ratio_list[2], (1,1), True, name='stage%d_unit%d' % (i + 1, j + 2), 157 | num_group=num_group, bottle_neck=bottle_neck, bn_mom=bn_mom, workspace=workspace, memonger=memonger) 158 | pool1 = mx.symbol.Pooling(data=body, global_pool=True, kernel=(7, 7), pool_type='avg', name='pool1') 159 | flat = mx.symbol.Flatten(data=pool1) 160 | drop1= mx.symbol.Dropout(data=flat, p=drop_out, name='dp1') 161 | fc1 = mx.symbol.FullyConnected(data=drop1, num_hidden=num_class, name='fc1') 162 | return mx.symbol.SoftmaxOutput(data=fc1, name='softmax') 163 | -------------------------------------------------------------------------------- /resnext/train_se_resnext_w_d.py: -------------------------------------------------------------------------------- 1 | """ 2 | Add dropout layer followed by last pooling layer. 3 | Updated by Lin Xiong Jul-21, 2017 4 | """ 5 | import argparse,logging,os 6 | import mxnet as mx 7 | from symbol_se_resnext_w_d import resnext 8 | import mxnet.optimizer as optimizer 9 | logger = logging.getLogger() 10 | logger.setLevel(logging.INFO) 11 | 12 | formatter = logging.Formatter('%(asctime)s - %(message)s') 13 | console = logging.StreamHandler() 14 | console.setFormatter(formatter) 15 | logger.addHandler(console) 16 | 17 | # load and tune model 18 | def get_fine_tune_model(model_name, epoch): 19 | # load model 20 | symbol, arg_params, aux_params = mx.model.load_checkpoint(model_name, epoch) 21 | # model tuning 22 | all_layers = symbol.get_internals() 23 | net = all_layers['flatten0_output'] 24 | net = mx.symbol.FullyConnected(data=net, num_hidden=args.num_classes, name='newfc1') 25 | net = mx.symbol.SoftmaxOutput(data=net, name='softmax') 26 | # eliminate weights of new layer 27 | new_args = dict({k:arg_params[k] for k in arg_params if 'fc1' not in k}) 28 | return (net, new_args,aux_params) 29 | 30 | 31 | 32 | def main(): 33 | ratio_list = [0.25, 0.125, 0.0625, 0.03125] # 1/4, 1/8, 1/16, 1/32 34 | if args.depth == 18: 35 | units = [2, 2, 2, 2] 36 | elif args.depth == 34: 37 | units = [3, 4, 6, 3] 38 | elif args.depth == 50: 39 | units = [3, 4, 6, 3] 40 | elif args.depth == 101: 41 | units = [3, 4, 23, 3] 42 | elif args.depth == 152: 43 | units = [3, 8, 36, 3] 44 | elif args.depth == 200: 45 | units = [3, 24, 36, 3] 46 | elif args.depth == 269: 47 | units = [3, 30, 48, 8] 48 | else: 49 | raise ValueError("no experiments done on detph {}, you can do it youself".format(args.depth)) 50 | symbol = resnext(units=units, num_stage=4, filter_list=[64, 256, 512, 1024, 2048] if args.depth >=50 51 | else [64, 64, 128, 256, 512], ratio_list=ratio_list, num_class=args.num_classes, num_group=args.num_group, data_type="imagenet", drop_out=args.drop_out, bottle_neck = True 52 | if args.depth >= 50 else False, bn_mom=args.bn_mom, workspace=args.workspace, 53 | memonger=args.memonger) 54 | 55 | kv = mx.kvstore.create(args.kv_store) 56 | devs = mx.cpu() if args.gpus is None else [mx.gpu(int(i)) for i in args.gpus.split(',')] 57 | epoch_size = max(int(args.num_examples / args.batch_size / kv.num_workers), 1) 58 | begin_epoch = args.model_load_epoch if args.model_load_epoch else 0 59 | if not os.path.exists("./"+args.model_name): 60 | os.mkdir("./"+args.model_name) 61 | model_prefix = args.model_name+"/se-resnext-{}-{}-{}".format(args.data_type, args.depth, kv.rank) 62 | checkpoint = mx.callback.do_checkpoint(model_prefix) 63 | arg_params = None 64 | aux_params = None 65 | load_model_prefix = 'model/se-resnext-imagenet-50-0' 66 | if args.retrain: 67 | if args.finetune: 68 | (symbol,arg_params,aux_params)=get_fine_tune_model(load_model_prefix, args.model_load_epoch) 69 | else: 70 | symbol, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, args.model_load_epoch) 71 | if args.memonger: 72 | import memonger 73 | symbol = memonger.search_plan(symbol, data=(args.batch_size, 3, 32, 32) if args.data_type=="cifar10" 74 | else (args.batch_size, 3, 224, 224)) 75 | train = mx.io.ImageRecordIter( 76 | path_imgrec = args.data_train+'.rec', 77 | path_imgidx = args.data_train+'.idx', 78 | label_width = 1, 79 | data_name = 'data', 80 | label_name = 'softmax_label', 81 | data_shape = (3, 224, 224), 82 | batch_size = args.batch_size, 83 | pad = 0, 84 | fill_value = 0, # only used when pad is valid 85 | rand_crop = False, 86 | shuffle = True) 87 | train.reset() 88 | if(args.data_val == 'None'): 89 | val = None 90 | else: 91 | val = mx.io.ImageRecordIter( 92 | path_imgrec = args.data_val, 93 | label_width = 1, 94 | data_name = 'data', 95 | label_name = 'softmax_label', 96 | batch_size = args.batch_size, 97 | data_shape = (3, 224, 224), 98 | rand_crop = False, 99 | rand_mirror = False, 100 | num_parts = kv.num_workers, 101 | part_index = kv.rank) 102 | 103 | fix_param = None 104 | if args.freeze: 105 | fix_param = [k for k in arg_params if 'fc' not in k] 106 | 107 | model = mx.mod.Module(symbol=symbol, context=devs, fixed_param_names = fix_param) 108 | model.bind(data_shapes = train.provide_data, label_shapes = train.provide_label) 109 | # sgd = mx.optimizer.Optimizer.create_optimizer('sgd') 110 | # finetune_lr = dict({k: 0 for k in arg_params}) 111 | # sgd.set_lr_mult(finetune_lr) 112 | 113 | opt = optimizer.SGD(learning_rate=args.lr, momentum=0.9, wd=0.0005, rescale_grad=1.0/args.batch_size/(len(args.gpus.split(',')))) 114 | 115 | # training 116 | model.fit(train, val, 117 | num_epoch=args.num_epoch, 118 | arg_params=arg_params, 119 | aux_params=aux_params, 120 | allow_missing=True, 121 | kvstore='device', 122 | optimizer=opt, 123 | initializer=mx.init.Xavier(rnd_type='gaussian', factor_type="in", magnitude=2), 124 | batch_end_callback = mx.callback.Speedometer(args.batch_size, args.frequent), 125 | epoch_end_callback = checkpoint, 126 | eval_metric=['acc', 'ce']) 127 | 128 | 129 | 130 | if __name__ == "__main__": 131 | parser = argparse.ArgumentParser(description="command for training resnet-v2") 132 | parser.add_argument('--gpus', type=str, default='0', help='the gpus will be used, e.g "0,1,2,3"') 133 | parser.add_argument('--data-type', type=str, default='imagenet', help='the dataset type') 134 | parser.add_argument('--data-train', type=str, default='data/train.rec', help='the train dataset') 135 | parser.add_argument('--data-val', type=str, default='None', help='the val dataset') 136 | parser.add_argument('--lr', type=float, default=0.001, help='initialization learning reate') 137 | parser.add_argument('--mom', type=float, default=0.9, help='momentum for sgd') 138 | parser.add_argument('--bn-mom', type=float, default=0.9, help='momentum for batch normlization') 139 | parser.add_argument('--wd', type=float, default=0.0001, help='weight decay for sgd') 140 | parser.add_argument('--batch-size', type=int, default=256, help='the batch size') 141 | parser.add_argument('--num-epoch', type=int, default=100, help='the epoch num') 142 | parser.add_argument('--num-group', type=int, default=32, help='the number of convolution groups') 143 | parser.add_argument('--drop-out', type=float, default=0.0, help='the probability of an element to be zeroed') 144 | parser.add_argument('--workspace', type=int, default=512, help='memory space size(MB) used in convolution, if xpu ' 145 | ' memory is oom, then you can try smaller vale, such as --workspace 256') 146 | parser.add_argument('--depth', type=int, default=50, help='the depth of resnet') 147 | parser.add_argument('--num-classes', type=int, default=1000, help='the class number of your task') 148 | parser.add_argument('--aug-level', type=int, default=2, choices=[1, 2, 3], 149 | help='level 1: use only random crop and random mirror\n' 150 | 'level 2: add scale/aspect/hsv augmentation based on level 1\n' 151 | 'level 3: add rotation/shear augmentation based on level 2') 152 | parser.add_argument('--num-examples', type=int, default=1281167, help='the number of training examples') 153 | parser.add_argument('--kv-store', type=str, default='device', help='the kvstore type') 154 | parser.add_argument('--model-name', type=str, default='save_model', help='the name of classifier') 155 | parser.add_argument('--model-load-epoch', type=int, default=0, 156 | help='load the model on an epoch using the model-load-prefix') 157 | 158 | parser.add_argument('--frequent', type=int, default=50, help='frequency of logging') 159 | parser.add_argument('--freeze', type=int, default=0) 160 | parser.add_argument('--finetune', type=int, default=1) 161 | parser.add_argument('--memonger', action='store_true', default=False, 162 | help='true means using memonger to save momory, https://github.com/dmlc/mxnet-memonger') 163 | parser.add_argument('--retrain', action='store_true', default=False, help='true means continue training') 164 | args = parser.parse_args() 165 | hdlr = logging.FileHandler('./log/log-se-resnext-{}-{}.log'.format(args.data_type, args.depth)) 166 | hdlr.setFormatter(formatter) 167 | logger.addHandler(hdlr) 168 | logging.info(args) 169 | main() 170 | 171 | -------------------------------------------------------------------------------- /simple_model/merge.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | import random 3 | import numpy as np 4 | import argparse 5 | 6 | 7 | def read_from_pickle(pickle_path): 8 | with open(pickle_path, 'rb') as fin: 9 | result = pickle.load(fin, encoding='iso-8859-1') 10 | return result 11 | 12 | 13 | if __name__ == '__main__': 14 | parser = argparse.ArgumentParser(description='Train face network') 15 | 16 | parser.add_argument('--pickle-file', default='../save_model/tmp_model/model_all_mix.hdf5', help='directory to save model.') 17 | parser.add_argument('--target', default='merge/result.txt', help='directory to save model.') 18 | parser.add_argument('--txt', type=int, default=1) 19 | args = parser.parse_args() 20 | 21 | pickle_file = args.pickle_file.split(',') 22 | 23 | pickles = [read_from_pickle(f) for f in pickle_file] 24 | 25 | for pickle_data in pickles: 26 | print(len(pickle_data)) 27 | pickle_data.sort(key = lambda x: x[0]) 28 | 29 | merge_result = [] 30 | for i in range(0, len(pickles[0])): 31 | video_name = pickles[0][i][0] 32 | result = pickles[0][i][1] 33 | for j in range(1, len(pickles)): 34 | result += pickles[j][i][1] 35 | result = result/ len(pickles) 36 | merge_result.append((video_name, result)) 37 | 38 | print(len(merge_result)) 39 | 40 | result_path = args.target 41 | if args.txt: 42 | classify_result = [] 43 | for i in range(0, 4934): 44 | classify_result.append([]) 45 | print('Start sorting...') 46 | for i in range(0, 4934): 47 | merge_result.sort(key = lambda x: x[1][i], reverse=True) 48 | classify_result[i] = merge_result[0:100] 49 | 50 | 51 | with open(result_path, 'w+') as fin: 52 | for i in range(0, 4934): 53 | output_str = str(i+1) 54 | data = classify_result[i] 55 | for d in data: 56 | output_str += (' ' + d[0]) 57 | output_str += '\n' 58 | fin.write(output_str) 59 | 60 | with open(result_path+'.pickle', 'wb+') as f: 61 | pickle.dump(merge_result, f) 62 | 63 | 64 | -------------------------------------------------------------------------------- /simple_model/merge.sh: -------------------------------------------------------------------------------- 1 | python3.6 merge.py --pickle-file ../data/simple_mode/tmp_result/v1.pickle,../data/simple_mode/tmp_result/v2.pickle,../data/simple_mode/tmp_result/v3.pickle,../data/simple_mode/tmp_result/mean.pickle --target ../submit_result/best_face.pickle -------------------------------------------------------------------------------- /simple_model/pre.sh: -------------------------------------------------------------------------------- 1 | 2 | python3.6 predict_multi.py --ratio 2 \ 3 | --model-prefix "save_model/model1" \ 4 | --pickle ../data/test_mean.pickle \ 5 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result1.pickle 6 | 7 | 8 | python3.6 predict_multi.py --ratio 2 \ 9 | --model-prefix "save_model/model2" \ 10 | --pickle ../data/test_mean.pickle \ 11 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result2.pickle 12 | 13 | 14 | python3.6 predict_multi.py --ratio 2 \ 15 | --model-prefix "save_model/model3" \ 16 | --pickle ../data/test_mean.pickle \ 17 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result3.pickle 18 | 19 | 20 | python3.6 predict_multi.py --ratio 2 \ 21 | --model-prefix "save_model/model4" \ 22 | --pickle ../data/test_mean.pickle \ 23 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result4.pickle 24 | 25 | 26 | python3.6 predict_multi.py --ratio 2 \ 27 | --model-prefix "save_model/model5" \ 28 | --pickle ../data/test_mean.pickle \ 29 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result5.pickle 30 | 31 | 32 | python3.6 predict_multi.py --ratio 2 \ 33 | --model-prefix "save_model/model6" \ 34 | --pickle ../data/test_mean.pickle \ 35 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result6.pickle 36 | 37 | 38 | python3.6 predict_multi.py --ratio 2 \ 39 | --model-prefix "save_model/model7" \ 40 | --pickle ../data/test_mean.pickle \ 41 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result7.pickle 42 | 43 | 44 | python3.6 predict_multi.py --ratio 2 \ 45 | --model-prefix "save_model/model8" \ 46 | --pickle ../data/test_mean.pickle \ 47 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result8.pickle 48 | 49 | 50 | python3.6 predict_multi.py --ratio 2 \ 51 | --model-prefix "save_model/model9" \ 52 | --pickle ../data/test_mean.pickle \ 53 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result9.pickle 54 | 55 | 56 | python3.6 predict_multi.py --ratio 2 \ 57 | --model-prefix "save_model/model10" \ 58 | --pickle ../data/test_mean.pickle \ 59 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result10.pickle 60 | 61 | 62 | python3.6 predict_multi.py --ratio 2 \ 63 | --model-prefix "save_model/model11" \ 64 | --pickle ../data/test_mean.pickle \ 65 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result11.pickle 66 | 67 | 68 | python3.6 predict_multi.py --ratio 2 \ 69 | --model-prefix "save_model/model12" \ 70 | --pickle ../data/test_mean.pickle \ 71 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result12.pickle 72 | 73 | 74 | python3.6 predict_multi.py --ratio 2 \ 75 | --model-prefix "save_model/model13" \ 76 | --pickle ../data/test_mean.pickle \ 77 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result13.pickle 78 | 79 | 80 | python3.6 predict_multi.py --ratio 2 \ 81 | --model-prefix "save_model/model14" \ 82 | --pickle ../data/test_mean.pickle \ 83 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result14.pickle 84 | 85 | 86 | python3.6 predict_multi.py --ratio 2 \ 87 | --model-prefix "save_model/model15" \ 88 | --pickle ../data/test_mean.pickle \ 89 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result15.pickle 90 | 91 | 92 | python3.6 predict_multi.py --ratio 2 \ 93 | --model-prefix "save_model/model16" \ 94 | --pickle ../data/test_mean.pickle \ 95 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result16.pickle 96 | 97 | 98 | python3.6 predict_multi.py --ratio 2 \ 99 | --model-prefix "save_model/model17" \ 100 | --pickle ../data/test_mean.pickle \ 101 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result17.pickle 102 | 103 | 104 | python3.6 predict_multi.py --ratio 2 \ 105 | --model-prefix "save_model/model18" \ 106 | --pickle ../data/test_mean.pickle \ 107 | --save-path /home/jzhengas/Jason/result/drop_file/tmp_pickle2/result18.pickle 108 | 109 | -------------------------------------------------------------------------------- /simple_model/predict_multi.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | import random 3 | import numpy as np 4 | from keras.models import * 5 | from keras.layers import * 6 | from keras import optimizers 7 | from keras.utils.np_utils import to_categorical 8 | from keras import regularizers 9 | from keras.callbacks import LearningRateScheduler 10 | import keras 11 | import argparse 12 | import tensorflow as tf 13 | import keras.backend.tensorflow_backend as KTF 14 | import os 15 | config = tf.ConfigProto() 16 | config.gpu_options.allow_growth=True 17 | session = tf.Session(config=config) 18 | KTF.set_session(session) 19 | 20 | random.seed(2018) 21 | 22 | 23 | def get_data(pickle_path='', threshold=20, allow_empty=False, ratio=0): 24 | with open(pickle_path, 'rb') as fin: 25 | feats_dict = pickle.load(fin, encoding='iso-8859-1') 26 | x = [] 27 | name_list = [] 28 | video_count = 0 29 | for video_name in feats_dict: 30 | feats = feats_dict[video_name] 31 | if len(feats) == 0: 32 | continue 33 | video_count += 1 34 | feats.sort(key = lambda x: x[3], reverse=True) 35 | 36 | if ratio > 0: 37 | end_pos = int(len(feats) / ratio) if int(len(feats) / ratio) >=1 else 1 38 | feats = feats[0: end_pos] 39 | select_feats = [f for f in feats if f[3] > threshold] 40 | 41 | if (len(select_feats) == 0) and not allow_empty: 42 | select_feats = [feats[0]] 43 | 44 | for feat in select_feats: 45 | [frame_num, bbox, det_score, qua_score, feat_arr] = feat 46 | x.append(feat_arr) 47 | name_list.append(video_name) 48 | print(len(x), len(name_list), video_count) 49 | d = list(zip(x, name_list)) 50 | random.shuffle(d) 51 | x, name_list = zip(*d) 52 | return np.array(x), name_list 53 | 54 | def read_from_pickle(pickle_path): 55 | with open(pickle_path, 'rb') as fin: 56 | feats_dict = pickle.load(fin, encoding='iso-8859-1') 57 | x, y, det_score, face_score, video_name = zip(*feats_dict) 58 | x = np.array(x) 59 | video_name = np.array(video_name) 60 | return x, video_name, face_score 61 | 62 | 63 | def get_result_dict(name_data, result): 64 | result_dict = {} 65 | for i in range(0, len(name_data)): 66 | tmp_name = name_data[i] 67 | if tmp_name not in result_dict: 68 | result_dict[tmp_name] = [result[i]] 69 | else: 70 | result_dict[tmp_name].append(result[i]) 71 | all_result = [] 72 | for name in result_dict: 73 | all_result.append([name, np.mean(result_dict[name], axis=0)]) 74 | print('Result count:', len(all_result)) 75 | return all_result 76 | 77 | 78 | if __name__ == '__main__': 79 | 80 | parser = argparse.ArgumentParser(description='Train face network') 81 | # general 82 | parser.add_argument('--threshold', type=int, default=0, help='threshold') 83 | parser.add_argument('--model-prefix', default='../save_model/tmp_model/model_all_mix.hdf5_0_200') 84 | parser.add_argument('--save-path', default='/home/jzhengas/Jason/result/result/') 85 | parser.add_argument('--pickle', default='/home/jzhengas/Jason/data/simple_model/test_mean.pickle') 86 | parser.add_argument('--allow-empty',type=int, default=0) 87 | parser.add_argument('--ratio',type=int, default=2) 88 | args = parser.parse_args() 89 | threshold = args.threshold 90 | x, name_data = get_data(args.pickle, threshold=threshold, allow_empty=args.allow_empty, ratio=args.ratio) 91 | 92 | prefix = args.model_prefix 93 | model_list = [prefix+'_0_200.hdf5',prefix+'_20_200.hdf5',prefix+'_40_200.hdf5', 94 | prefix+'_0_80.hdf5'] 95 | result_list = [] 96 | for model_name in model_list: 97 | model = load_model(model_name) 98 | result = model.predict(x, batch_size=256) 99 | print('Finish Predict...') 100 | 101 | all_result = get_result_dict(name_data, result) 102 | result = None 103 | result_list.append(all_result) 104 | 105 | merge_result = [] 106 | for i in range(0, len(result_list[0])): 107 | video_name = result_list[0][i][0] 108 | result = result_list[0][i][1] 109 | for j in range(1, len(result_list)): 110 | result += result_list[j][i][1] 111 | result = result / len(result_list) 112 | merge_result.append((video_name, result)) 113 | 114 | pickle_path = args.save_path 115 | with open(pickle_path, 'wb+') as f: 116 | pickle.dump(merge_result, f) 117 | 118 | 119 | 120 | -------------------------------------------------------------------------------- /simple_model/train.sh: -------------------------------------------------------------------------------- 1 | # part 1 2 | python3.6 train_ini.py --seed 1 --save-model save_model/model1_0_200.hdf5 --drop 0.8 --threshold 0,200 3 | python3.6 train_ini.py --seed 1 --save-model save_model/model1_20_200.hdf5 --drop 0.8 --threshold 20,200 4 | python3.6 train_ini.py --seed 1 --save-model save_model/model1_40_200.hdf5 --drop 0.8 --threshold 40,200 5 | python3.6 train_ini.py --seed 1 --save-model save_model/model1_0_80.hdf5 --drop 0.8 --threshold 0,80 6 | 7 | python3.6 train_ini.py --seed 2 --save-model save_model/model2_0_200.hdf5 --drop 0.8 --threshold 0,200 8 | python3.6 train_ini.py --seed 2 --save-model save_model/model2_20_200.hdf5 --drop 0.8 --threshold 20,200 9 | python3.6 train_ini.py --seed 2 --save-model save_model/model2_40_200.hdf5 --drop 0.8 --threshold 40,200 10 | python3.6 train_ini.py --seed 2 --save-model save_model/model2_0_80.hdf5 --drop 0.8 --threshold 0,80 11 | 12 | 13 | python3.6 train_ini.py --seed 3 --save-model save_model/model3_0_200.hdf5 --drop 0.8 --threshold 0,200 14 | python3.6 train_ini.py --seed 3 --save-model save_model/model3_20_200.hdf5 --drop 0.8 --threshold 20,200 15 | python3.6 train_ini.py --seed 3 --save-model save_model/model3_40_200.hdf5 --drop 0.8 --threshold 40,200 16 | python3.6 train_ini.py --seed 3 --save-model save_model/model3_0_80.hdf5 --drop 0.8 --threshold 0,80 17 | 18 | python3.6 train_ini.py --seed 4 --save-model save_model/model4_0_200.hdf5 --drop 0.8 --threshold 0,200 19 | python3.6 train_ini.py --seed 4 --save-model save_model/model4_20_200.hdf5 --drop 0.8 --threshold 20,200 20 | python3.6 train_ini.py --seed 4 --save-model save_model/model4_40_200.hdf5 --drop 0.8 --threshold 40,200 21 | python3.6 train_ini.py --seed 4 --save-model save_model/model4_0_80.hdf5 --drop 0.8 --threshold 0,80 22 | 23 | 24 | python3.6 train_ini.py --seed 5 --save-model save_model/model5_0_200.hdf5 --drop 0.8 --threshold 0,200 25 | python3.6 train_ini.py --seed 5 --save-model save_model/model5_20_200.hdf5 --drop 0.8 --threshold 20,200 26 | python3.6 train_ini.py --seed 5 --save-model save_model/model5_40_200.hdf5 --drop 0.8 --threshold 40,200 27 | python3.6 train_ini.py --seed 5 --save-model save_model/model5_0_80.hdf5 --drop 0.8 --threshold 0,80 28 | 29 | python3.6 train_ini.py --seed 6 --save-model save_model/model6_0_200.hdf5 --drop 0.8 --threshold 0,200 30 | python3.6 train_ini.py --seed 6 --save-model save_model/model6_20_200.hdf5 --drop 0.8 --threshold 20,200 31 | python3.6 train_ini.py --seed 6 --save-model save_model/model6_40_200.hdf5 --drop 0.8 --threshold 40,200 32 | python3.6 train_ini.py --seed 6 --save-model save_model/model6_0_80.hdf5 --drop 0.8 --threshold 0,80 33 | 34 | 35 | # part 2 36 | python3.6 train_ini.py --seed 7 --save-model save_model/model7_0_200.hdf5 --drop 0.8 --threshold 0,200 37 | python3.6 train_ini.py --seed 7 --save-model save_model/model7_20_200.hdf5 --drop 0.8 --threshold 20,200 38 | python3.6 train_ini.py --seed 7 --save-model save_model/model7_40_200.hdf5 --drop 0.8 --threshold 40,200 39 | python3.6 train_ini.py --seed 7 --save-model save_model/model7_0_80.hdf5 --drop 0.8 --threshold 0,80 40 | 41 | python3.6 train_ini.py --seed 8 --save-model save_model/model8_0_200.hdf5 --drop 0.8 --threshold 0,200 42 | python3.6 train_ini.py --seed 8 --save-model save_model/model8_20_200.hdf5 --drop 0.8 --threshold 20,200 43 | python3.6 train_ini.py --seed 8 --save-model save_model/model8_40_200.hdf5 --drop 0.8 --threshold 40,200 44 | python3.6 train_ini.py --seed 8 --save-model save_model/model8_0_80.hdf5 --drop 0.8 --threshold 0,80 45 | 46 | 47 | python3.6 train_ini.py --seed 9 --save-model save_model/model9_0_200.hdf5 --drop 0.8 --threshold 0,200 48 | python3.6 train_ini.py --seed 9 --save-model save_model/model9_20_200.hdf5 --drop 0.8 --threshold 20,200 49 | python3.6 train_ini.py --seed 9 --save-model save_model/model9_40_200.hdf5 --drop 0.8 --threshold 40,200 50 | python3.6 train_ini.py --seed 9 --save-model save_model/model9_0_80.hdf5 --drop 0.8 --threshold 0,80 51 | 52 | python3.6 train_ini.py --seed 10 --save-model save_model/model10_0_200.hdf5 --drop 0.8 --threshold 0,200 53 | python3.6 train_ini.py --seed 10 --save-model save_model/model10_20_200.hdf5 --drop 0.8 --threshold 20,200 54 | python3.6 train_ini.py --seed 10 --save-model save_model/model10_40_200.hdf5 --drop 0.8 --threshold 40,200 55 | python3.6 train_ini.py --seed 10 --save-model save_model/model10_0_80.hdf5 --drop 0.8 --threshold 0,80 56 | 57 | 58 | python3.6 train_ini.py --seed 11 --save-model save_model/model11_0_200.hdf5 --drop 0.8 --threshold 0,200 59 | python3.6 train_ini.py --seed 11 --save-model save_model/model11_20_200.hdf5 --drop 0.8 --threshold 20,200 60 | python3.6 train_ini.py --seed 11 --save-model save_model/model11_40_200.hdf5 --drop 0.8 --threshold 40,200 61 | python3.6 train_ini.py --seed 11 --save-model save_model/model11_0_80.hdf5 --drop 0.8 --threshold 0,80 62 | 63 | 64 | python3.6 train_ini.py --seed 12 --save-model save_model/model12_0_200.hdf5 --drop 0.8 --threshold 0,200 65 | python3.6 train_ini.py --seed 12 --save-model save_model/model12_20_200.hdf5 --drop 0.8 --threshold 20,200 66 | python3.6 train_ini.py --seed 12 --save-model save_model/model12_40_200.hdf5 --drop 0.8 --threshold 40,200 67 | python3.6 train_ini.py --seed 12 --save-model save_model/model12_0_80.hdf5 --drop 0.8 --threshold 0,80 68 | 69 | 70 | # part 3 71 | python3.6 train_ini.py --seed 13 --save-model save_model/model13_0_200.hdf5 --drop 0.8 --threshold 0,200 --allow-empty 1 72 | python3.6 train_ini.py --seed 13 --save-model save_model/model13_20_200.hdf5 --drop 0.8 --threshold 20,200 --allow-empty 1 73 | python3.6 train_ini.py --seed 13 --save-model save_model/model13_40_200.hdf5 --drop 0.8 --threshold 40,200 --allow-empty 1 74 | python3.6 train_ini.py --seed 13 --save-model save_model/model13_0_80.hdf5 --drop 0.8 --threshold 0,80 --allow-empty 1 75 | 76 | python3.6 train_ini.py --seed 14 --save-model save_model/model14_0_200.hdf5 --drop 0.8 --threshold 0,200 --allow-empty 1 77 | python3.6 train_ini.py --seed 14 --save-model save_model/model14_20_200.hdf5 --drop 0.8 --threshold 20,200 --allow-empty 1 78 | python3.6 train_ini.py --seed 14 --save-model save_model/model14_40_200.hdf5 --drop 0.8 --threshold 40,200 --allow-empty 1 79 | python3.6 train_ini.py --seed 14 --save-model save_model/model14_0_80.hdf5 --drop 0.8 --threshold 0,80 --allow-empty 1 80 | 81 | 82 | python3.6 train_ini.py --seed 15 --save-model save_model/model15_0_200.hdf5 --drop 0.8 --threshold 0,200 --allow-empty 1 83 | python3.6 train_ini.py --seed 15 --save-model save_model/model15_20_200.hdf5 --drop 0.8 --threshold 20,200 --allow-empty 1 84 | python3.6 train_ini.py --seed 15 --save-model save_model/model15_40_200.hdf5 --drop 0.8 --threshold 40,200 --allow-empty 1 85 | python3.6 train_ini.py --seed 15 --save-model save_model/model15_0_80.hdf5 --drop 0.8 --threshold 0,80 --allow-empty 1 86 | 87 | 88 | python3.6 train_ini.py --seed 16 --save-model save_model/model16_0_200.hdf5 --drop 0.8 --threshold 0,200 --allow-empty 1 89 | python3.6 train_ini.py --seed 16 --save-model save_model/model16_20_200.hdf5 --drop 0.8 --threshold 20,200 --allow-empty 1 90 | python3.6 train_ini.py --seed 16 --save-model save_model/model16_40_200.hdf5 --drop 0.8 --threshold 40,200 --allow-empty 1 91 | python3.6 train_ini.py --seed 16 --save-model save_model/model16_0_80.hdf5 --drop 0.8 --threshold 0,80 --allow-empty 1 92 | 93 | 94 | 95 | python3.6 train_ini.py --seed 17 --save-model save_model/model17_0_200.hdf5 --drop 0.8 --threshold 0,200 --allow-empty 1 96 | python3.6 train_ini.py --seed 17 --save-model save_model/model17_20_200.hdf5 --drop 0.8 --threshold 20,200 --allow-empty 1 97 | python3.6 train_ini.py --seed 17 --save-model save_model/model17_40_200.hdf5 --drop 0.8 --threshold 40,200 --allow-empty 1 98 | python3.6 train_ini.py --seed 17 --save-model save_model/model17_0_80.hdf5 --drop 0.8 --threshold 0,80 --allow-empty 1 99 | 100 | 101 | 102 | python3.6 train_ini.py --seed 18 --save-model save_model/model18_0_200.hdf5 --drop 0.8 --threshold 0,200 --allow-empty 1 103 | python3.6 train_ini.py --seed 18 --save-model save_model/model18_20_200.hdf5 --drop 0.8 --threshold 20,200 --allow-empty 1 104 | python3.6 train_ini.py --seed 18 --save-model save_model/model18_40_200.hdf5 --drop 0.8 --threshold 40,200 --allow-empty 1 105 | python3.6 train_ini.py --seed 18 --save-model save_model/model18_0_80.hdf5 --drop 0.8 --threshold 0,80 --allow-empty 1 -------------------------------------------------------------------------------- /simple_model/train_ini.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | import random 3 | import numpy as np 4 | from keras.models import * 5 | from keras.layers import * 6 | from keras import optimizers 7 | from keras.utils.np_utils import to_categorical 8 | from keras import regularizers 9 | from keras.callbacks import LearningRateScheduler 10 | import keras 11 | import argparse 12 | 13 | import tensorflow as tf 14 | import keras.backend.tensorflow_backend as KTF 15 | config = tf.ConfigProto() 16 | config.gpu_options.allow_growth=True 17 | session = tf.Session(config=config) 18 | KTF.set_session(session) 19 | 20 | 21 | def load_from_raw_pickle(pickle_path='data/feats_val.pickle', allow_empty=False, threshold=None, drop=0.8): 22 | with open(pickle_path, 'rb') as fin: 23 | feats_dict = pickle.load(fin, encoding='iso-8859-1') 24 | with open('/home/jzhengas/Jason/data/simple_model/dict.pickle', 'rb') as fin: 25 | data_dict = pickle.load(fin, encoding='iso-8859-1') 26 | x = [] 27 | y = [] 28 | val_x = [] 29 | val_y = [] 30 | for video_name in feats_dict: 31 | drop_cond = random.random() > drop 32 | feats = feats_dict[video_name] 33 | if not data_dict is None: 34 | if video_name in data_dict: 35 | label = data_dict[video_name] 36 | else: 37 | label = 4935 38 | else: 39 | label = 4935 40 | if len(feats) == 0: 41 | continue 42 | label = label - 1 43 | feats.sort(key = lambda x: x[3], reverse=True) 44 | 45 | select_count = 0 46 | for feat in feats: 47 | [frame_num, bbox, det_score, qua_score, feat_arr] = feat 48 | if threshold is None or (qua_score > threshold[0] and qua_score < threshold[1]): 49 | if drop_cond: 50 | val_x.append(feat_arr) 51 | val_y.append(label) 52 | else: 53 | x.append(feat_arr) 54 | y.append(label) 55 | select_count += 1 56 | 57 | if select_count == 0 and not allow_empty: 58 | [frame_num, bbox, det_score, qua_score, feat_arr] = feats[int(len(feats)/2)] 59 | if drop_cond: 60 | val_x.append(feat_arr) 61 | val_y.append(label) 62 | else: 63 | x.append(feat_arr) 64 | y.append(label) 65 | 66 | return np.array(x), np.array(y), np.array(val_x), np.array(val_y) 67 | 68 | 69 | 70 | def lr_scheduler(epoch, lr_base = 0.001): 71 | if epoch >= 1: 72 | lr_base = lr_base * 0.8 73 | 74 | print('lr: %f' % lr_base) 75 | return lr_base 76 | 77 | 78 | def read_from_pickle(pickle_path): 79 | with open(pickle_path, 'rb') as fin: 80 | feats_dict = pickle.load(fin, encoding='iso-8859-1') 81 | random.shuffle(feats_dict) 82 | x, y, det_score, face_score, video_name = zip(*feats_dict) 83 | x = np.array(x) 84 | y = np.array(y) 85 | video_name = np.array(video_name) 86 | return x, y 87 | 88 | 89 | if __name__ == '__main__': 90 | parser = argparse.ArgumentParser(description='Train face network') 91 | # general 92 | parser.add_argument('--threshold', type=str, default='0,200', help='threshold') 93 | parser.add_argument('--save-model', default='../save_model/tmp_model/drop_model.hdf5', help='directory to save model.') 94 | parser.add_argument('--file-list', default='/home/jzhengas/Jason/data/simple_model/train.pickle,/home/jzhengas/Jason/data/simple_model/val.pickle', help='training data') 95 | parser.add_argument('--epoch', type=int, default=6, help='threshold') 96 | parser.add_argument('--seed', type=int, default=2018) 97 | parser.add_argument('--allow-empty', type=int, default=0) 98 | parser.add_argument('--drop', type=float, default=1.0) 99 | args = parser.parse_args() 100 | random.seed(args.seed) 101 | 102 | threshold = [int(d) for d in args.threshold.split(',')] 103 | 104 | 105 | train_x1, train_y1, val_x1, val_y1 = load_from_raw_pickle('/home/jzhengas/Jason/data/simple_model/pickle/feats_train.pickle', threshold=threshold, drop=args.drop, allow_empty=args.allow_empty) 106 | train_x2, train_y2, val_x2, val_y2 = load_from_raw_pickle('/home/jzhengas/Jason/data/simple_model/pickle/feats_val.pickle', threshold=threshold, drop=args.drop, allow_empty=args.allow_empty) 107 | train_x = np.concatenate([train_x1, train_x2]) 108 | train_y = np.concatenate([train_y1, train_y2]) 109 | val_x = np.concatenate([val_x1, val_x2]) 110 | val_y = np.concatenate([val_y1, val_y2]) 111 | 112 | val_data = (val_x, val_y) 113 | 114 | print('Finish loading data, train len {}'.format(len(train_x))) 115 | 116 | input_tensor = Input(shape=(512,)) 117 | x = Dense(1024, activation='relu')(input_tensor) 118 | x = BatchNormalization()(x) 119 | x = Dropout(0.5)(x) 120 | x = Dense(4935, activation='softmax')(x) 121 | model = Model(input_tensor, x) 122 | 123 | opt = optimizers.adam(lr=0.001) 124 | scheduler = LearningRateScheduler(lr_scheduler) 125 | 126 | model.compile(optimizer=opt, loss='sparse_categorical_crossentropy', metrics=['accuracy']) 127 | model.fit(x=train_x, y=train_y, epochs=args.epoch, batch_size=256, validation_data=val_data, callbacks=[scheduler], shuffle=True) 128 | model.save(args.save_model) 129 | 130 | 131 | -------------------------------------------------------------------------------- /答辩.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Jasonbaby/IQIYI_VID_FACE/a2e75205118907a38d068b36569c546248ddfa4b/答辩.pdf --------------------------------------------------------------------------------