├── .gitignore
├── README.md
├── detect_landmarks_in_image.py
├── examples
    ├── cropped.jpg
    ├── origin.jpg
    └── rendered.png
├── load_data.py
├── models
    ├── __pycache__
    │   └── resnet_50.cpython-37.pyc
    └── resnet_50.py
├── preprocess
    ├── __pycache__
    │   └── mtcnn.cpython-37.pyc
    ├── mtcnn.py
    └── utils
    │   ├── __pycache__
    │       └── detect_face.cpython-37.pyc
    │   └── detect_face.py
├── recon_demo.py
├── reconstruction_mesh.py
├── requirements.txt
└── utils.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | BFM/*
 2 | dataset/*
 3 | facebank/*
 4 | *.pt
 5 | output/*
 6 | __pycache__/*
 7 | !examples/*
 8 | params/*
 9 | .vscode/*
10 | *.pyc


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Accurate 3D Face Reconstruction with Weakly-Supervised Learning: From Single Image to Image Set
 2 | 
 3 | Pytorch version of the repo [Deep3DFaceReconstruction](https://github.com/microsoft/Deep3DFaceReconstruction).
 4 | 
 5 | This repo only contains the **reconstruction** part, so you can use [Deep3DFaceReconstruction-pytorch](https://github.com/changhongjian/Deep3DFaceReconstruction-pytorch) repo to train the network. And the pretrained model is also from this [repo](https://github.com/changhongjian/Deep3DFaceReconstruction-pytorch/tree/master/network).
 6 | 
 7 | ## Features
 8 | 
 9 | ### MTCNN
10 | 
11 | I use mtcnn to crop raw images and detect 5 landmarks. The most code of MTCNN comes from [FaceNet-pytorch](https://github.com/timesler/facenet-pytorch).
12 | 
13 | ### Pytorc3d
14 | 
15 | In this repo, I use [PyTorch3d 0.3.0](https://github.com/facebookresearch/pytorch3d) to render the reconstructed images.
16 | 
17 | ### Estimating Intrinsic Parameters
18 | 
19 | In the origin repo ([Deep3DFaceReconstruction-pytorch](https://github.com/changhongjian/Deep3DFaceReconstruction-pytorch)), the rendered images is not the same as the input image because of `preprocess`. So, I add the `estimate_intrinsic` to get intrinsic parameters.
20 | 
21 | ## Examples:
22 | 
23 | Here are some examples:
24 | 
25 | |Origin Images|Cropped Images|Rendered Images|
26 | |-------------|---|---|
27 | |![Putin](examples/origin.jpg)|![Putin](examples/cropped.jpg)|![putin](examples/rendered.png)|
28 | 
29 | 
30 | ## File Architecture
31 | 
32 | ```
33 | ├─BFM               same as Deep3DFaceReconstruction
34 | ├─dataset           storing the corpped images
35 | │  └─Vladimir_Putin
36 | ├─examples          show examples
37 | ├─facebank          storing the raw/origin images
38 | │  └─Vladimir_Putin
39 | ├─models            storing the pretrained models
40 | ├─output            storing the output images(.mat, .png)
41 | │  └─Vladimir_Putin
42 | └─preprocess        cropping images and detecting landmarks
43 |     ├─data          storing the models of mtcnn
44 |     ├─utils
45 | ```
46 | 
47 | Also, this repo can also generate the UV map, and you need download UV coordinates from the following link:  
48 | &nbsp;&nbsp;Download UV coordinates fom STN website: https://github.com/anilbas/3DMMasSTN/blob/master/util/BFM_UV.mat  
49 | &nbsp;&nbsp;Copy BFM_UV.mat to BFM
50 | 
51 | The pretrained models can be downloaded from [Google Drive](https://drive.google.com/file/d/1JjLl8-7Qurwlq5q61hSJEbCKFrhPh0t2/view?usp=sharing).
52 | 


--------------------------------------------------------------------------------
/detect_landmarks_in_image.py:
--------------------------------------------------------------------------------
 1 | from preprocess.mtcnn import MTCNN
 2 | from torch.utils.data import DataLoader
 3 | from torchvision import datasets, transforms
 4 | import torch
 5 | import os
 6 | 
 7 | 
 8 | def collate_pil(x):
 9 |     out_x, out_y = [], []
10 |     for xx, yy in x:
11 |         out_x.append(xx)
12 |         out_y.append(yy)
13 |     return out_x, out_y
14 | 
15 | 
16 | batch_size = 1
17 | workers = 0 if os.name == 'nt' else 8
18 | dataset_dir = r'facebank'
19 | cropped_dataset = r'dataset'
20 | device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
21 | 
22 | mtcnn = MTCNN(
23 |     image_size=(300, 300), margin=20, min_face_size=20,
24 |     thresholds=[0.6, 0.7, 0.7], factor=0.709, post_process=True,
25 |     device=device
26 | )
27 | 
28 | dataset = datasets.ImageFolder(
29 |     dataset_dir, transform=transforms.Resize((512, 512)))
30 | dataset.samples = [
31 |     (p, p.replace(dataset_dir, cropped_dataset))
32 |     for p, _ in dataset.samples
33 | ]
34 | loader = DataLoader(
35 |     dataset,
36 |     num_workers=workers,
37 |     batch_size=batch_size,
38 |     collate_fn=collate_pil
39 | )
40 | 
41 | for i, (x, y) in enumerate(loader):
42 |     x = mtcnn(x, save_path=y, save_landmarks=True)
43 | 


--------------------------------------------------------------------------------
/examples/cropped.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/csyuhao/Deep3DFaceReconstruction-Pytorch/7c506b55adee55bb269f73354dd16d5327a7fb04/examples/cropped.jpg


--------------------------------------------------------------------------------
/examples/origin.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/csyuhao/Deep3DFaceReconstruction-Pytorch/7c506b55adee55bb269f73354dd16d5327a7fb04/examples/origin.jpg


--------------------------------------------------------------------------------
/examples/rendered.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/csyuhao/Deep3DFaceReconstruction-Pytorch/7c506b55adee55bb269f73354dd16d5327a7fb04/examples/rendered.png


--------------------------------------------------------------------------------
/load_data.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from scipy.io import loadmat, savemat
  3 | from array import array
  4 | import numpy as np
  5 | from PIL import Image
  6 | 
  7 | 
  8 | class BFM(object):
  9 |     # BFM 3D face model
 10 |     def __init__(self, model_path='BFM/BFM_model_front.mat', device='cpu'):
 11 |         model = loadmat(model_path)
 12 |         # mean face shape. [3*N,1]
 13 |         self.meanshape = torch.from_numpy(model['meanshape'])
 14 |         # identity basis. [3*N,80]
 15 |         self.idBase = torch.from_numpy(model['idBase'])
 16 |         self.exBase = torch.from_numpy(model['exBase'].astype(
 17 |             np.float32))    # expression basis. [3*N,64]
 18 |         # mean face texture. [3*N,1] (0-255)
 19 |         self.meantex = torch.from_numpy(model['meantex'])
 20 |         # texture basis. [3*N,80]
 21 |         self.texBase = torch.from_numpy(model['texBase'])
 22 |         # triangle indices for each vertex that lies in. starts from 1. [N,8]
 23 |         self.point_buf = model['point_buf'].astype(np.int32)
 24 |         # vertex indices in each triangle. starts from 1. [F,3]
 25 |         self.tri = model['tri'].astype(np.int32)
 26 |         # vertex indices of 68 facial landmarks. starts from 1. [68,1]
 27 |         self.keypoints = model['keypoints'].astype(np.int32)[0]
 28 |         self.to_device(device)
 29 | 
 30 |     def to_device(self, device):
 31 |         self.meanshape = self.meanshape.to(device)
 32 |         self.idBase = self.idBase.to(device)
 33 |         self.exBase = self.exBase.to(device)
 34 |         self.meantex = self.meantex.to(device)
 35 |         self.texBase = self.texBase.to(device)
 36 | 
 37 |     def load_lm3d(self, fsimilarity_Lm3D_all_mat='BFM/similarity_Lm3D_all.mat'):
 38 |         # load landmarks for standard face, which is used for image preprocessing
 39 |         Lm3D = loadmat(fsimilarity_Lm3D_all_mat)
 40 |         Lm3D = Lm3D['lm']
 41 | 
 42 |         # calculate 5 facial landmarks using 68 landmarks
 43 |         lm_idx = np.array([31, 37, 40, 43, 46, 49, 55]) - 1
 44 |         Lm3D = np.stack([Lm3D[lm_idx[0], :], np.mean(Lm3D[lm_idx[[1, 2]], :], 0), np.mean(
 45 |             Lm3D[lm_idx[[3, 4]], :], 0), Lm3D[lm_idx[5], :], Lm3D[lm_idx[6], :]], axis=0)
 46 |         Lm3D = Lm3D[[1, 2, 0, 3, 4], :]
 47 |         self.Lm3D = Lm3D
 48 |         return Lm3D
 49 | 
 50 | 
 51 | def load_expbasis():
 52 |     # load expression basis
 53 |     n_vertex = 53215
 54 |     exp_bin = open(r'BFM\Exp_Pca.bin', 'rb')
 55 |     exp_dim = array('i')
 56 |     exp_dim.fromfile(exp_bin, 1)
 57 |     expMU = array('f')
 58 |     expPC = array('f')
 59 |     expMU.fromfile(exp_bin, 3*n_vertex)
 60 |     expPC.fromfile(exp_bin, 3*exp_dim[0]*n_vertex)
 61 | 
 62 |     expPC = np.array(expPC)
 63 |     expPC = np.reshape(expPC, [exp_dim[0], -1])
 64 |     expPC = np.transpose(expPC)
 65 | 
 66 |     expEV = np.loadtxt(r'BFM\std_exp.txt')
 67 | 
 68 |     return expPC, expEV
 69 | 
 70 | 
 71 | def transfer_BFM09():
 72 |     # tranfer original BFM2009 to target face model
 73 |     original_BFM = loadmat(r'BFM\01_MorphableModel.mat')
 74 |     shapePC = original_BFM['shapePC']   # shape basis
 75 |     shapeEV = original_BFM['shapeEV']   # corresponding eigen values
 76 |     shapeMU = original_BFM['shapeMU']   # mean face
 77 |     texPC = original_BFM['texPC']       # texture basis
 78 |     texEV = original_BFM['texEV']       # corresponding eigen values
 79 |     texMU = original_BFM['texMU']       # mean texture
 80 | 
 81 |     expPC, expEV = load_expbasis()
 82 | 
 83 |     idBase = shapePC * np.reshape(shapeEV, [-1, 199])
 84 |     idBase = idBase / 1e5		# unify the scale to decimeter
 85 |     idBase = idBase[:, :80]		# use only first 80 basis
 86 | 
 87 |     exBase = expPC * np.reshape(expEV, [-1, 79])
 88 |     exBase = exBase / 1e5		# unify the scale to decimeter
 89 |     exBase = exBase[:, :64]		# use only first 64 basis
 90 | 
 91 |     texBase = texPC*np.reshape(texEV, [-1, 199])
 92 |     texBase = texBase[:, :80]  # use only first 80 basis
 93 | 
 94 |     # our face model is cropped align face landmarks which contains only 35709 vertex.
 95 |     # original BFM09 contains 53490 vertex, and expression basis provided by JuYong contains 53215 vertex.
 96 |     # thus we select corresponding vertex to get our face model.
 97 |     index_exp = loadmat('BFM/BFM_front_idx.mat')
 98 |     index_exp = index_exp['idx'].astype(
 99 |         np.int32) - 1  # starts from 0 (to 53215)
100 | 
101 |     index_shape = loadmat('BFM/BFM_exp_idx.mat')
102 |     index_shape = index_shape['trimIndex'].astype(
103 |         np.int32) - 1  # starts from 0 (to 53490)
104 |     index_shape = index_shape[index_exp]
105 | 
106 |     idBase = np.reshape(idBase, [-1, 3, 80])
107 |     idBase = idBase[index_shape, :, :]
108 |     idBase = np.reshape(idBase, [-1, 80])
109 | 
110 |     texBase = np.reshape(texBase, [-1, 3, 80])
111 |     texBase = texBase[index_shape, :, :]
112 |     texBase = np.reshape(texBase, [-1, 80])
113 | 
114 |     exBase = np.reshape(exBase, [-1, 3, 64])
115 |     exBase = exBase[index_exp, :, :]
116 |     exBase = np.reshape(exBase, [-1, 64])
117 | 
118 |     meanshape = np.reshape(shapeMU, [-1, 3]) / 1e5
119 |     meanshape = meanshape[index_shape, :]
120 |     meanshape = np.reshape(meanshape, [1, -1])
121 | 
122 |     meantex = np.reshape(texMU, [-1, 3])
123 |     meantex = meantex[index_shape, :]
124 |     meantex = np.reshape(meantex, [1, -1])
125 | 
126 |     # other info contains triangles, region used for computing photometric loss,
127 |     # region used for skin texture regularization, and 68 landmarks index etc.
128 |     other_info = loadmat('BFM/facemodel_info.mat')
129 |     frontmask2_idx = other_info['frontmask2_idx']
130 |     skinmask = other_info['skinmask']
131 |     keypoints = other_info['keypoints']
132 |     point_buf = other_info['point_buf']
133 |     tri = other_info['tri']
134 |     tri_mask2 = other_info['tri_mask2']
135 | 
136 |     # save our face model
137 |     savemat('BFM/BFM_model_front.mat', {'meanshape': meanshape, 'meantex': meantex, 'idBase': idBase, 'exBase': exBase, 'texBase': texBase,
138 |                                         'tri': tri, 'point_buf': point_buf, 'tri_mask2': tri_mask2, 'keypoints': keypoints, 'frontmask2_idx': frontmask2_idx, 'skinmask': skinmask})
139 | 
140 | 
141 | # calculating least sqaures problem
142 | def POS(xp, x):
143 |     npts = xp.shape[1]
144 | 
145 |     A = np.zeros([2*npts, 8])
146 | 
147 |     A[0:2*npts-1:2, 0:3] = x.transpose()
148 |     A[0:2*npts-1:2, 3] = 1
149 | 
150 |     A[1:2*npts:2, 4:7] = x.transpose()
151 |     A[1:2*npts:2, 7] = 1
152 | 
153 |     b = np.reshape(xp.transpose(), [2*npts, 1])
154 | 
155 |     k, _, _, _ = np.linalg.lstsq(A, b, rcond=None)
156 | 
157 |     R1 = k[0:3]
158 |     R2 = k[4:7]
159 |     sTx = k[3]
160 |     sTy = k[7]
161 |     s = (np.linalg.norm(R1) + np.linalg.norm(R2))/2
162 |     t = np.stack([sTx, sTy], axis=0)
163 | 
164 |     return t, s
165 | 
166 | 
167 | def process_img(img, lm, t, s, target_size=224.):
168 |     w0, h0 = img.size
169 |     w = (w0/s*102).astype(np.int32)
170 |     h = (h0/s*102).astype(np.int32)
171 |     img = img.resize((w, h), resample=Image.BICUBIC)
172 | 
173 |     left = (w/2 - target_size/2 + float((t[0] - w0/2)*102/s)).astype(np.int32)
174 |     right = left + target_size
175 |     up = (h/2 - target_size/2 + float((h0/2 - t[1])*102/s)).astype(np.int32)
176 |     below = up + target_size
177 | 
178 |     img = img.crop((left, up, right, below))
179 |     img = np.array(img)
180 |     img = img[:, :, ::-1]  # RGBtoBGR
181 |     img = np.expand_dims(img, 0)
182 |     lm = np.stack([lm[:, 0] - t[0] + w0/2, lm[:, 1] -
183 |                    t[1] + h0/2], axis=1)/s*102
184 |     lm = lm - \
185 |         np.reshape(
186 |             np.array([(w/2 - target_size/2), (h/2-target_size/2)]), [1, 2])
187 | 
188 |     return img, lm
189 | 
190 | 
191 | def Preprocess(img, lm, lm3D):
192 |     # resize and crop input images before sending to the R-Net
193 |     w0, h0 = img.size
194 | 
195 |     # change from image plane coordinates to 3D sapce coordinates(X-Y plane)
196 |     lm = np.stack([lm[:, 0], h0 - 1 - lm[:, 1]], axis=1)
197 | 
198 |     # calculate translation and scale factors using 5 facial landmarks and standard landmarks
199 |     # lm3D -> lm
200 |     t, s = POS(lm.transpose(), lm3D.transpose())
201 | 
202 |     # processing the image
203 |     img_new, lm_new = process_img(img, lm, t, s)
204 | 
205 |     lm_new = np.stack([lm_new[:, 0], 223 - lm_new[:, 1]], axis=1)
206 |     trans_params = np.array([w0, h0, 102.0/s, t[0, 0], t[1, 0]])
207 | 
208 |     return img_new, lm_new, trans_params
209 | 
210 | 
211 | def load_img(img_path, lm_path):
212 |     # load input images and corresponding 5 landmarks
213 |     image = Image.open(img_path)
214 |     lm = np.loadtxt(lm_path)
215 |     return image, lm
216 | 
217 | 
218 | def save_obj(path, v, f, c):
219 |     # save 3D face to obj file
220 |     with open(path, 'w') as file:
221 |         for i in range(len(v)):
222 |             file.write('v %f %f %f %f %f %f\n' %
223 |                        (v[i, 0], v[i, 1], v[i, 2], c[i, 0], c[i, 1], c[i, 2]))
224 | 
225 |         file.write('\n')
226 | 
227 |         for i in range(len(f)):
228 |             file.write('f %d %d %d\n' % (f[i, 0], f[i, 1], f[i, 2]))
229 | 
230 |     file.close()
231 | 
232 | 
233 | def transfer_UV():
234 |     uv_model = loadmat('BFM/BFM_UV.mat')
235 | 
236 |     index_exp = loadmat('BFM/BFM_front_idx.mat')
237 |     index_exp = index_exp['idx'].astype(
238 |         np.int32) - 1  # starts from 0 (to 53215)
239 | 
240 |     uv_pos = uv_model['UV']
241 |     uv_pos = uv_pos[index_exp, :]
242 |     uv_pos = np.reshape(uv_pos, (-1, 2))
243 | 
244 |     return uv_pos
245 | 


--------------------------------------------------------------------------------
/models/__pycache__/resnet_50.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/csyuhao/Deep3DFaceReconstruction-Pytorch/7c506b55adee55bb269f73354dd16d5327a7fb04/models/__pycache__/resnet_50.cpython-37.pyc


--------------------------------------------------------------------------------
/models/resnet_50.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import math
  4 | import pickle
  5 | 
  6 | 
  7 | def load_state_dict(model, fname):
  8 |     """
  9 |     Set parameters converted from Caffe models authors of VGGFace2 provide.
 10 |     See https://www.robots.ox.ac.uk/~vgg/data/vgg_face2/.
 11 | 
 12 |     Arguments:
 13 |         model: model
 14 |         fname: file name of parameters converted from a Caffe model, assuming the file format is Pickle.
 15 |     """
 16 |     with open(fname, 'rb') as f:
 17 |         weights = pickle.load(f, encoding='latin1')
 18 | 
 19 |     own_state = model.state_dict()
 20 |     for name, param in weights.items():
 21 |         if name in own_state:
 22 |             try:
 23 |                 own_state[name].copy_(torch.from_numpy(param))
 24 |             except Exception:
 25 |                 raise RuntimeError('While copying the parameter named {}, whose dimensions in the model are {} and whose '
 26 |                                    'dimensions in the checkpoint are {}.'.format(name, own_state[name].size(), param.size()))
 27 |         else:
 28 |             # raise KeyError('unexpected key "{}" in state_dict'.format(name))
 29 |             print('unexpected key "{}" in state_dict'.format(name))
 30 | 
 31 | 
 32 | def conv3x3(in_planes, out_planes, stride=1):
 33 |     """3x3 convolution with padding"""
 34 |     return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 35 |                      padding=1, bias=False)
 36 | 
 37 | 
 38 | def conv1x1(in_planes, out_planes, bias=True):
 39 |     """3x3 convolution with padding"""
 40 |     return nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=1, bias=bias)
 41 | 
 42 | 
 43 | class Bottleneck(nn.Module):
 44 |     expansion = 4
 45 | 
 46 |     def __init__(self, inplanes, planes, stride=1, downsample=None):
 47 |         super(Bottleneck, self).__init__()
 48 |         self.conv1 = nn.Conv2d(
 49 |             inplanes, planes, kernel_size=1, stride=stride, bias=False)
 50 |         self.bn1 = nn.BatchNorm2d(planes)
 51 |         self.conv2 = nn.Conv2d(planes, planes, kernel_size=3,
 52 |                                stride=1, padding=1, bias=False)
 53 |         self.bn2 = nn.BatchNorm2d(planes)
 54 |         self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
 55 |         self.bn3 = nn.BatchNorm2d(planes * 4)
 56 |         self.relu = nn.ReLU(inplace=True)
 57 |         self.downsample = downsample
 58 |         self.stride = stride
 59 | 
 60 |     def forward(self, x):
 61 |         residual = x
 62 | 
 63 |         out = self.conv1(x)
 64 |         out = self.bn1(out)
 65 |         out = self.relu(out)
 66 | 
 67 |         out = self.conv2(out)
 68 |         out = self.bn2(out)
 69 |         out = self.relu(out)
 70 | 
 71 |         out = self.conv3(out)
 72 |         out = self.bn3(out)
 73 | 
 74 |         if self.downsample is not None:
 75 |             residual = self.downsample(x)
 76 | 
 77 |         out += residual
 78 |         out = self.relu(out)
 79 | 
 80 |         return out
 81 | 
 82 | 
 83 | class ResNet(nn.Module):
 84 | 
 85 |     def __init__(self, block, layers, num_classes=-1, include_top=True):
 86 |         self.inplanes = 64
 87 |         super(ResNet, self).__init__()
 88 |         self.include_top = include_top
 89 | 
 90 |         self.conv1 = nn.Conv2d(3, 64, kernel_size=7,
 91 |                                stride=2, padding=3, bias=False)
 92 |         self.bn1 = nn.BatchNorm2d(64)
 93 |         self.relu = nn.ReLU(inplace=True)
 94 |         self.maxpool = nn.MaxPool2d(
 95 |             kernel_size=3, stride=2, padding=0, ceil_mode=True)
 96 | 
 97 |         self.layer1 = self._make_layer(block, 64, layers[0])
 98 |         self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
 99 |         self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
100 |         self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
101 |         self.avgpool = nn.AvgPool2d(7, stride=1)
102 | 
103 |         # self.fc = nn.Linear(512 * block.expansion, num_classes)
104 | 
105 |         # CHJ_ADD task use
106 |         self.fc_dims = {
107 |             "id": 80,
108 |             "ex": 64,
109 |             "tex": 80,
110 |             "angles": 3,
111 |             "gamma": 27,
112 |             "XY": 2,
113 |             "Z": 1}
114 | 
115 |         # self.fc_dims_arr=[0] * (1+len(self.fc_dims))
116 |         # for i, (k, v) in enumerate(self.fc_dims.items()):
117 |         #    self.fc_dims_arr[i+1] = v + self.fc_dims_arr[i]
118 | 
119 |         _outdim = 512 * block.expansion
120 |         '''
121 |         self.fcid = nn.Linear(_outdim, 80)
122 |         self.fcex = nn.Linear(_outdim, 64)
123 |         self.fctex = nn.Linear(_outdim, 80)
124 |         self.fcangles = nn.Linear(_outdim, 3)
125 |         self.fcgamma = nn.Linear(_outdim, 27)
126 |         self.fcXY = nn.Linear(_outdim, 2)
127 |         self.fcZ = nn.Linear(_outdim, 1)
128 |         '''
129 |         self.fcid = conv1x1(_outdim, 80)
130 |         self.fcex = conv1x1(_outdim, 64)
131 |         self.fctex = conv1x1(_outdim, 80)
132 |         self.fcangles = conv1x1(_outdim, 3)
133 |         self.fcgamma = conv1x1(_outdim, 27)
134 |         self.fcXY = conv1x1(_outdim, 2)
135 |         self.fcZ = conv1x1(_outdim, 1)
136 | 
137 |         self.arr_fc = [self.fcid, self.fcex, self.fctex,
138 |                        self.fcangles, self.fcgamma, self.fcXY, self.fcZ]
139 | 
140 |         for m in self.modules():
141 |             if isinstance(m, nn.Conv2d):
142 |                 n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
143 |                 m.weight.data.normal_(0, math.sqrt(2. / n))
144 |             elif isinstance(m, nn.BatchNorm2d):
145 |                 m.weight.data.fill_(1)
146 |                 m.bias.data.zero_()
147 | 
148 |     def _make_layer(self, block, planes, blocks, stride=1):
149 |         downsample = None
150 |         if stride != 1 or self.inplanes != planes * block.expansion:
151 |             downsample = nn.Sequential(
152 |                 nn.Conv2d(self.inplanes, planes * block.expansion,
153 |                           kernel_size=1, stride=stride, bias=False),
154 |                 nn.BatchNorm2d(planes * block.expansion),
155 |             )
156 | 
157 |         layers = []
158 |         layers.append(block(self.inplanes, planes, stride, downsample))
159 |         self.inplanes = planes * block.expansion
160 |         for i in range(1, blocks):
161 |             layers.append(block(self.inplanes, planes))
162 | 
163 |         return nn.Sequential(*layers)
164 | 
165 |     def forward(self, x):
166 |         x = self.conv1(x)
167 |         x = self.bn1(x)
168 |         x = self.relu(x)
169 |         x = self.maxpool(x)
170 |         x = self.layer1(x)
171 |         x = self.layer2(x)
172 |         x = self.layer3(x)
173 |         x = self.layer4(x)
174 |         x = self.avgpool(x)
175 | 
176 |         # 这里不需要view
177 |         n_b = x.size(0)
178 |         outs = []
179 |         for fc in self.arr_fc:
180 |             outs.append(fc(x).view(n_b, -1))
181 | 
182 |         return outs
183 | 
184 | 
185 | def resnet50_use():
186 |     """Constructs a ResNet-50 model.
187 |     """
188 |     model = ResNet(Bottleneck, [3, 4, 6, 3])
189 |     return model
190 | 


--------------------------------------------------------------------------------
/preprocess/__pycache__/mtcnn.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/csyuhao/Deep3DFaceReconstruction-Pytorch/7c506b55adee55bb269f73354dd16d5327a7fb04/preprocess/__pycache__/mtcnn.cpython-37.pyc


--------------------------------------------------------------------------------
/preprocess/mtcnn.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torch import nn
  3 | import numpy as np
  4 | import os
  5 | 
  6 | from .utils.detect_face import detect_face, extract_face, save_landmark
  7 | 
  8 | 
  9 | class PNet(nn.Module):
 10 |     """MTCNN PNet.
 11 | 
 12 |     Keyword Arguments:
 13 |         pretrained {bool} -- Whether or not to load saved pretrained weights (default: {True})
 14 |     """
 15 | 
 16 |     def __init__(self, pretrained=True):
 17 |         super().__init__()
 18 | 
 19 |         self.conv1 = nn.Conv2d(3, 10, kernel_size=3)
 20 |         self.prelu1 = nn.PReLU(10)
 21 |         self.pool1 = nn.MaxPool2d(2, 2, ceil_mode=True)
 22 |         self.conv2 = nn.Conv2d(10, 16, kernel_size=3)
 23 |         self.prelu2 = nn.PReLU(16)
 24 |         self.conv3 = nn.Conv2d(16, 32, kernel_size=3)
 25 |         self.prelu3 = nn.PReLU(32)
 26 |         self.conv4_1 = nn.Conv2d(32, 2, kernel_size=1)
 27 |         self.softmax4_1 = nn.Softmax(dim=1)
 28 |         self.conv4_2 = nn.Conv2d(32, 4, kernel_size=1)
 29 | 
 30 |         self.training = False
 31 | 
 32 |         if pretrained:
 33 |             state_dict_path = os.path.join(
 34 |                 os.path.dirname(__file__), 'data/pnet.pt')
 35 |             state_dict = torch.load(state_dict_path)
 36 |             self.load_state_dict(state_dict)
 37 | 
 38 |     def forward(self, x):
 39 |         x = self.conv1(x)
 40 |         x = self.prelu1(x)
 41 |         x = self.pool1(x)
 42 |         x = self.conv2(x)
 43 |         x = self.prelu2(x)
 44 |         x = self.conv3(x)
 45 |         x = self.prelu3(x)
 46 |         a = self.conv4_1(x)
 47 |         a = self.softmax4_1(a)
 48 |         b = self.conv4_2(x)
 49 |         return b, a
 50 | 
 51 | 
 52 | class RNet(nn.Module):
 53 |     """MTCNN RNet.
 54 | 
 55 |     Keyword Arguments:
 56 |         pretrained {bool} -- Whether or not to load saved pretrained weights (default: {True})
 57 |     """
 58 | 
 59 |     def __init__(self, pretrained=True):
 60 |         super().__init__()
 61 | 
 62 |         self.conv1 = nn.Conv2d(3, 28, kernel_size=3)
 63 |         self.prelu1 = nn.PReLU(28)
 64 |         self.pool1 = nn.MaxPool2d(3, 2, ceil_mode=True)
 65 |         self.conv2 = nn.Conv2d(28, 48, kernel_size=3)
 66 |         self.prelu2 = nn.PReLU(48)
 67 |         self.pool2 = nn.MaxPool2d(3, 2, ceil_mode=True)
 68 |         self.conv3 = nn.Conv2d(48, 64, kernel_size=2)
 69 |         self.prelu3 = nn.PReLU(64)
 70 |         self.dense4 = nn.Linear(576, 128)
 71 |         self.prelu4 = nn.PReLU(128)
 72 |         self.dense5_1 = nn.Linear(128, 2)
 73 |         self.softmax5_1 = nn.Softmax(dim=1)
 74 |         self.dense5_2 = nn.Linear(128, 4)
 75 | 
 76 |         self.training = False
 77 | 
 78 |         if pretrained:
 79 |             state_dict_path = os.path.join(
 80 |                 os.path.dirname(__file__), 'data/rnet.pt')
 81 |             state_dict = torch.load(state_dict_path)
 82 |             self.load_state_dict(state_dict)
 83 | 
 84 |     def forward(self, x):
 85 |         x = self.conv1(x)
 86 |         x = self.prelu1(x)
 87 |         x = self.pool1(x)
 88 |         x = self.conv2(x)
 89 |         x = self.prelu2(x)
 90 |         x = self.pool2(x)
 91 |         x = self.conv3(x)
 92 |         x = self.prelu3(x)
 93 |         x = x.permute(0, 3, 2, 1).contiguous()
 94 |         x = self.dense4(x.view(x.shape[0], -1))
 95 |         x = self.prelu4(x)
 96 |         a = self.dense5_1(x)
 97 |         a = self.softmax5_1(a)
 98 |         b = self.dense5_2(x)
 99 |         return b, a
100 | 
101 | 
102 | class ONet(nn.Module):
103 |     """MTCNN ONet.
104 | 
105 |     Keyword Arguments:
106 |         pretrained {bool} -- Whether or not to load saved pretrained weights (default: {True})
107 |     """
108 | 
109 |     def __init__(self, pretrained=True):
110 |         super().__init__()
111 | 
112 |         self.conv1 = nn.Conv2d(3, 32, kernel_size=3)
113 |         self.prelu1 = nn.PReLU(32)
114 |         self.pool1 = nn.MaxPool2d(3, 2, ceil_mode=True)
115 |         self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
116 |         self.prelu2 = nn.PReLU(64)
117 |         self.pool2 = nn.MaxPool2d(3, 2, ceil_mode=True)
118 |         self.conv3 = nn.Conv2d(64, 64, kernel_size=3)
119 |         self.prelu3 = nn.PReLU(64)
120 |         self.pool3 = nn.MaxPool2d(2, 2, ceil_mode=True)
121 |         self.conv4 = nn.Conv2d(64, 128, kernel_size=2)
122 |         self.prelu4 = nn.PReLU(128)
123 |         self.dense5 = nn.Linear(1152, 256)
124 |         self.prelu5 = nn.PReLU(256)
125 |         self.dense6_1 = nn.Linear(256, 2)
126 |         self.softmax6_1 = nn.Softmax(dim=1)
127 |         self.dense6_2 = nn.Linear(256, 4)
128 |         self.dense6_3 = nn.Linear(256, 10)
129 | 
130 |         self.training = False
131 | 
132 |         if pretrained:
133 |             state_dict_path = os.path.join(
134 |                 os.path.dirname(__file__), 'data/onet.pt')
135 |             state_dict = torch.load(state_dict_path)
136 |             self.load_state_dict(state_dict)
137 | 
138 |     def forward(self, x):
139 |         x = self.conv1(x)
140 |         x = self.prelu1(x)
141 |         x = self.pool1(x)
142 |         x = self.conv2(x)
143 |         x = self.prelu2(x)
144 |         x = self.pool2(x)
145 |         x = self.conv3(x)
146 |         x = self.prelu3(x)
147 |         x = self.pool3(x)
148 |         x = self.conv4(x)
149 |         x = self.prelu4(x)
150 |         x = x.permute(0, 3, 2, 1).contiguous()
151 |         x = self.dense5(x.view(x.shape[0], -1))
152 |         x = self.prelu5(x)
153 |         a = self.dense6_1(x)
154 |         a = self.softmax6_1(a)
155 |         b = self.dense6_2(x)
156 |         c = self.dense6_3(x)
157 |         return b, c, a
158 | 
159 | 
160 | class MTCNN(nn.Module):
161 |     """MTCNN face detection module.
162 | 
163 |     This class loads pretrained P-, R-, and O-nets and returns images cropped to include the face
164 |     only, given raw input images of one of the following types:
165 |         - PIL image or list of PIL images
166 |         - numpy.ndarray (uint8) representing either a single image (3D) or a batch of images (4D).
167 |     Cropped faces can optionally be saved to file
168 |     also.
169 | 
170 |     Keyword Arguments:
171 |         image_size {int} -- Output image size in pixels. The image will be square. (default: {160})
172 |         margin {int} -- Margin to add to bounding box, in terms of pixels in the final image. 
173 |             Note that the application of the margin differs slightly from the davidsandberg/facenet
174 |             repo, which applies the margin to the original image before resizing, making the margin
175 |             dependent on the original image size (this is a bug in davidsandberg/facenet).
176 |             (default: {0})
177 |         min_face_size {int} -- Minimum face size to search for. (default: {20})
178 |         thresholds {list} -- MTCNN face detection thresholds (default: {[0.6, 0.7, 0.7]})
179 |         factor {float} -- Factor used to create a scaling pyramid of face sizes. (default: {0.709})
180 |         post_process {bool} -- Whether or not to post process images tensors before returning.
181 |             (default: {True})
182 |         select_largest {bool} -- If True, if multiple faces are detected, the largest is returned.
183 |             If False, the face with the highest detection probability is returned.
184 |             (default: {True})
185 |         keep_all {bool} -- If True, all detected faces are returned, in the order dictated by the
186 |             select_largest parameter. If a save_path is specified, the first face is saved to that
187 |             path and the remaining faces are saved to <save_path>1, <save_path>2 etc.
188 |         device {torch.device} -- The device on which to run neural net passes. Image tensors and
189 |             models are copied to this device before running forward passes. (default: {None})
190 |     """
191 | 
192 |     def __init__(
193 |         self, image_size=160, margin=0, min_face_size=20,
194 |         thresholds=[0.6, 0.7, 0.7], factor=0.709, post_process=True,
195 |         select_largest=True, keep_all=False, device=None
196 |     ):
197 |         super().__init__()
198 | 
199 |         self.image_size = image_size
200 |         self.margin = margin
201 |         self.min_face_size = min_face_size
202 |         self.thresholds = thresholds
203 |         self.factor = factor
204 |         self.post_process = post_process
205 |         self.select_largest = select_largest
206 |         self.keep_all = keep_all
207 | 
208 |         self.pnet = PNet()
209 |         self.rnet = RNet()
210 |         self.onet = ONet()
211 | 
212 |         self.device = torch.device('cpu')
213 |         if device is not None:
214 |             self.device = device
215 |             self.to(device)
216 | 
217 |     def forward(self, img, save_path=None, return_prob=False, save_landmarks=False):
218 |         """Run MTCNN face detection on a PIL image or numpy array. This method performs both
219 |         detection and extraction of faces, returning tensors representing detected faces rather
220 |         than the bounding boxes. To access bounding boxes, see the MTCNN.detect() method below.
221 | 
222 |         Arguments:
223 |             img {PIL.Image, np.ndarray, or list} -- A PIL image, np.ndarray, or list.
224 | 
225 |         Keyword Arguments:
226 |             save_path {str} -- An optional save path for the cropped image. Note that when
227 |                 self.post_process=True, although the returned tensor is post processed, the saved
228 |                 face image is not, so it is a true representation of the face in the input image.
229 |                 If `img` is a list of images, `save_path` should be a list of equal length.
230 |                 (default: {None})
231 |             return_prob {bool} -- Whether or not to return the detection probability.
232 |                 (default: {False})
233 | 
234 |         Returns:
235 |             Union[torch.Tensor, tuple(torch.tensor, float)] -- If detected, cropped image of a face
236 |                 with dimensions 3 x image_size x image_size. Optionally, the probability that a
237 |                 face was detected. If self.keep_all is True, n detected faces are returned in an
238 |                 n x 3 x image_size x image_size tensor with an optional list of detection
239 |                 probabilities. If `img` is a list of images, the item(s) returned have an extra 
240 |                 dimension (batch) as the first dimension.
241 | 
242 |         Example:
243 |         >>> from facenet_pytorch import MTCNN
244 |         >>> mtcnn = MTCNN()
245 |         >>> face_tensor, prob = mtcnn(img, save_path='face.png', return_prob=True)
246 |         """
247 | 
248 |         # Detect faces
249 |         with torch.no_grad():
250 |             res = self.detect(img, save_landmarks)
251 |         if save_landmarks:
252 |             batch_boxes, batch_probs, batch_landmarks = res[0], res[1], res[2]
253 |         else:
254 |             batch_boxes, batch_probs = res[0], res[1]
255 | 
256 |         # Determine if a batch or single image was passed
257 |         batch_mode = True
258 |         if not isinstance(img, (list, tuple)) and not (isinstance(img, np.ndarray) and len(img.shape) == 4):
259 |             img = [img]
260 |             batch_boxes = [batch_boxes]
261 |             batch_probs = [batch_probs]
262 |             batch_mode = False
263 | 
264 |         # Parse save path(s)
265 |         if save_path is not None:
266 |             if isinstance(save_path, str):
267 |                 save_path = [save_path]
268 |         else:
269 |             save_path = [None for _ in range(len(img))]
270 | 
271 |         # Process all bounding boxes and probabilities
272 |         faces, probs = [], []
273 |         for idx, (im, box_im, prob_im, path_im) in enumerate(zip(img, batch_boxes, batch_probs, save_path)):
274 |             if box_im is None:
275 |                 faces.append(None)
276 |                 probs.append([None] if self.keep_all else None)
277 |                 continue
278 | 
279 |             if not self.keep_all:
280 |                 box_im = box_im[[0]]
281 | 
282 |             land_im = batch_landmarks[idx]
283 | 
284 |             faces_im = []
285 |             for i, box in enumerate(box_im):
286 |                 face_path = path_im
287 |                 save_name, ext = os.path.splitext(path_im)
288 |                 landmark_path = save_name + '.txt'
289 | 
290 |                 land = land_im[i]
291 | 
292 |                 if path_im is not None and i > 0:
293 |                     save_name, ext = os.path.splitext(path_im)
294 |                     face_path = save_name + '_' + str(i + 1) + ext
295 |                     landmark_path = save_name + '_' + str(i + 1) + '.txt'
296 | 
297 |                 face = extract_face(im, box, self.image_size,
298 |                                     self.margin, face_path)
299 |                 if save_landmarks:
300 |                     save_landmark(im, box, self.image_size, self.margin, land, landmark_path)
301 |                 if self.post_process:
302 |                     face = fixed_image_standardization(face)
303 |                 faces_im.append(face)
304 | 
305 |             if self.keep_all:
306 |                 faces_im = torch.stack(faces_im)
307 |             else:
308 |                 faces_im = faces_im[0]
309 |                 prob_im = prob_im[0]
310 | 
311 |             faces.append(faces_im)
312 |             probs.append(prob_im)
313 | 
314 |         if not batch_mode:
315 |             faces = faces[0]
316 |             probs = probs[0]
317 | 
318 |         if return_prob:
319 |             return faces, probs
320 |         else:
321 |             return faces
322 | 
323 |     def detect(self, img, landmarks=False):
324 |         """Detect all faces in PIL image and return bounding boxes and optional facial landmarks.
325 | 
326 |         This method is used by the forward method and is also useful for face detection tasks
327 |         that require lower-level handling of bounding boxes and facial landmarks (e.g., face
328 |         tracking). The functionality of the forward function can be emulated by using this method
329 |         followed by the extract_face() function.
330 | 
331 |         Arguments:
332 |             img {PIL.Image, np.ndarray, or list} -- A PIL image or a list of PIL images.
333 | 
334 |         Keyword Arguments:
335 |             landmarks {bool} -- Whether to return facial landmarks in addition to bounding boxes.
336 |                 (default: {False})
337 | 
338 |         Returns:
339 |             tuple(numpy.ndarray, list) -- For N detected faces, a tuple containing an
340 |                 Nx4 array of bounding boxes and a length N list of detection probabilities.
341 |                 Returned boxes will be sorted in descending order by detection probability if
342 |                 self.select_largest=False, otherwise the largest face will be returned first.
343 |                 If `img` is a list of images, the items returned have an extra dimension
344 |                 (batch) as the first dimension. Optionally, a third item, the facial landmarks,
345 |                 are returned if `landmarks=True`.
346 | 
347 |         Example:
348 |         >>> from PIL import Image, ImageDraw
349 |         >>> from facenet_pytorch import MTCNN, extract_face
350 |         >>> mtcnn = MTCNN(keep_all=True)
351 |         >>> boxes, probs, points = mtcnn.detect(img, landmarks=True)
352 |         >>> # Draw boxes and save faces
353 |         >>> img_draw = img.copy()
354 |         >>> draw = ImageDraw.Draw(img_draw)
355 |         >>> for i, (box, point) in enumerate(zip(boxes, points)):
356 |         ...     draw.rectangle(box.tolist(), width=5)
357 |         ...     for p in point:
358 |         ...         draw.rectangle((p - 10).tolist() + (p + 10).tolist(), width=10)
359 |         ...     extract_face(img, box, save_path='detected_face_{}.png'.format(i))
360 |         >>> img_draw.save('annotated_faces.png')
361 |         """
362 | 
363 |         with torch.no_grad():
364 |             batch_boxes, batch_points = detect_face(
365 |                 img, self.min_face_size,
366 |                 self.pnet, self.rnet, self.onet,
367 |                 self.thresholds, self.factor,
368 |                 self.device
369 |             )
370 | 
371 |         boxes, probs, points = [], [], []
372 |         for box, point in zip(batch_boxes, batch_points):
373 |             box = np.array(box)
374 |             point = np.array(point)
375 |             if len(box) == 0:
376 |                 boxes.append(None)
377 |                 probs.append([None])
378 |                 points.append(None)
379 |             elif self.select_largest:
380 |                 box_order = np.argsort(
381 |                     (box[:, 2] - box[:, 0]) * (box[:, 3] - box[:, 1]))[::-1]
382 |                 box = box[box_order]
383 |                 point = point[box_order]
384 |                 boxes.append(box[:, :4])
385 |                 probs.append(box[:, 4])
386 |                 points.append(point)
387 |             else:
388 |                 boxes.append(box[:, :4])
389 |                 probs.append(box[:, 4])
390 |                 points.append(point)
391 |         boxes = np.array(boxes)
392 |         probs = np.array(probs)
393 |         points = np.array(points)
394 | 
395 |         if not isinstance(img, (list, tuple)) and not (isinstance(img, np.ndarray) and len(img.shape) == 4):
396 |             boxes = boxes[0]
397 |             probs = probs[0]
398 |             points = points[0]
399 | 
400 |         if landmarks:
401 |             return boxes, probs, points
402 | 
403 |         return boxes, probs
404 | 
405 | 
406 | def fixed_image_standardization(image_tensor):
407 |     processed_tensor = (image_tensor - 127.5) / 128.0
408 |     return processed_tensor
409 | 
410 | 
411 | def prewhiten(x):
412 |     mean = x.mean()
413 |     std = x.std()
414 |     std_adj = std.clamp(min=1.0/(float(x.numel())**0.5))
415 |     y = (x - mean) / std_adj
416 |     return y
417 | 


--------------------------------------------------------------------------------
/preprocess/utils/__pycache__/detect_face.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/csyuhao/Deep3DFaceReconstruction-Pytorch/7c506b55adee55bb269f73354dd16d5327a7fb04/preprocess/utils/__pycache__/detect_face.cpython-37.pyc


--------------------------------------------------------------------------------
/preprocess/utils/detect_face.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | from torch.nn.functional import interpolate
  3 | from torchvision.transforms import functional as F
  4 | from torchvision.ops.boxes import batched_nms
  5 | import cv2
  6 | from PIL import Image, ImageDraw
  7 | import numpy as np
  8 | import os
  9 | 
 10 | 
 11 | def detect_face(imgs, minsize, pnet, rnet, onet, threshold, factor, device):
 12 |     if isinstance(imgs, (np.ndarray, torch.Tensor)):
 13 |         imgs = torch.as_tensor(imgs, device=device)
 14 |         if len(imgs.shape) == 3:
 15 |             imgs = imgs.unsqueeze(0)
 16 |     else:
 17 |         if not isinstance(imgs, (list, tuple)):
 18 |             imgs = [imgs]
 19 |         if any(img.size != imgs[0].size for img in imgs):
 20 |             raise Exception("MTCNN batch processing only compatible with equal-dimension images.")
 21 |         imgs = np.stack([np.uint8(img) for img in imgs])
 22 | 
 23 |     imgs = torch.as_tensor(imgs, device=device)
 24 | 
 25 |     imgs = imgs.permute(0, 3, 1, 2).float()
 26 | 
 27 |     batch_size = len(imgs)
 28 |     h, w = imgs.shape[2:4]
 29 |     m = 12.0 / minsize
 30 |     minl = min(h, w)
 31 |     minl = minl * m
 32 | 
 33 |     # Create scale pyramid
 34 |     scale_i = m
 35 |     scales = []
 36 |     while minl >= 12:
 37 |         scales.append(scale_i)
 38 |         scale_i = scale_i * factor
 39 |         minl = minl * factor
 40 | 
 41 |     # First stage
 42 |     boxes = []
 43 |     image_inds = []
 44 |     all_inds = []
 45 |     all_i = 0
 46 |     for scale in scales:
 47 |         im_data = imresample(imgs, (int(h * scale + 1), int(w * scale + 1)))
 48 |         im_data = (im_data - 127.5) * 0.0078125
 49 |         reg, probs = pnet(im_data)
 50 |     
 51 |         boxes_scale, image_inds_scale = generateBoundingBox(reg, probs[:, 1], scale, threshold[0])
 52 |         boxes.append(boxes_scale)
 53 |         image_inds.append(image_inds_scale)
 54 |         all_inds.append(all_i + image_inds_scale)
 55 |         all_i += batch_size
 56 | 
 57 |     boxes = torch.cat(boxes, dim=0)
 58 |     image_inds = torch.cat(image_inds, dim=0).cpu()
 59 |     all_inds = torch.cat(all_inds, dim=0)
 60 | 
 61 |     # NMS within each scale + image
 62 |     pick = batched_nms(boxes[:, :4], boxes[:, 4], all_inds, 0.5)
 63 |     boxes, image_inds = boxes[pick], image_inds[pick]
 64 |     
 65 |     # NMS within each image
 66 |     pick = batched_nms(boxes[:, :4], boxes[:, 4], image_inds, 0.7)
 67 |     boxes, image_inds = boxes[pick], image_inds[pick]
 68 | 
 69 |     regw = boxes[:, 2] - boxes[:, 0]
 70 |     regh = boxes[:, 3] - boxes[:, 1]
 71 |     qq1 = boxes[:, 0] + boxes[:, 5] * regw
 72 |     qq2 = boxes[:, 1] + boxes[:, 6] * regh
 73 |     qq3 = boxes[:, 2] + boxes[:, 7] * regw
 74 |     qq4 = boxes[:, 3] + boxes[:, 8] * regh
 75 |     boxes = torch.stack([qq1, qq2, qq3, qq4, boxes[:, 4]]).permute(1, 0)
 76 |     boxes = rerec(boxes)
 77 |     y, ey, x, ex = pad(boxes, w, h)
 78 |     
 79 |     # Second stage
 80 |     if len(boxes) > 0:
 81 |         im_data = []
 82 |         for k in range(len(y)):
 83 |             if ey[k] > (y[k] - 1) and ex[k] > (x[k] - 1):
 84 |                 img_k = imgs[image_inds[k], :, (y[k] - 1):ey[k], (x[k] - 1):ex[k]].unsqueeze(0)
 85 |                 im_data.append(imresample(img_k, (24, 24)))
 86 |         im_data = torch.cat(im_data, dim=0)
 87 |         im_data = (im_data - 127.5) * 0.0078125
 88 |         out = rnet(im_data)
 89 | 
 90 |         out0 = out[0].permute(1, 0)
 91 |         out1 = out[1].permute(1, 0)
 92 |         score = out1[1, :]
 93 |         ipass = score > threshold[1]
 94 |         boxes = torch.cat((boxes[ipass, :4], score[ipass].unsqueeze(1)), dim=1)
 95 |         image_inds = image_inds[ipass]
 96 |         mv = out0[:, ipass].permute(1, 0)
 97 | 
 98 |         # NMS within each image
 99 |         pick = batched_nms(boxes[:, :4], boxes[:, 4], image_inds, 0.7)
100 |         boxes, image_inds, mv = boxes[pick], image_inds[pick], mv[pick]
101 |         boxes = bbreg(boxes, mv)
102 |         boxes = rerec(boxes)
103 | 
104 |     # Third stage
105 |     points = torch.zeros(0, 5, 2, device=device)
106 |     if len(boxes) > 0:
107 |         y, ey, x, ex = pad(boxes, w, h)
108 |         im_data = []
109 |         for k in range(len(y)):
110 |             if ey[k] > (y[k] - 1) and ex[k] > (x[k] - 1):
111 |                 img_k = imgs[image_inds[k], :, (y[k] - 1):ey[k], (x[k] - 1):ex[k]].unsqueeze(0)
112 |                 im_data.append(imresample(img_k, (48, 48)))
113 |         im_data = torch.cat(im_data, dim=0)
114 |         im_data = (im_data - 127.5) * 0.0078125
115 |         out = onet(im_data)
116 | 
117 |         out0 = out[0].permute(1, 0)
118 |         out1 = out[1].permute(1, 0)
119 |         out2 = out[2].permute(1, 0)
120 |         score = out2[1, :]
121 |         points = out1
122 |         ipass = score > threshold[2]
123 |         points = points[:, ipass]
124 |         boxes = torch.cat((boxes[ipass, :4], score[ipass].unsqueeze(1)), dim=1)
125 |         image_inds = image_inds[ipass]
126 |         mv = out0[:, ipass].permute(1, 0)
127 | 
128 |         w_i = boxes[:, 2] - boxes[:, 0] + 1
129 |         h_i = boxes[:, 3] - boxes[:, 1] + 1
130 |         points_x = w_i.repeat(5, 1) * points[:5, :] + boxes[:, 0].repeat(5, 1) - 1
131 |         points_y = h_i.repeat(5, 1) * points[5:10, :] + boxes[:, 1].repeat(5, 1) - 1
132 |         points = torch.stack((points_x, points_y)).permute(2, 1, 0)
133 |         boxes = bbreg(boxes, mv)
134 | 
135 |         # NMS within each image using "Min" strategy
136 |         # pick = batched_nms(boxes[:, :4], boxes[:, 4], image_inds, 0.7)
137 |         pick = batched_nms_numpy(boxes[:, :4], boxes[:, 4], image_inds, 0.7, 'Min')
138 |         boxes, image_inds, points = boxes[pick], image_inds[pick], points[pick]
139 | 
140 |     boxes = boxes.cpu().numpy()
141 |     points = points.cpu().numpy()
142 | 
143 |     batch_boxes = []
144 |     batch_points = []
145 |     for b_i in range(batch_size):
146 |         b_i_inds = np.where(image_inds == b_i)
147 |         batch_boxes.append(boxes[b_i_inds].copy())
148 |         batch_points.append(points[b_i_inds].copy())
149 | 
150 |     batch_boxes, batch_points = np.array(batch_boxes), np.array(batch_points)
151 | 
152 |     return batch_boxes, batch_points
153 | 
154 | 
155 | def bbreg(boundingbox, reg):
156 |     if reg.shape[1] == 1:
157 |         reg = torch.reshape(reg, (reg.shape[2], reg.shape[3]))
158 | 
159 |     w = boundingbox[:, 2] - boundingbox[:, 0] + 1
160 |     h = boundingbox[:, 3] - boundingbox[:, 1] + 1
161 |     b1 = boundingbox[:, 0] + reg[:, 0] * w
162 |     b2 = boundingbox[:, 1] + reg[:, 1] * h
163 |     b3 = boundingbox[:, 2] + reg[:, 2] * w
164 |     b4 = boundingbox[:, 3] + reg[:, 3] * h
165 |     boundingbox[:, :4] = torch.stack([b1, b2, b3, b4]).permute(1, 0)
166 | 
167 |     return boundingbox
168 | 
169 | 
170 | def generateBoundingBox(reg, probs, scale, thresh):
171 |     stride = 2
172 |     cellsize = 12
173 | 
174 |     reg = reg.permute(1, 0, 2, 3)
175 | 
176 |     mask = probs >= thresh
177 |     mask_inds = mask.nonzero()
178 |     image_inds = mask_inds[:, 0]
179 |     score = probs[mask]
180 |     reg = reg[:, mask].permute(1, 0)
181 |     bb = mask_inds[:, 1:].float().flip(1)
182 |     q1 = ((stride * bb + 1) / scale).floor()
183 |     q2 = ((stride * bb + cellsize - 1 + 1) / scale).floor()
184 |     boundingbox = torch.cat([q1, q2, score.unsqueeze(1), reg], dim=1)
185 |     return boundingbox, image_inds
186 | 
187 | 
188 | def nms_numpy(boxes, scores, threshold, method):
189 |     if boxes.size == 0:
190 |         return np.empty((0, 3))
191 | 
192 |     x1 = boxes[:, 0].copy()
193 |     y1 = boxes[:, 1].copy()
194 |     x2 = boxes[:, 2].copy()
195 |     y2 = boxes[:, 3].copy()
196 |     s = scores
197 |     area = (x2 - x1 + 1) * (y2 - y1 + 1)
198 | 
199 |     I = np.argsort(s)
200 |     pick = np.zeros_like(s, dtype=np.int16)
201 |     counter = 0
202 |     while I.size > 0:
203 |         i = I[-1]
204 |         pick[counter] = i
205 |         counter += 1
206 |         idx = I[0:-1]
207 | 
208 |         xx1 = np.maximum(x1[i], x1[idx]).copy()
209 |         yy1 = np.maximum(y1[i], y1[idx]).copy()
210 |         xx2 = np.minimum(x2[i], x2[idx]).copy()
211 |         yy2 = np.minimum(y2[i], y2[idx]).copy()
212 | 
213 |         w = np.maximum(0.0, xx2 - xx1 + 1).copy()
214 |         h = np.maximum(0.0, yy2 - yy1 + 1).copy()
215 | 
216 |         inter = w * h
217 |         if method is "Min":
218 |             o = inter / np.minimum(area[i], area[idx])
219 |         else:
220 |             o = inter / (area[i] + area[idx] - inter)
221 |         I = I[np.where(o <= threshold)]
222 | 
223 |     pick = pick[:counter].copy()
224 |     return pick
225 | 
226 | 
227 | def batched_nms_numpy(boxes, scores, idxs, threshold, method):
228 |     device = boxes.device
229 |     if boxes.numel() == 0:
230 |         return torch.empty((0,), dtype=torch.int64, device=device)
231 |     # strategy: in order to perform NMS independently per class.
232 |     # we add an offset to all the boxes. The offset is dependent
233 |     # only on the class idx, and is large enough so that boxes
234 |     # from different classes do not overlap
235 |     max_coordinate = boxes.max()
236 |     offsets = idxs.to(boxes) * (max_coordinate + 1)
237 |     boxes_for_nms = boxes + offsets[:, None]
238 |     boxes_for_nms = boxes_for_nms.cpu().numpy()
239 |     scores = scores.cpu().numpy()
240 |     keep = nms_numpy(boxes_for_nms, scores, threshold, method)
241 |     return torch.as_tensor(keep, dtype=torch.long, device=device)
242 | 
243 | 
244 | def pad(boxes, w, h):
245 |     boxes = boxes.trunc().int().cpu().numpy()
246 |     x = boxes[:, 0]
247 |     y = boxes[:, 1]
248 |     ex = boxes[:, 2]
249 |     ey = boxes[:, 3]
250 | 
251 |     x[x < 1] = 1
252 |     y[y < 1] = 1
253 |     ex[ex > w] = w
254 |     ey[ey > h] = h
255 | 
256 |     return y, ey, x, ex
257 | 
258 | 
259 | def rerec(bboxA):
260 |     h = bboxA[:, 3] - bboxA[:, 1]
261 |     w = bboxA[:, 2] - bboxA[:, 0]
262 |     
263 |     l = torch.max(w, h)
264 |     bboxA[:, 0] = bboxA[:, 0] + w * 0.5 - l * 0.5
265 |     bboxA[:, 1] = bboxA[:, 1] + h * 0.5 - l * 0.5
266 |     bboxA[:, 2:4] = bboxA[:, :2] + l.repeat(2, 1).permute(1, 0)
267 | 
268 |     return bboxA
269 | 
270 | 
271 | def imresample(img, sz):
272 |     im_data = interpolate(img, size=sz, mode="area")
273 |     return im_data
274 | 
275 | 
276 | def crop_resize(img, box, image_size):
277 |     if isinstance(image_size, tuple):
278 |         if isinstance(img, np.ndarray):
279 |             out = cv2.resize(
280 |                 img[box[1]:box[3], box[0]:box[2]],
281 |                 (image_size[1], image_size[0]),
282 |                 interpolation=cv2.INTER_AREA
283 |             ).copy()
284 |         else:
285 |             out = img.crop(box).copy().resize((image_size[1], image_size[0]), Image.BILINEAR)
286 |     else:
287 |         if isinstance(img, np.ndarray):
288 |             out = cv2.resize(
289 |                 img[box[1]:box[3], box[0]:box[2]],
290 |                 (image_size, image_size),
291 |                 interpolation=cv2.INTER_AREA
292 |             ).copy()
293 |         else:
294 |             out = img.crop(box).copy().resize((image_size, image_size), Image.BILINEAR)
295 |     return out
296 | 
297 | 
298 | def save_img(img, path):
299 |     if isinstance(img, np.ndarray):
300 |         cv2.imwrite(path, cv2.cvtColor(img, cv2.COLOR_RGB2BGR))
301 |     else:
302 |         img.save(path)
303 | 
304 | 
305 | def get_size(img):
306 |     if isinstance(img, np.ndarray):
307 |         return img.shape[1::-1]
308 |     else:
309 |         return img.size
310 | 
311 | 
312 | def extract_face(img, box, image_size=160, margin=0, save_path=None):
313 |     """Extract face + margin from PIL Image given bounding box.
314 |     
315 |     Arguments:
316 |         img {PIL.Image} -- A PIL Image.
317 |         box {numpy.ndarray} -- Four-element bounding box.
318 |         image_size {int} -- Output image size in pixels. The image will be square.
319 |         margin {int} -- Margin to add to bounding box, in terms of pixels in the final image. 
320 |             Note that the application of the margin differs slightly from the davidsandberg/facenet
321 |             repo, which applies the margin to the original image before resizing, making the margin
322 |             dependent on the original image size.
323 |         save_path {str} -- Save path for extracted face image. (default: {None})
324 |     
325 |     Returns:
326 |         torch.tensor -- tensor representing the extracted face.
327 |     """
328 |     if isinstance(image_size, tuple):
329 |         margin = [
330 |             margin * (box[2] - box[0]) / (image_size[1] - margin),
331 |             margin * (box[3] - box[1]) / (image_size[0] - margin),
332 |         ]
333 |     else:
334 |         margin = [
335 |             margin * (box[2] - box[0]) / (image_size - margin),
336 |             margin * (box[3] - box[1]) / (image_size - margin),
337 |         ]
338 |     raw_image_size = get_size(img)
339 |     box = [
340 |         int(max(box[0] - margin[0] / 2, 0)),
341 |         int(max(box[1] - margin[1] / 2, 0)),
342 |         int(min(box[2] + margin[0] / 2, raw_image_size[0])),
343 |         int(min(box[3] + margin[1] / 2, raw_image_size[1])),
344 |     ]
345 | 
346 |     # img_draw = img.copy()
347 |     # draw = ImageDraw.Draw(img_draw)
348 |     # draw.rectangle(box, outline=(255, 0, 0), width=6)
349 |     # img_draw.show()
350 | 
351 |     face = crop_resize(img, box, image_size)
352 | 
353 |     if save_path is not None:
354 |         os.makedirs(os.path.dirname(save_path) + "/", exist_ok=True)
355 |         save_img(face, save_path)
356 | 
357 |     face = F.to_tensor(np.float32(face))
358 | 
359 |     return face
360 | 
361 | 
362 | def save_landmark(img, box, image_size, margin, landmark, save_path):
363 |     """Save Landmark to path
364 | 
365 |     Arguments:
366 |         img {PIL.Image} -- A PIL Image.
367 |         box {numpy.ndarray} -- Four-element bounding box.
368 |         image_size {int} -- Output image size in pixels. The image will be square.
369 |         margin {int} -- Margin to add to bounding box, in terms of pixels in the final image. 
370 |             Note that the application of the margin differs slightly from the davidsandberg/facenet
371 |             repo, which applies the margin to the original image before resizing, making the margin
372 |             dependent on the original image size.
373 |         save_path {str} -- Save path for extracted face image. (default: {None})
374 | 
375 |     Returns:
376 |         None
377 |     """
378 |     if isinstance(image_size, tuple):
379 |         margin = [
380 |             margin * (box[2] - box[0]) / (image_size[1] - margin),
381 |             margin * (box[3] - box[1]) / (image_size[0] - margin),
382 |         ]
383 |     else:
384 |         margin = [
385 |             margin * (box[2] - box[0]) / (image_size - margin),
386 |             margin * (box[3] - box[1]) / (image_size - margin),
387 |         ]
388 |     raw_image_size = get_size(img)
389 |     box = [
390 |         int(max(box[0] - margin[0] / 2, 0)),
391 |         int(max(box[1] - margin[1] / 2, 0)),
392 |         int(min(box[2] + margin[0] / 2, raw_image_size[0])),
393 |         int(min(box[3] + margin[1] / 2, raw_image_size[1])),
394 |     ]
395 | 
396 |     landmark[:, 0] = (landmark[:, 0] - box[0]) / (box[2] - box[0])
397 |     landmark[:, 1] = (landmark[:, 1] - box[1]) / (box[3] - box[1])
398 | 
399 |     if isinstance(image_size, tuple):
400 |         landmark[:, 0] *= image_size[1]
401 |         landmark[:, 1] *= image_size[0]
402 |     else:
403 |         landmark *= image_size
404 | 
405 |     with open(save_path, 'w+') as f:
406 |         for (x, y) in landmark:
407 |             f.write('{}\t{}\n'.format(x, y))
408 | 


--------------------------------------------------------------------------------
/recon_demo.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import glob
  3 | import torch
  4 | import numpy as np
  5 | from models.resnet_50 import resnet50_use
  6 | from load_data import transfer_BFM09, BFM, load_img, Preprocess, save_obj
  7 | from reconstruction_mesh import reconstruction, render_img, transform_face_shape, estimate_intrinsic
  8 | 
  9 | 
 10 | def recon():
 11 |     # input and output folder
 12 |     image_path = r'dataset'
 13 |     save_path = 'output'
 14 |     if not os.path.exists(save_path):
 15 |         os.makedirs(save_path)
 16 |     img_list = glob.glob(image_path + '/**/' + '*.png', recursive=True)
 17 |     img_list += glob.glob(image_path + '/**/' + '*.jpg', recursive=True)
 18 | 
 19 |     # read BFM face model
 20 |     # transfer original BFM model to our model
 21 |     if not os.path.isfile('BFM/BFM_model_front.mat'):
 22 |         transfer_BFM09()
 23 | 
 24 |     device = 'cuda:0' if torch.cuda.is_available() else 'cpu:0'
 25 |     bfm = BFM(r'BFM/BFM_model_front.mat', device)
 26 | 
 27 |     # read standard landmarks for preprocessing images
 28 |     lm3D = bfm.load_lm3d()
 29 | 
 30 |     model = resnet50_use().to(device)
 31 |     model.load_state_dict(torch.load(r'models\params.pt'))
 32 |     model.eval()
 33 | 
 34 |     for param in model.parameters():
 35 |         param.requires_grad = False
 36 | 
 37 |     for file in img_list:
 38 |         # load images and corresponding 5 facial landmarks
 39 |         img, lm = load_img(file, file.replace('jpg', 'txt'))
 40 | 
 41 |         # preprocess input image
 42 |         input_img_org, lm_new, transform_params = Preprocess(img, lm, lm3D)
 43 | 
 44 |         input_img = input_img_org.astype(np.float32)
 45 |         input_img = torch.from_numpy(input_img).permute(0, 3, 1, 2)
 46 |         # the input_img is BGR
 47 |         input_img = input_img.to(device)
 48 | 
 49 |         arr_coef = model(input_img)
 50 | 
 51 |         coef = torch.cat(arr_coef, 1)
 52 | 
 53 |         # reconstruct 3D face with output coefficients and face model
 54 |         face_shape, face_texture, face_color, landmarks_2d, z_buffer, angles, translation, gamma = reconstruction(coef, bfm)
 55 | 
 56 |         fx, px, fy, py = estimate_intrinsic(landmarks_2d, transform_params, z_buffer, face_shape, bfm, angles, translation)
 57 | 
 58 |         face_shape_t = transform_face_shape(face_shape, angles, translation)
 59 |         face_color = face_color / 255.0
 60 |         face_shape_t[:, :, 2] = 10.0 - face_shape_t[:, :, 2]
 61 | 
 62 |         images = render_img(face_shape_t, face_color, bfm, 300, fx, fy, px, py)
 63 |         images = images.detach().cpu().numpy()
 64 |         images = np.squeeze(images)
 65 | 
 66 |         path_str = file.replace(image_path, save_path)
 67 |         path = os.path.split(path_str)[0]
 68 |         if os.path.exists(path) is False:
 69 |             os.makedirs(path)
 70 | 
 71 |         from PIL import Image
 72 |         images = np.uint8(images[:, :, :3] * 255.0)
 73 |         # init_img = np.array(img)
 74 |         # init_img[images != 0] = 0
 75 |         # images += init_img
 76 |         img = Image.fromarray(images)
 77 |         img.save(file.replace(image_path, save_path).replace('jpg', 'png'))
 78 | 
 79 |         face_shape = face_shape.detach().cpu().numpy()
 80 |         face_color = face_color.detach().cpu().numpy()
 81 | 
 82 |         face_shape = np.squeeze(face_shape)
 83 |         face_color = np.squeeze(face_color)
 84 |         save_obj(file.replace(image_path, save_path).replace('.jpg', '_mesh.obj'), face_shape, bfm.tri, np.clip(face_color, 0, 1.0))  # 3D reconstruction face (in canonical view)
 85 | 
 86 |         from load_data import transfer_UV
 87 |         from utils import process_uv
 88 |         # loading UV coordinates
 89 |         uv_pos = transfer_UV()
 90 |         tex_coords = process_uv(uv_pos.copy())
 91 |         tex_coords = torch.tensor(tex_coords, dtype=torch.float32).unsqueeze(0).to(device)
 92 | 
 93 |         face_texture = face_texture / 255.0
 94 |         images = render_img(tex_coords, face_texture, bfm, 600, 600.0 - 1.0, 600.0 - 1.0, 0.0, 0.0)
 95 |         images = images.detach().cpu().numpy()
 96 |         images = np.squeeze(images)
 97 | 
 98 |         # from PIL import Image
 99 |         images = np.uint8(images[:, :, :3] * 255.0)
100 |         img = Image.fromarray(images)
101 |         img.save(file.replace(image_path, save_path).replace('.jpg', '_texture.png'))
102 | 
103 | 
104 | if __name__ == '__main__':
105 |     recon()
106 | 


--------------------------------------------------------------------------------
/reconstruction_mesh.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import math
  3 | import numpy as np
  4 | from utils import LeastSquares
  5 | 
  6 | 
  7 | def split_coeff(coeff):
  8 |     # input: coeff with shape [1,257]
  9 |     id_coeff = coeff[:, :80]  # identity(shape) coeff of dim 80
 10 |     ex_coeff = coeff[:, 80:144]  # expression coeff of dim 64
 11 |     tex_coeff = coeff[:, 144:224]  # texture(albedo) coeff of dim 80
 12 |     angles = coeff[:, 224:227]  # ruler angles(x,y,z) for rotation of dim 3
 13 |     # lighting coeff for 3 channel SH function of dim 27
 14 |     gamma = coeff[:, 227:254]
 15 |     translation = coeff[:, 254:]  # translation coeff of dim 3
 16 | 
 17 |     return id_coeff, ex_coeff, tex_coeff, angles, gamma, translation
 18 | 
 19 | 
 20 | class _need_const:
 21 |     a0 = np.pi
 22 |     a1 = 2 * np.pi / np.sqrt(3.0)
 23 |     a2 = 2 * np.pi / np.sqrt(8.0)
 24 |     c0 = 1 / np.sqrt(4 * np.pi)
 25 |     c1 = np.sqrt(3.0) / np.sqrt(4 * np.pi)
 26 |     c2 = 3 * np.sqrt(5.0) / np.sqrt(12 * np.pi)
 27 |     d0 = 0.5 / np.sqrt(3.0)
 28 | 
 29 |     illu_consts = [a0, a1, a2, c0, c1, c2, d0]
 30 | 
 31 |     origin_size = 300
 32 |     target_size = 224
 33 |     camera_pos = 10.0
 34 | 
 35 | 
 36 | def shape_formation(id_coeff, ex_coeff, facemodel):
 37 |     # compute face shape with identity and expression coeff, based on BFM model
 38 |     # input: id_coeff with shape [1,80]
 39 |     #         ex_coeff with shape [1,64]
 40 |     # output: face_shape with shape [1,N,3], N is number of vertices
 41 | 
 42 |     '''
 43 |         S = mean_shape + \alpha * B_id + \beta * B_exp
 44 |     '''
 45 |     n_b = id_coeff.size(0)
 46 |     face_shape = torch.einsum('ij,aj->ai', facemodel.idBase, id_coeff) + \
 47 |         torch.einsum('ij,aj->ai', facemodel.exBase, ex_coeff) + \
 48 |         facemodel.meanshape
 49 | 
 50 |     face_shape = face_shape.view(n_b, -1, 3)
 51 |     # re-center face shape
 52 |     face_shape = face_shape - \
 53 |         facemodel.meanshape.view(1, -1, 3).mean(dim=1, keepdim=True)
 54 | 
 55 |     return face_shape
 56 | 
 57 | 
 58 | def texture_formation(tex_coeff, facemodel):
 59 |     # compute vertex texture(albedo) with tex_coeff
 60 |     # input: tex_coeff with shape [1,N,3]
 61 |     # output: face_texture with shape [1,N,3], RGB order, range from 0-255
 62 | 
 63 |     '''
 64 |         T = mean_texture + \gamma * B_texture
 65 |     '''
 66 | 
 67 |     n_b = tex_coeff.size(0)
 68 |     face_texture = torch.einsum(
 69 |         'ij,aj->ai', facemodel.texBase, tex_coeff) + facemodel.meantex
 70 | 
 71 |     face_texture = face_texture.view(n_b, -1, 3)
 72 |     return face_texture
 73 | 
 74 | 
 75 | def compute_norm(face_shape, facemodel):
 76 |     # compute vertex normal using one-ring neighborhood (8 points)
 77 |     # input: face_shape with shape [1,N,3]
 78 |     # output: v_norm with shape [1,N,3]
 79 |     # https://fredriksalomonsson.files.wordpress.com/2010/10/mesh-data-structuresv2.pdf
 80 | 
 81 |     # vertex index for each triangle face, with shape [F,3], F is number of faces
 82 |     face_id = facemodel.tri - 1
 83 |     # adjacent face index for each vertex, with shape [N,8], N is number of vertex
 84 |     point_id = facemodel.point_buf - 1
 85 |     shape = face_shape
 86 |     v1 = shape[:, face_id[:, 0], :]
 87 |     v2 = shape[:, face_id[:, 1], :]
 88 |     v3 = shape[:, face_id[:, 2], :]
 89 |     e1 = v1 - v2
 90 |     e2 = v2 - v3
 91 |     face_norm = e1.cross(e2)  # compute normal for each face
 92 | 
 93 |     # normalized face_norm first
 94 |     face_norm = torch.nn.functional.normalize(face_norm, p=2, dim=2)
 95 |     empty = torch.zeros((face_norm.size(0), 1, 3),
 96 |                         dtype=face_norm.dtype, device=face_norm.device)
 97 | 
 98 |     # concat face_normal with a zero vector at the end
 99 |     face_norm = torch.cat((face_norm, empty), 1)
100 | 
101 |     # compute vertex normal using one-ring neighborhood
102 |     v_norm = face_norm[:, point_id, :].sum(dim=2)
103 |     v_norm = torch.nn.functional.normalize(v_norm, p=2, dim=2)  # normalize normal vectors
104 |     return v_norm
105 | 
106 | 
107 | def compute_rotation_matrix(angles):
108 |     # compute rotation matrix based on 3 ruler angles
109 |     # input: angles with shape [1,3]
110 |     # output: rotation matrix with shape [1,3,3]
111 |     n_b = angles.size(0)
112 | 
113 |     # https://www.cnblogs.com/larry-xia/p/11926121.html
114 |     device = angles.device
115 |     # compute rotation matrix for X-axis, Y-axis, Z-axis respectively
116 |     rotation_X = torch.cat(
117 |         [
118 |             torch.ones([n_b, 1]).to(device),
119 |             torch.zeros([n_b, 3]).to(device),
120 |             torch.reshape(torch.cos(angles[:, 0]), [n_b, 1]),
121 |             - torch.reshape(torch.sin(angles[:, 0]), [n_b, 1]),
122 |             torch.zeros([n_b, 1]).to(device),
123 |             torch.reshape(torch.sin(angles[:, 0]), [n_b, 1]),
124 |             torch.reshape(torch.cos(angles[:, 0]), [n_b, 1])
125 |         ],
126 |         axis=1
127 |     )
128 |     rotation_Y = torch.cat(
129 |         [
130 |             torch.reshape(torch.cos(angles[:, 1]), [n_b, 1]),
131 |             torch.zeros([n_b, 1]).to(device),
132 |             torch.reshape(torch.sin(angles[:, 1]), [n_b, 1]),
133 |             torch.zeros([n_b, 1]).to(device),
134 |             torch.ones([n_b, 1]).to(device),
135 |             torch.zeros([n_b, 1]).to(device),
136 |             - torch.reshape(torch.sin(angles[:, 1]), [n_b, 1]),
137 |             torch.zeros([n_b, 1]).to(device),
138 |             torch.reshape(torch.cos(angles[:, 1]), [n_b, 1]),
139 |         ],
140 |         axis=1
141 |     )
142 |     rotation_Z = torch.cat(
143 |         [
144 |             torch.reshape(torch.cos(angles[:, 2]), [n_b, 1]),
145 |             - torch.reshape(torch.sin(angles[:, 2]), [n_b, 1]),
146 |             torch.zeros([n_b, 1]).to(device),
147 |             torch.reshape(torch.sin(angles[:, 2]), [n_b, 1]),
148 |             torch.reshape(torch.cos(angles[:, 2]), [n_b, 1]),
149 |             torch.zeros([n_b, 3]).to(device),
150 |             torch.ones([n_b, 1]).to(device),
151 |         ],
152 |         axis=1
153 |     )
154 | 
155 |     rotation_X = rotation_X.reshape([n_b, 3, 3])
156 |     rotation_Y = rotation_Y.reshape([n_b, 3, 3])
157 |     rotation_Z = rotation_Z.reshape([n_b, 3, 3])
158 | 
159 |     # R = Rz*Ry*Rx
160 |     rotation = rotation_Z.bmm(rotation_Y).bmm(rotation_X)
161 | 
162 |     # because our face shape is N*3, so compute the transpose of R, so that rotation shapes can be calculated as face_shape*R
163 |     rotation = rotation.permute(0, 2, 1)
164 | 
165 |     return rotation
166 | 
167 | 
168 | def projection_layer(face_shape, fx=1015.0, fy=1015.0, px=112.0, py=112.0):
169 |     # we choose the focal length and camera position empirically
170 |     # project 3D face onto image plane
171 |     # input: face_shape with shape [1,N,3]
172 |     #          rotation with shape [1,3,3]
173 |     #         translation with shape [1,3]
174 |     # output: face_projection with shape [1,N,2]
175 |     #           z_buffer with shape [1,N,1]
176 | 
177 |     cam_pos = 10
178 |     p_matrix = np.concatenate([[fx], [0.0], [px], [0.0], [fy], [py], [0.0], [0.0], [1.0]],
179 |                               axis=0).astype(np.float32)  # projection matrix
180 |     p_matrix = np.reshape(p_matrix, [1, 3, 3])
181 |     p_matrix = torch.from_numpy(p_matrix)
182 |     gpu_p_matrix = None
183 | 
184 |     n_b, nV, _ = face_shape.size()
185 |     if face_shape.is_cuda:
186 |         gpu_p_matrix = p_matrix.cuda()
187 |         p_matrix = gpu_p_matrix.expand(n_b, 3, 3)
188 |     else:
189 |         p_matrix = p_matrix.expand(n_b, 3, 3)
190 | 
191 |     face_shape[:, :, 2] = cam_pos - face_shape[:, :, 2]
192 |     aug_projection = face_shape.bmm(p_matrix.permute(0, 2, 1))
193 |     face_projection = aug_projection[:, :, 0:2] / aug_projection[:, :, 2:]
194 | 
195 |     z_buffer = cam_pos - aug_projection[:, :, 2:]
196 | 
197 |     return face_projection, z_buffer
198 | 
199 | 
200 | def illumination_layer(face_texture, norm, gamma):
201 |     # CHJ: It's different from what I knew.
202 |     # compute vertex color using face_texture and SH function lighting approximation
203 |     # input: face_texture with shape [1,N,3]
204 |     #          norm with shape [1,N,3]
205 |     #         gamma with shape [1,27]
206 |     # output: face_color with shape [1,N,3], RGB order, range from 0-255
207 |     #          lighting with shape [1,N,3], color under uniform texture
208 | 
209 |     n_b, num_vertex, _ = face_texture.size()
210 |     n_v_full = n_b * num_vertex
211 |     gamma = gamma.view(-1, 3, 9).clone()
212 |     gamma[:, :, 0] += 0.8
213 | 
214 |     gamma = gamma.permute(0, 2, 1)
215 | 
216 |     a0, a1, a2, c0, c1, c2, d0 = _need_const.illu_consts
217 | 
218 |     Y0 = torch.ones(n_v_full).float() * a0*c0
219 |     if gamma.is_cuda:
220 |         Y0 = Y0.cuda()
221 |     norm = norm.view(-1, 3)
222 |     nx, ny, nz = norm[:, 0], norm[:, 1], norm[:, 2]
223 |     arrH = []
224 | 
225 |     arrH.append(Y0)
226 |     arrH.append(-a1*c1*ny)
227 |     arrH.append(a1*c1*nz)
228 |     arrH.append(-a1*c1*nx)
229 |     arrH.append(a2*c2*nx*ny)
230 |     arrH.append(-a2*c2*ny*nz)
231 |     arrH.append(a2*c2*d0*(3*nz.pow(2)-1))
232 |     arrH.append(-a2*c2*nx*nz)
233 |     arrH.append(a2*c2*0.5*(nx.pow(2)-ny.pow(2)))
234 | 
235 |     H = torch.stack(arrH, 1)
236 |     Y = H.view(n_b, num_vertex, 9)
237 | 
238 |     # Y shape:[batch,N,9].
239 | 
240 |     # shape:[batch,N,3]
241 |     lighting = Y.bmm(gamma)
242 | 
243 |     face_color = face_texture * lighting
244 | 
245 |     return face_color, lighting
246 | 
247 | 
248 | def rigid_transform(face_shape, rotation, translation):
249 |     n_b = face_shape.shape[0]
250 |     face_shape_r = face_shape.bmm(rotation)  # R has been transposed
251 |     face_shape_t = face_shape_r + translation.view(n_b, 1, 3)
252 |     return face_shape_t
253 | 
254 | 
255 | def compute_landmarks(face_shape, facemodel):
256 |     # compute 3D landmark postitions with pre-computed 3D face shape
257 |     keypoints_idx = facemodel.keypoints - 1
258 |     face_landmarks = face_shape[:, keypoints_idx, :]
259 |     return face_landmarks
260 | 
261 | 
262 | def compute_3d_landmarks(face_shape, facemodel, angles, translation):
263 |     rotation = compute_rotation_matrix(angles)
264 |     face_shape_t = rigid_transform(face_shape, rotation, translation)
265 |     landmarks_3d = compute_landmarks(face_shape_t, facemodel)
266 |     return landmarks_3d
267 | 
268 | 
269 | def transform_face_shape(face_shape, angles, translation):
270 |     rotation = compute_rotation_matrix(angles)
271 |     face_shape_t = rigid_transform(face_shape, rotation, translation)
272 |     return face_shape_t
273 | 
274 | 
275 | def render_img(face_shape, face_color, facemodel, image_size=224, fx=1015.0, fy=1015.0, px=112.0, py=112.0, device='cuda:0'):
276 |     '''
277 |         ref: https://github.com/facebookresearch/pytorch3d/issues/184
278 |         The rendering function (just for test)
279 |         Input:
280 |             face_shape:  Tensor[1, 35709, 3]
281 |             face_color: Tensor[1, 35709, 3] in [0, 1]
282 |             facemodel: contains `tri` (triangles[70789, 3], index start from 1)
283 |     '''
284 |     from pytorch3d.structures import Meshes
285 |     from pytorch3d.renderer.mesh.textures import TexturesVertex
286 |     from pytorch3d.renderer import (
287 |         PerspectiveCameras,
288 |         PointLights,
289 |         RasterizationSettings,
290 |         MeshRenderer,
291 |         MeshRasterizer,
292 |         SoftPhongShader,
293 |         BlendParams
294 |     )
295 | 
296 |     face_color = TexturesVertex(verts_features=face_color.to(device))
297 |     face_buf = torch.from_numpy(facemodel.tri - 1)  # index start from 1
298 |     face_idx = face_buf.unsqueeze(0)
299 | 
300 |     mesh = Meshes(face_shape.to(device), face_idx.to(device), face_color)
301 | 
302 |     R = torch.eye(3).view(1, 3, 3).to(device)
303 |     R[0, 0, 0] *= -1.0
304 |     T = torch.zeros([1, 3]).to(device)
305 | 
306 |     half_size = (image_size - 1.0) / 2
307 |     focal_length = torch.tensor([fx / half_size, fy / half_size], dtype=torch.float32).reshape(1, 2).to(device)
308 |     principal_point = torch.tensor([(half_size - px) / half_size, (py - half_size) / half_size], dtype=torch.float32).reshape(1, 2).to(device)
309 | 
310 |     cameras = PerspectiveCameras(
311 |         device=device,
312 |         R=R,
313 |         T=T,
314 |         focal_length=focal_length,
315 |         principal_point=principal_point
316 |     )
317 | 
318 |     raster_settings = RasterizationSettings(
319 |         image_size=image_size,
320 |         blur_radius=0.0,
321 |         faces_per_pixel=1
322 |     )
323 | 
324 |     lights = PointLights(
325 |         device=device,
326 |         ambient_color=((1.0, 1.0, 1.0),),
327 |         diffuse_color=((0.0, 0.0, 0.0),),
328 |         specular_color=((0.0, 0.0, 0.0),),
329 |         location=((0.0, 0.0, 1e5),)
330 |     )
331 | 
332 |     blend_params = BlendParams(background_color=(0.0, 0.0, 0.0))
333 | 
334 |     renderer = MeshRenderer(
335 |         rasterizer=MeshRasterizer(
336 |             cameras=cameras,
337 |             raster_settings=raster_settings
338 |         ),
339 |         shader=SoftPhongShader(
340 |             device=device,
341 |             cameras=cameras,
342 |             lights=lights,
343 |             blend_params=blend_params
344 |         )
345 |     )
346 |     images = renderer(mesh)
347 |     images = torch.clamp(images, 0.0, 1.0)
348 |     return images
349 | 
350 | 
351 | def estimate_intrinsic(landmarks_2d, transform_params, z_buffer, face_shape, facemodel, angles, translation):
352 |     # estimate intrinsic parameters
353 | 
354 |     def re_convert(landmarks_2d, trans_params, origin_size=_need_const.origin_size, target_size=_need_const.target_size):
355 |         # convert landmarks to un_cropped images
356 |         w = (origin_size * trans_params[2]).astype(np.int32)
357 |         h = (origin_size * trans_params[2]).astype(np.int32)
358 |         landmarks_2d[:, :, 1] = target_size - 1 - landmarks_2d[:, :, 1]
359 | 
360 |         landmarks_2d[:, :, 0] = landmarks_2d[:, :, 0] + w / 2 - target_size / 2
361 |         landmarks_2d[:, :, 1] = landmarks_2d[:, :, 1] + h / 2 - target_size / 2
362 | 
363 |         landmarks_2d = landmarks_2d / trans_params[2]
364 | 
365 |         landmarks_2d[:, :, 0] = landmarks_2d[:, :, 0] + trans_params[3] - origin_size / 2
366 |         landmarks_2d[:, :, 1] = landmarks_2d[:, :, 1] + trans_params[4] - origin_size / 2
367 | 
368 |         landmarks_2d[:, :, 1] = origin_size - 1 - landmarks_2d[:, :, 1]
369 |         return landmarks_2d
370 | 
371 |     def POS(xp, x):
372 |         # calculating least sqaures problem
373 |         # ref https://github.com/pytorch/pytorch/issues/27036
374 |         ls = LeastSquares()
375 |         npts = xp.shape[1]
376 | 
377 |         A = torch.zeros([2*npts, 4]).to(x.device)
378 |         A[0:2*npts-1:2, 0:2] = x[0, :, [0, 2]]
379 |         A[1:2*npts:2, 2:4] = x[0, :, [1, 2]]
380 | 
381 |         b = torch.reshape(xp[0], [2*npts, 1])
382 | 
383 |         k = ls.lstq(A, b, 0.010)
384 | 
385 |         fx = k[0, 0]
386 |         px = k[1, 0]
387 |         fy = k[2, 0]
388 |         py = k[3, 0]
389 |         return fx, px, fy, py
390 | 
391 |     # convert landmarks to un_cropped images
392 |     landmarks_2d = re_convert(landmarks_2d, transform_params)
393 |     landmarks_2d[:, :, 1] = _need_const.origin_size - 1.0 - landmarks_2d[:, :, 1]
394 |     landmarks_2d[:, :, :2] = landmarks_2d[:, :, :2] * (_need_const.camera_pos - z_buffer[:, :, :])
395 | 
396 |     # compute 3d landmarks
397 |     landmarks_3d = compute_3d_landmarks(face_shape, facemodel, angles, translation)
398 | 
399 |     # compute fx, fy, px, py
400 |     landmarks_3d_ = landmarks_3d.clone()
401 |     landmarks_3d_[:, :, 2] = _need_const.camera_pos - landmarks_3d_[:, :, 2]
402 |     fx, px, fy, py = POS(landmarks_2d, landmarks_3d_)
403 |     return fx, px, fy, py
404 | 
405 | 
406 | def reconstruction(coeff, facemodel):
407 |     # The image size is 224 * 224
408 |     # face reconstruction with coeff and BFM model
409 |     id_coeff, ex_coeff, tex_coeff, angles, gamma, translation = split_coeff(coeff)
410 | 
411 |     # compute face shape
412 |     face_shape = shape_formation(id_coeff, ex_coeff, facemodel)
413 |     # compute vertex texture(albedo)
414 |     face_texture = texture_formation(tex_coeff, facemodel)
415 | 
416 |     # vertex normal
417 |     face_norm = compute_norm(face_shape, facemodel)
418 |     # rotation matrix
419 |     rotation = compute_rotation_matrix(angles)
420 |     face_norm_r = face_norm.bmm(rotation)
421 |     # print(face_norm_r[:, :3, :])
422 | 
423 |     # do rigid transformation for face shape using predicted rotation and translation
424 |     face_shape_t = rigid_transform(face_shape, rotation, translation)
425 | 
426 |     # compute 2d landmark projection
427 |     face_landmark_t = compute_landmarks(face_shape_t, facemodel)
428 | 
429 |     # compute 68 landmark on image plane (with image sized 224*224)
430 |     landmarks_2d, z_buffer = projection_layer(face_landmark_t)
431 |     landmarks_2d[:, :, 1] = _need_const.target_size - 1.0 - landmarks_2d[:, :, 1]
432 | 
433 |     # compute vertex color using SH function lighting approximation
434 |     face_color, lighting = illumination_layer(face_texture, face_norm_r, gamma)
435 | 
436 |     return face_shape, face_texture, face_color, landmarks_2d, z_buffer, angles, translation, gamma
437 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | # This file may be used to create an environment using:
 2 | # $ conda create --name <env> --file <this file>
 3 | # platform: win-64
 4 | blas=1.0=mkl
 5 | ca-certificates=2020.11.8=h5b45459_0
 6 | certifi=2020.11.8=py38haa244fe_0
 7 | chardet=3.0.4=pypi_0
 8 | cudatoolkit=10.1.243=h74a9793_0
 9 | docopt=0.6.2=pypi_0
10 | freetype=2.10.4=hd328e21_0
11 | future=0.18.2=pypi_0
12 | fvcore=0.1.2.post20201111=pypi_0
13 | icc_rt=2020.2=intel_254
14 | idna=2.10=pypi_0
15 | intel-openmp=2020.2=254
16 | jpeg=9b=hb83a4c4_2
17 | libpng=1.6.37=h2a8f88b_0
18 | libtiff=4.1.0=h56a325e_1
19 | lz4-c=1.9.2=hf4a77e7_3
20 | mkl=2020.2=256
21 | mkl-service=2.3.0=py38hb782905_0
22 | mkl_fft=1.2.0=py38h45dec08_0
23 | mkl_random=1.1.1=py38h47e9c7a_0
24 | msys2-conda-epoch=20160418=1
25 | ninja=1.7.2=0
26 | numpy=1.19.2=py38hadc3359_0
27 | numpy-base=1.19.2=py38ha3acd2a_0
28 | olefile=0.46=py_0
29 | openssl=1.1.1h=he774522_0
30 | pillow>=8.1.1
31 | pip=20.2.4=py38haa95532_0
32 | pipreqs=0.4.10=pypi_0
33 | portalocker=2.0.0=pypi_0
34 | python=3.8.5=h5fd99cc_1
35 | python_abi=3.8=1_cp38
36 | pytorch=1.6.0=py3.8_cuda101_cudnn7_0
37 | pytorch3d=0.3.0=pypi_0
38 | pywin32=300=pypi_0
39 | PyYAML>=5.4
40 | requests=2.25.0=pypi_0
41 | scipy=1.5.2=py38h14eb087_0
42 | setuptools=50.3.1=py38haa95532_1
43 | six=1.15.0=py38haa95532_0
44 | sqlite=3.33.0=h2a8f88b_0
45 | tabulate=0.8.7=pyh9f0ad1d_0
46 | termcolor=1.1.0=pypi_0
47 | tk=8.6.10=he774522_0
48 | torchvision=0.7.0=py38_cu101
49 | tqdm=4.52.0=pyhd3deb0d_0
50 | urllib3>=1.26.4
51 | vc=14.1=h0510ff6_4
52 | vs2015_runtime=14.16.27012=hf0eaf9b_3
53 | wheel=0.35.1=pyhd3eb1b0_0
54 | wincertstore=0.2=py38_0
55 | xz=5.2.5=h62dcd97_0
56 | yacs=0.1.8=pypi_0
57 | yaml=0.2.5=he774522_0
58 | yarg=0.1.9=pypi_0
59 | zlib=1.2.11=h62dcd97_4
60 | zstd=1.4.5=h04227a9_0
61 | 


--------------------------------------------------------------------------------
/utils.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import numpy as np
 3 | 
 4 | 
 5 | class LeastSquares:
 6 |     # https://github.com/pytorch/pytorch/issues/27036
 7 |     def __init__(self):
 8 |         pass
 9 | 
10 |     def lstq(self, A, Y, lamb=0.0):
11 |         """
12 |         Differentiable least square
13 |         :param A: m x n
14 |         :param Y: n x 1
15 |         """
16 |         # Assuming A to be full column rank
17 |         cols = A.shape[1]
18 |         if cols == torch.matrix_rank(A):
19 |             q, r = torch.qr(A)
20 |             x = torch.inverse(r) @ q.T @ Y
21 |         else:
22 |             A_dash = A.permute(1, 0) @ A + lamb * torch.eye(cols)
23 |             Y_dash = A.permute(1, 0) @ Y
24 |             x = self.lstq(A_dash, Y_dash)
25 |         return x
26 | 
27 | 
28 | def process_uv(uv_coords):
29 |     uv_coords[:, 0] = uv_coords[:, 0]
30 |     uv_coords[:, 1] = uv_coords[:, 1]
31 |     # uv_coords[:, 1] = uv_h - uv_coords[:, 1] - 1
32 |     uv_coords = np.hstack((uv_coords, np.ones((uv_coords.shape[0], 1))))   # add z
33 |     return uv_coords
34 | 


--------------------------------------------------------------------------------