├── .gitignore
├── LICENSE
├── README.md
├── paper.pdf
├── requirements.txt
├── src
├── __init__.py
├── config
│ ├── test_face2facerho.ini
│ └── train_face2facerho.ini
├── dataset
│ ├── __init__.py
│ ├── base_data_loader.py
│ ├── base_dataset.py
│ └── voxceleb_dataset.py
├── external
│ ├── LICENSE
│ ├── README.md
│ ├── data
│ │ ├── landmark_embedding.json
│ │ └── pose_transform_config.json
│ └── decalib
│ │ ├── __init__.py
│ │ ├── datasets
│ │ ├── aflw2000.py
│ │ ├── build_datasets.py
│ │ ├── datasets.py
│ │ ├── detectors.py
│ │ ├── ethnicity.py
│ │ ├── now.py
│ │ ├── train_datasets.py
│ │ ├── vggface.py
│ │ └── vox.py
│ │ ├── deca.py
│ │ ├── models
│ │ ├── FLAME.py
│ │ ├── decoders.py
│ │ ├── encoders.py
│ │ ├── frnet.py
│ │ ├── lbs.py
│ │ └── resnet.py
│ │ ├── trainer.py
│ │ └── utils
│ │ ├── config.py
│ │ ├── rotation_converter.py
│ │ └── util.py
├── fitting.py
├── models
│ ├── VGG19_LOSS.py
│ ├── __init__.py
│ ├── base_model.py
│ ├── discriminator.py
│ ├── face2face_rho_model.py
│ ├── image_pyramid.py
│ ├── motion_network.py
│ ├── networks.py
│ └── rendering_network.py
├── options
│ ├── __init__.py
│ └── parse_config.py
├── reenact.py
├── train.py
└── util
│ ├── html.py
│ ├── landmark_image_generation.py
│ ├── util.py
│ └── visualizer.py
├── test_case
├── driving
│ ├── FLAME
│ │ ├── headpose.txt
│ │ └── landmark.txt
│ ├── driving.jpg
│ └── original
│ │ ├── headpose.txt
│ │ └── landmark.txt
└── source
│ ├── FLAME
│ ├── headpose.txt
│ └── landmark.txt
│ ├── original
│ ├── headpose.txt
│ └── landmark.txt
│ └── source.jpg
└── trainingset
└── VoxCeleb
├── id10001#7w0IBEWc9Qw#000993#001143
├── headpose
│ ├── 150.txt
│ └── 54.txt
├── img
│ ├── 150.jpg
│ └── 54.jpg
├── landmark
│ ├── 150.txt
│ └── 54.txt
└── mask
│ ├── 150.png
│ └── 54.png
├── id10009#AtavJVP4bCk#012568#012652
├── headpose
│ ├── 82.txt
│ └── 89.txt
├── img
│ ├── 82.jpg
│ └── 89.jpg
├── landmark
│ ├── 82.txt
│ └── 89.txt
└── mask
│ ├── 82.png
│ └── 89.png
└── list.txt
/.gitignore:
--------------------------------------------------------------------------------
1 | .idea/
2 | src/external/data/generic_model.pkl
3 | src/external/data/deca_model.tar
4 | src/checkpoints/
5 | test_case/result
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | BSD 3-Clause License
2 |
3 | Copyright (c) 2021, NetEase Games AI Lab.
4 | All rights reserved.
5 |
6 | Redistribution and use in source and binary forms, with or without
7 | modification, are permitted provided that the following conditions are met:
8 |
9 | * Redistributions of source code must retain the above copyright notice, this
10 | list of conditions and the following disclaimer.
11 |
12 | * Redistributions in binary form must reproduce the above copyright notice,
13 | this list of conditions and the following disclaimer in the documentation
14 | and/or other materials provided with the distribution.
15 |
16 | * Neither the name of the copyright holder nor the names of its
17 | contributors may be used to endorse or promote products derived from
18 | this software without specific prior written permission.
19 |
20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## Face2Faceρ: Official Pytorch Implementation
2 | [
](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136730055.pdf)
3 |
4 | ### Environment
5 | - CUDA 10.2 or above
6 | - Python 3.8.5
7 | - ``pip install -r requirements.txt``
8 | - For visdom, some dependencies may need to be manually
9 | downloaded ([visdom issue](https://github.com/fossasia/visdom/issues/111))
10 |
11 | ### Training data
12 | Our framework relies on a large video dataset containing many identities,
13 | such as [VoxCeleb](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/). For each video frame, the following data is required:
14 | - image: the cropped face image (please refer to the pre-processing steps of
15 | [Siarohin et al.](https://github.com/AliaksandrSiarohin/video-preprocessing))
16 | - landmark: the 2D facial landmark coordinates obtained via projecting the 3D keypoints on the
17 | fitted 3DMM mesh to the image space. (please refer to Section 3.3
18 | of our paper)
19 | - headpose: 3DMM head pose coefficients
20 | - face mask (optional): the face area mask (can be generated by any face parsing methods, such as
21 | [BiSeNet](https://github.com/zllrunning/face-parsing.PyTorch)).
22 |
23 |
24 | The pre-processed data should be organized as follows (an
25 | example dataset containing two video sequences is provided in ```./dataset/VoxCeleb```):
26 |
27 | ```
28 | - dataset
29 | -
30 | - list.txt ---list of all videos
31 | - id10001#7w0IBEWc9Qw#000993#001143 ---video folder 1 (should be named as #)
32 | - img ---video frame
33 | - 1.jpg
34 | - 2.jpg
35 | - ...
36 | - landmark ---landmark coordinates for each frame
37 | - 1.txt
38 | - 2.txt
39 | - ...
40 | - headpose ---head pose coefficients for each frame
41 | - 1.txt
42 | - 2.txt
43 | - ...
44 | - mask ---face mask for each frame
45 | - 1.png
46 | - 2.png
47 | - ...
48 | - id10009#AtavJVP4bCk#012568#012652 ---video folder 2
49 | ...
50 |
51 | ```
52 | ### Training
53 | - Set training data
54 | - Set ```dataroot``` in ```./src/config/train_face2facerho.ini``` to ```./dataset/``` (e.g. ```dataroot=./dataset/VoxCeleb```)
55 | - Setup vidsom
56 | - Set visdom `````` to ```display_port``` in ```./src/config/train_face2facerho.ini```, and run:
57 | ```bash
58 | nohup python -m visdom.server -port &
59 | ```
60 | - Start training (tested with Tesla V100)
61 | ```bash
62 | python src/train.py --config ./src/config/train_face2facerho.ini
63 | ```
64 |
65 | ### Testing
66 | - Prepare models
67 | - Download our pre-trained models from [GooleDrive](https://drive.google.com/drive/folders/1asCyEjKxpKSV8g674WwmmCgxAZP9pf9x) and put it in ```./src/checkpoints/voxceleb_face2facerho```
68 | - Download [FLAME model](https://flame.is.tue.mpg.de/download.php),
69 | choose **FLAME 2020** and unzip it, copy 'generic_model.pkl' into ```./src/external/data```
70 | - Download [DECA trained model](https://drive.google.com/file/d/1rp8kdyLPvErw2dTmqtjISRVvQLj6Yzje/view?usp=sharing),
71 | and put it in ```./src/external/data``` (**no unzip required**)
72 | - Fit the 3DMM coefficients of the source and driving face images
73 | - Since the code of the 3DMM fitting algorithm used in our paper was taken from our company’s in-house facial performance
74 | capture system, we can only release them after getting our company’s official permission. Alternatively, we provide an
75 | open-source solution based on [DECA](https://github.com/YadiraF/DECA). However, the overall performance of our framework may be slightly poorer with DECA. On the one hand, our original 3DMM fitting algorithm is more accurate and stable than DECA; on the other, the pre-configured 72 keypoints are not exactly the same as our original configuration because the mesh templates are different.
76 |
77 |
78 | Note that the resulting
79 | quality may deteriorate by using DECA 3DMM fitting algorithm, since our original 3DMM fitting algorithm is more
80 | stable and robust than DECA, and the pre-configured 72 keypoints on the FLAME mesh template are also slightly
81 | different from our original configuration.
82 | - Run (tested with Nvidia GeForce RTX 2080Ti):
83 | ```bash
84 | python src/fitting.py --device <"cpu" or "cuda"> \
85 | --src_img \
86 | --drv_img \
87 | --output_src_headpose \
88 | --output_src_landmark \
89 | --output_drv_headpose \
90 | --output_drv_landmark
91 | ```
92 | - Input
93 | - ```device```: set device, "cpu" or "cuda"
94 | - ```src_img```: input source actor image
95 | - ```drv_img```: input driving actor image
96 | - ```output_src_headpose```: output head pose coefficients of source image (.txt)
97 | - ```output_src_landmark```: output facial landmarks of source image (.txt)
98 | - ```output_drv_headpose```: output head pose coefficients of driving image (.txt)
99 | - ```output_drv_landmark```: output driving facial landmarks (.txt, reconstructed by using shape coefficients
100 | of the source actor and expression and head pose coefficients of the driving actor).
101 | - Example
102 | ```bash
103 | python src/fitting.py --device cuda \
104 | --src_img ./test_case/source/source.jpg --drv_img ./test_case/driving/driving.jpg \
105 | --output_src_headpose ./test_case/source/FLAME/headpose.txt --output_src_landmark ./test_case/source/FLAME/landmark.txt \
106 | --output_drv_headpose ./test_case/driving/FLAME/headpose.txt --output_drv_landmark ./test_case/driving/FLAME/landmark.txt
107 | ```
108 |
109 | - Get the final reenacted result (tested with Nvidia GeForce RTX 2080Ti):
110 | ```bash
111 | python src/reenact.py --config ./config/test_face2facerho.ini \
112 | --src_img \
113 | --src_headpose \
114 | --src_landmark \
115 | --drv_headpose \
116 | --drv_landmark \
117 | --output_dir
118 | ```
119 | - Input
120 | - ```src_img```: input source actor image
121 | - ```src_headpose```: input head pose coefficients of source image (.txt)
122 | - ```src_landmark```: input facial landmarks of source image (.txt)
123 | - ```drv_headpose```: input head pose coefficients of driving image (.txt)
124 | - ```drv_landmark```: input driving facial landmarks (reconstructed by using shape coefficients of the
125 | source actor and expression and head pose coefficients of the driving actor).
126 | - ```output_dir```: output image (named "result.png") will be saved in this folder.
127 | - Example
128 | - Run using 3DMM fitting results by our original 3DMM fitting algorithm (results are pre-save in
129 | ```./test_case/source/original``` and ```./test_case/source/original```)
130 | ```bash
131 | python src/reenact.py --config ./src/config/test_face2facerho.ini \
132 | --src_img ./test_case/source/source.jpg \
133 | --src_headpose ./test_case/source/original/headpose.txt --src_landmark ./test_case/source/original/landmark.txt \
134 | --drv_headpose ./test_case/driving/original/headpose.txt --drv_landmark ./test_case/driving/original/landmark.txt \
135 | --output_dir ./test_case/result
136 | ```
137 | - Run using 3DMM fitting results by DECA
138 | ```bash
139 | python src/reenact.py --config ./src/config/test_face2facerho.ini \
140 | --src_img ./test_case/source/source.jpg \
141 | --src_headpose ./test_case/source/FLAME/headpose.txt --src_landmark ./test_case/source/FLAME/landmark.txt \
142 | --drv_headpose ./test_case/driving/FLAME/headpose.txt --drv_landmark ./test_case/driving/FLAME/landmark.txt \
143 | --output_dir ./test_case/result
144 | ```
145 |
--------------------------------------------------------------------------------
/paper.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/paper.pdf
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy==1.18.5
2 | opencv-python==4.4.0.46
3 | torch==1.6.0
4 | torchvision==0.7.0
5 | Pillow==8.0.1
6 | visdom==0.1.8.9
7 | dominate==2.6.0
8 | yacs==0.1.8
9 | scikit-image==0.17.2
10 | face-alignment
11 | chumpy==0.70
--------------------------------------------------------------------------------
/src/__init__.py:
--------------------------------------------------------------------------------
1 | from .util import *
2 | from .data import *
3 | from .models import *
4 | from .options import *
5 | from .config import *
--------------------------------------------------------------------------------
/src/config/test_face2facerho.ini:
--------------------------------------------------------------------------------
1 | [ROOT]
2 | ;basic config
3 | name = voxceleb_face2facerho
4 | gpu_ids = 0
5 | checkpoints_dir = ./src/checkpoints
6 | model = face2face_rho
7 | output_size = 512
8 | isTrain = False
9 | phase = test
10 | load_iter = -1
11 | epoch = 105
12 |
13 | ;rendering module config
14 | headpose_dims = 6
15 | mobilev2_encoder_channels = 16,8,12,28,64,72,140,280
16 | mobilev2_decoder_channels = 16,8,14,24,64,96,140,280
17 | mobilev2_encoder_layers = 1,2,2,2,2,2,1
18 | mobilev2_decoder_layers = 1,2,2,2,2,2,1
19 | mobilev2_encoder_expansion_factor = 1,6,6,6,6,6,6
20 | mobilev2_decoder_expansion_factor = 1,6,6,6,6,6,6
21 | headpose_embedding_ngf = 8
22 |
23 | ;motion module config
24 | mn_ngf = 16
25 | n_local_enhancers = 2
26 | mn_n_downsampling = 2
27 | mn_n_blocks_local = 3
28 |
29 | ;discriminator
30 | disc_block_expansion = 32
31 | disc_num_blocks = 4
32 | disc_max_features = 512
33 |
34 | ;training parameters
35 | init_type = none
36 | init_gain = 0.02
37 | emphasize_face_area = True
38 | loss_scales = 1,0.5,0.25,0.125
39 | warp_loss_weight = 500.0
40 | reconstruction_loss_weight = 15.0
41 | feature_matching_loss_weight = 1
42 | face_area_weight_scale = 4
43 | init_field_epochs = 5
44 | lr = 0.0002
45 | beta1 = 0.9
46 | lr_policy = lambda
47 | epoch_count = 0
48 | niter = 90
49 | niter_decay = 15
50 | continue_train = False
51 |
52 | ;dataset parameters
53 | dataset_mode = voxceleb_test
54 | dataroot = ./dataset/VoxCeleb
55 | num_repeats = 60
56 | batch_size = 1
57 | serial_batches = True
58 | num_threads = 0
59 |
60 | ;vis_config
61 | display_freq = 200
62 | update_html_freq = 20
63 | display_id = 1
64 | display_server = http://localhost
65 | display_env = voxceleb_face2facerho
66 | display_port = 6005
67 | print_freq = 200
68 | save_latest_freq = 10000
69 | save_epoch_freq = 1
70 | no_html = True
71 | display_winsize = 256
72 | display_ncols = 3
73 | verbose = False
--------------------------------------------------------------------------------
/src/config/train_face2facerho.ini:
--------------------------------------------------------------------------------
1 | [ROOT]
2 | ;basic config
3 | name = voxceleb_face2facerho
4 | gpu_ids = 0
5 | checkpoints_dir = ./src/checkpoints
6 | model = face2face_rho
7 | output_size = 512
8 | isTrain = True
9 | phase = train
10 | load_iter = -1
11 | epoch = 0
12 |
13 | ;rendering module config
14 | headpose_dims = 6
15 | mobilev2_encoder_channels = 16,8,12,28,64,72,140,280
16 | mobilev2_decoder_channels = 16,8,14,24,64,96,140,280
17 | mobilev2_encoder_layers = 1,2,2,2,2,2,1
18 | mobilev2_decoder_layers = 1,2,2,2,2,2,1
19 | mobilev2_encoder_expansion_factor = 1,6,6,6,6,6,6
20 | mobilev2_decoder_expansion_factor = 1,6,6,6,6,6,6
21 | headpose_embedding_ngf = 8
22 |
23 | ;motion module config
24 | mn_ngf = 16
25 | n_local_enhancers = 2
26 | mn_n_downsampling = 2
27 | mn_n_blocks_local = 3
28 |
29 | ;discriminator
30 | disc_block_expansion = 32
31 | disc_num_blocks = 4
32 | disc_max_features = 512
33 |
34 | ;training parameters
35 | init_type = none
36 | init_gain = 0.02
37 | emphasize_face_area = True
38 | loss_scales = 1,0.5,0.25,0.125
39 | warp_loss_weight = 500.0
40 | reconstruction_loss_weight = 15.0
41 | feature_matching_loss_weight = 1
42 | face_area_weight_scale = 4
43 | init_field_epochs = 5
44 | lr = 0.0002
45 | beta1 = 0.9
46 | lr_policy = lambda
47 | epoch_count = 0
48 | niter = 90
49 | niter_decay = 15
50 | continue_train = False
51 |
52 | ;dataset parameters
53 | dataset_mode = voxceleb
54 | dataroot = ./trainingset/VoxCeleb
55 | num_repeats = 60
56 | batch_size = 6
57 | serial_batches = False
58 | num_threads = 8
59 |
60 | ;vis_config
61 | display_freq = 200
62 | update_html_freq = 20
63 | display_id = 1
64 | display_server = http://localhost
65 | display_env = voxceleb_face2facerho
66 | display_port = 6005
67 | print_freq = 200
68 | save_latest_freq = 10000
69 | save_epoch_freq = 1
70 | no_html = True
71 | display_winsize = 256
72 | display_ncols = 3
73 | verbose = False
--------------------------------------------------------------------------------
/src/dataset/__init__.py:
--------------------------------------------------------------------------------
1 | import importlib
2 | import torch.utils.data
3 | from dataset.base_data_loader import BaseDataLoader
4 | from dataset.base_dataset import BaseDataset
5 |
6 |
7 | def find_dataset_using_name(dataset_name):
8 | # Given the option --dataset_mode [datasetname],
9 | # the file "data/datasetname_dataset.py"
10 | # will be imported.
11 | dataset_filename = "dataset." + dataset_name + "_dataset"
12 | datasetlib = importlib.import_module(dataset_filename)
13 |
14 | # In the file, the class called DatasetNameDataset() will
15 | # be instantiated. It has to be a subclass of BaseDataset,
16 | # and it is case-insensitive.
17 | dataset = None
18 | target_dataset_name = dataset_name.replace('_', '') + 'dataset'
19 | for name, cls in datasetlib.__dict__.items():
20 | if name.lower() == target_dataset_name.lower() \
21 | and issubclass(cls, BaseDataset):
22 | dataset = cls
23 |
24 | if dataset is None:
25 | print("In %s.py, there should be a subclass of BaseDataset with class name that matches %s in lowercase." % (dataset_filename, target_dataset_name))
26 | exit(0)
27 |
28 | return dataset
29 |
30 |
31 | def get_option_setter(dataset_name):
32 | dataset_class = find_dataset_using_name(dataset_name)
33 | return dataset_class.modify_commandline_options
34 |
35 |
36 | def create_dataset(opt):
37 | dataset = find_dataset_using_name(opt.dataset_mode)
38 | instance = dataset()
39 | instance.initialize(opt)
40 | print("dataset [%s] was created" % (instance.name()))
41 | return instance
42 |
43 |
44 | def CreateDataLoader(opt):
45 | data_loader = CustomDatasetDataLoader()
46 | data_loader.initialize(opt)
47 | return data_loader
48 |
49 |
50 | # Wrapper class of Dataset class that performs
51 | # multi-threaded data loading
52 | class CustomDatasetDataLoader(BaseDataLoader):
53 | def name(self):
54 | return 'CustomDatasetDataLoader'
55 |
56 | def initialize(self, opt):
57 | BaseDataLoader.initialize(self, opt)
58 | self.dataset = create_dataset(opt)
59 | if opt.serial_batches:
60 | self.dataloader = torch.utils.data.DataLoader(
61 | self.dataset,
62 | batch_size=opt.batch_size,
63 | shuffle=False,
64 | num_workers=int(opt.num_threads))
65 | else:
66 | #weights = make_weights_for_balanced_classes(dataset_train.imgs, len(dataset_train.classes))
67 | # weights = self.dataset.getSampleWeights()
68 | # weights = torch.DoubleTensor(weights)
69 | # sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, len(weights))
70 |
71 | self.dataloader = torch.utils.data.DataLoader(
72 | self.dataset,
73 | batch_size=opt.batch_size,
74 | shuffle=True,
75 | # sampler=sampler,
76 | # pin_memory=True,
77 | num_workers=int(opt.num_threads))
78 |
79 |
80 | def load_data(self):
81 | return self
82 |
83 | def __len__(self):
84 | return min(len(self.dataset), self.opt.max_dataset_size)
85 |
86 | def __iter__(self):
87 | for i, data in enumerate(self.dataloader):
88 | if i * self.opt.batch_size >= self.opt.max_dataset_size:
89 | break
90 | yield data
91 |
--------------------------------------------------------------------------------
/src/dataset/base_data_loader.py:
--------------------------------------------------------------------------------
1 | class BaseDataLoader():
2 | def __init__(self):
3 | pass
4 |
5 | def initialize(self, opt):
6 | self.opt = opt
7 | pass
8 |
9 | def load_data():
10 | return None
11 |
--------------------------------------------------------------------------------
/src/dataset/base_dataset.py:
--------------------------------------------------------------------------------
1 | import torch.utils.data as data
2 | from PIL import Image
3 | import torchvision.transforms as transforms
4 | import torch
5 |
6 |
7 | class BaseDataset(data.Dataset):
8 | def __init__(self):
9 | super(BaseDataset, self).__init__()
10 |
11 | def name(self):
12 | return 'BaseDataset'
13 |
14 | @staticmethod
15 | def modify_commandline_options(parser, is_train):
16 | return parser
17 |
18 | def initialize(self, opt):
19 | pass
20 |
21 | def getSampleWeights(self):
22 | return torch.ones((len(self)))
23 |
24 | def __len__(self):
25 | return 0
26 |
27 |
28 | def get_transform(opt):
29 | transform_list = []
30 | if opt.resize_or_crop == 'resize_and_crop':
31 | osize = [opt.loadSize, opt.loadSize]
32 | transform_list.append(transforms.Resize(osize, Image.BICUBIC))
33 | transform_list.append(transforms.RandomCrop(opt.fineSize))
34 | elif opt.resize_or_crop == 'crop':
35 | transform_list.append(transforms.RandomCrop(opt.fineSize))
36 | elif opt.resize_or_crop == 'scale_width':
37 | transform_list.append(transforms.Lambda(
38 | lambda img: __scale_width(img, opt.fineSize)))
39 | elif opt.resize_or_crop == 'scale_width_and_crop':
40 | transform_list.append(transforms.Lambda(
41 | lambda img: __scale_width(img, opt.loadSize)))
42 | transform_list.append(transforms.RandomCrop(opt.fineSize))
43 | elif opt.resize_or_crop == 'none':
44 | transform_list.append(transforms.Lambda(
45 | lambda img: __adjust(img)))
46 | else:
47 | raise ValueError('--resize_or_crop %s is not a valid option.' % opt.resize_or_crop)
48 |
49 | if opt.isTrain and not opt.no_flip:
50 | transform_list.append(transforms.RandomHorizontalFlip())
51 |
52 | transform_list += [transforms.ToTensor(),
53 | transforms.Normalize((0.5, 0.5, 0.5),
54 | (0.5, 0.5, 0.5))]
55 | return transforms.Compose(transform_list)
56 |
57 |
58 | # just modify the width and height to be multiple of 4
59 | def __adjust(img):
60 | ow, oh = img.size
61 |
62 | # the size needs to be a multiple of this number,
63 | # because going through generator network may change img size
64 | # and eventually cause size mismatch error
65 | mult = 4
66 | if ow % mult == 0 and oh % mult == 0:
67 | return img
68 | w = (ow - 1) // mult
69 | w = (w + 1) * mult
70 | h = (oh - 1) // mult
71 | h = (h + 1) * mult
72 |
73 | if ow != w or oh != h:
74 | __print_size_warning(ow, oh, w, h)
75 |
76 | return img.resize((w, h), Image.BICUBIC)
77 |
78 |
79 | def __scale_width(img, target_width):
80 | ow, oh = img.size
81 |
82 | # the size needs to be a multiple of this number,
83 | # because going through generator network may change img size
84 | # and eventually cause size mismatch error
85 | mult = 4
86 | assert target_width % mult == 0, "the target width needs to be multiple of %d." % mult
87 | if (ow == target_width and oh % mult == 0):
88 | return img
89 | w = target_width
90 | target_height = int(target_width * oh / ow)
91 | m = (target_height - 1) // mult
92 | h = (m + 1) * mult
93 |
94 | if target_height != h:
95 | __print_size_warning(target_width, target_height, w, h)
96 |
97 | return img.resize((w, h), Image.BICUBIC)
98 |
99 |
100 | def __print_size_warning(ow, oh, w, h):
101 | if not hasattr(__print_size_warning, 'has_printed'):
102 | print("The image size needs to be a multiple of 4. "
103 | "The loaded image size was (%d, %d), so it was adjusted to "
104 | "(%d, %d). This adjustment will be done to all images "
105 | "whose sizes are not multiples of 4" % (ow, oh, w, h))
106 | __print_size_warning.has_printed = True
107 |
--------------------------------------------------------------------------------
/src/dataset/voxceleb_dataset.py:
--------------------------------------------------------------------------------
1 | import os.path
2 | import torch
3 | import numpy as np
4 | from dataset.base_dataset import BaseDataset
5 | from util.util import (
6 | make_ids,
7 | read_target,
8 | load_coeffs,
9 | load_landmarks
10 | )
11 |
12 | from util.landmark_image_generation import LandmarkImageGeneration
13 |
14 |
15 | class VoxCelebDataset(BaseDataset):
16 | def initialize(self, opt):
17 | self.opt = opt
18 | self.dataroot = opt.dataroot
19 | video_list_file = self.dataroot + "/list.txt"
20 | self.video_path = self.dataroot
21 | self.video_names = []
22 | with open(video_list_file, 'r') as f:
23 | lines = f.readlines()
24 | for line in lines:
25 | self.video_names.append(self.video_path + "/" + line)
26 | person_ids = set()
27 | for video in self.video_names:
28 | person_ids.add(os.path.basename(video).split('#')[0])
29 | self.person_ids = list(person_ids)
30 | self.person_ids.sort()
31 |
32 | self.landmark_img_generator = LandmarkImageGeneration(self.opt)
33 |
34 | self.total_person_id = len(self.person_ids)
35 | print('\tnum videos: {}, repeat {} times, total: {}'.format(self.total_person_id, opt.num_repeats,
36 | self.total_person_id * opt.num_repeats))
37 |
38 | def __getitem__(self, index):
39 | idx = index % self.total_person_id
40 | name = self.person_ids[idx]
41 | video_name = np.random.choice(self.choose_video_from_person_id(name))
42 | frame_ids = make_ids(video_name + "/img")
43 |
44 | frame_idx = np.sort(np.random.choice(frame_ids, replace=False, size=2))
45 |
46 | img_dir = video_name + "/img"
47 | headpose_dir = video_name + "/headpose"
48 | landmark_dir = video_name + "/landmark"
49 | mask_dir = video_name + "/mask"
50 |
51 | src_idx = frame_idx[0]
52 | drv_idx = frame_idx[1]
53 |
54 | src_img = read_target(img_dir + "/" + str(src_idx) + ".jpg", self.opt.output_size)
55 | drv_img = read_target(img_dir + "/" + str(drv_idx) + ".jpg", self.opt.output_size)
56 |
57 | src_headpose = torch.from_numpy(np.array(load_coeffs(headpose_dir + "/" + str(src_idx) + ".txt"))).float()
58 | drv_headpose = torch.from_numpy(np.array(load_coeffs(headpose_dir + "/" + str(drv_idx) + ".txt"))).float()
59 |
60 | src_landmark = torch.from_numpy(np.array(load_landmarks(landmark_dir + "/" + str(src_idx) + ".txt"))).float()
61 | drv_landmark = torch.from_numpy(np.array(load_landmarks(landmark_dir + "/" + str(drv_idx) + ".txt"))).float()
62 |
63 | src_landmark_img = self.draw_landmark_img(src_landmark)
64 | drv_landmark_img = self.draw_landmark_img(drv_landmark)
65 |
66 | input_data = {
67 | 'src_img': src_img,
68 | 'drv_img': drv_img,
69 | 'src_headpose': src_headpose,
70 | 'drv_headpose': drv_headpose,
71 | 'src_landmark_img': src_landmark_img,
72 | 'drv_landmark_img': drv_landmark_img,
73 | }
74 | if self.opt.emphasize_face_area:
75 | drv_face_mask = read_target(mask_dir + "/" + str(drv_idx) + ".png", self.opt.output_size)
76 | input_data['drv_face_mask'] = drv_face_mask.squeeze(0)
77 | return input_data
78 |
79 | def choose_video_from_person_id(self, name):
80 | names = []
81 | for video_name in self.video_names:
82 | if name in video_name:
83 | names.append(video_name.strip())
84 | return names
85 |
86 | def draw_landmark_img(self, landmarks):
87 | landmark_imgs = self.landmark_img_generator.generate_landmark_img(landmarks)
88 | return landmark_imgs
89 |
90 | def __len__(self):
91 | return self.total_person_id * self.opt.num_repeats
92 |
93 | def name(self):
94 | return 'VoxCelebDataset'
95 |
96 |
97 |
--------------------------------------------------------------------------------
/src/external/LICENSE:
--------------------------------------------------------------------------------
1 | License
2 |
3 | Software Copyright License for non-commercial scientific research purposes
4 | Please read carefully the following terms and conditions and any accompanying documentation before you download
5 | and/or use the DECA model, data and software, (the "Model & Software"), including 3D meshes, software, and scripts.
6 | By downloading and/or using the Model & Software (including downloading, cloning, installing, and any other use
7 | of this github repository), you acknowledge that you have read these terms and conditions, understand them, and
8 | agree to be bound by them. If you do not agree with these terms and conditions, you must not download and/or use
9 | the Model & Software. Any infringement of the terms of this agreement will automatically terminate your rights
10 | under this License
11 |
12 | Ownership / Licensees
13 | The Model & Software and the associated materials has been developed at the
14 | Max Planck Institute for Intelligent Systems (hereinafter "MPI").
15 |
16 | Any copyright or patent right is owned by and proprietary material of the
17 | Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (hereinafter “MPG”; MPI and MPG hereinafter
18 | collectively “Max-Planck”) hereinafter the “Licensor”.
19 |
20 | License Grant
21 | Licensor grants you (Licensee) personally a single-user, non-exclusive, non-transferable, free of charge right:
22 |
23 | • To install the Model & Software on computers owned, leased or otherwise controlled by you and/or your organization.
24 | • To use the Model & Software for the sole purpose of performing peaceful non-commercial scientific research,
25 | non-commercial education, or non-commercial artistic projects.
26 |
27 | Any other use, in particular any use for commercial, pornographic, military, or surveillance purposes is prohibited.
28 | This includes, without limitation, incorporation in a commercial product, use in a commercial service,
29 | or production of other artefacts for commercial purposes.
30 |
31 | The Model & Software may not be used to create fake, libelous, misleading, or defamatory content of any kind, excluding
32 | analyses in peer-reviewed scientific research.
33 |
34 | The Model & Software may not be reproduced, modified and/or made available in any form to any third party
35 | without Max-Planck’s prior written permission.
36 |
37 | The Model & Software may not be used for pornographic purposes or to generate pornographic material whether
38 | commercial or not. This license also prohibits the use of the Model & Software to train methods/algorithms/neural
39 | networks/etc. for commercial use of any kind. By downloading the Model & Software, you agree not to reverse engineer it.
40 |
41 | No Distribution
42 | The Model & Software and the license herein granted shall not be copied, shared, distributed, re-sold, offered
43 | for re-sale, transferred or sub-licensed in whole or in part except that you may make one copy for archive
44 | purposes only.
45 |
46 | Disclaimer of Representations and Warranties
47 | You expressly acknowledge and agree that the Model & Software results from basic research, is provided “AS IS”,
48 | may contain errors, and that any use of the Model & Software is at your sole risk.
49 | LICENSOR MAKES NO REPRESENTATIONS
50 | OR WARRANTIES OF ANY KIND CONCERNING THE MODEL & SOFTWARE, NEITHER EXPRESS NOR IMPLIED, AND THE ABSENCE OF ANY
51 | LEGAL OR ACTUAL DEFECTS, WHETHER DISCOVERABLE OR NOT. Specifically, and not to limit the foregoing, licensor
52 | makes no representations or warranties (i) regarding the merchantability or fitness for a particular purpose of
53 | the Model & Software, (ii) that the use of the Model & Software will not infringe any patents, copyrights or other
54 | intellectual property rights of a third party, and (iii) that the use of the Model & Software will not cause any
55 | damage of any kind to you or a third party.
56 |
57 | Limitation of Liability
58 | Because this Model & Software License Agreement qualifies as a donation, according to Section 521 of the German
59 | Civil Code (Bürgerliches Gesetzbuch – BGB) Licensor as a donor is liable for intent and gross negligence only.
60 | If the Licensor fraudulently conceals a legal or material defect, they are obliged to compensate the Licensee
61 | for the resulting damage.
62 |
63 | Licensor shall be liable for loss of data only up to the amount of typical recovery costs which would have
64 | arisen had proper and regular data backup measures been taken. For the avoidance of doubt Licensor shall be
65 | liable in accordance with the German Product Liability Act in the event of product liability. The foregoing
66 | applies also to Licensor’s legal representatives or assistants in performance. Any further liability shall
67 | be excluded. Patent claims generated through the usage of the Model & Software cannot be directed towards the copyright holders.
68 | The Model & Software is provided in the state of development the licensor defines. If modified or extended by
69 | Licensee, the Licensor makes no claims about the fitness of the Model & Software and is not responsible
70 | for any problems such modifications cause.
71 |
72 | No Maintenance Services
73 | You understand and agree that Licensor is under no obligation to provide either maintenance services,
74 | update services, notices of latent defects, or corrections of defects with regard to the Model & Software.
75 | Licensor nevertheless reserves the right to update, modify, or discontinue the Model & Software at any time.
76 |
77 | Defects of the Model & Software must be notified in writing to the Licensor with a comprehensible description
78 | of the error symptoms. The notification of the defect should enable the reproduction of the error.
79 | The Licensee is encouraged to communicate any use, results, modification or publication.
80 |
81 | Publications using the Model & Software
82 | You acknowledge that the Model & Software is a valuable scientific resource and agree to appropriately reference
83 | the following paper in any publication making use of the Model & Software.
84 |
85 | Commercial licensing opportunities
86 | For commercial uses of the Model & Software, please send email to ps-license@tue.mpg.de
87 |
88 | This Agreement shall be governed by the laws of the Federal Republic of Germany except for the UN Sales Convention.
89 |
--------------------------------------------------------------------------------
/src/external/README.md:
--------------------------------------------------------------------------------
1 | # DECA: Detailed Expression Capture and Animation (SIGGRAPH2021)
2 |
3 | Please refer to [README](https://github.com/YadiraF/DECA/blob/master/README.md) for more details about DECA. If you want
4 | to use this code, you should follow the original [LICENSE](https://github.com/YadiraF/DECA/blob/master/LICENSE) of DECA:
5 |
6 | >This code and model are available for non-commercial scientific research purposes as defined in the [LICENSE](https://github.com/YadiraF/DECA/blob/master/LICENSE) file.
7 | By downloading and using the code and model you agree to the terms in the [LICENSE](https://github.com/YadiraF/DECA/blob/master/LICENSE).
--------------------------------------------------------------------------------
/src/external/data/landmark_embedding.json:
--------------------------------------------------------------------------------
1 | {
2 | "lmk_faces_idx": [
3 | 7726,
4 | 2666,
5 | 3406,
6 | 8460,
7 | 8371,
8 | 8456,
9 | 8455,
10 | 8384,
11 | 8382,
12 | 2622,
13 | 7155,
14 | 3680,
15 | 8909,
16 | 8198,
17 | 8258,
18 | 3195,
19 | 3112,
20 | 8829,
21 | 7930,
22 | 225,
23 | 3779,
24 | 8809,
25 | 3839,
26 | 3845,
27 | 5107,
28 | 6428,
29 | 1248,
30 | 8626,
31 | 1180,
32 | 3742,
33 | 8800,
34 | 2238,
35 | 7341,
36 | 8803,
37 | 5920,
38 | 6229,
39 | 6765,
40 | 6773,
41 | 2462,
42 | 3730,
43 | 8730,
44 | 6650,
45 | 8548,
46 | 1649,
47 | 8820,
48 | 3799,
49 | 6076,
50 | 5374,
51 | 759,
52 | 8593,
53 | 3691,
54 | 197,
55 | 7423,
56 | 2354,
57 | 7390,
58 | 7418,
59 | 2331,
60 | 917,
61 | 5997,
62 | 885,
63 | 917,
64 | 8219,
65 | 8768,
66 | 8635,
67 | 6055,
68 | 6026,
69 | 6069,
70 | 7498,
71 | 7446,
72 | 2452,
73 | 8668,
74 | 8799
75 | ],
76 | "lmk_bary_coords": [
77 | [
78 | 0.5,
79 | 0.5,
80 | 0.0
81 | ],
82 | [
83 | 1.0,
84 | 0.0,
85 | 0.0
86 | ],
87 | [
88 | 0.3333333,
89 | 0.3333333,
90 | 0.3333333
91 | ],
92 | [
93 | 0.5,
94 | 0.5,
95 | 0.0
96 | ],
97 | [
98 | 0.3333333,
99 | 0.3333333,
100 | 0.3333333
101 | ],
102 | [
103 | 0.3333333,
104 | 0.3333333,
105 | 0.3333333
106 | ],
107 | [
108 | 0.0,
109 | 0.5,
110 | 0.5
111 | ],
112 | [
113 | 0.5,
114 | 0.5,
115 | 0.0
116 | ],
117 | [
118 | 1.0,
119 | 0.0,
120 | 0.0
121 | ],
122 | [
123 | 0.5,
124 | 0.0,
125 | 0.5
126 | ],
127 | [
128 | 1.0,
129 | 0.0,
130 | 0.0
131 | ],
132 | [
133 | 0.3333333,
134 | 0.3333333,
135 | 0.3333333
136 | ],
137 | [
138 | 0.0,
139 | 0.5,
140 | 0.5
141 | ],
142 | [
143 | 0.3333333,
144 | 0.3333333,
145 | 0.3333333
146 | ],
147 | [
148 | 0.3333333,
149 | 0.3333333,
150 | 0.3333333
151 | ],
152 | [
153 | 0.0,
154 | 0.5,
155 | 0.5
156 | ],
157 | [
158 | 0.5,
159 | 0.0,
160 | 0.5
161 | ],
162 | [
163 | 0.3333333,
164 | 0.3333333,
165 | 0.3333333
166 | ],
167 | [
168 | 0.5,
169 | 0.0,
170 | 0.5
171 | ],
172 | [
173 | 0.5,
174 | 0.0,
175 | 0.5
176 | ],
177 | [
178 | 0.5,
179 | 0.0,
180 | 0.5
181 | ],
182 | [
183 | 0.5,
184 | 0.5,
185 | 0.0
186 | ],
187 | [
188 | 0.5,
189 | 0.0,
190 | 0.5
191 | ],
192 | [
193 | 0.0,
194 | 0.5,
195 | 0.5
196 | ],
197 | [
198 | 0.5,
199 | 0.5,
200 | 0.0
201 | ],
202 | [
203 | 0.5,
204 | 0.0,
205 | 0.5
206 | ],
207 | [
208 | 0.3333333,
209 | 0.3333333,
210 | 0.3333333
211 | ],
212 | [
213 | 4.999473057765158e-15,
214 | 0.9740002155303955,
215 | 0.025999775156378746
216 | ],
217 | [
218 | 0.8086484670639038,
219 | 0.01935010962188244,
220 | 0.1720014214515686
221 | ],
222 | [
223 | 0.003992011304944754,
224 | 0.4596897065639496,
225 | 0.536318302154541
226 | ],
227 | [
228 | 0.8673513531684875,
229 | 0.12561924755573273,
230 | 0.007029408123344183
231 | ],
232 | [
233 | 0.6809834837913513,
234 | 0.23452331125736237,
235 | 0.08449321240186691
236 | ],
237 | [
238 | 0.6094620823860168,
239 | 0.16802914440631866,
240 | 0.2225087583065033
241 | ],
242 | [
243 | 0.004817526787519455,
244 | 0.6991134881973267,
245 | 0.2960689663887024
246 | ],
247 | [
248 | 0.2225087583065033,
249 | 0.16802914440631866,
250 | 0.6094620823860168
251 | ],
252 | [
253 | 0.6809834837913513,
254 | 0.08449321240186691,
255 | 0.23452331125736237
256 | ],
257 | [
258 | 0.3333333,
259 | 0.3333333,
260 | 0.3333333
261 | ],
262 | [
263 | 0.3333333,
264 | 0.3333333,
265 | 0.3333333
266 | ],
267 | [
268 | 0.5,
269 | 0.5,
270 | 0
271 | ],
272 | [
273 | 0.3333333,
274 | 0.3333333,
275 | 0.3333333
276 | ],
277 | [
278 | 0.3333333,
279 | 0.3333333,
280 | 0.3333333
281 | ],
282 | [
283 | 0.15,
284 | 0.15,
285 | 0.7
286 | ],
287 | [
288 | 0.3333333,
289 | 0.3333333,
290 | 0.3333333
291 | ],
292 | [
293 | 0.3333333,
294 | 0.3333333,
295 | 0.3333333
296 | ],
297 | [
298 | 0.3333333,
299 | 0.3333333,
300 | 0.3333333
301 | ],
302 | [
303 | 0.3333333,
304 | 0.3333333,
305 | 0.3333333
306 | ],
307 | [
308 | 0.5,
309 | 0.0,
310 | 0.5
311 | ],
312 | [
313 | 0.3333333,
314 | 0.3333333,
315 | 0.3333333
316 | ],
317 | [
318 | 0.3333333,
319 | 0.3333333,
320 | 0.3333333
321 | ],
322 | [
323 | 0.3333333,
324 | 0.3333333,
325 | 0.3333333
326 | ],
327 | [
328 | 0.3333333,
329 | 0.3333333,
330 | 0.3333333
331 | ],
332 | [
333 | 0.15,
334 | 0.7,
335 | 0.15
336 | ],
337 | [
338 | 0.4711792767047882,
339 | 0.5057284235954285,
340 | 0.023092320188879967
341 | ],
342 | [
343 | 0.1317896991968155,
344 | 0.08670816570520401,
345 | 0.7815021276473999
346 | ],
347 | [
348 | 0.136175274848938,
349 | 0.701887845993042,
350 | 0.16193686425685883
351 | ],
352 | [
353 | 0.07477450370788574,
354 | 0.33583617210388184,
355 | 0.5893893241882324
356 | ],
357 | [
358 | 0.5,
359 | 0.0,
360 | 0.5
361 | ],
362 | [
363 | 0.4711792767047882,
364 | 0.023092320188879967,
365 | 0.5057284235954285
366 | ],
367 | [
368 | 0.13178616762161255,
369 | 0.7815194725990295,
370 | 0.08669432997703552
371 | ],
372 | [
373 | 0.136175274848938,
374 | 0.16193686425685883,
375 | 0.701887845993042
376 | ],
377 | [
378 | 0.8093733191490173,
379 | -9.597695766394229e-14,
380 | 0.19062671065330505
381 | ],
382 | [
383 | 0.5,
384 | 0.5,
385 | 0.0
386 | ],
387 | [
388 | 0.00028422221657820046,
389 | 0.8192539215087891,
390 | 0.18046186864376068
391 | ],
392 | [
393 | 0.9783512353897095,
394 | 0.02164875715970993,
395 | -1.3638637584277862e-15
396 | ],
397 | [
398 | 0.5,
399 | 0.5,
400 | 0.0
401 | ],
402 | [
403 | 0.09441351890563965,
404 | 0.9030233025550842,
405 | 0.0025631925091147423
406 | ],
407 | [
408 | 0.047000590711832047,
409 | 0.9423790574073792,
410 | 0.010620360262691975
411 | ],
412 | [
413 | 0.0,
414 | 0.5,
415 | 0.5
416 | ],
417 | [
418 | 0.0025631925091147423,
419 | 0.9030233025550842,
420 | 0.09441351890563965
421 | ],
422 | [
423 | 0.047000590711832047,
424 | 0.010620360262691975,
425 | 0.9423790574073792
426 | ],
427 | [
428 | 0.02153543196618557,
429 | 0.9784510731697083,
430 | 1.3496901374310255e-05
431 | ],
432 | [
433 | 0.0001076170738087967,
434 | 0.03357921168208122,
435 | 0.9663131833076477
436 | ]
437 | ]
438 | }
--------------------------------------------------------------------------------
/src/external/data/pose_transform_config.json:
--------------------------------------------------------------------------------
1 | {
2 | "scale_transform": 4122.399989645386,
3 | "tx_transform": 0.2582781138863169,
4 | "ty_transform": -0.26074984122168304
5 | }
--------------------------------------------------------------------------------
/src/external/decalib/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/src/external/decalib/__init__.py
--------------------------------------------------------------------------------
/src/external/decalib/datasets/aflw2000.py:
--------------------------------------------------------------------------------
1 | import os, sys
2 | import torch
3 | import torchvision.transforms as transforms
4 | import numpy as np
5 | import cv2
6 | import scipy
7 | from skimage.io import imread, imsave
8 | from skimage.transform import estimate_transform, warp, resize, rescale
9 | from glob import glob
10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset
11 | import scipy.io
12 |
13 | class AFLW2000(Dataset):
14 | def __init__(self, testpath='/ps/scratch/yfeng/Data/AFLW2000/GT', crop_size=224):
15 | '''
16 | data class for loading AFLW2000 dataset
17 | make sure each image has corresponding mat file, which provides cropping infromation
18 | '''
19 | if os.path.isdir(testpath):
20 | self.imagepath_list = glob(testpath + '/*.jpg') + glob(testpath + '/*.png')
21 | elif isinstance(testpath, list):
22 | self.imagepath_list = testpath
23 | elif os.path.isfile(testpath) and (testpath[-3:] in ['jpg', 'png']):
24 | self.imagepath_list = [testpath]
25 | else:
26 | print('please check the input path')
27 | exit()
28 | print('total {} images'.format(len(self.imagepath_list)))
29 | self.imagepath_list = sorted(self.imagepath_list)
30 | self.crop_size = crop_size
31 | self.scale = 1.6
32 | self.resolution_inp = crop_size
33 |
34 | def __len__(self):
35 | return len(self.imagepath_list)
36 |
37 | def __getitem__(self, index):
38 | imagepath = self.imagepath_list[index]
39 | imagename = imagepath.split('/')[-1].split('.')[0]
40 | image = imread(imagepath)[:,:,:3]
41 | kpt = scipy.io.loadmat(imagepath.replace('jpg', 'mat'))['pt3d_68'].T
42 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]);
43 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1])
44 |
45 | h, w, _ = image.shape
46 | old_size = (right - left + bottom - top)/2
47 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])#+ old_size*0.1])
48 | size = int(old_size*self.scale)
49 |
50 | # crop image
51 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]])
52 | DST_PTS = np.array([[0,0], [0,self.resolution_inp - 1], [self.resolution_inp - 1, 0]])
53 | tform = estimate_transform('similarity', src_pts, DST_PTS)
54 |
55 | image = image/255.
56 | dst_image = warp(image, tform.inverse, output_shape=(self.resolution_inp, self.resolution_inp))
57 | dst_image = dst_image.transpose(2,0,1)
58 | return {'image': torch.tensor(dst_image).float(),
59 | 'imagename': imagename,
60 | # 'tform': tform,
61 | # 'original_image': torch.tensor(image.transpose(2,0,1)).float(),
62 | }
--------------------------------------------------------------------------------
/src/external/decalib/datasets/build_datasets.py:
--------------------------------------------------------------------------------
1 | import os, sys
2 | import torch
3 | from torch.utils.data import Dataset, ConcatDataset
4 | import torchvision.transforms as transforms
5 | import numpy as np
6 | import cv2
7 | import scipy
8 | from skimage.io import imread, imsave
9 | from skimage.transform import estimate_transform, warp, resize, rescale
10 | from glob import glob
11 |
12 | from .vggface import VGGFace2Dataset
13 | from .ethnicity import EthnicityDataset
14 | from .aflw2000 import AFLW2000
15 | from .now import NoWDataset
16 | from .vox import VoxelDataset
17 |
18 | def build_train(config, is_train=True):
19 | data_list = []
20 | if 'vox2' in config.training_data:
21 | data_list.append(VoxelDataset(dataname='vox2', K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle))
22 | if 'vggface2' in config.training_data:
23 | data_list.append(VGGFace2Dataset(K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle))
24 | if 'vggface2hq' in config.training_data:
25 | data_list.append(VGGFace2HQDataset(K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle))
26 | if 'ethnicity' in config.training_data:
27 | data_list.append(EthnicityDataset(K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle))
28 | if 'coco' in config.training_data:
29 | data_list.append(COCODataset(image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale))
30 | if 'celebahq' in config.training_data:
31 | data_list.append(CelebAHQDataset(image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale))
32 | dataset = ConcatDataset(data_list)
33 |
34 | return dataset
35 |
36 | def build_val(config, is_train=True):
37 | data_list = []
38 | if 'vggface2' in config.eval_data:
39 | data_list.append(VGGFace2Dataset(isEval=True, K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle))
40 | if 'now' in config.eval_data:
41 | data_list.append(NoWDataset())
42 | if 'aflw2000' in config.eval_data:
43 | data_list.append(AFLW2000())
44 | dataset = ConcatDataset(data_list)
45 |
46 | return dataset
47 |
--------------------------------------------------------------------------------
/src/external/decalib/datasets/datasets.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | #
3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
4 | # holder of all proprietary rights on this computer program.
5 | # Using this computer program means that you agree to the terms
6 | # in the LICENSE file included with this software distribution.
7 | # Any use not explicitly granted by the LICENSE is prohibited.
8 | #
9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung
10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
11 | # for Intelligent Systems. All rights reserved.
12 | #
13 | # For comments or questions, please email us at deca@tue.mpg.de
14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de
15 |
16 | import os, sys
17 | import torch
18 | from torch.utils.data import Dataset, DataLoader
19 | import torchvision.transforms as transforms
20 | import numpy as np
21 | import cv2
22 | import scipy
23 | from skimage.io import imread, imsave
24 | from skimage.transform import estimate_transform, warp, resize, rescale
25 | from glob import glob
26 | import scipy.io
27 |
28 | from . import detectors
29 |
30 | def video2sequence(video_path, sample_step=10):
31 | videofolder = os.path.splitext(video_path)[0]
32 | os.makedirs(videofolder, exist_ok=True)
33 | video_name = os.path.splitext(os.path.split(video_path)[-1])[0]
34 | vidcap = cv2.VideoCapture(video_path)
35 | success,image = vidcap.read()
36 | count = 0
37 | imagepath_list = []
38 | while success:
39 | # if count%sample_step == 0:
40 | imagepath = os.path.join(videofolder, f'{video_name}_frame{count:04d}.jpg')
41 | cv2.imwrite(imagepath, image) # save frame as JPEG file
42 | success,image = vidcap.read()
43 | count += 1
44 | imagepath_list.append(imagepath)
45 | print('video frames are stored in {}'.format(videofolder))
46 | return imagepath_list
47 |
48 | class TestData(Dataset):
49 | def __init__(self, testpath, iscrop=True, crop_size=224, scale=1.25, face_detector='fan', sample_step=10):
50 | '''
51 | testpath: folder, imagepath_list, image path, video path
52 | '''
53 | if isinstance(testpath, list):
54 | self.imagepath_list = testpath
55 | elif os.path.isdir(testpath):
56 | self.imagepath_list = glob(testpath + '/*.jpg') + glob(testpath + '/*.png') + glob(testpath + '/*.bmp')
57 | elif os.path.isfile(testpath) and (testpath[-3:] in ['jpg', 'png', 'bmp']):
58 | self.imagepath_list = [testpath]
59 | elif os.path.isfile(testpath) and (testpath[-3:] in ['mp4', 'csv', 'vid', 'ebm']):
60 | self.imagepath_list = video2sequence(testpath, sample_step)
61 | else:
62 | print(f'please check the test path: {testpath}')
63 | exit()
64 | # print('total {} images'.format(len(self.imagepath_list)))
65 | self.imagepath_list = sorted(self.imagepath_list)
66 | self.crop_size = crop_size
67 | self.scale = scale
68 | self.iscrop = iscrop
69 | self.resolution_inp = crop_size
70 | if face_detector == 'fan':
71 | self.face_detector = detectors.FAN()
72 | # elif face_detector == 'mtcnn':
73 | # self.face_detector = detectors.MTCNN()
74 | else:
75 | print(f'please check the detector: {face_detector}')
76 | exit()
77 |
78 | def __len__(self):
79 | return len(self.imagepath_list)
80 |
81 | def bbox2point(self, left, right, top, bottom, type='bbox'):
82 | ''' bbox from detector and landmarks are different
83 | '''
84 | if type=='kpt68':
85 | old_size = (right - left + bottom - top)/2*1.1
86 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])
87 | elif type=='bbox':
88 | old_size = (right - left + bottom - top)/2
89 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 + old_size*0.12])
90 | else:
91 | raise NotImplementedError
92 | return old_size, center
93 |
94 | def __getitem__(self, index):
95 | imagepath = self.imagepath_list[index]
96 | imagename = os.path.splitext(os.path.split(imagepath)[-1])[0]
97 | image = np.array(imread(imagepath))
98 | if len(image.shape) == 2:
99 | image = image[:,:,None].repeat(1,1,3)
100 | if len(image.shape) == 3 and image.shape[2] > 3:
101 | image = image[:,:,:3]
102 |
103 | h, w, _ = image.shape
104 | if h!=w:
105 | print('only support square image!')
106 | exit(-1)
107 | if self.iscrop:
108 | # provide kpt as txt file, or mat file (for AFLW2000)
109 | kpt_matpath = os.path.splitext(imagepath)[0]+'.mat'
110 | kpt_txtpath = os.path.splitext(imagepath)[0]+'.txt'
111 | if os.path.exists(kpt_matpath):
112 | kpt = scipy.io.loadmat(kpt_matpath)['pt3d_68'].T
113 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]);
114 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1])
115 | old_size, center = self.bbox2point(left, right, top, bottom, type='kpt68')
116 | elif os.path.exists(kpt_txtpath):
117 | kpt = np.loadtxt(kpt_txtpath)
118 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]);
119 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1])
120 | old_size, center = self.bbox2point(left, right, top, bottom, type='kpt68')
121 | else:
122 | bbox, bbox_type = self.face_detector.run(image)
123 | if len(bbox) < 4:
124 | print('no face detected! run original image')
125 | left = 0; right = h-1; top=0; bottom=w-1
126 | else:
127 | left = bbox[0]; right=bbox[2]
128 | top = bbox[1]; bottom=bbox[3]
129 | old_size, center = self.bbox2point(left, right, top, bottom, type=bbox_type)
130 | size = int(old_size*self.scale)
131 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]])
132 | else:
133 | src_pts = np.array([[0, 0], [0, h-1], [w-1, 0]])
134 |
135 | DST_PTS = np.array([[0,0], [0,self.resolution_inp - 1], [self.resolution_inp - 1, 0]])
136 | tform = estimate_transform('similarity', src_pts, DST_PTS)
137 |
138 | image = image/255.
139 |
140 | dst_image = warp(image, tform.inverse, output_shape=(self.resolution_inp, self.resolution_inp))
141 | # cv2.imwrite("../crop_img.png", dst_image * 255)
142 | dst_image = dst_image.transpose(2,0,1)
143 | return {'image': torch.tensor(dst_image).float(),
144 | 'imagename': imagename,
145 | 'tform': torch.tensor(tform.params).float(),
146 | 'original_image': torch.tensor(image.transpose(2,0,1)).float(),
147 | }
--------------------------------------------------------------------------------
/src/external/decalib/datasets/detectors.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | #
3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
4 | # holder of all proprietary rights on this computer program.
5 | # Using this computer program means that you agree to the terms
6 | # in the LICENSE file included with this software distribution.
7 | # Any use not explicitly granted by the LICENSE is prohibited.
8 | #
9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung
10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
11 | # for Intelligent Systems. All rights reserved.
12 | #
13 | # For comments or questions, please email us at deca@tue.mpg.de
14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de
15 |
16 | import numpy as np
17 | import torch
18 |
19 | class FAN(object):
20 | def __init__(self):
21 | import face_alignment
22 | self.model = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D, flip_input=False)
23 |
24 | def run(self, image):
25 | '''
26 | image: 0-255, uint8, rgb, [h, w, 3]
27 | return: detected box list
28 | '''
29 | out = self.model.get_landmarks(image)
30 | if out is None:
31 | return [0], 'kpt68'
32 | else:
33 | kpt = out[0].squeeze()
34 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]);
35 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1])
36 | bbox = [left,top, right, bottom]
37 | return bbox, 'kpt68'
38 |
39 | class MTCNN(object):
40 | def __init__(self, device = 'cpu'):
41 | '''
42 | https://github.com/timesler/facenet-pytorch/blob/master/examples/infer.ipynb
43 | '''
44 | from facenet_pytorch import MTCNN as mtcnn
45 | self.device = device
46 | self.model = mtcnn(keep_all=True)
47 | def run(self, input):
48 | '''
49 | image: 0-255, uint8, rgb, [h, w, 3]
50 | return: detected box
51 | '''
52 | out = self.model.detect(input[None,...])
53 | if out[0][0] is None:
54 | return [0]
55 | else:
56 | bbox = out[0][0].squeeze()
57 | return bbox, 'bbox'
58 |
59 |
60 |
61 |
--------------------------------------------------------------------------------
/src/external/decalib/datasets/ethnicity.py:
--------------------------------------------------------------------------------
1 | import os, sys
2 | import torch
3 | import torchvision.transforms as transforms
4 | import numpy as np
5 | import cv2
6 | import scipy
7 | from skimage.io import imread, imsave
8 | from skimage.transform import estimate_transform, warp, resize, rescale
9 | from glob import glob
10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset
11 |
12 | class EthnicityDataset(Dataset):
13 | def __init__(self, K, image_size, scale, trans_scale = 0, isTemporal=False, isEval=False, isSingle=False):
14 | '''
15 | K must be less than 6
16 | '''
17 | self.K = K
18 | self.image_size = image_size
19 | self.imagefolder = '/ps/scratch/face2d3d/train'
20 | self.kptfolder = '/ps/scratch/face2d3d/train_annotated_torch7/'
21 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_seg/test_crop_size_400_batch/'
22 | # hq:
23 | # datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_bbx_size_bigger_than_400_train_list_max_normal_100_ring_5_1_serial.npy'
24 | datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_and_race_per_7000_african_asian_2d_train_list_max_normal_100_ring_5_1_serial.npy'
25 | self.data_lines = np.load(datafile).astype('str')
26 |
27 | self.isTemporal = isTemporal
28 | self.scale = scale #[scale_min, scale_max]
29 | self.trans_scale = trans_scale #[dx, dy]
30 | self.isSingle = isSingle
31 | if isSingle:
32 | self.K = 1
33 |
34 | def __len__(self):
35 | return len(self.data_lines)
36 |
37 | def __getitem__(self, idx):
38 | images_list = []; kpt_list = []; mask_list = []
39 | for i in range(self.K):
40 | name = self.data_lines[idx, i]
41 | if name[0]=='n':
42 | self.imagefolder = '/ps/scratch/face2d3d/train/'
43 | self.kptfolder = '/ps/scratch/face2d3d/train_annotated_torch7/'
44 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_seg/test_crop_size_400_batch/'
45 | elif name[0]=='A':
46 | self.imagefolder = '/ps/scratch/face2d3d/race_per_7000/'
47 | self.kptfolder = '/ps/scratch/face2d3d/race_per_7000_annotated_torch7_new/'
48 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/race7000_seg/test_crop_size_400_batch/'
49 |
50 | image_path = os.path.join(self.imagefolder, name + '.jpg')
51 | seg_path = os.path.join(self.segfolder, name + '.npy')
52 | kpt_path = os.path.join(self.kptfolder, name + '.npy')
53 |
54 | image = imread(image_path)/255.
55 | kpt = np.load(kpt_path)[:,:2]
56 | mask = self.load_mask(seg_path, image.shape[0], image.shape[1])
57 |
58 | ### crop information
59 | tform = self.crop(image, kpt)
60 | ## crop
61 | cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size))
62 | cropped_mask = warp(mask, tform.inverse, output_shape=(self.image_size, self.image_size))
63 | cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params)
64 |
65 | # normalized kpt
66 | cropped_kpt[:,:2] = cropped_kpt[:,:2]/self.image_size * 2 - 1
67 |
68 | images_list.append(cropped_image.transpose(2,0,1))
69 | kpt_list.append(cropped_kpt)
70 | mask_list.append(cropped_mask)
71 |
72 | ###
73 | images_array = torch.from_numpy(np.array(images_list)).type(dtype = torch.float32) #K,224,224,3
74 | kpt_array = torch.from_numpy(np.array(kpt_list)).type(dtype = torch.float32) #K,224,224,3
75 | mask_array = torch.from_numpy(np.array(mask_list)).type(dtype = torch.float32) #K,224,224,3
76 |
77 | if self.isSingle:
78 | images_array = images_array.squeeze()
79 | kpt_array = kpt_array.squeeze()
80 | mask_array = mask_array.squeeze()
81 |
82 | data_dict = {
83 | 'image': images_array,
84 | 'landmark': kpt_array,
85 | 'mask': mask_array
86 | }
87 |
88 | return data_dict
89 |
90 | def crop(self, image, kpt):
91 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]);
92 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1])
93 |
94 | h, w, _ = image.shape
95 | old_size = (right - left + bottom - top)/2
96 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])#+ old_size*0.1])
97 | # translate center
98 | trans_scale = (np.random.rand(2)*2 -1) * self.trans_scale
99 | center = center + trans_scale*old_size # 0.5
100 |
101 | scale = np.random.rand() * (self.scale[1] - self.scale[0]) + self.scale[0]
102 | size = int(old_size*scale)
103 |
104 | # crop image
105 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]])
106 | DST_PTS = np.array([[0,0], [0,self.image_size - 1], [self.image_size - 1, 0]])
107 | tform = estimate_transform('similarity', src_pts, DST_PTS)
108 |
109 | # cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size))
110 | # # change kpt accordingly
111 | # cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params)
112 | return tform
113 |
114 | def load_mask(self, maskpath, h, w):
115 | # print(maskpath)
116 | if os.path.isfile(maskpath):
117 | vis_parsing_anno = np.load(maskpath)
118 | # atts = ['skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r',
119 | # 'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']
120 | mask = np.zeros_like(vis_parsing_anno)
121 | # for i in range(1, 16):
122 | mask[vis_parsing_anno>0.5] = 1.
123 | else:
124 | mask = np.ones((h, w))
125 | return mask
126 |
127 |
--------------------------------------------------------------------------------
/src/external/decalib/datasets/now.py:
--------------------------------------------------------------------------------
1 | import os, sys
2 | import torch
3 | import torchvision.transforms as transforms
4 | import numpy as np
5 | import cv2
6 | import scipy
7 | from skimage.io import imread, imsave
8 | from skimage.transform import estimate_transform, warp, resize, rescale
9 | from glob import glob
10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset
11 |
12 | class NoWDataset(Dataset):
13 | def __init__(self, ring_elements=6, crop_size=224, scale=1.6):
14 | folder = '/ps/scratch/yfeng/other-github/now_evaluation/data/NoW_Dataset'
15 | self.data_path = os.path.join(folder, 'imagepathsvalidation.txt')
16 | with open(self.data_path) as f:
17 | self.data_lines = f.readlines()
18 |
19 | self.imagefolder = os.path.join(folder, 'final_release_version', 'iphone_pictures')
20 | self.bbxfolder = os.path.join(folder, 'final_release_version', 'detected_face')
21 |
22 | # self.data_path = '/ps/scratch/face2d3d/ringnetpp/eccv/test_data/evaluation/NoW_Dataset/final_release_version/test_image_paths_ring_6_elements.npy'
23 | # self.imagepath = '/ps/scratch/face2d3d/ringnetpp/eccv/test_data/evaluation/NoW_Dataset/final_release_version/iphone_pictures/'
24 | # self.bbxpath = '/ps/scratch/face2d3d/ringnetpp/eccv/test_data/evaluation/NoW_Dataset/final_release_version/detected_face/'
25 | self.crop_size = crop_size
26 | self.scale = scale
27 |
28 | def __len__(self):
29 | return len(self.data_lines)
30 |
31 | def __getitem__(self, index):
32 | imagepath = os.path.join(self.imagefolder, self.data_lines[index].strip()) #+ '.jpg'
33 | bbx_path = os.path.join(self.bbxfolder, self.data_lines[index].strip().replace('.jpg', '.npy'))
34 | bbx_data = np.load(bbx_path, allow_pickle=True, encoding='latin1').item()
35 | # box = np.array([[bbx_data['left'], bbx_data['top']], [bbx_data['right'], bbx_data['bottom']]]).astype('float32')
36 | left = bbx_data['left']; right = bbx_data['right']
37 | top = bbx_data['top']; bottom = bbx_data['bottom']
38 |
39 | imagename = imagepath.split('/')[-1].split('.')[0]
40 | image = imread(imagepath)[:,:,:3]
41 |
42 | h, w, _ = image.shape
43 | old_size = (right - left + bottom - top)/2
44 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])
45 | size = int(old_size*self.scale)
46 |
47 | # crop image
48 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]])
49 | DST_PTS = np.array([[0,0], [0,self.crop_size - 1], [self.crop_size - 1, 0]])
50 | tform = estimate_transform('similarity', src_pts, DST_PTS)
51 |
52 | image = image/255.
53 | dst_image = warp(image, tform.inverse, output_shape=(self.crop_size, self.crop_size))
54 | dst_image = dst_image.transpose(2,0,1)
55 | return {'image': torch.tensor(dst_image).float(),
56 | 'imagename': self.data_lines[index].strip().replace('.jpg', ''),
57 | # 'tform': tform,
58 | # 'original_image': torch.tensor(image.transpose(2,0,1)).float(),
59 | }
--------------------------------------------------------------------------------
/src/external/decalib/datasets/vggface.py:
--------------------------------------------------------------------------------
1 | import os, sys
2 | import torch
3 | import torchvision.transforms as transforms
4 | import numpy as np
5 | import cv2
6 | import scipy
7 | from skimage.io import imread, imsave
8 | from skimage.transform import estimate_transform, warp, resize, rescale
9 | from glob import glob
10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset
11 |
12 | class VGGFace2Dataset(Dataset):
13 | def __init__(self, K, image_size, scale, trans_scale = 0, isTemporal=False, isEval=False, isSingle=False):
14 | '''
15 | K must be less than 6
16 | '''
17 | self.K = K
18 | self.image_size = image_size
19 | self.imagefolder = '/ps/scratch/face2d3d/train'
20 | self.kptfolder = '/ps/scratch/face2d3d/train_annotated_torch7'
21 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_seg/test_crop_size_400_batch'
22 | # hq:
23 | # datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_bbx_size_bigger_than_400_train_list_max_normal_100_ring_5_1_serial.npy'
24 | datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_train_list_max_normal_100_ring_5_1_serial.npy'
25 | if isEval:
26 | datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_val_list_max_normal_100_ring_5_1_serial.npy'
27 | self.data_lines = np.load(datafile).astype('str')
28 |
29 | self.isTemporal = isTemporal
30 | self.scale = scale #[scale_min, scale_max]
31 | self.trans_scale = trans_scale #[dx, dy]
32 | self.isSingle = isSingle
33 | if isSingle:
34 | self.K = 1
35 |
36 | def __len__(self):
37 | return len(self.data_lines)
38 |
39 | def __getitem__(self, idx):
40 | images_list = []; kpt_list = []; mask_list = []
41 |
42 | random_ind = np.random.permutation(5)[:self.K]
43 | for i in random_ind:
44 | name = self.data_lines[idx, i]
45 | image_path = os.path.join(self.imagefolder, name + '.jpg')
46 | seg_path = os.path.join(self.segfolder, name + '.npy')
47 | kpt_path = os.path.join(self.kptfolder, name + '.npy')
48 |
49 | image = imread(image_path)/255.
50 | kpt = np.load(kpt_path)[:,:2]
51 | mask = self.load_mask(seg_path, image.shape[0], image.shape[1])
52 |
53 | ### crop information
54 | tform = self.crop(image, kpt)
55 | ## crop
56 | cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size))
57 | cropped_mask = warp(mask, tform.inverse, output_shape=(self.image_size, self.image_size))
58 | cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params)
59 |
60 | # normalized kpt
61 | cropped_kpt[:,:2] = cropped_kpt[:,:2]/self.image_size * 2 - 1
62 |
63 | images_list.append(cropped_image.transpose(2,0,1))
64 | kpt_list.append(cropped_kpt)
65 | mask_list.append(cropped_mask)
66 |
67 | ###
68 | images_array = torch.from_numpy(np.array(images_list)).type(dtype = torch.float32) #K,224,224,3
69 | kpt_array = torch.from_numpy(np.array(kpt_list)).type(dtype = torch.float32) #K,224,224,3
70 | mask_array = torch.from_numpy(np.array(mask_list)).type(dtype = torch.float32) #K,224,224,3
71 |
72 | if self.isSingle:
73 | images_array = images_array.squeeze()
74 | kpt_array = kpt_array.squeeze()
75 | mask_array = mask_array.squeeze()
76 |
77 | data_dict = {
78 | 'image': images_array,
79 | 'landmark': kpt_array,
80 | 'mask': mask_array
81 | }
82 |
83 | return data_dict
84 |
85 | def crop(self, image, kpt):
86 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]);
87 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1])
88 |
89 | h, w, _ = image.shape
90 | old_size = (right - left + bottom - top)/2
91 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])#+ old_size*0.1])
92 | # translate center
93 | trans_scale = (np.random.rand(2)*2 -1) * self.trans_scale
94 | center = center + trans_scale*old_size # 0.5
95 |
96 | scale = np.random.rand() * (self.scale[1] - self.scale[0]) + self.scale[0]
97 | size = int(old_size*scale)
98 |
99 | # crop image
100 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]])
101 | DST_PTS = np.array([[0,0], [0,self.image_size - 1], [self.image_size - 1, 0]])
102 | tform = estimate_transform('similarity', src_pts, DST_PTS)
103 |
104 | # cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size))
105 | # # change kpt accordingly
106 | # cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params)
107 | return tform
108 |
109 | def load_mask(self, maskpath, h, w):
110 | # print(maskpath)
111 | if os.path.isfile(maskpath):
112 | vis_parsing_anno = np.load(maskpath)
113 | # atts = ['skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r',
114 | # 'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']
115 | mask = np.zeros_like(vis_parsing_anno)
116 | # for i in range(1, 16):
117 | mask[vis_parsing_anno>0.5] = 1.
118 | else:
119 | mask = np.ones((h, w))
120 | return mask
121 |
122 |
123 |
124 | class VGGFace2HQDataset(Dataset):
125 | def __init__(self, K, image_size, scale, trans_scale = 0, isTemporal=False, isEval=False, isSingle=False):
126 | '''
127 | K must be less than 6
128 | '''
129 | self.K = K
130 | self.image_size = image_size
131 | self.imagefolder = '/ps/scratch/face2d3d/train'
132 | self.kptfolder = '/ps/scratch/face2d3d/train_annotated_torch7'
133 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_seg/test_crop_size_400_batch'
134 | # hq:
135 | # datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_bbx_size_bigger_than_400_train_list_max_normal_100_ring_5_1_serial.npy'
136 | datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_bbx_size_bigger_than_400_train_list_max_normal_100_ring_5_1_serial.npy'
137 | self.data_lines = np.load(datafile).astype('str')
138 |
139 | self.isTemporal = isTemporal
140 | self.scale = scale #[scale_min, scale_max]
141 | self.trans_scale = trans_scale #[dx, dy]
142 | self.isSingle = isSingle
143 | if isSingle:
144 | self.K = 1
145 |
146 | def __len__(self):
147 | return len(self.data_lines)
148 |
149 | def __getitem__(self, idx):
150 | images_list = []; kpt_list = []; mask_list = []
151 |
152 | for i in range(self.K):
153 | name = self.data_lines[idx, i]
154 | image_path = os.path.join(self.imagefolder, name + '.jpg')
155 | seg_path = os.path.join(self.segfolder, name + '.npy')
156 | kpt_path = os.path.join(self.kptfolder, name + '.npy')
157 |
158 | image = imread(image_path)/255.
159 | kpt = np.load(kpt_path)[:,:2]
160 | mask = self.load_mask(seg_path, image.shape[0], image.shape[1])
161 |
162 | ### crop information
163 | tform = self.crop(image, kpt)
164 | ## crop
165 | cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size))
166 | cropped_mask = warp(mask, tform.inverse, output_shape=(self.image_size, self.image_size))
167 | cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params)
168 |
169 | # normalized kpt
170 | cropped_kpt[:,:2] = cropped_kpt[:,:2]/self.image_size * 2 - 1
171 |
172 | images_list.append(cropped_image.transpose(2,0,1))
173 | kpt_list.append(cropped_kpt)
174 | mask_list.append(cropped_mask)
175 |
176 | ###
177 | images_array = torch.from_numpy(np.array(images_list)).type(dtype = torch.float32) #K,224,224,3
178 | kpt_array = torch.from_numpy(np.array(kpt_list)).type(dtype = torch.float32) #K,224,224,3
179 | mask_array = torch.from_numpy(np.array(mask_list)).type(dtype = torch.float32) #K,224,224,3
180 |
181 | if self.isSingle:
182 | images_array = images_array.squeeze()
183 | kpt_array = kpt_array.squeeze()
184 | mask_array = mask_array.squeeze()
185 |
186 | data_dict = {
187 | 'image': images_array,
188 | 'landmark': kpt_array,
189 | 'mask': mask_array
190 | }
191 |
192 | return data_dict
193 |
194 | def crop(self, image, kpt):
195 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]);
196 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1])
197 |
198 | h, w, _ = image.shape
199 | old_size = (right - left + bottom - top)/2
200 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])#+ old_size*0.1])
201 | # translate center
202 | trans_scale = (np.random.rand(2)*2 -1) * self.trans_scale
203 | center = center + trans_scale*old_size # 0.5
204 |
205 | scale = np.random.rand() * (self.scale[1] - self.scale[0]) + self.scale[0]
206 | size = int(old_size*scale)
207 |
208 | # crop image
209 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]])
210 | DST_PTS = np.array([[0,0], [0,self.image_size - 1], [self.image_size - 1, 0]])
211 | tform = estimate_transform('similarity', src_pts, DST_PTS)
212 |
213 | # cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size))
214 | # # change kpt accordingly
215 | # cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params)
216 | return tform
217 |
218 | def load_mask(self, maskpath, h, w):
219 | # print(maskpath)
220 | if os.path.isfile(maskpath):
221 | vis_parsing_anno = np.load(maskpath)
222 | # atts = ['skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r',
223 | # 'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']
224 | mask = np.zeros_like(vis_parsing_anno)
225 | # for i in range(1, 16):
226 | mask[vis_parsing_anno>0.5] = 1.
227 | else:
228 | mask = np.ones((h, w))
229 | return mask
--------------------------------------------------------------------------------
/src/external/decalib/datasets/vox.py:
--------------------------------------------------------------------------------
1 | import os, sys
2 | import torch
3 | import torchvision.transforms as transforms
4 | import numpy as np
5 | import cv2
6 | import scipy
7 | from skimage.io import imread, imsave
8 | from skimage.transform import estimate_transform, warp, resize, rescale
9 | from glob import glob
10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset
11 |
12 | class VoxelDataset(Dataset):
13 | def __init__(self, K, image_size, scale, trans_scale = 0, dataname='vox2', n_train=100000, isTemporal=False, isEval=False, isSingle=False):
14 | self.K = K
15 | self.image_size = image_size
16 | if dataname == 'vox1':
17 | self.kpt_suffix = '.txt'
18 | self.imagefolder = '/ps/project/face2d3d/VoxCeleb/vox1/dev/images_cropped'
19 | self.kptfolder = '/ps/scratch/yfeng/Data/VoxCeleb/vox1/landmark_2d'
20 |
21 | self.face_dict = {}
22 | for person_id in sorted(os.listdir(self.kptfolder)):
23 | for video_id in os.listdir(os.path.join(self.kptfolder, person_id)):
24 | for face_id in os.listdir(os.path.join(self.kptfolder, person_id, video_id)):
25 | if 'txt' in face_id:
26 | continue
27 | key = person_id + '/' + video_id + '/' + face_id
28 | # if key not in self.face_dict.keys():
29 | # self.face_dict[key] = []
30 | name_list = os.listdir(os.path.join(self.kptfolder, person_id, video_id, face_id))
31 | name_list = [name.split['.'][0] for name in name_list]
32 | if len(name_list)0.5] = 1.
162 | else:
163 | mask = np.ones((h, w))
164 | return mask
165 |
--------------------------------------------------------------------------------
/src/external/decalib/deca.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | #
3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
4 | # holder of all proprietary rights on this computer program.
5 | # Using this computer program means that you agree to the terms
6 | # in the LICENSE file included with this software distribution.
7 | # Any use not explicitly granted by the LICENSE is prohibited.
8 | #
9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung
10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
11 | # for Intelligent Systems. All rights reserved.
12 | #
13 | # For comments or questions, please email us at deca@tue.mpg.de
14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de
15 |
16 | import os
17 | import torch
18 | import torch.nn as nn
19 | from .models.encoders import ResnetEncoder
20 | from .utils import util
21 | from .utils.rotation_converter import batch_euler2axis
22 | from .utils.config import cfg
23 | torch.backends.cudnn.benchmark = True
24 |
25 | class DECA(nn.Module):
26 | def __init__(self, config=None, device='cuda'):
27 | super(DECA, self).__init__()
28 | if config is None:
29 | self.cfg = cfg
30 | else:
31 | self.cfg = config
32 | self.device = device
33 | self.image_size = self.cfg.dataset.image_size
34 | self.uv_size = self.cfg.model.uv_size
35 |
36 | self._create_model(self.cfg.model)
37 |
38 | def _create_model(self, model_cfg):
39 | # set up parameters
40 | self.n_param = model_cfg.n_shape+model_cfg.n_tex+model_cfg.n_exp+model_cfg.n_pose+model_cfg.n_cam+model_cfg.n_light
41 | self.n_detail = model_cfg.n_detail
42 | self.n_cond = model_cfg.n_exp + 3 # exp + jaw pose
43 | self.num_list = [model_cfg.n_shape, model_cfg.n_tex, model_cfg.n_exp, model_cfg.n_pose, model_cfg.n_cam, model_cfg.n_light]
44 | self.param_dict = {i:model_cfg.get('n_' + i) for i in model_cfg.param_list}
45 |
46 | # encoders
47 | self.E_flame = ResnetEncoder(outsize=self.n_param).to(self.device)
48 | self.E_detail = ResnetEncoder(outsize=self.n_detail).to(self.device)
49 | # resume model
50 | model_path = self.cfg.pretrained_modelpath
51 | if os.path.exists(model_path):
52 | print(f'trained model found. load {model_path}')
53 | checkpoint = torch.load(model_path)
54 | self.checkpoint = checkpoint
55 | util.copy_state_dict(self.E_flame.state_dict(), checkpoint['E_flame'])
56 | util.copy_state_dict(self.E_detail.state_dict(), checkpoint['E_detail'])
57 | else:
58 | print(f'please check model path: {model_path}')
59 | # exit()
60 | # eval mode
61 | self.E_flame.eval()
62 | self.E_detail.eval()
63 |
64 | def decompose_code(self, code, num_dict):
65 | ''' Convert a flattened parameter vector to a dictionary of parameters
66 | code_dict.keys() = ['shape', 'tex', 'exp', 'pose', 'cam', 'light']
67 | '''
68 | code_dict = {}
69 | start = 0
70 | for key in num_dict:
71 | end = start+int(num_dict[key])
72 | code_dict[key] = code[:, start:end]
73 | start = end
74 | if key == 'light':
75 | code_dict[key] = code_dict[key].reshape(code_dict[key].shape[0], 9, 3)
76 | return code_dict
77 |
78 | # @torch.no_grad()
79 | def encode(self, images, use_detail=False):
80 | if use_detail:
81 | # use_detail is for training detail model, need to set coarse model as eval mode
82 | with torch.no_grad():
83 | parameters = self.E_flame(images)
84 | else:
85 | parameters = self.E_flame(images)
86 | codedict = self.decompose_code(parameters, self.param_dict)
87 | codedict['images'] = images
88 | if use_detail:
89 | detailcode = self.E_detail(images)
90 | codedict['detail'] = detailcode
91 | if self.cfg.model.jaw_type == 'euler':
92 | posecode = codedict['pose']
93 | euler_jaw_pose = posecode[:,3:].clone() # x for yaw (open mouth), y for pitch (left ang right), z for roll
94 | posecode[:,3:] = batch_euler2axis(euler_jaw_pose)
95 | codedict['pose'] = posecode
96 | codedict['euler_jaw_pose'] = euler_jaw_pose
97 | return codedict
98 |
99 | def ensemble_3DMM_params(self, codedict, image_size, original_image_size):
100 | i = 0
101 | cam = codedict['cam']
102 | tform = codedict['tform']
103 | scale, tx, ty, sz = util.calculate_scale_tx_ty(cam, tform, image_size, original_image_size)
104 | crop_scale, crop_tx, crop_ty, crop_sz = util.calculate_crop_scale_tx_ty(cam)
105 | scale = float(scale[i].cpu())
106 | tx = float(tx[i].cpu())
107 | ty = float(ty[i].cpu())
108 | sz = float(sz[i].cpu())
109 |
110 | crop_scale = float(crop_scale[i].cpu())
111 | crop_tx = float(crop_tx[i].cpu())
112 | crop_ty = float(crop_ty[i].cpu())
113 | crop_sz = float(crop_sz[i].cpu())
114 |
115 | shape_params = codedict['shape'][i].cpu().numpy()
116 | expression_params = codedict['exp'][i].cpu().numpy()
117 | pose_params = codedict['pose'][i].cpu().numpy()
118 |
119 | face_model_paras = dict()
120 | face_model_paras['shape'] = shape_params.tolist()
121 | face_model_paras['exp'] = expression_params.tolist()
122 | face_model_paras['pose'] = pose_params.tolist()
123 | face_model_paras['cam'] = cam[i].cpu().numpy().tolist()
124 |
125 | face_model_paras['scale'] = scale
126 | face_model_paras['tx'] = tx
127 | face_model_paras['ty'] = ty
128 | face_model_paras['sz'] = sz
129 |
130 | face_model_paras['crop_scale'] = crop_scale
131 | face_model_paras['crop_tx'] = crop_tx
132 | face_model_paras['crop_ty'] = crop_ty
133 | face_model_paras['crop_sz'] = crop_sz
134 | return face_model_paras
135 |
--------------------------------------------------------------------------------
/src/external/decalib/models/FLAME.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | #
3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
4 | # holder of all proprietary rights on this computer program.
5 | # Using this computer program means that you agree to the terms
6 | # in the LICENSE file included with this software distribution.
7 | # Any use not explicitly granted by the LICENSE is prohibited.
8 | #
9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung
10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
11 | # for Intelligent Systems. All rights reserved.
12 | #
13 | # For comments or questions, please email us at deca@tue.mpg.de
14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de
15 |
16 | import torch
17 | import torch.nn as nn
18 | import numpy as np
19 | import pickle
20 | import torch.nn.functional as F
21 | import json
22 |
23 | from .lbs import lbs, batch_rodrigues, vertices2landmarks, rot_mat_to_euler
24 |
25 | def to_tensor(array, dtype=torch.float32):
26 | if 'torch.tensor' not in str(type(array)):
27 | return torch.tensor(array, dtype=dtype)
28 | def to_np(array, dtype=np.float32):
29 | if 'scipy.sparse' in str(type(array)):
30 | array = array.todense()
31 | return np.array(array, dtype=dtype)
32 |
33 | class Struct(object):
34 | def __init__(self, **kwargs):
35 | for key, val in kwargs.items():
36 | setattr(self, key, val)
37 |
38 | class FLAME(nn.Module):
39 | """
40 | borrowed from https://github.com/soubhiksanyal/FLAME_PyTorch/blob/master/FLAME.py
41 | Given flame parameters this class generates a differentiable FLAME function
42 | which outputs the a mesh and 2D/3D facial landmarks
43 | """
44 | def __init__(self, config):
45 | super(FLAME, self).__init__()
46 | print("creating the FLAME Decoder")
47 | with open(config.flame_model_path, 'rb') as f:
48 | ss = pickle.load(f, encoding='latin1')
49 | flame_model = Struct(**ss)
50 |
51 | self.dtype = torch.float32
52 | self.register_buffer('faces_tensor', to_tensor(to_np(flame_model.f, dtype=np.int64), dtype=torch.long))
53 | # The vertices of the template model
54 | self.register_buffer('v_template', to_tensor(to_np(flame_model.v_template), dtype=self.dtype))
55 | # The shape components and expression
56 | shapedirs = to_tensor(to_np(flame_model.shapedirs), dtype=self.dtype)
57 | shapedirs = torch.cat([shapedirs[:,:,:config.n_shape], shapedirs[:,:,300:300+config.n_exp]], 2)
58 | self.register_buffer('shapedirs', shapedirs)
59 | # The pose components
60 | num_pose_basis = flame_model.posedirs.shape[-1]
61 | posedirs = np.reshape(flame_model.posedirs, [-1, num_pose_basis]).T
62 | self.register_buffer('posedirs', to_tensor(to_np(posedirs), dtype=self.dtype))
63 | #
64 | self.register_buffer('J_regressor', to_tensor(to_np(flame_model.J_regressor), dtype=self.dtype))
65 | parents = to_tensor(to_np(flame_model.kintree_table[0])).long(); parents[0] = -1
66 | self.register_buffer('parents', parents)
67 | self.register_buffer('lbs_weights', to_tensor(to_np(flame_model.weights), dtype=self.dtype))
68 |
69 | # Fixing Eyeball and neck rotation
70 | default_eyball_pose = torch.zeros([1, 6], dtype=self.dtype, requires_grad=False)
71 | self.register_parameter('eye_pose', nn.Parameter(default_eyball_pose,
72 | requires_grad=False))
73 | default_neck_pose = torch.zeros([1, 3], dtype=self.dtype, requires_grad=False)
74 | self.register_parameter('neck_pose', nn.Parameter(default_neck_pose,
75 | requires_grad=False))
76 |
77 | with open(config.flame_lmk_embedding_path, 'r') as f:
78 | lmk_embeddings = json.load(f)
79 |
80 | self.lmk_faces_idx = torch.tensor(lmk_embeddings['lmk_faces_idx']).long().unsqueeze(0)
81 | self.lmk_bary_coords = torch.tensor(lmk_embeddings['lmk_bary_coords']).to(self.dtype).unsqueeze(0)
82 |
83 | def forward(self, shape_params=None, expression_params=None, pose_params=None, eye_pose_params=None):
84 | """
85 | Input:
86 | shape_params: N X number of shape parameters
87 | expression_params: N X number of expression parameters
88 | pose_params: N X number of pose parameters (6)
89 | return:d
90 | vertices: N X V X 3
91 | landmarks: N X number of landmarks X 3
92 | """
93 | batch_size = shape_params.shape[0]
94 | if pose_params is None:
95 | pose_params = self.eye_pose.expand(batch_size, -1)
96 | if eye_pose_params is None:
97 | eye_pose_params = self.eye_pose.expand(batch_size, -1)
98 | betas = torch.cat([shape_params, expression_params], dim=1)
99 | full_pose = torch.cat([pose_params[:, :3], self.neck_pose.expand(batch_size, -1), pose_params[:, 3:], eye_pose_params], dim=1)
100 | template_vertices = self.v_template.unsqueeze(0).expand(batch_size, -1, -1)
101 |
102 | vertices, _ = lbs(betas, full_pose, template_vertices,
103 | self.shapedirs, self.posedirs,
104 | self.J_regressor, self.parents,
105 | self.lbs_weights, dtype=self.dtype)
106 | bz = vertices.shape[0]
107 | landmarks3d = vertices2landmarks(vertices, self.faces_tensor,
108 | self.lmk_faces_idx.repeat(bz, 1),
109 | self.lmk_bary_coords.repeat(bz, 1, 1))
110 | return vertices, landmarks3d
111 |
112 | class FLAMETex(nn.Module):
113 | """
114 | FLAME texture:
115 | https://github.com/TimoBolkart/TF_FLAME/blob/ade0ab152300ec5f0e8555d6765411555c5ed43d/sample_texture.py#L64
116 | FLAME texture converted from BFM:
117 | https://github.com/TimoBolkart/BFM_to_FLAME
118 | """
119 | def __init__(self, config):
120 | super(FLAMETex, self).__init__()
121 | if config.tex_type == 'BFM':
122 | mu_key = 'MU'
123 | pc_key = 'PC'
124 | n_pc = 199
125 | tex_path = config.tex_path
126 | tex_space = np.load(tex_path)
127 | texture_mean = tex_space[mu_key].reshape(1, -1)
128 | texture_basis = tex_space[pc_key].reshape(-1, n_pc)
129 |
130 | elif config.tex_type == 'FLAME':
131 | mu_key = 'mean'
132 | pc_key = 'tex_dir'
133 | n_pc = 200
134 | tex_path = config.flame_tex_path
135 | tex_space = np.load(tex_path)
136 | texture_mean = tex_space[mu_key].reshape(1, -1)/255.
137 | texture_basis = tex_space[pc_key].reshape(-1, n_pc)/255.
138 | else:
139 | print('texture type ', config.tex_type, 'not exist!')
140 | raise NotImplementedError
141 |
142 | n_tex = config.n_tex
143 | num_components = texture_basis.shape[1]
144 | texture_mean = torch.from_numpy(texture_mean).float()[None,...]
145 | texture_basis = torch.from_numpy(texture_basis[:,:n_tex]).float()[None,...]
146 | self.register_buffer('texture_mean', texture_mean)
147 | self.register_buffer('texture_basis', texture_basis)
148 |
149 | def forward(self, texcode):
150 | '''
151 | texcode: [batchsize, n_tex]
152 | texture: [bz, 3, 256, 256], range: 0-1
153 | '''
154 | texture = self.texture_mean + (self.texture_basis*texcode[:,None,:]).sum(-1)
155 | texture = texture.reshape(texcode.shape[0], 512, 512, 3).permute(0,3,1,2)
156 | texture = F.interpolate(texture, [256, 256])
157 | texture = texture[:,[2,1,0], :,:]
158 | return texture
--------------------------------------------------------------------------------
/src/external/decalib/models/decoders.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | #
3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
4 | # holder of all proprietary rights on this computer program.
5 | # Using this computer program means that you agree to the terms
6 | # in the LICENSE file included with this software distribution.
7 | # Any use not explicitly granted by the LICENSE is prohibited.
8 | #
9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung
10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
11 | # for Intelligent Systems. All rights reserved.
12 | #
13 | # For comments or questions, please email us at deca@tue.mpg.de
14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de
15 |
16 | import torch
17 | import torch.nn as nn
18 |
19 | class Generator(nn.Module):
20 | def __init__(self, latent_dim=100, out_channels=1, out_scale=0.01, sample_mode = 'bilinear'):
21 | super(Generator, self).__init__()
22 | self.out_scale = out_scale
23 |
24 | self.init_size = 32 // 4 # Initial size before upsampling
25 | self.l1 = nn.Sequential(nn.Linear(latent_dim, 128 * self.init_size ** 2))
26 | self.conv_blocks = nn.Sequential(
27 | nn.BatchNorm2d(128),
28 | nn.Upsample(scale_factor=2, mode=sample_mode), #16
29 | nn.Conv2d(128, 128, 3, stride=1, padding=1),
30 | nn.BatchNorm2d(128, 0.8),
31 | nn.LeakyReLU(0.2, inplace=True),
32 | nn.Upsample(scale_factor=2, mode=sample_mode), #32
33 | nn.Conv2d(128, 64, 3, stride=1, padding=1),
34 | nn.BatchNorm2d(64, 0.8),
35 | nn.LeakyReLU(0.2, inplace=True),
36 | nn.Upsample(scale_factor=2, mode=sample_mode), #64
37 | nn.Conv2d(64, 64, 3, stride=1, padding=1),
38 | nn.BatchNorm2d(64, 0.8),
39 | nn.LeakyReLU(0.2, inplace=True),
40 | nn.Upsample(scale_factor=2, mode=sample_mode), #128
41 | nn.Conv2d(64, 32, 3, stride=1, padding=1),
42 | nn.BatchNorm2d(32, 0.8),
43 | nn.LeakyReLU(0.2, inplace=True),
44 | nn.Upsample(scale_factor=2, mode=sample_mode), #256
45 | nn.Conv2d(32, 16, 3, stride=1, padding=1),
46 | nn.BatchNorm2d(16, 0.8),
47 | nn.LeakyReLU(0.2, inplace=True),
48 | nn.Conv2d(16, out_channels, 3, stride=1, padding=1),
49 | nn.Tanh(),
50 | )
51 |
52 | def forward(self, noise):
53 | out = self.l1(noise)
54 | out = out.view(out.shape[0], 128, self.init_size, self.init_size)
55 | img = self.conv_blocks(out)
56 | return img*self.out_scale
--------------------------------------------------------------------------------
/src/external/decalib/models/encoders.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | #
3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is
4 | # holder of all proprietary rights on this computer program.
5 | # Using this computer program means that you agree to the terms
6 | # in the LICENSE file included with this software distribution.
7 | # Any use not explicitly granted by the LICENSE is prohibited.
8 | #
9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung
10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute
11 | # for Intelligent Systems. All rights reserved.
12 | #
13 | # For comments or questions, please email us at deca@tue.mpg.de
14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de
15 |
16 | import numpy as np
17 | import torch.nn as nn
18 | import torch
19 | import torch.nn.functional as F
20 | from . import resnet
21 |
22 | class ResnetEncoder(nn.Module):
23 | def __init__(self, outsize, last_op=None):
24 | super(ResnetEncoder, self).__init__()
25 | feature_size = 2048
26 | self.encoder = resnet.load_ResNet50Model() #out: 2048
27 | ### regressor
28 | self.layers = nn.Sequential(
29 | nn.Linear(feature_size, 1024),
30 | nn.ReLU(),
31 | nn.Linear(1024, outsize)
32 | )
33 | self.last_op = last_op
34 |
35 | def forward(self, inputs):
36 | features = self.encoder(inputs)
37 | parameters = self.layers(features)
38 | if self.last_op:
39 | parameters = self.last_op(parameters)
40 | return parameters
41 |
--------------------------------------------------------------------------------
/src/external/decalib/models/frnet.py:
--------------------------------------------------------------------------------
1 | import torch.nn as nn
2 | import numpy as np
3 | import torch
4 | # from pro_gan_pytorch.PRO_GAN import ProGAN, Generator, Discriminator
5 | import torch.nn.functional as F
6 | import cv2
7 | from torch.autograd import Variable
8 | import math
9 |
10 | def conv3x3(in_planes, out_planes, stride=1):
11 | """3x3 convolution with padding"""
12 | return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
13 | padding=1, bias=False)
14 |
15 | class BasicBlock(nn.Module):
16 | expansion = 1
17 |
18 | def __init__(self, inplanes, planes, stride=1, downsample=None):
19 | super(BasicBlock, self).__init__()
20 | self.conv1 = conv3x3(inplanes, planes, stride)
21 | self.bn1 = nn.BatchNorm2d(planes)
22 | self.relu = nn.ReLU(inplace=True)
23 | self.conv2 = conv3x3(planes, planes)
24 | self.bn2 = nn.BatchNorm2d(planes)
25 | self.downsample = downsample
26 | self.stride = stride
27 |
28 | def forward(self, x):
29 | residual = x
30 |
31 | out = self.conv1(x)
32 | out = self.bn1(out)
33 | out = self.relu(out)
34 |
35 | out = self.conv2(out)
36 | out = self.bn2(out)
37 |
38 | if self.downsample is not None:
39 | residual = self.downsample(x)
40 |
41 | out += residual
42 | out = self.relu(out)
43 |
44 | return out
45 |
46 |
47 | class Bottleneck(nn.Module):
48 | expansion = 4
49 |
50 | def __init__(self, inplanes, planes, stride=1, downsample=None):
51 | super(Bottleneck, self).__init__()
52 | self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, stride=stride, bias=False)
53 | self.bn1 = nn.BatchNorm2d(planes)
54 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)
55 | self.bn2 = nn.BatchNorm2d(planes)
56 | self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
57 | self.bn3 = nn.BatchNorm2d(planes * 4)
58 | self.relu = nn.ReLU(inplace=True)
59 | self.downsample = downsample
60 | self.stride = stride
61 |
62 | def forward(self, x):
63 | residual = x
64 |
65 | out = self.conv1(x)
66 | out = self.bn1(out)
67 | out = self.relu(out)
68 |
69 | out = self.conv2(out)
70 | out = self.bn2(out)
71 | out = self.relu(out)
72 |
73 | out = self.conv3(out)
74 | out = self.bn3(out)
75 |
76 | if self.downsample is not None:
77 | residual = self.downsample(x)
78 |
79 | out += residual
80 | out = self.relu(out)
81 |
82 | return out
83 |
84 |
85 | class ResNet(nn.Module):
86 |
87 | def __init__(self, block, layers, num_classes=1000, include_top=True):
88 | self.inplanes = 64
89 | super(ResNet, self).__init__()
90 | self.include_top = include_top
91 |
92 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False)
93 | self.bn1 = nn.BatchNorm2d(64)
94 | self.relu = nn.ReLU(inplace=True)
95 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=0, ceil_mode=True)
96 |
97 | self.layer1 = self._make_layer(block, 64, layers[0])
98 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
99 | self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
100 | self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
101 | self.avgpool = nn.AvgPool2d(7, stride=1)
102 | self.fc = nn.Linear(512 * block.expansion, num_classes)
103 |
104 | for m in self.modules():
105 | if isinstance(m, nn.Conv2d):
106 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
107 | m.weight.data.normal_(0, math.sqrt(2. / n))
108 | elif isinstance(m, nn.BatchNorm2d):
109 | m.weight.data.fill_(1)
110 | m.bias.data.zero_()
111 |
112 | def _make_layer(self, block, planes, blocks, stride=1):
113 | downsample = None
114 | if stride != 1 or self.inplanes != planes * block.expansion:
115 | downsample = nn.Sequential(
116 | nn.Conv2d(self.inplanes, planes * block.expansion,
117 | kernel_size=1, stride=stride, bias=False),
118 | nn.BatchNorm2d(planes * block.expansion),
119 | )
120 |
121 | layers = []
122 | layers.append(block(self.inplanes, planes, stride, downsample))
123 | self.inplanes = planes * block.expansion
124 | for i in range(1, blocks):
125 | layers.append(block(self.inplanes, planes))
126 |
127 | return nn.Sequential(*layers)
128 |
129 | def forward(self, x):
130 | x = self.conv1(x)
131 | x = self.bn1(x)
132 | x = self.relu(x)
133 | x = self.maxpool(x)
134 |
135 | x = self.layer1(x)
136 | x = self.layer2(x)
137 | x = self.layer3(x)
138 | x = self.layer4(x)
139 |
140 | x = self.avgpool(x)
141 |
142 | if not self.include_top:
143 | return x
144 |
145 | x = x.view(x.size(0), -1)
146 | x = self.fc(x)
147 | return x
148 |
149 | def resnet50(**kwargs):
150 | """Constructs a ResNet-50 model.
151 | """
152 | model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs)
153 | return model
154 |
155 | import pickle
156 | def load_state_dict(model, fname):
157 | """
158 | Set parameters converted from Caffe models authors of VGGFace2 provide.
159 | See https://www.robots.ox.ac.uk/~vgg/data/vgg_face2/.
160 | Arguments:
161 | model: model
162 | fname: file name of parameters converted from a Caffe model, assuming the file format is Pickle.
163 | """
164 | with open(fname, 'rb') as f:
165 | weights = pickle.load(f, encoding='latin1')
166 |
167 | own_state = model.state_dict()
168 | for name, param in weights.items():
169 | if name in own_state:
170 | try:
171 | own_state[name].copy_(torch.from_numpy(param))
172 | except Exception:
173 | raise RuntimeError('While copying the parameter named {}, whose dimensions in the model are {} and whose '\
174 | 'dimensions in the checkpoint are {}.'.format(name, own_state[name].size(), param.size()))
175 | else:
176 | raise KeyError('unexpected key "{}" in state_dict'.format(name))
177 |
178 |
--------------------------------------------------------------------------------
/src/external/decalib/models/resnet.py:
--------------------------------------------------------------------------------
1 | """
2 | Author: Soubhik Sanyal
3 | Copyright (c) 2019, Soubhik Sanyal
4 | All rights reserved.
5 | Loads different resnet models
6 | """
7 | '''
8 | file: Resnet.py
9 | date: 2018_05_02
10 | author: zhangxiong(1025679612@qq.com)
11 | mark: copied from pytorch source code
12 | '''
13 |
14 | import torch.nn as nn
15 | import torch.nn.functional as F
16 | import torch
17 | from torch.nn.parameter import Parameter
18 | import torch.optim as optim
19 | import numpy as np
20 | import math
21 | import torchvision
22 |
23 | class ResNet(nn.Module):
24 | def __init__(self, block, layers, num_classes=1000):
25 | self.inplanes = 64
26 | super(ResNet, self).__init__()
27 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
28 | bias=False)
29 | self.bn1 = nn.BatchNorm2d(64)
30 | self.relu = nn.ReLU(inplace=True)
31 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
32 | self.layer1 = self._make_layer(block, 64, layers[0])
33 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
34 | self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
35 | self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
36 | self.avgpool = nn.AvgPool2d(7, stride=1)
37 | # self.fc = nn.Linear(512 * block.expansion, num_classes)
38 |
39 | for m in self.modules():
40 | if isinstance(m, nn.Conv2d):
41 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
42 | m.weight.data.normal_(0, math.sqrt(2. / n))
43 | elif isinstance(m, nn.BatchNorm2d):
44 | m.weight.data.fill_(1)
45 | m.bias.data.zero_()
46 |
47 | def _make_layer(self, block, planes, blocks, stride=1):
48 | downsample = None
49 | if stride != 1 or self.inplanes != planes * block.expansion:
50 | downsample = nn.Sequential(
51 | nn.Conv2d(self.inplanes, planes * block.expansion,
52 | kernel_size=1, stride=stride, bias=False),
53 | nn.BatchNorm2d(planes * block.expansion),
54 | )
55 |
56 | layers = []
57 | layers.append(block(self.inplanes, planes, stride, downsample))
58 | self.inplanes = planes * block.expansion
59 | for i in range(1, blocks):
60 | layers.append(block(self.inplanes, planes))
61 |
62 | return nn.Sequential(*layers)
63 |
64 | def forward(self, x):
65 | x = self.conv1(x)
66 | x = self.bn1(x)
67 | x = self.relu(x)
68 | x = self.maxpool(x)
69 |
70 | x = self.layer1(x)
71 | x = self.layer2(x)
72 | x = self.layer3(x)
73 | x1 = self.layer4(x)
74 |
75 | x2 = self.avgpool(x1)
76 | x2 = x2.view(x2.size(0), -1)
77 | # x = self.fc(x)
78 | ## x2: [bz, 2048] for shape
79 | ## x1: [bz, 2048, 7, 7] for texture
80 | return x2
81 |
82 | class Bottleneck(nn.Module):
83 | expansion = 4
84 |
85 | def __init__(self, inplanes, planes, stride=1, downsample=None):
86 | super(Bottleneck, self).__init__()
87 | self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
88 | self.bn1 = nn.BatchNorm2d(planes)
89 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
90 | padding=1, bias=False)
91 | self.bn2 = nn.BatchNorm2d(planes)
92 | self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
93 | self.bn3 = nn.BatchNorm2d(planes * 4)
94 | self.relu = nn.ReLU(inplace=True)
95 | self.downsample = downsample
96 | self.stride = stride
97 |
98 | def forward(self, x):
99 | residual = x
100 |
101 | out = self.conv1(x)
102 | out = self.bn1(out)
103 | out = self.relu(out)
104 |
105 | out = self.conv2(out)
106 | out = self.bn2(out)
107 | out = self.relu(out)
108 |
109 | out = self.conv3(out)
110 | out = self.bn3(out)
111 |
112 | if self.downsample is not None:
113 | residual = self.downsample(x)
114 |
115 | out += residual
116 | out = self.relu(out)
117 |
118 | return out
119 |
120 | def conv3x3(in_planes, out_planes, stride=1):
121 | """3x3 convolution with padding"""
122 | return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
123 | padding=1, bias=False)
124 |
125 | class BasicBlock(nn.Module):
126 | expansion = 1
127 |
128 | def __init__(self, inplanes, planes, stride=1, downsample=None):
129 | super(BasicBlock, self).__init__()
130 | self.conv1 = conv3x3(inplanes, planes, stride)
131 | self.bn1 = nn.BatchNorm2d(planes)
132 | self.relu = nn.ReLU(inplace=True)
133 | self.conv2 = conv3x3(planes, planes)
134 | self.bn2 = nn.BatchNorm2d(planes)
135 | self.downsample = downsample
136 | self.stride = stride
137 |
138 | def forward(self, x):
139 | residual = x
140 |
141 | out = self.conv1(x)
142 | out = self.bn1(out)
143 | out = self.relu(out)
144 |
145 | out = self.conv2(out)
146 | out = self.bn2(out)
147 |
148 | if self.downsample is not None:
149 | residual = self.downsample(x)
150 |
151 | out += residual
152 | out = self.relu(out)
153 |
154 | return out
155 |
156 | def copy_parameter_from_resnet(model, resnet_dict):
157 | cur_state_dict = model.state_dict()
158 | # import ipdb; ipdb.set_trace()
159 | for name, param in list(resnet_dict.items())[0:None]:
160 | if name not in cur_state_dict:
161 | # print(name, ' not available in reconstructed resnet')
162 | continue
163 | if isinstance(param, Parameter):
164 | param = param.data
165 | try:
166 | cur_state_dict[name].copy_(param)
167 | except:
168 | # print(name, ' is inconsistent!')
169 | continue
170 | # print('copy resnet state dict finished!')
171 | # import ipdb; ipdb.set_trace()
172 |
173 | def load_ResNet50Model():
174 | model = ResNet(Bottleneck, [3, 4, 6, 3])
175 | copy_parameter_from_resnet(model, torchvision.models.resnet50(pretrained = False).state_dict())
176 | return model
177 |
178 | def load_ResNet101Model():
179 | model = ResNet(Bottleneck, [3, 4, 23, 3])
180 | copy_parameter_from_resnet(model, torchvision.models.resnet101(pretrained = True).state_dict())
181 | return model
182 |
183 | def load_ResNet152Model():
184 | model = ResNet(Bottleneck, [3, 8, 36, 3])
185 | copy_parameter_from_resnet(model, torchvision.models.resnet152(pretrained = True).state_dict())
186 | return model
187 |
188 | # model.load_state_dict(checkpoint['model_state_dict'])
189 |
190 |
191 | ######## Unet
192 |
193 | class DoubleConv(nn.Module):
194 | """(convolution => [BN] => ReLU) * 2"""
195 |
196 | def __init__(self, in_channels, out_channels):
197 | super().__init__()
198 | self.double_conv = nn.Sequential(
199 | nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1),
200 | nn.BatchNorm2d(out_channels),
201 | nn.ReLU(inplace=True),
202 | nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1),
203 | nn.BatchNorm2d(out_channels),
204 | nn.ReLU(inplace=True)
205 | )
206 |
207 | def forward(self, x):
208 | return self.double_conv(x)
209 |
210 |
211 | class Down(nn.Module):
212 | """Downscaling with maxpool then double conv"""
213 |
214 | def __init__(self, in_channels, out_channels):
215 | super().__init__()
216 | self.maxpool_conv = nn.Sequential(
217 | nn.MaxPool2d(2),
218 | DoubleConv(in_channels, out_channels)
219 | )
220 |
221 | def forward(self, x):
222 | return self.maxpool_conv(x)
223 |
224 |
225 | class Up(nn.Module):
226 | """Upscaling then double conv"""
227 |
228 | def __init__(self, in_channels, out_channels, bilinear=True):
229 | super().__init__()
230 |
231 | # if bilinear, use the normal convolutions to reduce the number of channels
232 | if bilinear:
233 | self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
234 | else:
235 | self.up = nn.ConvTranspose2d(in_channels // 2, in_channels // 2, kernel_size=2, stride=2)
236 |
237 | self.conv = DoubleConv(in_channels, out_channels)
238 |
239 | def forward(self, x1, x2):
240 | x1 = self.up(x1)
241 | # input is CHW
242 | diffY = x2.size()[2] - x1.size()[2]
243 | diffX = x2.size()[3] - x1.size()[3]
244 |
245 | x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2,
246 | diffY // 2, diffY - diffY // 2])
247 | # if you have padding issues, see
248 | # https://github.com/HaiyongJiang/U-Net-Pytorch-Unstructured-Buggy/commit/0e854509c2cea854e247a9c615f175f76fbb2e3a
249 | # https://github.com/xiaopeng-liao/Pytorch-UNet/commit/8ebac70e633bac59fc22bb5195e513d5832fb3bd
250 | x = torch.cat([x2, x1], dim=1)
251 | return self.conv(x)
252 |
253 |
254 | class OutConv(nn.Module):
255 | def __init__(self, in_channels, out_channels):
256 | super(OutConv, self).__init__()
257 | self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1)
258 |
259 | def forward(self, x):
260 | return self.conv(x)
--------------------------------------------------------------------------------
/src/external/decalib/utils/config.py:
--------------------------------------------------------------------------------
1 | '''
2 | Default config for DECA
3 | '''
4 | from yacs.config import CfgNode as CN
5 | import argparse
6 | import yaml
7 | import os
8 |
9 | cfg = CN()
10 |
11 | abs_deca_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..'))
12 | cfg.deca_dir = abs_deca_dir
13 | cfg.device = 'cuda'
14 | cfg.device_id = '0'
15 |
16 | cfg.pretrained_modelpath = os.path.join(cfg.deca_dir, 'data', 'deca_model.tar')
17 | cfg.output_dir = ''
18 | cfg.rasterizer_type = 'pytorch3d'
19 | # ---------------------------------------------------------------------------- #
20 | # Options for Face model
21 | # ---------------------------------------------------------------------------- #
22 | cfg.model = CN()
23 | cfg.model.topology_path = os.path.join(cfg.deca_dir, 'data', 'head_template.obj')
24 | # texture data original from http://files.is.tue.mpg.de/tbolkart/FLAME/FLAME_texture_data.zip
25 | cfg.model.dense_template_path = os.path.join(cfg.deca_dir, 'data', 'texture_data_256.npy')
26 | cfg.model.fixed_displacement_path = os.path.join(cfg.deca_dir, 'data', 'fixed_displacement_256.npy')
27 | cfg.model.flame_model_path = os.path.join(cfg.deca_dir, 'data', 'generic_model.pkl')
28 | cfg.model.flame_lmk_embedding_path = os.path.join(cfg.deca_dir, 'data', 'landmark_embedding.json')
29 | cfg.model.face_mask_path = os.path.join(cfg.deca_dir, 'data', 'uv_face_mask.png')
30 | cfg.model.face_eye_mask_path = os.path.join(cfg.deca_dir, 'data', 'uv_face_eye_mask.png')
31 | cfg.model.mean_tex_path = os.path.join(cfg.deca_dir, 'data', 'mean_texture.jpg')
32 | cfg.model.tex_path = os.path.join(cfg.deca_dir, 'data', 'FLAME_albedo_from_BFM.npz')
33 | cfg.model.tex_type = 'BFM' # BFM, FLAME, albedoMM
34 | cfg.model.uv_size = 256
35 | cfg.model.param_list = ['shape', 'tex', 'exp', 'pose', 'cam', 'light']
36 | cfg.model.n_shape = 100
37 | cfg.model.n_tex = 50
38 | cfg.model.n_exp = 50
39 | cfg.model.n_cam = 3
40 | cfg.model.n_pose = 6
41 | cfg.model.n_light = 27
42 | cfg.model.use_tex = True
43 | cfg.model.jaw_type = 'aa' # default use axis angle, another option: euler. Note that: aa is not stable in the beginning
44 | # face recognition model
45 | cfg.model.fr_model_path = os.path.join(cfg.deca_dir, 'data', 'resnet50_ft_weight.pkl')
46 |
47 | ## details
48 | cfg.model.n_detail = 128
49 | cfg.model.max_z = 0.01
50 |
51 | # ---------------------------------------------------------------------------- #
52 | # Options for Dataset
53 | # ---------------------------------------------------------------------------- #
54 | cfg.dataset = CN()
55 | cfg.dataset.training_data = ['vggface2', 'ethnicity']
56 | # cfg.dataset.training_data = ['ethnicity']
57 | cfg.dataset.eval_data = ['aflw2000']
58 | cfg.dataset.test_data = ['']
59 | cfg.dataset.batch_size = 2
60 | cfg.dataset.K = 4
61 | cfg.dataset.isSingle = False
62 | # cfg.dataset.num_workers = 2
63 | cfg.dataset.num_workers = 0
64 | cfg.dataset.image_size = 224
65 | cfg.dataset.scale_min = 1.4
66 | cfg.dataset.scale_max = 1.8
67 | cfg.dataset.trans_scale = 0.
68 |
69 | # ---------------------------------------------------------------------------- #
70 | # Options for training
71 | # ---------------------------------------------------------------------------- #
72 | cfg.train = CN()
73 | cfg.train.train_detail = False
74 | cfg.train.max_epochs = 500
75 | cfg.train.max_steps = 1000000
76 | cfg.train.lr = 1e-4
77 | cfg.train.log_dir = 'logs'
78 | cfg.train.log_steps = 10
79 | cfg.train.vis_dir = 'train_images'
80 | cfg.train.vis_steps = 200
81 | cfg.train.write_summary = True
82 | cfg.train.checkpoint_steps = 500
83 | cfg.train.val_steps = 500
84 | cfg.train.val_vis_dir = 'val_images'
85 | cfg.train.eval_steps = 5000
86 | cfg.train.resume = True
87 |
88 | # ---------------------------------------------------------------------------- #
89 | # Options for Losses
90 | # ---------------------------------------------------------------------------- #
91 | cfg.loss = CN()
92 | cfg.loss.lmk = 1.0
93 | cfg.loss.useWlmk = True
94 | cfg.loss.eyed = 1.0
95 | cfg.loss.lipd = 0.5
96 | cfg.loss.photo = 2.0
97 | cfg.loss.useSeg = True
98 | cfg.loss.id = 0.2
99 | cfg.loss.id_shape_only = True
100 | cfg.loss.reg_shape = 1e-04
101 | cfg.loss.reg_exp = 1e-04
102 | cfg.loss.reg_tex = 1e-04
103 | cfg.loss.reg_light = 1.
104 | cfg.loss.reg_jaw_pose = 0. #1.
105 | cfg.loss.use_gender_prior = False
106 | cfg.loss.shape_consistency = True
107 | # loss for detail
108 | cfg.loss.detail_consistency = True
109 | cfg.loss.useConstraint = True
110 | cfg.loss.mrf = 5e-2
111 | cfg.loss.photo_D = 2.
112 | cfg.loss.reg_sym = 0.005
113 | cfg.loss.reg_z = 0.005
114 | cfg.loss.reg_diff = 0.005
115 |
116 |
117 | def get_cfg_defaults():
118 | """Get a yacs CfgNode object with default values for my_project."""
119 | # Return a clone so that the defaults will not be altered
120 | # This is for the "local variable" use pattern
121 | return cfg.clone()
122 |
123 | def update_cfg(cfg, cfg_file):
124 | cfg.merge_from_file(cfg_file)
125 | return cfg.clone()
126 |
127 | def parse_args():
128 | parser = argparse.ArgumentParser()
129 | parser.add_argument('--cfg', type=str, help='cfg file path')
130 | parser.add_argument('--mode', type=str, default = 'train', help='deca mode')
131 |
132 | args = parser.parse_args()
133 | print(args, end='\n\n')
134 |
135 | cfg = get_cfg_defaults()
136 | cfg.cfg_file = None
137 | cfg.mode = args.mode
138 | # import ipdb; ipdb.set_trace()
139 | if args.cfg is not None:
140 | cfg_file = args.cfg
141 | cfg = update_cfg(cfg, args.cfg)
142 | cfg.cfg_file = cfg_file
143 |
144 | return cfg
145 |
--------------------------------------------------------------------------------
/src/fitting.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import torch
3 | import json
4 | import os
5 | import copy
6 | import numpy as np
7 | from external.decalib.utils.config import cfg as deca_cfg
8 | from external.decalib.deca import DECA
9 | from external.decalib.datasets import datasets
10 | from external.decalib.models.FLAME import FLAME
11 | from util.util import (
12 | save_coeffs,
13 | save_landmarks
14 | )
15 |
16 |
17 | def parse_args():
18 | """Configurations."""
19 | parser = argparse.ArgumentParser(description='test process of Face2FaceRHO')
20 | parser.add_argument('--device', default='cuda', type=str, help='set device, cpu for using cpu')
21 | parser.add_argument('--src_img', type=str, required=True, help='input source image (.jpg, .jpg, .jpeg, .png)')
22 | parser.add_argument('--drv_img', type=str, required=True, help='input driving image (.jpg, .jpg, .jpeg, .png)')
23 |
24 | parser.add_argument('--output_src_headpose', type=str,
25 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'source', 'FLAME', 'headpose.txt'),
26 | help='output head pose coefficients of source image (.txt)')
27 | parser.add_argument('--output_src_landmark', type=str,
28 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'source', 'FLAME', 'landmark.txt'),
29 | help='output facial landmarks of source image (.txt)')
30 |
31 | parser.add_argument('--output_drv_headpose', type=str,
32 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'driving', 'FLAME', 'headpose.txt'),
33 | help=' output head pose coefficients of driving image (.txt)')
34 | parser.add_argument('--output_drv_landmark', type=str,
35 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'driving', 'FLAME', 'landmark.txt'),
36 | help='output driving facial landmarks (.txt, reconstructed by using shape coefficients '
37 | 'of the source actor and expression and head pose coefficients of the driving actor)')
38 |
39 | return _check_args(parser.parse_args())
40 |
41 |
42 | def _check_args(args):
43 | if args is None:
44 | raise RuntimeError('Invalid arguments!')
45 | return args
46 |
47 |
48 | class FLAMEFitting:
49 | def __init__(self):
50 | self.deca = DECA(config=deca_cfg, device=args.device)
51 |
52 | def fitting(self, img_name):
53 | testdata = datasets.TestData(img_name, iscrop=True,face_detector='fan', sample_step=10)
54 | input_data = testdata[0]
55 | images = input_data['image'].to(args.device)[None, ...]
56 | with torch.no_grad():
57 | codedict = self.deca.encode(images)
58 | codedict['tform'] = input_data['tform'][None, ...]
59 | original_image = input_data['original_image'][None, ...]
60 | _, _, h, w = original_image.shape
61 | params = self.deca.ensemble_3DMM_params(codedict, image_size=deca_cfg.dataset.image_size, original_image_size=h)
62 | return params
63 |
64 |
65 | class PoseLandmarkExtractor:
66 | def __init__(self):
67 | self.flame = FLAME(deca_cfg.model)
68 |
69 | with open(os.path.join(deca_cfg.deca_dir, 'data', 'pose_transform_config.json'), 'r') as f:
70 | pose_transform = json.load(f)
71 |
72 | self.scale_transform = pose_transform['scale_transform']
73 | self.tx_transform = pose_transform['tx_transform']
74 | self.ty_transform = pose_transform['ty_transform']
75 | self.tx_scale = 0.256 # 512 / 2000
76 | self.ty_scale = - self.tx_scale
77 |
78 | @staticmethod
79 | def transform_points(points, scale, tx, ty):
80 | trans_matrix = torch.zeros((1, 4, 4), dtype=torch.float32)
81 | trans_matrix[:, 0, 0] = scale
82 | trans_matrix[:, 1, 1] = -scale
83 | trans_matrix[:, 2, 2] = 1
84 | trans_matrix[:, 0, 3] = tx
85 | trans_matrix[:, 1, 3] = ty
86 | trans_matrix[:, 3, 3] = 1
87 |
88 | batch_size, n_points, _ = points.shape
89 | points_homo = torch.cat([points, torch.ones([batch_size, n_points, 1], dtype=points.dtype)], dim=2)
90 | points_homo = points_homo.transpose(1, 2)
91 | trans_points = torch.bmm(trans_matrix, points_homo).transpose(1, 2)
92 | trans_points = trans_points[:, :, 0:3]
93 | return trans_points
94 |
95 | def get_project_points(self, shape_params, expression_params, pose, scale, tx, ty):
96 | shape_params = torch.tensor(shape_params).unsqueeze(0)
97 | expression_params = torch.tensor(expression_params).unsqueeze(0)
98 | pose = torch.tensor(pose).unsqueeze(0)
99 | verts, landmarks3d = self.flame(
100 | shape_params=shape_params, expression_params=expression_params, pose_params=pose)
101 | trans_landmarks3d = self.transform_points(landmarks3d, scale, tx, ty)
102 | trans_landmarks3d = trans_landmarks3d.squeeze(0).cpu().numpy()
103 | return trans_landmarks3d[:, 0:2].tolist()
104 |
105 | def calculate_nose_tip_tx_ty(self, shape_params, expression_params, pose, scale, tx, ty):
106 | front_pose = copy.deepcopy(pose)
107 | front_pose[0] = front_pose[1] = front_pose[2] = 0
108 | front_landmarks3d = self.get_project_points(shape_params, expression_params, front_pose, scale, tx, ty)
109 | original_landmark3d = self.get_project_points(shape_params, expression_params, pose, scale, tx, ty)
110 | nose_tx = original_landmark3d[30][0] - front_landmarks3d[30][0]
111 | nose_ty = original_landmark3d[30][1] - front_landmarks3d[30][1]
112 | return nose_tx, nose_ty
113 |
114 | def get_pose(self, shape_params, expression_params, pose, scale, tx, ty):
115 | nose_tx, nose_ty = self.calculate_nose_tip_tx_ty(
116 | shape_params, expression_params, pose, scale, tx, ty)
117 | transformed_axis_angle = [
118 | float(pose[0]),
119 | float(pose[1]),
120 | float(pose[2])
121 | ]
122 | transformed_tx = self.tx_transform + self.tx_scale * (tx + nose_tx)
123 | transformed_ty = self.ty_transform + self.ty_scale * (ty + nose_ty)
124 | transformed_scale = scale / self.scale_transform
125 | return transformed_axis_angle + [transformed_tx, transformed_ty, transformed_scale]
126 |
127 |
128 | if __name__ == '__main__':
129 | args = parse_args()
130 |
131 | # 3DMM fitting by DECA: Detailed Expression Capture and Animation using FLAME model
132 | face_fitting = FLAMEFitting()
133 | src_params = face_fitting.fitting(args.src_img)
134 | drv_params = face_fitting.fitting(args.drv_img)
135 |
136 | # calculate head pose and facial landmarks for the source and driving face images
137 | pose_lml_extractor = PoseLandmarkExtractor()
138 | src_headpose = pose_lml_extractor.get_pose(
139 | src_params['shape'], src_params['exp'], src_params['pose'],
140 | src_params['scale'], src_params['tx'], src_params['ty'])
141 |
142 | src_lmks = pose_lml_extractor.get_project_points(
143 | src_params['shape'], src_params['exp'], src_params['pose'],
144 | src_params['scale'], src_params['tx'], src_params['ty'])
145 |
146 | # Note that the driving head pose and facial landmarks are calculated using the shape parameters of the source image
147 | # in order to eliminate the interference of the driving actor's identity.
148 | drv_headpose = pose_lml_extractor.get_pose(
149 | src_params['shape'], drv_params['exp'], drv_params['pose'],
150 | drv_params['scale'], drv_params['tx'], drv_params['ty'])
151 |
152 | drv_lmks = pose_lml_extractor.get_project_points(
153 | src_params['shape'], drv_params['exp'], drv_params['pose'],
154 | drv_params['scale'], drv_params['tx'], drv_params['ty'])
155 |
156 | # save
157 | os.makedirs(os.path.split(args.output_src_headpose)[0], exist_ok=True)
158 | save_coeffs(args.output_src_headpose, src_headpose)
159 | os.makedirs(os.path.split(args.output_src_landmark)[0], exist_ok=True)
160 | save_landmarks(args.output_src_landmark, src_lmks)
161 |
162 | os.makedirs(os.path.split(args.output_drv_headpose)[0], exist_ok=True)
163 | save_coeffs(args.output_drv_headpose, drv_headpose)
164 | os.makedirs(os.path.split(args.output_drv_landmark)[0], exist_ok=True)
165 | save_landmarks(args.output_drv_landmark, drv_lmks)
166 |
--------------------------------------------------------------------------------
/src/models/VGG19_LOSS.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torchvision
3 | from torch.nn import functional as F
4 | import numpy as np
5 |
6 |
7 | # VGG architecture, used for the perceptual loss using a pretrained VGG network
8 | class VGG19(torch.nn.Module):
9 | def __init__(self, requires_grad=False):
10 | super().__init__()
11 | vgg_pretrained_features = torchvision.models.vgg19(
12 | pretrained=True
13 | ).features
14 | self.slice1 = torch.nn.Sequential()
15 | self.slice2 = torch.nn.Sequential()
16 | self.slice3 = torch.nn.Sequential()
17 | self.slice4 = torch.nn.Sequential()
18 | self.slice5 = torch.nn.Sequential()
19 | for x in range(2):
20 | self.slice1.add_module(str(x), vgg_pretrained_features[x])
21 | for x in range(2, 7):
22 | self.slice2.add_module(str(x), vgg_pretrained_features[x])
23 | for x in range(7, 12):
24 | self.slice3.add_module(str(x), vgg_pretrained_features[x])
25 | for x in range(12, 21):
26 | self.slice4.add_module(str(x), vgg_pretrained_features[x])
27 | for x in range(21, 30):
28 | self.slice5.add_module(str(x), vgg_pretrained_features[x])
29 |
30 | self.mean = torch.nn.Parameter(data=torch.Tensor(np.array([0.485, 0.456, 0.406]).reshape((1, 3, 1, 1))),
31 | requires_grad=False)
32 | self.std = torch.nn.Parameter(data=torch.Tensor(np.array([0.229, 0.224, 0.225]).reshape((1, 3, 1, 1))),
33 | requires_grad=False)
34 |
35 | if not requires_grad:
36 | for param in self.parameters():
37 | param.requires_grad = False
38 |
39 | def forward(self, X):
40 | # Normalize the image so that it is in the appropriate range
41 | X = (X + 1) / 2
42 | X = (X - self.mean) / self.std
43 | h_relu1 = self.slice1(X)
44 | h_relu2 = self.slice2(h_relu1)
45 | h_relu3 = self.slice3(h_relu2)
46 | h_relu4 = self.slice4(h_relu3)
47 | h_relu5 = self.slice5(h_relu4)
48 | out = [h_relu1, h_relu2, h_relu3, h_relu4, h_relu5]
49 | return out
50 |
51 |
52 | class VGG19LOSS(torch.nn.Module):
53 | def __init__(self):
54 | super(VGG19LOSS, self).__init__()
55 | self.model = VGG19()
56 |
57 | def forward(self, fake, target, weight_mask=None, loss_weights=[1.0, 1.0, 1.0, 1.0, 1.0]):
58 | vgg_fake = self.model(fake)
59 | vgg_target = self.model(target)
60 |
61 | value_total = 0
62 | for i, weight in enumerate(loss_weights):
63 | value = torch.abs(vgg_fake[i] - vgg_target[i].detach())
64 | if weight_mask is not None:
65 | bs, c, H1, W1 = value.shape
66 | _, _, H2, W2 = weight_mask.shape
67 | if H1 != H2 or W1 != W2:
68 | cur_weight_mask = F.interpolate(weight_mask, size=(H1, W1))
69 | value = value * cur_weight_mask
70 | else:
71 | value = value * weight_mask
72 | value = torch.mean(value, dim=[x for x in range(1, len(value.size()))])
73 | value_total += loss_weights[i] * value
74 | return value_total
75 |
--------------------------------------------------------------------------------
/src/models/__init__.py:
--------------------------------------------------------------------------------
1 | import importlib
2 | from models.base_model import BaseModel
3 |
4 |
5 | def find_model_using_name(model_name):
6 | # Given the option --model [modelname],
7 | # the file "models/modelname_model.py"
8 | # will be imported.
9 | model_filename = "models." + model_name + "_model"
10 | modellib = importlib.import_module(model_filename)
11 |
12 | # In the file, the class called ModelNameModel() will
13 | # be instantiated. It has to be a subclass of BaseModel,
14 | # and it is case-insensitive.
15 | model = None
16 | target_model_name = model_name.replace('_', '') + 'model'
17 | for name, cls in modellib.__dict__.items():
18 | if name.lower() == target_model_name.lower() \
19 | and issubclass(cls, BaseModel):
20 | model = cls
21 |
22 | if model is None:
23 | print("In %s.py, there should be a subclass of BaseModel with class name that matches %s in lowercase." % (model_filename, target_model_name))
24 | exit(0)
25 |
26 | return model
27 |
28 |
29 | def get_option_setter(model_name):
30 | model_class = find_model_using_name(model_name)
31 | return model_class.modify_commandline_options
32 |
33 |
34 | def create_model(opt):
35 | model = find_model_using_name(opt.model)
36 | instance = model()
37 | instance.initialize(opt)
38 | print("model [%s] was created" % (instance.name()))
39 | return instance
40 |
--------------------------------------------------------------------------------
/src/models/base_model.py:
--------------------------------------------------------------------------------
1 | import os
2 | import torch
3 | import torch.nn as nn
4 | from collections import OrderedDict
5 | from . import networks
6 | import numpy as np
7 | from PIL import Image
8 |
9 | def save_tensor_image(input_image, image_path):
10 | if isinstance(input_image, torch.Tensor):
11 | image_tensor = input_image.data
12 | image_numpy = image_tensor[0].cpu().float().numpy()
13 | if image_numpy.shape[0] == 1:
14 | image_numpy = np.tile(image_numpy, (3, 1, 1))
15 | image_numpy = (np.transpose(image_numpy, (1, 2, 0)) + 1) / 2.0 * 255.0
16 | else:
17 | image_numpy = input_image
18 | image_numpy = image_numpy.astype(np.uint8)
19 | image_pil = Image.fromarray(image_numpy)
20 | image_pil.save(image_path)
21 |
22 | class BaseModel():
23 | @staticmethod
24 | def modify_commandline_options(parser, is_train):
25 | return parser
26 |
27 | def name(self):
28 | return 'BaseModel'
29 |
30 | def initialize(self, opt):
31 | self.opt = opt
32 | self.gpu_ids = opt.gpu_ids
33 | self.isTrain = opt.isTrain
34 | self.device = torch.device('cuda:{}'.format(self.gpu_ids[0])) if self.gpu_ids else torch.device('cpu')
35 | self.load_dir = os.path.join(opt.checkpoints_dir, opt.name)
36 | self.save_dir = os.path.join(opt.checkpoints_dir, opt.name)
37 | if not os.path.exists(self.save_dir):
38 | os.makedirs(self.save_dir)
39 | self.loss_names = []
40 | self.model_names = []
41 | self.visual_names = []
42 | self.image_paths = []
43 |
44 | def set_input(self, input):
45 | pass
46 |
47 | def forward(self):
48 | pass
49 |
50 | # load and print networks; create schedulers
51 | def setup(self, opt, parser=None):
52 | if self.isTrain:
53 | self.schedulers = [networks.get_scheduler(optimizer, opt) for optimizer in self.optimizers]
54 | if not self.isTrain or opt.continue_train:
55 | load_suffix = 'iter_%d' % opt.load_iter if opt.load_iter > 0 else opt.epoch
56 | self.load_networks(load_suffix)
57 | self.print_networks(opt.verbose)
58 |
59 |
60 |
61 | # load specific moudles
62 | def loadModules(self, opt, model_name, module_names):
63 | for name in module_names:
64 | if isinstance(name, str):
65 | load_dir = os.path.join(opt.checkpoints_dir, model_name)
66 | load_filename = 'latest_%s.pth' % (name)
67 | load_path = os.path.join(load_dir, load_filename)
68 | net = getattr(self, name)
69 | if isinstance(net, torch.Tensor):
70 | print('loading the tensor from %s' % load_path)
71 | net_loaded = torch.load(load_path, map_location=str(self.device))
72 | net.copy_(net_loaded)
73 | else:
74 | # if isinstance(net, torch.nn.DataParallel):
75 | # net = net.module
76 | print('loading the module from %s' % load_path)
77 | # if you are using PyTorch newer than 0.4 (e.g., built from
78 | # GitHub source), you can remove str() on self.device
79 | state_dict = torch.load(load_path, map_location=str(self.device))
80 | if hasattr(state_dict, '_metadata'):
81 | del state_dict._metadata
82 |
83 | # patch InstanceNorm checkpoints prior to 0.4
84 | for key in list(state_dict.keys()): # need to copy keys here because we mutate in loop
85 | self.__patch_instance_norm_state_dict(state_dict, net, key.split('.'))
86 | net.load_state_dict(state_dict)
87 |
88 |
89 |
90 |
91 | # make models eval mode during test time
92 | def eval(self):
93 | for name in self.model_names:
94 | if isinstance(name, str):
95 | net = getattr(self, name)
96 | net.eval()
97 |
98 | # used in test time, wrapping `forward` in no_grad() so we don't save
99 | # intermediate steps for backprop
100 | def test(self):
101 | with torch.no_grad():
102 | self.forward()
103 |
104 | # get image paths
105 | def get_image_paths(self):
106 | return self.image_paths
107 |
108 | def optimize_parameters(self):
109 | pass
110 |
111 | # update learning rate (called once every epoch)
112 | def update_learning_rate(self):
113 | for scheduler in self.schedulers:
114 | scheduler.step()
115 | lr = self.optimizers[0].param_groups[0]['lr']
116 | print('learning rate = %.7f' % lr)
117 |
118 | # return visualization images. train.py will display these images, and save the images to a html
119 | def get_current_visuals(self):
120 | visual_ret = OrderedDict()
121 | for name in self.visual_names:
122 | if isinstance(name, str):
123 | visual_ret[name] = getattr(self, name)
124 | return visual_ret
125 |
126 | # return traning losses/errors. train.py will print out these errors as debugging information
127 | def get_current_losses(self):
128 | errors_ret = OrderedDict()
129 | for name in self.loss_names:
130 | if isinstance(name, str):
131 | # float(...) works for both scalar tensor and float number
132 | errors_ret[name] = float(getattr(self, 'loss_' + name))
133 | return errors_ret
134 |
135 | # save models to the disk
136 | def save_networks(self, epoch):
137 | for name in self.model_names:
138 | if isinstance(name, str):
139 | save_filename = '%s_%s.pth' % (epoch, name)
140 | save_path = os.path.join(self.save_dir, save_filename)
141 | net = getattr(self, name)
142 |
143 | if isinstance(net, torch.Tensor):
144 | #torch.save(net.state_dict(), save_path)
145 | torch.save(net, save_path)
146 | for i in range(0, list(net.size())[0]):
147 | save_tensor_image(net[i:i+1,0:3,:,:], save_path+str(i)+'.png')
148 | else:
149 | if len(self.gpu_ids) > 0 and torch.cuda.is_available():
150 | #torch.save(net.module.cpu().state_dict(), save_path) # << original
151 | torch.save(net.cpu().state_dict(), save_path)
152 | net.cuda(self.gpu_ids[0])
153 | else:
154 | torch.save(net.cpu().state_dict(), save_path)
155 |
156 | def __patch_instance_norm_state_dict(self, state_dict, module, keys, i=0):
157 | key = keys[i]
158 | if i + 1 == len(keys): # at the end, pointing to a parameter/buffer
159 | if module.__class__.__name__.startswith('InstanceNorm') and \
160 | (key == 'running_mean' or key == 'running_var'):
161 | if getattr(module, key) is None:
162 | state_dict.pop('.'.join(keys))
163 | if module.__class__.__name__.startswith('InstanceNorm') and \
164 | (key == 'num_batches_tracked'):
165 | state_dict.pop('.'.join(keys))
166 | else:
167 | self.__patch_instance_norm_state_dict(state_dict, getattr(module, key), keys, i + 1)
168 |
169 | # load models from the disk
170 | def load_networks(self, epoch):
171 | for name in self.model_names:
172 | if isinstance(name, str):
173 | load_filename = '%s_%s.pth' % (epoch, name)
174 | load_path = os.path.join(self.load_dir, load_filename)
175 | net = getattr(self, name)
176 | if isinstance(net, torch.Tensor):
177 | print('loading the tensor from %s' % load_path)
178 | net_loaded = torch.load(load_path, map_location=str(self.device))
179 | net.copy_(net_loaded)
180 | else:
181 | # if isinstance(net, torch.nn.DataParallel):
182 | # net = net.module
183 | print('loading the module from %s' % load_path)
184 | # if you are using PyTorch newer than 0.4 (e.g., built from
185 | # GitHub source), you can remove str() on self.device
186 | state_dict = torch.load(load_path, map_location=str(self.device))
187 | if hasattr(state_dict, '_metadata'):
188 | del state_dict._metadata
189 |
190 | # patch InstanceNorm checkpoints prior to 0.4
191 | for key in list(state_dict.keys()): # need to copy keys here because we mutate in loop
192 | self.__patch_instance_norm_state_dict(state_dict, net, key.split('.'))
193 | net.load_state_dict(state_dict)
194 |
195 | # print network information
196 | def print_networks(self, verbose):
197 | print('---------- Networks initialized -------------')
198 | for name in self.model_names:
199 | if isinstance(name, str):
200 | net = getattr(self, name)
201 | if isinstance(net, torch.Tensor):
202 | num_params = net.numel()
203 | print('[Tensor %s] Total number of parameters : %.3f M' % (name, num_params / 1e6))
204 | else:
205 | num_params = 0
206 | for param in net.parameters():
207 | num_params += param.numel()
208 | if verbose:
209 | print(net)
210 | print('[Network %s] Total number of parameters : %.3f M' % (name, num_params / 1e6))
211 | print('-----------------------------------------------')
212 |
213 | # set requies_grad=False to avoid computation
214 | def set_requires_grad(self, nets, requires_grad=False):
215 | if not isinstance(nets, list):
216 | nets = [nets]
217 | for net in nets:
218 | if net is not None:
219 | for param in net.parameters():
220 | param.requires_grad = requires_grad
221 |
--------------------------------------------------------------------------------
/src/models/discriminator.py:
--------------------------------------------------------------------------------
1 | from torch import nn
2 | import torch.nn.functional as F
3 | import torch
4 |
5 |
6 | class DownBlock2d(nn.Module):
7 | """
8 | Simple block for processing video (encoder).
9 | """
10 |
11 | def __init__(self, in_features, out_features, norm=False, kernel_size=4, pool=False, sn=False):
12 | super(DownBlock2d, self).__init__()
13 | self.conv = nn.Conv2d(in_channels=in_features, out_channels=out_features, kernel_size=kernel_size)
14 |
15 | if sn:
16 | self.conv = nn.utils.spectral_norm(self.conv)
17 |
18 | if norm:
19 | self.norm = nn.InstanceNorm2d(out_features, affine=True)
20 | else:
21 | self.norm = None
22 | self.pool = pool
23 |
24 | def forward(self, x):
25 | out = x
26 | out = self.conv(out)
27 | if self.norm:
28 | out = self.norm(out)
29 | out = F.leaky_relu(out, 0.2)
30 | if self.pool:
31 | out = F.avg_pool2d(out, (2, 2))
32 | return out
33 |
34 |
35 | class Discriminator(nn.Module):
36 | """
37 | Discriminator similar to Pix2Pix
38 | """
39 |
40 | def __init__(self, num_channels=3, block_expansion=64, num_blocks=4, max_features=512, sn=False, use_kp=False):
41 | super(Discriminator, self).__init__()
42 |
43 | down_blocks = []
44 | for i in range(num_blocks):
45 | down_blocks.append(
46 | DownBlock2d(num_channels + 3 * use_kp if i == 0 else min(max_features, block_expansion * (2 ** i)),
47 | min(max_features, block_expansion * (2 ** (i + 1))),
48 | norm=(i != 0), kernel_size=4, pool=(i != num_blocks - 1), sn=sn))
49 |
50 | self.down_blocks = nn.ModuleList(down_blocks)
51 | self.conv = nn.Conv2d(self.down_blocks[-1].conv.out_channels, out_channels=1, kernel_size=1)
52 | if sn:
53 | self.conv = nn.utils.spectral_norm(self.conv)
54 | self.use_kp = use_kp
55 |
56 | def forward(self, x, kp=None):
57 | feature_maps = []
58 | out = x
59 | if self.use_kp:
60 | bs, _, h1, w1 = kp.shape
61 | bs, C, h2, w2 = out.shape
62 | if h1 != h2 or w1 != w2:
63 | kp = F.interpolate(kp, size=(h2, w2), mode='bilinear')
64 | out = torch.cat([out, kp], dim=1)
65 |
66 | for down_block in self.down_blocks:
67 | feature_maps.append(down_block(out))
68 | out = feature_maps[-1]
69 | prediction_map = self.conv(out)
70 |
71 | return feature_maps, prediction_map
72 |
73 |
74 | class MultiScaleDiscriminator(nn.Module):
75 | """
76 | Multi-scale (scale) discriminator
77 | """
78 |
79 | def __init__(self, scales=(), **kwargs):
80 | super(MultiScaleDiscriminator, self).__init__()
81 | self.scales = scales
82 | discs = {}
83 | for scale in scales:
84 | discs[str(scale).replace('.', '-')] = Discriminator(**kwargs)
85 | self.discs = nn.ModuleDict(discs)
86 |
87 | def forward(self, x, kp=None):
88 | out_dict = {}
89 | for scale, disc in self.discs.items():
90 | scale = str(scale).replace('-', '.')
91 | key = 'prediction_' + scale
92 | feature_maps, prediction_map = disc(x[key], kp)
93 | out_dict['feature_maps_' + scale] = feature_maps
94 | out_dict['prediction_map_' + scale] = prediction_map
95 | return out_dict
--------------------------------------------------------------------------------
/src/models/image_pyramid.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from torch import nn
3 | import torch.nn.functional as F
4 |
5 |
6 | class AntiAliasInterpolation2d(nn.Module):
7 | """
8 | Band-limited downsampling, for better preservation of the input signal.
9 | """
10 | def __init__(self, channels, scale):
11 | super(AntiAliasInterpolation2d, self).__init__()
12 | sigma = (1 / scale - 1) / 2
13 | kernel_size = 2 * round(sigma * 4) + 1
14 | self.ka = kernel_size // 2
15 | self.kb = self.ka - 1 if kernel_size % 2 == 0 else self.ka
16 |
17 | kernel_size = [kernel_size, kernel_size]
18 | sigma = [sigma, sigma]
19 | # The gaussian kernel is the product of the
20 | # gaussian function of each dimension.
21 | kernel = 1
22 | meshgrids = torch.meshgrid(
23 | [
24 | torch.arange(size, dtype=torch.float32)
25 | for size in kernel_size
26 | ]
27 | )
28 | for size, std, mgrid in zip(kernel_size, sigma, meshgrids):
29 | mean = (size - 1) / 2
30 | kernel *= torch.exp(-(mgrid - mean) ** 2 / (2 * std ** 2))
31 |
32 | # Make sure sum of values in gaussian kernel equals 1.
33 | kernel = kernel / torch.sum(kernel)
34 | # Reshape to depthwise convolutional weight
35 | kernel = kernel.view(1, 1, *kernel.size())
36 | kernel = kernel.repeat(channels, *[1] * (kernel.dim() - 1))
37 |
38 | self.register_buffer('weight', kernel)
39 | self.groups = channels
40 | self.scale = scale
41 |
42 | def forward(self, input):
43 | if self.scale == 1.0:
44 | return input
45 |
46 | out = F.pad(input, (self.ka, self.kb, self.ka, self.kb))
47 | out = F.conv2d(out, weight=self.weight, groups=self.groups)
48 | out = F.interpolate(out, scale_factor=(self.scale, self.scale))
49 |
50 | return out
51 |
52 |
53 | class ImagePyramide(torch.nn.Module):
54 | def __init__(self, scales, num_channels):
55 | super(ImagePyramide, self).__init__()
56 | downs = {}
57 | for scale in scales:
58 | downs[str(scale).replace('.', '-')] = AntiAliasInterpolation2d(num_channels, scale)
59 | self.downs = nn.ModuleDict(downs)
60 |
61 | def forward(self, x):
62 | out_dict = {}
63 | for scale, down_module in self.downs.items():
64 | out_dict['prediction_' + str(scale).replace('-', '.')] = down_module(x)
65 | return out_dict
--------------------------------------------------------------------------------
/src/models/motion_network.py:
--------------------------------------------------------------------------------
1 | import torch.nn.functional as F
2 | from torch import nn
3 |
4 |
5 | class DownBlock(nn.Module):
6 | def __init__(self, in_features, out_features, kernel_size=3, padding=1, groups=1, use_relu=True):
7 | super(DownBlock, self).__init__()
8 | self.conv = nn.Conv2d(in_channels=in_features, out_channels=out_features, kernel_size=kernel_size,
9 | padding=padding, groups=groups, stride=2)
10 | self.norm = nn.BatchNorm2d(out_features, affine=True)
11 | self.use_relu = use_relu
12 |
13 | def forward(self, x):
14 | out = self.conv(x)
15 | out = self.norm(out)
16 | if self.use_relu:
17 | out = F.relu(out)
18 | return out
19 |
20 |
21 | class UpBlock(nn.Module):
22 | def __init__(self, in_features, out_features, kernel_size=3, padding=1, groups=1, use_relu=True,
23 | sample_mode='nearest'):
24 | super(UpBlock, self).__init__()
25 | self.conv = nn.Conv2d(in_channels=in_features, out_channels=out_features, kernel_size=kernel_size,
26 | padding=padding, groups=groups)
27 | self.norm = nn.BatchNorm2d(out_features, affine=True)
28 | self.use_relu = use_relu
29 | self.sample_mode = sample_mode
30 |
31 | def forward(self, x):
32 | out = F.interpolate(x, scale_factor=2, mode=self.sample_mode)
33 | out = self.conv(out)
34 | out = self.norm(out)
35 | if self.use_relu:
36 | out = F.relu(out)
37 | return out
38 |
39 |
40 | class ResBlock(nn.Module):
41 | """
42 | Res block, preserve spatial resolution.
43 | """
44 |
45 | def __init__(self, in_features, kernel_size, padding):
46 | super(ResBlock, self).__init__()
47 | self.conv1 = nn.Conv2d(in_channels=in_features, out_channels=in_features, kernel_size=kernel_size,
48 | padding=padding)
49 | self.conv2 = nn.Conv2d(in_channels=in_features, out_channels=in_features, kernel_size=kernel_size,
50 | padding=padding)
51 | self.norm = nn.BatchNorm2d(in_features, affine=True)
52 |
53 | def forward(self, x):
54 | out = self.conv1(x)
55 | out = self.norm(out)
56 | out = F.relu(out)
57 | out = self.conv2(out)
58 | out += x
59 | return out
60 |
61 |
62 | class MotionNet(nn.Module):
63 | def __init__(self, opt):
64 | super(MotionNet, self).__init__()
65 |
66 | ngf = opt.mn_ngf
67 | n_local_enhancers = opt.n_local_enhancers
68 | n_downsampling = opt.mn_n_downsampling
69 | n_blocks_local = opt.mn_n_blocks_local
70 |
71 | in_features = [9, 9, 9]
72 |
73 | # F1
74 | f1_model_ngf = ngf * (2 ** n_local_enhancers)
75 | f1_model = [
76 | nn.Conv2d(in_channels=in_features[0], out_channels=f1_model_ngf, kernel_size=3, stride=1, padding=1),
77 | nn.BatchNorm2d(f1_model_ngf),
78 | nn.ReLU(True)]
79 |
80 | for i in range(n_downsampling):
81 | mult = 2 ** i
82 | f1_model += [
83 | DownBlock(f1_model_ngf * mult, f1_model_ngf * mult * 2, kernel_size=4, padding=1, use_relu=True)]
84 |
85 | for i in range(n_downsampling):
86 | mult = 2 ** (n_downsampling - i)
87 | f1_model += [
88 | UpBlock(f1_model_ngf * mult, int(f1_model_ngf * mult / 2), kernel_size=3, padding=1)
89 | ]
90 |
91 | self.f1_model = nn.Sequential(*f1_model)
92 | self.f1_motion = nn.Conv2d(f1_model_ngf, 2, kernel_size=(3, 3), padding=(1, 1))
93 |
94 | #f2 and f3
95 | for n in range(1, n_local_enhancers + 1):
96 | ### first downsampling block
97 | ngf_global = ngf * (2 ** (n_local_enhancers - n))
98 | model_first_downsample = [DownBlock(in_features[n], ngf_global * 2, kernel_size=4, padding=1, use_relu=True)]
99 | ### other downsampling blocks, residual blocks and upsampling blocks
100 | # other downsampling blocks
101 | model_other = []
102 | model_other += [
103 | DownBlock(ngf_global * 2, ngf_global * 4, kernel_size=4, padding=1, use_relu=True),
104 | DownBlock(ngf_global * 4, ngf_global * 8, kernel_size=4, padding=1, use_relu=True),
105 | ]
106 | # residual blocks
107 | for i in range(n_blocks_local):
108 | model_other += [ResBlock(ngf_global * 8, 3, 1)]
109 | # upsampling blocks
110 | model_other += [
111 | UpBlock(ngf_global * 8, ngf_global * 4, kernel_size=3, padding=1),
112 | UpBlock(ngf_global * 4, ngf_global * 2, kernel_size=3, padding=1),
113 | UpBlock(ngf_global * 2, ngf_global, kernel_size=3, padding=1)
114 | ]
115 | model_motion = nn.Conv2d(ngf_global, out_channels=2, kernel_size=3, padding=1, groups=1)
116 |
117 | setattr(self, 'model' + str(n) + '_1', nn.Sequential(*model_first_downsample))
118 | setattr(self, 'model' + str(n) + '_2', nn.Sequential(*model_other))
119 | setattr(self, 'model' + str(n) + '_3', model_motion)
120 |
121 | def forward(self, input1, input2, input3):
122 | ### output at small scale(f1)
123 | output_prev = self.f1_model(input1)
124 | low_motion = self.f1_motion(output_prev)
125 |
126 | ### output at middle scale(f2)
127 | output_prev = self.model1_2(self.model1_1(input2) + output_prev)
128 | middle_motion = self.model1_3(output_prev)
129 | middle_motion = middle_motion + nn.Upsample(scale_factor=2, mode='nearest')(low_motion)
130 |
131 | ### output at large scale(f3)
132 | output_prev = self.model2_2(self.model2_1(input3) + output_prev)
133 | high_motion = self.model2_3(output_prev)
134 | high_motion = high_motion + nn.Upsample(scale_factor=2, mode='nearest')(middle_motion)
135 |
136 | low_motion = low_motion.permute(0, 2, 3, 1)
137 | middle_motion = middle_motion.permute(0, 2, 3, 1)
138 | high_motion = high_motion.permute(0, 2, 3, 1)
139 | return [low_motion, middle_motion, high_motion]
140 |
--------------------------------------------------------------------------------
/src/options/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/src/options/__init__.py
--------------------------------------------------------------------------------
/src/options/parse_config.py:
--------------------------------------------------------------------------------
1 | import configparser
2 | from util import util
3 | import os
4 | import torch
5 | from abc import ABC
6 |
7 |
8 | class Options():
9 | def __init__(self):
10 | pass
11 |
12 |
13 | def _get_value_from_ini(conf, section, option, type=str, default=None):
14 | if conf.has_option(section, option):
15 | if type == bool:
16 | return conf.get(section, option) == str(True)
17 | else:
18 | return type(conf.get(section, option))
19 | else:
20 | return default
21 |
22 |
23 | def str2ids(input_str):
24 | str_ids = input_str.split(',')
25 | ids = []
26 | for str_id in str_ids:
27 | id = int(str_id)
28 | if id >= 0:
29 | ids.append(id)
30 | return ids
31 |
32 |
33 | def str2floats(input_str):
34 | str_ids = input_str.split(',')
35 | ids = []
36 | for str_id in str_ids:
37 | id = float(str_id)
38 | if id >= 0:
39 | ids.append(id)
40 | return ids
41 |
42 |
43 | def str2ints(input_str):
44 | str_ids = input_str.split(',')
45 | ids = []
46 | for str_id in str_ids:
47 | id = int(str_id)
48 | if id >= 0:
49 | ids.append(id)
50 | return ids
51 |
52 |
53 | class ConfigParse(ABC):
54 | def __init__(self):
55 | self.conf = configparser.ConfigParser()
56 | self.opt = 0
57 |
58 | @staticmethod
59 | def get_opt_from_ini(self, file_name):
60 | return self.opt
61 |
62 | def setup_environment(self):
63 | # print options
64 | message = ''
65 | message += '----------------- Options ---------------\n'
66 | for k, v in vars(self.opt).items():
67 | message += '{:>25}: {:<30}\n'.format(str(k), str(v))
68 | message += '----------------- End -------------------'
69 | print(message)
70 |
71 | # save to the disk
72 | expr_dir = os.path.join(self.opt.checkpoints_dir, self.opt.name)
73 | util.mkdirs(expr_dir)
74 | file_name = os.path.join(expr_dir, '{}_opt.txt'.format(self.opt.phase))
75 | with open(file_name, 'wt') as opt_file:
76 | opt_file.write(message)
77 | opt_file.write('\n')
78 |
79 | # set gpu ids
80 | if len(self.opt.gpu_ids) > 0:
81 | torch.cuda.set_device(self.opt.gpu_ids[0])
82 |
83 | def setup_test_environment(self):
84 | # print options
85 | message = ''
86 | message += '----------------- Options ---------------\n'
87 | for k, v in vars(self.opt).items():
88 | message += '{:>25}: {:<30}\n'.format(str(k), str(v))
89 | message += '----------------- End -------------------'
90 | print(message)
91 |
92 | # set gpu ids
93 | if len(self.opt.gpu_ids) > 0:
94 | torch.cuda.set_device(self.opt.gpu_ids[0])
95 |
96 |
97 | class Face2FaceRHOConfigParse(ConfigParse):
98 | def __init__(self):
99 | ConfigParse.__init__(self)
100 |
101 | def get_opt_from_ini(self, file_name):
102 | self.conf.read(file_name, encoding="utf-8")
103 | opt = Options()
104 | # basic config
105 | opt.name = self.conf.get("ROOT", "name")
106 | opt.gpu_ids = self.conf.get("ROOT", "gpu_ids")
107 | opt.gpu_ids = str2ids(opt.gpu_ids)
108 | opt.checkpoints_dir = self.conf.get("ROOT", "checkpoints_dir")
109 | opt.model = self.conf.get("ROOT", "model")
110 | opt.output_size = int(self.conf.get("ROOT", "output_size"))
111 | opt.isTrain = self.conf.get("ROOT", "isTrain") == 'True'
112 | opt.phase = self.conf.get("ROOT", "phase")
113 | opt.load_iter = int(self.conf.get("ROOT", "load_iter"))
114 | opt.epoch = int(self.conf.get("ROOT", "epoch"))
115 |
116 | # rendering module config
117 | opt.headpose_dims = int(self.conf.get("ROOT", "headpose_dims"))
118 | opt.mobilev2_encoder_channels = self.conf.get("ROOT", "mobilev2_encoder_channels")
119 | opt.mobilev2_encoder_channels = str2ints(opt.mobilev2_encoder_channels)
120 | opt.mobilev2_decoder_channels = self.conf.get("ROOT", "mobilev2_decoder_channels")
121 | opt.mobilev2_decoder_channels = str2ints(opt.mobilev2_decoder_channels)
122 | opt.mobilev2_encoder_layers = self.conf.get("ROOT", "mobilev2_encoder_layers")
123 | opt.mobilev2_encoder_layers = str2ints(opt.mobilev2_encoder_layers)
124 | opt.mobilev2_decoder_layers = self.conf.get("ROOT", "mobilev2_decoder_layers")
125 | opt.mobilev2_decoder_layers = str2ints(opt.mobilev2_decoder_layers)
126 | opt.mobilev2_encoder_expansion_factor = self.conf.get("ROOT", "mobilev2_encoder_expansion_factor")
127 | opt.mobilev2_encoder_expansion_factor = str2ints(opt.mobilev2_encoder_expansion_factor)
128 | opt.mobilev2_decoder_expansion_factor = self.conf.get("ROOT", "mobilev2_decoder_expansion_factor")
129 | opt.mobilev2_decoder_expansion_factor = str2ints(opt.mobilev2_decoder_expansion_factor)
130 | opt.headpose_embedding_ngf = int(self.conf.get("ROOT", "headpose_embedding_ngf"))
131 |
132 | # motion module config
133 | opt.mn_ngf = int(self.conf.get("ROOT", "mn_ngf"))
134 | opt.n_local_enhancers = int(self.conf.get("ROOT", "n_local_enhancers"))
135 | opt.mn_n_downsampling = int(self.conf.get("ROOT", "mn_n_downsampling"))
136 | opt.mn_n_blocks_local = int(self.conf.get("ROOT", "mn_n_blocks_local"))
137 |
138 | # discriminator
139 | opt.disc_scales = [1]
140 | opt.disc_block_expansion = int(self.conf.get("ROOT", "disc_block_expansion"))
141 | opt.disc_num_blocks = int(self.conf.get("ROOT", "disc_num_blocks"))
142 | opt.disc_max_features = int(self.conf.get("ROOT", "disc_max_features"))
143 |
144 | # training parameters
145 | opt.init_type = self.conf.get("ROOT", "init_type")
146 | opt.init_gain = float(self.conf.get("ROOT", "init_gain"))
147 | opt.emphasize_face_area = self.conf.get("ROOT", "emphasize_face_area") == 'True'
148 | opt.loss_scales = self.conf.get("ROOT", "loss_scales")
149 | opt.loss_scales = str2floats(opt.loss_scales)
150 | opt.warp_loss_weight = float(self.conf.get("ROOT", "warp_loss_weight"))
151 | opt.reconstruction_loss_weight = float(self.conf.get("ROOT", "reconstruction_loss_weight"))
152 | opt.feature_matching_loss_weight = float(self.conf.get("ROOT", "feature_matching_loss_weight"))
153 | opt.face_area_weight_scale = float(self.conf.get("ROOT", "face_area_weight_scale"))
154 | opt.init_field_epochs = int(self.conf.get("ROOT", "init_field_epochs"))
155 | opt.lr = float(self.conf.get("ROOT", "lr"))
156 | opt.beta1 = float(self.conf.get("ROOT", "beta1"))
157 | opt.lr_policy = self.conf.get("ROOT", "lr_policy")
158 | opt.epoch_count = int(self.conf.get("ROOT", "epoch_count"))
159 | opt.niter = int(self.conf.get("ROOT", "niter"))
160 | opt.niter_decay = int(self.conf.get("ROOT", "niter_decay"))
161 | opt.continue_train = self.conf.get("ROOT", "continue_train") == 'True'
162 |
163 | # dataset parameters
164 | opt.dataset_mode = self.conf.get("ROOT", "dataset_mode")
165 | opt.dataroot = self.conf.get("ROOT", "dataroot")
166 | opt.num_repeats = int(self.conf.get("ROOT", "num_repeats"))
167 | opt.batch_size = int(self.conf.get("ROOT", "batch_size"))
168 | opt.serial_batches = self.conf.get("ROOT", "serial_batches") == 'True'
169 | opt.num_threads = int(self.conf.get("ROOT", "num_threads"))
170 | opt.max_dataset_size = float("inf")
171 |
172 | # vis_config
173 | opt.display_freq = int(self.conf.get("ROOT", "display_freq"))
174 | opt.update_html_freq = int(self.conf.get("ROOT", "update_html_freq"))
175 | opt.display_id = int(self.conf.get("ROOT", "display_id"))
176 | opt.display_server = self.conf.get("ROOT", "display_server")
177 | opt.display_env = self.conf.get("ROOT", "display_env")
178 | opt.display_port = int(self.conf.get("ROOT", "display_port"))
179 | opt.print_freq = int(self.conf.get("ROOT", "print_freq"))
180 | opt.save_latest_freq = int(self.conf.get("ROOT", "save_latest_freq"))
181 | opt.save_epoch_freq = int(self.conf.get("ROOT", "save_epoch_freq"))
182 | opt.no_html = self.conf.get("ROOT", "no_html") == str(True)
183 | opt.display_winsize = int(self.conf.get("ROOT", "display_winsize"))
184 | opt.display_ncols = int(self.conf.get("ROOT", "display_ncols"))
185 | opt.verbose = self.conf.get("ROOT", "verbose") == 'True'
186 | self.opt = opt
187 | return self.opt
188 |
189 |
--------------------------------------------------------------------------------
/src/reenact.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | from options.parse_config import Face2FaceRHOConfigParse
3 | from models import create_model
4 | import os
5 | import torch
6 | import numpy as np
7 | from util.landmark_image_generation import LandmarkImageGeneration
8 | from util.util import (
9 | read_target,
10 | load_coeffs,
11 | load_landmarks
12 | )
13 | import cv2
14 | from util.util import tensor2im
15 |
16 |
17 | def parse_args():
18 | """Configurations."""
19 | parser = argparse.ArgumentParser(description='test process of Face2FaceRHO')
20 | parser.add_argument('--config', type=str, required=True,
21 | help='.ini config file name')
22 | parser.add_argument('--src_img', type=str, required=True,
23 | help='input source actor image (.jpg, .jpg, .jpeg, .png)')
24 | parser.add_argument('--src_headpose', type=str, required=True,
25 | help='input head pose coefficients of source image (.txt)')
26 | parser.add_argument('--src_landmark', type=str, required=True,
27 | help='input facial landmarks of source image (.txt)')
28 | parser.add_argument('--drv_headpose', type=str, required=True,
29 | help='input head pose coefficients of driving image (.txt)')
30 | parser.add_argument('--drv_landmark', type=str, required=True,
31 | help='input driving facial landmarks (.txt, reconstructed by using shape coefficients '
32 | 'of the source actor and expression and head pose coefficients of the driving actor)')
33 |
34 | parser.add_argument('--output_dir', type=str,
35 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'result'),
36 | help='output directory')
37 |
38 | return _check_args(parser.parse_args())
39 |
40 |
41 | def _check_args(args):
42 | if args is None:
43 | raise RuntimeError('Invalid arguments!')
44 | return args
45 |
46 |
47 | def load_data(opt, headpose_file, landmark_file, img_file=None, load_img=True):
48 | face = dict()
49 | if load_img:
50 | face['img'] = read_target(img_file, opt.output_size)
51 | face['headpose'] = torch.from_numpy(np.array(load_coeffs(headpose_file))).float()
52 | face['landmarks'] = torch.from_numpy(np.array(load_landmarks(landmark_file))).float()
53 | return face
54 |
55 |
56 | if __name__ == '__main__':
57 | args = parse_args()
58 | config_parse = Face2FaceRHOConfigParse()
59 | opt = config_parse.get_opt_from_ini(args.config)
60 | config_parse.setup_environment()
61 |
62 | model = create_model(opt)
63 | model.setup(opt)
64 | model.eval()
65 |
66 | if not os.path.exists(args.output_dir):
67 | os.makedirs(args.output_dir)
68 |
69 | src_face = load_data(opt, args.src_headpose, args.src_landmark, args.src_img)
70 | drv_face = load_data(opt, args.drv_headpose, args.drv_landmark, load_img=False)
71 |
72 | landmark_img_generator = LandmarkImageGeneration(opt)
73 |
74 | # off-line stage
75 | src_face['landmark_img'] = landmark_img_generator.generate_landmark_img(src_face['landmarks'])
76 | src_face['landmark_img'] = [value.unsqueeze(0) for value in src_face['landmark_img']]
77 | model.set_source_face(src_face['img'].unsqueeze(0), src_face['headpose'].unsqueeze(0))
78 |
79 | # on-line stage
80 | drv_face['landmark_img'] = landmark_img_generator.generate_landmark_img(drv_face['landmarks'])
81 | drv_face['landmark_img'] = [value.unsqueeze(0) for value in drv_face['landmark_img']]
82 | model.reenactment(src_face['landmark_img'], drv_face['headpose'].unsqueeze(0), drv_face['landmark_img'])
83 |
84 | visual_results = model.get_current_visuals()
85 | output_file_name = args.output_dir + "/result.png"
86 | im = tensor2im(visual_results['fake'])
87 | im = cv2.cvtColor(im, cv2.COLOR_RGB2BGR)
88 | cv2.imwrite(output_file_name, im)
89 |
--------------------------------------------------------------------------------
/src/train.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import time
3 | from options.parse_config import Face2FaceRHOConfigParse
4 | from dataset import CreateDataLoader
5 | from models import create_model
6 | from util.visualizer import Visualizer
7 | import os
8 |
9 |
10 | def parse_args():
11 | """Configurations."""
12 | parser = argparse.ArgumentParser(description='training process of Face2FaceRHO')
13 | parser.add_argument('--config', type=str, required=True, help='.ini config file name')
14 | return _check_args(parser.parse_args())
15 |
16 |
17 | def _check_args(args):
18 | if args is None:
19 | raise RuntimeError('Invalid arguments!')
20 | return args
21 |
22 |
23 | if __name__ == '__main__':
24 | print(os.getcwd())
25 | args = parse_args()
26 | config_parse = Face2FaceRHOConfigParse()
27 | opt = config_parse.get_opt_from_ini(args.config) # get training options
28 | config_parse.setup_environment()
29 |
30 | data_loader = CreateDataLoader(opt)
31 | dataset = data_loader.load_data()
32 | dataset_size = len(data_loader)
33 | print('#training images = %d' % dataset_size)
34 |
35 | model = create_model(opt)
36 | model.setup(opt)
37 |
38 | visualizer = Visualizer(opt)
39 | total_steps = 0
40 |
41 | display_inter = -1
42 | print_inter = -1
43 | save_latest_inter = -1
44 |
45 | for epoch in range(opt.epoch_count, opt.niter + opt.niter_decay + 1):
46 | epoch_start_time = time.time()
47 | iter_data_time = time.time()
48 | epoch_iter = 0 # iterator within an epoch
49 |
50 | for i, data in enumerate(dataset):
51 | iter_start_time = time.time()
52 | if total_steps % opt.print_freq == 0:
53 | t_data = iter_start_time - iter_data_time
54 | visualizer.reset()
55 | total_steps += opt.batch_size
56 | epoch_iter += opt.batch_size
57 |
58 | model.set_input(data)
59 | model.optimize_parameters(epoch)
60 | print("total steps: {}".format(total_steps))
61 |
62 | if total_steps // opt.display_freq > display_inter:
63 | display_inter = total_steps // opt.display_freq
64 | save_result = total_steps % opt.update_html_freq == 0
65 | visualizer.display_current_results(model.get_current_visuals(), epoch, False)
66 |
67 | if total_steps // opt.print_freq > print_inter:
68 | print_inter = total_steps // opt.print_freq
69 | losses = model.get_current_losses()
70 | t = (time.time() - iter_start_time) / opt.batch_size
71 | visualizer.print_current_losses(epoch, epoch_iter, losses, t, t_data)
72 | if opt.display_id > 0:
73 | visualizer.plot_current_losses(epoch, float(epoch_iter) / dataset_size, opt, losses)
74 |
75 | if total_steps // opt.save_latest_freq > save_latest_inter:
76 | save_latest_inter = total_steps // opt.save_latest_freq
77 | print('saving the latest model (epoch %d, total_steps %d)' % (epoch, total_steps))
78 | # model.save_networks('latest')
79 | save_suffix = 'iter_%d' % total_steps
80 | model.save_networks(save_suffix)
81 |
82 | iter_data_time = time.time()
83 | if epoch % opt.save_epoch_freq == 0:
84 | print('saving the model at the end of epoch %d, iters %d' % (epoch, total_steps))
85 | model.save_networks('latest')
86 | model.save_networks(epoch)
87 |
88 | print('End of epoch %d / %d \t Time Taken: %d sec' %
89 | (epoch, opt.niter + opt.niter_decay, time.time() - epoch_start_time))
90 | model.update_learning_rate()
--------------------------------------------------------------------------------
/src/util/html.py:
--------------------------------------------------------------------------------
1 | import dominate
2 | from dominate.tags import meta, h3, table, tr, td, p, a, img, br
3 | import os
4 |
5 |
6 | class HTML:
7 | def __init__(self, web_dir, title, reflesh=0):
8 | self.title = title
9 | self.web_dir = web_dir
10 | self.img_dir = os.path.join(self.web_dir, 'images')
11 | if not os.path.exists(self.web_dir):
12 | os.makedirs(self.web_dir)
13 | if not os.path.exists(self.img_dir):
14 | os.makedirs(self.img_dir)
15 | # print(self.img_dir)
16 |
17 | self.doc = dominate.document(title=title)
18 | if reflesh > 0:
19 | with self.doc.head:
20 | meta(http_equiv="reflesh", content=str(reflesh))
21 |
22 | def get_image_dir(self):
23 | return self.img_dir
24 |
25 | def add_header(self, str):
26 | with self.doc:
27 | h3(str)
28 |
29 | def add_table(self, border=1):
30 | self.t = table(border=border, style="table-layout: fixed;")
31 | self.doc.add(self.t)
32 |
33 | def add_images(self, ims, txts, links, width=400):
34 | self.add_table()
35 | with self.t:
36 | with tr():
37 | for im, txt, link in zip(ims, txts, links):
38 | with td(style="word-wrap: break-word;", halign="center", valign="top"):
39 | with p():
40 | with a(href=os.path.join('images', link)):
41 | img(style="width:%dpx" % width, src=os.path.join('images', im))
42 | br()
43 | p(txt)
44 |
45 | def save(self):
46 | html_file = '%s/index.html' % self.web_dir
47 | f = open(html_file, 'wt')
48 | f.write(self.doc.render())
49 | f.close()
50 |
51 |
52 | if __name__ == '__main__':
53 | html = HTML('web/', 'test_html')
54 | html.add_header('hello world')
55 |
56 | ims = []
57 | txts = []
58 | links = []
59 | for n in range(4):
60 | ims.append('image_%d.png' % n)
61 | txts.append('text_%d' % n)
62 | links.append('image_%d.png' % n)
63 | html.add_images(ims, txts, links)
64 | html.save()
65 |
--------------------------------------------------------------------------------
/src/util/landmark_image_generation.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | import torchvision.transforms as transforms
3 | import numpy as np
4 |
5 |
6 | class LandmarkImageGeneration:
7 | def __init__(self, opt):
8 | self.output_size = opt.output_size
9 | self.landmark_img_sizes = [
10 | int(opt.output_size / 16),
11 | int(opt.output_size / 8),
12 | int(opt.output_size / 4),
13 | ]
14 |
15 | self.facial_part_color = {
16 | # B G R
17 | 'left_eyebrow': [0, 0, 255], # red
18 | 'right_eyebrow': [0, 255, 0], # green
19 | 'left_eye': [255, 0, 0], # blue
20 | 'right_eye': [255, 255, 0], # cyan
21 | 'nose': [255, 0, 255], # purple
22 | 'mouth': [0, 255, 255], # yellow
23 | 'face_contour': [125, 125, 125], # gray
24 | }
25 |
26 | self.facial_part = [
27 | {
28 | 'left_eyebrow': [],
29 | 'right_eyebrow': [],
30 | 'left_eye': [42], # 1
31 | 'right_eye': [50], # 1
32 | 'nose': [30], # 1
33 | 'mouth': [52, 57], # 2
34 | 'face_contour': [0, 8, 9], # 3
35 | },
36 | {
37 | 'left_eyebrow': [17, 21], # 2
38 | 'right_eyebrow': [26, 22], # 2
39 | 'left_eye': [36, 38, 40, 42, 36], # 4
40 | 'right_eye': [48, 46, 44, 50, 48], # 4
41 | 'nose': [[27, 33, 31], [33, 35]], # 4
42 | 'mouth': [[52, 62, 57, 71, 52], [63], [70]], # 6
43 | 'face_contour': [0, 4, 8, 13, 9], # 5
44 | },
45 | {
46 | 'left_eyebrow': [17, 18, 19, 20, 21], # 5
47 | 'right_eyebrow': [26, 25, 24, 23, 22], # 5
48 | 'left_eye': [36, 37, 38, 39, 40, 41, 42, 43, 36], # 8
49 | 'right_eye': [48, 47, 46, 45, 44, 51, 50, 48, 48], # 8
50 | 'nose': [[27, 28, 29, 30, 33], [31, 32, 33], [35, 34, 33]], # 9
51 | 'mouth': [[52, 54, 53, 62, 58, 59, 57, 65, 66, 71, 69, 68, 52], [55, 56, 63, 61, 60, 64, 70, 67, 55]], # 20
52 | 'face_contour': [0, 1, 2, 3, 4, 5, 6, 7, 8, 16, 15, 14, 13, 12, 11, 10, 9], # 17
53 | }
54 | ]
55 |
56 | def generate_landmark_img(self, landmarks):
57 | landmark_imgs = []
58 | for i in range(len(self.landmark_img_sizes)):
59 | cur_landmarks = landmarks.clone()
60 | image_size = self.landmark_img_sizes[i]
61 |
62 | cur_landmarks[:, 0:1] = (cur_landmarks[:, 0:1] + 1) / 2 * (image_size - 1)
63 | cur_landmarks[:, 1:2] = (cur_landmarks[:, 1:2] + 1) / 2 * (image_size - 1)
64 |
65 | cur_facial_part = self.facial_part[i]
66 | line_width = 1
67 |
68 | landmark_img = np.zeros((image_size, image_size, 3), dtype=np.uint8)
69 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['left_eyebrow'], self.facial_part_color['left_eyebrow'], line_width)
70 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['right_eyebrow'], self.facial_part_color['right_eyebrow'], line_width)
71 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['left_eye'], self.facial_part_color['left_eye'], line_width)
72 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['right_eye'], self.facial_part_color['right_eye'], line_width)
73 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['nose'], self.facial_part_color['nose'], line_width)
74 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['mouth'], self.facial_part_color['mouth'], line_width)
75 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['face_contour'], self.facial_part_color['face_contour'], line_width)
76 | landmark_img = 2.0 * transforms.ToTensor()(landmark_img.astype(np.float32)) / 255.0 - 1.0
77 |
78 | landmark_imgs.append(landmark_img)
79 | return landmark_imgs
80 |
81 | @staticmethod
82 | def draw_line(landmark_map, projected_landmarks, line_ids, line_color, line_width):
83 | if len(line_ids) == 1: # only single point
84 | center_x = int(projected_landmarks[line_ids[0], 0])
85 | center_y = int(projected_landmarks[line_ids[0], 1])
86 | cv2.circle(landmark_map, (center_x, center_y), line_width, line_color, -1, cv2.LINE_4)
87 | elif len(line_ids) > 1:
88 | if isinstance(line_ids[0], list):
89 | for i in range(len(line_ids)):
90 | if len(line_ids[i]) == 1:
91 | center_x = int(projected_landmarks[line_ids[i][0], 0])
92 | center_y = int(projected_landmarks[line_ids[i][0], 1])
93 | cv2.circle(landmark_map, (center_x, center_y), line_width, line_color, -1, cv2.LINE_4)
94 | else:
95 | for j in range(len(line_ids[i]) - 1):
96 | pt1_x = int(projected_landmarks[line_ids[i][j], 0])
97 | pt1_y = int(projected_landmarks[line_ids[i][j], 1])
98 | pt2_x = int(projected_landmarks[line_ids[i][j + 1], 0])
99 | pt2_y = int(projected_landmarks[line_ids[i][j + 1], 1])
100 | pt1 = (int(pt1_x), int(pt1_y))
101 | pt2 = (int(pt2_x), int(pt2_y))
102 | cv2.line(landmark_map, pt1, pt2, line_color, line_width, cv2.LINE_4)
103 | else:
104 | for i in range(len(line_ids) - 1):
105 | pt1_x = int(projected_landmarks[line_ids[i], 0])
106 | pt1_y = int(projected_landmarks[line_ids[i], 1])
107 | pt2_x = int(projected_landmarks[line_ids[i+1], 0])
108 | pt2_y = int(projected_landmarks[line_ids[i + 1], 1])
109 | pt1 = (int(pt1_x), int(pt1_y))
110 | pt2 = (int(pt2_x), int(pt2_y))
111 | cv2.line(landmark_map, pt1, pt2, line_color, line_width, cv2.LINE_4)
112 |
113 |
114 |
115 |
--------------------------------------------------------------------------------
/src/util/util.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | from PIL import Image
3 | import os
4 | import torchvision.transforms as transforms
5 | import torch
6 | import cv2
7 |
8 |
9 | def make_ids(path):
10 | ids = []
11 | frames = os.listdir(path)
12 | for frame in frames:
13 | (filename, extension) = os.path.splitext(frame)
14 | ids.append(int(filename))
15 | ids = sorted(ids)
16 | return ids
17 |
18 |
19 | def read_target(file_name, output_size):
20 | pil_target = Image.open(file_name)
21 | if pil_target.size[0] != output_size:
22 | pil_target = transforms.Resize((output_size, output_size), interpolation=Image.BILINEAR)(pil_target)
23 | img_numpy = np.asarray(pil_target)
24 | TARGET = 2.0 * transforms.ToTensor()(img_numpy.astype(np.float32)) / 255.0 - 1.0
25 | return TARGET
26 |
27 |
28 | def load_coeffs(input_file_name):
29 | file = open(input_file_name, "r")
30 | coeffs = [float(line) for line in file]
31 | file.close()
32 | return coeffs
33 |
34 |
35 | def load_landmarks(file_name):
36 | landmarks = []
37 | file = open(file_name, 'r')
38 | for line in file:
39 | s1 = line.split(' ')
40 | landmarks.append([float(s1[0]), float(s1[1])])
41 | file.close()
42 | return landmarks
43 |
44 |
45 | def mkdir(path):
46 | if not os.path.exists(path):
47 | os.makedirs(path)
48 |
49 |
50 | def mkdirs(paths):
51 | if isinstance(paths, list) and not isinstance(paths, str):
52 | for path in paths:
53 | mkdir(path)
54 | else:
55 | mkdir(paths)
56 |
57 |
58 | def tensor2im(input_image, imtype=np.uint8, bs=0):
59 | if isinstance(input_image, torch.Tensor):
60 | input_image = torch.clamp(input_image, -1.0, 1.0)
61 | image_tensor = input_image.data
62 | else:
63 | return input_image
64 | image_numpy = image_tensor[bs].cpu().float().numpy()
65 | if image_numpy.shape[0] == 1:
66 | image_numpy = np.tile(image_numpy, (3, 1, 1))
67 | image_numpy = (np.transpose(image_numpy, (1, 2, 0)) + 1) / 2.0 * 255.0
68 | return image_numpy.astype(imtype)
69 |
70 |
71 | def show_mask(mask, bs=0):
72 | image_tensor = mask.data
73 | image_tensor = image_tensor[bs:bs+1, ...].cpu()
74 | mask_image = torch.ones(image_tensor.shape, dtype=torch.float32)
75 | mask_image = torch.where(image_tensor, torch.ones_like(mask_image), torch.zeros_like(mask_image))
76 | mask_image = mask_image.cpu().squeeze(0).numpy()
77 | mask_image = mask_image * 255
78 | mask_image = mask_image.astype(np.uint8)
79 | mask_image = cv2.cvtColor(mask_image, cv2.COLOR_GRAY2RGB)
80 | return mask_image
81 |
82 |
83 | def save_image(image_numpy, image_path, aspect_ratio=1.0):
84 | """Save a numpy image to the disk
85 |
86 | Parameters:
87 | image_numpy (numpy array) -- input numpy array
88 | image_path (str) -- the path of the image
89 | """
90 |
91 | image_pil = Image.fromarray(image_numpy)
92 | h, w, _ = image_numpy.shape
93 |
94 | if aspect_ratio > 1.0:
95 | image_pil = image_pil.resize((h, int(w * aspect_ratio)), Image.BICUBIC)
96 | if aspect_ratio < 1.0:
97 | image_pil = image_pil.resize((int(h / aspect_ratio), w), Image.BICUBIC)
98 | image_pil.save(image_path)
99 |
100 |
101 | def save_coeffs(output_file_name, coeffs):
102 | with open(output_file_name, "w") as f:
103 | for coeff in coeffs:
104 | f.write(str(coeff) + "\n")
105 |
106 |
107 | def save_landmarks(output_file_name, landmarks):
108 | with open(output_file_name, "w") as f:
109 | for landmark in landmarks:
110 | f.write(str(landmark[0]) + " " + str(landmark[1]) + "\n")
111 |
--------------------------------------------------------------------------------
/src/util/visualizer.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import os
3 | import sys
4 | import time
5 | from . import util
6 | from . import html
7 | from PIL import Image
8 | import visdom
9 |
10 | if sys.version_info[0] == 2:
11 | VisdomExceptionBase = Exception
12 | else:
13 | VisdomExceptionBase = ConnectionError
14 |
15 |
16 | class Visualizer():
17 | def __init__(self, opt):
18 | self.display_id = opt.display_id
19 | self.use_html = opt.isTrain and not opt.no_html
20 | self.win_size = opt.display_winsize
21 | self.name = opt.name
22 | self.opt = opt
23 | self.saved = False
24 | if self.display_id > 0:
25 | self.ncols = opt.display_ncols
26 | self.vis = visdom.Visdom(server=opt.display_server, port=opt.display_port, env=opt.display_env, raise_exceptions=True)
27 | if self.use_html:
28 | self.web_dir = os.path.join(opt.checkpoints_dir, opt.name, 'web')
29 | self.img_dir = os.path.join(self.web_dir, 'images')
30 | print('create web directory %s...' % self.web_dir)
31 | util.mkdirs([self.web_dir, self.img_dir])
32 | self.log_name = os.path.join(opt.checkpoints_dir, opt.name, 'loss_log.txt')
33 | with open(self.log_name, "a") as log_file:
34 | now = time.strftime("%c")
35 | log_file.write('================ Training Loss (%s) ================\n' % now)
36 |
37 | def reset(self):
38 | self.saved = False
39 |
40 | def throw_visdom_connection_error(self):
41 | print('\n\nCould not connect to Visdom server (https://github.com/facebookresearch/visdom) for displaying training progress.\nYou can suppress connection to Visdom using the option --display_id -1. To install visdom, run \n$ pip install visdom\n, and start the server by \n$ python -m visdom.server.\n\n')
42 | exit(1)
43 |
44 | # |visuals|: dictionary of images to display or save
45 | def display_current_results(self, visuals, epoch, save_result, aspect_ratio=1.0, width=256):
46 | if self.display_id > 0: # show images in the browser
47 | ncols = self.ncols
48 | if ncols > 0:
49 | ncols = min(ncols, len(visuals))
50 | h, w = next(iter(visuals.values())).shape[2:4]
51 | # h, w = 256,256
52 | height = int(width * h / float(w))
53 | h = height
54 | w = width
55 | table_css = """""" % (w, h)
59 | title = self.name
60 | label_html = ''
61 | label_html_row = ''
62 | images = []
63 | idx = 0
64 | for label, image in visuals.items():
65 | if label == 'drv_face_mask':
66 | image_numpy = util.show_mask(image)
67 | else:
68 | image_numpy = util.tensor2im(image, np.uint8)
69 | image_numpy = np.array(Image.fromarray(image_numpy).resize((h, w)))
70 | image_numpy = image_numpy.transpose([2, 0, 1])
71 | label_html_row += '%s | ' % label
72 | images.append(image_numpy)
73 | idx += 1
74 | if idx % ncols == 0:
75 | label_html += '%s
' % label_html_row
76 | label_html_row = ''
77 | white_image = np.ones_like(image_numpy) * 255
78 | while idx % ncols != 0:
79 | images.append(white_image)
80 | label_html_row += ' | '
81 | idx += 1
82 | if label_html_row != '':
83 | label_html += '%s
' % label_html_row
84 | # pane col = image row
85 | try:
86 | self.vis.images(images, nrow=ncols, win=self.display_id + 1, padding=2, opts=dict(title=title + ' images'))
87 | label_html = '' % label_html
88 | self.vis.text(table_css + label_html, win=self.display_id + 2,
89 | opts=dict(title=title + ' labels'))
90 | except VisdomExceptionBase:
91 | self.throw_visdom_connection_error()
92 |
93 | else:
94 | idx = 1
95 | for label, image in visuals.items():
96 | image_numpy = util.tensor2im(image)
97 | self.vis.image(image_numpy.transpose([2, 0, 1]), opts=dict(title=label),
98 | win=self.display_id + idx)
99 | idx += 1
100 |
101 | if self.use_html and (save_result or not self.saved): # save images to a html file
102 | self.saved = True
103 | for label, image in visuals.items():
104 | image_numpy = util.tensor2im(image)
105 | img_path = os.path.join(self.img_dir, 'epoch%.3d_%s.png' % (epoch, label))
106 | util.save_image(image_numpy, img_path)
107 | # update website
108 | webpage = html.HTML(self.web_dir, 'Experiment name = %s' % self.name, refresh=1)
109 | for n in range(epoch, 0, -1):
110 | webpage.add_header('epoch [%d]' % n)
111 | ims, txts, links = [], [], []
112 |
113 | for label, image_numpy in visuals.items():
114 | image_numpy = util.tensor2im(image)
115 | img_path = 'epoch%.3d_%s.png' % (n, label)
116 | ims.append(img_path)
117 | txts.append(label)
118 | links.append(img_path)
119 | webpage.add_images(ims, txts, links, width=self.win_size)
120 | webpage.save()
121 |
122 | # losses: dictionary of error labels and values
123 | def plot_current_losses(self, epoch, counter_ratio, opt, losses):
124 | if not hasattr(self, 'plot_data'):
125 | self.plot_data = {'X': [], 'Y': [], 'legend': list(losses.keys())}
126 | self.plot_data['X'].append(epoch + counter_ratio)
127 | self.plot_data['Y'].append([losses[k] for k in self.plot_data['legend']])
128 | try:
129 | self.vis.line(
130 | X=np.stack([np.array(self.plot_data['X'])] * len(self.plot_data['legend']), 1),
131 | Y=np.array(self.plot_data['Y']),
132 | opts={
133 | 'title': self.name + ' loss over time',
134 | 'legend': self.plot_data['legend'],
135 | 'xlabel': 'epoch',
136 | 'ylabel': 'loss'},
137 | win=self.display_id)
138 | except VisdomExceptionBase:
139 | self.throw_visdom_connection_error()
140 |
141 | # losses: same format as |losses| of plot_current_losses
142 | def print_current_losses(self, epoch, i, losses, t, t_data):
143 | message = '(epoch: %d, iters: %d, time: %.3f, data: %.3f) ' % (epoch, i, t, t_data)
144 | for k, v in losses.items():
145 | message += '%s: %.3f ' % (k, v)
146 |
147 | print(message)
148 | with open(self.log_name, "a") as log_file:
149 | log_file.write('%s\n' % message)
150 |
151 |
152 | # losses: dictionary of error labels and values
153 | def plot_current_validation_error(self, epoch, counter_ratio, losses):
154 | if not hasattr(self, 'plot_validation_data'):
155 | self.plot_validation_data = {'X': [], 'Y': [], 'legend': list(losses.keys())}
156 | self.plot_validation_data['X'].append(epoch + counter_ratio)
157 | self.plot_validation_data['Y'].append([losses[k] for k in self.plot_validation_data['legend']])
158 | try:
159 | self.vis.line(
160 | X=np.stack([np.array(self.plot_validation_data['X'])] * len(self.plot_validation_data['legend']), 1),
161 | Y=np.array(self.plot_validation_data['Y']),
162 | opts={
163 | 'title': self.name + ' validation error over time',
164 | 'legend': self.plot_validation_data['legend'],
165 | 'xlabel': 'epoch',
166 | 'ylabel': 'error'},
167 | win=self.display_id+1)
168 | except VisdomExceptionBase:
169 | self.throw_visdom_connection_error()
--------------------------------------------------------------------------------
/test_case/driving/FLAME/headpose.txt:
--------------------------------------------------------------------------------
1 | 0.13625283539295197
2 | 0.21206384897232056
3 | -0.007699888199567795
4 | 0.28585067896505223
5 | -0.2442243018234897
6 | 0.0011781832587462737
7 |
--------------------------------------------------------------------------------
/test_case/driving/FLAME/landmark.txt:
--------------------------------------------------------------------------------
1 | -0.31516164541244507 -0.179478719830513
2 | -0.30770164728164673 -0.07279782742261887
3 | -0.29041358828544617 0.030962064862251282
4 | -0.273349791765213 0.109856516122818
5 | -0.2382015734910965 0.1876646876335144
6 | -0.16550007462501526 0.24567601084709167
7 | -0.09306877106428146 0.281278133392334
8 | -0.007662743330001831 0.32588326930999756
9 | 0.07144586741924286 0.3382234573364258
10 | 0.3539423644542694 -0.18897154927253723
11 | 0.34861069917678833 -0.08055130392313004
12 | 0.329071581363678 0.02442900836467743
13 | 0.31596314907073975 0.10410526394844055
14 | 0.28633609414100647 0.18222516775131226
15 | 0.25385111570358276 0.24201679229736328
16 | 0.20868098735809326 0.2786073088645935
17 | 0.14341698586940765 0.32407960295677185
18 | -0.20078876614570618 -0.2900853157043457
19 | -0.1561872661113739 -0.300489604473114
20 | -0.10244493931531906 -0.3071072995662689
21 | -0.051480911672115326 -0.3058977425098419
22 | 0.014424465596675873 -0.2898973226547241
23 | 0.1638312041759491 -0.2904817461967468
24 | 0.22384917736053467 -0.3062269687652588
25 | 0.26421189308166504 -0.3074803352355957
26 | 0.29964524507522583 -0.30190491676330566
27 | 0.3224305212497711 -0.29381853342056274
28 | 0.089569091796875 -0.20857898890972137
29 | 0.09757955372333527 -0.14829565584659576
30 | 0.10460248589515686 -0.10214441269636154
31 | 0.1119510680437088 -0.05317731201648712
32 | 0.029375575482845306 -0.0040546804666519165
33 | 0.060426533222198486 0.0071538835763931274
34 | 0.0912569910287857 0.015764638781547546
35 | 0.11840483546257019 0.006637006998062134
36 | 0.141998291015625 -0.004986941814422607
37 | -0.14466696977615356 -0.18221096694469452
38 | -0.12468508630990982 -0.18699705600738525
39 | -0.09439322352409363 -0.18846870958805084
40 | -0.05997729301452637 -0.18889926373958588
41 | -0.017359264194965363 -0.1823589950799942
42 | -0.05062717944383621 -0.17332319915294647
43 | -0.10127006471157074 -0.1677996665239334
44 | -0.13213858008384705 -0.17395025491714478
45 | 0.15765415132045746 -0.1818913072347641
46 | 0.20564237236976624 -0.1873338520526886
47 | 0.23834621906280518 -0.1874956637620926
48 | 0.2605167031288147 -0.18757116794586182
49 | 0.2730824053287506 -0.1842278093099594
50 | 0.26508164405822754 -0.17632071673870087
51 | 0.24249637126922607 -0.16950492560863495
52 | 0.19319242238998413 -0.17441074550151825
53 | -0.04412370175123215 0.11576831340789795
54 | 0.057739436626434326 0.07976815104484558
55 | -0.0005059316754341125 0.08937084674835205
56 | -0.04149330407381058 0.11526310443878174
57 | 0.007847115397453308 0.10679468512535095
58 | 0.19560709595680237 0.11232137680053711
59 | 0.12210072576999664 0.07869571447372437
60 | 0.1672673374414444 0.08678025007247925
61 | 0.19456690549850464 0.11193358898162842
62 | 0.150030717253685 0.1048298180103302
63 | 0.09031131863594055 0.08315038681030273
64 | 0.08297209441661835 0.10099273920059204
65 | 0.14698274433612823 0.12132978439331055
66 | 0.17087598145008087 0.143466979265213
67 | 0.1295694261789322 0.15547624230384827
68 | 0.02747763693332672 0.12202847003936768
69 | 0.0011781379580497742 0.14499148726463318
70 | 0.05990856885910034 0.15536382794380188
71 | 0.09299695491790771 0.1228334903717041
72 | 0.09577301144599915 0.15782442688941956
73 |
--------------------------------------------------------------------------------
/test_case/driving/driving.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/test_case/driving/driving.jpg
--------------------------------------------------------------------------------
/test_case/driving/original/headpose.txt:
--------------------------------------------------------------------------------
1 | -0.0358294
2 | 0.193555
3 | 0.0186145
4 | 0.279976
5 | -0.240904
6 | 0.00114472
7 |
--------------------------------------------------------------------------------
/test_case/driving/original/landmark.txt:
--------------------------------------------------------------------------------
1 | -0.32812726497650146 -0.17575065791606903
2 | -0.3193890452384949 -0.09410519152879715
3 | -0.30486586689949036 -0.00901447981595993
4 | -0.2908157706260681 0.06765569001436234
5 | -0.2622595429420471 0.15148618817329407
6 | -0.19879738986492157 0.2215268909931183
7 | -0.12630710005760193 0.2841624915599823
8 | -0.04811964929103851 0.33685097098350525
9 | 0.05803518369793892 0.35090646147727966
10 | 0.3353678286075592 -0.18283289670944214
11 | 0.3339826464653015 -0.10109979659318924
12 | 0.3243960738182068 -0.01574944518506527
13 | 0.318934828042984 0.06125636026263237
14 | 0.30464500188827515 0.14568272233009338
15 | 0.26899731159210205 0.2171289026737213
16 | 0.21716704964637756 0.2810239791870117
17 | 0.15250642597675323 0.3350328505039215
18 | -0.18008267879486084 -0.2903229892253876
19 | -0.14966793358325958 -0.31077027320861816
20 | -0.09212638437747955 -0.32110685110092163
21 | -0.05061572417616844 -0.3146970570087433
22 | 0.005072138272225857 -0.2991345524787903
23 | 0.12899674475193024 -0.3003069758415222
24 | 0.180558979511261 -0.31509026885032654
25 | 0.21502850949764252 -0.320447713136673
26 | 0.2612355649471283 -0.3097497820854187
27 | 0.2836766242980957 -0.2907673716545105
28 | 0.06661558896303177 -0.22068162262439728
29 | 0.0714191347360611 -0.17015986144542694
30 | 0.07822294533252716 -0.12035435438156128
31 | 0.08429558575153351 -0.0610278882086277
32 | 0.004675115458667278 -0.005552200134843588
33 | 0.03847338631749153 -0.0015566380461677909
34 | 0.07607473433017731 0.0036368665751069784
35 | 0.10979848355054855 -0.002425919519737363
36 | 0.1380128711462021 -0.006990542635321617
37 | -0.14075849950313568 -0.18631498515605927
38 | -0.11834703385829926 -0.1869308054447174
39 | -0.08572912961244583 -0.18787769973278046
40 | -0.05379549786448479 -0.1896868199110031
41 | -0.029561372473835945 -0.18963144719600677
42 | -0.05296739563345909 -0.1862848401069641
43 | -0.07900605350732803 -0.18137046694755554
44 | -0.10806751996278763 -0.18178780376911163
45 | 0.14968648552894592 -0.19157154858112335
46 | 0.16921043395996094 -0.1928856521844864
47 | 0.19890174269676208 -0.19118916988372803
48 | 0.22966940701007843 -0.1910347193479538
49 | 0.25184503197669983 -0.19051343202590942
50 | 0.22667716443538666 -0.18536078929901123
51 | 0.20031540095806122 -0.1843135505914688
52 | 0.1745179444551468 -0.18870475888252258
53 | -0.05384637042880058 0.11597751826047897
54 | 0.05857943370938301 0.06649796664714813
55 | -0.0063765887171030045 0.08337722718715668
56 | -0.04636808857321739 0.11252179741859436
57 | 0.01774972304701805 0.1061658039689064
58 | 0.19657620787620544 0.11519702523946762
59 | 0.11550715565681458 0.06598487496376038
60 | 0.1680016964673996 0.08212072402238846
61 | 0.18980461359024048 0.11198490113019943
62 | 0.14404307305812836 0.10582582652568817
63 | 0.0883045643568039 0.07205545902252197
64 | 0.08565974235534668 0.1076158881187439
65 | 0.1478145271539688 0.11648828536272049
66 | 0.16730821132659912 0.1569463312625885
67 | 0.1291319578886032 0.17184729874134064
68 | 0.015439774841070175 0.11658688634634018
69 | -0.009288492612540722 0.15742817521095276
70 | 0.039102375507354736 0.1718333214521408
71 | 0.0865754559636116 0.1208595335483551
72 | 0.08545981347560883 0.17389750480651855
73 |
--------------------------------------------------------------------------------
/test_case/source/FLAME/headpose.txt:
--------------------------------------------------------------------------------
1 | 0.1681414097547531
2 | 0.2536787986755371
3 | 0.06973648071289062
4 | 0.28077412562032567
5 | -0.25723848868262056
6 | 0.0012728127173397697
7 |
--------------------------------------------------------------------------------
/test_case/source/FLAME/landmark.txt:
--------------------------------------------------------------------------------
1 | -0.3915994167327881 -0.1150280013680458
2 | -0.37632814049720764 8.036196231842041e-05
3 | -0.3505066931247711 0.11054354906082153
4 | -0.3252720236778259 0.19552060961723328
5 | -0.28039902448654175 0.27713948488235474
6 | -0.19613677263259888 0.34234046936035156
7 | -0.11438309401273727 0.38312917947769165
8 | -0.01678033173084259 0.43073731660842896
9 | 0.07022051513195038 0.43742769956588745
10 | 0.3226321339607239 -0.1867319792509079
11 | 0.3261435925960541 -0.06925588846206665
12 | 0.31598329544067383 0.04439428448677063
13 | 0.3083377480506897 0.13193431496620178
14 | 0.28226298093795776 0.21942484378814697
15 | 0.2552831470966339 0.29522982239723206
16 | 0.21195143461227417 0.34722304344177246
17 | 0.1464998871088028 0.4113011360168457
18 | -0.2687588632106781 -0.22776859998703003
19 | -0.21894890069961548 -0.2390923947095871
20 | -0.15942734479904175 -0.24691247940063477
21 | -0.1039041206240654 -0.2479831427335739
22 | -0.03141304850578308 -0.2352439910173416
23 | 0.12426505982875824 -0.24792760610580444
24 | 0.18584638833999634 -0.27190640568733215
25 | 0.22782588005065918 -0.2801172733306885
26 | 0.265785276889801 -0.28296464681625366
27 | 0.2891693413257599 -0.2797398269176483
28 | 0.05311575531959534 -0.16408880054950714
29 | 0.06664858758449554 -0.10286129266023636
30 | 0.0790686160326004 -0.05173478275537491
31 | 0.09205496311187744 0.002275601029396057
32 | 0.0029867887496948242 0.05675312876701355
33 | 0.03786562383174896 0.06785309314727783
34 | 0.07199119031429291 0.07482191920280457
35 | 0.1002703458070755 0.061978861689567566
36 | 0.12310999631881714 0.04548470675945282
37 | -0.20481428503990173 -0.13162489235401154
38 | -0.1838582158088684 -0.14527401328086853
39 | -0.15007027983665466 -0.15455220639705658
40 | -0.10895757377147675 -0.15663555264472961
41 | -0.06435224413871765 -0.13814763724803925
42 | -0.09962136298418045 -0.12134777009487152
43 | -0.15342330932617188 -0.11249685287475586
44 | -0.19158394634723663 -0.1175440102815628
45 | 0.1246703565120697 -0.15406309068202972
46 | 0.1749478280544281 -0.178548201918602
47 | 0.21306371688842773 -0.1833753138780594
48 | 0.23675093054771423 -0.18115878105163574
49 | 0.24895545840263367 -0.1723051518201828
50 | 0.24317428469657898 -0.15752866864204407
51 | 0.21697154641151428 -0.14639832079410553
52 | 0.16456133127212524 -0.1455719769001007
53 | -0.07869639247655869 0.1957665979862213
54 | 0.038986608386039734 0.1469036042690277
55 | -0.026080477982759476 0.16074302792549133
56 | -0.07600720226764679 0.19541475176811218
57 | -0.01719493418931961 0.17700600624084473
58 | 0.18652254343032837 0.16827836632728577
59 | 0.10776787996292114 0.14032906293869019
60 | 0.1554509550333023 0.1435796022415161
61 | 0.18522456288337708 0.16786694526672363
62 | 0.1377050131559372 0.16322994232177734
63 | 0.07434439659118652 0.14701855182647705
64 | 0.06612437963485718 0.16539010405540466
65 | 0.1325272023677826 0.19573479890823364
66 | 0.1618194282054901 0.21479544043540955
67 | 0.1177782267332077 0.2364625632762909
68 | 0.0017322301864624023 0.2091311514377594
69 | -0.02514980360865593 0.23496738076210022
70 | 0.041609689593315125 0.24414023756980896
71 | 0.07389257848262787 0.2068658173084259
72 | 0.08118276298046112 0.24355071783065796
73 |
--------------------------------------------------------------------------------
/test_case/source/original/headpose.txt:
--------------------------------------------------------------------------------
1 | 0.094624
2 | 0.224248
3 | 0.0906196
4 | 0.274085
5 | -0.258861
6 | 0.00126695
7 |
--------------------------------------------------------------------------------
/test_case/source/original/landmark.txt:
--------------------------------------------------------------------------------
1 | -0.41398540139198303 -0.13855670392513275
2 | -0.39786043763160706 -0.04808853939175606
3 | -0.37566015124320984 0.04454704001545906
4 | -0.35401850938796997 0.12935957312583923
5 | -0.31501394510269165 0.22222258150577545
6 | -0.23676329851150513 0.3022463917732239
7 | -0.15003231167793274 0.37161487340927124
8 | -0.058873869478702545 0.426936537027359
9 | 0.05960548296570778 0.43604427576065063
10 | 0.30743855237960815 -0.21318182349205017
11 | 0.3125551640987396 -0.12159918248653412
12 | 0.3085859715938568 -0.026257745921611786
13 | 0.3093137741088867 0.06087388098239899
14 | 0.30156058073043823 0.1588819921016693
15 | 0.27255749702453613 0.25048595666885376
16 | 0.22408780455589294 0.3336263597011566
17 | 0.1595725566148758 0.4046099781990051
18 | -0.2424519956111908 -0.23273399472236633
19 | -0.20851489901542664 -0.2534434199333191
20 | -0.14444002509117126 -0.26778602600097656
21 | -0.09849899262189865 -0.2647925019264221
22 | -0.037037111818790436 -0.2545986771583557
23 | 0.0974816381931305 -0.2677493095397949
24 | 0.1518627256155014 -0.29011762142181396
25 | 0.18786311149597168 -0.3016672432422638
26 | 0.23710954189300537 -0.29916146397590637
27 | 0.26130804419517517 -0.28458142280578613
28 | 0.03486708551645279 -0.17625953257083893
29 | 0.045056603848934174 -0.11815688014030457
30 | 0.05814138054847717 -0.059055134654045105
31 | 0.07038316875696182 0.00965171679854393
32 | -0.018880583345890045 0.06478478014469147
33 | 0.01923198252916336 0.06748099625110626
34 | 0.06152241677045822 0.07035651803016663
35 | 0.09818808734416962 0.05957295000553131
36 | 0.1281459629535675 0.050056032836437225
37 | -0.20204558968544006 -0.13417282700538635
38 | -0.17514348030090332 -0.15104661881923676
39 | -0.13936007022857666 -0.1594904065132141
40 | -0.1013808473944664 -0.1585645079612732
41 | -0.06935463100671768 -0.13894785940647125
42 | -0.10236678272485733 -0.1264077126979828
43 | -0.13617637753486633 -0.12006562948226929
44 | -0.16964977979660034 -0.12204238772392273
45 | 0.12423384189605713 -0.1589944213628769
46 | 0.14868421852588654 -0.1803821623325348
47 | 0.18442374467849731 -0.19259406626224518
48 | 0.21803560853004456 -0.19426123797893524
49 | 0.2399219572544098 -0.17988036572933197
50 | 0.2194332480430603 -0.16232435405254364
51 | 0.1926254779100418 -0.15410242974758148
52 | 0.15961241722106934 -0.1535251885652542
53 | -0.08823604136705399 0.1933051198720932
54 | 0.037760909646749496 0.13411930203437805
55 | -0.03670395910739899 0.15546481311321259
56 | -0.08000732213258743 0.18889179825782776
57 | -0.00799543410539627 0.17504075169563293
58 | 0.19321969151496887 0.16880981624126434
59 | 0.10224615037441254 0.12806382775306702
60 | 0.16135382652282715 0.13724902272224426
61 | 0.18520762026309967 0.1661205142736435
62 | 0.13593405485153198 0.16213418543338776
63 | 0.07198242098093033 0.13796880841255188
64 | 0.0702238604426384 0.17165715992450714
65 | 0.1410117745399475 0.1864575743675232
66 | 0.16403953731060028 0.22564265131950378
67 | 0.12338931858539581 0.25030457973480225
68 | -0.007767893373966217 0.1993907243013382
69 | -0.03333708643913269 0.24307218194007874
70 | 0.023107167333364487 0.2589821517467499
71 | 0.0735296830534935 0.20158612728118896
72 | 0.07525938749313354 0.25801753997802734
73 |
--------------------------------------------------------------------------------
/test_case/source/source.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/test_case/source/source.jpg
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/headpose/150.txt:
--------------------------------------------------------------------------------
1 | 0.156604
2 | -0.230057
3 | 0.29011
4 | 0.220425
5 | -0.312521
6 | 0.00194483
7 |
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/headpose/54.txt:
--------------------------------------------------------------------------------
1 | 0.131516
2 | -0.205836
3 | 0.193343
4 | 0.228106
5 | -0.293041
6 | 0.00190257
7 |
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/img/150.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/img/150.jpg
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/img/54.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/img/54.jpg
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/landmark/150.txt:
--------------------------------------------------------------------------------
1 | -0.568425714969635 0.02589094638824463
2 | -0.5238426923751831 0.15984371304512024
3 | -0.4607754945755005 0.2932114899158478
4 | -0.4068068861961365 0.42605629563331604
5 | -0.3416854739189148 0.5617364645004272
6 | -0.254992812871933 0.6702439785003662
7 | -0.15150493383407593 0.7649673223495483
8 | -0.028679929673671722 0.8329890370368958
9 | 0.13000307977199554 0.8273522853851318
10 | 0.46755319833755493 -0.2684306800365448
11 | 0.4962214231491089 -0.12998950481414795
12 | 0.5162476301193237 0.015611246228218079
13 | 0.5354956984519958 0.1588059961795807
14 | 0.532108724117279 0.31500229239463806
15 | 0.46318209171295166 0.4704422950744629
16 | 0.3857147693634033 0.6180211305618286
17 | 0.29314345121383667 0.7472214698791504
18 | -0.5722818970680237 -0.09347942471504211
19 | -0.5491307377815247 -0.13007354736328125
20 | -0.47550225257873535 -0.1633845567703247
21 | -0.41378623247146606 -0.1721431016921997
22 | -0.32607191801071167 -0.1741737723350525
23 | -0.1536448895931244 -0.22927811741828918
24 | -0.07168544083833694 -0.2739851772785187
25 | -0.0026913434267044067 -0.3017808794975281
26 | 0.10618767142295837 -0.3195626735687256
27 | 0.17112918198108673 -0.3071504831314087
28 | -0.1985154151916504 -0.07722553610801697
29 | -0.18322625756263733 0.01261092722415924
30 | -0.16710540652275085 0.11665801703929901
31 | -0.14298115670681 0.23547233641147614
32 | -0.19555987417697906 0.31517547369003296
33 | -0.14759938418865204 0.31499534845352173
34 | -0.09060841053724289 0.3149871230125427
35 | -0.03476393222808838 0.2837763726711273
36 | 0.018790408968925476 0.2559306025505066
37 | -0.48085927963256836 -0.0003206878900527954
38 | -0.4648837447166443 -0.031194746494293213
39 | -0.422918438911438 -0.050164878368377686
40 | -0.36573654413223267 -0.056621164083480835
41 | -0.31113848090171814 -0.033762574195861816
42 | -0.35314321517944336 -0.005850285291671753
43 | -0.39823031425476074 0.013135358691215515
44 | -0.4398861527442932 0.0161551833152771
45 | -0.03688753396272659 -0.11144664883613586
46 | -0.01981893926858902 -0.14834806323051453
47 | 0.02133864164352417 -0.17553764581680298
48 | 0.07874740660190582 -0.18968263268470764
49 | 0.14956389367580414 -0.1794232726097107
50 | 0.10819225013256073 -0.13965561985969543
51 | 0.06364291906356812 -0.11864683032035828
52 | 0.0136423259973526 -0.1098666787147522
53 | -0.18786075711250305 0.5219935774803162
54 | -0.10953058302402496 0.4380062222480774
55 | -0.17474597692489624 0.4775727689266205
56 | -0.17920799553394318 0.514659583568573
57 | -0.12799328565597534 0.4940014183521271
58 | 0.1604418009519577 0.4297633767127991
59 | -0.02031145989894867 0.41321608424186707
60 | 0.08209000527858734 0.4081108570098877
61 | 0.14822788536548615 0.4284103810787201
62 | 0.05698837339878082 0.4452410340309143
63 | -0.06325575709342957 0.4370177984237671
64 | -0.040912143886089325 0.4773541986942291
65 | 0.06541876494884491 0.4649183750152588
66 | 0.1210654228925705 0.5131053328514099
67 | 0.06025557219982147 0.5552003383636475
68 | -0.11841686069965363 0.5124952793121338
69 | -0.12994122505187988 0.5768686532974243
70 | -0.07111692428588867 0.5869581699371338
71 | -0.03216373175382614 0.49953222274780273
72 | -0.006496444344520569 0.5757865309715271
73 |
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/landmark/54.txt:
--------------------------------------------------------------------------------
1 | -0.5328267216682434 -0.0723910927772522
2 | -0.5018036365509033 0.06208936125040054
3 | -0.45313066244125366 0.1979004144668579
4 | -0.4132305383682251 0.3302344083786011
5 | -0.3626813292503357 0.4664897918701172
6 | -0.2858491539955139 0.5769708156585693
7 | -0.1896677017211914 0.6799073815345764
8 | -0.0726291611790657 0.7585027813911438
9 | 0.08300289511680603 0.7647232413291931
10 | 0.5111780166625977 -0.2648939788341522
11 | 0.5261664390563965 -0.12748831510543823
12 | 0.531586229801178 0.016316726803779602
13 | 0.5370694994926453 0.1567668914794922
14 | 0.5192488431930542 0.30695825815200806
15 | 0.44011157751083374 0.4487931430339813
16 | 0.3516508936882019 0.5847730040550232
17 | 0.24884922802448273 0.7027948498725891
18 | -0.508565366268158 -0.19433322548866272
19 | -0.4805501699447632 -0.22859832644462585
20 | -0.4023595452308655 -0.2544301450252533
21 | -0.33930259943008423 -0.25589197874069214
22 | -0.25241878628730774 -0.2473609745502472
23 | -0.08189016580581665 -0.28887394070625305
24 | 0.00226447731256485 -0.3264678120613098
25 | 0.07184536755084991 -0.34871211647987366
26 | 0.17850489914417267 -0.3567420244216919
27 | 0.23930494487285614 -0.33791759610176086
28 | -0.14098984003067017 -0.14685529470443726
29 | -0.1338806003332138 -0.06055460870265961
30 | -0.1265304684638977 0.04188540577888489
31 | -0.1136949434876442 0.16041991114616394
32 | -0.18069304525852203 0.23102693259716034
33 | -0.13256478309631348 0.23521006107330322
34 | -0.07499934732913971 0.24042908847332
35 | -0.015736736357212067 0.21636074781417847
36 | 0.0397486686706543 0.19484290480613708
37 | -0.43493765592575073 -0.10025641322135925
38 | -0.41847842931747437 -0.13559266924858093
39 | -0.3767554759979248 -0.15263423323631287
40 | -0.31769585609436035 -0.15196728706359863
41 | -0.2587515413761139 -0.11556455492973328
42 | -0.31028175354003906 -0.08304101228713989
43 | -0.3602960705757141 -0.07066814601421356
44 | -0.40067577362060547 -0.0752880722284317
45 | 0.017254754900932312 -0.16644325852394104
46 | 0.03816370666027069 -0.2095049023628235
47 | 0.0817098468542099 -0.23707088828086853
48 | 0.13946031033992767 -0.24436751008033752
49 | 0.20904017984867096 -0.21885281801223755
50 | 0.17498789727687836 -0.1798402965068817
51 | 0.1305464804172516 -0.16024965047836304
52 | 0.07684198021888733 -0.15424272418022156
53 | -0.2012825310230255 0.4252326190471649
54 | -0.11096867173910141 0.34387582540512085
55 | -0.18156373500823975 0.3784526586532593
56 | -0.19107991456985474 0.41860517859458923
57 | -0.13880407810211182 0.39680421352386475
58 | 0.1645442694425583 0.360065758228302
59 | -0.018234863877296448 0.32719436287879944
60 | 0.08877328038215637 0.33039039373397827
61 | 0.15258203446865082 0.3573301434516907
62 | 0.05606567859649658 0.3614235520362854
63 | -0.06422337889671326 0.34662073850631714
64 | -0.04901714622974396 0.3844396471977234
65 | 0.06316450238227844 0.39063721895217896
66 | 0.11141422390937805 0.4393317699432373
67 | 0.044770583510398865 0.4749511480331421
68 | -0.12722159922122955 0.4254927337169647
69 | -0.14531119167804718 0.4844479560852051
70 | -0.0882822573184967 0.4978744089603424
71 | -0.039423272013664246 0.417609840631485
72 | -0.023823358118534088 0.4912562966346741
73 |
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/mask/150.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/mask/150.png
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/mask/54.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/mask/54.png
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/headpose/82.txt:
--------------------------------------------------------------------------------
1 | 0.0632977
2 | -0.149218
3 | 0.0846878
4 | 0.245892
5 | -0.229037
6 | 0.00108858
7 |
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/headpose/89.txt:
--------------------------------------------------------------------------------
1 | -0.0103226
2 | 0.0244929
3 | 0.0211892
4 | 0.289309
5 | -0.216485
6 | 0.00109599
7 |
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/img/82.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/img/82.jpg
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/img/89.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/img/89.jpg
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/landmark/82.txt:
--------------------------------------------------------------------------------
1 | -0.303331583738327 -0.233467698097229
2 | -0.2927420735359192 -0.15596948564052582
3 | -0.27413156628608704 -0.07600411772727966
4 | -0.2593475580215454 0.0007681772112846375
5 | -0.23581907153129578 0.08137907832860947
6 | -0.19255167245864868 0.1481340229511261
7 | -0.13532105088233948 0.2038843035697937
8 | -0.06805747002363205 0.24654272198677063
9 | 0.019405052065849304 0.2546466886997223
10 | 0.3265819847583771 -0.2844737470149994
11 | 0.32591888308525085 -0.20606783032417297
12 | 0.3201008439064026 -0.12412900477647781
13 | 0.3144366443157196 -0.04574782773852348
14 | 0.29549098014831543 0.03825075179338455
15 | 0.2432161569595337 0.11261963099241257
16 | 0.18111205101013184 0.17802658677101135
17 | 0.1129712238907814 0.2316981852054596
18 | -0.2598797678947449 -0.31481245160102844
19 | -0.24004164338111877 -0.3338385820388794
20 | -0.19487902522087097 -0.3469902276992798
21 | -0.15954212844371796 -0.3452047109603882
22 | -0.10769272595643997 -0.3362177312374115
23 | 0.00960732251405716 -0.3468177020549774
24 | 0.061776503920555115 -0.3639221787452698
25 | 0.10131721943616867 -0.3716222643852234
26 | 0.15790408849716187 -0.36651837825775146
27 | 0.18842813372612 -0.3514263927936554
28 | -0.04102310910820961 -0.2670649290084839
29 | -0.039631668478250504 -0.2162145972251892
30 | -0.04007062315940857 -0.16453418135643005
31 | -0.03915104269981384 -0.10569751262664795
32 | -0.08614125847816467 -0.05491805821657181
33 | -0.05807076767086983 -0.052395183593034744
34 | -0.02641167864203453 -0.048068370670080185
35 | 0.006891876459121704 -0.05765453726053238
36 | 0.03809083253145218 -0.06505139172077179
37 | -0.2151382863521576 -0.23495230078697205
38 | -0.20378583669662476 -0.24206718802452087
39 | -0.1784401834011078 -0.24663186073303223
40 | -0.14726132154464722 -0.24776050448417664
41 | -0.11646721512079239 -0.2383313775062561
42 | -0.13856452703475952 -0.23524385690689087
43 | -0.16416260600090027 -0.22999557852745056
44 | -0.18897226452827454 -0.22930997610092163
45 | 0.04985576868057251 -0.251802921295166
46 | 0.06622543185949326 -0.26347416639328003
47 | 0.09400001913309097 -0.2688547968864441
48 | 0.12553390860557556 -0.27008354663848877
49 | 0.15456527471542358 -0.2649274170398712
50 | 0.1234506145119667 -0.2546200752258301
51 | 0.0968899354338646 -0.2511388063430786
52 | 0.07148075103759766 -0.2522624433040619
53 | -0.11389538645744324 0.06512745469808578
54 | -0.04827549308538437 0.018069162964820862
55 | -0.09267351776361465 0.03604593127965927
56 | -0.10865690559148788 0.06198366731405258
57 | -0.06910214573144913 0.05240192264318466
58 | 0.09590766578912735 0.046984054148197174
59 | 0.0015379749238491058 0.01383092999458313
60 | 0.0566081777215004 0.023342572152614594
61 | 0.08959364145994186 0.044777028262615204
62 | 0.038133010268211365 0.043131373822689056
63 | -0.02340156026184559 0.02199704200029373
64 | -0.018049370497465134 0.050268061459064484
65 | 0.040446922183036804 0.05117339640855789
66 | 0.06491007655858994 0.08550301939249039
67 | 0.02552839368581772 0.10056950896978378
68 | -0.07054953277111053 0.060900814831256866
69 | -0.0844009518623352 0.09845524281263351
70 | -0.048431023955345154 0.10711487382650375
71 | -0.017524361610412598 0.059963397681713104
72 | -0.011984145268797874 0.10500424355268478
73 |
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/landmark/89.txt:
--------------------------------------------------------------------------------
1 | -0.20474007725715637 -0.268522173166275
2 | -0.19697806239128113 -0.19082148373126984
3 | -0.18260139226913452 -0.10959465801715851
4 | -0.17002439498901367 -0.03243817389011383
5 | -0.14566102623939514 0.04790972173213959
6 | -0.09365737438201904 0.1127336323261261
7 | -0.030290022492408752 0.16876181960105896
8 | 0.04052555561065674 0.2137988805770874
9 | 0.13412050902843475 0.22540783882141113
10 | 0.43853750824928284 -0.28223657608032227
11 | 0.4348083436489105 -0.20429421961307526
12 | 0.42424193024635315 -0.12253983318805695
13 | 0.4158458411693573 -0.044976845383644104
14 | 0.39665839076042175 0.03624339401721954
15 | 0.3511892557144165 0.10304197669029236
16 | 0.29287177324295044 0.16170603036880493
17 | 0.22555701434612274 0.2097836136817932
18 | -0.10753002762794495 -0.36760014295578003
19 | -0.08167213201522827 -0.3875850737094879
20 | -0.02926197648048401 -0.4006659686565399
21 | 0.009437300264835358 -0.39810019731521606
22 | 0.06310754269361496 -0.3868800401687622
23 | 0.18353502452373505 -0.38926300406455994
24 | 0.2362411916255951 -0.402813196182251
25 | 0.2739555239677429 -0.40704137086868286
26 | 0.3252486288547516 -0.3962045907974243
27 | 0.35067373514175415 -0.37733206152915955
28 | 0.12474481761455536 -0.31153035163879395
29 | 0.12640230357646942 -0.261530339717865
30 | 0.1282726377248764 -0.2118726670742035
31 | 0.13027189671993256 -0.15463194251060486
32 | 0.06638786196708679 -0.1010010838508606
33 | 0.09698450565338135 -0.0977640151977539
34 | 0.1304248720407486 -0.09223908185958862
35 | 0.16320598125457764 -0.0992237776517868
36 | 0.1930646449327469 -0.10383424162864685
37 | -0.06575420498847961 -0.2854270339012146
38 | -0.049208953976631165 -0.2951361835002899
39 | -0.020956620573997498 -0.2996317446231842
40 | 0.010934144258499146 -0.29836714267730713
41 | 0.03947927802801132 -0.2840616703033447
42 | 0.01575426757335663 -0.28100720047950745
43 | -0.010776922106742859 -0.27754607796669006
44 | -0.037083253264427185 -0.2784150242805481
45 | 0.2092171311378479 -0.2876892685890198
46 | 0.22892925143241882 -0.300739586353302
47 | 0.25859159231185913 -0.3056843876838684
48 | 0.28926992416381836 -0.3042370676994324
49 | 0.3130781650543213 -0.29351168870925903
50 | 0.28573623299598694 -0.2853110134601593
51 | 0.2598593533039093 -0.2833213210105896
52 | 0.23314571380615234 -0.2856521010398865
53 | 0.02525217831134796 0.02099992334842682
54 | 0.1070152223110199 -0.028147980570793152
55 | 0.05638381093740463 -0.010722845792770386
56 | 0.031153663992881775 0.01803739368915558
57 | 0.07802911847829819 0.007497057318687439
58 | 0.23863503336906433 0.015418171882629395
59 | 0.1577325016260147 -0.029432155191898346
60 | 0.20821356773376465 -0.014503806829452515
61 | 0.23263554275035858 0.012729361653327942
62 | 0.1870465874671936 0.0046945661306381226
63 | 0.13260778784751892 -0.02300858497619629
64 | 0.13275161385536194 0.0071703046560287476
65 | 0.18915385007858276 0.01493680477142334
66 | 0.20920619368553162 0.0515366792678833
67 | 0.17145150899887085 0.06345342099666595
68 | 0.07630160450935364 0.01794058084487915
69 | 0.05715292692184448 0.055464744567871094
70 | 0.09599503874778748 0.06549146771430969
71 | 0.13304740190505981 0.019217073917388916
72 | 0.13377049565315247 0.06540191173553467
73 |
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/mask/82.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/mask/82.png
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/mask/89.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/mask/89.png
--------------------------------------------------------------------------------
/trainingset/VoxCeleb/list.txt:
--------------------------------------------------------------------------------
1 | id10001#7w0IBEWc9Qw#000993#001143
2 | id10009#AtavJVP4bCk#012568#012652
3 |
--------------------------------------------------------------------------------