├── .gitignore ├── LICENSE ├── README.md ├── paper.pdf ├── requirements.txt ├── src ├── __init__.py ├── config │ ├── test_face2facerho.ini │ └── train_face2facerho.ini ├── dataset │ ├── __init__.py │ ├── base_data_loader.py │ ├── base_dataset.py │ └── voxceleb_dataset.py ├── external │ ├── LICENSE │ ├── README.md │ ├── data │ │ ├── landmark_embedding.json │ │ └── pose_transform_config.json │ └── decalib │ │ ├── __init__.py │ │ ├── datasets │ │ ├── aflw2000.py │ │ ├── build_datasets.py │ │ ├── datasets.py │ │ ├── detectors.py │ │ ├── ethnicity.py │ │ ├── now.py │ │ ├── train_datasets.py │ │ ├── vggface.py │ │ └── vox.py │ │ ├── deca.py │ │ ├── models │ │ ├── FLAME.py │ │ ├── decoders.py │ │ ├── encoders.py │ │ ├── frnet.py │ │ ├── lbs.py │ │ └── resnet.py │ │ ├── trainer.py │ │ └── utils │ │ ├── config.py │ │ ├── rotation_converter.py │ │ └── util.py ├── fitting.py ├── models │ ├── VGG19_LOSS.py │ ├── __init__.py │ ├── base_model.py │ ├── discriminator.py │ ├── face2face_rho_model.py │ ├── image_pyramid.py │ ├── motion_network.py │ ├── networks.py │ └── rendering_network.py ├── options │ ├── __init__.py │ └── parse_config.py ├── reenact.py ├── train.py └── util │ ├── html.py │ ├── landmark_image_generation.py │ ├── util.py │ └── visualizer.py ├── test_case ├── driving │ ├── FLAME │ │ ├── headpose.txt │ │ └── landmark.txt │ ├── driving.jpg │ └── original │ │ ├── headpose.txt │ │ └── landmark.txt └── source │ ├── FLAME │ ├── headpose.txt │ └── landmark.txt │ ├── original │ ├── headpose.txt │ └── landmark.txt │ └── source.jpg └── trainingset └── VoxCeleb ├── id10001#7w0IBEWc9Qw#000993#001143 ├── headpose │ ├── 150.txt │ └── 54.txt ├── img │ ├── 150.jpg │ └── 54.jpg ├── landmark │ ├── 150.txt │ └── 54.txt └── mask │ ├── 150.png │ └── 54.png ├── id10009#AtavJVP4bCk#012568#012652 ├── headpose │ ├── 82.txt │ └── 89.txt ├── img │ ├── 82.jpg │ └── 89.jpg ├── landmark │ ├── 82.txt │ └── 89.txt └── mask │ ├── 82.png │ └── 89.png └── list.txt /.gitignore: -------------------------------------------------------------------------------- 1 | .idea/ 2 | src/external/data/generic_model.pkl 3 | src/external/data/deca_model.tar 4 | src/checkpoints/ 5 | test_case/result -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2021, NetEase Games AI Lab. 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | * Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | * Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | * Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Face2Faceρ: Official Pytorch Implementation 2 | [](https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136730055.pdf) 3 | 4 | ### Environment 5 | - CUDA 10.2 or above 6 | - Python 3.8.5 7 | - ``pip install -r requirements.txt`` 8 | - For visdom, some dependencies may need to be manually 9 | downloaded ([visdom issue](https://github.com/fossasia/visdom/issues/111)) 10 | 11 | ### Training data 12 | Our framework relies on a large video dataset containing many identities, 13 | such as [VoxCeleb](https://www.robots.ox.ac.uk/~vgg/data/voxceleb/). For each video frame, the following data is required: 14 | - image: the cropped face image (please refer to the pre-processing steps of 15 | [Siarohin et al.](https://github.com/AliaksandrSiarohin/video-preprocessing)) 16 | - landmark: the 2D facial landmark coordinates obtained via projecting the 3D keypoints on the 17 | fitted 3DMM mesh to the image space. (please refer to Section 3.3 18 | of our paper) 19 | - headpose: 3DMM head pose coefficients 20 | - face mask (optional): the face area mask (can be generated by any face parsing methods, such as 21 | [BiSeNet](https://github.com/zllrunning/face-parsing.PyTorch)). 22 | 23 | 24 | The pre-processed data should be organized as follows (an 25 | example dataset containing two video sequences is provided in ```./dataset/VoxCeleb```): 26 | 27 | ``` 28 | - dataset 29 | - 30 | - list.txt ---list of all videos 31 | - id10001#7w0IBEWc9Qw#000993#001143 ---video folder 1 (should be named as #) 32 | - img ---video frame 33 | - 1.jpg 34 | - 2.jpg 35 | - ... 36 | - landmark ---landmark coordinates for each frame 37 | - 1.txt 38 | - 2.txt 39 | - ... 40 | - headpose ---head pose coefficients for each frame 41 | - 1.txt 42 | - 2.txt 43 | - ... 44 | - mask ---face mask for each frame 45 | - 1.png 46 | - 2.png 47 | - ... 48 | - id10009#AtavJVP4bCk#012568#012652 ---video folder 2 49 | ... 50 | 51 | ``` 52 | ### Training 53 | - Set training data 54 | - Set ```dataroot``` in ```./src/config/train_face2facerho.ini``` to ```./dataset/``` (e.g. ```dataroot=./dataset/VoxCeleb```) 55 | - Setup vidsom 56 | - Set visdom `````` to ```display_port``` in ```./src/config/train_face2facerho.ini```, and run: 57 | ```bash 58 | nohup python -m visdom.server -port & 59 | ``` 60 | - Start training (tested with Tesla V100) 61 | ```bash 62 | python src/train.py --config ./src/config/train_face2facerho.ini 63 | ``` 64 | 65 | ### Testing 66 | - Prepare models 67 | - Download our pre-trained models from [GooleDrive](https://drive.google.com/drive/folders/1asCyEjKxpKSV8g674WwmmCgxAZP9pf9x) and put it in ```./src/checkpoints/voxceleb_face2facerho``` 68 | - Download [FLAME model](https://flame.is.tue.mpg.de/download.php), 69 | choose **FLAME 2020** and unzip it, copy 'generic_model.pkl' into ```./src/external/data``` 70 | - Download [DECA trained model](https://drive.google.com/file/d/1rp8kdyLPvErw2dTmqtjISRVvQLj6Yzje/view?usp=sharing), 71 | and put it in ```./src/external/data``` (**no unzip required**) 72 | - Fit the 3DMM coefficients of the source and driving face images 73 | - Since the code of the 3DMM fitting algorithm used in our paper was taken from our company’s in-house facial performance 74 | capture system, we can only release them after getting our company’s official permission. Alternatively, we provide an 75 | open-source solution based on [DECA](https://github.com/YadiraF/DECA). However, the overall performance of our framework may be slightly poorer with DECA. On the one hand, our original 3DMM fitting algorithm is more accurate and stable than DECA; on the other, the pre-configured 72 keypoints are not exactly the same as our original configuration because the mesh templates are different. 76 | 77 | 78 | Note that the resulting 79 | quality may deteriorate by using DECA 3DMM fitting algorithm, since our original 3DMM fitting algorithm is more 80 | stable and robust than DECA, and the pre-configured 72 keypoints on the FLAME mesh template are also slightly 81 | different from our original configuration. 82 | - Run (tested with Nvidia GeForce RTX 2080Ti): 83 | ```bash 84 | python src/fitting.py --device <"cpu" or "cuda"> \ 85 | --src_img \ 86 | --drv_img \ 87 | --output_src_headpose \ 88 | --output_src_landmark \ 89 | --output_drv_headpose \ 90 | --output_drv_landmark 91 | ``` 92 | - Input 93 | - ```device```: set device, "cpu" or "cuda" 94 | - ```src_img```: input source actor image 95 | - ```drv_img```: input driving actor image 96 | - ```output_src_headpose```: output head pose coefficients of source image (.txt) 97 | - ```output_src_landmark```: output facial landmarks of source image (.txt) 98 | - ```output_drv_headpose```: output head pose coefficients of driving image (.txt) 99 | - ```output_drv_landmark```: output driving facial landmarks (.txt, reconstructed by using shape coefficients 100 | of the source actor and expression and head pose coefficients of the driving actor). 101 | - Example 102 | ```bash 103 | python src/fitting.py --device cuda \ 104 | --src_img ./test_case/source/source.jpg --drv_img ./test_case/driving/driving.jpg \ 105 | --output_src_headpose ./test_case/source/FLAME/headpose.txt --output_src_landmark ./test_case/source/FLAME/landmark.txt \ 106 | --output_drv_headpose ./test_case/driving/FLAME/headpose.txt --output_drv_landmark ./test_case/driving/FLAME/landmark.txt 107 | ``` 108 | 109 | - Get the final reenacted result (tested with Nvidia GeForce RTX 2080Ti): 110 | ```bash 111 | python src/reenact.py --config ./config/test_face2facerho.ini \ 112 | --src_img \ 113 | --src_headpose \ 114 | --src_landmark \ 115 | --drv_headpose \ 116 | --drv_landmark \ 117 | --output_dir 118 | ``` 119 | - Input 120 | - ```src_img```: input source actor image 121 | - ```src_headpose```: input head pose coefficients of source image (.txt) 122 | - ```src_landmark```: input facial landmarks of source image (.txt) 123 | - ```drv_headpose```: input head pose coefficients of driving image (.txt) 124 | - ```drv_landmark```: input driving facial landmarks (reconstructed by using shape coefficients of the 125 | source actor and expression and head pose coefficients of the driving actor). 126 | - ```output_dir```: output image (named "result.png") will be saved in this folder. 127 | - Example 128 | - Run using 3DMM fitting results by our original 3DMM fitting algorithm (results are pre-save in 129 | ```./test_case/source/original``` and ```./test_case/source/original```) 130 | ```bash 131 | python src/reenact.py --config ./src/config/test_face2facerho.ini \ 132 | --src_img ./test_case/source/source.jpg \ 133 | --src_headpose ./test_case/source/original/headpose.txt --src_landmark ./test_case/source/original/landmark.txt \ 134 | --drv_headpose ./test_case/driving/original/headpose.txt --drv_landmark ./test_case/driving/original/landmark.txt \ 135 | --output_dir ./test_case/result 136 | ``` 137 | - Run using 3DMM fitting results by DECA 138 | ```bash 139 | python src/reenact.py --config ./src/config/test_face2facerho.ini \ 140 | --src_img ./test_case/source/source.jpg \ 141 | --src_headpose ./test_case/source/FLAME/headpose.txt --src_landmark ./test_case/source/FLAME/landmark.txt \ 142 | --drv_headpose ./test_case/driving/FLAME/headpose.txt --drv_landmark ./test_case/driving/FLAME/landmark.txt \ 143 | --output_dir ./test_case/result 144 | ``` 145 | -------------------------------------------------------------------------------- /paper.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/paper.pdf -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy==1.18.5 2 | opencv-python==4.4.0.46 3 | torch==1.6.0 4 | torchvision==0.7.0 5 | Pillow==8.0.1 6 | visdom==0.1.8.9 7 | dominate==2.6.0 8 | yacs==0.1.8 9 | scikit-image==0.17.2 10 | face-alignment 11 | chumpy==0.70 -------------------------------------------------------------------------------- /src/__init__.py: -------------------------------------------------------------------------------- 1 | from .util import * 2 | from .data import * 3 | from .models import * 4 | from .options import * 5 | from .config import * -------------------------------------------------------------------------------- /src/config/test_face2facerho.ini: -------------------------------------------------------------------------------- 1 | [ROOT] 2 | ;basic config 3 | name = voxceleb_face2facerho 4 | gpu_ids = 0 5 | checkpoints_dir = ./src/checkpoints 6 | model = face2face_rho 7 | output_size = 512 8 | isTrain = False 9 | phase = test 10 | load_iter = -1 11 | epoch = 105 12 | 13 | ;rendering module config 14 | headpose_dims = 6 15 | mobilev2_encoder_channels = 16,8,12,28,64,72,140,280 16 | mobilev2_decoder_channels = 16,8,14,24,64,96,140,280 17 | mobilev2_encoder_layers = 1,2,2,2,2,2,1 18 | mobilev2_decoder_layers = 1,2,2,2,2,2,1 19 | mobilev2_encoder_expansion_factor = 1,6,6,6,6,6,6 20 | mobilev2_decoder_expansion_factor = 1,6,6,6,6,6,6 21 | headpose_embedding_ngf = 8 22 | 23 | ;motion module config 24 | mn_ngf = 16 25 | n_local_enhancers = 2 26 | mn_n_downsampling = 2 27 | mn_n_blocks_local = 3 28 | 29 | ;discriminator 30 | disc_block_expansion = 32 31 | disc_num_blocks = 4 32 | disc_max_features = 512 33 | 34 | ;training parameters 35 | init_type = none 36 | init_gain = 0.02 37 | emphasize_face_area = True 38 | loss_scales = 1,0.5,0.25,0.125 39 | warp_loss_weight = 500.0 40 | reconstruction_loss_weight = 15.0 41 | feature_matching_loss_weight = 1 42 | face_area_weight_scale = 4 43 | init_field_epochs = 5 44 | lr = 0.0002 45 | beta1 = 0.9 46 | lr_policy = lambda 47 | epoch_count = 0 48 | niter = 90 49 | niter_decay = 15 50 | continue_train = False 51 | 52 | ;dataset parameters 53 | dataset_mode = voxceleb_test 54 | dataroot = ./dataset/VoxCeleb 55 | num_repeats = 60 56 | batch_size = 1 57 | serial_batches = True 58 | num_threads = 0 59 | 60 | ;vis_config 61 | display_freq = 200 62 | update_html_freq = 20 63 | display_id = 1 64 | display_server = http://localhost 65 | display_env = voxceleb_face2facerho 66 | display_port = 6005 67 | print_freq = 200 68 | save_latest_freq = 10000 69 | save_epoch_freq = 1 70 | no_html = True 71 | display_winsize = 256 72 | display_ncols = 3 73 | verbose = False -------------------------------------------------------------------------------- /src/config/train_face2facerho.ini: -------------------------------------------------------------------------------- 1 | [ROOT] 2 | ;basic config 3 | name = voxceleb_face2facerho 4 | gpu_ids = 0 5 | checkpoints_dir = ./src/checkpoints 6 | model = face2face_rho 7 | output_size = 512 8 | isTrain = True 9 | phase = train 10 | load_iter = -1 11 | epoch = 0 12 | 13 | ;rendering module config 14 | headpose_dims = 6 15 | mobilev2_encoder_channels = 16,8,12,28,64,72,140,280 16 | mobilev2_decoder_channels = 16,8,14,24,64,96,140,280 17 | mobilev2_encoder_layers = 1,2,2,2,2,2,1 18 | mobilev2_decoder_layers = 1,2,2,2,2,2,1 19 | mobilev2_encoder_expansion_factor = 1,6,6,6,6,6,6 20 | mobilev2_decoder_expansion_factor = 1,6,6,6,6,6,6 21 | headpose_embedding_ngf = 8 22 | 23 | ;motion module config 24 | mn_ngf = 16 25 | n_local_enhancers = 2 26 | mn_n_downsampling = 2 27 | mn_n_blocks_local = 3 28 | 29 | ;discriminator 30 | disc_block_expansion = 32 31 | disc_num_blocks = 4 32 | disc_max_features = 512 33 | 34 | ;training parameters 35 | init_type = none 36 | init_gain = 0.02 37 | emphasize_face_area = True 38 | loss_scales = 1,0.5,0.25,0.125 39 | warp_loss_weight = 500.0 40 | reconstruction_loss_weight = 15.0 41 | feature_matching_loss_weight = 1 42 | face_area_weight_scale = 4 43 | init_field_epochs = 5 44 | lr = 0.0002 45 | beta1 = 0.9 46 | lr_policy = lambda 47 | epoch_count = 0 48 | niter = 90 49 | niter_decay = 15 50 | continue_train = False 51 | 52 | ;dataset parameters 53 | dataset_mode = voxceleb 54 | dataroot = ./trainingset/VoxCeleb 55 | num_repeats = 60 56 | batch_size = 6 57 | serial_batches = False 58 | num_threads = 8 59 | 60 | ;vis_config 61 | display_freq = 200 62 | update_html_freq = 20 63 | display_id = 1 64 | display_server = http://localhost 65 | display_env = voxceleb_face2facerho 66 | display_port = 6005 67 | print_freq = 200 68 | save_latest_freq = 10000 69 | save_epoch_freq = 1 70 | no_html = True 71 | display_winsize = 256 72 | display_ncols = 3 73 | verbose = False -------------------------------------------------------------------------------- /src/dataset/__init__.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import torch.utils.data 3 | from dataset.base_data_loader import BaseDataLoader 4 | from dataset.base_dataset import BaseDataset 5 | 6 | 7 | def find_dataset_using_name(dataset_name): 8 | # Given the option --dataset_mode [datasetname], 9 | # the file "data/datasetname_dataset.py" 10 | # will be imported. 11 | dataset_filename = "dataset." + dataset_name + "_dataset" 12 | datasetlib = importlib.import_module(dataset_filename) 13 | 14 | # In the file, the class called DatasetNameDataset() will 15 | # be instantiated. It has to be a subclass of BaseDataset, 16 | # and it is case-insensitive. 17 | dataset = None 18 | target_dataset_name = dataset_name.replace('_', '') + 'dataset' 19 | for name, cls in datasetlib.__dict__.items(): 20 | if name.lower() == target_dataset_name.lower() \ 21 | and issubclass(cls, BaseDataset): 22 | dataset = cls 23 | 24 | if dataset is None: 25 | print("In %s.py, there should be a subclass of BaseDataset with class name that matches %s in lowercase." % (dataset_filename, target_dataset_name)) 26 | exit(0) 27 | 28 | return dataset 29 | 30 | 31 | def get_option_setter(dataset_name): 32 | dataset_class = find_dataset_using_name(dataset_name) 33 | return dataset_class.modify_commandline_options 34 | 35 | 36 | def create_dataset(opt): 37 | dataset = find_dataset_using_name(opt.dataset_mode) 38 | instance = dataset() 39 | instance.initialize(opt) 40 | print("dataset [%s] was created" % (instance.name())) 41 | return instance 42 | 43 | 44 | def CreateDataLoader(opt): 45 | data_loader = CustomDatasetDataLoader() 46 | data_loader.initialize(opt) 47 | return data_loader 48 | 49 | 50 | # Wrapper class of Dataset class that performs 51 | # multi-threaded data loading 52 | class CustomDatasetDataLoader(BaseDataLoader): 53 | def name(self): 54 | return 'CustomDatasetDataLoader' 55 | 56 | def initialize(self, opt): 57 | BaseDataLoader.initialize(self, opt) 58 | self.dataset = create_dataset(opt) 59 | if opt.serial_batches: 60 | self.dataloader = torch.utils.data.DataLoader( 61 | self.dataset, 62 | batch_size=opt.batch_size, 63 | shuffle=False, 64 | num_workers=int(opt.num_threads)) 65 | else: 66 | #weights = make_weights_for_balanced_classes(dataset_train.imgs, len(dataset_train.classes)) 67 | # weights = self.dataset.getSampleWeights() 68 | # weights = torch.DoubleTensor(weights) 69 | # sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, len(weights)) 70 | 71 | self.dataloader = torch.utils.data.DataLoader( 72 | self.dataset, 73 | batch_size=opt.batch_size, 74 | shuffle=True, 75 | # sampler=sampler, 76 | # pin_memory=True, 77 | num_workers=int(opt.num_threads)) 78 | 79 | 80 | def load_data(self): 81 | return self 82 | 83 | def __len__(self): 84 | return min(len(self.dataset), self.opt.max_dataset_size) 85 | 86 | def __iter__(self): 87 | for i, data in enumerate(self.dataloader): 88 | if i * self.opt.batch_size >= self.opt.max_dataset_size: 89 | break 90 | yield data 91 | -------------------------------------------------------------------------------- /src/dataset/base_data_loader.py: -------------------------------------------------------------------------------- 1 | class BaseDataLoader(): 2 | def __init__(self): 3 | pass 4 | 5 | def initialize(self, opt): 6 | self.opt = opt 7 | pass 8 | 9 | def load_data(): 10 | return None 11 | -------------------------------------------------------------------------------- /src/dataset/base_dataset.py: -------------------------------------------------------------------------------- 1 | import torch.utils.data as data 2 | from PIL import Image 3 | import torchvision.transforms as transforms 4 | import torch 5 | 6 | 7 | class BaseDataset(data.Dataset): 8 | def __init__(self): 9 | super(BaseDataset, self).__init__() 10 | 11 | def name(self): 12 | return 'BaseDataset' 13 | 14 | @staticmethod 15 | def modify_commandline_options(parser, is_train): 16 | return parser 17 | 18 | def initialize(self, opt): 19 | pass 20 | 21 | def getSampleWeights(self): 22 | return torch.ones((len(self))) 23 | 24 | def __len__(self): 25 | return 0 26 | 27 | 28 | def get_transform(opt): 29 | transform_list = [] 30 | if opt.resize_or_crop == 'resize_and_crop': 31 | osize = [opt.loadSize, opt.loadSize] 32 | transform_list.append(transforms.Resize(osize, Image.BICUBIC)) 33 | transform_list.append(transforms.RandomCrop(opt.fineSize)) 34 | elif opt.resize_or_crop == 'crop': 35 | transform_list.append(transforms.RandomCrop(opt.fineSize)) 36 | elif opt.resize_or_crop == 'scale_width': 37 | transform_list.append(transforms.Lambda( 38 | lambda img: __scale_width(img, opt.fineSize))) 39 | elif opt.resize_or_crop == 'scale_width_and_crop': 40 | transform_list.append(transforms.Lambda( 41 | lambda img: __scale_width(img, opt.loadSize))) 42 | transform_list.append(transforms.RandomCrop(opt.fineSize)) 43 | elif opt.resize_or_crop == 'none': 44 | transform_list.append(transforms.Lambda( 45 | lambda img: __adjust(img))) 46 | else: 47 | raise ValueError('--resize_or_crop %s is not a valid option.' % opt.resize_or_crop) 48 | 49 | if opt.isTrain and not opt.no_flip: 50 | transform_list.append(transforms.RandomHorizontalFlip()) 51 | 52 | transform_list += [transforms.ToTensor(), 53 | transforms.Normalize((0.5, 0.5, 0.5), 54 | (0.5, 0.5, 0.5))] 55 | return transforms.Compose(transform_list) 56 | 57 | 58 | # just modify the width and height to be multiple of 4 59 | def __adjust(img): 60 | ow, oh = img.size 61 | 62 | # the size needs to be a multiple of this number, 63 | # because going through generator network may change img size 64 | # and eventually cause size mismatch error 65 | mult = 4 66 | if ow % mult == 0 and oh % mult == 0: 67 | return img 68 | w = (ow - 1) // mult 69 | w = (w + 1) * mult 70 | h = (oh - 1) // mult 71 | h = (h + 1) * mult 72 | 73 | if ow != w or oh != h: 74 | __print_size_warning(ow, oh, w, h) 75 | 76 | return img.resize((w, h), Image.BICUBIC) 77 | 78 | 79 | def __scale_width(img, target_width): 80 | ow, oh = img.size 81 | 82 | # the size needs to be a multiple of this number, 83 | # because going through generator network may change img size 84 | # and eventually cause size mismatch error 85 | mult = 4 86 | assert target_width % mult == 0, "the target width needs to be multiple of %d." % mult 87 | if (ow == target_width and oh % mult == 0): 88 | return img 89 | w = target_width 90 | target_height = int(target_width * oh / ow) 91 | m = (target_height - 1) // mult 92 | h = (m + 1) * mult 93 | 94 | if target_height != h: 95 | __print_size_warning(target_width, target_height, w, h) 96 | 97 | return img.resize((w, h), Image.BICUBIC) 98 | 99 | 100 | def __print_size_warning(ow, oh, w, h): 101 | if not hasattr(__print_size_warning, 'has_printed'): 102 | print("The image size needs to be a multiple of 4. " 103 | "The loaded image size was (%d, %d), so it was adjusted to " 104 | "(%d, %d). This adjustment will be done to all images " 105 | "whose sizes are not multiples of 4" % (ow, oh, w, h)) 106 | __print_size_warning.has_printed = True 107 | -------------------------------------------------------------------------------- /src/dataset/voxceleb_dataset.py: -------------------------------------------------------------------------------- 1 | import os.path 2 | import torch 3 | import numpy as np 4 | from dataset.base_dataset import BaseDataset 5 | from util.util import ( 6 | make_ids, 7 | read_target, 8 | load_coeffs, 9 | load_landmarks 10 | ) 11 | 12 | from util.landmark_image_generation import LandmarkImageGeneration 13 | 14 | 15 | class VoxCelebDataset(BaseDataset): 16 | def initialize(self, opt): 17 | self.opt = opt 18 | self.dataroot = opt.dataroot 19 | video_list_file = self.dataroot + "/list.txt" 20 | self.video_path = self.dataroot 21 | self.video_names = [] 22 | with open(video_list_file, 'r') as f: 23 | lines = f.readlines() 24 | for line in lines: 25 | self.video_names.append(self.video_path + "/" + line) 26 | person_ids = set() 27 | for video in self.video_names: 28 | person_ids.add(os.path.basename(video).split('#')[0]) 29 | self.person_ids = list(person_ids) 30 | self.person_ids.sort() 31 | 32 | self.landmark_img_generator = LandmarkImageGeneration(self.opt) 33 | 34 | self.total_person_id = len(self.person_ids) 35 | print('\tnum videos: {}, repeat {} times, total: {}'.format(self.total_person_id, opt.num_repeats, 36 | self.total_person_id * opt.num_repeats)) 37 | 38 | def __getitem__(self, index): 39 | idx = index % self.total_person_id 40 | name = self.person_ids[idx] 41 | video_name = np.random.choice(self.choose_video_from_person_id(name)) 42 | frame_ids = make_ids(video_name + "/img") 43 | 44 | frame_idx = np.sort(np.random.choice(frame_ids, replace=False, size=2)) 45 | 46 | img_dir = video_name + "/img" 47 | headpose_dir = video_name + "/headpose" 48 | landmark_dir = video_name + "/landmark" 49 | mask_dir = video_name + "/mask" 50 | 51 | src_idx = frame_idx[0] 52 | drv_idx = frame_idx[1] 53 | 54 | src_img = read_target(img_dir + "/" + str(src_idx) + ".jpg", self.opt.output_size) 55 | drv_img = read_target(img_dir + "/" + str(drv_idx) + ".jpg", self.opt.output_size) 56 | 57 | src_headpose = torch.from_numpy(np.array(load_coeffs(headpose_dir + "/" + str(src_idx) + ".txt"))).float() 58 | drv_headpose = torch.from_numpy(np.array(load_coeffs(headpose_dir + "/" + str(drv_idx) + ".txt"))).float() 59 | 60 | src_landmark = torch.from_numpy(np.array(load_landmarks(landmark_dir + "/" + str(src_idx) + ".txt"))).float() 61 | drv_landmark = torch.from_numpy(np.array(load_landmarks(landmark_dir + "/" + str(drv_idx) + ".txt"))).float() 62 | 63 | src_landmark_img = self.draw_landmark_img(src_landmark) 64 | drv_landmark_img = self.draw_landmark_img(drv_landmark) 65 | 66 | input_data = { 67 | 'src_img': src_img, 68 | 'drv_img': drv_img, 69 | 'src_headpose': src_headpose, 70 | 'drv_headpose': drv_headpose, 71 | 'src_landmark_img': src_landmark_img, 72 | 'drv_landmark_img': drv_landmark_img, 73 | } 74 | if self.opt.emphasize_face_area: 75 | drv_face_mask = read_target(mask_dir + "/" + str(drv_idx) + ".png", self.opt.output_size) 76 | input_data['drv_face_mask'] = drv_face_mask.squeeze(0) 77 | return input_data 78 | 79 | def choose_video_from_person_id(self, name): 80 | names = [] 81 | for video_name in self.video_names: 82 | if name in video_name: 83 | names.append(video_name.strip()) 84 | return names 85 | 86 | def draw_landmark_img(self, landmarks): 87 | landmark_imgs = self.landmark_img_generator.generate_landmark_img(landmarks) 88 | return landmark_imgs 89 | 90 | def __len__(self): 91 | return self.total_person_id * self.opt.num_repeats 92 | 93 | def name(self): 94 | return 'VoxCelebDataset' 95 | 96 | 97 | -------------------------------------------------------------------------------- /src/external/LICENSE: -------------------------------------------------------------------------------- 1 | License 2 | 3 | Software Copyright License for non-commercial scientific research purposes 4 | Please read carefully the following terms and conditions and any accompanying documentation before you download 5 | and/or use the DECA model, data and software, (the "Model & Software"), including 3D meshes, software, and scripts. 6 | By downloading and/or using the Model & Software (including downloading, cloning, installing, and any other use 7 | of this github repository), you acknowledge that you have read these terms and conditions, understand them, and 8 | agree to be bound by them. If you do not agree with these terms and conditions, you must not download and/or use 9 | the Model & Software. Any infringement of the terms of this agreement will automatically terminate your rights 10 | under this License 11 | 12 | Ownership / Licensees 13 | The Model & Software and the associated materials has been developed at the 14 | Max Planck Institute for Intelligent Systems (hereinafter "MPI"). 15 | 16 | Any copyright or patent right is owned by and proprietary material of the 17 | Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (hereinafter “MPG”; MPI and MPG hereinafter 18 | collectively “Max-Planck”) hereinafter the “Licensor”. 19 | 20 | License Grant 21 | Licensor grants you (Licensee) personally a single-user, non-exclusive, non-transferable, free of charge right: 22 | 23 | • To install the Model & Software on computers owned, leased or otherwise controlled by you and/or your organization. 24 | • To use the Model & Software for the sole purpose of performing peaceful non-commercial scientific research, 25 | non-commercial education, or non-commercial artistic projects. 26 | 27 | Any other use, in particular any use for commercial, pornographic, military, or surveillance purposes is prohibited. 28 | This includes, without limitation, incorporation in a commercial product, use in a commercial service, 29 | or production of other artefacts for commercial purposes. 30 | 31 | The Model & Software may not be used to create fake, libelous, misleading, or defamatory content of any kind, excluding 32 | analyses in peer-reviewed scientific research. 33 | 34 | The Model & Software may not be reproduced, modified and/or made available in any form to any third party 35 | without Max-Planck’s prior written permission. 36 | 37 | The Model & Software may not be used for pornographic purposes or to generate pornographic material whether 38 | commercial or not. This license also prohibits the use of the Model & Software to train methods/algorithms/neural 39 | networks/etc. for commercial use of any kind. By downloading the Model & Software, you agree not to reverse engineer it. 40 | 41 | No Distribution 42 | The Model & Software and the license herein granted shall not be copied, shared, distributed, re-sold, offered 43 | for re-sale, transferred or sub-licensed in whole or in part except that you may make one copy for archive 44 | purposes only. 45 | 46 | Disclaimer of Representations and Warranties 47 | You expressly acknowledge and agree that the Model & Software results from basic research, is provided “AS IS”, 48 | may contain errors, and that any use of the Model & Software is at your sole risk. 49 | LICENSOR MAKES NO REPRESENTATIONS 50 | OR WARRANTIES OF ANY KIND CONCERNING THE MODEL & SOFTWARE, NEITHER EXPRESS NOR IMPLIED, AND THE ABSENCE OF ANY 51 | LEGAL OR ACTUAL DEFECTS, WHETHER DISCOVERABLE OR NOT. Specifically, and not to limit the foregoing, licensor 52 | makes no representations or warranties (i) regarding the merchantability or fitness for a particular purpose of 53 | the Model & Software, (ii) that the use of the Model & Software will not infringe any patents, copyrights or other 54 | intellectual property rights of a third party, and (iii) that the use of the Model & Software will not cause any 55 | damage of any kind to you or a third party. 56 | 57 | Limitation of Liability 58 | Because this Model & Software License Agreement qualifies as a donation, according to Section 521 of the German 59 | Civil Code (Bürgerliches Gesetzbuch – BGB) Licensor as a donor is liable for intent and gross negligence only. 60 | If the Licensor fraudulently conceals a legal or material defect, they are obliged to compensate the Licensee 61 | for the resulting damage. 62 | 63 | Licensor shall be liable for loss of data only up to the amount of typical recovery costs which would have 64 | arisen had proper and regular data backup measures been taken. For the avoidance of doubt Licensor shall be 65 | liable in accordance with the German Product Liability Act in the event of product liability. The foregoing 66 | applies also to Licensor’s legal representatives or assistants in performance. Any further liability shall 67 | be excluded. Patent claims generated through the usage of the Model & Software cannot be directed towards the copyright holders. 68 | The Model & Software is provided in the state of development the licensor defines. If modified or extended by 69 | Licensee, the Licensor makes no claims about the fitness of the Model & Software and is not responsible 70 | for any problems such modifications cause. 71 | 72 | No Maintenance Services 73 | You understand and agree that Licensor is under no obligation to provide either maintenance services, 74 | update services, notices of latent defects, or corrections of defects with regard to the Model & Software. 75 | Licensor nevertheless reserves the right to update, modify, or discontinue the Model & Software at any time. 76 | 77 | Defects of the Model & Software must be notified in writing to the Licensor with a comprehensible description 78 | of the error symptoms. The notification of the defect should enable the reproduction of the error. 79 | The Licensee is encouraged to communicate any use, results, modification or publication. 80 | 81 | Publications using the Model & Software 82 | You acknowledge that the Model & Software is a valuable scientific resource and agree to appropriately reference 83 | the following paper in any publication making use of the Model & Software. 84 | 85 | Commercial licensing opportunities 86 | For commercial uses of the Model & Software, please send email to ps-license@tue.mpg.de 87 | 88 | This Agreement shall be governed by the laws of the Federal Republic of Germany except for the UN Sales Convention. 89 | -------------------------------------------------------------------------------- /src/external/README.md: -------------------------------------------------------------------------------- 1 | # DECA: Detailed Expression Capture and Animation (SIGGRAPH2021) 2 | 3 | Please refer to [README](https://github.com/YadiraF/DECA/blob/master/README.md) for more details about DECA. If you want 4 | to use this code, you should follow the original [LICENSE](https://github.com/YadiraF/DECA/blob/master/LICENSE) of DECA: 5 | 6 | >This code and model are available for non-commercial scientific research purposes as defined in the [LICENSE](https://github.com/YadiraF/DECA/blob/master/LICENSE) file. 7 | By downloading and using the code and model you agree to the terms in the [LICENSE](https://github.com/YadiraF/DECA/blob/master/LICENSE). -------------------------------------------------------------------------------- /src/external/data/landmark_embedding.json: -------------------------------------------------------------------------------- 1 | { 2 | "lmk_faces_idx": [ 3 | 7726, 4 | 2666, 5 | 3406, 6 | 8460, 7 | 8371, 8 | 8456, 9 | 8455, 10 | 8384, 11 | 8382, 12 | 2622, 13 | 7155, 14 | 3680, 15 | 8909, 16 | 8198, 17 | 8258, 18 | 3195, 19 | 3112, 20 | 8829, 21 | 7930, 22 | 225, 23 | 3779, 24 | 8809, 25 | 3839, 26 | 3845, 27 | 5107, 28 | 6428, 29 | 1248, 30 | 8626, 31 | 1180, 32 | 3742, 33 | 8800, 34 | 2238, 35 | 7341, 36 | 8803, 37 | 5920, 38 | 6229, 39 | 6765, 40 | 6773, 41 | 2462, 42 | 3730, 43 | 8730, 44 | 6650, 45 | 8548, 46 | 1649, 47 | 8820, 48 | 3799, 49 | 6076, 50 | 5374, 51 | 759, 52 | 8593, 53 | 3691, 54 | 197, 55 | 7423, 56 | 2354, 57 | 7390, 58 | 7418, 59 | 2331, 60 | 917, 61 | 5997, 62 | 885, 63 | 917, 64 | 8219, 65 | 8768, 66 | 8635, 67 | 6055, 68 | 6026, 69 | 6069, 70 | 7498, 71 | 7446, 72 | 2452, 73 | 8668, 74 | 8799 75 | ], 76 | "lmk_bary_coords": [ 77 | [ 78 | 0.5, 79 | 0.5, 80 | 0.0 81 | ], 82 | [ 83 | 1.0, 84 | 0.0, 85 | 0.0 86 | ], 87 | [ 88 | 0.3333333, 89 | 0.3333333, 90 | 0.3333333 91 | ], 92 | [ 93 | 0.5, 94 | 0.5, 95 | 0.0 96 | ], 97 | [ 98 | 0.3333333, 99 | 0.3333333, 100 | 0.3333333 101 | ], 102 | [ 103 | 0.3333333, 104 | 0.3333333, 105 | 0.3333333 106 | ], 107 | [ 108 | 0.0, 109 | 0.5, 110 | 0.5 111 | ], 112 | [ 113 | 0.5, 114 | 0.5, 115 | 0.0 116 | ], 117 | [ 118 | 1.0, 119 | 0.0, 120 | 0.0 121 | ], 122 | [ 123 | 0.5, 124 | 0.0, 125 | 0.5 126 | ], 127 | [ 128 | 1.0, 129 | 0.0, 130 | 0.0 131 | ], 132 | [ 133 | 0.3333333, 134 | 0.3333333, 135 | 0.3333333 136 | ], 137 | [ 138 | 0.0, 139 | 0.5, 140 | 0.5 141 | ], 142 | [ 143 | 0.3333333, 144 | 0.3333333, 145 | 0.3333333 146 | ], 147 | [ 148 | 0.3333333, 149 | 0.3333333, 150 | 0.3333333 151 | ], 152 | [ 153 | 0.0, 154 | 0.5, 155 | 0.5 156 | ], 157 | [ 158 | 0.5, 159 | 0.0, 160 | 0.5 161 | ], 162 | [ 163 | 0.3333333, 164 | 0.3333333, 165 | 0.3333333 166 | ], 167 | [ 168 | 0.5, 169 | 0.0, 170 | 0.5 171 | ], 172 | [ 173 | 0.5, 174 | 0.0, 175 | 0.5 176 | ], 177 | [ 178 | 0.5, 179 | 0.0, 180 | 0.5 181 | ], 182 | [ 183 | 0.5, 184 | 0.5, 185 | 0.0 186 | ], 187 | [ 188 | 0.5, 189 | 0.0, 190 | 0.5 191 | ], 192 | [ 193 | 0.0, 194 | 0.5, 195 | 0.5 196 | ], 197 | [ 198 | 0.5, 199 | 0.5, 200 | 0.0 201 | ], 202 | [ 203 | 0.5, 204 | 0.0, 205 | 0.5 206 | ], 207 | [ 208 | 0.3333333, 209 | 0.3333333, 210 | 0.3333333 211 | ], 212 | [ 213 | 4.999473057765158e-15, 214 | 0.9740002155303955, 215 | 0.025999775156378746 216 | ], 217 | [ 218 | 0.8086484670639038, 219 | 0.01935010962188244, 220 | 0.1720014214515686 221 | ], 222 | [ 223 | 0.003992011304944754, 224 | 0.4596897065639496, 225 | 0.536318302154541 226 | ], 227 | [ 228 | 0.8673513531684875, 229 | 0.12561924755573273, 230 | 0.007029408123344183 231 | ], 232 | [ 233 | 0.6809834837913513, 234 | 0.23452331125736237, 235 | 0.08449321240186691 236 | ], 237 | [ 238 | 0.6094620823860168, 239 | 0.16802914440631866, 240 | 0.2225087583065033 241 | ], 242 | [ 243 | 0.004817526787519455, 244 | 0.6991134881973267, 245 | 0.2960689663887024 246 | ], 247 | [ 248 | 0.2225087583065033, 249 | 0.16802914440631866, 250 | 0.6094620823860168 251 | ], 252 | [ 253 | 0.6809834837913513, 254 | 0.08449321240186691, 255 | 0.23452331125736237 256 | ], 257 | [ 258 | 0.3333333, 259 | 0.3333333, 260 | 0.3333333 261 | ], 262 | [ 263 | 0.3333333, 264 | 0.3333333, 265 | 0.3333333 266 | ], 267 | [ 268 | 0.5, 269 | 0.5, 270 | 0 271 | ], 272 | [ 273 | 0.3333333, 274 | 0.3333333, 275 | 0.3333333 276 | ], 277 | [ 278 | 0.3333333, 279 | 0.3333333, 280 | 0.3333333 281 | ], 282 | [ 283 | 0.15, 284 | 0.15, 285 | 0.7 286 | ], 287 | [ 288 | 0.3333333, 289 | 0.3333333, 290 | 0.3333333 291 | ], 292 | [ 293 | 0.3333333, 294 | 0.3333333, 295 | 0.3333333 296 | ], 297 | [ 298 | 0.3333333, 299 | 0.3333333, 300 | 0.3333333 301 | ], 302 | [ 303 | 0.3333333, 304 | 0.3333333, 305 | 0.3333333 306 | ], 307 | [ 308 | 0.5, 309 | 0.0, 310 | 0.5 311 | ], 312 | [ 313 | 0.3333333, 314 | 0.3333333, 315 | 0.3333333 316 | ], 317 | [ 318 | 0.3333333, 319 | 0.3333333, 320 | 0.3333333 321 | ], 322 | [ 323 | 0.3333333, 324 | 0.3333333, 325 | 0.3333333 326 | ], 327 | [ 328 | 0.3333333, 329 | 0.3333333, 330 | 0.3333333 331 | ], 332 | [ 333 | 0.15, 334 | 0.7, 335 | 0.15 336 | ], 337 | [ 338 | 0.4711792767047882, 339 | 0.5057284235954285, 340 | 0.023092320188879967 341 | ], 342 | [ 343 | 0.1317896991968155, 344 | 0.08670816570520401, 345 | 0.7815021276473999 346 | ], 347 | [ 348 | 0.136175274848938, 349 | 0.701887845993042, 350 | 0.16193686425685883 351 | ], 352 | [ 353 | 0.07477450370788574, 354 | 0.33583617210388184, 355 | 0.5893893241882324 356 | ], 357 | [ 358 | 0.5, 359 | 0.0, 360 | 0.5 361 | ], 362 | [ 363 | 0.4711792767047882, 364 | 0.023092320188879967, 365 | 0.5057284235954285 366 | ], 367 | [ 368 | 0.13178616762161255, 369 | 0.7815194725990295, 370 | 0.08669432997703552 371 | ], 372 | [ 373 | 0.136175274848938, 374 | 0.16193686425685883, 375 | 0.701887845993042 376 | ], 377 | [ 378 | 0.8093733191490173, 379 | -9.597695766394229e-14, 380 | 0.19062671065330505 381 | ], 382 | [ 383 | 0.5, 384 | 0.5, 385 | 0.0 386 | ], 387 | [ 388 | 0.00028422221657820046, 389 | 0.8192539215087891, 390 | 0.18046186864376068 391 | ], 392 | [ 393 | 0.9783512353897095, 394 | 0.02164875715970993, 395 | -1.3638637584277862e-15 396 | ], 397 | [ 398 | 0.5, 399 | 0.5, 400 | 0.0 401 | ], 402 | [ 403 | 0.09441351890563965, 404 | 0.9030233025550842, 405 | 0.0025631925091147423 406 | ], 407 | [ 408 | 0.047000590711832047, 409 | 0.9423790574073792, 410 | 0.010620360262691975 411 | ], 412 | [ 413 | 0.0, 414 | 0.5, 415 | 0.5 416 | ], 417 | [ 418 | 0.0025631925091147423, 419 | 0.9030233025550842, 420 | 0.09441351890563965 421 | ], 422 | [ 423 | 0.047000590711832047, 424 | 0.010620360262691975, 425 | 0.9423790574073792 426 | ], 427 | [ 428 | 0.02153543196618557, 429 | 0.9784510731697083, 430 | 1.3496901374310255e-05 431 | ], 432 | [ 433 | 0.0001076170738087967, 434 | 0.03357921168208122, 435 | 0.9663131833076477 436 | ] 437 | ] 438 | } -------------------------------------------------------------------------------- /src/external/data/pose_transform_config.json: -------------------------------------------------------------------------------- 1 | { 2 | "scale_transform": 4122.399989645386, 3 | "tx_transform": 0.2582781138863169, 4 | "ty_transform": -0.26074984122168304 5 | } -------------------------------------------------------------------------------- /src/external/decalib/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/src/external/decalib/__init__.py -------------------------------------------------------------------------------- /src/external/decalib/datasets/aflw2000.py: -------------------------------------------------------------------------------- 1 | import os, sys 2 | import torch 3 | import torchvision.transforms as transforms 4 | import numpy as np 5 | import cv2 6 | import scipy 7 | from skimage.io import imread, imsave 8 | from skimage.transform import estimate_transform, warp, resize, rescale 9 | from glob import glob 10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset 11 | import scipy.io 12 | 13 | class AFLW2000(Dataset): 14 | def __init__(self, testpath='/ps/scratch/yfeng/Data/AFLW2000/GT', crop_size=224): 15 | ''' 16 | data class for loading AFLW2000 dataset 17 | make sure each image has corresponding mat file, which provides cropping infromation 18 | ''' 19 | if os.path.isdir(testpath): 20 | self.imagepath_list = glob(testpath + '/*.jpg') + glob(testpath + '/*.png') 21 | elif isinstance(testpath, list): 22 | self.imagepath_list = testpath 23 | elif os.path.isfile(testpath) and (testpath[-3:] in ['jpg', 'png']): 24 | self.imagepath_list = [testpath] 25 | else: 26 | print('please check the input path') 27 | exit() 28 | print('total {} images'.format(len(self.imagepath_list))) 29 | self.imagepath_list = sorted(self.imagepath_list) 30 | self.crop_size = crop_size 31 | self.scale = 1.6 32 | self.resolution_inp = crop_size 33 | 34 | def __len__(self): 35 | return len(self.imagepath_list) 36 | 37 | def __getitem__(self, index): 38 | imagepath = self.imagepath_list[index] 39 | imagename = imagepath.split('/')[-1].split('.')[0] 40 | image = imread(imagepath)[:,:,:3] 41 | kpt = scipy.io.loadmat(imagepath.replace('jpg', 'mat'))['pt3d_68'].T 42 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]); 43 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1]) 44 | 45 | h, w, _ = image.shape 46 | old_size = (right - left + bottom - top)/2 47 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])#+ old_size*0.1]) 48 | size = int(old_size*self.scale) 49 | 50 | # crop image 51 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]]) 52 | DST_PTS = np.array([[0,0], [0,self.resolution_inp - 1], [self.resolution_inp - 1, 0]]) 53 | tform = estimate_transform('similarity', src_pts, DST_PTS) 54 | 55 | image = image/255. 56 | dst_image = warp(image, tform.inverse, output_shape=(self.resolution_inp, self.resolution_inp)) 57 | dst_image = dst_image.transpose(2,0,1) 58 | return {'image': torch.tensor(dst_image).float(), 59 | 'imagename': imagename, 60 | # 'tform': tform, 61 | # 'original_image': torch.tensor(image.transpose(2,0,1)).float(), 62 | } -------------------------------------------------------------------------------- /src/external/decalib/datasets/build_datasets.py: -------------------------------------------------------------------------------- 1 | import os, sys 2 | import torch 3 | from torch.utils.data import Dataset, ConcatDataset 4 | import torchvision.transforms as transforms 5 | import numpy as np 6 | import cv2 7 | import scipy 8 | from skimage.io import imread, imsave 9 | from skimage.transform import estimate_transform, warp, resize, rescale 10 | from glob import glob 11 | 12 | from .vggface import VGGFace2Dataset 13 | from .ethnicity import EthnicityDataset 14 | from .aflw2000 import AFLW2000 15 | from .now import NoWDataset 16 | from .vox import VoxelDataset 17 | 18 | def build_train(config, is_train=True): 19 | data_list = [] 20 | if 'vox2' in config.training_data: 21 | data_list.append(VoxelDataset(dataname='vox2', K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle)) 22 | if 'vggface2' in config.training_data: 23 | data_list.append(VGGFace2Dataset(K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle)) 24 | if 'vggface2hq' in config.training_data: 25 | data_list.append(VGGFace2HQDataset(K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle)) 26 | if 'ethnicity' in config.training_data: 27 | data_list.append(EthnicityDataset(K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle)) 28 | if 'coco' in config.training_data: 29 | data_list.append(COCODataset(image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale)) 30 | if 'celebahq' in config.training_data: 31 | data_list.append(CelebAHQDataset(image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale)) 32 | dataset = ConcatDataset(data_list) 33 | 34 | return dataset 35 | 36 | def build_val(config, is_train=True): 37 | data_list = [] 38 | if 'vggface2' in config.eval_data: 39 | data_list.append(VGGFace2Dataset(isEval=True, K=config.K, image_size=config.image_size, scale=[config.scale_min, config.scale_max], trans_scale=config.trans_scale, isSingle=config.isSingle)) 40 | if 'now' in config.eval_data: 41 | data_list.append(NoWDataset()) 42 | if 'aflw2000' in config.eval_data: 43 | data_list.append(AFLW2000()) 44 | dataset = ConcatDataset(data_list) 45 | 46 | return dataset 47 | -------------------------------------------------------------------------------- /src/external/decalib/datasets/datasets.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is 4 | # holder of all proprietary rights on this computer program. 5 | # Using this computer program means that you agree to the terms 6 | # in the LICENSE file included with this software distribution. 7 | # Any use not explicitly granted by the LICENSE is prohibited. 8 | # 9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung 10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute 11 | # for Intelligent Systems. All rights reserved. 12 | # 13 | # For comments or questions, please email us at deca@tue.mpg.de 14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de 15 | 16 | import os, sys 17 | import torch 18 | from torch.utils.data import Dataset, DataLoader 19 | import torchvision.transforms as transforms 20 | import numpy as np 21 | import cv2 22 | import scipy 23 | from skimage.io import imread, imsave 24 | from skimage.transform import estimate_transform, warp, resize, rescale 25 | from glob import glob 26 | import scipy.io 27 | 28 | from . import detectors 29 | 30 | def video2sequence(video_path, sample_step=10): 31 | videofolder = os.path.splitext(video_path)[0] 32 | os.makedirs(videofolder, exist_ok=True) 33 | video_name = os.path.splitext(os.path.split(video_path)[-1])[0] 34 | vidcap = cv2.VideoCapture(video_path) 35 | success,image = vidcap.read() 36 | count = 0 37 | imagepath_list = [] 38 | while success: 39 | # if count%sample_step == 0: 40 | imagepath = os.path.join(videofolder, f'{video_name}_frame{count:04d}.jpg') 41 | cv2.imwrite(imagepath, image) # save frame as JPEG file 42 | success,image = vidcap.read() 43 | count += 1 44 | imagepath_list.append(imagepath) 45 | print('video frames are stored in {}'.format(videofolder)) 46 | return imagepath_list 47 | 48 | class TestData(Dataset): 49 | def __init__(self, testpath, iscrop=True, crop_size=224, scale=1.25, face_detector='fan', sample_step=10): 50 | ''' 51 | testpath: folder, imagepath_list, image path, video path 52 | ''' 53 | if isinstance(testpath, list): 54 | self.imagepath_list = testpath 55 | elif os.path.isdir(testpath): 56 | self.imagepath_list = glob(testpath + '/*.jpg') + glob(testpath + '/*.png') + glob(testpath + '/*.bmp') 57 | elif os.path.isfile(testpath) and (testpath[-3:] in ['jpg', 'png', 'bmp']): 58 | self.imagepath_list = [testpath] 59 | elif os.path.isfile(testpath) and (testpath[-3:] in ['mp4', 'csv', 'vid', 'ebm']): 60 | self.imagepath_list = video2sequence(testpath, sample_step) 61 | else: 62 | print(f'please check the test path: {testpath}') 63 | exit() 64 | # print('total {} images'.format(len(self.imagepath_list))) 65 | self.imagepath_list = sorted(self.imagepath_list) 66 | self.crop_size = crop_size 67 | self.scale = scale 68 | self.iscrop = iscrop 69 | self.resolution_inp = crop_size 70 | if face_detector == 'fan': 71 | self.face_detector = detectors.FAN() 72 | # elif face_detector == 'mtcnn': 73 | # self.face_detector = detectors.MTCNN() 74 | else: 75 | print(f'please check the detector: {face_detector}') 76 | exit() 77 | 78 | def __len__(self): 79 | return len(self.imagepath_list) 80 | 81 | def bbox2point(self, left, right, top, bottom, type='bbox'): 82 | ''' bbox from detector and landmarks are different 83 | ''' 84 | if type=='kpt68': 85 | old_size = (right - left + bottom - top)/2*1.1 86 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ]) 87 | elif type=='bbox': 88 | old_size = (right - left + bottom - top)/2 89 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 + old_size*0.12]) 90 | else: 91 | raise NotImplementedError 92 | return old_size, center 93 | 94 | def __getitem__(self, index): 95 | imagepath = self.imagepath_list[index] 96 | imagename = os.path.splitext(os.path.split(imagepath)[-1])[0] 97 | image = np.array(imread(imagepath)) 98 | if len(image.shape) == 2: 99 | image = image[:,:,None].repeat(1,1,3) 100 | if len(image.shape) == 3 and image.shape[2] > 3: 101 | image = image[:,:,:3] 102 | 103 | h, w, _ = image.shape 104 | if h!=w: 105 | print('only support square image!') 106 | exit(-1) 107 | if self.iscrop: 108 | # provide kpt as txt file, or mat file (for AFLW2000) 109 | kpt_matpath = os.path.splitext(imagepath)[0]+'.mat' 110 | kpt_txtpath = os.path.splitext(imagepath)[0]+'.txt' 111 | if os.path.exists(kpt_matpath): 112 | kpt = scipy.io.loadmat(kpt_matpath)['pt3d_68'].T 113 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]); 114 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1]) 115 | old_size, center = self.bbox2point(left, right, top, bottom, type='kpt68') 116 | elif os.path.exists(kpt_txtpath): 117 | kpt = np.loadtxt(kpt_txtpath) 118 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]); 119 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1]) 120 | old_size, center = self.bbox2point(left, right, top, bottom, type='kpt68') 121 | else: 122 | bbox, bbox_type = self.face_detector.run(image) 123 | if len(bbox) < 4: 124 | print('no face detected! run original image') 125 | left = 0; right = h-1; top=0; bottom=w-1 126 | else: 127 | left = bbox[0]; right=bbox[2] 128 | top = bbox[1]; bottom=bbox[3] 129 | old_size, center = self.bbox2point(left, right, top, bottom, type=bbox_type) 130 | size = int(old_size*self.scale) 131 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]]) 132 | else: 133 | src_pts = np.array([[0, 0], [0, h-1], [w-1, 0]]) 134 | 135 | DST_PTS = np.array([[0,0], [0,self.resolution_inp - 1], [self.resolution_inp - 1, 0]]) 136 | tform = estimate_transform('similarity', src_pts, DST_PTS) 137 | 138 | image = image/255. 139 | 140 | dst_image = warp(image, tform.inverse, output_shape=(self.resolution_inp, self.resolution_inp)) 141 | # cv2.imwrite("../crop_img.png", dst_image * 255) 142 | dst_image = dst_image.transpose(2,0,1) 143 | return {'image': torch.tensor(dst_image).float(), 144 | 'imagename': imagename, 145 | 'tform': torch.tensor(tform.params).float(), 146 | 'original_image': torch.tensor(image.transpose(2,0,1)).float(), 147 | } -------------------------------------------------------------------------------- /src/external/decalib/datasets/detectors.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is 4 | # holder of all proprietary rights on this computer program. 5 | # Using this computer program means that you agree to the terms 6 | # in the LICENSE file included with this software distribution. 7 | # Any use not explicitly granted by the LICENSE is prohibited. 8 | # 9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung 10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute 11 | # for Intelligent Systems. All rights reserved. 12 | # 13 | # For comments or questions, please email us at deca@tue.mpg.de 14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de 15 | 16 | import numpy as np 17 | import torch 18 | 19 | class FAN(object): 20 | def __init__(self): 21 | import face_alignment 22 | self.model = face_alignment.FaceAlignment(face_alignment.LandmarksType._2D, flip_input=False) 23 | 24 | def run(self, image): 25 | ''' 26 | image: 0-255, uint8, rgb, [h, w, 3] 27 | return: detected box list 28 | ''' 29 | out = self.model.get_landmarks(image) 30 | if out is None: 31 | return [0], 'kpt68' 32 | else: 33 | kpt = out[0].squeeze() 34 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]); 35 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1]) 36 | bbox = [left,top, right, bottom] 37 | return bbox, 'kpt68' 38 | 39 | class MTCNN(object): 40 | def __init__(self, device = 'cpu'): 41 | ''' 42 | https://github.com/timesler/facenet-pytorch/blob/master/examples/infer.ipynb 43 | ''' 44 | from facenet_pytorch import MTCNN as mtcnn 45 | self.device = device 46 | self.model = mtcnn(keep_all=True) 47 | def run(self, input): 48 | ''' 49 | image: 0-255, uint8, rgb, [h, w, 3] 50 | return: detected box 51 | ''' 52 | out = self.model.detect(input[None,...]) 53 | if out[0][0] is None: 54 | return [0] 55 | else: 56 | bbox = out[0][0].squeeze() 57 | return bbox, 'bbox' 58 | 59 | 60 | 61 | -------------------------------------------------------------------------------- /src/external/decalib/datasets/ethnicity.py: -------------------------------------------------------------------------------- 1 | import os, sys 2 | import torch 3 | import torchvision.transforms as transforms 4 | import numpy as np 5 | import cv2 6 | import scipy 7 | from skimage.io import imread, imsave 8 | from skimage.transform import estimate_transform, warp, resize, rescale 9 | from glob import glob 10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset 11 | 12 | class EthnicityDataset(Dataset): 13 | def __init__(self, K, image_size, scale, trans_scale = 0, isTemporal=False, isEval=False, isSingle=False): 14 | ''' 15 | K must be less than 6 16 | ''' 17 | self.K = K 18 | self.image_size = image_size 19 | self.imagefolder = '/ps/scratch/face2d3d/train' 20 | self.kptfolder = '/ps/scratch/face2d3d/train_annotated_torch7/' 21 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_seg/test_crop_size_400_batch/' 22 | # hq: 23 | # datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_bbx_size_bigger_than_400_train_list_max_normal_100_ring_5_1_serial.npy' 24 | datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_and_race_per_7000_african_asian_2d_train_list_max_normal_100_ring_5_1_serial.npy' 25 | self.data_lines = np.load(datafile).astype('str') 26 | 27 | self.isTemporal = isTemporal 28 | self.scale = scale #[scale_min, scale_max] 29 | self.trans_scale = trans_scale #[dx, dy] 30 | self.isSingle = isSingle 31 | if isSingle: 32 | self.K = 1 33 | 34 | def __len__(self): 35 | return len(self.data_lines) 36 | 37 | def __getitem__(self, idx): 38 | images_list = []; kpt_list = []; mask_list = [] 39 | for i in range(self.K): 40 | name = self.data_lines[idx, i] 41 | if name[0]=='n': 42 | self.imagefolder = '/ps/scratch/face2d3d/train/' 43 | self.kptfolder = '/ps/scratch/face2d3d/train_annotated_torch7/' 44 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_seg/test_crop_size_400_batch/' 45 | elif name[0]=='A': 46 | self.imagefolder = '/ps/scratch/face2d3d/race_per_7000/' 47 | self.kptfolder = '/ps/scratch/face2d3d/race_per_7000_annotated_torch7_new/' 48 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/race7000_seg/test_crop_size_400_batch/' 49 | 50 | image_path = os.path.join(self.imagefolder, name + '.jpg') 51 | seg_path = os.path.join(self.segfolder, name + '.npy') 52 | kpt_path = os.path.join(self.kptfolder, name + '.npy') 53 | 54 | image = imread(image_path)/255. 55 | kpt = np.load(kpt_path)[:,:2] 56 | mask = self.load_mask(seg_path, image.shape[0], image.shape[1]) 57 | 58 | ### crop information 59 | tform = self.crop(image, kpt) 60 | ## crop 61 | cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size)) 62 | cropped_mask = warp(mask, tform.inverse, output_shape=(self.image_size, self.image_size)) 63 | cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params) 64 | 65 | # normalized kpt 66 | cropped_kpt[:,:2] = cropped_kpt[:,:2]/self.image_size * 2 - 1 67 | 68 | images_list.append(cropped_image.transpose(2,0,1)) 69 | kpt_list.append(cropped_kpt) 70 | mask_list.append(cropped_mask) 71 | 72 | ### 73 | images_array = torch.from_numpy(np.array(images_list)).type(dtype = torch.float32) #K,224,224,3 74 | kpt_array = torch.from_numpy(np.array(kpt_list)).type(dtype = torch.float32) #K,224,224,3 75 | mask_array = torch.from_numpy(np.array(mask_list)).type(dtype = torch.float32) #K,224,224,3 76 | 77 | if self.isSingle: 78 | images_array = images_array.squeeze() 79 | kpt_array = kpt_array.squeeze() 80 | mask_array = mask_array.squeeze() 81 | 82 | data_dict = { 83 | 'image': images_array, 84 | 'landmark': kpt_array, 85 | 'mask': mask_array 86 | } 87 | 88 | return data_dict 89 | 90 | def crop(self, image, kpt): 91 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]); 92 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1]) 93 | 94 | h, w, _ = image.shape 95 | old_size = (right - left + bottom - top)/2 96 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])#+ old_size*0.1]) 97 | # translate center 98 | trans_scale = (np.random.rand(2)*2 -1) * self.trans_scale 99 | center = center + trans_scale*old_size # 0.5 100 | 101 | scale = np.random.rand() * (self.scale[1] - self.scale[0]) + self.scale[0] 102 | size = int(old_size*scale) 103 | 104 | # crop image 105 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]]) 106 | DST_PTS = np.array([[0,0], [0,self.image_size - 1], [self.image_size - 1, 0]]) 107 | tform = estimate_transform('similarity', src_pts, DST_PTS) 108 | 109 | # cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size)) 110 | # # change kpt accordingly 111 | # cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params) 112 | return tform 113 | 114 | def load_mask(self, maskpath, h, w): 115 | # print(maskpath) 116 | if os.path.isfile(maskpath): 117 | vis_parsing_anno = np.load(maskpath) 118 | # atts = ['skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r', 119 | # 'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat'] 120 | mask = np.zeros_like(vis_parsing_anno) 121 | # for i in range(1, 16): 122 | mask[vis_parsing_anno>0.5] = 1. 123 | else: 124 | mask = np.ones((h, w)) 125 | return mask 126 | 127 | -------------------------------------------------------------------------------- /src/external/decalib/datasets/now.py: -------------------------------------------------------------------------------- 1 | import os, sys 2 | import torch 3 | import torchvision.transforms as transforms 4 | import numpy as np 5 | import cv2 6 | import scipy 7 | from skimage.io import imread, imsave 8 | from skimage.transform import estimate_transform, warp, resize, rescale 9 | from glob import glob 10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset 11 | 12 | class NoWDataset(Dataset): 13 | def __init__(self, ring_elements=6, crop_size=224, scale=1.6): 14 | folder = '/ps/scratch/yfeng/other-github/now_evaluation/data/NoW_Dataset' 15 | self.data_path = os.path.join(folder, 'imagepathsvalidation.txt') 16 | with open(self.data_path) as f: 17 | self.data_lines = f.readlines() 18 | 19 | self.imagefolder = os.path.join(folder, 'final_release_version', 'iphone_pictures') 20 | self.bbxfolder = os.path.join(folder, 'final_release_version', 'detected_face') 21 | 22 | # self.data_path = '/ps/scratch/face2d3d/ringnetpp/eccv/test_data/evaluation/NoW_Dataset/final_release_version/test_image_paths_ring_6_elements.npy' 23 | # self.imagepath = '/ps/scratch/face2d3d/ringnetpp/eccv/test_data/evaluation/NoW_Dataset/final_release_version/iphone_pictures/' 24 | # self.bbxpath = '/ps/scratch/face2d3d/ringnetpp/eccv/test_data/evaluation/NoW_Dataset/final_release_version/detected_face/' 25 | self.crop_size = crop_size 26 | self.scale = scale 27 | 28 | def __len__(self): 29 | return len(self.data_lines) 30 | 31 | def __getitem__(self, index): 32 | imagepath = os.path.join(self.imagefolder, self.data_lines[index].strip()) #+ '.jpg' 33 | bbx_path = os.path.join(self.bbxfolder, self.data_lines[index].strip().replace('.jpg', '.npy')) 34 | bbx_data = np.load(bbx_path, allow_pickle=True, encoding='latin1').item() 35 | # box = np.array([[bbx_data['left'], bbx_data['top']], [bbx_data['right'], bbx_data['bottom']]]).astype('float32') 36 | left = bbx_data['left']; right = bbx_data['right'] 37 | top = bbx_data['top']; bottom = bbx_data['bottom'] 38 | 39 | imagename = imagepath.split('/')[-1].split('.')[0] 40 | image = imread(imagepath)[:,:,:3] 41 | 42 | h, w, _ = image.shape 43 | old_size = (right - left + bottom - top)/2 44 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ]) 45 | size = int(old_size*self.scale) 46 | 47 | # crop image 48 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]]) 49 | DST_PTS = np.array([[0,0], [0,self.crop_size - 1], [self.crop_size - 1, 0]]) 50 | tform = estimate_transform('similarity', src_pts, DST_PTS) 51 | 52 | image = image/255. 53 | dst_image = warp(image, tform.inverse, output_shape=(self.crop_size, self.crop_size)) 54 | dst_image = dst_image.transpose(2,0,1) 55 | return {'image': torch.tensor(dst_image).float(), 56 | 'imagename': self.data_lines[index].strip().replace('.jpg', ''), 57 | # 'tform': tform, 58 | # 'original_image': torch.tensor(image.transpose(2,0,1)).float(), 59 | } -------------------------------------------------------------------------------- /src/external/decalib/datasets/vggface.py: -------------------------------------------------------------------------------- 1 | import os, sys 2 | import torch 3 | import torchvision.transforms as transforms 4 | import numpy as np 5 | import cv2 6 | import scipy 7 | from skimage.io import imread, imsave 8 | from skimage.transform import estimate_transform, warp, resize, rescale 9 | from glob import glob 10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset 11 | 12 | class VGGFace2Dataset(Dataset): 13 | def __init__(self, K, image_size, scale, trans_scale = 0, isTemporal=False, isEval=False, isSingle=False): 14 | ''' 15 | K must be less than 6 16 | ''' 17 | self.K = K 18 | self.image_size = image_size 19 | self.imagefolder = '/ps/scratch/face2d3d/train' 20 | self.kptfolder = '/ps/scratch/face2d3d/train_annotated_torch7' 21 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_seg/test_crop_size_400_batch' 22 | # hq: 23 | # datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_bbx_size_bigger_than_400_train_list_max_normal_100_ring_5_1_serial.npy' 24 | datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_train_list_max_normal_100_ring_5_1_serial.npy' 25 | if isEval: 26 | datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_val_list_max_normal_100_ring_5_1_serial.npy' 27 | self.data_lines = np.load(datafile).astype('str') 28 | 29 | self.isTemporal = isTemporal 30 | self.scale = scale #[scale_min, scale_max] 31 | self.trans_scale = trans_scale #[dx, dy] 32 | self.isSingle = isSingle 33 | if isSingle: 34 | self.K = 1 35 | 36 | def __len__(self): 37 | return len(self.data_lines) 38 | 39 | def __getitem__(self, idx): 40 | images_list = []; kpt_list = []; mask_list = [] 41 | 42 | random_ind = np.random.permutation(5)[:self.K] 43 | for i in random_ind: 44 | name = self.data_lines[idx, i] 45 | image_path = os.path.join(self.imagefolder, name + '.jpg') 46 | seg_path = os.path.join(self.segfolder, name + '.npy') 47 | kpt_path = os.path.join(self.kptfolder, name + '.npy') 48 | 49 | image = imread(image_path)/255. 50 | kpt = np.load(kpt_path)[:,:2] 51 | mask = self.load_mask(seg_path, image.shape[0], image.shape[1]) 52 | 53 | ### crop information 54 | tform = self.crop(image, kpt) 55 | ## crop 56 | cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size)) 57 | cropped_mask = warp(mask, tform.inverse, output_shape=(self.image_size, self.image_size)) 58 | cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params) 59 | 60 | # normalized kpt 61 | cropped_kpt[:,:2] = cropped_kpt[:,:2]/self.image_size * 2 - 1 62 | 63 | images_list.append(cropped_image.transpose(2,0,1)) 64 | kpt_list.append(cropped_kpt) 65 | mask_list.append(cropped_mask) 66 | 67 | ### 68 | images_array = torch.from_numpy(np.array(images_list)).type(dtype = torch.float32) #K,224,224,3 69 | kpt_array = torch.from_numpy(np.array(kpt_list)).type(dtype = torch.float32) #K,224,224,3 70 | mask_array = torch.from_numpy(np.array(mask_list)).type(dtype = torch.float32) #K,224,224,3 71 | 72 | if self.isSingle: 73 | images_array = images_array.squeeze() 74 | kpt_array = kpt_array.squeeze() 75 | mask_array = mask_array.squeeze() 76 | 77 | data_dict = { 78 | 'image': images_array, 79 | 'landmark': kpt_array, 80 | 'mask': mask_array 81 | } 82 | 83 | return data_dict 84 | 85 | def crop(self, image, kpt): 86 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]); 87 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1]) 88 | 89 | h, w, _ = image.shape 90 | old_size = (right - left + bottom - top)/2 91 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])#+ old_size*0.1]) 92 | # translate center 93 | trans_scale = (np.random.rand(2)*2 -1) * self.trans_scale 94 | center = center + trans_scale*old_size # 0.5 95 | 96 | scale = np.random.rand() * (self.scale[1] - self.scale[0]) + self.scale[0] 97 | size = int(old_size*scale) 98 | 99 | # crop image 100 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]]) 101 | DST_PTS = np.array([[0,0], [0,self.image_size - 1], [self.image_size - 1, 0]]) 102 | tform = estimate_transform('similarity', src_pts, DST_PTS) 103 | 104 | # cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size)) 105 | # # change kpt accordingly 106 | # cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params) 107 | return tform 108 | 109 | def load_mask(self, maskpath, h, w): 110 | # print(maskpath) 111 | if os.path.isfile(maskpath): 112 | vis_parsing_anno = np.load(maskpath) 113 | # atts = ['skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r', 114 | # 'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat'] 115 | mask = np.zeros_like(vis_parsing_anno) 116 | # for i in range(1, 16): 117 | mask[vis_parsing_anno>0.5] = 1. 118 | else: 119 | mask = np.ones((h, w)) 120 | return mask 121 | 122 | 123 | 124 | class VGGFace2HQDataset(Dataset): 125 | def __init__(self, K, image_size, scale, trans_scale = 0, isTemporal=False, isEval=False, isSingle=False): 126 | ''' 127 | K must be less than 6 128 | ''' 129 | self.K = K 130 | self.image_size = image_size 131 | self.imagefolder = '/ps/scratch/face2d3d/train' 132 | self.kptfolder = '/ps/scratch/face2d3d/train_annotated_torch7' 133 | self.segfolder = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_seg/test_crop_size_400_batch' 134 | # hq: 135 | # datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_bbx_size_bigger_than_400_train_list_max_normal_100_ring_5_1_serial.npy' 136 | datafile = '/ps/scratch/face2d3d/texture_in_the_wild_code/VGGFace2_cleaning_codes/ringnetpp_training_lists/second_cleaning/vggface2_bbx_size_bigger_than_400_train_list_max_normal_100_ring_5_1_serial.npy' 137 | self.data_lines = np.load(datafile).astype('str') 138 | 139 | self.isTemporal = isTemporal 140 | self.scale = scale #[scale_min, scale_max] 141 | self.trans_scale = trans_scale #[dx, dy] 142 | self.isSingle = isSingle 143 | if isSingle: 144 | self.K = 1 145 | 146 | def __len__(self): 147 | return len(self.data_lines) 148 | 149 | def __getitem__(self, idx): 150 | images_list = []; kpt_list = []; mask_list = [] 151 | 152 | for i in range(self.K): 153 | name = self.data_lines[idx, i] 154 | image_path = os.path.join(self.imagefolder, name + '.jpg') 155 | seg_path = os.path.join(self.segfolder, name + '.npy') 156 | kpt_path = os.path.join(self.kptfolder, name + '.npy') 157 | 158 | image = imread(image_path)/255. 159 | kpt = np.load(kpt_path)[:,:2] 160 | mask = self.load_mask(seg_path, image.shape[0], image.shape[1]) 161 | 162 | ### crop information 163 | tform = self.crop(image, kpt) 164 | ## crop 165 | cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size)) 166 | cropped_mask = warp(mask, tform.inverse, output_shape=(self.image_size, self.image_size)) 167 | cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params) 168 | 169 | # normalized kpt 170 | cropped_kpt[:,:2] = cropped_kpt[:,:2]/self.image_size * 2 - 1 171 | 172 | images_list.append(cropped_image.transpose(2,0,1)) 173 | kpt_list.append(cropped_kpt) 174 | mask_list.append(cropped_mask) 175 | 176 | ### 177 | images_array = torch.from_numpy(np.array(images_list)).type(dtype = torch.float32) #K,224,224,3 178 | kpt_array = torch.from_numpy(np.array(kpt_list)).type(dtype = torch.float32) #K,224,224,3 179 | mask_array = torch.from_numpy(np.array(mask_list)).type(dtype = torch.float32) #K,224,224,3 180 | 181 | if self.isSingle: 182 | images_array = images_array.squeeze() 183 | kpt_array = kpt_array.squeeze() 184 | mask_array = mask_array.squeeze() 185 | 186 | data_dict = { 187 | 'image': images_array, 188 | 'landmark': kpt_array, 189 | 'mask': mask_array 190 | } 191 | 192 | return data_dict 193 | 194 | def crop(self, image, kpt): 195 | left = np.min(kpt[:,0]); right = np.max(kpt[:,0]); 196 | top = np.min(kpt[:,1]); bottom = np.max(kpt[:,1]) 197 | 198 | h, w, _ = image.shape 199 | old_size = (right - left + bottom - top)/2 200 | center = np.array([right - (right - left) / 2.0, bottom - (bottom - top) / 2.0 ])#+ old_size*0.1]) 201 | # translate center 202 | trans_scale = (np.random.rand(2)*2 -1) * self.trans_scale 203 | center = center + trans_scale*old_size # 0.5 204 | 205 | scale = np.random.rand() * (self.scale[1] - self.scale[0]) + self.scale[0] 206 | size = int(old_size*scale) 207 | 208 | # crop image 209 | src_pts = np.array([[center[0]-size/2, center[1]-size/2], [center[0] - size/2, center[1]+size/2], [center[0]+size/2, center[1]-size/2]]) 210 | DST_PTS = np.array([[0,0], [0,self.image_size - 1], [self.image_size - 1, 0]]) 211 | tform = estimate_transform('similarity', src_pts, DST_PTS) 212 | 213 | # cropped_image = warp(image, tform.inverse, output_shape=(self.image_size, self.image_size)) 214 | # # change kpt accordingly 215 | # cropped_kpt = np.dot(tform.params, np.hstack([kpt, np.ones([kpt.shape[0],1])]).T).T # np.linalg.inv(tform.params) 216 | return tform 217 | 218 | def load_mask(self, maskpath, h, w): 219 | # print(maskpath) 220 | if os.path.isfile(maskpath): 221 | vis_parsing_anno = np.load(maskpath) 222 | # atts = ['skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r', 223 | # 'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat'] 224 | mask = np.zeros_like(vis_parsing_anno) 225 | # for i in range(1, 16): 226 | mask[vis_parsing_anno>0.5] = 1. 227 | else: 228 | mask = np.ones((h, w)) 229 | return mask -------------------------------------------------------------------------------- /src/external/decalib/datasets/vox.py: -------------------------------------------------------------------------------- 1 | import os, sys 2 | import torch 3 | import torchvision.transforms as transforms 4 | import numpy as np 5 | import cv2 6 | import scipy 7 | from skimage.io import imread, imsave 8 | from skimage.transform import estimate_transform, warp, resize, rescale 9 | from glob import glob 10 | from torch.utils.data import Dataset, DataLoader, ConcatDataset 11 | 12 | class VoxelDataset(Dataset): 13 | def __init__(self, K, image_size, scale, trans_scale = 0, dataname='vox2', n_train=100000, isTemporal=False, isEval=False, isSingle=False): 14 | self.K = K 15 | self.image_size = image_size 16 | if dataname == 'vox1': 17 | self.kpt_suffix = '.txt' 18 | self.imagefolder = '/ps/project/face2d3d/VoxCeleb/vox1/dev/images_cropped' 19 | self.kptfolder = '/ps/scratch/yfeng/Data/VoxCeleb/vox1/landmark_2d' 20 | 21 | self.face_dict = {} 22 | for person_id in sorted(os.listdir(self.kptfolder)): 23 | for video_id in os.listdir(os.path.join(self.kptfolder, person_id)): 24 | for face_id in os.listdir(os.path.join(self.kptfolder, person_id, video_id)): 25 | if 'txt' in face_id: 26 | continue 27 | key = person_id + '/' + video_id + '/' + face_id 28 | # if key not in self.face_dict.keys(): 29 | # self.face_dict[key] = [] 30 | name_list = os.listdir(os.path.join(self.kptfolder, person_id, video_id, face_id)) 31 | name_list = [name.split['.'][0] for name in name_list] 32 | if len(name_list)0.5] = 1. 162 | else: 163 | mask = np.ones((h, w)) 164 | return mask 165 | -------------------------------------------------------------------------------- /src/external/decalib/deca.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is 4 | # holder of all proprietary rights on this computer program. 5 | # Using this computer program means that you agree to the terms 6 | # in the LICENSE file included with this software distribution. 7 | # Any use not explicitly granted by the LICENSE is prohibited. 8 | # 9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung 10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute 11 | # for Intelligent Systems. All rights reserved. 12 | # 13 | # For comments or questions, please email us at deca@tue.mpg.de 14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de 15 | 16 | import os 17 | import torch 18 | import torch.nn as nn 19 | from .models.encoders import ResnetEncoder 20 | from .utils import util 21 | from .utils.rotation_converter import batch_euler2axis 22 | from .utils.config import cfg 23 | torch.backends.cudnn.benchmark = True 24 | 25 | class DECA(nn.Module): 26 | def __init__(self, config=None, device='cuda'): 27 | super(DECA, self).__init__() 28 | if config is None: 29 | self.cfg = cfg 30 | else: 31 | self.cfg = config 32 | self.device = device 33 | self.image_size = self.cfg.dataset.image_size 34 | self.uv_size = self.cfg.model.uv_size 35 | 36 | self._create_model(self.cfg.model) 37 | 38 | def _create_model(self, model_cfg): 39 | # set up parameters 40 | self.n_param = model_cfg.n_shape+model_cfg.n_tex+model_cfg.n_exp+model_cfg.n_pose+model_cfg.n_cam+model_cfg.n_light 41 | self.n_detail = model_cfg.n_detail 42 | self.n_cond = model_cfg.n_exp + 3 # exp + jaw pose 43 | self.num_list = [model_cfg.n_shape, model_cfg.n_tex, model_cfg.n_exp, model_cfg.n_pose, model_cfg.n_cam, model_cfg.n_light] 44 | self.param_dict = {i:model_cfg.get('n_' + i) for i in model_cfg.param_list} 45 | 46 | # encoders 47 | self.E_flame = ResnetEncoder(outsize=self.n_param).to(self.device) 48 | self.E_detail = ResnetEncoder(outsize=self.n_detail).to(self.device) 49 | # resume model 50 | model_path = self.cfg.pretrained_modelpath 51 | if os.path.exists(model_path): 52 | print(f'trained model found. load {model_path}') 53 | checkpoint = torch.load(model_path) 54 | self.checkpoint = checkpoint 55 | util.copy_state_dict(self.E_flame.state_dict(), checkpoint['E_flame']) 56 | util.copy_state_dict(self.E_detail.state_dict(), checkpoint['E_detail']) 57 | else: 58 | print(f'please check model path: {model_path}') 59 | # exit() 60 | # eval mode 61 | self.E_flame.eval() 62 | self.E_detail.eval() 63 | 64 | def decompose_code(self, code, num_dict): 65 | ''' Convert a flattened parameter vector to a dictionary of parameters 66 | code_dict.keys() = ['shape', 'tex', 'exp', 'pose', 'cam', 'light'] 67 | ''' 68 | code_dict = {} 69 | start = 0 70 | for key in num_dict: 71 | end = start+int(num_dict[key]) 72 | code_dict[key] = code[:, start:end] 73 | start = end 74 | if key == 'light': 75 | code_dict[key] = code_dict[key].reshape(code_dict[key].shape[0], 9, 3) 76 | return code_dict 77 | 78 | # @torch.no_grad() 79 | def encode(self, images, use_detail=False): 80 | if use_detail: 81 | # use_detail is for training detail model, need to set coarse model as eval mode 82 | with torch.no_grad(): 83 | parameters = self.E_flame(images) 84 | else: 85 | parameters = self.E_flame(images) 86 | codedict = self.decompose_code(parameters, self.param_dict) 87 | codedict['images'] = images 88 | if use_detail: 89 | detailcode = self.E_detail(images) 90 | codedict['detail'] = detailcode 91 | if self.cfg.model.jaw_type == 'euler': 92 | posecode = codedict['pose'] 93 | euler_jaw_pose = posecode[:,3:].clone() # x for yaw (open mouth), y for pitch (left ang right), z for roll 94 | posecode[:,3:] = batch_euler2axis(euler_jaw_pose) 95 | codedict['pose'] = posecode 96 | codedict['euler_jaw_pose'] = euler_jaw_pose 97 | return codedict 98 | 99 | def ensemble_3DMM_params(self, codedict, image_size, original_image_size): 100 | i = 0 101 | cam = codedict['cam'] 102 | tform = codedict['tform'] 103 | scale, tx, ty, sz = util.calculate_scale_tx_ty(cam, tform, image_size, original_image_size) 104 | crop_scale, crop_tx, crop_ty, crop_sz = util.calculate_crop_scale_tx_ty(cam) 105 | scale = float(scale[i].cpu()) 106 | tx = float(tx[i].cpu()) 107 | ty = float(ty[i].cpu()) 108 | sz = float(sz[i].cpu()) 109 | 110 | crop_scale = float(crop_scale[i].cpu()) 111 | crop_tx = float(crop_tx[i].cpu()) 112 | crop_ty = float(crop_ty[i].cpu()) 113 | crop_sz = float(crop_sz[i].cpu()) 114 | 115 | shape_params = codedict['shape'][i].cpu().numpy() 116 | expression_params = codedict['exp'][i].cpu().numpy() 117 | pose_params = codedict['pose'][i].cpu().numpy() 118 | 119 | face_model_paras = dict() 120 | face_model_paras['shape'] = shape_params.tolist() 121 | face_model_paras['exp'] = expression_params.tolist() 122 | face_model_paras['pose'] = pose_params.tolist() 123 | face_model_paras['cam'] = cam[i].cpu().numpy().tolist() 124 | 125 | face_model_paras['scale'] = scale 126 | face_model_paras['tx'] = tx 127 | face_model_paras['ty'] = ty 128 | face_model_paras['sz'] = sz 129 | 130 | face_model_paras['crop_scale'] = crop_scale 131 | face_model_paras['crop_tx'] = crop_tx 132 | face_model_paras['crop_ty'] = crop_ty 133 | face_model_paras['crop_sz'] = crop_sz 134 | return face_model_paras 135 | -------------------------------------------------------------------------------- /src/external/decalib/models/FLAME.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is 4 | # holder of all proprietary rights on this computer program. 5 | # Using this computer program means that you agree to the terms 6 | # in the LICENSE file included with this software distribution. 7 | # Any use not explicitly granted by the LICENSE is prohibited. 8 | # 9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung 10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute 11 | # for Intelligent Systems. All rights reserved. 12 | # 13 | # For comments or questions, please email us at deca@tue.mpg.de 14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de 15 | 16 | import torch 17 | import torch.nn as nn 18 | import numpy as np 19 | import pickle 20 | import torch.nn.functional as F 21 | import json 22 | 23 | from .lbs import lbs, batch_rodrigues, vertices2landmarks, rot_mat_to_euler 24 | 25 | def to_tensor(array, dtype=torch.float32): 26 | if 'torch.tensor' not in str(type(array)): 27 | return torch.tensor(array, dtype=dtype) 28 | def to_np(array, dtype=np.float32): 29 | if 'scipy.sparse' in str(type(array)): 30 | array = array.todense() 31 | return np.array(array, dtype=dtype) 32 | 33 | class Struct(object): 34 | def __init__(self, **kwargs): 35 | for key, val in kwargs.items(): 36 | setattr(self, key, val) 37 | 38 | class FLAME(nn.Module): 39 | """ 40 | borrowed from https://github.com/soubhiksanyal/FLAME_PyTorch/blob/master/FLAME.py 41 | Given flame parameters this class generates a differentiable FLAME function 42 | which outputs the a mesh and 2D/3D facial landmarks 43 | """ 44 | def __init__(self, config): 45 | super(FLAME, self).__init__() 46 | print("creating the FLAME Decoder") 47 | with open(config.flame_model_path, 'rb') as f: 48 | ss = pickle.load(f, encoding='latin1') 49 | flame_model = Struct(**ss) 50 | 51 | self.dtype = torch.float32 52 | self.register_buffer('faces_tensor', to_tensor(to_np(flame_model.f, dtype=np.int64), dtype=torch.long)) 53 | # The vertices of the template model 54 | self.register_buffer('v_template', to_tensor(to_np(flame_model.v_template), dtype=self.dtype)) 55 | # The shape components and expression 56 | shapedirs = to_tensor(to_np(flame_model.shapedirs), dtype=self.dtype) 57 | shapedirs = torch.cat([shapedirs[:,:,:config.n_shape], shapedirs[:,:,300:300+config.n_exp]], 2) 58 | self.register_buffer('shapedirs', shapedirs) 59 | # The pose components 60 | num_pose_basis = flame_model.posedirs.shape[-1] 61 | posedirs = np.reshape(flame_model.posedirs, [-1, num_pose_basis]).T 62 | self.register_buffer('posedirs', to_tensor(to_np(posedirs), dtype=self.dtype)) 63 | # 64 | self.register_buffer('J_regressor', to_tensor(to_np(flame_model.J_regressor), dtype=self.dtype)) 65 | parents = to_tensor(to_np(flame_model.kintree_table[0])).long(); parents[0] = -1 66 | self.register_buffer('parents', parents) 67 | self.register_buffer('lbs_weights', to_tensor(to_np(flame_model.weights), dtype=self.dtype)) 68 | 69 | # Fixing Eyeball and neck rotation 70 | default_eyball_pose = torch.zeros([1, 6], dtype=self.dtype, requires_grad=False) 71 | self.register_parameter('eye_pose', nn.Parameter(default_eyball_pose, 72 | requires_grad=False)) 73 | default_neck_pose = torch.zeros([1, 3], dtype=self.dtype, requires_grad=False) 74 | self.register_parameter('neck_pose', nn.Parameter(default_neck_pose, 75 | requires_grad=False)) 76 | 77 | with open(config.flame_lmk_embedding_path, 'r') as f: 78 | lmk_embeddings = json.load(f) 79 | 80 | self.lmk_faces_idx = torch.tensor(lmk_embeddings['lmk_faces_idx']).long().unsqueeze(0) 81 | self.lmk_bary_coords = torch.tensor(lmk_embeddings['lmk_bary_coords']).to(self.dtype).unsqueeze(0) 82 | 83 | def forward(self, shape_params=None, expression_params=None, pose_params=None, eye_pose_params=None): 84 | """ 85 | Input: 86 | shape_params: N X number of shape parameters 87 | expression_params: N X number of expression parameters 88 | pose_params: N X number of pose parameters (6) 89 | return:d 90 | vertices: N X V X 3 91 | landmarks: N X number of landmarks X 3 92 | """ 93 | batch_size = shape_params.shape[0] 94 | if pose_params is None: 95 | pose_params = self.eye_pose.expand(batch_size, -1) 96 | if eye_pose_params is None: 97 | eye_pose_params = self.eye_pose.expand(batch_size, -1) 98 | betas = torch.cat([shape_params, expression_params], dim=1) 99 | full_pose = torch.cat([pose_params[:, :3], self.neck_pose.expand(batch_size, -1), pose_params[:, 3:], eye_pose_params], dim=1) 100 | template_vertices = self.v_template.unsqueeze(0).expand(batch_size, -1, -1) 101 | 102 | vertices, _ = lbs(betas, full_pose, template_vertices, 103 | self.shapedirs, self.posedirs, 104 | self.J_regressor, self.parents, 105 | self.lbs_weights, dtype=self.dtype) 106 | bz = vertices.shape[0] 107 | landmarks3d = vertices2landmarks(vertices, self.faces_tensor, 108 | self.lmk_faces_idx.repeat(bz, 1), 109 | self.lmk_bary_coords.repeat(bz, 1, 1)) 110 | return vertices, landmarks3d 111 | 112 | class FLAMETex(nn.Module): 113 | """ 114 | FLAME texture: 115 | https://github.com/TimoBolkart/TF_FLAME/blob/ade0ab152300ec5f0e8555d6765411555c5ed43d/sample_texture.py#L64 116 | FLAME texture converted from BFM: 117 | https://github.com/TimoBolkart/BFM_to_FLAME 118 | """ 119 | def __init__(self, config): 120 | super(FLAMETex, self).__init__() 121 | if config.tex_type == 'BFM': 122 | mu_key = 'MU' 123 | pc_key = 'PC' 124 | n_pc = 199 125 | tex_path = config.tex_path 126 | tex_space = np.load(tex_path) 127 | texture_mean = tex_space[mu_key].reshape(1, -1) 128 | texture_basis = tex_space[pc_key].reshape(-1, n_pc) 129 | 130 | elif config.tex_type == 'FLAME': 131 | mu_key = 'mean' 132 | pc_key = 'tex_dir' 133 | n_pc = 200 134 | tex_path = config.flame_tex_path 135 | tex_space = np.load(tex_path) 136 | texture_mean = tex_space[mu_key].reshape(1, -1)/255. 137 | texture_basis = tex_space[pc_key].reshape(-1, n_pc)/255. 138 | else: 139 | print('texture type ', config.tex_type, 'not exist!') 140 | raise NotImplementedError 141 | 142 | n_tex = config.n_tex 143 | num_components = texture_basis.shape[1] 144 | texture_mean = torch.from_numpy(texture_mean).float()[None,...] 145 | texture_basis = torch.from_numpy(texture_basis[:,:n_tex]).float()[None,...] 146 | self.register_buffer('texture_mean', texture_mean) 147 | self.register_buffer('texture_basis', texture_basis) 148 | 149 | def forward(self, texcode): 150 | ''' 151 | texcode: [batchsize, n_tex] 152 | texture: [bz, 3, 256, 256], range: 0-1 153 | ''' 154 | texture = self.texture_mean + (self.texture_basis*texcode[:,None,:]).sum(-1) 155 | texture = texture.reshape(texcode.shape[0], 512, 512, 3).permute(0,3,1,2) 156 | texture = F.interpolate(texture, [256, 256]) 157 | texture = texture[:,[2,1,0], :,:] 158 | return texture -------------------------------------------------------------------------------- /src/external/decalib/models/decoders.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is 4 | # holder of all proprietary rights on this computer program. 5 | # Using this computer program means that you agree to the terms 6 | # in the LICENSE file included with this software distribution. 7 | # Any use not explicitly granted by the LICENSE is prohibited. 8 | # 9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung 10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute 11 | # for Intelligent Systems. All rights reserved. 12 | # 13 | # For comments or questions, please email us at deca@tue.mpg.de 14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de 15 | 16 | import torch 17 | import torch.nn as nn 18 | 19 | class Generator(nn.Module): 20 | def __init__(self, latent_dim=100, out_channels=1, out_scale=0.01, sample_mode = 'bilinear'): 21 | super(Generator, self).__init__() 22 | self.out_scale = out_scale 23 | 24 | self.init_size = 32 // 4 # Initial size before upsampling 25 | self.l1 = nn.Sequential(nn.Linear(latent_dim, 128 * self.init_size ** 2)) 26 | self.conv_blocks = nn.Sequential( 27 | nn.BatchNorm2d(128), 28 | nn.Upsample(scale_factor=2, mode=sample_mode), #16 29 | nn.Conv2d(128, 128, 3, stride=1, padding=1), 30 | nn.BatchNorm2d(128, 0.8), 31 | nn.LeakyReLU(0.2, inplace=True), 32 | nn.Upsample(scale_factor=2, mode=sample_mode), #32 33 | nn.Conv2d(128, 64, 3, stride=1, padding=1), 34 | nn.BatchNorm2d(64, 0.8), 35 | nn.LeakyReLU(0.2, inplace=True), 36 | nn.Upsample(scale_factor=2, mode=sample_mode), #64 37 | nn.Conv2d(64, 64, 3, stride=1, padding=1), 38 | nn.BatchNorm2d(64, 0.8), 39 | nn.LeakyReLU(0.2, inplace=True), 40 | nn.Upsample(scale_factor=2, mode=sample_mode), #128 41 | nn.Conv2d(64, 32, 3, stride=1, padding=1), 42 | nn.BatchNorm2d(32, 0.8), 43 | nn.LeakyReLU(0.2, inplace=True), 44 | nn.Upsample(scale_factor=2, mode=sample_mode), #256 45 | nn.Conv2d(32, 16, 3, stride=1, padding=1), 46 | nn.BatchNorm2d(16, 0.8), 47 | nn.LeakyReLU(0.2, inplace=True), 48 | nn.Conv2d(16, out_channels, 3, stride=1, padding=1), 49 | nn.Tanh(), 50 | ) 51 | 52 | def forward(self, noise): 53 | out = self.l1(noise) 54 | out = out.view(out.shape[0], 128, self.init_size, self.init_size) 55 | img = self.conv_blocks(out) 56 | return img*self.out_scale -------------------------------------------------------------------------------- /src/external/decalib/models/encoders.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. (MPG) is 4 | # holder of all proprietary rights on this computer program. 5 | # Using this computer program means that you agree to the terms 6 | # in the LICENSE file included with this software distribution. 7 | # Any use not explicitly granted by the LICENSE is prohibited. 8 | # 9 | # Copyright©2019 Max-Planck-Gesellschaft zur Förderung 10 | # der Wissenschaften e.V. (MPG). acting on behalf of its Max Planck Institute 11 | # for Intelligent Systems. All rights reserved. 12 | # 13 | # For comments or questions, please email us at deca@tue.mpg.de 14 | # For commercial licensing contact, please contact ps-license@tuebingen.mpg.de 15 | 16 | import numpy as np 17 | import torch.nn as nn 18 | import torch 19 | import torch.nn.functional as F 20 | from . import resnet 21 | 22 | class ResnetEncoder(nn.Module): 23 | def __init__(self, outsize, last_op=None): 24 | super(ResnetEncoder, self).__init__() 25 | feature_size = 2048 26 | self.encoder = resnet.load_ResNet50Model() #out: 2048 27 | ### regressor 28 | self.layers = nn.Sequential( 29 | nn.Linear(feature_size, 1024), 30 | nn.ReLU(), 31 | nn.Linear(1024, outsize) 32 | ) 33 | self.last_op = last_op 34 | 35 | def forward(self, inputs): 36 | features = self.encoder(inputs) 37 | parameters = self.layers(features) 38 | if self.last_op: 39 | parameters = self.last_op(parameters) 40 | return parameters 41 | -------------------------------------------------------------------------------- /src/external/decalib/models/frnet.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import numpy as np 3 | import torch 4 | # from pro_gan_pytorch.PRO_GAN import ProGAN, Generator, Discriminator 5 | import torch.nn.functional as F 6 | import cv2 7 | from torch.autograd import Variable 8 | import math 9 | 10 | def conv3x3(in_planes, out_planes, stride=1): 11 | """3x3 convolution with padding""" 12 | return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, 13 | padding=1, bias=False) 14 | 15 | class BasicBlock(nn.Module): 16 | expansion = 1 17 | 18 | def __init__(self, inplanes, planes, stride=1, downsample=None): 19 | super(BasicBlock, self).__init__() 20 | self.conv1 = conv3x3(inplanes, planes, stride) 21 | self.bn1 = nn.BatchNorm2d(planes) 22 | self.relu = nn.ReLU(inplace=True) 23 | self.conv2 = conv3x3(planes, planes) 24 | self.bn2 = nn.BatchNorm2d(planes) 25 | self.downsample = downsample 26 | self.stride = stride 27 | 28 | def forward(self, x): 29 | residual = x 30 | 31 | out = self.conv1(x) 32 | out = self.bn1(out) 33 | out = self.relu(out) 34 | 35 | out = self.conv2(out) 36 | out = self.bn2(out) 37 | 38 | if self.downsample is not None: 39 | residual = self.downsample(x) 40 | 41 | out += residual 42 | out = self.relu(out) 43 | 44 | return out 45 | 46 | 47 | class Bottleneck(nn.Module): 48 | expansion = 4 49 | 50 | def __init__(self, inplanes, planes, stride=1, downsample=None): 51 | super(Bottleneck, self).__init__() 52 | self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, stride=stride, bias=False) 53 | self.bn1 = nn.BatchNorm2d(planes) 54 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False) 55 | self.bn2 = nn.BatchNorm2d(planes) 56 | self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False) 57 | self.bn3 = nn.BatchNorm2d(planes * 4) 58 | self.relu = nn.ReLU(inplace=True) 59 | self.downsample = downsample 60 | self.stride = stride 61 | 62 | def forward(self, x): 63 | residual = x 64 | 65 | out = self.conv1(x) 66 | out = self.bn1(out) 67 | out = self.relu(out) 68 | 69 | out = self.conv2(out) 70 | out = self.bn2(out) 71 | out = self.relu(out) 72 | 73 | out = self.conv3(out) 74 | out = self.bn3(out) 75 | 76 | if self.downsample is not None: 77 | residual = self.downsample(x) 78 | 79 | out += residual 80 | out = self.relu(out) 81 | 82 | return out 83 | 84 | 85 | class ResNet(nn.Module): 86 | 87 | def __init__(self, block, layers, num_classes=1000, include_top=True): 88 | self.inplanes = 64 89 | super(ResNet, self).__init__() 90 | self.include_top = include_top 91 | 92 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, bias=False) 93 | self.bn1 = nn.BatchNorm2d(64) 94 | self.relu = nn.ReLU(inplace=True) 95 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=0, ceil_mode=True) 96 | 97 | self.layer1 = self._make_layer(block, 64, layers[0]) 98 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2) 99 | self.layer3 = self._make_layer(block, 256, layers[2], stride=2) 100 | self.layer4 = self._make_layer(block, 512, layers[3], stride=2) 101 | self.avgpool = nn.AvgPool2d(7, stride=1) 102 | self.fc = nn.Linear(512 * block.expansion, num_classes) 103 | 104 | for m in self.modules(): 105 | if isinstance(m, nn.Conv2d): 106 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 107 | m.weight.data.normal_(0, math.sqrt(2. / n)) 108 | elif isinstance(m, nn.BatchNorm2d): 109 | m.weight.data.fill_(1) 110 | m.bias.data.zero_() 111 | 112 | def _make_layer(self, block, planes, blocks, stride=1): 113 | downsample = None 114 | if stride != 1 or self.inplanes != planes * block.expansion: 115 | downsample = nn.Sequential( 116 | nn.Conv2d(self.inplanes, planes * block.expansion, 117 | kernel_size=1, stride=stride, bias=False), 118 | nn.BatchNorm2d(planes * block.expansion), 119 | ) 120 | 121 | layers = [] 122 | layers.append(block(self.inplanes, planes, stride, downsample)) 123 | self.inplanes = planes * block.expansion 124 | for i in range(1, blocks): 125 | layers.append(block(self.inplanes, planes)) 126 | 127 | return nn.Sequential(*layers) 128 | 129 | def forward(self, x): 130 | x = self.conv1(x) 131 | x = self.bn1(x) 132 | x = self.relu(x) 133 | x = self.maxpool(x) 134 | 135 | x = self.layer1(x) 136 | x = self.layer2(x) 137 | x = self.layer3(x) 138 | x = self.layer4(x) 139 | 140 | x = self.avgpool(x) 141 | 142 | if not self.include_top: 143 | return x 144 | 145 | x = x.view(x.size(0), -1) 146 | x = self.fc(x) 147 | return x 148 | 149 | def resnet50(**kwargs): 150 | """Constructs a ResNet-50 model. 151 | """ 152 | model = ResNet(Bottleneck, [3, 4, 6, 3], **kwargs) 153 | return model 154 | 155 | import pickle 156 | def load_state_dict(model, fname): 157 | """ 158 | Set parameters converted from Caffe models authors of VGGFace2 provide. 159 | See https://www.robots.ox.ac.uk/~vgg/data/vgg_face2/. 160 | Arguments: 161 | model: model 162 | fname: file name of parameters converted from a Caffe model, assuming the file format is Pickle. 163 | """ 164 | with open(fname, 'rb') as f: 165 | weights = pickle.load(f, encoding='latin1') 166 | 167 | own_state = model.state_dict() 168 | for name, param in weights.items(): 169 | if name in own_state: 170 | try: 171 | own_state[name].copy_(torch.from_numpy(param)) 172 | except Exception: 173 | raise RuntimeError('While copying the parameter named {}, whose dimensions in the model are {} and whose '\ 174 | 'dimensions in the checkpoint are {}.'.format(name, own_state[name].size(), param.size())) 175 | else: 176 | raise KeyError('unexpected key "{}" in state_dict'.format(name)) 177 | 178 | -------------------------------------------------------------------------------- /src/external/decalib/models/resnet.py: -------------------------------------------------------------------------------- 1 | """ 2 | Author: Soubhik Sanyal 3 | Copyright (c) 2019, Soubhik Sanyal 4 | All rights reserved. 5 | Loads different resnet models 6 | """ 7 | ''' 8 | file: Resnet.py 9 | date: 2018_05_02 10 | author: zhangxiong(1025679612@qq.com) 11 | mark: copied from pytorch source code 12 | ''' 13 | 14 | import torch.nn as nn 15 | import torch.nn.functional as F 16 | import torch 17 | from torch.nn.parameter import Parameter 18 | import torch.optim as optim 19 | import numpy as np 20 | import math 21 | import torchvision 22 | 23 | class ResNet(nn.Module): 24 | def __init__(self, block, layers, num_classes=1000): 25 | self.inplanes = 64 26 | super(ResNet, self).__init__() 27 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, 28 | bias=False) 29 | self.bn1 = nn.BatchNorm2d(64) 30 | self.relu = nn.ReLU(inplace=True) 31 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) 32 | self.layer1 = self._make_layer(block, 64, layers[0]) 33 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2) 34 | self.layer3 = self._make_layer(block, 256, layers[2], stride=2) 35 | self.layer4 = self._make_layer(block, 512, layers[3], stride=2) 36 | self.avgpool = nn.AvgPool2d(7, stride=1) 37 | # self.fc = nn.Linear(512 * block.expansion, num_classes) 38 | 39 | for m in self.modules(): 40 | if isinstance(m, nn.Conv2d): 41 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 42 | m.weight.data.normal_(0, math.sqrt(2. / n)) 43 | elif isinstance(m, nn.BatchNorm2d): 44 | m.weight.data.fill_(1) 45 | m.bias.data.zero_() 46 | 47 | def _make_layer(self, block, planes, blocks, stride=1): 48 | downsample = None 49 | if stride != 1 or self.inplanes != planes * block.expansion: 50 | downsample = nn.Sequential( 51 | nn.Conv2d(self.inplanes, planes * block.expansion, 52 | kernel_size=1, stride=stride, bias=False), 53 | nn.BatchNorm2d(planes * block.expansion), 54 | ) 55 | 56 | layers = [] 57 | layers.append(block(self.inplanes, planes, stride, downsample)) 58 | self.inplanes = planes * block.expansion 59 | for i in range(1, blocks): 60 | layers.append(block(self.inplanes, planes)) 61 | 62 | return nn.Sequential(*layers) 63 | 64 | def forward(self, x): 65 | x = self.conv1(x) 66 | x = self.bn1(x) 67 | x = self.relu(x) 68 | x = self.maxpool(x) 69 | 70 | x = self.layer1(x) 71 | x = self.layer2(x) 72 | x = self.layer3(x) 73 | x1 = self.layer4(x) 74 | 75 | x2 = self.avgpool(x1) 76 | x2 = x2.view(x2.size(0), -1) 77 | # x = self.fc(x) 78 | ## x2: [bz, 2048] for shape 79 | ## x1: [bz, 2048, 7, 7] for texture 80 | return x2 81 | 82 | class Bottleneck(nn.Module): 83 | expansion = 4 84 | 85 | def __init__(self, inplanes, planes, stride=1, downsample=None): 86 | super(Bottleneck, self).__init__() 87 | self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False) 88 | self.bn1 = nn.BatchNorm2d(planes) 89 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, 90 | padding=1, bias=False) 91 | self.bn2 = nn.BatchNorm2d(planes) 92 | self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False) 93 | self.bn3 = nn.BatchNorm2d(planes * 4) 94 | self.relu = nn.ReLU(inplace=True) 95 | self.downsample = downsample 96 | self.stride = stride 97 | 98 | def forward(self, x): 99 | residual = x 100 | 101 | out = self.conv1(x) 102 | out = self.bn1(out) 103 | out = self.relu(out) 104 | 105 | out = self.conv2(out) 106 | out = self.bn2(out) 107 | out = self.relu(out) 108 | 109 | out = self.conv3(out) 110 | out = self.bn3(out) 111 | 112 | if self.downsample is not None: 113 | residual = self.downsample(x) 114 | 115 | out += residual 116 | out = self.relu(out) 117 | 118 | return out 119 | 120 | def conv3x3(in_planes, out_planes, stride=1): 121 | """3x3 convolution with padding""" 122 | return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, 123 | padding=1, bias=False) 124 | 125 | class BasicBlock(nn.Module): 126 | expansion = 1 127 | 128 | def __init__(self, inplanes, planes, stride=1, downsample=None): 129 | super(BasicBlock, self).__init__() 130 | self.conv1 = conv3x3(inplanes, planes, stride) 131 | self.bn1 = nn.BatchNorm2d(planes) 132 | self.relu = nn.ReLU(inplace=True) 133 | self.conv2 = conv3x3(planes, planes) 134 | self.bn2 = nn.BatchNorm2d(planes) 135 | self.downsample = downsample 136 | self.stride = stride 137 | 138 | def forward(self, x): 139 | residual = x 140 | 141 | out = self.conv1(x) 142 | out = self.bn1(out) 143 | out = self.relu(out) 144 | 145 | out = self.conv2(out) 146 | out = self.bn2(out) 147 | 148 | if self.downsample is not None: 149 | residual = self.downsample(x) 150 | 151 | out += residual 152 | out = self.relu(out) 153 | 154 | return out 155 | 156 | def copy_parameter_from_resnet(model, resnet_dict): 157 | cur_state_dict = model.state_dict() 158 | # import ipdb; ipdb.set_trace() 159 | for name, param in list(resnet_dict.items())[0:None]: 160 | if name not in cur_state_dict: 161 | # print(name, ' not available in reconstructed resnet') 162 | continue 163 | if isinstance(param, Parameter): 164 | param = param.data 165 | try: 166 | cur_state_dict[name].copy_(param) 167 | except: 168 | # print(name, ' is inconsistent!') 169 | continue 170 | # print('copy resnet state dict finished!') 171 | # import ipdb; ipdb.set_trace() 172 | 173 | def load_ResNet50Model(): 174 | model = ResNet(Bottleneck, [3, 4, 6, 3]) 175 | copy_parameter_from_resnet(model, torchvision.models.resnet50(pretrained = False).state_dict()) 176 | return model 177 | 178 | def load_ResNet101Model(): 179 | model = ResNet(Bottleneck, [3, 4, 23, 3]) 180 | copy_parameter_from_resnet(model, torchvision.models.resnet101(pretrained = True).state_dict()) 181 | return model 182 | 183 | def load_ResNet152Model(): 184 | model = ResNet(Bottleneck, [3, 8, 36, 3]) 185 | copy_parameter_from_resnet(model, torchvision.models.resnet152(pretrained = True).state_dict()) 186 | return model 187 | 188 | # model.load_state_dict(checkpoint['model_state_dict']) 189 | 190 | 191 | ######## Unet 192 | 193 | class DoubleConv(nn.Module): 194 | """(convolution => [BN] => ReLU) * 2""" 195 | 196 | def __init__(self, in_channels, out_channels): 197 | super().__init__() 198 | self.double_conv = nn.Sequential( 199 | nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1), 200 | nn.BatchNorm2d(out_channels), 201 | nn.ReLU(inplace=True), 202 | nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1), 203 | nn.BatchNorm2d(out_channels), 204 | nn.ReLU(inplace=True) 205 | ) 206 | 207 | def forward(self, x): 208 | return self.double_conv(x) 209 | 210 | 211 | class Down(nn.Module): 212 | """Downscaling with maxpool then double conv""" 213 | 214 | def __init__(self, in_channels, out_channels): 215 | super().__init__() 216 | self.maxpool_conv = nn.Sequential( 217 | nn.MaxPool2d(2), 218 | DoubleConv(in_channels, out_channels) 219 | ) 220 | 221 | def forward(self, x): 222 | return self.maxpool_conv(x) 223 | 224 | 225 | class Up(nn.Module): 226 | """Upscaling then double conv""" 227 | 228 | def __init__(self, in_channels, out_channels, bilinear=True): 229 | super().__init__() 230 | 231 | # if bilinear, use the normal convolutions to reduce the number of channels 232 | if bilinear: 233 | self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True) 234 | else: 235 | self.up = nn.ConvTranspose2d(in_channels // 2, in_channels // 2, kernel_size=2, stride=2) 236 | 237 | self.conv = DoubleConv(in_channels, out_channels) 238 | 239 | def forward(self, x1, x2): 240 | x1 = self.up(x1) 241 | # input is CHW 242 | diffY = x2.size()[2] - x1.size()[2] 243 | diffX = x2.size()[3] - x1.size()[3] 244 | 245 | x1 = F.pad(x1, [diffX // 2, diffX - diffX // 2, 246 | diffY // 2, diffY - diffY // 2]) 247 | # if you have padding issues, see 248 | # https://github.com/HaiyongJiang/U-Net-Pytorch-Unstructured-Buggy/commit/0e854509c2cea854e247a9c615f175f76fbb2e3a 249 | # https://github.com/xiaopeng-liao/Pytorch-UNet/commit/8ebac70e633bac59fc22bb5195e513d5832fb3bd 250 | x = torch.cat([x2, x1], dim=1) 251 | return self.conv(x) 252 | 253 | 254 | class OutConv(nn.Module): 255 | def __init__(self, in_channels, out_channels): 256 | super(OutConv, self).__init__() 257 | self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=1) 258 | 259 | def forward(self, x): 260 | return self.conv(x) -------------------------------------------------------------------------------- /src/external/decalib/utils/config.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Default config for DECA 3 | ''' 4 | from yacs.config import CfgNode as CN 5 | import argparse 6 | import yaml 7 | import os 8 | 9 | cfg = CN() 10 | 11 | abs_deca_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..', '..')) 12 | cfg.deca_dir = abs_deca_dir 13 | cfg.device = 'cuda' 14 | cfg.device_id = '0' 15 | 16 | cfg.pretrained_modelpath = os.path.join(cfg.deca_dir, 'data', 'deca_model.tar') 17 | cfg.output_dir = '' 18 | cfg.rasterizer_type = 'pytorch3d' 19 | # ---------------------------------------------------------------------------- # 20 | # Options for Face model 21 | # ---------------------------------------------------------------------------- # 22 | cfg.model = CN() 23 | cfg.model.topology_path = os.path.join(cfg.deca_dir, 'data', 'head_template.obj') 24 | # texture data original from http://files.is.tue.mpg.de/tbolkart/FLAME/FLAME_texture_data.zip 25 | cfg.model.dense_template_path = os.path.join(cfg.deca_dir, 'data', 'texture_data_256.npy') 26 | cfg.model.fixed_displacement_path = os.path.join(cfg.deca_dir, 'data', 'fixed_displacement_256.npy') 27 | cfg.model.flame_model_path = os.path.join(cfg.deca_dir, 'data', 'generic_model.pkl') 28 | cfg.model.flame_lmk_embedding_path = os.path.join(cfg.deca_dir, 'data', 'landmark_embedding.json') 29 | cfg.model.face_mask_path = os.path.join(cfg.deca_dir, 'data', 'uv_face_mask.png') 30 | cfg.model.face_eye_mask_path = os.path.join(cfg.deca_dir, 'data', 'uv_face_eye_mask.png') 31 | cfg.model.mean_tex_path = os.path.join(cfg.deca_dir, 'data', 'mean_texture.jpg') 32 | cfg.model.tex_path = os.path.join(cfg.deca_dir, 'data', 'FLAME_albedo_from_BFM.npz') 33 | cfg.model.tex_type = 'BFM' # BFM, FLAME, albedoMM 34 | cfg.model.uv_size = 256 35 | cfg.model.param_list = ['shape', 'tex', 'exp', 'pose', 'cam', 'light'] 36 | cfg.model.n_shape = 100 37 | cfg.model.n_tex = 50 38 | cfg.model.n_exp = 50 39 | cfg.model.n_cam = 3 40 | cfg.model.n_pose = 6 41 | cfg.model.n_light = 27 42 | cfg.model.use_tex = True 43 | cfg.model.jaw_type = 'aa' # default use axis angle, another option: euler. Note that: aa is not stable in the beginning 44 | # face recognition model 45 | cfg.model.fr_model_path = os.path.join(cfg.deca_dir, 'data', 'resnet50_ft_weight.pkl') 46 | 47 | ## details 48 | cfg.model.n_detail = 128 49 | cfg.model.max_z = 0.01 50 | 51 | # ---------------------------------------------------------------------------- # 52 | # Options for Dataset 53 | # ---------------------------------------------------------------------------- # 54 | cfg.dataset = CN() 55 | cfg.dataset.training_data = ['vggface2', 'ethnicity'] 56 | # cfg.dataset.training_data = ['ethnicity'] 57 | cfg.dataset.eval_data = ['aflw2000'] 58 | cfg.dataset.test_data = [''] 59 | cfg.dataset.batch_size = 2 60 | cfg.dataset.K = 4 61 | cfg.dataset.isSingle = False 62 | # cfg.dataset.num_workers = 2 63 | cfg.dataset.num_workers = 0 64 | cfg.dataset.image_size = 224 65 | cfg.dataset.scale_min = 1.4 66 | cfg.dataset.scale_max = 1.8 67 | cfg.dataset.trans_scale = 0. 68 | 69 | # ---------------------------------------------------------------------------- # 70 | # Options for training 71 | # ---------------------------------------------------------------------------- # 72 | cfg.train = CN() 73 | cfg.train.train_detail = False 74 | cfg.train.max_epochs = 500 75 | cfg.train.max_steps = 1000000 76 | cfg.train.lr = 1e-4 77 | cfg.train.log_dir = 'logs' 78 | cfg.train.log_steps = 10 79 | cfg.train.vis_dir = 'train_images' 80 | cfg.train.vis_steps = 200 81 | cfg.train.write_summary = True 82 | cfg.train.checkpoint_steps = 500 83 | cfg.train.val_steps = 500 84 | cfg.train.val_vis_dir = 'val_images' 85 | cfg.train.eval_steps = 5000 86 | cfg.train.resume = True 87 | 88 | # ---------------------------------------------------------------------------- # 89 | # Options for Losses 90 | # ---------------------------------------------------------------------------- # 91 | cfg.loss = CN() 92 | cfg.loss.lmk = 1.0 93 | cfg.loss.useWlmk = True 94 | cfg.loss.eyed = 1.0 95 | cfg.loss.lipd = 0.5 96 | cfg.loss.photo = 2.0 97 | cfg.loss.useSeg = True 98 | cfg.loss.id = 0.2 99 | cfg.loss.id_shape_only = True 100 | cfg.loss.reg_shape = 1e-04 101 | cfg.loss.reg_exp = 1e-04 102 | cfg.loss.reg_tex = 1e-04 103 | cfg.loss.reg_light = 1. 104 | cfg.loss.reg_jaw_pose = 0. #1. 105 | cfg.loss.use_gender_prior = False 106 | cfg.loss.shape_consistency = True 107 | # loss for detail 108 | cfg.loss.detail_consistency = True 109 | cfg.loss.useConstraint = True 110 | cfg.loss.mrf = 5e-2 111 | cfg.loss.photo_D = 2. 112 | cfg.loss.reg_sym = 0.005 113 | cfg.loss.reg_z = 0.005 114 | cfg.loss.reg_diff = 0.005 115 | 116 | 117 | def get_cfg_defaults(): 118 | """Get a yacs CfgNode object with default values for my_project.""" 119 | # Return a clone so that the defaults will not be altered 120 | # This is for the "local variable" use pattern 121 | return cfg.clone() 122 | 123 | def update_cfg(cfg, cfg_file): 124 | cfg.merge_from_file(cfg_file) 125 | return cfg.clone() 126 | 127 | def parse_args(): 128 | parser = argparse.ArgumentParser() 129 | parser.add_argument('--cfg', type=str, help='cfg file path') 130 | parser.add_argument('--mode', type=str, default = 'train', help='deca mode') 131 | 132 | args = parser.parse_args() 133 | print(args, end='\n\n') 134 | 135 | cfg = get_cfg_defaults() 136 | cfg.cfg_file = None 137 | cfg.mode = args.mode 138 | # import ipdb; ipdb.set_trace() 139 | if args.cfg is not None: 140 | cfg_file = args.cfg 141 | cfg = update_cfg(cfg, args.cfg) 142 | cfg.cfg_file = cfg_file 143 | 144 | return cfg 145 | -------------------------------------------------------------------------------- /src/fitting.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import torch 3 | import json 4 | import os 5 | import copy 6 | import numpy as np 7 | from external.decalib.utils.config import cfg as deca_cfg 8 | from external.decalib.deca import DECA 9 | from external.decalib.datasets import datasets 10 | from external.decalib.models.FLAME import FLAME 11 | from util.util import ( 12 | save_coeffs, 13 | save_landmarks 14 | ) 15 | 16 | 17 | def parse_args(): 18 | """Configurations.""" 19 | parser = argparse.ArgumentParser(description='test process of Face2FaceRHO') 20 | parser.add_argument('--device', default='cuda', type=str, help='set device, cpu for using cpu') 21 | parser.add_argument('--src_img', type=str, required=True, help='input source image (.jpg, .jpg, .jpeg, .png)') 22 | parser.add_argument('--drv_img', type=str, required=True, help='input driving image (.jpg, .jpg, .jpeg, .png)') 23 | 24 | parser.add_argument('--output_src_headpose', type=str, 25 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'source', 'FLAME', 'headpose.txt'), 26 | help='output head pose coefficients of source image (.txt)') 27 | parser.add_argument('--output_src_landmark', type=str, 28 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'source', 'FLAME', 'landmark.txt'), 29 | help='output facial landmarks of source image (.txt)') 30 | 31 | parser.add_argument('--output_drv_headpose', type=str, 32 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'driving', 'FLAME', 'headpose.txt'), 33 | help=' output head pose coefficients of driving image (.txt)') 34 | parser.add_argument('--output_drv_landmark', type=str, 35 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'driving', 'FLAME', 'landmark.txt'), 36 | help='output driving facial landmarks (.txt, reconstructed by using shape coefficients ' 37 | 'of the source actor and expression and head pose coefficients of the driving actor)') 38 | 39 | return _check_args(parser.parse_args()) 40 | 41 | 42 | def _check_args(args): 43 | if args is None: 44 | raise RuntimeError('Invalid arguments!') 45 | return args 46 | 47 | 48 | class FLAMEFitting: 49 | def __init__(self): 50 | self.deca = DECA(config=deca_cfg, device=args.device) 51 | 52 | def fitting(self, img_name): 53 | testdata = datasets.TestData(img_name, iscrop=True,face_detector='fan', sample_step=10) 54 | input_data = testdata[0] 55 | images = input_data['image'].to(args.device)[None, ...] 56 | with torch.no_grad(): 57 | codedict = self.deca.encode(images) 58 | codedict['tform'] = input_data['tform'][None, ...] 59 | original_image = input_data['original_image'][None, ...] 60 | _, _, h, w = original_image.shape 61 | params = self.deca.ensemble_3DMM_params(codedict, image_size=deca_cfg.dataset.image_size, original_image_size=h) 62 | return params 63 | 64 | 65 | class PoseLandmarkExtractor: 66 | def __init__(self): 67 | self.flame = FLAME(deca_cfg.model) 68 | 69 | with open(os.path.join(deca_cfg.deca_dir, 'data', 'pose_transform_config.json'), 'r') as f: 70 | pose_transform = json.load(f) 71 | 72 | self.scale_transform = pose_transform['scale_transform'] 73 | self.tx_transform = pose_transform['tx_transform'] 74 | self.ty_transform = pose_transform['ty_transform'] 75 | self.tx_scale = 0.256 # 512 / 2000 76 | self.ty_scale = - self.tx_scale 77 | 78 | @staticmethod 79 | def transform_points(points, scale, tx, ty): 80 | trans_matrix = torch.zeros((1, 4, 4), dtype=torch.float32) 81 | trans_matrix[:, 0, 0] = scale 82 | trans_matrix[:, 1, 1] = -scale 83 | trans_matrix[:, 2, 2] = 1 84 | trans_matrix[:, 0, 3] = tx 85 | trans_matrix[:, 1, 3] = ty 86 | trans_matrix[:, 3, 3] = 1 87 | 88 | batch_size, n_points, _ = points.shape 89 | points_homo = torch.cat([points, torch.ones([batch_size, n_points, 1], dtype=points.dtype)], dim=2) 90 | points_homo = points_homo.transpose(1, 2) 91 | trans_points = torch.bmm(trans_matrix, points_homo).transpose(1, 2) 92 | trans_points = trans_points[:, :, 0:3] 93 | return trans_points 94 | 95 | def get_project_points(self, shape_params, expression_params, pose, scale, tx, ty): 96 | shape_params = torch.tensor(shape_params).unsqueeze(0) 97 | expression_params = torch.tensor(expression_params).unsqueeze(0) 98 | pose = torch.tensor(pose).unsqueeze(0) 99 | verts, landmarks3d = self.flame( 100 | shape_params=shape_params, expression_params=expression_params, pose_params=pose) 101 | trans_landmarks3d = self.transform_points(landmarks3d, scale, tx, ty) 102 | trans_landmarks3d = trans_landmarks3d.squeeze(0).cpu().numpy() 103 | return trans_landmarks3d[:, 0:2].tolist() 104 | 105 | def calculate_nose_tip_tx_ty(self, shape_params, expression_params, pose, scale, tx, ty): 106 | front_pose = copy.deepcopy(pose) 107 | front_pose[0] = front_pose[1] = front_pose[2] = 0 108 | front_landmarks3d = self.get_project_points(shape_params, expression_params, front_pose, scale, tx, ty) 109 | original_landmark3d = self.get_project_points(shape_params, expression_params, pose, scale, tx, ty) 110 | nose_tx = original_landmark3d[30][0] - front_landmarks3d[30][0] 111 | nose_ty = original_landmark3d[30][1] - front_landmarks3d[30][1] 112 | return nose_tx, nose_ty 113 | 114 | def get_pose(self, shape_params, expression_params, pose, scale, tx, ty): 115 | nose_tx, nose_ty = self.calculate_nose_tip_tx_ty( 116 | shape_params, expression_params, pose, scale, tx, ty) 117 | transformed_axis_angle = [ 118 | float(pose[0]), 119 | float(pose[1]), 120 | float(pose[2]) 121 | ] 122 | transformed_tx = self.tx_transform + self.tx_scale * (tx + nose_tx) 123 | transformed_ty = self.ty_transform + self.ty_scale * (ty + nose_ty) 124 | transformed_scale = scale / self.scale_transform 125 | return transformed_axis_angle + [transformed_tx, transformed_ty, transformed_scale] 126 | 127 | 128 | if __name__ == '__main__': 129 | args = parse_args() 130 | 131 | # 3DMM fitting by DECA: Detailed Expression Capture and Animation using FLAME model 132 | face_fitting = FLAMEFitting() 133 | src_params = face_fitting.fitting(args.src_img) 134 | drv_params = face_fitting.fitting(args.drv_img) 135 | 136 | # calculate head pose and facial landmarks for the source and driving face images 137 | pose_lml_extractor = PoseLandmarkExtractor() 138 | src_headpose = pose_lml_extractor.get_pose( 139 | src_params['shape'], src_params['exp'], src_params['pose'], 140 | src_params['scale'], src_params['tx'], src_params['ty']) 141 | 142 | src_lmks = pose_lml_extractor.get_project_points( 143 | src_params['shape'], src_params['exp'], src_params['pose'], 144 | src_params['scale'], src_params['tx'], src_params['ty']) 145 | 146 | # Note that the driving head pose and facial landmarks are calculated using the shape parameters of the source image 147 | # in order to eliminate the interference of the driving actor's identity. 148 | drv_headpose = pose_lml_extractor.get_pose( 149 | src_params['shape'], drv_params['exp'], drv_params['pose'], 150 | drv_params['scale'], drv_params['tx'], drv_params['ty']) 151 | 152 | drv_lmks = pose_lml_extractor.get_project_points( 153 | src_params['shape'], drv_params['exp'], drv_params['pose'], 154 | drv_params['scale'], drv_params['tx'], drv_params['ty']) 155 | 156 | # save 157 | os.makedirs(os.path.split(args.output_src_headpose)[0], exist_ok=True) 158 | save_coeffs(args.output_src_headpose, src_headpose) 159 | os.makedirs(os.path.split(args.output_src_landmark)[0], exist_ok=True) 160 | save_landmarks(args.output_src_landmark, src_lmks) 161 | 162 | os.makedirs(os.path.split(args.output_drv_headpose)[0], exist_ok=True) 163 | save_coeffs(args.output_drv_headpose, drv_headpose) 164 | os.makedirs(os.path.split(args.output_drv_landmark)[0], exist_ok=True) 165 | save_landmarks(args.output_drv_landmark, drv_lmks) 166 | -------------------------------------------------------------------------------- /src/models/VGG19_LOSS.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torchvision 3 | from torch.nn import functional as F 4 | import numpy as np 5 | 6 | 7 | # VGG architecture, used for the perceptual loss using a pretrained VGG network 8 | class VGG19(torch.nn.Module): 9 | def __init__(self, requires_grad=False): 10 | super().__init__() 11 | vgg_pretrained_features = torchvision.models.vgg19( 12 | pretrained=True 13 | ).features 14 | self.slice1 = torch.nn.Sequential() 15 | self.slice2 = torch.nn.Sequential() 16 | self.slice3 = torch.nn.Sequential() 17 | self.slice4 = torch.nn.Sequential() 18 | self.slice5 = torch.nn.Sequential() 19 | for x in range(2): 20 | self.slice1.add_module(str(x), vgg_pretrained_features[x]) 21 | for x in range(2, 7): 22 | self.slice2.add_module(str(x), vgg_pretrained_features[x]) 23 | for x in range(7, 12): 24 | self.slice3.add_module(str(x), vgg_pretrained_features[x]) 25 | for x in range(12, 21): 26 | self.slice4.add_module(str(x), vgg_pretrained_features[x]) 27 | for x in range(21, 30): 28 | self.slice5.add_module(str(x), vgg_pretrained_features[x]) 29 | 30 | self.mean = torch.nn.Parameter(data=torch.Tensor(np.array([0.485, 0.456, 0.406]).reshape((1, 3, 1, 1))), 31 | requires_grad=False) 32 | self.std = torch.nn.Parameter(data=torch.Tensor(np.array([0.229, 0.224, 0.225]).reshape((1, 3, 1, 1))), 33 | requires_grad=False) 34 | 35 | if not requires_grad: 36 | for param in self.parameters(): 37 | param.requires_grad = False 38 | 39 | def forward(self, X): 40 | # Normalize the image so that it is in the appropriate range 41 | X = (X + 1) / 2 42 | X = (X - self.mean) / self.std 43 | h_relu1 = self.slice1(X) 44 | h_relu2 = self.slice2(h_relu1) 45 | h_relu3 = self.slice3(h_relu2) 46 | h_relu4 = self.slice4(h_relu3) 47 | h_relu5 = self.slice5(h_relu4) 48 | out = [h_relu1, h_relu2, h_relu3, h_relu4, h_relu5] 49 | return out 50 | 51 | 52 | class VGG19LOSS(torch.nn.Module): 53 | def __init__(self): 54 | super(VGG19LOSS, self).__init__() 55 | self.model = VGG19() 56 | 57 | def forward(self, fake, target, weight_mask=None, loss_weights=[1.0, 1.0, 1.0, 1.0, 1.0]): 58 | vgg_fake = self.model(fake) 59 | vgg_target = self.model(target) 60 | 61 | value_total = 0 62 | for i, weight in enumerate(loss_weights): 63 | value = torch.abs(vgg_fake[i] - vgg_target[i].detach()) 64 | if weight_mask is not None: 65 | bs, c, H1, W1 = value.shape 66 | _, _, H2, W2 = weight_mask.shape 67 | if H1 != H2 or W1 != W2: 68 | cur_weight_mask = F.interpolate(weight_mask, size=(H1, W1)) 69 | value = value * cur_weight_mask 70 | else: 71 | value = value * weight_mask 72 | value = torch.mean(value, dim=[x for x in range(1, len(value.size()))]) 73 | value_total += loss_weights[i] * value 74 | return value_total 75 | -------------------------------------------------------------------------------- /src/models/__init__.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | from models.base_model import BaseModel 3 | 4 | 5 | def find_model_using_name(model_name): 6 | # Given the option --model [modelname], 7 | # the file "models/modelname_model.py" 8 | # will be imported. 9 | model_filename = "models." + model_name + "_model" 10 | modellib = importlib.import_module(model_filename) 11 | 12 | # In the file, the class called ModelNameModel() will 13 | # be instantiated. It has to be a subclass of BaseModel, 14 | # and it is case-insensitive. 15 | model = None 16 | target_model_name = model_name.replace('_', '') + 'model' 17 | for name, cls in modellib.__dict__.items(): 18 | if name.lower() == target_model_name.lower() \ 19 | and issubclass(cls, BaseModel): 20 | model = cls 21 | 22 | if model is None: 23 | print("In %s.py, there should be a subclass of BaseModel with class name that matches %s in lowercase." % (model_filename, target_model_name)) 24 | exit(0) 25 | 26 | return model 27 | 28 | 29 | def get_option_setter(model_name): 30 | model_class = find_model_using_name(model_name) 31 | return model_class.modify_commandline_options 32 | 33 | 34 | def create_model(opt): 35 | model = find_model_using_name(opt.model) 36 | instance = model() 37 | instance.initialize(opt) 38 | print("model [%s] was created" % (instance.name())) 39 | return instance 40 | -------------------------------------------------------------------------------- /src/models/base_model.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import torch.nn as nn 4 | from collections import OrderedDict 5 | from . import networks 6 | import numpy as np 7 | from PIL import Image 8 | 9 | def save_tensor_image(input_image, image_path): 10 | if isinstance(input_image, torch.Tensor): 11 | image_tensor = input_image.data 12 | image_numpy = image_tensor[0].cpu().float().numpy() 13 | if image_numpy.shape[0] == 1: 14 | image_numpy = np.tile(image_numpy, (3, 1, 1)) 15 | image_numpy = (np.transpose(image_numpy, (1, 2, 0)) + 1) / 2.0 * 255.0 16 | else: 17 | image_numpy = input_image 18 | image_numpy = image_numpy.astype(np.uint8) 19 | image_pil = Image.fromarray(image_numpy) 20 | image_pil.save(image_path) 21 | 22 | class BaseModel(): 23 | @staticmethod 24 | def modify_commandline_options(parser, is_train): 25 | return parser 26 | 27 | def name(self): 28 | return 'BaseModel' 29 | 30 | def initialize(self, opt): 31 | self.opt = opt 32 | self.gpu_ids = opt.gpu_ids 33 | self.isTrain = opt.isTrain 34 | self.device = torch.device('cuda:{}'.format(self.gpu_ids[0])) if self.gpu_ids else torch.device('cpu') 35 | self.load_dir = os.path.join(opt.checkpoints_dir, opt.name) 36 | self.save_dir = os.path.join(opt.checkpoints_dir, opt.name) 37 | if not os.path.exists(self.save_dir): 38 | os.makedirs(self.save_dir) 39 | self.loss_names = [] 40 | self.model_names = [] 41 | self.visual_names = [] 42 | self.image_paths = [] 43 | 44 | def set_input(self, input): 45 | pass 46 | 47 | def forward(self): 48 | pass 49 | 50 | # load and print networks; create schedulers 51 | def setup(self, opt, parser=None): 52 | if self.isTrain: 53 | self.schedulers = [networks.get_scheduler(optimizer, opt) for optimizer in self.optimizers] 54 | if not self.isTrain or opt.continue_train: 55 | load_suffix = 'iter_%d' % opt.load_iter if opt.load_iter > 0 else opt.epoch 56 | self.load_networks(load_suffix) 57 | self.print_networks(opt.verbose) 58 | 59 | 60 | 61 | # load specific moudles 62 | def loadModules(self, opt, model_name, module_names): 63 | for name in module_names: 64 | if isinstance(name, str): 65 | load_dir = os.path.join(opt.checkpoints_dir, model_name) 66 | load_filename = 'latest_%s.pth' % (name) 67 | load_path = os.path.join(load_dir, load_filename) 68 | net = getattr(self, name) 69 | if isinstance(net, torch.Tensor): 70 | print('loading the tensor from %s' % load_path) 71 | net_loaded = torch.load(load_path, map_location=str(self.device)) 72 | net.copy_(net_loaded) 73 | else: 74 | # if isinstance(net, torch.nn.DataParallel): 75 | # net = net.module 76 | print('loading the module from %s' % load_path) 77 | # if you are using PyTorch newer than 0.4 (e.g., built from 78 | # GitHub source), you can remove str() on self.device 79 | state_dict = torch.load(load_path, map_location=str(self.device)) 80 | if hasattr(state_dict, '_metadata'): 81 | del state_dict._metadata 82 | 83 | # patch InstanceNorm checkpoints prior to 0.4 84 | for key in list(state_dict.keys()): # need to copy keys here because we mutate in loop 85 | self.__patch_instance_norm_state_dict(state_dict, net, key.split('.')) 86 | net.load_state_dict(state_dict) 87 | 88 | 89 | 90 | 91 | # make models eval mode during test time 92 | def eval(self): 93 | for name in self.model_names: 94 | if isinstance(name, str): 95 | net = getattr(self, name) 96 | net.eval() 97 | 98 | # used in test time, wrapping `forward` in no_grad() so we don't save 99 | # intermediate steps for backprop 100 | def test(self): 101 | with torch.no_grad(): 102 | self.forward() 103 | 104 | # get image paths 105 | def get_image_paths(self): 106 | return self.image_paths 107 | 108 | def optimize_parameters(self): 109 | pass 110 | 111 | # update learning rate (called once every epoch) 112 | def update_learning_rate(self): 113 | for scheduler in self.schedulers: 114 | scheduler.step() 115 | lr = self.optimizers[0].param_groups[0]['lr'] 116 | print('learning rate = %.7f' % lr) 117 | 118 | # return visualization images. train.py will display these images, and save the images to a html 119 | def get_current_visuals(self): 120 | visual_ret = OrderedDict() 121 | for name in self.visual_names: 122 | if isinstance(name, str): 123 | visual_ret[name] = getattr(self, name) 124 | return visual_ret 125 | 126 | # return traning losses/errors. train.py will print out these errors as debugging information 127 | def get_current_losses(self): 128 | errors_ret = OrderedDict() 129 | for name in self.loss_names: 130 | if isinstance(name, str): 131 | # float(...) works for both scalar tensor and float number 132 | errors_ret[name] = float(getattr(self, 'loss_' + name)) 133 | return errors_ret 134 | 135 | # save models to the disk 136 | def save_networks(self, epoch): 137 | for name in self.model_names: 138 | if isinstance(name, str): 139 | save_filename = '%s_%s.pth' % (epoch, name) 140 | save_path = os.path.join(self.save_dir, save_filename) 141 | net = getattr(self, name) 142 | 143 | if isinstance(net, torch.Tensor): 144 | #torch.save(net.state_dict(), save_path) 145 | torch.save(net, save_path) 146 | for i in range(0, list(net.size())[0]): 147 | save_tensor_image(net[i:i+1,0:3,:,:], save_path+str(i)+'.png') 148 | else: 149 | if len(self.gpu_ids) > 0 and torch.cuda.is_available(): 150 | #torch.save(net.module.cpu().state_dict(), save_path) # << original 151 | torch.save(net.cpu().state_dict(), save_path) 152 | net.cuda(self.gpu_ids[0]) 153 | else: 154 | torch.save(net.cpu().state_dict(), save_path) 155 | 156 | def __patch_instance_norm_state_dict(self, state_dict, module, keys, i=0): 157 | key = keys[i] 158 | if i + 1 == len(keys): # at the end, pointing to a parameter/buffer 159 | if module.__class__.__name__.startswith('InstanceNorm') and \ 160 | (key == 'running_mean' or key == 'running_var'): 161 | if getattr(module, key) is None: 162 | state_dict.pop('.'.join(keys)) 163 | if module.__class__.__name__.startswith('InstanceNorm') and \ 164 | (key == 'num_batches_tracked'): 165 | state_dict.pop('.'.join(keys)) 166 | else: 167 | self.__patch_instance_norm_state_dict(state_dict, getattr(module, key), keys, i + 1) 168 | 169 | # load models from the disk 170 | def load_networks(self, epoch): 171 | for name in self.model_names: 172 | if isinstance(name, str): 173 | load_filename = '%s_%s.pth' % (epoch, name) 174 | load_path = os.path.join(self.load_dir, load_filename) 175 | net = getattr(self, name) 176 | if isinstance(net, torch.Tensor): 177 | print('loading the tensor from %s' % load_path) 178 | net_loaded = torch.load(load_path, map_location=str(self.device)) 179 | net.copy_(net_loaded) 180 | else: 181 | # if isinstance(net, torch.nn.DataParallel): 182 | # net = net.module 183 | print('loading the module from %s' % load_path) 184 | # if you are using PyTorch newer than 0.4 (e.g., built from 185 | # GitHub source), you can remove str() on self.device 186 | state_dict = torch.load(load_path, map_location=str(self.device)) 187 | if hasattr(state_dict, '_metadata'): 188 | del state_dict._metadata 189 | 190 | # patch InstanceNorm checkpoints prior to 0.4 191 | for key in list(state_dict.keys()): # need to copy keys here because we mutate in loop 192 | self.__patch_instance_norm_state_dict(state_dict, net, key.split('.')) 193 | net.load_state_dict(state_dict) 194 | 195 | # print network information 196 | def print_networks(self, verbose): 197 | print('---------- Networks initialized -------------') 198 | for name in self.model_names: 199 | if isinstance(name, str): 200 | net = getattr(self, name) 201 | if isinstance(net, torch.Tensor): 202 | num_params = net.numel() 203 | print('[Tensor %s] Total number of parameters : %.3f M' % (name, num_params / 1e6)) 204 | else: 205 | num_params = 0 206 | for param in net.parameters(): 207 | num_params += param.numel() 208 | if verbose: 209 | print(net) 210 | print('[Network %s] Total number of parameters : %.3f M' % (name, num_params / 1e6)) 211 | print('-----------------------------------------------') 212 | 213 | # set requies_grad=False to avoid computation 214 | def set_requires_grad(self, nets, requires_grad=False): 215 | if not isinstance(nets, list): 216 | nets = [nets] 217 | for net in nets: 218 | if net is not None: 219 | for param in net.parameters(): 220 | param.requires_grad = requires_grad 221 | -------------------------------------------------------------------------------- /src/models/discriminator.py: -------------------------------------------------------------------------------- 1 | from torch import nn 2 | import torch.nn.functional as F 3 | import torch 4 | 5 | 6 | class DownBlock2d(nn.Module): 7 | """ 8 | Simple block for processing video (encoder). 9 | """ 10 | 11 | def __init__(self, in_features, out_features, norm=False, kernel_size=4, pool=False, sn=False): 12 | super(DownBlock2d, self).__init__() 13 | self.conv = nn.Conv2d(in_channels=in_features, out_channels=out_features, kernel_size=kernel_size) 14 | 15 | if sn: 16 | self.conv = nn.utils.spectral_norm(self.conv) 17 | 18 | if norm: 19 | self.norm = nn.InstanceNorm2d(out_features, affine=True) 20 | else: 21 | self.norm = None 22 | self.pool = pool 23 | 24 | def forward(self, x): 25 | out = x 26 | out = self.conv(out) 27 | if self.norm: 28 | out = self.norm(out) 29 | out = F.leaky_relu(out, 0.2) 30 | if self.pool: 31 | out = F.avg_pool2d(out, (2, 2)) 32 | return out 33 | 34 | 35 | class Discriminator(nn.Module): 36 | """ 37 | Discriminator similar to Pix2Pix 38 | """ 39 | 40 | def __init__(self, num_channels=3, block_expansion=64, num_blocks=4, max_features=512, sn=False, use_kp=False): 41 | super(Discriminator, self).__init__() 42 | 43 | down_blocks = [] 44 | for i in range(num_blocks): 45 | down_blocks.append( 46 | DownBlock2d(num_channels + 3 * use_kp if i == 0 else min(max_features, block_expansion * (2 ** i)), 47 | min(max_features, block_expansion * (2 ** (i + 1))), 48 | norm=(i != 0), kernel_size=4, pool=(i != num_blocks - 1), sn=sn)) 49 | 50 | self.down_blocks = nn.ModuleList(down_blocks) 51 | self.conv = nn.Conv2d(self.down_blocks[-1].conv.out_channels, out_channels=1, kernel_size=1) 52 | if sn: 53 | self.conv = nn.utils.spectral_norm(self.conv) 54 | self.use_kp = use_kp 55 | 56 | def forward(self, x, kp=None): 57 | feature_maps = [] 58 | out = x 59 | if self.use_kp: 60 | bs, _, h1, w1 = kp.shape 61 | bs, C, h2, w2 = out.shape 62 | if h1 != h2 or w1 != w2: 63 | kp = F.interpolate(kp, size=(h2, w2), mode='bilinear') 64 | out = torch.cat([out, kp], dim=1) 65 | 66 | for down_block in self.down_blocks: 67 | feature_maps.append(down_block(out)) 68 | out = feature_maps[-1] 69 | prediction_map = self.conv(out) 70 | 71 | return feature_maps, prediction_map 72 | 73 | 74 | class MultiScaleDiscriminator(nn.Module): 75 | """ 76 | Multi-scale (scale) discriminator 77 | """ 78 | 79 | def __init__(self, scales=(), **kwargs): 80 | super(MultiScaleDiscriminator, self).__init__() 81 | self.scales = scales 82 | discs = {} 83 | for scale in scales: 84 | discs[str(scale).replace('.', '-')] = Discriminator(**kwargs) 85 | self.discs = nn.ModuleDict(discs) 86 | 87 | def forward(self, x, kp=None): 88 | out_dict = {} 89 | for scale, disc in self.discs.items(): 90 | scale = str(scale).replace('-', '.') 91 | key = 'prediction_' + scale 92 | feature_maps, prediction_map = disc(x[key], kp) 93 | out_dict['feature_maps_' + scale] = feature_maps 94 | out_dict['prediction_map_' + scale] = prediction_map 95 | return out_dict -------------------------------------------------------------------------------- /src/models/image_pyramid.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch import nn 3 | import torch.nn.functional as F 4 | 5 | 6 | class AntiAliasInterpolation2d(nn.Module): 7 | """ 8 | Band-limited downsampling, for better preservation of the input signal. 9 | """ 10 | def __init__(self, channels, scale): 11 | super(AntiAliasInterpolation2d, self).__init__() 12 | sigma = (1 / scale - 1) / 2 13 | kernel_size = 2 * round(sigma * 4) + 1 14 | self.ka = kernel_size // 2 15 | self.kb = self.ka - 1 if kernel_size % 2 == 0 else self.ka 16 | 17 | kernel_size = [kernel_size, kernel_size] 18 | sigma = [sigma, sigma] 19 | # The gaussian kernel is the product of the 20 | # gaussian function of each dimension. 21 | kernel = 1 22 | meshgrids = torch.meshgrid( 23 | [ 24 | torch.arange(size, dtype=torch.float32) 25 | for size in kernel_size 26 | ] 27 | ) 28 | for size, std, mgrid in zip(kernel_size, sigma, meshgrids): 29 | mean = (size - 1) / 2 30 | kernel *= torch.exp(-(mgrid - mean) ** 2 / (2 * std ** 2)) 31 | 32 | # Make sure sum of values in gaussian kernel equals 1. 33 | kernel = kernel / torch.sum(kernel) 34 | # Reshape to depthwise convolutional weight 35 | kernel = kernel.view(1, 1, *kernel.size()) 36 | kernel = kernel.repeat(channels, *[1] * (kernel.dim() - 1)) 37 | 38 | self.register_buffer('weight', kernel) 39 | self.groups = channels 40 | self.scale = scale 41 | 42 | def forward(self, input): 43 | if self.scale == 1.0: 44 | return input 45 | 46 | out = F.pad(input, (self.ka, self.kb, self.ka, self.kb)) 47 | out = F.conv2d(out, weight=self.weight, groups=self.groups) 48 | out = F.interpolate(out, scale_factor=(self.scale, self.scale)) 49 | 50 | return out 51 | 52 | 53 | class ImagePyramide(torch.nn.Module): 54 | def __init__(self, scales, num_channels): 55 | super(ImagePyramide, self).__init__() 56 | downs = {} 57 | for scale in scales: 58 | downs[str(scale).replace('.', '-')] = AntiAliasInterpolation2d(num_channels, scale) 59 | self.downs = nn.ModuleDict(downs) 60 | 61 | def forward(self, x): 62 | out_dict = {} 63 | for scale, down_module in self.downs.items(): 64 | out_dict['prediction_' + str(scale).replace('-', '.')] = down_module(x) 65 | return out_dict -------------------------------------------------------------------------------- /src/models/motion_network.py: -------------------------------------------------------------------------------- 1 | import torch.nn.functional as F 2 | from torch import nn 3 | 4 | 5 | class DownBlock(nn.Module): 6 | def __init__(self, in_features, out_features, kernel_size=3, padding=1, groups=1, use_relu=True): 7 | super(DownBlock, self).__init__() 8 | self.conv = nn.Conv2d(in_channels=in_features, out_channels=out_features, kernel_size=kernel_size, 9 | padding=padding, groups=groups, stride=2) 10 | self.norm = nn.BatchNorm2d(out_features, affine=True) 11 | self.use_relu = use_relu 12 | 13 | def forward(self, x): 14 | out = self.conv(x) 15 | out = self.norm(out) 16 | if self.use_relu: 17 | out = F.relu(out) 18 | return out 19 | 20 | 21 | class UpBlock(nn.Module): 22 | def __init__(self, in_features, out_features, kernel_size=3, padding=1, groups=1, use_relu=True, 23 | sample_mode='nearest'): 24 | super(UpBlock, self).__init__() 25 | self.conv = nn.Conv2d(in_channels=in_features, out_channels=out_features, kernel_size=kernel_size, 26 | padding=padding, groups=groups) 27 | self.norm = nn.BatchNorm2d(out_features, affine=True) 28 | self.use_relu = use_relu 29 | self.sample_mode = sample_mode 30 | 31 | def forward(self, x): 32 | out = F.interpolate(x, scale_factor=2, mode=self.sample_mode) 33 | out = self.conv(out) 34 | out = self.norm(out) 35 | if self.use_relu: 36 | out = F.relu(out) 37 | return out 38 | 39 | 40 | class ResBlock(nn.Module): 41 | """ 42 | Res block, preserve spatial resolution. 43 | """ 44 | 45 | def __init__(self, in_features, kernel_size, padding): 46 | super(ResBlock, self).__init__() 47 | self.conv1 = nn.Conv2d(in_channels=in_features, out_channels=in_features, kernel_size=kernel_size, 48 | padding=padding) 49 | self.conv2 = nn.Conv2d(in_channels=in_features, out_channels=in_features, kernel_size=kernel_size, 50 | padding=padding) 51 | self.norm = nn.BatchNorm2d(in_features, affine=True) 52 | 53 | def forward(self, x): 54 | out = self.conv1(x) 55 | out = self.norm(out) 56 | out = F.relu(out) 57 | out = self.conv2(out) 58 | out += x 59 | return out 60 | 61 | 62 | class MotionNet(nn.Module): 63 | def __init__(self, opt): 64 | super(MotionNet, self).__init__() 65 | 66 | ngf = opt.mn_ngf 67 | n_local_enhancers = opt.n_local_enhancers 68 | n_downsampling = opt.mn_n_downsampling 69 | n_blocks_local = opt.mn_n_blocks_local 70 | 71 | in_features = [9, 9, 9] 72 | 73 | # F1 74 | f1_model_ngf = ngf * (2 ** n_local_enhancers) 75 | f1_model = [ 76 | nn.Conv2d(in_channels=in_features[0], out_channels=f1_model_ngf, kernel_size=3, stride=1, padding=1), 77 | nn.BatchNorm2d(f1_model_ngf), 78 | nn.ReLU(True)] 79 | 80 | for i in range(n_downsampling): 81 | mult = 2 ** i 82 | f1_model += [ 83 | DownBlock(f1_model_ngf * mult, f1_model_ngf * mult * 2, kernel_size=4, padding=1, use_relu=True)] 84 | 85 | for i in range(n_downsampling): 86 | mult = 2 ** (n_downsampling - i) 87 | f1_model += [ 88 | UpBlock(f1_model_ngf * mult, int(f1_model_ngf * mult / 2), kernel_size=3, padding=1) 89 | ] 90 | 91 | self.f1_model = nn.Sequential(*f1_model) 92 | self.f1_motion = nn.Conv2d(f1_model_ngf, 2, kernel_size=(3, 3), padding=(1, 1)) 93 | 94 | #f2 and f3 95 | for n in range(1, n_local_enhancers + 1): 96 | ### first downsampling block 97 | ngf_global = ngf * (2 ** (n_local_enhancers - n)) 98 | model_first_downsample = [DownBlock(in_features[n], ngf_global * 2, kernel_size=4, padding=1, use_relu=True)] 99 | ### other downsampling blocks, residual blocks and upsampling blocks 100 | # other downsampling blocks 101 | model_other = [] 102 | model_other += [ 103 | DownBlock(ngf_global * 2, ngf_global * 4, kernel_size=4, padding=1, use_relu=True), 104 | DownBlock(ngf_global * 4, ngf_global * 8, kernel_size=4, padding=1, use_relu=True), 105 | ] 106 | # residual blocks 107 | for i in range(n_blocks_local): 108 | model_other += [ResBlock(ngf_global * 8, 3, 1)] 109 | # upsampling blocks 110 | model_other += [ 111 | UpBlock(ngf_global * 8, ngf_global * 4, kernel_size=3, padding=1), 112 | UpBlock(ngf_global * 4, ngf_global * 2, kernel_size=3, padding=1), 113 | UpBlock(ngf_global * 2, ngf_global, kernel_size=3, padding=1) 114 | ] 115 | model_motion = nn.Conv2d(ngf_global, out_channels=2, kernel_size=3, padding=1, groups=1) 116 | 117 | setattr(self, 'model' + str(n) + '_1', nn.Sequential(*model_first_downsample)) 118 | setattr(self, 'model' + str(n) + '_2', nn.Sequential(*model_other)) 119 | setattr(self, 'model' + str(n) + '_3', model_motion) 120 | 121 | def forward(self, input1, input2, input3): 122 | ### output at small scale(f1) 123 | output_prev = self.f1_model(input1) 124 | low_motion = self.f1_motion(output_prev) 125 | 126 | ### output at middle scale(f2) 127 | output_prev = self.model1_2(self.model1_1(input2) + output_prev) 128 | middle_motion = self.model1_3(output_prev) 129 | middle_motion = middle_motion + nn.Upsample(scale_factor=2, mode='nearest')(low_motion) 130 | 131 | ### output at large scale(f3) 132 | output_prev = self.model2_2(self.model2_1(input3) + output_prev) 133 | high_motion = self.model2_3(output_prev) 134 | high_motion = high_motion + nn.Upsample(scale_factor=2, mode='nearest')(middle_motion) 135 | 136 | low_motion = low_motion.permute(0, 2, 3, 1) 137 | middle_motion = middle_motion.permute(0, 2, 3, 1) 138 | high_motion = high_motion.permute(0, 2, 3, 1) 139 | return [low_motion, middle_motion, high_motion] 140 | -------------------------------------------------------------------------------- /src/options/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/src/options/__init__.py -------------------------------------------------------------------------------- /src/options/parse_config.py: -------------------------------------------------------------------------------- 1 | import configparser 2 | from util import util 3 | import os 4 | import torch 5 | from abc import ABC 6 | 7 | 8 | class Options(): 9 | def __init__(self): 10 | pass 11 | 12 | 13 | def _get_value_from_ini(conf, section, option, type=str, default=None): 14 | if conf.has_option(section, option): 15 | if type == bool: 16 | return conf.get(section, option) == str(True) 17 | else: 18 | return type(conf.get(section, option)) 19 | else: 20 | return default 21 | 22 | 23 | def str2ids(input_str): 24 | str_ids = input_str.split(',') 25 | ids = [] 26 | for str_id in str_ids: 27 | id = int(str_id) 28 | if id >= 0: 29 | ids.append(id) 30 | return ids 31 | 32 | 33 | def str2floats(input_str): 34 | str_ids = input_str.split(',') 35 | ids = [] 36 | for str_id in str_ids: 37 | id = float(str_id) 38 | if id >= 0: 39 | ids.append(id) 40 | return ids 41 | 42 | 43 | def str2ints(input_str): 44 | str_ids = input_str.split(',') 45 | ids = [] 46 | for str_id in str_ids: 47 | id = int(str_id) 48 | if id >= 0: 49 | ids.append(id) 50 | return ids 51 | 52 | 53 | class ConfigParse(ABC): 54 | def __init__(self): 55 | self.conf = configparser.ConfigParser() 56 | self.opt = 0 57 | 58 | @staticmethod 59 | def get_opt_from_ini(self, file_name): 60 | return self.opt 61 | 62 | def setup_environment(self): 63 | # print options 64 | message = '' 65 | message += '----------------- Options ---------------\n' 66 | for k, v in vars(self.opt).items(): 67 | message += '{:>25}: {:<30}\n'.format(str(k), str(v)) 68 | message += '----------------- End -------------------' 69 | print(message) 70 | 71 | # save to the disk 72 | expr_dir = os.path.join(self.opt.checkpoints_dir, self.opt.name) 73 | util.mkdirs(expr_dir) 74 | file_name = os.path.join(expr_dir, '{}_opt.txt'.format(self.opt.phase)) 75 | with open(file_name, 'wt') as opt_file: 76 | opt_file.write(message) 77 | opt_file.write('\n') 78 | 79 | # set gpu ids 80 | if len(self.opt.gpu_ids) > 0: 81 | torch.cuda.set_device(self.opt.gpu_ids[0]) 82 | 83 | def setup_test_environment(self): 84 | # print options 85 | message = '' 86 | message += '----------------- Options ---------------\n' 87 | for k, v in vars(self.opt).items(): 88 | message += '{:>25}: {:<30}\n'.format(str(k), str(v)) 89 | message += '----------------- End -------------------' 90 | print(message) 91 | 92 | # set gpu ids 93 | if len(self.opt.gpu_ids) > 0: 94 | torch.cuda.set_device(self.opt.gpu_ids[0]) 95 | 96 | 97 | class Face2FaceRHOConfigParse(ConfigParse): 98 | def __init__(self): 99 | ConfigParse.__init__(self) 100 | 101 | def get_opt_from_ini(self, file_name): 102 | self.conf.read(file_name, encoding="utf-8") 103 | opt = Options() 104 | # basic config 105 | opt.name = self.conf.get("ROOT", "name") 106 | opt.gpu_ids = self.conf.get("ROOT", "gpu_ids") 107 | opt.gpu_ids = str2ids(opt.gpu_ids) 108 | opt.checkpoints_dir = self.conf.get("ROOT", "checkpoints_dir") 109 | opt.model = self.conf.get("ROOT", "model") 110 | opt.output_size = int(self.conf.get("ROOT", "output_size")) 111 | opt.isTrain = self.conf.get("ROOT", "isTrain") == 'True' 112 | opt.phase = self.conf.get("ROOT", "phase") 113 | opt.load_iter = int(self.conf.get("ROOT", "load_iter")) 114 | opt.epoch = int(self.conf.get("ROOT", "epoch")) 115 | 116 | # rendering module config 117 | opt.headpose_dims = int(self.conf.get("ROOT", "headpose_dims")) 118 | opt.mobilev2_encoder_channels = self.conf.get("ROOT", "mobilev2_encoder_channels") 119 | opt.mobilev2_encoder_channels = str2ints(opt.mobilev2_encoder_channels) 120 | opt.mobilev2_decoder_channels = self.conf.get("ROOT", "mobilev2_decoder_channels") 121 | opt.mobilev2_decoder_channels = str2ints(opt.mobilev2_decoder_channels) 122 | opt.mobilev2_encoder_layers = self.conf.get("ROOT", "mobilev2_encoder_layers") 123 | opt.mobilev2_encoder_layers = str2ints(opt.mobilev2_encoder_layers) 124 | opt.mobilev2_decoder_layers = self.conf.get("ROOT", "mobilev2_decoder_layers") 125 | opt.mobilev2_decoder_layers = str2ints(opt.mobilev2_decoder_layers) 126 | opt.mobilev2_encoder_expansion_factor = self.conf.get("ROOT", "mobilev2_encoder_expansion_factor") 127 | opt.mobilev2_encoder_expansion_factor = str2ints(opt.mobilev2_encoder_expansion_factor) 128 | opt.mobilev2_decoder_expansion_factor = self.conf.get("ROOT", "mobilev2_decoder_expansion_factor") 129 | opt.mobilev2_decoder_expansion_factor = str2ints(opt.mobilev2_decoder_expansion_factor) 130 | opt.headpose_embedding_ngf = int(self.conf.get("ROOT", "headpose_embedding_ngf")) 131 | 132 | # motion module config 133 | opt.mn_ngf = int(self.conf.get("ROOT", "mn_ngf")) 134 | opt.n_local_enhancers = int(self.conf.get("ROOT", "n_local_enhancers")) 135 | opt.mn_n_downsampling = int(self.conf.get("ROOT", "mn_n_downsampling")) 136 | opt.mn_n_blocks_local = int(self.conf.get("ROOT", "mn_n_blocks_local")) 137 | 138 | # discriminator 139 | opt.disc_scales = [1] 140 | opt.disc_block_expansion = int(self.conf.get("ROOT", "disc_block_expansion")) 141 | opt.disc_num_blocks = int(self.conf.get("ROOT", "disc_num_blocks")) 142 | opt.disc_max_features = int(self.conf.get("ROOT", "disc_max_features")) 143 | 144 | # training parameters 145 | opt.init_type = self.conf.get("ROOT", "init_type") 146 | opt.init_gain = float(self.conf.get("ROOT", "init_gain")) 147 | opt.emphasize_face_area = self.conf.get("ROOT", "emphasize_face_area") == 'True' 148 | opt.loss_scales = self.conf.get("ROOT", "loss_scales") 149 | opt.loss_scales = str2floats(opt.loss_scales) 150 | opt.warp_loss_weight = float(self.conf.get("ROOT", "warp_loss_weight")) 151 | opt.reconstruction_loss_weight = float(self.conf.get("ROOT", "reconstruction_loss_weight")) 152 | opt.feature_matching_loss_weight = float(self.conf.get("ROOT", "feature_matching_loss_weight")) 153 | opt.face_area_weight_scale = float(self.conf.get("ROOT", "face_area_weight_scale")) 154 | opt.init_field_epochs = int(self.conf.get("ROOT", "init_field_epochs")) 155 | opt.lr = float(self.conf.get("ROOT", "lr")) 156 | opt.beta1 = float(self.conf.get("ROOT", "beta1")) 157 | opt.lr_policy = self.conf.get("ROOT", "lr_policy") 158 | opt.epoch_count = int(self.conf.get("ROOT", "epoch_count")) 159 | opt.niter = int(self.conf.get("ROOT", "niter")) 160 | opt.niter_decay = int(self.conf.get("ROOT", "niter_decay")) 161 | opt.continue_train = self.conf.get("ROOT", "continue_train") == 'True' 162 | 163 | # dataset parameters 164 | opt.dataset_mode = self.conf.get("ROOT", "dataset_mode") 165 | opt.dataroot = self.conf.get("ROOT", "dataroot") 166 | opt.num_repeats = int(self.conf.get("ROOT", "num_repeats")) 167 | opt.batch_size = int(self.conf.get("ROOT", "batch_size")) 168 | opt.serial_batches = self.conf.get("ROOT", "serial_batches") == 'True' 169 | opt.num_threads = int(self.conf.get("ROOT", "num_threads")) 170 | opt.max_dataset_size = float("inf") 171 | 172 | # vis_config 173 | opt.display_freq = int(self.conf.get("ROOT", "display_freq")) 174 | opt.update_html_freq = int(self.conf.get("ROOT", "update_html_freq")) 175 | opt.display_id = int(self.conf.get("ROOT", "display_id")) 176 | opt.display_server = self.conf.get("ROOT", "display_server") 177 | opt.display_env = self.conf.get("ROOT", "display_env") 178 | opt.display_port = int(self.conf.get("ROOT", "display_port")) 179 | opt.print_freq = int(self.conf.get("ROOT", "print_freq")) 180 | opt.save_latest_freq = int(self.conf.get("ROOT", "save_latest_freq")) 181 | opt.save_epoch_freq = int(self.conf.get("ROOT", "save_epoch_freq")) 182 | opt.no_html = self.conf.get("ROOT", "no_html") == str(True) 183 | opt.display_winsize = int(self.conf.get("ROOT", "display_winsize")) 184 | opt.display_ncols = int(self.conf.get("ROOT", "display_ncols")) 185 | opt.verbose = self.conf.get("ROOT", "verbose") == 'True' 186 | self.opt = opt 187 | return self.opt 188 | 189 | -------------------------------------------------------------------------------- /src/reenact.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | from options.parse_config import Face2FaceRHOConfigParse 3 | from models import create_model 4 | import os 5 | import torch 6 | import numpy as np 7 | from util.landmark_image_generation import LandmarkImageGeneration 8 | from util.util import ( 9 | read_target, 10 | load_coeffs, 11 | load_landmarks 12 | ) 13 | import cv2 14 | from util.util import tensor2im 15 | 16 | 17 | def parse_args(): 18 | """Configurations.""" 19 | parser = argparse.ArgumentParser(description='test process of Face2FaceRHO') 20 | parser.add_argument('--config', type=str, required=True, 21 | help='.ini config file name') 22 | parser.add_argument('--src_img', type=str, required=True, 23 | help='input source actor image (.jpg, .jpg, .jpeg, .png)') 24 | parser.add_argument('--src_headpose', type=str, required=True, 25 | help='input head pose coefficients of source image (.txt)') 26 | parser.add_argument('--src_landmark', type=str, required=True, 27 | help='input facial landmarks of source image (.txt)') 28 | parser.add_argument('--drv_headpose', type=str, required=True, 29 | help='input head pose coefficients of driving image (.txt)') 30 | parser.add_argument('--drv_landmark', type=str, required=True, 31 | help='input driving facial landmarks (.txt, reconstructed by using shape coefficients ' 32 | 'of the source actor and expression and head pose coefficients of the driving actor)') 33 | 34 | parser.add_argument('--output_dir', type=str, 35 | default=os.path.join(os.path.dirname(__file__), '..', 'test_case', 'result'), 36 | help='output directory') 37 | 38 | return _check_args(parser.parse_args()) 39 | 40 | 41 | def _check_args(args): 42 | if args is None: 43 | raise RuntimeError('Invalid arguments!') 44 | return args 45 | 46 | 47 | def load_data(opt, headpose_file, landmark_file, img_file=None, load_img=True): 48 | face = dict() 49 | if load_img: 50 | face['img'] = read_target(img_file, opt.output_size) 51 | face['headpose'] = torch.from_numpy(np.array(load_coeffs(headpose_file))).float() 52 | face['landmarks'] = torch.from_numpy(np.array(load_landmarks(landmark_file))).float() 53 | return face 54 | 55 | 56 | if __name__ == '__main__': 57 | args = parse_args() 58 | config_parse = Face2FaceRHOConfigParse() 59 | opt = config_parse.get_opt_from_ini(args.config) 60 | config_parse.setup_environment() 61 | 62 | model = create_model(opt) 63 | model.setup(opt) 64 | model.eval() 65 | 66 | if not os.path.exists(args.output_dir): 67 | os.makedirs(args.output_dir) 68 | 69 | src_face = load_data(opt, args.src_headpose, args.src_landmark, args.src_img) 70 | drv_face = load_data(opt, args.drv_headpose, args.drv_landmark, load_img=False) 71 | 72 | landmark_img_generator = LandmarkImageGeneration(opt) 73 | 74 | # off-line stage 75 | src_face['landmark_img'] = landmark_img_generator.generate_landmark_img(src_face['landmarks']) 76 | src_face['landmark_img'] = [value.unsqueeze(0) for value in src_face['landmark_img']] 77 | model.set_source_face(src_face['img'].unsqueeze(0), src_face['headpose'].unsqueeze(0)) 78 | 79 | # on-line stage 80 | drv_face['landmark_img'] = landmark_img_generator.generate_landmark_img(drv_face['landmarks']) 81 | drv_face['landmark_img'] = [value.unsqueeze(0) for value in drv_face['landmark_img']] 82 | model.reenactment(src_face['landmark_img'], drv_face['headpose'].unsqueeze(0), drv_face['landmark_img']) 83 | 84 | visual_results = model.get_current_visuals() 85 | output_file_name = args.output_dir + "/result.png" 86 | im = tensor2im(visual_results['fake']) 87 | im = cv2.cvtColor(im, cv2.COLOR_RGB2BGR) 88 | cv2.imwrite(output_file_name, im) 89 | -------------------------------------------------------------------------------- /src/train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import time 3 | from options.parse_config import Face2FaceRHOConfigParse 4 | from dataset import CreateDataLoader 5 | from models import create_model 6 | from util.visualizer import Visualizer 7 | import os 8 | 9 | 10 | def parse_args(): 11 | """Configurations.""" 12 | parser = argparse.ArgumentParser(description='training process of Face2FaceRHO') 13 | parser.add_argument('--config', type=str, required=True, help='.ini config file name') 14 | return _check_args(parser.parse_args()) 15 | 16 | 17 | def _check_args(args): 18 | if args is None: 19 | raise RuntimeError('Invalid arguments!') 20 | return args 21 | 22 | 23 | if __name__ == '__main__': 24 | print(os.getcwd()) 25 | args = parse_args() 26 | config_parse = Face2FaceRHOConfigParse() 27 | opt = config_parse.get_opt_from_ini(args.config) # get training options 28 | config_parse.setup_environment() 29 | 30 | data_loader = CreateDataLoader(opt) 31 | dataset = data_loader.load_data() 32 | dataset_size = len(data_loader) 33 | print('#training images = %d' % dataset_size) 34 | 35 | model = create_model(opt) 36 | model.setup(opt) 37 | 38 | visualizer = Visualizer(opt) 39 | total_steps = 0 40 | 41 | display_inter = -1 42 | print_inter = -1 43 | save_latest_inter = -1 44 | 45 | for epoch in range(opt.epoch_count, opt.niter + opt.niter_decay + 1): 46 | epoch_start_time = time.time() 47 | iter_data_time = time.time() 48 | epoch_iter = 0 # iterator within an epoch 49 | 50 | for i, data in enumerate(dataset): 51 | iter_start_time = time.time() 52 | if total_steps % opt.print_freq == 0: 53 | t_data = iter_start_time - iter_data_time 54 | visualizer.reset() 55 | total_steps += opt.batch_size 56 | epoch_iter += opt.batch_size 57 | 58 | model.set_input(data) 59 | model.optimize_parameters(epoch) 60 | print("total steps: {}".format(total_steps)) 61 | 62 | if total_steps // opt.display_freq > display_inter: 63 | display_inter = total_steps // opt.display_freq 64 | save_result = total_steps % opt.update_html_freq == 0 65 | visualizer.display_current_results(model.get_current_visuals(), epoch, False) 66 | 67 | if total_steps // opt.print_freq > print_inter: 68 | print_inter = total_steps // opt.print_freq 69 | losses = model.get_current_losses() 70 | t = (time.time() - iter_start_time) / opt.batch_size 71 | visualizer.print_current_losses(epoch, epoch_iter, losses, t, t_data) 72 | if opt.display_id > 0: 73 | visualizer.plot_current_losses(epoch, float(epoch_iter) / dataset_size, opt, losses) 74 | 75 | if total_steps // opt.save_latest_freq > save_latest_inter: 76 | save_latest_inter = total_steps // opt.save_latest_freq 77 | print('saving the latest model (epoch %d, total_steps %d)' % (epoch, total_steps)) 78 | # model.save_networks('latest') 79 | save_suffix = 'iter_%d' % total_steps 80 | model.save_networks(save_suffix) 81 | 82 | iter_data_time = time.time() 83 | if epoch % opt.save_epoch_freq == 0: 84 | print('saving the model at the end of epoch %d, iters %d' % (epoch, total_steps)) 85 | model.save_networks('latest') 86 | model.save_networks(epoch) 87 | 88 | print('End of epoch %d / %d \t Time Taken: %d sec' % 89 | (epoch, opt.niter + opt.niter_decay, time.time() - epoch_start_time)) 90 | model.update_learning_rate() -------------------------------------------------------------------------------- /src/util/html.py: -------------------------------------------------------------------------------- 1 | import dominate 2 | from dominate.tags import meta, h3, table, tr, td, p, a, img, br 3 | import os 4 | 5 | 6 | class HTML: 7 | def __init__(self, web_dir, title, reflesh=0): 8 | self.title = title 9 | self.web_dir = web_dir 10 | self.img_dir = os.path.join(self.web_dir, 'images') 11 | if not os.path.exists(self.web_dir): 12 | os.makedirs(self.web_dir) 13 | if not os.path.exists(self.img_dir): 14 | os.makedirs(self.img_dir) 15 | # print(self.img_dir) 16 | 17 | self.doc = dominate.document(title=title) 18 | if reflesh > 0: 19 | with self.doc.head: 20 | meta(http_equiv="reflesh", content=str(reflesh)) 21 | 22 | def get_image_dir(self): 23 | return self.img_dir 24 | 25 | def add_header(self, str): 26 | with self.doc: 27 | h3(str) 28 | 29 | def add_table(self, border=1): 30 | self.t = table(border=border, style="table-layout: fixed;") 31 | self.doc.add(self.t) 32 | 33 | def add_images(self, ims, txts, links, width=400): 34 | self.add_table() 35 | with self.t: 36 | with tr(): 37 | for im, txt, link in zip(ims, txts, links): 38 | with td(style="word-wrap: break-word;", halign="center", valign="top"): 39 | with p(): 40 | with a(href=os.path.join('images', link)): 41 | img(style="width:%dpx" % width, src=os.path.join('images', im)) 42 | br() 43 | p(txt) 44 | 45 | def save(self): 46 | html_file = '%s/index.html' % self.web_dir 47 | f = open(html_file, 'wt') 48 | f.write(self.doc.render()) 49 | f.close() 50 | 51 | 52 | if __name__ == '__main__': 53 | html = HTML('web/', 'test_html') 54 | html.add_header('hello world') 55 | 56 | ims = [] 57 | txts = [] 58 | links = [] 59 | for n in range(4): 60 | ims.append('image_%d.png' % n) 61 | txts.append('text_%d' % n) 62 | links.append('image_%d.png' % n) 63 | html.add_images(ims, txts, links) 64 | html.save() 65 | -------------------------------------------------------------------------------- /src/util/landmark_image_generation.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import torchvision.transforms as transforms 3 | import numpy as np 4 | 5 | 6 | class LandmarkImageGeneration: 7 | def __init__(self, opt): 8 | self.output_size = opt.output_size 9 | self.landmark_img_sizes = [ 10 | int(opt.output_size / 16), 11 | int(opt.output_size / 8), 12 | int(opt.output_size / 4), 13 | ] 14 | 15 | self.facial_part_color = { 16 | # B G R 17 | 'left_eyebrow': [0, 0, 255], # red 18 | 'right_eyebrow': [0, 255, 0], # green 19 | 'left_eye': [255, 0, 0], # blue 20 | 'right_eye': [255, 255, 0], # cyan 21 | 'nose': [255, 0, 255], # purple 22 | 'mouth': [0, 255, 255], # yellow 23 | 'face_contour': [125, 125, 125], # gray 24 | } 25 | 26 | self.facial_part = [ 27 | { 28 | 'left_eyebrow': [], 29 | 'right_eyebrow': [], 30 | 'left_eye': [42], # 1 31 | 'right_eye': [50], # 1 32 | 'nose': [30], # 1 33 | 'mouth': [52, 57], # 2 34 | 'face_contour': [0, 8, 9], # 3 35 | }, 36 | { 37 | 'left_eyebrow': [17, 21], # 2 38 | 'right_eyebrow': [26, 22], # 2 39 | 'left_eye': [36, 38, 40, 42, 36], # 4 40 | 'right_eye': [48, 46, 44, 50, 48], # 4 41 | 'nose': [[27, 33, 31], [33, 35]], # 4 42 | 'mouth': [[52, 62, 57, 71, 52], [63], [70]], # 6 43 | 'face_contour': [0, 4, 8, 13, 9], # 5 44 | }, 45 | { 46 | 'left_eyebrow': [17, 18, 19, 20, 21], # 5 47 | 'right_eyebrow': [26, 25, 24, 23, 22], # 5 48 | 'left_eye': [36, 37, 38, 39, 40, 41, 42, 43, 36], # 8 49 | 'right_eye': [48, 47, 46, 45, 44, 51, 50, 48, 48], # 8 50 | 'nose': [[27, 28, 29, 30, 33], [31, 32, 33], [35, 34, 33]], # 9 51 | 'mouth': [[52, 54, 53, 62, 58, 59, 57, 65, 66, 71, 69, 68, 52], [55, 56, 63, 61, 60, 64, 70, 67, 55]], # 20 52 | 'face_contour': [0, 1, 2, 3, 4, 5, 6, 7, 8, 16, 15, 14, 13, 12, 11, 10, 9], # 17 53 | } 54 | ] 55 | 56 | def generate_landmark_img(self, landmarks): 57 | landmark_imgs = [] 58 | for i in range(len(self.landmark_img_sizes)): 59 | cur_landmarks = landmarks.clone() 60 | image_size = self.landmark_img_sizes[i] 61 | 62 | cur_landmarks[:, 0:1] = (cur_landmarks[:, 0:1] + 1) / 2 * (image_size - 1) 63 | cur_landmarks[:, 1:2] = (cur_landmarks[:, 1:2] + 1) / 2 * (image_size - 1) 64 | 65 | cur_facial_part = self.facial_part[i] 66 | line_width = 1 67 | 68 | landmark_img = np.zeros((image_size, image_size, 3), dtype=np.uint8) 69 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['left_eyebrow'], self.facial_part_color['left_eyebrow'], line_width) 70 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['right_eyebrow'], self.facial_part_color['right_eyebrow'], line_width) 71 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['left_eye'], self.facial_part_color['left_eye'], line_width) 72 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['right_eye'], self.facial_part_color['right_eye'], line_width) 73 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['nose'], self.facial_part_color['nose'], line_width) 74 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['mouth'], self.facial_part_color['mouth'], line_width) 75 | self.draw_line(landmark_img, cur_landmarks, cur_facial_part['face_contour'], self.facial_part_color['face_contour'], line_width) 76 | landmark_img = 2.0 * transforms.ToTensor()(landmark_img.astype(np.float32)) / 255.0 - 1.0 77 | 78 | landmark_imgs.append(landmark_img) 79 | return landmark_imgs 80 | 81 | @staticmethod 82 | def draw_line(landmark_map, projected_landmarks, line_ids, line_color, line_width): 83 | if len(line_ids) == 1: # only single point 84 | center_x = int(projected_landmarks[line_ids[0], 0]) 85 | center_y = int(projected_landmarks[line_ids[0], 1]) 86 | cv2.circle(landmark_map, (center_x, center_y), line_width, line_color, -1, cv2.LINE_4) 87 | elif len(line_ids) > 1: 88 | if isinstance(line_ids[0], list): 89 | for i in range(len(line_ids)): 90 | if len(line_ids[i]) == 1: 91 | center_x = int(projected_landmarks[line_ids[i][0], 0]) 92 | center_y = int(projected_landmarks[line_ids[i][0], 1]) 93 | cv2.circle(landmark_map, (center_x, center_y), line_width, line_color, -1, cv2.LINE_4) 94 | else: 95 | for j in range(len(line_ids[i]) - 1): 96 | pt1_x = int(projected_landmarks[line_ids[i][j], 0]) 97 | pt1_y = int(projected_landmarks[line_ids[i][j], 1]) 98 | pt2_x = int(projected_landmarks[line_ids[i][j + 1], 0]) 99 | pt2_y = int(projected_landmarks[line_ids[i][j + 1], 1]) 100 | pt1 = (int(pt1_x), int(pt1_y)) 101 | pt2 = (int(pt2_x), int(pt2_y)) 102 | cv2.line(landmark_map, pt1, pt2, line_color, line_width, cv2.LINE_4) 103 | else: 104 | for i in range(len(line_ids) - 1): 105 | pt1_x = int(projected_landmarks[line_ids[i], 0]) 106 | pt1_y = int(projected_landmarks[line_ids[i], 1]) 107 | pt2_x = int(projected_landmarks[line_ids[i+1], 0]) 108 | pt2_y = int(projected_landmarks[line_ids[i + 1], 1]) 109 | pt1 = (int(pt1_x), int(pt1_y)) 110 | pt2 = (int(pt2_x), int(pt2_y)) 111 | cv2.line(landmark_map, pt1, pt2, line_color, line_width, cv2.LINE_4) 112 | 113 | 114 | 115 | -------------------------------------------------------------------------------- /src/util/util.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from PIL import Image 3 | import os 4 | import torchvision.transforms as transforms 5 | import torch 6 | import cv2 7 | 8 | 9 | def make_ids(path): 10 | ids = [] 11 | frames = os.listdir(path) 12 | for frame in frames: 13 | (filename, extension) = os.path.splitext(frame) 14 | ids.append(int(filename)) 15 | ids = sorted(ids) 16 | return ids 17 | 18 | 19 | def read_target(file_name, output_size): 20 | pil_target = Image.open(file_name) 21 | if pil_target.size[0] != output_size: 22 | pil_target = transforms.Resize((output_size, output_size), interpolation=Image.BILINEAR)(pil_target) 23 | img_numpy = np.asarray(pil_target) 24 | TARGET = 2.0 * transforms.ToTensor()(img_numpy.astype(np.float32)) / 255.0 - 1.0 25 | return TARGET 26 | 27 | 28 | def load_coeffs(input_file_name): 29 | file = open(input_file_name, "r") 30 | coeffs = [float(line) for line in file] 31 | file.close() 32 | return coeffs 33 | 34 | 35 | def load_landmarks(file_name): 36 | landmarks = [] 37 | file = open(file_name, 'r') 38 | for line in file: 39 | s1 = line.split(' ') 40 | landmarks.append([float(s1[0]), float(s1[1])]) 41 | file.close() 42 | return landmarks 43 | 44 | 45 | def mkdir(path): 46 | if not os.path.exists(path): 47 | os.makedirs(path) 48 | 49 | 50 | def mkdirs(paths): 51 | if isinstance(paths, list) and not isinstance(paths, str): 52 | for path in paths: 53 | mkdir(path) 54 | else: 55 | mkdir(paths) 56 | 57 | 58 | def tensor2im(input_image, imtype=np.uint8, bs=0): 59 | if isinstance(input_image, torch.Tensor): 60 | input_image = torch.clamp(input_image, -1.0, 1.0) 61 | image_tensor = input_image.data 62 | else: 63 | return input_image 64 | image_numpy = image_tensor[bs].cpu().float().numpy() 65 | if image_numpy.shape[0] == 1: 66 | image_numpy = np.tile(image_numpy, (3, 1, 1)) 67 | image_numpy = (np.transpose(image_numpy, (1, 2, 0)) + 1) / 2.0 * 255.0 68 | return image_numpy.astype(imtype) 69 | 70 | 71 | def show_mask(mask, bs=0): 72 | image_tensor = mask.data 73 | image_tensor = image_tensor[bs:bs+1, ...].cpu() 74 | mask_image = torch.ones(image_tensor.shape, dtype=torch.float32) 75 | mask_image = torch.where(image_tensor, torch.ones_like(mask_image), torch.zeros_like(mask_image)) 76 | mask_image = mask_image.cpu().squeeze(0).numpy() 77 | mask_image = mask_image * 255 78 | mask_image = mask_image.astype(np.uint8) 79 | mask_image = cv2.cvtColor(mask_image, cv2.COLOR_GRAY2RGB) 80 | return mask_image 81 | 82 | 83 | def save_image(image_numpy, image_path, aspect_ratio=1.0): 84 | """Save a numpy image to the disk 85 | 86 | Parameters: 87 | image_numpy (numpy array) -- input numpy array 88 | image_path (str) -- the path of the image 89 | """ 90 | 91 | image_pil = Image.fromarray(image_numpy) 92 | h, w, _ = image_numpy.shape 93 | 94 | if aspect_ratio > 1.0: 95 | image_pil = image_pil.resize((h, int(w * aspect_ratio)), Image.BICUBIC) 96 | if aspect_ratio < 1.0: 97 | image_pil = image_pil.resize((int(h / aspect_ratio), w), Image.BICUBIC) 98 | image_pil.save(image_path) 99 | 100 | 101 | def save_coeffs(output_file_name, coeffs): 102 | with open(output_file_name, "w") as f: 103 | for coeff in coeffs: 104 | f.write(str(coeff) + "\n") 105 | 106 | 107 | def save_landmarks(output_file_name, landmarks): 108 | with open(output_file_name, "w") as f: 109 | for landmark in landmarks: 110 | f.write(str(landmark[0]) + " " + str(landmark[1]) + "\n") 111 | -------------------------------------------------------------------------------- /src/util/visualizer.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | import sys 4 | import time 5 | from . import util 6 | from . import html 7 | from PIL import Image 8 | import visdom 9 | 10 | if sys.version_info[0] == 2: 11 | VisdomExceptionBase = Exception 12 | else: 13 | VisdomExceptionBase = ConnectionError 14 | 15 | 16 | class Visualizer(): 17 | def __init__(self, opt): 18 | self.display_id = opt.display_id 19 | self.use_html = opt.isTrain and not opt.no_html 20 | self.win_size = opt.display_winsize 21 | self.name = opt.name 22 | self.opt = opt 23 | self.saved = False 24 | if self.display_id > 0: 25 | self.ncols = opt.display_ncols 26 | self.vis = visdom.Visdom(server=opt.display_server, port=opt.display_port, env=opt.display_env, raise_exceptions=True) 27 | if self.use_html: 28 | self.web_dir = os.path.join(opt.checkpoints_dir, opt.name, 'web') 29 | self.img_dir = os.path.join(self.web_dir, 'images') 30 | print('create web directory %s...' % self.web_dir) 31 | util.mkdirs([self.web_dir, self.img_dir]) 32 | self.log_name = os.path.join(opt.checkpoints_dir, opt.name, 'loss_log.txt') 33 | with open(self.log_name, "a") as log_file: 34 | now = time.strftime("%c") 35 | log_file.write('================ Training Loss (%s) ================\n' % now) 36 | 37 | def reset(self): 38 | self.saved = False 39 | 40 | def throw_visdom_connection_error(self): 41 | print('\n\nCould not connect to Visdom server (https://github.com/facebookresearch/visdom) for displaying training progress.\nYou can suppress connection to Visdom using the option --display_id -1. To install visdom, run \n$ pip install visdom\n, and start the server by \n$ python -m visdom.server.\n\n') 42 | exit(1) 43 | 44 | # |visuals|: dictionary of images to display or save 45 | def display_current_results(self, visuals, epoch, save_result, aspect_ratio=1.0, width=256): 46 | if self.display_id > 0: # show images in the browser 47 | ncols = self.ncols 48 | if ncols > 0: 49 | ncols = min(ncols, len(visuals)) 50 | h, w = next(iter(visuals.values())).shape[2:4] 51 | # h, w = 256,256 52 | height = int(width * h / float(w)) 53 | h = height 54 | w = width 55 | table_css = """""" % (w, h) 59 | title = self.name 60 | label_html = '' 61 | label_html_row = '' 62 | images = [] 63 | idx = 0 64 | for label, image in visuals.items(): 65 | if label == 'drv_face_mask': 66 | image_numpy = util.show_mask(image) 67 | else: 68 | image_numpy = util.tensor2im(image, np.uint8) 69 | image_numpy = np.array(Image.fromarray(image_numpy).resize((h, w))) 70 | image_numpy = image_numpy.transpose([2, 0, 1]) 71 | label_html_row += '%s' % label 72 | images.append(image_numpy) 73 | idx += 1 74 | if idx % ncols == 0: 75 | label_html += '%s' % label_html_row 76 | label_html_row = '' 77 | white_image = np.ones_like(image_numpy) * 255 78 | while idx % ncols != 0: 79 | images.append(white_image) 80 | label_html_row += '' 81 | idx += 1 82 | if label_html_row != '': 83 | label_html += '%s' % label_html_row 84 | # pane col = image row 85 | try: 86 | self.vis.images(images, nrow=ncols, win=self.display_id + 1, padding=2, opts=dict(title=title + ' images')) 87 | label_html = '%s
' % label_html 88 | self.vis.text(table_css + label_html, win=self.display_id + 2, 89 | opts=dict(title=title + ' labels')) 90 | except VisdomExceptionBase: 91 | self.throw_visdom_connection_error() 92 | 93 | else: 94 | idx = 1 95 | for label, image in visuals.items(): 96 | image_numpy = util.tensor2im(image) 97 | self.vis.image(image_numpy.transpose([2, 0, 1]), opts=dict(title=label), 98 | win=self.display_id + idx) 99 | idx += 1 100 | 101 | if self.use_html and (save_result or not self.saved): # save images to a html file 102 | self.saved = True 103 | for label, image in visuals.items(): 104 | image_numpy = util.tensor2im(image) 105 | img_path = os.path.join(self.img_dir, 'epoch%.3d_%s.png' % (epoch, label)) 106 | util.save_image(image_numpy, img_path) 107 | # update website 108 | webpage = html.HTML(self.web_dir, 'Experiment name = %s' % self.name, refresh=1) 109 | for n in range(epoch, 0, -1): 110 | webpage.add_header('epoch [%d]' % n) 111 | ims, txts, links = [], [], [] 112 | 113 | for label, image_numpy in visuals.items(): 114 | image_numpy = util.tensor2im(image) 115 | img_path = 'epoch%.3d_%s.png' % (n, label) 116 | ims.append(img_path) 117 | txts.append(label) 118 | links.append(img_path) 119 | webpage.add_images(ims, txts, links, width=self.win_size) 120 | webpage.save() 121 | 122 | # losses: dictionary of error labels and values 123 | def plot_current_losses(self, epoch, counter_ratio, opt, losses): 124 | if not hasattr(self, 'plot_data'): 125 | self.plot_data = {'X': [], 'Y': [], 'legend': list(losses.keys())} 126 | self.plot_data['X'].append(epoch + counter_ratio) 127 | self.plot_data['Y'].append([losses[k] for k in self.plot_data['legend']]) 128 | try: 129 | self.vis.line( 130 | X=np.stack([np.array(self.plot_data['X'])] * len(self.plot_data['legend']), 1), 131 | Y=np.array(self.plot_data['Y']), 132 | opts={ 133 | 'title': self.name + ' loss over time', 134 | 'legend': self.plot_data['legend'], 135 | 'xlabel': 'epoch', 136 | 'ylabel': 'loss'}, 137 | win=self.display_id) 138 | except VisdomExceptionBase: 139 | self.throw_visdom_connection_error() 140 | 141 | # losses: same format as |losses| of plot_current_losses 142 | def print_current_losses(self, epoch, i, losses, t, t_data): 143 | message = '(epoch: %d, iters: %d, time: %.3f, data: %.3f) ' % (epoch, i, t, t_data) 144 | for k, v in losses.items(): 145 | message += '%s: %.3f ' % (k, v) 146 | 147 | print(message) 148 | with open(self.log_name, "a") as log_file: 149 | log_file.write('%s\n' % message) 150 | 151 | 152 | # losses: dictionary of error labels and values 153 | def plot_current_validation_error(self, epoch, counter_ratio, losses): 154 | if not hasattr(self, 'plot_validation_data'): 155 | self.plot_validation_data = {'X': [], 'Y': [], 'legend': list(losses.keys())} 156 | self.plot_validation_data['X'].append(epoch + counter_ratio) 157 | self.plot_validation_data['Y'].append([losses[k] for k in self.plot_validation_data['legend']]) 158 | try: 159 | self.vis.line( 160 | X=np.stack([np.array(self.plot_validation_data['X'])] * len(self.plot_validation_data['legend']), 1), 161 | Y=np.array(self.plot_validation_data['Y']), 162 | opts={ 163 | 'title': self.name + ' validation error over time', 164 | 'legend': self.plot_validation_data['legend'], 165 | 'xlabel': 'epoch', 166 | 'ylabel': 'error'}, 167 | win=self.display_id+1) 168 | except VisdomExceptionBase: 169 | self.throw_visdom_connection_error() -------------------------------------------------------------------------------- /test_case/driving/FLAME/headpose.txt: -------------------------------------------------------------------------------- 1 | 0.13625283539295197 2 | 0.21206384897232056 3 | -0.007699888199567795 4 | 0.28585067896505223 5 | -0.2442243018234897 6 | 0.0011781832587462737 7 | -------------------------------------------------------------------------------- /test_case/driving/FLAME/landmark.txt: -------------------------------------------------------------------------------- 1 | -0.31516164541244507 -0.179478719830513 2 | -0.30770164728164673 -0.07279782742261887 3 | -0.29041358828544617 0.030962064862251282 4 | -0.273349791765213 0.109856516122818 5 | -0.2382015734910965 0.1876646876335144 6 | -0.16550007462501526 0.24567601084709167 7 | -0.09306877106428146 0.281278133392334 8 | -0.007662743330001831 0.32588326930999756 9 | 0.07144586741924286 0.3382234573364258 10 | 0.3539423644542694 -0.18897154927253723 11 | 0.34861069917678833 -0.08055130392313004 12 | 0.329071581363678 0.02442900836467743 13 | 0.31596314907073975 0.10410526394844055 14 | 0.28633609414100647 0.18222516775131226 15 | 0.25385111570358276 0.24201679229736328 16 | 0.20868098735809326 0.2786073088645935 17 | 0.14341698586940765 0.32407960295677185 18 | -0.20078876614570618 -0.2900853157043457 19 | -0.1561872661113739 -0.300489604473114 20 | -0.10244493931531906 -0.3071072995662689 21 | -0.051480911672115326 -0.3058977425098419 22 | 0.014424465596675873 -0.2898973226547241 23 | 0.1638312041759491 -0.2904817461967468 24 | 0.22384917736053467 -0.3062269687652588 25 | 0.26421189308166504 -0.3074803352355957 26 | 0.29964524507522583 -0.30190491676330566 27 | 0.3224305212497711 -0.29381853342056274 28 | 0.089569091796875 -0.20857898890972137 29 | 0.09757955372333527 -0.14829565584659576 30 | 0.10460248589515686 -0.10214441269636154 31 | 0.1119510680437088 -0.05317731201648712 32 | 0.029375575482845306 -0.0040546804666519165 33 | 0.060426533222198486 0.0071538835763931274 34 | 0.0912569910287857 0.015764638781547546 35 | 0.11840483546257019 0.006637006998062134 36 | 0.141998291015625 -0.004986941814422607 37 | -0.14466696977615356 -0.18221096694469452 38 | -0.12468508630990982 -0.18699705600738525 39 | -0.09439322352409363 -0.18846870958805084 40 | -0.05997729301452637 -0.18889926373958588 41 | -0.017359264194965363 -0.1823589950799942 42 | -0.05062717944383621 -0.17332319915294647 43 | -0.10127006471157074 -0.1677996665239334 44 | -0.13213858008384705 -0.17395025491714478 45 | 0.15765415132045746 -0.1818913072347641 46 | 0.20564237236976624 -0.1873338520526886 47 | 0.23834621906280518 -0.1874956637620926 48 | 0.2605167031288147 -0.18757116794586182 49 | 0.2730824053287506 -0.1842278093099594 50 | 0.26508164405822754 -0.17632071673870087 51 | 0.24249637126922607 -0.16950492560863495 52 | 0.19319242238998413 -0.17441074550151825 53 | -0.04412370175123215 0.11576831340789795 54 | 0.057739436626434326 0.07976815104484558 55 | -0.0005059316754341125 0.08937084674835205 56 | -0.04149330407381058 0.11526310443878174 57 | 0.007847115397453308 0.10679468512535095 58 | 0.19560709595680237 0.11232137680053711 59 | 0.12210072576999664 0.07869571447372437 60 | 0.1672673374414444 0.08678025007247925 61 | 0.19456690549850464 0.11193358898162842 62 | 0.150030717253685 0.1048298180103302 63 | 0.09031131863594055 0.08315038681030273 64 | 0.08297209441661835 0.10099273920059204 65 | 0.14698274433612823 0.12132978439331055 66 | 0.17087598145008087 0.143466979265213 67 | 0.1295694261789322 0.15547624230384827 68 | 0.02747763693332672 0.12202847003936768 69 | 0.0011781379580497742 0.14499148726463318 70 | 0.05990856885910034 0.15536382794380188 71 | 0.09299695491790771 0.1228334903717041 72 | 0.09577301144599915 0.15782442688941956 73 | -------------------------------------------------------------------------------- /test_case/driving/driving.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/test_case/driving/driving.jpg -------------------------------------------------------------------------------- /test_case/driving/original/headpose.txt: -------------------------------------------------------------------------------- 1 | -0.0358294 2 | 0.193555 3 | 0.0186145 4 | 0.279976 5 | -0.240904 6 | 0.00114472 7 | -------------------------------------------------------------------------------- /test_case/driving/original/landmark.txt: -------------------------------------------------------------------------------- 1 | -0.32812726497650146 -0.17575065791606903 2 | -0.3193890452384949 -0.09410519152879715 3 | -0.30486586689949036 -0.00901447981595993 4 | -0.2908157706260681 0.06765569001436234 5 | -0.2622595429420471 0.15148618817329407 6 | -0.19879738986492157 0.2215268909931183 7 | -0.12630710005760193 0.2841624915599823 8 | -0.04811964929103851 0.33685097098350525 9 | 0.05803518369793892 0.35090646147727966 10 | 0.3353678286075592 -0.18283289670944214 11 | 0.3339826464653015 -0.10109979659318924 12 | 0.3243960738182068 -0.01574944518506527 13 | 0.318934828042984 0.06125636026263237 14 | 0.30464500188827515 0.14568272233009338 15 | 0.26899731159210205 0.2171289026737213 16 | 0.21716704964637756 0.2810239791870117 17 | 0.15250642597675323 0.3350328505039215 18 | -0.18008267879486084 -0.2903229892253876 19 | -0.14966793358325958 -0.31077027320861816 20 | -0.09212638437747955 -0.32110685110092163 21 | -0.05061572417616844 -0.3146970570087433 22 | 0.005072138272225857 -0.2991345524787903 23 | 0.12899674475193024 -0.3003069758415222 24 | 0.180558979511261 -0.31509026885032654 25 | 0.21502850949764252 -0.320447713136673 26 | 0.2612355649471283 -0.3097497820854187 27 | 0.2836766242980957 -0.2907673716545105 28 | 0.06661558896303177 -0.22068162262439728 29 | 0.0714191347360611 -0.17015986144542694 30 | 0.07822294533252716 -0.12035435438156128 31 | 0.08429558575153351 -0.0610278882086277 32 | 0.004675115458667278 -0.005552200134843588 33 | 0.03847338631749153 -0.0015566380461677909 34 | 0.07607473433017731 0.0036368665751069784 35 | 0.10979848355054855 -0.002425919519737363 36 | 0.1380128711462021 -0.006990542635321617 37 | -0.14075849950313568 -0.18631498515605927 38 | -0.11834703385829926 -0.1869308054447174 39 | -0.08572912961244583 -0.18787769973278046 40 | -0.05379549786448479 -0.1896868199110031 41 | -0.029561372473835945 -0.18963144719600677 42 | -0.05296739563345909 -0.1862848401069641 43 | -0.07900605350732803 -0.18137046694755554 44 | -0.10806751996278763 -0.18178780376911163 45 | 0.14968648552894592 -0.19157154858112335 46 | 0.16921043395996094 -0.1928856521844864 47 | 0.19890174269676208 -0.19118916988372803 48 | 0.22966940701007843 -0.1910347193479538 49 | 0.25184503197669983 -0.19051343202590942 50 | 0.22667716443538666 -0.18536078929901123 51 | 0.20031540095806122 -0.1843135505914688 52 | 0.1745179444551468 -0.18870475888252258 53 | -0.05384637042880058 0.11597751826047897 54 | 0.05857943370938301 0.06649796664714813 55 | -0.0063765887171030045 0.08337722718715668 56 | -0.04636808857321739 0.11252179741859436 57 | 0.01774972304701805 0.1061658039689064 58 | 0.19657620787620544 0.11519702523946762 59 | 0.11550715565681458 0.06598487496376038 60 | 0.1680016964673996 0.08212072402238846 61 | 0.18980461359024048 0.11198490113019943 62 | 0.14404307305812836 0.10582582652568817 63 | 0.0883045643568039 0.07205545902252197 64 | 0.08565974235534668 0.1076158881187439 65 | 0.1478145271539688 0.11648828536272049 66 | 0.16730821132659912 0.1569463312625885 67 | 0.1291319578886032 0.17184729874134064 68 | 0.015439774841070175 0.11658688634634018 69 | -0.009288492612540722 0.15742817521095276 70 | 0.039102375507354736 0.1718333214521408 71 | 0.0865754559636116 0.1208595335483551 72 | 0.08545981347560883 0.17389750480651855 73 | -------------------------------------------------------------------------------- /test_case/source/FLAME/headpose.txt: -------------------------------------------------------------------------------- 1 | 0.1681414097547531 2 | 0.2536787986755371 3 | 0.06973648071289062 4 | 0.28077412562032567 5 | -0.25723848868262056 6 | 0.0012728127173397697 7 | -------------------------------------------------------------------------------- /test_case/source/FLAME/landmark.txt: -------------------------------------------------------------------------------- 1 | -0.3915994167327881 -0.1150280013680458 2 | -0.37632814049720764 8.036196231842041e-05 3 | -0.3505066931247711 0.11054354906082153 4 | -0.3252720236778259 0.19552060961723328 5 | -0.28039902448654175 0.27713948488235474 6 | -0.19613677263259888 0.34234046936035156 7 | -0.11438309401273727 0.38312917947769165 8 | -0.01678033173084259 0.43073731660842896 9 | 0.07022051513195038 0.43742769956588745 10 | 0.3226321339607239 -0.1867319792509079 11 | 0.3261435925960541 -0.06925588846206665 12 | 0.31598329544067383 0.04439428448677063 13 | 0.3083377480506897 0.13193431496620178 14 | 0.28226298093795776 0.21942484378814697 15 | 0.2552831470966339 0.29522982239723206 16 | 0.21195143461227417 0.34722304344177246 17 | 0.1464998871088028 0.4113011360168457 18 | -0.2687588632106781 -0.22776859998703003 19 | -0.21894890069961548 -0.2390923947095871 20 | -0.15942734479904175 -0.24691247940063477 21 | -0.1039041206240654 -0.2479831427335739 22 | -0.03141304850578308 -0.2352439910173416 23 | 0.12426505982875824 -0.24792760610580444 24 | 0.18584638833999634 -0.27190640568733215 25 | 0.22782588005065918 -0.2801172733306885 26 | 0.265785276889801 -0.28296464681625366 27 | 0.2891693413257599 -0.2797398269176483 28 | 0.05311575531959534 -0.16408880054950714 29 | 0.06664858758449554 -0.10286129266023636 30 | 0.0790686160326004 -0.05173478275537491 31 | 0.09205496311187744 0.002275601029396057 32 | 0.0029867887496948242 0.05675312876701355 33 | 0.03786562383174896 0.06785309314727783 34 | 0.07199119031429291 0.07482191920280457 35 | 0.1002703458070755 0.061978861689567566 36 | 0.12310999631881714 0.04548470675945282 37 | -0.20481428503990173 -0.13162489235401154 38 | -0.1838582158088684 -0.14527401328086853 39 | -0.15007027983665466 -0.15455220639705658 40 | -0.10895757377147675 -0.15663555264472961 41 | -0.06435224413871765 -0.13814763724803925 42 | -0.09962136298418045 -0.12134777009487152 43 | -0.15342330932617188 -0.11249685287475586 44 | -0.19158394634723663 -0.1175440102815628 45 | 0.1246703565120697 -0.15406309068202972 46 | 0.1749478280544281 -0.178548201918602 47 | 0.21306371688842773 -0.1833753138780594 48 | 0.23675093054771423 -0.18115878105163574 49 | 0.24895545840263367 -0.1723051518201828 50 | 0.24317428469657898 -0.15752866864204407 51 | 0.21697154641151428 -0.14639832079410553 52 | 0.16456133127212524 -0.1455719769001007 53 | -0.07869639247655869 0.1957665979862213 54 | 0.038986608386039734 0.1469036042690277 55 | -0.026080477982759476 0.16074302792549133 56 | -0.07600720226764679 0.19541475176811218 57 | -0.01719493418931961 0.17700600624084473 58 | 0.18652254343032837 0.16827836632728577 59 | 0.10776787996292114 0.14032906293869019 60 | 0.1554509550333023 0.1435796022415161 61 | 0.18522456288337708 0.16786694526672363 62 | 0.1377050131559372 0.16322994232177734 63 | 0.07434439659118652 0.14701855182647705 64 | 0.06612437963485718 0.16539010405540466 65 | 0.1325272023677826 0.19573479890823364 66 | 0.1618194282054901 0.21479544043540955 67 | 0.1177782267332077 0.2364625632762909 68 | 0.0017322301864624023 0.2091311514377594 69 | -0.02514980360865593 0.23496738076210022 70 | 0.041609689593315125 0.24414023756980896 71 | 0.07389257848262787 0.2068658173084259 72 | 0.08118276298046112 0.24355071783065796 73 | -------------------------------------------------------------------------------- /test_case/source/original/headpose.txt: -------------------------------------------------------------------------------- 1 | 0.094624 2 | 0.224248 3 | 0.0906196 4 | 0.274085 5 | -0.258861 6 | 0.00126695 7 | -------------------------------------------------------------------------------- /test_case/source/original/landmark.txt: -------------------------------------------------------------------------------- 1 | -0.41398540139198303 -0.13855670392513275 2 | -0.39786043763160706 -0.04808853939175606 3 | -0.37566015124320984 0.04454704001545906 4 | -0.35401850938796997 0.12935957312583923 5 | -0.31501394510269165 0.22222258150577545 6 | -0.23676329851150513 0.3022463917732239 7 | -0.15003231167793274 0.37161487340927124 8 | -0.058873869478702545 0.426936537027359 9 | 0.05960548296570778 0.43604427576065063 10 | 0.30743855237960815 -0.21318182349205017 11 | 0.3125551640987396 -0.12159918248653412 12 | 0.3085859715938568 -0.026257745921611786 13 | 0.3093137741088867 0.06087388098239899 14 | 0.30156058073043823 0.1588819921016693 15 | 0.27255749702453613 0.25048595666885376 16 | 0.22408780455589294 0.3336263597011566 17 | 0.1595725566148758 0.4046099781990051 18 | -0.2424519956111908 -0.23273399472236633 19 | -0.20851489901542664 -0.2534434199333191 20 | -0.14444002509117126 -0.26778602600097656 21 | -0.09849899262189865 -0.2647925019264221 22 | -0.037037111818790436 -0.2545986771583557 23 | 0.0974816381931305 -0.2677493095397949 24 | 0.1518627256155014 -0.29011762142181396 25 | 0.18786311149597168 -0.3016672432422638 26 | 0.23710954189300537 -0.29916146397590637 27 | 0.26130804419517517 -0.28458142280578613 28 | 0.03486708551645279 -0.17625953257083893 29 | 0.045056603848934174 -0.11815688014030457 30 | 0.05814138054847717 -0.059055134654045105 31 | 0.07038316875696182 0.00965171679854393 32 | -0.018880583345890045 0.06478478014469147 33 | 0.01923198252916336 0.06748099625110626 34 | 0.06152241677045822 0.07035651803016663 35 | 0.09818808734416962 0.05957295000553131 36 | 0.1281459629535675 0.050056032836437225 37 | -0.20204558968544006 -0.13417282700538635 38 | -0.17514348030090332 -0.15104661881923676 39 | -0.13936007022857666 -0.1594904065132141 40 | -0.1013808473944664 -0.1585645079612732 41 | -0.06935463100671768 -0.13894785940647125 42 | -0.10236678272485733 -0.1264077126979828 43 | -0.13617637753486633 -0.12006562948226929 44 | -0.16964977979660034 -0.12204238772392273 45 | 0.12423384189605713 -0.1589944213628769 46 | 0.14868421852588654 -0.1803821623325348 47 | 0.18442374467849731 -0.19259406626224518 48 | 0.21803560853004456 -0.19426123797893524 49 | 0.2399219572544098 -0.17988036572933197 50 | 0.2194332480430603 -0.16232435405254364 51 | 0.1926254779100418 -0.15410242974758148 52 | 0.15961241722106934 -0.1535251885652542 53 | -0.08823604136705399 0.1933051198720932 54 | 0.037760909646749496 0.13411930203437805 55 | -0.03670395910739899 0.15546481311321259 56 | -0.08000732213258743 0.18889179825782776 57 | -0.00799543410539627 0.17504075169563293 58 | 0.19321969151496887 0.16880981624126434 59 | 0.10224615037441254 0.12806382775306702 60 | 0.16135382652282715 0.13724902272224426 61 | 0.18520762026309967 0.1661205142736435 62 | 0.13593405485153198 0.16213418543338776 63 | 0.07198242098093033 0.13796880841255188 64 | 0.0702238604426384 0.17165715992450714 65 | 0.1410117745399475 0.1864575743675232 66 | 0.16403953731060028 0.22564265131950378 67 | 0.12338931858539581 0.25030457973480225 68 | -0.007767893373966217 0.1993907243013382 69 | -0.03333708643913269 0.24307218194007874 70 | 0.023107167333364487 0.2589821517467499 71 | 0.0735296830534935 0.20158612728118896 72 | 0.07525938749313354 0.25801753997802734 73 | -------------------------------------------------------------------------------- /test_case/source/source.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/test_case/source/source.jpg -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/headpose/150.txt: -------------------------------------------------------------------------------- 1 | 0.156604 2 | -0.230057 3 | 0.29011 4 | 0.220425 5 | -0.312521 6 | 0.00194483 7 | -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/headpose/54.txt: -------------------------------------------------------------------------------- 1 | 0.131516 2 | -0.205836 3 | 0.193343 4 | 0.228106 5 | -0.293041 6 | 0.00190257 7 | -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/img/150.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/img/150.jpg -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/img/54.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/img/54.jpg -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/landmark/150.txt: -------------------------------------------------------------------------------- 1 | -0.568425714969635 0.02589094638824463 2 | -0.5238426923751831 0.15984371304512024 3 | -0.4607754945755005 0.2932114899158478 4 | -0.4068068861961365 0.42605629563331604 5 | -0.3416854739189148 0.5617364645004272 6 | -0.254992812871933 0.6702439785003662 7 | -0.15150493383407593 0.7649673223495483 8 | -0.028679929673671722 0.8329890370368958 9 | 0.13000307977199554 0.8273522853851318 10 | 0.46755319833755493 -0.2684306800365448 11 | 0.4962214231491089 -0.12998950481414795 12 | 0.5162476301193237 0.015611246228218079 13 | 0.5354956984519958 0.1588059961795807 14 | 0.532108724117279 0.31500229239463806 15 | 0.46318209171295166 0.4704422950744629 16 | 0.3857147693634033 0.6180211305618286 17 | 0.29314345121383667 0.7472214698791504 18 | -0.5722818970680237 -0.09347942471504211 19 | -0.5491307377815247 -0.13007354736328125 20 | -0.47550225257873535 -0.1633845567703247 21 | -0.41378623247146606 -0.1721431016921997 22 | -0.32607191801071167 -0.1741737723350525 23 | -0.1536448895931244 -0.22927811741828918 24 | -0.07168544083833694 -0.2739851772785187 25 | -0.0026913434267044067 -0.3017808794975281 26 | 0.10618767142295837 -0.3195626735687256 27 | 0.17112918198108673 -0.3071504831314087 28 | -0.1985154151916504 -0.07722553610801697 29 | -0.18322625756263733 0.01261092722415924 30 | -0.16710540652275085 0.11665801703929901 31 | -0.14298115670681 0.23547233641147614 32 | -0.19555987417697906 0.31517547369003296 33 | -0.14759938418865204 0.31499534845352173 34 | -0.09060841053724289 0.3149871230125427 35 | -0.03476393222808838 0.2837763726711273 36 | 0.018790408968925476 0.2559306025505066 37 | -0.48085927963256836 -0.0003206878900527954 38 | -0.4648837447166443 -0.031194746494293213 39 | -0.422918438911438 -0.050164878368377686 40 | -0.36573654413223267 -0.056621164083480835 41 | -0.31113848090171814 -0.033762574195861816 42 | -0.35314321517944336 -0.005850285291671753 43 | -0.39823031425476074 0.013135358691215515 44 | -0.4398861527442932 0.0161551833152771 45 | -0.03688753396272659 -0.11144664883613586 46 | -0.01981893926858902 -0.14834806323051453 47 | 0.02133864164352417 -0.17553764581680298 48 | 0.07874740660190582 -0.18968263268470764 49 | 0.14956389367580414 -0.1794232726097107 50 | 0.10819225013256073 -0.13965561985969543 51 | 0.06364291906356812 -0.11864683032035828 52 | 0.0136423259973526 -0.1098666787147522 53 | -0.18786075711250305 0.5219935774803162 54 | -0.10953058302402496 0.4380062222480774 55 | -0.17474597692489624 0.4775727689266205 56 | -0.17920799553394318 0.514659583568573 57 | -0.12799328565597534 0.4940014183521271 58 | 0.1604418009519577 0.4297633767127991 59 | -0.02031145989894867 0.41321608424186707 60 | 0.08209000527858734 0.4081108570098877 61 | 0.14822788536548615 0.4284103810787201 62 | 0.05698837339878082 0.4452410340309143 63 | -0.06325575709342957 0.4370177984237671 64 | -0.040912143886089325 0.4773541986942291 65 | 0.06541876494884491 0.4649183750152588 66 | 0.1210654228925705 0.5131053328514099 67 | 0.06025557219982147 0.5552003383636475 68 | -0.11841686069965363 0.5124952793121338 69 | -0.12994122505187988 0.5768686532974243 70 | -0.07111692428588867 0.5869581699371338 71 | -0.03216373175382614 0.49953222274780273 72 | -0.006496444344520569 0.5757865309715271 73 | -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/landmark/54.txt: -------------------------------------------------------------------------------- 1 | -0.5328267216682434 -0.0723910927772522 2 | -0.5018036365509033 0.06208936125040054 3 | -0.45313066244125366 0.1979004144668579 4 | -0.4132305383682251 0.3302344083786011 5 | -0.3626813292503357 0.4664897918701172 6 | -0.2858491539955139 0.5769708156585693 7 | -0.1896677017211914 0.6799073815345764 8 | -0.0726291611790657 0.7585027813911438 9 | 0.08300289511680603 0.7647232413291931 10 | 0.5111780166625977 -0.2648939788341522 11 | 0.5261664390563965 -0.12748831510543823 12 | 0.531586229801178 0.016316726803779602 13 | 0.5370694994926453 0.1567668914794922 14 | 0.5192488431930542 0.30695825815200806 15 | 0.44011157751083374 0.4487931430339813 16 | 0.3516508936882019 0.5847730040550232 17 | 0.24884922802448273 0.7027948498725891 18 | -0.508565366268158 -0.19433322548866272 19 | -0.4805501699447632 -0.22859832644462585 20 | -0.4023595452308655 -0.2544301450252533 21 | -0.33930259943008423 -0.25589197874069214 22 | -0.25241878628730774 -0.2473609745502472 23 | -0.08189016580581665 -0.28887394070625305 24 | 0.00226447731256485 -0.3264678120613098 25 | 0.07184536755084991 -0.34871211647987366 26 | 0.17850489914417267 -0.3567420244216919 27 | 0.23930494487285614 -0.33791759610176086 28 | -0.14098984003067017 -0.14685529470443726 29 | -0.1338806003332138 -0.06055460870265961 30 | -0.1265304684638977 0.04188540577888489 31 | -0.1136949434876442 0.16041991114616394 32 | -0.18069304525852203 0.23102693259716034 33 | -0.13256478309631348 0.23521006107330322 34 | -0.07499934732913971 0.24042908847332 35 | -0.015736736357212067 0.21636074781417847 36 | 0.0397486686706543 0.19484290480613708 37 | -0.43493765592575073 -0.10025641322135925 38 | -0.41847842931747437 -0.13559266924858093 39 | -0.3767554759979248 -0.15263423323631287 40 | -0.31769585609436035 -0.15196728706359863 41 | -0.2587515413761139 -0.11556455492973328 42 | -0.31028175354003906 -0.08304101228713989 43 | -0.3602960705757141 -0.07066814601421356 44 | -0.40067577362060547 -0.0752880722284317 45 | 0.017254754900932312 -0.16644325852394104 46 | 0.03816370666027069 -0.2095049023628235 47 | 0.0817098468542099 -0.23707088828086853 48 | 0.13946031033992767 -0.24436751008033752 49 | 0.20904017984867096 -0.21885281801223755 50 | 0.17498789727687836 -0.1798402965068817 51 | 0.1305464804172516 -0.16024965047836304 52 | 0.07684198021888733 -0.15424272418022156 53 | -0.2012825310230255 0.4252326190471649 54 | -0.11096867173910141 0.34387582540512085 55 | -0.18156373500823975 0.3784526586532593 56 | -0.19107991456985474 0.41860517859458923 57 | -0.13880407810211182 0.39680421352386475 58 | 0.1645442694425583 0.360065758228302 59 | -0.018234863877296448 0.32719436287879944 60 | 0.08877328038215637 0.33039039373397827 61 | 0.15258203446865082 0.3573301434516907 62 | 0.05606567859649658 0.3614235520362854 63 | -0.06422337889671326 0.34662073850631714 64 | -0.04901714622974396 0.3844396471977234 65 | 0.06316450238227844 0.39063721895217896 66 | 0.11141422390937805 0.4393317699432373 67 | 0.044770583510398865 0.4749511480331421 68 | -0.12722159922122955 0.4254927337169647 69 | -0.14531119167804718 0.4844479560852051 70 | -0.0882822573184967 0.4978744089603424 71 | -0.039423272013664246 0.417609840631485 72 | -0.023823358118534088 0.4912562966346741 73 | -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/mask/150.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/mask/150.png -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/mask/54.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10001#7w0IBEWc9Qw#000993#001143/mask/54.png -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/headpose/82.txt: -------------------------------------------------------------------------------- 1 | 0.0632977 2 | -0.149218 3 | 0.0846878 4 | 0.245892 5 | -0.229037 6 | 0.00108858 7 | -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/headpose/89.txt: -------------------------------------------------------------------------------- 1 | -0.0103226 2 | 0.0244929 3 | 0.0211892 4 | 0.289309 5 | -0.216485 6 | 0.00109599 7 | -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/img/82.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/img/82.jpg -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/img/89.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/img/89.jpg -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/landmark/82.txt: -------------------------------------------------------------------------------- 1 | -0.303331583738327 -0.233467698097229 2 | -0.2927420735359192 -0.15596948564052582 3 | -0.27413156628608704 -0.07600411772727966 4 | -0.2593475580215454 0.0007681772112846375 5 | -0.23581907153129578 0.08137907832860947 6 | -0.19255167245864868 0.1481340229511261 7 | -0.13532105088233948 0.2038843035697937 8 | -0.06805747002363205 0.24654272198677063 9 | 0.019405052065849304 0.2546466886997223 10 | 0.3265819847583771 -0.2844737470149994 11 | 0.32591888308525085 -0.20606783032417297 12 | 0.3201008439064026 -0.12412900477647781 13 | 0.3144366443157196 -0.04574782773852348 14 | 0.29549098014831543 0.03825075179338455 15 | 0.2432161569595337 0.11261963099241257 16 | 0.18111205101013184 0.17802658677101135 17 | 0.1129712238907814 0.2316981852054596 18 | -0.2598797678947449 -0.31481245160102844 19 | -0.24004164338111877 -0.3338385820388794 20 | -0.19487902522087097 -0.3469902276992798 21 | -0.15954212844371796 -0.3452047109603882 22 | -0.10769272595643997 -0.3362177312374115 23 | 0.00960732251405716 -0.3468177020549774 24 | 0.061776503920555115 -0.3639221787452698 25 | 0.10131721943616867 -0.3716222643852234 26 | 0.15790408849716187 -0.36651837825775146 27 | 0.18842813372612 -0.3514263927936554 28 | -0.04102310910820961 -0.2670649290084839 29 | -0.039631668478250504 -0.2162145972251892 30 | -0.04007062315940857 -0.16453418135643005 31 | -0.03915104269981384 -0.10569751262664795 32 | -0.08614125847816467 -0.05491805821657181 33 | -0.05807076767086983 -0.052395183593034744 34 | -0.02641167864203453 -0.048068370670080185 35 | 0.006891876459121704 -0.05765453726053238 36 | 0.03809083253145218 -0.06505139172077179 37 | -0.2151382863521576 -0.23495230078697205 38 | -0.20378583669662476 -0.24206718802452087 39 | -0.1784401834011078 -0.24663186073303223 40 | -0.14726132154464722 -0.24776050448417664 41 | -0.11646721512079239 -0.2383313775062561 42 | -0.13856452703475952 -0.23524385690689087 43 | -0.16416260600090027 -0.22999557852745056 44 | -0.18897226452827454 -0.22930997610092163 45 | 0.04985576868057251 -0.251802921295166 46 | 0.06622543185949326 -0.26347416639328003 47 | 0.09400001913309097 -0.2688547968864441 48 | 0.12553390860557556 -0.27008354663848877 49 | 0.15456527471542358 -0.2649274170398712 50 | 0.1234506145119667 -0.2546200752258301 51 | 0.0968899354338646 -0.2511388063430786 52 | 0.07148075103759766 -0.2522624433040619 53 | -0.11389538645744324 0.06512745469808578 54 | -0.04827549308538437 0.018069162964820862 55 | -0.09267351776361465 0.03604593127965927 56 | -0.10865690559148788 0.06198366731405258 57 | -0.06910214573144913 0.05240192264318466 58 | 0.09590766578912735 0.046984054148197174 59 | 0.0015379749238491058 0.01383092999458313 60 | 0.0566081777215004 0.023342572152614594 61 | 0.08959364145994186 0.044777028262615204 62 | 0.038133010268211365 0.043131373822689056 63 | -0.02340156026184559 0.02199704200029373 64 | -0.018049370497465134 0.050268061459064484 65 | 0.040446922183036804 0.05117339640855789 66 | 0.06491007655858994 0.08550301939249039 67 | 0.02552839368581772 0.10056950896978378 68 | -0.07054953277111053 0.060900814831256866 69 | -0.0844009518623352 0.09845524281263351 70 | -0.048431023955345154 0.10711487382650375 71 | -0.017524361610412598 0.059963397681713104 72 | -0.011984145268797874 0.10500424355268478 73 | -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/landmark/89.txt: -------------------------------------------------------------------------------- 1 | -0.20474007725715637 -0.268522173166275 2 | -0.19697806239128113 -0.19082148373126984 3 | -0.18260139226913452 -0.10959465801715851 4 | -0.17002439498901367 -0.03243817389011383 5 | -0.14566102623939514 0.04790972173213959 6 | -0.09365737438201904 0.1127336323261261 7 | -0.030290022492408752 0.16876181960105896 8 | 0.04052555561065674 0.2137988805770874 9 | 0.13412050902843475 0.22540783882141113 10 | 0.43853750824928284 -0.28223657608032227 11 | 0.4348083436489105 -0.20429421961307526 12 | 0.42424193024635315 -0.12253983318805695 13 | 0.4158458411693573 -0.044976845383644104 14 | 0.39665839076042175 0.03624339401721954 15 | 0.3511892557144165 0.10304197669029236 16 | 0.29287177324295044 0.16170603036880493 17 | 0.22555701434612274 0.2097836136817932 18 | -0.10753002762794495 -0.36760014295578003 19 | -0.08167213201522827 -0.3875850737094879 20 | -0.02926197648048401 -0.4006659686565399 21 | 0.009437300264835358 -0.39810019731521606 22 | 0.06310754269361496 -0.3868800401687622 23 | 0.18353502452373505 -0.38926300406455994 24 | 0.2362411916255951 -0.402813196182251 25 | 0.2739555239677429 -0.40704137086868286 26 | 0.3252486288547516 -0.3962045907974243 27 | 0.35067373514175415 -0.37733206152915955 28 | 0.12474481761455536 -0.31153035163879395 29 | 0.12640230357646942 -0.261530339717865 30 | 0.1282726377248764 -0.2118726670742035 31 | 0.13027189671993256 -0.15463194251060486 32 | 0.06638786196708679 -0.1010010838508606 33 | 0.09698450565338135 -0.0977640151977539 34 | 0.1304248720407486 -0.09223908185958862 35 | 0.16320598125457764 -0.0992237776517868 36 | 0.1930646449327469 -0.10383424162864685 37 | -0.06575420498847961 -0.2854270339012146 38 | -0.049208953976631165 -0.2951361835002899 39 | -0.020956620573997498 -0.2996317446231842 40 | 0.010934144258499146 -0.29836714267730713 41 | 0.03947927802801132 -0.2840616703033447 42 | 0.01575426757335663 -0.28100720047950745 43 | -0.010776922106742859 -0.27754607796669006 44 | -0.037083253264427185 -0.2784150242805481 45 | 0.2092171311378479 -0.2876892685890198 46 | 0.22892925143241882 -0.300739586353302 47 | 0.25859159231185913 -0.3056843876838684 48 | 0.28926992416381836 -0.3042370676994324 49 | 0.3130781650543213 -0.29351168870925903 50 | 0.28573623299598694 -0.2853110134601593 51 | 0.2598593533039093 -0.2833213210105896 52 | 0.23314571380615234 -0.2856521010398865 53 | 0.02525217831134796 0.02099992334842682 54 | 0.1070152223110199 -0.028147980570793152 55 | 0.05638381093740463 -0.010722845792770386 56 | 0.031153663992881775 0.01803739368915558 57 | 0.07802911847829819 0.007497057318687439 58 | 0.23863503336906433 0.015418171882629395 59 | 0.1577325016260147 -0.029432155191898346 60 | 0.20821356773376465 -0.014503806829452515 61 | 0.23263554275035858 0.012729361653327942 62 | 0.1870465874671936 0.0046945661306381226 63 | 0.13260778784751892 -0.02300858497619629 64 | 0.13275161385536194 0.0071703046560287476 65 | 0.18915385007858276 0.01493680477142334 66 | 0.20920619368553162 0.0515366792678833 67 | 0.17145150899887085 0.06345342099666595 68 | 0.07630160450935364 0.01794058084487915 69 | 0.05715292692184448 0.055464744567871094 70 | 0.09599503874778748 0.06549146771430969 71 | 0.13304740190505981 0.019217073917388916 72 | 0.13377049565315247 0.06540191173553467 73 | -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/mask/82.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/mask/82.png -------------------------------------------------------------------------------- /trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/mask/89.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/NetEase-GameAI/Face2FaceRHO/d14e6a24cd74df5e0b720dbab02cabcc96eca757/trainingset/VoxCeleb/id10009#AtavJVP4bCk#012568#012652/mask/89.png -------------------------------------------------------------------------------- /trainingset/VoxCeleb/list.txt: -------------------------------------------------------------------------------- 1 | id10001#7w0IBEWc9Qw#000993#001143 2 | id10009#AtavJVP4bCk#012568#012652 3 | --------------------------------------------------------------------------------