├── .gitattributes ├── H36M_results_grid_updated_A.pdf ├── H36M_results_grid_updated_A.png ├── High_Level_Architecture.jpg ├── README.md ├── _config.yml ├── assets ├── Skeleton cameras.png └── Teaser.pdf ├── base ├── __init__.py ├── base_data_loader.py ├── base_model.py └── base_trainer.py ├── config_zoo └── default.json ├── data ├── AlphaPoseDataset.py ├── BVHDataset.py ├── CameraPoseVisualizer.py ├── __init__.py ├── angles_utils.py ├── cam_utils.py ├── cameras.h5 ├── data_loaders.py ├── data_utils.py ├── multi_view_h36_dataset.py ├── panutils.py └── prepare_cmu.py ├── environment.yml ├── evaluate_multiview.py ├── index.md ├── model ├── __init__.py ├── loss.py ├── metric.py ├── model.py ├── model_zoo.py ├── model_zoo_multi_view.py ├── tensor_pool.py └── transformer_model.py ├── requirements.txt ├── train.py ├── trainer ├── __init__.py └── multi_view_trainer.py ├── utils ├── Animation.py ├── AnimationPositions.py ├── AnimationStructure.py ├── BVH.py ├── InverseKinematics.py ├── Quaternions.py ├── __init__.py ├── angles_utils.py ├── h36m_utils.py ├── learnable_utils.py ├── logger.py ├── motion_utils.py ├── myAnimation.py ├── myBVH.py ├── quaternion.py ├── util.py └── visualization.py └── videos ├── .DS_Store ├── Human36M_S9_Posing_1.mov ├── Human36M_S9_Posing_1.mp4 ├── Human36M_S9_Sitting.mov ├── Human36M_S9_Sitting.mp4 ├── KTH_football.mov ├── KTH_football.mp4 ├── MotioNet_Comparison.mov ├── MotioNet_Comparison.mp4 ├── README.md ├── clip.mov └── clip.mp4 /.gitattributes: -------------------------------------------------------------------------------- 1 | *.mp4 filter=lfs diff=lfs merge=lfs -text 2 | -------------------------------------------------------------------------------- /H36M_results_grid_updated_A.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/H36M_results_grid_updated_A.pdf -------------------------------------------------------------------------------- /H36M_results_grid_updated_A.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/H36M_results_grid_updated_A.png -------------------------------------------------------------------------------- /High_Level_Architecture.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/High_Level_Architecture.jpg -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # FLEX: Extrinsic Parameter-free Multi-view 3D Human Motion Reconstruction 2 | [![Youtube video](https://img.youtube.com/vi/2Vgs4nBHLa8/0.jpg)](https://www.youtube.com/watch?v=2Vgs4nBHLa8) 3 | 4 | ![alt text](https://github.com/BrianG13/FLEX/blob/main/High_Level_Architecture.jpg) 5 | 6 | 7 | 8 | This repository is the official implementation for the [paper](https://arxiv.org/abs/2105.01937) 9 | 10 | A short clip describing our work: [clip.mp4](https://drive.google.com/file/d/1HSILhK9NX2lGeNQ6mFpNQlYbbK2F8R1X/view?usp=sharing) 11 | 12 | - Video files showing our results on the Human3.6M dataset: [Human36M_S9_Posing_1.mp4](https://drive.google.com/file/d/19yuAHNPcNB574Num5LUcBDDDShDJdjtu/view?usp=sharing) & [Human36M_S9_Sitting.mp4](https://drive.google.com/file/d/1F0nZA257StpxzKVNNI84Y89SwCEAmUVK/view?usp=sharing) 13 | - Video file showing our results on the KTH multi-view Football II dataset: [KTH_football.mp4](https://drive.google.com/file/d/12o6MYtX53kZ7_pxy-ga26Bn0JwZyGJ4I/view?usp=sharing) 14 | - Video files comparing MotioNet (single-view) \& Iskakov et.al. results versus ours: [MotioNet_comparison.mp4](https://drive.google.com/file/d/1BNmIJ_eb5LyP2WuIsG5C1wo0CDZWzoCk/view?usp=sharing) & [Iskakov_comparison.mp4](https://drive.google.com/file/d/1oyrvhq5245__lxcgKfPC8eiKdwZvORo4/view?usp=sharing) 15 | - Syntethic videos from our Blender studio are available to download [here](https://drive.google.com/drive/folders/1yvBYLr8GgRSlsCK25bUsQVbAEqZQYMG6?usp=sharing) 16 | 17 | ## Requirements 18 | 19 | - Linux 20 | - Python 3 21 | - NVIDIA GPU + CUDA CuDNN 22 | 23 | Run the following command to install other packages: 24 | ```setup 25 | conda env create -f environment.yml -n 26 | ``` 27 | 28 | ## Data 29 | - Download the data .zip file from [here](https://drive.google.com/file/d/1hJoyuptbXe4-WcO7sWNUHkNO4iaZJzDh/view?usp=sharing) and unzip it inside the `FLEX/data` folder. 30 | - Download the pre-trained model checkpoint from [here](https://drive.google.com/file/d/1rJMh6SzzsjU4pAMq9bg4ssnUgyx1bF_Q/view?usp=sharing) and add it under the `FLEX/checkpoint` folder. 31 | 32 | ## Evaluation 33 | After you have downloaded the data & pre-trained checkpoint you can evaluate our model by running: 34 | ``` 35 | python evaluate_multiview.py --resume=./4_views_mha64_gt.pth --device= 36 | ``` 37 | Notes: 38 | - In case you are not on a GPU supported machine, just delete the `--device` flag, and the evaluation will run on CPU. 39 | - In order to save bvh files under `FLEX/output` folder, add `--save_bvh_files` argument. 40 | 41 | 42 | ## Training 43 | 44 | To train the model(s) in the paper, run this command: 45 | 46 | Using GT data: 47 | ```train 48 | python train.py --batch_size=32 --channel=1024 --n_views=4 --kernel_width=5 --padding=2 --kernel_size_stage_1=5,3,1 --kernel_size_stage_2=5,3,1 --data=gt --n_joints=20 --dilation=1,1,1 --stride=1,1,1 --kernel_size=5,3,1 --transformer_mode=mha --transformer_n_heads=64 --device= 49 | ``` 50 | 51 | Using Iskakov et al. 2D detected pose: 52 | ```train 53 | python train.py --batch_size=32 --channel=1024 --n_views=4 --kernel_width=5 --padding=2 --kernel_size_stage_1=5,3,1 --kernel_size_stage_2=5,3,1 --data=learnable --n_joints=20 --dilation=1,1,1 --stride=1,1,1 --kernel_size=5,3,1 --transformer_mode=mha --transformer_n_heads=64 --device= 54 | ``` 55 | 56 | ## Results 57 | The evaluation script will output some results to the terminal. 58 | Here is an example of our pre-trained model output, using ground-truth 2D input: 59 | ``` 60 | +--------------+------------+---------------------+ 61 | | Action | MPJPE (mm) | Acc. Error (mm/s^2) | 62 | +--------------+------------+---------------------+ 63 | | Directions | 18.04 | 0.54 | 64 | | Discussion | 22.03 | 0.73 | 65 | | Eating | 20.52 | 0.55 | 66 | | Greeting | 20.60 | 1.38 | 67 | | Phoning | 22.82 | 0.94 | 68 | | Photo | 31.77 | 0.68 | 69 | | Posing | 19.68 | 0.70 | 70 | | Purchases | 21.88 | 1.02 | 71 | | Sitting | 26.98 | 0.49 | 72 | | SittingDown | 28.65 | 0.81 | 73 | | Smoking | 24.05 | 0.93 | 74 | | Waiting | 21.06 | 0.58 | 75 | | WalkDog | 25.93 | 1.72 | 76 | | WalkTogether | 19.23 | 0.87 | 77 | | Walking | 18.92 | 1.09 | 78 | | | | | 79 | | Average | 22.89 | 0.87 | 80 | +--------------+------------+---------------------+ 81 | ``` 82 | ## Citation 83 | 84 | ``` 85 | @inproceedings{gordon2022flex, 86 | title={FLEX: Extrinsic Parameters-free Multi-view 3D Human Motion Reconstruction}, 87 | author={Gordon, Brian and Raab, Sigal and Azov, Guy and Giryes, Raja and Cohen-Or, Daniel}, 88 | booktitle={European Conference on Computer Vision (ECCV)}, 89 | pages={176--196}, 90 | year={2022}, 91 | organization={Springer} 92 | } 93 | ``` 94 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-cayman -------------------------------------------------------------------------------- /assets/Skeleton cameras.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/assets/Skeleton cameras.png -------------------------------------------------------------------------------- /assets/Teaser.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/assets/Teaser.pdf -------------------------------------------------------------------------------- /base/__init__.py: -------------------------------------------------------------------------------- 1 | from .base_data_loader import * 2 | from .base_model import * 3 | from .base_trainer import * 4 | -------------------------------------------------------------------------------- /base/base_data_loader.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from torch.utils.data import DataLoader 3 | from torch.utils.data.dataloader import default_collate 4 | from torch.utils.data.sampler import SubsetRandomSampler 5 | 6 | 7 | class BaseDataLoader(DataLoader): 8 | """ 9 | Base class for all data loaders 10 | """ 11 | 12 | def __init__(self, dataset, batch_size, shuffle, num_workers, pin_memory, collate_fn=default_collate, 13 | drop_last=False): 14 | self.shuffle = shuffle 15 | 16 | self.batch_idx = 0 17 | self.n_samples = len(dataset) 18 | self.init_kwargs = { 19 | 'dataset': dataset, 20 | 'batch_size': batch_size, 21 | 'shuffle': self.shuffle, 22 | 'collate_fn': collate_fn, 23 | 'num_workers': num_workers, 24 | 'pin_memory': pin_memory, 25 | 'drop_last': drop_last 26 | } 27 | super(BaseDataLoader, self).__init__(**self.init_kwargs) 28 | 29 | def _split_sampler(self, split): 30 | if split == 0.0: 31 | return None, None 32 | 33 | idx_full = np.arange(self.n_samples) 34 | 35 | np.random.seed(0) 36 | np.random.shuffle(idx_full) 37 | 38 | len_valid = int(self.n_samples * split) 39 | 40 | valid_idx = idx_full[0:len_valid] 41 | train_idx = np.delete(idx_full, np.arange(0, len_valid)) 42 | 43 | train_sampler = SubsetRandomSampler(train_idx) 44 | valid_sampler = SubsetRandomSampler(valid_idx) 45 | 46 | # turn off shuffle option which is mutually exclusive with sampler 47 | self.shuffle = False 48 | self.n_samples = len(train_idx) 49 | 50 | return train_sampler, valid_sampler 51 | -------------------------------------------------------------------------------- /base/base_model.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import torch 3 | import torch.nn as nn 4 | import numpy as np 5 | 6 | logging.basicConfig(level = logging.INFO,format = '') 7 | 8 | class base_model(nn.Module): 9 | """ 10 | Base class for all models 11 | """ 12 | def __init__(self): 13 | super(base_model, self).__init__() 14 | 15 | self.logger = logging.getLogger(self.__class__.__name__) 16 | self.Tensor = torch.cuda.FloatTensor # if self.gpu_ids else torch.Tensor 17 | 18 | def forward(self, *input): 19 | """ 20 | Forward pass logic 21 | 22 | :return: Model output 23 | """ 24 | raise NotImplementedError 25 | 26 | def summary(self): 27 | """ 28 | Model summary 29 | """ 30 | model_parameters = filter(lambda p: p.requires_grad, self.parameters()) 31 | params = sum([np.prod(p.size()) for p in model_parameters]) 32 | self.logger.info('Trainable parameters: {}'.format(params)) 33 | self.logger.info(self) -------------------------------------------------------------------------------- /base/base_trainer.py: -------------------------------------------------------------------------------- 1 | import os 2 | import math 3 | import json 4 | import datetime 5 | import torch 6 | import numpy as np 7 | from utils.util import mkdir_dir 8 | from utils.visualization import WriterTensorboardX 9 | from utils.logger import Logger 10 | 11 | class base_trainer: 12 | def __init__(self, model, resume, config, logger_path): 13 | self.config = config 14 | self.checkpoint_dir = config.trainer.checkpoint_dir 15 | mkdir_dir(self.checkpoint_dir) 16 | self.train_logger = Logger(logger_path) 17 | print(f'config.device: {config.device}') 18 | self.device, device_ids = self._prepare_device(config.device) 19 | print(f'device_ids: {device_ids}') 20 | self.model = model.to(self.device) 21 | # if len(device_ids) > 1: 22 | # self.model = torch.nn.DataParallel(model, device_ids=device_ids) 23 | 24 | self.epochs = config.trainer.epochs 25 | self.save_freq = config.trainer.save_freq 26 | self.verbosity = config.trainer.verbosity 27 | 28 | self.monitor = config.trainer.monitor 29 | self.monitor_mode = config.trainer.monitor_mode 30 | assert self.monitor_mode in ['min', 'max', 'off'] 31 | self.monitor_best = math.inf if self.monitor_mode == 'min' else -math.inf 32 | self.start_epoch = 1 33 | 34 | self.writer = WriterTensorboardX(config.trainer.checkpoint_dir, self.train_logger, config.visualization.tensorboardX) 35 | if resume: 36 | self._resume_checkpoint(resume) 37 | self.model = model.to(self.device) 38 | 39 | # def _prepare_device(self, gpu_id): 40 | # """ 41 | # setup GPU device if available, move model into configured device 42 | # """ 43 | # n_gpu = torch.cuda.device_count() 44 | # print(f'n_gpu {n_gpu}') 45 | # if n_gpu == 0: 46 | # self.train_logger.warning("Warning: There\'s no GPU available on this machine, training will be performed on CPU.") 47 | # # if gpu_id > n_gpu: 48 | # # msg = "Warning: The number of GPU\'s configured to use is {}, but only {} are available on this machine.".format(gpu_id, n_gpu) 49 | # # self.train_logger.warning(msg) 50 | # # gpu_id = n_gpu 51 | # # if n_gpu > 1: 52 | # # device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 53 | # # list_ids = list([0,1]) 54 | 55 | # # else: 56 | # # device = torch.device('cuda:0' if gpu_id is not None else 'cpu') 57 | # # list_ids = list([0]) 58 | # device = torch.device('cuda:0' if gpu_id is not None else 'cpu') 59 | # list_ids = list([0]) 60 | # return device, list_ids 61 | def _prepare_device(self, gpu_id): 62 | """ 63 | setup GPU device if available, move model into configured device 64 | """ 65 | # gpu_id = int(gpu_id) 66 | n_gpu = torch.cuda.device_count() 67 | print(f'n_gpu: {n_gpu}') 68 | if n_gpu == 0: 69 | self.train_logger.warning("Warning: There\'s no GPU available on this machine, training will be performed on CPU.") 70 | # if gpu_id > n_gpu: 71 | msg = "Warning: The number of GPU\'s configured to use is {}, but only {} are available on this machine.".format(gpu_id, n_gpu) 72 | self.train_logger.warning(msg) 73 | gpu_id = n_gpu 74 | device = torch.device('cuda:0' if gpu_id is not None else 'cpu') 75 | list_ids = list([n_gpu]) 76 | return device, list_ids 77 | 78 | def _prepare_data(self, data, _from='numpy'): 79 | return torch.from_numpy(np.array(data)).float().to(self.device)if _from == 'numpy' else data.float().to(self.device) 80 | 81 | def train(self): 82 | """ 83 | Full training logic 84 | """ 85 | for epoch in range(self.start_epoch, self.epochs + 1): 86 | result = self._train_epoch(epoch) 87 | log = {'epoch': epoch} 88 | for key, value in result.items(): 89 | if key == 'metrics': 90 | log.update({key: value}) 91 | elif key == 'val_metrics': 92 | log.update({'val_' + key: value}) 93 | else: 94 | log[key] = value 95 | 96 | # print logged informations to the screen 97 | if self.train_logger is not None: 98 | if self.verbosity >= 1: 99 | for key, value in log.items(): 100 | self.train_logger.info(' {:15s}: {}'.format(str(key), value)) 101 | 102 | # evaluate model performance according to configured metric, save best checkpoint as model_best 103 | best = False 104 | if self.monitor_mode != 'off': 105 | try: 106 | if (self.monitor_mode == 'min' and log[self.monitor] < self.monitor_best) or (self.monitor_mode == 'max' and log[self.monitor] > self.monitor_best): 107 | self.monitor_best = log[self.monitor] 108 | best = True 109 | except KeyError: 110 | if epoch == 1: 111 | msg = "Warning: Can\'t recognize metric named '{}' ".format(self.monitor)\ 112 | + "for performance monitoring. model_best checkpoint won\'t be updated." 113 | self.train_logger.warning(msg) 114 | if best and epoch>5: 115 | self._save_checkpoint('best', epoch) 116 | if epoch % self.save_freq == 0: 117 | self._save_checkpoint('epoch%s' % epoch, epoch) 118 | 119 | def _train_epoch(self, epoch): 120 | """ 121 | Training logic for an epoch 122 | 123 | :param epoch: Current epoch number 124 | """ 125 | raise NotImplementedError 126 | 127 | def _save_checkpoint(self, save_name, epoch): 128 | arch = type(self.model).__name__ 129 | state = { 130 | 'arch': arch, 131 | 'epoch': epoch, 132 | 'state_dict': self.model.state_dict(), 133 | 'monitor_best': self.monitor_best, 134 | 'config': self.config 135 | } 136 | save_path = self.checkpoint_dir 137 | filename = os.path.join(save_path, f'{save_name}.pth') 138 | torch.save(state, filename) 139 | self.train_logger.info("Saving checkpoint: {} ...".format(filename)) 140 | if hasattr(self, 'losses_dict'): 141 | with open(os.path.join(save_path, 'losses_%s.json' % save_name), 'w') as fp: 142 | json.dump(self.losses_dict, fp) 143 | if hasattr(self, 'val_log_dict'): 144 | with open(os.path.join(save_path, 'val_log_%s.json' % save_name), 'w') as fp: 145 | json.dump(self.val_log_dict, fp) 146 | 147 | def _resume_checkpoint(self, resume_path): 148 | """ 149 | Resume from saved checkpoints 150 | 151 | :param resume_path: Checkpoint path to be resumed 152 | """ 153 | self.train_logger.info("Loading checkpoint: {} ...".format(resume_path)) 154 | checkpoint = torch.load(resume_path) 155 | self.start_epoch = checkpoint['epoch'] + 1 156 | self.monitor_best = checkpoint['monitor_best'] 157 | 158 | # load architecture params from checkpoint. 159 | # if checkpoint.config.arch != self.config.arch: 160 | if checkpoint['arch'] != self.config.arch: 161 | 162 | self.train_logger.warning('Warning: Architecture configuration given in config file is different from that of checkpoint. ' + \ 163 | 'This may yield an exception while state_dict is being loaded.') 164 | self.model.load_state_dict(checkpoint['state_dict']) 165 | 166 | if 'logger' in checkpoint: 167 | self.train_logger = checkpoint['logger'] 168 | 169 | self.train_logger.info("Checkpoint '{}' (epoch {}) loaded".format(resume_path, self.start_epoch)) 170 | -------------------------------------------------------------------------------- /config_zoo/default.json: -------------------------------------------------------------------------------- 1 | { 2 | "use_GT_bones": false, 3 | "use_GT_rot": false, 4 | "arch": { 5 | "type": "fk_multi_view_model", 6 | "kernel_size": [5,3,1], 7 | "stride": [1,1,1], 8 | "dilation": [1,1,1], 9 | "kernel_size_stage_1": [5,3,1], 10 | "kernel_size_stage_2": [5,3,1], 11 | "stride_stage_1": [1,1,1], 12 | "dilation_stage_1": [1,1,1], 13 | "channel": 512, 14 | "stage": 2, 15 | "n_type": 1, 16 | "rotation_type": "q", 17 | "translation": false, 18 | "confidence": false, 19 | "contact": false, 20 | "predict_joints_confidence": false, 21 | "predict_rotations_confidence": false, 22 | "branch_Q_double_input": true, 23 | "btw_stages_fusion": false, 24 | "late_fusion": false, 25 | "n_views": 4, 26 | "branch_S_multi_view": true 27 | }, 28 | "trainer": { 29 | "epochs": 200, 30 | "save_dir": null, 31 | "save_freq": 10, 32 | "verbosity": 2, 33 | "monitor": "val_metric", 34 | "monitor_mode": "min", 35 | "siamese": true, 36 | "dump_processed_data": false, 37 | "upload_processed_data": false, 38 | "use_openpose_2d": false, 39 | "use_cmu_data": false, 40 | "merge_double_view_poses": false, 41 | "new_merge_component": true, 42 | "use_loss_angles": false, 43 | "multi_view_mode": false, 44 | "optimizer": "adam" 45 | }, 46 | "visualization": { 47 | "tensorboardX": true 48 | } 49 | } 50 | 51 | -------------------------------------------------------------------------------- /data/AlphaPoseDataset.py: -------------------------------------------------------------------------------- 1 | import json 2 | # -*- coding:utf-8 -*- 3 | import numpy as np 4 | import os 5 | from torch.utils.data import Dataset 6 | from utils import h36m_utils, util 7 | from pathlib import Path 8 | 9 | 10 | class AlphaPoseDataset(Dataset): 11 | def __init__(self, config, is_train=False, num_of_views=3): 12 | print('AlphaPoseDataset') 13 | 14 | poses_3d, poses_2d = [], [] 15 | 16 | self.frame_numbers = [] 17 | self.video_and_frame = [] 18 | self.video_name = [] 19 | self.config = config 20 | self.is_train = is_train 21 | 22 | self.n_camera_views = 4 23 | 24 | for ACTION_NAME in ['Fight_Scene']: 25 | # for cameras_idx in itertools.permutations([10,11,12,13],4): 26 | current_dir = os.path.abspath(os.getcwd()) 27 | path = Path(__file__).parent.absolute() 28 | 29 | with open(os.path.join(path, f"alphapose/{ACTION_NAME}_alphapose.json"), "r") as read_file: 30 | action_json = json.load(read_file) 31 | 32 | for subject in action_json: 33 | subject_data = action_json[subject] 34 | for cameras_idx in [[18, 19, 20, 21], [18, 19, 21, 22], [19, 20, 22, 23]]: 35 | # for cameras_idx in [[12, 11, 10, 13]]: 36 | set_views_poses_2d = [] 37 | min_length = float('inf') 38 | for cam_idx in cameras_idx: 39 | subject_cam_data = subject_data[str(cam_idx)] 40 | set_2d = self.subject_cam_data_to_pose2d(subject_cam_data) 41 | set_2d = set_2d.reshape((-1, 20 * 2)) 42 | if set_2d.shape[0] < min_length: 43 | min_length = set_2d.shape[0] 44 | set_2d = self.center_and_scale(set_2d, 800, 800) 45 | set_views_poses_2d.append(set_2d) 46 | 47 | self.frame_numbers.append(min_length) 48 | string_idxs = [str(n) for n in cameras_idx] 49 | self.video_name.append(f"{ACTION_NAME}_S{subject}_" + "_".join(string_idxs)) 50 | for i in range(len(set_views_poses_2d)): 51 | set_views_poses_2d[i] = set_views_poses_2d[i][:min_length, :] 52 | poses_2d.append(np.stack(set_views_poses_2d, axis=0)) 53 | self.poses_2d = np.concatenate(poses_2d, axis=1) 54 | self.set_sequences() 55 | 56 | self.poses_2d = self.poses_2d.reshape((-1, self.poses_2d.shape[-1])) 57 | 58 | print('Normalizing Poses 2D ..') 59 | self.poses_2d, self.poses_2d_mean, self.poses_2d_std = util.normalize_data(self.poses_2d) 60 | self.poses_2d = self.poses_2d.reshape((self.n_camera_views, -1, self.poses_2d.shape[-1])) 61 | self.translate_to_lists() 62 | 63 | def subject_cam_data_to_pose2d(self, subject_cam_data): 64 | n_frames = len(subject_cam_data) 65 | all_keypoints = [] 66 | for frame_data in subject_cam_data: 67 | all_keypoints.append(frame_data['keypoints']) 68 | all_keypoints = np.asarray(all_keypoints).reshape((n_frames, 20, 3)) 69 | 70 | all_keypoints = all_keypoints[:, :, :2] # Remove confidence 71 | all_keypoints[:, 8, :] = 0.5 * (all_keypoints[:, 8, :2] + all_keypoints[:, 9, :2]) 72 | return all_keypoints 73 | 74 | def translate_to_lists(self): 75 | # self.poses_3d_list = [] 76 | self.poses_2d_list = [] 77 | 78 | for i in range(self.n_camera_views): 79 | # self.poses_3d_list.append(self.poses_3d[i]) 80 | self.poses_2d_list.append(self.poses_2d[i]) 81 | 82 | def __getitem__(self, index): 83 | items_index = self.sequence_index[index] 84 | # Start of experiment" 85 | views_data = [] 86 | for i in range(self.n_camera_views): 87 | # poses_3d = self.poses_3d_list[i][items_index] 88 | poses_2d = self.poses_2d_list[i][items_index] 89 | 90 | # views_data.append([poses_2d, poses_3d]) 91 | views_data.append([poses_2d]) 92 | 93 | return views_data, self.video_name[index] 94 | 95 | def __len__(self): 96 | return self.sequence_index.shape[0] 97 | 98 | def get_video_and_frame_details(self, index): 99 | items_index = self.sequence_index[index] 100 | return self.video_and_frame[items_index] 101 | 102 | def center_and_scale(self, set_2d, res_height=640, res_width=480): 103 | set_2d_root = set_2d - np.tile(set_2d[:, :2], [1, int(set_2d.shape[-1] / 2)]) 104 | set_2d_root[:, list(range(0, set_2d.shape[-1], 2))] /= res_height 105 | set_2d_root[:, list(range(1, set_2d.shape[-1], 2))] /= res_width 106 | return set_2d_root 107 | 108 | def set_sequences(self): 109 | def slice_set(offset, frame_number, frame_numbers): 110 | sequence_index = [] 111 | start_index = 0 112 | for frames in frame_numbers: 113 | if frames > train_frames: 114 | if self.is_train: 115 | clips_number = int((frames - train_frames) // offset) 116 | for i in range(clips_number): 117 | start = int(i * offset + start_index) 118 | end = int(i * offset + train_frames + start_index) 119 | sequence_index.append(list(range(start, end))) 120 | sequence_index.append(list(range(start_index + frames - train_frames, start_index + frames))) 121 | else: 122 | sequence_index.append(list(range(start_index, start_index + frames))) 123 | start_index += frames 124 | return sequence_index 125 | 126 | offset = 10 127 | train_frames = 0 128 | self.sequence_index = np.array(slice_set(offset, train_frames, self.frame_numbers)) 129 | 130 | def get_parameters(self): 131 | bones_mean = np.asarray( 132 | [[0.516709056080739, 1.8597128599473656, 1.8486176161353194, 1.0154634542775278, 0.9944090324875341, 133 | 0.46335894047448684, 0.45708159715181057, 0.7002218481026042, 1.155838079375657, 134 | 0.9885875159669523]]) 135 | bones_std = np.asarray([0.03250722274696495, 0.0029220969699794838, 0.0006156061152456668, 0.005663153869630782, 136 | 0.008404610836805063, 0.014311367462960448, 0.004236793633603145, 0.026840327186035892, 137 | 0.01512937262086334, 0.007775925244763229]) 138 | 139 | return self.poses_2d_mean, self.poses_2d_std, bones_mean, bones_std 140 | 141 | 142 | # 143 | if __name__ == '__main__': 144 | dataset = AlphaPoseDataset(config={'run_local': True}, num_of_views=3) 145 | -------------------------------------------------------------------------------- /data/CameraPoseVisualizer.py: -------------------------------------------------------------------------------- 1 | 2 | from mpl_toolkits.mplot3d import Axes3D # <--- This is important for 3d plotting 3 | import matplotlib.pyplot as plt 4 | import matplotlib.gridspec as gridspec 5 | import matplotlib as mpl 6 | from matplotlib.patches import Patch 7 | from mpl_toolkits.mplot3d.art3d import Poly3DCollection 8 | # import cv2 9 | import numpy as np 10 | from cam_utils import from_R_T_to_extrinsic, quatWAvgMarkley 11 | 12 | class CameraPoseVisualizer: 13 | def __init__(self, axis=None): 14 | self.fig = plt.figure(figsize=(18, 7)) 15 | if axis is None: 16 | self.ax = self.fig.gca(projection='3d') 17 | else: 18 | self.ax = axis 19 | 20 | self.ax.set_aspect("auto") 21 | # self.ax.set_xlim([-5, 5]) 22 | # self.ax.set_ylim([-5, 5]) 23 | # self.ax.set_zlim([-5, 5]) 24 | self.ax.set_xlim([-5000, 5000]) 25 | self.ax.set_ylim([-5000, 5000]) 26 | self.ax.set_zlim([-5000, 5000]) 27 | self.ax.set_xlabel('x') 28 | self.ax.set_ylabel('y') 29 | self.ax.set_zlabel('z') 30 | print('initialize camera pose visualizer') 31 | 32 | def extrinsic2pyramid(self, extrinsic, color='r', focal_len_scaled=2, aspect_ratio=0.3,name=None): 33 | vertex_std = np.array([[0, 0, 0, 1], 34 | [focal_len_scaled * aspect_ratio, -focal_len_scaled * aspect_ratio, focal_len_scaled, 1], 35 | [focal_len_scaled * aspect_ratio, focal_len_scaled * aspect_ratio, focal_len_scaled, 1], 36 | [-focal_len_scaled * aspect_ratio, focal_len_scaled * aspect_ratio, focal_len_scaled, 1], 37 | [-focal_len_scaled * aspect_ratio, -focal_len_scaled * aspect_ratio, focal_len_scaled, 1]]) 38 | vertex_transformed = vertex_std @ extrinsic.T 39 | meshes = [[vertex_transformed[0, :-1], vertex_transformed[1][:-1], vertex_transformed[2, :-1]], 40 | [vertex_transformed[0, :-1], vertex_transformed[2, :-1], vertex_transformed[3, :-1]], 41 | [vertex_transformed[0, :-1], vertex_transformed[3, :-1], vertex_transformed[4, :-1]], 42 | [vertex_transformed[0, :-1], vertex_transformed[4, :-1], vertex_transformed[1, :-1]], 43 | [vertex_transformed[1, :-1], vertex_transformed[2, :-1], vertex_transformed[3, :-1], vertex_transformed[4, :-1]]] 44 | # for i in range(len(meshes)): 45 | # for j in range(len(meshes[i])): 46 | # meshes[i][j] = meshes[i][j][[0,2,1]] 47 | self.ax.add_collection3d( 48 | Poly3DCollection(meshes, facecolors=color, linewidths=0.3, edgecolors=color, alpha=0.35)) 49 | if name: 50 | T = extrinsic[:3, 3] 51 | self.ax.text(T[0], T[1], T[2], name) 52 | 53 | def customize_legend(self, list_label): 54 | list_handle = [] 55 | for idx, label in enumerate(list_label): 56 | color = plt.cm.rainbow(idx / len(list_label)) 57 | patch = Patch(color=color, label=label) 58 | list_handle.append(patch) 59 | plt.legend(loc='right', bbox_to_anchor=(1.8, 0.5), handles=list_handle) 60 | 61 | def colorbar(self, max_frame_length): 62 | cmap = mpl.cm.rainbow 63 | norm = mpl.colors.Normalize(vmin=0, vmax=max_frame_length) 64 | self.fig.colorbar(mpl.cm.ScalarMappable(norm=norm, cmap=cmap), orientation='vertical', label='Frame Number') 65 | 66 | def show(self): 67 | plt.title('Extrinsic Parameters') 68 | plt.show() 69 | 70 | # def from_R_T_to_extrinsic(R_mat,T_mat): 71 | # temp = np.concatenate([R_mat.T, T_mat], axis=1) 72 | # temp = np.concatenate([temp, np.array([[0, 0, 0, 1]])], axis=0) 73 | # return temp 74 | 75 | 76 | if __name__ == '__main__': 77 | # import matplotlib 78 | # matplotlib.rcParams["backend"] = "TkAgg" 79 | gs1 = gridspec.GridSpec(1, 1) 80 | ax = plt.subplot(gs1[0, 0], projection='3d') 81 | visualizer = CameraPoseVisualizer(ax) 82 | R_list = [ 83 | np.asarray([[-0.90334862, 0.42691198, 0.04132109], 84 | [0.04153061, 0.18295114, -0.98224441], 85 | [-0.42689165, -0.88559305, -0.18299858]]), 86 | np.asarray([[0.93157205, 0.36348288, -0.00732918], 87 | [0.06810069, -0.19426748, -0.97858185], 88 | [-0.35712157, 0.91112038, -0.20572759]]), 89 | np.asarray([[-0.92693442, -0.37323035, -0.03862235], 90 | [-0.04725991, 0.21824049, -0.97475001], 91 | [0.37223525, -0.90170405, -0.21993346]]), 92 | np.asarray([[0.91546071, -0.39734607, 0.0636223], 93 | [-0.04940628, -0.26789168, -0.96218141], 94 | [0.39936288, 0.87769594, -0.2648757]]) 95 | ] 96 | T_list = [ 97 | np.asarray([[2044.45852504], 98 | [4935.11727985], 99 | [1481.22752753]]), 100 | np.asarray([[1990.95966215], 101 | [-5123.81055156], 102 | [1568.80481574]]), 103 | np.asarray([[-1670.99215489], 104 | [5211.98574196], 105 | [1528.38799772]]), 106 | np.asarray([[-1696.04347097], 107 | [-3827.09988629], 108 | [1591.41272728]]) 109 | ] 110 | for cam_idx in range(len(R_list)): 111 | cam_extrinsics = from_R_T_to_extrinsic(R_list[cam_idx], T_list[cam_idx]) 112 | visualizer.extrinsic2pyramid(cam_extrinsics, 'r', 1000, name=str(cam_idx)) 113 | 114 | # from scipy.spatial.transform import Rotation as R 115 | # r = R.from_matrix(R_list[2].T) 116 | # r2 = R.from_matrix(R_list[3].T) 117 | # for alpha in range(1, 10, 1): 118 | # alpha /= 10 119 | # # r_avg = 0.5*(r.as_rotvec()+r2.as_rotvec()) 120 | # # r_avg = R.from_rotvec(r_avg).as_matrix() 121 | # t_avg = alpha * T_list[2] + (1 - alpha) * T_list[3] 122 | # # cam_extrinsics = from_R_T_to_extrinsic(r_avg.T, t_avg) 123 | # 124 | # all_q = np.stack([r.as_quat(), r2.as_quat()]) 125 | # avg_q = quatWAvgMarkley(all_q, [alpha, 1 - alpha]) 126 | # avg_r = R.from_quat(avg_q).as_matrix() 127 | # cam_extrinsics = from_R_T_to_extrinsic(avg_r.T, t_avg) 128 | # visualizer.extrinsic2pyramid(cam_extrinsics, 'c', 1000, name=str(alpha)) 129 | 130 | # ax.view_init(azim=-20, elev=20) 131 | plt.show() 132 | -------------------------------------------------------------------------------- /data/__init__.py: -------------------------------------------------------------------------------- 1 | from .data_loaders import * -------------------------------------------------------------------------------- /data/angles_utils.py: -------------------------------------------------------------------------------- 1 | 2 | """Functions that help with data processing for human3.6m""" 3 | 4 | from __future__ import absolute_import 5 | from __future__ import division 6 | from __future__ import print_function 7 | 8 | import numpy as np 9 | from six.moves import xrange # pylint: disable=redefined-builtin 10 | import copy 11 | 12 | def rotmat2euler( R ): 13 | """ 14 | Converts a rotation matrix to Euler angles 15 | Matlab port to python for evaluation purposes 16 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/RotMat2Euler.m#L1 17 | 18 | Args 19 | R: a 3x3 rotation matrix 20 | Returns 21 | eul: a 3x1 Euler angle representation of R 22 | """ 23 | if R[0,2] == 1 or R[0,2] == -1: 24 | # special case 25 | E3 = 0 # set arbitrarily 26 | dlta = np.arctan2( R[0,1], R[0,2] ); 27 | 28 | if R[0,2] == -1: 29 | E2 = np.pi/2; 30 | E1 = E3 + dlta; 31 | else: 32 | E2 = -np.pi/2; 33 | E1 = -E3 + dlta; 34 | 35 | else: 36 | E2 = -np.arcsin( R[0,2] ) 37 | E1 = np.arctan2( R[1,2]/np.cos(E2), R[2,2]/np.cos(E2) ) 38 | E3 = np.arctan2( R[0,1]/np.cos(E2), R[0,0]/np.cos(E2) ) 39 | 40 | eul = np.array([E1, E2, E3]); 41 | return eul 42 | 43 | 44 | def quat2expmap(q): 45 | """ 46 | Converts a quaternion to an exponential map 47 | Matlab port to python for evaluation purposes 48 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/quat2expmap.m#L1 49 | 50 | Args 51 | q: 1x4 quaternion 52 | Returns 53 | r: 1x3 exponential map 54 | Raises 55 | ValueError if the l2 norm of the quaternion is not close to 1 56 | """ 57 | if (np.abs(np.linalg.norm(q)-1)>1e-3): 58 | raise(ValueError, "quat2expmap: input quaternion is not norm 1") 59 | 60 | sinhalftheta = np.linalg.norm(q[1:]) 61 | coshalftheta = q[0] 62 | 63 | r0 = np.divide( q[1:], (np.linalg.norm(q[1:]) + np.finfo(np.float32).eps)); 64 | theta = 2 * np.arctan2( sinhalftheta, coshalftheta ) 65 | theta = np.mod( theta + 2*np.pi, 2*np.pi ) 66 | 67 | if theta > np.pi: 68 | theta = 2 * np.pi - theta 69 | r0 = -r0 70 | 71 | r = r0 * theta 72 | return r 73 | 74 | def rotmat2quat(R): 75 | """ 76 | Converts a rotation matrix to a quaternion 77 | Matlab port to python for evaluation purposes 78 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/rotmat2quat.m#L4 79 | 80 | Args 81 | R: 3x3 rotation matrix 82 | Returns 83 | q: 1x4 quaternion 84 | """ 85 | rotdiff = R - R.T; 86 | 87 | r = np.zeros(3) 88 | r[0] = -rotdiff[1,2] 89 | r[1] = rotdiff[0,2] 90 | r[2] = -rotdiff[0,1] 91 | sintheta = np.linalg.norm(r) / 2; 92 | r0 = np.divide(r, np.linalg.norm(r) + np.finfo(np.float32).eps ); 93 | 94 | costheta = (np.trace(R)-1) / 2; 95 | 96 | theta = np.arctan2( sintheta, costheta ); 97 | 98 | q = np.zeros(4) 99 | q[0] = np.cos(theta/2) 100 | q[1:] = r0*np.sin(theta/2) 101 | return q 102 | 103 | def rotmat2expmap(R): 104 | return quat2expmap( rotmat2quat(R) ); 105 | 106 | def expmap2rotmat(r): 107 | """ 108 | Converts an exponential map angle to a rotation matrix 109 | Matlab port to python for evaluation purposes 110 | I believe this is also called Rodrigues' formula 111 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/expmap2rotmat.m 112 | 113 | Args 114 | r: 1x3 exponential map 115 | Returns 116 | R: 3x3 rotation matrix 117 | """ 118 | theta = np.linalg.norm( r ) 119 | r0 = np.divide( r, theta + np.finfo(np.float32).eps ) 120 | r0x = np.array([0, -r0[2], r0[1], 0, 0, -r0[0], 0, 0, 0]).reshape(3,3) 121 | r0x = r0x - r0x.T 122 | R = np.eye(3,3) + np.sin(theta)*r0x + (1-np.cos(theta))*(r0x).dot(r0x); 123 | return R 124 | 125 | 126 | def unNormalizeData(normalizedData, data_mean, data_std, dimensions_to_ignore, actions, one_hot ): 127 | """Borrowed from SRNN code. Reads a csv file and returns a float32 matrix. 128 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/generateMotionData.py#L12 129 | 130 | Args 131 | normalizedData: nxd matrix with normalized data 132 | data_mean: vector of mean used to normalize the data 133 | data_std: vector of standard deviation used to normalize the data 134 | dimensions_to_ignore: vector with dimensions not used by the model 135 | actions: list of strings with the encoded actions 136 | one_hot: whether the data comes with one-hot encoding 137 | Returns 138 | origData: data originally used to 139 | """ 140 | T = normalizedData.shape[0] 141 | D = data_mean.shape[0] 142 | 143 | origData = np.zeros((T, D), dtype=np.float32) 144 | dimensions_to_use = [] 145 | for i in range(D): 146 | if i in dimensions_to_ignore: 147 | continue 148 | dimensions_to_use.append(i) 149 | dimensions_to_use = np.array(dimensions_to_use) 150 | 151 | if one_hot: 152 | origData[:, dimensions_to_use] = normalizedData[:, :-len(actions)] 153 | else: 154 | origData[:, dimensions_to_use] = normalizedData 155 | 156 | # potentially ineficient, but only done once per experiment 157 | stdMat = data_std.reshape((1, D)) 158 | stdMat = np.repeat(stdMat, T, axis=0) 159 | meanMat = data_mean.reshape((1, D)) 160 | meanMat = np.repeat(meanMat, T, axis=0) 161 | origData = np.multiply(origData, stdMat) + meanMat 162 | return origData 163 | 164 | 165 | def revert_output_format(poses, data_mean, data_std, dim_to_ignore, actions, one_hot): 166 | """ 167 | Converts the output of the neural network to a format that is more easy to 168 | manipulate for, e.g. conversion to other format or visualization 169 | 170 | Args 171 | poses: The output from the TF model. A list with (seq_length) entries, 172 | each with a (batch_size, dim) output 173 | Returns 174 | poses_out: A tensor of size (batch_size, seq_length, dim) output. Each 175 | batch is an n-by-d sequence of poses. 176 | """ 177 | seq_len = len(poses) 178 | if seq_len == 0: 179 | return [] 180 | 181 | batch_size, dim = poses[0].shape 182 | 183 | poses_out = np.concatenate(poses) 184 | poses_out = np.reshape(poses_out, (seq_len, batch_size, dim)) 185 | poses_out = np.transpose(poses_out, [1, 0, 2]) 186 | 187 | poses_out_list = [] 188 | for i in xrange(poses_out.shape[0]): 189 | poses_out_list.append( 190 | unNormalizeData(poses_out[i, :, :], data_mean, data_std, dim_to_ignore, actions, one_hot)) 191 | 192 | return poses_out_list 193 | 194 | 195 | def readCSVasFloat(filename): 196 | """ 197 | Borrowed from SRNN code. Reads a csv and returns a float matrix. 198 | https://github.com/asheshjain399/NeuralModels/blob/master/neuralmodels/utils.py#L34 199 | 200 | Args 201 | filename: string. Path to the csv file 202 | Returns 203 | returnArray: the read data in a float32 matrix 204 | """ 205 | returnArray = [] 206 | lines = open(filename).readlines() 207 | for line in lines: 208 | line = line.strip().split(',') 209 | if len(line) > 0: 210 | returnArray.append(np.array([np.float32(x) for x in line])) 211 | 212 | returnArray = np.array(returnArray) 213 | return returnArray 214 | 215 | 216 | def load_data(path_to_dataset, subjects, actions, one_hot): 217 | """ 218 | Borrowed from SRNN code. This is how the SRNN code reads the provided .txt files 219 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/processdata.py#L270 220 | 221 | Args 222 | path_to_dataset: string. directory where the data resides 223 | subjects: list of numbers. The subjects to load 224 | actions: list of string. The actions to load 225 | one_hot: Whether to add a one-hot encoding to the data 226 | Returns 227 | trainData: dictionary with k:v 228 | k=(subject, action, subaction, 'even'), v=(nxd) un-normalized data 229 | completeData: nxd matrix with all the data. Used to normlization stats 230 | """ 231 | nactions = len( actions ) 232 | 233 | trainData = {} 234 | completeData = [] 235 | for subj in subjects: 236 | for action_idx in np.arange(len(actions)): 237 | 238 | action = actions[ action_idx ] 239 | 240 | for subact in [1, 2]: # subactions 241 | 242 | print("Reading subject {0}, action {1}, subaction {2}".format(subj, action, subact)) 243 | 244 | filename = '{0}/S{1}/{2}_{3}.txt'.format( path_to_dataset, subj, action, subact) 245 | action_sequence = readCSVasFloat(filename) 246 | 247 | n, d = action_sequence.shape 248 | even_list = range(0, n, 2) 249 | 250 | if one_hot: 251 | # Add a one-hot encoding at the end of the representation 252 | the_sequence = np.zeros( (len(even_list), d + nactions), dtype=float ) 253 | the_sequence[ :, 0:d ] = action_sequence[even_list, :] 254 | the_sequence[ :, d+action_idx ] = 1 255 | trainData[(subj, action, subact, 'even')] = the_sequence 256 | else: 257 | trainData[(subj, action, subact, 'even')] = action_sequence[even_list, :] 258 | 259 | 260 | if len(completeData) == 0: 261 | completeData = copy.deepcopy(action_sequence) 262 | else: 263 | completeData = np.append(completeData, action_sequence, axis=0) 264 | 265 | return trainData, completeData 266 | 267 | 268 | def normalize_data( data, data_mean, data_std, dim_to_use, actions, one_hot ): 269 | """ 270 | Normalize input data by removing unused dimensions, subtracting the mean and 271 | dividing by the standard deviation 272 | 273 | Args 274 | data: nx99 matrix with data to normalize 275 | data_mean: vector of mean used to normalize the data 276 | data_std: vector of standard deviation used to normalize the data 277 | dim_to_use: vector with dimensions used by the model 278 | actions: list of strings with the encoded actions 279 | one_hot: whether the data comes with one-hot encoding 280 | Returns 281 | data_out: the passed data matrix, but normalized 282 | """ 283 | data_out = {} 284 | nactions = len(actions) 285 | 286 | if not one_hot: 287 | # No one-hot encoding... no need to do anything special 288 | for key in data.keys(): 289 | data_out[ key ] = np.divide( (data[key] - data_mean), data_std ) 290 | data_out[ key ] = data_out[ key ][ :, dim_to_use ] 291 | 292 | else: 293 | # TODO hard-coding 99 dimensions for un-normalized human poses 294 | for key in data.keys(): 295 | data_out[ key ] = np.divide( (data[key][:, 0:99] - data_mean), data_std ) 296 | data_out[ key ] = data_out[ key ][ :, dim_to_use ] 297 | data_out[ key ] = np.hstack( (data_out[key], data[key][:,-nactions:]) ) 298 | 299 | return data_out 300 | 301 | 302 | def normalization_stats(completeData): 303 | """" 304 | Also borrowed for SRNN code. Computes mean, stdev and dimensions to ignore. 305 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/processdata.py#L33 306 | 307 | Args 308 | completeData: nx99 matrix with data to normalize 309 | Returns 310 | data_mean: vector of mean used to normalize the data 311 | data_std: vector of standard deviation used to normalize the data 312 | dimensions_to_ignore: vector with dimensions not used by the model 313 | dimensions_to_use: vector with dimensions used by the model 314 | """ 315 | data_mean = np.mean(completeData, axis=0) 316 | data_std = np.std(completeData, axis=0) 317 | 318 | dimensions_to_ignore = [] 319 | dimensions_to_use = [] 320 | 321 | dimensions_to_ignore.extend( list(np.where(data_std < 1e-4)[0]) ) 322 | dimensions_to_use.extend( list(np.where(data_std >= 1e-4)[0]) ) 323 | 324 | data_std[dimensions_to_ignore] = 1.0 325 | 326 | return data_mean, data_std, dimensions_to_ignore, dimensions_to_use 327 | -------------------------------------------------------------------------------- /data/cam_utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from utils import h36m_utils, util 3 | import matplotlib.pyplot as plt 4 | import matplotlib.gridspec as gridspec 5 | from scipy.spatial.transform import Rotation as R 6 | 7 | 8 | def plot2DPose(poses_2d, img_path=None, save_path=None, show=False): 9 | from matplotlib import image 10 | lcolor = "#3498db" 11 | rcolor = "#e74c3c" 12 | vals = poses_2d.reshape((-1, 2)) 13 | if vals.shape[0] == 14: 14 | I = np.array([6, 5, 4, 3, 2, 12, 11, 10, 9, 8, 4, 3, 13]) - 1 15 | J = np.array([5, 4, 3, 2, 1, 11, 10, 9, 8, 7, 10, 9, 14]) - 1 16 | else: 17 | I = np.array([0, 1, 2, 0, 4, 5, 0, 8, 9, 10, 9, 13, 14, 9, 17, 18]) 18 | J = np.array([1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 17, 18, 19]) 19 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 20 | 21 | fig = plt.figure(figsize=(48, 48)) 22 | gs1 = gridspec.GridSpec(1, 1) 23 | # plt = plt.subplot(gs1[0, 0]) 24 | # Make connection matrix 25 | 26 | ax = plt.gca() 27 | 28 | for i in np.arange(len(I)): 29 | x, y = [np.array([vals[I[i], j], vals[J[i], j]]) for j in range(2)] 30 | if np.mean(x) != 0 and np.mean(y) != 0: 31 | ax.plot(x, y, lw=10, c=lcolor if LR[i] else rcolor) 32 | 33 | # Get rid of the ticks 34 | ax.set_xticks([]) 35 | ax.set_yticks([]) 36 | 37 | RADIUS = 400 # space around the subject 38 | xroot, yroot = vals[0, 0], vals[0, 1] 39 | ax.set_xlim([-RADIUS + xroot, RADIUS + xroot]) 40 | ax.set_ylim([-RADIUS + yroot, RADIUS + yroot]) 41 | if True: 42 | ax.set_xlabel("x") 43 | ax.set_ylabel("z") 44 | # plt.aspect('equal') 45 | # for i in range(vals.shape[0]): 46 | # plt.text(vals[i, 0], vals[i, 1], f'{i}', color='black') 47 | 48 | for i in range(vals.shape[0]): 49 | ax.add_patch(plt.Circle((vals[i, 0], vals[i, 1]), 5, color='w', edgecolor='k')) 50 | 51 | if img_path is not None: 52 | img = image.imread(img_path) 53 | plt.imshow(img) 54 | 55 | ax.set_ylim(ax.get_ylim()[::-1]) 56 | 57 | if save_path: 58 | fig.savefig(save_path) 59 | 60 | if show: 61 | plt.show() 62 | 63 | def from_R_T_to_extrinsic(R_mat,T_mat): 64 | temp = np.concatenate([R_mat.T, T_mat], axis=1) 65 | temp = np.concatenate([temp, np.array([[0, 0, 0, 1]])], axis=0) 66 | return temp 67 | 68 | def quatWAvgMarkley(Q, weights): 69 | ''' 70 | Averaging Quaternions. 71 | 72 | Arguments: 73 | Q(ndarray): an Mx4 ndarray of quaternions. 74 | weights(list): an M elements list, a weight for each quaternion. 75 | ''' 76 | 77 | # Form the symmetric accumulator matrix 78 | A = np.zeros((4, 4)) 79 | M = Q.shape[0] 80 | wSum = 0 81 | 82 | for i in range(M): 83 | q = Q[i, :] 84 | w_i = weights[i] 85 | A += w_i * (np.outer(q, q)) # rank 1 update 86 | wSum += w_i 87 | 88 | # scale 89 | A /= wSum 90 | 91 | # Get the eigenvector corresponding to largest eigen value 92 | return np.linalg.eigh(A)[1][:, -1] 93 | 94 | 95 | def create_cameras_between(cam_a, cam_b, base_name): 96 | cameras,camera_names = [], [] 97 | R_a, T_a, f, c, k, p, res_w, res_h = cam_a 98 | R_b, T_b = cam_b[0], cam_b[1] 99 | r = R.from_matrix(R_a.T) 100 | r2 = R.from_matrix(R_b.T) 101 | for alpha in range(0, 10+1, 1): 102 | alpha /= 10 103 | t_weighted = alpha * T_a + (1 - alpha) * T_b 104 | 105 | q_weighted = quatWAvgMarkley(np.stack([r.as_quat(), r2.as_quat()]), 106 | [alpha, 1 - alpha]) 107 | r_weighted = R.from_quat(q_weighted).as_matrix().T 108 | cameras.append([r_weighted, t_weighted, f, c, k, p, res_w, res_h]) 109 | camera_names.append(f"{base_name}_{alpha}") 110 | return cameras, camera_names 111 | 112 | def create_2d_poses_for_synthetic_camera(set_3d_world, cameras): 113 | import os 114 | R, T, f, c, k, p, res_w, res_h = cameras[0] 115 | R_b, T_b, f, c, k, p, res_w, res_h = cameras[1] 116 | cameras, cameras_names = create_cameras_between(cameras[0], cameras[1]) 117 | poses_2d = [] 118 | i = 0.1 119 | for cam in cameras: 120 | r_mat, t_mat = cam[0], cam[1] 121 | set_2d = h36m_utils.project_2d(set_3d_world.reshape((-1, 3)), r_mat, t_mat, f, c, k, p, from_world=True)[ 122 | 0].reshape((set_3d_world.shape[0], int(set_3d_world.shape[-1] / 3 * 2))) 123 | poses_2d.append(set_2d) 124 | BASE_DATA_PATH = os.path.dirname(os.path.realpath(__file__)) 125 | plot2DPose(set_2d[0],save_path=f"{BASE_DATA_PATH}/test_2d_{i:.1f}.png") 126 | i+=0.1 127 | -------------------------------------------------------------------------------- /data/cameras.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/data/cameras.h5 -------------------------------------------------------------------------------- /data/data_loaders.py: -------------------------------------------------------------------------------- 1 | from base.base_data_loader import BaseDataLoader 2 | from data import multi_view_h36_dataset 3 | 4 | NUMBER_WORK = 8 5 | 6 | 7 | class h36m_loader(BaseDataLoader): 8 | def __init__(self, config, is_training=False, eval_mode=False): 9 | self.dataset = multi_view_h36_dataset.multi_view_h36_dataset(config, is_train=is_training, 10 | num_of_views=config.arch.n_views, 11 | eval_mode=eval_mode) 12 | 13 | batch_size = config.trainer.batch_size if is_training else 1 14 | super(h36m_loader, self).__init__(self.dataset, batch_size=batch_size, shuffle=is_training, pin_memory=True, 15 | num_workers=NUMBER_WORK, drop_last=True) 16 | -------------------------------------------------------------------------------- /data/data_utils.py: -------------------------------------------------------------------------------- 1 | 2 | """Functions that help with data processing for human3.6m""" 3 | 4 | from __future__ import absolute_import 5 | from __future__ import division 6 | from __future__ import print_function 7 | 8 | import numpy as np 9 | from six.moves import xrange # pylint: disable=redefined-builtin 10 | import copy 11 | 12 | def rotmat2euler( R ): 13 | """ 14 | Converts a rotation matrix to Euler angles 15 | Matlab port to python for evaluation purposes 16 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/RotMat2Euler.m#L1 17 | 18 | Args 19 | R: a 3x3 rotation matrix 20 | Returns 21 | eul: a 3x1 Euler angle representation of R 22 | """ 23 | if R[0,2] == 1 or R[0,2] == -1: 24 | # special case 25 | E3 = 0 # set arbitrarily 26 | dlta = np.arctan2( R[0,1], R[0,2] ); 27 | 28 | if R[0,2] == -1: 29 | E2 = np.pi/2; 30 | E1 = E3 + dlta; 31 | else: 32 | E2 = -np.pi/2; 33 | E1 = -E3 + dlta; 34 | 35 | else: 36 | E2 = -np.arcsin( R[0,2] ) 37 | E1 = np.arctan2( R[1,2]/np.cos(E2), R[2,2]/np.cos(E2) ) 38 | E3 = np.arctan2( R[0,1]/np.cos(E2), R[0,0]/np.cos(E2) ) 39 | 40 | eul = np.array([E1, E2, E3]); 41 | return eul 42 | 43 | 44 | def quat2expmap(q): 45 | """ 46 | Converts a quaternion to an exponential map 47 | Matlab port to python for evaluation purposes 48 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/quat2expmap.m#L1 49 | 50 | Args 51 | q: 1x4 quaternion 52 | Returns 53 | r: 1x3 exponential map 54 | Raises 55 | ValueError if the l2 norm of the quaternion is not close to 1 56 | """ 57 | if (np.abs(np.linalg.norm(q)-1)>1e-3): 58 | raise(ValueError, "quat2expmap: input quaternion is not norm 1") 59 | 60 | sinhalftheta = np.linalg.norm(q[1:]) 61 | coshalftheta = q[0] 62 | 63 | r0 = np.divide( q[1:], (np.linalg.norm(q[1:]) + np.finfo(np.float32).eps)); 64 | theta = 2 * np.arctan2( sinhalftheta, coshalftheta ) 65 | theta = np.mod( theta + 2*np.pi, 2*np.pi ) 66 | 67 | if theta > np.pi: 68 | theta = 2 * np.pi - theta 69 | r0 = -r0 70 | 71 | r = r0 * theta 72 | return r 73 | 74 | def rotmat2quat(R): 75 | """ 76 | Converts a rotation matrix to a quaternion 77 | Matlab port to python for evaluation purposes 78 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/rotmat2quat.m#L4 79 | 80 | Args 81 | R: 3x3 rotation matrix 82 | Returns 83 | q: 1x4 quaternion 84 | """ 85 | rotdiff = R - R.T; 86 | 87 | r = np.zeros(3) 88 | r[0] = -rotdiff[1,2] 89 | r[1] = rotdiff[0,2] 90 | r[2] = -rotdiff[0,1] 91 | sintheta = np.linalg.norm(r) / 2; 92 | r0 = np.divide(r, np.linalg.norm(r) + np.finfo(np.float32).eps ); 93 | 94 | costheta = (np.trace(R)-1) / 2; 95 | 96 | theta = np.arctan2( sintheta, costheta ); 97 | 98 | q = np.zeros(4) 99 | q[0] = np.cos(theta/2) 100 | q[1:] = r0*np.sin(theta/2) 101 | return q 102 | 103 | def rotmat2expmap(R): 104 | return quat2expmap( rotmat2quat(R) ); 105 | 106 | def expmap2rotmat(r): 107 | """ 108 | Converts an exponential map angle to a rotation matrix 109 | Matlab port to python for evaluation purposes 110 | I believe this is also called Rodrigues' formula 111 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/expmap2rotmat.m 112 | 113 | Args 114 | r: 1x3 exponential map 115 | Returns 116 | R: 3x3 rotation matrix 117 | """ 118 | theta = np.linalg.norm( r ) 119 | r0 = np.divide( r, theta + np.finfo(np.float32).eps ) 120 | r0x = np.array([0, -r0[2], r0[1], 0, 0, -r0[0], 0, 0, 0]).reshape(3,3) 121 | r0x = r0x - r0x.T 122 | R = np.eye(3,3) + np.sin(theta)*r0x + (1-np.cos(theta))*(r0x).dot(r0x); 123 | return R 124 | 125 | 126 | def unNormalizeData(normalizedData, data_mean, data_std, dimensions_to_ignore, actions, one_hot ): 127 | """Borrowed from SRNN code. Reads a csv file and returns a float32 matrix. 128 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/generateMotionData.py#L12 129 | 130 | Args 131 | normalizedData: nxd matrix with normalized data 132 | data_mean: vector of mean used to normalize the data 133 | data_std: vector of standard deviation used to normalize the data 134 | dimensions_to_ignore: vector with dimensions not used by the model 135 | actions: list of strings with the encoded actions 136 | one_hot: whether the data comes with one-hot encoding 137 | Returns 138 | origData: data originally used to 139 | """ 140 | T = normalizedData.shape[0] 141 | D = data_mean.shape[0] 142 | 143 | origData = np.zeros((T, D), dtype=np.float32) 144 | dimensions_to_use = [] 145 | for i in range(D): 146 | if i in dimensions_to_ignore: 147 | continue 148 | dimensions_to_use.append(i) 149 | dimensions_to_use = np.array(dimensions_to_use) 150 | 151 | if one_hot: 152 | origData[:, dimensions_to_use] = normalizedData[:, :-len(actions)] 153 | else: 154 | origData[:, dimensions_to_use] = normalizedData 155 | 156 | # potentially ineficient, but only done once per experiment 157 | stdMat = data_std.reshape((1, D)) 158 | stdMat = np.repeat(stdMat, T, axis=0) 159 | meanMat = data_mean.reshape((1, D)) 160 | meanMat = np.repeat(meanMat, T, axis=0) 161 | origData = np.multiply(origData, stdMat) + meanMat 162 | return origData 163 | 164 | 165 | def revert_output_format(poses, data_mean, data_std, dim_to_ignore, actions, one_hot): 166 | """ 167 | Converts the output of the neural network to a format that is more easy to 168 | manipulate for, e.g. conversion to other format or visualization 169 | 170 | Args 171 | poses: The output from the TF model. A list with (seq_length) entries, 172 | each with a (batch_size, dim) output 173 | Returns 174 | poses_out: A tensor of size (batch_size, seq_length, dim) output. Each 175 | batch is an n-by-d sequence of poses. 176 | """ 177 | seq_len = len(poses) 178 | if seq_len == 0: 179 | return [] 180 | 181 | batch_size, dim = poses[0].shape 182 | 183 | poses_out = np.concatenate(poses) 184 | poses_out = np.reshape(poses_out, (seq_len, batch_size, dim)) 185 | poses_out = np.transpose(poses_out, [1, 0, 2]) 186 | 187 | poses_out_list = [] 188 | for i in xrange(poses_out.shape[0]): 189 | poses_out_list.append( 190 | unNormalizeData(poses_out[i, :, :], data_mean, data_std, dim_to_ignore, actions, one_hot)) 191 | 192 | return poses_out_list 193 | 194 | 195 | def readCSVasFloat(filename): 196 | """ 197 | Borrowed from SRNN code. Reads a csv and returns a float matrix. 198 | https://github.com/asheshjain399/NeuralModels/blob/master/neuralmodels/utils.py#L34 199 | 200 | Args 201 | filename: string. Path to the csv file 202 | Returns 203 | returnArray: the read data in a float32 matrix 204 | """ 205 | returnArray = [] 206 | lines = open(filename).readlines() 207 | for line in lines: 208 | line = line.strip().split(',') 209 | if len(line) > 0: 210 | returnArray.append(np.array([np.float32(x) for x in line])) 211 | 212 | returnArray = np.array(returnArray) 213 | return returnArray 214 | 215 | 216 | def load_data(path_to_dataset, subjects, actions, one_hot): 217 | """ 218 | Borrowed from SRNN code. This is how the SRNN code reads the provided .txt files 219 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/processdata.py#L270 220 | 221 | Args 222 | path_to_dataset: string. directory where the data resides 223 | subjects: list of numbers. The subjects to load 224 | actions: list of string. The actions to load 225 | one_hot: Whether to add a one-hot encoding to the data 226 | Returns 227 | trainData: dictionary with k:v 228 | k=(subject, action, subaction, 'even'), v=(nxd) un-normalized data 229 | completeData: nxd matrix with all the data. Used to normlization stats 230 | """ 231 | nactions = len( actions ) 232 | 233 | trainData = {} 234 | completeData = [] 235 | for subj in subjects: 236 | for action_idx in np.arange(len(actions)): 237 | 238 | action = actions[ action_idx ] 239 | 240 | for subact in [1, 2]: # subactions 241 | 242 | print("Reading subject {0}, action {1}, subaction {2}".format(subj, action, subact)) 243 | 244 | filename = '{0}/S{1}/{2}_{3}.txt'.format( path_to_dataset, subj, action, subact) 245 | action_sequence = readCSVasFloat(filename) 246 | 247 | n, d = action_sequence.shape 248 | even_list = range(0, n, 2) 249 | 250 | if one_hot: 251 | # Add a one-hot encoding at the end of the representation 252 | the_sequence = np.zeros( (len(even_list), d + nactions), dtype=float ) 253 | # the_sequence[ :, 0:d ] = action_sequence[even_list, :] 254 | the_sequence[:, :] = action_sequence[:, :] 255 | 256 | the_sequence[ :, d+action_idx ] = 1 257 | trainData[(subj, action, subact, 'even')] = the_sequence 258 | else: 259 | # trainData[(subj, action, subact, 'even')] = action_sequence[even_list, :] 260 | trainData[(subj, action, subact, 'even')] = action_sequence[:, :] 261 | 262 | 263 | 264 | if len(completeData) == 0: 265 | completeData = copy.deepcopy(action_sequence) 266 | else: 267 | completeData = np.append(completeData, action_sequence, axis=0) 268 | 269 | return trainData, completeData 270 | 271 | 272 | def normalize_data( data, data_mean, data_std, dim_to_use, actions, one_hot ): 273 | """ 274 | Normalize input data by removing unused dimensions, subtracting the mean and 275 | dividing by the standard deviation 276 | 277 | Args 278 | data: nx99 matrix with data to normalize 279 | data_mean: vector of mean used to normalize the data 280 | data_std: vector of standard deviation used to normalize the data 281 | dim_to_use: vector with dimensions used by the model 282 | actions: list of strings with the encoded actions 283 | one_hot: whether the data comes with one-hot encoding 284 | Returns 285 | data_out: the passed data matrix, but normalized 286 | """ 287 | data_out = {} 288 | nactions = len(actions) 289 | 290 | if not one_hot: 291 | # No one-hot encoding... no need to do anything special 292 | for key in data.keys(): 293 | data_out[ key ] = np.divide( (data[key] - data_mean), data_std ) 294 | data_out[ key ] = data_out[ key ][ :, dim_to_use ] 295 | 296 | else: 297 | # TODO hard-coding 99 dimensions for un-normalized human poses 298 | for key in data.keys(): 299 | data_out[ key ] = np.divide( (data[key][:, 0:99] - data_mean), data_std ) 300 | data_out[ key ] = data_out[ key ][ :, dim_to_use ] 301 | data_out[ key ] = np.hstack( (data_out[key], data[key][:,-nactions:]) ) 302 | 303 | return data_out 304 | 305 | 306 | def normalization_stats(completeData): 307 | """" 308 | Also borrowed for SRNN code. Computes mean, stdev and dimensions to ignore. 309 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/processdata.py#L33 310 | 311 | Args 312 | completeData: nx99 matrix with data to normalize 313 | Returns 314 | data_mean: vector of mean used to normalize the data 315 | data_std: vector of standard deviation used to normalize the data 316 | dimensions_to_ignore: vector with dimensions not used by the model 317 | dimensions_to_use: vector with dimensions used by the model 318 | """ 319 | data_mean = np.mean(completeData, axis=0) 320 | data_std = np.std(completeData, axis=0) 321 | 322 | dimensions_to_ignore = [] 323 | dimensions_to_use = [] 324 | 325 | dimensions_to_ignore.extend( list(np.where(data_std < 1e-4)[0]) ) 326 | dimensions_to_use.extend( list(np.where(data_std >= 1e-4)[0]) ) 327 | 328 | data_std[dimensions_to_ignore] = 1.0 329 | 330 | return data_mean, data_std, dimensions_to_ignore, dimensions_to_use 331 | -------------------------------------------------------------------------------- /data/panutils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | def projectPoints(X, K, R, t, Kd): 4 | """ Projects points X (3xN) using camera intrinsics K (3x3), 5 | extrinsics (R,t) and distortion parameters Kd=[k1,k2,p1,p2,k3]. 6 | 7 | Roughly, x = K*(R*X + t) + distortion 8 | 9 | See http://docs.opencv.org/2.4/doc/tutorials/calib3d/camera_calibration/camera_calibration.html 10 | or cv2.projectPoints 11 | """ 12 | 13 | x = np.asarray(R*X + t) 14 | 15 | x[0:2,:] = x[0:2,:]/x[2,:] 16 | 17 | r = x[0,:]*x[0,:] + x[1,:]*x[1,:] 18 | 19 | x[0,:] = x[0,:]*(1 + Kd[0]*r + Kd[1]*r*r + Kd[4]*r*r*r) + 2*Kd[2]*x[0,:]*x[1,:] + Kd[3]*(r + 2*x[0,:]*x[0,:]) 20 | x[1,:] = x[1,:]*(1 + Kd[0]*r + Kd[1]*r*r + Kd[4]*r*r*r) + 2*Kd[3]*x[0,:]*x[1,:] + Kd[2]*(r + 2*x[1,:]*x[1,:]) 21 | 22 | x[0,:] = K[0,0]*x[0,:] + K[0,1]*x[1,:] + K[0,2] 23 | x[1,:] = K[1,0]*x[0,:] + K[1,1]*x[1,:] + K[1,2] 24 | 25 | return x 26 | 27 | 28 | def get_uniform_camera_order(): 29 | """ Returns uniformly sampled camera order as a list of tuples [(panel,node), (panel,node), ...].""" 30 | panel_order =[1,19,14,6,16,9,5,10,18,15,3,8,4,20,11,13,7,2,17,12,9,5,6,3,15,2,12,14,16,10,4,13,20,8,17,19,18,9,4,6,1,20,1,11,7,7,14,15,3,2,16,13,3,15,17,9,20,19,8,11,5,8,18,10,12,19,5,6,16,12,4,6,20,13,4,10,15,12,17,17,16,1,5,3,2,18,13,16,8,19,13,11,10,7,3,2,18,10,1,17,10,15,14,4,7,9,11,7,20,14,1,12,1,6,11,18,7,8,9,3,15,19,4,16,18,1,11,8,4,10,20,13,6,16,7,6,16,17,12,5,17,4,8,20,12,17,14,2,19,14,18,15,11,11,9,9,2,13,5,15,20,18,8,3,19,11,9,2,13,14,5,9,17,9,7,6,12,16,18,17,13,15,17,20,4,2,2,12,4,1,16,4,11,1,16,12,18,9,7,20,1,10,10,19,5,8,14,8,4,2,9,20,14,17,11,3,12,3,13,6,5,16,3,5,10,19,1,11,13,17,18,2,5,14,19,15,8,8,9,3,6,16,15,18,20,4,13,2,11,20,7,13,15,18,10,20,7,5,2,15,6,13,4,17,7,3,19,19,3,10,2,12,10,7,7,12,11,19,8,9,6,10,6,15,10,11,3,16,1,5,14,6,5,13,20,14,4,18,10,14,14,1,19,8,14,19,3,6,6,3,13,17,8,20,15,18,2,2,16,5,19,15,9,12,19,17,8,9,3,7,1,12,7,13,1,14,5,12,11,2,16,1,18,4,18,10,16,11,7,5,1,16,9,4,15,1,7,10,14,3,2,17,13,19,20,15,10,4,8,16,14,5,6,20,12,5,18,7,1,8,11,5,13,1,16,14,18,12,15,2,12,3,8,12,17,8,20,9,2,6,9,6,12,3,20,15,20,13,3,14,1,4,8,6,10,7,17,13,18,19,10,20,12,19,2,15,10,8,19,11,19,11,2,4,6,2,11,8,7,18,14,4,12,14,7,9,7,11,18,16,16,17,16,15,4,15,9,17,13,3,6,17,17,20,19,11,5,3,1,18,4,10,5,9,13,1,5,9,6,14] 31 | node_order = [1,14,3,15,12,12,8,6,13,12,12,17,7,17,21,17,4,6,12,18,2,18,5,4,2,17,12,10,18,8,18,5,10,10,17,1,18,7,12,9,13,5,6,18,16,9,16,8,8,10,21,22,16,16,21,16,14,6,14,11,11,20,4,22,4,22,20,19,15,15,15,12,2,2,3,3,20,22,5,9,3,16,23,22,20,8,8,9,2,16,14,16,16,14,1,13,16,12,10,15,18,6,13,10,7,10,4,1,7,21,8,6,4,7,9,10,11,8,4,6,10,4,5,6,21,21,6,6,19,20,20,20,14,19,22,22,23,19,9,15,23,23,23,23,19,2,8,2,8,19,19,23,23,19,19,23,24,24,2,14,12,2,12,14,12,2,14,15,11,6,6,21,4,5,5,4,2,10,5,10,7,3,7,9,8,9,3,7,9,9,7,2,5,5,5,5,7,8,8,4,7,11,9,7,5,3,5,7,6,8,9,8,7,8,8,3,8,7,6,11,7,2,9,9,2,11,12,7,4,6,6,7,4,4,9,18,1,5,6,5,10,11,5,9,6,11,12,1,10,11,6,9,7,11,5,1,2,12,11,11,3,3,21,11,10,2,3,10,11,19,5,11,13,12,20,13,3,5,9,11,8,4,6,4,7,12,10,8,11,19,14,23,10,1,3,12,4,3,10,9,2,3,20,4,11,2,20,20,2,23,10,3,22,22,1,12,12,21,4,22,23,22,18,10,18,22,11,3,18,13,18,3,3,13,2,1,3,20,20,4,20,14,14,20,20,14,14,22,18,21,20,22,20,22,9,22,21,21,22,21,22,20,21,21,21,21,23,17,21,13,20,13,13,15,17,1,23,23,23,18,13,16,15,19,17,17,22,21,17,14,1,13,13,14,14,16,19,17,18,1,13,18,24,19,16,13,18,18,15,23,17,14,19,17,1,19,13,19,1,15,17,13,23,13,19,24,15,15,19,15,17,1,16,24,21,23,14,24,15,24,24,1,16,15,24,1,17,17,15,24,1,16,16,19,13,15,22,24,23,17,16,18,1,24,24,24,17,24,24,17,16,24,14,15,16,15,24,24,24,18] 32 | 33 | return zip(panel_order, node_order) -------------------------------------------------------------------------------- /data/prepare_cmu.py: -------------------------------------------------------------------------------- 1 | import sys 2 | sys.path.append('./') 3 | 4 | import numpy as np 5 | import utils.BVH as BVH 6 | 7 | from utils.Quaternions import Quaternions 8 | from utils import util 9 | 10 | rotations_bvh = [] 11 | bvh_files = util.make_dataset(['/mnt/dataset/cmubvh'], phase='bvh', data_split=1, sort_index=0) 12 | for file in bvh_files: 13 | original_anim, _, frametime = BVH.load(file, rotate=True) 14 | sampling = 3 15 | to_keep = [0, 7, 8, 2, 3, 12, 13, 15, 18, 19, 25, 26] 16 | real_rotations = original_anim.rotations.qs[1:, to_keep, :] 17 | rotations_bvh.append(real_rotations[np.arange(0, real_rotations.shape[0] // sampling) * sampling].astype('float32')) 18 | np.savez_compressed('./data/data_cmu.npz', rotations=rotations_bvh) -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | name: FLEXClone 2 | channels: 3 | - pytorch 4 | - intel 5 | - conda-forge 6 | - defaults 7 | dependencies: 8 | - _libgcc_mutex=0.1=conda_forge 9 | - _openmp_mutex=4.5=1_gnu 10 | - backports=1.0=py_2 11 | - backports.functools_lru_cache=1.6.4=pyhd8ed1ab_0 12 | - blas=1.1=openblas 13 | - bleach=1.5.0=py36_0 14 | - ca-certificates=2021.10.8=ha878542_0 15 | - certifi=2021.5.30=py36h5fab9bb_0 16 | - cffi=1.14.1=py36h0ff685e_0 17 | - click=7.1.2=pyh9f0ad1d_0 18 | - conda-verify=3.1.1=py36h5fab9bb_1003 19 | - cudatoolkit=10.0.130=0 20 | - cycler=0.10.0=py_2 21 | - decorator=4.4.2=py_0 22 | - einops=0.3.2=pyhd8ed1ab_0 23 | - freetype=2.10.2=he06d7ca_0 24 | - future=0.18.2=py36h5fab9bb_3 25 | - h5py=2.10.0=py36h7918eee_0 26 | - hdf5=1.10.4=nompi_h3c11f04_1106 27 | - html5lib=0.9999999=py36_0 28 | - importlib-metadata=1.7.0=py36h9f0ad1d_0 29 | - intel-openmp=2019.4=243 30 | - intelpython=2021.3.0=7 31 | - jinja2=3.0.3=pyhd8ed1ab_0 32 | - jpeg=9d=h516909a_0 33 | - kiwisolver=1.2.0=py36hdb11119_0 34 | - lcms2=2.11=hbd6801e_0 35 | - ld_impl_linux-64=2.34=hc38a660_9 36 | - libblas=3.8.0=17_openblas 37 | - libcblas=3.8.0=17_openblas 38 | - libffi=3.2.1=he1b5a44_1007 39 | - libgcc-ng=9.3.0=h24d8f2e_16 40 | - libgfortran=3.0.0=1 41 | - libgfortran-ng=7.5.0=hdf63c60_16 42 | - libgomp=9.3.0=h24d8f2e_16 43 | - liblapack=3.8.0=17_openblas 44 | - libopenblas=0.3.10=h5a2b251_0 45 | - libpng=1.6.37=hed695b0_2 46 | - libprotobuf=3.13.0=h8b12597_0 47 | - libstdcxx-ng=9.3.0=hdf63c60_16 48 | - libtiff=4.2.0=h85742a9_0 49 | - libwebp-base=1.1.0=h516909a_3 50 | - lz4-c=1.9.2=he1b5a44_3 51 | - markdown=3.2.2=py_0 52 | - matplotlib=3.3.1=1 53 | - matplotlib-base=3.3.1=py36h817c723_0 54 | - mkl=2018.0.3=1 55 | - mkl-service=2.1.0=py36_2 56 | - ncurses=6.2=he1b5a44_1 57 | - ninja=1.10.1=hc9558a2_1 58 | - numpy=1.19.5=py36h2aa4a07_1 59 | - olefile=0.46=py_0 60 | - openblas=0.2.19=2 61 | - openssl=1.1.1k=h7f98852_0 62 | - pip=20.2.2=py_0 63 | - prettytable=2.4.0=pyhd8ed1ab_0 64 | - protobuf=3.13.0=py36h831f99a_0 65 | - pycparser=2.20=pyh9f0ad1d_2 66 | - pyparsing=2.4.7=pyh9f0ad1d_0 67 | - python=3.6.11=h4d41432_2_cpython 68 | - python_abi=3.6=1_cp36m 69 | - pytorch=1.4.0=py3.6_cuda10.0.130_cudnn7.6.3_0 70 | - pyyaml=5.4.1=py36h8f6f2f9_0 71 | - readline=8.0=he28a2e2_2 72 | - scipy=1.4.1=py36h2d22cac_3 73 | - setuptools=49.6.0=py36h9f0ad1d_0 74 | - six=1.15.0=pyh9f0ad1d_0 75 | - sqlite=3.33.0=h4cf870e_0 76 | - tensorboardx=1.4=py_0 77 | - tk=8.6.10=hed695b0_0 78 | - torchvision=0.5.0=py36_cu100 79 | - tornado=6.0.4=py36h8c4c3a4_1 80 | - wcwidth=0.2.5=pyh9f0ad1d_2 81 | - webencodings=0.5.1=py_1 82 | - werkzeug=1.0.1=pyh9f0ad1d_0 83 | - wheel=0.35.1=pyh9f0ad1d_0 84 | - xz=5.2.5=h516909a_1 85 | - yaml=0.2.5=h516909a_0 86 | - zipp=3.1.0=py_0 87 | - zlib=1.2.11=h516909a_1009 88 | - zstd=1.4.5=h6597ccf_2 89 | - pip: 90 | - markupsafe==1.1.1 91 | - pillow==7.1.0 92 | - python-dateutil==2.6.0 93 | - python-graphviz==0.16 94 | prefix: /home/briangordon/miniconda3/envs/FLEXClone 95 | -------------------------------------------------------------------------------- /evaluate_multiview.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import argparse 4 | import torch 5 | import copy 6 | import numpy as np 7 | import matplotlib.pyplot as plt 8 | import model.model as models 9 | from data.data_loaders import h36m_loader 10 | from utils import util, h36m_utils, Animation, BVH 11 | from model import metric 12 | import matplotlib.gridspec as gridspec 13 | from utils.visualization import show2Dpose, show3Dpose, fig2img 14 | from utils.quaternion import expmap_to_quaternion, qfix, qeuler 15 | from prettytable import PrettyTable 16 | 17 | N_JOINTS = 17 18 | 19 | 20 | def set_3d_to_world_coords(set_3d_array, R_array, T_array): 21 | set_3d_reshaped = set_3d_array.reshape((set_3d_array.shape[0], set_3d_array.shape[1], -1, 3)) 22 | # Camera Formula: R.T.dot(X.T) + T (X is the 3d set) 23 | if torch.cuda.is_available(): 24 | R_T = torch.transpose(R_array, 2, 3).cuda() 25 | pose_3d_T = torch.transpose(set_3d_reshaped, 2, 3).double().cuda() 26 | X_cam = torch.matmul(R_T, pose_3d_T).cuda() # + T_array.cuda() 27 | X_cam = torch.transpose(X_cam, 2, 3).cuda() 28 | else: 29 | R_T = torch.transpose(R_array, 2, 3) 30 | pose_3d_T = torch.transpose(set_3d_reshaped, 2, 3).double() 31 | X_cam = torch.matmul(R_T, pose_3d_T) # + T_array 32 | X_cam = torch.transpose(X_cam, 2, 3) 33 | 34 | return X_cam 35 | 36 | 37 | def visualize_2d_and_3d(gt_pose_2d, gt_pose_3d, fake_pose_2d, fake_pose_3d, save_path): 38 | fig = plt.figure(figsize=(60, 12)) 39 | 40 | gs1 = gridspec.GridSpec(2, 2) 41 | # gs1.update(wspace=-0.00, hspace=0.05) # set the spacing between axes. 42 | # plt.axis('off') 43 | 44 | ax0 = plt.subplot(gs1[0, 0]) 45 | show2Dpose(copy.deepcopy(gt_pose_2d), ax0, radius=100) 46 | ax1 = plt.subplot(gs1[0, 1]) 47 | show2Dpose(copy.deepcopy(fake_pose_2d), ax1, radius=100) 48 | ax2 = plt.subplot(gs1[1, 0], projection='3d') 49 | show3Dpose(copy.deepcopy(gt_pose_3d), ax2, radius=600) 50 | ax3 = plt.subplot(gs1[1, 1], projection='3d') 51 | show3Dpose(copy.deepcopy(fake_pose_3d), ax3, radius=600) 52 | 53 | if save_path is None: 54 | fig_img = fig2img(fig) 55 | plt.close() 56 | return fig_img 57 | else: 58 | plt.show() 59 | fig.savefig(save_path) 60 | 61 | 62 | def save_bvh(config, test_data_loader, video_name, pre_proj, poses_2d, pre_rotations_full, pre_bones, test_parameters, 63 | name_list, output_folder): 64 | translation = np.zeros((poses_2d.shape[1], 3)) 65 | rotations = pre_rotations_full[0].cpu().numpy() 66 | length = (pre_bones * test_parameters[3].unsqueeze(0) + test_parameters[2].repeat(pre_bones.shape[0], 1, 1))[ 67 | 0].cpu().numpy() 68 | BVH.save('%s/%s.bvh' % (output_folder, video_name), 69 | Animation.load_from_network(translation, rotations, length, third_dimension=1), names=name_list) 70 | 71 | 72 | def main(config, args, output_folder): 73 | name_list = ['Hips', 'RightUpLeg', 'RightLeg', 'RightFoot', 'LeftUpLeg', 'LeftLeg', 'LeftFoot', 'Spine', 'Spine1', 74 | 'Neck', 'Head', 'LeftArm', 'LeftForeArm', 'LeftHand', 'RightArm', 'RightForeArm', 'RightHand'] 75 | name_list_20 = ['Hips', 'RightUpLeg', 'RightLeg', 'RightFoot', 'LeftUpLeg', 'LeftLeg', 'LeftFoot', 'Spine', 76 | 'Spine1', 'Neck', 'Head', 'Site', 'LeftShoulder', 'LeftArm', 'LeftForeArm', 'LeftHand', 77 | 'RightShoulder', 'RightArm', 'RightForeArm', 'RightHand'] 78 | if config.arch.n_joints == 20: 79 | name_list = name_list_20 80 | 81 | resume = args.resume 82 | print(f'Loading checkpoint from: {resume}') 83 | checkpoint = torch.load(resume) 84 | config_checkpoint = checkpoint['config'] 85 | print(config_checkpoint) 86 | 87 | model = getattr(models, config.arch.type)(config_checkpoint) 88 | 89 | state_dict = checkpoint['state_dict'] 90 | model.load_state_dict(state_dict) 91 | 92 | device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') 93 | model = model.to(device) 94 | model.eval() 95 | print(model.summary()) 96 | 97 | def _prepare_data(device, data, _from='numpy'): 98 | return torch.from_numpy(np.array(data)).float().to(device) if _from == 'numpy' else data.float().to(device) 99 | 100 | test_data_loader = h36m_loader(config, is_training=False, eval_mode=True) 101 | test_parameters = [torch.from_numpy(np.array(item)).float().to(device) for item in 102 | test_data_loader.dataset.get_parameters()] 103 | 104 | mpjpe_error_list = {} 105 | acc_error_list = {} 106 | 107 | mpjpe_errors, acc_errors = [], [] 108 | print(f'Evaluating...') 109 | for video_idx, datas in enumerate(test_data_loader): 110 | data, video_name = datas[0], datas[1][0] 111 | data = [[item.float().to(device) for item in view_data] for view_data in data] 112 | poses_2d_views = torch.stack([view_data[0] for view_data in data], dim=1) 113 | poses_3d_views = torch.stack([view_data[1] for view_data in data], dim=1) 114 | bones_views = torch.stack([view_data[2] for view_data in data], dim=1) 115 | contacts_views = torch.stack([view_data[3] for view_data in data], dim=1) 116 | alphas_views = torch.stack([view_data[4] for view_data in data], dim=1) 117 | proj_facters_views = torch.stack([view_data[5] for view_data in data], dim=1) 118 | root_offsets_views = torch.stack([view_data[6] for view_data in data], dim=1) 119 | angles_3d_views = torch.stack([view_data[7] for view_data in data], dim=1) 120 | poses_2d_views_pixels = torch.stack([torch.unsqueeze(view_data[8], 0) for view_data in data], dim=1) 121 | 122 | with torch.no_grad(): 123 | network_output = model.forward_fk(poses_2d_views, test_parameters, bones_views, angles_3d_views) 124 | fake_bones_views, fake_rotations_views, fake_rotations_full_views, fake_pose_3d_views, fake_c_views, fake_proj_views = network_output[ 125 | :6] 126 | 127 | action_name = video_name.split('_')[1].split(' ')[0] # TODO - REMOVE 128 | 129 | for view_index in range(config.arch.n_views): 130 | poses_2d_pixels = poses_2d_views_pixels[:, view_index, :, :][0] 131 | poses_3d = poses_3d_views[:, view_index, :, :] 132 | alphas = alphas_views[:, view_index, :] 133 | 134 | pre_pose_3d = fake_pose_3d_views[view_index] 135 | pre_proj = fake_proj_views[view_index] 136 | pre_rotations_full = fake_rotations_full_views[view_index] 137 | pre_bones = fake_bones_views[view_index] 138 | 139 | 140 | mpjpe_error = metric.mean_points_error(poses_3d, pre_pose_3d) * torch.mean(alphas[0]).data.cpu().numpy() 141 | accel_error = metric.compute_error_accel(poses_3d, pre_pose_3d, alphas) 142 | 143 | mpjpe_errors.append(mpjpe_error) 144 | acc_errors.append(accel_error) 145 | 146 | if mpjpe_error and action_name in mpjpe_error_list.keys(): 147 | mpjpe_error_list[action_name].append(mpjpe_error) 148 | acc_error_list[action_name].append(accel_error) 149 | else: 150 | mpjpe_error_list[action_name] = [mpjpe_error] 151 | acc_error_list[action_name] = [accel_error] 152 | 153 | if args.save_bvh_files: 154 | save_bvh(config, test_data_loader, video_name + f"_view_{view_index}", pre_proj, poses_2d_pixels, 155 | pre_rotations_full, pre_bones, test_parameters, name_list, output_folder) 156 | 157 | error_file = '%s/errors.txt' % output_folder 158 | 159 | with open(error_file, 'w') as f: 160 | t = PrettyTable(['Action', 'MPJPE (mm)', 'Acc. Error (mm/s^2)']) 161 | f.writelines("=Action= \t =MPJPE (mm)= \t =Acc. Error(mm/s^2)==") 162 | for key in mpjpe_error_list.keys(): 163 | mean_pos_error = np.mean(np.array(mpjpe_error_list[key])) 164 | mean_acc_error = np.mean(np.array(acc_error_list[key])) 165 | t.add_row([key, f"{mean_pos_error:.2f}", f"{mean_acc_error:.2f}"]) 166 | f.writelines(f'{key} : \t {mean_pos_error:.2f} \t {mean_acc_error:.2f}') 167 | 168 | avg_mpjpe = np.mean(np.array(mpjpe_errors)) 169 | avg_acc_error = np.mean(np.array(acc_errors)) 170 | t.add_row(["", "", ""]) 171 | t.add_row(["Average", f"{avg_mpjpe:.2f}", f"{avg_acc_error:.2f}"]) 172 | f.writelines(f'Total avg. MPJPE: {avg_mpjpe:.2f} \nTotal acc. error: {avg_acc_error:.2f}') 173 | f.close() 174 | print(t) 175 | 176 | 177 | if __name__ == '__main__': 178 | parser = argparse.ArgumentParser(description='### MotioNet eveluation') 179 | 180 | parser.add_argument('-r', '--resume', default='./checkpoints/h36m_gt.pth', type=str, 181 | help='path to checkpoint (default: None)') 182 | parser.add_argument('-d', '--device', default="7", type=str, help='indices of GPUs to enable (default: all)') 183 | parser.add_argument('-i', '--input', default='h36m', type=str, help='h36m or demo or [input_folder_path]') 184 | parser.add_argument('-o', '--output', default='./output', type=str, help='Output folder') 185 | parser.add_argument('--save_bvh_files', action='store_true', default=False, 186 | help='Flag if save or not the bvh files') 187 | 188 | args = parser.parse_args() 189 | 190 | if args.device: 191 | os.environ["CUDA_VISIBLE_DEVICES"] = args.device 192 | if args.resume: 193 | os.environ["CUDA_VISIBLE_DEVICES"] = args.device 194 | torch.cuda.current_device() 195 | config = torch.load(args.resume)['config'] 196 | output_folder = util.mkdir_dir('%s/%s' % (args.output, config.trainer.checkpoint_dir.split('/')[-1])) 197 | print(config) 198 | 199 | main(config, args, output_folder) 200 | -------------------------------------------------------------------------------- /index.md: -------------------------------------------------------------------------------- 1 | 2 |

3 | 4 | Accepted to ECCV22 ! 5 | 6 | Poster is available [here](https://drive.google.com/file/d/1CbGD6I9-wfhACe9jhk5dYQZYWvjkaVYg/view?usp=sharing) 7 | 8 | More video demos are available [here](https://github.com/BrianG13/FLEX/tree/main/videos) 9 | 10 | ### Data & Code are available at our github repository: https://github.com/BrianG13/FLEX 11 | 12 | 13 | 54 | -------------------------------------------------------------------------------- /model/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/model/__init__.py -------------------------------------------------------------------------------- /model/loss.py: -------------------------------------------------------------------------------- 1 | import torch.nn.functional as F 2 | import torch.nn as nn 3 | import torch 4 | 5 | def nll_loss(output, target): 6 | return F.nll_loss(output, target) 7 | 8 | 9 | def l2_loss(output, target): 10 | criterion = nn.MSELoss(size_average=True).cuda() 11 | return criterion(output, target) 12 | 13 | 14 | def distance_loss(fake, gt): 15 | return torch.mean(torch.norm(fake - gt, dim=len(gt.shape)-1)) -------------------------------------------------------------------------------- /model/metric.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import numpy as np 3 | from utils.Quaternions import Quaternions 4 | from utils.quaternion import qfix, qeuler 5 | import utils.quaternion as quaternion 6 | from model import model 7 | 8 | def mean_angle_error(fake_rotations_full, quaternion_angles): 9 | with torch.no_grad(): 10 | fake_angles = fake_rotations_full.reshape((-1, 4)) 11 | norm = torch.norm(fake_angles, dim=-1) 12 | norm[norm == 0] = 1 13 | fake_angles = fake_angles / norm.reshape((-1, 1)) 14 | fake_angles = np.degrees(qeuler(fake_angles, order='zxy').cpu().numpy()) 15 | quaternion_angles = quaternion_angles.reshape((-1, 4)) 16 | gt_angles = np.degrees(qeuler(quaternion_angles, order='zxy').cpu().numpy()) 17 | diff = fake_angles - gt_angles 18 | sign = np.sign(diff) 19 | sign[sign == 0] = 1 20 | error = np.mean(np.abs(np.mod(diff, sign * 180))) 21 | return error 22 | 23 | 24 | def mean_angle_error_pavllo(fake_rotations_full, quaternion_angles, n_joints=20): 25 | with torch.no_grad(): 26 | predicted_quat = fake_rotations_full.reshape((-1, 4)) 27 | norm = torch.norm(predicted_quat, dim=-1) 28 | norm[norm == 0] = 1 29 | predicted_quat = predicted_quat / norm.reshape((-1, 1)) 30 | predicted_quat = predicted_quat.view(-1, n_joints, 4) 31 | 32 | predicted_euler = qeuler(predicted_quat, order='zxy', epsilon=1e-6) 33 | 34 | expected_quat = quaternion_angles.view(-1, n_joints, 4) 35 | expected_euler = qeuler(expected_quat, order='zxy', epsilon=1e-6) 36 | 37 | # L1 loss on angle distance with 2pi wrap-around 38 | angle_distance = torch.remainder(predicted_euler - expected_euler + np.pi, 2 * np.pi) - np.pi 39 | return torch.mean(torch.abs(angle_distance)) 40 | 41 | 42 | def mean_points_error(output, target): 43 | with torch.no_grad(): 44 | if not isinstance(output, np.ndarray): 45 | output = output.data.cpu().numpy() 46 | if not isinstance(target, np.ndarray): 47 | target = target.data.cpu().numpy() 48 | output_reshape = output.reshape((-1, 3)) 49 | target_reshape = target.reshape((-1, 3)) 50 | error = np.mean(np.sqrt(np.sum(np.square((output_reshape - target_reshape)), axis=1))) 51 | return error 52 | 53 | 54 | def compute_error_accel(joints_gt, joints_pred,alphas=None, vis=None): 55 | """ 56 | Computes acceleration error: 57 | 1/(n-2) \sum_{i=1}^{n-1} X_{i-1} - 2X_i + X_{i+1} 58 | Note that for each frame that is not visible, three entries in the 59 | acceleration error should be zero'd out. 60 | Args: 61 | joints_gt (Nx14x3). 62 | joints_pred (Nx14x3). 63 | vis (N). 64 | Returns: 65 | error_accel (N-2). 66 | """ 67 | # (N-2)x14x3 68 | joints_gt = np.squeeze(joints_gt.detach().cpu().numpy()) 69 | joints_pred = np.squeeze(joints_pred.detach().cpu().numpy()) 70 | if alphas is not None: 71 | alphas = np.squeeze(alphas.detach().cpu().numpy()) 72 | joints_gt = joints_gt * np.expand_dims(alphas, axis=-1) 73 | joints_pred = joints_pred * np.expand_dims(alphas, axis=-1) 74 | 75 | joints_gt = joints_gt.reshape((joints_gt.shape[0], -1, 3)) 76 | joints_pred = joints_pred.reshape((joints_pred.shape[0], -1, 3)) 77 | 78 | accel_gt = joints_gt[:-2] - 2 * joints_gt[1:-1] + joints_gt[2:] 79 | accel_pred = joints_pred[:-2] - 2 * joints_pred[1:-1] + joints_pred[2:] 80 | 81 | normed = np.linalg.norm(accel_pred - accel_gt, axis=2) 82 | 83 | if vis is None: 84 | new_vis = np.ones(len(normed), dtype=bool) 85 | else: 86 | invis = np.logical_not(vis) 87 | invis1 = np.roll(invis, -1) 88 | invis2 = np.roll(invis, -2) 89 | new_invis = np.logical_or(invis, np.logical_or(invis1, invis2))[:-2] 90 | new_vis = np.logical_not(new_invis) 91 | 92 | return np.mean(np.mean(normed[new_vis], axis=1)) 93 | 94 | 95 | def mean_points_error_per_joint(output, target, n_joints): 96 | joint_errors = [] 97 | with torch.no_grad(): 98 | if not isinstance(output, np.ndarray): 99 | output = output.data.cpu().numpy() 100 | target = target.data.cpu().numpy() 101 | output_reshape = output.reshape((-1, n_joints, 3)) 102 | target_reshape = target.reshape((-1, n_joints, 3)) 103 | for i in range(n_joints): 104 | output_joints = output_reshape[:, i, :] 105 | target_joints = target_reshape[:, i, :] 106 | error = np.mean(np.sqrt(np.sum(np.square((output_joints - target_joints)), axis=1))) 107 | joint_errors.append(error) 108 | return joint_errors 109 | 110 | 111 | def mean_points_error_index(output, target): 112 | with torch.no_grad(): 113 | if not isinstance(output, np.ndarray): 114 | output = output.data.cpu().numpy() 115 | target = target.data.cpu().numpy() 116 | output_reshape = output.reshape((-1, 3)) 117 | target_reshape = target.reshape((-1, 3)) 118 | error_vector = np.sqrt(np.sum(np.square((output_reshape - target_reshape)), axis=1)) 119 | return np.argmax(error_vector) 120 | 121 | 122 | def mean_points_error_sequence(output, target): 123 | with torch.no_grad(): 124 | if not isinstance(output, np.ndarray): 125 | output = output.data.cpu().numpy() 126 | target = target.data.cpu().numpy() 127 | output_reshape = output.reshape((-1, 3)) 128 | target_reshape = target.reshape((-1, 3)) 129 | # if len(output.shape) == 3: 130 | # # output_reshape = output.reshape((output.shape[0] * int(output.shape[1]/2)*output.shape[2], 2)) 131 | # output_reshape = output.reshape((-1, 3)) 132 | # # target_reshape = target.reshape((output.shape[0] * int(output.shape[1]/2)*output.shape[2], 2)) 133 | # target_reshape = target.reshape((-1, 3)) 134 | # elif len(output.shape) == 2: 135 | # # output_reshape = output.reshape((output.shape[0] * int(output.shape[1] / 2), 2)) 136 | # # target_reshape = target.reshape((output.shape[0] * int(output.shape[1] / 2), 2)) 137 | error = np.mean(np.sqrt(np.sum(np.square((output_reshape - target_reshape)), axis=1))) 138 | return error 139 | -------------------------------------------------------------------------------- /model/model_zoo.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | from base.base_model import base_model 5 | from utils.Quaternions import Quaternions 6 | 7 | 8 | def weight_init(m): 9 | if isinstance(m, nn.Linear): 10 | nn.init.kaiming_normal(m.weight) 11 | 12 | 13 | class fk_layer(base_model): 14 | def __init__(self, rotation_type): 15 | super(fk_layer, self).__init__() 16 | self.rotation_type = rotation_type 17 | self.cuda_available = torch.cuda.is_available() 18 | 19 | def normalize_vector(self, v): 20 | batch = v.shape[0] 21 | v_mag = torch.sqrt(v.pow(2).sum(1)) # batch 22 | v_mag = torch.max(v_mag, torch.autograd.Variable( 23 | torch.FloatTensor([1e-8]).cuda())) if self.cuda_available else torch.max(v_mag, torch.autograd.Variable( 24 | torch.FloatTensor([1e-8]))) 25 | v_mag = v_mag.view(batch, 1).expand(batch, v.shape[1]) 26 | v = v / v_mag 27 | return v 28 | 29 | def cross_product(self, u, v): 30 | batch = u.shape[0] 31 | i = u[:, 1] * v[:, 2] - u[:, 2] * v[:, 1] 32 | j = u[:, 2] * v[:, 0] - u[:, 0] * v[:, 2] 33 | k = u[:, 0] * v[:, 1] - u[:, 1] * v[:, 0] 34 | 35 | out = torch.cat((i.view(batch, 1), j.view(batch, 1), k.view(batch, 1)), 1) # batch*3 36 | 37 | return out 38 | 39 | def transforms_multiply(self, t0s, t1s): 40 | return torch.matmul(t0s, t1s) 41 | 42 | def transforms_blank(self, rotations): 43 | diagonal = torch.diag(torch.ones(4))[None, None, :, :].cuda() if self.cuda_available else torch.diag( 44 | torch.ones(4))[None, None, :, :] 45 | ts = diagonal.repeat(int(rotations.shape[0]), int(rotations.shape[1]), 1, 1) 46 | return ts 47 | 48 | def transforms_rotations(self, rotations): 49 | if self.rotation_type == 'q': 50 | q_length = torch.sqrt(torch.sum(torch.pow(rotations, 2), dim=-1)) 51 | qw = rotations[..., 0] / q_length 52 | qx = rotations[..., 1] / q_length 53 | qy = rotations[..., 2] / q_length 54 | qz = rotations[..., 3] / q_length 55 | qw[qw != qw] = 0 56 | qx[qx != qx] = 0 57 | qy[qy != qy] = 0 58 | qz[qz != qz] = 0 59 | """Unit quaternion based rotation matrix computation""" 60 | x2 = qx + qx 61 | y2 = qy + qy 62 | z2 = qz + qz 63 | xx = qx * x2 64 | yy = qy * y2 65 | wx = qw * x2 66 | xy = qx * y2 67 | yz = qy * z2 68 | wy = qw * y2 69 | xz = qx * z2 70 | zz = qz * z2 71 | wz = qw * z2 72 | 73 | dim0 = torch.stack([1.0 - (yy + zz), xy - wz, xz + wy], dim=-1) 74 | dim1 = torch.stack([xy + wz, 1.0 - (xx + zz), yz - wx], dim=-1) 75 | dim2 = torch.stack([xz - wy, yz + wx, 1.0 - (xx + yy)], dim=-1) 76 | m = torch.stack([dim0, dim1, dim2], dim=-2) 77 | elif self.rotation_type == '6d': 78 | rotations_reshape = rotations.view((-1, 6)) 79 | x_raw = rotations_reshape[:, 0:3] # batch*3 80 | y_raw = rotations_reshape[:, 3:6] # batch*3 81 | 82 | x = self.normalize_vector(x_raw) # batch*3 83 | z = self.cross_product(x, y_raw) # batch*3 84 | z = self.normalize_vector(z) # batch*3 85 | y = self.cross_product(z, x) # batch*3 86 | 87 | x = x.view(-1, 3, 1) 88 | y = y.view(-1, 3, 1) 89 | z = z.view(-1, 3, 1) 90 | m = torch.cat((x, y, z), 2).reshape((rotations.shape[0], rotations.shape[1], 3, 3)) # batch*3*3 91 | elif self.rotation_type == 'eular': 92 | rotations[:, 8, :] = 8 93 | rotations[:, 15, :2] = 0 94 | rotations[:, 16, 1:] = 0 95 | rotations[:, 11, :2] = 0 96 | rotations[:, 12, 1:] = 0 97 | rotations_reshape = rotations.view((-1, 3)) 98 | batch = rotations_reshape.shape[0] 99 | c1 = torch.cos(rotations_reshape[:, 0]).view(batch, 1) # batch*1 100 | s1 = torch.sin(rotations_reshape[:, 0]).view(batch, 1) # batch*1 101 | c2 = torch.cos(rotations_reshape[:, 2]).view(batch, 1) # batch*1 102 | s2 = torch.sin(rotations_reshape[:, 2]).view(batch, 1) # batch*1 103 | c3 = torch.cos(rotations_reshape[:, 1]).view(batch, 1) # batch*1 104 | s3 = torch.sin(rotations_reshape[:, 1]).view(batch, 1) # batch*1 105 | 106 | row1 = torch.cat((c2 * c3, -s2, c2 * s3), 1).view(-1, 1, 3) # batch*1*3 107 | row2 = torch.cat((c1 * s2 * c3 + s1 * s3, c1 * c2, c1 * s2 * s3 - s1 * c3), 1).view(-1, 1, 3) # batch*1*3 108 | row3 = torch.cat((s1 * s2 * c3 - c1 * s3, s1 * c2, s1 * s2 * s3 + c1 * c3), 1).view(-1, 1, 3) # batch*1*3 109 | 110 | m = torch.cat((row1, row2, row3), 1).reshape((rotations.shape[0], rotations.shape[1], 3, 3)) # batch*3*3 111 | return m 112 | 113 | def transforms_local(self, positions, rotations): 114 | cuda_available = torch.cuda.is_available() 115 | transforms = self.transforms_rotations(rotations) 116 | if positions.is_cuda and (transforms.is_cuda == False): 117 | transforms = transforms.cuda() 118 | transforms = torch.cat([transforms, positions[:, :, :, None]], dim=-1) 119 | zeros = torch.zeros( 120 | [int(transforms.shape[0]), int(transforms.shape[1]), 1, 3]).cuda() if cuda_available else torch.zeros( 121 | [int(transforms.shape[0]), int(transforms.shape[1]), 1, 3]) 122 | ones = torch.ones( 123 | [int(transforms.shape[0]), int(transforms.shape[1]), 1, 1]).cuda() if cuda_available else torch.ones( 124 | [int(transforms.shape[0]), int(transforms.shape[1]), 1, 1]) 125 | zerosones = torch.cat([zeros, ones], dim=-1) 126 | transforms = transforms.double() 127 | zerosones = zerosones.double() 128 | 129 | transforms = torch.cat([transforms, zerosones], dim=-2) 130 | return transforms 131 | 132 | def transforms_global(self, parents, positions, rotations): 133 | locals = self.transforms_local(positions, rotations) 134 | globals = self.transforms_blank(rotations) 135 | locals = locals.double() 136 | globals = globals.double() 137 | 138 | globals = torch.cat([locals[:, 0:1], globals[:, 1:]], dim=1) 139 | globals = list(torch.chunk(globals, int(globals.shape[1]), dim=1)) 140 | for i in range(1, positions.shape[1]): 141 | globals[i] = self.transforms_multiply(globals[parents[i]][:, 0], 142 | locals[:, i])[:, None, :, :] 143 | return torch.cat(globals, dim=1) 144 | 145 | def forward(self, parents, positions, rotations): 146 | positions = self.transforms_global(parents, positions, 147 | rotations)[:, :, :, 3] 148 | return positions[:, :, :3] / positions[:, :, 3, None] 149 | 150 | def convert_6d_to_quaternions(self, rotations): 151 | rotations_reshape = rotations.view((-1, 6)) 152 | x_raw = rotations_reshape[:, 0:3] # batch*3 153 | y_raw = rotations_reshape[:, 3:6] # batch*3 154 | 155 | x = self.normalize_vector(x_raw) # batch*3 156 | z = self.cross_product(x, y_raw) # batch*3 157 | z = self.normalize_vector(z) # batch*3 158 | y = self.cross_product(z, x) # batch*3 159 | 160 | x = x.view(-1, 3, 1) 161 | y = y.view(-1, 3, 1) 162 | z = z.view(-1, 3, 1) 163 | matrices = torch.cat((x, y, z), 2).cpu().numpy() 164 | 165 | q = Quaternions.from_transforms(matrices) 166 | 167 | return q.qs 168 | 169 | def convert_eular_to_quaternions(self, rotations): 170 | rotations_reshape = rotations.view((-1, 3)) 171 | q = Quaternions.from_euler(rotations_reshape) 172 | return q.qs 173 | 174 | # batch = matrices.shape[0] 175 | # 176 | # w = torch.sqrt(1.0 + matrices[:, 0, 0] + matrices[:, 1, 1] + matrices[:, 2, 2]) / 2.0 177 | # w = torch.max(w, torch.autograd.Variable(torch.zeros(batch).cuda()) + 1e-8) # batch 178 | # w4 = 4.0 * w 179 | # x = (matrices[:, 2, 1] - matrices[:, 1, 2]) / w4 180 | # y = (matrices[:, 0, 2] - matrices[:, 2, 0]) / w4 181 | # z = (matrices[:, 1, 0] - matrices[:, 0, 1]) / w4 182 | # 183 | # quats = torch.cat((w.view(batch, 1), x.view(batch, 1), y.view(batch, 1), z.view(batch, 1)), 1) 184 | # 185 | # return quats 186 | 187 | 188 | class pooling_shrink_net(base_model): 189 | def __init__(self, in_features, out_features, kernel_size_set, stride_set, dilation_set, channel, stage_number): 190 | super(pooling_shrink_net, self).__init__() 191 | print('Branch S - Single-View Configuration') 192 | self.drop = nn.Dropout(p=0.25) 193 | self.relu = nn.LeakyReLU(inplace=True) 194 | self.expand_conv = nn.Conv1d(in_features, channel, kernel_size=3, stride=2, bias=True) 195 | self.expand_bn = nn.BatchNorm1d(channel, momentum=0.1) 196 | self.shrink = nn.Conv1d(channel, out_features, 1) 197 | self.stage_number = stage_number 198 | self.out_features = out_features 199 | layers = [] 200 | 201 | for stage_index in range(0, stage_number): # 202 | for conv_index in range(len(kernel_size_set)): 203 | layers.append( 204 | nn.Sequential( 205 | nn.Conv1d(channel, channel, kernel_size_set[conv_index], stride_set[conv_index], dilation=1, 206 | bias=True), 207 | nn.BatchNorm1d(channel, momentum=0.1) 208 | ) 209 | ) 210 | 211 | self.stage_layers = nn.ModuleList(layers) 212 | 213 | def forward(self, x): 214 | x = torch.transpose(x, 1, 2) 215 | x = self.drop(self.relu(self.expand_bn(self.expand_conv(x)))) 216 | for layer in self.stage_layers: 217 | x = self.drop(self.relu(layer(x))) 218 | x = F.adaptive_max_pool1d(x, 1) 219 | x = self.shrink(x) 220 | return torch.transpose(x, 1, 2) 221 | 222 | 223 | class rotation_D(base_model): 224 | def __init__(self, in_features, out_features, channel, joint_numbers): 225 | super(rotation_D, self).__init__() 226 | self.local_fc_layers = nn.ModuleList() 227 | self.joint_numbers = joint_numbers 228 | self.shrink_frame_number = 24 229 | self.relu = nn.ReLU(inplace=True) 230 | self.conv1 = nn.Conv1d(self.joint_numbers, 500, kernel_size=4, stride=1, bias=False) 231 | self.conv2 = nn.Conv1d(500, self.joint_numbers, kernel_size=1, stride=1, bias=False) 232 | 233 | for i in range(joint_numbers): 234 | self.local_fc_layers.append( 235 | nn.Linear(in_features=self.shrink_frame_number, out_features=1) 236 | ) 237 | 238 | # Get input B*T*J*4 239 | def forward(self, x): 240 | x = x.reshape((x.shape[0], -1, self.joint_numbers)) 241 | x = torch.transpose(x, 1, 2) 242 | 243 | x = self.relu(self.conv2(self.relu(self.conv1(x)))) 244 | x = F.adaptive_avg_pool1d(x, self.shrink_frame_number) 245 | layer_output = [] 246 | for i in range(self.joint_numbers): 247 | layer_output.append(torch.sigmoid(self.local_fc_layers[i](x[:, i, :].clone()))) 248 | return torch.cat(layer_output, -1) 249 | -------------------------------------------------------------------------------- /model/tensor_pool.py: -------------------------------------------------------------------------------- 1 | import random 2 | import torch 3 | from torch.autograd import Variable 4 | class TensorPool(): 5 | def __init__(self, pool_size): 6 | self.pool_size = pool_size 7 | if self.pool_size > 0: 8 | self.num_imgs = 0 9 | self.images = [] 10 | 11 | def query(self, tensors): 12 | if self.pool_size == 0: 13 | return tensors 14 | return_tensors = [] 15 | for tensor in tensors.data: 16 | tensor = torch.unsqueeze(tensor, 0) 17 | if self.num_imgs < self.pool_size: 18 | self.num_imgs = self.num_imgs + 1 19 | self.images.append(tensor) 20 | return_tensors.append(tensor) 21 | else: 22 | p = random.uniform(0, 1) 23 | if p > 0.5: 24 | random_id = random.randint(0, self.pool_size-1) 25 | tmp = self.images[random_id].clone() 26 | self.images[random_id] = tensor 27 | return_tensors.append(tmp) 28 | else: 29 | return_tensors.append(tensor) 30 | return_images = Variable(torch.cat(return_tensors, 0)) 31 | return return_images 32 | -------------------------------------------------------------------------------- /model/transformer_model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from torch.nn.init import xavier_uniform_ 4 | from base.base_model import base_model 5 | 6 | N_HEADS = 64 7 | N_LAYERS = 2 8 | 9 | """ 10 | Some of the code taken from: https://github.com/lucidrains/vit-pytorch/blob/f196d1ec5b52edf554031c4a9c977d3a4e40ec9d/vit_pytorch/vit.py#L79 11 | """ 12 | 13 | 14 | class TransformerModel(base_model): 15 | def __init__(self, config, input_dim, output_dim, n_heads=N_HEADS, n_layers=N_LAYERS): 16 | super(TransformerModel, self).__init__() 17 | self.config = config 18 | if config.arch.transformer_mode == "encoder": 19 | print(f'Creating transformer encoder with {n_heads} Heads & {n_layers} Layers') 20 | encoder_layers = nn.TransformerEncoderLayer(d_model=input_dim, nhead=n_heads, dim_feedforward=output_dim) 21 | encoder_norm = nn.LayerNorm(output_dim) 22 | self.transformer_model = nn.TransformerEncoder(encoder_layers, num_layers=n_layers, norm=encoder_norm) 23 | elif config.arch.transformer_mode == "mha": 24 | print(f'Creating MHA layer with {n_heads} Heads') 25 | self.transformer_model = nn.MultiheadAttention(embed_dim=config.arch.channel, num_heads=n_heads) 26 | else: 27 | raise Exception('NO VALID CONFIGURATION FOR TRANSFORMER MODEL!') 28 | self.dummy_token = torch.nn.Parameter(torch.zeros((1, 1, config.arch.channel))) 29 | self.register_parameter(param=self.dummy_token, name='dummy_token') 30 | xavier_uniform_(self.dummy_token) 31 | 32 | def forward(self, x): 33 | # x.shape = [B,C,F,V] 34 | self.transformer_model = self.transformer_model 35 | bsz = x.shape[0] 36 | channels = x.shape[1] 37 | frames = x.shape[2] 38 | views = x.shape[3] 39 | x = x.transpose(1, 3) # x.shape = [B,V,F,C] 40 | x = x.transpose(1, 2) # x.shape = [B,F,V,C] 41 | x = x.contiguous().view(x.shape[0] * x.shape[1], x.shape[2], x.shape[3]) # x.shape = [B*F,V,C] 42 | x = x.transpose(0, 1) # x.shape = [V,B*F,C] 43 | 44 | temp = self.dummy_token.expand(-1, x.shape[1], -1) # Expand token to B*F dimension 45 | x = torch.cat((temp, x), dim=0) # x.shape = [B*F,V+1,C] 46 | 47 | if self.config.arch.transformer_mode == "mha": 48 | x, attn_output_weights = self.transformer_model(x, x, x) 49 | else: 50 | x = self.transformer_model(x) 51 | 52 | x = x.transpose(0, 1) # [B*F, 1 ,C] 53 | x = x.contiguous().view(bsz, frames, views + 1, channels) # [B,F,V+1,C] 54 | x = x.transpose(1, 2) # [B,F,V,C] 55 | x = x.transpose(1, 3) # [B,C,F,V] 56 | x = x[:, :, :, 0:1] # [B,C,F,1] 57 | 58 | return x 59 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | # This file may be used to create an environment using: 2 | # $ conda create --name --file 3 | # platform: linux-64 4 | _libgcc_mutex=0.1 5 | _openmp_mutex=4.5 6 | absl-py=0.11.0 7 | alabaster=0.7.12 8 | astunparse=1.6.3 9 | attrs=20.3.0 10 | babel=2.9.0 11 | blas=1.1 12 | bleach=1.5.0 13 | ca-certificates=2021.10.8 14 | cachetools=4.2.1 15 | cdflib=0.3.20 16 | certifi=2016.9.26 17 | cffi=1.14.1 18 | chardet=4.0.0 19 | clearml=0.17.4 20 | clearml-agent=0.17.1 21 | cloudpickle=0.4.0 22 | cudatoolkit=10.0.130 23 | cycler=0.10.0 24 | decorator=4.4.2 25 | docutils=0.16 26 | easydict=1.9 27 | einops=0.3.2 28 | exifread=2.1.2 29 | ffnet=0.8.4 30 | flatbuffers=1.12 31 | fpdf2=2.1.0 32 | freetype=2.10.2 33 | funcsigs=1.0.2 34 | furl=2.1.0 35 | future=0.18.2 36 | gast=0.3.3 37 | google-auth=1.27.0 38 | google-auth-oauthlib=0.4.2 39 | google-pasta=0.2.0 40 | gpxpy=1.1.2 41 | grpcio=1.32.0 42 | h5py=2.10.0 43 | hdf5=1.10.4 44 | html5lib=0.9999999 45 | humanfriendly=9.1 46 | idna=2.10 47 | imageio=2.9.0 48 | imagesize=1.2.0 49 | importlib-metadata=1.7.0 50 | intel-openmp=2019.4 51 | intelpython=2021.3.0 52 | jinja2=2.11.3 53 | joblib=0.14.1 54 | jpeg=9d 55 | json-tricks=3.15.5 56 | jsonschema=3.2.0 57 | keras-preprocessing=1.1.2 58 | kiwisolver=1.2.0 59 | lcms2=2.11 60 | ld_impl_linux-64=2.34 61 | libblas=3.8.0 62 | libcblas=3.8.0 63 | libffi=3.2.1 64 | libgcc-ng=9.3.0 65 | libgfortran=3.0.0 66 | libgfortran-ng=7.5.0 67 | libgomp=9.3.0 68 | liblapack=3.8.0 69 | libopenblas=0.3.10 70 | libpng=1.6.37 71 | libprotobuf=3.13.0 72 | libstdcxx-ng=9.3.0 73 | libtiff=4.1.0 74 | libwebp-base=1.1.0 75 | loky=2.9.0 76 | lz4-c=1.9.2 77 | markdown=3.2.2 78 | markupsafe=1.1.1 79 | matplotlib=3.3.1 80 | matplotlib-base=3.3.1 81 | mkl=2018.0.3 82 | mkl-service=2.1.0 83 | ncurses=6.2 84 | networkx=1.11 85 | ninja=1.10.1 86 | numpy=1.19.5 87 | oauthlib=3.1.0 88 | olefile=0.46 89 | openblas=0.2.19 90 | opencv-python=4.0.0.21 91 | openssl=1.1.1l 92 | opt-einsum=3.3.0 93 | orderedmultidict=1.0.1 94 | packaging=20.9 95 | pandas=1.1.5 96 | pathlib2=2.3.5 97 | pillow=7.1.0 98 | pip=20.2.2 99 | prettytable=2.0.0 100 | protobuf=3.13.0 101 | psutil=5.8.0 102 | py=1.10.0 103 | pyasn1=0.4.8 104 | pyasn1-modules=0.2.8 105 | pycparser=2.20 106 | pyfeatures=0.1.1 107 | pygments=2.8.1 108 | pyhocon=0.3.57 109 | pyjwt=1.7.1 110 | pyparsing=2.4.7 111 | pyproj=1.9.5.1 112 | pyrsistent=0.17.3 113 | pytest=3.0.7 114 | python=3.6.11 115 | python-dateutil=2.6.0 116 | python-graphviz=0.16 117 | python_abi=3.6 118 | pytorch=1.4.0 119 | pytorch-model-summary=0.1.2 120 | pytz=2021.1 121 | pywavelets=1.1.1 122 | pyyaml=5.1 123 | readline=8.0 124 | repoze-lru=0.7 125 | requests=2.25.1 126 | requests-file=1.5.1 127 | requests-oauthlib=1.3.0 128 | rsa=4.7.2 129 | scikit-image=0.17.2 130 | scikit-learn=0.24.2 131 | scipy=1.4.1 132 | seaborn=0.11.1 133 | setuptools=49.6.0 134 | six=1.15.0 135 | sklearn=0.0 136 | snowballstemmer=2.1.0 137 | spacepy=0.2.2 138 | sphinx=3.4.3 139 | sphinxcontrib-applehelp=1.0.2 140 | sphinxcontrib-devhelp=1.0.2 141 | sphinxcontrib-htmlhelp=1.0.3 142 | sphinxcontrib-jsmath=1.0.1 143 | sphinxcontrib-qthelp=1.0.3 144 | sphinxcontrib-serializinghtml=1.1.4 145 | sqlite=3.33.0 146 | tensorboard=2.4.1 147 | tensorboard-plugin-wit=1.8.0 148 | tensorboardx=1.4 149 | tensorflow=2.4.1 150 | tensorflow-estimator=2.4.0 151 | tensorflow-tensorboard=0.4.0 152 | termcolor=1.1.0 153 | threadpoolctl=3.0.0 154 | tifffile=2020.9.3 155 | tk=8.6.10 156 | torchvision=0.5.0 157 | torchviz=0.0.1 158 | tornado=6.0.4 159 | tqdm=4.54.1 160 | trains=0.16.4 161 | typing=3.7.4.3 162 | typing-extensions=3.7.4.3 163 | urllib3=1.26.2 164 | virtualenv=16.7.10 165 | wcwidth=0.2.5 166 | webencodings=0.5.1 167 | werkzeug=1.0.1 168 | wheel=0.35.1 169 | wrapt=1.12.1 170 | xmltodict=0.10.2 171 | xz=5.2.5 172 | zipp=3.1.0 173 | zlib=1.2.11 174 | zstd=1.4.5 175 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import os 2 | import copy 3 | import json 4 | import argparse 5 | import torch 6 | import model.model as models 7 | from data.data_loaders import h36m_loader 8 | from trainer.multi_view_trainer import fk_trainer_multi_view 9 | from types import SimpleNamespace as Namespace 10 | 11 | base_path = os.path.dirname(os.path.realpath(__file__)) 12 | 13 | 14 | def get_clearml_logger(config): 15 | import clearml 16 | import trains 17 | if config.arch.transformer_on: 18 | project_name = f'MotioNet-MultiView - {config.arch.n_views} Views - Transformer: {config.arch.transformer_mode}' 19 | task_name = f'[{config.trainer.data}][{config.arch.transformer_n_heads} Heads][{config.arch.transformer_n_layers} Layers](3D pos.loss)' 20 | else: 21 | project_name = f'MotioNet-MultiView - {config.arch.n_views} Views' 22 | task_name = f'[{config.trainer.data}] (3D pos.loss) Early fusion' 23 | 24 | if config.arch.translation: 25 | task_name += " [Translation]" 26 | 27 | if config.trainer.optimizer == "adaw": 28 | task_name += " [AdaW]" 29 | 30 | if config.trainer.train_only_on_cameras is not None: 31 | task_name += f" Training only on camera tuples: {config.trainer.train_only_on_cameras}" 32 | 33 | if getattr(config, 'momo', False): 34 | task = trains.Task.init(project_name=project_name, task_name=task_name) 35 | clearml_logger = task.get_logger() 36 | else: 37 | task = clearml.Task.init(project_name=project_name, task_name=task_name) 38 | clearml_logger = task.get_logger() 39 | 40 | return clearml_logger 41 | 42 | def count_parameters(model): 43 | from prettytable import PrettyTable 44 | table = PrettyTable(["Modules", "Parameters"]) 45 | total_params = 0 46 | for name, parameter in model.named_parameters(): 47 | if not parameter.requires_grad: 48 | continue 49 | param = parameter.numel() 50 | table.add_row([name, param]) 51 | total_params+=param 52 | print(table) 53 | print(f"Total Trainable Params: {total_params}") 54 | return total_params 55 | 56 | 57 | def get_instance(module, name, config, *args): 58 | return getattr(module, config[name]['type'])(*args, **config[name]['args']) 59 | 60 | 61 | def config_parse(args): 62 | config = copy.deepcopy(json.load(open(args.config), object_hook=lambda d: Namespace(**d))) 63 | 64 | config.device = str(args.device) 65 | config.arch.transformer_on = args.transformer_on if args.transformer_on is not None else config.arch.transformer_on 66 | config.arch.transformer_n_heads = args.transformer_n_heads 67 | config.arch.transformer_n_layers = args.transformer_n_layers 68 | config.arch.transformer_mode = args.transformer_mode 69 | 70 | config.momo = bool(args.momo) 71 | config.arch.n_joints = int(args.n_joints) 72 | config.trainer.starting_pos_totem = args.starting_pos_totem 73 | config.trainer.arms_and_legs_weight = args.arms_and_legs_weight 74 | config.trainer.use_3d_world_pose_as_labels = args.use_3d_world_pose_as_labels 75 | config.arch.branch_S_multi_view = args.branch_S_multi_view 76 | 77 | config.arch.kernel_size = list(map(int, args.kernel_size.replace(' ', '').strip().split(','))) if args.kernel_size is not None else config.arch.kernel_size 78 | config.arch.stride = list(map(int, args.stride.replace(' ', '').strip().split(','))) if args.stride is not None else config.arch.stride 79 | config.arch.dilation = list(map(int, args.dilation.replace(' ', '').strip().split(','))) if args.dilation is not None else config.arch.dilation 80 | 81 | config.arch.kernel_size_stage_1 = list(map(int, args.kernel_size_stage_1.replace(' ', '').strip().split(','))) if args.kernel_size_stage_1 is not None else config.arch.kernel_size_stage_1 82 | config.arch.stride_stage_1 = list(map(int, args.stride_stage_1.replace(' ', '').strip().split(','))) if args.stride_stage_1 is not None else config.arch.stride_stage_1 83 | config.arch.dilation_stage_1 = list(map(int, args.dilation_stage_1.replace(' ', '').strip().split(','))) if args.dilation_stage_1 is not None else config.arch.dilation_stage_1 84 | config.arch.kernel_size_stage_2 = list(map(int, args.kernel_size_stage_2.replace(' ', '').strip().split(','))) if args.kernel_size_stage_2 is not None else config.arch.kernel_size_stage_2 85 | 86 | config.arch.n_views = int(args.n_views) if args.n_views is not None else config.arch.n_views 87 | config.arch.kernel_width = int(args.kernel_width) if args.kernel_width is not None else None 88 | config.arch.padding = int(args.padding) if args.padding is not None else None 89 | 90 | config.arch.channel = int(args.channel) if args.channel is not None else config.arch.channel 91 | config.arch.stage = int(args.stage) if args.stage is not None else config.arch.stage 92 | config.arch.n_type = int(args.n_type) if args.n_type is not None else config.arch.n_type 93 | config.arch.rotation_type = args.rotation_type if args.rotation_type is not None else config.arch.rotation_type 94 | config.arch.translation = True if args.translation == 1 else config.arch.translation 95 | config.arch.confidence = True if args.confidence == 1 else config.arch.confidence 96 | config.arch.contact = True if args.contact == 1 else config.arch.contact 97 | if args.train_only_on_cameras is not None: 98 | camera_tuples = list(args.train_only_on_cameras.replace(' ', '').strip().split(':')) 99 | config.trainer.train_only_on_cameras = [] 100 | for cam_idxs in camera_tuples: 101 | config.trainer.train_only_on_cameras.append(list(map(int, cam_idxs.split(',')))) 102 | else: 103 | config.trainer.train_only_on_cameras = None 104 | config.trainer.data = args.data 105 | config.trainer.lr = args.lr 106 | config.trainer.batch_size = args.batch_size 107 | config.trainer.train_frames = args.train_frames 108 | config.trainer.use_loss_foot = True if args.loss_terms[0] == '1' else False 109 | config.trainer.use_loss_3d = True if args.loss_terms[1] == '1' else False 110 | config.trainer.use_loss_2d = True if args.loss_terms[2] == '1' else False 111 | config.trainer.use_loss_D = True if args.loss_terms[3] == '1' else False 112 | config.trainer.data_aug_flip = True if args.augmentation_terms[0] == '1' else False 113 | config.trainer.data_aug_depth = True if args.augmentation_terms[1] == '1' else False 114 | config.trainer.save_dir = args.save_dir if args.save_dir is not None else base_path 115 | 116 | config.trainer.checkpoint_dir = '%s/%s_%s_k%s_s%s_d%s_c%s_%s_%s_%s%s%s_%s_%s_loss%s_aug%s' % (config.trainer.save_dir, args.name, args.data, 117 | str(config.arch.kernel_size).strip('[]').replace(' ', ''), 118 | str(config.arch.stride).strip('[]').replace(' ', ''), 119 | str(config.arch.dilation).strip('[]').replace(' ', ''), 120 | config.arch.channel, config.arch.stage, config.arch.rotation_type, 121 | 't' if config.arch.translation else '', 122 | 'c' if config.arch.confidence else '', 123 | 'c' if config.arch.contact else '', 124 | args.lr, args.batch_size, args.loss_terms, args.augmentation_terms) 125 | return config 126 | 127 | 128 | def train(config, resume): 129 | print("Loading dataset..") 130 | train_data_loader = h36m_loader(config, is_training=True) 131 | test_data_loader = h36m_loader(config, is_training=False) 132 | 133 | model = getattr(models, config.arch.type)(config) 134 | # print(model.summary()) 135 | count_parameters(model) 136 | trainer = fk_trainer_multi_view(model, resume=resume, config=config, data_loader=train_data_loader, 137 | test_data_loader=test_data_loader, clearml_logger=clearml_logger, 138 | num_of_views=config.arch.n_views) 139 | trainer.train() 140 | 141 | 142 | if __name__ == '__main__': 143 | 144 | parser = argparse.ArgumentParser(description='### MotioNet training') 145 | 146 | parser.add_argument('--clearml_logger', action='store_true', help='Turn on ClearML Framework') 147 | 148 | # Runtime parameters 149 | parser.add_argument('--transformer_on', action='store_true', help='Activate Attention Heads on Fusion layer') 150 | parser.add_argument('--transformer_n_heads', default=2, type=int, help='Number of heads to use') 151 | parser.add_argument('--transformer_n_layers', default=2, type=int, help='Number of heads to use') 152 | parser.add_argument('--transformer_mode', default="encoder", type=str, help='Number of heads to use') 153 | 154 | parser.add_argument('-m', '--momo', action='store_true', help='If we are running on momo server') 155 | parser.add_argument('--n_joints', default=20, type=int, help='Number of joints to use') 156 | parser.add_argument('--starting_pos_totem', action='store_true',default=True, help='If we are running on momo server') 157 | parser.add_argument('--arms_and_legs_weight', type=float, default=1, help='If we are running on momo server') 158 | parser.add_argument('-c', '--config', default=f'{base_path}/config_zoo/default.json', type=str, 159 | help='config file path (default: None)') 160 | parser.add_argument('-r', '--resume', default=None, type=str, help='path to latest checkpoint (default: None)') 161 | parser.add_argument('-d', '--device', default='2', type=str, help='indices of GPUs to enable (default: all)') 162 | parser.add_argument('-n', '--name', default='debug_model', type=str, help='The name of this training') 163 | parser.add_argument('--data', default='gt', type=str, 164 | help='The training data, gt - projected 2d pose; cpn; detectron') 165 | parser.add_argument('--use_3d_world_pose_as_labels', action='store_true', help='If we are running on momo server') 166 | parser.add_argument('--branch_S_multi_view', action='store_true', help='If we are running on momo server') 167 | 168 | 169 | # Network definition 170 | parser.add_argument('--kernel_size', default=None, type=str, help='The kernel_size set of the convolution') 171 | parser.add_argument('--stride', default=None, type=str, help='The stride set of the convolution') 172 | parser.add_argument('--dilation', default=None, type=str, help='The dilation set of the convolution') 173 | parser.add_argument('--channel', default=None, type=int, help='The channel number of the network') 174 | parser.add_argument('--stage', default=None, type=int, help='The stage of the network') 175 | parser.add_argument('--n_type', default=None, type=int, help='The network architecture of rotation branch 0 - deconv 1- avgpool') 176 | parser.add_argument('--rotation_type', default=None, type=str, help='The type of rotations, including 6d, 5d, q, eular') 177 | parser.add_argument('--translation', default=None, type=int, help='If we want to use translation in the network, 0 - No, 1 - Yes') 178 | parser.add_argument('--confidence', default=None, type=int, help='If we want to use confidence map in the network, 0 - No, 1 - Yes') 179 | parser.add_argument('--contact', default=None, type=int, help='If we want to use foot contact in the network, 0 - No, 1 - Yes') 180 | 181 | parser.add_argument('--kernel_size_stage_1', default=None, type=str, help='The kernel_size set of the convolution') 182 | parser.add_argument('--stride_stage_1', default=None, type=str, help='The stride set of the convolution') 183 | parser.add_argument('--dilation_stage_1', default=None, type=str, help='The dilation set of the convolution') 184 | parser.add_argument('--kernel_size_stage_2', default=None, type=str, help='The kernel_size set of the convolution') 185 | parser.add_argument('--n_views', default=4, type=int, help='Number of views to use in Multi-View') 186 | parser.add_argument('--kernel_width', default=None, type=int, help='Number of views to use in Multi-View') 187 | parser.add_argument('--padding', default=None, type=int, help='Number of views to use in Multi-View') 188 | 189 | 190 | 191 | # Training parameters 192 | parser.add_argument('--train_only_on_cameras', default=None, type=str, help='Specify if limit train to camera idxs') 193 | 194 | parser.add_argument('--lr', default=0.001, type=float, help='The learning rate in the training') 195 | parser.add_argument('--batch_size', default=128, type=int, help='The batch size') 196 | parser.add_argument('--train_frames', default=0, type=int, help='The frames number for a training clip, 0 mean random number') 197 | parser.add_argument('--loss_terms', default='0100', type=str, help='The loss in training we want to use for [foot_contact, 3d_pose, 2d_pose, adversarial] we want to use, 0 - No, 1 - Yes, like: 11111') 198 | parser.add_argument('--augmentation_terms', default='00', type=str, help='Data augmentation in training we want to use for [pose_flip, projection_depth], 0 - No, 1 - Yes, like: 11') 199 | parser.add_argument('--save_dir', default=None, type=str, help='Base directory to save network') 200 | 201 | args = parser.parse_args() 202 | if args.resume: 203 | print('Loading Config file from checkpoint...') 204 | config = torch.load(args.resume)['config'] 205 | config.trainer.checkpoint_dir = config.trainer.checkpoint_dir + "/continue" 206 | config.device = str(args.device) 207 | elif args.config: 208 | config = config_parse(args) 209 | else: 210 | raise AssertionError("Configuration file need to be specified. Add '-c config.json', for example.") 211 | 212 | if args.clearml_logger: 213 | clearml_logger = get_clearml_logger(config) 214 | else: 215 | clearml_logger = None 216 | 217 | print(f'args.device: {str(args.device)}') 218 | config.device = str(args.device) 219 | os.environ["CUDA_VISIBLE_DEVICES"] = str(args.device) 220 | print(config) 221 | print(f'Checkpoints will be saved at: {config.trainer.checkpoint_dir}') 222 | train(config, args.resume) 223 | 224 | -------------------------------------------------------------------------------- /trainer/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/trainer/__init__.py -------------------------------------------------------------------------------- /utils/AnimationPositions.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from scipy.spatial import distance 3 | 4 | from utils.Quaternions import Quaternions 5 | import utils.Animation 6 | import utils.AnimationStructure 7 | 8 | def constrain(positions, constraints): 9 | """ 10 | Constrain animation positions given 11 | a number of VerletParticles constrains 12 | 13 | Parameters 14 | ---------- 15 | 16 | positions : (F, J, 3) ndarray 17 | array of joint positions for 18 | F frames and J joints 19 | 20 | constraints : [(int, int, float, float, float)] 21 | A list of constraints in the format: 22 | (Joint1, Joint2, Masses1, Masses2, Lengths) 23 | 24 | Returns 25 | ------- 26 | 27 | positions : (F, J, 3) ndarray 28 | joint positions for F 29 | frames and J joints constrained 30 | using the supplied constraints 31 | """ 32 | 33 | from VerletParticles import VerletParticles 34 | 35 | particles = VerletParticles(positions, gravity=0.0, timestep=0.0) 36 | for i, j, w0, w1, l in constraints: 37 | particles.add_length_constraint(i, j, w0, w1, l) 38 | 39 | return particles.constrain() 40 | 41 | def extremities(positions, count, **kwargs): 42 | """ 43 | List of most extreme frame indices 44 | 45 | Parameters 46 | ---------- 47 | 48 | positions : (F, J, 3) ndarray 49 | array of joint positions for 50 | F frames and J joints 51 | 52 | count : int 53 | Number of indices to return, 54 | does not include first and last 55 | frame which are always included 56 | 57 | static : bool 58 | Find extremities where root 59 | translation has been removed 60 | 61 | Returns 62 | ------- 63 | 64 | indices : (C) ndarray 65 | Returns C frame indices of the 66 | most extreme frames including 67 | the first and last frames. 68 | 69 | Therefore if `count` it specified 70 | as `4` will return and array of 71 | `6` indices. 72 | """ 73 | 74 | if kwargs.pop('static', False): 75 | positions = positions - positions[:,0][:,np.newaxis,:] 76 | 77 | positions = positions.reshape((len(positions), -1)) 78 | 79 | distance_matrix = distance.squareform(distance.pdist(positions)) 80 | 81 | keys = [0] 82 | for _ in range(count-1): 83 | keys.append(int(np.argmax(np.min(distance_matrix[keys], axis=0)))) 84 | return np.array(keys) 85 | 86 | 87 | def load_to_maya(positions, names=None, parents=None, color=None, radius=0.1, thickness=5.0): 88 | 89 | import pymel.core as pm 90 | import maya.mel as mel 91 | 92 | if names is None: 93 | names = ['joint_%i' % i for i in xrange(positions.shape[1])] 94 | 95 | if color is None: 96 | color = (0.5, 0.5, 0.5) 97 | 98 | mpoints = [] 99 | frames = range(1, len(positions)+1) 100 | for i, name in enumerate(names): 101 | 102 | #try: 103 | # point = pm.PyNode(name) 104 | #except pm.MayaNodeError: 105 | # point = pm.sphere(p=(0,0,0), n=name, radius=radius)[0] 106 | point = pm.sphere(p=(0,0,0), n=name, radius=radius)[0] 107 | 108 | jpositions = positions[:,i] 109 | 110 | for j,attr,attr_name in zip(xrange(3), 111 | [point.tx, point.ty, point.tz], 112 | ["_translateX", "_translateY", "_translateZ"]): 113 | conn = attr.listConnections() 114 | if len(conn) == 0: 115 | curve = pm.nodetypes.AnimCurveTU(n=name + attr_name) 116 | pm.connectAttr(curve.output, attr) 117 | else: 118 | curve = conn[0] 119 | curve.addKeys(frames, jpositions[:,j]) 120 | 121 | mpoints.append(point) 122 | 123 | if parents != None: 124 | 125 | for i, p in enumerate(parents): 126 | if p == -1: continue 127 | pointname = names[i] 128 | parntname = names[p] 129 | conn = pm.PyNode(pointname).t.listConnections() 130 | if len(conn) != 0: continue 131 | 132 | curve = pm.curve(p=[[0,0,0],[0,1,0]], d=1, n=names[i]+'_curve') 133 | pm.connectAttr(pointname+'.t', names[i]+'_curve.cv[0]') 134 | pm.connectAttr(parntname+'.t', names[i]+'_curve.cv[1]') 135 | pm.select(curve) 136 | pm.runtime.AttachBrushToCurves() 137 | stroke = pm.selected()[0] 138 | brush = pm.listConnections(stroke.getChildren()[0]+'.brush')[0] 139 | pm.setAttr(brush+'.color1', color) 140 | pm.setAttr(brush+'.globalScale', thickness) 141 | pm.setAttr(brush+'.endCaps', 1) 142 | pm.setAttr(brush+'.tubeSections', 20) 143 | mel.eval('doPaintEffectsToPoly(1,0,0,1,100000);') 144 | mpoints += [stroke, curve] 145 | 146 | return pm.group(mpoints, n='AnimationPositions'), mpoints 147 | 148 | 149 | def load_from_maya(root, start, end): 150 | 151 | import pymel.core as pm 152 | 153 | def rig_joints_list(s, js): 154 | for c in s.getChildren(): 155 | if 'Geo' in c.name(): continue 156 | if isinstance(c, pm.nodetypes.Joint): js = rig_joints_list(c, js); continue 157 | if isinstance(c, pm.nodetypes.Transform): js = rig_joints_list(c, js); continue 158 | return [s] + js 159 | 160 | joints = rig_joints_list(root, []) 161 | 162 | names = map(lambda j: j.name(), joints) 163 | positions = np.empty((end - start, len(names), 3)) 164 | 165 | original_time = pm.currentTime(q=True) 166 | pm.currentTime(start) 167 | 168 | for i in range(start, end): 169 | 170 | pm.currentTime(i) 171 | for j in joints: positions[i-start, names.index(j.name())] = j.getTranslation(space='world') 172 | 173 | pm.currentTime(original_time) 174 | 175 | return positions, names 176 | 177 | 178 | def loop(positions, forward='z'): 179 | 180 | fid = 'xyz'.index(forward) 181 | 182 | data = positions.copy() 183 | trajectory = data[:,0:1,fid].copy() 184 | 185 | data[:,:,fid] -= trajectory 186 | diff = data[0] - data[-1] 187 | data += np.linspace( 188 | 0, 1, len(data))[:,np.newaxis,np.newaxis] * diff[np.newaxis] 189 | data[:,:,fid] += trajectory 190 | 191 | return data 192 | 193 | 194 | def extend(positions, length, forward='z'): 195 | 196 | fid = 'xyz'.index(forward) 197 | 198 | data = positions.copy() 199 | 200 | while len(data) < length: 201 | 202 | next = positions[1:].copy() 203 | next[:,:,fid] += data[-1,0,fid] 204 | data = np.concatenate([data, next], axis=0) 205 | 206 | return data[:length] 207 | 208 | 209 | def redirect(positions, joint0, joint1, forward='z'): 210 | 211 | forwarddir = { 212 | 'x': np.array([[[1,0,0]]]), 213 | 'y': np.array([[[0,1,0]]]), 214 | 'z': np.array([[[0,0,1]]]), 215 | }[forward] 216 | 217 | direction = (positions[:,joint0] - positions[:,joint1]).mean(axis=0)[np.newaxis,np.newaxis] 218 | direction = direction / np.sqrt(np.sum(direction**2)) 219 | 220 | rotation = Quaternions.between(direction, forwarddir).constrained_y() 221 | 222 | return rotation * positions 223 | 224 | -------------------------------------------------------------------------------- /utils/AnimationStructure.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import scipy.sparse as sparse 3 | import utils.Animation as Animation 4 | 5 | 6 | """ Maya Functions """ 7 | 8 | def load_from_maya(root): 9 | """ 10 | Load joint parents and names from maya 11 | 12 | Parameters 13 | ---------- 14 | 15 | root : PyNode 16 | Root Maya Node 17 | 18 | Returns 19 | ------- 20 | 21 | (names, parents) : ([str], (J) ndarray) 22 | List of joint names and array 23 | of indices representing the parent 24 | joint for each joint J. 25 | 26 | Joint index -1 is used to represent 27 | that there is no parent joint 28 | """ 29 | 30 | import pymel.core as pm 31 | 32 | names = [] 33 | parents = [] 34 | 35 | def unload_joint(j, parents, par): 36 | 37 | id = len(names) 38 | names.append(j) 39 | parents.append(par) 40 | 41 | children = [c for c in j.getChildren() if 42 | isinstance(c, pm.nt.Transform) and 43 | not isinstance(c, pm.nt.Constraint) and 44 | not any(pm.listRelatives(c, s=True)) and 45 | (any(pm.listRelatives(c, ad=True, ap=False, type='joint')) or isinstance(c, pm.nt.Joint))] 46 | 47 | map(lambda c: unload_joint(c, parents, id), children) 48 | 49 | unload_joint(root, parents, -1) 50 | 51 | return (names, parents) 52 | 53 | 54 | """ Family Functions """ 55 | 56 | def joints(parents): 57 | """ 58 | Parameters 59 | ---------- 60 | 61 | parents : (J) ndarray 62 | parents array 63 | 64 | Returns 65 | ------- 66 | 67 | joints : (J) ndarray 68 | Array of joint indices 69 | """ 70 | return np.arange(len(parents), dtype=int) 71 | 72 | def joints_list(parents): 73 | """ 74 | Parameters 75 | ---------- 76 | 77 | parents : (J) ndarray 78 | parents array 79 | 80 | Returns 81 | ------- 82 | 83 | joints : [ndarray] 84 | List of arrays of joint idices for 85 | each joint 86 | """ 87 | return list(joints(parents)[:,np.newaxis]) 88 | 89 | def parents_list(parents): 90 | """ 91 | Parameters 92 | ---------- 93 | 94 | parents : (J) ndarray 95 | parents array 96 | 97 | Returns 98 | ------- 99 | 100 | parents : [ndarray] 101 | List of arrays of joint idices for 102 | the parents of each joint 103 | """ 104 | return list(parents[:,np.newaxis]) 105 | 106 | 107 | def children_list(parents): 108 | """ 109 | Parameters 110 | ---------- 111 | 112 | parents : (J) ndarray 113 | parents array 114 | 115 | Returns 116 | ------- 117 | 118 | children : [ndarray] 119 | List of arrays of joint indices for 120 | the children of each joint 121 | """ 122 | 123 | def joint_children(i): 124 | return [j for j, p in enumerate(parents) if p == i] 125 | 126 | return list(map(lambda j: np.array(joint_children(j)), joints(parents))) 127 | 128 | 129 | def descendants_list(parents): 130 | """ 131 | Parameters 132 | ---------- 133 | 134 | parents : (J) ndarray 135 | parents array 136 | 137 | Returns 138 | ------- 139 | 140 | descendants : [ndarray] 141 | List of arrays of joint idices for 142 | the descendants of each joint 143 | """ 144 | 145 | children = children_list(parents) 146 | 147 | def joint_descendants(i): 148 | return sum([joint_descendants(j) for j in children[i]], list(children[i])) 149 | 150 | return list(map(lambda j: np.array(joint_descendants(j)), joints(parents))) 151 | 152 | 153 | def ancestors_list(parents): 154 | """ 155 | Parameters 156 | ---------- 157 | 158 | parents : (J) ndarray 159 | parents array 160 | 161 | Returns 162 | ------- 163 | 164 | ancestors : [ndarray] 165 | List of arrays of joint idices for 166 | the ancestors of each joint 167 | """ 168 | 169 | decendants = descendants_list(parents) 170 | 171 | def joint_ancestors(i): 172 | return [j for j in joints(parents) if i in decendants[j]] 173 | 174 | return list(map(lambda j: np.array(joint_ancestors(j)), joints(parents))) 175 | 176 | 177 | """ Mask Functions """ 178 | 179 | def mask(parents, filter): 180 | """ 181 | Constructs a Mask for a give filter 182 | 183 | A mask is a (J, J) ndarray truth table for a given 184 | condition over J joints. For example there 185 | may be a mask specifying if a joint N is a 186 | child of another joint M. 187 | 188 | This could be constructed into a mask using 189 | `m = mask(parents, children_list)` and the condition 190 | of childhood tested using `m[N, M]`. 191 | 192 | Parameters 193 | ---------- 194 | 195 | parents : (J) ndarray 196 | parents array 197 | 198 | filter : (J) ndarray -> [ndarray] 199 | function that outputs a list of arrays 200 | of joint indices for some condition 201 | 202 | Returns 203 | ------- 204 | 205 | mask : (N, N) ndarray 206 | boolean truth table of given condition 207 | """ 208 | m = np.zeros((len(parents), len(parents))).astype(bool) 209 | jnts = joints(parents) 210 | fltr = filter(parents) 211 | for i,f in enumerate(fltr): m[i,:] = np.any(jnts[:,np.newaxis] == f[np.newaxis,:], axis=1) 212 | return m 213 | 214 | def joints_mask(parents): return np.eye(len(parents)).astype(bool) 215 | def children_mask(parents): return mask(parents, children_list) 216 | def parents_mask(parents): return mask(parents, parents_list) 217 | def descendants_mask(parents): return mask(parents, descendants_list) 218 | def ancestors_mask(parents): return mask(parents, ancestors_list) 219 | 220 | """ Search Functions """ 221 | 222 | def joint_chain_ascend(parents, start, end): 223 | chain = [] 224 | while start != end: 225 | chain.append(start) 226 | start = parents[start] 227 | chain.append(end) 228 | return np.array(chain, dtype=int) 229 | 230 | 231 | """ Constraints """ 232 | 233 | def constraints(anim, **kwargs): 234 | """ 235 | Constraint list for Animation 236 | 237 | This constraint list can be used in the 238 | VerletParticle solver to constrain 239 | a animation global joint positions. 240 | 241 | Parameters 242 | ---------- 243 | 244 | anim : Animation 245 | Input animation 246 | 247 | masses : (F, J) ndarray 248 | Optional list of masses 249 | for joints J across frames F 250 | defaults to weighting by 251 | vertical height 252 | 253 | Returns 254 | ------- 255 | 256 | constraints : [(int, int, (F, J) ndarray, (F, J) ndarray, (F, J) ndarray)] 257 | A list of constraints in the format: 258 | (Joint1, Joint2, Masses1, Masses2, Lengths) 259 | 260 | """ 261 | 262 | masses = kwargs.pop('masses', None) 263 | 264 | children = children_list(anim.parents) 265 | constraints = [] 266 | 267 | points_offsets = Animation.offsets_global(anim) 268 | points = Animation.positions_global(anim) 269 | 270 | if masses is None: 271 | masses = 1.0 / (0.1 + np.absolute(points_offsets[:,1])) 272 | masses = masses[np.newaxis].repeat(len(anim), axis=0) 273 | 274 | for j in range(anim.shape[1]): 275 | 276 | """ Add constraints between all joints and their children """ 277 | for c0 in children[j]: 278 | 279 | dists = np.sum((points[:, c0] - points[:, j])**2.0, axis=1)**0.5 280 | constraints.append((c0, j, masses[:,c0], masses[:,j], dists)) 281 | 282 | """ Add constraints between all children of joint """ 283 | for c1 in children[j]: 284 | if c0 == c1: continue 285 | 286 | dists = np.sum((points[:, c0] - points[:, c1])**2.0, axis=1)**0.5 287 | constraints.append((c0, c1, masses[:,c0], masses[:,c1], dists)) 288 | 289 | return constraints 290 | 291 | """ Graph Functions """ 292 | 293 | def graph(anim): 294 | """ 295 | Generates a weighted adjacency matrix 296 | using local joint distances along 297 | the skeletal structure. 298 | 299 | Joints which are not connected 300 | are assigned the weight `0`. 301 | 302 | Joints which actually have zero distance 303 | between them, but are still connected, are 304 | perturbed by some minimal amount. 305 | 306 | The output of this routine can be used 307 | with the `scipy.sparse.csgraph` 308 | routines for graph analysis. 309 | 310 | Parameters 311 | ---------- 312 | 313 | anim : Animation 314 | input animation 315 | 316 | Returns 317 | ------- 318 | 319 | graph : (N, N) ndarray 320 | weight adjacency matrix using 321 | local distances along the 322 | skeletal structure from joint 323 | N to joint M. If joints are not 324 | directly connected are assigned 325 | the weight `0`. 326 | """ 327 | 328 | graph = np.zeros(anim.shape[1], anim.shape[1]) 329 | lengths = np.sum(anim.offsets**2.0, axis=1)**0.5 + 0.001 330 | 331 | for i,p in enumerate(anim.parents): 332 | if p == -1: continue 333 | graph[i,p] = lengths[p] 334 | graph[p,i] = lengths[p] 335 | 336 | return graph 337 | 338 | 339 | def distances(anim): 340 | """ 341 | Generates a distance matrix for 342 | pairwise joint distances along 343 | the skeletal structure 344 | 345 | Parameters 346 | ---------- 347 | 348 | anim : Animation 349 | input animation 350 | 351 | Returns 352 | ------- 353 | 354 | distances : (N, N) ndarray 355 | array of pairwise distances 356 | along skeletal structure 357 | from some joint N to some 358 | joint M 359 | """ 360 | 361 | distances = np.zeros((anim.shape[1], anim.shape[1])) 362 | generated = distances.copy().astype(bool) 363 | 364 | joint_lengths = np.sum(anim.offsets**2.0, axis=1)**0.5 365 | joint_children = children_list(anim) 366 | joint_parents = parents_list(anim) 367 | 368 | def find_distance(distances, generated, prev, i, j): 369 | 370 | """ If root, identity, or already generated, return """ 371 | if j == -1: return (0.0, True) 372 | if j == i: return (0.0, True) 373 | if generated[i,j]: return (distances[i,j], True) 374 | 375 | """ Find best distances along parents and children """ 376 | par_dists = [(joint_lengths[j], find_distance(distances, generated, j, i, p)) for p in joint_parents[j] if p != prev] 377 | out_dists = [(joint_lengths[c], find_distance(distances, generated, j, i, c)) for c in joint_children[j] if c != prev] 378 | 379 | """ Check valid distance and not dead end """ 380 | par_dists = [a + d for (a, (d, f)) in par_dists if f] 381 | out_dists = [a + d for (a, (d, f)) in out_dists if f] 382 | 383 | """ All dead ends """ 384 | if (out_dists + par_dists) == []: return (0.0, False) 385 | 386 | """ Get minimum path """ 387 | dist = min(out_dists + par_dists) 388 | distances[i,j] = dist; distances[j,i] = dist 389 | generated[i,j] = True; generated[j,i] = True 390 | 391 | for i in xrange(anim.shape[1]): 392 | for j in xrange(anim.shape[1]): 393 | find_distance(distances, generated, -1, i, j) 394 | 395 | return distances 396 | 397 | def edges(parents): 398 | """ 399 | Animation structure edges 400 | 401 | Parameters 402 | ---------- 403 | 404 | parents : (J) ndarray 405 | parents array 406 | 407 | Returns 408 | ------- 409 | 410 | edges : (M, 2) ndarray 411 | array of pairs where each 412 | pair contains two indices of a joints 413 | which corrisponds to an edge in the 414 | joint structure going from parent to child. 415 | """ 416 | 417 | return np.array(list(zip(parents, joints(parents)))[1:]) 418 | 419 | 420 | def incidence(parents): 421 | """ 422 | Incidence Matrix 423 | 424 | Parameters 425 | ---------- 426 | 427 | parents : (J) ndarray 428 | parents array 429 | 430 | Returns 431 | ------- 432 | 433 | incidence : (N, M) ndarray 434 | 435 | Matrix of N joint positions by 436 | M edges which each entry is either 437 | 1 or -1 and multiplication by the 438 | joint positions returns the an 439 | array of vectors along each edge 440 | of the structure 441 | """ 442 | 443 | es = edges(parents) 444 | 445 | inc = np.zeros((len(parents)-1, len(parents))).astype(np.int) 446 | for i, e in enumerate(es): 447 | inc[i,e[0]] = 1 448 | inc[i,e[1]] = -1 449 | 450 | return inc.T 451 | -------------------------------------------------------------------------------- /utils/BVH.py: -------------------------------------------------------------------------------- 1 | import re 2 | import numpy as np 3 | 4 | from utils.Animation import Animation 5 | from utils.Quaternions import Quaternions 6 | 7 | channelmap = { 8 | 'Xrotation': 'x', 9 | 'Yrotation': 'y', 10 | 'Zrotation': 'z' 11 | } 12 | 13 | channelmap_inv = { 14 | 'x': 'Xrotation', 15 | 'y': 'Yrotation', 16 | 'z': 'Zrotation', 17 | } 18 | 19 | ordermap = { 20 | 'x': 0, 21 | 'y': 1, 22 | 'z': 2, 23 | } 24 | 25 | 26 | def load(filename, start=None, end=None, order=None, world=False): 27 | """ 28 | Reads a BVH file and constructs an animation 29 | 30 | Parameters 31 | ---------- 32 | filename: str 33 | File to be opened 34 | 35 | start : int 36 | Optional Starting Frame 37 | 38 | end : int 39 | Optional Ending Frame 40 | 41 | order : str 42 | Optional Specifier for joint order. 43 | Given as string E.G 'xyz', 'zxy' 44 | 45 | world : bool 46 | If set to true euler angles are applied 47 | together in world space rather than local 48 | space 49 | 50 | Returns 51 | ------- 52 | 53 | (animation, joint_names, frametime) 54 | Tuple of loaded animation and joint names 55 | """ 56 | 57 | f = open(filename, "r") 58 | 59 | i = 0 60 | active = -1 61 | end_site = False 62 | 63 | names = [] 64 | orients = Quaternions.id(0) 65 | offsets = np.array([]).reshape((0, 3)) 66 | parents = np.array([], dtype=int) 67 | 68 | for line in f: 69 | 70 | if "HIERARCHY" in line: continue 71 | if "MOTION" in line: continue 72 | 73 | rmatch = re.match(r"ROOT (\w+)", line) 74 | if rmatch: 75 | names.append(rmatch.group(1)) 76 | offsets = np.append(offsets, np.array([[0, 0, 0]]), axis=0) 77 | orients.qs = np.append(orients.qs, np.array([[1, 0, 0, 0]]), axis=0) 78 | parents = np.append(parents, active) 79 | active = (len(parents) - 1) 80 | continue 81 | 82 | if "{" in line: continue 83 | 84 | if "}" in line: 85 | if end_site: 86 | end_site = False 87 | else: 88 | active = parents[active] 89 | continue 90 | 91 | offmatch = re.match(r"\s*OFFSET\s+([\-\d\.e]+)\s+([\-\d\.e]+)\s+([\-\d\.e]+)", line) 92 | if offmatch: 93 | if not end_site: 94 | offsets[active] = np.array([list(map(float, offmatch.groups()))]) 95 | continue 96 | 97 | chanmatch = re.match(r"\s*CHANNELS\s+(\d+)", line) 98 | if chanmatch: 99 | channels = int(chanmatch.group(1)) 100 | if order is None: 101 | channelis = 0 if channels == 3 else 3 102 | channelie = 3 if channels == 3 else 6 103 | parts = line.split()[2 + channelis:2 + channelie] 104 | if any([p not in channelmap for p in parts]): 105 | continue 106 | order = "".join([channelmap[p] for p in parts]) 107 | continue 108 | 109 | jmatch = re.match("\s*JOINT\s+(\w+)", line) 110 | if jmatch: 111 | names.append(jmatch.group(1)) 112 | offsets = np.append(offsets, np.array([[0, 0, 0]]), axis=0) 113 | orients.qs = np.append(orients.qs, np.array([[1, 0, 0, 0]]), axis=0) 114 | parents = np.append(parents, active) 115 | active = (len(parents) - 1) 116 | continue 117 | 118 | if "End Site" in line: 119 | end_site = True 120 | continue 121 | 122 | fmatch = re.match("\s*Frames:\s+(\d+)", line) 123 | if fmatch: 124 | if start and end: 125 | fnum = (end - start) - 1 126 | else: 127 | fnum = int(fmatch.group(1)) 128 | jnum = len(parents) 129 | positions = offsets[np.newaxis].repeat(fnum, axis=0) 130 | rotations = np.zeros((fnum, len(orients), 3)) 131 | continue 132 | 133 | fmatch = re.match("\s*Frame Time:\s+([\d\.]+)", line) 134 | if fmatch: 135 | frametime = float(fmatch.group(1)) 136 | continue 137 | 138 | if (start and end) and (i < start or i >= end - 1): 139 | i += 1 140 | continue 141 | 142 | dmatch = line.strip().split(' ') 143 | if dmatch: 144 | data_block = np.array(list(map(float, dmatch))) 145 | N = len(parents) 146 | fi = i - start if start else i 147 | if channels == 3: 148 | positions[fi, 0:1] = data_block[0:3] 149 | rotations[fi, :] = data_block[3:].reshape(N, 3) 150 | elif channels == 6: 151 | data_block = data_block.reshape(N, 6) 152 | positions[fi, :] = data_block[:, 0:3] 153 | rotations[fi, :] = data_block[:, 3:6] 154 | elif channels == 9: 155 | positions[fi, 0] = data_block[0:3] 156 | data_block = data_block[3:].reshape(N - 1, 9) 157 | rotations[fi, 1:] = data_block[:, 3:6] 158 | positions[fi, 1:] += data_block[:, 0:3] * data_block[:, 6:9] 159 | else: 160 | raise Exception("Too many channels! %i" % channels) 161 | 162 | i += 1 163 | 164 | f.close() 165 | 166 | # quat_rot = Quaternions.from_euler(np.radians(rotations), order=order, world=world) # works for world =True. for world=False do not reverse order and reverse rotations 167 | reorder = [order.index('x'), order.index('y'), 168 | order.index('z')] # even if seq is not 'xyz', angles should be ordered by xyz 169 | quat_rot = Quaternions.from_euler(np.radians(rotations[:, :, reorder]), order=order, 170 | world=world) # works for world =True. for world=False do not reverse order and reverse rotati 171 | # quat_rot = Quaternions.id(rotations.shape[:2]) 172 | # quat_rot.qs = R.from_euler(seq=order[::-1], angles=rotations.reshape(-1,3)[:,reorder], degrees=True).as_quat().reshape(rotations.shape[:2]+(4,)) 173 | 174 | if False: 175 | print('sanity check:') 176 | print('rotations from file (reordered as xyz): {}'.format(rotations[:, :, reorder][1, 0])) 177 | print('rotations from file (reordered as xyz): {}'.format(np.degrees(quat_rot.euler(order='xyz'))[1, 0])) 178 | 179 | return (Animation(quat_rot, positions, orients, offsets, parents), names, frametime) 180 | 181 | 182 | def save(filename, anim, names=None, frametime=1.0 / 24.0, order='zyx', positions=False, orients=True, 183 | save_global=True): 184 | """ 185 | Saves an Animation to file as BVH 186 | 187 | Parameters 188 | ---------- 189 | filename: str 190 | File to be saved to 191 | 192 | anim : Animation 193 | Animation to save 194 | 195 | names : [str] 196 | List of joint names 197 | 198 | order : str 199 | Optional Specifier for joint order. 200 | Given as string E.G 'xyz', 'zxy' 201 | 202 | frametime : float 203 | Optional Animation Frame time 204 | 205 | positions : bool 206 | Optional specfier to save bone 207 | positions for each frame 208 | 209 | orients : bool 210 | Multiply joint orients to the rotations 211 | before saving. 212 | 213 | """ 214 | 215 | if names is None: 216 | names = ["joint_" + str(i) for i in range(len(anim.parents))] 217 | 218 | with open(filename, 'w') as f: 219 | 220 | t = "" 221 | f.write("%sHIERARCHY\n" % t) 222 | f.write("%sROOT %s\n" % (t, names[0])) 223 | f.write("%s{\n" % t) 224 | t += '\t' 225 | 226 | f.write("%sOFFSET %f %f %f\n" % (t, anim.offsets[0, 0], anim.offsets[0, 1], anim.offsets[0, 2])) 227 | f.write("%sCHANNELS 6 Xposition Yposition Zposition %s %s %s \n" % 228 | (t, channelmap_inv[order[0]], channelmap_inv[order[1]], channelmap_inv[order[2]])) 229 | 230 | for i in range(anim.shape[1]): 231 | if anim.parents[i] == 0: 232 | t = save_joint(f, anim, names, t, i, order=order, positions=positions) 233 | 234 | t = t[:-1] 235 | f.write("%s}\n" % t) 236 | 237 | f.write("MOTION\n") 238 | f.write("Frames: %i\n" % anim.shape[0]); 239 | f.write("Frame Time: %f\n" % frametime); 240 | 241 | # if orients: 242 | # rots = np.degrees((-anim.orients[np.newaxis] * anim.rotations).euler(order=order[::-1])) 243 | # else: 244 | # rots = np.degrees(anim.rotations.euler(order=order[::-1])) 245 | rots = np.degrees(anim.rotations.euler(order=order[::-1])) 246 | # rots[:,:,:] = rots[:,:,[1,0,2]] 247 | poss = anim.positions 248 | 249 | for i in range(anim.shape[0]): 250 | for j in range(anim.shape[1]): 251 | if i == 0: 252 | if j == 0: 253 | f.write("%f %f %f %f %f %f " % (0, 0, 0, 0, 0, 0)) 254 | else: 255 | f.write("%f %f %f " % (0, 0, 0)) 256 | else: 257 | if positions or j == 0: 258 | if save_global: 259 | f.write("%f %f %f %f %f %f " % ( 260 | # 0, 0, 0, 261 | poss[i, j, 0], poss[i, j, 1], poss[i, j, 2], 262 | rots[i, j, ordermap[order[0]]], rots[i, j, ordermap[order[1]]], 263 | rots[i, j, ordermap[order[2]]])) 264 | else: 265 | f.write("%f %f %f %f %f %f " % ( 266 | 0, 0, 0, rots[i, j, ordermap[order[0]]], rots[i, j, ordermap[order[1]]], 267 | rots[i, j, ordermap[order[2]]])) 268 | else: 269 | f.write("%f %f %f " % ( 270 | rots[i, j, ordermap[order[0]]], rots[i, j, ordermap[order[1]]], 271 | rots[i, j, ordermap[order[2]]])) 272 | f.write("\n") 273 | 274 | 275 | def save_joint(f, anim, names, t, i, order='zyx', positions=False): 276 | f.write("%sJOINT %s\n" % (t, names[i])) 277 | f.write("%s{\n" % t) 278 | t += '\t' 279 | 280 | f.write("%sOFFSET %f %f %f\n" % (t, anim.offsets[i, 0], anim.offsets[i, 1], anim.offsets[i, 2])) 281 | 282 | if positions: 283 | f.write("%sCHANNELS 6 Xposition Yposition Zposition %s %s %s \n" % (t, 284 | channelmap_inv[order[0]], 285 | channelmap_inv[order[1]], 286 | channelmap_inv[order[2]])) 287 | else: 288 | f.write("%sCHANNELS 3 %s %s %s\n" % (t, 289 | channelmap_inv[order[0]], channelmap_inv[order[1]], 290 | channelmap_inv[order[2]])) 291 | 292 | end_site = True 293 | 294 | for j in range(anim.shape[1]): 295 | if anim.parents[j] == i: 296 | t = save_joint(f, anim, names, t, j, order=order, positions=positions) 297 | end_site = False 298 | 299 | if end_site: 300 | f.write("%sEnd Site\n" % t) 301 | f.write("%s{\n" % t) 302 | t += '\t' 303 | f.write("%sOFFSET %f %f %f\n" % (t, 0.0, 0.0, 0.0)) 304 | t = t[:-1] 305 | f.write("%s}\n" % t) 306 | 307 | t = t[:-1] 308 | f.write("%s}\n" % t) 309 | 310 | return t 311 | -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- 1 | from .util import * 2 | from .visualization import * 3 | from .logger import * 4 | -------------------------------------------------------------------------------- /utils/angles_utils.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/utils/angles_utils.py -------------------------------------------------------------------------------- /utils/h36m_utils.py: -------------------------------------------------------------------------------- 1 | """Utilities to deal with the cameras of human3.6m""" 2 | 3 | from __future__ import division 4 | 5 | import h5py 6 | import json 7 | import numpy as np 8 | 9 | # Human3.6m IDs for training and testing 10 | DEBUG_SUBJECTS = [1] 11 | TRAIN_SUBJECTS = [1, 5, 6, 7, 8] 12 | TEST_SUBJECTS = [9, 11] 13 | ALL_SUBJECTS = [1, 5, 6, 7, 8, 9, 11] 14 | 15 | # Joints in H3.6M -- data has 32 joints, but only 17 that move; these are the indices. 16 | H36M_NAMES = [''] * 32 17 | H36M_NAMES[0] = 'Hip' 18 | H36M_NAMES[1] = 'RHip' 19 | H36M_NAMES[2] = 'RKnee' 20 | H36M_NAMES[3] = 'RFoot' 21 | H36M_NAMES[6] = 'LHip' 22 | H36M_NAMES[7] = 'LKnee' 23 | H36M_NAMES[8] = 'LFoot' 24 | H36M_NAMES[12] = 'Spine' 25 | H36M_NAMES[13] = 'Thorax' 26 | H36M_NAMES[14] = 'Neck/Nose' 27 | H36M_NAMES[15] = 'Head' 28 | H36M_NAMES[17] = 'LShoulder' 29 | H36M_NAMES[18] = 'LElbow' 30 | H36M_NAMES[19] = 'LWrist' 31 | H36M_NAMES[25] = 'RShoulder' 32 | H36M_NAMES[26] = 'RElbow' 33 | H36M_NAMES[27] = 'RWrist' 34 | 35 | 36 | def convert_openpose(openpose_json): 37 | mapping = [8, 9, 10, 11, 12, 13, 14, [1, 8], 1, 0, [15, 16], 5, 6, 7, 2, 3, 4] 38 | h36m_locations = np.zeros((len(mapping), 2), dtype=np.float32) 39 | h36m_confidences = np.zeros(len(mapping), dtype=np.float32) 40 | if len(openpose_json['people']) > 0: 41 | openpose_pose = np.array(openpose_json['people'][0]['pose_keypoints_2d']).reshape(-1, 3) 42 | for map_index, map_item in enumerate(mapping): 43 | if isinstance(map_item, int): 44 | h36m_locations[map_index] = openpose_pose[map_item][:2] 45 | h36m_confidences[map_index] = openpose_pose[map_item][2] 46 | else: 47 | h36m_locations[map_index] = np.mean(openpose_pose[map_item], axis=0)[:2] 48 | h36m_confidences[map_index] = np.min(openpose_pose[map_item], axis=0)[2] 49 | return h36m_locations.reshape(1, -1), h36m_confidences.reshape(1, -1) 50 | 51 | 52 | def load_camera_params(hf, path): 53 | """Load h36m camera parameters 54 | 55 | Args 56 | hf: hdf5 open file with h36m cameras data 57 | path: path or key inside hf to the camera we are interested in 58 | Returns 59 | R: 3x3 Camera rotation matrix 60 | T: 3x1 Camera translation parameters 61 | f: (scalar) Camera focal length 62 | c: 2x1 Camera center 63 | k: 3x1 Camera radial distortion coefficients 64 | p: 2x1 Camera tangential distortion coefficients 65 | name: String with camera id 66 | """ 67 | 68 | R = hf[path.format('R')][:] 69 | R = R.T 70 | 71 | T = hf[path.format('T')][:] 72 | f = hf[path.format('f')][:] 73 | c = hf[path.format('c')][:] 74 | k = hf[path.format('k')][:] 75 | p = hf[path.format('p')][:] 76 | 77 | name = hf[path.format('Name')][:] 78 | name = "".join([chr(item) for item in name]) 79 | 80 | return R, T, f, c, k, p, name 81 | 82 | 83 | def load_cameras(bpath): 84 | """Loads the cameras of h36m 85 | 86 | Args 87 | bpath: path to hdf5 file with h36m camera data 88 | subjects: List of ints representing the subject IDs for which cameras are requested 89 | Returns 90 | rcams: dictionary of 4 tuples per subject ID containing its camera parameters for the 4 h36m cams 91 | """ 92 | rcams = {} 93 | 94 | with h5py.File(bpath, 'r') as hf: 95 | for s in ALL_SUBJECTS: 96 | for c_idx in range(4): # There are 4 cameras in human3.6m 97 | R, T, f, c, k, p, name = load_camera_params(hf, 'subject%d/camera%d/{0}' % (s, c_idx + 1)) 98 | if name == '54138969': 99 | res_w, res_h = 1000, 1002 100 | elif name == '55011271': 101 | res_w, res_h = 1000, 1000 102 | elif name == '58860488': 103 | res_w, res_h = 1000, 1000 104 | elif name == '60457274': 105 | res_w, res_h = 1000, 1002 106 | rcams[(s, c_idx)] = (R, T, f, c, k, p, res_w, res_h) 107 | return rcams 108 | 109 | 110 | def define_actions(action): 111 | actions = ["Directions", "Discussion", "Eating", "Greeting", 112 | "Phoning", "Posing", "Purchases", "Sitting", 113 | "SittingDown", "Smoking", "TakingPhoto", "Waiting", 114 | "Walking", "WalkingDog", "WalkingTogether"] 115 | 116 | if action == "All" or action == "all" or action == 'ALL': 117 | return actions 118 | 119 | if not action in actions: 120 | raise (ValueError, "Unrecognized action: %s" % action) 121 | 122 | return [action] 123 | 124 | 125 | def project_2d(P, R, T, f, c, k, p, augment_depth=0, from_world=False): 126 | """ 127 | Project points from 3d to 2d using camera parameters 128 | including radial and tangential distortion 129 | 130 | Args 131 | P: Nx3 points in world coordinates 132 | R: 3x3 Camera rotation matrix 133 | T: 3x1 Camera translation parameters 134 | f: (scalar) Camera focal length 135 | c: 2x1 Cama center 136 | k: 3x1 Cameraer radial distortion coefficients 137 | p: 2x1 Camera tangential distortion coefficients 138 | Returns 139 | Proj: Nx2 points in pixel space 140 | D: 1xN depth of each point in camera space 141 | radial: 1xN radial distortion per point 142 | tan: 1xN tangential distortion per point 143 | r2: 1xN squared radius of the projected points before distortion 144 | """ 145 | 146 | # P is a matrix of 3-dimensional points 147 | assert len(P.shape) == 2 148 | assert P.shape[1] == 3 149 | 150 | N = P.shape[0] 151 | if from_world: 152 | X = R.dot(P.T - T) # rotate and translate 153 | else: 154 | X = P.T 155 | XX = X[:2, :] / (X[2, :] + augment_depth) 156 | r2 = XX[0, :] ** 2 + XX[1, :] ** 2 157 | 158 | radial = 1 + np.einsum('ij,ij->j', np.tile(k, (1, N)), np.array([r2, r2 ** 2, r2 ** 3])) 159 | tan = p[0] * XX[1, :] + p[1] * XX[0, :] 160 | 161 | XXX = XX * np.tile(radial + tan, (2, 1)) + np.outer(np.array([p[1], p[0]]).reshape(-1), r2) 162 | 163 | Proj = (f * XXX) + c 164 | Proj = Proj.T 165 | 166 | D = X[2,] 167 | 168 | return Proj, D, radial, tan, r2, XXX.T 169 | 170 | 171 | def postprocess_3d(poses): 172 | """ 173 | Center 3d points around root 174 | 175 | Args 176 | poses_set: dictionary with 3d data 177 | Returns 178 | poses_set: dictionary with 3d data centred around root (center hip) joint 179 | root_positions: dictionary with the original 3d position of each pose 180 | """ 181 | root_positions = poses[:, :3] 182 | poses = poses - np.tile(poses[:, :3], [1, len(H36M_NAMES)]) 183 | return poses, root_positions 184 | 185 | 186 | def world_to_camera_frame(P, R, T): 187 | """ 188 | Convert points from world to camera coordinates 189 | 190 | Args 191 | P: Nx3 3d points in world coordinates 192 | R: 3x3 Camera rotation matrix 193 | T: 3x1 Camera translation parameters 194 | Returns 195 | X_cam: Nx3 3d points in camera coordinates 196 | """ 197 | 198 | assert len(P.shape) == 2 199 | assert P.shape[1] == 3 200 | 201 | X_cam = R.dot(P.T - T) # rotate and translate 202 | 203 | return X_cam.T 204 | 205 | 206 | def camera_to_world_frame(P, R, T): 207 | """ Inverse of world_to_camera_frame 208 | 209 | Args 210 | P: Nx3 points in camera coordinates 211 | R: 3x3 Camera rotation matrix 212 | T: 3x1 Camera translation parameters 213 | Returns 214 | X_cam: Nx3 points in world coordinates 215 | """ 216 | 217 | assert len(P.shape) == 2 218 | assert P.shape[1] == 3 219 | 220 | X_cam = R.T.dot(P.T) # rotate 221 | if T is not None: 222 | X_cam += T # and translate 223 | return X_cam.T 224 | 225 | 226 | def cam2world_centered(data_3d_camframe, R, T): 227 | data_3d_worldframe = camera_to_world_frame(data_3d_camframe.reshape((-1, 3)), R, T) 228 | data_3d_worldframe = data_3d_worldframe.reshape((-1, data_3d_camframe.shape[-1])) 229 | # subtract root translation 230 | return data_3d_worldframe - np.tile(data_3d_worldframe[:, :3], (1, int(data_3d_camframe.shape[-1]/3))) 231 | 232 | 233 | def dimension_reducer(dimension, predict_number): 234 | if not dimension in [1, 2, 3]: 235 | raise (ValueError, 'dim must be 2 or 3') 236 | if dimension == 2: 237 | if predict_number == 15: 238 | dimensions_to_use = np.where(np.array([x != '' and x != 'Spine' and x != 'Neck/Nose' for x in H36M_NAMES]))[0] 239 | else: 240 | dimensions_to_use = np.where(np.array([x != '' for x in H36M_NAMES]))[0] 241 | dimensions_to_use = np.sort(np.hstack((dimensions_to_use * 2, dimensions_to_use * 2 + 1))) 242 | else: 243 | if predict_number == 15: 244 | dimensions_to_use = np.where(np.array([x != '' and x != 'Spine' and x != 'Neck/Nose' for x in H36M_NAMES]))[0] 245 | else: 246 | dimensions_to_use = np.where(np.array([x != '' for x in H36M_NAMES]))[0] 247 | dimensions_to_use = np.sort(np.hstack((dimensions_to_use * 3, 248 | dimensions_to_use * 3 + 1, 249 | dimensions_to_use * 3 + 2))) 250 | return dimensions_to_use 251 | 252 | 253 | def transform_world_to_camera(poses_set, cams, ncams=4): 254 | """ 255 | Project 3d poses from world coordinate to camera coordinate system 256 | Args 257 | poses_set: dictionary with 3d poses 258 | cams: dictionary with cameras 259 | ncams: number of cameras per subject 260 | Return: 261 | t3d_camera: dictionary with 3d poses in camera coordinate 262 | """ 263 | t3d_camera = {} 264 | for t3dk in sorted(poses_set.keys()): 265 | 266 | subj, action, seqname = t3dk 267 | t3d_world = poses_set[t3dk] 268 | 269 | for c in range(ncams): 270 | R, T, f, c, k, p, name = cams[(subj, c + 1)] 271 | camera_coord = world_to_camera_frame(np.reshape(t3d_world, [-1, 3]), R, T) 272 | camera_coord = np.reshape(camera_coord, [-1, len(H36M_NAMES) * 3]) 273 | 274 | sname = seqname[:-3] + "." + name + ".h5" # e.g.: Waiting 1.58860488.h5 275 | t3d_camera[(subj, action, sname)] = camera_coord 276 | 277 | return t3d_camera 278 | 279 | 280 | def project_to_cameras(poses_set, cams, ncams=4): 281 | """ 282 | Project 3d poses using camera parameters 283 | 284 | Args 285 | poses_set: dictionary with 3d poses 286 | cams: dictionary with camera parameters 287 | ncams: number of cameras per subject 288 | Returns 289 | t2d: dictionary with 2d poses 290 | """ 291 | t2d = {} 292 | 293 | for t3dk in sorted(poses_set.keys()): 294 | subj, a, seqname = t3dk 295 | t3d = poses_set[t3dk] 296 | 297 | for cam in range(ncams): 298 | R, T, f, c, k, p, name = cams[(subj, cam + 1)] 299 | pts2d, _, _, _, _ = project_2d(np.reshape(t3d, [-1, 3]), R, T, f, c, k, p, from_world=True) 300 | 301 | pts2d = np.reshape(pts2d, [-1, len(H36M_NAMES) * 2]) 302 | sname = seqname[:-3] + "." + name + ".h5" # e.g.: Waiting 1.58860488.h5 303 | t2d[(subj, a, sname)] = pts2d 304 | 305 | return t2d 306 | -------------------------------------------------------------------------------- /utils/learnable_utils.py: -------------------------------------------------------------------------------- 1 | ACTION_NAME_MAPPING = { 2 | 'S1': {'Directions-1': 'Directions 1', 'Directions-2': 'Directions', 'Discussion-1': 'Discussion 1', 3 | 'Discussion-2': 'Discussion', 'Eating-1': 'Eating 2', 'Eating-2': 'Eating', 'Greeting-1': 'Greeting 1', 4 | 'Greeting-2': 'Greeting', 'Phoning-1': 'Phoning 1', 'Phoning-2': 'Phoning', 'Posing-1': 'Posing 1', 5 | 'Posing-2': 'Posing', 'Purchases-1': 'Purchases 1', 'Purchases-2': 'Purchases', 'Sitting-1': 'Sitting 1', 6 | 'Sitting-2': 'Sitting 2', 'SittingDown-1': 'SittingDown 2', 'SittingDown-2': 'SittingDown', 7 | 'Smoking-1': 'Smoking 1', 'Smoking-2': 'Smoking', 'TakingPhoto-1': 'Photo 1', 'TakingPhoto-2': 'Photo', 8 | 'Waiting-1': 'Waiting 1', 'Waiting-2': 'Waiting', 'Walking-1': 'Walking 1', 'Walking-2': 'Walking', 9 | 'WalkingDog-1': 'WalkDog 1', 'WalkingDog-2': 'WalkDog', 'WalkingTogether-1': 'WalkTogether 1', 10 | 'WalkingTogether-2': 'WalkTogether'}, 11 | 'S5': {'Directions-1': 'Directions 1', 'Directions-2': 'Directions 2', 'Discussion-1': 'Discussion 2', 12 | 'Discussion-2': 'Discussion 3', 'Eating-1': 'Eating 1', 'Eating-2': 'Eating', 'Greeting-1': 'Greeting 1', 13 | 'Greeting-2': 'Greeting 2', 'Phoning-1': 'Phoning 1', 'Phoning-2': 'Phoning', 'Posing-1': 'Posing 1', 14 | 'Posing-2': 'Posing', 'Purchases-1': 'Purchases 1', 'Purchases-2': 'Purchases', 'Sitting-1': 'Sitting 1', 15 | 'Sitting-2': 'Sitting', 'SittingDown-1': 'SittingDown', 'SittingDown-2': 'SittingDown 1', 16 | 'Smoking-1': 'Smoking 1', 'Smoking-2': 'Smoking', 'TakingPhoto-1': 'Photo', 'TakingPhoto-2': 'Photo 2', 17 | 'Waiting-1': 'Waiting 1', 'Waiting-2': 'Waiting 2', 'Walking-1': 'Walking 1', 'Walking-2': 'Walking', 18 | 'WalkingDog-1': 'WalkDog 1', 'WalkingDog-2': 'WalkDog', 'WalkingTogether-1': 'WalkTogether 1', 19 | 'WalkingTogether-2': 'WalkTogether'}, 20 | 'S6': {'Directions-1': 'Directions 1', 'Directions-2': 'Directions', 'Discussion-1': 'Discussion 1', 21 | 'Discussion-2': 'Discussion', 'Eating-1': 'Eating 1', 'Eating-2': 'Eating 2', 'Greeting-1': 'Greeting 1', 22 | 'Greeting-2': 'Greeting', 'Phoning-1': 'Phoning 1', 'Phoning-2': 'Phoning', 'Posing-1': 'Posing 2', 23 | 'Posing-2': 'Posing', 'Purchases-1': 'Purchases 1', 'Purchases-2': 'Purchases', 'Sitting-1': 'Sitting 1', 24 | 'Sitting-2': 'Sitting 2', 'SittingDown-1': 'SittingDown 1', 'SittingDown-2': 'SittingDown', 25 | 'Smoking-1': 'Smoking 1', 'Smoking-2': 'Smoking', 'TakingPhoto-1': 'Photo', 'TakingPhoto-2': 'Photo 1', 26 | 'Waiting-1': 'Waiting 3', 'Waiting-2': 'Waiting', 'Walking-1': 'Walking 1', 'Walking-2': 'Walking', 27 | 'WalkingDog-1': 'WalkDog 1', 'WalkingDog-2': 'WalkDog', 'WalkingTogether-1': 'WalkTogether 1', 28 | 'WalkingTogether-2': 'WalkTogether'}, 29 | 'S7': {'Directions-1': 'Directions 1', 'Directions-2': 'Directions', 'Discussion-1': 'Discussion 1', 30 | 'Discussion-2': 'Discussion', 'Eating-1': 'Eating 1', 'Eating-2': 'Eating', 'Greeting-1': 'Greeting 1', 31 | 'Greeting-2': 'Greeting', 'Phoning-1': 'Phoning 2', 'Phoning-2': 'Phoning', 'Posing-1': 'Posing 1', 32 | 'Posing-2': 'Posing', 'Purchases-1': 'Purchases 1', 'Purchases-2': 'Purchases', 'Sitting-1': 'Sitting 1', 33 | 'Sitting-2': 'Sitting', 'SittingDown-1': 'SittingDown', 'SittingDown-2': 'SittingDown 1', 34 | 'Smoking-1': 'Smoking 1', 'Smoking-2': 'Smoking', 'TakingPhoto-1': 'Photo', 'TakingPhoto-2': 'Photo 1', 35 | 'Waiting-1': 'Waiting 1', 'Waiting-2': 'Waiting 2', 'Walking-1': 'Walking 1', 'Walking-2': 'Walking 2', 36 | 'WalkingDog-1': 'WalkDog 1', 'WalkingDog-2': 'WalkDog', 'WalkingTogether-1': 'WalkTogether 1', 37 | 'WalkingTogether-2': 'WalkTogether'}, 38 | 'S8': {'Directions-1': 'Directions 1', 'Directions-2': 'Directions', 'Discussion-1': 'Discussion 1', 39 | 'Discussion-2': 'Discussion', 'Eating-1': 'Eating 1', 'Eating-2': 'Eating', 'Greeting-1': 'Greeting 1', 40 | 'Greeting-2': 'Greeting', 'Phoning-1': 'Phoning 1', 'Phoning-2': 'Phoning', 'Posing-1': 'Posing 1', 41 | 'Posing-2': 'Posing', 'Purchases-1': 'Purchases 1', 'Purchases-2': 'Purchases', 'Sitting-1': 'Sitting 1', 42 | 'Sitting-2': 'Sitting', 'SittingDown-1': 'SittingDown', 'SittingDown-2': 'SittingDown 1', 43 | 'Smoking-1': 'Smoking 1', 'Smoking-2': 'Smoking', 'TakingPhoto-1': 'Photo 1', 'TakingPhoto-2': 'Photo', 44 | 'Waiting-1': 'Waiting 1', 'Waiting-2': 'Waiting', 'Walking-1': 'Walking 1', 'Walking-2': 'Walking', 45 | 'WalkingDog-1': 'WalkDog 1', 'WalkingDog-2': 'WalkDog', 'WalkingTogether-1': 'WalkTogether 1', 46 | 'WalkingTogether-2': 'WalkTogether 2'}, 47 | 'S9': {'Directions-1': 'Directions 1', 'Directions-2': 'Directions', 'Discussion-1': 'Discussion 1', 48 | 'Discussion-2': 'Discussion 2', 'Eating-1': 'Eating 1', 'Eating-2': 'Eating', 'Greeting-1': 'Greeting 1', 49 | 'Greeting-2': 'Greeting', 'Phoning-1': 'Phoning 1', 'Phoning-2': 'Phoning', 'Posing-1': 'Posing 1', 50 | 'Posing-2': 'Posing', 'Purchases-1': 'Purchases 1', 'Purchases-2': 'Purchases', 'Sitting-1': 'Sitting 1', 51 | 'Sitting-2': 'Sitting', 'SittingDown-1': 'SittingDown', 'SittingDown-2': 'SittingDown 1', 52 | 'Smoking-1': 'Smoking 1', 'Smoking-2': 'Smoking', 'TakingPhoto-1': 'Photo 1', 'TakingPhoto-2': 'Photo', 53 | 'Waiting-1': 'Waiting 1', 'Waiting-2': 'Waiting', 'Walking-1': 'Walking 1', 'Walking-2': 'Walking', 54 | 'WalkingDog-1': 'WalkDog 1', 'WalkingDog-2': 'WalkDog', 'WalkingTogether-1': 'WalkTogether 1', 55 | 'WalkingTogether-2': 'WalkTogether'}, 56 | 'S11': {'Directions-1': 'Directions 1', 'Discussion-1': 'Discussion 1', 'Discussion-2': 'Discussion 2', 57 | 'Eating-1': 'Eating 1', 'Eating-2': 'Eating', 'Greeting-1': 'Greeting 2', 'Greeting-2': 'Greeting', 58 | 'Phoning-1': 'Phoning 3', 'Phoning-2': 'Phoning 2', 'Posing-1': 'Posing 1', 'Posing-2': 'Posing', 59 | 'Purchases-1': 'Purchases 1', 'Purchases-2': 'Purchases', 'Sitting-1': 'Sitting 1', 'Sitting-2': 'Sitting', 60 | 'SittingDown-1': 'SittingDown', 'SittingDown-2': 'SittingDown 1', 'Smoking-1': 'Smoking 2', 61 | 'Smoking-2': 'Smoking', 'TakingPhoto-1': 'Photo 1', 'TakingPhoto-2': 'Photo', 'Waiting-1': 'Waiting 1', 62 | 'Waiting-2': 'Waiting', 'Walking-1': 'Walking 1', 'Walking-2': 'Walking', 'WalkingDog-1': 'WalkDog 1', 63 | 'WalkingDog-2': 'WalkDog', 'WalkingTogether-1': 'WalkTogether 1', 'WalkingTogether-2': 'WalkTogether'} 64 | } 65 | -------------------------------------------------------------------------------- /utils/logger.py: -------------------------------------------------------------------------------- 1 | import json 2 | import logging 3 | 4 | class Logger: 5 | """ 6 | Training process logger 7 | 8 | Note: 9 | Used by BaseTrainer to save training history. 10 | """ 11 | def __init__(self, filename): 12 | self.logger = logging.getLogger(self.__class__.__name__) 13 | formatter = logging.Formatter('%(asctime)s %(message)s', datefmt='%d %b %Y %H:%M:%S') 14 | 15 | file_handler = logging.FileHandler(filename, mode='w') 16 | file_handler.setLevel(logging.INFO) 17 | file_handler.setFormatter(formatter) 18 | 19 | stream_handler = logging.StreamHandler() 20 | stream_handler.setLevel(logging.INFO) 21 | stream_handler.setFormatter(formatter) 22 | 23 | self.logger.addHandler(file_handler) 24 | # self.logger.addHandler(stream_handler) 25 | logging.entries = {} 26 | 27 | def add_entry(self, entry): 28 | self.entries[len(self.entries) + 1] = entry 29 | 30 | def info(self, strs): 31 | return self.logger.info(strs) 32 | 33 | def warning(self, strs): 34 | return self.logger.warning(strs) 35 | 36 | def __str__(self): 37 | return json.dumps(self.entries, sort_keys=True, indent=4) 38 | -------------------------------------------------------------------------------- /utils/motion_utils.py: -------------------------------------------------------------------------------- 1 | """Functions that help with data processing for human3.6m""" 2 | 3 | from __future__ import absolute_import 4 | from __future__ import division 5 | from __future__ import print_function 6 | 7 | import numpy as np 8 | from six.moves import xrange # pylint: disable=redefined-builtin 9 | import copy 10 | 11 | def rotmat2quat(R): 12 | """ 13 | Converts a rotation matrix to a quaternion 14 | Matlab port to python for evaluation purposes 15 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/rotmat2quat.m#L4 16 | Args 17 | R: 3x3 rotation matrix 18 | Returns 19 | q: 1x4 quaternion 20 | """ 21 | rotdiff = R - R.T; 22 | 23 | r = np.zeros(3) 24 | r[0] = -rotdiff[1,2] 25 | r[1] = rotdiff[0,2] 26 | r[2] = -rotdiff[0,1] 27 | sintheta = np.linalg.norm(r) / 2; 28 | r0 = np.divide(r, np.linalg.norm(r) + np.finfo(np.float32).eps ); 29 | 30 | costheta = (np.trace(R)-1) / 2; 31 | 32 | theta = np.arctan2( sintheta, costheta ); 33 | 34 | q = np.zeros(4) 35 | q[0] = np.cos(theta/2) 36 | q[1:] = r0*np.sin(theta/2) 37 | return q 38 | 39 | def rotmat2expmap(R): 40 | return quat2expmap( rotmat2quat(R) ); 41 | 42 | def expmap2rotmat(r): 43 | """ 44 | Converts an exponential map angle to a rotation matrix 45 | Matlab port to python for evaluation purposes 46 | I believe this is also called Rodrigues' formula 47 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/expmap2rotmat.m 48 | Args 49 | r: 1x3 exponential map 50 | Returns 51 | R: 3x3 rotation matrix 52 | """ 53 | theta = np.linalg.norm( r ) 54 | r0 = np.divide( r, theta + np.finfo(np.float32).eps ) 55 | r0x = np.array([0, -r0[2], r0[1], 0, 0, -r0[0], 0, 0, 0]).reshape(3,3) 56 | r0x = r0x - r0x.T 57 | R = np.eye(3,3) + np.sin(theta)*r0x + (1-np.cos(theta))*(r0x).dot(r0x); 58 | return R 59 | 60 | def rotmat2quat(R): 61 | """ 62 | Converts a rotation matrix to a quaternion 63 | Matlab port to python for evaluation purposes 64 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/rotmat2quat.m#L4 65 | Args 66 | R: 3x3 rotation matrix 67 | Returns 68 | q: 1x4 quaternion 69 | """ 70 | rotdiff = R - R.T; 71 | 72 | r = np.zeros(3) 73 | r[0] = -rotdiff[1,2] 74 | r[1] = rotdiff[0,2] 75 | r[2] = -rotdiff[0,1] 76 | sintheta = np.linalg.norm(r) / 2; 77 | r0 = np.divide(r, np.linalg.norm(r) + np.finfo(np.float32).eps ); 78 | 79 | costheta = (np.trace(R)-1) / 2; 80 | 81 | theta = np.arctan2( sintheta, costheta ); 82 | 83 | q = np.zeros(4) 84 | q[0] = np.cos(theta/2) 85 | q[1:] = r0*np.sin(theta/2) 86 | return q 87 | 88 | def rotmat2expmap(R): 89 | return quat2expmap( rotmat2quat(R) ); 90 | 91 | def expmap2rotmat(r): 92 | """ 93 | Converts an exponential map angle to a rotation matrix 94 | Matlab port to python for evaluation purposes 95 | I believe this is also called Rodrigues' formula 96 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/expmap2rotmat.m 97 | Args 98 | r: 1x3 exponential map 99 | Returns 100 | R: 3x3 rotation matrix 101 | """ 102 | theta = np.linalg.norm( r ) 103 | r0 = np.divide( r, theta + np.finfo(np.float32).eps ) 104 | r0x = np.array([0, -r0[2], r0[1], 0, 0, -r0[0], 0, 0, 0]).reshape(3,3) 105 | r0x = r0x - r0x.T 106 | R = np.eye(3,3) + np.sin(theta)*r0x + (1-np.cos(theta))*(r0x).dot(r0x); 107 | return R -------------------------------------------------------------------------------- /utils/myBVH.py: -------------------------------------------------------------------------------- 1 | import re 2 | import numpy as np 3 | 4 | # from utils.Animation import Animation 5 | from utils.myAnimation import Animation 6 | # from Quaternions import Quaternions 7 | from scipy.spatial.transform import Rotation as R 8 | 9 | 10 | channelmap = { 11 | 'Xrotation' : 'x', 12 | 'Yrotation' : 'y', 13 | 'Zrotation' : 'z' 14 | } 15 | 16 | channelmap_inv = { 17 | 'x': 'Xrotation', 18 | 'y': 'Yrotation', 19 | 'z': 'Zrotation', 20 | } 21 | 22 | ordermap = { 23 | 'x' : 0, 24 | 'y' : 1, 25 | 'z' : 2, 26 | } 27 | 28 | def concat_rotations(rot1, rot2): 29 | q1 = rot1.as_quat() 30 | q2 = rot2.as_quat() 31 | rot = R.from_quat(np.append(q1, q2, axis=0)) 32 | return rot 33 | 34 | def load(filename, start=None, end=None, order=None, world=False): 35 | """ 36 | Reads a BVH file and constructs an animation 37 | 38 | Parameters 39 | ---------- 40 | filename: str 41 | File to be opened 42 | 43 | start : int 44 | Optional Starting Frame 45 | 46 | end : int 47 | Optional Ending Frame 48 | 49 | order : str 50 | Optional Specifier for joint order. 51 | Given as string E.G 'xyz', 'zxy' 52 | 53 | world : bool 54 | If set to true euler angles are applied 55 | together in world space rather than local 56 | space 57 | 58 | Returns 59 | ------- 60 | 61 | (animation, joint_names, frametime) 62 | Tuple of loaded animation and joint names 63 | """ 64 | 65 | f = open(filename, "r") 66 | 67 | i = 0 68 | active = -1 69 | end_site = False 70 | 71 | names = [] 72 | orients = R.identity(0) 73 | offsets = np.array([]).reshape((0,3)) 74 | parents = np.array([], dtype=int) 75 | 76 | for line in f: 77 | 78 | if "HIERARCHY" in line: continue 79 | if "MOTION" in line: continue 80 | 81 | rmatch = re.match(r"ROOT (\w+)", line) 82 | if rmatch: 83 | names.append(rmatch.group(1)) 84 | offsets = np.append(offsets, np.array([[0,0,0]]), axis=0) 85 | # orients.qs = np.append(orients.qs, np.array([[1,0,0,0]]), axis=0) 86 | orients = concat_rotations(orients, R.identity(1)) 87 | parents = np.append(parents, active) 88 | active = (len(parents)-1) 89 | continue 90 | 91 | if "{" in line: continue 92 | 93 | if "}" in line: 94 | if end_site: end_site = False 95 | else: active = parents[active] 96 | continue 97 | 98 | offmatch = re.match(r"\s*OFFSET\s+([\-\d\.e]+)\s+([\-\d\.e]+)\s+([\-\d\.e]+)", line) 99 | if offmatch: 100 | if not end_site: 101 | offsets[active] = np.array([list(map(float, offmatch.groups()))]) 102 | continue 103 | 104 | chanmatch = re.match(r"\s*CHANNELS\s+(\d+)", line) 105 | if chanmatch: 106 | channels = int(chanmatch.group(1)) 107 | if order is None: 108 | channelis = 0 if channels == 3 else 3 109 | channelie = 3 if channels == 3 else 6 110 | parts = line.split()[2+channelis:2+channelie] 111 | if any([p not in channelmap for p in parts]): 112 | continue 113 | order = "".join([channelmap[p] for p in parts]) 114 | continue 115 | 116 | jmatch = re.match("\s*JOINT\s+(\w+)", line) 117 | if jmatch: 118 | names.append(jmatch.group(1)) 119 | offsets = np.append(offsets, np.array([[0,0,0]]), axis=0) 120 | orients = concat_rotations(orients, R.identity(1)) 121 | parents = np.append(parents, active) 122 | active = (len(parents)-1) 123 | continue 124 | 125 | if "End Site" in line: 126 | end_site = True 127 | continue 128 | 129 | fmatch = re.match("\s*Frames:\s+(\d+)", line) 130 | if fmatch: 131 | if start and end: 132 | fnum = (end - start)-1 133 | else: 134 | fnum = int(fmatch.group(1)) 135 | jnum = len(parents) 136 | positions = offsets[np.newaxis].repeat(fnum, axis=0) 137 | rotations = np.zeros((fnum, len(orients), 3)) 138 | # rotations = R.identity(fnum * len(orients)) 139 | continue 140 | 141 | fmatch = re.match("\s*Frame Time:\s+([\d\.]+)", line) 142 | if fmatch: 143 | frametime = float(fmatch.group(1)) 144 | continue 145 | 146 | if (start and end) and (i < start or i >= end-1): 147 | i += 1 148 | continue 149 | 150 | dmatch = line.strip().split(' ') 151 | if dmatch: 152 | data_block = np.array(list(map(float, dmatch))) 153 | N = len(parents) 154 | fi = i - start if start else i 155 | if channels == 3: 156 | positions[fi,0:1] = data_block[0:3] 157 | rotations[fi, : ] = data_block[3: ].reshape(N,3) 158 | elif channels == 6: 159 | data_block = data_block.reshape(N,6) 160 | positions[fi,:] = data_block[:,0:3] 161 | rotations[fi,:] = data_block[:,3:6] 162 | elif channels == 9: 163 | positions[fi,0] = data_block[0:3] 164 | data_block = data_block[3:].reshape(N-1,9) 165 | rotations[fi,1:] = data_block[:,3:6] 166 | positions[fi,1:] += data_block[:,0:3] * data_block[:,6:9] 167 | else: 168 | raise Exception("Too many channels! %i" % channels) 169 | 170 | i += 1 171 | 172 | f.close() 173 | 174 | # rotations = Quaternions.from_euler(np.radians(rotations), order=order, world=world) 175 | reorder = [order.index('x'), order.index('y'), order.index('z')] # even if seq is not 'xyz', angles should be ordered by xyz 176 | # quat_rot = Quaternions.from_euler(np.radians(rotations[:,:,reorder]), order=order[::-1], world=world) 177 | # quat_rot = Quaternions.id(rotations.shape[:2]) 178 | # quat_rot.qs = R.from_euler(seq=order[::-1], angles=rotations.reshape(-1,3)[:,reorder], degrees=True).as_quat().reshape(rotations.shape[:2]+(4,)) 179 | rotations_linear = rotations.reshape(-1,3) 180 | for i in np.arange(rotations_linear.shape[0]): 181 | if np.isnan(rotations_linear[i]).any(): 182 | rotations_linear[i] = [0,0,0] 183 | rot_rot = R.from_euler(seq=order[::-1], angles=rotations_linear[:,reorder], degrees=True) 184 | 185 | return (Animation(rot_rot, positions, orients, offsets, parents), names, frametime) 186 | 187 | 188 | 189 | def save(filename, anim, names=None, frametime=1.0/24.0, order='zyx', positions=False, orients=True): 190 | """ 191 | Saves an Animation to file as BVH 192 | 193 | Parameters 194 | ---------- 195 | filename: str 196 | File to be saved to 197 | 198 | anim : Animation 199 | Animation to save 200 | 201 | names : [str] 202 | List of joint names 203 | 204 | order : str 205 | Optional Specifier for joint order. 206 | Given as string E.G 'xyz', 'zxy' 207 | 208 | frametime : float 209 | Optional Animation Frame time 210 | 211 | positions : bool 212 | Optional specfier to save bone 213 | positions for each frame 214 | 215 | orients : bool 216 | Multiply joint orients to the rotations 217 | before saving. 218 | 219 | """ 220 | 221 | if names is None: 222 | names = ["joint_" + str(i) for i in range(len(anim.parents))] 223 | 224 | with open(filename, 'w') as f: 225 | 226 | t = "" 227 | f.write("%sHIERARCHY\n" % t) 228 | f.write("%sROOT %s\n" % (t, names[0])) 229 | f.write("%s{\n" % t) 230 | t += '\t' 231 | 232 | f.write("%sOFFSET %f %f %f\n" % (t, anim.offsets[0,0], anim.offsets[0,1], anim.offsets[0,2]) ) 233 | f.write("%sCHANNELS 6 Xposition Yposition Zposition %s %s %s \n" % 234 | (t, channelmap_inv[order[0]], channelmap_inv[order[1]], channelmap_inv[order[2]])) 235 | 236 | for i in range(anim.shape[1]): 237 | if anim.parents[i] == 0: 238 | t = save_joint(f, anim, names, t, i, order=order, positions=positions) 239 | 240 | t = t[:-1] 241 | f.write("%s}\n" % t) 242 | 243 | f.write("MOTION\n") 244 | f.write("Frames: %i\n" % anim.shape[0]); 245 | f.write("Frame Time: %f\n" % frametime); 246 | 247 | #if orients: 248 | # rots = np.degrees((-anim.orients[np.newaxis] * anim.rotations).euler(order=order[::-1])) 249 | #else: 250 | # rots = np.degrees(anim.rotations.euler(order=order[::-1])) 251 | rots = anim.rotations.as_euler(seq=order[::-1], degrees=True).reshape((-1, anim.positions.shape[1], 3)) 252 | poss = anim.positions 253 | 254 | for i in range(anim.shape[0]): 255 | for j in range(anim.shape[1]): 256 | 257 | if positions or j == 0: 258 | 259 | f.write("%f %f %f %f %f %f " % ( 260 | poss[i,j,0], poss[i,j,1], poss[i,j,2], 261 | rots[i,j,ordermap[order[0]]], rots[i,j,ordermap[order[1]]], rots[i,j,ordermap[order[2]]])) 262 | 263 | else: 264 | 265 | f.write("%f %f %f " % ( 266 | rots[i,j,ordermap[order[0]]], rots[i,j,ordermap[order[1]]], rots[i,j,ordermap[order[2]]])) 267 | 268 | f.write("\n") 269 | 270 | 271 | def save_joint(f, anim, names, t, i, order='zyx', positions=False): 272 | 273 | f.write("%sJOINT %s\n" % (t, names[i])) 274 | f.write("%s{\n" % t) 275 | t += '\t' 276 | 277 | f.write("%sOFFSET %f %f %f\n" % (t, anim.offsets[i,0], anim.offsets[i,1], anim.offsets[i,2])) 278 | 279 | if positions: 280 | f.write("%sCHANNELS 6 Xposition Yposition Zposition %s %s %s \n" % (t, 281 | channelmap_inv[order[0]], channelmap_inv[order[1]], channelmap_inv[order[2]])) 282 | else: 283 | f.write("%sCHANNELS 3 %s %s %s\n" % (t, 284 | channelmap_inv[order[0]], channelmap_inv[order[1]], channelmap_inv[order[2]])) 285 | 286 | end_site = True 287 | 288 | for j in range(anim.shape[1]): 289 | if anim.parents[j] == i: 290 | t = save_joint(f, anim, names, t, j, order=order, positions=positions) 291 | end_site = False 292 | 293 | if end_site: 294 | f.write("%sEnd Site\n" % t) 295 | f.write("%s{\n" % t) 296 | t += '\t' 297 | f.write("%sOFFSET %f %f %f\n" % (t, 0.0, 0.0, 0.0)) 298 | t = t[:-1] 299 | f.write("%s}\n" % t) 300 | 301 | t = t[:-1] 302 | f.write("%s}\n" % t) 303 | 304 | return t -------------------------------------------------------------------------------- /utils/quaternion.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) 2018-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | # 7 | 8 | import torch 9 | import numpy as np 10 | 11 | # PyTorch-backed implementations 12 | 13 | def qmul(q, r): 14 | """ 15 | Multiply quaternion(s) q with quaternion(s) r. 16 | Expects two equally-sized tensors of shape (*, 4), where * denotes any number of dimensions. 17 | Returns q*r as a tensor of shape (*, 4). 18 | """ 19 | assert q.shape[-1] == 4 20 | assert r.shape[-1] == 4 21 | 22 | original_shape = q.shape 23 | 24 | # Compute outer product 25 | r =r.double() 26 | q = q.double() 27 | terms = torch.bmm(r.view(-1, 4, 1), q.view(-1, 1, 4)) 28 | 29 | w = terms[:, 0, 0] - terms[:, 1, 1] - terms[:, 2, 2] - terms[:, 3, 3] 30 | x = terms[:, 0, 1] + terms[:, 1, 0] - terms[:, 2, 3] + terms[:, 3, 2] 31 | y = terms[:, 0, 2] + terms[:, 1, 3] + terms[:, 2, 0] - terms[:, 3, 1] 32 | z = terms[:, 0, 3] - terms[:, 1, 2] + terms[:, 2, 1] + terms[:, 3, 0] 33 | return torch.stack((w, x, y, z), dim=1).view(original_shape) 34 | 35 | def qrot(q, v): 36 | """ 37 | Rotate vector(s) v about the rotation described by quaternion(s) q. 38 | Expects a tensor of shape (*, 4) for q and a tensor of shape (*, 3) for v, 39 | where * denotes any number of dimensions. 40 | Returns a tensor of shape (*, 3). 41 | """ 42 | assert q.shape[-1] == 4 43 | assert v.shape[-1] == 3 44 | assert q.shape[:-1] == v.shape[:-1] 45 | 46 | original_shape = list(v.shape) 47 | q = q.view(-1, 4) 48 | v = v.view(-1, 3) 49 | 50 | qvec = q[:, 1:] 51 | uv = torch.cross(qvec, v, dim=1) 52 | uuv = torch.cross(qvec, uv, dim=1) 53 | return (v + 2 * (q[:, :1] * uv + uuv)).view(original_shape) 54 | 55 | def qeuler(q, order, epsilon=0): 56 | """ 57 | Convert quaternion(s) q to Euler angles. 58 | Expects a tensor of shape (*, 4), where * denotes any number of dimensions. 59 | Returns a tensor of shape (*, 3). 60 | """ 61 | assert q.shape[-1] == 4 62 | 63 | original_shape = list(q.shape) 64 | original_shape[-1] = 3 65 | q = q.view(-1, 4) 66 | 67 | q0 = q[:, 0] 68 | q1 = q[:, 1] 69 | q2 = q[:, 2] 70 | q3 = q[:, 3] 71 | 72 | if order == 'xyz': 73 | x = torch.atan2(2 * (q0 * q1 - q2 * q3), 1 - 2*(q1 * q1 + q2 * q2)) 74 | y = torch.asin(torch.clamp(2 * (q1 * q3 + q0 * q2), -1+epsilon, 1-epsilon)) 75 | z = torch.atan2(2 * (q0 * q3 - q1 * q2), 1 - 2*(q2 * q2 + q3 * q3)) 76 | elif order == 'yzx': 77 | x = torch.atan2(2 * (q0 * q1 - q2 * q3), 1 - 2*(q1 * q1 + q3 * q3)) 78 | y = torch.atan2(2 * (q0 * q2 - q1 * q3), 1 - 2*(q2 * q2 + q3 * q3)) 79 | z = torch.asin(torch.clamp(2 * (q1 * q2 + q0 * q3), -1+epsilon, 1-epsilon)) 80 | elif order == 'zxy': 81 | x = torch.asin(torch.clamp(2 * (q0 * q1 + q2 * q3), -1+epsilon, 1-epsilon)) 82 | y = torch.atan2(2 * (q0 * q2 - q1 * q3), 1 - 2*(q1 * q1 + q2 * q2)) 83 | z = torch.atan2(2 * (q0 * q3 - q1 * q2), 1 - 2*(q1 * q1 + q3 * q3)) 84 | elif order == 'xzy': 85 | x = torch.atan2(2 * (q0 * q1 + q2 * q3), 1 - 2*(q1 * q1 + q3 * q3)) 86 | y = torch.atan2(2 * (q0 * q2 + q1 * q3), 1 - 2*(q2 * q2 + q3 * q3)) 87 | z = torch.asin(torch.clamp(2 * (q0 * q3 - q1 * q2), -1+epsilon, 1-epsilon)) 88 | elif order == 'yxz': 89 | x = torch.asin(torch.clamp(2 * (q0 * q1 - q2 * q3), -1+epsilon, 1-epsilon)) 90 | y = torch.atan2(2 * (q1 * q3 + q0 * q2), 1 - 2*(q1 * q1 + q2 * q2)) 91 | z = torch.atan2(2 * (q1 * q2 + q0 * q3), 1 - 2*(q1 * q1 + q3 * q3)) 92 | elif order == 'zyx': 93 | x = torch.atan2(2 * (q0 * q1 + q2 * q3), 1 - 2*(q1 * q1 + q2 * q2)) 94 | y = torch.asin(torch.clamp(2 * (q0 * q2 - q1 * q3), -1+epsilon, 1-epsilon)) 95 | z = torch.atan2(2 * (q0 * q3 + q1 * q2), 1 - 2*(q2 * q2 + q3 * q3)) 96 | else: 97 | raise 98 | 99 | return torch.stack((x, y, z), dim=1).view(original_shape) 100 | 101 | # Numpy-backed implementations 102 | 103 | def qmul_np(q, r): 104 | q = torch.from_numpy(q).contiguous() 105 | r = torch.from_numpy(r).contiguous() 106 | return qmul(q, r).numpy() 107 | 108 | def qrot_np(q, v): 109 | q = torch.from_numpy(q).contiguous() 110 | v = torch.from_numpy(v).contiguous() 111 | return qrot(q, v).numpy() 112 | 113 | def qeuler_np(q, order, epsilon=0, use_gpu=False): 114 | if use_gpu: 115 | q = torch.from_numpy(q).cuda() 116 | return qeuler(q, order, epsilon).cpu().numpy() 117 | else: 118 | q = torch.from_numpy(q).contiguous() 119 | return qeuler(q, order, epsilon).numpy() 120 | 121 | def qfix(q): 122 | """ 123 | Enforce quaternion continuity across the time dimension by selecting 124 | the representation (q or -q) with minimal distance (or, equivalently, maximal dot product) 125 | between two consecutive frames. 126 | 127 | Expects a tensor of shape (L, J, 4), where L is the sequence length and J is the number of joints. 128 | Returns a tensor of the same shape. 129 | """ 130 | assert len(q.shape) == 3 131 | assert q.shape[-1] == 4 132 | 133 | result = q.copy() 134 | dot_products = np.sum(q[1:]*q[:-1], axis=2) 135 | mask = dot_products < 0 136 | mask = (np.cumsum(mask, axis=0)%2).astype(bool) 137 | result[1:][mask] *= -1 138 | return result 139 | 140 | def expmap_to_quaternion(e): 141 | """ 142 | Convert axis-angle rotations (aka exponential maps) to quaternions. 143 | Stable formula from "Practical Parameterization of Rotations Using the Exponential Map". 144 | Expects a tensor of shape (*, 3), where * denotes any number of dimensions. 145 | Returns a tensor of shape (*, 4). 146 | """ 147 | assert e.shape[-1] == 3 148 | 149 | original_shape = list(e.shape) 150 | original_shape[-1] = 4 151 | e = e.reshape(-1, 3) 152 | 153 | theta = np.linalg.norm(e, axis=1).reshape(-1, 1) 154 | w = np.cos(0.5*theta).reshape(-1, 1) 155 | xyz = 0.5*np.sinc(0.5*theta/np.pi)*e 156 | return np.concatenate((w, xyz), axis=1).reshape(original_shape) 157 | 158 | def euler_to_quaternion(e, order): 159 | """ 160 | Convert Euler angles to quaternions. 161 | """ 162 | assert e.shape[-1] == 3 163 | 164 | original_shape = list(e.shape) 165 | original_shape[-1] = 4 166 | 167 | e = e.reshape(-1, 3) 168 | 169 | x = e[:, 0] 170 | y = e[:, 1] 171 | z = e[:, 2] 172 | 173 | rx = np.stack((np.cos(x/2), np.sin(x/2), np.zeros_like(x), np.zeros_like(x)), axis=1) 174 | ry = np.stack((np.cos(y/2), np.zeros_like(y), np.sin(y/2), np.zeros_like(y)), axis=1) 175 | rz = np.stack((np.cos(z/2), np.zeros_like(z), np.zeros_like(z), np.sin(z/2)), axis=1) 176 | 177 | result = None 178 | for coord in order: 179 | if coord == 'x': 180 | r = rx 181 | elif coord == 'y': 182 | r = ry 183 | elif coord == 'z': 184 | r = rz 185 | else: 186 | raise 187 | if result is None: 188 | result = r 189 | else: 190 | result = qmul_np(result, r) 191 | 192 | # Reverse antipodal representation to have a non-negative "w" 193 | if order in ['xyz', 'yzx', 'zxy']: 194 | result *= -1 195 | 196 | return result.reshape(original_shape) 197 | 198 | 199 | ################# 200 | def q_geometric_distance(q1,q2): 201 | predicted_quat = q1.view(-1, 4) 202 | expected_quat = q2.view(-1, 4) 203 | flipper = torch.Tensor([[1, -1, -1, -1]]).cuda() if torch.cuda.is_available() else torch.Tensor([[1, -1, -1, -1]]) 204 | quat_mul = qmul(predicted_quat, expected_quat * flipper) 205 | quat_log = qlog(quat_mul) 206 | return quat_log 207 | 208 | def qlog(q): 209 | """ 210 | Calculate quaternion log 211 | @param q: 212 | @return: 213 | """ 214 | q = q.reshape((-1, 4)) 215 | norm = torch.norm(q, dim=-1).reshape((-1, 1)) 216 | # norm[norm == 0] = 1 217 | q_normalized = qabs(q / norm) 218 | q_normalized = q_normalized[torch.abs(q_normalized).sum(dim=-1) != 0] 219 | imgs = q_normalized[:,-3:] 220 | reals = q_normalized[:,0] 221 | lens = torch.sqrt(torch.sum(imgs**2, dim=-1)) 222 | lens = torch.atan2(lens, reals) / (lens + 1e-10) 223 | return imgs * lens.reshape((-1, 1)) 224 | 225 | def qabs(q): 226 | """ Unify Quaternions To Single Pole """ 227 | qabs = q.clone() 228 | t = torch.Tensor([1, 0, 0, 0]).cuda() if q.is_cuda else torch.Tensor([1,0,0,0]) 229 | top = torch.sum(qabs * t, dim=-1) 230 | bot = torch.sum((-qabs) * t, dim=-1) 231 | qabs[top < bot] = -qabs[top < bot] 232 | return qabs 233 | 234 | def quat2expmap(q): 235 | """ 236 | Converts a quaternion to an exponential map 237 | Matlab port to python for evaluation purposes 238 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/quat2expmap.m#L1 239 | 240 | Args 241 | q: 1x4 quaternion 242 | Returns 243 | r: 1x3 exponential map 244 | Raises 245 | ValueError if the l2 norm of the quaternion is not close to 1 246 | """ 247 | # if (np.abs(np.linalg.norm(q)-1)>1e-3): 248 | # raise(ValueError, "quat2expmap: input quaternion is not norm 1") 249 | 250 | sinhalftheta = np.linalg.norm(q[1:]) 251 | coshalftheta = q[0] 252 | 253 | r0 = np.divide( q[1:], (np.linalg.norm(q[1:]) + np.finfo(np.float32).eps)); 254 | theta = 2 * np.arctan2( sinhalftheta, coshalftheta ) 255 | theta = np.mod( theta + 2*np.pi, 2*np.pi ) 256 | 257 | if theta > np.pi: 258 | theta = 2 * np.pi - theta 259 | r0 = -r0 260 | 261 | r = r0 * theta 262 | return r 263 | 264 | def expmap2rotmat(r): 265 | """ 266 | Converts an exponential map angle to a rotation matrix 267 | Matlab port to python for evaluation purposes 268 | I believe this is also called Rodrigues' formula 269 | https://github.com/asheshjain399/RNNexp/blob/srnn/structural_rnn/CRFProblems/H3.6m/mhmublv/Motion/expmap2rotmat.m 270 | 271 | Args 272 | r: 1x3 exponential map 273 | Returns 274 | R: 3x3 rotation matrix 275 | """ 276 | theta = np.linalg.norm( r ) 277 | r0 = np.divide( r, theta + np.finfo(np.float32).eps ) 278 | r0x = np.array([0, -r0[2], r0[1], 0, 0, -r0[0], 0, 0, 0]).reshape(3,3) 279 | r0x = r0x - r0x.T 280 | R = np.eye(3,3) + np.sin(theta)*r0x + (1-np.cos(theta))*(r0x).dot(r0x); 281 | return R 282 | 283 | def _quat2rotmat(q): 284 | return expmap2rotmat(quat2expmap(q)) 285 | 286 | def quat2rotmat(q): 287 | if isinstance(q, torch.Tensor): 288 | q = q.cpu().numpy() 289 | res = np.apply_along_axis(_quat2rotmat, axis=1, arr=q) 290 | return res 291 | # x = np.zeros((q.shape[0], 3, 3)) 292 | # for i in range(q.shape[0]): 293 | # expmap = quat2expmap(q[i]) 294 | # x[i, :, :] = expmap2rotmat(expmap) 295 | # return x 296 | -------------------------------------------------------------------------------- /utils/util.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import torch 4 | from utils.Quaternions import Quaternions 5 | from utils.quaternion import qfix 6 | 7 | 8 | ROTATION_NUMBERS = {'q': 4, '6d': 6, 'euler': 3} 9 | 10 | 11 | def euler_to_quaternions(angles, order, n_joints=20): 12 | cuda_available = torch.cuda.is_available() 13 | if isinstance(angles,torch.Tensor): 14 | angles = angles.cpu().numpy() 15 | quaternion_angles = Quaternions.from_euler(np.radians(angles), order=order, world=False) 16 | quaternion_angles = torch.from_numpy(quaternion_angles.qs.reshape((quaternion_angles.shape[0], -1, n_joints * 4))).to(torch.device('cuda:0')) if cuda_available else torch.from_numpy(quaternion_angles.qs.reshape((quaternion_angles.shape[0], -1, n_joints * 4))) 17 | original_shape = quaternion_angles.shape 18 | quaternion_angles = quaternion_angles.reshape((-1, n_joints, 4)) 19 | quaternion_angles = qfix(quaternion_angles.cpu().numpy()) 20 | quaternion_angles = quaternion_angles.reshape((-1, original_shape[0], n_joints, 4)) 21 | END_EFFECTORS_INDEXES = [3, 6, 11, 15, 19] 22 | quaternion_angles[:, :, END_EFFECTORS_INDEXES, :] = 0 23 | quaternion_angles = quaternion_angles.reshape(original_shape) 24 | quaternion_angles = torch.from_numpy(quaternion_angles).to(torch.device('cuda:0')) if cuda_available else torch.from_numpy(quaternion_angles) 25 | return quaternion_angles 26 | 27 | 28 | def createDict(*args): 29 | return dict(((k, eval(k)) for k in args)) 30 | 31 | def mkdir_dir(path): 32 | if not os.path.exists(path): 33 | os.makedirs(path) 34 | return path 35 | 36 | def make_dataset(dir_list, phase, data_split=3, sort=False, sort_index=1): 37 | images = [] 38 | for dataroot in dir_list: 39 | _images = [] 40 | image_filter = [] 41 | 42 | assert os.path.isdir(dataroot), '%s is not a valid directory' % dataroot 43 | for root, _, fnames in sorted(os.walk(dataroot)): 44 | for fname in fnames: 45 | if phase in fname: 46 | path = os.path.join(root, fname) 47 | _images.append(path) 48 | if sort: 49 | _images.sort(key=lambda x: int(x.split('/')[-1].split('.')[0].split('_')[sort_index])) 50 | if data_split is not None: 51 | for i in range(int(len(_images)/data_split - 1)): 52 | image_filter.append(_images[data_split*i]) 53 | images += image_filter 54 | return images 55 | else: 56 | return _images 57 | 58 | def mkdir(folder): 59 | if os.path.exists(folder): 60 | return 1 61 | else: 62 | os.makedirs(folder) 63 | 64 | 65 | def normalize_data(orig_data): 66 | data_mean = np.mean(orig_data, axis=0) 67 | data_std = np.std(orig_data, axis=0) 68 | normalized_data = np.divide((orig_data - data_mean), data_std) 69 | normalized_data[normalized_data != normalized_data] = 0 70 | return normalized_data, data_mean, data_std 71 | 72 | 73 | def umnormalize_data(normalized_data, data_mean, data_std): 74 | T = normalized_data.shape[0] # Batch size 75 | D = data_mean.shape[0] # Dimensionality 76 | 77 | stdMat = data_std.reshape((1, D)) 78 | stdMat = np.repeat(stdMat, T, axis=0) 79 | meanMat = data_mean.reshape((1, D)) 80 | meanMat = np.repeat(meanMat, T, axis=0) 81 | orig_data = np.multiply(normalized_data, stdMat) + meanMat 82 | return orig_data -------------------------------------------------------------------------------- /utils/visualization.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import importlib 4 | import warnings 5 | 6 | import matplotlib 7 | matplotlib.use('Agg') 8 | import matplotlib.pyplot as plt 9 | import matplotlib.gridspec as gridspec 10 | from mpl_toolkits.mplot3d import Axes3D 11 | from PIL import Image 12 | 13 | 14 | def fig2img(fig): 15 | """ 16 | @brief Convert a Matplotlib figure to a 4D numpy array with RGBA channels and return it 17 | @param fig a matplotlib figure 18 | @return a numpy 3D array of RGBA values 19 | """ 20 | # draw the renderer 21 | fig.canvas.draw() 22 | 23 | # Get the RGBA buffer from the figure 24 | w, h = fig.canvas.get_width_height() 25 | buf = np.fromstring(fig.canvas.tostring_rgb(), dtype=np.uint8) 26 | buf.shape = (w, h, 3) 27 | w, h, d = buf.shape 28 | return Image.frombytes("RGB", (w, h), buf.tostring()) 29 | 30 | def visual_result_grid(img_path, pose_2d, pose_3d_pre, pose_3d_gt, save_path=None): 31 | fig = plt.figure() 32 | 33 | gs1 = gridspec.GridSpec(2, 2) 34 | # gs1.update(wspace=-0.00, hspace=0.05) # set the spacing between axes. 35 | # plt.axis('off') 36 | # Show real image 37 | ax1 = plt.subplot(gs1[0, 0]) 38 | img = Image.open(img_path) 39 | ax1.imshow(img) 40 | 41 | # Show 2d pose 42 | ax2 = plt.subplot(gs1[0, 1]) 43 | show2Dpose(pose_2d, ax2) 44 | ax2.set_title('2d input') 45 | ax2.invert_yaxis() 46 | 47 | # Plot 3d predict 48 | ax3 = plt.subplot(gs1[1, 0], projection='3d') 49 | ax3.set_title('3d predict') 50 | show3Dpose(pose_3d_pre, ax3) 51 | # ax3.view_init(0, -90) 52 | 53 | # Plot 3d gt 54 | ax4 = plt.subplot(gs1[1, 1], projection='3d') 55 | ax4.set_title('3d gt') 56 | show3Dpose(pose_3d_gt, ax4) 57 | 58 | if save_path is None: 59 | fig_img = fig2img(fig) 60 | plt.close() 61 | return fig_img 62 | else: 63 | fig.savefig(save_path) 64 | plt.close() 65 | 66 | def visual_result_row(img_path, pose_2d, pose_3d_pre, pose_3d_gt, save_path=None): 67 | fig = plt.figure() 68 | 69 | gs1 = gridspec.GridSpec(1, 4) 70 | # gs1.update(wspace=-0.00, hspace=0.05) # set the spacing between axes. 71 | # plt.axis('off') 72 | # Show real image 73 | ax1 = plt.subplot(gs1[0]) 74 | img = Image.open(img_path) 75 | ax1.imshow(img) 76 | 77 | # Show 2d pose 78 | ax2 = plt.subplot(gs1[1]) 79 | show2Dpose(pose_2d, ax2) 80 | ax2.set_title('2d input') 81 | # ax2.invert_yaxis() 82 | 83 | # Plot 3d predict 84 | ax3 = plt.subplot(gs1[2], projection='3d') 85 | ax3.set_title('3d predict') 86 | show3Dpose(pose_3d_pre, ax3, radius=1) 87 | 88 | # Plot 3d gt 89 | ax4 = plt.subplot(gs1[3], projection='3d') 90 | ax4.set_title('3d gt') 91 | show3Dpose(pose_3d_gt, ax4, radius=1) 92 | 93 | if save_path is None: 94 | fig_img = fig2img(fig) 95 | plt.close() 96 | return fig_img 97 | else: 98 | fig.savefig(save_path) 99 | plt.close() 100 | 101 | def visual_sequence_result_without_image(pose_2d, pose_3d_predict, pose_3d_gt, save_path=None): 102 | fig = plt.figure(figsize=(60, 12)) 103 | 104 | gs1 = gridspec.GridSpec(3, 20) 105 | # gs1.update(wspace=-0.00, hspace=0.05) # set the spacing between axes. 106 | # plt.axis('off') 107 | 108 | for i in range(20): 109 | 110 | ax0 = plt.subplot(gs1[0, i]) 111 | show2Dpose(pose_2d[i, :], ax0, radius=500) 112 | ax2.invert_yaxis() 113 | 114 | ax2 = plt.subplot(gs1[1, i], projection='3d') 115 | show3Dpose(pose_3d_predict[i, :], ax2, radius=np.max(pose_3d_gt)) 116 | 117 | ax3 = plt.subplot(gs1[2, i], projection='3d') 118 | show3Dpose(pose_3d_gt[i, :], ax3, radius=np.max(pose_3d_gt)) 119 | 120 | if save_path is None: 121 | fig_img = fig2img(fig) 122 | plt.close() 123 | return fig_img 124 | else: 125 | fig.savefig(save_path) 126 | 127 | def visual_sequence_result(file_path, pose_2d, pose_3d_predict, pose_3d_gt, save_path=None): 128 | fig = plt.figure(figsize=(60, 12)) 129 | 130 | gs1 = gridspec.GridSpec(4, 20) 131 | # gs1.update(wspace=-0.00, hspace=0.05) # set the spacing between axes. 132 | # plt.axis('off') 133 | 134 | for i in range(20): 135 | # Show real image 136 | ax1 = plt.subplot(gs1[0, i]) 137 | img = Image.open(file_path[i]) 138 | ax1.imshow(img) 139 | 140 | ax2 = plt.subplot(gs1[1, i]) 141 | show2Dpose(pose_2d[i, :], ax2) 142 | ax2.invert_yaxis() 143 | 144 | ax2 = plt.subplot(gs1[2, i], projection='3d') 145 | show3Dpose(pose_3d_predict[i, :], ax2) 146 | 147 | ax3 = plt.subplot(gs1[3, i], projection='3d') 148 | show3Dpose(pose_3d_gt[i, :], ax3) 149 | 150 | if save_path is None: 151 | fig_img = fig2img(fig) 152 | plt.close() 153 | return fig_img 154 | else: 155 | fig.savefig(save_path) 156 | 157 | def show3Dpose(channels, ax, radius=600, lcolor="#3498db", rcolor="#e74c3c", add_labels=True,save_path=None): # blue, orange 158 | """ 159 | Visualize a 3d skeleton 160 | 161 | Args 162 | channels: 96x1 vector. The pose to plot. 163 | ax: matplotlib 3d axis to draw on 164 | lcolor: color for left part of the body 165 | rcolor: color for right part of the body 166 | add_labels: whether to add coordinate labels 167 | Returns 168 | Nothing. Draws on ax. 169 | """ 170 | 171 | if channels.size == 48: 172 | vals = np.reshape(channels, (-1, 3)) 173 | I = np.array([0, 1, 2, 0, 4, 5, 0, 7, 8, 7, 10, 11, 7, 13, 14]) 174 | J = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) 175 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 176 | elif channels.size == 51: 177 | vals = np.reshape(channels, (-1, 3)) 178 | I = np.array([0, 1, 2, 0, 4, 5, 0, 7, 8, 9, 8, 11, 12, 8, 14, 15]) 179 | J = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]) 180 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 181 | elif channels.size == 45: 182 | vals = np.reshape(channels, (-1, 3)) 183 | I = np.array([0, 1, 2, 0, 4, 5, 0, 7, 7, 9, 10, 7, 12, 13]) 184 | J = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) 185 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 186 | elif channels.size == 60: 187 | vals = np.reshape(channels, (-1, 3)) 188 | I = np.array([0, 1, 2, 0, 4, 5, 0, 8, 9, 10, 9, 13, 14, 9, 17, 18]) 189 | J = np.array([1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 17, 18, 19]) 190 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 191 | 192 | vals[:, [1, 2]] = vals[:, [2, 1]] 193 | # Make connection matrix 194 | for i in np.arange(len(I)): 195 | x, y, z = [np.array([vals[I[i], j], vals[J[i], j]]) for j in range(3)] 196 | ax.plot(x, y, z, lw=2, c=lcolor if LR[i] else rcolor) 197 | 198 | RADIUS = radius # space around the subject 199 | xroot, yroot, zroot = vals[0, 0], vals[0, 1], vals[0, 2] 200 | ax.set_xlim3d([-RADIUS + xroot, RADIUS + xroot]) 201 | ax.set_zlim3d([-RADIUS + zroot, RADIUS + zroot]) 202 | ax.set_ylim3d([RADIUS + yroot, -RADIUS + yroot]) 203 | 204 | if add_labels: 205 | ax.set_xlabel("x") 206 | ax.set_ylabel("z") 207 | ax.set_zlabel("y") 208 | ax.set_zlim(ax.get_zlim()[::-1]) 209 | 210 | # Get rid of the ticks and tick labels 211 | # ax.set_xticks([]) 212 | # ax.set_yticks([]) 213 | # ax.set_zticks([]) 214 | 215 | # ax.get_xaxis().set_ticklabels([]) 216 | # ax.get_yaxis().set_ticklabels([]) 217 | ax.set_zticklabels([]) 218 | 219 | # Get rid of the panes (actually, make them white) 220 | # white = (1.0, 1.0, 1.0, 0.0) 221 | # ax.w_xaxis.set_pane_color(white) 222 | # ax.w_yaxis.set_pane_color(white) 223 | # Keep z pane 224 | 225 | # Get rid of the lines in 3d 226 | # ax.w_xaxis.line.set_color(white) 227 | # ax.w_yaxis.line.set_color(white) 228 | # ax.w_zaxis.line.set_color(white) 229 | 230 | 231 | def show3Dpose_fixed(channels, ax, lcolor="#3498db", rcolor="#e74c3c", add_labels=True, label_min=None, label_max=None): # blue, orange 232 | if channels.size == 48: 233 | vals = np.reshape(channels, (-1, 3)) 234 | I = np.array([0, 1, 2, 0, 4, 5, 0, 7, 8, 7, 10, 11, 7, 13, 14]) 235 | J = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) 236 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 237 | elif channels.size == 51: 238 | vals = np.reshape(channels, (-1, 3)) 239 | I = np.array([0, 1, 2, 0, 4, 5, 0, 7, 8, 9, 8, 11, 12, 8, 14, 15]) 240 | J = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]) 241 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 242 | else: 243 | vals = np.reshape(channels, (-1, 3)) 244 | I = np.array([1, 2, 3, 1, 7, 8, 1, 13, 14, 15, 14, 18, 19, 14, 26, 27]) - 1 # start points 245 | J = np.array([2, 3, 4, 7, 8, 9, 13, 14, 15, 16, 18, 19, 20, 26, 27, 28]) - 1 # end points 246 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 247 | 248 | vals[:, [1, 2]] = vals[:, [2, 1]] 249 | 250 | # Make connection matrix 251 | for i in np.arange(len(I)): 252 | x, y, z = [np.array([vals[I[i], j], vals[J[i], j]]) for j in range(3)] 253 | ax.plot(x, z, y, lw=2, c=lcolor if LR[i] else rcolor) 254 | 255 | if add_labels: 256 | ax.set_xlabel("x") 257 | ax.set_ylabel("z") 258 | ax.set_zlabel("y") 259 | 260 | if label_min is not None and label_max is not None: 261 | ax.set_xlim3d([label_min, label_max]) 262 | ax.get_xaxis().set_ticklabels(list(range(int(label_min), int(label_max), int((label_max-label_min)/3)))) 263 | ax.set_ylim3d([label_min, label_max]) 264 | ax.get_yaxis().set_ticklabels(list(range(int(label_min), int(label_max), int((label_max-label_min)/3)))) 265 | ax.set_zlim3d([label_min, label_max]) 266 | ax.set_zticklabels(list(range(int(label_min), int(label_min), int((label_max-label_min)/3)))) 267 | 268 | ax.set_aspect('equal') 269 | 270 | white = (1.0, 1.0, 1.0, 0.0) 271 | ax.w_xaxis.set_pane_color(white) 272 | ax.w_yaxis.set_pane_color(white) 273 | 274 | ax.w_xaxis.line.set_color(white) 275 | ax.w_yaxis.line.set_color(white) 276 | ax.w_zaxis.line.set_color(white) 277 | 278 | 279 | def show2Dpose(channels, ax, radius=600, lcolor="#3498db", rcolor="#e74c3c", add_labels=False): 280 | if channels.size == 32: 281 | vals = np.reshape(channels, (-1, 2)) 282 | I = np.array([0, 1, 2, 0, 4, 5, 0, 7, 8, 7, 10, 11, 7, 13, 14]) 283 | J = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]) 284 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 285 | elif channels.size == 34: 286 | vals = np.reshape(channels, (-1, 2)) 287 | I = np.array([0, 1, 2, 0, 4, 5, 0, 7, 8, 9, 8, 11, 12, 8, 14, 15]) 288 | J = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]) 289 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 290 | elif channels.size == 30: 291 | vals = np.reshape(channels, (-1, 2)) 292 | I = np.array([0, 1, 2, 0, 4, 5, 0, 7, 7, 9, 10, 7, 12, 13]) 293 | J = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]) 294 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 295 | elif channels.size == 40: 296 | vals = np.reshape(channels, (-1, 2)) 297 | I = np.array([0, 1, 2, 0, 4, 5, 0, 8, 9, 10, 9, 13, 14, 9, 17, 18]) 298 | J = np.array([1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 13, 14, 15, 17, 18, 19]) 299 | LR = np.array([1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1], dtype=bool) 300 | 301 | # Make connection matrix 302 | for i in np.arange(len(I)): 303 | x, y = [np.array([vals[I[i], j], vals[J[i], j]]) for j in range(2)] 304 | if np.mean(x) != 0 and np.mean(y) != 0: 305 | ax.plot(x, y, lw=7, c=lcolor if LR[i] else rcolor) 306 | 307 | # Get rid of the ticks 308 | # ax.set_xticks([]) 309 | # ax.set_yticks([]) 310 | # ax.set_xticks([0, 200, 400, 600, 800, 1000]) 311 | # ax.set_yticks([0, 200, 400, 600, 800, 1000]) 312 | 313 | # Get rid of tick labels 314 | # ax.get_xaxis().set_ticklabels([]) 315 | # ax.get_yaxis().set_ticklabels([]) 316 | 317 | RADIUS = radius # space around the subject 318 | xroot, yroot = vals[0, 0], vals[0, 1] 319 | ax.set_xlim([-RADIUS + xroot, RADIUS + xroot]) 320 | ax.set_ylim([-RADIUS + yroot, RADIUS + yroot]) 321 | if add_labels: 322 | ax.set_xlabel("x") 323 | ax.set_ylabel("z") 324 | ax.set_ylim(ax.get_ylim()[::-1]) 325 | 326 | ax.set_aspect('equal') 327 | # for i in range(20): 328 | # ax.text(vals[i, 0],vals[i, 1], f'{i}', color='black') 329 | 330 | class WriterTensorboardX(): 331 | def __init__(self, writer_dir, logger, enable): 332 | self.writer = None 333 | if enable: 334 | log_path = writer_dir 335 | try: 336 | self.writer = importlib.import_module('tensorboardX').SummaryWriter(log_path) 337 | except ModuleNotFoundError: 338 | message = """TensorboardX visualization is configured to use, but currently not installed on this machine. Please install the package by 'pip install tensorboardx' command or turn off the option in the 'config.json' file.""" 339 | warnings.warn(message, UserWarning) 340 | logger.warn() 341 | self.step = 0 342 | self.mode = '' 343 | 344 | self.tensorboard_writer_ftns = ['add_scalar', 'add_scalars', 'add_image', 'add_audio', 'add_text', 'add_histogram', 'add_pr_curve', 'add_embedding'] 345 | 346 | def set_step(self, step, mode='train'): 347 | self.mode = mode 348 | self.step = step 349 | 350 | def set_scalars(self, scalars): 351 | for key, value in scalars.items(): 352 | self.add_scalar(key, value) 353 | 354 | def __getattr__(self, name): 355 | """ 356 | If visualization is configured to use: 357 | return add_data() methods of tensorboard with additional information (step, tag) added. 358 | Otherwise: 359 | return blank function handle that does nothing 360 | """ 361 | if name in self.tensorboard_writer_ftns: 362 | add_data = getattr(self.writer, name, None) 363 | 364 | def wrapper(tag, data, *args, **kwargs): 365 | if add_data is not None: 366 | add_data('{}/{}'.format(self.mode, tag), data, self.step, *args, **kwargs) 367 | 368 | return wrapper 369 | else: 370 | # default action for returning methods defined in this class, set_step() for instance. 371 | try: 372 | attr = object.__getattr__(name) 373 | except AttributeError: 374 | raise AttributeError("type object 'WriterTensorboardX' has no attribute '{}'".format(name)) 375 | return attr 376 | -------------------------------------------------------------------------------- /videos/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/BrianG13/FLEX/79d6d0a91941b76075d504043fc80646ac1f1878/videos/.DS_Store -------------------------------------------------------------------------------- /videos/Human36M_S9_Posing_1.mov: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:fac6a3e0e60d5beb19ffa0679d5dcfa4939fd85159e6cf3b7d10dc981d5e99f4 3 | size 32677789 4 | -------------------------------------------------------------------------------- /videos/Human36M_S9_Posing_1.mp4: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:492ee2be8b51cb3625a5283c5d78b1edeebfd33c8b9c857b6800d39992b1920f 3 | size 33765884 4 | -------------------------------------------------------------------------------- /videos/Human36M_S9_Sitting.mov: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:0966724c65a6373793d4af4db8dff6fa09b2053dc8bd98280d25e7080b2915f1 3 | size 32799581 4 | -------------------------------------------------------------------------------- /videos/Human36M_S9_Sitting.mp4: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:a36bed103bee5e0ccab78b436e95bcdf8c63bedcaee67f79bc1f681f2160b448 3 | size 33897570 4 | -------------------------------------------------------------------------------- /videos/KTH_football.mov: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:382578b15160bd3cc32a5a97e5f543bb6ea592b2d92a4f1b55997a8f7fbb992a 3 | size 18304178 4 | -------------------------------------------------------------------------------- /videos/KTH_football.mp4: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:30e6bcef1e0408b42d65a9be138fd565d7bea18cec577cba662017568a3935ec 3 | size 18987123 4 | -------------------------------------------------------------------------------- /videos/MotioNet_Comparison.mov: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:4646987a99880acb8ea9914f91fa1154cbd8cf0aa13833dc340c3e383f46c53e 3 | size 18693248 4 | -------------------------------------------------------------------------------- /videos/MotioNet_Comparison.mp4: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:aff0394c518b4a6c8ba93fff0628aa7e7485890a2a996cec50f58916188d12b0 3 | size 19345516 4 | -------------------------------------------------------------------------------- /videos/README.md: -------------------------------------------------------------------------------- 1 | Videos description: 2 | - A clip describing our work: clip.mov 3 | - Video files showing our results on the Human3.6M dataset: Human36M*.mov 4 | - Video files showing our results on the KTH multi-view Football II dataset: KTH\_football.mov 5 | - Video files comparing MotioNet (single-view) results versus ours: MotioNet\_comparison.mov. Notice that while MotioNet is occasionally wrong, our model is accurate and smooth. 6 | -------------------------------------------------------------------------------- /videos/clip.mov: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:1cffd3b1f6b9b3f37fe2ba107cb1c299b56a54ace116c3ce465f7d75da50d9da 3 | size 314809263 4 | -------------------------------------------------------------------------------- /videos/clip.mp4: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:95ed6de9f1f3b89204095d36788ddde97293965f801efa9d4c151673d49cdab8 3 | size 314808376 4 | --------------------------------------------------------------------------------