├── .gitignore ├── LICENSE ├── README.md ├── body_models └── README.md ├── configs ├── fit_amass_joints.cfg ├── fit_amass_keypts.cfg ├── fit_imapper.cfg ├── fit_prox.cfg ├── fit_proxd.cfg ├── fit_rgb_demo_no_split.cfg ├── fit_rgb_demo_use_split.cfg ├── intrinsics_default.json ├── test_humor.cfg ├── test_humor_qual.cfg ├── test_humor_qual_sampling.cfg ├── test_humor_qual_sampling_debug.cfg ├── test_humor_recon.cfg ├── test_humor_recon_debug.cfg ├── test_humor_sampling.cfg ├── test_humor_sampling_debug.cfg ├── train_humor.cfg └── train_humor_qual.cfg ├── data ├── README.md ├── get_i3db.sh ├── get_prox_extra.sh └── rgb_demo │ └── hiphop_clip1.mp4 ├── get_ckpt.sh ├── humor.png ├── humor ├── __init__.py ├── body_model │ ├── __init__.py │ ├── body_model.py │ └── utils.py ├── datasets │ ├── __init__.py │ ├── amass_discrete_dataset.py │ ├── amass_fit_dataset.py │ ├── amass_utils.py │ ├── imapper_dataset.py │ ├── prox_dataset.py │ └── rgb_dataset.py ├── fitting │ ├── config.py │ ├── eval_fitting_2d.py │ ├── eval_fitting_3d.py │ ├── eval_utils.py │ ├── fitting_loss.py │ ├── fitting_utils.py │ ├── motion_optimizer.py │ ├── run_fitting.py │ └── viz_fitting_rgb.py ├── losses │ └── humor_loss.py ├── models │ └── humor_model.py ├── scripts │ ├── cleanup_amass_data.py │ └── process_amass_data.py ├── test │ └── test_humor.py ├── train │ ├── train_humor.py │ └── train_state_prior.py ├── utils │ ├── __init__.py │ ├── chamfer_distance │ │ ├── LICENSE │ │ ├── __init__.py │ │ ├── chamfer_distance.cpp │ │ ├── chamfer_distance.cu │ │ └── chamfer_distance.py │ ├── config.py │ ├── logging.py │ ├── stats.py │ ├── torch.py │ ├── transforms.py │ └── video.py └── viz │ ├── __init__.py │ ├── mesh_viewer.py │ └── utils.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | *__pycache__* 2 | *.pyc 3 | body_models 4 | data 5 | checkpoints 6 | out 7 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | 2 | MIT License 3 | 4 | Copyright (c) 2021 Davis Rempe 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # HuMoR: 3D Human Motion Model for Robust Pose Estimation (ICCV 2021) 2 | This is the official implementation for the ICCV 2021 paper. For more information, see the [project webpage](https://geometry.stanford.edu/projects/humor/). 3 | 4 | ![HuMoR Teaser](humor.png) 5 | 6 | ## Environment Setup 7 | > Note: This code was developed on Ubuntu 16.04/18.04 with Python 3.7, CUDA 10.1 and PyTorch 1.6.0. Later versions should work, but have not been tested. 8 | 9 | Create and activate a virtual environment to work in, e.g. using Conda: 10 | ``` 11 | conda create -n humor_env python=3.7 12 | conda activate humor_env 13 | ``` 14 | 15 | Install CUDA and PyTorch 1.6. For CUDA 10.1, this would look like: 16 | ``` 17 | conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch 18 | ``` 19 | 20 | Install the remaining requirements with pip: 21 | ``` 22 | pip install -r requirements.txt 23 | ``` 24 | 25 | You must also have _ffmpeg_ installed on your system to save visualizations. 26 | 27 | ## Downloads & External Dependencies 28 | This codebase relies on various external downloads in order to run for certain modes of operation. Here we briefly overview each and what they are used for. Detailed setup instructions are linked in other READMEs. 29 | 30 | ### Body Model and Pose Prior 31 | Detailed instructions to install SMPL+H and VPoser are in [this documentation](./body_models/). 32 | 33 | * [SMPL+H](https://mano.is.tue.mpg.de/) is used for the pose/shape body model. Downloading this model is necessary for **all uses** of this codebase. 34 | * [VPoser](https://github.com/nghorbani/human_body_prior) is used as a pose prior only during the initialization phase of fitting, so it's only needed if you are using the test-time optimization functionality of this codebase. 35 | 36 | ### Datasets 37 | Detailed instructions to install, configure, and process each dataset are in [this documentation](./data/). 38 | 39 | * [AMASS](https://amass.is.tue.mpg.de/) motion capture data is used to train and evaluate (_e.g._ randomly sample) the HuMoR motion model and for fitting to 3D data like noisy joints and partial keypoints. 40 | * [i3DB](https://github.com/amonszpart/iMapper) contains RGB videos with heavy occlusions and is only used in the paper to evaluate test-time fitting to 2D joints. 41 | * [PROX](https://prox.is.tue.mpg.de/) contains RGB-D videos and is only used in the paper to evaluate test-time fitting to 2D joints and 3D point clouds. 42 | 43 | ### Pretrained Models 44 | Pretrained model checkpoints are available for HuMoR, HuMoR-Qual, and the initial state Gaussian mixture. To download (~215 MB), from the repo root run `bash get_ckpt.sh`. 45 | 46 | ### OpenPose 47 | [OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) is used to detect 2D joints for fitting to arbitrary RGB videos. If you will be running test-time optimization on the demo video or your own videos, you must install OpenPose (unless you pass in pre-computed OpenPose results using `--op-keypts`). To clone and build, please follow the [OpenPose README](https://github.com/CMU-Perceptual-Computing-Lab/openpose) in their repo. 48 | 49 | Optimization in [run_fitting.py](./humor/fitting/run_fitting.py) assumes OpenPose is installed at `./external/openpose` by default - if you install elsewhere, please pass in the location using the `--openpose` flag. 50 | 51 | ## Fitting to RGB Videos (Test-Time Optimization) 52 | To run motion/shape estimation on an arbitrary RGB video, you must have SMPL+H, VPoser, OpenPose, and a pretrained HuMoR model as detailed above. We have included a demo video in this repo along with a few example configurations to get started. 53 | 54 | > Note: if running on your own video, make sure the camera is not moving and the person is not interacting with uneven terrain in the scene (we assume a single ground plane). Also, only one person will be reconstructed. 55 | 56 | To run the optimization on the demo video use: 57 | ``` 58 | python humor/fitting/run_fitting.py @./configs/fit_rgb_demo_no_split.cfg 59 | ``` 60 | 61 | This configuration optimizes over the entire video (~3 sec) at once (i.e. over all frames). **If your video is longer than 2-3 sec**, it is recommended to instead use the settings in `./configs/fit_rgb_demo_use_split.cfg` which adds the `--rgb-seq-len`, `--rgb-overlap-len`, and `--rgb-overlap-consist-weight` arguments. Using this configuration, the input video is split into multiple overlapping sub-sequences and optimized in a batched fashion (with consistency losses between sub-sequences). This increases efficiency, and lessens the need to tune parameters based on video length. Note the larger the batch size, the better the results will be. 62 | 63 | If known, it's **highly recommended to pass in camera intrinsics** using the `--rgb-intrinsics` flag. See `./configs/intrinsics_default.json` for an example of what this looks like. If intrinsics are _not_ given, [default focal lengths](./humor/fitting/fitting_utils.py#L19) are used. 64 | 65 | Finally, this demo does _not_ use [PlaneRCNN](https://github.com/NVlabs/planercnn) to initialize the ground as described in the paper. Instead, it roughly initializes the ground at `y = 0.5` (with camera up-axis `-y`). We found this to be sufficient and often better than using PlaneRCNN. If you want to use PlaneRCNN instead, set up a separate environment, follow their install instructions, then use the following command to run their method where `example_image_dir` contains a single frame from your video and the camera parameters: `python evaluate.py --methods=f --suffix=warping_refine --dataset=inference --customDataFolder=example_image_dir`. The results directory can be passed into our optimization using the `--rgb-planercnn-res` flag. 66 | 67 | > Note: if you want to use your own OpenPose detections rather than having the fitting script run OpenPose, pass in the directory containing the json files using `--op-keypts`. The [expected format](https://github.com/CMU-Perceptual-Computing-Lab/openpose/blob/master/doc/02_output.md#json-output-format) of these json files is that written using the `--write_json` flag when running OpenPose with the `BODY_25` skeleton and unnormalized 2D keypoints (see [here](https://github.com/davrempe/humor/blob/main/humor/utils/video.py#L70) for the exact OpenPose command we use). We also assume these OpenPose detections are at 30 fps since this is the rate of the HuMoR model. Note that only the `pose_keypoints_2d` (body joints) of the first detected person in each json file is used for fitting. 68 | 69 | ### Visualizing RGB Results 70 | The optimization is performed in 3 stages, with stages 1 & 2 being initialization using a pose prior and smoothing (i.e. the _VPoser-t_ baseline) and stage 3 being the full optimization with the HuMoR motion prior. So for the demo, the final output for the full sequence will be saved in `./out/rgb_demo_no_split/results_out/final_results/stage3_results.npz`. To visualize results from the fitting use something like: 71 | ``` 72 | python humor/fitting/viz_fitting_rgb.py --results ./out/rgb_demo_no_split/results_out --out ./out/rgb_demo_no_split/viz_out --viz-prior-frame 73 | ``` 74 | 75 | By default, this will visualize the final full video result along with each sub-sequence separately (if applicable). Please use `--help` to see the many additional visualization options. This code is also useful to see how to load in and use the results for other tasks, if desired. 76 | 77 | ## Fitting on Specific Datasets 78 | Next, we detail how to run and evaluate the test-time optimization on the various datasets presented in the paper. In all these examples, the default batch size is quite small to accomodate smaller GPUs, but it should be increased depending on your system. 79 | 80 | ### AMASS 3D Data 81 | There are multiple settings possible for fitting to 3D data (e.g. noisy joints, partial keypoints, etc...), which can be specified using configuration flags. For example, to fit to partial upper-body 3D keypoints sampled from AMASS data, run: 82 | ``` 83 | python humor/fitting/run_fitting.py @./configs/fit_amass_keypts.cfg 84 | ``` 85 | 86 | Optimization results can be visualized using 87 | ``` 88 | python humor/fitting/eval_fitting_3d.py --results ./out/amass_verts_upper_fitting/results_out --out ./out/amass_verts_upper_fitting/eval_out --qual --viz-stages --viz-observation 89 | ``` 90 | and evaluation metrics computed with 91 | ``` 92 | python humor/fitting/eval_fitting_3d.py --results ./out/amass_verts_upper_fitting/results_out --out ./out/amass_verts_upper_fitting/eval_out --quant --quant-stages 93 | ``` 94 | 95 | The most relevant quantitative results will be written to `eval_out/eval_quant/compare_mean.csv`. 96 | 97 | ### i3DB RGB Data 98 | The i3DB dataset contains RGB videos with many occlusions along with annotated 3D joints for evaluation. To run test-time optimization on the full dataset, use: 99 | ``` 100 | python humor/fitting/run_fitting.py @./configs/fit_imapper.cfg 101 | ``` 102 | 103 | Results can be visualized using the same script as in the demo: 104 | ``` 105 | python humor/fitting/viz_fitting_rgb.py --results ./out/imapper_fitting/results_out --out ./out/imapper_fitting/viz_out --viz-prior-frame 106 | ``` 107 | 108 | Quantitative evaluation (comparing to results after each optimization stage) can be run with: 109 | ``` 110 | python humor/fitting/eval_fitting_2d.py --results ./out/imapper_fitting/results_out --dataset iMapper --imapper-floors ./data/iMapper/i3DB/floors --out ./out/imapper_fitting/eval_out --quant --quant-stages 111 | ``` 112 | 113 | The final quantitative results will be written to `eval_out/eval_quant/compare_mean.csv`. 114 | 115 | ### PROX RGB/RGB-D Data 116 | PROX contains RGB-D data so affords fitting to just 2D joints and 2D joints + 3D point cloud. The commands for running each of these are quite similar, just using different configuration files. For running on the full RGB-D data, use: 117 | ``` 118 | python humor/fitting/run_fitting.py @./configs/fit_proxd.cfg 119 | ``` 120 | 121 | Visualization must add the `--flip-img` flag to align with the original PROX videos: 122 | ``` 123 | python humor/fitting/viz_fitting_rgb.py --results ./out/proxd_fitting/results_out --out ./out/proxd_fitting/viz_out --viz-prior-frame --flip-img 124 | ``` 125 | 126 | Quantitative evalution (of plausibility metrics) for full RGB-D data uses 127 | ``` 128 | python humor/fitting/eval_fitting_2d.py --results ./out/proxd_fitting/results_out --dataset PROXD --prox-floors ./data/prox/qualitative/floors --out ./out/proxd_fitting/eval_out --quant --quant-stages 129 | ``` 130 | 131 | and for just RGB data is slightly different: 132 | ``` 133 | python humor/fitting/eval_fitting_2d.py --results ./out/prox_fitting/results_out --dataset PROX --prox-floors ./data/prox/qualitative/floors --out ./out/prox_fitting/eval_out --quant --quant-stages 134 | ``` 135 | 136 | ## Training & Testing Motion Model 137 | There are two versions of our model: HuMoR and HuMoR-Qual. HuMoR is the main model presented in the paper and is best suited for test-time optimization. HuMoR-Qual is a slight variation on HuMoR that gives more stable and qualitatively superior results for random motion generation (see the paper for details). 138 | 139 | Below we describe how to train and test HuMoR, but the exact same commands are used for HuMoR-Qual with a different configuration file at each step (see [all provided configs](./configs)). 140 | 141 | ### Training HuMoR 142 | To train HuMoR from scratch, make sure you have the processed version of the AMASS dataset at `./data/amass_processed` and run: 143 | ``` 144 | python humor/train/train_humor.py @./configs/train_humor.cfg 145 | ``` 146 | The default batch size is meant for a 16 GB GPU. 147 | 148 | ### Testing HuMoR 149 | After training HuMoR or downloading the pretrained checkpoints, we can evaluate the model in multiple ways 150 | 151 | To compute single-step losses (the exact same as during training) over the entire test set run: 152 | ``` 153 | python humor/test/test_humor.py @./configs/test_humor.cfg 154 | ``` 155 | 156 | To randomly sample a motion sequence and save a video visualization, run: 157 | ``` 158 | python humor/test/test_humor.py @./configs/test_humor_sampling.cfg 159 | ``` 160 | 161 | If you'd rather visualize the sampling results in an interactive viewer, use: 162 | ``` 163 | python humor/test/test_humor.py @./configs/test_humor_sampling_debug.cfg 164 | ``` 165 | 166 | Try adding `--viz-pred-joints`, `--viz-smpl-joints`, or `--viz-contacts` to the end of the command to visualize more outputs, or increasing the value of `--eval-num-samples` to sample the model multiple times from the same initial state. `--help` can always be used to see all flags and their descriptions. 167 | 168 | To reconstruct random sequences from AMASS (i.e. encode then decode them), use: 169 | ``` 170 | python humor/test/test_humor.py @./configs/test_humor_recon.cfg 171 | ``` 172 | 173 | ### Training Initial State GMM 174 | Test-time optimization also uses a Gaussian mixture model (GMM) prior over the initial state of the sequence. The pretrained model can be downloaded above, but if you wish to train from scratch, run: 175 | ``` 176 | python humor/train/train_state_prior.py --data ./data/amass_processed --out ./out/init_state_prior_gmm --gmm-comps 12 177 | ``` 178 | 179 | ## Citation 180 | If you found this code or paper useful, please consider citing: 181 | ``` 182 | @inproceedings{rempe2021humor, 183 | author={Rempe, Davis and Birdal, Tolga and Hertzmann, Aaron and Yang, Jimei and Sridhar, Srinath and Guibas, Leonidas J.}, 184 | title={HuMoR: 3D Human Motion Model for Robust Pose Estimation}, 185 | booktitle={International Conference on Computer Vision (ICCV)}, 186 | year={2021} 187 | } 188 | ``` 189 | 190 | ## Questions? 191 | If you run into any problems or have questions, please create an issue or contact Davis (first author) via email. 192 | -------------------------------------------------------------------------------- /body_models/README.md: -------------------------------------------------------------------------------- 1 | Both [SMPL+H](https://mano.is.tue.mpg.de/) and [VPoser](https://github.com/nghorbani/human_body_prior) should be installed to this directory. Detailed instructions are below. After installation, this directory should contain a `smplh` directory and a `vposer_v1_0` directory. 2 | 3 | ## SMPL+H 4 | To install the body model: 5 | * Create an account on the [project page](https://mano.is.tue.mpg.de/) 6 | * Go to the `Downloads` page and download the "Extended SMPL+H model (used in AMASS)". Place the downloaded `smplh.tar.xz` in this directory. 7 | * Extract downloaded model to new directory `mkdir smplh && tar -xf smplh.tar.xz -C smplh`. The model will be read in from here automatically when running this codebase. 8 | 9 | Note if you decide to install the body model somewhere else, please update `SMPLH_PATH` in [this file](../humor/body_model/utils.py). 10 | 11 | ## VPoser 12 | To install the pose prior: 13 | * Create an account on the [project page](https://smpl-x.is.tue.mpg.de/index.html) 14 | * Go to the `Download` page and under "VPoser: Variational Human Pose Prior" click on "Download VPoser v1.0 - CVPR'19" (note it's important to download v1.0 and **not** v2.0 which is not supported and will not work) 15 | * Copy the zip file to this directory, and unzip with `unzip vposer_v1_0.zip` 16 | 17 | If you're left with a directory called `vposer_v1_0` in the current directory, then it's been successfully installed. The `--vposer` argument in `run_fitting.py` by default points to this directory. -------------------------------------------------------------------------------- /configs/fit_amass_joints.cfg: -------------------------------------------------------------------------------- 1 | 2 | --data-path ./data/amass_processed 3 | --data-type AMASS 4 | --data-fps 30 5 | 6 | --smpl ./body_models/smplh/neutral/model.npz 7 | --init-motion-prior ./checkpoints/init_state_prior_gmm 8 | --humor ./checkpoints/humor/best_model.pth 9 | --out ./out/amass_joints_noisy_fitting 10 | 11 | --amass-split-by dataset 12 | --shuffle 13 | --amass-batch-size 2 14 | --amass-seq-len 60 15 | --amass-use-joints 16 | --amass-noise-std 0.04 17 | 18 | --joint3d-weight 1.0 1.0 1.0 19 | --vert3d-weight 0.0 0.0 0.0 20 | --point3d-weight 0.0 0.0 0.0 21 | --pose-prior-weight 2e-4 2e-4 0.0 22 | --shape-prior-weight 1.67e-4 1.67e-4 1.67e-4 23 | 24 | --motion-prior-weight 0.0 0.0 1e-3 25 | 26 | --init-motion-prior-weight 0.0 0.0 1e-3 27 | 28 | --joint3d-smooth-weight 10.0 10.0 0.0 29 | 30 | --joint-consistency-weight 0.0 0.0 1.0 31 | --bone-length-weight 0.0 0.0 10.0 32 | 33 | --contact-vel-weight 0.0 0.0 1.0 34 | --contact-height-weight 0.0 0.0 1.0 35 | 36 | --lr 1.0 37 | --num-iters 30 70 70 38 | 39 | --stage3-tune-init-num-frames 15 40 | --stage3-tune-init-freeze-start 30 41 | --stage3-tune-init-freeze-end 55 42 | 43 | --gt-body-type smplh 44 | 45 | --save-results 46 | --save-stages-results -------------------------------------------------------------------------------- /configs/fit_amass_keypts.cfg: -------------------------------------------------------------------------------- 1 | 2 | --data-path ./data/amass_processed 3 | --data-type AMASS 4 | --data-fps 30 5 | 6 | --smpl ./body_models/smplh/neutral/model.npz 7 | --init-motion-prior ./checkpoints/init_state_prior_gmm 8 | --humor ./checkpoints/humor/best_model.pth 9 | --out ./out/amass_verts_upper_fitting 10 | 11 | --amass-split-by dataset 12 | --shuffle 13 | --amass-batch-size 2 14 | --amass-seq-len 60 15 | --amass-use-verts 16 | --amass-noise-std 0.0 17 | --amass-make-partial 18 | --amass-partial-height 0.9 19 | 20 | --joint3d-weight 0.0 0.0 0.0 21 | --vert3d-weight 1.0 1.0 1.0 22 | --point3d-weight 0.0 0.0 0.0 23 | --pose-prior-weight 2e-4 2e-4 0.0 24 | --shape-prior-weight 1.67e-4 1.67e-4 1.67e-4 25 | 26 | --motion-prior-weight 0.0 0.0 5e-4 27 | 28 | --init-motion-prior-weight 0.0 0.0 5e-4 29 | 30 | --joint3d-smooth-weight 0.1 0.1 0.0 31 | 32 | --joint-consistency-weight 0.0 0.0 1.0 33 | --bone-length-weight 0.0 0.0 10.0 34 | 35 | --contact-vel-weight 0.0 0.0 1.0 36 | --contact-height-weight 0.0 0.0 1.0 37 | 38 | --lr 1.0 39 | --num-iters 30 70 70 40 | 41 | --stage3-tune-init-num-frames 15 42 | --stage3-tune-init-freeze-start 30 43 | --stage3-tune-init-freeze-end 55 44 | 45 | --gt-body-type smplh 46 | 47 | --save-results 48 | --save-stages-results -------------------------------------------------------------------------------- /configs/fit_imapper.cfg: -------------------------------------------------------------------------------- 1 | 2 | --data-path ./data/iMapper/i3DB 3 | --data-type iMapper-RGB 4 | --data-fps 30 5 | --mask-joints2d 6 | 7 | --smpl ./body_models/smplh/neutral/model.npz 8 | --init-motion-prior ./checkpoints/init_state_prior_gmm 9 | --humor ./checkpoints/humor/best_model.pth 10 | --out ./out/imapper_fitting 11 | 12 | --batch-size 2 13 | --imapper-seq-len 60 14 | 15 | --robust-loss bisquare 16 | --robust-tuning-const 4.6851 17 | --joint2d-sigma 100 18 | 19 | --joint2d-weight 0.001 0.001 0.001 20 | --pose-prior-weight 0.04 0.04 0.0 21 | --shape-prior-weight 0.05 0.05 0.05 22 | 23 | --joint3d-smooth-weight 100.0 100.0 0.0 24 | 25 | --motion-prior-weight 0.0 0.0 0.075 26 | 27 | --init-motion-prior-weight 0.0 0.0 0.075 28 | 29 | --joint-consistency-weight 0.0 0.0 100.0 30 | --bone-length-weight 0.0 0.0 2000.0 31 | 32 | --contact-vel-weight 0.0 0.0 0.0 33 | --contact-height-weight 0.0 0.0 10.0 34 | 35 | --floor-reg-weight 0.0 0.0 0.167 36 | 37 | --lr 1.0 38 | --num-iters 30 80 70 39 | 40 | --stage3-tune-init-num-frames 15 41 | --stage3-tune-init-freeze-start 30 42 | --stage3-tune-init-freeze-end 55 43 | 44 | --save-results 45 | --save-stages-results -------------------------------------------------------------------------------- /configs/fit_prox.cfg: -------------------------------------------------------------------------------- 1 | 2 | --data-path ./data/prox 3 | --data-type PROX-RGB 4 | --data-fps 30 5 | 6 | --smpl ./body_models/smplh/neutral/model.npz 7 | --init-motion-prior ./checkpoints/init_state_prior_gmm 8 | --humor ./checkpoints/humor/best_model.pth 9 | --out ./out/prox_fitting 10 | 11 | --prox-batch-size 2 12 | --prox-seq-len 60 13 | 14 | --robust-loss bisquare 15 | --robust-tuning-const 4.6851 16 | --joint2d-sigma 100 17 | 18 | --joint2d-weight 0.001 0.001 0.001 19 | --pose-prior-weight 0.04 0.04 0.0 20 | --shape-prior-weight 0.05 0.05 0.05 21 | 22 | --joint3d-smooth-weight 100.0 100.0 0.0 23 | 24 | --motion-prior-weight 0.0 0.0 0.05 25 | 26 | --init-motion-prior-weight 0.0 0.0 0.05 27 | 28 | --joint-consistency-weight 0.0 0.0 100.0 29 | --bone-length-weight 0.0 0.0 2000.0 30 | 31 | --contact-vel-weight 0.0 0.0 100.0 32 | --contact-height-weight 0.0 0.0 10.0 33 | 34 | --floor-reg-weight 0.0 0.0 0.167 35 | 36 | --lr 1.0 37 | --num-iters 30 80 70 38 | 39 | --stage3-tune-init-num-frames 15 40 | --stage3-tune-init-freeze-start 30 41 | --stage3-tune-init-freeze-end 55 42 | 43 | --save-results 44 | --save-stages-results -------------------------------------------------------------------------------- /configs/fit_proxd.cfg: -------------------------------------------------------------------------------- 1 | 2 | --data-path ./data/prox 3 | --data-type PROX-RGBD 4 | --data-fps 30 5 | 6 | --smpl ./body_models/smplh/neutral/model.npz 7 | --init-motion-prior ./checkpoints/init_state_prior_gmm 8 | --humor ./checkpoints/humor/best_model.pth 9 | --out ./out/proxd_fitting 10 | 11 | --prox-batch-size 2 12 | --prox-seq-len 60 13 | 14 | --robust-loss bisquare 15 | --robust-tuning-const 4.6851 16 | --joint2d-sigma 100 17 | 18 | --point3d-weight 1.0 1.0 1.0 19 | --joint2d-weight 0.001 0.001 0.001 20 | --pose-prior-weight 0.1 0.1 0.0 21 | --shape-prior-weight 0.034 0.034 0.034 22 | 23 | --joint3d-smooth-weight 100.0 100.0 0.0 24 | 25 | --motion-prior-weight 0.0 0.0 0.075 26 | --motion-optim-shape 27 | 28 | --init-motion-prior-weight 0.0 0.0 0.075 29 | 30 | --joint-consistency-weight 0.0 0.0 100.0 31 | --bone-length-weight 0.0 0.0 2000.0 32 | 33 | --contact-vel-weight 0.0 0.0 100.0 34 | --contact-height-weight 0.0 0.0 10.0 35 | 36 | --floor-reg-weight 0.0 0.0 1.0 37 | 38 | --lr 1.0 39 | --num-iters 30 70 70 40 | 41 | --stage3-tune-init-num-frames 15 42 | --stage3-tune-init-freeze-start 30 43 | --stage3-tune-init-freeze-end 55 44 | 45 | --save-results 46 | --save-stages-results -------------------------------------------------------------------------------- /configs/fit_rgb_demo_no_split.cfg: -------------------------------------------------------------------------------- 1 | 2 | --data-path ./data/rgb_demo/hiphop_clip1.mp4 3 | --data-type RGB 4 | --mask-joints2d 5 | 6 | --openpose ./external/openpose 7 | --smpl ./body_models/smplh/male/model.npz 8 | --init-motion-prior ./checkpoints/init_state_prior_gmm 9 | --humor ./checkpoints/humor/best_model.pth 10 | --out ./out/rgb_demo_no_split 11 | 12 | --batch-size 1 13 | 14 | --robust-loss bisquare 15 | --robust-tuning-const 4.6851 16 | --joint2d-sigma 100 17 | 18 | --joint2d-weight 0.001 0.001 0.001 19 | --pose-prior-weight 0.04 0.04 0.0 20 | --shape-prior-weight 0.05 0.05 0.05 21 | 22 | --joint3d-smooth-weight 100.0 100.0 0.0 23 | 24 | --motion-prior-weight 0.0 0.0 0.075 25 | 26 | --init-motion-prior-weight 0.0 0.0 0.075 27 | 28 | --joint-consistency-weight 0.0 0.0 100.0 29 | --bone-length-weight 0.0 0.0 2000.0 30 | 31 | --contact-vel-weight 0.0 0.0 100.0 32 | --contact-height-weight 0.0 0.0 10.0 33 | 34 | --floor-reg-weight 0.0 0.0 0.167 35 | 36 | --lr 1.0 37 | --num-iters 30 80 70 38 | 39 | --stage3-tune-init-num-frames 15 40 | --stage3-tune-init-freeze-start 30 41 | --stage3-tune-init-freeze-end 55 42 | 43 | --save-results 44 | --save-stages-results -------------------------------------------------------------------------------- /configs/fit_rgb_demo_use_split.cfg: -------------------------------------------------------------------------------- 1 | 2 | --data-path ./data/rgb_demo/hiphop_clip1.mp4 3 | --data-type RGB 4 | --mask-joints2d 5 | 6 | --openpose ./external/openpose 7 | --smpl ./body_models/smplh/male/model.npz 8 | --init-motion-prior ./checkpoints/init_state_prior_gmm 9 | --humor ./checkpoints/humor/best_model.pth 10 | --out ./out/rgb_demo_use_split 11 | 12 | --batch-size 2 13 | 14 | --rgb-seq-len 60 15 | --rgb-overlap-len 10 16 | --rgb-overlap-consist-weight 200.0 200.0 200.0 17 | 18 | --robust-loss bisquare 19 | --robust-tuning-const 4.6851 20 | --joint2d-sigma 100 21 | 22 | --joint2d-weight 0.001 0.001 0.001 23 | --pose-prior-weight 0.04 0.04 0.0 24 | --shape-prior-weight 0.05 0.05 0.05 25 | 26 | --joint3d-smooth-weight 100.0 100.0 0.0 27 | 28 | --motion-prior-weight 0.0 0.0 0.075 29 | 30 | --init-motion-prior-weight 0.0 0.0 0.075 31 | 32 | --joint-consistency-weight 0.0 0.0 100.0 33 | --bone-length-weight 0.0 0.0 2000.0 34 | 35 | --contact-vel-weight 0.0 0.0 100.0 36 | --contact-height-weight 0.0 0.0 10.0 37 | 38 | --floor-reg-weight 0.0 0.0 0.167 39 | 40 | --lr 1.0 41 | --num-iters 30 80 70 42 | 43 | --stage3-tune-init-num-frames 15 44 | --stage3-tune-init-freeze-start 30 45 | --stage3-tune-init-freeze-end 55 46 | 47 | --save-results 48 | --save-stages-results -------------------------------------------------------------------------------- /configs/intrinsics_default.json: -------------------------------------------------------------------------------- 1 | [[1060.531764702488, 0.0, 951.2999547224418], [0.0, 1060.3856705041237, 536.7703598373467], [0.0, 0.0, 1.0]] 2 | -------------------------------------------------------------------------------- /configs/test_humor.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 10 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --loss HumorLoss 18 | 19 | --out ./out/humor_test 20 | --ckpt ./checkpoints/humor/best_model.pth 21 | --gpu 0 22 | --batch-size 48 23 | 24 | --eval-test -------------------------------------------------------------------------------- /configs/test_humor_qual.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 10 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --model-use-smpl-joint-inputs 18 | 19 | --loss HumorLoss 20 | 21 | --out ./out/humor_qual_test 22 | --ckpt ./checkpoints/humor_qual/best_model.pth 23 | --gpu 0 24 | --batch-size 48 25 | 26 | --eval-test -------------------------------------------------------------------------------- /configs/test_humor_qual_sampling.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 10 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --model-use-smpl-joint-inputs 18 | 19 | --loss HumorLoss 20 | 21 | --out ./out/humor_qual_test_sampling 22 | --ckpt ./checkpoints/humor_qual/best_model.pth 23 | --gpu 0 24 | --batch-size 1 25 | 26 | --eval-sampling 27 | --eval-sampling-len 10.0 28 | --eval-num-samples 1 29 | --shuffle-test -------------------------------------------------------------------------------- /configs/test_humor_qual_sampling_debug.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 10 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --model-use-smpl-joint-inputs 18 | 19 | --loss HumorLoss 20 | 21 | --out ./out/humor_qual_test_sampling_debug 22 | --ckpt ./checkpoints/humor_qual/best_model.pth 23 | --gpu 0 24 | --batch-size 1 25 | 26 | --eval-sampling-debug 27 | --eval-sampling-len 5.0 28 | --eval-num-samples 1 29 | --shuffle-test -------------------------------------------------------------------------------- /configs/test_humor_recon.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 60 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --loss HumorLoss 18 | 19 | --out ./out/humor_test_recon 20 | --ckpt ./checkpoints/humor/best_model.pth 21 | --gpu 0 22 | --batch-size 1 23 | 24 | --eval-recon 25 | --shuffle-test -------------------------------------------------------------------------------- /configs/test_humor_recon_debug.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 60 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --loss HumorLoss 18 | 19 | --out ./out/humor_test_recon_debug 20 | --ckpt ./checkpoints/humor/best_model.pth 21 | --gpu 0 22 | --batch-size 1 23 | 24 | --eval-recon-debug 25 | --shuffle-test -------------------------------------------------------------------------------- /configs/test_humor_sampling.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 10 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --loss HumorLoss 18 | 19 | --out ./out/humor_test_sampling 20 | --ckpt ./checkpoints/humor/best_model.pth 21 | --gpu 0 22 | --batch-size 1 23 | 24 | --eval-sampling 25 | --eval-sampling-len 10.0 26 | --eval-num-samples 1 27 | --shuffle-test -------------------------------------------------------------------------------- /configs/test_humor_sampling_debug.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 10 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --loss HumorLoss 18 | 19 | --out ./out/humor_test_sampling_debug 20 | --ckpt ./checkpoints/humor/best_model.pth 21 | --gpu 0 22 | --batch-size 1 23 | 24 | --eval-sampling-debug 25 | --eval-sampling-len 5.0 26 | --eval-num-samples 1 27 | --shuffle-test -------------------------------------------------------------------------------- /configs/train_humor.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 10 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --loss HumorLoss 18 | --kl-loss 0.0004 19 | --kl-loss-anneal-start 0 20 | --kl-loss-anneal-end 50 21 | 22 | --contacts-loss 0.01 23 | --contacts-vel-loss 0.01 24 | 25 | --regr-trans-loss 1.0 26 | --regr-trans-vel-loss 1.0 27 | --regr-root-orient-loss 1.0 28 | --regr-root-orient-vel-loss 1.0 29 | --regr-pose-loss 1.0 30 | --regr-pose-vel-loss 1.0 31 | --regr-joint-loss 1.0 32 | --regr-joint-vel-loss 1.0 33 | --regr-joint-orient-vel-loss 1.0 34 | --regr-vert-loss 1.0 35 | --regr-vert-vel-loss 1.0 36 | 37 | --smpl-joint-loss 1.0 38 | --smpl-mesh-loss 1.0 39 | --smpl-joint-consistency-loss 1.0 40 | 41 | --out ./out/humor_train 42 | --gpu 0 43 | --batch-size 200 44 | --epochs 200 45 | --lr 1e-4 46 | --sched-milestones 50 80 140 47 | --sched-decay 0.5 0.2 0.4 48 | 49 | --sched-samp-start 10 50 | --sched-samp-end 20 51 | 52 | --val-every 2 53 | --save-every 25 54 | --print-every 10 -------------------------------------------------------------------------------- /configs/train_humor_qual.cfg: -------------------------------------------------------------------------------- 1 | --dataset AmassDiscreteDataset 2 | --data-paths ./data/amass_processed 3 | --split-by dataset 4 | --sample-num-frames 10 5 | --data-steps-in 1 6 | --data-steps-out 1 7 | --data-rot-rep mat 8 | --data-return-config smpl+joints+contacts 9 | 10 | --model HumorModel 11 | --model-data-config smpl+joints+contacts 12 | --in-rot-rep mat 13 | --out-rot-rep aa 14 | --latent-size 48 15 | --model-steps-in 1 16 | 17 | --model-use-smpl-joint-inputs 18 | 19 | --loss HumorLoss 20 | --kl-loss 0.0004 21 | --kl-loss-anneal-start 0 22 | --kl-loss-anneal-end 50 23 | 24 | --contacts-loss 0.01 25 | 26 | --regr-trans-loss 1.0 27 | --regr-trans-vel-loss 1.0 28 | --regr-root-orient-loss 1.0 29 | --regr-root-orient-vel-loss 1.0 30 | --regr-pose-loss 1.0 31 | --regr-pose-vel-loss 1.0 32 | --regr-joint-loss 1.0 33 | --regr-joint-vel-loss 1.0 34 | --regr-joint-orient-vel-loss 1.0 35 | --regr-vert-loss 1.0 36 | --regr-vert-vel-loss 1.0 37 | 38 | --smpl-joint-loss 1.0 39 | --smpl-mesh-loss 1.0 40 | --smpl-joint-consistency-loss 1.0 41 | 42 | --out ./out/humor_qual_train 43 | --gpu 0 44 | --batch-size 200 45 | --epochs 200 46 | --lr 1e-4 47 | --sched-milestones 50 80 140 48 | --sched-decay 0.5 0.2 0.4 49 | 50 | --sched-samp-start 10 51 | --sched-samp-end 20 52 | 53 | --val-every 2 54 | --save-every 25 55 | --print-every 10 -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- 1 | By default, all datasets described in this README are expected to be placed in this directory. If you decide to install them elsewhere, you will need to modify various input arguments within the config files. 2 | 3 | ## AMASS 4 | First you must obtain the raw AMASS dataset: 5 | * Create a directory to place raw data in: `mkdir amass_raw` 6 | * Create an account on the [project page](https://amass.is.tue.mpg.de/) 7 | * Go to the `Downloads` page and download the SMPL+H Body Data for all datasets. Extract each dataset to its own directory within `amass_raw` (_e.g._ all CMU data should be at `data/amass_raw/CMU`). 8 | 9 | Next, run the data processing which samples motions to 30 Hz, removes terrain interaction sequences, detects contacts, etc.., and saves the data into the format used by our codebase. From the root of this repo, run: 10 | ``` 11 | python humor/scripts/process_amass_data.py --amass-root ./data/amass_raw --out ./data/amass_processed --smplh-root ./body_models/smplh 12 | ``` 13 | By default this processes every sub-dataset in AMASS. If you only want to process a subset, e.g., CMU and HumanEva, pass in the flag `--datasets CMU HumanEva`. 14 | 15 | A second script does some small extra cleanup to remove sequences we found to be problematic (e.g., walking/running on treadmill and ice skating which negatively affects learning the motion model): 16 | ``` 17 | python humor/scripts/cleanup_amass_data.py --data ./data/amass_processed --backup ./data/cleanup_bk 18 | ``` 19 | The `--backup` flag indicates a directory where the sequences that are removed will be saved in case you need them again later. 20 | 21 | > Note: not all of the above processed data is actually used in training/testing HuMoR. To see the exact dataset splits used in the paper see [this script](../humor/datasets/amass_utils.py) 22 | 23 | ## i3DB 24 | We have prepared a pre-processed version of the i3DB dataset, originally released with [iMapper](https://github.com/amonszpart/iMapper), that can be downloaded directly. To download, from this directory run: 25 | ``` 26 | bash get_i3db.sh 27 | ``` 28 | 29 | ## PROX 30 | We have prepared ground plane and 2D joint data that complement PROX in order to easily run our method on the dataset. The first step is to download the PROX dataset: 31 | * First create the structure. From this directory run `mkdir prox && mkdir prox/qualitative` 32 | * Create an account on the [project page](https://prox.is.tue.mpg.de/) 33 | * Go to the `Download` page and download all files under "Qualitative PROX dataset" (note: `videos.zip` and `PROXD_videos.zip` are not required). Unzip these files to `prox/qualitative` that we created before. 34 | 35 | You should now have the full PROX qualitative dataset with directory structure (if you downloaded all the optional files): 36 | ``` 37 | prox/qualitative 38 | ├── body_segments 39 | ├── calibration 40 | ├── cam2world 41 | ├── PROXD 42 | ├── PROXD_videos 43 | ├── recordings 44 | ├── scenes 45 | └── sdf 46 | ``` 47 | 48 | To download the [OpenPose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) 2D joint detections, ground truth floor planes, and [PlaneRCNN](https://github.com/NVlabs/planercnn) detections used in our paper, from this directory run `bash get_prox_extra.sh`. This will add `floors`, `keypoints`, and `planes` directories to the structure above. -------------------------------------------------------------------------------- /data/get_i3db.sh: -------------------------------------------------------------------------------- 1 | # downloads and unzips i3DB data (~4.5 GB) 2 | wget http://download.cs.stanford.edu/orion/humor/iMapper.zip 3 | unzip iMapper.zip -------------------------------------------------------------------------------- /data/get_prox_extra.sh: -------------------------------------------------------------------------------- 1 | # downloads and unzips additional PROX data (~87 MB) 2 | wget http://download.cs.stanford.edu/orion/humor/prox.zip 3 | unzip prox.zip -------------------------------------------------------------------------------- /data/rgb_demo/hiphop_clip1.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/davrempe/humor/fc6ef84f0baa153be15427402e0147ed1a63a11a/data/rgb_demo/hiphop_clip1.mp4 -------------------------------------------------------------------------------- /get_ckpt.sh: -------------------------------------------------------------------------------- 1 | # downloads and unzips pre-trained HuMoR models (215 MB) 2 | wget http://download.cs.stanford.edu/orion/humor/checkpoints.zip 3 | unzip checkpoints.zip -------------------------------------------------------------------------------- /humor.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/davrempe/humor/fc6ef84f0baa153be15427402e0147ed1a63a11a/humor.png -------------------------------------------------------------------------------- /humor/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/davrempe/humor/fc6ef84f0baa153be15427402e0147ed1a63a11a/humor/__init__.py -------------------------------------------------------------------------------- /humor/body_model/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/davrempe/humor/fc6ef84f0baa153be15427402e0147ed1a63a11a/humor/body_model/__init__.py -------------------------------------------------------------------------------- /humor/body_model/body_model.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | import torch 4 | import torch.nn as nn 5 | import pickle 6 | 7 | from smplx import SMPL, SMPLH, SMPLX 8 | from smplx.vertex_ids import vertex_ids 9 | from smplx.utils import Struct 10 | 11 | class BodyModel(nn.Module): 12 | ''' 13 | Wrapper around SMPLX body model class. 14 | ''' 15 | 16 | def __init__(self, 17 | bm_path, 18 | num_betas=10, 19 | batch_size=1, 20 | num_expressions=10, 21 | use_vtx_selector=False, 22 | model_type='smplh'): 23 | super(BodyModel, self).__init__() 24 | ''' 25 | Creates the body model object at the given path. 26 | 27 | :param bm_path: path to the body model pkl file 28 | :param num_expressions: only for smplx 29 | :param model_type: one of [smpl, smplh, smplx] 30 | :param use_vtx_selector: if true, returns additional vertices as joints that correspond to OpenPose joints 31 | ''' 32 | self.use_vtx_selector = use_vtx_selector 33 | cur_vertex_ids = None 34 | if self.use_vtx_selector: 35 | cur_vertex_ids = vertex_ids[model_type] 36 | data_struct = None 37 | if '.npz' in bm_path: 38 | # smplx does not support .npz by default, so have to load in manually 39 | smpl_dict = np.load(bm_path, encoding='latin1') 40 | data_struct = Struct(**smpl_dict) 41 | # print(smpl_dict.files) 42 | if model_type == 'smplh': 43 | data_struct.hands_componentsl = np.zeros((0)) 44 | data_struct.hands_componentsr = np.zeros((0)) 45 | data_struct.hands_meanl = np.zeros((15 * 3)) 46 | data_struct.hands_meanr = np.zeros((15 * 3)) 47 | V, D, B = data_struct.shapedirs.shape 48 | data_struct.shapedirs = np.concatenate([data_struct.shapedirs, np.zeros((V, D, SMPL.SHAPE_SPACE_DIM-B))], axis=-1) # super hacky way to let smplh use 16-size beta 49 | kwargs = { 50 | 'model_type' : model_type, 51 | 'data_struct' : data_struct, 52 | 'num_betas': num_betas, 53 | 'batch_size' : batch_size, 54 | 'num_expression_coeffs' : num_expressions, 55 | 'vertex_ids' : cur_vertex_ids, 56 | 'use_pca' : False, 57 | 'flat_hand_mean' : True 58 | } 59 | assert(model_type in ['smpl', 'smplh', 'smplx']) 60 | if model_type == 'smpl': 61 | self.bm = SMPL(bm_path, **kwargs) 62 | self.num_joints = SMPL.NUM_JOINTS 63 | elif model_type == 'smplh': 64 | self.bm = SMPLH(bm_path, **kwargs) 65 | self.num_joints = SMPLH.NUM_JOINTS 66 | elif model_type == 'smplx': 67 | self.bm = SMPLX(bm_path, **kwargs) 68 | self.num_joints = SMPLX.NUM_JOINTS 69 | 70 | self.model_type = model_type 71 | 72 | def forward(self, root_orient=None, pose_body=None, pose_hand=None, pose_jaw=None, pose_eye=None, betas=None, 73 | trans=None, dmpls=None, expression=None, return_dict=False, **kwargs): 74 | ''' 75 | Note dmpls are not supported. 76 | ''' 77 | assert(dmpls is None) 78 | out_obj = self.bm( 79 | betas=betas, 80 | global_orient=root_orient, 81 | body_pose=pose_body, 82 | left_hand_pose=None if pose_hand is None else pose_hand[:,:(SMPLH.NUM_HAND_JOINTS*3)], 83 | right_hand_pose=None if pose_hand is None else pose_hand[:,(SMPLH.NUM_HAND_JOINTS*3):], 84 | transl=trans, 85 | expression=expression, 86 | jaw_pose=pose_jaw, 87 | leye_pose=None if pose_eye is None else pose_eye[:,:3], 88 | reye_pose=None if pose_eye is None else pose_eye[:,3:], 89 | return_full_pose=True, 90 | **kwargs 91 | ) 92 | 93 | out = { 94 | 'v' : out_obj.vertices, 95 | 'f' : self.bm.faces_tensor, 96 | 'betas' : out_obj.betas, 97 | 'Jtr' : out_obj.joints, 98 | 'pose_body' : out_obj.body_pose, 99 | 'full_pose' : out_obj.full_pose 100 | } 101 | if self.model_type in ['smplh', 'smplx']: 102 | out['pose_hand'] = torch.cat([out_obj.left_hand_pose, out_obj.right_hand_pose], dim=-1) 103 | if self.model_type == 'smplx': 104 | out['pose_jaw'] = out_obj.jaw_pose 105 | out['pose_eye'] = pose_eye 106 | 107 | 108 | if not self.use_vtx_selector: 109 | # don't need extra joints 110 | out['Jtr'] = out['Jtr'][:,:self.num_joints+1] # add one for the root 111 | 112 | if not return_dict: 113 | out = Struct(**out) 114 | 115 | return out 116 | 117 | -------------------------------------------------------------------------------- /humor/body_model/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import torch 4 | 5 | SMPL_JOINTS = {'hips' : 0, 'leftUpLeg' : 1, 'rightUpLeg' : 2, 'spine' : 3, 'leftLeg' : 4, 'rightLeg' : 5, 6 | 'spine1' : 6, 'leftFoot' : 7, 'rightFoot' : 8, 'spine2' : 9, 'leftToeBase' : 10, 'rightToeBase' : 11, 7 | 'neck' : 12, 'leftShoulder' : 13, 'rightShoulder' : 14, 'head' : 15, 'leftArm' : 16, 'rightArm' : 17, 8 | 'leftForeArm' : 18, 'rightForeArm' : 19, 'leftHand' : 20, 'rightHand' : 21} 9 | SMPL_PARENTS = [-1, 0, 0, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 12, 12, 13, 14, 16, 17, 18, 19] 10 | 11 | SMPLH_PATH = './body_models/smplh' 12 | SMPLX_PATH = './body_models/smplx' 13 | SMPL_PATH = './body_models/smpl' 14 | VPOSER_PATH = './body_models/vposer_v1_0' 15 | 16 | # chosen virtual mocap markers that are "keypoints" to work with 17 | KEYPT_VERTS = [4404, 920, 3076, 3169, 823, 4310, 1010, 1085, 4495, 4569, 6615, 3217, 3313, 6713, 18 | 6785, 3383, 6607, 3207, 1241, 1508, 4797, 4122, 1618, 1569, 5135, 5040, 5691, 5636, 19 | 5404, 2230, 2173, 2108, 134, 3645, 6543, 3123, 3024, 4194, 1306, 182, 3694, 4294, 744] 20 | 21 | 22 | # 23 | # From https://github.com/vchoutas/smplify-x/blob/master/smplifyx/utils.py 24 | # Please see license for usage restrictions. 25 | # 26 | def smpl_to_openpose(model_type='smplx', use_hands=True, use_face=True, 27 | use_face_contour=False, openpose_format='coco25'): 28 | ''' Returns the indices of the permutation that maps SMPL to OpenPose 29 | 30 | Parameters 31 | ---------- 32 | model_type: str, optional 33 | The type of SMPL-like model that is used. The default mapping 34 | returned is for the SMPLX model 35 | use_hands: bool, optional 36 | Flag for adding to the returned permutation the mapping for the 37 | hand keypoints. Defaults to True 38 | use_face: bool, optional 39 | Flag for adding to the returned permutation the mapping for the 40 | face keypoints. Defaults to True 41 | use_face_contour: bool, optional 42 | Flag for appending the facial contour keypoints. Defaults to False 43 | openpose_format: bool, optional 44 | The output format of OpenPose. For now only COCO-25 and COCO-19 is 45 | supported. Defaults to 'coco25' 46 | 47 | ''' 48 | if openpose_format.lower() == 'coco25': 49 | if model_type == 'smpl': 50 | return np.array([24, 12, 17, 19, 21, 16, 18, 20, 0, 2, 5, 8, 1, 4, 51 | 7, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34], 52 | dtype=np.int32) 53 | elif model_type == 'smplh': 54 | body_mapping = np.array([52, 12, 17, 19, 21, 16, 18, 20, 0, 2, 5, 55 | 8, 1, 4, 7, 53, 54, 55, 56, 57, 58, 59, 56 | 60, 61, 62], dtype=np.int32) 57 | mapping = [body_mapping] 58 | if use_hands: 59 | lhand_mapping = np.array([20, 34, 35, 36, 63, 22, 23, 24, 64, 60 | 25, 26, 27, 65, 31, 32, 33, 66, 28, 61 | 29, 30, 67], dtype=np.int32) 62 | rhand_mapping = np.array([21, 49, 50, 51, 68, 37, 38, 39, 69, 63 | 40, 41, 42, 70, 46, 47, 48, 71, 43, 64 | 44, 45, 72], dtype=np.int32) 65 | mapping += [lhand_mapping, rhand_mapping] 66 | return np.concatenate(mapping) 67 | # SMPLX 68 | elif model_type == 'smplx': 69 | body_mapping = np.array([55, 12, 17, 19, 21, 16, 18, 20, 0, 2, 5, 70 | 8, 1, 4, 7, 56, 57, 58, 59, 60, 61, 62, 71 | 63, 64, 65], dtype=np.int32) 72 | mapping = [body_mapping] 73 | if use_hands: 74 | lhand_mapping = np.array([20, 37, 38, 39, 66, 25, 26, 27, 75 | 67, 28, 29, 30, 68, 34, 35, 36, 69, 76 | 31, 32, 33, 70], dtype=np.int32) 77 | rhand_mapping = np.array([21, 52, 53, 54, 71, 40, 41, 42, 72, 78 | 43, 44, 45, 73, 49, 50, 51, 74, 46, 79 | 47, 48, 75], dtype=np.int32) 80 | 81 | mapping += [lhand_mapping, rhand_mapping] 82 | if use_face: 83 | # end_idx = 127 + 17 * use_face_contour 84 | face_mapping = np.arange(76, 127 + 17 * use_face_contour, 85 | dtype=np.int32) 86 | mapping += [face_mapping] 87 | 88 | return np.concatenate(mapping) 89 | else: 90 | raise ValueError('Unknown model type: {}'.format(model_type)) 91 | elif openpose_format == 'coco19': 92 | if model_type == 'smpl': 93 | return np.array([24, 12, 17, 19, 21, 16, 18, 20, 0, 2, 5, 8, 94 | 1, 4, 7, 25, 26, 27, 28], 95 | dtype=np.int32) 96 | elif model_type == 'smplh': 97 | body_mapping = np.array([52, 12, 17, 19, 21, 16, 18, 20, 0, 2, 5, 98 | 8, 1, 4, 7, 53, 54, 55, 56], 99 | dtype=np.int32) 100 | mapping = [body_mapping] 101 | if use_hands: 102 | lhand_mapping = np.array([20, 34, 35, 36, 57, 22, 23, 24, 58, 103 | 25, 26, 27, 59, 31, 32, 33, 60, 28, 104 | 29, 30, 61], dtype=np.int32) 105 | rhand_mapping = np.array([21, 49, 50, 51, 62, 37, 38, 39, 63, 106 | 40, 41, 42, 64, 46, 47, 48, 65, 43, 107 | 44, 45, 66], dtype=np.int32) 108 | mapping += [lhand_mapping, rhand_mapping] 109 | return np.concatenate(mapping) 110 | # SMPLX 111 | elif model_type == 'smplx': 112 | body_mapping = np.array([55, 12, 17, 19, 21, 16, 18, 20, 0, 2, 5, 113 | 8, 1, 4, 7, 56, 57, 58, 59], 114 | dtype=np.int32) 115 | mapping = [body_mapping] 116 | if use_hands: 117 | lhand_mapping = np.array([20, 37, 38, 39, 60, 25, 26, 27, 118 | 61, 28, 29, 30, 62, 34, 35, 36, 63, 119 | 31, 32, 33, 64], dtype=np.int32) 120 | rhand_mapping = np.array([21, 52, 53, 54, 65, 40, 41, 42, 66, 121 | 43, 44, 45, 67, 49, 50, 51, 68, 46, 122 | 47, 48, 69], dtype=np.int32) 123 | 124 | mapping += [lhand_mapping, rhand_mapping] 125 | if use_face: 126 | face_mapping = np.arange(70, 70 + 51 + 127 | 17 * use_face_contour, 128 | dtype=np.int32) 129 | mapping += [face_mapping] 130 | 131 | return np.concatenate(mapping) 132 | else: 133 | raise ValueError('Unknown model type: {}'.format(model_type)) 134 | else: 135 | raise ValueError('Unknown joint format: {}'.format(openpose_format)) 136 | -------------------------------------------------------------------------------- /humor/datasets/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/davrempe/humor/fc6ef84f0baa153be15427402e0147ed1a63a11a/humor/datasets/__init__.py -------------------------------------------------------------------------------- /humor/datasets/amass_fit_dataset.py: -------------------------------------------------------------------------------- 1 | import sys, os 2 | cur_file_path = os.path.dirname(os.path.realpath(__file__)) 3 | sys.path.append(os.path.join(cur_file_path, '..')) 4 | 5 | import numpy as np 6 | from torch.utils.data import Dataset 7 | import torch 8 | 9 | from datasets.amass_utils import CONTACT_INDS 10 | from utils.transforms import rotation_matrix_to_angle_axis 11 | from body_model.body_model import BodyModel 12 | from datasets.amass_discrete_dataset import AmassDiscreteDataset 13 | from body_model.utils import SMPLH_PATH, SMPL_JOINTS 14 | 15 | class AMASSFitDataset(Dataset): 16 | ''' 17 | Wrapper around AmassDiscreteDataset to return observed and GT data as expected and add desired noise. 18 | ''' 19 | 20 | def __init__(self, data_path, 21 | seq_len=60, 22 | return_joints=True, 23 | return_verts=True, 24 | return_points=True, 25 | noise_std=0.0, 26 | make_partial=False, 27 | partial_height=0.75, 28 | drop_middle=False, 29 | num_samp_pts=512, 30 | root_only=False, 31 | split_by='dataset', 32 | custom_split=None): 33 | 34 | self.seq_len = seq_len # global seq returns + 1 35 | self.return_joints = return_joints 36 | self.return_verts = return_verts 37 | self.return_points = return_points 38 | self.num_samp_pts = num_samp_pts 39 | self.noise_std = noise_std 40 | self.make_partial = make_partial 41 | self.partial_height = partial_height 42 | self.drop_middle = drop_middle 43 | self.root_only = root_only 44 | split_str = 'test' 45 | if split_by == 'dataset' and custom_split is not None: 46 | split_str = 'custom' 47 | self.amass_dataset = AmassDiscreteDataset(split=split_str, 48 | data_paths=[data_path], 49 | split_by=split_by, 50 | sample_num_frames=seq_len - 1, # global seq returns + 1 51 | step_frames_in=1, 52 | step_frames_out=1, 53 | data_rot_rep='aa', 54 | data_return_config='all', 55 | only_global=True, 56 | custom_split=custom_split) 57 | 58 | if self.return_points: 59 | # need to have SMPL model to sample on surface 60 | self.device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu') 61 | male_bm_path = os.path.join(SMPLH_PATH, 'male/model.npz') 62 | female_bm_path = os.path.join(SMPLH_PATH, 'female/model.npz') 63 | self.male_bm = BodyModel(bm_path=male_bm_path, num_betas=16, batch_size=self.seq_len).to(self.device) 64 | self.female_bm = BodyModel(bm_path=female_bm_path, num_betas=16, batch_size=self.seq_len).to(self.device) 65 | 66 | def __len__(self): 67 | return self.amass_dataset.__len__() 68 | 69 | def __getitem__(self, idx): 70 | global_data, meta = self.amass_dataset.__getitem__(idx) 71 | 72 | # create the ground truth data dictionary 73 | gt_dict = dict() 74 | for k, v in global_data.items(): 75 | gt_key = '_'.join(k.split('_')[1:]) 76 | T = v.size(0) 77 | if gt_key == 'root_orient' or gt_key == 'pose_body': 78 | # convert mat to aa rots 79 | v = rotation_matrix_to_angle_axis(v.reshape((-1, 3, 3))).reshape((T, -1)) 80 | gt_dict[gt_key] = v 81 | gt_dict['betas'] = meta['betas'] 82 | gt_dict['gender'] = meta['gender'] 83 | gt_dict['name'] = meta['path'].replace('/', '_')[:-4] 84 | 85 | # create clean observations 86 | observed_dict = dict() 87 | if self.return_joints: 88 | # 3d joint positions 89 | observed_dict['joints3d'] = gt_dict['joints'].clone().detach() 90 | if self.root_only: 91 | for k, v in SMPL_JOINTS.items(): 92 | if k not in ['leftArm', 'rightArm', 'head', 'neck', 'hips']: 93 | observed_dict['joints3d'][:,v,:] = float('inf') # everything but root is not observed 94 | if self.return_verts: 95 | # 3d vertex positions (of specific chosen keypoint vertices) 96 | observed_dict['verts3d'] = gt_dict['verts'].clone().detach() 97 | if self.return_points: 98 | import trimesh 99 | # forward pass of SMPL 100 | cur_bm = self.male_bm if meta['gender'] == 'male' else self.female_bm 101 | body_gt = cur_bm(pose_body=gt_dict['pose_body'].to(self.device), 102 | pose_hand=None, 103 | betas=gt_dict['betas'].to(self.device), 104 | root_orient=gt_dict['root_orient'].to(self.device), 105 | trans=gt_dict['trans'].to(self.device)) 106 | # sample points on the surface 107 | nv = body_gt.v.size(1) 108 | gt_dict['points'] = body_gt.v 109 | points_list = [] 110 | for t in range(self.seq_len): 111 | verts = body_gt.v[t].cpu().detach().numpy() 112 | faces = body_gt.f.cpu().detach().numpy() 113 | body_mesh = trimesh.Trimesh(vertices=verts, 114 | faces=faces, 115 | process=False) 116 | pts_t = trimesh.sample.sample_surface(body_mesh, self.num_samp_pts)[0] 117 | points_list.append(pts_t) 118 | points = torch.Tensor(np.stack(points_list, axis=0)) 119 | observed_dict['points3d'] = points 120 | 121 | # add gaussian noise 122 | if self.noise_std > 0.0: 123 | for k in observed_dict.keys(): 124 | observed_dict[k] += torch.randn_like(observed_dict[k])*self.noise_std 125 | 126 | if self.make_partial: 127 | # if z below a certain threshold make occluded 128 | for k, v in observed_dict.items(): 129 | if k == 'joints3d' and self.root_only: 130 | continue 131 | occluded_mask = observed_dict[k][:,:,2:3] < self.partial_height #0.95 #0.75 132 | occluded_mask = occluded_mask.expand_as(observed_dict[k]) 133 | observed_dict[k][occluded_mask] = float('inf') 134 | 135 | if k == 'points3d': 136 | vis_mask = torch.logical_not(occluded_mask) 137 | cur_points_seq = observed_dict[k] 138 | for t in range(self.seq_len): 139 | vis_pts = cur_points_seq[t][vis_mask[t]] 140 | vis_pts = vis_pts.reshape((-1, 3)) 141 | vis_pts = resize_points(vis_pts, self.num_samp_pts) 142 | observed_dict[k][t] = vis_pts 143 | 144 | if self.drop_middle: 145 | for k, v in observed_dict.items(): 146 | sidx = self.seq_len // 3 147 | eidx = sidx + (self.seq_len // 3) 148 | observed_dict[k][sidx:eidx] = float('inf') 149 | 150 | if 'contacts' in gt_dict: 151 | gt_contacts = gt_dict['contacts'] 152 | full_contacts = torch.zeros((gt_contacts.size(0), len(SMPL_JOINTS))).to(gt_contacts) 153 | full_contacts[:,CONTACT_INDS] = gt_contacts 154 | gt_dict['contacts'] = full_contacts 155 | 156 | return observed_dict, gt_dict -------------------------------------------------------------------------------- /humor/datasets/amass_utils.py: -------------------------------------------------------------------------------- 1 | 2 | from body_model.utils import SMPL_JOINTS 3 | 4 | 5 | TRAIN_DATASETS = ['CMU', 'MPI_Limits', 'TotalCapture', 'Eyes_Japan_Dataset', 'KIT', 'BioMotionLab_NTroje', 'BMLmovi', 6 | 'EKUT', 'ACCAD'] 7 | TEST_DATASETS = ['Transitions_mocap', 'HumanEva'] 8 | VAL_DATASETS = ['MPI_HDM05', 'SFU', 'MPI_mosh'] 9 | 10 | 11 | SPLITS = ['train', 'val', 'test', 'custom'] 12 | SPLIT_BY = [ 13 | 'single', # the data path is a single .npz file. Don't split: train and test are same 14 | 'sequence', # the data paths are directories of subjects. Collate and split by sequence. 15 | 'subject', # the data paths are directories of datasets. Collate and split by subject. 16 | 'dataset' # a single data path to the amass data root is given. The predefined datasets will be used for each split. 17 | ] 18 | 19 | ROT_REPS = ['mat', 'aa', '6d'] 20 | 21 | # these correspond to [root, left knee, right knee, left heel, right heel, left toe, right toe, left hand, right hand] 22 | CONTACT_ORDERING = ['hips', 'leftLeg', 'rightLeg', 'leftFoot', 'rightFoot', 'leftToeBase', 'rightToeBase', 'leftHand', 'rightHand'] 23 | CONTACT_INDS = [SMPL_JOINTS[jname] for jname in CONTACT_ORDERING] 24 | 25 | NUM_BODY_JOINTS = len(SMPL_JOINTS)-1 26 | NUM_KEYPT_VERTS = 43 27 | 28 | DATA_NAMES = ['trans', 'trans_vel', 'root_orient', 'root_orient_vel', 'pose_body', 'pose_body_vel', 'joints', 'joints_vel', 'joints_orient_vel', 'verts', 'verts_vel', 'contacts'] 29 | 30 | SMPL_JOINTS_RETURN_CONFIG = { 31 | 'trans' : True, 32 | 'trans_vel' : True, 33 | 'root_orient' : True, 34 | 'root_orient_vel' : True, 35 | 'pose_body' : True, 36 | 'pose_body_vel' : False, 37 | 'joints' : True, 38 | 'joints_vel' : True, 39 | 'joints_orient_vel' : False, 40 | 'verts' : False, 41 | 'verts_vel' : False, 42 | 'contacts' : False 43 | } 44 | 45 | SMPL_JOINTS_CONTACTS_RETURN_CONFIG = { 46 | 'trans' : True, 47 | 'trans_vel' : True, 48 | 'root_orient' : True, 49 | 'root_orient_vel' : True, 50 | 'pose_body' : True, 51 | 'pose_body_vel' : False, 52 | 'joints' : True, 53 | 'joints_vel' : True, 54 | 'joints_orient_vel' : False, 55 | 'verts' : False, 56 | 'verts_vel' : False, 57 | 'contacts' : True 58 | } 59 | 60 | ALL_RETURN_CONFIG = { 61 | 'trans' : True, 62 | 'trans_vel' : True, 63 | 'root_orient' : True, 64 | 'root_orient_vel' : True, 65 | 'pose_body' : True, 66 | 'pose_body_vel' : False, 67 | 'joints' : True, 68 | 'joints_vel' : True, 69 | 'joints_orient_vel' : False, 70 | 'verts' : True, 71 | 'verts_vel' : False, 72 | 'contacts' : True 73 | } 74 | 75 | RETURN_CONFIGS = { 76 | 'smpl+joints+contacts' : SMPL_JOINTS_CONTACTS_RETURN_CONFIG, 77 | 'smpl+joints' : SMPL_JOINTS_RETURN_CONFIG, 78 | 'all' : ALL_RETURN_CONFIG 79 | } 80 | 81 | def data_name_list(return_config): 82 | ''' 83 | returns the list of data values in the given configuration 84 | ''' 85 | cur_ret_cfg = RETURN_CONFIGS[return_config] 86 | data_names = [k for k in DATA_NAMES if cur_ret_cfg[k]] 87 | return data_names 88 | 89 | def data_dim(dname, rot_rep_size=9): 90 | ''' 91 | returns the dimension of the data with the given name. If the data is a rotation, returns the size with the given representation. 92 | ''' 93 | if dname in ['trans', 'trans_vel', 'root_orient_vel']: 94 | return 3 95 | elif dname in ['root_orient']: 96 | return rot_rep_size 97 | elif dname in ['pose_body']: 98 | return NUM_BODY_JOINTS*rot_rep_size 99 | elif dname in ['pose_body_vel']: 100 | return NUM_BODY_JOINTS*3 101 | elif dname in ['joints', 'joints_vel']: 102 | return len(SMPL_JOINTS)*3 103 | elif dname in ['joints_orient_vel']: 104 | return 1 105 | elif dname in ['verts', 'verts_vel']: 106 | return NUM_KEYPT_VERTS*3 107 | elif dname in ['contacts']: 108 | return len(CONTACT_ORDERING) 109 | else: 110 | print('The given data name %s is not valid!' % (dname)) 111 | exit() -------------------------------------------------------------------------------- /humor/datasets/rgb_dataset.py: -------------------------------------------------------------------------------- 1 | import sys, os 2 | cur_file_path = os.path.dirname(os.path.realpath(__file__)) 3 | sys.path.append(os.path.join(cur_file_path, '..')) 4 | 5 | import os.path as osp 6 | import glob, time, copy, pickle, json, math 7 | 8 | from torch.utils.data import Dataset, DataLoader 9 | 10 | from fitting.fitting_utils import read_keypoints, load_planercnn_res 11 | 12 | import numpy as np 13 | import torch 14 | import cv2 15 | 16 | DEFAULT_GROUND = [0.0, -1.0, 0.0, -0.5] 17 | 18 | class RGBVideoDataset(Dataset): 19 | 20 | def __init__(self, joints2d_path, 21 | cam_mat, 22 | seq_len=None, 23 | overlap_len=None, 24 | img_path=None, 25 | load_img=False, 26 | masks_path=None, 27 | mask_joints=False, 28 | planercnn_path=None, 29 | video_name='rgb_video' 30 | ): 31 | ''' 32 | Creates a dataset based on a single RGB video. 33 | 34 | - joints2d_path : path to saved OpenPose keypoints for the video 35 | - cam_mat : 3x3 camera intrinsics 36 | - seq_len : If not none, the maximum number of frames in a subsequence, will split the video into subsequences based on this. If none, the dataset contains a single sequence of the whole video. 37 | - overlap_len : the minimum number of frames to overlap each subsequence if splitting the video. 38 | - img_path : path to directory of video frames 39 | - load_img : if True, will load and return the video frames as part of the data. 40 | - masks_path : path to person segmentation masks 41 | - mask_joints: if True, masks the returned 2D joints using the person segmentation masks (i.e. drops any occluded joints) 42 | - planercnn_path : path to planercnn results on a single frame of the video. If given, uses this ground plane in the returned 43 | data rather than the default. 44 | ''' 45 | super(RGBVideoDataset, self).__init__() 46 | 47 | self.joints2d_path = joints2d_path 48 | self.cam_mat = cam_mat 49 | self.seq_len = seq_len 50 | self.overlap_len = overlap_len 51 | self.img_path = img_path 52 | self.load_img = load_img 53 | self.masks_path = masks_path 54 | self.mask_joints = mask_joints 55 | self.planercnn_path = planercnn_path 56 | self.video_name = video_name 57 | 58 | # load data paths 59 | self.data_dict, self.seq_intervals = self.load_data() 60 | self.data_len = len(self.data_dict['joints2d']) 61 | print('RGB dataset contains %d sub-sequences...' % (self.data_len)) 62 | 63 | def load_data(self): 64 | ''' 65 | Loads in the full dataset, except for image frames which are loaded on the fly if desired. 66 | ''' 67 | 68 | # get length of sequence based on files in self.joints2d_path and split into even sequences. 69 | keyp_paths = sorted(glob.glob(osp.join(self.joints2d_path, '*_keypoints.json'))) 70 | frame_names = ['_'.join(f.split('/')[-1].split('_')[:-1]) for f in keyp_paths] 71 | num_frames = len(keyp_paths) 72 | print('Found video with %d frames...' % (num_frames)) 73 | 74 | seq_intervals = [] 75 | if self.seq_len is not None and self.overlap_len is not None: 76 | num_seqs = math.ceil((num_frames - self.overlap_len) / (self.seq_len - self.overlap_len)) 77 | r = self.seq_len*num_seqs - self.overlap_len*(num_seqs-1) - num_frames # number of extra frames we cover 78 | extra_o = r // (num_seqs - 1) # we increase the overlap to avoid these as much as possible 79 | self.overlap_len = self.overlap_len + extra_o 80 | 81 | new_cov = self.seq_len*num_seqs - self.overlap_len*(num_seqs-1) # now compute how many frames are still left to account for 82 | r = new_cov - num_frames 83 | 84 | # create intervals 85 | cur_s = 0 86 | cur_e = cur_s + self.seq_len 87 | for int_idx in range(num_seqs): 88 | seq_intervals.append((cur_s, cur_e)) 89 | cur_overlap = self.overlap_len 90 | if int_idx < r: 91 | cur_overlap += 1 # update to account for final remainder 92 | cur_s += (self.seq_len - cur_overlap) 93 | cur_e = cur_s + self.seq_len 94 | 95 | print('Splitting into subsequences of length %d frames overlapping by %d...' % (self.seq_len, self.overlap_len)) 96 | else: 97 | print('Not splitting the video...') 98 | num_seqs = 1 99 | self.seq_len = num_frames 100 | seq_intervals = [(0, self.seq_len)] 101 | 102 | # 103 | # first load in entire video then split 104 | # 105 | 106 | # intrinsics 107 | cam_mat = self.cam_mat 108 | 109 | # path to image frames 110 | img_paths = None 111 | if self.img_path is not None: 112 | img_paths = [osp.join(self.img_path, img_fn) 113 | for img_fn in os.listdir(self.img_path) 114 | if img_fn.endswith('.png') or 115 | img_fn.endswith('.jpg') and 116 | not img_fn.startswith('.')] 117 | img_paths = sorted(img_paths) 118 | 119 | # path to masks 120 | mask_paths = None 121 | if self.masks_path is not None: 122 | mask_paths = [os.path.join(self.masks_path, f + '.png') for f in frame_names] 123 | 124 | # floor plane 125 | floor_plane = None 126 | if self.planercnn_path is not None: 127 | floor_plane = load_planercnn_res(self.planercnn_path) 128 | else: 129 | floor_plane = np.array(DEFAULT_GROUND) 130 | 131 | # get data for each subsequence 132 | data_out = { 133 | 'img_paths' : [], 134 | 'mask_paths' : [], 135 | 'cam_matx' : [], 136 | 'joints2d' : [], 137 | 'floor_plane' : [], 138 | 'names' : [] 139 | } 140 | for seq_idx in range(num_seqs): 141 | sidx, eidx = seq_intervals[seq_idx] 142 | 143 | data_out['cam_matx'].append(cam_mat) 144 | 145 | keyp_frames = [read_keypoints(f) for f in keyp_paths[sidx:eidx]] 146 | joint2d_data = np.stack(keyp_frames, axis=0) # T x J x 3 (x,y,conf) 147 | data_out['joints2d'].append(joint2d_data) 148 | 149 | data_out['floor_plane'].append(floor_plane) 150 | 151 | data_out['names'].append(self.video_name + '_' + '%04d' % (seq_idx)) 152 | 153 | if img_paths is not None: 154 | data_out['img_paths'].append(img_paths[sidx:eidx]) 155 | if mask_paths is not None: 156 | data_out['mask_paths'].append(mask_paths[sidx:eidx]) 157 | 158 | return data_out, seq_intervals 159 | 160 | def __len__(self): 161 | return self.data_len 162 | 163 | def __getitem__(self, idx): 164 | obs_data = dict() 165 | gt_data = dict() 166 | 167 | # 168 | # 2D keypoints 169 | # 170 | joint2d_data = self.data_dict['joints2d'][idx] 171 | obs_data['joints2d'] = torch.Tensor(joint2d_data) 172 | 173 | # person mask 174 | if self.mask_joints: 175 | cur_mask_paths = self.data_dict['mask_paths'][idx] 176 | obs_data['mask_paths'] = cur_mask_paths 177 | 178 | for t, mask_file in enumerate(cur_mask_paths): 179 | # load in mask 180 | vis_mask = cv2.imread(mask_file, cv2.IMREAD_GRAYSCALE) 181 | imh, imw = vis_mask.shape 182 | # mask out invisible joints (give confidence 0) 183 | uvs = np.round(joint2d_data[t, :, :2]).astype(int) 184 | uvs[:,0][uvs[:,0] >= imw] = (imw-1) 185 | uvs[:,1][uvs[:,1] >= imh] = (imh-1) 186 | occluded_mask_idx = vis_mask[uvs[:, 1], uvs[:, 0]] != 0 187 | joint2d_data[t, :, :][occluded_mask_idx] = 0.0 188 | 189 | # images 190 | if len(self.data_dict['img_paths']) > 0: 191 | cur_img_paths = self.data_dict['img_paths'][idx] 192 | obs_data['img_paths'] = cur_img_paths 193 | if self.load_img: 194 | # load images 195 | img_list = [] 196 | for img_path in cur_img_paths: 197 | # print(img_path) 198 | img = cv2.imread(img_path).astype(np.float32)[:, :, ::-1] / 255.0 199 | img_list.append(img) 200 | img_out = torch.Tensor(np.stack(img_list, axis=0)) 201 | # print(img_out.size()) 202 | obs_data['RGB'] = img_out 203 | 204 | # import matplotlib.pyplot as plt 205 | # for t in range(self.seq_len): 206 | # fig = plt.figure() 207 | # plt.imshow(img_list[t]) 208 | # plt.scatter(joint2d_data[t, :, 0], joint2d_data[t, :, 1]) 209 | # ax = plt.gca() 210 | # plt.show() 211 | # plt.close(fig) 212 | 213 | # cv2.namedWindow('frame', cv2.WINDOW_NORMAL) 214 | # for img in img_list: 215 | # while True: 216 | # cv2.imshow('frame', img) 217 | # key = cv2.waitKey(30) 218 | # if key == 27: 219 | # break 220 | 221 | # floor plane 222 | obs_data['floor_plane'] = self.data_dict['floor_plane'][idx] 223 | # intrinsics 224 | gt_data['cam_matx'] = torch.Tensor(self.data_dict['cam_matx'][idx]) 225 | # meta-data 226 | gt_data['name'] = self.data_dict['names'][idx] 227 | 228 | # the frames used in this subsequence 229 | obs_data['seq_interval'] = torch.Tensor(list(self.seq_intervals[idx])).to(torch.int) 230 | 231 | return obs_data, gt_data -------------------------------------------------------------------------------- /humor/fitting/config.py: -------------------------------------------------------------------------------- 1 | from utils.config import SplitLineParser 2 | 3 | from fitting.fitting_utils import NSTAGES 4 | 5 | def parse_args(argv): 6 | parser = SplitLineParser(fromfile_prefix_chars='@', allow_abbrev=False) 7 | 8 | # Observed data options 9 | parser.add_argument('--data-path', type=str, required=True, help='Path to the data to fit.') 10 | parser.add_argument('--data-type', type=str, required=True, choices=['AMASS', 'PROX-RGB', 'PROX-RGBD', 'iMapper-RGB', 'RGB'], help='The type of data we are fitting to.') 11 | parser.add_argument('--data-fps', type=int, default=30, help='Sampling rate of the data.') 12 | parser.add_argument('--batch-size', type=int, default=1, help='Number of sequences to batch together for fitting to data.') 13 | parser.add_argument('--shuffle', dest='shuffle', action='store_true', help="Shuffles data.") 14 | parser.set_defaults(shuffle=False) 15 | parser.add_argument('--op-keypts', type=str, default=None, help='(optional) path to a directory of custom detected OpenPose keypoints to use for RGB fitting rather than running OpenPose before optimization.') 16 | 17 | # AMASS-specific options 18 | parser.add_argument('--amass-split-by', type=str, default='dataset', choices=['single', 'sequence', 'subject', 'dataset'], help='How to split the dataset into train/test/val.') 19 | parser.add_argument('--amass-custom-split', type=str, nargs='+', default=None, help='Instead of using test set, use this custom list of datasets.') 20 | parser.add_argument('--amass-batch-size', type=int, default=-1, help='Number of sequences to batch together for fitting to AMASS data.') 21 | parser.add_argument('--amass-seq-len', type=int, default=60, help='Number of frames in AMASS sequences to fit.') 22 | parser.add_argument('--amass-use-joints', dest='amass_use_joints', action='store_true', help="Use 3D joint observations for fitting.") 23 | parser.set_defaults(amass_use_joints=False) 24 | parser.add_argument('--amass-root-joint-only', dest='amass_root_joint_only', action='store_true', help="Use 3D root joint observation for fitting.") 25 | parser.set_defaults(amass_root_joint_only=False) 26 | parser.add_argument('--amass-use-verts', dest='amass_use_verts', action='store_true', help="Use subset of 3D mesh vertices observations for fitting.") 27 | parser.set_defaults(amass_use_verts=False) 28 | parser.add_argument('--amass-use-points', dest='amass_use_points', action='store_true', help="Use sampled 3D points on mesh surface for fitting.") 29 | parser.set_defaults(amass_use_points=False) 30 | parser.add_argument('--amass-noise-std', type=float, default=0.0, help='Artificial gaussian noise standard deviation to add to observations.') 31 | parser.add_argument('--amass-make-partial', dest='amass_make_partial', action='store_true', help="Make the observations randomly partial.") 32 | parser.set_defaults(amass_make_partial=False) 33 | parser.add_argument('--amass-partial-height', type=float, default=0.9, help='Points/joints/verts under this z value will be dropped to make partial.') 34 | parser.add_argument('--amass-drop-middle', dest='amass_drop_middle', action='store_true', help="Drops the middle third frames from the sequence completely.") 35 | parser.set_defaults(amass_drop_middle=False) 36 | 37 | # PROX-specific options 38 | parser.add_argument('--prox-batch-size', type=int, default=-1, help='Number of sequences to batch together for fitting to PROX data.') 39 | parser.add_argument('--prox-seq-len', type=int, default=60, help='Number of frames in PROX sequences to fit.') 40 | parser.add_argument('--prox-recording', type=str, default=None, help='Fit to a specific PROX recording') 41 | parser.add_argument('--prox-recording-subseq-idx', type=int, default=-1, help='Fit to a specific PROX recording subsequence') 42 | 43 | # iMapper-specific options 44 | parser.add_argument('--imapper-seq-len', type=int, default=60, help='Number of frames in iMapper sequences to fit. ') 45 | parser.add_argument('--imapper-scene', type=str, default=None, help='Fit to a specific iMapper scene') 46 | parser.add_argument('--imapper-scene-subseq-idx', type=int, default=-1, help='Fit to a specific subsequence') 47 | 48 | # RGB-specific options 49 | parser.add_argument('--rgb-seq-len', type=int, default=None, help='If none, fits the whole video at once. If given, is the max number of frames to use when splitting the video into subseqeunces for fitting.') 50 | parser.add_argument('--rgb-overlap-len', type=int, default=None, help='If None, fitst the whole video at once. If given, is the minimum number of frames to overlap subsequences extracted from the given video. These overlapped frames are used in a consistency energy.') 51 | parser.add_argument('--rgb-intrinsics', type=str, default=None, help='Path to the camera intrinsics file to use for re-projection energy. If not given uses defaults.') 52 | parser.add_argument('--rgb-planercnn-res', type=str, default=None, help='Path to results of PlaneRCNN detection. If given uses this to initialize the floor plane otherwise uses defaults.') 53 | parser.add_argument('--rgb-overlap-consist-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='Enforces consistency between overlapping subsequences within a batch in terms of the ground plane, shape params, and joint positions.') 54 | 55 | # PROX + iMapper + RGB options 56 | parser.add_argument('--mask-joints2d', dest='mask_joints2d', action='store_true', help="If true, masks the 2d joints based on the person segmentation occlusion mask.") 57 | parser.set_defaults(mask_joints2d=False) 58 | 59 | # Loss weights 60 | parser.add_argument('--joint3d-weight', type=float, nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='L2 loss on 3D joints') 61 | parser.add_argument('--joint3d-rollout-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='L2 loss on 3D joints from motion prior rollout.') 62 | parser.add_argument('--joint3d-smooth-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='L2 loss on 3D joints differences') 63 | parser.add_argument('--vert3d-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='L2 loss on 3D verts') 64 | parser.add_argument('--point3d-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='Chamfer loss on 3D points') 65 | parser.add_argument('--joint2d-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='L2 loss on 2D reprojection') 66 | parser.add_argument('--pose-prior-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='likelihood under pose prior') 67 | parser.add_argument('--shape-prior-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='likelihood under shape prior') 68 | parser.add_argument('--motion-prior-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='likelihood under motion prior') 69 | parser.add_argument('--init-motion-prior-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='likelihood under init state prior') 70 | parser.add_argument('--joint-consistency-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='L2 difference between SMPL and motion prior joints') 71 | parser.add_argument('--bone-length-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='L2 difference between bone lengths of motion prior joints at consecutive frames.') 72 | parser.add_argument('--contact-vel-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='Predicted contacting joints have 0 velocity when in contact') 73 | parser.add_argument('--contact-height-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='Predicted contacting joints are at the height of the floor') 74 | parser.add_argument('--floor-reg-weight', type=float,nargs=NSTAGES, default=[0.0, 0.0, 0.0], help='L2 Regularization that pushes floor to stay close to the initialization.') 75 | # loss options 76 | parser.add_argument('--robust-loss', type=str, default='bisquare', choices=['none', 'bisquare'], help='Which robust loss weighting to use for points3d losses (if any).') 77 | parser.add_argument('--robust-tuning-const', type=float, default=4.6851, help='Tuning constant to use in the robust loss.') 78 | parser.add_argument('--joint2d-sigma', type=float, default=100.0, help='scaling for robust geman-mclure function on joint2d.') 79 | 80 | # stage 3 options 81 | parser.add_argument('--stage3-no-tune-init-state', dest='stage3_tune_init_state', action='store_false', help="If given, will not use initial state tuning at the beginning of stage 3, instead optimizing full seq at once.") 82 | parser.set_defaults(stage3_tune_init_state=True) 83 | parser.add_argument('--stage3-tune-init-num-frames', type=int, default=15, help="When tuning initial state at the beginning of stage 3, uses this many initial frames.") 84 | parser.add_argument('--stage3-tune-init-freeze-start', type=int, default=30, help='Iteration to tune initial state until, at which point it is frozen and full latent sequence is optimized') 85 | parser.add_argument('--stage3-tune-init-freeze-end', type=int, default=55, help='Iteration to freeze initial state until, at which point full sequence and initial state are refined together.') 86 | parser.add_argument('--stage3-full-contact', dest='stage3_contact_refine_only', action='store_false', help="If given, uses contact losses for the entire stage 3 rather than just in the final refinement portion.") 87 | parser.set_defaults(stage3_contact_refine_only=True) 88 | 89 | # smpl model path 90 | parser.add_argument('--smpl', type=str, default='./body_models/smplh/neutral/model.npz', help='Path to SMPL model to use for optimization. Currently only SMPL+H is supported.') 91 | parser.add_argument('--gt-body-type', type=str, default='smplh', choices=['smplh'], help='Which body model to load in for GT data') 92 | parser.add_argument('--vposer', type=str, default='./body_models/vposer_v1_0', help='Path to VPoser checkpoint.') 93 | parser.add_argument('--openpose', type=str, default='./external/openpose', help='Path to OpenPose installation.') 94 | 95 | # motion prior weights and model information 96 | parser.add_argument('--humor', type=str, help='Path to HuMoR weights to use as the motion prior.') 97 | parser.add_argument('--humor-out-rot-rep', type=str, default='aa', choices=['aa', '6d', '9d'], help='Rotation representation to output from the model.') 98 | parser.add_argument('--humor-in-rot-rep', type=str, default='mat', choices=['aa', '6d', 'mat'], help='Rotation representation to input to the model for the relative full sequence input.') 99 | parser.add_argument('--humor-latent-size', type=int, default=48, help='Size of the latent feature.') 100 | parser.add_argument('--humor-model-data-config', type=str, default='smpl+joints+contacts', choices=['smpl+joints', 'smpl+joints+contacts'], help='which state configuration to use for the model') 101 | parser.add_argument('--humor-steps-in', type=int, default=1, help='Number of input timesteps the prior expects.') 102 | 103 | # init motion state prior information 104 | parser.add_argument('--init-motion-prior', type=str, default='./checkpoints/init_state_prior_gmm', help='Path to parameters of a GMM to use as the prior for initial motion state.') 105 | 106 | # optimization options 107 | parser.add_argument('--lr', type=float, default=1.0, help='step size during optimization') 108 | parser.add_argument('--num-iters', type=int, nargs=NSTAGES, default=[30, 80, 70], help='The number of optimization iterations at each stage (3 stages total)') 109 | parser.add_argument('--lbfgs-max-iter', type=int, default=20, help='The number of max optim iterations per LBFGS step.') 110 | 111 | # options to save/visualize results 112 | parser.add_argument('--out', type=str, default=None, help='Output path to save fitting results/visualizations to.') 113 | 114 | parser.add_argument('--save-results', dest='save_results', action='store_true', help="Saves final optimized and GT smpl results and observations") 115 | parser.set_defaults(save_results=False) 116 | parser.add_argument('--save-stages-results', dest='save_stages_results', action='store_true', help="Saves intermediate optimized results") 117 | parser.set_defaults(save_stages_results=False) 118 | 119 | known_args, unknown_args = parser.parse_known_args(argv) 120 | 121 | return known_args -------------------------------------------------------------------------------- /humor/fitting/eval_utils.py: -------------------------------------------------------------------------------- 1 | import sys, os 2 | cur_file_path = os.path.dirname(os.path.realpath(__file__)) 3 | sys.path.append(os.path.join(cur_file_path, '..')) 4 | 5 | import os 6 | import numpy as np 7 | import torch 8 | 9 | from datasets.amass_utils import CONTACT_INDS 10 | from body_model.utils import SMPL_JOINTS 11 | from fitting.fitting_utils import perspective_projection, bdot, compute_plane_intersection 12 | from utils.transforms import rotation_matrix_to_angle_axis, batch_rodrigues 13 | 14 | SMPL_SIZES = { 15 | 'trans' : 3, 16 | 'betas' : 10, 17 | 'pose_body' : 63, 18 | 'root_orient' : 3 19 | } 20 | 21 | GRND_PEN_THRESH_LIST = [0.0, 0.03, 0.06, 0.09, 0.12, 0.15] 22 | IMW, IMH = 1920, 1080 # all data 23 | DATA_FPS = 30.0 24 | DATA_h = 1.0 / DATA_FPS 25 | 26 | # Baseline MVAE does not converge on these sequences, don't use for quantitative evaluation 27 | AMASS_EVAL_BLACKLIST = [ 28 | 'HumanEva_S1_Box_1_poses_548_frames_30_fps', 29 | 'HumanEva_S1_Box_3_poses_330_frames_30_fps', 30 | 'HumanEva_S1_Gestures_1_poses_594_frames_30_fps' 31 | ] 32 | 33 | # PROX-D baseline catastrophically fails on these, don't use for quantitative evaluation 34 | RGBD_EVAL_BLACKLIST = [ 35 | 'MPH1Library_00145_01_0020', 36 | 'MPH1Library_00145_01_0021', 37 | 'MPH1Library_00145_01_0022', 38 | 'MPH1Library_00145_01_0023', 39 | 'MPH1Library_00145_01_0024', 40 | 'MPH1Library_00145_01_0025', 41 | 'MPH1Library_00145_01_0026', 42 | 'MPH1Library_00145_01_0027', 43 | 'MPH1Library_00145_01_0028', 44 | 'N0Sofa_03403_01_0000', 45 | 'N0Sofa_03403_01_0001', 46 | 'N0Sofa_03403_01_0002', 47 | 'N0Sofa_03403_01_0003', 48 | 'N0Sofa_03403_01_0004', 49 | 'N0Sofa_03403_01_0005', 50 | 'N0Sofa_03403_01_0006', 51 | 'N0Sofa_03403_01_0007', 52 | 'N0Sofa_03403_01_0008', 53 | 'N0Sofa_03403_01_0009', 54 | 'N0Sofa_03403_01_0010', 55 | 'N0Sofa_03403_01_0011', 56 | 'N0Sofa_03403_01_0012', 57 | 'N0Sofa_03403_01_0013', 58 | 'N0Sofa_03403_01_0014' 59 | ] 60 | 61 | # VIBE baseline catastrophically fails on these, don't use for quantitative evaluation 62 | RGB_EVAL_BLACKLIST = [ 63 | 'MPH1Library_00145_01_0031', 64 | 'N0Sofa_03403_01_0013' 65 | ] 66 | 67 | 68 | def get_grnd_pen_key(thresh): 69 | return 'ground_pen@%0.2f' % (thresh) 70 | 71 | def quant_eval_3d(eval_dict, pred_data, gt_data, obs_data): 72 | # get positional errors for each modality 73 | for modality in ['joints3d', 'verts3d', 'mesh3d']: 74 | eval_pred = pred_data[modality] 75 | eval_gt = gt_data[modality] 76 | 77 | # all positional errors 78 | pos_err_all = torch.norm(eval_pred - eval_gt, dim=-1).detach().cpu().numpy() 79 | eval_dict[modality + '_all'].append(pos_err_all) 80 | 81 | # ee and legs 82 | if modality == 'joints3d': 83 | joints3d_ee = compute_subset_smpl_joint_err(eval_pred, eval_gt, subset='ee').detach().cpu().numpy() 84 | joints3d_legs = compute_subset_smpl_joint_err(eval_pred, eval_gt, subset='legs').detach().cpu().numpy() 85 | eval_dict['joints3d_ee'].append(joints3d_ee) 86 | eval_dict['joints3d_legs'].append(joints3d_legs) 87 | 88 | # split by occluded/visible if this was the observed modality 89 | if modality in obs_data: 90 | # visible data (based on observed) 91 | eval_obs = obs_data[modality] 92 | invis_mask = torch.isinf(eval_obs) 93 | vis_mask = torch.logical_not(invis_mask) # T x N x 3 94 | num_invis_pts = torch.sum(invis_mask[:,:,0]) 95 | num_vis_pts = torch.sum(vis_mask[:,:,0]) 96 | 97 | if num_vis_pts > 0: 98 | pred_vis = eval_pred[vis_mask].reshape((num_vis_pts, 3)) 99 | gt_vis = eval_gt[vis_mask].reshape((num_vis_pts, 3)) 100 | vis_err = torch.norm(pred_vis - gt_vis, dim=-1).detach().cpu().numpy() 101 | eval_dict[modality + '_vis'].append(vis_err) 102 | else: 103 | eval_dict[modality + '_vis'].append(np.zeros((0))) 104 | 105 | # invisible data 106 | if num_invis_pts > 0: 107 | pred_invis = eval_pred[invis_mask].reshape((num_invis_pts, 3)) 108 | gt_invis = eval_gt[invis_mask].reshape((num_invis_pts, 3)) 109 | invis_err = torch.norm(pred_invis - gt_invis, dim=-1).detach().cpu().numpy() 110 | eval_dict[modality + '_occ'].append(invis_err) 111 | else: 112 | eval_dict[modality + '_occ'].append(np.zeros((0))) 113 | 114 | # per-joint acceleration mag 115 | pred_joint_accel, pred_joint_accel_mag = compute_joint_accel(pred_data['joints3d']) 116 | eval_dict['accel_mag'].append(pred_joint_accel_mag.detach().cpu().numpy()) 117 | 118 | # toe-floor penetration 119 | floor_plane = torch.zeros((4)).to(pred_data['joints3d']) 120 | floor_plane[2] = 1.0 121 | num_pen_list, num_tot, pen_dist = compute_toe_floor_pen(pred_data['joints3d'], floor_plane, thresh_list=GRND_PEN_THRESH_LIST) 122 | eval_dict['ground_pen_dist'].append(pen_dist.detach().cpu().numpy()) 123 | for thresh_idx, pen_thresh in enumerate(GRND_PEN_THRESH_LIST): 124 | cur_pen_key = get_grnd_pen_key(pen_thresh) 125 | eval_dict[cur_pen_key].append(num_pen_list[thresh_idx].detach().cpu().item()) 126 | eval_dict[cur_pen_key + '_cnt'].append(num_tot) 127 | 128 | # contact classification (output number correct and total frame cnt) 129 | pred_contacts = pred_data['contacts'][:,CONTACT_INDS] # only compare joints for which the prior is trained 130 | gt_contacts = gt_data['contacts'][:,CONTACT_INDS] 131 | num_correct = np.sum((pred_contacts - gt_contacts) == 0) 132 | total_cnt = pred_contacts.shape[0]*pred_contacts.shape[1] 133 | eval_dict['contact_acc'].append(num_correct) 134 | eval_dict['contact_acc_cnt'].append(total_cnt) 135 | 136 | 137 | def quant_eval_2d(eval_dict, pred_joints_smpl, floor_plane, 138 | pred_joints_comp=None, 139 | gt_joints_comp=None, 140 | vis_mask=None, 141 | cam_intrins=None): 142 | ''' 143 | eval_dict dictionary accumulator to add results to. 144 | 145 | Always must have pred_joints_smpl and floor_plane to compute plausibility metrics. 146 | 147 | Optionally comparison pred and gt joints along with a joint occlusion mask can be passed in 148 | to compute joint position errors. Errors will also be split by visible/occluded. 149 | 150 | all_joints T x J x 3 151 | floor_plane (4) 152 | ''' 153 | 154 | do_comparison = pred_joints_comp is not None and \ 155 | gt_joints_comp is not None 156 | 157 | if do_comparison: 158 | # 159 | # pointwise distance errors (MPJPE) 160 | # split by visiblity if observations are available 161 | # 162 | T, J, _ = gt_joints_comp.size() 163 | entires_per_frame = J*3 164 | 165 | # mask out missing frames 166 | invalid_mask = torch.isinf(gt_joints_comp) 167 | num_inval_entries = torch.sum(invalid_mask, dim=[1, 2]) 168 | valid_mask = (num_inval_entries < entires_per_frame) 169 | num_val_frames = torch.sum(valid_mask) 170 | valid_frame_mask = valid_mask.reshape((T, 1, 1)).expand_as(gt_joints_comp) 171 | 172 | eval_pred_joints = pred_joints_comp[valid_frame_mask].reshape((num_val_frames, J, 3)) 173 | eval_gt_joints = gt_joints_comp[valid_frame_mask].reshape((num_val_frames, J, 3)) 174 | 175 | # all joint errors 176 | joints3d_all = torch.norm(eval_pred_joints - eval_gt_joints, dim=-1).detach().cpu().numpy() 177 | eval_dict['joints3d_all'].append(joints3d_all) 178 | 179 | # end-effector and related errors 180 | joints3d_ee = compute_subset_comp_joint_err(eval_pred_joints, eval_gt_joints, subset='ee').detach().cpu().numpy() 181 | joints3d_legs = compute_subset_comp_joint_err(eval_pred_joints, eval_gt_joints, subset='legs').detach().cpu().numpy() 182 | eval_dict['joints3d_ee'].append(joints3d_ee) 183 | eval_dict['joints3d_legs'].append(joints3d_legs) 184 | 185 | # aligned at root all joint errors 186 | pred_root = eval_pred_joints[:, COMP_ROOT_IDX:(COMP_ROOT_IDX+1), :] 187 | eval_pred_joints_align = eval_pred_joints - pred_root 188 | gt_root = eval_gt_joints[:, COMP_ROOT_IDX:(COMP_ROOT_IDX+1), :] 189 | eval_gt_joints_align = eval_gt_joints - gt_root 190 | joints3d_align_all = torch.norm(eval_pred_joints_align - eval_gt_joints_align, dim=-1).detach().cpu().numpy() 191 | eval_dict['joints3d_align_all'].append(joints3d_align_all) 192 | 193 | # end-effector and related errors 194 | joints3d_align_ee = compute_subset_comp_joint_err(eval_pred_joints_align, eval_gt_joints_align, subset='ee').detach().cpu().numpy() 195 | joints3d_align_legs = compute_subset_comp_joint_err(eval_pred_joints_align, eval_gt_joints_align, subset='legs').detach().cpu().numpy() 196 | eval_dict['joints3d_align_ee'].append(joints3d_align_ee) 197 | eval_dict['joints3d_align_legs'].append(joints3d_align_legs) 198 | 199 | if vis_mask is not None and cam_intrins is not None: 200 | # split into visible and occluded 201 | valid_vis_masks = vis_mask[valid_mask.cpu().numpy()] 202 | 203 | Tv = num_val_frames 204 | cam_t = torch.zeros((Tv, 3)).to(eval_gt_joints) 205 | cam_R = torch.eye(3).reshape((1, 3, 3)).expand((Tv, 3, 3)).to(eval_gt_joints) 206 | # project points to 2D 207 | cam_f = torch.zeros((Tv, 2)).to(eval_gt_joints) 208 | cam_f[:,0] = float(cam_intrins[0]) 209 | cam_f[:,1] = float(cam_intrins[1]) 210 | cam_cent = torch.zeros((Tv, 2)).to(eval_gt_joints) 211 | cam_cent[:,0] = float(cam_intrins[2]) 212 | cam_cent[:,1] = float(cam_intrins[3]) 213 | gt_joints2d = perspective_projection(eval_gt_joints, 214 | cam_R, 215 | cam_t, 216 | cam_f, 217 | cam_cent) 218 | 219 | uvs = np.round(gt_joints2d.cpu().numpy()).astype(int) 220 | uvs[:,:,0] = np.clip(uvs[:,:,0], 0, IMW-1) 221 | uvs[:,:,1] = np.clip(uvs[:,:,1], 0, IMH-1) 222 | occlusion_mask = np.zeros((Tv, J), dtype=np.bool) 223 | for t in range(Tv): 224 | occlusion_mask[t] = valid_vis_masks[t][uvs[t, :, 1], uvs[t, :, 0]] == 1 225 | 226 | occlusion_mask = torch.Tensor(occlusion_mask).to(torch.bool).to(eval_gt_joints.device) 227 | vis_mask = torch.logical_not(occlusion_mask) 228 | num_occl_pts = torch.sum(occlusion_mask) 229 | num_vis_pts = torch.sum(vis_mask) 230 | 231 | occlusion_mask = occlusion_mask.unsqueeze(2).expand_as(eval_gt_joints) 232 | vis_mask = vis_mask.unsqueeze(2).expand_as(eval_gt_joints) 233 | 234 | # visible absolute data 235 | if num_vis_pts > 0: 236 | eval_pred_vis = eval_pred_joints[vis_mask].reshape((num_vis_pts, 3)) 237 | eval_gt_vis = eval_gt_joints[vis_mask].reshape((num_vis_pts, 3)) 238 | joints3d_vis = torch.norm(eval_pred_vis - eval_gt_vis, dim=-1).detach().cpu().numpy() 239 | eval_dict['joints3d_vis'].append(joints3d_vis) 240 | else: 241 | eval_dict['joints3d_vis'].append(np.zeros((0))) 242 | 243 | # occluded absolute data 244 | if num_occl_pts > 0: 245 | eval_pred_occl = eval_pred_joints[occlusion_mask].reshape((num_occl_pts, 3)) 246 | eval_gt_occl = eval_gt_joints[occlusion_mask].reshape((num_occl_pts, 3)) 247 | joints3d_occl = torch.norm(eval_pred_occl - eval_gt_occl, dim=-1).detach().cpu().numpy() 248 | eval_dict['joints3d_occ'].append(joints3d_occl) 249 | else: 250 | eval_dict['joints3d_occ'].append(np.zeros((0))) 251 | 252 | # occl/vis aligned 253 | # visible absolute data 254 | if num_vis_pts > 0: 255 | eval_pred_align_vis = eval_pred_joints_align[vis_mask].reshape((num_vis_pts, 3)) 256 | eval_gt_align_vis = eval_gt_joints_align[vis_mask].reshape((num_vis_pts, 3)) 257 | joints3d_align_vis = torch.norm(eval_pred_align_vis - eval_gt_align_vis, dim=-1).detach().cpu().numpy() 258 | eval_dict['joints3d_align_vis'].append(joints3d_align_vis) 259 | else: 260 | eval_dict['joints3d_align_vis'].append(np.zeros((0))) 261 | 262 | # occluded absolute data 263 | if num_occl_pts > 0: 264 | eval_pred_align_occl = eval_pred_joints_align[occlusion_mask].reshape((num_occl_pts, 3)) 265 | eval_gt_align_occl = eval_gt_joints_align[occlusion_mask].reshape((num_occl_pts, 3)) 266 | joints3d_align_occl = torch.norm(eval_pred_align_occl - eval_gt_align_occl, dim=-1).detach().cpu().numpy() 267 | eval_dict['joints3d_align_occ'].append(joints3d_align_occl) 268 | else: 269 | eval_dict['joints3d_align_occ'].append(np.zeros((0))) 270 | 271 | # per-joint acceleration 272 | _, pred_joint_accel_mag = compute_joint_accel(pred_joints_smpl) 273 | eval_dict['accel_mag'].append(pred_joint_accel_mag.detach().cpu().numpy()) 274 | 275 | pred_joints_align = pred_joints_smpl - pred_joints_smpl[:,0:1,:] 276 | _, pred_joint_align_accel_mag = compute_joint_accel(pred_joints_align) 277 | eval_dict['accel_mag_align'].append(pred_joint_align_accel_mag.detach().cpu().numpy()) 278 | 279 | # toe-floor penetration 280 | num_pen_list, num_tot, pen_dist = compute_toe_floor_pen(pred_joints_smpl, floor_plane, thresh_list=GRND_PEN_THRESH_LIST) 281 | eval_dict['ground_pen_dist'].append(pen_dist.detach().cpu().numpy()) 282 | for thresh_idx, pen_thresh in enumerate(GRND_PEN_THRESH_LIST): 283 | cur_pen_key = get_grnd_pen_key(pen_thresh) 284 | eval_dict[cur_pen_key].append(num_pen_list[thresh_idx].detach().cpu().item()) 285 | eval_dict[cur_pen_key + '_cnt'].append(num_tot) 286 | 287 | return 288 | 289 | 290 | def compute_subset_smpl_joint_err(eval_pred_joints, eval_gt_joints, subset='ee'): 291 | ''' 292 | Compute SMPL joint position errors betwen pred_joints and gt_joints for the given subject. 293 | Assumed size of B x 22 x 3 294 | Options: 295 | - ee : end-effectors. hands, toebase, and ankle 296 | - legs : knees, angles, and toes 297 | ''' 298 | subset_inds = None 299 | if subset == 'ee': 300 | subset_inds = [SMPL_JOINTS['leftFoot'], SMPL_JOINTS['rightFoot'], 301 | SMPL_JOINTS['leftToeBase'], SMPL_JOINTS['rightToeBase'], 302 | SMPL_JOINTS['leftHand'], SMPL_JOINTS['rightHand']] 303 | elif subset == 'legs': 304 | subset_inds = [SMPL_JOINTS['leftFoot'], SMPL_JOINTS['rightFoot'], 305 | SMPL_JOINTS['leftToeBase'], SMPL_JOINTS['rightToeBase'], 306 | SMPL_JOINTS['leftLeg'], SMPL_JOINTS['rightLeg']] 307 | else: 308 | print('Unrecognized joint subset!') 309 | exit() 310 | 311 | joint_err = torch.norm(eval_pred_joints[:, subset_inds] - eval_gt_joints[:, subset_inds], dim=-1) 312 | return joint_err 313 | 314 | def compute_subset_comp_joint_err(eval_pred_joints, eval_gt_joints, subset='ee'): 315 | ''' 316 | Compute comparison skeleton joint position errors betwen pred_joints and gt_joints for the given subject. 317 | Assumed size of B x J x 3 318 | Options: 319 | - ee : hands and ankels 320 | - legs : knees and ankles 321 | ''' 322 | subset_inds = None 323 | if subset == 'ee': 324 | subset_inds = [COMP_JOINTS['RANK'], COMP_JOINTS['LANK'], 325 | COMP_JOINTS['RWRI'], COMP_JOINTS['LWRI']] 326 | elif subset == 'legs': 327 | subset_inds = [COMP_JOINTS['RANK'], COMP_JOINTS['LANK'], 328 | COMP_JOINTS['RKNE'], COMP_JOINTS['LKNE']] 329 | else: 330 | print('Unrecognized joint subset!') 331 | exit() 332 | 333 | joint_err = torch.norm(eval_pred_joints[:, subset_inds] - eval_gt_joints[:, subset_inds], dim=-1) 334 | return joint_err 335 | 336 | def compute_joint_accel(joint_seq): 337 | ''' Magnitude of joint accelerations for joint_seq : T x J x 3 ''' 338 | joint_accel = joint_seq[:-2] - (2*joint_seq[1:-1]) + joint_seq[2:] 339 | joint_accel = joint_accel / ((DATA_h**2)) 340 | joint_accel_mag = torch.norm(joint_accel, dim=-1) 341 | return joint_accel, joint_accel_mag 342 | 343 | def compute_toe_floor_pen(joint_seq, floor_plane, thresh_list=[0.0]): 344 | ''' 345 | Given SMPL body joints sequence and the floor plane, computes number of times 346 | the toes penetrate the floor and the total number of frames. 347 | 348 | - thresh_list : compute the penetration ratio for each threshold in cm in this list 349 | 350 | Returns: 351 | - list of num_penetrations for each threshold, the number of total frames, and penetration distance at threshold 0.0 352 | ''' 353 | toe_joints = joint_seq[:,[SMPL_JOINTS['leftToeBase'], SMPL_JOINTS['rightToeBase']], :] 354 | toe_joints = toe_joints.reshape((-1, 3)) 355 | floor_normal = floor_plane[:3].reshape((1, 3)) 356 | floor_normal = floor_normal / torch.norm(floor_normal, dim=-1, keepdim=True) 357 | floor_normal = floor_normal.expand_as(toe_joints) 358 | 359 | _, s = compute_plane_intersection(toe_joints, -floor_normal, floor_plane.reshape((1, 4)).expand((toe_joints.size(0), 4))) 360 | 361 | num_pen_list = torch.zeros((len(thresh_list))).to(torch.int).to(joint_seq.device) 362 | for thresh_idx, pen_thresh in enumerate(thresh_list): 363 | num_pen_thresh = torch.sum(s < -pen_thresh) 364 | num_pen_list[thresh_idx] = num_pen_thresh 365 | 366 | num_tot = s.size(0) 367 | 368 | pen_dist = torch.Tensor(np.array((0))) 369 | if torch.sum(s < 0) > 0: 370 | pen_dist = -s[s < 0] # distance of penetration at threshold of 0 371 | 372 | return num_pen_list, num_tot, pen_dist 373 | 374 | # map from imapper gt 3d joints to comparison 12-joint skeleton 375 | IMAP2COMPARE = [0, 1, 4, 5, 6, 7, 10, 11, 12, 13, 14, 15] # no hips, neck, or head 376 | COMP_ROOT_IDX = 4 377 | 378 | IMAP_JOINTS = { 'RANK' : 0, 'RKNE' : 1, 'RHIP' : 2, 'LHIP' : 3, 'LKNE' : 4, 'LANK' : 5, 'PELV' : 6, 379 | 'THRX' : 7, 'NECK' : 8, 'HEAD' : 9, 'RWRI' : 10, 'RELB' : 11, 'RSHO' : 12, 380 | 'LSHO' : 13, 'LELB' : 14, 'LWRI' : 15} 381 | IMAP_ID2NAME = {v : k for k, v in IMAP_JOINTS.items()} 382 | COMP_NAMES = [IMAP_ID2NAME[i] for i in IMAP2COMPARE] 383 | COMP_JOINTS = {jname : idx for idx, jname in enumerate(COMP_NAMES)} 384 | 385 | # map from smpl regressed 3d joints to comparison 12-joint skeleton 386 | SMPL2COMPARE = [ SMPL_JOINTS['rightFoot'], SMPL_JOINTS['rightLeg'], SMPL_JOINTS['leftLeg'], SMPL_JOINTS['leftFoot'], 387 | SMPL_JOINTS['hips'], SMPL_JOINTS['neck'], SMPL_JOINTS['rightHand'], 388 | SMPL_JOINTS['rightForeArm'], SMPL_JOINTS['rightArm'], SMPL_JOINTS['leftArm'], SMPL_JOINTS['leftForeArm'], 389 | SMPL_JOINTS['leftHand'] ] -------------------------------------------------------------------------------- /humor/scripts/cleanup_amass_data.py: -------------------------------------------------------------------------------- 1 | import glob 2 | import os 3 | import argparse 4 | import time 5 | import random 6 | import shutil 7 | 8 | # which sub-datasets to clean up 9 | BML_NTroje = True 10 | MPI_HDM05 = True 11 | 12 | def main(args): 13 | if not os.path.exists(args.data) or not os.path.isdir(args.data): 14 | print('Could not find path or it is not a directory') 15 | return 16 | 17 | if BML_NTroje: 18 | # remove treadmill clips from BML_NTroje 19 | # contains treadmill_ 20 | # contains normal_ 21 | dataset_path = os.path.join(args.data, 'BioMotionLab_NTroje') 22 | if not os.path.exists(dataset_path): 23 | print('Could not find BioMotionLab_NTroje data, not filtering out treadmill sequences...') 24 | else: 25 | # create output directory to backup moved data 26 | if not os.path.exists(args.backup): 27 | os.mkdir(args.backup) 28 | bk_dir = os.path.join(args.backup, 'BioMotionLab_NTroje') 29 | if not os.path.exists(bk_dir): 30 | os.mkdir(bk_dir) 31 | subj_names = sorted([f for f in os.listdir(dataset_path) if f[0] != '.']) 32 | subj_paths = sorted([os.path.join(dataset_path, f) for f in subj_names]) 33 | bk_subj_paths = sorted([os.path.join(bk_dir, f) for f in subj_names]) 34 | # print(subj_paths) 35 | for subj_dir, bk_subj_dir in zip(subj_paths, bk_subj_paths): 36 | motion_paths = sorted(glob.glob(subj_dir + '/*.npz')) 37 | # print(motion_paths) 38 | for motion_file in motion_paths: 39 | motion_name = motion_file.split('/')[-1] 40 | motion_type = motion_name.split('_')[1] 41 | # print(motion_type) 42 | if motion_type == 'treadmill' or motion_type == 'normal': 43 | if not os.path.exists(bk_subj_dir): 44 | os.mkdir(bk_subj_dir) 45 | bk_path = os.path.join(bk_subj_dir, motion_name) 46 | # print(bk_path) 47 | shutil.move(motion_file, bk_path) 48 | 49 | 50 | if MPI_HDM05: 51 | # remove ice skating clips from MPI_HDM05 52 | # dg/HDM_dg_07-01* is inline skating 53 | dataset_path = os.path.join(args.data, 'MPI_HDM05') 54 | if not os.path.exists(dataset_path): 55 | print('Could not find MPI_HDM05 data, not filtering out inline skating sequences...') 56 | else: 57 | # create output directory to backup moved data 58 | if not os.path.exists(args.backup): 59 | os.mkdir(args.backup) 60 | bk_dir = os.path.join(args.backup, 'MPI_HDM05') 61 | if not os.path.exists(bk_dir): 62 | os.mkdir(bk_dir) 63 | subj_path = os.path.join(dataset_path, 'dg') 64 | if not os.path.exists(subj_path): 65 | print('Could not find problematic subject in MPI_HDM05: dg') 66 | else: 67 | skating_clips = sorted(glob.glob(subj_path + '/HDM_dg_07-01*')) 68 | # print(skating_clips) 69 | # print(len(skating_clips)) 70 | bk_dir = os.path.join(bk_dir, 'dg') 71 | if not os.path.exists(bk_dir): 72 | os.mkdir(bk_dir) 73 | 74 | for clip in skating_clips: 75 | bk_path = os.path.join(bk_dir, clip.split('/')[-1]) 76 | # print(bk_path) 77 | shutil.move(clip, bk_path) 78 | 79 | 80 | if __name__ == "__main__": 81 | parser = argparse.ArgumentParser() 82 | parser.add_argument('--data', type=str, required=True, help='Root dir of processed AMASS data') 83 | parser.add_argument('--backup', type=str, required=True, help='Root directory to save removed data to.') 84 | 85 | config = parser.parse_known_args() 86 | config = config[0] 87 | 88 | main(config) -------------------------------------------------------------------------------- /humor/train/train_humor.py: -------------------------------------------------------------------------------- 1 | 2 | import sys, os 3 | cur_file_path = os.path.dirname(os.path.realpath(__file__)) 4 | sys.path.append(os.path.join(cur_file_path, '..')) 5 | 6 | import importlib, time 7 | import traceback 8 | import numpy as np 9 | 10 | import torch 11 | import torch.optim as optim 12 | from torch.optim.lr_scheduler import MultiStepLR 13 | from torch.utils.data import DataLoader 14 | 15 | from utils.config import TrainConfig 16 | from utils.logging import Logger, class_name_to_file_name, mkdir, cp_files 17 | from utils.torch import get_device, save_state, load_state 18 | from utils.stats import StatTracker 19 | 20 | NUM_WORKERS = 2 21 | 22 | def parse_args(argv): 23 | # create config and parse args 24 | config = TrainConfig(argv) 25 | known_args, unknown_args = config.parse() 26 | print('Unrecognized args: ' + str(unknown_args)) 27 | return known_args 28 | 29 | def train(args_obj, config_file): 30 | 31 | # set up output 32 | args = args_obj.base 33 | mkdir(args.out) 34 | 35 | # create logging system 36 | train_log_path = os.path.join(args.out, 'train.log') 37 | Logger.init(train_log_path) 38 | 39 | # save arguments used 40 | Logger.log('Base args: ' + str(args)) 41 | Logger.log('Model args: ' + str(args_obj.model)) 42 | Logger.log('Dataset args: ' + str(args_obj.dataset)) 43 | Logger.log('Loss args: ' + str(args_obj.loss)) 44 | 45 | # save training script/model/dataset used 46 | train_scripts_path = os.path.join(args.out, 'train_scripts') 47 | mkdir(train_scripts_path) 48 | pkg_root = os.path.join(cur_file_path, '..') 49 | dataset_file = class_name_to_file_name(args.dataset) 50 | dataset_file_path = os.path.join(pkg_root, 'datasets/' + dataset_file + '.py') 51 | model_file = class_name_to_file_name(args.model) 52 | loss_file = class_name_to_file_name(args.loss) 53 | model_file_path = os.path.join(pkg_root, 'models/' + model_file + '.py') 54 | train_file_path = os.path.join(pkg_root, 'train/train_humor.py') 55 | cp_files(train_scripts_path, [train_file_path, model_file_path, dataset_file_path, config_file]) 56 | 57 | # load model class and instantiate 58 | model_class = importlib.import_module('models.' + model_file) 59 | Model = getattr(model_class, args.model) 60 | model = Model(**args_obj.model_dict, 61 | model_smpl_batch_size=args.batch_size) # assumes model is HumorModel 62 | 63 | # load loss class and instantiate 64 | loss_class = importlib.import_module('losses.' + loss_file) 65 | Loss = getattr(loss_class, args.loss) 66 | loss_func = Loss(**args_obj.loss_dict, 67 | smpl_batch_size=args.batch_size*args_obj.dataset.sample_num_frames) # assumes loss is HumorLoss 68 | 69 | device = get_device(args.gpu) 70 | model.to(device) 71 | loss_func.to(device) 72 | 73 | print(model) 74 | 75 | # count params in model 76 | model_parameters = filter(lambda p: p.requires_grad, model.parameters()) 77 | params = sum([np.prod(p.size()) for p in model_parameters]) 78 | Logger.log('Num model params: ' + str(params)) 79 | 80 | # freeze params in loss 81 | for param in loss_func.parameters(): 82 | param.requires_grad = False 83 | 84 | # optimizer 85 | betas = (args.beta1, args.beta2) 86 | if args.use_adam: 87 | optimizer = optim.Adam(model.parameters(), 88 | lr=args.lr, 89 | betas=betas, 90 | eps=args.eps, 91 | weight_decay=args.decay) 92 | else: 93 | optimizer = optim.Adamax(model.parameters(), 94 | lr=args.lr, 95 | betas=betas, 96 | eps=args.eps, 97 | weight_decay=args.decay) 98 | 99 | # load in pretrained weights/optimizer state if given 100 | start_epoch = 0 101 | min_val_loss = min_train_loss = float('inf') 102 | if args.ckpt is not None: 103 | load_optim = optimizer if args.load_optim else None 104 | start_epoch, min_val_loss, min_train_loss = load_state(args.ckpt, model, optimizer=load_optim, map_location=device, ignore_keys=model.ignore_keys) 105 | start_epoch += 1 106 | Logger.log('Resuming from saved checkpoint at epoch idx %d with min val loss %.6f...' % (start_epoch, min_val_loss)) 107 | if not args.load_optim: 108 | Logger.log('Not loading optimizer state as desired...') 109 | Logger.log('WARNING: Also resetting min_val_loss and epoch count!') 110 | min_val_loss = float('inf') 111 | start_epoch = 0 112 | 113 | # initialize LR scheduler 114 | scheduler = MultiStepLR(optimizer, milestones=args.sched_milestones, gamma=args.sched_decay) 115 | 116 | # intialize schedule sampling if desired 117 | use_sched_samp = False 118 | if args.sched_samp_start is not None and args.sched_samp_end is not None: 119 | if args.sched_samp_start >= 0 and args.sched_samp_end >= args.sched_samp_start: 120 | Logger.log('Using scheduled sampling starting at epoch %d and ending at epoch %d!' % (args.sched_samp_start, args.sched_samp_end)) 121 | use_sched_samp = True 122 | else: 123 | Logger.log('Could not use scheduled sampling with given start and end!') 124 | 125 | # load dataset class and instantiate training and validation set 126 | Dataset = getattr(importlib.import_module('datasets.' + dataset_file), args.dataset) 127 | train_dataset = Dataset(split='train', **args_obj.dataset_dict) 128 | val_dataset = Dataset(split='val', **args_obj.dataset_dict) 129 | # create loaders 130 | train_loader = DataLoader(train_dataset, 131 | batch_size=args.batch_size, 132 | shuffle=True, 133 | num_workers=NUM_WORKERS, 134 | pin_memory=True, 135 | worker_init_fn=lambda _: np.random.seed()) # get around pytorch RNG seed bug 136 | val_loader = DataLoader(val_dataset, 137 | batch_size=args.batch_size, 138 | shuffle=False, 139 | num_workers=NUM_WORKERS, 140 | pin_memory=True, 141 | worker_init_fn=lambda _: np.random.seed()) 142 | 143 | # stats tracker 144 | tensorboard_path = os.path.join(args.out, 'train_tensorboard') 145 | mkdir(tensorboard_path) 146 | stat_tracker = StatTracker(tensorboard_path) 147 | 148 | # checkpoints saving 149 | ckpts_path = os.path.join(args.out, 'checkpoints') 150 | mkdir(ckpts_path) 151 | 152 | if use_sched_samp: 153 | train_dataset.return_global = True 154 | val_dataset.return_global = True 155 | 156 | # main training loop 157 | train_start_t = time.time() 158 | for epoch in range(start_epoch, args.epochs): 159 | 160 | model.train() 161 | 162 | # train 163 | stat_tracker.reset() 164 | batch_start_t = None 165 | reset_loss_track = train_dataset.pre_batch(epoch=epoch) 166 | # see which phase we're in 167 | sched_samp_gt_p = 1.0 # supervised 168 | if use_sched_samp: 169 | if epoch >= args.sched_samp_start and epoch < args.sched_samp_end: 170 | frac = (epoch - args.sched_samp_start) / (args.sched_samp_end - args.sched_samp_start) 171 | sched_samp_gt_p = 1.0*(1.0 - frac) 172 | elif epoch >= args.sched_samp_end: 173 | # autoregressive 174 | sched_samp_gt_p = 0.0 175 | Logger.log('Scheduled sampling current use_gt_p = %f' % (sched_samp_gt_p)) 176 | 177 | if epoch == args.sched_samp_end: 178 | # the loss will naturally go up when using own rollouts 179 | reset_loss_track = True 180 | 181 | if args_obj.loss_dict['kl_loss_cycle_len'] > 0: 182 | # if we're cycling, only want to save results when using full ELBO 183 | if (epoch % args_obj.loss_dict['kl_loss_cycle_len']) > (args_obj.loss_dict['kl_loss_cycle_len'] // 2): 184 | # have reached second half of a cycle 185 | reset_loss_track = True 186 | 187 | if reset_loss_track: 188 | Logger.log('Resetting min_val_loss and min_train_loss') 189 | min_val_loss = min_train_loss = float('inf') 190 | 191 | for i, data in enumerate(train_loader): 192 | batch_start_t = time.time() 193 | 194 | try: 195 | # zero the gradients 196 | optimizer.zero_grad() 197 | # run model 198 | loss, stats_dict = model_class.step(model, loss_func, data, train_dataset, device, epoch, mode='train', use_gt_p=sched_samp_gt_p) 199 | if torch.isnan(loss).item(): 200 | Logger.log('WARNING: NaN loss. Skipping to next data...') 201 | torch.cuda.empty_cache() 202 | continue 203 | # backprop and step 204 | loss.backward() 205 | # check gradients 206 | parameters = [p for p in model.parameters() if p.grad is not None] 207 | total_norm = torch.norm(torch.stack([torch.norm(p.grad.detach(), 2.0).to(device) for p in parameters]), 2.0) 208 | if torch.isnan(total_norm): 209 | Logger.log('WARNING: NaN gradients. Skipping to next data...') 210 | torch.cuda.empty_cache() 211 | continue 212 | optimizer.step() 213 | except (RuntimeError, AssertionError) as e: 214 | if epoch > 0: 215 | # to catch bad dynamics, but keep training 216 | Logger.log('WARNING: caught an exception during forward or backward pass. Skipping to next data...') 217 | Logger.log(e) 218 | traceback.print_exc() 219 | reset_loss_track = train_dataset.pre_batch(epoch=epoch) 220 | if reset_loss_track: 221 | Logger.log('Resetting min_val_loss and min_train_loss') 222 | min_val_loss = min_train_loss = float('inf') 223 | continue 224 | else: 225 | raise e 226 | 227 | # collect stats 228 | batch_elapsed_t = time.time() - batch_start_t 229 | total_elapsed_t = time.time() - train_start_t 230 | stats_dict['loss'] = loss 231 | for param_group in optimizer.param_groups: 232 | stats_dict['lr'] = torch.Tensor([param_group['lr']])[0] 233 | stats_dict['time_per_batch'] = torch.Tensor([batch_elapsed_t])[0] 234 | 235 | last_batch = (i==(len(train_loader)-1)) 236 | stat_tracker.update(stats_dict, tag='train', save_tf=last_batch) 237 | if i % args.print_every == 0: 238 | stat_tracker.print(i, len(train_loader), 239 | epoch, args.epochs, 240 | total_elapsed_time=total_elapsed_t, 241 | tag='train') 242 | 243 | reset_loss_track = train_dataset.pre_batch(epoch=epoch) 244 | if reset_loss_track: 245 | Logger.log('Resetting min_val_loss and min_train_loss') 246 | min_val_loss = min_train_loss = float('inf') 247 | 248 | # save if desired 249 | if epoch % args.save_every == 0: 250 | Logger.log('Saving checkpoint...') 251 | save_file = os.path.join(ckpts_path, 'epoch_%08d_model.pth' % (epoch)) 252 | save_state(save_file, model, optimizer, cur_epoch=epoch, min_val_loss=min_val_loss, min_train_loss=min_train_loss, ignore_keys=model.ignore_keys) 253 | 254 | # check if it's the best train model so far 255 | mean_train_loss = stat_tracker.meter_dict['train/loss'].avg 256 | if mean_train_loss < min_train_loss: 257 | min_train_loss = mean_train_loss 258 | Logger.log('Best train loss so far! Saving checkpoint...') 259 | save_file = os.path.join(ckpts_path, 'best_train_model.pth') 260 | save_state(save_file, model, optimizer, cur_epoch=epoch, min_val_loss=min_val_loss, min_train_loss=min_train_loss, ignore_keys=model.ignore_keys) 261 | 262 | # validate 263 | if epoch % args.val_every == 0: 264 | with torch.no_grad(): 265 | # run on validation data 266 | model.eval() 267 | 268 | stat_tracker.reset() 269 | for i, data in enumerate(val_loader): 270 | # print(i) 271 | batch_start_t = time.time() 272 | # run model 273 | loss, stats_dict = model_class.step(model, loss_func, data, val_dataset, device, epoch, mode='test', use_gt_p=sched_samp_gt_p) 274 | 275 | if torch.isnan(loss): 276 | Logger.log('WARNING: NaN loss on VALIDATION. Skipping to next data...') 277 | continue 278 | 279 | # collect stats 280 | batch_elapsed_t = time.time() - batch_start_t 281 | total_elapsed_t = time.time() - train_start_t 282 | stats_dict['loss'] = loss 283 | stats_dict['time_per_batch'] = torch.Tensor([batch_elapsed_t])[0] 284 | 285 | stat_tracker.update(stats_dict, tag='val', save_tf=(i==(len(val_loader)-1)), increment_step=False) 286 | 287 | if i % args.print_every == 0: 288 | stat_tracker.print(i, len(val_loader), 289 | epoch, args.epochs, 290 | total_elapsed_time=total_elapsed_t, 291 | tag='val') 292 | 293 | # check if it's the best model so far 294 | mean_val_loss = stat_tracker.meter_dict['val/loss'].avg 295 | if mean_val_loss < min_val_loss: 296 | min_val_loss = mean_val_loss 297 | Logger.log('Best val loss so far! Saving checkpoint...') 298 | save_file = os.path.join(ckpts_path, 'best_model.pth') 299 | save_state(save_file, model, optimizer, cur_epoch=epoch, min_val_loss=min_val_loss, min_train_loss=min_train_loss, ignore_keys=model.ignore_keys) 300 | 301 | scheduler.step() 302 | 303 | torch.cuda.empty_cache() 304 | 305 | Logger.log('Finished!') 306 | 307 | def main(args, config_file): 308 | train(args, config_file) 309 | 310 | if __name__=='__main__': 311 | args = parse_args(sys.argv[1:]) 312 | config_file = sys.argv[1:][0][1:] 313 | main(args, config_file) -------------------------------------------------------------------------------- /humor/train/train_state_prior.py: -------------------------------------------------------------------------------- 1 | import sys, os 2 | cur_file_path = os.path.dirname(os.path.realpath(__file__)) 3 | sys.path.append(os.path.join(cur_file_path, '..')) 4 | 5 | import argparse, time 6 | import numpy as np 7 | import torch 8 | from torch.utils.data import DataLoader 9 | from torch.distributions import MixtureSameFamily, MultivariateNormal, Independent, Categorical, Normal 10 | 11 | from sklearn.mixture import GaussianMixture 12 | from sklearn.decomposition import PCA 13 | 14 | from datasets.amass_discrete_dataset import AmassDiscreteDataset 15 | 16 | def parse_args(argv): 17 | parser = argparse.ArgumentParser(allow_abbrev=False) 18 | 19 | parser.add_argument('--data', type=str, required=True, help='Path to the full AMASS dataset') 20 | parser.add_argument('--out', type=str, required=True, help='Path to save outputs to') 21 | parser.add_argument('--train-states', type=str, default=None, help='npy file with pre-loaded train states') 22 | parser.add_argument('--gmm-comps', type=int, default=12, help='Number of GMM components to use') 23 | 24 | parser.add_argument('--viz-only', dest='viz_only', action='store_true', help="If given, only visualizes results in the given output directory, does not refit.") 25 | parser.set_defaults(viz_only=False) 26 | parser.add_argument('--test-only', dest='test_only', action='store_true', help="If given, only runs fitting in the given output directory on test data, does not refit.") 27 | parser.set_defaults(test_only=False) 28 | 29 | known_args, unknown_args = parser.parse_known_args(argv) 30 | 31 | return known_args 32 | 33 | def main(args): 34 | print(args) 35 | 36 | if not os.path.exists(args.out): 37 | os.makedirs(args.out) 38 | 39 | all_states_out_path = os.path.join(args.out, 'train_states.npy') 40 | gmm_out_path = os.path.join(args.out, 'prior_gmm.npz') 41 | if args.viz_only: 42 | print('Visualizing results...') 43 | viz_gmm_fit_results(gmm_out_path) 44 | exit() 45 | if args.test_only: 46 | print('Evaluating on test set...') 47 | test_results(args.data, gmm_out_path) 48 | exit() 49 | 50 | all_states = None 51 | if args.train_states is not None: 52 | start_t = time.time() 53 | print('Loading processed train states...') 54 | all_states = np.load(args.train_states) 55 | print('Loaded in %f s' % (time.time() - start_t)) 56 | else: 57 | amass_dataset = AmassDiscreteDataset(split='train', 58 | data_paths=[args.data], 59 | split_by='dataset', 60 | sample_num_frames=1, 61 | step_frames_in=1, 62 | step_frames_out=0, 63 | data_rot_rep='aa', 64 | data_return_config='smpl+joints', 65 | deterministic_train=True, 66 | return_global=False, 67 | only_global=False) 68 | 69 | batch_size = 1000 70 | loader = DataLoader(amass_dataset, 71 | batch_size=batch_size, 72 | shuffle=False, 73 | num_workers=8, 74 | pin_memory=False, 75 | drop_last=False, 76 | worker_init_fn=lambda _: np.random.seed()) # get around numpy RNG seed bug 77 | 78 | all_states = [] 79 | for i, data in enumerate(loader): 80 | print('Batch %d/%d...' % (i, len(loader))) 81 | start_t = time.time() 82 | batch_in, _, meta = data 83 | B = batch_in['joints'].size(0) 84 | joints = batch_in['joints'][:,0,0].reshape((B, -1)) 85 | joints_vel = batch_in['joints_vel'][:,0,0].reshape((B, -1)) 86 | trans_vel = batch_in['trans_vel'][:,0,0] 87 | root_orient_vel = batch_in['root_orient_vel'][:,0,0] 88 | 89 | cur_state = torch.cat([joints, joints_vel, trans_vel, root_orient_vel], dim=-1) 90 | all_states.append(cur_state) 91 | 92 | all_states = torch.cat(all_states, dim=0).numpy() 93 | np.save(all_states_out_path, all_states) 94 | 95 | print(all_states.shape) 96 | 97 | print('Fitting GMM with %d components...' % (args.gmm_comps)) 98 | start_t = time.time() 99 | gmm = GaussianMixture(n_components=args.gmm_comps, 100 | covariance_type='full', 101 | tol=0.001, 102 | reg_covar=1e-06, 103 | max_iter=200, 104 | n_init=1, 105 | init_params='kmeans', 106 | weights_init=None, 107 | means_init=None, 108 | precisions_init=None, 109 | random_state=0, 110 | warm_start=False, 111 | verbose=1, 112 | verbose_interval=5) 113 | gmm.fit(all_states) 114 | # print(gmm.weights_) 115 | # print(gmm.means_) 116 | # print(gmm.covariances_) 117 | print(gmm.converged_) 118 | print(gmm.weights_.shape) 119 | print(gmm.means_.shape) 120 | print(gmm.covariances_.shape) 121 | 122 | # save distirbution information 123 | np.savez(gmm_out_path, weights=gmm.weights_, means=gmm.means_, covariances=gmm.covariances_) 124 | 125 | print('GMM time: %f s' % (time.time() - start_t)) 126 | 127 | print('Running evaluation on test set...') 128 | test_results(args.data, gmm_out_path) 129 | # print('Visualizing sampled results...') 130 | # viz_gmm_fit_results(gmm_out_path, debug_gmm_obj=gmm, debug_data=all_states) 131 | 132 | def load_gmm_results(gmm_path): 133 | gmm_res = np.load(gmm_path) 134 | gmm_weights = gmm_res['weights'] 135 | gmm_means = gmm_res['means'] 136 | gmm_covs = gmm_res['covariances'] 137 | return gmm_weights, gmm_means, gmm_covs 138 | 139 | def build_pytorch_gmm(gmm_weights, gmm_means, gmm_covs): 140 | mix = Categorical(torch.from_numpy(gmm_weights)) 141 | comp = MultivariateNormal(torch.from_numpy(gmm_means), covariance_matrix=torch.from_numpy(gmm_covs)) 142 | gmm_distrib = MixtureSameFamily(mix, comp) 143 | return gmm_distrib 144 | 145 | def viz_gmm_fit_results(gmm_path, debug_gmm_obj=None, debug_data=None): 146 | 147 | # load in GMM result 148 | gmm_weights, gmm_means, gmm_covs = load_gmm_results(gmm_path) 149 | 150 | # build pytorch distrib 151 | gmm_distrib = build_pytorch_gmm(gmm_weights, gmm_means, gmm_covs) 152 | print([gmm_distrib.batch_shape, gmm_distrib.event_shape]) 153 | 154 | if debug_gmm_obj is not None and debug_data is not None: 155 | print('pytorch logprob...') 156 | torch_logprob = gmm_distrib.log_prob(torch.from_numpy(debug_data)) 157 | print(torch_logprob.size()) 158 | print(torch_logprob[:20]) 159 | 160 | print('sklearn logprob...') 161 | sk_logprob = debug_gmm_obj.score_samples(debug_data) 162 | print(sk_logprob.shape) 163 | print(sk_logprob[:20]) 164 | 165 | # sample randomly 166 | num_samps = 100 167 | sample_states = gmm_distrib.sample(torch.Size([num_samps])) 168 | 169 | num_samps = gmm_means.shape[0] 170 | sample_states = torch.from_numpy(gmm_means) 171 | print(gmm_weights) 172 | 173 | torch_logprob = gmm_distrib.log_prob(sample_states) 174 | 175 | print(sample_states.size()) 176 | print(torch_logprob) 177 | print(torch_logprob.mean()) 178 | 179 | # visualize joints and velocities 180 | from viz.utils import viz_results, viz_smpl_seq 181 | 182 | # visualize results 183 | viz_joints = sample_states[:,:66].reshape((num_samps, 22, 3)) 184 | viz_joints_vel = sample_states[:,66:132].reshape((num_samps, 22, 3)) 185 | viz_trans_vel = sample_states[:,132:135] 186 | viz_root_orient_vel = sample_states[:,135:] 187 | print(viz_joints.shape) 188 | print(viz_joints_vel.shape) 189 | print(viz_trans_vel.shape) 190 | print(viz_root_orient_vel.shape) 191 | print('Showing joint velocities...') 192 | viz_smpl_seq(None, imw=1080, imh=1080, fps=10, contacts=None, 193 | render_body=False, render_joints=True, render_skeleton=True, render_ground=True, 194 | joints_seq=viz_joints, 195 | joints_vel=viz_joints_vel) 196 | print('Showing root velocity...') 197 | viz_smpl_seq(None, imw=1080, imh=1080, fps=10, contacts=None, 198 | render_body=False, render_joints=True, render_skeleton=True, render_ground=True, 199 | joints_seq=viz_joints, 200 | joints_vel=viz_trans_vel.reshape((-1, 1, 3)).repeat((1, 22, 1))) 201 | print('Showing root orient velocity...') 202 | viz_smpl_seq(None, imw=1080, imh=1080, fps=10, contacts=None, 203 | render_body=False, render_joints=True, render_skeleton=True, render_ground=True, 204 | joints_seq=viz_joints, 205 | joints_vel=viz_root_orient_vel.reshape((-1, 1, 3)).repeat((1, 22, 1))) 206 | 207 | def test_results(data_path, gmm_path): 208 | # 209 | # Evaluate likelihood of test data 210 | # 211 | 212 | # load in GMM result 213 | gmm_weights, gmm_means, gmm_covs = load_gmm_results(gmm_path) 214 | 215 | # build pytorch distrib 216 | gmm_distrib = build_pytorch_gmm(gmm_weights, gmm_means, gmm_covs) 217 | 218 | # load in all test data 219 | test_dataset = AmassDiscreteDataset(split='test', 220 | data_paths=[data_path], 221 | split_by='dataset', 222 | sample_num_frames=1, 223 | step_frames_in=1, 224 | step_frames_out=0, 225 | data_rot_rep='aa', 226 | data_return_config='smpl+joints', 227 | deterministic_train=True, 228 | return_global=False, 229 | only_global=False) 230 | 231 | batch_size = 1000 232 | test_loader = DataLoader(test_dataset, 233 | batch_size=batch_size, 234 | shuffle=False, 235 | num_workers=8, 236 | pin_memory=False, 237 | drop_last=False, 238 | worker_init_fn=lambda _: np.random.seed()) # get around numpy RNG seed bug 239 | 240 | test_states = [] 241 | for i, data in enumerate(test_loader): 242 | print('Batch %d/%d...' % (i, len(test_loader))) 243 | start_t = time.time() 244 | batch_in, _, meta = data 245 | # print(meta['path']) 246 | 247 | B = batch_in['joints'].size(0) 248 | joints = batch_in['joints'][:,0,0].reshape((B, -1)) 249 | joints_vel = batch_in['joints_vel'][:,0,0].reshape((B, -1)) 250 | trans_vel = batch_in['trans_vel'][:,0,0] 251 | root_orient_vel = batch_in['root_orient_vel'][:,0,0] 252 | 253 | cur_state = torch.cat([joints, joints_vel, trans_vel, root_orient_vel], dim=-1) 254 | test_states.append(cur_state) 255 | 256 | test_states = torch.cat(test_states, dim=0) 257 | print(test_states.size()) 258 | 259 | # eval likelihood 260 | test_logprob = gmm_distrib.log_prob(test_states) 261 | mean_logprob = test_logprob.mean() 262 | 263 | print('Mean test logprob: %f' % (mean_logprob.item())) 264 | 265 | if __name__=='__main__': 266 | args = parse_args(sys.argv[1:]) 267 | main(args) -------------------------------------------------------------------------------- /humor/utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/davrempe/humor/fc6ef84f0baa153be15427402e0147ed1a63a11a/humor/utils/__init__.py -------------------------------------------------------------------------------- /humor/utils/chamfer_distance/LICENSE: -------------------------------------------------------------------------------- 1 | 2 | 3 | MIT License 4 | 5 | Copyright (c) [year] [fullname] 6 | 7 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 8 | 9 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 10 | 11 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 12 | -------------------------------------------------------------------------------- /humor/utils/chamfer_distance/__init__.py: -------------------------------------------------------------------------------- 1 | from .chamfer_distance import ChamferDistance 2 | -------------------------------------------------------------------------------- /humor/utils/chamfer_distance/chamfer_distance.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | // CUDA forward declarations 4 | int ChamferDistanceKernelLauncher( 5 | const int b, const int n, 6 | const float* xyz, 7 | const int m, 8 | const float* xyz2, 9 | float* result, 10 | int* result_i, 11 | float* result2, 12 | int* result2_i); 13 | 14 | int ChamferDistanceGradKernelLauncher( 15 | const int b, const int n, 16 | const float* xyz1, 17 | const int m, 18 | const float* xyz2, 19 | const float* grad_dist1, 20 | const int* idx1, 21 | const float* grad_dist2, 22 | const int* idx2, 23 | float* grad_xyz1, 24 | float* grad_xyz2); 25 | 26 | 27 | void chamfer_distance_forward_cuda( 28 | const at::Tensor xyz1, 29 | const at::Tensor xyz2, 30 | const at::Tensor dist1, 31 | const at::Tensor dist2, 32 | const at::Tensor idx1, 33 | const at::Tensor idx2) 34 | { 35 | ChamferDistanceKernelLauncher(xyz1.size(0), xyz1.size(1), xyz1.data(), 36 | xyz2.size(1), xyz2.data(), 37 | dist1.data(), idx1.data(), 38 | dist2.data(), idx2.data()); 39 | } 40 | 41 | void chamfer_distance_backward_cuda( 42 | const at::Tensor xyz1, 43 | const at::Tensor xyz2, 44 | at::Tensor gradxyz1, 45 | at::Tensor gradxyz2, 46 | at::Tensor graddist1, 47 | at::Tensor graddist2, 48 | at::Tensor idx1, 49 | at::Tensor idx2) 50 | { 51 | ChamferDistanceGradKernelLauncher(xyz1.size(0), xyz1.size(1), xyz1.data(), 52 | xyz2.size(1), xyz2.data(), 53 | graddist1.data(), idx1.data(), 54 | graddist2.data(), idx2.data(), 55 | gradxyz1.data(), gradxyz2.data()); 56 | } 57 | 58 | 59 | void nnsearch( 60 | const int b, const int n, const int m, 61 | const float* xyz1, 62 | const float* xyz2, 63 | float* dist, 64 | int* idx) 65 | { 66 | for (int i = 0; i < b; i++) { 67 | for (int j = 0; j < n; j++) { 68 | const float x1 = xyz1[(i*n+j)*3+0]; 69 | const float y1 = xyz1[(i*n+j)*3+1]; 70 | const float z1 = xyz1[(i*n+j)*3+2]; 71 | double best = 0; 72 | int besti = 0; 73 | for (int k = 0; k < m; k++) { 74 | const float x2 = xyz2[(i*m+k)*3+0] - x1; 75 | const float y2 = xyz2[(i*m+k)*3+1] - y1; 76 | const float z2 = xyz2[(i*m+k)*3+2] - z1; 77 | const double d=x2*x2+y2*y2+z2*z2; 78 | if (k==0 || d < best){ 79 | best = d; 80 | besti = k; 81 | } 82 | } 83 | dist[i*n+j] = best; 84 | idx[i*n+j] = besti; 85 | } 86 | } 87 | } 88 | 89 | 90 | void chamfer_distance_forward( 91 | const at::Tensor xyz1, 92 | const at::Tensor xyz2, 93 | const at::Tensor dist1, 94 | const at::Tensor dist2, 95 | const at::Tensor idx1, 96 | const at::Tensor idx2) 97 | { 98 | const int batchsize = xyz1.size(0); 99 | const int n = xyz1.size(1); 100 | const int m = xyz2.size(1); 101 | 102 | const float* xyz1_data = xyz1.data(); 103 | const float* xyz2_data = xyz2.data(); 104 | float* dist1_data = dist1.data(); 105 | float* dist2_data = dist2.data(); 106 | int* idx1_data = idx1.data(); 107 | int* idx2_data = idx2.data(); 108 | 109 | nnsearch(batchsize, n, m, xyz1_data, xyz2_data, dist1_data, idx1_data); 110 | nnsearch(batchsize, m, n, xyz2_data, xyz1_data, dist2_data, idx2_data); 111 | } 112 | 113 | 114 | void chamfer_distance_backward( 115 | const at::Tensor xyz1, 116 | const at::Tensor xyz2, 117 | at::Tensor gradxyz1, 118 | at::Tensor gradxyz2, 119 | at::Tensor graddist1, 120 | at::Tensor graddist2, 121 | at::Tensor idx1, 122 | at::Tensor idx2) 123 | { 124 | const int b = xyz1.size(0); 125 | const int n = xyz1.size(1); 126 | const int m = xyz2.size(1); 127 | 128 | const float* xyz1_data = xyz1.data(); 129 | const float* xyz2_data = xyz2.data(); 130 | float* gradxyz1_data = gradxyz1.data(); 131 | float* gradxyz2_data = gradxyz2.data(); 132 | float* graddist1_data = graddist1.data(); 133 | float* graddist2_data = graddist2.data(); 134 | const int* idx1_data = idx1.data(); 135 | const int* idx2_data = idx2.data(); 136 | 137 | for (int i = 0; i < b*n*3; i++) 138 | gradxyz1_data[i] = 0; 139 | for (int i = 0; i < b*m*3; i++) 140 | gradxyz2_data[i] = 0; 141 | for (int i = 0;i < b; i++) { 142 | for (int j = 0; j < n; j++) { 143 | const float x1 = xyz1_data[(i*n+j)*3+0]; 144 | const float y1 = xyz1_data[(i*n+j)*3+1]; 145 | const float z1 = xyz1_data[(i*n+j)*3+2]; 146 | const int j2 = idx1_data[i*n+j]; 147 | 148 | const float x2 = xyz2_data[(i*m+j2)*3+0]; 149 | const float y2 = xyz2_data[(i*m+j2)*3+1]; 150 | const float z2 = xyz2_data[(i*m+j2)*3+2]; 151 | const float g = graddist1_data[i*n+j]*2; 152 | 153 | gradxyz1_data[(i*n+j)*3+0] += g*(x1-x2); 154 | gradxyz1_data[(i*n+j)*3+1] += g*(y1-y2); 155 | gradxyz1_data[(i*n+j)*3+2] += g*(z1-z2); 156 | gradxyz2_data[(i*m+j2)*3+0] -= (g*(x1-x2)); 157 | gradxyz2_data[(i*m+j2)*3+1] -= (g*(y1-y2)); 158 | gradxyz2_data[(i*m+j2)*3+2] -= (g*(z1-z2)); 159 | } 160 | for (int j = 0; j < m; j++) { 161 | const float x1 = xyz2_data[(i*m+j)*3+0]; 162 | const float y1 = xyz2_data[(i*m+j)*3+1]; 163 | const float z1 = xyz2_data[(i*m+j)*3+2]; 164 | const int j2 = idx2_data[i*m+j]; 165 | const float x2 = xyz1_data[(i*n+j2)*3+0]; 166 | const float y2 = xyz1_data[(i*n+j2)*3+1]; 167 | const float z2 = xyz1_data[(i*n+j2)*3+2]; 168 | const float g = graddist2_data[i*m+j]*2; 169 | gradxyz2_data[(i*m+j)*3+0] += g*(x1-x2); 170 | gradxyz2_data[(i*m+j)*3+1] += g*(y1-y2); 171 | gradxyz2_data[(i*m+j)*3+2] += g*(z1-z2); 172 | gradxyz1_data[(i*n+j2)*3+0] -= (g*(x1-x2)); 173 | gradxyz1_data[(i*n+j2)*3+1] -= (g*(y1-y2)); 174 | gradxyz1_data[(i*n+j2)*3+2] -= (g*(z1-z2)); 175 | } 176 | } 177 | } 178 | 179 | 180 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { 181 | m.def("forward", &chamfer_distance_forward, "ChamferDistance forward"); 182 | m.def("forward_cuda", &chamfer_distance_forward_cuda, "ChamferDistance forward (CUDA)"); 183 | m.def("backward", &chamfer_distance_backward, "ChamferDistance backward"); 184 | m.def("backward_cuda", &chamfer_distance_backward_cuda, "ChamferDistance backward (CUDA)"); 185 | } 186 | -------------------------------------------------------------------------------- /humor/utils/chamfer_distance/chamfer_distance.cu: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include 4 | #include 5 | 6 | __global__ 7 | void ChamferDistanceKernel( 8 | int b, 9 | int n, 10 | const float* xyz, 11 | int m, 12 | const float* xyz2, 13 | float* result, 14 | int* result_i) 15 | { 16 | const int batch=512; 17 | __shared__ float buf[batch*3]; 18 | for (int i=blockIdx.x;ibest){ 130 | result[(i*n+j)]=best; 131 | result_i[(i*n+j)]=best_i; 132 | } 133 | } 134 | __syncthreads(); 135 | } 136 | } 137 | } 138 | 139 | void ChamferDistanceKernelLauncher( 140 | const int b, const int n, 141 | const float* xyz, 142 | const int m, 143 | const float* xyz2, 144 | float* result, 145 | int* result_i, 146 | float* result2, 147 | int* result2_i) 148 | { 149 | ChamferDistanceKernel<<>>(b, n, xyz, m, xyz2, result, result_i); 150 | ChamferDistanceKernel<<>>(b, m, xyz2, n, xyz, result2, result2_i); 151 | 152 | cudaError_t err = cudaGetLastError(); 153 | if (err != cudaSuccess) 154 | printf("error in chamfer distance updateOutput: %s\n", cudaGetErrorString(err)); 155 | } 156 | 157 | 158 | __global__ 159 | void ChamferDistanceGradKernel( 160 | int b, int n, 161 | const float* xyz1, 162 | int m, 163 | const float* xyz2, 164 | const float* grad_dist1, 165 | const int* idx1, 166 | float* grad_xyz1, 167 | float* grad_xyz2) 168 | { 169 | for (int i = blockIdx.x; i>>(b, n, xyz1, m, xyz2, grad_dist1, idx1, grad_xyz1, grad_xyz2); 204 | ChamferDistanceGradKernel<<>>(b, m, xyz2, n, xyz1, grad_dist2, idx2, grad_xyz2, grad_xyz1); 205 | 206 | cudaError_t err = cudaGetLastError(); 207 | if (err != cudaSuccess) 208 | printf("error in chamfer distance get grad: %s\n", cudaGetErrorString(err)); 209 | } 210 | -------------------------------------------------------------------------------- /humor/utils/chamfer_distance/chamfer_distance.py: -------------------------------------------------------------------------------- 1 | # 2 | # Taken from https://github.com/chrdiller/pyTorchChamferDistance 3 | # 4 | 5 | import torch 6 | from torch.utils.cpp_extension import load 7 | import os 8 | FileDirPath = os.path.dirname(os.path.realpath(__file__)) 9 | #print('[ INFO ]: Chamfer directory:', FileDirPath) 10 | cd = load(name='cd', sources=[os.path.join(FileDirPath, 'chamfer_distance.cpp'), os.path.join(FileDirPath, 'chamfer_distance.cu')]) 11 | 12 | class ChamferDistanceFunction(torch.autograd.Function): 13 | @staticmethod 14 | def forward(ctx, xyz1, xyz2): 15 | batchsize, n, _ = xyz1.size() 16 | _, m, _ = xyz2.size() 17 | xyz1 = xyz1.contiguous() 18 | xyz2 = xyz2.contiguous() 19 | dist1 = torch.zeros(batchsize, n) 20 | dist2 = torch.zeros(batchsize, m) 21 | 22 | idx1 = torch.zeros(batchsize, n, dtype=torch.int) 23 | idx2 = torch.zeros(batchsize, m, dtype=torch.int) 24 | 25 | if not xyz1.is_cuda: 26 | cd.forward(xyz1, xyz2, dist1, dist2, idx1, idx2) 27 | else: 28 | dist1 = dist1.cuda() 29 | dist2 = dist2.cuda() 30 | idx1 = idx1.cuda() 31 | idx2 = idx2.cuda() 32 | cd.forward_cuda(xyz1, xyz2, dist1, dist2, idx1, idx2) 33 | 34 | ctx.save_for_backward(xyz1, xyz2, idx1, idx2) 35 | 36 | return dist1, dist2 37 | 38 | @staticmethod 39 | def backward(ctx, graddist1, graddist2): 40 | xyz1, xyz2, idx1, idx2 = ctx.saved_tensors 41 | 42 | graddist1 = graddist1.contiguous() 43 | graddist2 = graddist2.contiguous() 44 | 45 | gradxyz1 = torch.zeros(xyz1.size()) 46 | gradxyz2 = torch.zeros(xyz2.size()) 47 | 48 | if not graddist1.is_cuda: 49 | cd.backward(xyz1, xyz2, gradxyz1, gradxyz2, graddist1, graddist2, idx1, idx2) 50 | else: 51 | gradxyz1 = gradxyz1.cuda() 52 | gradxyz2 = gradxyz2.cuda() 53 | cd.backward_cuda(xyz1, xyz2, gradxyz1, gradxyz2, graddist1, graddist2, idx1, idx2) 54 | 55 | return gradxyz1, gradxyz2 56 | 57 | 58 | class ChamferDistance(torch.nn.Module): 59 | def forward(self, xyz1, xyz2): 60 | return ChamferDistanceFunction.apply(xyz1, xyz2) 61 | -------------------------------------------------------------------------------- /humor/utils/config.py: -------------------------------------------------------------------------------- 1 | import sys, os 2 | file_path = os.path.dirname(os.path.realpath(__file__)) 3 | sys.path.append(os.path.join(file_path, '.')) 4 | 5 | import argparse, importlib 6 | 7 | class SplitLineParser(argparse.ArgumentParser): 8 | def convert_arg_line_to_args(self, arg_line): 9 | return arg_line.split() 10 | 11 | class Args(): 12 | ''' 13 | Container class to hold parsed arguments for the base/train/test configuration along with 14 | model and dataset-specific. 15 | ''' 16 | def __init__(self, base, model=None, dataset=None, loss=None): 17 | self.base = base 18 | self.model = model 19 | self.dataset = dataset 20 | self.loss = loss 21 | 22 | # dictionary versions of args that can be used to pass as constructor arguments 23 | self.model_dict = vars(self.model) if self.model is not None else None 24 | self.dataset_dict = vars(self.dataset) if self.dataset is not None else None 25 | self.loss_dict = vars(self.loss) if self.loss is not None else None 26 | 27 | class BaseConfig(): 28 | ''' 29 | Base configuration, arguments apply to both training and evaluation scripts. 30 | This configuration will automatically load the sub-configuration of the specified model and dataset if available. 31 | ''' 32 | def __init__(self, argv): 33 | self.argv = argv 34 | self.parser = SplitLineParser(fromfile_prefix_chars='@', allow_abbrev=False) 35 | 36 | self.parser.add_argument('--dataset', type=str, required=True, choices=['AmassDiscreteDataset'], help='The name of the dataset type.') 37 | self.parser.add_argument('--model', type=str, required=True, help='The name of the model to use.') 38 | self.parser.add_argument('--loss', type=str, default=None, help='The name of the loss to use.') 39 | self.parser.add_argument('--out', type=str, default='./output', help='The directory to save outputs to (logs, results, weights, etc..).') 40 | self.parser.add_argument('--ckpt', type=str, default=None, help='Path to model weights to start training/testing from.') 41 | 42 | self.parser.add_argument('--gpu', type=int, default=0, help='The GPU index to use.') 43 | 44 | self.parser.add_argument('--batch-size', type=int, default=8, help='Batch size for training.') 45 | self.parser.add_argument('--print-every', type=int, default=1, help='Number of batches between printing stats.') 46 | 47 | 48 | def parse(self): 49 | base_args, unknown_args = self.parser.parse_known_args(self.argv) 50 | 51 | # load any model-specific configuration 52 | model_args = None 53 | try: 54 | ModelConfig = getattr(importlib.import_module('config'), base_args.model + 'Config') 55 | model_config = ModelConfig(self.argv) 56 | model_args, model_unknown_args = model_config.parse() 57 | except AttributeError: 58 | print('No model-specific configuration for %s...' % (base_args.model)) 59 | model_unknown_args = unknown_args 60 | 61 | # load any dataset-specific configuration 62 | dataset_args = None 63 | try: 64 | DataConfig = getattr(importlib.import_module('config'), base_args.dataset + 'Config') 65 | data_config = DataConfig(self.argv) 66 | dataset_args, data_unknown_args = data_config.parse() 67 | except AttributeError: 68 | print('No data-specific configuration for %s...' % (base_args.dataset)) 69 | data_unknown_args = unknown_args 70 | 71 | # load any dataset-specific configuration 72 | use_loss_config = base_args.loss is not None 73 | loss_args = None 74 | loss_unknown_args = [] 75 | if use_loss_config: 76 | loss_args = None 77 | try: 78 | LossConfig = getattr(importlib.import_module('config'), base_args.loss + 'Config') 79 | loss_config = LossConfig(self.argv) 80 | loss_args, loss_unknown_args = loss_config.parse() 81 | except AttributeError: 82 | print('No data-specific configuration for %s...' % (base_args.loss)) 83 | loss_unknown_args = unknown_args 84 | 85 | # make sure unknown args are unknown to both if returning 86 | unknown_args = set([arg for arg in unknown_args if arg[:2] == '--']) 87 | model_unknown_args = set([arg for arg in model_unknown_args if arg[:2] == '--']) 88 | data_unknown_args = set([arg for arg in data_unknown_args if arg[:2] == '--']) 89 | loss_unknown_args = set([arg for arg in loss_unknown_args if arg[:2] == '--']) 90 | final_unknown_args = list(unknown_args.intersection(model_unknown_args, data_unknown_args, loss_unknown_args)) 91 | 92 | final_args = Args(base_args, model=model_args, dataset=dataset_args, loss=loss_args) 93 | 94 | return final_args, final_unknown_args 95 | 96 | # default args: model name (load additional args based on specific model name), dataset name 97 | 98 | class BaseSubConfig(): 99 | ''' 100 | Base sub-configuration, each model/dataset-specific sub-configuration should derive from this. 101 | ''' 102 | def __init__(self, argv): 103 | self.argv = argv 104 | self.parser = SplitLineParser(fromfile_prefix_chars='@', allow_abbrev=False) 105 | 106 | def parse(self, namespace=None): 107 | self.args = self.parser.parse_known_args(self.argv, namespace=namespace) 108 | return self.args 109 | 110 | # 111 | # NOTE: Edit/Add these configs for changes in training and testing scripts 112 | # 113 | 114 | class TrainConfig(BaseConfig): 115 | def __init__(self, argv): 116 | super(TrainConfig, self).__init__(argv) 117 | 118 | self.parser.add_argument('--epochs', type=int, default=1, help='Number of epochs for training.') 119 | self.parser.add_argument('--val-every', type=int, default=1, help='Number of epochs between validations.') 120 | self.parser.add_argument('--save-every', type=int, default=1, help='Number of epochs between saving model checkpoints.') 121 | 122 | self.parser.add_argument('--lr', type=float, default=1e-3, help='Starting learning rate.') 123 | self.parser.add_argument('--beta1', type=float, default=0.9, help='Beta1 for ADAM') 124 | self.parser.add_argument('--beta2', type=float, default=0.999, help='Beta2 for ADAM') 125 | self.parser.add_argument('--eps', type=float, default=1e-8, help='Epsilon rate for ADAM') 126 | self.parser.add_argument('--sched-milestones', type=int, nargs='+', default=[1], help='List of epochs to decay learning rate.') 127 | self.parser.add_argument('--sched-decay', type=float, default=1.0, help='The decay rate of the LR scheduler, by default there is no decay.') 128 | 129 | self.parser.add_argument('--decay', type=float, default=0.0, help='Weight decay on params.') 130 | 131 | self.parser.add_argument('--no-load-optim', dest='load_optim', action='store_false', help="If given, will not load the state of the optimizer to continue training from a chekcpoint.") 132 | self.parser.set_defaults(load_optim=True) 133 | 134 | self.parser.add_argument('--adam', dest='use_adam', action='store_true', help="If given, uses Adam optimizer rather than Adamax.") 135 | self.parser.set_defaults(use_adam=False) 136 | 137 | self.parser.add_argument('--sched-samp-start', type=int, default=None, help='The epoch at which to start scheduled sampling after the supervised phase of training.') 138 | self.parser.add_argument('--sched-samp-end', type=int, default=None, help='The epoch at which to end scheduled sampling which moves on to the autoregressive phase of training.') 139 | 140 | class TestConfig(BaseConfig): 141 | def __init__(self, argv): 142 | super(TestConfig, self).__init__(argv) 143 | # NOTE: add test-specific options here (e.g. which test to run) 144 | self.parser.add_argument('--shuffle-test', dest='shuffle_test', action='store_true', help="Shuffles test data.") 145 | self.parser.set_defaults(shuffle_test=False) 146 | self.parser.add_argument('--test-on-train', dest='test_on_train', action='store_true', help="Runs evaluation on TRAINING data.") 147 | self.parser.set_defaults(test_on_train=False) 148 | self.parser.add_argument('--test-on-val', dest='test_on_val', action='store_true', help="Runs evaluation on VALIADTION data.") 149 | self.parser.set_defaults(test_on_val=False) 150 | 151 | self.parser.add_argument('--eval-sampling', dest='eval_sampling', action='store_true', help="Visualizing random sample rollouts") 152 | self.parser.set_defaults(eval_sampling=False) 153 | self.parser.add_argument('--eval-sampling-len', type=float, default=10.0, help='Number of seconds to sample for (default 10 s)') 154 | self.parser.add_argument('--eval-sampling-debug', dest='eval_sampling_debug', action='store_true', help="Visualizes random samples in interactive visualization.") 155 | self.parser.set_defaults(eval_sampling_debug=False) 156 | self.parser.add_argument('--eval-test', dest='eval_full_test', action='store_true', help="Evaluate on the full test set with same metrics as during training.") 157 | self.parser.set_defaults(eval_full_test=False) 158 | self.parser.add_argument('--eval-num-samples', type=int, default=1, help='Number of times to sample the model for the same initial state for eval_sampling evalutations.') 159 | self.parser.add_argument('--eval-recon', dest='eval_recon', action='store_true', help="Visualizes reconstructions of random AMASS sequences") 160 | self.parser.add_argument('--eval-recon-debug', dest='eval_recon_debug', action='store_true', help="Interactively visualizes reconstructions of random AMASS sequences") 161 | 162 | self.parser.add_argument('--viz-contacts', dest='viz_contacts', action='store_true', help="For visualization, body mesh is translucent and contacts are shown on SMPL joint skeleton.") 163 | self.parser.set_defaults(viz_contacts=False) 164 | self.parser.add_argument('--viz-pred-joints', dest='viz_pred_joints', action='store_true', help="For visualization, HuMoR output joints are visualized.") 165 | self.parser.set_defaults(viz_pred_joints=False) 166 | self.parser.add_argument('--viz-smpl-joints', dest='viz_smpl_joints', action='store_true', help="For visualization, SMPL joints are visualized (determined from HuMoR output joint angles).") 167 | self.parser.set_defaults(viz_smpl_joints=False) 168 | 169 | # 170 | # Edit/add configs here for changes to model-specific arguments. 171 | # NOTE: must be named ModelNameConfig to be properly loaded. Also should not clash names with any Base/Train/Test configuration flags. 172 | # 173 | 174 | class HumorModelConfig(BaseSubConfig): 175 | ''' 176 | Configuration for arguments specific to models.HumorModel model class. 177 | ''' 178 | def __init__(self, argv): 179 | super(HumorModelConfig, self).__init__(argv) 180 | # arguments specific to this model 181 | self.parser.add_argument('--out-rot-rep', type=str, default='aa', choices=['aa', '6d', '9d'], help='Rotation representation to output from the model.') 182 | self.parser.add_argument('--in-rot-rep', type=str, default='mat', choices=['aa', '6d', 'mat'], help='Rotation representation to input to the model for the relative full sequence input.') 183 | self.parser.add_argument('--latent-size', type=int, default=48, help='Size of the latent feature.') 184 | 185 | self.parser.add_argument('--model-steps-in', dest='steps_in', type=int, default=1, help='At each step of the sequence, the number of input frames.') 186 | 187 | self.parser.add_argument('--no-conditional-prior', dest='conditional_prior', action='store_false', help="Conditions the prior on the past input sequence.") 188 | self.parser.set_defaults(conditional_prior=True) 189 | self.parser.add_argument('--no-output-delta', dest='output_delta', action='store_false', help="Each step predicts the residual rather than the next step.") 190 | self.parser.set_defaults(output_delta=True) 191 | 192 | self.parser.add_argument('--posterior-arch', type=str, default='mlp', choices=['mlp'], help='') 193 | self.parser.add_argument('--decoder-arch', type=str, default='mlp', choices=['mlp'], help='') 194 | self.parser.add_argument('--prior-arch', type=str, default='mlp', choices=['mlp'], help='') 195 | 196 | self.parser.add_argument('--model-data-config', type=str, default='smpl+joints+contacts', choices=['smpl+joints', 'smpl+joints+contacts'], help='which state configuration to use for the model') 197 | 198 | self.parser.add_argument('--no-detach-sched-samp', dest='detach_sched_samp', action='store_false', help="Allows gradients to backprop through multiple output steps when using schedules sampling.") 199 | self.parser.set_defaults(detach_sched_samp=True) 200 | 201 | self.parser.add_argument('--model-use-smpl-joint-inputs', dest='model_use_smpl_joint_inputs', action='store_true', help="uses smpl joints rather than regressed joints to input at next step (during rollout and sched samp).") 202 | self.parser.set_defaults(model_use_smpl_joint_inputs=False) 203 | 204 | 205 | # Edit/add configs here for changes to dataset-specific arguments. 206 | # NOTE: must be named DatasetNameConfig to be properly loaded. Also should not clash names with any Base/Train/Test configuration flags. 207 | # 208 | 209 | class AmassDiscreteDatasetConfig(BaseSubConfig): 210 | ''' 211 | Configuration for arguments specific to models.AmassDiscreteDataset dataset class. 212 | ''' 213 | def __init__(self, argv): 214 | super(AmassDiscreteDatasetConfig, self).__init__(argv) 215 | # arguments specific to this dataset 216 | self.parser.add_argument('--data-paths', type=str, nargs='+', required=True, help='Paths to dataset roots.') 217 | self.parser.add_argument('--split-by', type=str, default='dataset', choices=['single', 'sequence', 'subject', 'dataset'], help='How to split the dataset into train/test/val.') 218 | self.parser.add_argument('--splits-path', type=str, default=None, help='Path to data splits to use.') 219 | self.parser.add_argument('--sample-num-frames', type=int, default=10, help=' the number of frames returned for each sequence, i.e. the number of input/output pairs.') 220 | self.parser.add_argument('--data-rot-rep', type=str, default='mat', choices=['aa', 'mat', '6d'], help='the rotation representation for the INPUT data. [aa, mat, 6d] Output data is always given as a rotation matrix.') 221 | 222 | self.parser.add_argument('--data-steps-in', dest='step_frames_in', type=int, default=1, help='At each step of the sequence, the number of input frames.') 223 | self.parser.add_argument('--data-steps-out', dest='step_frames_out', type=int, default=1, help='At each step of the sequence, the number of output frames.') 224 | self.parser.add_argument('--data-out-step-size', dest='frames_out_step_size', type=int, default=1, help='Spacing between the output frames.') 225 | 226 | self.parser.add_argument('--data-return-config', type=str, default='smpl+joints+contacts', choices=['smpl+joints', 'smpl+joints+contacts', 'all'], help='which values to return from the data loader') 227 | self.parser.add_argument('--data-noise-std', type=float, default=0.0, help='Standard deviation for gaussian noise to add to input motion.') 228 | 229 | class HumorLossConfig(BaseSubConfig): 230 | ''' 231 | Configuration for arguments specific to losses.HumorLoss dataset class. 232 | ''' 233 | def __init__(self, argv): 234 | super(HumorLossConfig, self).__init__(argv) 235 | 236 | self.parser.add_argument('--kl-loss', type=float, default=0.0004, help='Loss weight') 237 | self.parser.add_argument('--kl-loss-anneal-start', type=int, default=0, help='The epoch that the kl loss will start linearly increasing from 0.0') 238 | self.parser.add_argument('--kl-loss-anneal-end', type=int, default=50, help='The epoch that the kl loss will reach its full weight') 239 | self.parser.add_argument('--kl-loss-cycle-len', type=int, default=-1, help='If > 0, KL annealing will be done cyclicly and it will last this many epochs per cycle. If given will ignore kl-loss-anneal-start/end.') 240 | 241 | self.parser.add_argument('--regr-trans-loss', type=float, default=1.0, help='Loss weight') 242 | self.parser.add_argument('--regr-trans-vel-loss', type=float, default=1.0, help='Loss weight') 243 | self.parser.add_argument('--regr-root-orient-loss', type=float, default=1.0, help='Loss weight') 244 | self.parser.add_argument('--regr-root-orient-vel-loss', type=float, default=1.0, help='Loss weight') 245 | self.parser.add_argument('--regr-pose-loss', type=float, default=1.0, help='Loss weight') 246 | self.parser.add_argument('--regr-pose-vel-loss', type=float, default=1.0, help='Loss weight') 247 | self.parser.add_argument('--regr-joint-loss', type=float, default=1.0, help='Loss weight') 248 | self.parser.add_argument('--regr-joint-vel-loss', type=float, default=1.0, help='Loss weight') 249 | self.parser.add_argument('--regr-joint-orient-vel-loss', type=float, default=1.0, help='Loss weight') 250 | self.parser.add_argument('--regr-vert-loss', type=float, default=1.0, help='Loss weight') 251 | self.parser.add_argument('--regr-vert-vel-loss', type=float, default=1.0, help='Loss weight') 252 | self.parser.add_argument('--contacts-loss', type=float, default=0.01, help='Loss weight') 253 | self.parser.add_argument('--contacts-vel-loss', type=float, default=0.01, help='Loss weight') 254 | 255 | self.parser.add_argument('--smpl-joint-loss', type=float, default=1.0, help='Loss weight') 256 | self.parser.add_argument('--smpl-mesh-loss', type=float, default=1.0, help='Loss weight') 257 | self.parser.add_argument('--smpl-joint-consistency-loss', type=float, default=1.0, help='Loss weight') 258 | self.parser.add_argument('--smpl-vert-consistency-loss', type=float, default=0.0, help='Loss weight') -------------------------------------------------------------------------------- /humor/utils/logging.py: -------------------------------------------------------------------------------- 1 | 2 | import os, re, datetime, shutil 3 | 4 | class Logger(object): 5 | ''' 6 | "Static" class to handle logging. 7 | ''' 8 | log_file = None 9 | 10 | @staticmethod 11 | def init(log_path): 12 | Logger.log_file = log_path 13 | 14 | @staticmethod 15 | def log(write_str): 16 | print(write_str) 17 | if not Logger.log_file: 18 | print('Logger must be initialized before logging!') 19 | return 20 | time_str = datetime.datetime.now().strftime("%Y-%m-%d_%H:%M:%S") 21 | with open(Logger.log_file, 'a') as f: 22 | f.write(time_str + ' ') 23 | f.write(str(write_str) + '\n') 24 | 25 | def class_name_to_file_name(class_name): 26 | ''' 27 | Converts class name to the containing file name. 28 | Assumes class name is in CamelCase and file name is the same 29 | name but in snake_case. 30 | Can be used for models and datasets. 31 | ''' 32 | toks = re.findall('[A-Z][^A-Z]*', class_name) 33 | toks = [tok.lower() for tok in toks] 34 | file_name = '_'.join(toks) 35 | return file_name 36 | 37 | def mkdir(dir_name): 38 | if not os.path.exists(dir_name): 39 | os.makedirs(dir_name) 40 | 41 | def cp_files(dir_out, file_list): 42 | ''' copies a list of files ''' 43 | if not os.path.exists(dir_out): 44 | print('Cannot copy to nonexistent directory ' + dir_out) 45 | return 46 | for f in file_list: 47 | shutil.copy(f, dir_out) -------------------------------------------------------------------------------- /humor/utils/stats.py: -------------------------------------------------------------------------------- 1 | 2 | import sys, os 3 | cur_file_path = os.path.dirname(os.path.realpath(__file__)) 4 | sys.path.append(os.path.join(cur_file_path, '..')) 5 | 6 | import numpy as np 7 | import torch 8 | from torch.utils.tensorboard import SummaryWriter 9 | 10 | from datetime import datetime, timedelta 11 | 12 | from utils.logging import Logger 13 | 14 | class AverageMeter(object): 15 | """Computes and stores the average and current scalar value""" 16 | def __init__(self): 17 | self.reset() 18 | 19 | def reset(self): 20 | self.val = 0 21 | self.avg = 0 22 | self.sum = 0 23 | self.count = 0 24 | 25 | def update(self, val, n=1): 26 | self.val = val 27 | self.sum += val * n 28 | self.count += n 29 | self.avg = self.sum / self.count 30 | 31 | class VectorMeter(object): 32 | """ 33 | Stores all values that are given as vectors 34 | so can compute things like median/std. 35 | """ 36 | def __init__(self): 37 | self.reset() 38 | 39 | def reset(self): 40 | self.val = [] 41 | 42 | def update(self, val): 43 | self.val += val.tolist() 44 | 45 | def mean(self): 46 | return np.mean(np.array(self.val)) 47 | 48 | def std(self): 49 | return np.std(np.array(self.val)) 50 | 51 | def median(self): 52 | return np.median(np.array(self.val)) 53 | 54 | def dhms(td): 55 | d, h, m = td.days, td.seconds//3600, (td.seconds//60)%60 56 | s = td.seconds - ( (h*3600) + (m*60) ) # td.seconds are the seconds remaining after days have been removed 57 | return d, h, m, s 58 | 59 | def getTimeDur(seconds): 60 | Duration = timedelta(seconds=seconds) 61 | OutStr = '' 62 | d, h, m, s = dhms(Duration) 63 | if d > 0: 64 | OutStr = OutStr + str(d)+ ' d ' 65 | if h > 0: 66 | OutStr = OutStr + str(h) + ' h ' 67 | if m > 0: 68 | OutStr = OutStr + str(m) + ' m ' 69 | OutStr = OutStr + str(s) + ' s' 70 | 71 | return OutStr 72 | 73 | class StatTracker(object): 74 | ''' 75 | Keeps track of stats of desired stats throughout training/testing. 76 | 77 | This includes a running mean and tensorboard visualization output. 78 | ''' 79 | 80 | def __init__(self, out_dir): 81 | self.writer = SummaryWriter(out_dir) 82 | self.step = 0 83 | self.meter_dict = dict() 84 | self.vector_meter_dict = dict() 85 | 86 | def reset(self): 87 | # keep global track for tensorboard but only per-epoch for meter_dict 88 | self.meter_dict = dict() 89 | 90 | def update(self, stats_dict, tag='train', save_tf=True, n=1, increment_step=True): 91 | all_tag = 'run' 92 | # find stats of each type 93 | scalar_dict = dict() 94 | vector_dict = dict() 95 | image_dict = dict() 96 | pcl_dict = dict() 97 | for k in stats_dict.keys(): 98 | if torch.is_tensor(stats_dict[k]): 99 | num_dims = len(stats_dict[k].size()) 100 | else: 101 | stats_dict[k] = torch.Tensor([stats_dict[k]])[0] 102 | num_dims = 0 103 | # print('%s : %d' % (k, num_dims)) 104 | if num_dims == 0: 105 | # scalar 106 | scalar_dict[tag + '/' + k] = stats_dict[k].cpu().item() 107 | elif num_dims == 1: 108 | # vector 109 | vector_dict[tag + '/' + k] = stats_dict[k].cpu().data.numpy() 110 | scalar_dict[tag + '/' + k] = vector_dict[tag + '/' + k].mean() 111 | elif num_dims == 2: 112 | # point cloud 113 | pcl_dict[tag + '/' + k] = stats_dict[k].cpu().unsqueeze(0) 114 | elif num_dims == 3: 115 | # image 116 | image_dict[tag + '/' + k] = stats_dict[k].cpu() 117 | 118 | # update average meter dicts 119 | for k, v in scalar_dict.items(): 120 | if not k in self.meter_dict: 121 | self.meter_dict[k] = AverageMeter() 122 | self.meter_dict[k].update(v, n=n) 123 | # update scalar dict for tf save 124 | scalar_dict[k] = self.meter_dict[k].avg 125 | 126 | # update vector meter dicts 127 | for k, v in vector_dict.items(): 128 | if not k in self.vector_meter_dict: 129 | self.vector_meter_dict[k] = VectorMeter() 130 | self.vector_meter_dict[k].update(v) 131 | 132 | # write to tensorboard 133 | if save_tf: 134 | self.writer.add_scalars(all_tag, scalar_dict, self.step) 135 | for k, v in image_dict.items(): 136 | self.writer.add_image(all_tag + '/' + k, v, self.step) 137 | for k, v in pcl_dict.items(): 138 | colors = 255*((v / torch.max(v)) + 0.5) 139 | points_config = { 140 | 'cls': 'PointsMaterial', 141 | 'size': 0.05 142 | } 143 | self.writer.add_mesh(all_tag + '/' + k, v, colors.to(torch.int), global_step=self.step, config_dict={'material' : points_config}) 144 | 145 | if increment_step: 146 | self.step += 1 147 | 148 | 149 | def print(self, cur_batch_idx, num_batches, cur_epoch_idx, num_epochs, total_elapsed_time=None, tag='train'): 150 | # print the progress bar with estimated time 151 | done = int(50 * (cur_batch_idx+1) / num_batches) 152 | progress_str = '[{}>{}] {} epoch - {}/{} | batch - {}/{}'.format('=' * done, '-' * (50 - done), tag, 153 | cur_epoch_idx+1, num_epochs, 154 | cur_batch_idx+1, num_batches) 155 | Logger.log(progress_str) 156 | 157 | # timing stats if available 158 | time_per_batch_str = tag + '/' + 'time_per_batch' 159 | if time_per_batch_str in self.meter_dict and total_elapsed_time is not None: 160 | mean_per_batch = self.meter_dict[time_per_batch_str].avg 161 | elapsed = total_elapsed_time 162 | elapsed_str = getTimeDur(elapsed) 163 | cur_frac = (num_batches*cur_epoch_idx + cur_batch_idx) / (num_batches*num_epochs) 164 | ETA = (elapsed / (cur_frac + 1e-6)) - elapsed 165 | ETA_str = getTimeDur(ETA) 166 | time_str = '%.3f s per batch | %s elapsed | %s ETA' % (mean_per_batch, elapsed_str, ETA_str) 167 | Logger.log(time_str) 168 | 169 | # recorded stats 170 | for k, v in self.meter_dict.items(): 171 | stat_str = '%s : %.5f' % (k, v.avg) 172 | # see if there's an associated vector value 173 | if k in self.vector_meter_dict: 174 | vec_meter = self.vector_meter_dict[k] 175 | stat_str += ' mean, %.5f std, %.5f med' % (vec_meter.std(), vec_meter.median()) 176 | Logger.log(stat_str) 177 | 178 | 179 | -------------------------------------------------------------------------------- /humor/utils/torch.py: -------------------------------------------------------------------------------- 1 | 2 | import sys, os, time 3 | import torch 4 | import torch.nn as nn 5 | import numpy as np 6 | 7 | def get_device(gpu_idx=0): 8 | ''' 9 | Returns the pytorch device for the given gpu index. 10 | ''' 11 | gpu_device_str = 'cuda:%d' % (gpu_idx) 12 | device_str = gpu_device_str if torch.cuda.is_available() else 'cpu' 13 | if device_str == gpu_device_str: 14 | print('Using detected GPU...') 15 | device_str = 'cuda:0' 16 | else: 17 | print('No detected GPU...using CPU.') 18 | device = torch.device(device_str) 19 | return device 20 | 21 | def torch_to_numpy(tensor_list): 22 | return [x.to('cpu').data.numpy() for x in tensor_list] 23 | 24 | def torch_to_scalar(tensor_list): 25 | return [x.to('cpu').item() for x in tensor_list] 26 | 27 | copy2cpu = lambda tensor: tensor.detach().cpu().numpy() 28 | 29 | def save_state(file_out, model, optimizer, cur_epoch=0, min_val_loss=float('Inf'), min_train_loss=float('Inf'), ignore_keys=None): 30 | model_state_dict = model.state_dict() 31 | if ignore_keys is not None: 32 | model_state_dict = {k: v for k, v in model_state_dict.items() if k.split('.')[0] not in ignore_keys} 33 | 34 | full_checkpoint_dict = { 35 | 'model' : model_state_dict, 36 | 'optim' : optimizer.state_dict(), 37 | 'epoch' : cur_epoch, 38 | 'min_val_loss' : min_val_loss, 39 | 'min_train_loss' : min_train_loss 40 | } 41 | torch.save(full_checkpoint_dict, file_out) 42 | 43 | def load_state(load_path, model, optimizer=None, is_parallel=False, map_location=None, ignore_keys=None): 44 | if not os.path.exists(load_path): 45 | print('Could not find checkpoint at path ' + load_path) 46 | 47 | full_checkpoint_dict = torch.load(load_path, map_location=map_location) 48 | model_state_dict = full_checkpoint_dict['model'] 49 | optim_state_dict = full_checkpoint_dict['optim'] 50 | 51 | # load model weights 52 | for k, v in model_state_dict.items(): 53 | if k.split('.')[0] == 'module' and not is_parallel: 54 | # then it was trained with Data parallel 55 | print('Loading weights trained with DataParallel...') 56 | model_state_dict = {'.'.join(k.split('.')[1:]) : v for k, v in model_state_dict.items() if k.split('.')[0] == 'module'} 57 | break 58 | 59 | if ignore_keys is not None: 60 | model_state_dict = {k: v for k, v in model_state_dict.items() if k.split('.')[0] not in ignore_keys} 61 | 62 | # overwrite entries in the existing state dict 63 | missing_keys, unexpected_keys = model.load_state_dict(model_state_dict, strict=False) 64 | if ignore_keys is not None: 65 | missing_keys = [k for k in missing_keys if k.split('.')[0] not in ignore_keys] 66 | unexpected_keys = [k for k in unexpected_keys if k.split('.')[0] not in ignore_keys] 67 | if len(missing_keys) > 0: 68 | print('WARNING: The following keys could not be found in the given state dict - ignoring...') 69 | print(missing_keys) 70 | if len(unexpected_keys) > 0: 71 | print('WARNING: The following keys were found in the given state dict but not in the current model - ignoring...') 72 | print(unexpected_keys) 73 | 74 | # load optimizer weights 75 | if optimizer is not None: 76 | optimizer.load_state_dict(optim_state_dict) 77 | 78 | min_train_loss = float('Inf') 79 | if 'min_train_loss' in full_checkpoint_dict.keys(): 80 | min_train_loss = full_checkpoint_dict['min_train_loss'] 81 | 82 | return full_checkpoint_dict['epoch'], full_checkpoint_dict['min_val_loss'], min_train_loss 83 | -------------------------------------------------------------------------------- /humor/utils/transforms.py: -------------------------------------------------------------------------------- 1 | import copy 2 | 3 | import torch 4 | import numpy as np 5 | from torch.nn import functional as F 6 | 7 | from body_model.utils import SMPL_JOINTS 8 | 9 | # 10 | # For computing local body frame 11 | # 12 | 13 | GLOB_DEVICE = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu') 14 | XY_AXIS_GLOB = torch.Tensor([[1.0, 1.0, 0.0]]).to(device=GLOB_DEVICE) 15 | X_AXIS_GLOB = torch.Tensor([[1.0, 0.0, 0.0]]).to(device=GLOB_DEVICE) 16 | 17 | def compute_aligned_from_right(body_right): 18 | xy_axis = XY_AXIS_GLOB 19 | x_axis = X_AXIS_GLOB 20 | 21 | body_right_x_proj = body_right[:,0:1] / (torch.norm(body_right[:,:2], dim=1, keepdim=True) + 1e-6) 22 | body_right_x_proj = torch.clamp(body_right_x_proj, min=-1.0, max=1.0) # avoid acos error 23 | 24 | world2aligned_angle = torch.acos(body_right_x_proj) # project to world x axis, and compute angle 25 | body_right = body_right * xy_axis 26 | world2aligned_axis = torch.cross(body_right, x_axis.expand_as(body_right)) 27 | 28 | world2aligned_aa = (world2aligned_axis / (torch.norm(world2aligned_axis, dim=1, keepdim=True) + 1e-6)) * world2aligned_angle 29 | world2aligned_mat = batch_rodrigues(world2aligned_aa) 30 | 31 | return world2aligned_mat, world2aligned_aa 32 | 33 | def compute_world2aligned_mat(rot_pos): 34 | ''' 35 | batch of world rotation matrices: B x 3 x 3 36 | returns rot mats that align the inputs to the forward direction: B x 3 x 3 37 | Torch version 38 | ''' 39 | body_right = -rot_pos[:,:,0] #.clone() # in body coordinates body x-axis is left 40 | 41 | world2aligned_mat, world2aligned_aa = compute_aligned_from_right(body_right) 42 | return world2aligned_mat 43 | 44 | 45 | def compute_world2aligned_joints_mat(joints): 46 | ''' 47 | Compute world to canonical frame (rotation around up axis) 48 | from the given batch of joints (B x J x 3) 49 | ''' 50 | left_idx = SMPL_JOINTS['leftUpLeg'] 51 | right_idx = SMPL_JOINTS['rightUpLeg'] 52 | 53 | body_right = joints[:, right_idx] - joints[:, left_idx] 54 | body_right = body_right / torch.norm(body_right, dim=1, keepdim=True) 55 | 56 | world2aligned_mat, world2aligned_aa = compute_aligned_from_right(body_right) 57 | 58 | return world2aligned_mat 59 | 60 | def convert_to_rotmat(pred_rot, rep='aa'): 61 | ''' 62 | Converts rotation rep to rotation matrix based on the given type. 63 | pred_rot : B x T x N 64 | ''' 65 | B, T, _ = pred_rot.size() 66 | pred_rot_mat = None 67 | if rep == 'aa': 68 | pred_rot_mat = batch_rodrigues(pred_rot.reshape(-1, 3)) 69 | elif rep == '6d': 70 | pred_rot_mat = rot6d_to_rotmat(pred_rot.reshape(-1, 6)) 71 | elif rep == '9d': 72 | pred_rot_mat = rot9d_to_rotmat(pred_rot.reshape(-1, 9)) 73 | return pred_rot_mat.reshape((B, T, -1)) 74 | 75 | # 76 | # Many of these functions taken from https://github.com/mkocabas/VIBE/blob/a859e45a907379aa2fba65a7b620b4a2d65dcf1b/lib/utils/geometry.py 77 | # Please see their license for usage restrictions. 78 | # 79 | 80 | def matrot2axisangle(matrots): 81 | ''' 82 | :param matrots: N*num_joints*9 83 | :return: N*num_joints*3 84 | ''' 85 | import cv2 86 | batch_size = matrots.shape[0] 87 | matrots = matrots.reshape([batch_size,-1,9]) 88 | out_axisangle = [] 89 | for mIdx in range(matrots.shape[0]): 90 | cur_axisangle = [] 91 | for jIdx in range(matrots.shape[1]): 92 | a = cv2.Rodrigues(matrots[mIdx, jIdx:jIdx + 1, :].reshape(3, 3))[0].reshape((1, 3)) 93 | cur_axisangle.append(a) 94 | 95 | out_axisangle.append(np.array(cur_axisangle).reshape([1,-1,3])) 96 | return np.vstack(out_axisangle) 97 | 98 | def axisangle2matrots(axisangle): 99 | ''' 100 | :param axisangle: N*num_joints*3 101 | :return: N*num_joints*9 102 | ''' 103 | import cv2 104 | batch_size = axisangle.shape[0] 105 | axisangle = axisangle.reshape([batch_size,-1,3]) 106 | out_matrot = [] 107 | for mIdx in range(axisangle.shape[0]): 108 | cur_axisangle = [] 109 | for jIdx in range(axisangle.shape[1]): 110 | a = cv2.Rodrigues(axisangle[mIdx, jIdx:jIdx + 1, :].reshape(1, 3))[0] 111 | cur_axisangle.append(a) 112 | 113 | out_matrot.append(np.array(cur_axisangle).reshape([1,-1,9])) 114 | return np.vstack(out_matrot) 115 | 116 | def make_rot_homog(rotation_matrix): 117 | if rotation_matrix.shape[1:] == (3,3): 118 | rot_mat = rotation_matrix.reshape(-1, 3, 3) 119 | hom = torch.tensor([0, 0, 1], dtype=torch.float32, 120 | device=rotation_matrix.device).reshape(1, 3, 1).expand(rot_mat.shape[0], -1, -1) 121 | rotation_matrix = torch.cat([rot_mat, hom], dim=-1) 122 | return rotation_matrix 123 | 124 | def skew(v): 125 | ''' 126 | Returns skew symmetric (B x 3 x 3) mat from vector v: B x 3 127 | ''' 128 | B, D = v.size() 129 | assert(D == 3) 130 | skew_mat = torch.zeros((B, 3, 3)).to(v) 131 | skew_mat[:,0,1] = v[:,2] 132 | skew_mat[:,1,0] = -v[:,2] 133 | skew_mat[:,0,2] = v[:,1] 134 | skew_mat[:,2,0] = -v[:,1] 135 | skew_mat[:,1,2] = v[:,0] 136 | skew_mat[:,2,1] = -v[:,0] 137 | return skew_mat 138 | 139 | def batch_rodrigues(rot_vecs, epsilon=1e-8, dtype=torch.float32): 140 | ''' Calculates the rotation matrices for a batch of rotation vectors 141 | Parameters 142 | ---------- 143 | rot_vecs: torch.tensor Nx3 144 | array of N axis-angle vectors 145 | Returns 146 | ------- 147 | R: torch.tensor Nx3x3 148 | The rotation matrices for the given axis-angle parameters 149 | ''' 150 | 151 | batch_size = rot_vecs.shape[0] 152 | device = rot_vecs.device 153 | 154 | angle = torch.norm(rot_vecs + 1e-8, dim=1, keepdim=True) 155 | rot_dir = rot_vecs / angle 156 | 157 | cos = torch.unsqueeze(torch.cos(angle), dim=1) 158 | sin = torch.unsqueeze(torch.sin(angle), dim=1) 159 | 160 | # Bx1 arrays 161 | rx, ry, rz = torch.split(rot_dir, 1, dim=1) 162 | K = torch.zeros((batch_size, 3, 3), dtype=dtype, device=device) 163 | 164 | zeros = torch.zeros((batch_size, 1), dtype=dtype, device=device) 165 | K = torch.cat([zeros, -rz, ry, rz, zeros, -rx, -ry, rx, zeros], dim=1) \ 166 | .view((batch_size, 3, 3)) 167 | 168 | ident = torch.eye(3, dtype=dtype, device=device).unsqueeze(dim=0) 169 | rot_mat = ident + sin * K + (1 - cos) * torch.bmm(K, K) 170 | return rot_mat 171 | 172 | def quat2mat(quat): 173 | """ 174 | This function is borrowed from https://github.com/MandyMo/pytorch_HMR/blob/master/src/util.py#L50 175 | Convert quaternion coefficients to rotation matrix. 176 | Args: 177 | quat: size = [batch_size, 4] 4 <===>(w, x, y, z) 178 | Returns: 179 | Rotation matrix corresponding to the quaternion -- size = [batch_size, 3, 3] 180 | """ 181 | norm_quat = quat 182 | norm_quat = norm_quat / norm_quat.norm(p=2, dim=1, keepdim=True) 183 | w, x, y, z = norm_quat[:, 0], norm_quat[:, 1], norm_quat[:, 184 | 2], norm_quat[:, 185 | 3] 186 | 187 | batch_size = quat.size(0) 188 | 189 | w2, x2, y2, z2 = w.pow(2), x.pow(2), y.pow(2), z.pow(2) 190 | wx, wy, wz = w * x, w * y, w * z 191 | xy, xz, yz = x * y, x * z, y * z 192 | 193 | rotMat = torch.stack([ 194 | w2 + x2 - y2 - z2, 2 * xy - 2 * wz, 2 * wy + 2 * xz, 2 * wz + 2 * xy, 195 | w2 - x2 + y2 - z2, 2 * yz - 2 * wx, 2 * xz - 2 * wy, 2 * wx + 2 * yz, 196 | w2 - x2 - y2 + z2 197 | ], 198 | dim=1).view(batch_size, 3, 3) 199 | return rotMat 200 | 201 | def rot6d_to_rotmat(x): 202 | """Convert 6D rotation representation to 3x3 rotation matrix. 203 | Based on Zhou et al., "On the Continuity of Rotation Representations in Neural Networks", CVPR 2019 204 | Input: 205 | (B,6) Batch of 6-D rotation representations 206 | Output: 207 | (B,3,3) Batch of corresponding rotation matrices 208 | """ 209 | x = x.view(-1,3,2) 210 | a1 = x[:, :, 0] 211 | a2 = x[:, :, 1] 212 | b1 = F.normalize(a1) 213 | b2 = F.normalize(a2 - torch.einsum('bi,bi->b', b1, a2).unsqueeze(-1) * b1) 214 | 215 | # inp = a2 - torch.einsum('bi,bi->b', b1, a2).unsqueeze(-1) * b1 216 | # denom = inp.pow(2).sum(dim=1).sqrt().unsqueeze(-1) + 1e-8 217 | # b2 = inp / denom 218 | 219 | b3 = torch.cross(b1, b2) 220 | return torch.stack((b1, b2, b3), dim=-1) 221 | 222 | def rot9d_to_rotmat(x): 223 | ''' 224 | Converts 9D rotation output to valid 3x3 rotation amtrix. 225 | Based on Levinson et al., An Analysis of SVD for Deep Rotation Estimation. 226 | 227 | Input: 228 | (B, 9) 229 | Output: 230 | (B, 9) 231 | ''' 232 | B = x.size()[0] 233 | x = x.reshape((B, 3, 3)) 234 | u, s, v = torch.svd(x) 235 | 236 | v_T = v.transpose(-2, -1) 237 | s_p = torch.eye(3).to(x).reshape((1, 3, 3)).expand_as(x).clone() 238 | s_p[:, 2, 2] = torch.det(torch.matmul(u, v_T)) 239 | x_out = torch.matmul(torch.matmul(u, s_p), v_T) 240 | 241 | return x_out.reshape((B, 9)) 242 | 243 | def rotation_matrix_to_angle_axis(rotation_matrix): 244 | """ 245 | This function is borrowed from https://github.com/kornia/kornia 246 | Convert 3x4 rotation matrix to Rodrigues vector 247 | Args: 248 | rotation_matrix (Tensor): rotation matrix. 249 | Returns: 250 | Tensor: Rodrigues vector transformation. 251 | Shape: 252 | - Input: :math:`(N, 3, 4)` 253 | - Output: :math:`(N, 3)` 254 | Example: 255 | >>> input = torch.rand(2, 3, 4) # Nx4x4 256 | >>> output = tgm.rotation_matrix_to_angle_axis(input) # Nx3 257 | """ 258 | if rotation_matrix.shape[1:] == (3,3): 259 | rot_mat = rotation_matrix.reshape(-1, 3, 3) 260 | hom = torch.tensor([0, 0, 1], dtype=torch.float32, 261 | device=rotation_matrix.device).reshape(1, 3, 1).expand(rot_mat.shape[0], -1, -1) 262 | rotation_matrix = torch.cat([rot_mat, hom], dim=-1) 263 | 264 | quaternion = rotation_matrix_to_quaternion(rotation_matrix) 265 | aa = quaternion_to_angle_axis(quaternion) 266 | aa[torch.isnan(aa)] = 0.0 267 | return aa 268 | 269 | def rotation_matrix_to_quaternion(rotation_matrix, eps=1e-6): 270 | """ 271 | This function is borrowed from https://github.com/kornia/kornia 272 | Convert 3x4 rotation matrix to 4d quaternion vector 273 | This algorithm is based on algorithm described in 274 | https://github.com/KieranWynn/pyquaternion/blob/master/pyquaternion/quaternion.py#L201 275 | Args: 276 | rotation_matrix (Tensor): the rotation matrix to convert. 277 | Return: 278 | Tensor: the rotation in quaternion 279 | Shape: 280 | - Input: :math:`(N, 3, 4)` 281 | - Output: :math:`(N, 4)` 282 | Example: 283 | >>> input = torch.rand(4, 3, 4) # Nx3x4 284 | >>> output = tgm.rotation_matrix_to_quaternion(input) # Nx4 285 | """ 286 | if not torch.is_tensor(rotation_matrix): 287 | raise TypeError("Input type is not a torch.Tensor. Got {}".format( 288 | type(rotation_matrix))) 289 | 290 | if len(rotation_matrix.shape) > 3: 291 | raise ValueError( 292 | "Input size must be a three dimensional tensor. Got {}".format( 293 | rotation_matrix.shape)) 294 | if not rotation_matrix.shape[-2:] == (3, 4): 295 | raise ValueError( 296 | "Input size must be a N x 3 x 4 tensor. Got {}".format( 297 | rotation_matrix.shape)) 298 | 299 | rmat_t = torch.transpose(rotation_matrix, 1, 2) 300 | 301 | mask_d2 = rmat_t[:, 2, 2] < eps 302 | 303 | mask_d0_d1 = rmat_t[:, 0, 0] > rmat_t[:, 1, 1] 304 | mask_d0_nd1 = rmat_t[:, 0, 0] < -rmat_t[:, 1, 1] 305 | 306 | t0 = 1 + rmat_t[:, 0, 0] - rmat_t[:, 1, 1] - rmat_t[:, 2, 2] 307 | q0 = torch.stack([rmat_t[:, 1, 2] - rmat_t[:, 2, 1], 308 | t0, rmat_t[:, 0, 1] + rmat_t[:, 1, 0], 309 | rmat_t[:, 2, 0] + rmat_t[:, 0, 2]], -1) 310 | t0_rep = t0.repeat(4, 1).t() 311 | 312 | t1 = 1 - rmat_t[:, 0, 0] + rmat_t[:, 1, 1] - rmat_t[:, 2, 2] 313 | q1 = torch.stack([rmat_t[:, 2, 0] - rmat_t[:, 0, 2], 314 | rmat_t[:, 0, 1] + rmat_t[:, 1, 0], 315 | t1, rmat_t[:, 1, 2] + rmat_t[:, 2, 1]], -1) 316 | t1_rep = t1.repeat(4, 1).t() 317 | 318 | t2 = 1 - rmat_t[:, 0, 0] - rmat_t[:, 1, 1] + rmat_t[:, 2, 2] 319 | q2 = torch.stack([rmat_t[:, 0, 1] - rmat_t[:, 1, 0], 320 | rmat_t[:, 2, 0] + rmat_t[:, 0, 2], 321 | rmat_t[:, 1, 2] + rmat_t[:, 2, 1], t2], -1) 322 | t2_rep = t2.repeat(4, 1).t() 323 | 324 | t3 = 1 + rmat_t[:, 0, 0] + rmat_t[:, 1, 1] + rmat_t[:, 2, 2] 325 | q3 = torch.stack([t3, rmat_t[:, 1, 2] - rmat_t[:, 2, 1], 326 | rmat_t[:, 2, 0] - rmat_t[:, 0, 2], 327 | rmat_t[:, 0, 1] - rmat_t[:, 1, 0]], -1) 328 | t3_rep = t3.repeat(4, 1).t() 329 | 330 | mask_c0 = mask_d2 * mask_d0_d1 331 | mask_c1 = mask_d2 * ~mask_d0_d1 332 | mask_c2 = ~mask_d2 * mask_d0_nd1 333 | mask_c3 = ~mask_d2 * ~mask_d0_nd1 334 | mask_c0 = mask_c0.view(-1, 1).type_as(q0) 335 | mask_c1 = mask_c1.view(-1, 1).type_as(q1) 336 | mask_c2 = mask_c2.view(-1, 1).type_as(q2) 337 | mask_c3 = mask_c3.view(-1, 1).type_as(q3) 338 | 339 | q = q0 * mask_c0 + q1 * mask_c1 + q2 * mask_c2 + q3 * mask_c3 340 | q /= torch.sqrt(t0_rep * mask_c0 + t1_rep * mask_c1 + # noqa 341 | t2_rep * mask_c2 + t3_rep * mask_c3) # noqa 342 | q *= 0.5 343 | return q 344 | 345 | def quaternion_to_angle_axis(quaternion: torch.Tensor) -> torch.Tensor: 346 | """ 347 | This function is borrowed from https://github.com/kornia/kornia 348 | Convert quaternion vector to angle axis of rotation. 349 | Adapted from ceres C++ library: ceres-solver/include/ceres/rotation.h 350 | Args: 351 | quaternion (torch.Tensor): tensor with quaternions. 352 | Return: 353 | torch.Tensor: tensor with angle axis of rotation. 354 | Shape: 355 | - Input: :math:`(*, 4)` where `*` means, any number of dimensions 356 | - Output: :math:`(*, 3)` 357 | Example: 358 | >>> quaternion = torch.rand(2, 4) # Nx4 359 | >>> angle_axis = tgm.quaternion_to_angle_axis(quaternion) # Nx3 360 | """ 361 | if not torch.is_tensor(quaternion): 362 | raise TypeError("Input type is not a torch.Tensor. Got {}".format( 363 | type(quaternion))) 364 | 365 | if not quaternion.shape[-1] == 4: 366 | raise ValueError("Input must be a tensor of shape Nx4 or 4. Got {}" 367 | .format(quaternion.shape)) 368 | # unpack input and compute conversion 369 | q1: torch.Tensor = quaternion[..., 1] 370 | q2: torch.Tensor = quaternion[..., 2] 371 | q3: torch.Tensor = quaternion[..., 3] 372 | sin_squared_theta: torch.Tensor = q1 * q1 + q2 * q2 + q3 * q3 373 | 374 | sin_theta: torch.Tensor = torch.sqrt(sin_squared_theta) 375 | cos_theta: torch.Tensor = quaternion[..., 0] 376 | two_theta: torch.Tensor = 2.0 * torch.where( 377 | cos_theta < 0.0, 378 | torch.atan2(-sin_theta, -cos_theta), 379 | torch.atan2(sin_theta, cos_theta)) 380 | 381 | k_pos: torch.Tensor = two_theta / sin_theta 382 | k_neg: torch.Tensor = 2.0 * torch.ones_like(sin_theta) 383 | k: torch.Tensor = torch.where(sin_squared_theta > 0.0, k_pos, k_neg) 384 | 385 | angle_axis: torch.Tensor = torch.zeros_like(quaternion)[..., :3] 386 | angle_axis[..., 0] += q1 * k 387 | angle_axis[..., 1] += q2 * k 388 | angle_axis[..., 2] += q3 * k 389 | return angle_axis -------------------------------------------------------------------------------- /humor/utils/video.py: -------------------------------------------------------------------------------- 1 | import os, sys, shutil, argparse, subprocess, time, json, glob 2 | from multiprocessing import Pool 3 | 4 | import os.path as osp 5 | import torch, torchvision 6 | from PIL import Image 7 | from torchvision import transforms 8 | import numpy as np 9 | 10 | import cv2 11 | 12 | 13 | def video_to_images(vid_file, img_folder=None, return_info=False, fps=30): 14 | ''' 15 | From https://github.com/mkocabas/VIBE/blob/master/lib/utils/demo_utils.py 16 | 17 | fps will sample the video to this rate. 18 | ''' 19 | if img_folder is None: 20 | img_folder = osp.join('/tmp', osp.basename(vid_file).replace('.', '_')) 21 | 22 | os.makedirs(img_folder, exist_ok=True) 23 | 24 | command = ['ffmpeg', 25 | '-i', vid_file, 26 | '-r', str(fps), 27 | '-f', 'image2', 28 | '-v', 'error', 29 | f'{img_folder}/%06d.png'] 30 | print(f'Running \"{" ".join(command)}\"') 31 | subprocess.call(command) 32 | 33 | print(f'Images saved to \"{img_folder}\"') 34 | 35 | img_shape = cv2.imread(osp.join(img_folder, '000001.png')).shape 36 | 37 | if return_info: 38 | return img_folder, len(os.listdir(img_folder)), img_shape 39 | else: 40 | return img_folder 41 | 42 | def make_absolute(rel_paths): 43 | ''' Makes a list of relative paths absolute ''' 44 | return [os.path.join(os.getcwd(), rel_path) for rel_path in rel_paths] 45 | 46 | SKELETON = 'BODY_25' 47 | 48 | def run_openpose(openpose_path, img_dir, out_dir, video_out=None, img_out=None): 49 | ''' 50 | Runs OpenPose for 2D joint detection on the images in img_dir. 51 | ''' 52 | # make all paths absolute to call OP 53 | openpose_path = make_absolute([openpose_path])[0] 54 | img_dir = make_absolute([img_dir])[0] 55 | out_dir = make_absolute([out_dir])[0] 56 | if video_out is not None: 57 | video_out = make_absolute([video_out])[0] 58 | if img_out is not None: 59 | img_out = make_absolute([img_out])[0] 60 | 61 | if not os.path.exists(out_dir): 62 | os.makedirs(out_dir) 63 | 64 | # run open pose 65 | # must change to openpose dir path to run properly 66 | og_cwd = os.getcwd() 67 | os.chdir(openpose_path) 68 | 69 | # then run openpose 70 | run_cmds = ['./build/examples/openpose/openpose.bin', \ 71 | '--image_dir', img_dir, '--write_json', out_dir, \ 72 | '--display', '0', '--model_pose', SKELETON, '--number_people_max', '1', \ 73 | '--num_gpu', '1'] 74 | if video_out is not None: 75 | run_cmds += ['--write_video', video_out, '--write_video_fps', '30'] 76 | if img_out is not None: 77 | run_cmds += ['--write_images', img_out] 78 | if not (video_out is not None or img_out is not None): 79 | run_cmds += ['--render_pose', '0'] 80 | print(run_cmds) 81 | subprocess.run(run_cmds) 82 | 83 | os.chdir(og_cwd) # change back to resume 84 | 85 | 86 | def run_deeplab_v3(img_dir, img_shape, out_dir, batch_size=16, img_extn='png'): 87 | ''' 88 | Runs DeepLabv3 to get a person segmentation mask on each img in img_dir. 89 | 90 | - img_shape : (H x W) 91 | ''' 92 | print('Running DeepLabv3 to compute person mask...') 93 | H, W = img_shape 94 | device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu') 95 | model = torch.hub.load('pytorch/vision:v0.6.0', 'deeplabv3_resnet101', pretrained=True).to(device) 96 | model.eval() 97 | preprocess = transforms.Compose([ 98 | transforms.ToTensor(), 99 | transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), 100 | ]) 101 | 102 | img_path = img_dir 103 | all_img_paths = sorted(glob.glob(os.path.join(img_path + '/*.' + img_extn))) 104 | img_names = ['.'.join(f.split('/')[-1].split('.')[:-1]) for f in all_img_paths] 105 | out_path = out_dir 106 | if not os.path.exists(out_path): 107 | os.makedirs(out_path) 108 | all_mask_paths = [os.path.join(out_path, f + '.png') for f in img_names] 109 | # print(all_mask_paths) 110 | 111 | num_imgs = len(img_names) 112 | num_batches = (num_imgs / batch_size) + 1 113 | sidx = 0 114 | eidx = min(num_imgs, batch_size) 115 | cnt = 1 116 | while sidx < num_imgs: 117 | # print(sidx) 118 | # print(eidx) 119 | # batch 120 | print('Batch %d / %d' % (cnt, num_batches)) 121 | img_path_batch = all_img_paths[sidx:eidx] 122 | mask_path_batch = all_mask_paths[sidx:eidx] 123 | B = len(img_path_batch) 124 | img_batch = torch.zeros((B, 3, H, W)) 125 | for bidx, cur_img_path in enumerate(img_path_batch): 126 | input_image = Image.open(cur_img_path) 127 | input_tensor = preprocess(input_image) 128 | img_batch[bidx] = input_tensor 129 | img_batch = img_batch.to(device) 130 | # print(img_batch.size()) 131 | 132 | # eval and save 133 | with torch.no_grad(): 134 | output = model(img_batch)['out'] 135 | seg = torch.logical_not(output.argmax(1) == 15).to(torch.float) # the max probability is the person class 136 | seg = seg.cpu().numpy() 137 | for bidx in range(B): 138 | person_mask = (seg[bidx]*255.0).astype(np.uint8) 139 | out_img = Image.fromarray(person_mask) 140 | out_img.save(mask_path_batch[bidx]) 141 | 142 | 143 | # # create a color pallette, selecting a color for each class 144 | # palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2 ** 21 - 1]) 145 | # colors = torch.as_tensor([i for i in range(21)])[:, None] * palette 146 | # colors = (colors % 255).numpy().astype("uint8") 147 | # # plot the semantic segmentation predictions of 21 classes in each color 148 | # r = Image.fromarray(seg[0].byte().cpu().numpy()).resize(input_image.size) 149 | # r.putpalette(colors) 150 | # import matplotlib.pyplot as plt 151 | # plt.imshow(r) 152 | # plt.show() 153 | 154 | sidx = eidx 155 | eidx = min(num_imgs, sidx + batch_size) 156 | cnt += 1 -------------------------------------------------------------------------------- /humor/viz/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/davrempe/humor/fc6ef84f0baa153be15427402e0147ed1a63a11a/humor/viz/__init__.py -------------------------------------------------------------------------------- /humor/viz/utils.py: -------------------------------------------------------------------------------- 1 | 2 | import time, os, shutil 3 | import subprocess 4 | import glob 5 | 6 | import numpy as np 7 | import trimesh 8 | import torch 9 | from PIL import Image, ImageDraw, ImageFont 10 | 11 | from utils.torch import copy2cpu as c2c 12 | 13 | smpl_connections = [[11, 8], [8, 5], [5, 2], [2, 0], [10, 7], [7, 4], [4, 1], [1, 0], 14 | [0, 3], [3, 6], [6, 9], [9, 12], [12, 15], [12, 13], [13, 16], [16, 18], 15 | [18, 20], [12, 14], [14, 17], [17, 19], [19, 21]] 16 | 17 | imapper_connections = [[0, 1], [1, 2], [5, 4], [4, 3], [2, 6], [3, 6], [6, 7], [7, 8], [8, 9], 18 | [7, 12], [12, 11], [11, 10], [7, 13], [13, 14], [14, 15]] 19 | 20 | comp_connections = [[0, 1], [2, 3], [1, 4], [2, 4], [4, 5], [5, 8], [8, 7], [7, 6], [5, 9], [9, 10], [10, 11]] 21 | 22 | colors = { 23 | 'pink': [.7, .7, .9], 24 | 'purple': [.9, .7, .7], 25 | 'cyan': [.7, .75, .5], 26 | 'red': [1.0, 0.0, 0.0], 27 | 28 | 'green': [.0, 1., .0], 29 | 'yellow': [1., 1., 0], 30 | 'brown': [.5, .7, .7], 31 | 'blue': [.0, .0, 1.], 32 | 33 | 'offwhite': [.8, .9, .9], 34 | 'white': [1., 1., 1.], 35 | 'orange': [.5, .65, .9], 36 | 37 | 'grey': [.7, .7, .7], 38 | 'black': np.zeros(3), 39 | 'white': np.ones(3), 40 | 41 | 'yellowg': [0.83, 1, 0], 42 | } 43 | 44 | def create_video(img_path, out_path, fps): 45 | ''' 46 | Creates a video from the frame format in the given directory and saves to out_path. 47 | ''' 48 | command = ['ffmpeg', '-y', '-r', str(fps), '-i', img_path, \ 49 | '-vcodec', 'libx264', '-crf', '25', '-pix_fmt', 'yuv420p', out_path] 50 | subprocess.run(command) 51 | 52 | def create_gif(img_path, out_path, fps): 53 | ''' 54 | Creates a gif (and video) from the frame format in the given directory and saves to out_path. 55 | ''' 56 | vid_path = out_path[:-3] + 'mp4' 57 | create_video(img_path, vid_path, fps) 58 | subprocess.run(['ffmpeg', '-y', '-i', vid_path, \ 59 | '-pix_fmt', 'rgb8', out_path]) 60 | 61 | def create_comparison_images(img1_dir, img2_dir, out_dir, text1=None, text2=None): 62 | ''' 63 | Given two direcdtories containing (png) frames of a video, combines them into one large frame and 64 | saves new side-by-side images to the given directory. 65 | ''' 66 | img1_frames = sorted(glob.glob(os.path.join(img1_dir, '*.png'))) 67 | img2_frames = sorted(glob.glob(os.path.join(img2_dir, '*.png'))) 68 | 69 | if not os.path.exists(out_dir): 70 | os.mkdir(out_dir) 71 | 72 | for img1_path, img2_path in zip(img1_frames, img2_frames): 73 | img1 = Image.open(img1_path) 74 | img2 = Image.open(img2_path) 75 | 76 | if text1 is not None: 77 | d = ImageDraw.Draw(img1) 78 | font = ImageFont.truetype('Pillow/Tests/fonts/FreeMono.ttf', 12) 79 | d.text((10, 10), text1, fill=(0,0,0), font=font) 80 | if text2 is not None: 81 | d = ImageDraw.Draw(img2) 82 | font = ImageFont.truetype('Pillow/Tests/fonts/FreeMono.ttf', 12) 83 | d.text((10, 10), text2, fill=(0,0,0), font=font) 84 | 85 | dst = Image.new('RGB', (img1.width + img2.width, img1.height)) 86 | dst.paste(img1, (0, 0)) 87 | dst.paste(img2, (img1.width, 0)) 88 | 89 | dst.save(os.path.join(out_dir, img1_path.split('/')[-1])) 90 | 91 | def create_multi_comparison_images(img_dirs, out_dir, texts=None, extn='.png'): 92 | ''' 93 | Given list of direcdtories containing (png) frames of a video, combines them into one large frame and 94 | saves new side-by-side images to the given directory. 95 | ''' 96 | img_frame_list = [] 97 | for img_dir in img_dirs: 98 | img_frame_list.append(sorted(glob.glob(os.path.join(img_dir, '*.' + extn)))) 99 | 100 | use_text = texts is not None and len(texts) == len(img_dirs) 101 | 102 | if not os.path.exists(out_dir): 103 | os.mkdir(out_dir) 104 | 105 | for img_path_tuple in zip(*img_frame_list): 106 | img_list = [] 107 | width_list = [] 108 | for im_idx, cur_img_path in enumerate(img_path_tuple): 109 | cur_img = Image.open(cur_img_path) 110 | if use_text: 111 | d = ImageDraw.Draw(cur_img) 112 | font = ImageFont.truetype('Pillow/Tests/fonts/FreeMono.ttf', 12) 113 | d.text((10, 10), texts[im_idx], fill=(0,0,0), font=font) 114 | img_list.append(cur_img) 115 | width_list.append(cur_img.width) 116 | 117 | dst = Image.new('RGB', (sum(width_list), img_list[0].height)) 118 | for im_idx, cur_img in enumerate(img_list): 119 | if im_idx == 0: 120 | dst.paste(cur_img, (0, 0)) 121 | else: 122 | dst.paste(cur_img, (sum(width_list[:im_idx]), 0)) 123 | 124 | dst.save(os.path.join(out_dir, img_path_tuple[0].split('/')[-1])) 125 | 126 | def viz_smpl_seq(body, imw=1080, imh=1080, fps=30, contacts=None, 127 | render_body=True, render_joints=False, render_skeleton=False, render_ground=True, ground_plane=None, 128 | use_offscreen=False, out_path=None, wireframe=False, RGBA=False, 129 | joints_seq=None, joints_vel=None, follow_camera=False, vtx_list=None, points_seq=None, points_vel=None, 130 | static_meshes=None, camera_intrinsics=None, img_seq=None, point_rad=0.015, 131 | skel_connections=smpl_connections, img_extn='png', ground_alpha=1.0, body_alpha=None, mask_seq=None, 132 | cam_offset=[0.0, 4.0, 1.25], ground_color0=[0.8, 0.9, 0.9], ground_color1=[0.6, 0.7, 0.7], 133 | skel_color=[0.0, 0.0, 1.0], 134 | joint_rad=0.015, 135 | point_color=[0.0, 1.0, 0.0], 136 | joint_color=[0.0, 1.0, 0.0], 137 | contact_color=[1.0, 0.0, 0.0], 138 | render_bodies_static=None, 139 | render_points_static=None, 140 | cam_rot=None): 141 | ''' 142 | Visualizes the body model output of a smpl sequence. 143 | - body : body model output from SMPL forward pass (where the sequence is the batch) 144 | - joints_seq : list of torch/numy tensors/arrays 145 | - points_seq : list of torch/numpy tensors 146 | - camera_intrinsics : (fx, fy, cx, cy) 147 | - ground_plane : [a, b, c, d] 148 | - render_bodies_static is an integer, if given renders all bodies at once but only every x steps 149 | ''' 150 | 151 | if contacts is not None and torch.is_tensor(contacts): 152 | contacts = c2c(contacts) 153 | 154 | if render_body or vtx_list is not None: 155 | start_t = time.time() 156 | nv = body.v.size(1) 157 | vertex_colors = np.tile(colors['grey'], (nv, 1)) 158 | if body_alpha is not None: 159 | vtx_alpha = np.ones((vertex_colors.shape[0], 1))*body_alpha 160 | vertex_colors = np.concatenate([vertex_colors, vtx_alpha], axis=1) 161 | faces = c2c(body.f) 162 | body_mesh_seq = [trimesh.Trimesh(vertices=c2c(body.v[i]), faces=faces, vertex_colors=vertex_colors, process=False) for i in range(body.v.size(0))] 163 | 164 | if render_joints and joints_seq is None: 165 | start_t = time.time() 166 | # only body joints 167 | joints_seq = [c2c(body.Jtr[i, :22]) for i in range(body.Jtr.size(0))] 168 | elif render_joints and torch.is_tensor(joints_seq[0]): 169 | joints_seq = [c2c(joint_frame) for joint_frame in joints_seq] 170 | 171 | if joints_vel is not None and torch.is_tensor(joints_vel[0]): 172 | joints_vel = [c2c(joint_frame) for joint_frame in joints_vel] 173 | if points_vel is not None and torch.is_tensor(points_vel[0]): 174 | points_vel = [c2c(joint_frame) for joint_frame in points_vel] 175 | 176 | mv = MeshViewer(width=imw, height=imh, 177 | use_offscreen=use_offscreen, 178 | follow_camera=follow_camera, 179 | camera_intrinsics=camera_intrinsics, 180 | img_extn=img_extn, 181 | default_cam_offset=cam_offset, 182 | default_cam_rot=cam_rot) 183 | if render_body and render_bodies_static is None: 184 | mv.add_mesh_seq(body_mesh_seq) 185 | elif render_body and render_bodies_static is not None: 186 | mv.add_static_meshes([body_mesh_seq[i] for i in range(len(body_mesh_seq)) if i % render_bodies_static == 0]) 187 | if render_joints and render_skeleton: 188 | mv.add_point_seq(joints_seq, color=joint_color, radius=joint_rad, contact_seq=contacts, 189 | connections=skel_connections, connect_color=skel_color, vel=joints_vel, 190 | contact_color=contact_color, render_static=render_points_static) 191 | elif render_joints: 192 | mv.add_point_seq(joints_seq, color=joint_color, radius=joint_rad, contact_seq=contacts, vel=joints_vel, contact_color=contact_color, 193 | render_static=render_points_static) 194 | 195 | if vtx_list is not None: 196 | mv.add_smpl_vtx_list_seq(body_mesh_seq, vtx_list, color=[0.0, 0.0, 1.0], radius=0.015) 197 | 198 | if points_seq is not None: 199 | if torch.is_tensor(points_seq[0]): 200 | points_seq = [c2c(point_frame) for point_frame in points_seq] 201 | mv.add_point_seq(points_seq, color=point_color, radius=point_rad, vel=points_vel, render_static=render_points_static) 202 | 203 | if static_meshes is not None: 204 | mv.set_static_meshes(static_meshes) 205 | 206 | if img_seq is not None: 207 | mv.set_img_seq(img_seq) 208 | 209 | if mask_seq is not None: 210 | mv.set_mask_seq(mask_seq) 211 | 212 | if render_ground: 213 | xyz_orig = None 214 | if ground_plane is not None: 215 | if render_body: 216 | xyz_orig = body_mesh_seq[0].vertices[0, :] 217 | elif render_joints: 218 | xyz_orig = joints_seq[0][0, :] 219 | elif points_seq is not None: 220 | xyz_orig = points_seq[0][0, :] 221 | 222 | mv.add_ground(ground_plane=ground_plane, xyz_orig=xyz_orig, color0=ground_color0, color1=ground_color1, alpha=ground_alpha) 223 | 224 | mv.set_render_settings(out_path=out_path, wireframe=wireframe, RGBA=RGBA, 225 | single_frame=(render_points_static is not None or render_bodies_static is not None)) # only does anything for offscreen rendering 226 | try: 227 | mv.animate(fps=fps) 228 | except RuntimeError as err: 229 | print('Could not render properly with the error: %s' % (str(err))) 230 | 231 | del mv 232 | 233 | def viz_results(body_pred, body_gt, fps, viz_out_dir=None, base_name=None, contacts=None, pred_joints=None, gt_joints=None, 234 | pred_verts=None, gt_verts=None, render_body=True, cleanup=True, pred_contacts=None, gt_contacts=None, 235 | wireframe=False, RGBA=False, camera_intrinsics=None, imw=1080, imh=1080, img_seq=None, 236 | render_ground=True, point_rad=0.015, ground_plane=None, render_pred_body=None, render_gt_body=None, 237 | skel_connections=smpl_connections, ground_alpha=1.0, body_alpha=None, point_color=[0.0, 1.0, 0.0], 238 | cam_offset=[0.0, 4.0, 1.25]): 239 | use_offscreen = False 240 | pred_out_path = gt_out_path = comparison_out_path = None 241 | if viz_out_dir is not None: 242 | if base_name is None: 243 | print('Must give base name to save visualized output') 244 | return 245 | use_offscreen = True 246 | if not os.path.exists(viz_out_dir): 247 | os.mkdir(viz_out_dir) 248 | 249 | base_out_path = os.path.join(viz_out_dir, base_name) 250 | pred_out_path = base_out_path + '_pred' 251 | gt_out_path = base_out_path + '_gt' 252 | comparison_out_path = base_out_path + '_compare' 253 | 254 | if pred_contacts is None: 255 | pred_contacts = contacts 256 | if gt_contacts is None: 257 | gt_contacts = contacts 258 | 259 | if render_pred_body is not None or render_gt_body is not None: 260 | if render_pred_body is None: 261 | render_pred_body = render_body 262 | if render_gt_body is None: 263 | render_gt_body = render_body 264 | else: 265 | render_pred_body = render_body 266 | render_gt_body = render_body 267 | 268 | # determine whether to have a following camera or not 269 | follow_camera = torch.max(torch.abs(body_pred.Jtr[:, :22, :2])) > 2.0 270 | if follow_camera: 271 | print('Using follow camera...') 272 | print('Visualizing PREDICTED sequence...') 273 | viz_smpl_seq(body_pred, 274 | imw=imw, 275 | imh=imh, 276 | fps=fps, 277 | contacts=pred_contacts, 278 | render_body=render_pred_body, 279 | render_joints=(pred_joints is not None), 280 | render_skeleton=(not render_pred_body), 281 | render_ground=render_ground, 282 | ground_plane=ground_plane, 283 | ground_alpha=ground_alpha, 284 | body_alpha=body_alpha, 285 | joints_seq=pred_joints, 286 | points_seq=pred_verts, 287 | use_offscreen=use_offscreen, 288 | out_path=pred_out_path, 289 | wireframe=wireframe, 290 | RGBA=RGBA, 291 | point_rad=point_rad, 292 | camera_intrinsics=camera_intrinsics, 293 | follow_camera=follow_camera, 294 | img_seq=img_seq, 295 | skel_connections=skel_connections, 296 | point_color=point_color, 297 | cam_offset=cam_offset) 298 | print('Visualizing GROUND TRUTH sequence...') 299 | viz_smpl_seq(body_gt, 300 | imw=imw, 301 | imh=imh, 302 | fps=fps, 303 | contacts=gt_contacts, 304 | render_body=render_gt_body, 305 | render_joints=(gt_joints is not None), 306 | render_skeleton=(not render_gt_body), 307 | render_ground=render_ground, 308 | ground_plane=ground_plane, 309 | ground_alpha=ground_alpha, 310 | body_alpha=body_alpha, 311 | joints_seq=gt_joints, 312 | points_seq=gt_verts, 313 | use_offscreen=use_offscreen, 314 | out_path=gt_out_path, 315 | wireframe=wireframe, 316 | RGBA=RGBA, 317 | point_rad=point_rad, 318 | camera_intrinsics=camera_intrinsics, 319 | follow_camera=follow_camera, 320 | img_seq=img_seq, 321 | skel_connections=skel_connections, 322 | point_color=point_color, 323 | cam_offset=cam_offset) 324 | 325 | if use_offscreen: 326 | # create a video of each 327 | # create_video(os.path.join(pred_out_path + '/frame_%08d.png'), pred_out_path + '.mp4', fps) 328 | # create_video(os.path.join(gt_out_path + '/frame_%08d.png'), gt_out_path + '.mp4', fps) 329 | # # then for comparison 330 | # create_comparison_images(gt_out_path, pred_out_path, comparison_out_path, text1='GT', text2='Pred') 331 | # create_video(os.path.join(comparison_out_path + '/frame_%08d.png'), comparison_out_path + '.mp4', fps) 332 | 333 | # or gif 334 | create_gif(os.path.join(pred_out_path + '/frame_%08d.png'), pred_out_path + '.gif', fps) 335 | create_gif(os.path.join(gt_out_path + '/frame_%08d.png'), gt_out_path + '.gif', fps) 336 | create_comparison_images(gt_out_path, pred_out_path, comparison_out_path, text1='GT', text2='Pred') 337 | create_gif(os.path.join(comparison_out_path + '/frame_%08d.png'), comparison_out_path + '.gif', fps) 338 | 339 | # cleanup 340 | if cleanup: 341 | shutil.rmtree(pred_out_path) 342 | shutil.rmtree(gt_out_path) 343 | shutil.rmtree(comparison_out_path) 344 | 345 | # avoid cyclic dependency 346 | from viz.mesh_viewer import MeshViewer 347 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | matplotlib 2 | opencv-python 3 | scikit-learn 4 | trimesh 5 | Pillow 6 | pyrender 7 | pyglet==1.5.15 8 | tensorboard 9 | git+https://github.com/nghorbani/configer 10 | torchgeometry==0.1.2 11 | smplx==0.1.28 --------------------------------------------------------------------------------