├── LICENSE ├── README.md ├── README_zh_Hans.md ├── dataset ├── augmentation.py ├── coco.py ├── imagematte.py ├── spd.py ├── videomatte.py └── youtubevis.py ├── documentation ├── image │ ├── showreel.gif │ └── teaser.gif ├── inference.md ├── inference_zh_Hans.md ├── misc │ ├── aim_test.txt │ ├── d646_test.txt │ ├── dvm_background_test_clips.txt │ ├── dvm_background_train_clips.txt │ ├── imagematte_train.txt │ ├── imagematte_valid.txt │ └── spd_preprocess.py └── training.md ├── evaluation ├── evaluate_hr.py ├── evaluate_lr.py ├── generate_imagematte_with_background_image.py ├── generate_imagematte_with_background_video.py ├── generate_videomatte_with_background_image.py └── generate_videomatte_with_background_video.py ├── hubconf.py ├── inference.py ├── inference_speed_test.py ├── inference_utils.py ├── model ├── __init__.py ├── decoder.py ├── deep_guided_filter.py ├── fast_guided_filter.py ├── lraspp.py ├── mobilenetv3.py ├── model.py └── resnet.py ├── requirements_inference.txt ├── requirements_training.txt ├── train.py ├── train_config.py └── train_loss.py /README.md: -------------------------------------------------------------------------------- 1 | # Robust Video Matting (RVM) 2 | 3 |  4 | 5 |
English | 中文
6 | 7 | Official repository for the paper [Robust High-Resolution Video Matting with Temporal Guidance](https://peterl1n.github.io/RobustVideoMatting/). RVM is specifically designed for robust human video matting. Unlike existing neural models that process frames as independent images, RVM uses a recurrent neural network to process videos with temporal memory. RVM can perform matting in real-time on any videos without additional inputs. It achieves **4K 76FPS** and **HD 104FPS** on an Nvidia GTX 1080 Ti GPU. The project was developed at [ByteDance Inc.](https://www.bytedance.com/) 8 | 9 |
24 |
25 |
26 |
27 |
Framework | 48 |Download | 49 |Notes | 50 |
PyTorch | 55 |
56 | rvm_mobilenetv3.pth 57 | rvm_resnet50.pth 58 | |
59 | 60 | Official weights for PyTorch. Doc 61 | | 62 |
TorchHub | 65 |66 | Nothing to Download. 67 | | 68 |69 | Easiest way to use our model in your PyTorch project. Doc 70 | | 71 |
TorchScript | 74 |
75 | rvm_mobilenetv3_fp32.torchscript 76 | rvm_mobilenetv3_fp16.torchscript 77 | rvm_resnet50_fp32.torchscript 78 | rvm_resnet50_fp16.torchscript 79 | |
80 | 81 | If inference on mobile, consider export int8 quantized models yourself. Doc 82 | | 83 |
ONNX | 86 |
87 | rvm_mobilenetv3_fp32.onnx 88 | rvm_mobilenetv3_fp16.onnx 89 | rvm_resnet50_fp32.onnx 90 | rvm_resnet50_fp16.onnx 91 | |
92 | 93 | Tested on ONNX Runtime with CPU and CUDA backends. Provided models use opset 12. Doc, Exporter. 94 | | 95 |
TensorFlow | 98 |
99 | rvm_mobilenetv3_tf.zip 100 | rvm_resnet50_tf.zip 101 | |
102 | 103 | TensorFlow 2 SavedModel. Doc 104 | | 105 |
TensorFlow.js | 108 |
109 | rvm_mobilenetv3_tfjs_int8.zip 110 | |
111 | 112 | Run the model on the web. Demo, Starter Code 113 | | 114 |
CoreML | 117 |
118 | rvm_mobilenetv3_1280x720_s0.375_fp16.mlmodel 119 | rvm_mobilenetv3_1280x720_s0.375_int8.mlmodel 120 | rvm_mobilenetv3_1920x1080_s0.25_fp16.mlmodel 121 | rvm_mobilenetv3_1920x1080_s0.25_int8.mlmodel 122 | |
123 |
124 | CoreML does not support dynamic resolution. Other resolutions can be exported yourself. Models require iOS 13+. s denotes downsample_ratio . Doc, Exporter
125 | |
126 |
English | 中文
6 | 7 | 论文 [Robust High-Resolution Video Matting with Temporal Guidance](https://peterl1n.github.io/RobustVideoMatting/) 的官方 GitHub 库。RVM 专为稳定人物视频抠像设计。不同于现有神经网络将每一帧作为单独图片处理,RVM 使用循环神经网络,在处理视频流时有时间记忆。RVM 可在任意视频上做实时高清抠像。在 Nvidia GTX 1080Ti 上实现 **4K 76FPS** 和 **HD 104FPS**。此研究项目来自[字节跳动](https://www.bytedance.com/)。 8 | 9 |
23 |
24 |
25 |
26 |
框架 | 47 |下载 | 48 |备注 | 49 |
PyTorch | 54 |
55 | rvm_mobilenetv3.pth 56 | rvm_resnet50.pth 57 | |
58 | 59 | 官方 PyTorch 模型权值。文档 60 | | 61 |
TorchHub | 64 |65 | 无需手动下载。 66 | | 67 |68 | 更方便地在你的 PyTorch 项目里使用此模型。文档 69 | | 70 |
TorchScript | 73 |
74 | rvm_mobilenetv3_fp32.torchscript 75 | rvm_mobilenetv3_fp16.torchscript 76 | rvm_resnet50_fp32.torchscript 77 | rvm_resnet50_fp16.torchscript 78 | |
79 | 80 | 若需在移动端推断,可以考虑自行导出 int8 量化的模型。文档 81 | | 82 |
ONNX | 85 |
86 | rvm_mobilenetv3_fp32.onnx 87 | rvm_mobilenetv3_fp16.onnx 88 | rvm_resnet50_fp32.onnx 89 | rvm_resnet50_fp16.onnx 90 | |
91 | 92 | 在 ONNX Runtime 的 CPU 和 CUDA backend 上测试过。提供的模型用 opset 12。文档,导出 93 | | 94 |
TensorFlow | 97 |
98 | rvm_mobilenetv3_tf.zip 99 | rvm_resnet50_tf.zip 100 | |
101 | 102 | TensorFlow 2 SavedModel 格式。文档 103 | | 104 |
TensorFlow.js | 107 |
108 | rvm_mobilenetv3_tfjs_int8.zip 109 | |
110 | 111 | 在网页上跑模型。展示,示范代码 112 | | 113 |
CoreML | 116 |
117 | rvm_mobilenetv3_1280x720_s0.375_fp16.mlmodel 118 | rvm_mobilenetv3_1280x720_s0.375_int8.mlmodel 119 | rvm_mobilenetv3_1920x1080_s0.25_fp16.mlmodel 120 | rvm_mobilenetv3_1920x1080_s0.25_int8.mlmodel 121 | |
122 |
123 | CoreML 只能导出固定分辨率,其他分辨率可自行导出。支持 iOS 13+。s 代表下采样比。文档,导出
124 | |
125 |
English | 中文
4 | 5 | ## Content 6 | 7 | * [Concepts](#concepts) 8 | * [Downsample Ratio](#downsample-ratio) 9 | * [Recurrent States](#recurrent-states) 10 | * [PyTorch](#pytorch) 11 | * [TorchHub](#torchhub) 12 | * [TorchScript](#torchscript) 13 | * [ONNX](#onnx) 14 | * [TensorFlow](#tensorflow) 15 | * [TensorFlow.js](#tensorflowjs) 16 | * [CoreML](#coreml) 17 | 18 |English | 中文
4 | 5 | ## 目录 6 | 7 | * [概念](#概念) 8 | * [下采样比](#下采样比) 9 | * [循环记忆](#循环记忆) 10 | * [PyTorch](#pytorch) 11 | * [TorchHub](#torchhub) 12 | * [TorchScript](#torchscript) 13 | * [ONNX](#onnx) 14 | * [TensorFlow](#tensorflow) 15 | * [TensorFlow.js](#tensorflowjs) 16 | * [CoreML](#coreml) 17 | 18 |