├── example.gif ├── .gitattributes ├── requirements.txt ├── LICENSE ├── .gitignore ├── README.md └── createmask.py /example.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WhiteNoise/deep-bgremove/HEAD/example.gif -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WhiteNoise/deep-bgremove/HEAD/requirements.txt -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 WhiteNoise 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .nox/ 42 | .coverage 43 | .coverage.* 44 | .cache 45 | nosetests.xml 46 | coverage.xml 47 | *.cover 48 | .hypothesis/ 49 | .pytest_cache/ 50 | 51 | # Translations 52 | *.mo 53 | *.pot 54 | 55 | # Django stuff: 56 | *.log 57 | local_settings.py 58 | db.sqlite3 59 | 60 | # Flask stuff: 61 | instance/ 62 | .webassets-cache 63 | 64 | # Scrapy stuff: 65 | .scrapy 66 | 67 | # Sphinx documentation 68 | docs/_build/ 69 | 70 | # PyBuilder 71 | target/ 72 | 73 | # Jupyter Notebook 74 | .ipynb_checkpoints 75 | 76 | # IPython 77 | profile_default/ 78 | ipython_config.py 79 | 80 | # pyenv 81 | .python-version 82 | 83 | # celery beat schedule file 84 | celerybeat-schedule 85 | 86 | # SageMath parsed files 87 | *.sage.py 88 | 89 | # Environments 90 | .env 91 | .venv 92 | env/ 93 | venv/ 94 | ENV/ 95 | env.bak/ 96 | venv.bak/ 97 | 98 | # Spyder project settings 99 | .spyderproject 100 | .spyproject 101 | 102 | # Rope project settings 103 | .ropeproject 104 | 105 | # mkdocs documentation 106 | /site 107 | 108 | # mypy 109 | .mypy_cache/ 110 | .dmypy.json 111 | dmypy.json 112 | 113 | # Pyre type checker 114 | .pyre/ 115 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Deep BGRemove 2 | 3 | ![Example](example.gif) 4 | 5 | This is a quick tool that I whipped up for doing background removal using Pytorch's DeeplabV3 segmentation model. The pre-trained weights are used and downloaded from the Pytorch Hub, so getting up and running should be fairly easy. MoviePy is used for processing and I/O and it accepts a variety of formats. Cuda will be used if it is available on the system (you will want to use it as generating the output can be very slow, especially for large videos). 6 | 7 | Mask movies can be brought into a tool like After effects and used as an alpha channel. 8 | 9 | ## Installation 10 | Install the requirements. 11 | ```bash 12 | pip install -r requirements.txt 13 | ``` 14 | 15 | ## Usage 16 | 17 | ```bash 18 | python createmask.py --input "avideo.mp4" --output "output.mp4" 19 | ``` 20 | 21 | Optionally use `--width` to resize the input, keeping aspect ratio. If you have 4k video, you should resize it down to 1080p for masking purposes otherwise it will take quite a while. 22 | 23 | ## Tips 24 | 25 | You'll want to have a reasonably high res video as input (with a similar output size) - 720p or 1080p is recommended. Too low res, and it tends to miss the people or not mask them too acurately. Good quality video is important - the network has trouble with motion blur. The training data most likely is all in-focus, so the network works great on still photos but people in motion are not masked as well. Higher framerates may help reduce motion blur - I am still experimenting with this. 26 | 27 | ## Notes 28 | 29 | The video processing aspect is based on MoviePy, which is a powerful video editor in it's own right. Adding more powerful editing commands would be possible fairly easily. I had originally wanted to implementing compositing in this script, but it looks like MoviePy has some limitations on how masks work and are processed, and syncronizing video streams lends itself more to visual tools anyway. But, this could change in the future. 30 | 31 | 32 | -------------------------------------------------------------------------------- /createmask.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import argparse 3 | import random 4 | import math 5 | 6 | import torch 7 | import torch.nn as nn 8 | import torch.nn.functional as F 9 | import torch.optim as optim 10 | 11 | from os import listdir 12 | from os.path import join 13 | 14 | from moviepy.editor import * 15 | 16 | model = torch.hub.load('pytorch/vision', 'deeplabv3_resnet101', pretrained=True) 17 | people_class = 15 18 | 19 | model.eval() 20 | print ("Model Loaded") 21 | 22 | blur = torch.FloatTensor([[[[1.0, 2.0, 1.0],[2.0, 4.0, 2.0],[1.0, 2.0, 1.0]]]]) / 16.0 23 | 24 | # move the input and model to GPU for speed if available 25 | if torch.cuda.is_available(): 26 | model.to('cuda') 27 | blur = blur.to('cuda') 28 | 29 | import urllib 30 | from torchvision import transforms 31 | 32 | preprocess = transforms.Compose([ 33 | transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), 34 | ]) 35 | 36 | def makeSegMask(img): 37 | frame_data = torch.FloatTensor( img ) / 255.0 38 | 39 | input_tensor = preprocess(frame_data.permute(2, 0, 1)) 40 | input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model 41 | 42 | # move the input and model to GPU for speed if available 43 | if torch.cuda.is_available(): 44 | input_batch = input_batch.to('cuda') 45 | 46 | 47 | with torch.no_grad(): 48 | output = model(input_batch)['out'][0] 49 | 50 | segmentation = output.argmax(0) 51 | 52 | bgOut = output[0:1][:][:] 53 | a = (1.0 - F.relu(torch.tanh(bgOut * 0.30 - 1.0))).pow(0.5) * 2.0 54 | 55 | people = segmentation.eq( torch.ones_like(segmentation).long().fill_(people_class) ).float() 56 | 57 | people.unsqueeze_(0).unsqueeze_(0) 58 | 59 | for i in range(3): 60 | people = F.conv2d(people, blur, stride=1, padding=1) 61 | 62 | # combined_mask = F.hardtanh(a * b) 63 | combined_mask = F.relu(F.hardtanh(a * (people.squeeze().pow(1.5)) )) 64 | combined_mask = combined_mask.expand(1, 3, -1, -1) 65 | 66 | res = (combined_mask * 255.0).cpu().squeeze().byte().permute(1, 2, 0).numpy() 67 | 68 | return res 69 | 70 | def processMovie(args): 71 | print("Processing {}... This will take some time.".format(args.input)) 72 | 73 | if args.width != 0: 74 | target=[args.width, None] 75 | else: 76 | target=None 77 | 78 | realityClip = VideoFileClip(args.input, target_resolution=target) 79 | 80 | realityMask = realityClip.fl_image(makeSegMask) 81 | realityMask.write_videofile(args.output) 82 | 83 | def main(): 84 | parser = argparse.ArgumentParser(description='BGRemove') 85 | parser.add_argument('--input', metavar='N', required=True, 86 | help='input movie path') 87 | parser.add_argument('--output', metavar='N', required=True, 88 | help='output movie path') 89 | parser.add_argument('--width', metavar='N', type=int, default=0, 90 | help='target width (optional, omit for full width)') 91 | 92 | args = parser.parse_args() 93 | 94 | processMovie(args) 95 | 96 | if __name__ == '__main__': 97 | main() 98 | --------------------------------------------------------------------------------