├── example.gif
├── .gitattributes
├── requirements.txt
├── LICENSE
├── .gitignore
├── README.md
└── createmask.py


/example.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhiteNoise/deep-bgremove/HEAD/example.gif


--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | # Auto detect text files and perform LF normalization
2 | * text=auto
3 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WhiteNoise/deep-bgremove/HEAD/requirements.txt


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2019 WhiteNoise
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
  1 | # Byte-compiled / optimized / DLL files
  2 | __pycache__/
  3 | *.py[cod]
  4 | *$py.class
  5 | 
  6 | # C extensions
  7 | *.so
  8 | 
  9 | # Distribution / packaging
 10 | .Python
 11 | build/
 12 | develop-eggs/
 13 | dist/
 14 | downloads/
 15 | eggs/
 16 | .eggs/
 17 | lib/
 18 | lib64/
 19 | parts/
 20 | sdist/
 21 | var/
 22 | wheels/
 23 | *.egg-info/
 24 | .installed.cfg
 25 | *.egg
 26 | MANIFEST
 27 | 
 28 | # PyInstaller
 29 | #  Usually these files are written by a python script from a template
 30 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 31 | *.manifest
 32 | *.spec
 33 | 
 34 | # Installer logs
 35 | pip-log.txt
 36 | pip-delete-this-directory.txt
 37 | 
 38 | # Unit test / coverage reports
 39 | htmlcov/
 40 | .tox/
 41 | .nox/
 42 | .coverage
 43 | .coverage.*
 44 | .cache
 45 | nosetests.xml
 46 | coverage.xml
 47 | *.cover
 48 | .hypothesis/
 49 | .pytest_cache/
 50 | 
 51 | # Translations
 52 | *.mo
 53 | *.pot
 54 | 
 55 | # Django stuff:
 56 | *.log
 57 | local_settings.py
 58 | db.sqlite3
 59 | 
 60 | # Flask stuff:
 61 | instance/
 62 | .webassets-cache
 63 | 
 64 | # Scrapy stuff:
 65 | .scrapy
 66 | 
 67 | # Sphinx documentation
 68 | docs/_build/
 69 | 
 70 | # PyBuilder
 71 | target/
 72 | 
 73 | # Jupyter Notebook
 74 | .ipynb_checkpoints
 75 | 
 76 | # IPython
 77 | profile_default/
 78 | ipython_config.py
 79 | 
 80 | # pyenv
 81 | .python-version
 82 | 
 83 | # celery beat schedule file
 84 | celerybeat-schedule
 85 | 
 86 | # SageMath parsed files
 87 | *.sage.py
 88 | 
 89 | # Environments
 90 | .env
 91 | .venv
 92 | env/
 93 | venv/
 94 | ENV/
 95 | env.bak/
 96 | venv.bak/
 97 | 
 98 | # Spyder project settings
 99 | .spyderproject
100 | .spyproject
101 | 
102 | # Rope project settings
103 | .ropeproject
104 | 
105 | # mkdocs documentation
106 | /site
107 | 
108 | # mypy
109 | .mypy_cache/
110 | .dmypy.json
111 | dmypy.json
112 | 
113 | # Pyre type checker
114 | .pyre/
115 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Deep BGRemove
 2 | 
 3 | ![Example](example.gif)
 4 | 
 5 | This is a quick tool that I whipped up for doing background removal using Pytorch's DeeplabV3 segmentation model. The pre-trained weights are used and downloaded from the Pytorch Hub, so getting up and running should be fairly easy. MoviePy is used for processing and I/O and it accepts a variety of formats. Cuda will be used if it is available on the system (you will want to use it as generating the output can be very slow, especially for large videos).
 6 | 
 7 | Mask movies can be brought into a tool like After effects and used as an alpha channel. 
 8 | 
 9 | ## Installation
10 | Install the requirements.
11 | ```bash
12 | pip install -r requirements.txt
13 | ```
14 | 
15 | ## Usage
16 | 
17 | ```bash
18 | python createmask.py --input "avideo.mp4" --output "output.mp4"
19 | ```
20 | 
21 | Optionally use `--width` to resize the input, keeping aspect ratio. If you have 4k video, you should resize it down to 1080p for masking purposes otherwise it will take quite a while.
22 | 
23 | ## Tips
24 | 
25 | You'll want to have a reasonably high res video as input (with a similar output size) - 720p or 1080p is recommended. Too low res, and it tends to miss the people or not mask them too acurately. Good quality video is important - the network has trouble with motion blur. The training data most likely is all in-focus, so the network works great on still photos but people in motion are not masked as well. Higher framerates may help reduce motion blur - I am still experimenting with this.
26 | 
27 | ## Notes
28 | 
29 | The video processing aspect is based on MoviePy, which is a powerful video editor in it's own right. Adding more powerful editing commands would be possible fairly easily. I had originally wanted to implementing compositing in this script, but it looks like MoviePy has some limitations on how masks work and are processed, and syncronizing video streams lends itself more to visual tools anyway. But, this could change in the future.
30 | 
31 | 
32 | 


--------------------------------------------------------------------------------
/createmask.py:
--------------------------------------------------------------------------------
 1 | from __future__ import print_function
 2 | import argparse
 3 | import random
 4 | import math
 5 | 
 6 | import torch
 7 | import torch.nn as nn
 8 | import torch.nn.functional as F
 9 | import torch.optim as optim
10 | 
11 | from os import listdir
12 | from os.path import join
13 | 
14 | from moviepy.editor import *
15 | 
16 | model = torch.hub.load('pytorch/vision', 'deeplabv3_resnet101', pretrained=True)
17 | people_class = 15
18 | 
19 | model.eval()
20 | print ("Model Loaded")
21 | 
22 | blur = torch.FloatTensor([[[[1.0, 2.0, 1.0],[2.0, 4.0, 2.0],[1.0, 2.0, 1.0]]]]) / 16.0
23 | 
24 | # move the input and model to GPU for speed if available
25 | if torch.cuda.is_available():
26 | 	model.to('cuda')
27 | 	blur = blur.to('cuda')
28 | 	
29 | import urllib
30 | from torchvision import transforms
31 | 
32 | preprocess = transforms.Compose([
33 | 	transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
34 | ])
35 | 
36 | def makeSegMask(img):
37 | 	frame_data = torch.FloatTensor( img ) / 255.0
38 | 
39 | 	input_tensor = preprocess(frame_data.permute(2, 0, 1))
40 | 	input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model
41 | 
42 | 	# move the input and model to GPU for speed if available
43 | 	if torch.cuda.is_available():
44 | 		input_batch = input_batch.to('cuda')
45 | 
46 | 
47 | 	with torch.no_grad():
48 | 		output = model(input_batch)['out'][0]
49 | 
50 | 	segmentation = output.argmax(0)
51 | 
52 | 	bgOut = output[0:1][:][:]
53 | 	a = (1.0 - F.relu(torch.tanh(bgOut * 0.30 - 1.0))).pow(0.5) * 2.0
54 | 
55 | 	people = segmentation.eq( torch.ones_like(segmentation).long().fill_(people_class) ).float()
56 | 
57 | 	people.unsqueeze_(0).unsqueeze_(0)
58 | 	
59 | 	for i in range(3):
60 | 		people = F.conv2d(people, blur, stride=1, padding=1)
61 | 
62 | 	# combined_mask = F.hardtanh(a * b)
63 | 	combined_mask = F.relu(F.hardtanh(a * (people.squeeze().pow(1.5)) ))
64 | 	combined_mask = combined_mask.expand(1, 3, -1, -1)
65 | 
66 | 	res = (combined_mask * 255.0).cpu().squeeze().byte().permute(1, 2, 0).numpy()
67 | 
68 | 	return res
69 | 
70 | def processMovie(args):
71 | 	print("Processing {}... This will take some time.".format(args.input))
72 | 
73 | 	if args.width != 0:
74 | 		target=[args.width, None]
75 | 	else:
76 | 		target=None
77 | 
78 | 	realityClip = VideoFileClip(args.input, target_resolution=target)
79 | 
80 | 	realityMask = realityClip.fl_image(makeSegMask)
81 | 	realityMask.write_videofile(args.output)
82 | 
83 | def main():
84 | 	parser = argparse.ArgumentParser(description='BGRemove')
85 | 	parser.add_argument('--input', metavar='N', required=True,
86 | 						help='input movie path')	
87 | 	parser.add_argument('--output', metavar='N', required=True,
88 | 						help='output movie path')							
89 | 	parser.add_argument('--width', metavar='N', type=int, default=0,
90 | 						help='target width (optional, omit for full width)')													
91 | 
92 | 	args = parser.parse_args()
93 | 
94 | 	processMovie(args)
95 | 
96 | if __name__ == '__main__':
97 | 	main()
98 | 


--------------------------------------------------------------------------------