├── .gitignore
├── .gitmodules
├── README.md
├── example.jpg
├── faceit.py
└── requirements.txt


/.gitignore:
--------------------------------------------------------------------------------
 1 | *~
 2 | .DS_Store
 3 | *.mp4
 4 | *.mp3
 5 | __pycache__
 6 | *.pyc
 7 | models
 8 | pyyolo
 9 | env
10 | data
11 | _*.jpg
12 | 


--------------------------------------------------------------------------------
/.gitmodules:
--------------------------------------------------------------------------------
1 | [submodule "lib/faceswap"]
2 | 	path = faceswap
3 | 	url = https://github.com/deepfakes/faceswap.git
4 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # FaceIt
 2 | 
 3 | ![Jimmy Fallon's body with John Olivery's head, oh my.](example.jpg)
 4 | 
 5 | A script to make it easy to swap faces in videos using the [deepfakes/faceswap](https://github.com/deepfakes/faceswap) library, and urls of YouTube videos for training data. The image above shows a face swap from Jimmy Fallon (host of The Tonight Show) to John Oliver (host of Last Week Tonight).
 6 | 
 7 | ## Overview
 8 | 
 9 | I wrote this script to help me explore the capabilities and limitations of the video face swapping technology known as [Deepfakes](https://github.com/deepfakes/faceswap).
10 | 
11 | **[Read all about it in this detailed blog post.](https://goberoi.com/exploring-deepfakes-20c9947c22d9)**
12 | 
13 | What does this script do? It makes it trivially easy to acquire and preprocess training data from YouTube. This greatly simplifies the work required to setup a new model, since often all you need to do is find 3-4 videos of each person to get decent results.
14 | 
15 | ## Installation
16 | 
17 | There is a requirements.txt file in the repo, but to make it all work, you'll need CUDA libraries installed, and ideally Dlib compiled with CUDA support.
18 | 
19 | ## Usage
20 | 
21 | Setup your model and training data in code, e.g.:
22 | ```python
23 | # Create the model with params: model name, person A name, person B name.
24 | faceit = FaceIt('fallon_to_oliver', 'fallon', 'oliver')
25 | 
26 | # Add any number of videos for person A by specifying the YouTube url of the video.
27 | faceit.add_video('fallon', 'fallon_emmastone.mp4', 'https://www.youtube.com/watch?v=bLBSoC_2IY8')
28 | faceit.add_video('fallon', 'fallon_single.mp4', 'https://www.youtube.com/watch?v=xfFVuXN0FSI')
29 | faceit.add_video('fallon', 'fallon_sesamestreet.mp4', 'https://www.youtube.com/watch?v=SHogg7pJI_M')
30 | 
31 | # Do the same for person B.
32 | faceit.add_video('oliver', 'oliver_trumpcard.mp4', 'https://www.youtube.com/watch?v=JlxQ3IUWT0I')
33 | faceit.add_video('oliver', 'oliver_taxreform.mp4', 'https://www.youtube.com/watch?v=g23w7WPSaU8')
34 | faceit.add_video('oliver', 'oliver_zazu.mp4', 'https://www.youtube.com/watch?v=Y0IUPwXSQqg')
35 | ```
36 | 
37 | Then create the directory `./data/persons` and put one image containing the face of person A and another of person B. Use the same name that you did when setting up the model. This file is used to filter their face from any others in the videos you provide. E.g.:
38 | ```
39 | ./data/persons/fallon.jpg
40 | ./data/persons/oliver.jpg
41 | ```
42 | 
43 | Then, preprocess the data. This downloads the videos, breaks them into frames, and extracts the relevant faces, e.g.:
44 | ```
45 | python faceit.py preprocess fallon_to_oliver
46 | ```
47 | 
48 | Then train the model, e.g.:
49 | ```
50 | python faceit.py train fallon_to_oliver
51 | ```
52 | 
53 | Finally, convert any video that is stored on disk, e.g.:
54 | ```
55 | python faceit.py convert fallon_to_oliver fallon_emmastone.mp4 --start 40 --duration 55 --side-by-side
56 | ```
57 | 
58 | Note that you can get useful usage information just by running: `python faceit.py -h`
59 | 
60 | 
61 | ## License
62 | 
63 | *This script is shared under the MIT license, but the library it depends on currently has no license. Beware!*
64 | 
65 | Copyright 2018 Gaurav Oberoi (goberoi@gmail.com)
66 | 
67 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
68 | 
69 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
70 | 
71 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
72 | 


--------------------------------------------------------------------------------
/example.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/goberoi/faceit/2e4f069f90ce81d2093bf77670e0b1e0aa048b1f/example.jpg


--------------------------------------------------------------------------------
/faceit.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | from argparse import Namespace
  3 | import argparse
  4 | import youtube_dl
  5 | import cv2
  6 | import time
  7 | import tqdm
  8 | import numpy
  9 | from moviepy.video.io.VideoFileClip import VideoFileClip
 10 | from moviepy.video.io.ImageSequenceClip import ImageSequenceClip
 11 | from moviepy.video.fx.all import crop
 12 | from moviepy.editor import AudioFileClip, clips_array, TextClip, CompositeVideoClip
 13 | import shutil
 14 | from pathlib import Path
 15 | import sys
 16 | sys.path.append('faceswap')
 17 | 
 18 | from lib.utils import FullHelpArgumentParser
 19 | from scripts.extract import ExtractTrainingData
 20 | from scripts.train import TrainingProcessor
 21 | from scripts.convert import ConvertImage
 22 | from lib.faces_detect import detect_faces
 23 | from plugins.PluginLoader import PluginLoader
 24 | from lib.FaceFilter import FaceFilter
 25 | 
 26 | class FaceIt:
 27 |     VIDEO_PATH = 'data/videos'
 28 |     PERSON_PATH = 'data/persons'
 29 |     PROCESSED_PATH = 'data/processed'
 30 |     OUTPUT_PATH = 'data/output'
 31 |     MODEL_PATH = 'models'
 32 |     MODELS = {}
 33 | 
 34 |     @classmethod
 35 |     def add_model(cls, model):
 36 |         FaceIt.MODELS[model._name] = model
 37 |     
 38 |     def __init__(self, name, person_a, person_b):
 39 |         def _create_person_data(person):
 40 |             return {
 41 |                 'name' : person,
 42 |                 'videos' : [],
 43 |                 'faces' : os.path.join(FaceIt.PERSON_PATH, person + '.jpg'),
 44 |                 'photos' : []
 45 |             }
 46 |         
 47 |         self._name = name
 48 | 
 49 |         self._people = {
 50 |             person_a : _create_person_data(person_a),
 51 |             person_b : _create_person_data(person_b),
 52 |         }
 53 |         self._person_a = person_a
 54 |         self._person_b = person_b
 55 |         
 56 |         self._faceswap = FaceSwapInterface()
 57 | 
 58 |         if not os.path.exists(os.path.join(FaceIt.VIDEO_PATH)):
 59 |             os.makedirs(FaceIt.VIDEO_PATH)            
 60 | 
 61 |     def add_photos(self, person, photo_dir):
 62 |         self._people[person]['photos'].append(photo_dir)
 63 |             
 64 |     def add_video(self, person, name, url=None, fps=20):
 65 |         self._people[person]['videos'].append({
 66 |             'name' : name,
 67 |             'url' : url,
 68 |             'fps' : fps
 69 |         })
 70 | 
 71 |     def fetch(self):
 72 |         self._process_media(self._fetch_video)
 73 | 
 74 |     def extract_frames(self):
 75 |         self._process_media(self._extract_frames)
 76 | 
 77 |     def extract_faces(self):        
 78 |         self._process_media(self._extract_faces)
 79 |         self._process_media(self._extract_faces_from_photos, 'photos')        
 80 | 
 81 |     def all_videos(self):
 82 |         return self._people[self._person_a]['videos'] + self._people[self._person_b]['videos']
 83 | 
 84 |     def _process_media(self, func, media_type = 'videos'):
 85 |         for person in self._people:
 86 |             for video in self._people[person][media_type]:
 87 |                 func(person, video)
 88 | 
 89 |     def _video_path(self, video):
 90 |         return os.path.join(FaceIt.VIDEO_PATH, video['name'])        
 91 | 
 92 |     def _video_frames_path(self, video):
 93 |         return os.path.join(FaceIt.PROCESSED_PATH, video['name'] + '_frames')        
 94 | 
 95 |     def _video_faces_path(self, video):
 96 |         return os.path.join(FaceIt.PROCESSED_PATH, video['name'] + '_faces')
 97 | 
 98 |     def _model_path(self, use_gan = False):
 99 |         path = FaceIt.MODEL_PATH
100 |         if use_gan:
101 |             path += "_gan"
102 |         return os.path.join(path, self._name)
103 | 
104 |     def _model_data_path(self):
105 |         return os.path.join(FaceIt.PROCESSED_PATH, "model_data_" + self._name)
106 |     
107 |     def _model_person_data_path(self, person):
108 |         return os.path.join(self._model_data_path(), person)
109 | 
110 |     def _fetch_video(self, person, video):
111 |         options = {
112 |             'format': 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/bestvideo+bestaudio',
113 |             'outtmpl': os.path.join(FaceIt.VIDEO_PATH, video['name']),
114 |             'merge_output_format' : 'mp4'
115 |         }
116 |         with youtube_dl.YoutubeDL(options) as ydl:
117 |             x = ydl.download([video['url']])
118 | 
119 |     def _extract_frames(self, person, video):
120 |         video_frames_dir = self._video_frames_path(video)
121 |         video_clip = VideoFileClip(self._video_path(video))
122 |         
123 |         start_time = time.time()
124 |         print('[extract-frames] about to extract_frames for {}, fps {}, length {}s'.format(video_frames_dir, video_clip.fps, video_clip.duration))
125 |         
126 |         if os.path.exists(video_frames_dir):
127 |             print('[extract-frames] frames already exist, skipping extraction: {}'.format(video_frames_dir))
128 |             return
129 |         
130 |         os.makedirs(video_frames_dir)
131 |         frame_num = 0
132 |         for frame in tqdm.tqdm(video_clip.iter_frames(fps=video['fps']), total = video_clip.fps * video_clip.duration):
133 |             video_frame_file = os.path.join(video_frames_dir, 'frame_{:03d}.jpg'.format(frame_num))
134 |             frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # Swap RGB to BGR to work with OpenCV
135 |             cv2.imwrite(video_frame_file, frame)
136 |             frame_num += 1
137 | 
138 |         print('[extract] finished extract_frames for {}, total frames {}, time taken {:.0f}s'.format(
139 |             video_frames_dir, frame_num-1, time.time() - start_time))            
140 | 
141 |     def _extract_faces(self, person, video):
142 |         video_faces_dir = self._video_faces_path(video)
143 | 
144 |         start_time = time.time()
145 |         print('[extract-faces] about to extract faces for {}'.format(video_faces_dir))
146 |         
147 |         if os.path.exists(video_faces_dir):
148 |             print('[extract-faces] faces already exist, skipping face extraction: {}'.format(video_faces_dir))
149 |             return
150 |         
151 |         os.makedirs(video_faces_dir)
152 |         self._faceswap.extract(self._video_frames_path(video), video_faces_dir, self._people[person]['faces'])
153 | 
154 |     def _extract_faces_from_photos(self, person, photo_dir):
155 |         photo_faces_dir = self._video_faces_path({ 'name' : photo_dir })
156 | 
157 |         start_time = time.time()
158 |         print('[extract-faces] about to extract faces for {}'.format(photo_faces_dir))
159 |         
160 |         if os.path.exists(photo_faces_dir):
161 |             print('[extract-faces] faces already exist, skipping face extraction: {}'.format(photo_faces_dir))
162 |             return
163 |         
164 |         os.makedirs(photo_faces_dir)
165 |         self._faceswap.extract(self._video_path({ 'name' : photo_dir }), photo_faces_dir, self._people[person]['faces'])
166 | 
167 | 
168 |     def preprocess(self):
169 |         self.fetch()
170 |         self.extract_frames()
171 |         self.extract_faces()
172 |     
173 |     def _symlink_faces_for_model(self, person, video):
174 |         if isinstance(video, str):
175 |             video = { 'name' : video }
176 |         for face_file in os.listdir(self._video_faces_path(video)):
177 |             target_file = os.path.join(self._model_person_data_path(person), video['name'] + "_" + face_file)
178 |             face_file_path = os.path.join(os.getcwd(), self._video_faces_path(video), face_file)
179 |             os.symlink(face_file_path, target_file)
180 | 
181 |     def train(self, use_gan = False):
182 |         # Setup directory structure for model, and create one director for person_a faces, and
183 |         # another for person_b faces containing symlinks to all faces.
184 |         if not os.path.exists(self._model_path(use_gan)):
185 |             os.makedirs(self._model_path(use_gan))
186 | 
187 |         if os.path.exists(self._model_data_path()):
188 |             shutil.rmtree(self._model_data_path())
189 | 
190 |         for person in self._people:
191 |             os.makedirs(self._model_person_data_path(person))
192 |         self._process_media(self._symlink_faces_for_model)
193 | 
194 |         self._faceswap.train(self._model_person_data_path(self._person_a), self._model_person_data_path(self._person_b), self._model_path(use_gan), use_gan)
195 | 
196 |     def convert(self, video_file, swap_model = False, duration = None, start_time = None, use_gan = False, face_filter = False, photos = True, crop_x = None, width = None, side_by_side = False):
197 |         # Magic incantation to not have tensorflow blow up with an out of memory error.
198 |         import tensorflow as tf
199 |         import keras.backend.tensorflow_backend as K
200 |         config = tf.ConfigProto()
201 |         config.gpu_options.allow_growth = True
202 |         config.gpu_options.visible_device_list="0"
203 |         K.set_session(tf.Session(config=config))
204 | 
205 |         # Load model
206 |         model_name = "Original"
207 |         converter_name = "Masked"
208 |         if use_gan:
209 |             model_name = "GAN"
210 |             converter_name = "GAN"
211 |         model = PluginLoader.get_model(model_name)(Path(self._model_path(use_gan)))
212 |         if not model.load(swap_model):
213 |             print('model Not Found! A valid model must be provided to continue!')
214 |             exit(1)
215 | 
216 |         # Load converter
217 |         converter = PluginLoader.get_converter(converter_name)
218 |         converter = converter(model.converter(False),
219 |                               blur_size=8,
220 |                               seamless_clone=True,
221 |                               mask_type="facehullandrect",
222 |                               erosion_kernel_size=None,
223 |                               smooth_mask=True,
224 |                               avg_color_adjust=True)
225 | 
226 |         # Load face filter
227 |         filter_person = self._person_a
228 |         if swap_model:
229 |             filter_person = self._person_b
230 |         filter = FaceFilter(self._people[filter_person]['faces'])
231 | 
232 |         # Define conversion method per frame
233 |         def _convert_frame(frame, convert_colors = True):
234 |             if convert_colors:
235 |                 frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # Swap RGB to BGR to work with OpenCV
236 |             for face in detect_faces(frame, "cnn"):
237 |                 if (not face_filter) or (face_filter and filter.check(face)):
238 |                     frame = converter.patch_image(frame, face)
239 |                     frame = frame.astype(numpy.float32)
240 |             if convert_colors:                    
241 |                 frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # Swap RGB to BGR to work with OpenCV
242 |             return frame
243 |         def _convert_helper(get_frame, t):
244 |             return _convert_frame(get_frame(t))
245 | 
246 |         media_path = self._video_path({ 'name' : video_file })
247 |         if not photos:
248 |             # Process video; start loading the video clip
249 |             video = VideoFileClip(media_path)
250 | 
251 |             # If a duration is set, trim clip
252 |             if duration:
253 |                 video = video.subclip(start_time, start_time + duration)
254 |             
255 |             # Resize clip before processing
256 |             if width:
257 |                 video = video.resize(width = width)
258 | 
259 |             # Crop clip if desired
260 |             if crop_x:
261 |                 video = video.fx(crop, x2 = video.w / 2)
262 | 
263 |             # Kick off convert frames for each frame
264 |             new_video = video.fl(_convert_helper)
265 | 
266 |             # Stack clips side by side
267 |             if side_by_side:
268 |                 def add_caption(caption, clip):
269 |                     text = (TextClip(caption, font='Amiri-regular', color='white', fontsize=80).
270 |                             margin(40).
271 |                             set_duration(clip.duration).
272 |                             on_color(color=(0,0,0), col_opacity=0.6))
273 |                     return CompositeVideoClip([clip, text])
274 |                 video = add_caption("Original", video)
275 |                 new_video = add_caption("Swapped", new_video)                
276 |                 final_video = clips_array([[video], [new_video]])
277 |             else:
278 |                 final_video = new_video
279 | 
280 |             # Resize clip after processing
281 |             #final_video = final_video.resize(width = (480 * 2))
282 | 
283 |             # Write video
284 |             output_path = os.path.join(self.OUTPUT_PATH, video_file)
285 |             final_video.write_videofile(output_path, rewrite_audio = True)
286 |             
287 |             # Clean up
288 |             del video
289 |             del new_video
290 |             del final_video
291 |         else:
292 |             # Process a directory of photos
293 |             for face_file in os.listdir(media_path):
294 |                 face_path = os.path.join(media_path, face_file)
295 |                 image = cv2.imread(face_path)
296 |                 image = _convert_frame(image, convert_colors = False)
297 |                 cv2.imwrite(os.path.join(self.OUTPUT_PATH, face_file), image)
298 | 
299 | class FaceSwapInterface:
300 |     def __init__(self):
301 |         self._parser = FullHelpArgumentParser()
302 |         self._subparser = self._parser.add_subparsers()
303 | 
304 |     def extract(self, input_dir, output_dir, filter_path):
305 |         extract = ExtractTrainingData(
306 |             self._subparser, "extract", "Extract the faces from a pictures.")
307 |         args_str = "extract --input-dir {} --output-dir {} --processes 1 --detector cnn --filter {}"
308 |         args_str = args_str.format(input_dir, output_dir, filter_path)
309 |         self._run_script(args_str)
310 | 
311 |     def train(self, input_a_dir, input_b_dir, model_dir, gan = False):
312 |         model_type = "Original"
313 |         if gan:
314 |             model_type = "GAN"
315 |         train = TrainingProcessor(
316 |             self._subparser, "train", "This command trains the model for the two faces A and B.")
317 |         args_str = "train --input-A {} --input-B {} --model-dir {} --trainer {} --batch-size {} --write-image"
318 |         args_str = args_str.format(input_a_dir, input_b_dir, model_dir, model_type, 512)
319 |         self._run_script(args_str)
320 | 
321 |     def _run_script(self, args_str):
322 |         args = self._parser.parse_args(args_str.split(' '))
323 |         args.func(args)
324 | 
325 | 
326 | if __name__ == '__main__':
327 |     faceit = FaceIt('fallon_to_oliver', 'fallon', 'oliver')
328 |     faceit.add_video('oliver', 'oliver_trumpcard.mp4', 'https://www.youtube.com/watch?v=JlxQ3IUWT0I')
329 |     faceit.add_video('oliver', 'oliver_taxreform.mp4', 'https://www.youtube.com/watch?v=g23w7WPSaU8')
330 |     faceit.add_video('oliver', 'oliver_zazu.mp4', 'https://www.youtube.com/watch?v=Y0IUPwXSQqg')
331 |     faceit.add_video('oliver', 'oliver_pastor.mp4', 'https://www.youtube.com/watch?v=mUndxpbufkg')
332 |     faceit.add_video('oliver', 'oliver_cookie.mp4', 'https://www.youtube.com/watch?v=H916EVndP_A')
333 |     faceit.add_video('oliver', 'oliver_lorelai.mp4', 'https://www.youtube.com/watch?v=G1xP2f1_1Jg')
334 |     faceit.add_video('fallon', 'fallon_mom.mp4', 'https://www.youtube.com/watch?v=gjXrm2Q-te4')
335 |     faceit.add_video('fallon', 'fallon_charlottesville.mp4', 'https://www.youtube.com/watch?v=E9TJsw67OmE')
336 |     faceit.add_video('fallon', 'fallon_dakota.mp4', 'https://www.youtube.com/watch?v=tPtMP_NAMz0')
337 |     faceit.add_video('fallon', 'fallon_single.mp4', 'https://www.youtube.com/watch?v=xfFVuXN0FSI')
338 |     faceit.add_video('fallon', 'fallon_sesamestreet.mp4', 'https://www.youtube.com/watch?v=SHogg7pJI_M')
339 |     faceit.add_video('fallon', 'fallon_emmastone.mp4', 'https://www.youtube.com/watch?v=bLBSoC_2IY8')
340 |     faceit.add_video('fallon', 'fallon_xfinity.mp4', 'https://www.youtube.com/watch?v=7JwBBZRLgkM')
341 |     faceit.add_video('fallon', 'fallon_bank.mp4', 'https://www.youtube.com/watch?v=q-0hmYHWVgE')
342 |     FaceIt.add_model(faceit)
343 | 
344 |     parser = argparse.ArgumentParser()
345 |     parser.add_argument('task', choices = ['preprocess', 'train', 'convert'])
346 |     parser.add_argument('model', choices = FaceIt.MODELS.keys())
347 |     parser.add_argument('video', nargs = '?')
348 |     parser.add_argument('--duration', type = int, default = None)
349 |     parser.add_argument('--photos', action = 'store_true', default = False)    
350 |     parser.add_argument('--swap-model', action = 'store_true', default = False)
351 |     parser.add_argument('--face-filter', action = 'store_true', default = False)
352 |     parser.add_argument('--start-time', type = int, default = 0)
353 |     parser.add_argument('--crop-x', type = int, default = None)
354 |     parser.add_argument('--width', type = int, default = None)
355 |     parser.add_argument('--side-by-side', action = 'store_true', default = False)    
356 |     args = parser.parse_args()
357 | 
358 |     faceit = FaceIt.MODELS[args.model]
359 |     
360 |     if args.task == 'preprocess':
361 |         faceit.preprocess()
362 |     elif args.task == 'train':
363 |         faceit.train()
364 |     elif args.task == 'convert':
365 |         if not args.video:
366 |             print('Need a video to convert. Some ideas: {}'.format(", ".join([video['name'] for video in faceit.all_videos()])))
367 |         else:
368 |             faceit.convert(args.video, duration = args.duration, swap_model = args.swap_model, face_filter = args.face_filter, start_time = args.start_time, photos = args.photos, crop_x = args.crop_x, width = args.width, side_by_side = args.side_by_side)
369 | 
370 | 
371 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | pathlib==1.0.1
 2 | scandir==1.6
 3 | h5py==2.7.1
 4 | Keras==2.1.2
 5 | scikit-image
 6 | dlib
 7 | face_recognition
 8 | youtube-dl
 9 | moviepy
10 | numpy==1.14.0
11 | tqdm
12 | pydot
13 | graphviz
14 | 
15 | 
16 | # Required but installed using other means.
17 | #
18 | #opencv-python==3.4.0
19 | #tensorflow-gpu==1.5.0
20 | 


--------------------------------------------------------------------------------