├── .gitignore ├── .gitmodules ├── README.md ├── example.jpg ├── faceit.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | .DS_Store 3 | *.mp4 4 | *.mp3 5 | __pycache__ 6 | *.pyc 7 | models 8 | pyyolo 9 | env 10 | data 11 | _*.jpg 12 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "lib/faceswap"] 2 | path = faceswap 3 | url = https://github.com/deepfakes/faceswap.git 4 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # FaceIt 2 | 3 | ![Jimmy Fallon's body with John Olivery's head, oh my.](example.jpg) 4 | 5 | A script to make it easy to swap faces in videos using the [deepfakes/faceswap](https://github.com/deepfakes/faceswap) library, and urls of YouTube videos for training data. The image above shows a face swap from Jimmy Fallon (host of The Tonight Show) to John Oliver (host of Last Week Tonight). 6 | 7 | ## Overview 8 | 9 | I wrote this script to help me explore the capabilities and limitations of the video face swapping technology known as [Deepfakes](https://github.com/deepfakes/faceswap). 10 | 11 | **[Read all about it in this detailed blog post.](https://goberoi.com/exploring-deepfakes-20c9947c22d9)** 12 | 13 | What does this script do? It makes it trivially easy to acquire and preprocess training data from YouTube. This greatly simplifies the work required to setup a new model, since often all you need to do is find 3-4 videos of each person to get decent results. 14 | 15 | ## Installation 16 | 17 | There is a requirements.txt file in the repo, but to make it all work, you'll need CUDA libraries installed, and ideally Dlib compiled with CUDA support. 18 | 19 | ## Usage 20 | 21 | Setup your model and training data in code, e.g.: 22 | ```python 23 | # Create the model with params: model name, person A name, person B name. 24 | faceit = FaceIt('fallon_to_oliver', 'fallon', 'oliver') 25 | 26 | # Add any number of videos for person A by specifying the YouTube url of the video. 27 | faceit.add_video('fallon', 'fallon_emmastone.mp4', 'https://www.youtube.com/watch?v=bLBSoC_2IY8') 28 | faceit.add_video('fallon', 'fallon_single.mp4', 'https://www.youtube.com/watch?v=xfFVuXN0FSI') 29 | faceit.add_video('fallon', 'fallon_sesamestreet.mp4', 'https://www.youtube.com/watch?v=SHogg7pJI_M') 30 | 31 | # Do the same for person B. 32 | faceit.add_video('oliver', 'oliver_trumpcard.mp4', 'https://www.youtube.com/watch?v=JlxQ3IUWT0I') 33 | faceit.add_video('oliver', 'oliver_taxreform.mp4', 'https://www.youtube.com/watch?v=g23w7WPSaU8') 34 | faceit.add_video('oliver', 'oliver_zazu.mp4', 'https://www.youtube.com/watch?v=Y0IUPwXSQqg') 35 | ``` 36 | 37 | Then create the directory `./data/persons` and put one image containing the face of person A and another of person B. Use the same name that you did when setting up the model. This file is used to filter their face from any others in the videos you provide. E.g.: 38 | ``` 39 | ./data/persons/fallon.jpg 40 | ./data/persons/oliver.jpg 41 | ``` 42 | 43 | Then, preprocess the data. This downloads the videos, breaks them into frames, and extracts the relevant faces, e.g.: 44 | ``` 45 | python faceit.py preprocess fallon_to_oliver 46 | ``` 47 | 48 | Then train the model, e.g.: 49 | ``` 50 | python faceit.py train fallon_to_oliver 51 | ``` 52 | 53 | Finally, convert any video that is stored on disk, e.g.: 54 | ``` 55 | python faceit.py convert fallon_to_oliver fallon_emmastone.mp4 --start 40 --duration 55 --side-by-side 56 | ``` 57 | 58 | Note that you can get useful usage information just by running: `python faceit.py -h` 59 | 60 | 61 | ## License 62 | 63 | *This script is shared under the MIT license, but the library it depends on currently has no license. Beware!* 64 | 65 | Copyright 2018 Gaurav Oberoi (goberoi@gmail.com) 66 | 67 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 68 | 69 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 70 | 71 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 72 | -------------------------------------------------------------------------------- /example.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/goberoi/faceit/2e4f069f90ce81d2093bf77670e0b1e0aa048b1f/example.jpg -------------------------------------------------------------------------------- /faceit.py: -------------------------------------------------------------------------------- 1 | import os 2 | from argparse import Namespace 3 | import argparse 4 | import youtube_dl 5 | import cv2 6 | import time 7 | import tqdm 8 | import numpy 9 | from moviepy.video.io.VideoFileClip import VideoFileClip 10 | from moviepy.video.io.ImageSequenceClip import ImageSequenceClip 11 | from moviepy.video.fx.all import crop 12 | from moviepy.editor import AudioFileClip, clips_array, TextClip, CompositeVideoClip 13 | import shutil 14 | from pathlib import Path 15 | import sys 16 | sys.path.append('faceswap') 17 | 18 | from lib.utils import FullHelpArgumentParser 19 | from scripts.extract import ExtractTrainingData 20 | from scripts.train import TrainingProcessor 21 | from scripts.convert import ConvertImage 22 | from lib.faces_detect import detect_faces 23 | from plugins.PluginLoader import PluginLoader 24 | from lib.FaceFilter import FaceFilter 25 | 26 | class FaceIt: 27 | VIDEO_PATH = 'data/videos' 28 | PERSON_PATH = 'data/persons' 29 | PROCESSED_PATH = 'data/processed' 30 | OUTPUT_PATH = 'data/output' 31 | MODEL_PATH = 'models' 32 | MODELS = {} 33 | 34 | @classmethod 35 | def add_model(cls, model): 36 | FaceIt.MODELS[model._name] = model 37 | 38 | def __init__(self, name, person_a, person_b): 39 | def _create_person_data(person): 40 | return { 41 | 'name' : person, 42 | 'videos' : [], 43 | 'faces' : os.path.join(FaceIt.PERSON_PATH, person + '.jpg'), 44 | 'photos' : [] 45 | } 46 | 47 | self._name = name 48 | 49 | self._people = { 50 | person_a : _create_person_data(person_a), 51 | person_b : _create_person_data(person_b), 52 | } 53 | self._person_a = person_a 54 | self._person_b = person_b 55 | 56 | self._faceswap = FaceSwapInterface() 57 | 58 | if not os.path.exists(os.path.join(FaceIt.VIDEO_PATH)): 59 | os.makedirs(FaceIt.VIDEO_PATH) 60 | 61 | def add_photos(self, person, photo_dir): 62 | self._people[person]['photos'].append(photo_dir) 63 | 64 | def add_video(self, person, name, url=None, fps=20): 65 | self._people[person]['videos'].append({ 66 | 'name' : name, 67 | 'url' : url, 68 | 'fps' : fps 69 | }) 70 | 71 | def fetch(self): 72 | self._process_media(self._fetch_video) 73 | 74 | def extract_frames(self): 75 | self._process_media(self._extract_frames) 76 | 77 | def extract_faces(self): 78 | self._process_media(self._extract_faces) 79 | self._process_media(self._extract_faces_from_photos, 'photos') 80 | 81 | def all_videos(self): 82 | return self._people[self._person_a]['videos'] + self._people[self._person_b]['videos'] 83 | 84 | def _process_media(self, func, media_type = 'videos'): 85 | for person in self._people: 86 | for video in self._people[person][media_type]: 87 | func(person, video) 88 | 89 | def _video_path(self, video): 90 | return os.path.join(FaceIt.VIDEO_PATH, video['name']) 91 | 92 | def _video_frames_path(self, video): 93 | return os.path.join(FaceIt.PROCESSED_PATH, video['name'] + '_frames') 94 | 95 | def _video_faces_path(self, video): 96 | return os.path.join(FaceIt.PROCESSED_PATH, video['name'] + '_faces') 97 | 98 | def _model_path(self, use_gan = False): 99 | path = FaceIt.MODEL_PATH 100 | if use_gan: 101 | path += "_gan" 102 | return os.path.join(path, self._name) 103 | 104 | def _model_data_path(self): 105 | return os.path.join(FaceIt.PROCESSED_PATH, "model_data_" + self._name) 106 | 107 | def _model_person_data_path(self, person): 108 | return os.path.join(self._model_data_path(), person) 109 | 110 | def _fetch_video(self, person, video): 111 | options = { 112 | 'format': 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/bestvideo+bestaudio', 113 | 'outtmpl': os.path.join(FaceIt.VIDEO_PATH, video['name']), 114 | 'merge_output_format' : 'mp4' 115 | } 116 | with youtube_dl.YoutubeDL(options) as ydl: 117 | x = ydl.download([video['url']]) 118 | 119 | def _extract_frames(self, person, video): 120 | video_frames_dir = self._video_frames_path(video) 121 | video_clip = VideoFileClip(self._video_path(video)) 122 | 123 | start_time = time.time() 124 | print('[extract-frames] about to extract_frames for {}, fps {}, length {}s'.format(video_frames_dir, video_clip.fps, video_clip.duration)) 125 | 126 | if os.path.exists(video_frames_dir): 127 | print('[extract-frames] frames already exist, skipping extraction: {}'.format(video_frames_dir)) 128 | return 129 | 130 | os.makedirs(video_frames_dir) 131 | frame_num = 0 132 | for frame in tqdm.tqdm(video_clip.iter_frames(fps=video['fps']), total = video_clip.fps * video_clip.duration): 133 | video_frame_file = os.path.join(video_frames_dir, 'frame_{:03d}.jpg'.format(frame_num)) 134 | frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # Swap RGB to BGR to work with OpenCV 135 | cv2.imwrite(video_frame_file, frame) 136 | frame_num += 1 137 | 138 | print('[extract] finished extract_frames for {}, total frames {}, time taken {:.0f}s'.format( 139 | video_frames_dir, frame_num-1, time.time() - start_time)) 140 | 141 | def _extract_faces(self, person, video): 142 | video_faces_dir = self._video_faces_path(video) 143 | 144 | start_time = time.time() 145 | print('[extract-faces] about to extract faces for {}'.format(video_faces_dir)) 146 | 147 | if os.path.exists(video_faces_dir): 148 | print('[extract-faces] faces already exist, skipping face extraction: {}'.format(video_faces_dir)) 149 | return 150 | 151 | os.makedirs(video_faces_dir) 152 | self._faceswap.extract(self._video_frames_path(video), video_faces_dir, self._people[person]['faces']) 153 | 154 | def _extract_faces_from_photos(self, person, photo_dir): 155 | photo_faces_dir = self._video_faces_path({ 'name' : photo_dir }) 156 | 157 | start_time = time.time() 158 | print('[extract-faces] about to extract faces for {}'.format(photo_faces_dir)) 159 | 160 | if os.path.exists(photo_faces_dir): 161 | print('[extract-faces] faces already exist, skipping face extraction: {}'.format(photo_faces_dir)) 162 | return 163 | 164 | os.makedirs(photo_faces_dir) 165 | self._faceswap.extract(self._video_path({ 'name' : photo_dir }), photo_faces_dir, self._people[person]['faces']) 166 | 167 | 168 | def preprocess(self): 169 | self.fetch() 170 | self.extract_frames() 171 | self.extract_faces() 172 | 173 | def _symlink_faces_for_model(self, person, video): 174 | if isinstance(video, str): 175 | video = { 'name' : video } 176 | for face_file in os.listdir(self._video_faces_path(video)): 177 | target_file = os.path.join(self._model_person_data_path(person), video['name'] + "_" + face_file) 178 | face_file_path = os.path.join(os.getcwd(), self._video_faces_path(video), face_file) 179 | os.symlink(face_file_path, target_file) 180 | 181 | def train(self, use_gan = False): 182 | # Setup directory structure for model, and create one director for person_a faces, and 183 | # another for person_b faces containing symlinks to all faces. 184 | if not os.path.exists(self._model_path(use_gan)): 185 | os.makedirs(self._model_path(use_gan)) 186 | 187 | if os.path.exists(self._model_data_path()): 188 | shutil.rmtree(self._model_data_path()) 189 | 190 | for person in self._people: 191 | os.makedirs(self._model_person_data_path(person)) 192 | self._process_media(self._symlink_faces_for_model) 193 | 194 | self._faceswap.train(self._model_person_data_path(self._person_a), self._model_person_data_path(self._person_b), self._model_path(use_gan), use_gan) 195 | 196 | def convert(self, video_file, swap_model = False, duration = None, start_time = None, use_gan = False, face_filter = False, photos = True, crop_x = None, width = None, side_by_side = False): 197 | # Magic incantation to not have tensorflow blow up with an out of memory error. 198 | import tensorflow as tf 199 | import keras.backend.tensorflow_backend as K 200 | config = tf.ConfigProto() 201 | config.gpu_options.allow_growth = True 202 | config.gpu_options.visible_device_list="0" 203 | K.set_session(tf.Session(config=config)) 204 | 205 | # Load model 206 | model_name = "Original" 207 | converter_name = "Masked" 208 | if use_gan: 209 | model_name = "GAN" 210 | converter_name = "GAN" 211 | model = PluginLoader.get_model(model_name)(Path(self._model_path(use_gan))) 212 | if not model.load(swap_model): 213 | print('model Not Found! A valid model must be provided to continue!') 214 | exit(1) 215 | 216 | # Load converter 217 | converter = PluginLoader.get_converter(converter_name) 218 | converter = converter(model.converter(False), 219 | blur_size=8, 220 | seamless_clone=True, 221 | mask_type="facehullandrect", 222 | erosion_kernel_size=None, 223 | smooth_mask=True, 224 | avg_color_adjust=True) 225 | 226 | # Load face filter 227 | filter_person = self._person_a 228 | if swap_model: 229 | filter_person = self._person_b 230 | filter = FaceFilter(self._people[filter_person]['faces']) 231 | 232 | # Define conversion method per frame 233 | def _convert_frame(frame, convert_colors = True): 234 | if convert_colors: 235 | frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # Swap RGB to BGR to work with OpenCV 236 | for face in detect_faces(frame, "cnn"): 237 | if (not face_filter) or (face_filter and filter.check(face)): 238 | frame = converter.patch_image(frame, face) 239 | frame = frame.astype(numpy.float32) 240 | if convert_colors: 241 | frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # Swap RGB to BGR to work with OpenCV 242 | return frame 243 | def _convert_helper(get_frame, t): 244 | return _convert_frame(get_frame(t)) 245 | 246 | media_path = self._video_path({ 'name' : video_file }) 247 | if not photos: 248 | # Process video; start loading the video clip 249 | video = VideoFileClip(media_path) 250 | 251 | # If a duration is set, trim clip 252 | if duration: 253 | video = video.subclip(start_time, start_time + duration) 254 | 255 | # Resize clip before processing 256 | if width: 257 | video = video.resize(width = width) 258 | 259 | # Crop clip if desired 260 | if crop_x: 261 | video = video.fx(crop, x2 = video.w / 2) 262 | 263 | # Kick off convert frames for each frame 264 | new_video = video.fl(_convert_helper) 265 | 266 | # Stack clips side by side 267 | if side_by_side: 268 | def add_caption(caption, clip): 269 | text = (TextClip(caption, font='Amiri-regular', color='white', fontsize=80). 270 | margin(40). 271 | set_duration(clip.duration). 272 | on_color(color=(0,0,0), col_opacity=0.6)) 273 | return CompositeVideoClip([clip, text]) 274 | video = add_caption("Original", video) 275 | new_video = add_caption("Swapped", new_video) 276 | final_video = clips_array([[video], [new_video]]) 277 | else: 278 | final_video = new_video 279 | 280 | # Resize clip after processing 281 | #final_video = final_video.resize(width = (480 * 2)) 282 | 283 | # Write video 284 | output_path = os.path.join(self.OUTPUT_PATH, video_file) 285 | final_video.write_videofile(output_path, rewrite_audio = True) 286 | 287 | # Clean up 288 | del video 289 | del new_video 290 | del final_video 291 | else: 292 | # Process a directory of photos 293 | for face_file in os.listdir(media_path): 294 | face_path = os.path.join(media_path, face_file) 295 | image = cv2.imread(face_path) 296 | image = _convert_frame(image, convert_colors = False) 297 | cv2.imwrite(os.path.join(self.OUTPUT_PATH, face_file), image) 298 | 299 | class FaceSwapInterface: 300 | def __init__(self): 301 | self._parser = FullHelpArgumentParser() 302 | self._subparser = self._parser.add_subparsers() 303 | 304 | def extract(self, input_dir, output_dir, filter_path): 305 | extract = ExtractTrainingData( 306 | self._subparser, "extract", "Extract the faces from a pictures.") 307 | args_str = "extract --input-dir {} --output-dir {} --processes 1 --detector cnn --filter {}" 308 | args_str = args_str.format(input_dir, output_dir, filter_path) 309 | self._run_script(args_str) 310 | 311 | def train(self, input_a_dir, input_b_dir, model_dir, gan = False): 312 | model_type = "Original" 313 | if gan: 314 | model_type = "GAN" 315 | train = TrainingProcessor( 316 | self._subparser, "train", "This command trains the model for the two faces A and B.") 317 | args_str = "train --input-A {} --input-B {} --model-dir {} --trainer {} --batch-size {} --write-image" 318 | args_str = args_str.format(input_a_dir, input_b_dir, model_dir, model_type, 512) 319 | self._run_script(args_str) 320 | 321 | def _run_script(self, args_str): 322 | args = self._parser.parse_args(args_str.split(' ')) 323 | args.func(args) 324 | 325 | 326 | if __name__ == '__main__': 327 | faceit = FaceIt('fallon_to_oliver', 'fallon', 'oliver') 328 | faceit.add_video('oliver', 'oliver_trumpcard.mp4', 'https://www.youtube.com/watch?v=JlxQ3IUWT0I') 329 | faceit.add_video('oliver', 'oliver_taxreform.mp4', 'https://www.youtube.com/watch?v=g23w7WPSaU8') 330 | faceit.add_video('oliver', 'oliver_zazu.mp4', 'https://www.youtube.com/watch?v=Y0IUPwXSQqg') 331 | faceit.add_video('oliver', 'oliver_pastor.mp4', 'https://www.youtube.com/watch?v=mUndxpbufkg') 332 | faceit.add_video('oliver', 'oliver_cookie.mp4', 'https://www.youtube.com/watch?v=H916EVndP_A') 333 | faceit.add_video('oliver', 'oliver_lorelai.mp4', 'https://www.youtube.com/watch?v=G1xP2f1_1Jg') 334 | faceit.add_video('fallon', 'fallon_mom.mp4', 'https://www.youtube.com/watch?v=gjXrm2Q-te4') 335 | faceit.add_video('fallon', 'fallon_charlottesville.mp4', 'https://www.youtube.com/watch?v=E9TJsw67OmE') 336 | faceit.add_video('fallon', 'fallon_dakota.mp4', 'https://www.youtube.com/watch?v=tPtMP_NAMz0') 337 | faceit.add_video('fallon', 'fallon_single.mp4', 'https://www.youtube.com/watch?v=xfFVuXN0FSI') 338 | faceit.add_video('fallon', 'fallon_sesamestreet.mp4', 'https://www.youtube.com/watch?v=SHogg7pJI_M') 339 | faceit.add_video('fallon', 'fallon_emmastone.mp4', 'https://www.youtube.com/watch?v=bLBSoC_2IY8') 340 | faceit.add_video('fallon', 'fallon_xfinity.mp4', 'https://www.youtube.com/watch?v=7JwBBZRLgkM') 341 | faceit.add_video('fallon', 'fallon_bank.mp4', 'https://www.youtube.com/watch?v=q-0hmYHWVgE') 342 | FaceIt.add_model(faceit) 343 | 344 | parser = argparse.ArgumentParser() 345 | parser.add_argument('task', choices = ['preprocess', 'train', 'convert']) 346 | parser.add_argument('model', choices = FaceIt.MODELS.keys()) 347 | parser.add_argument('video', nargs = '?') 348 | parser.add_argument('--duration', type = int, default = None) 349 | parser.add_argument('--photos', action = 'store_true', default = False) 350 | parser.add_argument('--swap-model', action = 'store_true', default = False) 351 | parser.add_argument('--face-filter', action = 'store_true', default = False) 352 | parser.add_argument('--start-time', type = int, default = 0) 353 | parser.add_argument('--crop-x', type = int, default = None) 354 | parser.add_argument('--width', type = int, default = None) 355 | parser.add_argument('--side-by-side', action = 'store_true', default = False) 356 | args = parser.parse_args() 357 | 358 | faceit = FaceIt.MODELS[args.model] 359 | 360 | if args.task == 'preprocess': 361 | faceit.preprocess() 362 | elif args.task == 'train': 363 | faceit.train() 364 | elif args.task == 'convert': 365 | if not args.video: 366 | print('Need a video to convert. Some ideas: {}'.format(", ".join([video['name'] for video in faceit.all_videos()]))) 367 | else: 368 | faceit.convert(args.video, duration = args.duration, swap_model = args.swap_model, face_filter = args.face_filter, start_time = args.start_time, photos = args.photos, crop_x = args.crop_x, width = args.width, side_by_side = args.side_by_side) 369 | 370 | 371 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | pathlib==1.0.1 2 | scandir==1.6 3 | h5py==2.7.1 4 | Keras==2.1.2 5 | scikit-image 6 | dlib 7 | face_recognition 8 | youtube-dl 9 | moviepy 10 | numpy==1.14.0 11 | tqdm 12 | pydot 13 | graphviz 14 | 15 | 16 | # Required but installed using other means. 17 | # 18 | #opencv-python==3.4.0 19 | #tensorflow-gpu==1.5.0 20 | --------------------------------------------------------------------------------