├── README.md ├── calculator.py ├── ebsynth_utility.py ├── imgs ├── clipseg.png ├── controlnet_0.png ├── controlnet_1.png ├── controlnet_option_in_ebsynthutil.png ├── controlnet_setting.png ├── sample1.mp4 ├── sample2.mp4 ├── sample3.mp4 ├── sample4.mp4 ├── sample5.mp4 ├── sample6.mp4 ├── sample_anyaheh.mp4 ├── sample_autotag.mp4 └── sample_clipseg.mp4 ├── install.py ├── sample ├── add_token.txt └── blacklist.txt ├── scripts ├── custom_script.py └── ui.py ├── stage1.py ├── stage2.py ├── stage3_5.py ├── stage5.py ├── stage7.py ├── stage8.py └── style.css /README.md: -------------------------------------------------------------------------------- 1 | # ebsynth_utility 2 | 3 | ## Overview 4 | #### AUTOMATIC1111 UI extension for creating videos using img2img and ebsynth. 5 | #### This extension allows you to output edited videos using ebsynth.(AE is not required) 6 | 7 | 8 | ##### With [Controlnet](https://github.com/Mikubill/sd-webui-controlnet) installed, I have confirmed that all features of this extension are working properly! 9 | ##### [Controlnet](https://github.com/Mikubill/sd-webui-controlnet) is a must for video editing, so I recommend installing it. 10 | ##### Multi ControlNet("canny" + "normal map") would be suitable for video editing. 11 | 12 |
13 | 14 | ###### I modified animatediff-cli to create a txt2video tool that allows flexible prompt specification. You can use it if you like. 15 | ###### [animatediff-cli-prompt-travel](https://github.com/s9roll7/animatediff-cli-prompt-travel) 16 |
17 | 18 | 19 |
20 | 21 | 22 | ## Example 23 | - The following sample is raw output of this extension. 24 | #### sample 1 mask with [clipseg](https://github.com/timojl/clipseg) 25 | - first from left : original 26 | - second from left : masking "cat" exclude "finger" 27 | - third from left : masking "cat head" 28 | - right : color corrected with [color-matcher](https://github.com/hahnec/color-matcher) (see stage 3.5) 29 | - Multiple targets can also be specified.(e.g. cat,dog,boy,girl) 30 |
31 | 32 | #### sample 2 blend background 33 | - person : masterpiece, best quality, masterpiece, 1girl, masterpiece, best quality,anime screencap, anime style 34 | - background : cyberpunk, factory, room ,anime screencap, anime style 35 | - It is also possible to blend with your favorite videos. 36 |
37 | 38 | #### sample 3 auto tagging 39 | - left : original 40 | - center : apply the same prompts in all keyframes 41 | - right : apply auto tagging by deepdanbooru in all keyframes 42 | - This function improves the detailed changes in facial expressions, hand expressions, etc. 43 | In the sample video, the "closed_eyes" and "hands_on_own_face" tags have been added to better represent eye blinks and hands brought in front of the face. 44 |
45 | 46 | #### sample 4 auto tagging (apply lora dynamically) 47 | - left : apply auto tagging by deepdanbooru in all keyframes 48 | - right : apply auto tagging by deepdanbooru in all keyframes + apply "anyahehface" lora dynamically 49 | - Added the function to dynamically apply TI, hypernet, Lora, and additional prompts according to automatically attached tags. 50 | In the sample video, if the "smile" tag is given, the lora and lora trigger keywords are set to be added according to the strength of the "smile" tag. 51 | Also, since automatically added tags are sometimes incorrect, unnecessary tags are listed in the blacklist. 52 | [Here](sample/) is the actual configuration file used. placed in "Project directory" for use. 53 |
54 | 55 |
56 | 57 | ## Installation 58 | - Install [ffmpeg](https://ffmpeg.org/) for your operating system 59 | (https://www.geeksforgeeks.org/how-to-install-ffmpeg-on-windows/) 60 | - Install [Ebsynth](https://ebsynth.com/) 61 | - Use the Extensions tab of the webui to [Install from URL] 62 | 63 |
64 |
65 | 66 | ## Usage 67 | - Go to [Ebsynth Utility] tab. 68 | - Create an empty directory somewhere, and fill in the "Project directory" field. 69 | - Place the video you want to edit from somewhere, and fill in the "Original Movie Path" field. 70 | Use short videos of a few seconds at first. 71 | - Select stage 1 and Generate. 72 | - Execute in order from stage 1 to 7. 73 | Progress during the process is not reflected in webui, so please check the console screen. 74 | If you see "completed." in webui, it is completed. 75 | (In the current latest webui, it seems to cause an error if you do not drop the image on the main screen of img2img. 76 | Please drop the image as it does not affect the result.) 77 | 78 |
79 |
80 | 81 | ## Note 1 82 | For reference, here's what I did when I edited a 1280x720 30fps 15sec video based on 83 | #### Stage 1 84 | There is nothing to configure. 85 | All frames of the video and mask images for all frames are generated. 86 | 87 | #### Stage 2 88 | In the implementation of this extension, the keyframe interval is chosen to be shorter where there is a lot of motion and longer where there is little motion. 89 | If the animation breaks up, increase the keyframe, if it flickers, decrease the keyframe. 90 | First, generate one time with the default settings and go straight ahead without worrying about the result. 91 | 92 | 93 | #### Stage 3 94 | Select one of the keyframes, throw it to img2img, and run [Interrogate DeepBooru]. 95 | Delete unwanted words such as blur from the displayed prompt. 96 | Fill in the rest of the settings as you would normally do for image generation. 97 | 98 | Here is the settings I used. 99 | - Sampling method : Euler a 100 | - Sampling Steps : 50 101 | - Width : 960 102 | - Height : 512 103 | - CFG Scale : 20 104 | - Denoising strength : 0.2 105 | 106 | Here is the settings for extension. 107 | - Mask Mode(Override img2img Mask mode) : Normal 108 | - Img2Img Repeat Count (Loop Back) : 5 109 | - Add N to seed when repeating : 1 110 | - use Face Crop img2img : True 111 | - Face Detection Method : YuNet 112 | - Max Crop Size : 1024 113 | - Face Denoising Strength : 0.25 114 | - Face Area Magnification : 1.5 (The larger the number, the closer to the model's painting style, but the more likely it is to shift when merged with the body.) 115 | - Enable Face Prompt : False 116 | 117 | Trial and error in this process is the most time-consuming part. 118 | Monitor the destination folder and if you do not like results, interrupt and change the settings. 119 | [Prompt][Denoising strength] and [Face Denoising Strength] settings when using Face Crop img2img will greatly affect the result. 120 | For more information on Face Crop img2img, check [here](https://github.com/s9roll7/face_crop_img2img) 121 | 122 | If you have lots of memory to spare, increasing the width and height values while maintaining the aspect ratio may greatly improve results. 123 | 124 | This extension may help with the adjustment. 125 | https://github.com/s9roll7/img2img_for_all_method 126 | 127 |
128 | 129 | **The information above is from a time when there was no controlnet. 130 | When controlnet are used together (especially multi-controlnets), 131 | Even setting "Denoising strength" to a high value works well, and even setting it to 1.0 produces meaningful results. 132 | If "Denoising strength" is set to a high value, "Loop Back" can be set to 1.** 133 | 134 |
135 | 136 | #### Stage 4 137 | Scale it up or down and process it to exactly the same size as the original video. 138 | This process should only need to be done once. 139 | 140 | - Width : 1280 141 | - Height : 720 142 | - Upscaler 1 : R-ESRGAN 4x+ 143 | - Upscaler 2 : R-ESRGAN 4x+ Anime6B 144 | - Upscaler 2 visibility : 0.5 145 | - GFPGAN visibility : 1 146 | - CodeFormer visibility : 0 147 | - CodeFormer weight : 0 148 | 149 | #### Stage 5 150 | There is nothing to configure. 151 | .ebs file will be generated. 152 | 153 | #### Stage 6 154 | Run the .ebs file. 155 | I wouldn't change the settings, but you could adjust the .ebs settings. 156 | 157 | #### Stage 7 158 | Finally, output the video. 159 | In my case, the entire process from 1 to 7 took about 30 minutes. 160 | 161 | - Crossfade blend rate : 1.0 162 | - Export type : mp4 163 | 164 |
165 |
166 | 167 | ## Note 2 : How to use multi-controlnet together 168 | #### in webui setting 169 | ![controlnet_setting](imgs/controlnet_setting.png "controlnet_setting") 170 |
171 | #### In controlnet settings in img2img tab(for controlnet 0) 172 | ![controlnet_0](imgs/controlnet_0.png "controlnet_0") 173 |
174 | #### In controlnet settings in img2img tab(for controlnet 1) 175 | ![controlnet_1](imgs/controlnet_1.png "controlnet_1") 176 |
177 | #### In ebsynth_utility settings in img2img tab 178 | **Warning : "Weight" in the controlnet settings is overridden by the following values** 179 | ![controlnet_option_in_ebsynthutil](imgs/controlnet_option_in_ebsynthutil.png "controlnet_option_in_ebsynthutil") 180 | 181 |
182 |
183 | 184 | ## Note 3 : How to use clipseg 185 | ![clipseg](imgs/clipseg.png "How to use clipseg") 186 | 187 | 188 | -------------------------------------------------------------------------------- /calculator.py: -------------------------------------------------------------------------------- 1 | # https://www.mycompiler.io/view/3TFZagC 2 | 3 | class ParseError(Exception): 4 | def __init__(self, pos, msg, *args): 5 | self.pos = pos 6 | self.msg = msg 7 | self.args = args 8 | 9 | def __str__(self): 10 | return '%s at position %s' % (self.msg % self.args, self.pos) 11 | 12 | class Parser: 13 | def __init__(self): 14 | self.cache = {} 15 | 16 | def parse(self, text): 17 | self.text = text 18 | self.pos = -1 19 | self.len = len(text) - 1 20 | rv = self.start() 21 | self.assert_end() 22 | return rv 23 | 24 | def assert_end(self): 25 | if self.pos < self.len: 26 | raise ParseError( 27 | self.pos + 1, 28 | 'Expected end of string but got %s', 29 | self.text[self.pos + 1] 30 | ) 31 | 32 | def eat_whitespace(self): 33 | while self.pos < self.len and self.text[self.pos + 1] in " \f\v\r\t\n": 34 | self.pos += 1 35 | 36 | def split_char_ranges(self, chars): 37 | try: 38 | return self.cache[chars] 39 | except KeyError: 40 | pass 41 | 42 | rv = [] 43 | index = 0 44 | length = len(chars) 45 | 46 | while index < length: 47 | if index + 2 < length and chars[index + 1] == '-': 48 | if chars[index] >= chars[index + 2]: 49 | raise ValueError('Bad character range') 50 | 51 | rv.append(chars[index:index + 3]) 52 | index += 3 53 | else: 54 | rv.append(chars[index]) 55 | index += 1 56 | 57 | self.cache[chars] = rv 58 | return rv 59 | 60 | def char(self, chars=None): 61 | if self.pos >= self.len: 62 | raise ParseError( 63 | self.pos + 1, 64 | 'Expected %s but got end of string', 65 | 'character' if chars is None else '[%s]' % chars 66 | ) 67 | 68 | next_char = self.text[self.pos + 1] 69 | if chars == None: 70 | self.pos += 1 71 | return next_char 72 | 73 | for char_range in self.split_char_ranges(chars): 74 | if len(char_range) == 1: 75 | if next_char == char_range: 76 | self.pos += 1 77 | return next_char 78 | elif char_range[0] <= next_char <= char_range[2]: 79 | self.pos += 1 80 | return next_char 81 | 82 | raise ParseError( 83 | self.pos + 1, 84 | 'Expected %s but got %s', 85 | 'character' if chars is None else '[%s]' % chars, 86 | next_char 87 | ) 88 | 89 | def keyword(self, *keywords): 90 | self.eat_whitespace() 91 | if self.pos >= self.len: 92 | raise ParseError( 93 | self.pos + 1, 94 | 'Expected %s but got end of string', 95 | ','.join(keywords) 96 | ) 97 | 98 | for keyword in keywords: 99 | low = self.pos + 1 100 | high = low + len(keyword) 101 | 102 | if self.text[low:high] == keyword: 103 | self.pos += len(keyword) 104 | self.eat_whitespace() 105 | return keyword 106 | 107 | raise ParseError( 108 | self.pos + 1, 109 | 'Expected %s but got %s', 110 | ','.join(keywords), 111 | self.text[self.pos + 1], 112 | ) 113 | 114 | def match(self, *rules): 115 | self.eat_whitespace() 116 | last_error_pos = -1 117 | last_exception = None 118 | last_error_rules = [] 119 | 120 | for rule in rules: 121 | initial_pos = self.pos 122 | try: 123 | rv = getattr(self, rule)() 124 | self.eat_whitespace() 125 | return rv 126 | except ParseError as e: 127 | self.pos = initial_pos 128 | 129 | if e.pos > last_error_pos: 130 | last_exception = e 131 | last_error_pos = e.pos 132 | last_error_rules.clear() 133 | last_error_rules.append(rule) 134 | elif e.pos == last_error_pos: 135 | last_error_rules.append(rule) 136 | 137 | if len(last_error_rules) == 1: 138 | raise last_exception 139 | else: 140 | raise ParseError( 141 | last_error_pos, 142 | 'Expected %s but got %s', 143 | ','.join(last_error_rules), 144 | self.text[last_error_pos] 145 | ) 146 | 147 | def maybe_char(self, chars=None): 148 | try: 149 | return self.char(chars) 150 | except ParseError: 151 | return None 152 | 153 | def maybe_match(self, *rules): 154 | try: 155 | return self.match(*rules) 156 | except ParseError: 157 | return None 158 | 159 | def maybe_keyword(self, *keywords): 160 | try: 161 | return self.keyword(*keywords) 162 | except ParseError: 163 | return None 164 | 165 | class CalcParser(Parser): 166 | def start(self): 167 | return self.expression() 168 | 169 | def expression(self): 170 | rv = self.match('term') 171 | while True: 172 | op = self.maybe_keyword('+', '-') 173 | if op is None: 174 | break 175 | 176 | term = self.match('term') 177 | if op == '+': 178 | rv += term 179 | else: 180 | rv -= term 181 | 182 | return rv 183 | 184 | def term(self): 185 | rv = self.match('factor') 186 | while True: 187 | op = self.maybe_keyword('*', '/') 188 | if op is None: 189 | break 190 | 191 | term = self.match('factor') 192 | if op == '*': 193 | rv *= term 194 | else: 195 | rv /= term 196 | 197 | return rv 198 | 199 | def factor(self): 200 | if self.maybe_keyword('('): 201 | rv = self.match('expression') 202 | self.keyword(')') 203 | 204 | return rv 205 | 206 | return self.match('number') 207 | 208 | def number(self): 209 | chars = [] 210 | 211 | sign = self.maybe_keyword('+', '-') 212 | if sign is not None: 213 | chars.append(sign) 214 | 215 | chars.append(self.char('0-9')) 216 | 217 | while True: 218 | char = self.maybe_char('0-9') 219 | if char is None: 220 | break 221 | 222 | chars.append(char) 223 | 224 | if self.maybe_char('.'): 225 | chars.append('.') 226 | chars.append(self.char('0-9')) 227 | 228 | while True: 229 | char = self.maybe_char('0-9') 230 | if char is None: 231 | break 232 | 233 | chars.append(char) 234 | 235 | rv = float(''.join(chars)) 236 | return rv 237 | 238 | -------------------------------------------------------------------------------- /ebsynth_utility.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | from modules.ui import plaintext_to_html 4 | 5 | import cv2 6 | import glob 7 | from PIL import Image 8 | 9 | from extensions.ebsynth_utility.stage1 import ebsynth_utility_stage1,ebsynth_utility_stage1_invert 10 | from extensions.ebsynth_utility.stage2 import ebsynth_utility_stage2 11 | from extensions.ebsynth_utility.stage5 import ebsynth_utility_stage5 12 | from extensions.ebsynth_utility.stage7 import ebsynth_utility_stage7 13 | from extensions.ebsynth_utility.stage8 import ebsynth_utility_stage8 14 | from extensions.ebsynth_utility.stage3_5 import ebsynth_utility_stage3_5 15 | 16 | 17 | def x_ceiling(value, step): 18 | return -(-value // step) * step 19 | 20 | def dump_dict(string, d:dict): 21 | for key in d.keys(): 22 | string += ( key + " : " + str(d[key]) + "\n") 23 | return string 24 | 25 | class debug_string: 26 | txt = "" 27 | def print(self, comment): 28 | print(comment) 29 | self.txt += comment + '\n' 30 | def to_string(self): 31 | return self.txt 32 | 33 | def ebsynth_utility_process(stage_index: int, project_dir:str, original_movie_path:str, frame_width:int, frame_height:int, st1_masking_method_index:int, st1_mask_threshold:float, tb_use_fast_mode:bool, tb_use_jit:bool, clipseg_mask_prompt:str, clipseg_exclude_prompt:str, clipseg_mask_threshold:int, clipseg_mask_blur_size:int, clipseg_mask_blur_size2:int, key_min_gap:int, key_max_gap:int, key_th:float, key_add_last_frame:bool, color_matcher_method:str, st3_5_use_mask:bool, st3_5_use_mask_ref:bool, st3_5_use_mask_org:bool, color_matcher_ref_type:int, color_matcher_ref_image:Image, blend_rate:float, export_type:str, bg_src:str, bg_type:str, mask_blur_size:int, mask_threshold:float, fg_transparency:float, mask_mode:str): 34 | args = locals() 35 | info = "" 36 | info = dump_dict(info, args) 37 | dbg = debug_string() 38 | 39 | 40 | def process_end(dbg, info): 41 | return plaintext_to_html(dbg.to_string()), plaintext_to_html(info) 42 | 43 | 44 | if not os.path.isdir(project_dir): 45 | dbg.print("{0} project_dir not found".format(project_dir)) 46 | return process_end( dbg, info ) 47 | 48 | if not os.path.isfile(original_movie_path): 49 | dbg.print("{0} original_movie_path not found".format(original_movie_path)) 50 | return process_end( dbg, info ) 51 | 52 | is_invert_mask = False 53 | if mask_mode == "Invert": 54 | is_invert_mask = True 55 | 56 | frame_path = os.path.join(project_dir , "video_frame") 57 | frame_mask_path = os.path.join(project_dir, "video_mask") 58 | 59 | if is_invert_mask: 60 | inv_path = os.path.join(project_dir, "inv") 61 | os.makedirs(inv_path, exist_ok=True) 62 | 63 | org_key_path = os.path.join(inv_path, "video_key") 64 | img2img_key_path = os.path.join(inv_path, "img2img_key") 65 | img2img_upscale_key_path = os.path.join(inv_path, "img2img_upscale_key") 66 | else: 67 | org_key_path = os.path.join(project_dir, "video_key") 68 | img2img_key_path = os.path.join(project_dir, "img2img_key") 69 | img2img_upscale_key_path = os.path.join(project_dir, "img2img_upscale_key") 70 | 71 | if mask_mode == "None": 72 | frame_mask_path = "" 73 | 74 | 75 | project_args = [project_dir, original_movie_path, frame_path, frame_mask_path, org_key_path, img2img_key_path, img2img_upscale_key_path] 76 | 77 | 78 | if stage_index == 0: 79 | ebsynth_utility_stage1(dbg, project_args, frame_width, frame_height, st1_masking_method_index, st1_mask_threshold, tb_use_fast_mode, tb_use_jit, clipseg_mask_prompt, clipseg_exclude_prompt, clipseg_mask_threshold, clipseg_mask_blur_size, clipseg_mask_blur_size2, is_invert_mask) 80 | if is_invert_mask: 81 | inv_mask_path = os.path.join(inv_path, "inv_video_mask") 82 | ebsynth_utility_stage1_invert(dbg, frame_mask_path, inv_mask_path) 83 | 84 | elif stage_index == 1: 85 | ebsynth_utility_stage2(dbg, project_args, key_min_gap, key_max_gap, key_th, key_add_last_frame, is_invert_mask) 86 | elif stage_index == 2: 87 | 88 | sample_image = glob.glob( os.path.join(frame_path , "*.png" ) )[0] 89 | img_height, img_width, _ = cv2.imread(sample_image).shape 90 | if img_width < img_height: 91 | re_w = 512 92 | re_h = int(x_ceiling( (512 / img_width) * img_height , 64)) 93 | else: 94 | re_w = int(x_ceiling( (512 / img_height) * img_width , 64)) 95 | re_h = 512 96 | img_width = re_w 97 | img_height = re_h 98 | 99 | dbg.print("stage 3") 100 | dbg.print("") 101 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 102 | dbg.print("1. Go to img2img tab") 103 | dbg.print("2. Select [ebsynth utility] in the script combo box") 104 | dbg.print("3. Fill in the \"Project directory\" field with [" + project_dir + "]" ) 105 | dbg.print("4. Select in the \"Mask Mode(Override img2img Mask mode)\" field with [" + ("Invert" if is_invert_mask else "Normal") + "]" ) 106 | dbg.print("5. I recommend to fill in the \"Width\" field with [" + str(img_width) + "]" ) 107 | dbg.print("6. I recommend to fill in the \"Height\" field with [" + str(img_height) + "]" ) 108 | dbg.print("7. I recommend to fill in the \"Denoising strength\" field with lower than 0.35" ) 109 | dbg.print(" (When using controlnet together, you can put in large values (even 1.0 is possible).)") 110 | dbg.print("8. Fill in the remaining configuration fields of img2img. No image and mask settings are required.") 111 | dbg.print("9. Drop any image onto the img2img main screen. This is necessary to avoid errors, but does not affect the results of img2img.") 112 | dbg.print("10. Generate") 113 | dbg.print("(Images are output to [" + img2img_key_path + "])") 114 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 115 | return process_end( dbg, "" ) 116 | 117 | elif stage_index == 3: 118 | ebsynth_utility_stage3_5(dbg, project_args, color_matcher_method, st3_5_use_mask, st3_5_use_mask_ref, st3_5_use_mask_org, color_matcher_ref_type, color_matcher_ref_image) 119 | 120 | elif stage_index == 4: 121 | sample_image = glob.glob( os.path.join(frame_path , "*.png" ) )[0] 122 | img_height, img_width, _ = cv2.imread(sample_image).shape 123 | 124 | sample_img2img_key = glob.glob( os.path.join(img2img_key_path , "*.png" ) )[0] 125 | img_height_key, img_width_key, _ = cv2.imread(sample_img2img_key).shape 126 | 127 | if is_invert_mask: 128 | project_dir = inv_path 129 | 130 | dbg.print("stage 4") 131 | dbg.print("") 132 | 133 | if img_height == img_height_key and img_width == img_width_key: 134 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 135 | dbg.print("!! The size of frame and img2img_key matched.") 136 | dbg.print("!! You can skip this stage.") 137 | 138 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 139 | dbg.print("0. Enable the following item") 140 | dbg.print("Settings ->") 141 | dbg.print(" Saving images/grids ->") 142 | dbg.print(" Use original name for output filename during batch process in extras tab") 143 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 144 | dbg.print("1. If \"img2img_upscale_key\" directory already exists in the %s, delete it manually before executing."%(project_dir)) 145 | dbg.print("2. Go to Extras tab") 146 | dbg.print("3. Go to Batch from Directory tab") 147 | dbg.print("4. Fill in the \"Input directory\" field with [" + img2img_key_path + "]" ) 148 | dbg.print("5. Fill in the \"Output directory\" field with [" + img2img_upscale_key_path + "]" ) 149 | dbg.print("6. Go to Scale to tab") 150 | dbg.print("7. Fill in the \"Width\" field with [" + str(img_width) + "]" ) 151 | dbg.print("8. Fill in the \"Height\" field with [" + str(img_height) + "]" ) 152 | dbg.print("9. Fill in the remaining configuration fields of Upscaler.") 153 | dbg.print("10. Generate") 154 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 155 | return process_end( dbg, "" ) 156 | elif stage_index == 5: 157 | ebsynth_utility_stage5(dbg, project_args, is_invert_mask) 158 | elif stage_index == 6: 159 | 160 | if is_invert_mask: 161 | project_dir = inv_path 162 | 163 | dbg.print("stage 6") 164 | dbg.print("") 165 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 166 | dbg.print("Running ebsynth.(on your self)") 167 | dbg.print("Open the generated .ebs under %s and press [Run All] button."%(project_dir)) 168 | dbg.print("If ""out-*"" directory already exists in the %s, delete it manually before executing."%(project_dir)) 169 | dbg.print("If multiple .ebs files are generated, run them all.") 170 | dbg.print("(I recommend associating the .ebs file with EbSynth.exe.)") 171 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 172 | return process_end( dbg, "" ) 173 | elif stage_index == 7: 174 | ebsynth_utility_stage7(dbg, project_args, blend_rate, export_type, is_invert_mask) 175 | elif stage_index == 8: 176 | if mask_mode != "Normal": 177 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 178 | dbg.print("Please reset [configuration]->[etc]->[Mask Mode] to Normal.") 179 | dbg.print("!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!") 180 | return process_end( dbg, "" ) 181 | ebsynth_utility_stage8(dbg, project_args, bg_src, bg_type, mask_blur_size, mask_threshold, fg_transparency, export_type) 182 | else: 183 | pass 184 | 185 | return process_end( dbg, info ) 186 | -------------------------------------------------------------------------------- /imgs/clipseg.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/clipseg.png -------------------------------------------------------------------------------- /imgs/controlnet_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/controlnet_0.png -------------------------------------------------------------------------------- /imgs/controlnet_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/controlnet_1.png -------------------------------------------------------------------------------- /imgs/controlnet_option_in_ebsynthutil.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/controlnet_option_in_ebsynthutil.png -------------------------------------------------------------------------------- /imgs/controlnet_setting.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/controlnet_setting.png -------------------------------------------------------------------------------- /imgs/sample1.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/sample1.mp4 -------------------------------------------------------------------------------- /imgs/sample2.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/sample2.mp4 -------------------------------------------------------------------------------- /imgs/sample3.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/sample3.mp4 -------------------------------------------------------------------------------- /imgs/sample4.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/sample4.mp4 -------------------------------------------------------------------------------- /imgs/sample5.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/sample5.mp4 -------------------------------------------------------------------------------- /imgs/sample6.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/sample6.mp4 -------------------------------------------------------------------------------- /imgs/sample_anyaheh.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/sample_anyaheh.mp4 -------------------------------------------------------------------------------- /imgs/sample_autotag.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/sample_autotag.mp4 -------------------------------------------------------------------------------- /imgs/sample_clipseg.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/s9roll7/ebsynth_utility/8ff9fbf221cbadd2ed7ebf30d7954184bab40275/imgs/sample_clipseg.mp4 -------------------------------------------------------------------------------- /install.py: -------------------------------------------------------------------------------- 1 | import launch 2 | import platform 3 | 4 | def update_transparent_background(): 5 | from importlib.metadata import version as meta_version 6 | from packaging import version 7 | v = meta_version("transparent-background") 8 | print("current transparent-background " + v) 9 | if version.parse(v) < version.parse('1.2.3'): 10 | launch.run_pip("install -U transparent-background", "update transparent-background version for Ebsynth Utility") 11 | 12 | # Check if user is running an M1/M2 device and, if so, install pyvirtualcam, which is required for updating the transparent_background package 13 | # Note that we have to directly install from source because the prebuilt PyPl wheel does not support ARM64 machines such as M1/M2 Macs 14 | if platform.system() == "Darwin" and platform.machine() == "arm64": 15 | if not launch.is_installed("pyvirtualcam"): 16 | launch.run_pip("install git+https://github.com/letmaik/pyvirtualcam", "requirements for Ebsynth Utility") 17 | 18 | if not launch.is_installed("transparent_background"): 19 | launch.run_pip("install transparent-background", "requirements for Ebsynth Utility") 20 | 21 | update_transparent_background() 22 | 23 | if not launch.is_installed("IPython"): 24 | launch.run_pip("install ipython", "requirements for Ebsynth Utility") 25 | 26 | if not launch.is_installed("seaborn"): 27 | launch.run_pip("install ""seaborn>=0.11.0""", "requirements for Ebsynth Utility") 28 | 29 | if not launch.is_installed("color_matcher"): 30 | launch.run_pip("install color-matcher", "requirements for Ebsynth Utility") 31 | 32 | -------------------------------------------------------------------------------- /sample/add_token.txt: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "target":"smile", 4 | "min_score":0.5, 5 | "token": ["lottalewds_v0", "1.2"], 6 | "type":"lora" 7 | }, 8 | { 9 | "target":"smile", 10 | "min_score":0.5, 11 | "token": ["anyahehface", "score*1.2"], 12 | "type":"normal" 13 | }, 14 | { 15 | "target":"smile", 16 | "min_score":0.5, 17 | "token": ["wicked smug", "score*1.2"], 18 | "type":"normal" 19 | }, 20 | { 21 | "target":"smile", 22 | "min_score":0.5, 23 | "token": ["half closed eyes", "0.2 + score*0.3"], 24 | "type":"normal" 25 | }, 26 | 27 | 28 | 29 | { 30 | "target":"test_token", 31 | "min_score":0.8, 32 | "token": ["lora_name_A", "0.5"], 33 | "type":"lora" 34 | }, 35 | { 36 | "target":"test_token", 37 | "min_score":0.5, 38 | "token": ["bbbb", "score - 0.1"], 39 | "type":"normal" 40 | }, 41 | { 42 | "target":"test_token2", 43 | "min_score":0.8, 44 | "token": ["hypernet_name_A", "score"], 45 | "type":"hypernet" 46 | }, 47 | { 48 | "target":"test_token3", 49 | "min_score":0.0, 50 | "token": ["dddd", "score"], 51 | "type":"normal" 52 | } 53 | ] 54 | 55 | -------------------------------------------------------------------------------- /sample/blacklist.txt: -------------------------------------------------------------------------------- 1 | motion_blur 2 | blurry 3 | realistic 4 | depth_of_field 5 | mountain 6 | tree 7 | water 8 | underwater 9 | tongue 10 | tongue_out 11 | -------------------------------------------------------------------------------- /scripts/custom_script.py: -------------------------------------------------------------------------------- 1 | import modules.scripts as scripts 2 | import gradio as gr 3 | import os 4 | import torch 5 | import random 6 | import time 7 | import pprint 8 | import shutil 9 | 10 | from modules.processing import process_images,Processed 11 | from modules.paths import models_path 12 | from modules.textual_inversion import autocrop 13 | import modules.images 14 | from modules import shared,deepbooru,masking 15 | import cv2 16 | import copy 17 | import numpy as np 18 | from PIL import Image,ImageOps 19 | import glob 20 | import requests 21 | import json 22 | import re 23 | from extensions.ebsynth_utility.calculator import CalcParser,ParseError 24 | 25 | def get_my_dir(): 26 | if os.path.isdir("extensions/ebsynth_utility"): 27 | return "extensions/ebsynth_utility" 28 | return scripts.basedir() 29 | 30 | def x_ceiling(value, step): 31 | return -(-value // step) * step 32 | 33 | def remove_pngs_in_dir(path): 34 | if not os.path.isdir(path): 35 | return 36 | pngs = glob.glob( os.path.join(path, "*.png") ) 37 | for png in pngs: 38 | os.remove(png) 39 | 40 | def resize_img(img, w, h): 41 | if img.shape[0] + img.shape[1] < h + w: 42 | interpolation = interpolation=cv2.INTER_CUBIC 43 | else: 44 | interpolation = interpolation=cv2.INTER_AREA 45 | 46 | return cv2.resize(img, (w, h), interpolation=interpolation) 47 | 48 | def download_and_cache_models(dirname): 49 | download_url = 'https://github.com/zymk9/yolov5_anime/blob/8b50add22dbd8224904221be3173390f56046794/weights/yolov5s_anime.pt?raw=true' 50 | model_file_name = 'yolov5s_anime.pt' 51 | 52 | if not os.path.exists(dirname): 53 | os.makedirs(dirname) 54 | 55 | cache_file = os.path.join(dirname, model_file_name) 56 | if not os.path.exists(cache_file): 57 | print(f"downloading face detection model from '{download_url}' to '{cache_file}'") 58 | response = requests.get(download_url) 59 | with open(cache_file, "wb") as f: 60 | f.write(response.content) 61 | 62 | if os.path.exists(cache_file): 63 | return cache_file 64 | return None 65 | 66 | class Script(scripts.Script): 67 | anime_face_detector = None 68 | face_detector = None 69 | face_merge_mask_filename = "face_crop_img2img_mask.png" 70 | face_merge_mask_image = None 71 | prompts_dir = "" 72 | calc_parser = None 73 | is_invert_mask = False 74 | controlnet_weight = 0.5 75 | controlnet_weight_for_face = 0.5 76 | add_tag_replace_underscore = False 77 | 78 | 79 | # The title of the script. This is what will be displayed in the dropdown menu. 80 | def title(self): 81 | return "ebsynth utility" 82 | 83 | # Determines when the script should be shown in the dropdown menu via the 84 | # returned value. As an example: 85 | # is_img2img is True if the current tab is img2img, and False if it is txt2img. 86 | # Thus, return is_img2img to only show the script on the img2img tab. 87 | 88 | def show(self, is_img2img): 89 | return is_img2img 90 | 91 | # How the script's is displayed in the UI. See https://gradio.app/docs/#components 92 | # for the different UI components you can use and how to create them. 93 | # Most UI components can return a value, such as a boolean for a checkbox. 94 | # The returned values are passed to the run method as parameters. 95 | 96 | def ui(self, is_img2img): 97 | with gr.Column(variant='panel'): 98 | with gr.Column(): 99 | project_dir = gr.Textbox(label='Project directory', lines=1) 100 | generation_test = gr.Checkbox(False, label="Generation TEST!!(Ignore Project directory and use the image and mask specified in the main UI)") 101 | 102 | with gr.Accordion("Mask option"): 103 | mask_mode = gr.Dropdown(choices=["Normal","Invert","None","Don't Override"], value="Normal" ,label="Mask Mode(Override img2img Mask mode)") 104 | inpaint_area = gr.Dropdown(choices=["Whole picture","Only masked","Don't Override"], type = "index", value="Only masked" ,label="Inpaint Area(Override img2img Inpaint area)") 105 | use_depth = gr.Checkbox(True, label="Use Depth Map If exists in /video_key_depth") 106 | gr.HTML(value="

\ 107 | See \ 108 | [here] for depth map.\ 109 |

") 110 | 111 | with gr.Accordion("ControlNet option"): 112 | controlnet_weight = gr.Slider(minimum=0.0, maximum=2.0, step=0.01, value=0.5, label="Control Net Weight") 113 | controlnet_weight_for_face = gr.Slider(minimum=0.0, maximum=2.0, step=0.01, value=0.5, label="Control Net Weight For Face") 114 | use_preprocess_img = gr.Checkbox(True, label="Use Preprocess image If exists in /controlnet_preprocess") 115 | gr.HTML(value="

\ 116 | Please enable the following settings to use controlnet from this script.
\ 117 | \ 118 | Settings->ControlNet->Allow other script to control this extension\ 119 | \ 120 |

") 121 | 122 | with gr.Accordion("Loopback option"): 123 | img2img_repeat_count = gr.Slider(minimum=1, maximum=30, step=1, value=1, label="Img2Img Repeat Count (Loop Back)") 124 | inc_seed = gr.Slider(minimum=0, maximum=9999999, step=1, value=1, label="Add N to seed when repeating ") 125 | 126 | with gr.Accordion("Auto Tagging option"): 127 | auto_tag_mode = gr.Dropdown(choices=["None","DeepDanbooru","CLIP"], value="None" ,label="Auto Tagging") 128 | add_tag_to_head = gr.Checkbox(False, label="Add additional prompts to the head") 129 | add_tag_replace_underscore = gr.Checkbox(False, label="Replace '_' with ' '(Does not affect the function to add tokens using add_token.txt.)") 130 | gr.HTML(value="

\ 131 | The results are stored in timestamp_prompts.txt.
\ 132 | If you want to use the same tagging results the next time you run img2img, rename the file to prompts.txt
\ 133 | Recommend enabling the following settings.
\ 134 | \ 135 | Settings->Interrogate Option->Interrogate: include ranks of model tags matches in results\ 136 | \ 137 |

") 138 | 139 | with gr.Accordion("Face Crop option"): 140 | is_facecrop = gr.Checkbox(False, label="use Face Crop img2img") 141 | 142 | with gr.Row(): 143 | face_detection_method = gr.Dropdown(choices=["YuNet","Yolov5_anime"], value="YuNet" ,label="Face Detection Method") 144 | gr.HTML(value="

\ 145 | If loading of the Yolov5_anime model fails, check\ 146 | [this] solution.\ 147 |

") 148 | face_crop_resolution = gr.Slider(minimum=128, maximum=2048, step=1, value=512, label="Face Crop Resolution") 149 | max_crop_size = gr.Slider(minimum=0, maximum=2048, step=1, value=1024, label="Max Crop Size") 150 | face_denoising_strength = gr.Slider(minimum=0.00, maximum=1.00, step=0.01, value=0.5, label="Face Denoising Strength") 151 | face_area_magnification = gr.Slider(minimum=1.00, maximum=10.00, step=0.01, value=1.5, label="Face Area Magnification ") 152 | disable_facecrop_lpbk_last_time = gr.Checkbox(False, label="Disable at the last loopback time") 153 | 154 | with gr.Column(): 155 | enable_face_prompt = gr.Checkbox(False, label="Enable Face Prompt") 156 | face_prompt = gr.Textbox(label="Face Prompt", show_label=False, lines=2, 157 | placeholder="Prompt for Face", 158 | value = "face close up," 159 | ) 160 | 161 | return [project_dir, generation_test, mask_mode, inpaint_area, use_depth, img2img_repeat_count, inc_seed, auto_tag_mode, add_tag_to_head, add_tag_replace_underscore, is_facecrop, face_detection_method, face_crop_resolution, max_crop_size, face_denoising_strength, face_area_magnification, enable_face_prompt, face_prompt, controlnet_weight, controlnet_weight_for_face, disable_facecrop_lpbk_last_time,use_preprocess_img] 162 | 163 | 164 | def detect_face_from_img(self, img_array): 165 | if not self.face_detector: 166 | dnn_model_path = autocrop.download_and_cache_models(os.path.join(models_path, "opencv")) 167 | self.face_detector = cv2.FaceDetectorYN.create(dnn_model_path, "", (0, 0)) 168 | 169 | self.face_detector.setInputSize((img_array.shape[1], img_array.shape[0])) 170 | _, result = self.face_detector.detect(img_array) 171 | return result 172 | 173 | def detect_anime_face_from_img(self, img_array): 174 | import sys 175 | 176 | if not self.anime_face_detector: 177 | if 'models' in sys.modules: 178 | del sys.modules['models'] 179 | 180 | anime_model_path = download_and_cache_models(os.path.join(models_path, "yolov5_anime")) 181 | 182 | if not os.path.isfile(anime_model_path): 183 | print( "WARNING!! " + anime_model_path + " not found.") 184 | print( "use YuNet instead.") 185 | return self.detect_face_from_img(img_array) 186 | 187 | self.anime_face_detector = torch.hub.load('ultralytics/yolov5', 'custom', path=anime_model_path) 188 | 189 | # warmup 190 | test = np.zeros([512,512,3],dtype=np.uint8) 191 | _ = self.anime_face_detector(test) 192 | 193 | result = self.anime_face_detector(img_array) 194 | #models.common.Detections 195 | faces = [] 196 | for x_c, y_c, w, h, _, _ in result.xywh[0].tolist(): 197 | faces.append( [ x_c - w/2 , y_c - h/2, w, h ] ) 198 | 199 | return faces 200 | 201 | def detect_face(self, img, mask, face_detection_method, max_crop_size): 202 | img_array = np.array(img) 203 | 204 | # image without alpha 205 | if img_array.shape[2] == 4: 206 | img_array = img_array[:,:,:3] 207 | 208 | if mask is not None: 209 | if self.is_invert_mask: 210 | mask = ImageOps.invert(mask) 211 | mask_array = np.array(mask)/255 212 | if mask_array.ndim == 2: 213 | mask_array = mask_array[:, :, np.newaxis] 214 | 215 | if mask_array.shape[2] == 4: 216 | mask_array = mask_array[:,:,:3] 217 | 218 | img_array = mask_array * img_array 219 | img_array = img_array.astype(np.uint8) 220 | 221 | if face_detection_method == "YuNet": 222 | faces = self.detect_face_from_img(img_array) 223 | elif face_detection_method == "Yolov5_anime": 224 | faces = self.detect_anime_face_from_img(img_array) 225 | else: 226 | faces = self.detect_face_from_img(img_array) 227 | 228 | if faces is None or len(faces) == 0: 229 | return [] 230 | 231 | face_coords = [] 232 | for face in faces: 233 | x = int(face[0]) 234 | y = int(face[1]) 235 | w = int(face[2]) 236 | h = int(face[3]) 237 | if max(w,h) > max_crop_size: 238 | print("ignore big face") 239 | continue 240 | if w == 0 or h == 0: 241 | print("ignore w,h = 0 face") 242 | continue 243 | 244 | face_coords.append( [ x/img_array.shape[1],y/img_array.shape[0],w/img_array.shape[1],h/img_array.shape[0]] ) 245 | 246 | return face_coords 247 | 248 | def get_mask(self): 249 | def create_mask( output, x_rate, y_rate, k_size ): 250 | img = np.zeros((512, 512, 3)) 251 | img = cv2.ellipse(img, ((256, 256), (int(512 * x_rate), int(512 * y_rate)), 0), (255, 255, 255), thickness=-1) 252 | img = cv2.GaussianBlur(img, (k_size, k_size), 0) 253 | cv2.imwrite(output, img) 254 | 255 | if self.face_merge_mask_image is None: 256 | mask_file_path = os.path.join( get_my_dir() , self.face_merge_mask_filename) 257 | if not os.path.isfile(mask_file_path): 258 | create_mask( mask_file_path, 0.9, 0.9, 91) 259 | 260 | m = cv2.imread( mask_file_path )[:,:,0] 261 | m = m[:, :, np.newaxis] 262 | self.face_merge_mask_image = m / 255 263 | 264 | return self.face_merge_mask_image 265 | 266 | def face_img_crop(self, img, face_coords,face_area_magnification): 267 | img_array = np.array(img) 268 | face_imgs =[] 269 | new_coords = [] 270 | 271 | for face in face_coords: 272 | x = int(face[0] * img_array.shape[1]) 273 | y = int(face[1] * img_array.shape[0]) 274 | w = int(face[2] * img_array.shape[1]) 275 | h = int(face[3] * img_array.shape[0]) 276 | print([x,y,w,h]) 277 | 278 | cx = x + int(w/2) 279 | cy = y + int(h/2) 280 | 281 | x = cx - int(w*face_area_magnification / 2) 282 | x = x if x > 0 else 0 283 | w = cx + int(w*face_area_magnification / 2) - x 284 | w = w if x+w < img.width else img.width - x 285 | 286 | y = cy - int(h*face_area_magnification / 2) 287 | y = y if y > 0 else 0 288 | h = cy + int(h*face_area_magnification / 2) - y 289 | h = h if y+h < img.height else img.height - y 290 | 291 | print([x,y,w,h]) 292 | 293 | face_imgs.append( img_array[y: y+h, x: x+w] ) 294 | new_coords.append( [x,y,w,h] ) 295 | 296 | resized = [] 297 | for face_img in face_imgs: 298 | if face_img.shape[1] < face_img.shape[0]: 299 | re_w = self.face_crop_resolution 300 | re_h = int(x_ceiling( (self.face_crop_resolution / face_img.shape[1]) * face_img.shape[0] , 64)) 301 | else: 302 | re_w = int(x_ceiling( (self.face_crop_resolution / face_img.shape[0]) * face_img.shape[1] , 64)) 303 | re_h = self.face_crop_resolution 304 | 305 | face_img = resize_img(face_img, re_w, re_h) 306 | resized.append( Image.fromarray(face_img)) 307 | 308 | return resized, new_coords 309 | 310 | def face_crop_img2img(self, p, face_coords, face_denoising_strength, face_area_magnification, enable_face_prompt, face_prompt, controlnet_input_img, controlnet_input_face_imgs, preprocess_img_exist): 311 | 312 | def merge_face(img, face_img, face_coord, base_img_size, mask): 313 | x_rate = img.width / base_img_size[0] 314 | y_rate = img.height / base_img_size[1] 315 | 316 | img_array = np.array(img) 317 | x = int(face_coord[0] * x_rate) 318 | y = int(face_coord[1] * y_rate) 319 | w = int(face_coord[2] * x_rate) 320 | h = int(face_coord[3] * y_rate) 321 | 322 | face_array = np.array(face_img) 323 | face_array = resize_img(face_array, w, h) 324 | mask = resize_img(mask, w, h) 325 | if mask.ndim == 2: 326 | mask = mask[:, :, np.newaxis] 327 | 328 | bg = img_array[y: y+h, x: x+w] 329 | img_array[y: y+h, x: x+w] = mask * face_array + (1-mask)*bg 330 | 331 | return Image.fromarray(img_array) 332 | 333 | base_img = p.init_images[0] 334 | 335 | base_img_size = (base_img.width, base_img.height) 336 | 337 | if face_coords is None or len(face_coords) == 0: 338 | print("no face detected") 339 | return process_images(p) 340 | 341 | print(face_coords) 342 | face_imgs, new_coords = self.face_img_crop(base_img, face_coords, face_area_magnification) 343 | 344 | if not face_imgs: 345 | return process_images(p) 346 | 347 | face_p = copy.copy(p) 348 | 349 | ### img2img base img 350 | proc = self.process_images(p, controlnet_input_img, self.controlnet_weight, preprocess_img_exist) 351 | print(proc.seed) 352 | 353 | ### img2img for each face 354 | face_img2img_results = [] 355 | 356 | for face, coord, controlnet_input_face in zip(face_imgs, new_coords, controlnet_input_face_imgs): 357 | # cv2.imwrite("scripts/face.png", np.array(face)[:, :, ::-1]) 358 | face_p.init_images = [face] 359 | face_p.width = face.width 360 | face_p.height = face.height 361 | face_p.denoising_strength = face_denoising_strength 362 | 363 | if enable_face_prompt: 364 | face_p.prompt = face_prompt 365 | else: 366 | face_p.prompt = "close-up face ," + face_p.prompt 367 | 368 | if p.image_mask is not None: 369 | x,y,w,h = coord 370 | cropped_face_mask = Image.fromarray(np.array(p.image_mask)[y: y+h, x: x+w]) 371 | face_p.image_mask = modules.images.resize_image(0, cropped_face_mask, face.width, face.height) 372 | 373 | face_proc = self.process_images(face_p, controlnet_input_face, self.controlnet_weight_for_face, preprocess_img_exist) 374 | print(face_proc.seed) 375 | 376 | face_img2img_results.append((face_proc.images[0], coord)) 377 | 378 | ### merge faces 379 | bg = proc.images[0] 380 | mask = self.get_mask() 381 | 382 | for face_img, coord in face_img2img_results: 383 | bg = merge_face(bg, face_img, coord, base_img_size, mask) 384 | 385 | proc.images[0] = bg 386 | 387 | return proc 388 | 389 | def get_depth_map(self, mask, depth_path ,img_basename, is_invert_mask): 390 | depth_img_path = os.path.join( depth_path , img_basename ) 391 | 392 | depth = None 393 | 394 | if os.path.isfile( depth_img_path ): 395 | depth = Image.open(depth_img_path) 396 | else: 397 | # try 00001-0000.png 398 | os.path.splitext(img_basename)[0] 399 | depth_img_path = os.path.join( depth_path , os.path.splitext(img_basename)[0] + "-0000.png" ) 400 | if os.path.isfile( depth_img_path ): 401 | depth = Image.open(depth_img_path) 402 | 403 | if depth: 404 | if mask: 405 | mask_array = np.array(mask) 406 | depth_array = np.array(depth) 407 | 408 | if is_invert_mask == False: 409 | depth_array[mask_array[:,:,0] == 0] = 0 410 | else: 411 | depth_array[mask_array[:,:,0] != 0] = 0 412 | 413 | depth = Image.fromarray(depth_array) 414 | 415 | tmp_path = os.path.join( depth_path , "tmp" ) 416 | os.makedirs(tmp_path, exist_ok=True) 417 | tmp_path = os.path.join( tmp_path , img_basename ) 418 | depth_array = depth_array.astype(np.uint16) 419 | cv2.imwrite(tmp_path, depth_array) 420 | 421 | mask = depth 422 | 423 | return depth!=None, mask 424 | 425 | ### auto tagging 426 | debug_count = 0 427 | 428 | def get_masked_image(self, image, mask_image): 429 | 430 | if mask_image == None: 431 | return image.convert("RGB") 432 | 433 | mask = mask_image.convert('L') 434 | if self.is_invert_mask: 435 | mask = ImageOps.invert(mask) 436 | crop_region = masking.get_crop_region(np.array(mask), 0) 437 | # crop_region = masking.expand_crop_region(crop_region, self.width, self.height, mask.width, mask.height) 438 | # x1, y1, x2, y2 = crop_region 439 | image = image.crop(crop_region).convert("RGB") 440 | mask = mask.crop(crop_region) 441 | 442 | base_img = Image.new("RGB", image.size, (255, 190, 200)) 443 | 444 | image = Image.composite( image, base_img, mask ) 445 | 446 | # image.save("scripts/get_masked_image_test_"+ str(self.debug_count) + ".png") 447 | # self.debug_count += 1 448 | 449 | return image 450 | 451 | def interrogate_deepdanbooru(self, imgs, masks): 452 | prompts_dict = {} 453 | cause_err = False 454 | 455 | try: 456 | deepbooru.model.start() 457 | 458 | for img,mask in zip(imgs,masks): 459 | key = os.path.basename(img) 460 | print(key + " interrogate deepdanbooru") 461 | 462 | image = Image.open(img) 463 | mask_image = Image.open(mask) if mask else None 464 | image = self.get_masked_image(image, mask_image) 465 | 466 | prompt = deepbooru.model.tag_multi(image) 467 | 468 | prompts_dict[key] = prompt 469 | except Exception as e: 470 | import traceback 471 | traceback.print_exc() 472 | print(e) 473 | cause_err = True 474 | finally: 475 | deepbooru.model.stop() 476 | if cause_err: 477 | print("Exception occurred during auto-tagging(deepdanbooru)") 478 | return Processed() 479 | 480 | return prompts_dict 481 | 482 | 483 | def interrogate_clip(self, imgs, masks): 484 | from modules import devices, shared, lowvram, paths 485 | import importlib 486 | import models 487 | 488 | caption_list = [] 489 | prompts_dict = {} 490 | cause_err = False 491 | 492 | try: 493 | if shared.cmd_opts.lowvram or shared.cmd_opts.medvram: 494 | lowvram.send_everything_to_cpu() 495 | devices.torch_gc() 496 | 497 | with paths.Prioritize("BLIP"): 498 | importlib.reload(models) 499 | shared.interrogator.load() 500 | 501 | for img,mask in zip(imgs,masks): 502 | key = os.path.basename(img) 503 | print(key + " generate caption") 504 | 505 | image = Image.open(img) 506 | mask_image = Image.open(mask) if mask else None 507 | image = self.get_masked_image(image, mask_image) 508 | 509 | caption = shared.interrogator.generate_caption(image) 510 | caption_list.append(caption) 511 | 512 | shared.interrogator.send_blip_to_ram() 513 | devices.torch_gc() 514 | 515 | for img,mask,caption in zip(imgs,masks,caption_list): 516 | key = os.path.basename(img) 517 | print(key + " interrogate clip") 518 | 519 | image = Image.open(img) 520 | mask_image = Image.open(mask) if mask else None 521 | image = self.get_masked_image(image, mask_image) 522 | 523 | clip_image = shared.interrogator.clip_preprocess(image).unsqueeze(0).type(shared.interrogator.dtype).to(devices.device_interrogate) 524 | 525 | res = "" 526 | 527 | with torch.no_grad(), devices.autocast(): 528 | image_features = shared.interrogator.clip_model.encode_image(clip_image).type(shared.interrogator.dtype) 529 | image_features /= image_features.norm(dim=-1, keepdim=True) 530 | 531 | for name, topn, items in shared.interrogator.categories(): 532 | matches = shared.interrogator.rank(image_features, items, top_count=topn) 533 | for match, score in matches: 534 | if shared.opts.interrogate_return_ranks: 535 | res += f", ({match}:{score/100:.3f})" 536 | else: 537 | res += ", " + match 538 | 539 | prompts_dict[key] = (caption + res) 540 | 541 | except Exception as e: 542 | import traceback 543 | traceback.print_exc() 544 | print(e) 545 | cause_err = True 546 | finally: 547 | shared.interrogator.unload() 548 | if cause_err: 549 | print("Exception occurred during auto-tagging(blip/clip)") 550 | return Processed() 551 | 552 | return prompts_dict 553 | 554 | 555 | def remove_reserved_token(self, token_list): 556 | reserved_list = ["pink_background","simple_background","pink","pink_theme"] 557 | 558 | result_list = [] 559 | 560 | head_token = token_list[0] 561 | 562 | if head_token[2] == "normal": 563 | head_token_str = head_token[0].replace('pink background', '') 564 | token_list[0] = (head_token_str, head_token[1], head_token[2]) 565 | 566 | for token in token_list: 567 | if token[0] in reserved_list: 568 | continue 569 | result_list.append(token) 570 | 571 | return result_list 572 | 573 | def remove_blacklisted_token(self, token_list): 574 | black_list_path = os.path.join(self.prompts_dir, "blacklist.txt") 575 | if not os.path.isfile(black_list_path): 576 | print(black_list_path + " not found.") 577 | return token_list 578 | 579 | with open(black_list_path) as f: 580 | black_list = [s.strip() for s in f.readlines()] 581 | 582 | result_list = [] 583 | 584 | for token in token_list: 585 | if token[0] in black_list: 586 | continue 587 | result_list.append(token) 588 | 589 | token_list = result_list 590 | 591 | return token_list 592 | 593 | def add_token(self, token_list): 594 | add_list_path = os.path.join(self.prompts_dir, "add_token.txt") 595 | if not os.path.isfile(add_list_path): 596 | print(add_list_path + " not found.") 597 | 598 | if self.add_tag_replace_underscore: 599 | token_list = [ (x[0].replace("_"," "), x[1], x[2]) for x in token_list ] 600 | 601 | return token_list 602 | 603 | if not self.calc_parser: 604 | self.calc_parser = CalcParser() 605 | 606 | with open(add_list_path) as f: 607 | add_list = json.load(f) 608 | ''' 609 | [ 610 | { 611 | "target":"test_token", 612 | "min_score":0.8, 613 | "token": ["lora_name_A", "0.5"], 614 | "type":"lora" 615 | }, 616 | { 617 | "target":"test_token", 618 | "min_score":0.5, 619 | "token": ["bbbb", "score - 0.1"], 620 | "type":"normal" 621 | }, 622 | { 623 | "target":"test_token2", 624 | "min_score":0.8, 625 | "token": ["hypernet_name_A", "score"], 626 | "type":"hypernet" 627 | }, 628 | { 629 | "target":"test_token3", 630 | "min_score":0.0, 631 | "token": ["dddd", "score"], 632 | "type":"normal" 633 | } 634 | ] 635 | ''' 636 | result_list = [] 637 | 638 | for token in token_list: 639 | for add_item in add_list: 640 | if token[0] == add_item["target"]: 641 | if token[1] > add_item["min_score"]: 642 | # hit 643 | formula = str(add_item["token"][1]) 644 | formula = formula.replace("score",str(token[1])) 645 | print('Input: %s' % str(add_item["token"][1])) 646 | 647 | try: 648 | score = self.calc_parser.parse(formula) 649 | score = round(score, 3) 650 | except (ParseError, ZeroDivisionError) as e: 651 | print('Input: %s' % str(add_item["token"][1])) 652 | print('Error: %s' % e) 653 | print("ignore this token") 654 | continue 655 | 656 | print("score = " + str(score)) 657 | result_list.append( ( add_item["token"][0], score, add_item["type"] ) ) 658 | 659 | if self.add_tag_replace_underscore: 660 | token_list = [ (x[0].replace("_"," "), x[1], x[2]) for x in token_list ] 661 | 662 | token_list = token_list + result_list 663 | 664 | return token_list 665 | 666 | def create_prompts_dict(self, imgs, masks, auto_tag_mode): 667 | prompts_dict = {} 668 | 669 | if auto_tag_mode == "DeepDanbooru": 670 | raw_dict = self.interrogate_deepdanbooru(imgs, masks) 671 | elif auto_tag_mode == "CLIP": 672 | raw_dict = self.interrogate_clip(imgs, masks) 673 | 674 | repatter = re.compile(r'\((.+)\:([0-9\.]+)\)') 675 | 676 | for key, value_str in raw_dict.items(): 677 | value_list = [x.strip() for x in value_str.split(',')] 678 | 679 | value = [] 680 | for v in value_list: 681 | m = repatter.fullmatch(v) 682 | if m: 683 | value.append((m.group(1), float(m.group(2)), "normal")) 684 | else: 685 | value.append((v, 1, "no_score")) 686 | 687 | # print(value) 688 | value = self.remove_reserved_token(value) 689 | # print(value) 690 | value = self.remove_blacklisted_token(value) 691 | # print(value) 692 | value = self.add_token(value) 693 | # print(value) 694 | 695 | def create_token_str(x): 696 | print(x) 697 | if x[2] == "no_score": 698 | return x[0] 699 | elif x[2] == "lora": 700 | return "" 701 | elif x[2] == "hypernet": 702 | return "" 703 | else: 704 | return "(" + x[0] + ":" + str(x[1]) + ")" 705 | 706 | value_list = [create_token_str(x) for x in value] 707 | value = ",".join(value_list) 708 | 709 | prompts_dict[key] = value 710 | 711 | return prompts_dict 712 | 713 | def load_prompts_dict(self, imgs, default_token): 714 | prompts_path = os.path.join(self.prompts_dir, "prompts.txt") 715 | if not os.path.isfile(prompts_path): 716 | print(prompts_path + " not found.") 717 | return {} 718 | 719 | prompts_dict = {} 720 | 721 | print(prompts_path + " found!!") 722 | print("skip auto tagging.") 723 | 724 | with open(prompts_path) as f: 725 | raw_dict = json.load(f) 726 | prev_value = default_token 727 | for img in imgs: 728 | key = os.path.basename(img) 729 | 730 | if key in raw_dict: 731 | prompts_dict[key] = raw_dict[key] 732 | prev_value = raw_dict[key] 733 | else: 734 | prompts_dict[key] = prev_value 735 | 736 | return prompts_dict 737 | 738 | def process_images(self, p, input_img, controlnet_weight, input_img_is_preprocessed): 739 | p.control_net_input_image = input_img 740 | p.control_net_weight = controlnet_weight 741 | if input_img_is_preprocessed: 742 | p.control_net_module = "none" 743 | return process_images(p) 744 | 745 | # This is where the additional processing is implemented. The parameters include 746 | # self, the model object "p" (a StableDiffusionProcessing class, see 747 | # processing.py), and the parameters returned by the ui method. 748 | # Custom functions can be defined here, and additional libraries can be imported 749 | # to be used in processing. The return value should be a Processed object, which is 750 | # what is returned by the process_images method. 751 | def run(self, p, project_dir, generation_test, mask_mode, inpaint_area, use_depth, img2img_repeat_count, inc_seed, auto_tag_mode, add_tag_to_head, add_tag_replace_underscore, is_facecrop, face_detection_method, face_crop_resolution, max_crop_size, face_denoising_strength, face_area_magnification, enable_face_prompt, face_prompt, controlnet_weight, controlnet_weight_for_face, disable_facecrop_lpbk_last_time, use_preprocess_img): 752 | args = locals() 753 | 754 | if generation_test: 755 | print("generation_test") 756 | test_proj_dir = os.path.join( get_my_dir() , "generation_test_proj") 757 | os.makedirs(test_proj_dir, exist_ok=True) 758 | test_video_key_path = os.path.join( test_proj_dir , "video_key") 759 | os.makedirs(test_video_key_path, exist_ok=True) 760 | test_video_mask_path = os.path.join( test_proj_dir , "video_mask") 761 | os.makedirs(test_video_mask_path, exist_ok=True) 762 | 763 | controlnet_input_path = os.path.join(test_proj_dir, "controlnet_input") 764 | if os.path.isdir(controlnet_input_path): 765 | shutil.rmtree(controlnet_input_path) 766 | 767 | remove_pngs_in_dir(test_video_key_path) 768 | remove_pngs_in_dir(test_video_mask_path) 769 | 770 | test_base_img = p.init_images[0] 771 | test_mask = p.image_mask 772 | 773 | if test_base_img: 774 | test_base_img.save( os.path.join( test_video_key_path , "00001.png") ) 775 | if test_mask: 776 | test_mask.save( os.path.join( test_video_mask_path , "00001.png") ) 777 | 778 | project_dir = test_proj_dir 779 | else: 780 | if not os.path.isdir(project_dir): 781 | print("project_dir not found") 782 | return Processed() 783 | 784 | self.controlnet_weight = controlnet_weight 785 | self.controlnet_weight_for_face = controlnet_weight_for_face 786 | 787 | self.add_tag_replace_underscore = add_tag_replace_underscore 788 | self.face_crop_resolution = face_crop_resolution 789 | 790 | if p.seed == -1: 791 | p.seed = int(random.randrange(4294967294)) 792 | 793 | if mask_mode == "Normal": 794 | p.inpainting_mask_invert = 0 795 | elif mask_mode == "Invert": 796 | p.inpainting_mask_invert = 1 797 | 798 | if inpaint_area in (0,1): #"Whole picture","Only masked" 799 | p.inpaint_full_res = inpaint_area 800 | 801 | is_invert_mask = False 802 | if mask_mode == "Invert": 803 | is_invert_mask = True 804 | 805 | inv_path = os.path.join(project_dir, "inv") 806 | if not os.path.isdir(inv_path): 807 | print("project_dir/inv not found") 808 | return Processed() 809 | 810 | org_key_path = os.path.join(inv_path, "video_key") 811 | img2img_key_path = os.path.join(inv_path, "img2img_key") 812 | depth_path = os.path.join(inv_path, "video_key_depth") 813 | 814 | preprocess_path = os.path.join(inv_path, "controlnet_preprocess") 815 | 816 | controlnet_input_path = os.path.join(inv_path, "controlnet_input") 817 | 818 | self.prompts_dir = inv_path 819 | self.is_invert_mask = True 820 | else: 821 | org_key_path = os.path.join(project_dir, "video_key") 822 | img2img_key_path = os.path.join(project_dir, "img2img_key") 823 | depth_path = os.path.join(project_dir, "video_key_depth") 824 | 825 | preprocess_path = os.path.join(project_dir, "controlnet_preprocess") 826 | 827 | controlnet_input_path = os.path.join(project_dir, "controlnet_input") 828 | 829 | self.prompts_dir = project_dir 830 | self.is_invert_mask = False 831 | 832 | frame_mask_path = os.path.join(project_dir, "video_mask") 833 | 834 | if not use_depth: 835 | depth_path = None 836 | 837 | if not os.path.isdir(org_key_path): 838 | print(org_key_path + " not found") 839 | print("Generate key frames first." if is_invert_mask == False else \ 840 | "Generate key frames first.(with [Ebsynth Utility] Tab -> [configuration] -> [etc]-> [Mask Mode] = Invert setting)") 841 | return Processed() 842 | 843 | if not os.path.isdir(controlnet_input_path): 844 | print(controlnet_input_path + " not found") 845 | print("copy {0} -> {1}".format(org_key_path,controlnet_input_path)) 846 | 847 | os.makedirs(controlnet_input_path, exist_ok=True) 848 | 849 | imgs = glob.glob( os.path.join(org_key_path ,"*.png") ) 850 | for img in imgs: 851 | img_basename = os.path.basename(img) 852 | shutil.copy( img , os.path.join(controlnet_input_path, img_basename) ) 853 | 854 | remove_pngs_in_dir(img2img_key_path) 855 | os.makedirs(img2img_key_path, exist_ok=True) 856 | 857 | 858 | def get_mask_of_img(img): 859 | img_basename = os.path.basename(img) 860 | 861 | if mask_mode != "None": 862 | mask_path = os.path.join( frame_mask_path , img_basename ) 863 | if os.path.isfile( mask_path ): 864 | return mask_path 865 | return "" 866 | 867 | def get_pair_of_img(img, target_dir): 868 | img_basename = os.path.basename(img) 869 | 870 | pair_path = os.path.join( target_dir , img_basename ) 871 | if os.path.isfile( pair_path ): 872 | return pair_path 873 | print("!!! pair of "+ img + " not in " + target_dir) 874 | return "" 875 | 876 | def get_controlnet_input_img(img): 877 | pair_img = get_pair_of_img(img, controlnet_input_path) 878 | if not pair_img: 879 | pair_img = get_pair_of_img(img, org_key_path) 880 | return pair_img 881 | 882 | imgs = glob.glob( os.path.join(org_key_path ,"*.png") ) 883 | masks = [ get_mask_of_img(i) for i in imgs ] 884 | controlnet_input_imgs = [ get_controlnet_input_img(i) for i in imgs ] 885 | 886 | for mask in masks: 887 | m = cv2.imread(mask) if mask else None 888 | if m is not None: 889 | if m.max() == 0: 890 | print("{0} blank mask found".format(mask)) 891 | if m.ndim == 2: 892 | m[0,0] = 255 893 | else: 894 | m = m[:,:,:3] 895 | m[0,0,0:3] = 255 896 | cv2.imwrite(mask, m) 897 | 898 | ###################### 899 | # face crop 900 | face_coords_dict={} 901 | for img,mask in zip(imgs,masks): 902 | face_detected = False 903 | if is_facecrop: 904 | image = Image.open(img) 905 | mask_image = Image.open(mask) if mask else None 906 | face_coords = self.detect_face(image, mask_image, face_detection_method, max_crop_size) 907 | if face_coords is None or len(face_coords) == 0: 908 | print("no face detected") 909 | else: 910 | print("face detected") 911 | face_detected = True 912 | 913 | key = os.path.basename(img) 914 | face_coords_dict[key] = face_coords if face_detected else [] 915 | 916 | with open( os.path.join( project_dir if is_invert_mask == False else inv_path,"faces.txt" ), "w") as f: 917 | f.write(json.dumps(face_coords_dict,indent=4)) 918 | 919 | ###################### 920 | # prompts 921 | prompts_dict = self.load_prompts_dict(imgs, p.prompt) 922 | 923 | if not prompts_dict: 924 | if auto_tag_mode != "None": 925 | prompts_dict = self.create_prompts_dict(imgs, masks, auto_tag_mode) 926 | 927 | for key, value in prompts_dict.items(): 928 | prompts_dict[key] = (value + "," + p.prompt) if add_tag_to_head else (p.prompt + "," + value) 929 | 930 | else: 931 | for img in imgs: 932 | key = os.path.basename(img) 933 | prompts_dict[key] = p.prompt 934 | 935 | with open( os.path.join( project_dir if is_invert_mask == False else inv_path, time.strftime("%Y%m%d-%H%M%S_") + "prompts.txt" ), "w") as f: 936 | f.write(json.dumps(prompts_dict,indent=4)) 937 | 938 | 939 | ###################### 940 | # img2img 941 | for img, mask, controlnet_input_img, face_coords, prompts in zip(imgs, masks, controlnet_input_imgs, face_coords_dict.values(), prompts_dict.values()): 942 | 943 | # Generation cancelled. 944 | if shared.state.interrupted: 945 | print("Generation cancelled.") 946 | break 947 | 948 | image = Image.open(img) 949 | mask_image = Image.open(mask) if mask else None 950 | 951 | img_basename = os.path.basename(img) 952 | 953 | _p = copy.copy(p) 954 | 955 | _p.init_images=[image] 956 | _p.image_mask = mask_image 957 | _p.prompt = prompts 958 | resized_mask = None 959 | 960 | repeat_count = img2img_repeat_count 961 | 962 | if mask_mode != "None" or use_depth: 963 | if use_depth: 964 | depth_found, _p.image_mask = self.get_depth_map( mask_image, depth_path ,img_basename, is_invert_mask ) 965 | mask_image = _p.image_mask 966 | if depth_found: 967 | _p.inpainting_mask_invert = 0 968 | 969 | preprocess_img_exist = False 970 | controlnet_input_base_img = Image.open(controlnet_input_img) if controlnet_input_img else None 971 | 972 | if use_preprocess_img: 973 | preprocess_img = os.path.join(preprocess_path, img_basename) 974 | if os.path.isfile( preprocess_img ): 975 | controlnet_input_base_img = Image.open(preprocess_img) 976 | preprocess_img_exist = True 977 | 978 | if face_coords: 979 | controlnet_input_face_imgs, _ = self.face_img_crop(controlnet_input_base_img, face_coords, face_area_magnification) 980 | 981 | while repeat_count > 0: 982 | 983 | if disable_facecrop_lpbk_last_time: 984 | if img2img_repeat_count > 1: 985 | if repeat_count == 1: 986 | face_coords = None 987 | 988 | if face_coords: 989 | proc = self.face_crop_img2img(_p, face_coords, face_denoising_strength, face_area_magnification, enable_face_prompt, face_prompt, controlnet_input_base_img, controlnet_input_face_imgs, preprocess_img_exist) 990 | else: 991 | proc = self.process_images(_p, controlnet_input_base_img, self.controlnet_weight, preprocess_img_exist) 992 | print(proc.seed) 993 | 994 | repeat_count -= 1 995 | 996 | if repeat_count > 0: 997 | _p.init_images=[proc.images[0]] 998 | 999 | if mask_image is not None and resized_mask is None: 1000 | resized_mask = resize_img(np.array(mask_image) , proc.images[0].width, proc.images[0].height) 1001 | resized_mask = Image.fromarray(resized_mask) 1002 | _p.image_mask = resized_mask 1003 | _p.seed += inc_seed 1004 | 1005 | proc.images[0].save( os.path.join( img2img_key_path , img_basename ) ) 1006 | 1007 | with open( os.path.join( project_dir if is_invert_mask == False else inv_path,"param.txt" ), "w") as f: 1008 | f.write(pprint.pformat(proc.info)) 1009 | with open( os.path.join( project_dir if is_invert_mask == False else inv_path ,"args.txt" ), "w") as f: 1010 | f.write(pprint.pformat(args)) 1011 | 1012 | return proc 1013 | -------------------------------------------------------------------------------- /scripts/ui.py: -------------------------------------------------------------------------------- 1 | 2 | import gradio as gr 3 | 4 | from ebsynth_utility import ebsynth_utility_process 5 | from modules import script_callbacks 6 | from modules.call_queue import wrap_gradio_gpu_call 7 | 8 | def on_ui_tabs(): 9 | 10 | with gr.Blocks(analytics_enabled=False) as ebs_interface: 11 | with gr.Row(equal_height=False): 12 | with gr.Column(variant='panel'): 13 | 14 | with gr.Row(): 15 | with gr.Tabs(elem_id="ebs_settings"): 16 | with gr.TabItem('project setting', elem_id='ebs_project_setting'): 17 | project_dir = gr.Textbox(label='Project directory', lines=1) 18 | original_movie_path = gr.Textbox(label='Original Movie Path', lines=1) 19 | 20 | org_video = gr.Video(interactive=True, mirror_webcam=False) 21 | def fn_upload_org_video(video): 22 | return video 23 | org_video.upload(fn_upload_org_video, org_video, original_movie_path) 24 | gr.HTML(value="

\ 25 | If you have trouble entering the video path manually, you can also use drag and drop.For large videos, please enter the path manually. \ 26 |

") 27 | 28 | with gr.TabItem('configuration', elem_id='ebs_configuration'): 29 | with gr.Tabs(elem_id="ebs_configuration_tab"): 30 | with gr.TabItem(label="stage 1",elem_id='ebs_configuration_tab1'): 31 | with gr.Row(): 32 | frame_width = gr.Number(value=-1, label="Frame Width", precision=0, interactive=True) 33 | frame_height = gr.Number(value=-1, label="Frame Height", precision=0, interactive=True) 34 | gr.HTML(value="

\ 35 | -1 means that it is calculated automatically. If both are -1, the size will be the same as the source size. \ 36 |

") 37 | 38 | st1_masking_method_index = gr.Radio(label='Masking Method', choices=["transparent-background","clipseg","transparent-background AND clipseg"], value="transparent-background", type="index") 39 | 40 | with gr.Accordion(label="transparent-background options"): 41 | st1_mask_threshold = gr.Slider(minimum=0.0, maximum=1.0, step=0.01, label='Mask Threshold', value=0.0) 42 | 43 | # https://pypi.org/project/transparent-background/ 44 | gr.HTML(value="

\ 45 | configuration for \ 46 | [transparent-background]\ 47 |

") 48 | tb_use_fast_mode = gr.Checkbox(label="Use Fast Mode(It will be faster, but the quality of the mask will be lower.)", value=False) 49 | tb_use_jit = gr.Checkbox(label="Use Jit", value=False) 50 | 51 | with gr.Accordion(label="clipseg options"): 52 | clipseg_mask_prompt = gr.Textbox(label='Mask Target (e.g., girl, cats)', lines=1) 53 | clipseg_exclude_prompt = gr.Textbox(label='Exclude Target (e.g., finger, book)', lines=1) 54 | clipseg_mask_threshold = gr.Slider(minimum=0.0, maximum=1.0, step=0.01, label='Mask Threshold', value=0.4) 55 | clipseg_mask_blur_size = gr.Slider(minimum=0, maximum=150, step=1, label='Mask Blur Kernel Size(MedianBlur)', value=11) 56 | clipseg_mask_blur_size2 = gr.Slider(minimum=0, maximum=150, step=1, label='Mask Blur Kernel Size(GaussianBlur)', value=11) 57 | 58 | with gr.TabItem(label="stage 2", elem_id='ebs_configuration_tab2'): 59 | key_min_gap = gr.Slider(minimum=0, maximum=500, step=1, label='Minimum keyframe gap', value=10) 60 | key_max_gap = gr.Slider(minimum=0, maximum=1000, step=1, label='Maximum keyframe gap', value=300) 61 | key_th = gr.Slider(minimum=0.0, maximum=100.0, step=0.1, label='Threshold of delta frame edge', value=8.5) 62 | key_add_last_frame = gr.Checkbox(label="Add last frame to keyframes", value=True) 63 | 64 | with gr.TabItem(label="stage 3.5", elem_id='ebs_configuration_tab3_5'): 65 | gr.HTML(value="

\ 66 | [color-matcher]\ 67 |

") 68 | 69 | color_matcher_method = gr.Radio(label='Color Transfer Method', choices=['default', 'hm', 'reinhard', 'mvgd', 'mkl', 'hm-mvgd-hm', 'hm-mkl-hm'], value="hm-mkl-hm", type="value") 70 | color_matcher_ref_type = gr.Radio(label='Color Matcher Ref Image Type', choices=['original video frame', 'first frame of img2img result'], value="original video frame", type="index") 71 | gr.HTML(value="

\ 72 | If an image is specified below, it will be used with highest priority.\ 73 |

") 74 | color_matcher_ref_image = gr.Image(label="Color Matcher Ref Image", source='upload', mirror_webcam=False, type='pil') 75 | st3_5_use_mask = gr.Checkbox(label="Apply mask to the result", value=True) 76 | st3_5_use_mask_ref = gr.Checkbox(label="Apply mask to the Ref Image", value=False) 77 | st3_5_use_mask_org = gr.Checkbox(label="Apply mask to original image", value=False) 78 | #st3_5_number_of_itr = gr.Slider(minimum=1, maximum=10, step=1, label='Number of iterations', value=1) 79 | 80 | with gr.TabItem(label="stage 7", elem_id='ebs_configuration_tab7'): 81 | blend_rate = gr.Slider(minimum=0.0, maximum=1.0, step=0.01, label='Crossfade blend rate', value=1.0) 82 | export_type = gr.Dropdown(choices=["mp4","webm","gif","rawvideo"], value="mp4" ,label="Export type") 83 | 84 | with gr.TabItem(label="stage 8", elem_id='ebs_configuration_tab8'): 85 | bg_src = gr.Textbox(label='Background source(mp4 or directory containing images)', lines=1) 86 | bg_type = gr.Dropdown(choices=["Fit video length","Loop"], value="Fit video length" ,label="Background type") 87 | mask_blur_size = gr.Slider(minimum=0, maximum=150, step=1, label='Mask Blur Kernel Size', value=5) 88 | mask_threshold = gr.Slider(minimum=0.0, maximum=1.0, step=0.01, label='Mask Threshold', value=0.0) 89 | #is_transparent = gr.Checkbox(label="Is Transparent", value=True, visible = False) 90 | fg_transparency = gr.Slider(minimum=0.0, maximum=1.0, step=0.01, label='Foreground Transparency', value=0.0) 91 | 92 | with gr.TabItem(label="etc", elem_id='ebs_configuration_tab_etc'): 93 | mask_mode = gr.Dropdown(choices=["Normal","Invert","None"], value="Normal" ,label="Mask Mode") 94 | with gr.TabItem('info', elem_id='ebs_info'): 95 | gr.HTML(value="

\ 96 | The process of creating a video can be divided into the following stages.
\ 97 | (Stage 3, 4, and 6 only show a guide and do nothing actual processing.)

\ 98 | stage 1
\ 99 | Extract frames from the original video.
\ 100 | Generate a mask image.

\ 101 | stage 2
\ 102 | Select keyframes to be given to ebsynth.

\ 103 | stage 3
\ 104 | img2img keyframes.

\ 105 | stage 3.5
\ 106 | (this is optional. Perform color correction on the img2img results and expect flickering to decrease. Or, you can simply change the color tone from the generated result.)

\ 107 | stage 4
\ 108 | and upscale to the size of the original video.

\ 109 | stage 5
\ 110 | Rename keyframes.
\ 111 | Generate .ebs file.(ebsynth project file)

\ 112 | stage 6
\ 113 | Running ebsynth.(on your self)
\ 114 | Open the generated .ebs under project directory and press [Run All] button.
\ 115 | If ""out-*"" directory already exists in the Project directory, delete it manually before executing.
\ 116 | If multiple .ebs files are generated, run them all.

\ 117 | stage 7
\ 118 | Concatenate each frame while crossfading.
\ 119 | Composite audio files extracted from the original video onto the concatenated video.

\ 120 | stage 8
\ 121 | This is an extra stage.
\ 122 | You can put any image or images or video you like in the background.
\ 123 | You can specify in this field -> [Ebsynth Utility]->[configuration]->[stage 8]->[Background source]
\ 124 | If you have already created a background video in Invert Mask Mode([Ebsynth Utility]->[configuration]->[etc]->[Mask Mode]),
\ 125 | You can specify \"path_to_project_dir/inv/crossfade_tmp\".
\ 126 |

") 127 | 128 | with gr.Column(variant='panel'): 129 | with gr.Column(scale=1): 130 | with gr.Row(): 131 | stage_index = gr.Radio(label='Process Stage', choices=["stage 1","stage 2","stage 3","stage 3.5","stage 4","stage 5","stage 6","stage 7","stage 8"], value="stage 1", type="index", elem_id='ebs_stages') 132 | 133 | with gr.Row(): 134 | generate_btn = gr.Button('Generate', elem_id="ebs_generate_btn", variant='primary') 135 | 136 | with gr.Group(): 137 | debug_info = gr.HTML(elem_id="ebs_info_area", value=".") 138 | 139 | with gr.Column(scale=2): 140 | html_info = gr.HTML() 141 | 142 | ebs_args = dict( 143 | fn=wrap_gradio_gpu_call(ebsynth_utility_process), 144 | inputs=[ 145 | stage_index, 146 | 147 | project_dir, 148 | original_movie_path, 149 | 150 | frame_width, 151 | frame_height, 152 | st1_masking_method_index, 153 | st1_mask_threshold, 154 | tb_use_fast_mode, 155 | tb_use_jit, 156 | clipseg_mask_prompt, 157 | clipseg_exclude_prompt, 158 | clipseg_mask_threshold, 159 | clipseg_mask_blur_size, 160 | clipseg_mask_blur_size2, 161 | 162 | key_min_gap, 163 | key_max_gap, 164 | key_th, 165 | key_add_last_frame, 166 | 167 | color_matcher_method, 168 | st3_5_use_mask, 169 | st3_5_use_mask_ref, 170 | st3_5_use_mask_org, 171 | color_matcher_ref_type, 172 | color_matcher_ref_image, 173 | 174 | blend_rate, 175 | export_type, 176 | 177 | bg_src, 178 | bg_type, 179 | mask_blur_size, 180 | mask_threshold, 181 | fg_transparency, 182 | 183 | mask_mode, 184 | 185 | ], 186 | outputs=[ 187 | debug_info, 188 | html_info, 189 | ], 190 | show_progress=False, 191 | ) 192 | generate_btn.click(**ebs_args) 193 | 194 | return (ebs_interface, "Ebsynth Utility", "ebs_interface"), 195 | 196 | script_callbacks.on_ui_tabs(on_ui_tabs) 197 | -------------------------------------------------------------------------------- /stage1.py: -------------------------------------------------------------------------------- 1 | import os 2 | import subprocess 3 | import glob 4 | import cv2 5 | import re 6 | 7 | from transformers import AutoProcessor, CLIPSegForImageSegmentation 8 | from PIL import Image 9 | from transparent_background import Remover 10 | from tqdm.auto import tqdm 11 | import torch 12 | import numpy as np 13 | 14 | 15 | def resize_img(img, w, h): 16 | if img.shape[0] + img.shape[1] < h + w: 17 | interpolation = interpolation=cv2.INTER_CUBIC 18 | else: 19 | interpolation = interpolation=cv2.INTER_AREA 20 | 21 | return cv2.resize(img, (w, h), interpolation=interpolation) 22 | 23 | def resize_all_img(path, frame_width, frame_height): 24 | if not os.path.isdir(path): 25 | return 26 | 27 | pngs = glob.glob( os.path.join(path, "*.png") ) 28 | img = cv2.imread(pngs[0]) 29 | org_h,org_w = img.shape[0],img.shape[1] 30 | 31 | if frame_width == -1 and frame_height == -1: 32 | return 33 | elif frame_width == -1 and frame_height != -1: 34 | frame_width = int(frame_height * org_w / org_h) 35 | elif frame_width != -1 and frame_height == -1: 36 | frame_height = int(frame_width * org_h / org_w) 37 | else: 38 | pass 39 | print("({0},{1}) resize to ({2},{3})".format(org_w, org_h, frame_width, frame_height)) 40 | 41 | for png in pngs: 42 | img = cv2.imread(png) 43 | img = resize_img(img, frame_width, frame_height) 44 | cv2.imwrite(png, img) 45 | 46 | def remove_pngs_in_dir(path): 47 | if not os.path.isdir(path): 48 | return 49 | 50 | pngs = glob.glob( os.path.join(path, "*.png") ) 51 | for png in pngs: 52 | os.remove(png) 53 | 54 | def create_and_mask(mask_dir1, mask_dir2, output_dir): 55 | masks = glob.glob( os.path.join(mask_dir1, "*.png") ) 56 | 57 | for mask1 in masks: 58 | base_name = os.path.basename(mask1) 59 | print("combine {0}".format(base_name)) 60 | 61 | mask2 = os.path.join(mask_dir2, base_name) 62 | if not os.path.isfile(mask2): 63 | print("{0} not found!!! -> skip".format(mask2)) 64 | continue 65 | 66 | img_1 = cv2.imread(mask1) 67 | img_2 = cv2.imread(mask2) 68 | img_1 = np.minimum(img_1,img_2) 69 | 70 | out_path = os.path.join(output_dir, base_name) 71 | cv2.imwrite(out_path, img_1) 72 | 73 | 74 | def create_mask_clipseg(input_dir, output_dir, clipseg_mask_prompt, clipseg_exclude_prompt, clipseg_mask_threshold, mask_blur_size, mask_blur_size2): 75 | from modules import devices 76 | 77 | devices.torch_gc() 78 | 79 | device = devices.get_optimal_device_name() 80 | 81 | processor = AutoProcessor.from_pretrained("CIDAS/clipseg-rd64-refined") 82 | model = CLIPSegForImageSegmentation.from_pretrained("CIDAS/clipseg-rd64-refined") 83 | model.to(device) 84 | 85 | imgs = glob.glob( os.path.join(input_dir, "*.png") ) 86 | texts = [x.strip() for x in clipseg_mask_prompt.split(',')] 87 | exclude_texts = [x.strip() for x in clipseg_exclude_prompt.split(',')] if clipseg_exclude_prompt else None 88 | 89 | if exclude_texts: 90 | all_texts = texts + exclude_texts 91 | else: 92 | all_texts = texts 93 | 94 | 95 | for img_count,img in enumerate(imgs): 96 | image = Image.open(img) 97 | base_name = os.path.basename(img) 98 | 99 | inputs = processor(text=all_texts, images=[image] * len(all_texts), padding="max_length", return_tensors="pt") 100 | inputs = inputs.to(device) 101 | 102 | with torch.no_grad(), devices.autocast(): 103 | outputs = model(**inputs) 104 | 105 | if len(all_texts) == 1: 106 | preds = outputs.logits.unsqueeze(0) 107 | else: 108 | preds = outputs.logits 109 | 110 | mask_img = None 111 | 112 | for i in range(len(all_texts)): 113 | x = torch.sigmoid(preds[i]) 114 | x = x.to('cpu').detach().numpy() 115 | 116 | # x[x < clipseg_mask_threshold] = 0 117 | x = x > clipseg_mask_threshold 118 | 119 | if i < len(texts): 120 | if mask_img is None: 121 | mask_img = x 122 | else: 123 | mask_img = np.maximum(mask_img,x) 124 | else: 125 | mask_img[x > 0] = 0 126 | 127 | mask_img = mask_img*255 128 | mask_img = mask_img.astype(np.uint8) 129 | 130 | if mask_blur_size > 0: 131 | mask_blur_size = mask_blur_size//2 * 2 + 1 132 | mask_img = cv2.medianBlur(mask_img, mask_blur_size) 133 | 134 | if mask_blur_size2 > 0: 135 | mask_blur_size2 = mask_blur_size2//2 * 2 + 1 136 | mask_img = cv2.GaussianBlur(mask_img, (mask_blur_size2, mask_blur_size2), 0) 137 | 138 | mask_img = resize_img(mask_img, image.width, image.height) 139 | 140 | mask_img = cv2.cvtColor(mask_img, cv2.COLOR_GRAY2RGB) 141 | save_path = os.path.join(output_dir, base_name) 142 | cv2.imwrite(save_path, mask_img) 143 | 144 | print("{0} / {1}".format( img_count+1,len(imgs) )) 145 | 146 | devices.torch_gc() 147 | 148 | 149 | def create_mask_transparent_background(input_dir, output_dir, tb_use_fast_mode, tb_use_jit, st1_mask_threshold): 150 | from modules import devices 151 | remover = Remover(fast=tb_use_fast_mode, jit=tb_use_jit, device=devices.get_optimal_device_name()) 152 | 153 | original_imgs = glob.glob( os.path.join(input_dir, "*.png") ) 154 | 155 | pbar_original_imgs = tqdm(original_imgs, bar_format='{desc:<15}{percentage:3.0f}%|{bar:50}{r_bar}') 156 | for m in pbar_original_imgs: 157 | base_name = os.path.basename(m) 158 | pbar_original_imgs.set_description('{}'.format(base_name)) 159 | img = Image.open(m).convert('RGB') 160 | out = remover.process(img, type='map') 161 | if isinstance(out,Image.Image): 162 | out = np.array(out) 163 | out[out < int( 255 * st1_mask_threshold )] = 0 164 | cv2.imwrite(os.path.join(output_dir, base_name), out) 165 | 166 | 167 | def ebsynth_utility_stage1(dbg, project_args, frame_width, frame_height, st1_masking_method_index, st1_mask_threshold, tb_use_fast_mode, tb_use_jit, clipseg_mask_prompt, clipseg_exclude_prompt, clipseg_mask_threshold, clipseg_mask_blur_size, clipseg_mask_blur_size2, is_invert_mask): 168 | dbg.print("stage1") 169 | dbg.print("") 170 | 171 | if st1_masking_method_index == 1 and (not clipseg_mask_prompt): 172 | dbg.print("Error: clipseg_mask_prompt is Empty") 173 | return 174 | 175 | project_dir, original_movie_path, frame_path, frame_mask_path, _, _, _ = project_args 176 | 177 | if is_invert_mask: 178 | if os.path.isdir( frame_path ) and os.path.isdir( frame_mask_path ): 179 | dbg.print("Skip as it appears that the frame and normal masks have already been generated.") 180 | return 181 | 182 | # remove_pngs_in_dir(frame_path) 183 | 184 | if frame_mask_path: 185 | remove_pngs_in_dir(frame_mask_path) 186 | 187 | if frame_mask_path: 188 | os.makedirs(frame_mask_path, exist_ok=True) 189 | 190 | if os.path.isdir( frame_path ): 191 | dbg.print("Skip frame extraction") 192 | else: 193 | os.makedirs(frame_path, exist_ok=True) 194 | 195 | png_path = os.path.join(frame_path , "%05d.png") 196 | # ffmpeg.exe -ss 00:00:00 -y -i %1 -qscale 0 -f image2 -c:v png "%05d.png" 197 | subprocess.call("ffmpeg -ss 00:00:00 -y -i " + original_movie_path + " -qscale 0 -f image2 -c:v png " + png_path, shell=True) 198 | 199 | dbg.print("frame extracted") 200 | 201 | frame_width = max(frame_width,-1) 202 | frame_height = max(frame_height,-1) 203 | 204 | if frame_width != -1 or frame_height != -1: 205 | resize_all_img(frame_path, frame_width, frame_height) 206 | 207 | if frame_mask_path: 208 | if st1_masking_method_index == 0: 209 | create_mask_transparent_background(frame_path, frame_mask_path, tb_use_fast_mode, tb_use_jit, st1_mask_threshold) 210 | elif st1_masking_method_index == 1: 211 | create_mask_clipseg(frame_path, frame_mask_path, clipseg_mask_prompt, clipseg_exclude_prompt, clipseg_mask_threshold, clipseg_mask_blur_size, clipseg_mask_blur_size2) 212 | elif st1_masking_method_index == 2: 213 | tb_tmp_path = os.path.join(project_dir , "tb_mask_tmp") 214 | if not os.path.isdir( tb_tmp_path ): 215 | os.makedirs(tb_tmp_path, exist_ok=True) 216 | create_mask_transparent_background(frame_path, tb_tmp_path, tb_use_fast_mode, tb_use_jit, st1_mask_threshold) 217 | create_mask_clipseg(frame_path, frame_mask_path, clipseg_mask_prompt, clipseg_exclude_prompt, clipseg_mask_threshold, clipseg_mask_blur_size, clipseg_mask_blur_size2) 218 | create_and_mask(tb_tmp_path,frame_mask_path,frame_mask_path) 219 | 220 | 221 | dbg.print("mask created") 222 | 223 | dbg.print("") 224 | dbg.print("completed.") 225 | 226 | 227 | def ebsynth_utility_stage1_invert(dbg, frame_mask_path, inv_mask_path): 228 | dbg.print("stage 1 create_invert_mask") 229 | dbg.print("") 230 | 231 | if not os.path.isdir( frame_mask_path ): 232 | dbg.print( frame_mask_path + " not found") 233 | dbg.print("Normal masks must be generated previously.") 234 | dbg.print("Do stage 1 with [Ebsynth Utility] Tab -> [configuration] -> [etc]-> [Mask Mode] = Normal setting first") 235 | return 236 | 237 | os.makedirs(inv_mask_path, exist_ok=True) 238 | 239 | mask_imgs = glob.glob( os.path.join(frame_mask_path, "*.png") ) 240 | 241 | for m in mask_imgs: 242 | img = cv2.imread(m) 243 | inv = cv2.bitwise_not(img) 244 | 245 | base_name = os.path.basename(m) 246 | cv2.imwrite(os.path.join(inv_mask_path,base_name), inv) 247 | 248 | dbg.print("") 249 | dbg.print("completed.") 250 | -------------------------------------------------------------------------------- /stage2.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import os 3 | import glob 4 | import shutil 5 | import numpy as np 6 | import math 7 | 8 | #--------------------------------- 9 | # Copied from PySceneDetect 10 | def mean_pixel_distance(left: np.ndarray, right: np.ndarray) -> float: 11 | """Return the mean average distance in pixel values between `left` and `right`. 12 | Both `left and `right` should be 2 dimensional 8-bit images of the same shape. 13 | """ 14 | assert len(left.shape) == 2 and len(right.shape) == 2 15 | assert left.shape == right.shape 16 | num_pixels: float = float(left.shape[0] * left.shape[1]) 17 | return (np.sum(np.abs(left.astype(np.int32) - right.astype(np.int32))) / num_pixels) 18 | 19 | 20 | def estimated_kernel_size(frame_width: int, frame_height: int) -> int: 21 | """Estimate kernel size based on video resolution.""" 22 | size: int = 4 + round(math.sqrt(frame_width * frame_height) / 192) 23 | if size % 2 == 0: 24 | size += 1 25 | return size 26 | 27 | _kernel = None 28 | 29 | def _detect_edges(lum: np.ndarray) -> np.ndarray: 30 | global _kernel 31 | """Detect edges using the luma channel of a frame. 32 | Arguments: 33 | lum: 2D 8-bit image representing the luma channel of a frame. 34 | Returns: 35 | 2D 8-bit image of the same size as the input, where pixels with values of 255 36 | represent edges, and all other pixels are 0. 37 | """ 38 | # Initialize kernel. 39 | if _kernel is None: 40 | kernel_size = estimated_kernel_size(lum.shape[1], lum.shape[0]) 41 | _kernel = np.ones((kernel_size, kernel_size), np.uint8) 42 | 43 | # Estimate levels for thresholding. 44 | sigma: float = 1.0 / 3.0 45 | median = np.median(lum) 46 | low = int(max(0, (1.0 - sigma) * median)) 47 | high = int(min(255, (1.0 + sigma) * median)) 48 | 49 | # Calculate edges using Canny algorithm, and reduce noise by dilating the edges. 50 | # This increases edge overlap leading to improved robustness against noise and slow 51 | # camera movement. Note that very large kernel sizes can negatively affect accuracy. 52 | edges = cv2.Canny(lum, low, high) 53 | return cv2.dilate(edges, _kernel) 54 | 55 | #--------------------------------- 56 | 57 | def detect_edges(img_path, mask_path, is_invert_mask): 58 | im = cv2.imread(img_path) 59 | if mask_path: 60 | mask = cv2.imread(mask_path)[:,:,0] 61 | mask = mask[:, :, np.newaxis] 62 | im = im * ( (mask == 0) if is_invert_mask else (mask > 0) ) 63 | # im = im * (mask/255) 64 | # im = im.astype(np.uint8) 65 | # cv2.imwrite( os.path.join( os.path.dirname(mask_path) , "tmp.png" ) , im) 66 | 67 | hue, sat, lum = cv2.split(cv2.cvtColor( im , cv2.COLOR_BGR2HSV)) 68 | return _detect_edges(lum) 69 | 70 | def get_mask_path_of_img(img_path, mask_dir): 71 | img_basename = os.path.basename(img_path) 72 | mask_path = os.path.join( mask_dir , img_basename ) 73 | return mask_path if os.path.isfile( mask_path ) else None 74 | 75 | def analyze_key_frames(png_dir, mask_dir, th, min_gap, max_gap, add_last_frame, is_invert_mask): 76 | keys = [] 77 | 78 | frames = sorted(glob.glob( os.path.join(png_dir, "[0-9]*.png") )) 79 | 80 | key_frame = frames[0] 81 | keys.append( int(os.path.splitext(os.path.basename(key_frame))[0]) ) 82 | key_edges = detect_edges( key_frame, get_mask_path_of_img( key_frame, mask_dir ), is_invert_mask ) 83 | gap = 0 84 | 85 | for frame in frames: 86 | gap += 1 87 | if gap < min_gap: 88 | continue 89 | 90 | edges = detect_edges( frame, get_mask_path_of_img( frame, mask_dir ), is_invert_mask ) 91 | 92 | delta = mean_pixel_distance( edges, key_edges ) 93 | 94 | _th = th * (max_gap - gap)/max_gap 95 | 96 | if _th < delta: 97 | basename_without_ext = os.path.splitext(os.path.basename(frame))[0] 98 | keys.append( int(basename_without_ext) ) 99 | key_frame = frame 100 | key_edges = edges 101 | gap = 0 102 | 103 | if add_last_frame: 104 | basename_without_ext = os.path.splitext(os.path.basename(frames[-1]))[0] 105 | last_frame = int(basename_without_ext) 106 | if not last_frame in keys: 107 | keys.append( last_frame ) 108 | 109 | return keys 110 | 111 | def remove_pngs_in_dir(path): 112 | if not os.path.isdir(path): 113 | return 114 | 115 | pngs = glob.glob( os.path.join(path, "*.png") ) 116 | for png in pngs: 117 | os.remove(png) 118 | 119 | def ebsynth_utility_stage2(dbg, project_args, key_min_gap, key_max_gap, key_th, key_add_last_frame, is_invert_mask): 120 | dbg.print("stage2") 121 | dbg.print("") 122 | 123 | _, original_movie_path, frame_path, frame_mask_path, org_key_path, _, _ = project_args 124 | 125 | remove_pngs_in_dir(org_key_path) 126 | os.makedirs(org_key_path, exist_ok=True) 127 | 128 | fps = 30 129 | clip = cv2.VideoCapture(original_movie_path) 130 | if clip: 131 | fps = clip.get(cv2.CAP_PROP_FPS) 132 | clip.release() 133 | 134 | if key_min_gap == -1: 135 | key_min_gap = int(10 * fps/30) 136 | else: 137 | key_min_gap = max(1, key_min_gap) 138 | key_min_gap = int(key_min_gap * fps/30) 139 | 140 | if key_max_gap == -1: 141 | key_max_gap = int(300 * fps/30) 142 | else: 143 | key_max_gap = max(10, key_max_gap) 144 | key_max_gap = int(key_max_gap * fps/30) 145 | 146 | key_min_gap,key_max_gap = (key_min_gap,key_max_gap) if key_min_gap < key_max_gap else (key_max_gap,key_min_gap) 147 | 148 | dbg.print("fps: {}".format(fps)) 149 | dbg.print("key_min_gap: {}".format(key_min_gap)) 150 | dbg.print("key_max_gap: {}".format(key_max_gap)) 151 | dbg.print("key_th: {}".format(key_th)) 152 | 153 | keys = analyze_key_frames(frame_path, frame_mask_path, key_th, key_min_gap, key_max_gap, key_add_last_frame, is_invert_mask) 154 | 155 | dbg.print("keys : " + str(keys)) 156 | 157 | for k in keys: 158 | filename = str(k).zfill(5) + ".png" 159 | shutil.copy( os.path.join( frame_path , filename) , os.path.join(org_key_path, filename) ) 160 | 161 | 162 | dbg.print("") 163 | dbg.print("Keyframes are output to [" + org_key_path + "]") 164 | dbg.print("") 165 | dbg.print("[Ebsynth Utility]->[configuration]->[stage 2]->[Threshold of delta frame edge]") 166 | dbg.print("The smaller this value, the narrower the keyframe spacing, and if set to 0, the keyframes will be equally spaced at the value of [Minimum keyframe gap].") 167 | dbg.print("") 168 | dbg.print("If you do not like the selection, you can modify it manually.") 169 | dbg.print("(Delete keyframe, or Add keyframe from ["+frame_path+"])") 170 | 171 | dbg.print("") 172 | dbg.print("completed.") 173 | 174 | -------------------------------------------------------------------------------- /stage3_5.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import os 3 | import glob 4 | import shutil 5 | import numpy as np 6 | from PIL import Image 7 | 8 | from color_matcher import ColorMatcher 9 | from color_matcher.normalizer import Normalizer 10 | 11 | def resize_img(img, w, h): 12 | if img.shape[0] + img.shape[1] < h + w: 13 | interpolation = interpolation=cv2.INTER_CUBIC 14 | else: 15 | interpolation = interpolation=cv2.INTER_AREA 16 | 17 | return cv2.resize(img, (w, h), interpolation=interpolation) 18 | 19 | def get_pair_of_img(img_path, target_dir): 20 | img_basename = os.path.basename(img_path) 21 | target_path = os.path.join( target_dir , img_basename ) 22 | return target_path if os.path.isfile( target_path ) else None 23 | 24 | def remove_pngs_in_dir(path): 25 | if not os.path.isdir(path): 26 | return 27 | 28 | pngs = glob.glob( os.path.join(path, "*.png") ) 29 | for png in pngs: 30 | os.remove(png) 31 | 32 | def get_pair_of_img(img, target_dir): 33 | img_basename = os.path.basename(img) 34 | 35 | pair_path = os.path.join( target_dir , img_basename ) 36 | if os.path.isfile( pair_path ): 37 | return pair_path 38 | print("!!! pair of "+ img + " not in " + target_dir) 39 | return "" 40 | 41 | def get_mask_array(mask_path): 42 | if not mask_path: 43 | return None 44 | mask_array = np.asarray(Image.open( mask_path )) 45 | if mask_array.ndim == 2: 46 | mask_array = mask_array[:, :, np.newaxis] 47 | mask_array = mask_array[:,:,:1] 48 | mask_array = mask_array/255 49 | return mask_array 50 | 51 | def color_match(imgs, ref_image, color_matcher_method, dst_path): 52 | cm = ColorMatcher(method=color_matcher_method) 53 | 54 | i = 0 55 | total = len(imgs) 56 | 57 | for fname in imgs: 58 | 59 | img_src = Image.open(fname) 60 | img_src = Normalizer(np.asarray(img_src)).type_norm() 61 | 62 | img_src = cm.transfer(src=img_src, ref=ref_image, method=color_matcher_method) 63 | 64 | img_src = Normalizer(img_src).uint8_norm() 65 | Image.fromarray(img_src).save(os.path.join(dst_path, os.path.basename(fname))) 66 | 67 | i += 1 68 | print("{0}/{1}".format(i, total)) 69 | 70 | imgs = sorted( glob.glob( os.path.join(dst_path, "*.png") ) ) 71 | 72 | 73 | def ebsynth_utility_stage3_5(dbg, project_args, color_matcher_method, st3_5_use_mask, st3_5_use_mask_ref, st3_5_use_mask_org, color_matcher_ref_type, color_matcher_ref_image): 74 | dbg.print("stage3.5") 75 | dbg.print("") 76 | 77 | _, _, frame_path, frame_mask_path, org_key_path, img2img_key_path, _ = project_args 78 | 79 | backup_path = os.path.join( os.path.join( img2img_key_path, "..") , "st3_5_backup_img2img_key") 80 | backup_path = os.path.normpath(backup_path) 81 | 82 | if not os.path.isdir( backup_path ): 83 | dbg.print("{0} not found -> create backup.".format(backup_path)) 84 | os.makedirs(backup_path, exist_ok=True) 85 | 86 | imgs = glob.glob( os.path.join(img2img_key_path, "*.png") ) 87 | 88 | for img in imgs: 89 | img_basename = os.path.basename(img) 90 | pair_path = os.path.join( backup_path , img_basename ) 91 | shutil.copy( img , pair_path) 92 | 93 | else: 94 | dbg.print("{0} found -> Treat the images here as originals.".format(backup_path)) 95 | 96 | org_imgs = sorted( glob.glob( os.path.join(backup_path, "*.png") ) ) 97 | head_of_keyframe = org_imgs[0] 98 | 99 | # open ref img 100 | ref_image = color_matcher_ref_image 101 | if not ref_image: 102 | dbg.print("color_matcher_ref_image not set") 103 | 104 | if color_matcher_ref_type == 0: 105 | #'original video frame' 106 | dbg.print("select -> original video frame") 107 | ref_image = Image.open( get_pair_of_img(head_of_keyframe, frame_path) ) 108 | else: 109 | #'first frame of img2img result' 110 | dbg.print("select -> first frame of img2img result") 111 | ref_image = Image.open( get_pair_of_img(head_of_keyframe, backup_path) ) 112 | 113 | ref_image = np.asarray(ref_image) 114 | 115 | if st3_5_use_mask_ref: 116 | mask = get_pair_of_img(head_of_keyframe, frame_mask_path) 117 | if mask: 118 | mask_array = get_mask_array( mask ) 119 | ref_image = ref_image * mask_array 120 | ref_image = ref_image.astype(np.uint8) 121 | 122 | else: 123 | dbg.print("select -> color_matcher_ref_image") 124 | ref_image = np.asarray(ref_image) 125 | 126 | 127 | if color_matcher_method in ('mvgd', 'hm-mvgd-hm'): 128 | sample_img = Image.open(head_of_keyframe) 129 | ref_image = resize_img( ref_image, sample_img.width, sample_img.height ) 130 | 131 | ref_image = Normalizer(ref_image).type_norm() 132 | 133 | 134 | if st3_5_use_mask_org: 135 | tmp_path = os.path.join( os.path.join( img2img_key_path, "..") , "st3_5_tmp") 136 | tmp_path = os.path.normpath(tmp_path) 137 | dbg.print("create {0} for masked original image".format(tmp_path)) 138 | 139 | remove_pngs_in_dir(tmp_path) 140 | os.makedirs(tmp_path, exist_ok=True) 141 | 142 | for org_img in org_imgs: 143 | image_basename = os.path.basename(org_img) 144 | 145 | org_image = np.asarray(Image.open(org_img)) 146 | 147 | mask = get_pair_of_img(org_img, frame_mask_path) 148 | if mask: 149 | mask_array = get_mask_array( mask ) 150 | org_image = org_image * mask_array 151 | org_image = org_image.astype(np.uint8) 152 | 153 | Image.fromarray(org_image).save( os.path.join( tmp_path, image_basename ) ) 154 | 155 | org_imgs = sorted( glob.glob( os.path.join(tmp_path, "*.png") ) ) 156 | 157 | 158 | color_match(org_imgs, ref_image, color_matcher_method, img2img_key_path) 159 | 160 | 161 | if st3_5_use_mask or st3_5_use_mask_org: 162 | imgs = sorted( glob.glob( os.path.join(img2img_key_path, "*.png") ) ) 163 | for img in imgs: 164 | mask = get_pair_of_img(img, frame_mask_path) 165 | if mask: 166 | mask_array = get_mask_array( mask ) 167 | bg = get_pair_of_img(img, frame_path) 168 | bg_image = np.asarray(Image.open( bg )) 169 | fg_image = np.asarray(Image.open( img )) 170 | 171 | final_img = fg_image * mask_array + bg_image * (1-mask_array) 172 | final_img = final_img.astype(np.uint8) 173 | 174 | Image.fromarray(final_img).save(img) 175 | 176 | dbg.print("") 177 | dbg.print("completed.") 178 | 179 | -------------------------------------------------------------------------------- /stage5.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import re 3 | import os 4 | import glob 5 | import time 6 | 7 | from sys import byteorder 8 | import binascii 9 | import numpy as np 10 | 11 | SYNTHS_PER_PROJECT = 15 12 | 13 | def to_float_bytes(f): 14 | if byteorder == 'little': 15 | return np.array([ float(f) ], dtype=' cur_clip+1 else -1 149 | 150 | current_frame = 0 151 | 152 | print(str(start) + " -> " + str(end+1)) 153 | 154 | black_img = np.zeros_like( cv2.imread( os.path.join(out_dirs[cur_clip]['path'], str(start).zfill(number_of_digits) + ".png") ) ) 155 | 156 | for i in range(start, end+1): 157 | 158 | print(str(i) + " / " + str(end)) 159 | 160 | if next_clip == -1: 161 | break 162 | 163 | if i in range( out_dirs[cur_clip]['startframe'], out_dirs[cur_clip]['endframe'] +1): 164 | pass 165 | elif i in range( out_dirs[next_clip]['startframe'], out_dirs[next_clip]['endframe'] +1): 166 | cur_clip = next_clip 167 | next_clip = cur_clip+1 if len(out_dirs) > cur_clip+1 else -1 168 | if next_clip == -1: 169 | break 170 | else: 171 | ### black 172 | # front ... none 173 | # back ... none 174 | cv2.imwrite( os.path.join(tmp_dir, filename) , black_img) 175 | current_frame = i 176 | continue 177 | 178 | filename = str(i).zfill(number_of_digits) + ".png" 179 | 180 | # front ... cur_clip 181 | # back ... next_clip or none 182 | 183 | if i in range( out_dirs[next_clip]['startframe'], out_dirs[next_clip]['endframe'] +1): 184 | # front ... cur_clip 185 | # back ... next_clip 186 | img_f = cv2.imread( os.path.join(out_dirs[cur_clip]['path'] , filename) ) 187 | img_b = cv2.imread( os.path.join(out_dirs[next_clip]['path'] , filename) ) 188 | 189 | back_rate = (i - out_dirs[next_clip]['startframe'])/ max( 1 , (out_dirs[cur_clip]['endframe'] - out_dirs[next_clip]['startframe']) ) 190 | 191 | img = cv2.addWeighted(img_f, 1.0 - back_rate, img_b, back_rate, 0) 192 | 193 | cv2.imwrite( os.path.join(tmp_dir , filename) , img) 194 | else: 195 | # front ... cur_clip 196 | # back ... none 197 | filename = str(i).zfill(number_of_digits) + ".png" 198 | shutil.copy( os.path.join(out_dirs[cur_clip]['path'] , filename) , os.path.join(tmp_dir , filename) ) 199 | 200 | current_frame = i 201 | 202 | 203 | start2 = current_frame+1 204 | 205 | print(str(start2) + " -> " + str(end+1)) 206 | 207 | for i in range(start2, end+1): 208 | filename = str(i).zfill(number_of_digits) + ".png" 209 | shutil.copy( os.path.join(out_dirs[cur_clip]['path'] , filename) , os.path.join(tmp_dir , filename) ) 210 | 211 | ### create movie 212 | movie_base_name = time.strftime("%Y%m%d-%H%M%S") 213 | if is_invert_mask: 214 | movie_base_name = "inv_" + movie_base_name 215 | 216 | nosnd_path = os.path.join(project_dir , movie_base_name + get_ext(export_type)) 217 | 218 | start = out_dirs[0]['startframe'] 219 | end = out_dirs[-1]['endframe'] 220 | 221 | create_movie_from_frames( tmp_dir, start, end, number_of_digits, fps, nosnd_path, export_type) 222 | 223 | dbg.print("exported : " + nosnd_path) 224 | 225 | if export_type == "mp4": 226 | 227 | with_snd_path = os.path.join(project_dir , movie_base_name + '_with_snd.mp4') 228 | 229 | if trying_to_add_audio(original_movie_path, nosnd_path, with_snd_path, tmp_dir): 230 | dbg.print("exported : " + with_snd_path) 231 | 232 | dbg.print("") 233 | dbg.print("completed.") 234 | 235 | -------------------------------------------------------------------------------- /stage8.py: -------------------------------------------------------------------------------- 1 | import os 2 | import re 3 | import subprocess 4 | import glob 5 | import shutil 6 | import time 7 | import cv2 8 | import numpy as np 9 | import itertools 10 | from extensions.ebsynth_utility.stage7 import create_movie_from_frames, get_ext, trying_to_add_audio 11 | 12 | def clamp(n, smallest, largest): 13 | return sorted([smallest, n, largest])[1] 14 | 15 | def resize_img(img, w, h): 16 | if img.shape[0] + img.shape[1] < h + w: 17 | interpolation = interpolation=cv2.INTER_CUBIC 18 | else: 19 | interpolation = interpolation=cv2.INTER_AREA 20 | 21 | return cv2.resize(img, (w, h), interpolation=interpolation) 22 | 23 | def merge_bg_src(base_frame_dir, bg_dir, frame_mask_path, tmp_dir, bg_type, mask_blur_size, mask_threshold, fg_transparency): 24 | 25 | base_frames = sorted(glob.glob( os.path.join(base_frame_dir, "[0-9]*.png"), recursive=False) ) 26 | 27 | bg_frames = sorted(glob.glob( os.path.join(bg_dir, "*.png"), recursive=False) ) 28 | 29 | def bg_frame(total_frames): 30 | bg_len = len(bg_frames) 31 | 32 | if bg_type == "Loop": 33 | itr = itertools.cycle(bg_frames) 34 | while True: 35 | yield next(itr) 36 | else: 37 | for i in range(total_frames): 38 | yield bg_frames[ int(bg_len * (i/total_frames))] 39 | 40 | bg_itr = bg_frame(len(base_frames)) 41 | 42 | for base_frame in base_frames: 43 | im = cv2.imread(base_frame) 44 | bg = cv2.imread( next(bg_itr) ) 45 | bg = resize_img(bg, im.shape[1], im.shape[0] ) 46 | 47 | basename = os.path.basename(base_frame) 48 | mask_path = os.path.join(frame_mask_path, basename) 49 | mask = cv2.imread(mask_path)[:,:,0] 50 | 51 | mask[mask < int( 255 * mask_threshold )] = 0 52 | 53 | if mask_blur_size > 0: 54 | mask_blur_size = mask_blur_size//2 * 2 + 1 55 | mask = cv2.GaussianBlur(mask, (mask_blur_size, mask_blur_size), 0) 56 | mask = mask[:, :, np.newaxis] 57 | 58 | fore_rate = (mask/255) * (1 - fg_transparency) 59 | 60 | im = im * fore_rate + bg * (1- fore_rate) 61 | im = im.astype(np.uint8) 62 | cv2.imwrite( os.path.join( tmp_dir , basename ) , im) 63 | 64 | def extract_frames(movie_path , output_dir, fps): 65 | png_path = os.path.join(output_dir , "%05d.png") 66 | # ffmpeg.exe -ss 00:00:00 -y -i %1 -qscale 0 -f image2 -c:v png "%05d.png" 67 | subprocess.call("ffmpeg -ss 00:00:00 -y -i " + movie_path + " -vf fps=" + str( round(fps, 2)) + " -qscale 0 -f image2 -c:v png " + png_path, shell=True) 68 | 69 | def ebsynth_utility_stage8(dbg, project_args, bg_src, bg_type, mask_blur_size, mask_threshold, fg_transparency, export_type): 70 | dbg.print("stage8") 71 | dbg.print("") 72 | 73 | if not bg_src: 74 | dbg.print("Fill [configuration] -> [stage 8] -> [Background source]") 75 | return 76 | 77 | project_dir, original_movie_path, _, frame_mask_path, _, _, _ = project_args 78 | 79 | fps = 30 80 | clip = cv2.VideoCapture(original_movie_path) 81 | if clip: 82 | fps = clip.get(cv2.CAP_PROP_FPS) 83 | clip.release() 84 | 85 | dbg.print("bg_src: {}".format(bg_src)) 86 | dbg.print("bg_type: {}".format(bg_type)) 87 | dbg.print("mask_blur_size: {}".format(mask_blur_size)) 88 | dbg.print("export_type: {}".format(export_type)) 89 | dbg.print("fps: {}".format(fps)) 90 | 91 | base_frame_dir = os.path.join( project_dir , "crossfade_tmp") 92 | 93 | if not os.path.isdir(base_frame_dir): 94 | dbg.print(base_frame_dir + " base frame not found") 95 | return 96 | 97 | tmp_dir = os.path.join( project_dir , "bg_merge_tmp") 98 | if os.path.isdir(tmp_dir): 99 | shutil.rmtree(tmp_dir) 100 | os.mkdir(tmp_dir) 101 | 102 | ### create frame imgs 103 | if os.path.isfile(bg_src): 104 | bg_ext = os.path.splitext(os.path.basename(bg_src))[1] 105 | if bg_ext == ".mp4": 106 | bg_tmp_dir = os.path.join( project_dir , "bg_extract_tmp") 107 | if os.path.isdir(bg_tmp_dir): 108 | shutil.rmtree(bg_tmp_dir) 109 | os.mkdir(bg_tmp_dir) 110 | 111 | extract_frames(bg_src, bg_tmp_dir, fps) 112 | 113 | bg_src = bg_tmp_dir 114 | else: 115 | dbg.print(bg_src + " must be mp4 or directory") 116 | return 117 | elif not os.path.isdir(bg_src): 118 | dbg.print(bg_src + " must be mp4 or directory") 119 | return 120 | 121 | merge_bg_src(base_frame_dir, bg_src, frame_mask_path, tmp_dir, bg_type, mask_blur_size, mask_threshold, fg_transparency) 122 | 123 | ### create movie 124 | movie_base_name = time.strftime("%Y%m%d-%H%M%S") 125 | movie_base_name = "merge_" + movie_base_name 126 | 127 | nosnd_path = os.path.join(project_dir , movie_base_name + get_ext(export_type)) 128 | 129 | merged_frames = sorted(glob.glob( os.path.join(tmp_dir, "[0-9]*.png"), recursive=False) ) 130 | start = int(os.path.splitext(os.path.basename(merged_frames[0]))[0]) 131 | end = int(os.path.splitext(os.path.basename(merged_frames[-1]))[0]) 132 | 133 | create_movie_from_frames(tmp_dir,start,end,5,fps,nosnd_path,export_type) 134 | 135 | dbg.print("exported : " + nosnd_path) 136 | 137 | if export_type == "mp4": 138 | 139 | with_snd_path = os.path.join(project_dir , movie_base_name + '_with_snd.mp4') 140 | 141 | if trying_to_add_audio(original_movie_path, nosnd_path, with_snd_path, tmp_dir): 142 | dbg.print("exported : " + with_snd_path) 143 | 144 | dbg.print("") 145 | dbg.print("completed.") 146 | 147 | -------------------------------------------------------------------------------- /style.css: -------------------------------------------------------------------------------- 1 | #ebs_info_area { 2 | border: #0B0F19 2px solid; 3 | border-radius: 5px; 4 | font-size: 15px; 5 | margin: 10px; 6 | padding: 10px; 7 | } 8 | 9 | #ebs_configuration_tab1>div{ 10 | margin: 5px; 11 | padding: 5px; 12 | } 13 | 14 | #ebs_configuration_tab2>div{ 15 | margin: 5px; 16 | padding: 5px; 17 | } 18 | 19 | #ebs_configuration_tab3_5>div{ 20 | margin: 5px; 21 | padding: 5px; 22 | } 23 | 24 | #ebs_configuration_tab7>div{ 25 | margin: 5px; 26 | padding: 5px; 27 | } 28 | 29 | #ebs_configuration_tab8>div{ 30 | margin: 5px; 31 | padding: 5px; 32 | } 33 | 34 | #ebs_configuration_tab_etc>div{ 35 | margin: 5px; 36 | padding: 5px; 37 | } 38 | 39 | video.svelte-w5wajl { 40 | max-height: 500px; 41 | } 42 | --------------------------------------------------------------------------------