├── LICENSE ├── README.md ├── __init__.py ├── install.py ├── requirements.txt └── scripts └── geeky-remb.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 GeekyGhost 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # GeekyRemB: Advanced Background Removal and Image/Video Manipulation Extension for Automatic1111 Web UI 2 | 3 | ## Overview 4 | 5 | **GeekyRemB** is a comprehensive extension for the **Automatic1111 Web UI**, built to bring advanced background removal, image/video manipulation, and blending capabilities into your projects. It offers precise background removal with support for multiple models, chroma keying, foreground adjustments, and advanced effects. Whether working with images or videos, this extension provides everything you need to manipulate visual content efficiently within the **Automatic1111** environment. 6 | 7 | --- 8 | 9 | 10 | https://github.com/user-attachments/assets/d9e089f8-461e-4cfd-a216-5632c30709f6 11 | 12 | 13 | ## Key Features 14 | 15 | - **Multi-model Background Removal**: Supports `u2net`, `isnet-general-use`, and other models. 16 | - **Chroma Key Support**: Remove specific colors (green, blue, or red) from backgrounds. 17 | - **Blending Modes**: 10 powerful blend modes for image compositing. 18 | - **Foreground Adjustments**: Scale, rotate, flip, and position elements precisely. 19 | - **Video and Image Support**: Process images and videos seamlessly. 20 | - **Batch and Multi-threaded Processing**: Handle large files efficiently using threading and GPU support. 21 | - **Customizable Output Formats**: Export in PNG, JPEG, MP4, AVI, and more. 22 | 23 | --- 24 | 25 | ## Installation 26 | 27 | 1. **Clone the Repository:** 28 | ```bash 29 | git clone https://github.com/GeekyGhost/Automatic1111-Geeky-Remb.git 30 | ``` 31 | 32 | 2. **Move to the Extensions Folder:** 33 | Navigate to your **Automatic1111 Web UI** installation directory and move the repository into the `extensions` folder: 34 | ```bash 35 | mv Automatic1111-Geeky-Remb ./extensions/ 36 | ``` 37 | 38 | 3. **Restart the Web UI** or use the **Reload** button within the Web UI to register the extension. 39 | 40 | 4. **Access the Extension**: Open the **GeekyRemB** tab within the Web UI to begin. 41 | 42 | --- 43 | 44 | ## Usage Instructions 45 | 46 | ### Basic Workflow 47 | 48 | 1. **Select Input Type**: Choose between **Image** or **Video** as your input. 49 | 2. **Upload Foreground Content**: Provide the image or video you want to manipulate. 50 | 3. **Adjust Foreground Settings**: Modify scaling, aspect ratio, position, rotation, and blending modes. 51 | 4. **Apply Background Removal**: Use AI models or chroma key to remove backgrounds. 52 | 5. **Choose Output Format**: Select PNG, JPEG, MP4, or other formats. 53 | 6. **Click ‘Run GeekyRemB’**: Process the input to generate your final output. 54 | 55 | --- 56 | 57 | ## Detailed Settings 58 | 59 | ### Input/Output Configuration 60 | 61 | - **Input Type**: 62 | Choose between **Image** or **Video**. 63 | 64 | - **Foreground Upload**: 65 | Upload your image or video content. 66 | 67 | - **Output Type**: 68 | Define whether the result will be an **Image** or **Video**. 69 | 70 | --- 71 | 72 | ### Foreground Adjustments 73 | 74 | - **Scale**: 75 | Adjust the size of the foreground between 0.1 to 5.0. 76 | 77 | - **Aspect Ratio**: 78 | Use ratios like `16:9` or terms like `portrait` or `square`. 79 | 80 | - **Rotation & Position**: 81 | Rotate the element from -360° to 360° and adjust **X/Y positions** within a -1000 to 1000 range. 82 | 83 | - **Flip Options**: 84 | Flip the foreground horizontally or vertically. 85 | 86 | --- 87 | 88 | ### Background Options 89 | 90 | - **Remove Background**: 91 | Use AI models (e.g., `u2net`) for automatic background removal. 92 | 93 | - **Chroma Key**: 94 | Select a chroma key color (green, blue, red) and set tolerance levels. 95 | 96 | - **Background Mode**: 97 | Options include **transparent**, **solid color**, **image**, or **video** backgrounds. 98 | 99 | --- 100 | 101 | ### Advanced Effects 102 | 103 | - **Blending Modes**: 104 | Choose from 10 blend modes: 105 | - **Normal**, **Multiply**, **Screen**, **Overlay**, **Soft Light**, **Hard Light**, **Difference**, **Exclusion**, **Color Dodge**, **Color Burn**. 106 | 107 | - **Shadow and Edge Detection**: 108 | Add shadows and edges with adjustable thickness and blur. 109 | 110 | - **Alpha Matting**: 111 | Fine-tune mask edges using alpha matting thresholds. 112 | 113 | --- 114 | 115 | ### Output Settings 116 | 117 | - **Custom Dimensions**: 118 | Enable to specify width and height manually. 119 | 120 | - **Output Formats**: 121 | Export images as PNG, JPEG, WEBP, and videos as MP4, AVI, or MOV. 122 | 123 | - **Video Quality**: 124 | Set video quality from 0-100 for optimized exports. 125 | 126 | --- 127 | 128 | ## Developer Guide 129 | 130 | This extension is built with modular, extensible code. Below is an in-depth look at the core classes and methods. 131 | 132 | --- 133 | 134 | ### Core Classes and Functions 135 | 136 | #### 1. **`GeekyRemB` Class** 137 | Manages sessions, background removal, threading, and GPU support. 138 | 139 | - **`__init__()`**: 140 | Initializes the session, checks for CUDA availability, and sets threading parameters. 141 | 142 | #### 2. **`remove_background()` Method** 143 | Handles background removal, blending, chroma keying, and effect applications. 144 | **Parameters**: 145 | - `model`: AI model to use for removal. 146 | - `alpha_matting`: Enables edge refinement. 147 | - `chroma_key`: Applies chroma key color. 148 | - `blend_mode`: Blend mode for the result. 149 | - `foreground_scale`: Controls the scale of the foreground. 150 | 151 | This function processes both image and video inputs, performing transformations and applying custom background settings. 152 | 153 | --- 154 | 155 | #### 3. **`apply_blend_mode()` Method** 156 | Applies one of 10 blending modes for compositing. 157 | 158 | ```python 159 | def apply_blend_mode(target, blend, mode="normal", opacity=1.0): 160 | # Ensures both images have the same dimensions and channels 161 | result = self.blend_modes[mode](target, blend, opacity) 162 | return np.clip(result * 255, 0, 255).astype(np.uint8) 163 | ``` 164 | 165 | Supported blend modes: 166 | - **Normal, Multiply, Screen, Overlay, Soft Light, Hard Light, Difference, Exclusion, Color Dodge, Color Burn** 167 | 168 | --- 169 | 170 | #### 4. **`process_video()` Method** 171 | Processes video frame-by-frame using threading for efficient execution. 172 | 173 | ```python 174 | def process_video(input_path, output_path, background_video_path, *args): 175 | cap = cv2.VideoCapture(input_path) 176 | fps = cap.get(cv2.CAP_PROP_FPS) 177 | # Process frames and write output 178 | ... 179 | ``` 180 | 181 | This function supports **CUDA-accelerated encoding** and uses **ffmpeg** for post-processing. 182 | 183 | --- 184 | 185 | #### 5. **`parse_aspect_ratio()` Method** 186 | Interprets aspect ratios provided by the user. Accepts terms like `portrait`, `landscape`, or numerical ratios. 187 | 188 | --- 189 | 190 | #### 6. **`calculate_new_dimensions()` Method** 191 | Calculates new dimensions based on scaling and aspect ratio. 192 | 193 | ```python 194 | def calculate_new_dimensions(self, orig_width, orig_height, scale, aspect_ratio): 195 | new_width = int(orig_width * scale) 196 | new_height = int(new_width / aspect_ratio) if aspect_ratio else int(orig_height * scale) 197 | return new_width, new_height 198 | ``` 199 | 200 | --- 201 | 202 | ## Performance Optimizations 203 | 204 | - **GPU Support**: Automatically detects and leverages CUDA for faster video processing. 205 | - **Thread Pooling**: Uses `ThreadPoolExecutor` to batch-process multiple frames. 206 | - **Memory Management**: Dynamically controls batch size to optimize performance. 207 | 208 | --- 209 | 210 | ## Troubleshooting Tips 211 | 212 | - **Edges are not smooth?** 213 | Enable **alpha matting** and adjust thresholds for better results. 214 | 215 | - **Chroma key not working correctly?** 216 | Increase the **color tolerance** to capture more shades of the key color. 217 | 218 | - **Output size not as expected?** 219 | Use the **custom dimensions** feature to manually set the desired size. 220 | 221 | --- 222 | 223 | ## Future Enhancements 224 | 225 | - **Real-time Preview**: Provide live feedback for adjustments. 226 | - **Animation Support**: Add keyframe-based animation for dynamic videos. 227 | - **New Models**: Incorporate additional models for niche use cases. 228 | 229 | --- 230 | 231 | ## Acknowledgments 232 | 233 | - **rembg Library**: This extension is built on top of the [rembg](https://github.com/danielgatis/rembg) library. 234 | - **Automatic1111 Community**: Thanks to the community for continuous inspiration and support. 235 | 236 | --- 237 | 238 | ## Contributing 239 | 240 | We welcome contributions! 241 | Feel free to submit pull requests or open issues with ideas, improvements, or bug reports. 242 | 243 | --- 244 | 245 | With **GeekyRemB**, unlock new possibilities in creative projects. Whether you need to fine-tune images or apply advanced effects to videos, this extension empowers your workflow with precision and control. 246 | 247 | --- 248 | 249 | 250 | 251 | 252 | https://github.com/user-attachments/assets/8b904ff2-ca26-4025-aa3f-5d15f21a03db 253 | 254 | 255 | 256 | 257 | Happy creating! 🎨 258 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- 1 | from .geeky_remb import GeekyRemBExtras 2 | 3 | def __init__(): 4 | return [GeekyRemBExtras()] -------------------------------------------------------------------------------- /install.py: -------------------------------------------------------------------------------- 1 | import launch 2 | 3 | if not launch.is_installed("rembg"): 4 | launch.run_pip("install rembg", "requirement for Geeky RemB") 5 | 6 | if not launch.is_installed("opencv-python"): 7 | launch.run_pip("install opencv-python", "requirement for Geeky RemB") -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | rembg 2 | numpy 3 | opencv-python 4 | Pillow 5 | onnxruntime-gpu 6 | onnxruntime 7 | -------------------------------------------------------------------------------- /scripts/geeky-remb.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | from rembg import remove, new_session 4 | from PIL import Image, ImageOps, ImageFilter, ImageEnhance, ImageColor 5 | import cv2 6 | from tqdm import tqdm 7 | import gradio as gr 8 | from modules import script_callbacks, shared, scripts 9 | from modules.paths_internal import models_path 10 | from modules.processing import StableDiffusionProcessing 11 | from modules.images import save_image 12 | import torch 13 | import tempfile 14 | from concurrent.futures import ThreadPoolExecutor 15 | import queue 16 | from threading import Thread 17 | 18 | class BlendMode: 19 | @staticmethod 20 | def _ensure_same_channels(target, blend): 21 | """Ensure both images have the same number of channels""" 22 | if target.shape[-1] == 4 and blend.shape[-1] == 3: 23 | alpha = np.ones((*blend.shape[:2], 1)) 24 | blend = np.concatenate([blend, alpha], axis=-1) 25 | elif target.shape[-1] == 3 and blend.shape[-1] == 4: 26 | alpha = np.ones((*target.shape[:2], 1)) 27 | target = np.concatenate([target, alpha], axis=-1) 28 | return target, blend 29 | 30 | @staticmethod 31 | def _apply_blend(target, blend, operation, opacity=1.0): 32 | """Apply blend operation with proper alpha handling""" 33 | target, blend = BlendMode._ensure_same_channels(target, blend) 34 | 35 | target_rgb = target[..., :3] 36 | blend_rgb = blend[..., :3] 37 | 38 | target_a = target[..., 3:] if target.shape[-1] == 4 else 1 39 | blend_a = blend[..., 3:] if blend.shape[-1] == 4 else 1 40 | 41 | result_rgb = operation(target_rgb, blend_rgb) 42 | result_a = target_a * blend_a 43 | 44 | result_rgb = result_rgb * opacity + target_rgb * (1 - opacity) 45 | result_a = result_a * opacity + target_a * (1 - opacity) 46 | 47 | return np.concatenate([result_rgb, result_a], axis=-1) if target.shape[-1] == 4 else result_rgb 48 | 49 | @staticmethod 50 | def normal(target, blend, opacity=1.0): 51 | return BlendMode._apply_blend(target, blend, lambda t, b: b, opacity) 52 | 53 | @staticmethod 54 | def multiply(target, blend, opacity=1.0): 55 | return BlendMode._apply_blend(target, blend, lambda t, b: t * b, opacity) 56 | 57 | @staticmethod 58 | def screen(target, blend, opacity=1.0): 59 | return BlendMode._apply_blend(target, blend, lambda t, b: 1 - (1 - t) * (1 - b), opacity) 60 | 61 | @staticmethod 62 | def overlay(target, blend, opacity=1.0): 63 | def overlay_op(t, b): 64 | return np.where(t > 0.5, 65 | 1 - 2 * (1 - t) * (1 - b), 66 | 2 * t * b) 67 | return BlendMode._apply_blend(target, blend, overlay_op, opacity) 68 | 69 | @staticmethod 70 | def soft_light(target, blend, opacity=1.0): 71 | def soft_light_op(t, b): 72 | return np.where(b > 0.5, 73 | t + (2 * b - 1) * (t - t * t), 74 | t - (1 - 2 * b) * t * (1 - t)) 75 | return BlendMode._apply_blend(target, blend, soft_light_op, opacity) 76 | 77 | @staticmethod 78 | def hard_light(target, blend, opacity=1.0): 79 | def hard_light_op(t, b): 80 | return np.where(b > 0.5, 81 | 1 - (1 - t) * (2 - 2 * b), 82 | 2 * t * b) 83 | return BlendMode._apply_blend(target, blend, hard_light_op, opacity) 84 | 85 | @staticmethod 86 | def difference(target, blend, opacity=1.0): 87 | return BlendMode._apply_blend(target, blend, lambda t, b: np.abs(t - b), opacity) 88 | 89 | @staticmethod 90 | def exclusion(target, blend, opacity=1.0): 91 | return BlendMode._apply_blend(target, blend, lambda t, b: t + b - 2 * t * b, opacity) 92 | 93 | @staticmethod 94 | def color_dodge(target, blend, opacity=1.0): 95 | def color_dodge_op(t, b): 96 | return np.where(b >= 1, 1, np.minimum(1, t / (1 - b + 1e-6))) 97 | return BlendMode._apply_blend(target, blend, color_dodge_op, opacity) 98 | 99 | @staticmethod 100 | def color_burn(target, blend, opacity=1.0): 101 | def color_burn_op(t, b): 102 | return np.where(b <= 0, 0, np.maximum(0, 1 - (1 - t) / (b + 1e-6))) 103 | return BlendMode._apply_blend(target, blend, color_burn_op, opacity) 104 | 105 | class GeekyRemB: 106 | def __init__(self): 107 | self.session = None 108 | if "U2NET_HOME" not in os.environ: 109 | os.environ["U2NET_HOME"] = os.path.join(models_path, "u2net") 110 | self.processing = False 111 | self.use_gpu = torch.cuda.is_available() 112 | self.frame_cache = {} 113 | self.max_cache_size = 100 114 | self.batch_size = 4 # Adjust based on your memory constraints 115 | self.max_workers = 4 # Adjust based on your CPU cores 116 | self.executor = ThreadPoolExecutor(max_workers=self.max_workers) 117 | self.blend_modes = { 118 | "normal": BlendMode.normal, 119 | "multiply": BlendMode.multiply, 120 | "screen": BlendMode.screen, 121 | "overlay": BlendMode.overlay, 122 | "soft_light": BlendMode.soft_light, 123 | "hard_light": BlendMode.hard_light, 124 | "difference": BlendMode.difference, 125 | "exclusion": BlendMode.exclusion, 126 | "color_dodge": BlendMode.color_dodge, 127 | "color_burn": BlendMode.color_burn 128 | } 129 | 130 | def process_frame_batch(self, frames, background_frames, *args): 131 | """Process multiple frames in parallel""" 132 | futures = [] 133 | for frame, bg_frame in zip(frames, background_frames): 134 | future = self.executor.submit(self.process_frame, frame, bg_frame, *args) 135 | futures.append(future) 136 | return [future.result() for future in futures] 137 | 138 | def process_video(self, input_path, output_path, background_video_path, *args): 139 | try: 140 | cap = cv2.VideoCapture(input_path) 141 | fps = cap.get(cv2.CAP_PROP_FPS) 142 | width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) 143 | height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) 144 | total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) 145 | 146 | bg_cap = None 147 | if background_video_path: 148 | bg_cap = cv2.VideoCapture(background_video_path) 149 | bg_total_frames = int(bg_cap.get(cv2.CAP_PROP_FRAME_COUNT)) 150 | 151 | frame_queue = queue.Queue(maxsize=self.batch_size * 2) 152 | result_queue = queue.Queue() 153 | 154 | # Use GPU-accelerated codec if available 155 | if self.use_gpu: 156 | fourcc = cv2.VideoWriter_fourcc(*'mp4v') 157 | else: 158 | fourcc = cv2.VideoWriter_fourcc(*'XVID') 159 | out = cv2.VideoWriter(output_path, fourcc, fps, (width, height)) 160 | 161 | def read_frames(): 162 | frame_idx = 0 163 | while frame_idx < total_frames: 164 | frames = [] 165 | bg_frames = [] 166 | for _ in range(self.batch_size): 167 | if frame_idx >= total_frames: 168 | break 169 | ret, frame = cap.read() 170 | if not ret: 171 | break 172 | 173 | bg_frame = None 174 | if bg_cap is not None: 175 | bg_frame_idx = frame_idx % bg_total_frames 176 | bg_cap.set(cv2.CAP_PROP_POS_FRAMES, bg_frame_idx) 177 | bg_ret, bg_frame = bg_cap.read() 178 | if bg_ret: 179 | bg_frame = cv2.resize(bg_frame, (width, height)) 180 | 181 | frames.append(frame) 182 | bg_frames.append(bg_frame) 183 | frame_idx += 1 184 | 185 | if frames: 186 | frame_queue.put((frames, bg_frames)) 187 | frame_queue.put(None) 188 | 189 | def process_frames(): 190 | while True: 191 | batch = frame_queue.get() 192 | if batch is None: 193 | result_queue.put(None) 194 | break 195 | frames, bg_frames = batch 196 | processed_frames = self.process_frame_batch(frames, bg_frames, *args) 197 | result_queue.put(processed_frames) 198 | 199 | read_thread = Thread(target=read_frames) 200 | process_thread = Thread(target=process_frames) 201 | read_thread.start() 202 | process_thread.start() 203 | 204 | with tqdm(total=total_frames, desc="Processing video") as pbar: 205 | while True: 206 | processed_batch = result_queue.get() 207 | if processed_batch is None: 208 | break 209 | for processed_frame in processed_batch: 210 | out.write(processed_frame) 211 | pbar.update(1) 212 | 213 | read_thread.join() 214 | process_thread.join() 215 | cap.release() 216 | if bg_cap: 217 | bg_cap.release() 218 | out.release() 219 | 220 | # Optimize final video encoding 221 | temp_output = output_path + "_temp.mp4" 222 | os.rename(output_path, temp_output) 223 | if self.use_gpu: 224 | os.system(f'ffmpeg -y -i "{temp_output}" -c:v h264_nvenc -preset p7 -tune hq -crf 23 "{output_path}"') 225 | else: 226 | os.system(f'ffmpeg -y -i "{temp_output}" -c:v libx264 -preset faster -crf 23 "{output_path}"') 227 | if os.path.exists(temp_output): 228 | os.remove(temp_output) 229 | 230 | except Exception as e: 231 | print(f"Error processing video: {str(e)}") 232 | raise 233 | 234 | finally: 235 | if 'cap' in locals(): 236 | cap.release() 237 | if 'bg_cap' in locals(): 238 | bg_cap.release() 239 | if 'out' in locals(): 240 | out.release() 241 | 242 | def apply_blend_mode(self, target, blend, mode="normal", opacity=1.0): 243 | if mode not in self.blend_modes: 244 | return blend 245 | 246 | target = target.astype(np.float32) / 255 247 | blend = blend.astype(np.float32) / 255 248 | 249 | result = self.blend_modes[mode](target, blend, opacity) 250 | 251 | return np.clip(result * 255, 0, 255).astype(np.uint8) 252 | 253 | def apply_chroma_key(self, image, color, threshold, color_tolerance=20): 254 | if isinstance(image, Image.Image): 255 | image = np.array(image) 256 | 257 | hsv = cv2.cvtColor(image, cv2.COLOR_RGB2HSV) 258 | if color == "green": 259 | lower = np.array([40 - color_tolerance, 40, 40]) 260 | upper = np.array([80 + color_tolerance, 255, 255]) 261 | elif color == "blue": 262 | lower = np.array([90 - color_tolerance, 40, 40]) 263 | upper = np.array([130 + color_tolerance, 255, 255]) 264 | elif color == "red": 265 | lower = np.array([0, 40, 40]) 266 | upper = np.array([20 + color_tolerance, 255, 255]) 267 | else: 268 | return np.zeros(image.shape[:2], dtype=np.uint8) 269 | 270 | mask = cv2.inRange(hsv, lower, upper) 271 | mask = 255 - cv2.threshold(mask, threshold, 255, cv2.THRESH_BINARY)[1] 272 | return mask 273 | 274 | def process_mask(self, mask, invert_mask, feather_amount, mask_blur, mask_expansion): 275 | if invert_mask: 276 | mask = 255 - mask 277 | 278 | if mask_expansion != 0: 279 | kernel = np.ones((abs(mask_expansion), abs(mask_expansion)), np.uint8) 280 | if mask_expansion > 0: 281 | mask = cv2.dilate(mask, kernel, iterations=1) 282 | else: 283 | mask = cv2.erode(mask, kernel, iterations=1) 284 | 285 | if feather_amount > 0: 286 | mask = cv2.GaussianBlur(mask, (0, 0), sigmaX=feather_amount) 287 | 288 | if mask_blur > 0: 289 | mask = cv2.GaussianBlur(mask, (0, 0), sigmaX=mask_blur) 290 | 291 | return mask 292 | 293 | def parse_aspect_ratio(self, aspect_ratio_input): 294 | if not aspect_ratio_input: 295 | return None 296 | 297 | if ':' in aspect_ratio_input: 298 | try: 299 | w, h = map(float, aspect_ratio_input.split(':')) 300 | return w / h 301 | except ValueError: 302 | return None 303 | 304 | try: 305 | return float(aspect_ratio_input) 306 | except ValueError: 307 | pass 308 | 309 | standard_ratios = { 310 | '4:3': 4/3, 311 | '16:9': 16/9, 312 | '21:9': 21/9, 313 | '1:1': 1, 314 | 'square': 1, 315 | 'portrait': 3/4, 316 | 'landscape': 4/3 317 | } 318 | 319 | return standard_ratios.get(aspect_ratio_input.lower()) 320 | 321 | def calculate_new_dimensions(self, orig_width, orig_height, scale, aspect_ratio): 322 | new_width = int(orig_width * scale) 323 | 324 | if aspect_ratio is None: 325 | new_height = int(orig_height * scale) 326 | else: 327 | new_height = int(new_width / aspect_ratio) 328 | 329 | return new_width, new_height 330 | 331 | def remove_background(self, image, background_image, model, alpha_matting, alpha_matting_foreground_threshold, 332 | alpha_matting_background_threshold, post_process_mask, chroma_key, chroma_threshold, 333 | color_tolerance, background_mode, background_color, output_format="RGBA", 334 | invert_mask=False, feather_amount=0, edge_detection=False, 335 | edge_thickness=1, edge_color="#FFFFFF", shadow=False, shadow_blur=5, 336 | shadow_opacity=0.5, color_adjustment=False, brightness=1.0, contrast=1.0, 337 | saturation=1.0, x_position=0, y_position=0, rotation=0, opacity=1.0, 338 | flip_horizontal=False, flip_vertical=False, mask_blur=0, mask_expansion=0, 339 | foreground_scale=1.0, foreground_aspect_ratio=None, remove_bg=True, 340 | use_custom_dimensions=False, custom_width=None, custom_height=None, 341 | output_dimension_source="Foreground", blend_mode="normal"): 342 | if self.session is None or self.session.model_name != model: 343 | self.session = new_session(model) 344 | 345 | if not isinstance(background_color, str) or not background_color.startswith('#'): 346 | background_color = "#000000" 347 | 348 | try: 349 | bg_color = tuple(int(background_color.lstrip('#')[i:i+2], 16) for i in (0, 2, 4)) + (255,) 350 | except ValueError: 351 | bg_color = (0, 0, 0, 255) 352 | 353 | try: 354 | edge_color = tuple(int(edge_color.lstrip('#')[i:i+2], 16) for i in (0, 2, 4)) 355 | except ValueError: 356 | edge_color = (255, 255, 255) 357 | 358 | pil_image = image if isinstance(image, Image.Image) else Image.fromarray(np.clip(255. * image[0].cpu().numpy(), 0, 255).astype(np.uint8)) 359 | original_image = np.array(pil_image) 360 | 361 | if chroma_key != "none": 362 | chroma_mask = self.apply_chroma_key(original_image, chroma_key, chroma_threshold, color_tolerance) 363 | input_mask = chroma_mask 364 | else: 365 | input_mask = None 366 | 367 | if remove_bg: 368 | removed_bg = remove( 369 | pil_image, 370 | session=self.session, 371 | alpha_matting=alpha_matting, 372 | alpha_matting_foreground_threshold=alpha_matting_foreground_threshold, 373 | alpha_matting_background_threshold=alpha_matting_background_threshold, 374 | post_process_mask=post_process_mask, 375 | ) 376 | rembg_mask = np.array(removed_bg)[:, :, 3] 377 | else: 378 | removed_bg = pil_image.convert("RGBA") 379 | rembg_mask = np.full(pil_image.size[::-1], 255, dtype=np.uint8) 380 | 381 | if input_mask is not None: 382 | final_mask = cv2.bitwise_and(rembg_mask, input_mask) 383 | else: 384 | final_mask = rembg_mask 385 | 386 | final_mask = self.process_mask(final_mask, invert_mask, feather_amount, mask_blur, mask_expansion) 387 | 388 | orig_width, orig_height = pil_image.size 389 | bg_width, bg_height = background_image.size if background_image else (orig_width, orig_height) 390 | 391 | if use_custom_dimensions and custom_width and custom_height: 392 | output_width, output_height = int(custom_width), int(custom_height) 393 | elif output_dimension_source == "Background" and background_image: 394 | output_width, output_height = bg_width, bg_height 395 | else: 396 | output_width, output_height = orig_width, orig_height 397 | 398 | aspect_ratio = self.parse_aspect_ratio(foreground_aspect_ratio) 399 | new_width, new_height = self.calculate_new_dimensions(orig_width, orig_height, foreground_scale, aspect_ratio) 400 | 401 | fg_image = pil_image.resize((new_width, new_height), Image.LANCZOS) 402 | fg_mask = Image.fromarray(final_mask).resize((new_width, new_height), Image.LANCZOS) 403 | 404 | if background_mode == "transparent": 405 | result = Image.new("RGBA", (output_width, output_height), (0, 0, 0, 0)) 406 | elif background_mode == "color": 407 | result = Image.new("RGBA", (output_width, output_height), bg_color) 408 | else: # background_mode == "image" 409 | if background_image is not None: 410 | result = background_image.resize((output_width, output_height), Image.LANCZOS).convert("RGBA") 411 | else: 412 | result = Image.new("RGBA", (output_width, output_height), (0, 0, 0, 0)) 413 | 414 | if flip_horizontal: 415 | fg_image = fg_image.transpose(Image.FLIP_LEFT_RIGHT) 416 | fg_mask = fg_mask.transpose(Image.FLIP_LEFT_RIGHT) 417 | if flip_vertical: 418 | fg_image = fg_image.transpose(Image.FLIP_TOP_BOTTOM) 419 | fg_mask = fg_mask.transpose(Image.FLIP_TOP_BOTTOM) 420 | 421 | fg_image = fg_image.rotate(rotation, resample=Image.BICUBIC, expand=True) 422 | fg_mask = fg_mask.rotate(rotation, resample=Image.BICUBIC, expand=True) 423 | 424 | paste_x = x_position + (output_width - fg_image.width) // 2 425 | paste_y = y_position + (output_height - fg_image.height) // 2 426 | 427 | # Apply blending mode 428 | if background_mode == "image" and background_image is not None: 429 | bg_array = np.array(result) 430 | fg_array = np.array(fg_image) 431 | 432 | # Ensure foreground array matches background dimensions before blending 433 | if bg_array.shape[:2] != fg_array.shape[:2]: 434 | # Resize foreground image to match background dimensions 435 | fg_image = fg_image.resize((output_width, output_height), Image.LANCZOS) 436 | fg_mask = fg_mask.resize((output_width, output_height), Image.LANCZOS) 437 | fg_array = np.array(fg_image) 438 | 439 | blended = self.apply_blend_mode(bg_array, fg_array, blend_mode, opacity) 440 | fg_with_opacity = Image.fromarray(blended) 441 | 442 | # Update paste coordinates since we resized 443 | paste_x = x_position 444 | paste_y = y_position 445 | else: 446 | fg_rgba = fg_image.convert("RGBA") 447 | fg_with_opacity = Image.new("RGBA", fg_rgba.size, (0, 0, 0, 0)) 448 | for x in range(fg_rgba.width): 449 | for y in range(fg_rgba.height): 450 | r, g, b, a = fg_rgba.getpixel((x, y)) 451 | fg_with_opacity.putpixel((x, y), (r, g, b, int(a * opacity))) 452 | 453 | # Ensure mask has same dimensions as image for pasting 454 | fg_mask_with_opacity = fg_mask.point(lambda p: int(p * opacity)) 455 | if fg_mask_with_opacity.size != fg_with_opacity.size: 456 | fg_mask_with_opacity = fg_mask_with_opacity.resize(fg_with_opacity.size, Image.LANCZOS) 457 | 458 | result.paste(fg_with_opacity, (paste_x, paste_y), fg_mask_with_opacity) 459 | 460 | if edge_detection: 461 | edge_mask = cv2.Canny(np.array(fg_mask), 100, 200) 462 | edge_mask = cv2.dilate(edge_mask, np.ones((edge_thickness, edge_thickness), np.uint8), iterations=1) 463 | edge_overlay = Image.new("RGBA", (output_width, output_height), (0, 0, 0, 0)) 464 | edge_overlay.paste(Image.new("RGB", fg_image.size, edge_color), (paste_x, paste_y), Image.fromarray(edge_mask)) 465 | result = Image.alpha_composite(result, edge_overlay) 466 | 467 | if shadow: 468 | shadow_mask = fg_mask.filter(ImageFilter.GaussianBlur(shadow_blur)) 469 | shadow_image = Image.new("RGBA", (output_width, output_height), (0, 0, 0, 0)) 470 | shadow_image.paste((0, 0, 0, int(255 * shadow_opacity)), (paste_x, paste_y), shadow_mask) 471 | result = Image.alpha_composite(result, shadow_image.filter(ImageFilter.GaussianBlur(shadow_blur))) 472 | 473 | if color_adjustment: 474 | enhancer = ImageEnhance.Brightness(result) 475 | result = enhancer.enhance(brightness) 476 | enhancer = ImageEnhance.Contrast(result) 477 | result = enhancer.enhance(contrast) 478 | enhancer = ImageEnhance.Color(result) 479 | result = enhancer.enhance(saturation) 480 | 481 | if output_format == "RGB": 482 | result = result.convert("RGB") 483 | 484 | return result, fg_mask 485 | 486 | def parse_color(self, color): 487 | """Safely parse color string to RGB tuple""" 488 | if isinstance(color, str) and color.startswith('#') and len(color) == 7: 489 | try: 490 | return tuple(int(color.lstrip('#')[i:i+2], 16) for i in (0, 2, 4)) 491 | except ValueError: 492 | pass 493 | return (0, 0, 0) # Default to black if parsing fails 494 | 495 | def process_frame(self, frame, background_frame=None, *args): 496 | """Process a single video frame with proper color handling""" 497 | if isinstance(frame, np.ndarray): 498 | pil_frame = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)) 499 | else: 500 | pil_frame = frame 501 | 502 | args = list(args) 503 | 504 | if len(args) > 9: # Handle background color 505 | bg_color = self.parse_color(args[9]) 506 | args[9] = f"#{bg_color[0]:02x}{bg_color[1]:02x}{bg_color[2]:02x}" 507 | 508 | if len(args) > 14: # Handle edge color 509 | edge_color = self.parse_color(args[14]) 510 | args[14] = f"#{edge_color[0]:02x}{edge_color[1]:02x}{edge_color[2]:02x}" 511 | 512 | if background_frame is not None: 513 | if isinstance(background_frame, np.ndarray): 514 | background_frame = Image.fromarray(cv2.cvtColor(background_frame, cv2.COLOR_BGR2RGB)) 515 | 516 | args = tuple(args) 517 | processed_frame, _ = self.remove_background(pil_frame, background_frame, *args) 518 | return cv2.cvtColor(np.array(processed_frame), cv2.COLOR_RGB2BGR) 519 | 520 | def process_video(self, input_path, output_path, background_video_path, *args): 521 | try: 522 | cap = cv2.VideoCapture(input_path) 523 | fps = cap.get(cv2.CAP_PROP_FPS) 524 | width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) 525 | height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) 526 | total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT)) 527 | 528 | bg_cap = None 529 | if background_video_path: 530 | bg_cap = cv2.VideoCapture(background_video_path) 531 | bg_total_frames = int(bg_cap.get(cv2.CAP_PROP_FRAME_COUNT)) 532 | 533 | fourcc = cv2.VideoWriter_fourcc(*'mp4v') 534 | out = cv2.VideoWriter(output_path, fourcc, fps, (width, height)) 535 | 536 | with tqdm(total=total_frames, desc="Processing video") as pbar: 537 | frame_idx = 0 538 | while True: 539 | ret, frame = cap.read() 540 | if not ret: 541 | break 542 | 543 | bg_frame = None 544 | if bg_cap is not None: 545 | bg_frame_idx = frame_idx % bg_total_frames 546 | bg_cap.set(cv2.CAP_PROP_POS_FRAMES, bg_frame_idx) 547 | bg_ret, bg_frame = bg_cap.read() 548 | if bg_ret: 549 | bg_frame = cv2.resize(bg_frame, (width, height)) 550 | 551 | processed_frame = self.process_frame(frame, bg_frame, *args) 552 | out.write(processed_frame) 553 | 554 | frame_idx += 1 555 | pbar.update(1) 556 | 557 | cap.release() 558 | if bg_cap: 559 | bg_cap.release() 560 | out.release() 561 | 562 | # Convert output video to MP4 container 563 | temp_output = output_path + "_temp.mp4" 564 | os.rename(output_path, temp_output) 565 | os.system(f'ffmpeg -i "{temp_output}" -c copy "{output_path}"') 566 | if os.path.exists(temp_output): 567 | os.remove(temp_output) 568 | 569 | except Exception as e: 570 | print(f"Error processing video: {str(e)}") 571 | raise 572 | 573 | def on_ui_tabs(): 574 | with gr.Blocks(analytics_enabled=False) as geeky_remb_tab: 575 | gr.Markdown("# GeekyRemB: Background Removal and Image/Video Manipulation") 576 | 577 | with gr.Row(): 578 | with gr.Column(scale=1): 579 | input_type = gr.Radio(["Image", "Video"], label="Input Type", value="Image") 580 | foreground_input = gr.Image(label="Foreground Image", type="pil", visible=True) 581 | foreground_video = gr.Video(label="Foreground Video", visible=False) 582 | run_button = gr.Button(label="Run GeekyRemB") 583 | 584 | with gr.Group(): 585 | gr.Markdown("### Foreground Adjustments") 586 | with gr.Group(): 587 | blend_mode = gr.Dropdown( 588 | label="Blend Mode", 589 | choices=["normal", "multiply", "screen", "overlay", "soft_light", 590 | "hard_light", "difference", "exclusion", "color_dodge", "color_burn"], 591 | value="normal" 592 | ) 593 | opacity = gr.Slider(label="Opacity", minimum=0.0, maximum=1.0, value=1.0, step=0.01) 594 | 595 | foreground_scale = gr.Slider(label="Scale", minimum=0.1, maximum=5.0, value=1.0, step=0.1) 596 | foreground_aspect_ratio = gr.Textbox( 597 | label="Aspect Ratio", 598 | placeholder="e.g., 16:9, 4:3, 1:1, portrait, landscape, or leave blank for original", 599 | value="" 600 | ) 601 | x_position = gr.Slider(label="X Position", minimum=-1000, maximum=1000, value=0, step=1) 602 | y_position = gr.Slider(label="Y Position", minimum=-1000, maximum=1000, value=0, step=1) 603 | rotation = gr.Slider(label="Rotation", minimum=-360, maximum=360, value=0, step=0.1) 604 | 605 | with gr.Row(): 606 | flip_horizontal = gr.Checkbox(label="Flip Horizontal", value=False) 607 | flip_vertical = gr.Checkbox(label="Flip Vertical", value=False) 608 | 609 | with gr.Column(scale=1): 610 | result_type = gr.Radio(["Image", "Video"], label="Output Type", value="Image") 611 | result_image = gr.Image(label="Result Image", type="pil", visible=True) 612 | result_video = gr.Video(label="Result Video", visible=False) 613 | 614 | with gr.Group(): 615 | gr.Markdown("### Background Options") 616 | remove_background = gr.Checkbox(label="Remove Background", value=True) 617 | background_mode = gr.Radio(label="Background Mode", choices=["transparent", "color", "image", "video"], value="transparent") 618 | background_color = gr.ColorPicker(label="Background Color", value="#000000", visible=False) 619 | background_image = gr.Image(label="Background Image", type="pil", visible=False) 620 | background_video = gr.Video(label="Background Video", visible=False) 621 | 622 | with gr.Accordion("Advanced Settings", open=False): 623 | with gr.Row(): 624 | with gr.Column(): 625 | gr.Markdown("### Removal Settings") 626 | model = gr.Dropdown(label="Model", choices=["u2net", "u2netp", "u2net_human_seg", "u2net_cloth_seg", "silueta", "isnet-general-use", "isnet-anime"], value="u2net") 627 | output_format = gr.Radio(label="Output Format", choices=["RGBA", "RGB"], value="RGBA") 628 | alpha_matting = gr.Checkbox(label="Alpha Matting", value=False) 629 | alpha_matting_foreground_threshold = gr.Slider(label="Alpha Matting Foreground Threshold", minimum=0, maximum=255, value=240, step=1) 630 | alpha_matting_background_threshold = gr.Slider(label="Alpha Matting Background Threshold", minimum=0, maximum=255, value=10, step=1) 631 | post_process_mask = gr.Checkbox(label="Post Process Mask", value=False) 632 | 633 | with gr.Column(): 634 | gr.Markdown("### Chroma Key Settings") 635 | chroma_key = gr.Dropdown(label="Chroma Key", choices=["none", "green", "blue", "red"], value="none") 636 | chroma_threshold = gr.Slider(label="Chroma Threshold", minimum=0, maximum=255, value=30, step=1) 637 | color_tolerance = gr.Slider(label="Color Tolerance", minimum=0, maximum=255, value=20, step=1) 638 | 639 | with gr.Column(): 640 | gr.Markdown("### Effects") 641 | invert_mask = gr.Checkbox(label="Invert Mask", value=False) 642 | feather_amount = gr.Slider(label="Feather Amount", minimum=0, maximum=100, value=0, step=1) 643 | edge_detection = gr.Checkbox(label="Edge Detection", value=False) 644 | edge_thickness = gr.Slider(label="Edge Thickness", minimum=1, maximum=10, value=1, step=1) 645 | edge_color = gr.ColorPicker(label="Edge Color", value="#FFFFFF") 646 | shadow = gr.Checkbox(label="Shadow", value=False) 647 | shadow_blur = gr.Slider(label="Shadow Blur", minimum=0, maximum=20, value=5, step=1) 648 | shadow_opacity = gr.Slider(label="Shadow Opacity", minimum=0.0, maximum=1.0, value=0.5, step=0.1) 649 | color_adjustment = gr.Checkbox(label="Color Adjustment", value=False) 650 | brightness = gr.Slider(label="Brightness", minimum=0.0, maximum=2.0, value=1.0, step=0.1) 651 | contrast = gr.Slider(label="Contrast", minimum=0.0, maximum=2.0, value=1.0, step=0.1) 652 | saturation = gr.Slider(label="Saturation", minimum=0.0, maximum=2.0, value=1.0, step=0.1) 653 | mask_blur = gr.Slider(label="Mask Blur", minimum=0, maximum=100, value=0, step=1) 654 | mask_expansion = gr.Slider(label="Mask Expansion", minimum=-100, maximum=100, value=0, step=1) 655 | 656 | with gr.Row(): 657 | gr.Markdown("### Output Settings") 658 | image_format = gr.Dropdown(label="Image Format", choices=["PNG", "JPEG", "WEBP"], value="PNG") 659 | video_format = gr.Dropdown(label="Video Format", choices=["MP4", "AVI", "MOV"], value="MP4") 660 | video_quality = gr.Slider(label="Video Quality", minimum=0, maximum=100, value=95, step=1) 661 | use_custom_dimensions = gr.Checkbox(label="Use Custom Dimensions", value=False) 662 | custom_width = gr.Number(label="Custom Width", value=512, visible=False) 663 | custom_height = gr.Number(label="Custom Height", value=512, visible=False) 664 | output_dimension_source = gr.Radio( 665 | label="Output Dimension Source", 666 | choices=["Foreground", "Background"], 667 | value="Foreground", 668 | visible=True 669 | ) 670 | 671 | def update_input_type(choice): 672 | return { 673 | foreground_input: gr.update(visible=choice == "Image"), 674 | foreground_video: gr.update(visible=choice == "Video") 675 | } 676 | 677 | def update_output_type(choice): 678 | return { 679 | result_image: gr.update(visible=choice == "Image"), 680 | result_video: gr.update(visible=choice == "Video") 681 | } 682 | 683 | def update_background_mode(mode): 684 | return { 685 | background_color: gr.update(visible=mode == "color"), 686 | background_image: gr.update(visible=mode == "image"), 687 | background_video: gr.update(visible=mode == "video") 688 | } 689 | 690 | def update_custom_dimensions(use_custom): 691 | return { 692 | custom_width: gr.update(visible=use_custom), 693 | custom_height: gr.update(visible=use_custom), 694 | output_dimension_source: gr.update(visible=not use_custom) 695 | } 696 | 697 | def process_image(image, background_image, *args): 698 | geeky_remb = GeekyRemB() 699 | result, _ = geeky_remb.remove_background(image, background_image, *args) 700 | return result 701 | 702 | def process_video(video_path, background_video_path, *args): 703 | geeky_remb = GeekyRemB() 704 | with tempfile.NamedTemporaryFile(delete=False, suffix=".mp4") as temp_file: 705 | output_path = temp_file.name 706 | geeky_remb.process_video(video_path, output_path, background_video_path, *args) 707 | return output_path 708 | 709 | def run_geeky_remb(input_type, foreground_input, foreground_video, result_type, model, 710 | output_format, alpha_matting, alpha_matting_foreground_threshold, 711 | alpha_matting_background_threshold, post_process_mask, chroma_key, 712 | chroma_threshold, color_tolerance, background_mode, background_color, 713 | background_image, background_video, invert_mask, feather_amount, 714 | edge_detection, edge_thickness, edge_color, shadow, shadow_blur, 715 | shadow_opacity, color_adjustment, brightness, contrast, saturation, 716 | x_position, y_position, rotation, opacity, flip_horizontal, 717 | flip_vertical, mask_blur, mask_expansion, foreground_scale, 718 | foreground_aspect_ratio, remove_background, image_format, 719 | video_format, video_quality, use_custom_dimensions, custom_width, 720 | custom_height, output_dimension_source, blend_mode): 721 | 722 | if not isinstance(background_color, str) or not background_color.startswith('#'): 723 | background_color = "#000000" 724 | if not isinstance(edge_color, str) or not edge_color.startswith('#'): 725 | edge_color = "#FFFFFF" 726 | 727 | args = (model, alpha_matting, alpha_matting_foreground_threshold, 728 | alpha_matting_background_threshold, post_process_mask, chroma_key, 729 | chroma_threshold, color_tolerance, background_mode, background_color, 730 | output_format, invert_mask, feather_amount, edge_detection, 731 | edge_thickness, edge_color, shadow, shadow_blur, shadow_opacity, 732 | color_adjustment, brightness, contrast, saturation, x_position, 733 | y_position, rotation, opacity, flip_horizontal, flip_vertical, 734 | mask_blur, mask_expansion, foreground_scale, foreground_aspect_ratio, 735 | remove_background, use_custom_dimensions, custom_width, custom_height, 736 | output_dimension_source, blend_mode) 737 | 738 | if input_type == "Image" and result_type == "Image": 739 | result = process_image(foreground_input, background_image, *args) 740 | if image_format != "PNG": 741 | result = result.convert("RGB") 742 | with tempfile.NamedTemporaryFile(delete=False, suffix=f".{image_format.lower()}") as temp_file: 743 | result.save(temp_file.name, format=image_format, quality=95 if image_format == "JPEG" else None) 744 | return temp_file.name, None 745 | elif input_type == "Video" and result_type == "Video": 746 | output_video = process_video(foreground_video, background_video if background_mode == "video" else None, *args) 747 | if video_format != "MP4": 748 | temp_output = output_video + f"_temp.{video_format.lower()}" 749 | os.system(f'ffmpeg -i "{output_video}" -c:v libx264 -crf {int(20 - (video_quality / 5))} "{temp_output}"') 750 | os.remove(output_video) 751 | output_video = temp_output 752 | return None, output_video 753 | elif input_type == "Image" and result_type == "Video": 754 | with tempfile.NamedTemporaryFile(delete=False, suffix=".mp4") as temp_file: 755 | output_path = temp_file.name 756 | frame = cv2.cvtColor(np.array(foreground_input), cv2.COLOR_RGB2BGR) 757 | height, width = frame.shape[:2] 758 | fourcc = cv2.VideoWriter_fourcc(*'mp4v') 759 | out = cv2.VideoWriter(output_path, fourcc, 24, (width, height)) 760 | for _ in range(24 * 5): # 5 seconds at 24 fps 761 | out.write(frame) 762 | out.release() 763 | return None, process_video(output_path, background_video if background_mode == "video" else None, *args) 764 | elif input_type == "Video" and result_type == "Image": 765 | cap = cv2.VideoCapture(foreground_video) 766 | ret, frame = cap.read() 767 | cap.release() 768 | if ret: 769 | pil_frame = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)) 770 | result = process_image(pil_frame, background_image, *args) 771 | if image_format != "PNG": 772 | result = result.convert("RGB") 773 | with tempfile.NamedTemporaryFile(delete=False, suffix=f".{image_format.lower()}") as temp_file: 774 | result.save(temp_file.name, format=image_format, quality=95 if image_format == "JPEG" else None) 775 | return temp_file.name, None 776 | return None, None 777 | 778 | input_type.change(update_input_type, inputs=[input_type], outputs=[foreground_input, foreground_video]) 779 | result_type.change(update_output_type, inputs=[result_type], outputs=[result_image, result_video]) 780 | background_mode.change(update_background_mode, inputs=[background_mode], 781 | outputs=[background_color, background_image, background_video]) 782 | use_custom_dimensions.change(update_custom_dimensions, inputs=[use_custom_dimensions], 783 | outputs=[custom_width, custom_height, output_dimension_source]) 784 | 785 | run_button.click( 786 | fn=run_geeky_remb, 787 | inputs=[ 788 | input_type, foreground_input, foreground_video, result_type, 789 | model, output_format, alpha_matting, alpha_matting_foreground_threshold, 790 | alpha_matting_background_threshold, post_process_mask, chroma_key, 791 | chroma_threshold, color_tolerance, background_mode, background_color, 792 | background_image, background_video, invert_mask, feather_amount, 793 | edge_detection, edge_thickness, edge_color, shadow, shadow_blur, 794 | shadow_opacity, color_adjustment, brightness, contrast, saturation, 795 | x_position, y_position, rotation, opacity, flip_horizontal, 796 | flip_vertical, mask_blur, mask_expansion, foreground_scale, 797 | foreground_aspect_ratio, remove_background, image_format, video_format, 798 | video_quality, use_custom_dimensions, custom_width, custom_height, 799 | output_dimension_source, blend_mode 800 | ], 801 | outputs=[result_image, result_video] 802 | ) 803 | 804 | return [(geeky_remb_tab, "GeekyRemB", "geeky_remb_tab")] 805 | 806 | def on_ui_settings(): 807 | section = ("geeky-remb", "GeekyRemB") 808 | shared.opts.add_option( 809 | "geekyremb_saving_path", 810 | shared.OptionInfo( 811 | "outputs/geekyremb", 812 | "GeekyRemB saving path", 813 | gr.Textbox, 814 | {"placeholder": "outputs/geekyremb"}, 815 | section=section 816 | ), 817 | ) 818 | shared.opts.add_option( 819 | "geekyremb_max_video_length", 820 | shared.OptionInfo( 821 | 300, 822 | "Maximum video length in seconds", 823 | gr.Number, 824 | {"minimum": 1, "maximum": 3600}, 825 | section=section 826 | ), 827 | ) 828 | shared.opts.add_option( 829 | "geekyremb_max_image_size", 830 | shared.OptionInfo( 831 | 4096, 832 | "Maximum image dimension", 833 | gr.Number, 834 | {"minimum": 512, "maximum": 8192}, 835 | section=section 836 | ), 837 | ) 838 | 839 | def update_input_type(choice): 840 | return { 841 | foreground_input: gr.update(visible=choice == "Image"), 842 | foreground_video: gr.update(visible=choice == "Video"), 843 | } 844 | 845 | def update_output_type(choice): 846 | return { 847 | result_image: gr.update(visible=choice == "Image"), 848 | result_video: gr.update(visible=choice == "Video"), 849 | } 850 | 851 | def update_background_mode(mode): 852 | return { 853 | background_color: gr.update(visible=mode == "color"), 854 | background_image: gr.update(visible=mode == "image"), 855 | background_video: gr.update(visible=mode == "video"), 856 | } 857 | 858 | def update_custom_dimensions(use_custom): 859 | return { 860 | custom_width: gr.update(visible=use_custom), 861 | custom_height: gr.update(visible=use_custom), 862 | output_dimension_source: gr.update(visible=not use_custom) 863 | } 864 | 865 | def process_image(image, background_image, *args): 866 | geeky_remb = GeekyRemB() 867 | result, _ = geeky_remb.remove_background(image, background_image, *args) 868 | return result 869 | 870 | def process_video(video_path, background_video_path, *args): 871 | geeky_remb = GeekyRemB() 872 | with tempfile.NamedTemporaryFile(delete=False, suffix=".mp4") as temp_file: 873 | output_path = temp_file.name 874 | geeky_remb.process_video(video_path, output_path, background_video_path, *args) 875 | return output_path 876 | 877 | def run_geeky_remb(input_type, foreground_input, foreground_video, result_type, model, 878 | output_format, alpha_matting, alpha_matting_foreground_threshold, 879 | alpha_matting_background_threshold, post_process_mask, chroma_key, 880 | chroma_threshold, color_tolerance, background_mode, background_color, 881 | background_image, background_video, invert_mask, feather_amount, 882 | edge_detection, edge_thickness, edge_color, shadow, shadow_blur, 883 | shadow_opacity, color_adjustment, brightness, contrast, saturation, 884 | x_position, y_position, rotation, opacity, flip_horizontal, 885 | flip_vertical, mask_blur, mask_expansion, foreground_scale, 886 | foreground_aspect_ratio, remove_background, image_format, 887 | video_format, video_quality, use_custom_dimensions, custom_width, 888 | custom_height, output_dimension_source, blend_mode): 889 | 890 | # Ensure color values are valid hex strings 891 | if not isinstance(background_color, str) or not background_color.startswith('#'): 892 | background_color = "#000000" 893 | if not isinstance(edge_color, str) or not edge_color.startswith('#'): 894 | edge_color = "#FFFFFF" 895 | 896 | args = (model, alpha_matting, alpha_matting_foreground_threshold, 897 | alpha_matting_background_threshold, post_process_mask, chroma_key, 898 | chroma_threshold, color_tolerance, background_mode, background_color, 899 | output_format, invert_mask, feather_amount, edge_detection, 900 | edge_thickness, edge_color, shadow, shadow_blur, shadow_opacity, 901 | color_adjustment, brightness, contrast, saturation, x_position, 902 | y_position, rotation, opacity, flip_horizontal, flip_vertical, 903 | mask_blur, mask_expansion, foreground_scale, foreground_aspect_ratio, 904 | remove_background, use_custom_dimensions, custom_width, custom_height, 905 | output_dimension_source, blend_mode) 906 | 907 | if input_type == "Image" and result_type == "Image": 908 | result = process_image(foreground_input, background_image, *args) 909 | if image_format != "PNG": 910 | result = result.convert("RGB") 911 | with tempfile.NamedTemporaryFile(delete=False, suffix=f".{image_format.lower()}") as temp_file: 912 | result.save(temp_file.name, format=image_format, quality=95 if image_format == "JPEG" else None) 913 | return temp_file.name, None 914 | elif input_type == "Video" and result_type == "Video": 915 | output_video = process_video(foreground_video, background_video if background_mode == "video" else None, *args) 916 | if video_format != "MP4": 917 | temp_output = output_video + f"_temp.{video_format.lower()}" 918 | os.system(f'ffmpeg -i "{output_video}" -c:v libx264 -crf {int(20 - (video_quality / 5))} "{temp_output}"') 919 | os.remove(output_video) 920 | output_video = temp_output 921 | return None, output_video 922 | elif input_type == "Image" and result_type == "Video": 923 | with tempfile.NamedTemporaryFile(delete=False, suffix=".mp4") as temp_file: 924 | output_path = temp_file.name 925 | frame = cv2.cvtColor(np.array(foreground_input), cv2.COLOR_RGB2BGR) 926 | height, width = frame.shape[:2] 927 | fourcc = cv2.VideoWriter_fourcc(*'mp4v') 928 | out = cv2.VideoWriter(output_path, fourcc, 24, (width, height)) 929 | for _ in range(24 * 5): # 5 seconds at 24 fps 930 | out.write(frame) 931 | out.release() 932 | return None, process_video(output_path, background_video if background_mode == "video" else None, *args) 933 | elif input_type == "Video" and result_type == "Image": 934 | cap = cv2.VideoCapture(foreground_video) 935 | ret, frame = cap.read() 936 | cap.release() 937 | if ret: 938 | pil_frame = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)) 939 | result = process_image(pil_frame, background_image, *args) 940 | if image_format != "PNG": 941 | result = result.convert("RGB") 942 | with tempfile.NamedTemporaryFile(delete=False, suffix=f".{image_format.lower()}") as temp_file: 943 | result.save(temp_file.name, format=image_format, quality=95 if image_format == "JPEG" else None) 944 | return temp_file.name, None 945 | return None, None 946 | 947 | script_callbacks.on_ui_tabs(on_ui_tabs) 948 | script_callbacks.on_ui_settings(on_ui_settings) 949 | --------------------------------------------------------------------------------