├── requirements.txt ├── __init__.py ├── README.md └── vton_api_node.py /requirements.txt: -------------------------------------------------------------------------------- 1 | torch 2 | torchvision 3 | numpy 4 | Pillow 5 | requests -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- 1 | from .vton_api_node import NODE_CLASS_MAPPINGS, NODE_DISPLAY_NAME_MAPPINGS 2 | 3 | __all__ = ['NODE_CLASS_MAPPINGS', 'NODE_DISPLAY_NAME_MAPPINGS'] 4 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ComfyUI_sm4ll-Wrapper 2 | A wrapper node for sm4ll-VTON models with both free demo and paid API access options. 3 | 4 | ![image](https://github.com/user-attachments/assets/8c08b4fd-a9a6-42ef-a9b8-ec577ac9f0b7) 5 | 6 | ## Installation 7 | 8 | 1. Clone or download this repository to your ComfyUI custom nodes directory: 9 | ```bash 10 | cd ComfyUI/custom_nodes/ 11 | git clone https://github.com/risunobushi/ComfyUI_sm4ll-Wrapper.git 12 | ``` 13 | 14 | 2. Install the required dependencies: 15 | ```bash 16 | cd ComfyUI_sm4ll-Wrapper 17 | pip install -r requirements.txt 18 | ``` 19 | 20 | 3. Restart ComfyUI 21 | 22 | ## Available Nodes 23 | 24 | This package provides three nodes for different use cases: 25 | 26 | ### 1. **sm4ll Wrapper Sampler - Demo Version** (Free) 27 | Uses the free demo API from HuggingFace Spaces with rate limiting. 28 | 29 | **Configure Inputs**: 30 | - **base_person_image**: The person/model image (required) 31 | - **product_image**: The garment/product to try on (required) 32 | - **model_choice**: Select from "eyewear", "footwear", "full-body", or "top garment" (required) 33 | - **base_person_mask** (optional): Custom mask if you don't trust the automasking feature 34 | 35 | ### 2. **sm4ll Wrapper Sampler - Paid API** (Requires API Key) 36 | Uses the production API with higher reliability, no rate limits, and priority processing. 37 | 38 | **Configure Inputs**: 39 | - **base_person_image**: The person/model image (required) 40 | - **product_image**: The garment/product to try on (required) 41 | - **model_choice**: Select from "eyewear", "footwear", "full-body", or "top garment" (required) 42 | - **api_key**: Your YourMirror API key (required) - format: `ym_29_characters` 43 | - **quality_tier**: Select quality level - "normal" (16 steps) or "high" (40 steps) (optional, defaults to "normal") 44 | - **base_person_mask** (optional): Custom mask for precise control 45 | 46 | ### 3. **sm4ll Wrapper Lookbook Sampler** (Requires API Key) 47 | Create multi-garment lookbook compositions with a single person image and up to 4 garment images. 48 | 49 | **Configure Inputs**: 50 | - **person_image**: The person/model image (required) 51 | - **garment_1**: First garment to include in the composition (required) 52 | - **garment_2**: Second garment (optional) 53 | - **garment_3**: Third garment (optional) 54 | - **garment_4**: Fourth garment (optional) 55 | - **mode**: Gender mode - "Male" or "Female" (required) 56 | - **quality**: Quality level - "Normal" or "High" (required) 57 | - **prompt**: Optional text description to guide the generation (optional) 58 | - **api_key**: Your YourMirror API key (required) - format: `ym_29_characters` 59 | 60 | **Note**: The Lookbook node composes multiple garments into a grid layout and generates a comprehensive try-on visualization showing the person wearing all provided garments simultaneously. 61 | 62 | ## Getting API Keys 63 | 64 | To use the **Paid API** or **Lookbook** nodes, you need a YourMirror API key: 65 | 66 | 1. **Sign up** at [studio.yourmirror.io](https://studio.yourmirror.io) 67 | 2. **Complete the onboarding** confirm your email and verify your account 68 | 3. **Navigate to API Keys** section in your dashboard 69 | 4. **Create a new API key** - you'll see it only once, so copy it immediately 70 | 5. **Use the API key** in the ComfyUI node (format: `ym_` followed by 29 characters) 71 | 72 | ### API Key Features: 73 | - **Format**: `ym_` + 29 random characters (total length: 32) 74 | - **Rate Limit**: 1000 requests per hour per key 75 | - **Usage Tracking**: All API calls are logged for billing and analytics 76 | - **NSFW Filtering**: Automatic content screening for appropriate use 77 | 78 | ## Usage Instructions 79 | 80 | 1. In ComfyUI, look for any of the three nodes in the `sm4ll/VTON` category: 81 | - "sm4ll Wrapper Sampler - Demo Version" (free, rate-limited) 82 | - "sm4ll Wrapper Sampler - Paid API" (requires API key) 83 | - "sm4ll Wrapper Lookbook Sampler" (requires API key, multi-garment) 84 | 85 | 2a. **Input Images**: 86 | - For Demo/Paid API nodes: Set a person image and a product image as inputs (e.g.: using Load Image nodes) 87 | - For Lookbook node: Set a person image and 1-4 garment images as inputs 88 | 89 | 2b. **(Optional) Input Mask**: If you don't want to trust the automasking features, set a Mask input for the person image (Demo/Paid API nodes only) 90 | 91 | 3. **Additional Configuration**: 92 | - For Lookbook node: Select gender mode (Male/Female), quality level, and optionally add a text prompt 93 | 94 | 4. **Output**: All nodes output a processed IMAGE that can be connected to a Save Image node, or other nodes for further processing 95 | 96 | ## How It Works 97 | 98 | ### Demo Version (Free) 99 | 1. **Image Processing**: Input images are automatically resized to 1.62 megapixels using Lanczos interpolation 100 | 2. **Upload**: Images are uploaded to the HuggingFace Spaces demo environment's temp storage for handling 101 | 3. **API Call**: Uses the free demo API with rate limiting and queue management 102 | 4. **Result Retrieval**: Downloads the processed image and converts back to ComfyUI tensor format 103 | 104 | ### Paid API Version 105 | 1. **Image Processing**: Same high-quality image preprocessing as demo version 106 | 2. **Authentication**: Validates your API key format and permissions 107 | 3. **Upload**: Images are uploaded to the production API infrastructure 108 | 4. **Priority Processing**: Dedicated endpoints for faster processing 109 | 110 | ### Lookbook Version (Multi-Garment) 111 | 1. **Image Processing**: All images (person + up to 4 garments) are resized to 1.62 megapixels 112 | 2. **Garment Composition**: Multiple garment images are composed into a grid layout on the backend 113 | 3. **Authentication**: Validates your API key format and permissions 114 | 4. **Upload**: All images are uploaded to the production API infrastructure 115 | 5. **Lookbook Generation**: Uses specialized AI model to create comprehensive try-on visualization 116 | 6. **Mode-Aware Processing**: Optimized for male or female body types based on selection 117 | 118 | ## API Pricing & Limits 119 | 120 | ### Demo Version (Free) 121 | - **Cost**: Free 122 | - **Rate Limits**: Tied to the same limitations as the HuggingFace Space Demo 123 | - **Queue**: May experience delays during peak usage, queue is shared with all Demo users 124 | - **Non-Commercial License**: Demo image outputs are subjected to a non-commercial license. Check the Terms and Conditions on [studio.yourmirror.io](https://studio.yourmirror.io) for more details. 125 | 126 | ### Paid API (Single Garment VTON) 127 | - **Cost (Normal Quality)**: $0.05 per API call (16 sampling steps) 128 | - **Cost (High Quality)**: $0.10 per API call (40 sampling steps) 129 | - **Rate Limits**: 1000 requests per hour per API key 130 | - **Processing**: Separate queue, separate endpoints for each sm4ll model 131 | - **Billing**: Automatic monthly billing based on actual usage 132 | - **Full Commercial License**: Paid API image outputs are subjected to a full commercial license. Check the Terms and Conditions on [studio.yourmirror.io](https://studio.yourmirror.io) for more details. 133 | 134 | ### Lookbook API (Multi-Garment) 135 | - **Cost (Normal Quality)**: $0.15 per API call (3 credits) 136 | - **Cost (High Quality)**: $0.30 per API call (6 credits) 137 | - **Rate Limits**: 1000 requests per hour per API key 138 | - **Processing**: Dedicated lookbook processing pipeline 139 | - **Features**: Supports 1-4 garments, gender-specific optimization, text prompts 140 | - **Billing**: Automatic monthly billing based on actual usage 141 | - **Full Commercial License**: Lookbook API image outputs are subjected to a full commercial license. Check the Terms and Conditions on [studio.yourmirror.io](https://studio.yourmirror.io) for more details. 142 | 143 | ## Supported Models & Use Cases 144 | 145 | ### Demo & Paid API Nodes (Single Garment) 146 | Support specific model types for single-garment try-on: 147 | 148 | - **Eyewear**: Sunglasses, prescription glasses, safety goggles 149 | - **Footwear**: Shoes, boots, sneakers, sandals 150 | - **Full-body**: Dresses, coats, full outfits 151 | - **Top Garment**: Shirts, jackets, tops, blouses 152 | 153 | ### Lookbook Node (Multi-Garment) 154 | Uses specialized lookbook VTON model that can handle any combination of garment types: 155 | 156 | - **Mixed Categories**: Combine different garment types (shoes + dress + accessories) 157 | - **Multiple Items**: Up to 4 different garments in one composition 158 | - **Cross-Category**: No restrictions on garment combinations 159 | - **Gender-Optimized**: Separate processing pipelines for male/female body types 160 | 161 | ## Technical Details 162 | 163 | - **API Integration**: 164 | - Demo: Uses HuggingFace Spaces Gradio API 165 | - Paid API: Uses production api.yourmirror.io with SSE streaming 166 | - Lookbook: Uses production api.yourmirror.io with direct HTTP POST 167 | - **Image Requirements**: 168 | - Supported formats: JPG, PNG, WEBP 169 | - Optimal size: Around 1.62 megapixels (automatically resized) 170 | 171 | ## Troubleshooting 172 | 173 | ### Common Issues 174 | 175 | - **Red Output Image**: Indicates an error occurred during processing. Check the console for detailed error messages. 176 | - **API Key Errors**: Ensure your API key starts with `ym_` and is exactly 32 characters long 177 | - **Upload Failures**: Check your internet connection and image format compatibility 178 | - **Timeout Issues**: 179 | - Demo: May experience delays during peak usage 180 | - Paid API: 5-minute timeout for processing, contact support if persistent 181 | 182 | ### Error Messages 183 | 184 | - `"API key is required"`: You're using the Paid API node without providing an API key 185 | - `"Invalid API key format"`: Your API key should be `ym_` followed by 29 characters 186 | - `"Failed to upload"`: Network connectivity or server availability issue 187 | 188 | ## Requirements 189 | 190 | - torch 191 | - torchvision 192 | - numpy 193 | - Pillow 194 | - requests 195 | 196 | ## Credits 197 | 198 | - Built for the [sm4ll-VTON](https://huggingface.co/spaces/sm4ll-VTON/sm4ll-VTON-Demo) family of models -------------------------------------------------------------------------------- /vton_api_node.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import numpy as np 4 | from PIL import Image 5 | import requests 6 | import time 7 | import json 8 | import math 9 | import io 10 | import random 11 | import base64 12 | 13 | def tensor_to_pil(tensor: torch.Tensor, batch_index=0): 14 | """Converts a ComfyUI image tensor to a PIL Image (RGB).""" 15 | try: 16 | # Ensure tensor is on CPU and detached 17 | tensor = tensor.detach().cpu() 18 | 19 | print(f"DEBUG: Input tensor shape: {tensor.shape}, dtype: {tensor.dtype}") 20 | print(f"DEBUG: Tensor min: {tensor.min():.3f}, max: {tensor.max():.3f}") 21 | 22 | # Handle batch dimension 23 | if tensor.ndim == 4: # BCHW format 24 | tensor = tensor[batch_index] 25 | print(f"DEBUG: After batch selection: {tensor.shape}") 26 | 27 | # Handle different tensor formats 28 | if tensor.ndim == 3: 29 | height, width, channels = tensor.shape 30 | 31 | # Standard ComfyUI format is HWC (Height, Width, Channels) 32 | if channels <= 4: # RGB/RGBA 33 | image_np = tensor.numpy() 34 | print(f"DEBUG: Using HWC format: {image_np.shape}") 35 | else: 36 | # If channels > 4, assume it's CHW format 37 | if tensor.shape[0] <= 4: # CHW 38 | image_np = tensor.permute(1, 2, 0).numpy() 39 | print(f"DEBUG: Converted CHW to HWC: {image_np.shape}") 40 | else: 41 | # Fallback: assume HWC 42 | image_np = tensor.numpy() 43 | print(f"DEBUG: Fallback HWC: {image_np.shape}") 44 | 45 | elif tensor.ndim == 2: # Grayscale HW 46 | image_np = tensor.numpy() 47 | image_np = np.expand_dims(image_np, axis=2) # Add channel dimension 48 | print(f"DEBUG: Grayscale expanded: {image_np.shape}") 49 | else: 50 | raise ValueError(f"Unsupported tensor shape: {tensor.shape}") 51 | 52 | # Ensure proper range [0, 1] -> [0, 255] 53 | if image_np.max() <= 1.0: 54 | image_np = image_np * 255.0 55 | 56 | # Clip and convert to uint8 57 | image_np = np.clip(image_np, 0, 255).astype(np.uint8) 58 | print(f"DEBUG: Final numpy shape: {image_np.shape}, dtype: {image_np.dtype}") 59 | 60 | # Handle different channel counts 61 | if image_np.shape[2] == 1: # Grayscale 62 | return Image.fromarray(image_np.squeeze(2), 'L').convert('RGB') 63 | elif image_np.shape[2] == 3: # RGB 64 | return Image.fromarray(image_np, 'RGB') 65 | elif image_np.shape[2] == 4: # RGBA 66 | return Image.fromarray(image_np, 'RGBA').convert('RGB') 67 | else: 68 | # Use first 3 channels as RGB 69 | return Image.fromarray(image_np[:, :, :3], 'RGB') 70 | 71 | except Exception as e: 72 | print(f"ERROR in tensor_to_pil: {e}") 73 | print(f"Tensor shape: {tensor.shape}, dtype: {tensor.dtype}") 74 | # Create a fallback image 75 | return Image.new('RGB', (512, 512), color=(255, 0, 0)) 76 | 77 | def pil_to_tensor(pil_image: Image.Image): 78 | """Converts a PIL Image to a ComfyUI image tensor (BHWC, float 0-1).""" 79 | try: 80 | # Ensure image is RGB 81 | if pil_image.mode != 'RGB': 82 | pil_image = pil_image.convert('RGB') 83 | 84 | print(f"DEBUG: PIL image size: {pil_image.size}, mode: {pil_image.mode}") 85 | 86 | # Convert to numpy array (HWC format) 87 | image_np = np.array(pil_image) 88 | print(f"DEBUG: Numpy array shape: {image_np.shape}, dtype: {image_np.dtype}") 89 | 90 | # Normalize to 0-1 91 | image_np = image_np.astype(np.float32) / 255.0 92 | 93 | # ComfyUI expects BHWC format (Batch, Height, Width, Channels) 94 | # Add batch dimension: HWC -> BHWC 95 | tensor = torch.from_numpy(image_np).unsqueeze(0) 96 | 97 | print(f"DEBUG: Output tensor shape: {tensor.shape}, dtype: {tensor.dtype}") 98 | print(f"DEBUG: Tensor min: {tensor.min():.3f}, max: {tensor.max():.3f}") 99 | 100 | return tensor 101 | 102 | except Exception as e: 103 | print(f"ERROR in pil_to_tensor: {e}") 104 | # Create a simple fallback tensor in BHWC format 105 | fallback = torch.zeros(1, 512, 512, 3, dtype=torch.float32) 106 | fallback[:, :, :, 0] = 1.0 # Make it red 107 | print(f"DEBUG: Created fallback tensor with shape: {fallback.shape}") 108 | return fallback 109 | 110 | def mask_to_pil(mask_tensor: torch.Tensor): 111 | """Convert ComfyUI mask tensor to PIL B&W image.""" 112 | try: 113 | # Ensure tensor is on CPU and detached 114 | mask_tensor = mask_tensor.detach().cpu() 115 | 116 | print(f"DEBUG: Input mask tensor shape: {mask_tensor.shape}, dtype: {mask_tensor.dtype}") 117 | print(f"DEBUG: Mask min: {mask_tensor.min():.3f}, max: {mask_tensor.max():.3f}") 118 | 119 | # Handle batch dimension if present 120 | if mask_tensor.ndim == 3: # BMH format (Batch, Mask, Height) - remove batch 121 | mask_tensor = mask_tensor[0] 122 | elif mask_tensor.ndim == 4: # BMHW format - remove batch and take first channel 123 | mask_tensor = mask_tensor[0, 0] 124 | elif mask_tensor.ndim == 2: # HW format - already correct 125 | pass 126 | else: 127 | # Try to squeeze out single dimensions 128 | mask_tensor = mask_tensor.squeeze() 129 | 130 | # Convert to numpy 131 | mask_np = mask_tensor.numpy() 132 | print(f"DEBUG: Mask numpy shape: {mask_np.shape}") 133 | 134 | # Ensure 2D array (Height, Width) 135 | if mask_np.ndim != 2: 136 | raise ValueError(f"Expected 2D mask, got shape: {mask_np.shape}") 137 | 138 | # Convert to 0-255 range 139 | if mask_np.max() <= 1.0: 140 | mask_np = mask_np * 255.0 141 | 142 | # Convert to uint8 143 | mask_np = np.clip(mask_np, 0, 255).astype(np.uint8) 144 | 145 | # Create PIL image in grayscale mode (B&W) 146 | mask_pil = Image.fromarray(mask_np, 'L') 147 | print(f"DEBUG: Created mask PIL image: {mask_pil.size}, mode: {mask_pil.mode}") 148 | 149 | return mask_pil 150 | 151 | except Exception as e: 152 | print(f"ERROR in mask_to_pil: {e}") 153 | print(f"Mask tensor shape: {mask_tensor.shape}, dtype: {mask_tensor.dtype}") 154 | # Create a fallback white mask 155 | return Image.new('L', (512, 512), color=255) 156 | 157 | def resize_to_megapixels(image: Image.Image, target_mpx: float = 1.62): 158 | """Resize image to target megapixels using Lanczos interpolation.""" 159 | current_pixels = image.width * image.height 160 | target_pixels = target_mpx * 1_000_000 161 | 162 | if current_pixels <= target_pixels: 163 | return image # No need to resize if already smaller 164 | 165 | scale_factor = math.sqrt(target_pixels / current_pixels) 166 | new_width = int(image.width * scale_factor) 167 | new_height = int(image.height * scale_factor) 168 | 169 | return image.resize((new_width, new_height), Image.Resampling.LANCZOS) 170 | 171 | def upload_to_gradio_session(image, base_url, session, is_paid_api=False): 172 | """Upload image using a session for state persistence.""" 173 | max_retries = 3 if is_paid_api else 1 174 | 175 | for attempt in range(max_retries): 176 | try: 177 | # Convert image to bytes 178 | img_buffer = io.BytesIO() 179 | image.save(img_buffer, format='PNG') 180 | img_buffer.seek(0) 181 | 182 | # Try the standard Gradio upload endpoint 183 | upload_url = f"{base_url}/gradio_api/upload" 184 | 185 | # Prepare the file for upload 186 | files = { 187 | 'files': ('image.png', img_buffer, 'image/png') 188 | } 189 | 190 | retry_msg = f" (attempt {attempt + 1}/{max_retries})" if is_paid_api and attempt > 0 else "" 191 | print(f"Uploading to Gradio space with session: {upload_url}{retry_msg}") 192 | response = session.post(upload_url, files=files, timeout=30) 193 | 194 | if response.status_code == 200: 195 | # Parse the response to get the file path/URL 196 | try: 197 | result = response.json() 198 | print(f"Session upload response: {result}") # Debug: show full response 199 | 200 | # Different Gradio versions return different formats 201 | if isinstance(result, list) and len(result) > 0: 202 | file_info = result[0] 203 | print(f"Session file info: {file_info}") # Debug: show file info 204 | 205 | if isinstance(file_info, dict): 206 | # Format: [{"name": "filename", "data": "file_path", ...}] 207 | file_path = file_info.get('name') or file_info.get('data') or file_info.get('path') 208 | if file_path: 209 | full_url = f"{base_url}/file={file_path}" 210 | print(f"Session upload successful: {full_url}") 211 | return file_path # Return the internal file path for API calls 212 | elif isinstance(file_info, str): 213 | # Format: ["/tmp/gradio/hash/filename"] 214 | file_path = file_info 215 | full_url = f"{base_url}/file={file_path}" 216 | print(f"Session upload successful: {full_url}") 217 | return file_path # Return the internal file path for API calls 218 | elif isinstance(result, dict): 219 | # Some formats return a dict directly 220 | file_path = result.get('name') or result.get('data') or result.get('path') 221 | if file_path: 222 | full_url = f"{base_url}/file={file_path}" 223 | print(f"Session upload successful: {full_url}") 224 | return file_path # Return the internal file path for API calls 225 | elif isinstance(result, str): 226 | # Try to use the raw response as filename 227 | file_path = result 228 | full_url = f"{base_url}/file={file_path}" 229 | print(f"Session upload successful: {full_url}") 230 | return file_path # Return the internal file path for API calls 231 | 232 | print(f"Unexpected session upload response format: {result}") 233 | 234 | except json.JSONDecodeError: 235 | # Sometimes the response is just a filename string 236 | print(f"Session JSON decode failed, raw response: '{response.text}'") # Debug 237 | file_path = response.text.strip().strip('"') 238 | if file_path: 239 | full_url = f"{base_url}/file={file_path}" 240 | print(f"Session upload successful: {full_url}") 241 | return file_path # Return the internal file path for API calls 242 | else: 243 | print(f"Session upload failed with status {response.status_code}: {response.text}") 244 | if is_paid_api and attempt < max_retries - 1: 245 | print(f"Retrying upload in 2 seconds... ({attempt + 1}/{max_retries})") 246 | time.sleep(2) 247 | continue 248 | 249 | except Exception as e: 250 | print(f"Error in session upload to Gradio space: {e}") 251 | if is_paid_api and attempt < max_retries - 1: 252 | print(f"Retrying upload in 2 seconds... ({attempt + 1}/{max_retries})") 253 | time.sleep(2) 254 | continue 255 | 256 | # If we get here and it's not a retry scenario, break 257 | if not is_paid_api: 258 | break 259 | 260 | return None 261 | 262 | def call_vton_api(base_file_path, product_file_path, model_choice, base_url, session, mask_file_path=None, api_key=None, quality="normal"): 263 | """Call VTON API following the exact Gradio API pattern (like curl -N).""" 264 | is_paid_api = api_key is not None 265 | max_retries = 3 if is_paid_api else 1 266 | 267 | for attempt in range(max_retries): 268 | try: 269 | # Map ComfyUI model choices to API parameters 270 | model_mapping = { 271 | "eyewear": "eyewear", 272 | "footwear": "footwear", 273 | "full-body": "dress", # API expects "dress" for full-body garments 274 | "top garment": "top" # API expects "top" for top garments 275 | } 276 | 277 | api_model_choice = model_mapping.get(model_choice, model_choice) 278 | retry_msg = f" (attempt {attempt + 1}/{max_retries})" if is_paid_api and attempt > 0 else "" 279 | print(f"\n🎨 Calling VTON API with model: {model_choice} → {api_model_choice}{retry_msg}") 280 | 281 | # Gradio API always expects 4 parameters: [base, product, model, mask] 282 | # Use user-provided mask if available, otherwise pass null for backend fallback 283 | if mask_file_path: 284 | mask_parameter = {"path": mask_file_path, "meta": {"_type": "gradio.FileData"}} 285 | print(f" 🎭 Including user-provided mask in API call: {mask_file_path}") 286 | else: 287 | mask_parameter = None 288 | print(f" 🎭 No mask provided - sending null (backend will use base image fallback with default workflow)") 289 | 290 | # Build API data array 291 | api_data_array = [ 292 | {"path": base_file_path, "meta": {"_type": "gradio.FileData"}}, 293 | {"path": product_file_path, "meta": {"_type": "gradio.FileData"}}, 294 | api_model_choice, 295 | mask_parameter 296 | ] 297 | 298 | # Add quality parameter only for paid API (when API key is provided) 299 | if api_key: 300 | api_data_array.append(quality) 301 | print(f" ⚙️ Using quality setting: {quality}") 302 | api_data_array.append(api_key) 303 | print(f" 🔑 Using API key: {api_key[:8]}...{api_key[-4:] if len(api_key) > 12 else '[SHORT]'}") 304 | else: 305 | print(f" 🆓 Demo API - no quality parameter (uses fixed normal quality)") 306 | 307 | api_data = {"data": api_data_array} 308 | 309 | print(f" 📤 API request data: {api_data}") 310 | 311 | # Step 1: POST to get EVENT_ID (exactly like the YAML example) 312 | submit_url = f"{base_url}/gradio_api/call/generate" 313 | print(f" 🚀 Submitting job to: {submit_url}") 314 | 315 | response = session.post( 316 | submit_url, 317 | json=api_data, 318 | headers={"Content-Type": "application/json"}, 319 | timeout=30 320 | ) 321 | 322 | print(f" 📨 Submit response: {response.text}") 323 | 324 | if response.status_code != 200: 325 | print(f" ❌ API submit failed: {response.status_code}") 326 | if is_paid_api and attempt < max_retries - 1: 327 | print(f" 🔄 Retrying API call in 3 seconds... ({attempt + 1}/{max_retries})") 328 | time.sleep(3) 329 | continue 330 | else: 331 | return None 332 | 333 | # Extract EVENT_ID (like awk -F'"' '{ print $4}' in the YAML) 334 | try: 335 | event_data = response.json() 336 | event_id = event_data.get('event_id') 337 | if not event_id: 338 | print(f" ❌ No event_id in response: {event_data}") 339 | if is_paid_api and attempt < max_retries - 1: 340 | print(f" 🔄 Retrying API call in 3 seconds... ({attempt + 1}/{max_retries})") 341 | time.sleep(3) 342 | continue 343 | else: 344 | return None 345 | except: 346 | print(f" ❌ Failed to parse event_id from response") 347 | if is_paid_api and attempt < max_retries - 1: 348 | print(f" 🔄 Retrying API call in 3 seconds... ({attempt + 1}/{max_retries})") 349 | time.sleep(3) 350 | continue 351 | else: 352 | return None 353 | 354 | print(f" ✅ Got EVENT_ID: {event_id}") 355 | 356 | # Step 2: GET with streaming (equivalent to curl -N) 357 | stream_url = f"{base_url}/gradio_api/call/generate/{event_id}" 358 | print(f" 🌊 Starting SSE stream: {stream_url}") 359 | print(f" (equivalent to: curl -N {stream_url})") 360 | 361 | # Make streaming request exactly like curl -N 362 | stream_response = session.get( 363 | stream_url, 364 | headers={ 365 | 'Accept': 'text/event-stream', 366 | 'Cache-Control': 'no-cache', 367 | 'Connection': 'keep-alive' 368 | }, 369 | timeout=300, # 5 minutes for AI processing 370 | stream=True 371 | ) 372 | 373 | if stream_response.status_code != 200: 374 | print(f" ❌ Stream failed: {stream_response.status_code}") 375 | print(f" 📄 Response: {stream_response.text[:200]}") 376 | if is_paid_api and attempt < max_retries - 1: 377 | print(f" 🔄 Retrying API call in 3 seconds... ({attempt + 1}/{max_retries})") 378 | time.sleep(3) 379 | continue 380 | else: 381 | return None 382 | 383 | print(f" ✅ SSE stream connected (status: {stream_response.status_code})") 384 | 385 | # Process streaming response line by line (like curl -N output) 386 | buffer = "" 387 | start_time = time.time() 388 | 389 | for chunk in stream_response.iter_content(chunk_size=1, decode_unicode=True): 390 | if chunk: 391 | buffer += chunk 392 | 393 | # Process complete lines 394 | while '\n' in buffer: 395 | line, buffer = buffer.split('\n', 1) 396 | line = line.strip() 397 | 398 | if line: 399 | elapsed = time.time() - start_time 400 | print(f" 📡 [{elapsed:.1f}s] {line}") 401 | 402 | # Initialize data_content to avoid scoping issues 403 | data_content = None 404 | 405 | # Handle SSE data lines 406 | if line.startswith('data: '): 407 | data_content = line[6:] # Remove 'data: ' prefix 408 | 409 | # Skip empty data 410 | if not data_content or data_content == '{}': 411 | continue 412 | 413 | try: 414 | # Parse JSON data only if we have data_content 415 | if data_content and (data_content.startswith('{') or data_content.startswith('[')): 416 | result_data = json.loads(data_content) 417 | 418 | if isinstance(result_data, dict): 419 | # Check for completion 420 | if result_data.get('msg') == 'process_completed': 421 | output = result_data.get('output', {}) 422 | if output and 'data' in output and output['data']: 423 | result_path = output['data'][0] 424 | print(f" ✅ COMPLETED! Result: {result_path}") 425 | return result_path 426 | 427 | # Check for failure 428 | elif result_data.get('msg') == 'process_failed': 429 | print(f" ❌ FAILED: {result_data}") 430 | return None 431 | 432 | # Status updates 433 | elif result_data.get('msg') in ['process_starts', 'estimation']: 434 | print(f" ⏳ Status: {result_data.get('msg')}") 435 | 436 | # Progress updates 437 | elif 'progress' in result_data: 438 | progress = result_data.get('progress', '') 439 | print(f" 🔄 Progress: {progress}") 440 | 441 | elif isinstance(result_data, list) and result_data: 442 | # Handle Gradio FileData objects in array (THIS IS THE WORKING FORMAT!) 443 | first_item = result_data[0] 444 | 445 | # Check if it's a FileData object with path/url 446 | if isinstance(first_item, dict): 447 | if 'url' in first_item: 448 | result_url = first_item['url'] 449 | print(f" ✅ COMPLETED! Result URL: {result_url}") 450 | return result_url 451 | elif 'path' in first_item: 452 | result_path = first_item['path'] 453 | print(f" ✅ COMPLETED! Result path: {result_path}") 454 | return result_path 455 | 456 | # Handle string paths 457 | elif isinstance(first_item, str) and first_item.startswith('/'): 458 | print(f" ✅ COMPLETED! Result: {first_item}") 459 | return first_item 460 | 461 | # Handle direct string responses (file paths) 462 | elif data_content and (data_content.startswith('/') or data_content.startswith('"/')): 463 | result_path = data_content.strip('"') 464 | print(f" ✅ COMPLETED! Result: {result_path}") 465 | return result_path 466 | 467 | except json.JSONDecodeError: 468 | # Try to extract file path from non-JSON data 469 | if data_content and data_content.startswith('/'): 470 | print(f" ✅ COMPLETED! Result: {data_content}") 471 | return data_content 472 | elif data_content: 473 | print(f" 📝 Raw data: {data_content}") 474 | 475 | # Handle other SSE lines 476 | elif line.startswith('event: '): 477 | event_type = line[7:] 478 | if event_type != 'heartbeat': # Don't log heartbeats 479 | print(f" 🎯 Event: {event_type}") 480 | 481 | # Handle error events 482 | if event_type == 'error': 483 | print(f" ❌ API returned error event - this usually means:") 484 | print(f" - Images are invalid format/size") 485 | print(f" - Server is overloaded") 486 | print(f" - Model choice is invalid") 487 | print(f" - Images are too large/small for the model") 488 | return None 489 | 490 | # Connection heartbeat 491 | elif line.startswith('id: ') or line.startswith('retry: '): 492 | continue # Skip SSE metadata 493 | 494 | # Timeout check 495 | if time.time() - start_time > 300: # 5 minutes 496 | print(f" ⏰ Stream timeout after 5 minutes") 497 | break 498 | 499 | print(f" 🔚 Stream ended without result") 500 | # If this is a paid API and we have retries left, try again 501 | if is_paid_api and attempt < max_retries - 1: 502 | print(f" 🔄 Retrying API call in 3 seconds... ({attempt + 1}/{max_retries})") 503 | time.sleep(3) 504 | continue 505 | else: 506 | return None 507 | 508 | except Exception as e: 509 | print(f" ❌ API error: {e}") 510 | if is_paid_api and attempt < max_retries - 1: 511 | print(f" 🔄 Retrying API call in 3 seconds... ({attempt + 1}/{max_retries})") 512 | time.sleep(3) 513 | continue 514 | else: 515 | return None 516 | 517 | return None 518 | 519 | def download_result_image(result_path_or_url, base_url, session): 520 | """Download the result image from Gradio.""" 521 | if not result_path_or_url: 522 | return None 523 | 524 | print(f"\n📥 Downloading result image: {result_path_or_url}") 525 | 526 | # Check if it's already a complete URL 527 | if result_path_or_url.startswith('http'): 528 | possible_urls = [result_path_or_url] 529 | else: 530 | # Try different URL formats for the result path 531 | possible_urls = [ 532 | f"{base_url}/gradio_api/file={result_path_or_url}", # Most likely format for Gradio 533 | f"{base_url}/file={result_path_or_url}", 534 | f"{base_url}/file/{result_path_or_url}", 535 | f"{base_url}/files/{result_path_or_url}", 536 | f"{base_url}/api/file/{result_path_or_url}", 537 | ] 538 | 539 | for url in possible_urls: 540 | print(f" 🔗 Trying: {url}") 541 | try: 542 | response = session.get(url, timeout=15) 543 | if response.status_code == 200: 544 | content_type = response.headers.get('content-type', '').lower() 545 | if 'image' in content_type or response.content.startswith(b'\x89PNG') or response.content.startswith(b'\xFF\xD8\xFF'): 546 | print(f" ✅ Successfully downloaded image ({len(response.content)} bytes)") 547 | 548 | # Return PIL Image 549 | image = Image.open(io.BytesIO(response.content)) 550 | print(f" 🖼️ Result image size: {image.size}") 551 | return image 552 | else: 553 | print(f" ❌ Not an image: {content_type}") 554 | else: 555 | print(f" ❌ HTTP {response.status_code}") 556 | except Exception as e: 557 | print(f" ❌ Error: {e}") 558 | 559 | print(" ❌ Could not download result image") 560 | return None 561 | 562 | class VTONAPINode: 563 | @classmethod 564 | def INPUT_TYPES(cls): 565 | return { 566 | "required": { 567 | "base_person_image": ("IMAGE",), 568 | "product_image": ("IMAGE",), 569 | "model_choice": (["eyewear", "footwear", "full-body", "top garment"], {"default": "eyewear"}), 570 | 571 | }, 572 | "optional": { 573 | "base_person_mask": ("MASK",), # Optional mask input (MASK type) 574 | } 575 | } 576 | 577 | RETURN_TYPES = ("IMAGE",) 578 | FUNCTION = "process_vton" 579 | CATEGORY = "sm4ll/VTON" 580 | 581 | # Disable caching - always execute even with same inputs 582 | NOT_IDEMPOTENT = True 583 | 584 | def process_vton(self, base_person_image, product_image, model_choice, base_person_mask=None): 585 | try: 586 | # Generate internal cache-buster to force re-execution 587 | cache_buster = time.time() + random.random() 588 | print(f"🎲 Internal cache-buster: {cache_buster:.6f} (ensures fresh execution)") 589 | 590 | # Use the hardcoded Gradio space URL 591 | base_url = "https://sm4ll-vton-sm4ll-vton-demo.hf.space" 592 | 593 | # Debug tensor shapes 594 | print(f"Input base tensor shape: {base_person_image.shape}") 595 | print(f"Input product tensor shape: {product_image.shape}") 596 | 597 | # Convert tensors to PIL images 598 | base_pil = tensor_to_pil(base_person_image) 599 | product_pil = tensor_to_pil(product_image) 600 | 601 | print(f"Converted base image size: {base_pil.size}") 602 | print(f"Converted product image size: {product_pil.size}") 603 | 604 | # Validate minimum size requirements 605 | if base_pil.size[0] < 100 or base_pil.size[1] < 100: 606 | raise Exception(f"Base image too small: {base_pil.size}. Minimum 100x100 required.") 607 | if product_pil.size[0] < 100 or product_pil.size[1] < 100: 608 | raise Exception(f"Product image too small: {product_pil.size}. Minimum 100x100 required.") 609 | 610 | # Resize images to 1.62mpx using Lanczos interpolation 611 | base_resized = resize_to_megapixels(base_pil, 1.62) 612 | product_resized = resize_to_megapixels(product_pil, 1.62) 613 | 614 | print(f"Resized base image size: {base_resized.size}") 615 | print(f"Resized product image size: {product_resized.size}") 616 | 617 | # Ensure images are RGB (sometimes they come as RGBA or other formats) 618 | if base_resized.mode != 'RGB': 619 | print(f"Converting base image from {base_resized.mode} to RGB") 620 | base_resized = base_resized.convert('RGB') 621 | if product_resized.mode != 'RGB': 622 | print(f"Converting product image from {product_resized.mode} to RGB") 623 | product_resized = product_resized.convert('RGB') 624 | 625 | # Validate aspect ratio (VTON models usually expect reasonable aspect ratios) 626 | base_aspect = base_resized.size[0] / base_resized.size[1] 627 | product_aspect = product_resized.size[0] / product_resized.size[1] 628 | print(f"Base image aspect ratio: {base_aspect:.2f}") 629 | print(f"Product image aspect ratio: {product_aspect:.2f}") 630 | 631 | if base_aspect < 0.3 or base_aspect > 3.0: 632 | print(f"⚠️ Warning: Base image has extreme aspect ratio: {base_aspect:.2f}") 633 | if product_aspect < 0.3 or product_aspect > 3.0: 634 | print(f"⚠️ Warning: Product image has extreme aspect ratio: {product_aspect:.2f}") 635 | 636 | # Create a session to maintain cookies/state 637 | session = requests.Session() 638 | 639 | # Upload images directly to the Gradio space 640 | print("Uploading base image to Gradio space...") 641 | base_file_path = upload_to_gradio_session(base_resized, base_url, session) 642 | 643 | if not base_file_path: 644 | raise Exception("Failed to upload base image to Gradio space") 645 | 646 | print("Uploading product image to Gradio space...") 647 | product_file_path = upload_to_gradio_session(product_resized, base_url, session) 648 | 649 | if not product_file_path: 650 | raise Exception("Failed to upload product image to Gradio space") 651 | 652 | # Handle optional mask image 653 | mask_file_path = None 654 | if base_person_mask is not None: 655 | print("Processing and uploading mask image...") 656 | print(f"Input mask tensor shape: {base_person_mask.shape}") 657 | 658 | # Convert MASK tensor to B&W PIL image 659 | mask_pil = mask_to_pil(base_person_mask) 660 | print(f"Mask B&W image size: {mask_pil.size}, mode: {mask_pil.mode}") 661 | 662 | # Validate minimum size requirements for mask 663 | if mask_pil.size[0] < 100 or mask_pil.size[1] < 100: 664 | print(f"⚠️ Warning: Mask image is very small ({mask_pil.size}), this might not work well") 665 | 666 | # Resize mask to same target as other images 667 | mask_resized = resize_to_megapixels(mask_pil, 1.62) 668 | print(f"Resized mask B&W image size: {mask_resized.size}") 669 | 670 | # Convert B&W mask to RGB for API upload (API expects IMAGE format) 671 | mask_resized_rgb = mask_resized.convert('RGB') 672 | print(f"Converted mask from {mask_resized.mode} to {mask_resized_rgb.mode} for API") 673 | 674 | # Upload mask as RGB image to Gradio 675 | mask_file_path = upload_to_gradio_session(mask_resized_rgb, base_url, session) 676 | 677 | if not mask_file_path: 678 | raise Exception("Failed to upload mask image to Gradio space") 679 | 680 | print(f"Mask image uploaded: {mask_file_path}") 681 | else: 682 | print("No mask image provided - will use base image fallback") 683 | 684 | print(f"Base image uploaded: {base_file_path}") 685 | print(f"Product image uploaded: {product_file_path}") 686 | if mask_file_path: 687 | print(f"Mask image uploaded: {mask_file_path}") 688 | 689 | # Call the VTON API (demo version - no quality parameter) 690 | result_path_or_url = call_vton_api(base_file_path, product_file_path, model_choice, base_url, session, mask_file_path) 691 | 692 | if not result_path_or_url: 693 | raise Exception("VTON API call failed - no result returned") 694 | 695 | # Download the result image 696 | result_image = download_result_image(result_path_or_url, base_url, session) 697 | 698 | if not result_image: 699 | raise Exception("Failed to download result image") 700 | 701 | # Convert result image back to tensor 702 | result_tensor = pil_to_tensor(result_image) 703 | print("✓ VTON processing completed successfully!") 704 | return (result_tensor,) 705 | 706 | except Exception as e: 707 | print(f"Error in VTON API processing: {e}") 708 | # Return a red placeholder image in case of error 709 | placeholder = Image.new('RGB', (512, 512), color=(255, 0, 0)) 710 | placeholder_tensor = pil_to_tensor(placeholder) 711 | print(f"Created error placeholder with shape: {placeholder_tensor.shape}") 712 | return (placeholder_tensor,) 713 | 714 | class VTONAPIPaidNode: 715 | @classmethod 716 | def INPUT_TYPES(cls): 717 | return { 718 | "required": { 719 | "base_person_image": ("IMAGE",), 720 | "product_image": ("IMAGE",), 721 | "model_choice": (["eyewear", "footwear", "full-body", "top garment"], {"default": "eyewear"}), 722 | "api_key": ("STRING", {"default": "ym_your_api_key_here", "multiline": False}), 723 | "quality": (["Normal", "High"], {"default": "Normal"}), 724 | }, 725 | "optional": { 726 | "base_person_mask": ("MASK",), # Optional mask input (MASK type) 727 | } 728 | } 729 | 730 | RETURN_TYPES = ("IMAGE",) 731 | FUNCTION = "process_vton_paid" 732 | CATEGORY = "sm4ll/VTON" 733 | 734 | # Disable caching - always execute even with same inputs 735 | NOT_IDEMPOTENT = True 736 | 737 | def process_vton_paid(self, base_person_image, product_image, model_choice, api_key, quality, base_person_mask=None): 738 | try: 739 | # Generate internal cache-buster to force re-execution 740 | cache_buster = time.time() + random.random() 741 | print(f"🎲 Internal cache-buster: {cache_buster:.6f} (ensures fresh execution)") 742 | 743 | # Validate API key format 744 | if not api_key or not api_key.strip(): 745 | raise Exception("API key is required for paid API access") 746 | 747 | api_key = api_key.strip() 748 | if not api_key.startswith("ym_") or len(api_key) != 32: 749 | raise Exception("Invalid API key format. Expected format: ym_29_characters") 750 | 751 | # Convert quality from display format to API format 752 | quality_api = quality.lower() # "Normal" -> "normal", "High" -> "high" 753 | print(f"🎯 Quality setting: {quality} -> {quality_api}") 754 | 755 | # Use the production API endpoint 756 | base_url = "https://api.yourmirror.io" 757 | 758 | # Debug tensor shapes 759 | print(f"Input base tensor shape: {base_person_image.shape}") 760 | print(f"Input product tensor shape: {product_image.shape}") 761 | 762 | # Convert tensors to PIL images 763 | base_pil = tensor_to_pil(base_person_image) 764 | product_pil = tensor_to_pil(product_image) 765 | 766 | print(f"Converted base image size: {base_pil.size}") 767 | print(f"Converted product image size: {product_pil.size}") 768 | 769 | # Validate minimum size requirements 770 | if base_pil.size[0] < 100 or base_pil.size[1] < 100: 771 | raise Exception(f"Base image too small: {base_pil.size}. Minimum 100x100 required.") 772 | if product_pil.size[0] < 100 or product_pil.size[1] < 100: 773 | raise Exception(f"Product image too small: {product_pil.size}. Minimum 100x100 required.") 774 | 775 | # Resize images to 1.62mpx using Lanczos interpolation 776 | base_resized = resize_to_megapixels(base_pil, 1.62) 777 | product_resized = resize_to_megapixels(product_pil, 1.62) 778 | 779 | print(f"Resized base image size: {base_resized.size}") 780 | print(f"Resized product image size: {product_resized.size}") 781 | 782 | # Ensure images are RGB (sometimes they come as RGBA or other formats) 783 | if base_resized.mode != 'RGB': 784 | print(f"Converting base image from {base_resized.mode} to RGB") 785 | base_resized = base_resized.convert('RGB') 786 | if product_resized.mode != 'RGB': 787 | print(f"Converting product image from {product_resized.mode} to RGB") 788 | product_resized = product_resized.convert('RGB') 789 | 790 | # Validate aspect ratio (VTON models usually expect reasonable aspect ratios) 791 | base_aspect = base_resized.size[0] / base_resized.size[1] 792 | product_aspect = product_resized.size[0] / product_resized.size[1] 793 | print(f"Base image aspect ratio: {base_aspect:.2f}") 794 | print(f"Product image aspect ratio: {product_aspect:.2f}") 795 | 796 | if base_aspect < 0.3 or base_aspect > 3.0: 797 | print(f"⚠️ Warning: Base image has extreme aspect ratio: {base_aspect:.2f}") 798 | if product_aspect < 0.3 or product_aspect > 3.0: 799 | print(f"⚠️ Warning: Product image has extreme aspect ratio: {product_aspect:.2f}") 800 | 801 | # Create a session to maintain cookies/state 802 | session = requests.Session() 803 | 804 | # Upload images directly to the production API 805 | print("Uploading base image to production API...") 806 | base_file_path = upload_to_gradio_session(base_resized, base_url, session, is_paid_api=True) 807 | 808 | if not base_file_path: 809 | raise Exception("Failed to upload base image to production API") 810 | 811 | print("Uploading product image to production API...") 812 | product_file_path = upload_to_gradio_session(product_resized, base_url, session, is_paid_api=True) 813 | 814 | if not product_file_path: 815 | raise Exception("Failed to upload product image to production API") 816 | 817 | # Handle optional mask image 818 | mask_file_path = None 819 | if base_person_mask is not None: 820 | print("Processing and uploading mask image...") 821 | print(f"Input mask tensor shape: {base_person_mask.shape}") 822 | 823 | # Convert MASK tensor to B&W PIL image 824 | mask_pil = mask_to_pil(base_person_mask) 825 | print(f"Mask B&W image size: {mask_pil.size}, mode: {mask_pil.mode}") 826 | 827 | # Validate minimum size requirements for mask 828 | if mask_pil.size[0] < 100 or mask_pil.size[1] < 100: 829 | print(f"⚠️ Warning: Mask image is very small ({mask_pil.size}), this might not work well") 830 | 831 | # Resize mask to same target as other images 832 | mask_resized = resize_to_megapixels(mask_pil, 1.62) 833 | print(f"Resized mask B&W image size: {mask_resized.size}") 834 | 835 | # Convert B&W mask to RGB for API upload (API expects IMAGE format) 836 | mask_resized_rgb = mask_resized.convert('RGB') 837 | print(f"Converted mask from {mask_resized.mode} to {mask_resized_rgb.mode} for API") 838 | 839 | # Upload mask as RGB image to production API 840 | mask_file_path = upload_to_gradio_session(mask_resized_rgb, base_url, session, is_paid_api=True) 841 | 842 | if not mask_file_path: 843 | raise Exception("Failed to upload mask image to production API") 844 | 845 | print(f"Mask image uploaded: {mask_file_path}") 846 | else: 847 | print("No mask image provided - will use base image fallback") 848 | 849 | print(f"Base image uploaded: {base_file_path}") 850 | print(f"Product image uploaded: {product_file_path}") 851 | if mask_file_path: 852 | print(f"Mask image uploaded: {mask_file_path}") 853 | 854 | # Call the VTON API with API key and quality setting 855 | result_path_or_url = call_vton_api(base_file_path, product_file_path, model_choice, base_url, session, mask_file_path, api_key, quality_api) 856 | 857 | if not result_path_or_url: 858 | raise Exception("VTON API call failed - no result returned") 859 | 860 | # Download the result image 861 | result_image = download_result_image(result_path_or_url, base_url, session) 862 | 863 | if not result_image: 864 | raise Exception("Failed to download result image") 865 | 866 | # Convert result image back to tensor 867 | result_tensor = pil_to_tensor(result_image) 868 | print("✓ VTON processing completed successfully!") 869 | return (result_tensor,) 870 | 871 | except Exception as e: 872 | print(f"Error in VTON API processing: {e}") 873 | # Return a red placeholder image in case of error 874 | placeholder = Image.new('RGB', (512, 512), color=(255, 0, 0)) 875 | placeholder_tensor = pil_to_tensor(placeholder) 876 | print(f"Created error placeholder with shape: {placeholder_tensor.shape}") 877 | return (placeholder_tensor,) 878 | 879 | def pil_to_base64_data_uri(pil_image): 880 | """Convert PIL image to base64 data URI.""" 881 | buffer = io.BytesIO() 882 | pil_image.save(buffer, format='PNG') 883 | buffer.seek(0) 884 | image_bytes = buffer.getvalue() 885 | base64_string = base64.b64encode(image_bytes).decode('utf-8') 886 | return f"data:image/png;base64,{base64_string}" 887 | 888 | def call_lookbook_api(person_image, garment_images, gender, prompt, quality, api_key, base_url): 889 | """Call Lookbook API with direct HTTP POST using base64 images.""" 890 | try: 891 | print(f"\n🎨 Calling Lookbook API with {len(garment_images)} garments, gender: {gender}, quality: {quality}") 892 | 893 | # Convert person image to base64 data URI 894 | person_b64 = pil_to_base64_data_uri(person_image) 895 | print(f" 📸 Converted person image to base64 ({len(person_b64)} chars)") 896 | 897 | # Convert garment images to base64 data URIs (up to 4 slots, null for empty) 898 | garment_b64_images = [] 899 | for i in range(4): 900 | if i < len(garment_images) and garment_images[i]: 901 | garment_b64 = pil_to_base64_data_uri(garment_images[i]) 902 | garment_b64_images.append(garment_b64) 903 | print(f" 👕 Converted garment {i+1} to base64 ({len(garment_b64)} chars)") 904 | else: 905 | garment_b64_images.append(None) 906 | 907 | # Build API payload with base64 data in FileData format 908 | payload = { 909 | "person_image": {"path": person_b64, "meta": {"_type": "gradio.FileData"}}, 910 | "garment_images": [ 911 | {"path": garment_b64, "meta": {"_type": "gradio.FileData"}} if garment_b64 else None 912 | for garment_b64 in garment_b64_images 913 | ], 914 | "quality": quality.lower(), 915 | "mode": gender.lower(), 916 | "api_key": api_key 917 | } 918 | 919 | # Add prompt if provided 920 | if prompt and prompt.strip(): 921 | payload["prompt"] = prompt.strip() 922 | 923 | print(f" 📤 Sending payload with base64 images to /lookbook") 924 | 925 | # Send POST request to /lookbook endpoint 926 | lookbook_url = f"{base_url}/lookbook" 927 | print(f" 🚀 Posting to: {lookbook_url}") 928 | 929 | response = requests.post( 930 | lookbook_url, 931 | json=payload, 932 | headers={"Content-Type": "application/json"}, 933 | timeout=600 # 10 minutes for processing 934 | ) 935 | 936 | print(f" 📨 Response status: {response.status_code}") 937 | 938 | if response.status_code == 200: 939 | result_data = response.json() 940 | print(f" ✅ Success! Response contains data: {bool(result_data.get('data'))}") 941 | 942 | # Extract result images from response 943 | if result_data.get("data") and len(result_data["data"]) > 0: 944 | return result_data["data"][0] # Return first result image 945 | else: 946 | print(f" ❌ No data in response: {result_data}") 947 | return None 948 | else: 949 | print(f" ❌ API failed: {response.status_code} - {response.text}") 950 | return None 951 | 952 | except Exception as e: 953 | print(f" ❌ Lookbook API error: {e}") 954 | return None 955 | 956 | class VTONLookbookNode: 957 | @classmethod 958 | def INPUT_TYPES(cls): 959 | return { 960 | "required": { 961 | "person_image": ("IMAGE",), 962 | "garment_1": ("IMAGE",), 963 | "mode": (["Male", "Female"], {"default": "Female"}), 964 | "quality": (["Normal", "High"], {"default": "Normal"}), 965 | "prompt": ("STRING", {"default": "", "multiline": True}), 966 | "api_key": ("STRING", {"default": "ym_your_api_key_here", "multiline": False}), 967 | }, 968 | "optional": { 969 | "garment_2": ("IMAGE",), 970 | "garment_3": ("IMAGE",), 971 | "garment_4": ("IMAGE",), 972 | } 973 | } 974 | 975 | RETURN_TYPES = ("IMAGE",) 976 | FUNCTION = "process_lookbook" 977 | CATEGORY = "sm4ll/VTON" 978 | 979 | # Disable caching - always execute even with same inputs 980 | NOT_IDEMPOTENT = True 981 | 982 | def process_lookbook(self, person_image, garment_1, mode, quality, prompt, api_key, garment_2=None, garment_3=None, garment_4=None): 983 | try: 984 | # Generate internal cache-buster to force re-execution 985 | cache_buster = time.time() + random.random() 986 | print(f"🎲 Internal cache-buster: {cache_buster:.6f} (ensures fresh execution)") 987 | 988 | # Validate API key format 989 | if not api_key or not api_key.strip(): 990 | raise Exception("API key is required for lookbook API access") 991 | 992 | api_key = api_key.strip() 993 | if not api_key.startswith("ym_") or len(api_key) != 32: 994 | raise Exception("Invalid API key format. Expected format: ym_29_characters") 995 | 996 | # Use the apiservice endpoint for lookbook 997 | base_url = "https://apiservice.yourmirror.io" 998 | 999 | # Convert tensors to PIL images and validate 1000 | person_pil = tensor_to_pil(person_image) 1001 | garment_1_pil = tensor_to_pil(garment_1) 1002 | 1003 | print(f"Person image size: {person_pil.size}") 1004 | print(f"Garment 1 size: {garment_1_pil.size}") 1005 | 1006 | # Validate minimum size requirements 1007 | if person_pil.size[0] < 100 or person_pil.size[1] < 100: 1008 | raise Exception(f"Person image too small: {person_pil.size}. Minimum 100x100 required.") 1009 | if garment_1_pil.size[0] < 100 or garment_1_pil.size[1] < 100: 1010 | raise Exception(f"Garment 1 image too small: {garment_1_pil.size}. Minimum 100x100 required.") 1011 | 1012 | # Resize images to 1.62mpx using Lanczos interpolation 1013 | person_resized = resize_to_megapixels(person_pil, 1.62) 1014 | garment_1_resized = resize_to_megapixels(garment_1_pil, 1.62) 1015 | 1016 | # Ensure images are RGB 1017 | if person_resized.mode != 'RGB': 1018 | person_resized = person_resized.convert('RGB') 1019 | if garment_1_resized.mode != 'RGB': 1020 | garment_1_resized = garment_1_resized.convert('RGB') 1021 | 1022 | # Process all garment images (garment_1 is required, others are optional) 1023 | garment_images = [garment_1_resized] 1024 | optional_garments = [garment_2, garment_3, garment_4] 1025 | 1026 | for i, garment in enumerate(optional_garments): 1027 | if garment is not None: 1028 | garment_pil = tensor_to_pil(garment) 1029 | print(f"Garment {i+2} size: {garment_pil.size}") 1030 | 1031 | if garment_pil.size[0] < 100 or garment_pil.size[1] < 100: 1032 | print(f"⚠️ Warning: Garment {i+2} image is very small ({garment_pil.size}), skipping") 1033 | continue 1034 | 1035 | garment_resized = resize_to_megapixels(garment_pil, 1.62) 1036 | if garment_resized.mode != 'RGB': 1037 | garment_resized = garment_resized.convert('RGB') 1038 | 1039 | garment_images.append(garment_resized) 1040 | 1041 | print(f"Processing with {len(garment_images)} garment images") 1042 | 1043 | # Call the Lookbook API directly with base64 images (no upload needed) 1044 | result_url_or_data = call_lookbook_api( 1045 | person_resized, 1046 | garment_images, 1047 | mode, 1048 | prompt, 1049 | quality, 1050 | api_key, 1051 | base_url 1052 | ) 1053 | 1054 | if not result_url_or_data: 1055 | raise Exception("Lookbook API call failed - no result returned") 1056 | 1057 | # Handle result - could be URL or base64 data 1058 | if result_url_or_data.startswith("data:image"): 1059 | # Base64 data URI - decode directly 1060 | print("Result is base64 data URI, decoding...") 1061 | header, data = result_url_or_data.split(",", 1) 1062 | image_bytes = base64.b64decode(data) 1063 | result_image = Image.open(io.BytesIO(image_bytes)) 1064 | else: 1065 | # URL - download the image directly 1066 | print(f"Result is URL, downloading: {result_url_or_data}") 1067 | try: 1068 | response = requests.get(result_url_or_data, timeout=30) 1069 | if response.status_code == 200: 1070 | result_image = Image.open(io.BytesIO(response.content)) 1071 | print(f"Successfully downloaded result image ({len(response.content)} bytes)") 1072 | else: 1073 | print(f"Failed to download result: HTTP {response.status_code}") 1074 | result_image = None 1075 | except Exception as e: 1076 | print(f"Error downloading result: {e}") 1077 | result_image = None 1078 | 1079 | if not result_image: 1080 | raise Exception("Failed to get result image") 1081 | 1082 | # Convert result image back to tensor 1083 | result_tensor = pil_to_tensor(result_image) 1084 | print("✓ Lookbook processing completed successfully!") 1085 | return (result_tensor,) 1086 | 1087 | except Exception as e: 1088 | print(f"Error in Lookbook processing: {e}") 1089 | # Return a red placeholder image in case of error 1090 | placeholder = Image.new('RGB', (512, 512), color=(255, 0, 0)) 1091 | placeholder_tensor = pil_to_tensor(placeholder) 1092 | print(f"Created error placeholder with shape: {placeholder_tensor.shape}") 1093 | return (placeholder_tensor,) 1094 | 1095 | NODE_CLASS_MAPPINGS = { 1096 | "VTONAPINode": VTONAPINode, 1097 | "VTONAPIPaidNode": VTONAPIPaidNode, 1098 | "VTONLookbookNode": VTONLookbookNode 1099 | } 1100 | 1101 | NODE_DISPLAY_NAME_MAPPINGS = { 1102 | "VTONAPINode": "sm4ll Wrapper Sampler - Demo Version", 1103 | "VTONAPIPaidNode": "sm4ll Wrapper Sampler - Paid API", 1104 | "VTONLookbookNode": "sm4ll Wrapper Lookbook Sampler - Paid API" 1105 | } 1106 | --------------------------------------------------------------------------------