├── .github
    └── workflows
    │   └── publish.yml
├── LICENSE
├── README.md
├── __init__.py
├── controlnet_fp8_node.py
├── examples
    ├── gguf_quantizer_workflow.json
    ├── workflow_controlnet_fp8_quantization-fast.json
    ├── workflow_integrated_quantization.json
    └── workflow_quantize.json
├── gguf_quantizer_node.py
├── nodes.py
├── pyproject.toml
├── requirements.txt
└── web
    └── appearance.js


/.github/workflows/publish.yml:
--------------------------------------------------------------------------------
 1 | name: Publish to Comfy registry
 2 | on:
 3 |   workflow_dispatch:
 4 |   push:
 5 |     branches:
 6 |       - main
 7 |       - master
 8 |     paths:
 9 |       - "pyproject.toml"
10 | 
11 | permissions:
12 |   issues: write
13 | 
14 | jobs:
15 |   publish-node:
16 |     name: Publish Custom Node to registry
17 |     runs-on: ubuntu-latest
18 |     if: ${{ github.repository_owner == 'lum3on' }}
19 |     steps:
20 |       - name: Check out code
21 |         uses: actions/checkout@v4
22 |         with:
23 |           submodules: true
24 |       - name: Publish Custom Node
25 |         uses: Comfy-Org/publish-node-action@v1
26 |         with:
27 |           ## Add your own personal access token to your Github Repository secrets and reference it here.
28 |           personal_access_token: ${{ secrets.REGISTRY_ACCESS_TOKEN }}
29 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2025 lum3on
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # ComfyUI Model Quantizer
  2 | 
  3 | A comprehensive custom node pack for ComfyUI that provides advanced tools for quantizing model weights to lower precision formats like FP16, BF16, and true FP8 types, with specialized support for ControlNet models.
  4 | 
  5 | 
  6 | ![image](https://github.com/user-attachments/assets/070b741d-e682-4e08-a4b4-5be8b2abd64f)
  7 | 
  8 | 
  9 | ## Overview
 10 | 
 11 | This node pack provides powerful quantization tools directly within ComfyUI, including:
 12 | 
 13 | ### Standard Quantization Nodes
 14 | 1.  **Model To State Dict**: Extracts the state dictionary from a model object and attempts to normalize keys.
 15 | 2.  **Quantize Model to FP8 Format**: Converts model weights directly to `float8_e4m3fn` or `float8_e5m2` format (requires CUDA).
 16 | 3.  **Quantize Model Scaled**: Applies simulated FP8 scaling (per-tensor or per-channel) and then casts the model to `float16`, `bfloat16`, or keeps the original format.
 17 | 4.  **Save As SafeTensor**: Saves the processed state dictionary to a `.safetensors` file at a specified path.
 18 | 
 19 | ### NEW: ControlNet FP8 Quantization Nodes
 20 | 5.  **ControlNet FP8 Quantizer**: Advanced FP8 quantization specifically designed for ControlNet models with precision-aware quantization, tensor calibration, and ComfyUI folder integration.
 21 | 6.  **ControlNet Metadata Viewer**: Analyzes and displays ControlNet model metadata, tensor information, and structure for debugging and optimization.
 22 | 
 23 | ### NEW: GGUF Model Quantization
 24 | 7.  **GGUF Quantizer 👾**: Advanced GGUF quantization wrapper around City96's GGUF tools, optimized for diffusion models including WAN, HunyuanVid, and FLUX. Supports multiple quantization levels (F16, Q4_K_M, Q5_0, Q8_0, etc.) with automatic architecture detection and 5D tensor handling.
 25 | 
 26 | ## Installation
 27 | 
 28 | 1.  Clone or download this repository into your ComfyUI's `custom_nodes` directory.
 29 |     * Example using git:
 30 |         ```bash
 31 |         cd ComfyUI/custom_nodes
 32 |         git clone [https://github.com/YourUsername/YourRepoName.git](https://github.com/YourUsername/YourRepoName.git) ComfyUI-ModelQuantizer
 33 |         # Replace with your actual repo URL and desired folder name
 34 |         ```
 35 |     * Alternatively, download the ZIP and extract it into `ComfyUI/custom_nodes/ComfyUI-ModelQuantizer`.
 36 | 
 37 | 2.  Install dependencies:
 38 |     ```bash
 39 |     cd ComfyUI/custom_nodes/ComfyUI-ModelQuantizer
 40 |     pip install -r requirements.txt
 41 |     ```
 42 | 
 43 | 3.  **For ControlNet quantization**, ensure your ControlNet models are in the correct folder:
 44 |     ```
 45 |     ComfyUI/models/controlnet/
 46 |     ├── control_v11p_sd15_canny.safetensors
 47 |     ├── control_v11p_sd15_openpose.safetensors
 48 |     └── ...
 49 |     ```
 50 | 
 51 | 4.  Restart ComfyUI.
 52 | 
 53 | ## Usage
 54 | 
 55 | ### Model To State Dict
 56 | * **Category:** `Model Quantization/Utils`
 57 | * **Function:** Extracts state dict from a MODEL object, stripping common prefixes.
 58 | * **Inputs**:
 59 |     * `model`: The input `MODEL` object.
 60 | * **Outputs**:
 61 |     * `model_state_dict`: The extracted state dictionary.
 62 | 
 63 | ### Quantize Model to FP8 Format
 64 | * **Category:** `Model Quantization/FP8 Direct`
 65 | * **Function:** Converts model weights directly to a specific FP8 format. Requires CUDA.
 66 | * **Inputs**:
 67 |     * `model_state_dict`: The state dictionary to quantize.
 68 |     * `fp8_format`: The target FP8 format (`float8_e5m2` or `float8_e4m3fn`).
 69 | * **Outputs**:
 70 |     * `quantized_model_state_dict`: The state dictionary with FP8 tensors.
 71 | 
 72 | ### Quantize Model Scaled
 73 | * **Category:** `Model Quantization`
 74 | * **Function:** Applies simulated FP8 value scaling and then casts to FP16, BF16, or keeps the original dtype. Useful for size reduction with good compatibility.
 75 | * **Inputs**:
 76 |     * `model_state_dict`: The state dictionary to quantize.
 77 |     * `scaling_strategy`: How to simulate scaling (`per_tensor` or `per_channel`).
 78 |     * `processing_device`: Where to perform calculations (`Auto`, `CPU`, `GPU`).
 79 |     * `output_dtype`: Final data type (`Original`, `float16`, `bfloat16`). Defaults to `float16`.
 80 | * **Outputs**:
 81 |     * `quantized_model_state_dict`: The processed state dictionary.
 82 | 
 83 | ### Save As SafeTensor
 84 | * **Category:** `Model Quantization/Save`
 85 | * **Function:** Saves the processed state dictionary to a `.safetensors` file.
 86 | * **Inputs**:
 87 |     * `quantized_model_state_dict`: The state dictionary to save.
 88 |     * `absolute_save_path`: The full path (including filename) where the model will be saved.
 89 | * **Outputs**: None (Output node).
 90 | 
 91 | ### ControlNet FP8 Quantizer
 92 | * **Category:** `Model Quantization/ControlNet`
 93 | * **Function:** Advanced FP8 quantization specifically designed for ControlNet models with precision-aware quantization and tensor calibration.
 94 | * **Inputs**:
 95 |     * `controlnet_model`: Dropdown selection of ControlNet models from `models/controlnet/` folder
 96 |     * `fp8_format`: FP8 format (`float8_e4m3fn` recommended, or `float8_e5m2`)
 97 |     * `quantization_strategy`: `per_tensor` (faster) or `per_channel` (better quality)
 98 |     * `activation_clipping`: Enable percentile-based outlier handling (recommended)
 99 |     * `custom_output_name`: Optional custom filename for output
100 |     * `calibration_samples`: Number of samples for tensor calibration (10-1000, default: 100)
101 |     * `preserve_metadata`: Preserve original metadata in output file
102 | * **Outputs**:
103 |     * `status`: Operation status and result message
104 |     * `metadata_info`: JSON-formatted metadata information
105 |     * `quantization_stats`: Detailed compression statistics and ratios
106 | 
107 | ### ControlNet Metadata Viewer
108 | * **Category:** `Model Quantization/ControlNet`
109 | * **Function:** Analyzes and displays ControlNet model metadata, tensor information, and structure.
110 | * **Inputs**:
111 |     * `controlnet_model`: Dropdown selection of ControlNet models from `models/controlnet/` folder
112 | * **Outputs**:
113 |     * `metadata`: JSON-formatted original metadata
114 |     * `tensor_info`: Detailed tensor information including shapes, dtypes, and sizes
115 |     * `model_analysis`: Model structure analysis including layer types and statistics
116 | 
117 | ### GGUF Quantizer 👾
118 | * **Category:** `Model Quantization/GGUF`
119 | * **Function:** Advanced GGUF quantization wrapper around City96's GGUF tools for diffusion models. Supports automatic architecture detection and multiple quantization formats.
120 | * **Inputs**:
121 |     * `model`: Input MODEL object (UNET/diffusion model)
122 |     * `quantization_type`: Target quantization format (`F16`, `Q4_K_M`, `Q5_0`, `Q8_0`, `ALL`, etc.)
123 |     * `output_path_template`: Output path template (relative or absolute)
124 |     * `is_absolute_path`: Toggle between relative (ComfyUI output) and absolute path modes
125 |     * `setup_environment`: Run llama.cpp setup if needed
126 |     * `verbose_logging`: Enable detailed debug logging
127 | * **Outputs**:
128 |     * `status_message`: Operation status and detailed progress information
129 |     * `output_gguf_path_or_dir`: Path to generated GGUF file(s)
130 | 
131 | **Supported Models:**
132 | - ✅ **WAN** (Weights Are Not) - Video generation models
133 | - ✅ **HunyuanVid** - Hunyuan video diffusion models
134 | - ✅ **FLUX** - FLUX diffusion models with proper tensor handling
135 | - 🚧 **LTX** - Coming soon
136 | - 🚧 **HiDream** - Coming soon
137 | 
138 | ## Example Workflows
139 | 
140 | ### Standard Model Quantization
141 | 1.  Load a model using a standard loader (e.g., `Load Checkpoint`).
142 | 2.  Connect the `MODEL` output to the `Model To State Dict` node.
143 | 3.  Connect the `model_state_dict` output from `Model To State Dict` to `Quantize Model Scaled`.
144 | 4.  In `Quantize Model Scaled`, select your desired `scaling_strategy` and set `output_dtype` to `float16` (for size reduction).
145 | 5.  Connect the `quantized_model_state_dict` output from `Quantize Model Scaled` to the `Save Model as SafeTensor` node.
146 | 6.  Specify the `absolute_save_path` in the `Save Model as SafeTensor` node.
147 | 7.  Queue the prompt.
148 | 8.  Restart ComfyUI or refresh loaders to find the saved model.
149 | 
150 | ### ControlNet FP8 Quantization
151 | 1.  Add `ControlNet FP8 Quantizer` node to your workflow.
152 | 2.  Select your ControlNet model from the dropdown (automatically populated from `models/controlnet/`).
153 | 3.  Configure settings:
154 |     * **FP8 Format**: `float8_e4m3fn` (recommended for most cases)
155 |     * **Strategy**: `per_channel` (better quality) or `per_tensor` (faster)
156 |     * **Activation Clipping**: `True` (recommended for better quality)
157 | 4.  Execute the workflow - quantized model automatically saved to `models/controlnet/quantized/`.
158 | 5.  Use `ControlNet Metadata Viewer` to analyze original vs quantized models.
159 | 
160 | ### Batch ControlNet Processing
161 | 1.  Add multiple `ControlNet FP8 Quantizer` nodes.
162 | 2.  Select different ControlNet models in each node.
163 | 3.  Use consistent settings across all nodes.
164 | 4.  Execute to process multiple models simultaneously.
165 | 
166 | ### GGUF Model Quantization
167 | 1.  Load your diffusion model using standard ComfyUI loaders.
168 | 2.  Add `GGUF Quantizer 👾` node to your workflow.
169 | 3.  Connect the `MODEL` output to the GGUF quantizer input.
170 | 4.  Configure settings:
171 |     * **Quantization Type**: `Q4_K_M` (recommended balance), `Q8_0` (higher quality), or `ALL` (generate multiple formats)
172 |     * **Output Path**: Specify where to save (e.g., `models/unet/quantized/`)
173 |     * **Verbose Logging**: Enable for detailed progress information
174 | 5.  Execute workflow - quantized GGUF files will be saved to specified location.
175 | 6.  Use quantized models with ComfyUI-GGUF loader nodes.
176 | 
177 | **Note**: GGUF quantization requires significant RAM (96GB+) and processing time varies by model size.
178 | 
179 | ## Features
180 | 
181 | ### Advanced ControlNet Quantization
182 | - **Precision-aware quantization** with tensor calibration and percentile-based scaling
183 | - **Two FP8 formats**: `float8_e4m3fn` (recommended) and `float8_e5m2`
184 | - **Quantization strategies**: per-tensor (faster) and per-channel (better quality)
185 | - **Automatic ComfyUI integration** with dropdown model selection
186 | - **Smart output management** - quantized models saved to `models/controlnet/quantized/`
187 | - **Comprehensive analysis** with metadata viewer and detailed statistics
188 | - **Fallback logic** for compatibility across different PyTorch versions
189 | 
190 | ### Technical Capabilities
191 | - **~50% size reduction** with maintained quality
192 | - **Advanced tensor calibration** using statistical analysis
193 | - **Activation clipping** with outlier handling
194 | - **Metadata preservation** with quantization information
195 | - **Error handling** with graceful fallbacks
196 | - **Progress tracking** and detailed logging
197 | 
198 | ### ComfyUI Integration
199 | - **Automatic model detection** from `models/controlnet/` folder
200 | - **Dropdown selection** - no manual path entry needed
201 | - **Auto-generated filenames** with format and strategy information
202 | - **Organized output** in dedicated quantized subfolder
203 | - **Seamless workflow integration** with existing ControlNet nodes
204 | 
205 | ## Requirements
206 | 
207 | ### Core Dependencies
208 | * PyTorch 2.0+ (for FP8 support, usually included with ComfyUI)
209 | * `safetensors` >= 0.3.1
210 | * `tqdm` >= 4.65.0
211 | 
212 | ### Additional Dependencies (for ControlNet nodes)
213 | * `tensorflow` >= 2.13.0 (optional, for advanced optimization)
214 | * `tensorflow-model-optimization` >= 0.7.0 (optional)
215 | 
216 | ### Hardware
217 | * CUDA-enabled GPU recommended for FP8 operations
218 | * CPU fallback available for compatibility
219 | 
220 | ### GGUF Quantization Requirements
221 | * **Minimum 96GB RAM** - Required for processing large diffusion models
222 | * **Decent GPU** - For model loading and processing (VRAM requirements vary by model size)
223 | * **Storage Space** - GGUF files can be large during processing (temporary files cleaned up automatically)
224 | * **Python 3.8+** with PyTorch 2.0+
225 | 
226 | ## Troubleshooting
227 | 
228 | ### ControlNet Nodes Not Appearing
229 | 1. Ensure all dependencies are installed: `pip install -r requirements.txt`
230 | 2. Check that ControlNet models are in `ComfyUI/models/controlnet/` folder
231 | 3. Restart ComfyUI completely
232 | 4. Check console for import errors
233 | 
234 | ### "No models found" in Dropdown
235 | 1. Place ControlNet models in `ComfyUI/models/controlnet/` folder
236 | 2. Supported formats: `.safetensors`, `.pth`
237 | 3. Check file permissions
238 | 4. Use manual path input as fallback if needed
239 | 
240 | ### Quantization Errors
241 | - **"quantile() input tensor must be either float or double dtype"**: Fixed in latest version
242 | - **CUDA out of memory**: Use CPU processing or reduce batch size
243 | - **FP8 not supported**: Upgrade PyTorch to 2.0+ or use CPU fallback
244 | 
245 | ### Performance Tips
246 | - **For best quality**: Use `per_channel` + `activation_clipping` + `float8_e4m3fn`
247 | - **For speed**: Use `per_tensor` + reduce `calibration_samples`
248 | - **Memory issues**: Process models one at a time
249 | 
250 | ## Workflow Examples
251 | 
252 | Pre-made workflow JSON files are available in the `examples/` folder:
253 | - `workflow_controlnet_fp8_quantization.json` - Basic ControlNet quantization
254 | - `workflow_advanced_controlnet_quantization.json` - Advanced with verification
255 | - `workflow_integrated_quantization.json` - Integration with existing nodes
256 | - `workflow_batch_controlnet_quantization.json` - Batch processing multiple models
257 | 
258 | ## Development Roadmap & TODO
259 | 
260 | ### Completed Features ✅
261 | #### Standard Quantization
262 | - [x] **FP16 Quantization** - Standard half-precision quantization
263 | - [x] **BF16 Quantization** - Brain floating-point 16-bit format
264 | - [x] **FP8 Direct Quantization** - True FP8 formats (float8_e4m3fn, float8_e5m2)
265 | - [x] **FP8 Scaled Quantization** - Simulated FP8 with scaling strategies
266 | - [x] **Per-Tensor & Per-Channel Scaling** - Multiple quantization strategies
267 | - [x] **State Dict Extraction** - Model to state dictionary conversion
268 | - [x] **SafeTensors Export** - Reliable model saving format
269 | 
270 | #### ControlNet FP8 Integration
271 | - [x] **ControlNet FP8 Quantizer** - Specialized FP8 quantization for ControlNet models
272 | - [x] **Precision-Aware Quantization** - Advanced tensor calibration and scaling
273 | 
274 | #### GGUF Quantization
275 | - [x] **WAN Model Support** - Complete with 5D tensor handling
276 | - [x] **HunyuanVid Model Support** - Architecture detection and conversion
277 | - [x] **FLUX Model Support** - Proper tensor prefix handling and quantization
278 | - [x] **Automatic Architecture Detection** - Smart model type detection
279 | - [x] **5D Tensor Handling** - Special handling for complex tensor shapes
280 | - [x] **Path Management** - Robust absolute/relative path handling
281 | - [x] **Multiple GGUF Formats** - F16, Q4_K_M, Q5_0, Q8_0, and more
282 | 
283 | ### Upcoming Features 🚧
284 | - [ ] **LTX Model Support** - Integration planned for next release
285 | - [ ] **HiDream Model Support** - Integration planned for next release
286 | - [ ] **DFloat11 Quantization** - Ultra-low precision format coming soon
287 | - [ ] **Memory Optimization** - Reduce RAM requirements where possible
288 | - [ ] **Batch Processing** - Support for multiple models in single operation
289 | 
290 | ### Known Issues
291 | - [ ] **High RAM Requirements** - Currently requires 96GB+ RAM for large models
292 | - [ ] **Processing Time** - Large models can take significant time to process
293 | - [ ] **Temporary File Cleanup** - Ensure all temporary files are properly cleaned up
294 | 
295 | ## Acknowledgments
296 | 
297 | This project wraps and extends [City96's GGUF tools](https://github.com/city96/ComfyUI-GGUF) for diffusion model quantization. Special thanks to the City96 team for this excellent GGUF implementation and the broader ComfyUI community for their contributions.
298 | 
299 | ## License
300 | 
301 | MIT (Or your chosen license)
302 | 


--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------
  1 | # __init__.py
  2 | # This file is necessary to make Python treat the directory as a package.
  3 | # It's also where ComfyUI looks for node mappings.
  4 | 
  5 | import sys
  6 | import traceback
  7 | 
  8 | # Import node classes from nodes.py
  9 | # Renamed QuantizeScaled to QuantizeModel
 10 | try:
 11 |     from .nodes import ModelToStateDict, QuantizeFP8Format, QuantizeModel, SaveAsSafeTensor
 12 | except ImportError:
 13 |     # Fallback for direct execution or when relative imports fail
 14 |     from nodes import ModelToStateDict, QuantizeFP8Format, QuantizeModel, SaveAsSafeTensor
 15 | 
 16 | # Import ControlNet FP8 quantization nodes
 17 | ControlNetFP8QuantizeNode = None
 18 | ControlNetMetadataViewerNode = None
 19 | CONTROLNET_NODES_AVAILABLE = False
 20 | 
 21 | try:
 22 |     try:
 23 |         from .controlnet_fp8_node import ControlNetFP8QuantizeNode, ControlNetMetadataViewerNode
 24 |     except ImportError:
 25 |         from controlnet_fp8_node import ControlNetFP8QuantizeNode, ControlNetMetadataViewerNode
 26 |     CONTROLNET_NODES_AVAILABLE = True
 27 |     print("✅ ControlNet FP8 nodes imported successfully")
 28 | except Exception as e:
 29 |     print(f"❌ Failed to import ControlNet FP8 nodes: {e}")
 30 |     print(f"❌ ControlNet error details: {traceback.format_exc()}")
 31 | 
 32 | # Import GGUF quantization nodes
 33 | GGUFQuantizerNode = None # For the new GGUF node
 34 | GGUF_NODES_AVAILABLE = False
 35 | 
 36 | try:
 37 |     try:
 38 |         from .gguf_quantizer_node import GGUFQuantizerNode # Updated filename
 39 |     except ImportError:
 40 |         from gguf_quantizer_node import GGUFQuantizerNode
 41 |     GGUF_NODES_AVAILABLE = True
 42 |     print("✅ GGUF quantization nodes imported successfully")
 43 | except Exception as e:
 44 |     print(f"❌ Failed to import GGUF quantization nodes: {e}")
 45 |     print(f"❌ GGUF error details: {traceback.format_exc()}")
 46 |     GGUF_NODES_AVAILABLE = False
 47 | 
 48 | 
 49 | # A dictionary that ComfyUI uses to map node class names to node objects
 50 | NODE_CLASS_MAPPINGS = {
 51 |     # Node Utils
 52 |     "ModelToStateDict": ModelToStateDict,
 53 |     # Direct FP8 Conversion
 54 |     "QuantizeFP8Format": QuantizeFP8Format,
 55 |     # Scaled Quantization + Casting (FP16/BF16)
 56 |     "QuantizeModel": QuantizeModel,         # Renamed class QuantizeScaled -> QuantizeModel
 57 |     # Saving Node
 58 |     "SaveAsSafeTensor": SaveAsSafeTensor,
 59 | }
 60 | 
 61 | # Add ControlNet nodes if available
 62 | if CONTROLNET_NODES_AVAILABLE and ControlNetFP8QuantizeNode is not None:
 63 |     NODE_CLASS_MAPPINGS.update({
 64 |         # ControlNet FP8 Quantization Nodes
 65 |         "ControlNetFP8QuantizeNode": ControlNetFP8QuantizeNode,
 66 |         "ControlNetMetadataViewerNode": ControlNetMetadataViewerNode,
 67 |     })
 68 |     print("✅ ControlNet FP8 nodes registered in NODE_CLASS_MAPPINGS")
 69 | 
 70 | # Add GGUF nodes if available
 71 | if GGUF_NODES_AVAILABLE and GGUFQuantizerNode is not None:
 72 |     NODE_CLASS_MAPPINGS["GGUFQuantizerNode"] = GGUFQuantizerNode # New GGUF node
 73 |     print("✅ GGUF quantization nodes registered in NODE_CLASS_MAPPINGS")
 74 | else:
 75 |     print(f"❌ GGUF nodes NOT registered. Available: {GGUF_NODES_AVAILABLE}, Node: {GGUFQuantizerNode}")
 76 | 
 77 | 
 78 | # A dictionary that ComfyUI uses to map node class names to display names
 79 | NODE_DISPLAY_NAME_MAPPINGS = {
 80 |     "ModelToStateDict": "Model To State Dict",
 81 |     "QuantizeFP8Format": "Quantize Model to FP8 Format",
 82 |     "QuantizeModel": "Quantize Model Scaled", # Display name for the renamed node
 83 |     "SaveAsSafeTensor": "Save Model as SafeTensor",
 84 | }
 85 | 
 86 | # Add ControlNet display names if available
 87 | if CONTROLNET_NODES_AVAILABLE:
 88 |     NODE_DISPLAY_NAME_MAPPINGS.update({
 89 |         # ControlNet FP8 Quantization Node Display Names
 90 |         "ControlNetFP8QuantizeNode": "ControlNet FP8 Quantizer",
 91 |         "ControlNetMetadataViewerNode": "ControlNet Metadata Viewer",
 92 |     })
 93 |     print("✅ ControlNet FP8 display names registered")
 94 | 
 95 | # Add GGUF display names if available
 96 | if GGUF_NODES_AVAILABLE and GGUFQuantizerNode is not None: # Check new node for display name
 97 |     NODE_DISPLAY_NAME_MAPPINGS.update({
 98 |         "GGUFQuantizerNode": "GGUF Quantizer 👾", # Display name for GGUF
 99 |     })
100 |     print("✅ GGUF quantization display names registered")
101 | 
102 | 
103 | # Optional: Print a message to the console when the extension is loaded
104 | print("----------------------------------------------------")
105 | print("--- ComfyUI Quantization Node Pack Loaded ---")
106 | print("--- Renamed QuantizeScaled to QuantizeModel ---")
107 | # ... (other existing print messages you want to keep) ...
108 | print("--- NEW: ControlNet FP8 Quantization Nodes  ---")
109 | print("--- NEW: GGUF Model Quantization ---")
110 | print("--- Developed by [Lum3on]            ---") # Remember to change this!
111 | print("--- Version 0.8.2                             ---") # Incremented version
112 | print("----------------------------------------------------")
113 | 
114 | # Tell ComfyUI where to find web files (for appearance.js)
115 | WEB_DIRECTORY = "./web"
116 | 
117 | __all__ = ['NODE_CLASS_MAPPINGS', 'NODE_DISPLAY_NAME_MAPPINGS', 'WEB_DIRECTORY']


--------------------------------------------------------------------------------
/controlnet_fp8_node.py:
--------------------------------------------------------------------------------
  1 | # controlnet_fp8_node.py
  2 | # ControlNet-specific FP8 quantization node for ComfyUI
  3 | # Based on the provided safetensors helper script with ComfyUI integration
  4 | 
  5 | import torch
  6 | import json
  7 | import os
  8 | from safetensors.torch import load_file, save_file
  9 | from tqdm import tqdm
 10 | from typing import Dict, Any, Tuple, Optional, Union
 11 | 
 12 | # ComfyUI imports for folder management
 13 | try:
 14 |     import folder_paths
 15 |     COMFYUI_AVAILABLE = True
 16 |     print("✅ ComfyUI folder_paths imported successfully")
 17 | except ImportError:
 18 |     COMFYUI_AVAILABLE = False
 19 |     print("⚠️ ComfyUI folder_paths not available - using manual paths")
 20 | 
 21 | 
 22 | def get_controlnet_models():
 23 |     """Get list of available ControlNet models from ComfyUI's models/controlnet folder."""
 24 |     if COMFYUI_AVAILABLE:
 25 |         try:
 26 |             # Get ControlNet models from ComfyUI's folder system
 27 |             controlnet_models = folder_paths.get_filename_list("controlnet")
 28 |             if controlnet_models:
 29 |                 print(f"✅ Found {len(controlnet_models)} ControlNet models")
 30 |                 return controlnet_models
 31 |             else:
 32 |                 print("⚠️ No ControlNet models found in models/controlnet folder")
 33 |                 return ["No models found"]
 34 |         except Exception as e:
 35 |             print(f"⚠️ Error accessing ControlNet models: {e}")
 36 |             return ["Error accessing models"]
 37 |     else:
 38 |         # Fallback for when ComfyUI folder_paths is not available
 39 |         return ["manual_path_required"]
 40 | 
 41 | 
 42 | def get_controlnet_model_path(model_name):
 43 |     """Get full path to a ControlNet model."""
 44 |     if COMFYUI_AVAILABLE and model_name != "manual_path_required" and model_name != "No models found":
 45 |         try:
 46 |             return folder_paths.get_full_path("controlnet", model_name)
 47 |         except Exception as e:
 48 |             print(f"⚠️ Error getting model path for {model_name}: {e}")
 49 |             return None
 50 |     return None
 51 | 
 52 | 
 53 | def get_output_folder():
 54 |     """Get the output folder for quantized models."""
 55 |     if COMFYUI_AVAILABLE:
 56 |         try:
 57 |             # Try to get the controlnet folder and create a quantized subfolder
 58 |             controlnet_folder = folder_paths.get_folder_paths("controlnet")[0]
 59 |             quantized_folder = os.path.join(controlnet_folder, "quantized")
 60 |             os.makedirs(quantized_folder, exist_ok=True)
 61 |             return quantized_folder
 62 |         except Exception as e:
 63 |             print(f"⚠️ Error creating quantized folder: {e}")
 64 |             return "models/controlnet/quantized"
 65 |     return "models/controlnet/quantized"
 66 | 
 67 | class ControlNetFP8Quantizer:
 68 |     """
 69 |     Advanced FP8 quantizer specifically designed for ControlNet models.
 70 |     Supports precision-aware quantization with tensor calibration and fallback logic.
 71 |     """
 72 | 
 73 |     def __init__(self,
 74 |                  fp8_format: str = "float8_e4m3fn",
 75 |                  quantization_strategy: str = "per_tensor",
 76 |                  activation_clipping: bool = True,
 77 |                  calibration_samples: int = 100):
 78 |         """
 79 |         Initialize the ControlNet FP8 quantizer.
 80 | 
 81 |         Args:
 82 |             fp8_format: FP8 format to use ('float8_e4m3fn' or 'float8_e5m2')
 83 |             quantization_strategy: 'per_tensor' or 'per_channel'
 84 |             activation_clipping: Whether to apply activation clipping
 85 |             calibration_samples: Number of samples for tensor calibration
 86 |         """
 87 |         if not hasattr(torch, fp8_format):
 88 |             raise ValueError(f"Unsupported FP8 format: {fp8_format}")
 89 | 
 90 |         self.fp8_format = fp8_format
 91 |         self.quantization_strategy = quantization_strategy
 92 |         self.activation_clipping = activation_clipping
 93 |         self.calibration_samples = calibration_samples
 94 |         self.scale_factors = {}
 95 |         self.metadata = {}
 96 | 
 97 |         # FP8 format specific parameters
 98 |         if fp8_format == "float8_e4m3fn":
 99 |             self.max_val = 448.0  # Maximum representable value for e4m3fn
100 |             self.min_val = -448.0
101 |         else:  # float8_e5m2
102 |             self.max_val = 57344.0  # Maximum representable value for e5m2
103 |             self.min_val = -57344.0
104 | 
105 |     def _analyze_tensor_statistics(self, tensor: torch.Tensor, layer_name: str) -> Dict[str, float]:
106 |         """Analyze tensor statistics for calibration."""
107 |         with torch.no_grad():
108 |             # Ensure tensor is float for statistical operations
109 |             if not tensor.is_floating_point():
110 |                 working_tensor = tensor.float()
111 |             else:
112 |                 working_tensor = tensor
113 | 
114 |             stats = {
115 |                 'mean': working_tensor.mean().item(),
116 |                 'std': working_tensor.std().item(),
117 |                 'min': working_tensor.min().item(),
118 |                 'max': working_tensor.max().item(),
119 |                 'abs_max': working_tensor.abs().max().item(),
120 |                 'sparsity': (working_tensor == 0).float().mean().item()
121 |             }
122 | 
123 |             # Calculate percentiles for better calibration
124 |             try:
125 |                 flattened = working_tensor.flatten()
126 |                 stats['p99'] = torch.quantile(torch.abs(flattened), 0.99).item()
127 |                 stats['p95'] = torch.quantile(torch.abs(flattened), 0.95).item()
128 |             except Exception as e:
129 |                 print(f"[ControlNetFP8Quantizer] Warning: percentile calculation failed for {layer_name}: {e}")
130 |                 stats['p99'] = stats['abs_max']
131 |                 stats['p95'] = stats['abs_max']
132 | 
133 |         return stats
134 | 
135 |     def _calculate_optimal_scale(self, tensor: torch.Tensor, layer_name: str) -> torch.Tensor:
136 |         """Calculate optimal scaling factor for quantization."""
137 |         device = tensor.device
138 |         dtype = tensor.dtype
139 | 
140 |         # Ensure tensor is float for quantile operations
141 |         if not tensor.is_floating_point():
142 |             # Convert to float32 for calculations
143 |             working_tensor = tensor.float()
144 |             target_dtype = torch.float32
145 |         else:
146 |             working_tensor = tensor
147 |             target_dtype = dtype
148 | 
149 |         if self.quantization_strategy == "per_tensor":
150 |             if self.activation_clipping:
151 |                 # Use 99th percentile for better outlier handling
152 |                 abs_tensor = torch.abs(working_tensor)
153 |                 try:
154 |                     scale_val = torch.quantile(abs_tensor.flatten(), 0.99)
155 |                 except Exception as e:
156 |                     print(f"[ControlNetFP8Quantizer] Warning: quantile failed for {layer_name}, using max: {e}")
157 |                     scale_val = torch.max(abs_tensor)
158 |             else:
159 |                 scale_val = torch.max(torch.abs(working_tensor))
160 | 
161 |             # Ensure scale is not zero
162 |             scale = torch.max(scale_val, torch.tensor(1e-8, device=device, dtype=target_dtype))
163 | 
164 |         elif self.quantization_strategy == "per_channel":
165 |             # Assume first dimension is the channel dimension for ControlNet
166 |             if working_tensor.ndim >= 2:
167 |                 dims_to_reduce = list(range(1, working_tensor.ndim))
168 |                 if self.activation_clipping:
169 |                     # Per-channel percentile-based scaling
170 |                     abs_tensor = torch.abs(working_tensor)
171 |                     # Reshape for percentile calculation per channel
172 |                     reshaped = abs_tensor.view(working_tensor.shape[0], -1)
173 |                     try:
174 |                         scale = torch.quantile(reshaped, 0.99, dim=1, keepdim=False)
175 |                         # Reshape scale to match tensor dimensions for broadcasting
176 |                         for _ in range(len(dims_to_reduce)):
177 |                             scale = scale.unsqueeze(-1)
178 |                     except Exception as e:
179 |                         print(f"[ControlNetFP8Quantizer] Warning: per-channel quantile failed for {layer_name}, using max: {e}")
180 |                         scale = torch.amax(abs_tensor, dim=dims_to_reduce, keepdim=True)
181 |                 else:
182 |                     scale = torch.amax(torch.abs(working_tensor), dim=dims_to_reduce, keepdim=True)
183 |             else:
184 |                 # Fallback to per-tensor for 1D tensors
185 |                 scale_val = torch.max(torch.abs(working_tensor))
186 |                 scale = torch.max(scale_val, torch.tensor(1e-8, device=device, dtype=target_dtype))
187 | 
188 |         # Ensure scale has minimum value to prevent division by zero
189 |         scale = torch.clamp(scale, min=1e-8)
190 | 
191 |         return scale
192 | 
193 |     def _quantize_tensor_fp8(self, tensor: torch.Tensor, layer_name: str) -> torch.Tensor:
194 |         """Quantize a single tensor to FP8 format with advanced calibration."""
195 |         if not tensor.is_floating_point():
196 |             return tensor
197 | 
198 |         original_device = tensor.device
199 |         original_dtype = tensor.dtype
200 | 
201 |         # Move to CUDA if available for FP8 operations
202 |         target_device = torch.device("cuda") if torch.cuda.is_available() else original_device
203 |         tensor_on_device = tensor.to(target_device)
204 | 
205 |         # Calculate optimal scale
206 |         scale = self._calculate_optimal_scale(tensor_on_device, layer_name)
207 | 
208 |         # Store scale factor for debugging/analysis
209 |         if self.quantization_strategy == "per_tensor":
210 |             self.scale_factors[layer_name] = scale.item()
211 |         else:
212 |             self.scale_factors[layer_name] = scale.squeeze().tolist() if scale.numel() > 1 else scale.item()
213 | 
214 |         # Perform quantization simulation
215 |         # Scale tensor to FP8 range
216 |         scaled_tensor = tensor_on_device / scale
217 | 
218 |         # Clamp to FP8 representable range
219 |         if self.activation_clipping:
220 |             # Use format-specific ranges
221 |             if self.fp8_format == "float8_e4m3fn":
222 |                 clamped_tensor = torch.clamp(scaled_tensor, -448.0, 448.0)
223 |             else:  # float8_e5m2
224 |                 clamped_tensor = torch.clamp(scaled_tensor, -57344.0, 57344.0)
225 |         else:
226 |             clamped_tensor = scaled_tensor
227 | 
228 |         # Convert to target FP8 format
229 |         try:
230 |             target_dtype = getattr(torch, self.fp8_format)
231 |             quantized_tensor = clamped_tensor.to(dtype=target_dtype)
232 | 
233 |             # Convert back to original dtype for compatibility (if needed)
234 |             # For true FP8 storage, keep the FP8 dtype
235 |             result_tensor = quantized_tensor
236 | 
237 |         except Exception as e:
238 |             print(f"[ControlNetFP8Quantizer] Warning: FP8 conversion failed for {layer_name}: {e}")
239 |             print(f"[ControlNetFP8Quantizer] Falling back to simulated quantization")
240 | 
241 |             # Fallback: simulate quantization effects without actual FP8 conversion
242 |             # This maintains compatibility while approximating FP8 behavior
243 |             simulated_quantized = torch.round(clamped_tensor * 127.0) / 127.0 * scale
244 |             result_tensor = simulated_quantized.to(dtype=original_dtype)
245 | 
246 |         return result_tensor.to(original_device)
247 | 
248 |     def quantize_state_dict(self, state_dict: Dict[str, torch.Tensor]) -> Dict[str, torch.Tensor]:
249 |         """Quantize an entire state dictionary."""
250 |         quantized_state_dict = {}
251 | 
252 |         # Filter tensors that should be quantized
253 |         # Only quantize floating point tensors with sufficient size
254 |         quantizable_tensors = {}
255 |         skipped_tensors = {}
256 | 
257 |         for name, tensor in state_dict.items():
258 |             if not isinstance(tensor, torch.Tensor):
259 |                 skipped_tensors[name] = "Not a tensor"
260 |                 continue
261 | 
262 |             # Skip very small tensors (likely bias terms or scalars)
263 |             if tensor.numel() < 4:
264 |                 skipped_tensors[name] = f"Too small ({tensor.numel()} elements)"
265 |                 continue
266 | 
267 |             # Skip non-floating point tensors
268 |             if not tensor.is_floating_point():
269 |                 skipped_tensors[name] = f"Non-float dtype ({tensor.dtype})"
270 |                 continue
271 | 
272 |             # Skip tensors that are likely indices or embeddings
273 |             if any(keyword in name.lower() for keyword in ['index', 'embedding', 'position']):
274 |                 skipped_tensors[name] = "Likely embedding/index tensor"
275 |                 continue
276 | 
277 |             quantizable_tensors[name] = tensor
278 | 
279 |         print(f"[ControlNetFP8Quantizer] Quantizing {len(quantizable_tensors)} tensors to {self.fp8_format}")
280 |         print(f"[ControlNetFP8Quantizer] Skipping {len(skipped_tensors)} tensors")
281 |         print(f"[ControlNetFP8Quantizer] Strategy: {self.quantization_strategy}, Clipping: {self.activation_clipping}")
282 | 
283 |         # Log some skipped tensors for debugging
284 |         if skipped_tensors:
285 |             sample_skipped = list(skipped_tensors.items())[:3]
286 |             for name, reason in sample_skipped:
287 |                 print(f"[ControlNetFP8Quantizer] Skipped '{name}': {reason}")
288 | 
289 |         # Progress bar for quantization
290 |         progress_bar = tqdm(
291 |             quantizable_tensors.items(),
292 |             desc=f"FP8 Quantization ({self.fp8_format})",
293 |             unit="tensor",
294 |             leave=False
295 |         )
296 | 
297 |         for name, tensor in progress_bar:
298 |             progress_bar.set_postfix({"layer": name[:30] + "..." if len(name) > 30 else name})
299 | 
300 |             try:
301 |                 # Analyze tensor statistics
302 |                 stats = self._analyze_tensor_statistics(tensor, name)
303 | 
304 |                 # Quantize tensor
305 |                 quantized_tensor = self._quantize_tensor_fp8(tensor.clone(), name)
306 |                 quantized_state_dict[name] = quantized_tensor
307 | 
308 |                 # Log statistics for important layers
309 |                 if any(keyword in name.lower() for keyword in ['conv', 'linear', 'attention', 'norm']):
310 |                     print(f"[ControlNetFP8Quantizer] {name}: "
311 |                           f"abs_max={stats['abs_max']:.6f}, "
312 |                           f"sparsity={stats['sparsity']:.3f}, "
313 |                           f"scale={self.scale_factors.get(name, 'N/A')}")
314 | 
315 |             except Exception as e:
316 |                 print(f"[ControlNetFP8Quantizer] Error quantizing {name}: {e}")
317 |                 # Keep original tensor if quantization fails
318 |                 quantized_state_dict[name] = tensor
319 | 
320 |         # Copy non-quantizable tensors
321 |         for name, tensor in state_dict.items():
322 |             if name not in quantized_state_dict:
323 |                 quantized_state_dict[name] = tensor
324 | 
325 |         return quantized_state_dict
326 | 
327 |     def load_safetensors_with_metadata(self, file_path: str) -> Tuple[Dict[str, torch.Tensor], Dict[str, Any]]:
328 |         """Load safetensors file and extract metadata."""
329 |         # Read metadata from safetensors header
330 |         with open(file_path, 'rb') as f:
331 |             header_size = int.from_bytes(f.read(8), 'little')
332 |             header_json = f.read(header_size).decode('utf-8')
333 |             header = json.loads(header_json)
334 |             metadata = header.get('__metadata__', {})
335 | 
336 |         # Load the actual tensors
337 |         state_dict = load_file(file_path)
338 | 
339 |         self.metadata = metadata
340 |         return state_dict, metadata
341 | 
342 |     def save_quantized_model(self,
343 |                            quantized_state_dict: Dict[str, torch.Tensor],
344 |                            save_path: str,
345 |                            original_metadata: Optional[Dict[str, Any]] = None) -> bool:
346 |         """Save quantized model with updated metadata."""
347 |         try:
348 |             # Prepare metadata - ensure all values are strings for safetensors compatibility
349 |             updated_metadata = {}
350 |             if original_metadata:
351 |                 # Convert all original metadata values to strings
352 |                 for key, value in original_metadata.items():
353 |                     updated_metadata[key] = str(value)
354 | 
355 |             # Add quantization metadata
356 |             updated_metadata.update({
357 |                 "quantization_format": self.fp8_format,
358 |                 "quantization_strategy": self.quantization_strategy,
359 |                 "activation_clipping": str(self.activation_clipping),
360 |                 "quantizer_version": "ControlNetFP8Quantizer_v1.0",
361 |                 "scale_factors_sample": str(list(self.scale_factors.items())[:3])  # Sample for debugging
362 |             })
363 | 
364 |             # Ensure directory exists
365 |             os.makedirs(os.path.dirname(save_path), exist_ok=True)
366 | 
367 |             # Move tensors to CPU for saving
368 |             cpu_state_dict = {}
369 |             for name, tensor in quantized_state_dict.items():
370 |                 if isinstance(tensor, torch.Tensor):
371 |                     cpu_state_dict[name] = tensor.cpu()
372 |                 else:
373 |                     cpu_state_dict[name] = tensor
374 | 
375 |             # Save with metadata
376 |             save_file(cpu_state_dict, save_path, metadata=updated_metadata)
377 | 
378 |             print(f"[ControlNetFP8Quantizer] Successfully saved quantized model to: {save_path}")
379 |             return True
380 | 
381 |         except Exception as e:
382 |             print(f"[ControlNetFP8Quantizer] Error saving model: {e}")
383 |             return False
384 | 
385 | 
386 | # ComfyUI Node Implementation
387 | class ControlNetFP8QuantizeNode:
388 |     """
389 |     ComfyUI node for ControlNet FP8 quantization with advanced features.
390 |     Supports loading, quantizing, and saving ControlNet models in FP8 format.
391 |     """
392 | 
393 |     @classmethod
394 |     def INPUT_TYPES(cls):
395 |         # Get available ControlNet models
396 |         controlnet_models = get_controlnet_models()
397 | 
398 |         input_types = {
399 |             "required": {
400 |                 "controlnet_model": (controlnet_models, {
401 |                     "default": controlnet_models[0] if controlnet_models else "No models found"
402 |                 }),
403 |                 "fp8_format": (["float8_e4m3fn", "float8_e5m2"], {
404 |                     "default": "float8_e4m3fn"
405 |                 }),
406 |                 "quantization_strategy": (["per_tensor", "per_channel"], {
407 |                     "default": "per_tensor"
408 |                 }),
409 |                 "activation_clipping": ("BOOLEAN", {
410 |                     "default": True
411 |                 }),
412 |             },
413 |             "optional": {
414 |                 "custom_output_name": ("STRING", {
415 |                     "default": "",
416 |                     "multiline": False,
417 |                     "placeholder": "Custom output filename (optional)"
418 |                 }),
419 |                 "calibration_samples": ("INT", {
420 |                     "default": 100,
421 |                     "min": 10,
422 |                     "max": 1000,
423 |                     "step": 10
424 |                 }),
425 |                 "preserve_metadata": ("BOOLEAN", {
426 |                     "default": True
427 |                 }),
428 |             }
429 |         }
430 | 
431 |         # Add manual path option if ComfyUI folder system is not available
432 |         if not COMFYUI_AVAILABLE or "manual_path_required" in controlnet_models:
433 |             input_types["optional"]["manual_path"] = ("STRING", {
434 |                 "default": "",
435 |                 "multiline": False,
436 |                 "placeholder": "Manual path to ControlNet model (if not using dropdown)"
437 |             })
438 | 
439 |         return input_types
440 | 
441 |     RETURN_TYPES = ("STRING", "STRING", "STRING")
442 |     RETURN_NAMES = ("status", "metadata_info", "quantization_stats")
443 |     FUNCTION = "quantize_controlnet"
444 |     CATEGORY = "Model Quantization/ControlNet"
445 |     OUTPUT_NODE = True
446 | 
447 |     def quantize_controlnet(self,
448 |                           controlnet_model: str,
449 |                           fp8_format: str,
450 |                           quantization_strategy: str,
451 |                           activation_clipping: bool,
452 |                           custom_output_name: str = "",
453 |                           calibration_samples: int = 100,
454 |                           preserve_metadata: bool = True,
455 |                           manual_path: str = ""):
456 |         """
457 |         Main function to quantize ControlNet models to FP8 format.
458 |         """
459 |         try:
460 |             # Determine the actual model path
461 |             if manual_path and os.path.exists(manual_path):
462 |                 # Use manual path if provided and exists
463 |                 safetensors_path = manual_path
464 |                 print(f"[ControlNetFP8QuantizeNode] Using manual path: {safetensors_path}")
465 |             else:
466 |                 # Use dropdown selection
467 |                 safetensors_path = get_controlnet_model_path(controlnet_model)
468 |                 if not safetensors_path:
469 |                     error_msg = f"Could not find model: {controlnet_model}"
470 |                     print(f"[ControlNetFP8QuantizeNode] Error: {error_msg}")
471 |                     return (f"ERROR: {error_msg}", "", "")
472 |                 print(f"[ControlNetFP8QuantizeNode] Using selected model: {controlnet_model}")
473 | 
474 |             # Validate that the file exists
475 |             if not os.path.exists(safetensors_path):
476 |                 error_msg = f"Model file not found: {safetensors_path}"
477 |                 print(f"[ControlNetFP8QuantizeNode] Error: {error_msg}")
478 |                 return (f"ERROR: {error_msg}", "", "")
479 | 
480 |             # Generate output path
481 |             output_folder = get_output_folder()
482 |             if custom_output_name:
483 |                 output_filename = custom_output_name
484 |                 if not output_filename.endswith('.safetensors'):
485 |                     output_filename += '.safetensors'
486 |             else:
487 |                 base_name = os.path.splitext(os.path.basename(safetensors_path))[0]
488 |                 output_filename = f"{base_name}_fp8_{fp8_format}.safetensors"
489 | 
490 |             output_path = os.path.join(output_folder, output_filename)
491 |             print(f"[ControlNetFP8QuantizeNode] Output path: {output_path}")
492 | 
493 |             # Initialize quantizer
494 |             quantizer = ControlNetFP8Quantizer(
495 |                 fp8_format=fp8_format,
496 |                 quantization_strategy=quantization_strategy,
497 |                 activation_clipping=activation_clipping,
498 |                 calibration_samples=calibration_samples
499 |             )
500 | 
501 |             print(f"[ControlNetFP8QuantizeNode] Loading model from: {safetensors_path}")
502 | 
503 |             # Load model and metadata
504 |             state_dict, metadata = quantizer.load_safetensors_with_metadata(safetensors_path)
505 | 
506 |             # Analyze model structure
507 |             total_tensors = len(state_dict)
508 |             quantizable_tensors = sum(1 for v in state_dict.values()
509 |                                     if isinstance(v, torch.Tensor) and v.is_floating_point())
510 | 
511 |             print(f"[ControlNetFP8QuantizeNode] Model loaded: {total_tensors} total tensors, "
512 |                   f"{quantizable_tensors} quantizable")
513 | 
514 |             # Perform quantization
515 |             print(f"[ControlNetFP8QuantizeNode] Starting quantization...")
516 |             quantized_state_dict = quantizer.quantize_state_dict(state_dict)
517 | 
518 |             # Calculate statistics
519 |             original_size = sum(v.numel() * v.element_size() for v in state_dict.values()
520 |                               if isinstance(v, torch.Tensor))
521 |             quantized_size = sum(v.numel() * v.element_size() for v in quantized_state_dict.values()
522 |                                if isinstance(v, torch.Tensor))
523 | 
524 |             compression_ratio = original_size / quantized_size if quantized_size > 0 else 1.0
525 | 
526 |             # Save quantized model
527 |             save_metadata = metadata if preserve_metadata else {}
528 |             success = quantizer.save_quantized_model(quantized_state_dict, output_path, save_metadata)
529 | 
530 |             if success:
531 |                 status_msg = f"SUCCESS: Quantized model saved to {output_path}"
532 | 
533 |                 # Prepare metadata info
534 |                 metadata_info = json.dumps({
535 |                     "original_metadata": metadata,
536 |                     "quantization_metadata": {
537 |                         "fp8_format": fp8_format,
538 |                         "quantization_strategy": quantization_strategy,
539 |                         "activation_clipping": activation_clipping,
540 |                         "calibration_samples": calibration_samples
541 |                     }
542 |                 }, indent=2)
543 | 
544 |                 # Prepare quantization statistics
545 |                 stats_info = json.dumps({
546 |                     "total_tensors": total_tensors,
547 |                     "quantizable_tensors": quantizable_tensors,
548 |                     "original_size_mb": round(original_size / (1024 * 1024), 2),
549 |                     "quantized_size_mb": round(quantized_size / (1024 * 1024), 2),
550 |                     "compression_ratio": round(compression_ratio, 2),
551 |                     "scale_factors_sample": dict(list(quantizer.scale_factors.items())[:5])
552 |                 }, indent=2)
553 | 
554 |                 print(f"[ControlNetFP8QuantizeNode] Quantization completed successfully!")
555 |                 print(f"[ControlNetFP8QuantizeNode] Compression ratio: {compression_ratio:.2f}x")
556 | 
557 |                 return (status_msg, metadata_info, stats_info)
558 |             else:
559 |                 error_msg = "Failed to save quantized model"
560 |                 return (f"ERROR: {error_msg}", "", "")
561 | 
562 |         except Exception as e:
563 |             error_msg = f"Quantization failed: {str(e)}"
564 |             print(f"[ControlNetFP8QuantizeNode] Error: {error_msg}")
565 |             import traceback
566 |             traceback.print_exc()
567 |             return (f"ERROR: {error_msg}", "", "")
568 | 
569 | 
570 | class ControlNetMetadataViewerNode:
571 |     """
572 |     ComfyUI node for viewing ControlNet model metadata and structure.
573 |     """
574 | 
575 |     @classmethod
576 |     def INPUT_TYPES(cls):
577 |         # Get available ControlNet models
578 |         controlnet_models = get_controlnet_models()
579 | 
580 |         input_types = {
581 |             "required": {
582 |                 "controlnet_model": (controlnet_models, {
583 |                     "default": controlnet_models[0] if controlnet_models else "No models found"
584 |                 }),
585 |             }
586 |         }
587 | 
588 |         # Add manual path option if ComfyUI folder system is not available
589 |         if not COMFYUI_AVAILABLE or "manual_path_required" in controlnet_models:
590 |             input_types["optional"] = {
591 |                 "manual_path": ("STRING", {
592 |                     "default": "",
593 |                     "multiline": False,
594 |                     "placeholder": "Manual path to ControlNet model (if not using dropdown)"
595 |                 })
596 |             }
597 | 
598 |         return input_types
599 | 
600 |     RETURN_TYPES = ("STRING", "STRING", "STRING")
601 |     RETURN_NAMES = ("metadata", "tensor_info", "model_analysis")
602 |     FUNCTION = "analyze_model"
603 |     CATEGORY = "Model Quantization/ControlNet"
604 |     OUTPUT_NODE = True
605 | 
606 |     def analyze_model(self, controlnet_model: str, manual_path: str = ""):
607 |         """Analyze and display ControlNet model information."""
608 |         try:
609 |             # Determine the actual model path
610 |             if manual_path and os.path.exists(manual_path):
611 |                 # Use manual path if provided and exists
612 |                 safetensors_path = manual_path
613 |                 print(f"[ControlNetMetadataViewerNode] Using manual path: {safetensors_path}")
614 |             else:
615 |                 # Use dropdown selection
616 |                 safetensors_path = get_controlnet_model_path(controlnet_model)
617 |                 if not safetensors_path:
618 |                     error_msg = f"Could not find model: {controlnet_model}"
619 |                     print(f"[ControlNetMetadataViewerNode] Error: {error_msg}")
620 |                     return (f"ERROR: {error_msg}", "", "")
621 |                 print(f"[ControlNetMetadataViewerNode] Analyzing model: {controlnet_model}")
622 | 
623 |             # Validate that the file exists
624 |             if not os.path.exists(safetensors_path):
625 |                 error_msg = f"Model file not found: {safetensors_path}"
626 |                 print(f"[ControlNetMetadataViewerNode] Error: {error_msg}")
627 |                 return (f"ERROR: {error_msg}", "", "")
628 | 
629 |             # Load metadata
630 |             with open(safetensors_path, 'rb') as f:
631 |                 header_size = int.from_bytes(f.read(8), 'little')
632 |                 header_json = f.read(header_size).decode('utf-8')
633 |                 header = json.loads(header_json)
634 |                 metadata = header.get('__metadata__', {})
635 | 
636 |             # Load tensors for analysis
637 |             state_dict = load_file(safetensors_path)
638 | 
639 |             # Analyze tensor information
640 |             tensor_analysis = {}
641 |             total_params = 0
642 |             dtype_counts = {}
643 | 
644 |             for name, tensor in state_dict.items():
645 |                 if isinstance(tensor, torch.Tensor):
646 |                     total_params += tensor.numel()
647 |                     dtype_str = str(tensor.dtype)
648 |                     dtype_counts[dtype_str] = dtype_counts.get(dtype_str, 0) + 1
649 | 
650 |                     tensor_analysis[name] = {
651 |                         "shape": list(tensor.shape),
652 |                         "dtype": dtype_str,
653 |                         "device": str(tensor.device),
654 |                         "numel": tensor.numel(),
655 |                         "size_mb": round(tensor.numel() * tensor.element_size() / (1024 * 1024), 4)
656 |                     }
657 | 
658 |             # Model analysis
659 |             model_analysis = {
660 |                 "total_tensors": len(state_dict),
661 |                 "total_parameters": total_params,
662 |                 "total_size_mb": round(sum(t.numel() * t.element_size() for t in state_dict.values()
663 |                                          if isinstance(t, torch.Tensor)) / (1024 * 1024), 2),
664 |                 "dtype_distribution": dtype_counts,
665 |                 "layer_types": self._analyze_layer_types(list(state_dict.keys()))
666 |             }
667 | 
668 |             # Format outputs
669 |             metadata_str = json.dumps(metadata, indent=2) if metadata else "No metadata found"
670 |             tensor_info_str = json.dumps(tensor_analysis, indent=2)
671 |             analysis_str = json.dumps(model_analysis, indent=2)
672 | 
673 |             return (metadata_str, tensor_info_str, analysis_str)
674 | 
675 |         except Exception as e:
676 |             error_msg = f"Analysis failed: {str(e)}"
677 |             print(f"[ControlNetMetadataViewerNode] Error: {error_msg}")
678 |             return (f"ERROR: {error_msg}", "", "")
679 | 
680 |     def _analyze_layer_types(self, layer_names):
681 |         """Analyze the types of layers in the model."""
682 |         layer_types = {}
683 |         for name in layer_names:
684 |             if 'conv' in name.lower():
685 |                 layer_types['convolution'] = layer_types.get('convolution', 0) + 1
686 |             elif 'linear' in name.lower() or 'fc' in name.lower():
687 |                 layer_types['linear'] = layer_types.get('linear', 0) + 1
688 |             elif 'norm' in name.lower() or 'bn' in name.lower():
689 |                 layer_types['normalization'] = layer_types.get('normalization', 0) + 1
690 |             elif 'attention' in name.lower() or 'attn' in name.lower():
691 |                 layer_types['attention'] = layer_types.get('attention', 0) + 1
692 |             elif 'embed' in name.lower():
693 |                 layer_types['embedding'] = layer_types.get('embedding', 0) + 1
694 |             else:
695 |                 layer_types['other'] = layer_types.get('other', 0) + 1
696 |         return layer_types
697 | 


--------------------------------------------------------------------------------
/examples/gguf_quantizer_workflow.json:
--------------------------------------------------------------------------------
  1 | {
  2 |   "id": "fac53e6c-027e-4d62-b631-9502460b54fa",
  3 |   "revision": 0,
  4 |   "last_node_id": 12,
  5 |   "last_link_id": 11,
  6 |   "nodes": [
  7 |     {
  8 |       "id": 12,
  9 |       "type": "UNETLoader",
 10 |       "pos": [
 11 |         -116.2187271118164,
 12 |         70.46971893310547
 13 |       ],
 14 |       "size": [
 15 |         270,
 16 |         82
 17 |       ],
 18 |       "flags": {},
 19 |       "order": 0,
 20 |       "mode": 0,
 21 |       "inputs": [],
 22 |       "outputs": [
 23 |         {
 24 |           "name": "MODEL",
 25 |           "type": "MODEL",
 26 |           "links": [
 27 |             11
 28 |           ]
 29 |         }
 30 |       ],
 31 |       "properties": {
 32 |         "cnr_id": "comfy-core",
 33 |         "ver": "0.3.40",
 34 |         "Node name for S&R": "UNETLoader",
 35 |         "enableTabs": false,
 36 |         "tabWidth": 65,
 37 |         "tabXOffset": 10,
 38 |         "hasSecondTab": false,
 39 |         "secondTabText": "Send Back",
 40 |         "secondTabOffset": 80,
 41 |         "secondTabWidth": 65,
 42 |         "widget_ue_connectable": {}
 43 |       },
 44 |       "widgets_values": [
 45 |         "DG_Wan_1_3b_t2v_boost_stock_V1_new.safetensors",
 46 |         "default"
 47 |       ]
 48 |     },
 49 |     {
 50 |       "id": 9,
 51 |       "type": "PreviewAny",
 52 |       "pos": [
 53 |         599.5283813476562,
 54 |         65.43692779541016
 55 |       ],
 56 |       "size": [
 57 |         241.1754913330078,
 58 |         389.3494873046875
 59 |       ],
 60 |       "flags": {},
 61 |       "order": 2,
 62 |       "mode": 0,
 63 |       "inputs": [
 64 |         {
 65 |           "name": "source",
 66 |           "type": "*",
 67 |           "link": 8
 68 |         }
 69 |       ],
 70 |       "outputs": [],
 71 |       "properties": {
 72 |         "cnr_id": "comfy-core",
 73 |         "ver": "0.3.40",
 74 |         "Node name for S&R": "PreviewAny",
 75 |         "enableTabs": false,
 76 |         "tabWidth": 65,
 77 |         "tabXOffset": 10,
 78 |         "hasSecondTab": false,
 79 |         "secondTabText": "Send Back",
 80 |         "secondTabOffset": 80,
 81 |         "secondTabWidth": 65,
 82 |         "widget_ue_connectable": {}
 83 |       },
 84 |       "widgets_values": []
 85 |     },
 86 |     {
 87 |       "id": 2,
 88 |       "type": "GGUFQuantizerNode",
 89 |       "pos": [
 90 |         169.37864685058594,
 91 |         62.260032653808594
 92 |       ],
 93 |       "size": [
 94 |         400,
 95 |         200
 96 |       ],
 97 |       "flags": {},
 98 |       "order": 1,
 99 |       "mode": 0,
100 |       "inputs": [
101 |         {
102 |           "name": "model",
103 |           "type": "MODEL",
104 |           "link": 11
105 |         }
106 |       ],
107 |       "outputs": [
108 |         {
109 |           "name": "status_message",
110 |           "type": "STRING",
111 |           "slot_index": 0,
112 |           "links": [
113 |             8
114 |           ]
115 |         },
116 |         {
117 |           "name": "output_gguf_path_or_dir",
118 |           "type": "STRING",
119 |           "slot_index": 1,
120 |           "links": [
121 |             3
122 |           ]
123 |         }
124 |       ],
125 |       "properties": {
126 |         "aux_id": "lum3on/ComfyUI-ModelQuantizer",
127 |         "ver": "7c4af70596c4e57e15284eca28487254b369c633",
128 |         "Node name for S&R": "GGUFQuantizerNode",
129 |         "enableTabs": false,
130 |         "tabWidth": 65,
131 |         "tabXOffset": 10,
132 |         "hasSecondTab": false,
133 |         "secondTabText": "Send Back",
134 |         "secondTabOffset": 80,
135 |         "secondTabWidth": 65,
136 |         "widget_ue_connectable": {}
137 |       },
138 |       "widgets_values": [
139 |         "Q5_0",
140 |         "C:\\Users\\RAIIN Studios\\Documents\\protable\\ComfyUI\\models\\unet\\ace",
141 |         true,
142 |         true,
143 |         true
144 |       ]
145 |     }
146 |   ],
147 |   "links": [
148 |     [
149 |       3,
150 |       2,
151 |       1,
152 |       4,
153 |       0,
154 |       "STRING"
155 |     ],
156 |     [
157 |       7,
158 |       1,
159 |       0,
160 |       7,
161 |       0,
162 |       "*"
163 |     ],
164 |     [
165 |       8,
166 |       2,
167 |       0,
168 |       9,
169 |       0,
170 |       "*"
171 |     ],
172 |     [
173 |       11,
174 |       12,
175 |       0,
176 |       2,
177 |       0,
178 |       "MODEL"
179 |     ]
180 |   ],
181 |   "groups": [
182 |     {
183 |       "id": 2,
184 |       "title": "Group",
185 |       "bounding": [
186 |         -128.0432891845703,
187 |         -10.42851448059082,
188 |         977.8348999023438,
189 |         477.9517822265625
190 |       ],
191 |       "color": "#3f789e",
192 |       "font_size": 24,
193 |       "flags": {}
194 |     }
195 |   ],
196 |   "config": {},
197 |   "extra": {
198 |     "ue_links": [],
199 |     "ds": {
200 |       "scale": 0.7972024500000005,
201 |       "offset": [
202 |         158.59439579954568,
203 |         333.5982287869171
204 |       ]
205 |     },
206 |     "links_added_by_ue": [],
207 |     "frontendVersion": "1.21.7",
208 |     "VHS_latentpreview": false,
209 |     "VHS_latentpreviewrate": 0,
210 |     "VHS_MetadataImage": true,
211 |     "VHS_KeepIntermediate": true
212 |   },
213 |   "version": 0.4
214 | }


--------------------------------------------------------------------------------
/examples/workflow_controlnet_fp8_quantization-fast.json:
--------------------------------------------------------------------------------
  1 | {
  2 |   "id": "a6338bdf-6b8a-421c-acc9-cd6aaa53fdc9",
  3 |   "revision": 0,
  4 |   "last_node_id": 20,
  5 |   "last_link_id": 23,
  6 |   "nodes": [
  7 |     {
  8 |       "id": 5,
  9 |       "type": "ControlNetFP8QuantizeNode",
 10 |       "pos": [
 11 |         466.84832763671875,
 12 |         164.81739807128906
 13 |       ],
 14 |       "size": [
 15 |         718.7396240234375,
 16 |         304.13946533203125
 17 |       ],
 18 |       "flags": {},
 19 |       "order": 1,
 20 |       "mode": 0,
 21 |       "inputs": [
 22 |         {
 23 |           "name": "custom_output_name",
 24 |           "shape": 7,
 25 |           "type": "STRING",
 26 |           "widget": {
 27 |             "name": "custom_output_name"
 28 |           },
 29 |           "link": 23
 30 |         }
 31 |       ],
 32 |       "outputs": [
 33 |         {
 34 |           "name": "status",
 35 |           "type": "STRING",
 36 |           "slot_index": 0,
 37 |           "links": []
 38 |         },
 39 |         {
 40 |           "name": "metadata_info",
 41 |           "type": "STRING",
 42 |           "slot_index": 1,
 43 |           "links": []
 44 |         },
 45 |         {
 46 |           "name": "quantization_stats",
 47 |           "type": "STRING",
 48 |           "slot_index": 2,
 49 |           "links": []
 50 |         }
 51 |       ],
 52 |       "properties": {
 53 |         "aux_id": "lum3on/ComfyUI-ModelQuantizer",
 54 |         "ver": "07592b8f05b40a2a185be9c7c7539a427b466c5d",
 55 |         "Node name for S&R": "ControlNetFP8QuantizeNode",
 56 |         "enableTabs": false,
 57 |         "tabWidth": 65,
 58 |         "tabXOffset": 10,
 59 |         "hasSecondTab": false,
 60 |         "secondTabText": "Send Back",
 61 |         "secondTabOffset": 80,
 62 |         "secondTabWidth": 65,
 63 |         "widget_ue_connectable": {}
 64 |       },
 65 |       "widgets_values": [
 66 |         "flux\\Flux.1-dev-Controlnet-Upscaler.safetensors",
 67 |         "float8_e4m3fn",
 68 |         "per_tensor",
 69 |         true,
 70 |         "fluxcn-up.fp8",
 71 |         100,
 72 |         true
 73 |       ],
 74 |       "color": "#432",
 75 |       "bgcolor": "#653"
 76 |     },
 77 |     {
 78 |       "id": 20,
 79 |       "type": "String Literal",
 80 |       "pos": [
 81 |         479.8630065917969,
 82 |         522.4603881835938
 83 |       ],
 84 |       "size": [
 85 |         705.0504760742188,
 86 |         210.10101318359375
 87 |       ],
 88 |       "flags": {},
 89 |       "order": 0,
 90 |       "mode": 0,
 91 |       "inputs": [],
 92 |       "outputs": [
 93 |         {
 94 |           "name": "STRING",
 95 |           "type": "STRING",
 96 |           "links": [
 97 |             23
 98 |           ]
 99 |         }
100 |       ],
101 |       "title": "outputname",
102 |       "properties": {
103 |         "cnr_id": "comfy-image-saver",
104 |         "ver": "65e6903eff274a50f8b5cd768f0f96baf37baea1",
105 |         "widget_ue_connectable": {},
106 |         "Node name for S&R": "String Literal",
107 |         "enableTabs": false,
108 |         "tabWidth": 65,
109 |         "tabXOffset": 10,
110 |         "hasSecondTab": false,
111 |         "secondTabText": "Send Back",
112 |         "secondTabOffset": 80,
113 |         "secondTabWidth": 65
114 |       },
115 |       "widgets_values": [
116 |         ""
117 |       ]
118 |     }
119 |   ],
120 |   "links": [
121 |     [
122 |       23,
123 |       20,
124 |       0,
125 |       5,
126 |       0,
127 |       "STRING"
128 |     ]
129 |   ],
130 |   "groups": [
131 |     {
132 |       "id": 2,
133 |       "title": "FP8 E4M3FN  Quantization",
134 |       "bounding": [
135 |         420.9110412597656,
136 |         68.4666519165039,
137 |         791.4306030273438,
138 |         695.5747680664062
139 |       ],
140 |       "color": "#8A8",
141 |       "font_size": 24,
142 |       "flags": {}
143 |     }
144 |   ],
145 |   "config": {},
146 |   "extra": {
147 |     "ds": {
148 |       "scale": 0.3719008264462851,
149 |       "offset": [
150 |         1014.0693446689652,
151 |         50.9154350884251
152 |       ]
153 |     },
154 |     "ue_links": [],
155 |     "links_added_by_ue": [],
156 |     "frontendVersion": "1.20.6",
157 |     "VHS_latentpreview": false,
158 |     "VHS_latentpreviewrate": 0,
159 |     "VHS_MetadataImage": true,
160 |     "VHS_KeepIntermediate": true
161 |   },
162 |   "version": 0.4
163 | }


--------------------------------------------------------------------------------
/examples/workflow_integrated_quantization.json:
--------------------------------------------------------------------------------
  1 | {
  2 |   "id": "dd73e3fe-ef87-4f21-ad20-fbc711ebc0f7",
  3 |   "revision": 0,
  4 |   "last_node_id": 18,
  5 |   "last_link_id": 22,
  6 |   "nodes": [
  7 |     {
  8 |       "id": 9,
  9 |       "type": "ControlNetFP8QuantizeNode",
 10 |       "pos": [
 11 |         614.6102294921875,
 12 |         109.16492462158203
 13 |       ],
 14 |       "size": [
 15 |         400,
 16 |         280
 17 |       ],
 18 |       "flags": {},
 19 |       "order": 3,
 20 |       "mode": 0,
 21 |       "inputs": [
 22 |         {
 23 |           "name": "custom_output_name",
 24 |           "shape": 7,
 25 |           "type": "STRING",
 26 |           "widget": {
 27 |             "name": "custom_output_name"
 28 |           },
 29 |           "link": 22
 30 |         }
 31 |       ],
 32 |       "outputs": [
 33 |         {
 34 |           "name": "status",
 35 |           "type": "STRING",
 36 |           "links": []
 37 |         },
 38 |         {
 39 |           "name": "metadata_info",
 40 |           "type": "STRING",
 41 |           "links": [
 42 |             10
 43 |           ]
 44 |         },
 45 |         {
 46 |           "name": "quantization_stats",
 47 |           "type": "STRING",
 48 |           "links": []
 49 |         }
 50 |       ],
 51 |       "properties": {
 52 |         "aux_id": "lum3on/ComfyUI-ModelQuantizer",
 53 |         "ver": "07592b8f05b40a2a185be9c7c7539a427b466c5d",
 54 |         "Node name for S&R": "ControlNetFP8QuantizeNode",
 55 |         "enableTabs": false,
 56 |         "tabWidth": 65,
 57 |         "tabXOffset": 10,
 58 |         "hasSecondTab": false,
 59 |         "secondTabText": "Send Back",
 60 |         "secondTabOffset": 80,
 61 |         "secondTabWidth": 65,
 62 |         "widget_ue_connectable": {}
 63 |       },
 64 |       "widgets_values": [
 65 |         "models/controlnet/control_v11p_sd15_openpose.safetensors",
 66 |         "float8_e4m3fn",
 67 |         "per_channel",
 68 |         true,
 69 |         "models/controlnet/quantized/control_v11p_sd15_openpose_fp8.safetensors",
 70 |         100,
 71 |         true
 72 |       ],
 73 |       "color": "#432",
 74 |       "bgcolor": "#653"
 75 |     },
 76 |     {
 77 |       "id": 18,
 78 |       "type": "PrimitiveNode",
 79 |       "pos": [
 80 |         1038.4066162109375,
 81 |         132.72703552246094
 82 |       ],
 83 |       "size": [
 84 |         310.17529296875,
 85 |         140.4438018798828
 86 |       ],
 87 |       "flags": {},
 88 |       "order": 0,
 89 |       "mode": 0,
 90 |       "inputs": [],
 91 |       "outputs": [
 92 |         {
 93 |           "name": "STRING",
 94 |           "type": "STRING",
 95 |           "widget": {
 96 |             "name": "custom_output_name"
 97 |           },
 98 |           "links": [
 99 |             22
100 |           ]
101 |         }
102 |       ],
103 |       "title": "output path",
104 |       "properties": {
105 |         "Run widget replace on values": false
106 |       },
107 |       "widgets_values": [
108 |         "models/controlnet/quantized/control_v11p_sd15_openpose_fp8.safetensors"
109 |       ]
110 |     },
111 |     {
112 |       "id": 5,
113 |       "type": "SaveAsSafeTensor",
114 |       "pos": [
115 |         -179.10694885253906,
116 |         411.8915100097656
117 |       ],
118 |       "size": [
119 |         350,
120 |         100
121 |       ],
122 |       "flags": {},
123 |       "order": 7,
124 |       "mode": 0,
125 |       "inputs": [
126 |         {
127 |           "name": "quantized_model_state_dict",
128 |           "type": "MODEL_STATE_DICT",
129 |           "link": 4
130 |         }
131 |       ],
132 |       "outputs": [],
133 |       "properties": {
134 |         "aux_id": "lum3on/ComfyUI-ModelQuantizer",
135 |         "ver": "07592b8f05b40a2a185be9c7c7539a427b466c5d",
136 |         "Node name for S&R": "SaveAsSafeTensor",
137 |         "enableTabs": false,
138 |         "tabWidth": 65,
139 |         "tabXOffset": 10,
140 |         "hasSecondTab": false,
141 |         "secondTabText": "Send Back",
142 |         "secondTabOffset": 80,
143 |         "secondTabWidth": 65,
144 |         "widget_ue_connectable": {}
145 |       },
146 |       "widgets_values": [
147 |         "models/quantized/diffusion_model_fp8_direct.safetensors"
148 |       ]
149 |     },
150 |     {
151 |       "id": 3,
152 |       "type": "QuantizeFP8Format",
153 |       "pos": [
154 |         -176.1340789794922,
155 |         237.16087341308594
156 |       ],
157 |       "size": [
158 |         350,
159 |         120
160 |       ],
161 |       "flags": {},
162 |       "order": 5,
163 |       "mode": 0,
164 |       "inputs": [
165 |         {
166 |           "name": "model_state_dict",
167 |           "type": "MODEL_STATE_DICT",
168 |           "link": 2
169 |         }
170 |       ],
171 |       "outputs": [
172 |         {
173 |           "name": "quantized_model_state_dict",
174 |           "type": "MODEL_STATE_DICT",
175 |           "links": [
176 |             4
177 |           ]
178 |         }
179 |       ],
180 |       "properties": {
181 |         "aux_id": "lum3on/ComfyUI-ModelQuantizer",
182 |         "ver": "07592b8f05b40a2a185be9c7c7539a427b466c5d",
183 |         "Node name for S&R": "QuantizeFP8Format",
184 |         "enableTabs": false,
185 |         "tabWidth": 65,
186 |         "tabXOffset": 10,
187 |         "hasSecondTab": false,
188 |         "secondTabText": "Send Back",
189 |         "secondTabOffset": 80,
190 |         "secondTabWidth": 65,
191 |         "widget_ue_connectable": {}
192 |       },
193 |       "widgets_values": [
194 |         "float8_e4m3fn"
195 |       ]
196 |     },
197 |     {
198 |       "id": 2,
199 |       "type": "ModelToStateDict",
200 |       "pos": [
201 |         226.8387908935547,
202 |         108.50463104248047
203 |       ],
204 |       "size": [
205 |         300,
206 |         100
207 |       ],
208 |       "flags": {
209 |         "collapsed": true
210 |       },
211 |       "order": 4,
212 |       "mode": 0,
213 |       "inputs": [
214 |         {
215 |           "name": "model",
216 |           "type": "MODEL",
217 |           "link": 21
218 |         }
219 |       ],
220 |       "outputs": [
221 |         {
222 |           "name": "model_state_dict",
223 |           "type": "MODEL_STATE_DICT",
224 |           "links": [
225 |             2,
226 |             3
227 |           ]
228 |         }
229 |       ],
230 |       "properties": {
231 |         "aux_id": "lum3on/ComfyUI-ModelQuantizer",
232 |         "ver": "07592b8f05b40a2a185be9c7c7539a427b466c5d",
233 |         "Node name for S&R": "ModelToStateDict",
234 |         "enableTabs": false,
235 |         "tabWidth": 65,
236 |         "tabXOffset": 10,
237 |         "hasSecondTab": false,
238 |         "secondTabText": "Send Back",
239 |         "secondTabOffset": 80,
240 |         "secondTabWidth": 65,
241 |         "widget_ue_connectable": {}
242 |       },
243 |       "widgets_values": []
244 |     },
245 |     {
246 |       "id": 4,
247 |       "type": "QuantizeModel",
248 |       "pos": [
249 |         220.89305114746094,
250 |         178.23707580566406
251 |       ],
252 |       "size": [
253 |         350,
254 |         160
255 |       ],
256 |       "flags": {},
257 |       "order": 6,
258 |       "mode": 0,
259 |       "inputs": [
260 |         {
261 |           "name": "model_state_dict",
262 |           "type": "MODEL_STATE_DICT",
263 |           "link": 3
264 |         }
265 |       ],
266 |       "outputs": [
267 |         {
268 |           "name": "quantized_model_state_dict",
269 |           "type": "MODEL_STATE_DICT",
270 |           "links": [
271 |             5
272 |           ]
273 |         }
274 |       ],
275 |       "properties": {
276 |         "aux_id": "lum3on/ComfyUI-ModelQuantizer",
277 |         "ver": "07592b8f05b40a2a185be9c7c7539a427b466c5d",
278 |         "Node name for S&R": "QuantizeModel",
279 |         "enableTabs": false,
280 |         "tabWidth": 65,
281 |         "tabXOffset": 10,
282 |         "hasSecondTab": false,
283 |         "secondTabText": "Send Back",
284 |         "secondTabOffset": 80,
285 |         "secondTabWidth": 65,
286 |         "widget_ue_connectable": {}
287 |       },
288 |       "widgets_values": [
289 |         "per_channel",
290 |         "Auto",
291 |         "float16"
292 |       ]
293 |     },
294 |     {
295 |       "id": 6,
296 |       "type": "SaveAsSafeTensor",
297 |       "pos": [
298 |         220.89305114746094,
299 |         390.1286926269531
300 |       ],
301 |       "size": [
302 |         350,
303 |         100
304 |       ],
305 |       "flags": {},
306 |       "order": 8,
307 |       "mode": 0,
308 |       "inputs": [
309 |         {
310 |           "name": "quantized_model_state_dict",
311 |           "type": "MODEL_STATE_DICT",
312 |           "link": 5
313 |         }
314 |       ],
315 |       "outputs": [],
316 |       "properties": {
317 |         "aux_id": "lum3on/ComfyUI-ModelQuantizer",
318 |         "ver": "07592b8f05b40a2a185be9c7c7539a427b466c5d",
319 |         "Node name for S&R": "SaveAsSafeTensor",
320 |         "enableTabs": false,
321 |         "tabWidth": 65,
322 |         "tabXOffset": 10,
323 |         "hasSecondTab": false,
324 |         "secondTabText": "Send Back",
325 |         "secondTabOffset": 80,
326 |         "secondTabWidth": 65,
327 |         "widget_ue_connectable": {}
328 |       },
329 |       "widgets_values": [
330 |         "models/quantized/diffusion_model_scaled_fp16.safetensors"
331 |       ]
332 |     },
333 |     {
334 |       "id": 16,
335 |       "type": "UNETLoader",
336 |       "pos": [
337 |         -188.74542236328125,
338 |         97.2307357788086
339 |       ],
340 |       "size": [
341 |         270,
342 |         82
343 |       ],
344 |       "flags": {},
345 |       "order": 1,
346 |       "mode": 0,
347 |       "inputs": [],
348 |       "outputs": [
349 |         {
350 |           "name": "MODEL",
351 |           "type": "MODEL",
352 |           "links": [
353 |             21
354 |           ]
355 |         }
356 |       ],
357 |       "properties": {
358 |         "cnr_id": "comfy-core",
359 |         "ver": "0.3.37",
360 |         "widget_ue_connectable": {},
361 |         "Node name for S&R": "UNETLoader",
362 |         "enableTabs": false,
363 |         "tabWidth": 65,
364 |         "tabXOffset": 10,
365 |         "hasSecondTab": false,
366 |         "secondTabText": "Send Back",
367 |         "secondTabOffset": 80,
368 |         "secondTabWidth": 65
369 |       },
370 |       "widgets_values": [
371 |         "DG_Wan_1_3b_t2v_boost_stock_V1_new.safetensors",
372 |         "default"
373 |       ]
374 |     },
375 |     {
376 |       "id": 12,
377 |       "type": "Note",
378 |       "pos": [
379 |         134.60061645507812,
380 |         563.052978515625
381 |       ],
382 |       "size": [
383 |         750,
384 |         150
385 |       ],
386 |       "flags": {},
387 |       "order": 2,
388 |       "mode": 0,
389 |       "inputs": [],
390 |       "outputs": [],
391 |       "properties": {
392 |         "text": "Integrated Quantization Workflow\n\nLeft side: Standard diffusion model quantization using existing nodes\n- LoadCheckpoint → ModelToStateDict → QuantizeFP8Format/QuantizeModel → SaveAsSafeTensor\n\nRight side: ControlNet-specific FP8 quantization using new nodes\n- ControlNetMetadataViewerNode → ControlNetFP8QuantizeNode\n\nThis demonstrates how the new ControlNet nodes complement the existing quantization workflow.",
393 |         "widget_ue_connectable": {}
394 |       },
395 |       "widgets_values": [
396 |         "Integrated Quantization Workflow\n\nLeft side: Standard diffusion model quantization using existing nodes\n- LoadCheckpoint → ModelToStateDict → QuantizeFP8Format/QuantizeModel → SaveAsSafeTensor\n\nRight side: ControlNet-specific FP8 quantization using new nodes\n- ControlNetMetadataViewerNode → ControlNetFP8QuantizeNode\n\nThis demonstrates how the new ControlNet nodes complement the existing quantization workflow."
397 |       ],
398 |       "color": "#432",
399 |       "bgcolor": "#653"
400 |     }
401 |   ],
402 |   "links": [
403 |     [
404 |       2,
405 |       2,
406 |       0,
407 |       3,
408 |       0,
409 |       "MODEL_STATE_DICT"
410 |     ],
411 |     [
412 |       3,
413 |       2,
414 |       0,
415 |       4,
416 |       0,
417 |       "MODEL_STATE_DICT"
418 |     ],
419 |     [
420 |       4,
421 |       3,
422 |       0,
423 |       5,
424 |       0,
425 |       "MODEL_STATE_DICT"
426 |     ],
427 |     [
428 |       5,
429 |       4,
430 |       0,
431 |       6,
432 |       0,
433 |       "MODEL_STATE_DICT"
434 |     ],
435 |     [
436 |       10,
437 |       9,
438 |       1,
439 |       10,
440 |       0,
441 |       "STRING"
442 |     ],
443 |     [
444 |       21,
445 |       16,
446 |       0,
447 |       2,
448 |       0,
449 |       "MODEL"
450 |     ],
451 |     [
452 |       22,
453 |       18,
454 |       0,
455 |       9,
456 |       0,
457 |       "STRING"
458 |     ]
459 |   ],
460 |   "groups": [
461 |     {
462 |       "id": 1,
463 |       "title": "Standard Model Quantization",
464 |       "bounding": [
465 |         -199.10694885253906,
466 |         30,
467 |         391.63507080078125,
468 |         495.94573974609375
469 |       ],
470 |       "color": "#A88",
471 |       "font_size": 24,
472 |       "flags": {}
473 |     },
474 |     {
475 |       "id": 2,
476 |       "title": "ControlNet FP8 Quantization",
477 |       "bounding": [
478 |         598.7603759765625,
479 |         36.06060791015625,
480 |         770,
481 |         490
482 |       ],
483 |       "color": "#8A8",
484 |       "font_size": 24,
485 |       "flags": {}
486 |     },
487 |     {
488 |       "id": 3,
489 |       "title": "Scaled Model Quantization",
490 |       "bounding": [
491 |         210.89305114746094,
492 |         34.904640197753906,
493 |         370,
494 |         465.2240295410156
495 |       ],
496 |       "color": "#A88",
497 |       "font_size": 24,
498 |       "flags": {}
499 |     }
500 |   ],
501 |   "config": {},
502 |   "extra": {
503 |     "ds": {
504 |       "scale": 0.5559917313492252,
505 |       "offset": [
506 |         924.9485001083959,
507 |         104.78444059193227
508 |       ]
509 |     },
510 |     "ue_links": [],
511 |     "frontendVersion": "1.20.6",
512 |     "VHS_latentpreview": false,
513 |     "VHS_latentpreviewrate": 0,
514 |     "VHS_MetadataImage": true,
515 |     "VHS_KeepIntermediate": true
516 |   },
517 |   "version": 0.4
518 | }


--------------------------------------------------------------------------------
/examples/workflow_quantize.json:
--------------------------------------------------------------------------------
  1 | {
  2 |   "id": "0c605522-962a-4c82-9089-e97d4235261f",
  3 |   "revision": 0,
  4 |   "last_node_id": 18,
  5 |   "last_link_id": 18,
  6 |   "nodes": [
  7 |     {
  8 |       "id": 14,
  9 |       "type": "SaveAsSafeTensor",
 10 |       "pos": [
 11 |         510.0681457519531,
 12 |         550.5472412109375
 13 |       ],
 14 |       "size": [
 15 |         363.431396484375,
 16 |         83.39691162109375
 17 |       ],
 18 |       "flags": {},
 19 |       "order": 10,
 20 |       "mode": 4,
 21 |       "inputs": [
 22 |         {
 23 |           "name": "quantized_model_state_dict",
 24 |           "type": "MODEL_STATE_DICT",
 25 |           "link": 14
 26 |         }
 27 |       ],
 28 |       "outputs": [],
 29 |       "properties": {
 30 |         "Node name for S&R": "SaveAsSafeTensor",
 31 |         "enableTabs": false,
 32 |         "tabWidth": 65,
 33 |         "tabXOffset": 10,
 34 |         "hasSecondTab": false,
 35 |         "secondTabText": "Send Back",
 36 |         "secondTabOffset": 80,
 37 |         "secondTabWidth": 65
 38 |       },
 39 |       "widgets_values": [
 40 |         "C:\\Users\\RAIIN Studios\\Documents\\protable\\ComfyUI\\models\\fluxfill-fp8e4m3fn.safetensors"
 41 |       ]
 42 |     },
 43 |     {
 44 |       "id": 16,
 45 |       "type": "CheckpointLoaderSimple",
 46 |       "pos": [
 47 |         -421.72259521484375,
 48 |         332.6713562011719
 49 |       ],
 50 |       "size": [
 51 |         290.4454650878906,
 52 |         98
 53 |       ],
 54 |       "flags": {},
 55 |       "order": 0,
 56 |       "mode": 4,
 57 |       "inputs": [],
 58 |       "outputs": [
 59 |         {
 60 |           "name": "MODEL",
 61 |           "type": "MODEL",
 62 |           "links": [
 63 |             17
 64 |           ]
 65 |         },
 66 |         {
 67 |           "name": "CLIP",
 68 |           "type": "CLIP",
 69 |           "links": null
 70 |         },
 71 |         {
 72 |           "name": "VAE",
 73 |           "type": "VAE",
 74 |           "links": null
 75 |         }
 76 |       ],
 77 |       "properties": {
 78 |         "cnr_id": "comfy-core",
 79 |         "ver": "0.3.32",
 80 |         "Node name for S&R": "CheckpointLoaderSimple",
 81 |         "enableTabs": false,
 82 |         "tabWidth": 65,
 83 |         "tabXOffset": 10,
 84 |         "hasSecondTab": false,
 85 |         "secondTabText": "Send Back",
 86 |         "secondTabOffset": 80,
 87 |         "secondTabWidth": 65
 88 |       },
 89 |       "widgets_values": [
 90 |         "SD1.5\\epicphotogasm_ultimateFidelity.safetensors"
 91 |       ]
 92 |     },
 93 |     {
 94 |       "id": 17,
 95 |       "type": "UnetLoaderGGUF",
 96 |       "pos": [
 97 |         -415.4770202636719,
 98 |         572.0929565429688
 99 |       ],
100 |       "size": [
101 |         270,
102 |         58
103 |       ],
104 |       "flags": {},
105 |       "order": 1,
106 |       "mode": 4,
107 |       "inputs": [],
108 |       "outputs": [
109 |         {
110 |           "name": "MODEL",
111 |           "type": "MODEL",
112 |           "links": [
113 |             18
114 |           ]
115 |         }
116 |       ],
117 |       "properties": {
118 |         "cnr_id": "comfyui-gguf",
119 |         "ver": "47bec6147569a138dd30ad3e14f190a36a3be456",
120 |         "Node name for S&R": "UnetLoaderGGUF",
121 |         "enableTabs": false,
122 |         "tabWidth": 65,
123 |         "tabXOffset": 10,
124 |         "hasSecondTab": false,
125 |         "secondTabText": "Send Back",
126 |         "secondTabOffset": 80,
127 |         "secondTabWidth": 65
128 |       },
129 |       "widgets_values": [
130 |         "flux1-dev-Q8_0.gguf"
131 |       ]
132 |     },
133 |     {
134 |       "id": 5,
135 |       "type": "UNETLoader",
136 |       "pos": [
137 |         -422.3947448730469,
138 |         134.14698791503906
139 |       ],
140 |       "size": [
141 |         287.13525390625,
142 |         83.07095336914062
143 |       ],
144 |       "flags": {},
145 |       "order": 2,
146 |       "mode": 0,
147 |       "inputs": [],
148 |       "outputs": [
149 |         {
150 |           "name": "MODEL",
151 |           "type": "MODEL",
152 |           "links": [
153 |             16
154 |           ]
155 |         }
156 |       ],
157 |       "properties": {
158 |         "cnr_id": "comfy-core",
159 |         "ver": "0.3.32",
160 |         "Node name for S&R": "UNETLoader",
161 |         "enableTabs": false,
162 |         "tabWidth": 65,
163 |         "tabXOffset": 10,
164 |         "hasSecondTab": false,
165 |         "secondTabText": "Send Back",
166 |         "secondTabOffset": 80,
167 |         "secondTabWidth": 65
168 |       },
169 |       "widgets_values": [
170 |         "flux\\flux1-fill-dev.safetensors",
171 |         "default"
172 |       ]
173 |     },
174 |     {
175 |       "id": 13,
176 |       "type": "QuantizeModel",
177 |       "pos": [
178 |         139.14344787597656,
179 |         542.1537475585938
180 |       ],
181 |       "size": [
182 |         345.158203125,
183 |         106
184 |       ],
185 |       "flags": {},
186 |       "order": 8,
187 |       "mode": 4,
188 |       "inputs": [
189 |         {
190 |           "name": "model_state_dict",
191 |           "type": "MODEL_STATE_DICT",
192 |           "link": 13
193 |         }
194 |       ],
195 |       "outputs": [
196 |         {
197 |           "name": "quantized_model_state_dict",
198 |           "type": "MODEL_STATE_DICT",
199 |           "links": [
200 |             14
201 |           ]
202 |         }
203 |       ],
204 |       "properties": {
205 |         "Node name for S&R": "QuantizeModel",
206 |         "enableTabs": false,
207 |         "tabWidth": 65,
208 |         "tabXOffset": 10,
209 |         "hasSecondTab": false,
210 |         "secondTabText": "Send Back",
211 |         "secondTabOffset": 80,
212 |         "secondTabWidth": 65
213 |       },
214 |       "widgets_values": [
215 |         "per_tensor",
216 |         "Auto",
217 |         "float16"
218 |       ]
219 |     },
220 |     {
221 |       "id": 9,
222 |       "type": "Note",
223 |       "pos": [
224 |         934.3568115234375,
225 |         514.1301879882812
226 |       ],
227 |       "size": [
228 |         371.62591552734375,
229 |         149.91725158691406
230 |       ],
231 |       "flags": {},
232 |       "order": 3,
233 |       "mode": 0,
234 |       "inputs": [],
235 |       "outputs": [],
236 |       "properties": {},
237 |       "widgets_values": [
238 |         "Available Scaling Strategies:\n\n- per_tensor  : Uses a single scale factor for the entire tensor (fast, less precise).\n\n- per_channel : Computes a separate scale for each output channel (more accurate)."
239 |       ],
240 |       "color": "#ffbbff",
241 |       "bgcolor": "#f1a7fb"
242 |     },
243 |     {
244 |       "id": 11,
245 |       "type": "QuantizeFP8Format",
246 |       "pos": [
247 |         459.6412048339844,
248 |         222.2886199951172
249 |       ],
250 |       "size": [
251 |         367.9277648925781,
252 |         59.94718933105469
253 |       ],
254 |       "flags": {},
255 |       "order": 7,
256 |       "mode": 0,
257 |       "inputs": [
258 |         {
259 |           "name": "model_state_dict",
260 |           "type": "MODEL_STATE_DICT",
261 |           "link": 11
262 |         }
263 |       ],
264 |       "outputs": [
265 |         {
266 |           "name": "quantized_model_state_dict",
267 |           "type": "MODEL_STATE_DICT",
268 |           "links": [
269 |             12
270 |           ]
271 |         }
272 |       ],
273 |       "properties": {
274 |         "Node name for S&R": "QuantizeFP8Format",
275 |         "enableTabs": false,
276 |         "tabWidth": 65,
277 |         "tabXOffset": 10,
278 |         "hasSecondTab": false,
279 |         "secondTabText": "Send Back",
280 |         "secondTabOffset": 80,
281 |         "secondTabWidth": 65
282 |       },
283 |       "widgets_values": [
284 |         "float8_e4m3fn"
285 |       ]
286 |     },
287 |     {
288 |       "id": 15,
289 |       "type": "Any Switch (rgthree)",
290 |       "pos": [
291 |         -70.95974731445312,
292 |         314.68389892578125
293 |       ],
294 |       "size": [
295 |         166.72030639648438,
296 |         108.85087585449219
297 |       ],
298 |       "flags": {
299 |         "collapsed": false
300 |       },
301 |       "order": 5,
302 |       "mode": 0,
303 |       "inputs": [
304 |         {
305 |           "name": "any_01",
306 |           "type": "MODEL",
307 |           "link": 16
308 |         },
309 |         {
310 |           "name": "any_02",
311 |           "type": "MODEL",
312 |           "link": 17
313 |         },
314 |         {
315 |           "name": "any_03",
316 |           "type": "MODEL",
317 |           "link": 18
318 |         },
319 |         {
320 |           "name": "any_04",
321 |           "type": "MODEL",
322 |           "link": null
323 |         },
324 |         {
325 |           "name": "any_05",
326 |           "type": "MODEL",
327 |           "link": null
328 |         }
329 |       ],
330 |       "outputs": [
331 |         {
332 |           "dir": 4,
333 |           "label": "MODEL",
334 |           "name": "*",
335 |           "shape": 3,
336 |           "type": "MODEL",
337 |           "links": [
338 |             15
339 |           ]
340 |         }
341 |       ],
342 |       "properties": {
343 |         "cnr_id": "rgthree-comfy",
344 |         "ver": "32142fe476878a354dda6e2d4b5ea98960de3ced"
345 |       },
346 |       "widgets_values": []
347 |     },
348 |     {
349 |       "id": 2,
350 |       "type": "ModelToStateDict",
351 |       "pos": [
352 |         145.28746032714844,
353 |         277.02728271484375
354 |       ],
355 |       "size": [
356 |         291.43902587890625,
357 |         79.07095336914062
358 |       ],
359 |       "flags": {},
360 |       "order": 6,
361 |       "mode": 0,
362 |       "inputs": [
363 |         {
364 |           "name": "model",
365 |           "type": "MODEL",
366 |           "link": 15
367 |         }
368 |       ],
369 |       "outputs": [
370 |         {
371 |           "name": "model_state_dict",
372 |           "type": "MODEL_STATE_DICT",
373 |           "slot_index": 0,
374 |           "links": [
375 |             11,
376 |             13
377 |           ]
378 |         }
379 |       ],
380 |       "properties": {
381 |         "Node name for S&R": "ModelToStateDict",
382 |         "enableTabs": false,
383 |         "tabWidth": 65,
384 |         "tabXOffset": 10,
385 |         "hasSecondTab": false,
386 |         "secondTabText": "Send Back",
387 |         "secondTabOffset": 80,
388 |         "secondTabWidth": 65
389 |       },
390 |       "widgets_values": []
391 |     },
392 |     {
393 |       "id": 4,
394 |       "type": "SaveAsSafeTensor",
395 |       "pos": [
396 |         456.5610656738281,
397 |         336.84515380859375
398 |       ],
399 |       "size": [
400 |         379.96441650390625,
401 |         69.17916870117188
402 |       ],
403 |       "flags": {},
404 |       "order": 9,
405 |       "mode": 0,
406 |       "inputs": [
407 |         {
408 |           "name": "quantized_model_state_dict",
409 |           "type": "MODEL_STATE_DICT",
410 |           "link": 12
411 |         }
412 |       ],
413 |       "outputs": [],
414 |       "properties": {
415 |         "Node name for S&R": "SaveAsSafeTensor",
416 |         "enableTabs": false,
417 |         "tabWidth": 65,
418 |         "tabXOffset": 10,
419 |         "hasSecondTab": false,
420 |         "secondTabText": "Send Back",
421 |         "secondTabOffset": 80,
422 |         "secondTabWidth": 65
423 |       },
424 |       "widgets_values": [
425 |         "C:\\Users\\RAIIN Studios\\Documents\\protable\\ComfyUI\\models\\fluxfill-fp8e4m3fn.safetensors"
426 |       ]
427 |     },
428 |     {
429 |       "id": 18,
430 |       "type": "Note",
431 |       "pos": [
432 |         883.9888916015625,
433 |         249.35699462890625
434 |       ],
435 |       "size": [
436 |         415.2137145996094,
437 |         122.82212829589844
438 |       ],
439 |       "flags": {},
440 |       "order": 4,
441 |       "mode": 0,
442 |       "inputs": [],
443 |       "outputs": [],
444 |       "properties": {},
445 |       "widgets_values": [
446 |         "If zou want to safe tensor file zou will always have to name the model correctly in zour path like in this example:\nC:\\Users\\RAIIN Studios\\Documents\\protable\\ComfyUI\\models\\fluxfill-fp8e4m3fn.safetensors"
447 |       ],
448 |       "color": "#ffbbff",
449 |       "bgcolor": "#f1a7fb"
450 |     }
451 |   ],
452 |   "links": [
453 |     [
454 |       11,
455 |       2,
456 |       0,
457 |       11,
458 |       0,
459 |       "MODEL_STATE_DICT"
460 |     ],
461 |     [
462 |       12,
463 |       11,
464 |       0,
465 |       4,
466 |       0,
467 |       "MODEL_STATE_DICT"
468 |     ],
469 |     [
470 |       13,
471 |       2,
472 |       0,
473 |       13,
474 |       0,
475 |       "MODEL_STATE_DICT"
476 |     ],
477 |     [
478 |       14,
479 |       13,
480 |       0,
481 |       14,
482 |       0,
483 |       "MODEL_STATE_DICT"
484 |     ],
485 |     [
486 |       15,
487 |       15,
488 |       0,
489 |       2,
490 |       0,
491 |       "MODEL"
492 |     ],
493 |     [
494 |       16,
495 |       5,
496 |       0,
497 |       15,
498 |       0,
499 |       "MODEL"
500 |     ],
501 |     [
502 |       17,
503 |       16,
504 |       0,
505 |       15,
506 |       1,
507 |       "MODEL"
508 |     ],
509 |     [
510 |       18,
511 |       17,
512 |       0,
513 |       15,
514 |       2,
515 |       "MODEL"
516 |     ]
517 |   ],
518 |   "groups": [
519 |     {
520 |       "id": 1,
521 |       "title": "fp8",
522 |       "bounding": [
523 |         126.80980682373047,
524 |         139.11367797851562,
525 |         728.5076293945312,
526 |         311.2386169433594
527 |       ],
528 |       "color": "#3f789e",
529 |       "font_size": 24,
530 |       "flags": {}
531 |     },
532 |     {
533 |       "id": 2,
534 |       "title": "fp/bf16",
535 |       "bounding": [
536 |         129.14346313476562,
537 |         468.5540771484375,
538 |         775.532958984375,
539 |         198.77328491210938
540 |       ],
541 |       "color": "#3f789e",
542 |       "font_size": 24,
543 |       "flags": {}
544 |     },
545 |     {
546 |       "id": 3,
547 |       "title": "Diffmodel",
548 |       "bounding": [
549 |         -431.82818603515625,
550 |         61.81802749633789,
551 |         304.28436279296875,
552 |         173.7981414794922
553 |       ],
554 |       "color": "#3f789e",
555 |       "font_size": 24,
556 |       "flags": {}
557 |     },
558 |     {
559 |       "id": 4,
560 |       "title": "ckpt",
561 |       "bounding": [
562 |         -430.8782958984375,
563 |         260.8616638183594,
564 |         310.4454650878906,
565 |         181.60000610351562
566 |       ],
567 |       "color": "#3f789e",
568 |       "font_size": 24,
569 |       "flags": {}
570 |     },
571 |     {
572 |       "id": 5,
573 |       "title": "Unet",
574 |       "bounding": [
575 |         -429.7647399902344,
576 |         502.79107666015625,
577 |         310.23577880859375,
578 |         143.02545166015625
579 |       ],
580 |       "color": "#3f789e",
581 |       "font_size": 24,
582 |       "flags": {}
583 |     }
584 |   ],
585 |   "config": {},
586 |   "extra": {
587 |     "frontendVersion": "1.18.9",
588 |     "ue_links": [],
589 |     "VHS_latentpreview": false,
590 |     "VHS_latentpreviewrate": 0,
591 |     "VHS_MetadataImage": true,
592 |     "VHS_KeepIntermediate": true
593 |   },
594 |   "version": 0.4
595 | }
596 | 


--------------------------------------------------------------------------------
/gguf_quantizer_node.py:
--------------------------------------------------------------------------------
  1 | # gguf_quantizer_node.py
  2 | import os
  3 | import subprocess
  4 | import shutil
  5 | import sys
  6 | import platform
  7 | import tempfile
  8 | import uuid # For unique temporary file names
  9 | from safetensors.torch import save_file # For saving the model state_dict
 10 | 
 11 | # ComfyUI imports
 12 | try:
 13 |     import folder_paths
 14 |     # import comfy.model_management # For type checking or detailed inspection if needed
 15 |     COMFYUI_AVAILABLE = True
 16 | except ImportError:
 17 |     COMFYUI_AVAILABLE = False
 18 |     # Fallback for paths if ComfyUI is not fully available
 19 |     class folder_paths:
 20 |         @staticmethod
 21 |         def get_input_directory(): return os.path.join(os.path.dirname(os.path.abspath(__file__)), "inputs")
 22 |         @staticmethod
 23 |         def get_output_directory(): return os.path.join(os.path.dirname(os.path.abspath(__file__)), "outputs")
 24 |         @staticmethod
 25 |         def get_temp_directory(): return os.path.join(os.path.dirname(os.path.abspath(__file__)), "temp")
 26 |         @staticmethod
 27 |         def get_folder_paths(folder_name): return [os.path.join(folder_paths.get_input_directory(), folder_name)]
 28 |         @staticmethod
 29 |         def get_filename_list(folder_name): # Not directly used by this node version
 30 |             pass
 31 | 
 32 | 
 33 | # --- GGUFImageQuantizer Core Logic ---
 34 | class GGUFImageQuantizer:
 35 |     def __init__(self, base_node_dir: str, verbose: bool = True):
 36 |         self.base_node_dir = base_node_dir
 37 |         self.verbose = verbose
 38 |         self.llama_cpp_src_dir = os.path.join(self.base_node_dir, "llama_cpp_src")
 39 |         
 40 |         self.quantize_exe_name = "llama-quantize.exe" if platform.system() == "Windows" else "llama-quantize"
 41 |         
 42 |         # Initial path guess, might be refined after build
 43 |         self.compiled_quantize_exe_path = os.path.join(
 44 |             self.llama_cpp_src_dir, "build", "bin", self.quantize_exe_name
 45 |         )
 46 |         if platform.system() == "Windows" and not os.path.exists(self.compiled_quantize_exe_path):
 47 |              self.compiled_quantize_exe_path = os.path.join(
 48 |                 self.llama_cpp_src_dir, "build", "bin", "Release", self.quantize_exe_name
 49 |             )
 50 | 
 51 |         gguf_scripts_subdir = "gguf" # Assumes a 'gguf' subdir in the node's directory for these scripts
 52 |         self.convert_script = os.path.join(self.base_node_dir, gguf_scripts_subdir, "convert.py")
 53 |         self.fix_5d_script = os.path.join(self.base_node_dir, gguf_scripts_subdir, "fix_5d_tensors.py")
 54 |         self.patch_file = os.path.join(self.base_node_dir, gguf_scripts_subdir, "lcpp.patch")
 55 |         self.fix_lines_script = os.path.join(self.base_node_dir, gguf_scripts_subdir, "fix_lines_ending.py")
 56 | 
 57 |         self.current_model_arch = None
 58 |         if self.verbose:
 59 |             print("DEBUG: GGUFImageQuantizer initialized.")
 60 | 
 61 |     def _get_python_executable(self):
 62 |         if self.verbose:
 63 |             print("DEBUG: _get_python_executable called.")
 64 |         return sys.executable if sys.executable else "python"
 65 | 
 66 | 
 67 |     def _run_subprocess(self, command: list, cwd: str = None, desc: str = ""):
 68 |         if desc and self.verbose:
 69 |             print(f"[GGUF Image Quantizer] DEBUG: Running: {desc} (Command: {' '.join(command)}) (CWD: {cwd if cwd else 'None'})")
 70 |         try:
 71 |             process = subprocess.Popen(
 72 |                 command,
 73 |                 stdout=subprocess.PIPE,
 74 |                 stderr=subprocess.PIPE,
 75 |                 cwd=cwd,
 76 |                 text=True,
 77 |                 encoding='utf-8',
 78 |                 errors='ignore'
 79 |             )
 80 |             stdout, stderr = process.communicate(timeout=600) # 10 minute timeout
 81 | 
 82 |             # --- START VERBOSE MODIFICATION ---
 83 |             if self.verbose and stdout and stdout.strip(): # Only print if there's actual output
 84 |                 print(f"[GGUF Image Quantizer] STDOUT from '{desc}':\n{stdout.strip()}")
 85 |             if stderr and stderr.strip(): # Always print stderr, even if returncode is 0, as it might contain warnings
 86 |                 print(f"[GGUF Image Quantizer] STDERR from '{desc}':\n{stderr.strip()}")
 87 |             # --- END VERBOSE MODIFICATION ---
 88 | 
 89 |             if process.returncode != 0:
 90 |                 print(f"[GGUF Image Quantizer] Error during '{desc}' (Return Code: {process.returncode}). See STDERR above if any.")
 91 |                 return False, stdout, stderr
 92 |             
 93 |             if self.verbose:
 94 |                 print(f"[GGUF Image Quantizer] Success: {desc}")
 95 |             return True, stdout, stderr
 96 |         except subprocess.TimeoutExpired:
 97 |             print(f"[GGUF Image Quantizer] Timeout during '{desc}' after 10 minutes.")
 98 |             return False, "", "TimeoutExpired"
 99 |         except Exception as e:
100 |             print(f"[GGUF Image Quantizer] Exception during '{desc}': {e}")
101 |             if self.verbose:
102 |                 import traceback
103 |                 print(f"DEBUG: Traceback for _run_subprocess exception: {traceback.format_exc()}")
104 |             return False, "", str(e)
105 | 
106 |     def setup_llama_cpp(self):
107 |         if self.verbose:
108 |             print("[GGUF Image Quantizer] DEBUG: Starting setup_llama_cpp...")
109 |         os.makedirs(self.llama_cpp_src_dir, exist_ok=True)
110 |         if self.verbose:
111 |             print(f"[GGUF Image Quantizer] DEBUG: Ensured llama_cpp_src_dir exists: {self.llama_cpp_src_dir}")
112 | 
113 |         gguf_scripts_dir = os.path.join(self.base_node_dir, "gguf")
114 |         if os.path.exists(self.fix_lines_script) and os.path.exists(self.patch_file):
115 |             if self.verbose:
116 |                 print("[GGUF Image Quantizer] DEBUG: Found fix_lines_script and patch_file. Attempting to fix line endings for patch file.")
117 |             self._run_subprocess(
118 |                 [self._get_python_executable(), self.fix_lines_script, self.patch_file],
119 |                 cwd=gguf_scripts_dir, 
120 |                 desc="Fix patch file line endings"
121 |             )
122 |         else:
123 |             if self.verbose:
124 |                 print(f"[GGUF Image Quantizer] DEBUG: Skipping fix line endings. fix_lines_script exists: {os.path.exists(self.fix_lines_script)}, patch_file exists: {os.path.exists(self.patch_file)}")
125 | 
126 | 
127 |         git_repo_path = os.path.join(self.llama_cpp_src_dir, ".git")
128 |         if not os.path.exists(git_repo_path):
129 |             if self.verbose:
130 |                 print(f"[GGUF Image Quantizer] DEBUG: .git directory not found at {git_repo_path}. Cloning llama.cpp.")
131 |             success, _, _ = self._run_subprocess(
132 |                 ["git", "clone", "https://github.com/ggerganov/llama.cpp.git", self.llama_cpp_src_dir],
133 |                 desc="Clone llama.cpp"
134 |             )
135 |             if not success: 
136 |                 if self.verbose:
137 |                     print("[GGUF Image Quantizer] DEBUG: Cloning llama.cpp failed.")
138 |                 return False
139 |         else:
140 |             if self.verbose:
141 |                 print("[GGUF Image Quantizer] DEBUG: llama.cpp repository already cloned. Fetching updates...")
142 |             self._run_subprocess(["git", "fetch", "--tags"], cwd=self.llama_cpp_src_dir, desc="Git fetch llama.cpp tags")
143 |         
144 |         readme_checkout_tag = "b3962" 
145 |         if self.verbose:
146 |             print(f"[GGUF Image Quantizer] DEBUG: Checking out llama.cpp tag: {readme_checkout_tag}...")
147 |         success, _, _ = self._run_subprocess(
148 |             ["git", "checkout", f"tags/{readme_checkout_tag}"], cwd=self.llama_cpp_src_dir, desc=f"Checkout tag {readme_checkout_tag}"
149 |         )
150 |         if not success:
151 |             if self.verbose:
152 |                 print(f"[GGUF Image Quantizer] DEBUG: Failed to checkout tag {readme_checkout_tag}. Trying git pull and re-checkout.")
153 |             self._run_subprocess(["git", "pull"], cwd=self.llama_cpp_src_dir, desc="Git pull after failed checkout")
154 |             success, _, _ = self._run_subprocess(
155 |                  ["git", "checkout", f"tags/{readme_checkout_tag}"], cwd=self.llama_cpp_src_dir, desc=f"Retry checkout tag {readme_checkout_tag}"
156 |             )
157 |             if not success:
158 |                 if self.verbose:
159 |                     print(f"[GGUF Image Quantizer] DEBUG: Critical: Failed to checkout required llama.cpp tag {readme_checkout_tag}. Patching and compilation may fail.")
160 |                 # return False # Or allow to proceed with caution
161 | 
162 |         patch_check_file = os.path.join(self.llama_cpp_src_dir, "gguf-py", "gguf", "constants.py") 
163 |         patch_applied_sentinel = "LLM_ARCH_FLUX" 
164 |         
165 |         already_applied = False
166 |         if os.path.exists(patch_check_file):
167 |             try:
168 |                 print(f"DEBUG: Checking for patch sentinel '{patch_applied_sentinel}' in {patch_check_file}")
169 |                 with open(patch_check_file, 'r', encoding='utf-8', errors='ignore') as f_check:
170 |                     if patch_applied_sentinel in f_check.read():
171 |                         already_applied = True
172 |                         print("[GGUF Image Quantizer] DEBUG: Patch sentinel found. Assuming patch is applied.")
173 |             except Exception as e:
174 |                  print(f"[GGUF Image Quantizer] DEBUG: Warning: Could not check if patch was applied due to: {e}")
175 |         else:
176 |             print(f"DEBUG: Patch check file {patch_check_file} does not exist.")
177 | 
178 |         
179 |         if not already_applied:
180 |             if self.verbose:
181 |                 print("[GGUF Image Quantizer] DEBUG: Patch not detected as applied.")
182 |             if not os.path.exists(self.patch_file):
183 |                 print(f"[GGUF Image Quantizer] DEBUG: Error: Patch file not found at {self.patch_file}. Cannot apply patch.")
184 |                 return False
185 |             
186 |             if self.verbose:
187 |                 print("[GGUF Image Quantizer] DEBUG: Attempting to reverse any existing patches (best effort)...")
188 |             self._run_subprocess( 
189 |                 ["git", "apply", "--reverse", "--reject", self.patch_file],
190 |                 cwd=self.llama_cpp_src_dir,
191 |                 desc="Reverse existing patches" 
192 |             )
193 |             if self.verbose:
194 |                 print("[GGUF Image Quantizer] DEBUG: Applying lcpp.patch...")
195 |             success, stdout_patch, stderr_patch = self._run_subprocess(
196 |                 ["git", "apply", "--ignore-whitespace", self.patch_file], 
197 |                 cwd=self.llama_cpp_src_dir,
198 |                 desc="Apply lcpp.patch"
199 |             )
200 |             if not success:
201 |                 if self.verbose:
202 |                     print(f"[GGUF Image Quantizer] DEBUG: Failed to apply patch.")
203 |                 if os.path.exists(patch_check_file):
204 |                     if self.verbose:
205 |                         print(f"DEBUG: Re-checking for patch sentinel in {patch_check_file} after failed apply.")
206 |                     with open(patch_check_file, 'r', encoding='utf-8', errors='ignore') as f_check_after_fail:
207 |                         if patch_applied_sentinel in f_check_after_fail.read():
208 |                             if self.verbose:
209 |                                 print("[GGUF Image Quantizer] DEBUG: Patch sentinel FOUND despite 'git apply' error. Proceeding cautiously.")
210 |                         else:
211 |                             if self.verbose:
212 |                                 print("[GGUF Image Quantizer] DEBUG: Patch sentinel NOT FOUND after 'git apply' error. Setup failed.")
213 |                             return False
214 |                 else:
215 |                      if self.verbose:
216 |                          print(f"[GGUF Image Quantizer] DEBUG: Patch check file {patch_check_file} not found after 'git apply' error. Setup failed.")
217 |                      return False
218 |         else:
219 |             if self.verbose:
220 |                 print("[GGUF Image Quantizer] DEBUG: Patch already applied or sentinel found. Skipping patch application.")
221 | 
222 |         
223 |         build_dir = os.path.join(self.llama_cpp_src_dir, "build")
224 |         os.makedirs(build_dir, exist_ok=True)
225 |         if self.verbose:
226 |             print(f"[GGUF Image Quantizer] DEBUG: Ensured build directory exists: {build_dir}")
227 |         
228 |         cmake_cache_file = os.path.join(build_dir, "CMakeCache.txt")
229 |         if not os.path.exists(cmake_cache_file):
230 |             if self.verbose:
231 |                 print("[GGUF Image Quantizer] DEBUG: CMakeCache.txt not found. Configuring CMake for llama-quantize (CPU build)...")
232 |             cmake_cmd = ["cmake", "..", "-DLLAMA_ACCELERATE=OFF", "-DLLAMA_METAL=OFF", "-DLLAMA_CUDA=OFF", "-DLLAMA_VULKAN=OFF", "-DLLAMA_SYCL=OFF", "-DLLAMA_OPENCL=OFF", "-DLLAMA_BLAS=OFF", "-DLLAMA_LAPACK=OFF"]
233 |             success, _, _ = self._run_subprocess(cmake_cmd, cwd=build_dir, desc="CMake configuration")
234 |             if not success:
235 |                 if self.verbose:
236 |                     print("[GGUF Image Quantizer] DEBUG: CMake configuration failed.")
237 |                 return False
238 |         else:
239 |             if self.verbose:
240 |                 print("[GGUF Image Quantizer] DEBUG: CMake cache found. Assuming already configured. Skipping CMake configuration.")
241 | 
242 |         if self.verbose:
243 |             print("[GGUF Image Quantizer] DEBUG: Building llama-quantize target...")
244 |         cmake_build_cmd = ["cmake", "--build", ".", "--target", "llama-quantize"]
245 |         if platform.system() == "Windows":
246 |              cmake_build_cmd.extend(["--config", "Release"])
247 |         
248 |         success, _, _ = self._run_subprocess(cmake_build_cmd, cwd=build_dir, desc="CMake build llama-quantize")
249 |         if not success:
250 |             if self.verbose:
251 |                 print("[GGUF Image Quantizer] DEBUG: CMake build llama-quantize failed.")
252 |             return False
253 |         
254 |         self.compiled_quantize_exe_path = os.path.join(self.llama_cpp_src_dir, "build", "bin", self.quantize_exe_name)
255 |         if platform.system() == "Windows" and not os.path.exists(self.compiled_quantize_exe_path):
256 |             self.compiled_quantize_exe_path = os.path.join(self.llama_cpp_src_dir, "build", "bin", "Release", self.quantize_exe_name)
257 |         
258 |         if not os.path.exists(self.compiled_quantize_exe_path):
259 |             alt_path = os.path.join(build_dir, self.quantize_exe_name)
260 |             if os.path.exists(alt_path):
261 |                 self.compiled_quantize_exe_path = alt_path
262 |                 if self.verbose:
263 |                     print(f"[GGUF Image Quantizer] DEBUG: Found llama-quantize at alternate path: {alt_path}")
264 |             else:
265 |                 if self.verbose:
266 |                     print(f"[GGUF Image Quantizer] DEBUG: Compiled llama-quantize not found at expected paths after build.")
267 |                 return False
268 |         if self.verbose:
269 |             print(f"[GGUF Image Quantizer] DEBUG: llama-quantize path set to: {self.compiled_quantize_exe_path}")
270 | 
271 |         if self.verbose:
272 |             print("[GGUF Image Quantizer] DEBUG: llama.cpp environment setup complete.")
273 |         return True
274 | 
275 |     def convert_model_to_initial_gguf(self, model_src_path: str, temp_conversion_dir: str):
276 |         if self.verbose:
277 |             print(f"[GGUF Image Quantizer] DEBUG: Starting convert_model_to_initial_gguf. Src: {model_src_path}, TempDir: {temp_conversion_dir}")
278 |         if not os.path.exists(self.convert_script):
279 |             print(f"DEBUG: Error: GGUF convert.py script not found at {self.convert_script}")
280 |             return None, None
281 | 
282 |         base_name = os.path.splitext(os.path.basename(model_src_path))[0]
283 |         # Specify the exact output path to avoid filename confusion
284 |         expected_gguf_name_f16 = f"{base_name}-F16.gguf"
285 |         expected_output_path = os.path.join(temp_conversion_dir, expected_gguf_name_f16)
286 |         cmd = [self._get_python_executable(), self.convert_script, "--src", model_src_path, "--dst", expected_output_path]
287 | 
288 |         if self.verbose:
289 |             print(f"[GGUF Image Quantizer] DEBUG: About to run convert.py. Command: {' '.join(cmd)}")
290 |         success, stdout, stderr = self._run_subprocess(
291 |             cmd,
292 |             cwd=temp_conversion_dir,
293 |             desc="Convert model to initial GGUF (FP16)"
294 |         )
295 |         if not success:
296 |             if self.verbose:
297 |                 print(f"[GGUF Image Quantizer] DEBUG: convert.py execution failed.")
298 |             return None, None
299 | 
300 |         initial_gguf_path = None
301 |         model_arch = None
302 | 
303 |         for line in stdout.splitlines():
304 |             line_lower = line.lower()
305 |             if "model architecture:" in line_lower:
306 |                 model_arch = line_lower.split("model architecture:")[-1].strip()
307 |                 if self.verbose:
308 |                     print(f"DEBUG: Parsed model_arch (from 'model architecture:'): {model_arch}")
309 |                 break
310 |             elif "llm_arch =" in line:
311 |                 model_arch = line.split("=")[-1].strip().replace("'", "").replace('"',"")
312 |                 if self.verbose:
313 |                     print(f"DEBUG: Parsed model_arch (from 'llm_arch ='): {model_arch}")
314 |                 break
315 | 
316 |         # Check if the file was created at the expected path (which we specified with --dst)
317 |         if os.path.exists(expected_output_path):
318 |             initial_gguf_path = expected_output_path
319 |             if self.verbose:
320 |                 print(f"DEBUG: Found initial GGUF at expected path: {initial_gguf_path}")
321 |         else:
322 |             if self.verbose:
323 |                 print(f"DEBUG: Expected GGUF file not found at {expected_output_path}. Scanning directory {temp_conversion_dir}...")
324 |             for fname in os.listdir(temp_conversion_dir):
325 |                 if fname.lower().endswith(".gguf"):
326 |                     initial_gguf_path = os.path.join(temp_conversion_dir, fname)
327 |                     if self.verbose:
328 |                         print(f"[GGUF Image Quantizer] DEBUG: Found GGUF file by scan: {fname}")
329 |                     break
330 | 
331 |         if not initial_gguf_path:
332 |             if self.verbose:
333 |                 print(f"[GGUF Image Quantizer] DEBUG: Could not find the output GGUF file in {temp_conversion_dir}.")
334 |             return None, None
335 | 
336 |         if model_arch:
337 |             self.current_model_arch = model_arch.lower()
338 |             if self.verbose:
339 |                 print(f"[GGUF Image Quantizer] DEBUG: Detected model architecture from script output: '{self.current_model_arch}'")
340 |         else:
341 |             if self.verbose:
342 |                 print("DEBUG: Model architecture not found in script output. Attempting to guess from filename.")
343 |             fn_lower = os.path.basename(initial_gguf_path).lower()
344 |             if "clip" in fn_lower: self.current_model_arch = "clip"
345 |             elif "siglip" in fn_lower: self.current_model_arch = "siglip"
346 |             elif "flux" in fn_lower: self.current_model_arch = "flux"
347 | 
348 |             if self.current_model_arch:
349 |                 if self.verbose:
350 |                     print(f"[GGUF Image Quantizer] DEBUG: Guessed model architecture from filename: '{self.current_model_arch}'")
351 |             else:
352 |                 if self.verbose:
353 |                     print("[GGUF Image Quantizer] DEBUG: Warning: Model architecture could not be determined.")
354 | 
355 |         if self.verbose:
356 |             print(f"[GGUF Image Quantizer] DEBUG: Initial GGUF created at: {initial_gguf_path}. Architecture: {self.current_model_arch}")
357 |         return initial_gguf_path, self.current_model_arch
358 | 
359 | 
360 |     def quantize_gguf(self, initial_gguf_path_in_temp: str, quant_type: str, final_output_gguf_path: str):
361 |         if self.verbose:
362 |             print(f"[GGUF Image Quantizer] DEBUG: Starting quantize_gguf. Initial: {initial_gguf_path_in_temp}, Type: {quant_type}, Final: {final_output_gguf_path}")
363 |         if not os.path.exists(self.compiled_quantize_exe_path):
364 |             print(f"DEBUG: Error: Compiled llama-quantize not found at {self.compiled_quantize_exe_path}")
365 |             return None
366 |         if not os.path.exists(initial_gguf_path_in_temp):
367 |             print(f"DEBUG: Error: Initial GGUF file not found for quantization: {initial_gguf_path_in_temp}")
368 |             return None
369 | 
370 |         os.makedirs(os.path.dirname(final_output_gguf_path), exist_ok=True)
371 |         if self.verbose:
372 |             print(f"DEBUG: Ensured output directory for quantized file exists: {os.path.dirname(final_output_gguf_path)}")
373 | 
374 |         cmd = [self.compiled_quantize_exe_path, initial_gguf_path_in_temp, final_output_gguf_path, quant_type.upper()]
375 |         if self.verbose:
376 |             print(f"[GGUF Image Quantizer] DEBUG: About to run llama-quantize. Command: {' '.join(cmd)}")
377 |         success, stdout_quant, stderr_quant = self._run_subprocess(cmd, desc=f"Convert/Quantize GGUF to {quant_type}")
378 | 
379 |         if not success or not os.path.exists(final_output_gguf_path):
380 |             if self.verbose:
381 |                 print(f"[GGUF Image Quantizer] DEBUG: Failed to process to {quant_type} or output file not found: {final_output_gguf_path}")
382 |             return None
383 | 
384 |         if self.verbose:
385 |             print(f"[GGUF Image Quantizer] DEBUG: Successfully processed to {quant_type}: {final_output_gguf_path}")
386 |         return final_output_gguf_path
387 | 
388 |     def apply_5d_fix_if_needed(self, target_final_gguf_path: str, model_arch: str, gguf_scripts_dir: str):
389 |         if self.verbose:
390 |             print(f"DEBUG: Starting apply_5d_fix_if_needed. Target: {target_final_gguf_path}, Arch: {model_arch}, ScriptsDir: {gguf_scripts_dir}")
391 |         if not model_arch:
392 |             if self.verbose:
393 |                 print("[GGUF Image Quantizer] DEBUG: No model architecture provided; skipping 5D tensor fix.")
394 |             return target_final_gguf_path
395 | 
396 |         fix_safetensor_filename = f"fix_5d_tensors_{model_arch.lower()}.safetensors"
397 |         fix_safetensor_path = os.path.join(gguf_scripts_dir, fix_safetensor_filename)
398 |         if self.verbose:
399 |             print(f"DEBUG: Expected 5D fix definition file path: {fix_safetensor_path}")
400 | 
401 |         if not os.path.exists(fix_safetensor_path):
402 |             if self.verbose:
403 |                 print(f"[GGUF Image Quantizer] DEBUG: No 5D fix definition file found for arch '{model_arch}' at {fix_safetensor_path}. Skipping 5D fix.")
404 |             return target_final_gguf_path
405 | 
406 |         if self.verbose:
407 |             print(f"[GGUF Image Quantizer] DEBUG: Applying 5D tensor fix for model arch: {model_arch} using {fix_safetensor_path}")
408 |         if not os.path.exists(self.fix_5d_script):
409 |             print(f"DEBUG: Error: fix_5d_tensors.py script not found at {self.fix_5d_script}")
410 |             return None
411 |         if not os.path.exists(target_final_gguf_path):
412 |              print(f"DEBUG: Error: Target GGUF for 5D fix not found: {target_final_gguf_path}")
413 |              return None
414 | 
415 |         cmd = [self._get_python_executable(), self.fix_5d_script,
416 |                "--src", target_final_gguf_path,
417 |                "--dst", target_final_gguf_path,
418 |                "--fix", fix_safetensor_path,
419 |                "--overwrite"]
420 | 
421 |         if self.verbose:
422 |             print(f"[GGUF Image Quantizer] DEBUG: About to run fix_5d_tensors.py. Command: {' '.join(cmd)}")
423 |         success, stdout_fix, stderr_fix = self._run_subprocess(
424 |             cmd,
425 |             cwd=gguf_scripts_dir,
426 |             desc="Apply 5D tensor fix"
427 |         )
428 |         if not success:
429 |             if self.verbose:
430 |                 print(f"[GGUF Image Quantizer] DEBUG: Failed to apply 5D fix to {target_final_gguf_path}.")
431 |             return None
432 | 
433 |         if self.verbose:
434 |             print(f"[GGUF Image Quantizer] DEBUG: 5D tensor fix applied. Final model at: {target_final_gguf_path}")
435 |         return target_final_gguf_path
436 | 
437 | 
438 | # --- ComfyUI Node ---
439 | class GGUFQuantizerNode:
440 |     QUANT_TYPES = sorted(["F16", "BF16", "Q4_0", "Q4_K_S", "Q4_K_M", "Q5_0", "Q5_K_S", "Q5_K_M", "Q6_K", "Q8_0",
441 |                           "IQ2_XS", "IQ2_S", "IQ3_XXS", "IQ3_S", "IQ3_M", "IQ4_NL", "IQ4_XS",
442 |                           "Q2_K", "Q3_K_S", "Q3_K_M", "Q3_K_L"])
443 | 
444 |     @classmethod
445 |     def INPUT_TYPES(cls):
446 |         extended_quant_types = cls.QUANT_TYPES + ["ALL"]
447 |         return {
448 |             "required": {
449 |                 "model": ("MODEL",),
450 |                 "quantization_type": (extended_quant_types, {"default": "Q4_K_M"}),
451 |                 "output_path_template": ("STRING", {"default": "gguf_quantized/piped_model", "multiline": False, "placeholder": "folder/name_core OR /abs_path/folder/name_core"}),
452 |                 "is_absolute_path": ("BOOLEAN", {"default": False, "label_on": "Absolute Path Mode", "label_off": "Relative to ComfyUI Output Dir"}),
453 |                 "setup_environment": ("BOOLEAN", {"default": False, "label_on": "Run Setup First (llama.cpp)", "label_off": "Skip Setup (if already done)"}),
454 |                 "verbose_logging": ("BOOLEAN", {"default": True, "label_on": "Verbose Debug Logging", "label_off": "Minimal Logging"}),
455 |             },
456 |         }
457 | 
458 |     RETURN_TYPES = ("STRING", "STRING",)
459 |     RETURN_NAMES = ("status_message", "output_gguf_path_or_dir",)
460 |     FUNCTION = "quantize_diffusion_model"
461 |     CATEGORY = "Model Quantization/GGUF"
462 |     OUTPUT_NODE = True
463 | 
464 |     def quantize_diffusion_model(self, model, quantization_type: str,
465 |                                  output_path_template: str, is_absolute_path: bool,
466 |                                  setup_environment: bool, verbose_logging: bool):
467 | 
468 |         base_node_dir = os.path.dirname(os.path.abspath(__file__))
469 |         quantizer = GGUFImageQuantizer(base_node_dir, verbose=verbose_logging)
470 |         status_messages = ["DEBUG: Starting GGUF Image Quantization Node..."]
471 |         if verbose_logging:
472 |             print("DEBUG: quantize_diffusion_model called with parameters:")
473 |             print(f"DEBUG:   quantization_type: {quantization_type}")
474 |             print(f"DEBUG:   output_path_template: {output_path_template}")
475 |             print(f"DEBUG:   is_absolute_path: {is_absolute_path}")
476 |             print(f"DEBUG:   setup_environment: {setup_environment}")
477 |             print(f"DEBUG:   verbose_logging: {verbose_logging}")
478 | 
479 | 
480 |         if setup_environment:
481 |             status_messages.append("DEBUG: Attempting llama.cpp environment setup...")
482 |             if verbose_logging:
483 |                 print("DEBUG: Calling quantizer.setup_llama_cpp()")
484 |             if not quantizer.setup_llama_cpp(): # This method now has its own DEBUG prints
485 |                 status_messages.append("❌ Error: llama.cpp environment setup failed. Check console.")
486 |                 if verbose_logging:
487 |                     print("DEBUG: quantizer.setup_llama_cpp() returned False.")
488 |                 return ("\n".join(status_messages), "")
489 |             status_messages.append("✅ llama.cpp environment setup successful.")
490 |             if verbose_logging:
491 |                 print("DEBUG: quantizer.setup_llama_cpp() returned True.")
492 |         elif not os.path.exists(quantizer.compiled_quantize_exe_path):
493 |             status_messages.append(f"❌ Error: llama-quantize not found at '{quantizer.compiled_quantize_exe_path}' and setup was skipped. Run with 'setup_environment=True' at least once.")
494 |             if verbose_logging:
495 |                 print(f"DEBUG: llama-quantize not found at {quantizer.compiled_quantize_exe_path} and setup_environment is False.")
496 |             return ("\n".join(status_messages), "")
497 |         else:
498 |             if verbose_logging:
499 |                 print(f"DEBUG: Skipping llama.cpp setup. Found llama-quantize at {quantizer.compiled_quantize_exe_path}")
500 | 
501 | 
502 |         temp_model_input_path = None
503 |         derived_model_name_for_output = "piped_unet_model"
504 | 
505 |         try:
506 |             if verbose_logging:
507 |                 print("DEBUG: Entering UNET state_dict extraction and model name determination block.")
508 |             unet_state_dict = None
509 | 
510 |             if hasattr(model, 'model') and hasattr(model.model, 'state_dict'):
511 |                 if verbose_logging:
512 |                     print("DEBUG: Trying to extract state_dict from model.model")
513 |                 unet_state_dict = model.model.state_dict()
514 |                 status_messages.append("✅ Extracted UNET state_dict from model.model")
515 |                 if hasattr(model, 'model_config'):
516 |                     m_config = model.model_config
517 |                     name_src = getattr(m_config, 'filename', getattr(m_config, 'name', None))
518 |                     if isinstance(name_src, str) and name_src.strip() and not any(x in name_src.lower() for x in ["unet.json", "config.json"]):
519 |                         derived_model_name_for_output = os.path.splitext(os.path.basename(name_src))[0]
520 |                     elif hasattr(m_config, 'original_config_path') and isinstance(getattr(m_config, 'original_config_path', None), str):
521 |                          derived_model_name_for_output = os.path.splitext(os.path.basename(m_config.original_config_path))[0]
522 |                 if verbose_logging:
523 |                     print(f"DEBUG: Path 1: derived_model_name_for_output = {derived_model_name_for_output}")
524 | 
525 |             elif hasattr(model, 'model') and hasattr(model.model, 'model') and hasattr(model.model.model, 'state_dict'):
526 |                 if verbose_logging:
527 |                     print("DEBUG: Trying to extract state_dict from model.model.model")
528 |                 unet_state_dict = model.model.model.state_dict()
529 |                 status_messages.append("✅ Extracted UNET state_dict from model.model.model")
530 |                 m_config = getattr(model.model, 'model_config', getattr(model, 'model_config', None))
531 |                 if m_config:
532 |                     name_src = getattr(m_config, 'filename', getattr(m_config, 'name', None))
533 |                     if isinstance(name_src, str) and name_src.strip() and not any(x in name_src.lower() for x in ["unet.json", "config.json"]):
534 |                         derived_model_name_for_output = os.path.splitext(os.path.basename(name_src))[0]
535 |                 if verbose_logging:
536 |                     print(f"DEBUG: Path 2: derived_model_name_for_output = {derived_model_name_for_output}")
537 | 
538 |             elif hasattr(model, 'diffusion_model') and hasattr(model.diffusion_model, 'state_dict'):
539 |                 if verbose_logging:
540 |                     print("DEBUG: Trying to extract state_dict from model.diffusion_model")
541 |                 unet_state_dict = model.diffusion_model.state_dict()
542 |                 status_messages.append("✅ Extracted UNET state_dict from model.diffusion_model")
543 |                 m_config = getattr(model, 'model_config', None)
544 |                 if not m_config and hasattr(model.diffusion_model, 'config'):
545 |                     diffusers_conf = getattr(model.diffusion_model, 'config', None)
546 |                     name_or_path = getattr(diffusers_conf, '_name_or_path', "")
547 |                     if isinstance(name_or_path, str) and name_or_path.strip():
548 |                          derived_model_name_for_output = os.path.basename(name_or_path)
549 |                          derived_model_name_for_output = os.path.splitext(derived_model_name_for_output)[0] if not os.path.isdir(os.path.join(".", name_or_path)) else derived_model_name_for_output
550 |                 elif m_config:
551 |                     name_src = getattr(m_config, 'filename', getattr(m_config, 'name', None))
552 |                     if isinstance(name_src, str) and name_src.strip() and not any(x in name_src.lower() for x in ["unet.json", "config.json"]):
553 |                          derived_model_name_for_output = os.path.splitext(os.path.basename(name_src))[0]
554 |                 if verbose_logging:
555 |                     print(f"DEBUG: Path 3: derived_model_name_for_output = {derived_model_name_for_output}")
556 | 
557 |             elif hasattr(model, 'state_dict'):
558 |                 if verbose_logging:
559 |                     print("DEBUG: Trying to extract state_dict directly from model object")
560 |                 unet_state_dict = model.state_dict()
561 |                 status_messages.append("✅ Extracted state_dict directly from input model object")
562 |                 direct_conf = getattr(model, 'config', getattr(model, 'model_config', None))
563 |                 if direct_conf:
564 |                     name_or_path = getattr(direct_conf, '_name_or_path', getattr(direct_conf, 'filename', getattr(direct_conf, 'name', None)))
565 |                     if isinstance(name_or_path, str) and name_or_path.strip() and not any(x in name_or_path.lower() for x in ["unet.json", "config.json"]):
566 |                         derived_model_name_for_output = os.path.basename(name_or_path)
567 |                         derived_model_name_for_output = os.path.splitext(derived_model_name_for_output)[0] if not os.path.isdir(os.path.join(".", name_or_path)) else derived_model_name_for_output
568 |                 if verbose_logging:
569 |                     print(f"DEBUG: Path 4: derived_model_name_for_output = {derived_model_name_for_output}")
570 | 
571 |             if unet_state_dict is None:
572 |                 if verbose_logging:
573 |                     print("DEBUG: UNET state_dict is None after all checks.")
574 |                 model_type_info = f"Type of input model: {type(model)}."
575 |                 model_attrs_str = ""
576 |                 try:
577 |                     model_attrs_str = f"Non-callable attributes: {', '.join(sorted(attr for attr in dir(model) if not callable(getattr(model, attr, None)) and not attr.startswith('__')))}"
578 |                 except: model_attrs_str = "Could not inspect model attributes."
579 |                 error_msg = (
580 |                     "❌ Error: Could not extract UNET state_dict. The input 'model' doesn't match known ComfyUI MODEL structures or provide a direct state_dict.\n"
581 |                     f"{model_type_info}\n{model_attrs_str[:1500]}"
582 |                 )
583 |                 status_messages.append(error_msg)
584 |                 return ("\n".join(status_messages), "") # Critical error, return
585 | 
586 |             status_messages.append(f"Using derived base name for output files: '{derived_model_name_for_output}'")
587 |             if verbose_logging:
588 |                 print(f"DEBUG: Final derived_model_name_for_output: {derived_model_name_for_output}")
589 | 
590 |             temp_dir_for_input_model_sf = folder_paths.get_temp_directory()
591 |             os.makedirs(temp_dir_for_input_model_sf, exist_ok=True)
592 |             temp_model_input_path = os.path.join(temp_dir_for_input_model_sf, f"temp_unet_{derived_model_name_for_output}_{uuid.uuid4()}.safetensors")
593 | 
594 |             if verbose_logging:
595 |                 print(f"DEBUG: About to save UNET state_dict to temporary file: {temp_model_input_path}")
596 |             save_file(unet_state_dict, temp_model_input_path)
597 |             status_messages.append(f"✅ UNET state_dict saved to temporary file: {os.path.basename(temp_model_input_path)}")
598 |             if verbose_logging:
599 |                 print(f"DEBUG: UNET state_dict saved successfully.")
600 |             src_model_path_for_convert = temp_model_input_path
601 | 
602 |         except Exception as e:
603 |             if verbose_logging:
604 |                 print(f"DEBUG: Exception during UNET state_dict extraction or saving: {e}")
605 |             if temp_model_input_path and os.path.exists(temp_model_input_path):
606 |                 try: os.remove(temp_model_input_path)
607 |                 except: pass
608 |             import traceback
609 |             tb_str = traceback.format_exc()
610 |             if verbose_logging:
611 |                 print(f"DEBUG: Traceback for state_dict exception: {tb_str}")
612 |             status_messages.append(f"❌ Error during UNET state_dict extraction or saving: {e}\n{tb_str}")
613 |             return ("\n".join(status_messages), "")
614 | 
615 |         status_messages.append(f"Preparing to convert & quantize using temporary UNET: {src_model_path_for_convert}")
616 |         if verbose_logging:
617 |             print(f"DEBUG: src_model_path_for_convert is set to: {src_model_path_for_convert}")
618 | 
619 |         # --- Determine Final Output Directory and Filename Core ---
620 |         if verbose_logging:
621 |             print("DEBUG: Starting output path determination block.")
622 |         path_template_str = output_path_template.strip()
623 |         filename_core = derived_model_name_for_output
624 |         output_directory_part = ""
625 | 
626 |         if not path_template_str:
627 |             if verbose_logging:
628 |                 print("DEBUG: output_path_template is empty.")
629 |             if is_absolute_path:
630 |                 status_messages.append("❌ Error: 'output_path_template' cannot be empty when 'is_absolute_path' is True.")
631 |                 if verbose_logging:
632 |                     print("DEBUG: Error - output_path_template empty in absolute_path mode.")
633 |                 if src_model_path_for_convert and os.path.exists(src_model_path_for_convert): os.remove(src_model_path_for_convert)
634 |                 return ("\n".join(status_messages), "")
635 |             else:
636 |                 output_directory_part = "gguf_quantized"
637 |                 final_output_directory = os.path.join(folder_paths.get_output_directory(), output_directory_part)
638 |                 if verbose_logging:
639 |                     print(f"DEBUG: Relative mode, empty template. Subdir: {output_directory_part}, Full dir: {final_output_directory}")
640 |         else:
641 |             if verbose_logging:
642 |                 print(f"DEBUG: output_path_template provided: '{path_template_str}'")
643 |             norm_template = os.path.normpath(path_template_str)
644 |             user_basename = os.path.basename(norm_template)
645 |             user_dirname = os.path.dirname(norm_template)
646 |             if verbose_logging:
647 |                 print(f"DEBUG: norm_template: {norm_template}, user_basename: {user_basename}, user_dirname: {user_dirname}")
648 | 
649 |             # Check if the path template is a directory path or a file path
650 |             # If it's a directory path (ends with separator, or basename has no extension and looks like a folder name),
651 |             # use the entire path as the directory and keep the original filename_core
652 |             is_directory_path = (
653 |                 path_template_str.endswith(os.path.sep) or
654 |                 path_template_str.endswith('/') or
655 |                 (user_basename and
656 |                  not '.' in user_basename and
657 |                  len(user_basename) > 0 and
658 |                  # Common directory names or patterns that suggest it's a directory
659 |                  (user_basename.lower() in ['models', 'unet', 'checkpoints', 'gguf', 'output', 'quantized'] or
660 |                   user_basename.lower().endswith('_models') or
661 |                   user_basename.lower().endswith('_output') or
662 |                   # If the parent directory exists and this looks like a subdirectory
663 |                   (user_dirname and os.path.exists(user_dirname))))
664 |             )
665 | 
666 |             if is_directory_path:
667 |                 # This is a directory path
668 |                 output_directory_part = norm_template
669 |                 # Keep the original filename_core (derived_model_name_for_output)
670 |                 if verbose_logging:
671 |                     print(f"DEBUG: Detected directory path. Using entire path as directory: {output_directory_part}")
672 |                     print(f"DEBUG: Keeping original filename_core: {filename_core}")
673 |             else:
674 |                 # This is a file path (directory/filename_core)
675 |                 if user_basename:
676 |                     filename_core = user_basename
677 |                 output_directory_part = user_dirname
678 |                 if verbose_logging:
679 |                     print(f"DEBUG: Detected file path. filename_core set to: {filename_core}")
680 |                     print(f"DEBUG: output_directory_part set to: {output_directory_part}")
681 | 
682 |             if is_absolute_path:
683 |                 if verbose_logging:
684 |                     print("DEBUG: Absolute path mode.")
685 |                 if not user_dirname and user_basename:
686 |                     status_messages.append(f"❌ Error: Absolute path template '{path_template_str}' must include an absolute directory, not just a filename.")
687 |                     if verbose_logging:
688 |                         print(f"DEBUG: Error - Absolute template '{path_template_str}' lacks directory part.")
689 |                     if src_model_path_for_convert and os.path.exists(src_model_path_for_convert): os.remove(src_model_path_for_convert)
690 |                     return ("\n".join(status_messages), "")
691 | 
692 |                 if not os.path.isabs(output_directory_part):
693 |                     status_messages.append(f"❌ Error: The directory part '{output_directory_part}' from template '{path_template_str}' is not an absolute path, but 'is_absolute_path' is True.")
694 |                     if verbose_logging:
695 |                         print(f"DEBUG: Error - Dir part '{output_directory_part}' is not absolute.")
696 |                     if src_model_path_for_convert and os.path.exists(src_model_path_for_convert): os.remove(src_model_path_for_convert)
697 |                     return ("\n".join(status_messages), "")
698 |                 final_output_directory = output_directory_part
699 |                 if verbose_logging:
700 |                     print(f"DEBUG: Absolute mode. Final output directory: {final_output_directory}")
701 |             else:
702 |                 if verbose_logging:
703 |                     print("DEBUG: Relative path mode.")
704 |                 if os.path.isabs(output_directory_part):
705 |                     abs_part_warning = f"⚠️ Warning: Path template '{path_template_str}' has an absolute directory part ('{output_directory_part}') in relative mode. This absolute part will be used directly under ComfyUI's output directory, e.g., 'ComfyUI/output{output_directory_part.lstrip(os.path.sep)}'."
706 |                     status_messages.append(abs_part_warning)
707 |                     if verbose_logging:
708 |                         print(f"DEBUG: {abs_part_warning}")
709 |                     final_output_directory = os.path.join(folder_paths.get_output_directory(), output_directory_part.lstrip(os.path.sep))
710 |                 else:
711 |                     final_output_directory = os.path.join(folder_paths.get_output_directory(), output_directory_part)
712 |                 if verbose_logging:
713 |                     print(f"DEBUG: Relative mode. Final output directory: {final_output_directory}")
714 | 
715 |         try:
716 |             if verbose_logging:
717 |                 print(f"DEBUG: Attempting to create final output directory: {final_output_directory}")
718 |             os.makedirs(final_output_directory, exist_ok=True)
719 |             status_messages.append(f"Output directory set to: {final_output_directory}")
720 |             if verbose_logging:
721 |                 print(f"DEBUG: Successfully ensured final output directory exists.")
722 |         except Exception as e_mkdir:
723 |             status_messages.append(f"❌ Error creating output directory '{final_output_directory}': {e_mkdir}")
724 |             if verbose_logging:
725 |                 print(f"DEBUG: Exception creating output directory: {e_mkdir}")
726 |             if src_model_path_for_convert and os.path.exists(src_model_path_for_convert): os.remove(src_model_path_for_convert)
727 |             return ("\n".join(status_messages), "")
728 | 
729 |         # --- GGUF Conversion and Quantization ---
730 |         final_return_path = ""
731 |         gguf_scripts_dir = os.path.join(base_node_dir, "gguf")
732 |         if verbose_logging:
733 |             print(f"DEBUG: gguf_scripts_dir for 5D fix: {gguf_scripts_dir}")
734 | 
735 |         try:
736 |             if verbose_logging:
737 |                 print("DEBUG: Entering main GGUF processing block (with tempfile.TemporaryDirectory).")
738 |             with tempfile.TemporaryDirectory(prefix="gguf_convert_temp_") as temp_dir_for_convert_outputs:
739 |                 status_messages.append(f"Using temporary directory for GGUF conversion: {temp_dir_for_convert_outputs}")
740 |                 if verbose_logging:
741 |                     print(f"DEBUG: temp_dir_for_convert_outputs: {temp_dir_for_convert_outputs}")
742 | 
743 |                 if verbose_logging:
744 |                     print(f"DEBUG: Calling quantizer.convert_model_to_initial_gguf with src: {src_model_path_for_convert}, temp_dir: {temp_dir_for_convert_outputs}")
745 |                 initial_gguf_path_in_temp, model_arch = quantizer.convert_model_to_initial_gguf(src_model_path_for_convert, temp_dir_for_convert_outputs)
746 |                 # quantizer.convert_model_to_initial_gguf has its own DEBUG prints
747 |                 if not initial_gguf_path_in_temp:
748 |                     status_messages.append("❌ Error: Failed to convert model to initial GGUF (F16/BF16). Check console for convert.py script errors.")
749 |                     if verbose_logging:
750 |                         print("DEBUG: quantizer.convert_model_to_initial_gguf failed (returned None).")
751 |                     raise ValueError("Initial GGUF conversion failed (convert.py error)")
752 | 
753 |                 status_messages.append(f"✅ Initial GGUF created in temp: {os.path.basename(initial_gguf_path_in_temp)}")
754 |                 if model_arch: status_messages.append(f"Detected model architecture: {model_arch}")
755 |                 else: status_messages.append("⚠️ Warning: Model architecture unknown. 5D tensor fix might be skipped.")
756 |                 if verbose_logging:
757 |                     print(f"DEBUG: Initial GGUF: {initial_gguf_path_in_temp}, Arch: {model_arch}")
758 | 
759 | 
760 |                 quant_types_to_process = []
761 |                 process_all_mode = quantization_type.upper() == "ALL"
762 |                 if process_all_mode:
763 |                     quant_types_to_process = self.QUANT_TYPES
764 |                     final_return_path = final_output_directory
765 |                     status_messages.append(f"Processing ALL {len(quant_types_to_process)} quantization types: {', '.join(quant_types_to_process)}")
766 |                     if verbose_logging:
767 |                         print(f"DEBUG: 'ALL' mode selected. Processing types: {quant_types_to_process}. final_return_path set to dir: {final_return_path}")
768 |                 else:
769 |                     quant_types_to_process = [quantization_type]
770 |                     if verbose_logging:
771 |                         print(f"DEBUG: Single mode selected. Processing type: {quantization_type}")
772 | 
773 |                 successful_outputs_count = 0
774 | 
775 |                 for idx, q_type in enumerate(quant_types_to_process):
776 |                     q_type_upper = q_type.upper()
777 |                     current_loop_status = [f"\n--- Processing type: {q_type_upper} ({idx+1}/{len(quant_types_to_process)}) ---"]
778 |                     if verbose_logging:
779 |                         print(f"DEBUG: Loop {idx+1}/{len(quant_types_to_process)} - Processing type: {q_type_upper}")
780 | 
781 |                     current_q_final_gguf_name = f"{filename_core}_{q_type_upper}.gguf"
782 |                     current_q_final_gguf_path = os.path.join(final_output_directory, current_q_final_gguf_name)
783 |                     if verbose_logging:
784 |                         print(f"DEBUG: Target output path for this type: {current_q_final_gguf_path}")
785 | 
786 |                     if verbose_logging:
787 |                         print(f"DEBUG: Calling quantizer.quantize_gguf for {q_type_upper}. Input: {initial_gguf_path_in_temp}, Output: {current_q_final_gguf_path}")
788 |                     processed_gguf_path = quantizer.quantize_gguf(initial_gguf_path_in_temp, q_type_upper, current_q_final_gguf_path)
789 |                     # quantizer.quantize_gguf has its own DEBUG prints
790 | 
791 |                     if not processed_gguf_path:
792 |                         current_loop_status.append(f"❌ Error: Failed to process/quantize to {q_type_upper}.")
793 |                         status_messages.extend(current_loop_status)
794 |                         if verbose_logging:
795 |                             print(f"DEBUG: quantizer.quantize_gguf failed for {q_type_upper}. Skipping this type.")
796 |                         continue
797 | 
798 |                     current_loop_status.append(f"✅ Model processed to {q_type_upper}: {os.path.basename(processed_gguf_path)}")
799 |                     if verbose_logging:
800 |                         print(f"DEBUG: Successfully processed to {q_type_upper}. Path: {processed_gguf_path}")
801 | 
802 |                     if model_arch and processed_gguf_path:
803 |                         if verbose_logging:
804 |                             print(f"DEBUG: Model arch '{model_arch}' known. Calling quantizer.apply_5d_fix_if_needed for {processed_gguf_path}")
805 |                         fixed_path_after_5d = quantizer.apply_5d_fix_if_needed(processed_gguf_path, model_arch, gguf_scripts_dir)
806 |                         # quantizer.apply_5d_fix_if_needed has its own DEBUG prints
807 |                         if fixed_path_after_5d is None:
808 |                              current_loop_status.append(f"❌ Error during 5D tensor fix for {q_type_upper}. File '{os.path.basename(processed_gguf_path)}' might be corrupted.")
809 |                              if verbose_logging:
810 |                                  print(f"DEBUG: 5D fix failed for {q_type_upper}.")
811 |                         elif fixed_path_after_5d == processed_gguf_path:
812 |                              current_loop_status.append(f"✅ 5D tensor fix check/apply complete for {q_type_upper}.")
813 |                              if verbose_logging:
814 |                                  print(f"DEBUG: 5D fix check/apply complete for {q_type_upper}.")
815 |                              successful_outputs_count +=1
816 |                              if not process_all_mode: final_return_path = processed_gguf_path
817 |                     elif not model_arch:
818 |                         current_loop_status.append(f"ℹ️ Skipping 5D tensor fix for {q_type_upper} (model architecture unknown).")
819 |                         if verbose_logging:
820 |                             print(f"DEBUG: Skipping 5D fix for {q_type_upper} (no model_arch).")
821 |                         successful_outputs_count +=1
822 |                         if not process_all_mode: final_return_path = processed_gguf_path
823 |                     else: # This case should ideally not be reached if processed_gguf_path was None and continue was hit.
824 |                         if verbose_logging:
825 |                             print(f"DEBUG: Fallthrough case after 5D fix logic for {q_type_upper} (processed_gguf_path might be None or arch unknown). This indicates an issue if processed_gguf_path was valid.")
826 |                         if processed_gguf_path : # If quantize was successful but arch unknown for fix
827 |                             successful_outputs_count +=1
828 |                             if not process_all_mode: final_return_path = processed_gguf_path
829 | 
830 | 
831 |                     status_messages.extend(current_loop_status)
832 | 
833 |                 if successful_outputs_count == 0:
834 |                     if verbose_logging:
835 |                         print("DEBUG: No GGUF files were successfully created or processed in the loop.")
836 |                     raise ValueError("No GGUF files were successfully created or processed during quantization loop.")
837 | 
838 |                 status_messages.append(f"\n🎉 Successfully processed. {successful_outputs_count} GGUF file(s) created/updated in '{final_output_directory}'.")
839 |                 if verbose_logging:
840 |                     print(f"DEBUG: Loop finished. Successful outputs: {successful_outputs_count}.")
841 | 
842 |             if verbose_logging:
843 |                 print("DEBUG: Exited GGUF processing block (tempfile.TemporaryDirectory scope ended).")
844 | 
845 |         except Exception as e:
846 |             if verbose_logging:
847 |                 print(f"DEBUG: Exception during main GGUF processing block: {e}")
848 |             status_messages.append(f"\n❌ An critical error occurred during GGUF processing: {e}")
849 |             import traceback
850 |             tb_str = traceback.format_exc()
851 |             if verbose_logging:
852 |                 print(f"DEBUG: Traceback for GGUF processing exception: {tb_str}")
853 |             status_messages.append(f"Traceback: {tb_str}")
854 |             final_return_path = ""
855 |         finally:
856 |             if verbose_logging:
857 |                 print("DEBUG: Entering final cleanup block (finally).")
858 |             if temp_model_input_path and os.path.exists(temp_model_input_path):
859 |                 try:
860 |                     if verbose_logging:
861 |                         print(f"DEBUG: Removing temporary input UNET: {temp_model_input_path}")
862 |                     os.remove(temp_model_input_path)
863 |                     status_messages.append(f"🗑️ Cleaned up temporary input UNET: {os.path.basename(temp_model_input_path)}")
864 |                     if verbose_logging:
865 |                         print(f"DEBUG: Successfully removed {temp_model_input_path}")
866 |                 except Exception as e_rem:
867 |                     status_messages.append(f"⚠️ Warning: Failed to clean temporary UNET file '{temp_model_input_path}': {e_rem}")
868 |                     if verbose_logging:
869 |                         print(f"DEBUG: Failed to remove {temp_model_input_path}: {e_rem}")
870 |             else:
871 |                 if verbose_logging:
872 |                     print(f"DEBUG: No temporary input UNET file to remove (Path: {temp_model_input_path}, Exists: {os.path.exists(temp_model_input_path) if temp_model_input_path else 'N/A'})")
873 | 
874 |         if not final_return_path:
875 |             status_messages.append(f"\n❌ Processing failed. No valid output path determined. Check logs.")
876 |             if verbose_logging:
877 |                 print("DEBUG: final_return_path is empty at the end. Processing failed.")
878 |             return ("\n".join(status_messages), "")
879 | 
880 |         if not process_all_mode and not os.path.exists(final_return_path):
881 |             status_messages.append(f"\n❌ Error: Final GGUF file '{final_return_path}' not found after processing.")
882 |             if verbose_logging:
883 |                 print(f"DEBUG: Single mode, but final_return_path '{final_return_path}' does not exist.")
884 |             return ("\n".join(status_messages), "")
885 | 
886 |         if verbose_logging:
887 |             print(f"DEBUG: Returning from quantize_diffusion_model. Status messages collected. Final return path: {final_return_path}")
888 |         return ("\n".join(status_messages), final_return_path)
889 | 
890 | 
891 | # ComfyUI Registration is handled by __init__.py
892 | 


--------------------------------------------------------------------------------
/nodes.py:
--------------------------------------------------------------------------------
  1 | # nodes.py
  2 | # This file contains the implementation of your custom nodes.
  3 | 
  4 | import torch
  5 | from safetensors.torch import save_file 
  6 | from tqdm import tqdm
  7 | import os 
  8 | 
  9 | # --- Helper Classes (Using original simpler scaling logic as provided by user) ---
 10 | 
 11 | class FP8Quantizer:
 12 |     """A class to apply FP8 quantization to a state_dict."""
 13 |     
 14 |     def __init__(self, quant_dtype: str = "float8_e5m2"):
 15 |         if not hasattr(torch, quant_dtype):
 16 |             raise ValueError(f"Unsupported quant_dtype: {quant_dtype}. PyTorch does not have this attribute.")
 17 |         self.quant_dtype = quant_dtype
 18 |         self.scale_factors = {} # Not used in current quantize_weights but kept for potential future use
 19 | 
 20 |     def quantize_weights(self, weight: torch.Tensor, layer_name: str) -> torch.Tensor:
 21 |         """Quantizes a weight tensor to the specified FP8 format using simple scaling."""
 22 |         if not weight.is_floating_point():
 23 |             return weight # Only quantize floating point tensors
 24 |         
 25 |         original_device = weight.device
 26 |         # Ensure FP8 conversion happens on CUDA if possible, as FP8 types require it
 27 |         can_use_cuda = torch.cuda.is_available()
 28 |         target_device = torch.device("cuda") if can_use_cuda else torch.device("cpu")
 29 |         
 30 |         # Warn if trying FP8 on CPU without CUDA
 31 |         if not can_use_cuda and "float8" in self.quant_dtype:
 32 |              print(f"[FP8Quantizer] Warning: CUDA not available. True {self.quant_dtype} conversion requires CUDA. Attempting on CPU, but results may be unexpected or errors may occur.")
 33 |              target_device = torch.device("cpu") # Try on CPU despite warning
 34 | 
 35 |         weight_on_target = weight.to(target_device)
 36 | 
 37 |         max_val = torch.max(torch.abs(weight_on_target))
 38 |         if max_val == 0:
 39 |             # For zero tensor, just cast to target dtype on the target device
 40 |             target_torch_dtype = getattr(torch, self.quant_dtype)
 41 |             return torch.zeros_like(weight_on_target, dtype=target_torch_dtype)
 42 |         else:
 43 |             # Using the simple scaling from user's provided script
 44 |             scale = max_val / 127.0 
 45 |             # Clamp scale to avoid division by zero if max_val is extremely small
 46 |             scale = torch.max(scale, torch.tensor(1e-12, device=target_device, dtype=weight_on_target.dtype))
 47 | 
 48 |         # Quantize: scale, round, unscale (simulated int8 range mapping)
 49 |         quantized_weight_simulated = torch.round(weight_on_target / scale * 127.0) / 127.0 * scale
 50 |         
 51 |         # Final cast to the target FP8 dtype
 52 |         target_torch_dtype = getattr(torch, self.quant_dtype)
 53 |         quantized_weight = quantized_weight_simulated.to(dtype=target_torch_dtype)
 54 |         
 55 |         # Return on the device where conversion happened (target_device, ideally CUDA)
 56 |         return quantized_weight
 57 | 
 58 |     def apply_quantization(self, state_dict: dict) -> dict:
 59 |         """Applies direct FP8 quantization to all applicable weights."""
 60 |         quantized_state_dict = {}
 61 |         eligible_tensors = {name: param for name, param in state_dict.items() if isinstance(param, torch.Tensor) and param.is_floating_point()}
 62 |         progress_bar = tqdm(eligible_tensors.items(), desc=f"Quantizing to {self.quant_dtype}", unit="tensor", leave=False)
 63 |         
 64 |         for name, param in progress_bar:
 65 |             # quantize_weights handles device logic now
 66 |             quantized_state_dict[name] = self.quantize_weights(param.clone(), name) 
 67 |         
 68 |         for name, param in state_dict.items():
 69 |             if name not in quantized_state_dict:
 70 |                 quantized_state_dict[name] = param
 71 |         return quantized_state_dict
 72 | 
 73 | class FP8ScaledQuantizer: 
 74 |     """
 75 |     Simulated FP8 quantizer using 8-bit scaled float approximation based on user provided script.
 76 |     Operations are performed on the device of the input tensors.
 77 |     (Used internally by QuantizeModel node for the value simulation step).
 78 |     """
 79 |     def __init__(self, scaling_strategy: str = "per_tensor"):
 80 |         self.scaling_strategy = scaling_strategy
 81 |         self.scale_factors = {} # Stores the calculated scales (Python floats or lists)
 82 | 
 83 |     def _quantize_fp8_simulated(self, tensor: torch.Tensor, scale: torch.Tensor) -> torch.Tensor:
 84 |         """Simulate quantization by scaling, clamping to 8-bit range, and dequantizing."""
 85 |         # Ensure scale is a tensor on the correct device and dtype
 86 |         scale = scale.to(device=tensor.device, dtype=tensor.dtype)
 87 |         # Prevent division by zero
 88 |         scale = torch.where(scale == 0, torch.tensor(1e-9, device=tensor.device, dtype=tensor.dtype), scale)
 89 |         
 90 |         # Perform simulation: scale, round, clamp, unscale
 91 |         quantized_intermediate = tensor / scale * 127.0
 92 |         quantized = torch.round(quantized_intermediate).clamp_(-127.0, 127.0) 
 93 |         dequantized = quantized / 127.0 * scale
 94 |         return dequantized
 95 | 
 96 |     def quantize_weights(self, weight: torch.Tensor, layer_name: str) -> torch.Tensor:
 97 |         """Applies the simulated quantization based on the chosen strategy."""
 98 |         if not isinstance(weight, torch.Tensor) or not weight.is_floating_point():
 99 |             return weight # Skip non-float tensors
100 | 
101 |         current_device = weight.device
102 |         
103 |         if self.scaling_strategy == "per_tensor":
104 |             scale_val = torch.max(torch.abs(weight))
105 |             # Ensure scale_val is a tensor for consistent handling in _quantize_fp8_simulated
106 |             scale = scale_val if scale_val != 0 else torch.tensor(1.0, device=current_device, dtype=weight.dtype)
107 |             self.scale_factors[layer_name] = scale.item() # Store the scale value
108 |             quantized_weight = self._quantize_fp8_simulated(weight, scale)
109 | 
110 |         elif self.scaling_strategy == "per_channel":
111 |             if weight.ndim < 2: # Fallback to per-tensor for 1D tensors
112 |                 scale_val = torch.max(torch.abs(weight))
113 |                 scale = scale_val if scale_val != 0 else torch.tensor(1.0, device=current_device, dtype=weight.dtype)
114 |                 self.scale_factors[layer_name] = scale.item()
115 |                 quantized_weight = self._quantize_fp8_simulated(weight, scale)
116 |             else:
117 |                 # Assume channel dimension is 0 for typical Conv layers
118 |                 # For Linear layers (e.g., [out_features, in_features]), dim 0 is also common.
119 |                 # If weights are [in, out], dim 1 might be needed. Defaulting to dim 0.
120 |                 channel_dim = 0 
121 |                 dims_to_reduce = [d for d in range(weight.ndim) if d != channel_dim]
122 |                 if not dims_to_reduce: # Handle edge case if channel_dim is the only dim somehow
123 |                      scale_val = torch.max(torch.abs(weight))
124 |                      scale = scale_val if scale_val != 0 else torch.tensor(1.0, device=current_device, dtype=weight.dtype)
125 |                      self.scale_factors[layer_name] = scale.item()
126 |                 else:
127 |                     scale = torch.amax(torch.abs(weight), dim=dims_to_reduce, keepdim=True)
128 |                     # Store scales as a list of floats
129 |                     self.scale_factors[layer_name] = scale.squeeze().tolist() 
130 |                 
131 |                 quantized_weight = self._quantize_fp8_simulated(weight, scale)
132 |         else:
133 |             raise ValueError(f"Unknown scaling strategy: {self.scaling_strategy}")
134 |             
135 |         # The output tensor retains the original dtype but has modified values
136 |         return quantized_weight 
137 | 
138 |     def apply_quantization(self, state_dict: dict) -> dict:
139 |         """Applies simulated FP8 quantization to all applicable weights."""
140 |         quantized_state_dict = {}
141 |         # Process only floating point tensors
142 |         eligible_tensors = {name: param for name, param in state_dict.items() if isinstance(param, torch.Tensor) and param.is_floating_point()}
143 |         progress_bar = tqdm(eligible_tensors.items(), desc=f"Applying scaled ({self.scaling_strategy}) quantization", unit="tensor", leave=False)
144 |         
145 |         for name, param in progress_bar:
146 |             # Pass the clone to avoid modifying the original dict if errors occur mid-way
147 |             quantized_state_dict[name] = self.quantize_weights(param.clone(), name) 
148 |         
149 |         # Add back non-floating point tensors and non-tensor data
150 |         for name, param in state_dict.items():
151 |             if name not in quantized_state_dict:
152 |                 quantized_state_dict[name] = param 
153 |         return quantized_state_dict
154 | 
155 | # --- ComfyUI Nodes ---
156 | 
157 | class ModelToStateDict: 
158 |     @classmethod
159 |     def INPUT_TYPES(s): return {"required": {"model": ("MODEL",)}}
160 |     RETURN_TYPES = ("MODEL_STATE_DICT",); RETURN_NAMES = ("model_state_dict",)
161 |     FUNCTION = "get_state_dict"; CATEGORY = "Model Quantization/Utils" 
162 |     def get_state_dict(self, model):
163 |         print("[ModelToStateDict] Attempting to extract state_dict...")
164 |         if not hasattr(model, 'model') or not hasattr(model.model, 'state_dict'):
165 |             print("[ModelToStateDict] Error: Invalid MODEL structure."); return ({},)
166 |         try:
167 |             original_state_dict = model.model.state_dict()
168 |             print(f"[ModelToStateDict] Original keys sample: {list(original_state_dict.keys())[:5]}")
169 |             state_dict_to_return = original_state_dict; prefixes_to_try = ["diffusion_model.", "model."]; prefix_found = False
170 |             for prefix in prefixes_to_try:
171 |                 num_keys = len(original_state_dict);
172 |                 if num_keys == 0: break
173 |                 matches = sum(1 for k in original_state_dict if k.startswith(prefix))
174 |                 if matches > 0 and (matches / num_keys > 0.5 or matches == num_keys):
175 |                     print(f"[ModelToStateDict] Stripping prefix '{prefix}'...")
176 |                     state_dict_to_return = {k[len(prefix):] if k.startswith(prefix) else k: v for k, v in original_state_dict.items()}
177 |                     print(f"[ModelToStateDict] New keys sample: {list(state_dict_to_return.keys())[:5]}"); prefix_found = True; break
178 |             if not prefix_found: print("[ModelToStateDict] No common prefixes stripped.")
179 |             dtypes = {}; total = 0
180 |             for k, v in state_dict_to_return.items():
181 |                 if isinstance(v, torch.Tensor): total += 1; dt = str(v.dtype); dtypes[dt] = dtypes.get(dt, 0) + 1
182 |             print(f"[ModelToStateDict] DEBUG: Output Tensors: {total}, Dtypes: {dtypes}")
183 |             return (state_dict_to_return,)
184 |         except Exception as e: print(f"[ModelToStateDict] Error: {e}"); return ({},)
185 | 
186 | class QuantizeFP8Format: # Direct FP8 conversion node
187 |     @classmethod
188 |     def INPUT_TYPES(s): return { "required": { "model_state_dict": ("MODEL_STATE_DICT",), "fp8_format": (["float8_e4m3fn", "float8_e5m2"], {"default": "float8_e5m2"}), } }
189 |     RETURN_TYPES = ("MODEL_STATE_DICT",); RETURN_NAMES = ("quantized_model_state_dict",)
190 |     FUNCTION = "quantize_model"; CATEGORY = "Model Quantization/FP8 Direct" 
191 |     def quantize_model(self, model_state_dict: dict, fp8_format: str):
192 |         print(f"[QuantizeFP8Format] To {fp8_format}. Keys(sample): {list(model_state_dict.keys())[:3]}")
193 |         if not isinstance(model_state_dict, dict) or not model_state_dict: print("[QuantizeFP8Format] Invalid input."); return ({},)
194 |         try:
195 |             quantizer = FP8Quantizer(quant_dtype=fp8_format) # Uses helper class with simple scaling
196 |             quantized_state_dict = quantizer.apply_quantization(model_state_dict)
197 |             found = False; 
198 |             for n, p in quantized_state_dict.items():
199 |                 if isinstance(p, torch.Tensor) and "float8" in str(p.dtype): print(f"[QuantizeFP8Format] Sample '{n}' dtype: {p.dtype}, dev: {p.device}"); found=True; break
200 |             if not found: print(f"[QuantizeFP8Format] No tensor converted to {fp8_format}.")
201 |             print("[QuantizeFP8Format] Complete."); return (quantized_state_dict,)
202 |         except Exception as e: print(f"[QuantizeFP8Format] Error: {e}"); return (model_state_dict,)
203 | 
204 | class QuantizeModel: # <<< RENAMED CLASS from QuantizeScaled
205 |     """
206 |     Applies simulated FP8 scaling (per-tensor/per-channel) and casts 
207 |     to a specified output dtype (float16, bfloat16, or Original).
208 |     """
209 |     # Removed "FP8" from the list as requested
210 |     OUTPUT_DTYPES_LIST = ["Original", "float16", "bfloat16"] 
211 | 
212 |     @classmethod
213 |     def INPUT_TYPES(s):
214 |         return {
215 |             "required": {
216 |                 "model_state_dict": ("MODEL_STATE_DICT",),
217 |                 "scaling_strategy": (["per_tensor", "per_channel"], {"default": "per_tensor"}), 
218 |                 "processing_device": (["Auto", "CPU", "GPU"], {"default": "Auto"}),
219 |                 # Default to float16 for size reduction, user can choose others
220 |                 "output_dtype": (s.OUTPUT_DTYPES_LIST, {"default": "float16"}), 
221 |             }
222 |         }
223 | 
224 |     RETURN_TYPES = ("MODEL_STATE_DICT",)
225 |     RETURN_NAMES = ("quantized_model_state_dict",)
226 |     FUNCTION = "quantize_model_scaled" # Internal function name can stay
227 |     CATEGORY = "Model Quantization" # More general category
228 | 
229 |     def quantize_model_scaled(self, model_state_dict: dict, scaling_strategy: str, processing_device: str, output_dtype: str):
230 |         # Log using the new class name
231 |         print(f"[QuantizeModel] Strategy: {scaling_strategy}, Device: {processing_device}, Output Dtype: {output_dtype}")
232 |         
233 |         if not isinstance(model_state_dict, dict) or not model_state_dict:
234 |             print("[QuantizeModel] Error: Input model_state_dict is invalid."); 
235 |             return (model_state_dict if isinstance(model_state_dict, dict) else {},)
236 |         
237 |         # Determine processing device
238 |         current_processing_device_str = "cpu"
239 |         if processing_device == "Auto":
240 |             first_tensor_device = next((p.device for p in model_state_dict.values() if isinstance(p, torch.Tensor)), torch.device("cpu"))
241 |             current_processing_device_str = str(first_tensor_device)
242 |         elif processing_device == "CPU": current_processing_device_str = "cpu"
243 |         elif processing_device == "GPU":
244 |             if torch.cuda.is_available(): current_processing_device_str = "cuda"
245 |             else: print("[QuantizeModel] Warning: GPU selected, CUDA unavailable. Defaulting to CPU."); current_processing_device_str = "cpu"
246 |         current_processing_device = torch.device(current_processing_device_str)
247 |         print(f"[QuantizeModel] Value scaling simulation target device: {current_processing_device}")
248 | 
249 |         # Move input state_dict to the processing device
250 |         state_dict_on_processing_device = {}
251 |         for name, param in model_state_dict.items():
252 |             if isinstance(param, torch.Tensor):
253 |                 state_dict_on_processing_device[name] = param.to(current_processing_device)
254 |             else: state_dict_on_processing_device[name] = param
255 |         
256 |         scaled_state_dict = {}; final_state_dict = {}
257 | 
258 |         try:
259 |             # Perform FP8 value simulation using FP8ScaledQuantizer helper (simple scaling version)
260 |             quantizer = FP8ScaledQuantizer(scaling_strategy=scaling_strategy)
261 |             scaled_state_dict = quantizer.apply_quantization(state_dict_on_processing_device) 
262 |             print(f"[QuantizeModel] FP8 value scaling simulation performed on {current_processing_device}.")
263 | 
264 |             # Cast to final output_dtype (Original, float16, bfloat16)
265 |             if output_dtype == "Original":
266 |                 print("[QuantizeModel] Output Dtype: Original. No further dtype casting.")
267 |                 final_state_dict = scaled_state_dict 
268 |             else:
269 |                 # output_dtype is guaranteed to be 'float16' or 'bfloat16'
270 |                 try:
271 |                     target_torch_dtype = getattr(torch, output_dtype)
272 |                     print(f"[QuantizeModel] Casting output to {output_dtype} ({target_torch_dtype})...")
273 |                     for name, param in scaled_state_dict.items():
274 |                         if isinstance(param, torch.Tensor) and param.is_floating_point():
275 |                             final_state_dict[name] = param.to(dtype=target_torch_dtype)
276 |                         else: 
277 |                             final_state_dict[name] = param # Pass non-float tensors or non-tensors
278 |                     print(f"[QuantizeModel] Casting to {output_dtype} complete.")
279 |                 except AttributeError: # Should not happen with the restricted list
280 |                     print(f"[QuantizeModel] Error: Invalid torch dtype '{output_dtype}'. Using scaled tensors without final casting.")
281 |                     final_state_dict = scaled_state_dict 
282 |                 except Exception as e_cast: 
283 |                     print(f"[QuantizeModel] Error during casting loop to {output_dtype}: {e_cast}. Using scaled tensors without final casting for affected tensors.")
284 |                     for name_done, param_done in final_state_dict.items(): pass 
285 |                     for name_rem, param_rem in scaled_state_dict.items():
286 |                         if name_rem not in final_state_dict: final_state_dict[name_rem] = param_rem
287 |             
288 |             # Verification log
289 |             for name, param in final_state_dict.items():
290 |                 if isinstance(param, torch.Tensor) and param.is_floating_point():
291 |                     print(f"[QuantizeModel] Sample output tensor '{name}' final dtype: {param.dtype}, device: {param.device}")
292 |                     break
293 |             print(f"[QuantizeModel] Processing complete.")
294 |             return (final_state_dict,)
295 |         except Exception as e:
296 |             print(f"[QuantizeModel] Major error during processing: {e}")
297 |             return (model_state_dict,)
298 | 
299 | 
300 | class SaveAsSafeTensor: # No changes needed
301 |     @classmethod
302 |     def INPUT_TYPES(s): return { "required": { "quantized_model_state_dict": ("MODEL_STATE_DICT",), "absolute_save_path": ("STRING", {"default": "C:/temp/quantized_model.safetensors", "multiline": False}), } }
303 |     RETURN_TYPES = () ; OUTPUT_NODE = True ; FUNCTION = "save_model"; CATEGORY = "Model Quantization/Save" 
304 |     def save_model(self, quantized_model_state_dict: dict, absolute_save_path: str):
305 |         print(f"[SaveAsSafeTensor] Saving to: {absolute_save_path}")
306 |         if not isinstance(quantized_model_state_dict, dict) or not quantized_model_state_dict: print("[SaveAsSafeTensor] Error: Input invalid."); return {"ui": {"text": ["Error: Input invalid."]}}
307 |         if not absolute_save_path: print("[SaveAsSafeTensor] Error: Path empty."); return {"ui": {"text": ["Error: Path empty."]}}
308 |         if not absolute_save_path.lower().endswith(".safetensors"): absolute_save_path += ".safetensors"; print(f"[SaveAsSafeTensor] Appended .safetensors")
309 |         try:
310 |             output_dir = os.path.dirname(absolute_save_path);
311 |             if output_dir and not os.path.exists(output_dir): os.makedirs(output_dir, exist_ok=True); print(f"[SaveAsSafeTensor] Created dir.")
312 |             cpu_state_dict = {}; dtype_counts = {}; total_tensors = 0
313 |             for k, v in quantized_model_state_dict.items():
314 |                 if isinstance(v, torch.Tensor):
315 |                     total_tensors += 1; tensor_to_save = v.cpu() if v.device.type != 'cpu' else v; cpu_state_dict[k] = tensor_to_save
316 |                     dt_str = str(tensor_to_save.dtype); dtype_counts[dt_str] = dtype_counts.get(dt_str, 0) + 1
317 |                 else: cpu_state_dict[k] = v 
318 |             print(f"[SaveAsSafeTensor] DEBUG: Tensors: {total_tensors}, Dtypes: {dtype_counts}")
319 |             save_file(cpu_state_dict, absolute_save_path)
320 |             print(f"[SaveAsSafeTensor] Saved successfully."); return {"ui": {"text": [f"Saved: {absolute_save_path}"]}}
321 |         except Exception as e: print(f"[SaveAsSafeTensor] Error saving: {e}"); return {"ui": {"text": [f"Error: {e}"]}}
322 | 
323 | # --- Main (for testing outside ComfyUI, not strictly necessary for the plugin) ---
324 | # (Test block remains the same as previous version, it already tests QuantizeModel)
325 | if __name__ == '__main__':
326 |     print("--- Testing Quantization Nodes (Renamed QuantizeModel) ---")
327 | 
328 |     class MockCoreModel(torch.nn.Module): 
329 |         def __init__(self): super().__init__(); self.layer1 = torch.nn.Linear(10,10).float(); self.layer2=torch.nn.Linear(10,10).float()
330 |         def forward(self,x): return self.layer2(self.layer1(x))
331 |         def state_dict(self, *args, **kwargs): return {k:v.clone() for k,v in super().state_dict(*args,**kwargs).items()}
332 |     class MockModelWithPrefix(torch.nn.Module):
333 |         def __init__(self): super().__init__(); self.model = MockCoreModel() # Use "model." prefix
334 |         def forward(self,x): return self.model(x)
335 |     class MockModelPatcher:
336 |         def __init__(self): self.model = MockModelWithPrefix(); [p.data.normal_().float() for p in self.model.parameters()]
337 |     
338 |     mock_comfy_model = MockModelPatcher()
339 |     node_to_sd = ModelToStateDict()
340 |     base_sd_tuple = node_to_sd.get_state_dict(mock_comfy_model)
341 |     base_sd = base_sd_tuple[0] if base_sd_tuple else {}
342 |     if not base_sd or 'layer1.weight' not in base_sd: print("ModelToStateDict failed."); exit() # Check unprefixed key
343 |     print(f"Base SD 'layer1.weight' dtype: {base_sd['layer1.weight'].dtype}, device: {base_sd['layer1.weight'].device}")
344 | 
345 |     # Test the renamed node
346 |     node_quantize = QuantizeModel() 
347 |     print("\n--- Test QuantizeModel ---")
348 |     
349 |     # Test Case 1: Output float16
350 |     print("Testing QuantizeModel: Output Dtype = float16, Device = CPU")
351 |     result_fp16_tuple = node_quantize.quantize_model_scaled(base_sd, "per_tensor", "CPU", "float16")
352 |     result_fp16 = result_fp16_tuple[0] if result_fp16_tuple else {}
353 |     if result_fp16 and 'layer1.weight' in result_fp16:
354 |         tensor = result_fp16['layer1.weight']
355 |         print(f"  Output 'layer1.weight' dtype: {tensor.dtype} (Expected float16), device: {tensor.device}")
356 |         assert tensor.dtype == torch.float16
357 |         assert tensor.device.type == 'cpu'
358 |     else: print("  Test Case 1 Failed.")
359 | 
360 |     # Test Case 2: Output Original
361 |     print("\nTesting QuantizeModel: Output Dtype = Original, Device = CPU")
362 |     result_orig_tuple = node_quantize.quantize_model_scaled(base_sd, "per_tensor", "CPU", "Original")
363 |     result_orig = result_orig_tuple[0] if result_orig_tuple else {}
364 |     if result_orig and 'layer1.weight' in result_orig:
365 |         tensor = result_orig['layer1.weight']
366 |         print(f"  Output 'layer1.weight' dtype: {tensor.dtype} (Expected {base_sd['layer1.weight'].dtype}), device: {tensor.device}")
367 |         assert tensor.dtype == base_sd['layer1.weight'].dtype # Should match original
368 |         assert tensor.device.type == 'cpu'
369 |     else: print("  Test Case 2 Failed.")
370 |     
371 |     print("\n--- Testing Complete ---")


--------------------------------------------------------------------------------
/pyproject.toml:
--------------------------------------------------------------------------------
 1 | [project]
 2 | name = "modelquantizer"
 3 | description = "This is a node to converts models into Fp8, bf16, fp16."
 4 | version = "1.0.0"
 5 | license = {file = "LICENSE"}
 6 | dependencies = ["torch>=2.0.0", "safetensors>=0.3.1", "tqdm>=4.65.0"]
 7 | 
 8 | [project.urls]
 9 | Repository = "https://github.com/lum3on/ComfyUI-ModelQuantizer"
10 | #  Used by Comfy Registry https://comfyregistry.org
11 | 
12 | [tool.comfy]
13 | PublisherId = ""
14 | DisplayName = "ComfyUI-ModelQuantizer"
15 | Icon = ""
16 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | # Core dependencies for all quantization features
 2 | torch>=2.0.0
 3 | safetensors>=0.3.1
 4 | tqdm>=4.65.0
 5 | numpy>=1.17.0
 6 | 
 7 | # Additional dependencies for ControlNet FP8 quantization
 8 | tensorflow>=2.13.0
 9 | tensorflow-model-optimization>=0.7.0
10 | 
11 | # Additional dependencies for GGUF quantization
12 | # Note: GGUF package is included in the gguf/ directory (from City96's tools)
13 | # The following are required for GGUF functionality:
14 | huggingface_hub>=0.16.0
15 | requests>=2.28.0
16 | 
17 | # Optional dependencies for enhanced functionality
18 | # These are recommended but not strictly required:
19 | # sentencepiece>=0.1.98  # For advanced GGUF model processing
20 | # pyyaml>=5.1            # For configuration file support
21 | 
22 | # System requirements for GGUF quantization:
23 | # - Minimum 96GB RAM (for large diffusion models)
24 | # - CUDA-compatible GPU (recommended)
25 | # - Python 3.8+ with PyTorch 2.0+
26 | # - Sufficient storage space for temporary files during processing
27 | 


--------------------------------------------------------------------------------
/web/appearance.js:
--------------------------------------------------------------------------------
 1 | import { app } from "../../scripts/app.js";
 2 | 
 3 | app.registerExtension({
 4 |     name: "ComfyUI-ModelQuantizer.appearance",
 5 |     async nodeCreated(node) {
 6 |         // Model Quantization nodes styling - Apply styling
 7 |         if (node.comfyClass === "ModelToStateDict" ||
 8 |             node.comfyClass === "QuantizeFP8Format" ||
 9 |             node.comfyClass === "QuantizeModel" ||
10 |             node.comfyClass === "SaveAsSafeTensor" ||
11 |             node.comfyClass === "ControlNetFP8QuantizeNode" ||
12 |             node.comfyClass === "ControlNetMetadataViewerNode" ||
13 |             node.comfyClass === "GGUFQuantizerNode") {
14 |             node.color = "#f9918b";
15 |             node.bgcolor = "#a1cfa9";
16 |         }
17 |     }
18 | });


--------------------------------------------------------------------------------