├── .gitattributes ├── .gitignore ├── README.md ├── assets ├── 16_comparison.gif ├── 1_comparison.gif ├── 3_comparison.gif ├── 4_comparison.gif ├── 7_comparison.gif ├── 8_comparison.gif ├── Cogview4 │ ├── cfg.png │ └── ours.png ├── HiDream │ ├── cat_cfg.png │ └── cat_ours.png ├── Qwen2.5 │ ├── output-origin.mp3 │ └── output-ours.mp3 ├── easycontrol │ ├── image.webp │ ├── image_CFG.webp │ └── image_CFG_zero_star.webp ├── flux │ ├── image_cfg.png │ ├── image_ours.png │ └── lora │ │ ├── image_cfg_ds.png │ │ └── image_ours_ds.png ├── hunyuan │ ├── 376559893_output_cfg.gif │ └── 376559893_output_ours.gif ├── repo_teaser.jpg ├── sd3 │ ├── output_cfg.png │ └── output_ours.png └── wan2.1 │ ├── 1270611998_base.gif │ ├── 1270611998_ours.gif │ ├── 1306980124_base.gif │ ├── 1306980124_ours.gif │ ├── 1322140014_base.gif │ ├── 1322140014_ours.gif │ ├── 158241056_base.gif │ ├── 158241056_ours.gif │ ├── I2V_CFG.gif │ ├── I2V_Ours.gif │ ├── i2v-14B_832_480_cfg_3549111921.gif │ ├── i2v-14B_832_480_ours_3549111921.gif │ └── i2v_input.JPG ├── demo.py ├── models ├── Cogview4 │ ├── infer.py │ └── pipeline.py ├── HiDream │ └── pipeline.py ├── Qwen2.5 │ ├── infer.py │ └── qw_model.py ├── easycontrol │ ├── infer.py │ └── src │ │ ├── __init__.py │ │ ├── layers_cache.py │ │ ├── lora_helper.py │ │ ├── pipeline.py │ │ └── transformer_flux.py ├── flux │ ├── Guidance_distilled.py │ ├── infer_lora.py │ └── pipeline.py ├── hunyuan │ ├── pipeline.py │ └── t2v.py ├── sd │ ├── infer.py │ └── sd3_pipeline.py └── wan │ ├── T2V_infer.py │ ├── image2video_cfg_zero_star.py │ └── wan_pipeline.py ├── requirements.txt └── tools └── convert_to_gif.py /.gitattributes: -------------------------------------------------------------------------------- 1 | *.gif filter=lfs diff=lfs merge=lfs -text 2 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # .* 2 | *.py[cod] 3 | # *.jpg 4 | *.jpeg 5 | # *.png 6 | # *.gif 7 | *.bmp 8 | *.mp4 9 | *.mov 10 | *.mkv 11 | *.log 12 | *.zip 13 | *.pt 14 | *.pth 15 | *.ckpt 16 | *.safetensors 17 | *.json 18 | # *.txt 19 | *.backup 20 | *.pkl 21 | *.html 22 | *.pdf 23 | *.whl 24 | cache 25 | __pycache__/ 26 | storage/ 27 | samples/ 28 | !.gitignore 29 | !requirements.txt 30 | .DS_Store 31 | *DS_Store 32 | 33 | generated_videos 34 | output 35 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models 2 | 3 |
15 | 19 | 20 | 21 | 22 | 28 | 29 | --- 30 | 31 | 36 | 37 | 🔥 [Huggingface demo for Ghibli style generation](https://huggingface.co/spaces/jamesliu1217/EasyControl_Ghibli) supported by [EasyControl](https://github.com/Xiaojiu-z/EasyControl). 38 | 39 | ⚡️ [Huggingface demo](https://huggingface.co/spaces/weepiess2383/CFG-Zero-Star) now supports text-to-image generation with SD3 and SD3.5. 40 | 41 | 💰 Bonus tip: You can even use pure zero-init (zeroing out the prediction of the first step) as a quick test—if it improves your flow-matching model a lot, it may indicate that the model has not converged yet. 42 | 43 | **🧪 Usage Tip: Use both optimized-scale and zero-init together. Adjust the zero-init steps based on total inference steps — 4% is generally a good starting point.** 44 | 45 | ## 🔥 Update and News 46 | - [2025.4.14] [HiDream](https://github.com/HiDream-ai/HiDream-I1) is suppported now! 47 | - [2025.4.14] 🔥 Supported by [sdnext](https://github.com/vladmandic/sdnext/blob/dev/CHANGELOG.md#update-for-2025-04-13) now! 48 | - [2025.4.6] 📙 Supported by [EasyControl](https://github.com/Xiaojiu-z/EasyControl) now! 49 | - [2025.4.4] 🤗 Supported by [Diffusers](https://github.com/huggingface/diffusers) now! 50 | - [2025.4.2] 🙌 Mentioned by [Wan2.1](https://github.com/Wan-Video/Wan2.1)! 51 | - [2025.4.1] Qwen2.5-Omni is suppported now! 52 | - [2025.3.30] Hunyuan is officially supported now! 53 | - [2025.3.29] Flux is officially supported now! 54 | - [2025.3.29] Both Wan2.1-14B I2V & T2V are now supported! 55 | - [2025.3.28] Wan2.1-14B T2V is now supported! (Note: The default setting has been updated to zero out 4% of total steps for this scenario.) 56 | - [2025.3.27] 📙 Supported by [ComfyUI-KJNodes](https://github.com/kijai/ComfyUI-KJNodes) now! 57 | - [2025.03.26] 📙 Supported by [Wan2.1GP](https://github.com/deepbeepmeep/Wan2GP) now! 58 | - [2025.03.25] Paper|Demo|Code have been officially released. 59 | 60 | ## Community Works 61 | If you find that CFG-Zero* helps improve your model, we'd love to hear about it! 62 | 63 | Thanks to the following models for supporting our method! 64 | - [blissful-tuner](https://github.com/Sarania/blissful-tuner/tree/main) 65 | - [SD.Next](https://github.com/vladmandic/sdnext) 66 | - [EasyControl](https://huggingface.co/spaces/jamesliu1217/EasyControl_Ghibli) 67 | - [ComfyUI-KJNodes](https://github.com/kijai/ComfyUI-KJNodes) 68 | - [Wan2.1GP](https://github.com/deepbeepmeep/Wan2GP) 69 | - [ComfyUI](https://github.com/comfyanonymous/ComfyUI) **Noted that ComfyUI's implementation is different from ours.** 70 | 71 | ## 📑 Todo List 72 | - Wan2.1 73 | - [x] 14B Text-to-Video 74 | - [x] 14B Image-to-Video 75 | - Hunyuan 76 | - [x] Text-to-Video 77 | - SD3/SD3.5 78 | - [x] Text-to-Image 79 | - Flux 80 | - [x] Text-to-Image (Guidance-distilled version) 81 | - [x] Lora 82 | - CogView4 83 | - [x] Text-to-Image 84 | - Qwen2.5-Omni 85 | - [x] Audio generation 86 | - EasyControl 87 | - [x] Ghibli-Style Portrait Generation 88 | - HiDream 89 | - [x] text2image pipeline 90 | 91 | ## :astonished: Gallery 92 | 93 |![]() |
99 |
![]() |
110 | ![]() |
111 |
![]() |
114 | ![]() |
115 |
![]() |
118 | ![]() |
119 |
![]() |
172 | ![]() |
173 |
CFG | 176 |CFG-Zero* | 177 |
180 | Prompt: "A cat walks on the grass, realistic" 181 | Seed: 1322140014 182 | |
183 | |
![]() |
188 | ![]() |
189 |
CFG | 192 |CFG-Zero* | 193 |
196 | Prompt: "A dynamic interaction between the ocean and a large rock. The rock, with its rough texture and jagged edges, is partially submerged in the water, suggesting it is a natural feature of the coastline. The water around the rock is in motion, with white foam and waves crashing against the rock, indicating the force of the ocean's movement. The background is a vast expanse of the ocean, with small ripples and waves, suggesting a moderate sea state. The overall style of the scene is a realistic depiction of a natural landscape, with a focus on the interplay between the rock and the water." 197 | Seed: 1306980124 198 | |
199 | |
![]() |
204 | ![]() |
205 |
CFG | 208 |CFG-Zero* | 209 |
212 | Prompt: "A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about." 213 | Seed: 1270611998 214 | |
215 | |
![]() |
220 | ![]() |
221 |
CFG | 224 |CFG-Zero* | 225 |
228 | Prompt: "The camera follows behind a white vintage SUV with a black roof rack as it speeds up a steep dirt road surrounded by pine trees on a steep mountain slope, dust kicks up from it’s tires, the sunlight shines on the SUV as it speeds along the dirt road, casting a warm glow over the scene. The dirt road curves gently into the distance, with no other cars or vehicles in sight. The trees on either side of the road are redwoods, with patches of greenery scattered throughout. The car is seen from the rear following the curve with ease, making it seem as if it is on a rugged drive through the rugged terrain. The dirt road itself is surrounded by steep hills and mountains, with a clear blue sky above with wispy clouds." 229 | Seed: 158241056 230 | |
231 |
![]() |
250 | ![]() |
251 | |
Input Image | 254 |CFG | 255 |CFG-Zero* | 256 |
259 | Prompt: "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside." 260 | Seed: 0 261 | |
262 | ||
![]() |
267 | ![]() |
268 | |
Input Image | 271 |CFG | 272 |CFG-Zero* | 273 |
276 | Prompt: "Summer beach vacation style. A white cat wearing sunglasses lounges confidently on a surfboard, gently bobbing with the ocean waves under the bright sun. The cat exudes a cool, laid-back attitude. After a moment, it casually reaches into a small bag, pulls out a cigarette, and lights it. A thin stream of smoke drifts into the salty breeze as the cat takes a slow drag, maintaining its nonchalant pose beneath the clear blue sky."
277 | 278 | Seed: 3549111921 279 | |
280 |
![]() |
296 | ![]() |
297 |
CFG | 300 |CFG-Zero* | 301 |
304 | Prompt: "a tiny astronaut hatching from an egg on the moon." 305 | Seed: 105297965 306 | |
307 |
![]() |
322 | ![]() |
323 |
CFG | 326 |CFG-Zero* | 327 |
330 | Prompt: "Death Stranding Style. A solitary figure in a futuristic suit with a large, intricate backpack stands on a grassy cliff, gazing at a vast, mist-covered landscape composed of rugged mountains and low valleys beneath a rainy, overcast sky. Raindrops streak softly through the air, and puddles glisten on the uneven ground. Above the horizon, an ethereal, upside-down rainbow arcs downward through the gray clouds — its surreal, inverted shape adding an otherworldly touch to the haunting scene. A soft glow from distant structures illuminates the depth of the valley, enhancing the mysterious atmosphere. The contrast between the rain-soaked greenery and jagged rocky terrain adds texture and detail, amplifying the sense of solitude, exploration, and the anticipation of unknown adventures beyond the horizon." 331 | Seed: 875187112 332 | Lora: https://civitai.com/models/46080/death-stranding 333 | |
334 |
![]() |
349 | ![]() |
350 |
CFG | 353 |CFG-Zero* | 354 |
357 | Prompt: "In an ornate, historical hall, a massive tidal wave peaks and begins to crash. A man is surfing, cinematic film shot in 35mm. High quality, high defination." 358 | Seed: 376559893 359 | |
360 |
![]() |
374 | ![]() |
375 |
CFG | 378 |CFG-Zero* | 379 |
382 | Prompt: "A capybara holding a sign that reads Hello World" 383 | Seed: 811677707 384 | |
385 |
![]() |
418 | ![]() |
419 | ![]() |
420 |
Source Image | 423 |CFG | 424 |CFG-Zero* | 425 |
![]() |
437 | ![]() |
438 |
CFG | 441 |CFG-Zero* | 442 |
445 | Prompt: "A vibrant cherry red sports car sits proudly under the gleaming sun, its polished exterior smooth and flawless, casting a mirror-like reflection. The car features a low, aerodynamic body, angular headlights that gaze forward like predatory eyes, and a set of black, high-gloss racing rims that contrast starkly with the red. A subtle hint of chrome embellishes the grille and exhaust, while the tinted windows suggest a luxurious and private interior. The scene conveys a sense of speed and elegance, the car appearing as if it's about to burst into a sprint along a coastal road, with the ocean's azure waves crashing in the background." 446 | Seed: 42 447 | |
448 |
![]() |
464 | ![]() |
465 |
CFG | 468 |CFG-Zero* | 469 |
472 | Prompt: "A cat holding a sign that says \"Hi-Dreams.ai\"." 473 | Seed: 0 474 | |
475 |