├── LICENSE ├── README.md └── assets ├── efficiency.jpg ├── flash_attn.jpg ├── high_resolution.jpg ├── logo.jpg ├── logo2.jpg ├── pipeline.jpg ├── results.jpg ├── results2.jpg ├── scaling.jpg ├── teaser.jpg └── visual.jpg /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 Hang Guo 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |

2 | 3 |

4 | 5 |
6 | 7 | **2K resolution image generation with on single 3090 GPU** 🏔️ 8 | 9 | 10 |

11 | FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning 12 |

13 | 14 | [Hang Guo](https://csguoh.github.io/), [Yawei Li](https://yaweili.bitbucket.io/), [Taolin Zhang](https://github.com/taolinzhang), [Jiangshan Wang](https://scholar.google.com.hk/citations?user=HoKoCv0AAAAJ&hl=zh-CN&oi=ao), [Tao Dai](https://scholar.google.com.hk/citations?user=MqJNdaAAAAAJ&hl=zh-CN&oi=ao), [Shu-Tao Xia](https://scholar.google.com.hk/citations?hl=zh-CN&user=koAXTXgAAAAJ), [Luca Benini](https://ee.ethz.ch/the-department/people-a-z/person-detail.luca-benini.html) 15 | 16 | ![visitors](https://visitor-badge.laobi.icu/badge?page_id=cshguo.FastVAR) 17 | [![arXiv](https://img.shields.io/badge/arXiv-2503.23367-b31b1b.svg)](https://arxiv.org/pdf/2503.23367) 18 | 19 |
20 | 21 | > **Abstract:** Visual Autoregressive (VAR) modeling has gained popularity for its shift towards next-scale prediction. However, existing VAR paradigms process the entire token map at each scale step, leading to the complexity and runtime scaling dramatically with image resolution. To address this challenge, we propose FastVAR, a post-training acceleration method for efficient resolution scaling with VARs. Our key finding is that the majority of latency arises from the large-scale step where most tokens have already converged. Leveraging this observation, we develop the cached token pruning strategy that only forwards pivotal tokens for scalespecific modeling while using cached tokens from previous scale steps to restore the pruned slots. This significantly reduces the number of forwarded tokens and improves the efficiency at larger resolutions. Experiments show the proposed FastVAR can further speedup FlashAttentionaccelerated VAR by 2.7× with negligible performance drop of <1%. We further extend FastVAR to zero-shot generation of higher resolution images. In particular, FastVAR can generate one 2K image with 15GB memory footprints in 1.5s on a single NVIDIA 3090 GPU. 22 | 23 | 24 | 25 | ⭐If this work is helpful for you, please help star this repo. Thanks!🤗 26 | 27 | ## ✨ Highlights 28 | 29 | 30 | 1️⃣ **Faster VAR Generation without Perceptual Loss** 31 | 32 |

33 | 34 |

35 | 36 | 2️⃣ **High-resolution Image Generation (even 2K image on single 3090 GPU)** 37 | 38 |

39 | 40 |

41 | 42 | 43 | 3️⃣ **Promising Resolution Scalibility (almost linear complexity)** 44 | 45 |

46 | 47 |

48 | 49 | 50 | ## 📑 Contents 51 | 52 | - [News](#news) 53 | - [Pipeline](#pipeline) 54 | - [TODO](#todo) 55 | - [Results](#results) 56 | - [Citation](#cite) 57 | 58 | ## 🆕 News 59 | 60 | - **2025-03-30:** arXiv paper available. 61 | - **2025-04-04:** This repo is released. 62 | 63 | 64 | ## ☑️ TODO 65 | 66 | - [x] arXiv version available 67 | - [ ] Release code 68 | - [ ] Further improvements 69 | 70 | 71 | ## 👀 Pipeline 72 | 73 | Our FastVAR introduces the **"cached token pruning"** which works on the large-scale steps of the VAR models, which is **training-free** and **generic** for various VAR backbones. 74 | 75 |

76 | 77 |

78 | 79 | 80 | ## 🥇 Results 81 | 82 | Our FastVAR can achieve **2.7x** speedup with **<1%** performance drop, even on top of [Flash-attention](https://arxiv.org/abs/2205.14135) accelerated setups. 83 | 84 | Detailed results can be found in the paper. 85 | 86 |
87 | Quantitative Results on the GenEval benchmark(click to expand) 88 | 89 |

90 | 91 |

92 |
93 | 94 | 95 |
96 | Quantitative Results on the MJHQ30K benchmark (click to expand) 97 | 98 |

99 | 100 |

101 |
102 | 103 | 104 |
105 | Comparison and combination with FlashAttention (click to expand) 106 | 107 |

108 | 109 |

110 |
111 | 112 | 113 | 114 | ## 🥰 Citation 115 | 116 | Please cite us if our work is useful for your research. 117 | 118 | ``` 119 | @article{guo2025fastvar, 120 | title={FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning}, 121 | author={Guo, Hang and Li, Yawei and Zhang, Taolin and Wang, Jiangshan and Dai, Tao and Xia, Shu-Tao and Benini, Luca}, 122 | journal={arXiv preprint arXiv:2503.23367}, 123 | year={2025} 124 | } 125 | ``` 126 | 127 | ## License 128 | 129 | Since this work based on the pre-trained VAR models, users should follow the license of the corresponding backbone models like [HART(MIT License)](https://github.com/mit-han-lab/hart) and [Infinity(MIT License)](https://github.com/FoundationVision/Infinity?tab=readme-ov-file). 130 | 131 | 132 | ## Contact 133 | 134 | If you have any questions, feel free to approach me at cshguo@gmail.com 135 | 136 | -------------------------------------------------------------------------------- /assets/efficiency.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/efficiency.jpg -------------------------------------------------------------------------------- /assets/flash_attn.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/flash_attn.jpg -------------------------------------------------------------------------------- /assets/high_resolution.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/high_resolution.jpg -------------------------------------------------------------------------------- /assets/logo.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/logo.jpg -------------------------------------------------------------------------------- /assets/logo2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/logo2.jpg -------------------------------------------------------------------------------- /assets/pipeline.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/pipeline.jpg -------------------------------------------------------------------------------- /assets/results.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/results.jpg -------------------------------------------------------------------------------- /assets/results2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/results2.jpg -------------------------------------------------------------------------------- /assets/scaling.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/scaling.jpg -------------------------------------------------------------------------------- /assets/teaser.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/teaser.jpg -------------------------------------------------------------------------------- /assets/visual.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/csguoh/FastVAR/e1c2112799733b174152c6fd6ded32230fda8fbc/assets/visual.jpg --------------------------------------------------------------------------------