└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # (ICLR 2025) Qihoo-T2X 2 | 3 | This is the official reproduction of [Qihoo-T2X](https://360cvgroup.github.io/Qihoo-T2X/), which represents a groundbreaking DiT architecture paradigm designed for Text-to-Any tasks. 4 | 5 | **[QIHOO-T2X: AN EFFICIENT PROXY-TOKENIZED DIFFUSION TRANSFORMER FOR TEXT-TO-ANY-TASK](https://arxiv.org/pdf/2409.04005)** 6 |
7 | Jing Wang*, Ao Ma*†, Jiasong Feng*, Dawei Leng‡, Yuhui Yin, Xiaodan Liang‡(*Equal Contribution, †Project Lead, ‡Corresponding Authors) 8 |
9 | [![arXiv](https://img.shields.io/badge/arXiv-2409.04005-b31b1b.svg)](https://arxiv.org/pdf/2409.04005) 10 | [![Project Page](https://img.shields.io/badge/Project-Website-green)](https://360cvgroup.github.io/Qihoo-T2X/) 11 | [![weixin](https://img.shields.io/badge/-WeChat@量子位-000000?logo=wechat&logoColor=07C160)](https://mp.weixin.qq.com/s/UUqtHn7f8zdeINA9eUNlFg) 12 | 13 | 14 | ## 📰 News 15 | - **[2025.02.11]** 🔥 We have open-sourced our model and inference code in [Ascend/MindSpeed-MM](https://gitee.com/ascend/MindSpeed-MM/tree/master/examples/qihoo_t2x#https://gitee.com/link?target=https%3A%2F%2Fhuggingface.co%2FQwen%2FQwen2-VL-2B-Instruct%2Ftree%2Fmain). 16 | - **[2025.01.22]** 🔥 Our paper has been accepted for presentation at ICLR 2025. 17 | - **[2024.09.12]** We created a project [homepage](https://360cvgroup.github.io/Qihoo-T2X/) featuring galleries for Qihoo-T2X. 18 | 19 | 20 | ## BibTeX 21 | ``` 22 | @misc{wang2024qihoot2xefficiencyfocuseddiffusiontransformer, 23 | title={Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task}, 24 | author={Jing Wang and Ao Ma and Jiasong Feng and Dawei Leng and Yuhui Yin and Xiaodan Liang}, 25 | year={2024}, 26 | eprint={2409.04005}, 27 | archivePrefix={arXiv}, 28 | primaryClass={cs.CV}, 29 | url={https://arxiv.org/abs/2409.04005}, 30 | } 31 | ``` 32 | --------------------------------------------------------------------------------