├── .gitignore ├── README.md └── assets ├── aiwalker.png └── zhushou.png /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # system 7 | .DS_Store 8 | 9 | # C extensions 10 | *.so 11 | 12 | # Distribution / packaging 13 | .Python 14 | build/ 15 | develop-eggs/ 16 | dist/ 17 | downloads/ 18 | eggs/ 19 | .eggs/ 20 | lib/ 21 | lib64/ 22 | parts/ 23 | sdist/ 24 | var/ 25 | wheels/ 26 | share/python-wheels/ 27 | *.egg-info/ 28 | .installed.cfg 29 | *.egg 30 | MANIFEST 31 | 32 | # PyInstaller 33 | # Usually these files are written by a python script from a template 34 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 35 | *.manifest 36 | *.spec 37 | 38 | # Installer logs 39 | pip-log.txt 40 | pip-delete-this-directory.txt 41 | 42 | # Unit test / coverage reports 43 | htmlcov/ 44 | .tox/ 45 | .nox/ 46 | .coverage 47 | .coverage.* 48 | .cache 49 | nosetests.xml 50 | coverage.xml 51 | *.cover 52 | *.py,cover 53 | .hypothesis/ 54 | .pytest_cache/ 55 | cover/ 56 | 57 | # Translations 58 | *.mo 59 | *.pot 60 | 61 | # Django stuff: 62 | *.log 63 | local_settings.py 64 | db.sqlite3 65 | db.sqlite3-journal 66 | 67 | # Flask stuff: 68 | instance/ 69 | .webassets-cache 70 | 71 | # Scrapy stuff: 72 | .scrapy 73 | 74 | # Sphinx documentation 75 | docs/_build/ 76 | 77 | # PyBuilder 78 | .pybuilder/ 79 | target/ 80 | 81 | # Jupyter Notebook 82 | .ipynb_checkpoints 83 | 84 | # IPython 85 | profile_default/ 86 | ipython_config.py 87 | 88 | # pyenv 89 | # For a library or package, you might want to ignore these files since the code is 90 | # intended to run in multiple environments; otherwise, check them in: 91 | # .python-version 92 | 93 | # pipenv 94 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 95 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 96 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 97 | # install all needed dependencies. 98 | #Pipfile.lock 99 | 100 | # poetry 101 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 102 | # This is especially recommended for binary packages to ensure reproducibility, and is more 103 | # commonly ignored for libraries. 104 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 105 | #poetry.lock 106 | 107 | # pdm 108 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 109 | #pdm.lock 110 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it 111 | # in version control. 112 | # https://pdm.fming.dev/#use-with-ide 113 | .pdm.toml 114 | 115 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 116 | __pypackages__/ 117 | 118 | # Celery stuff 119 | celerybeat-schedule 120 | celerybeat.pid 121 | 122 | # SageMath parsed files 123 | *.sage.py 124 | 125 | # Environments 126 | .env 127 | .venv 128 | env/ 129 | venv/ 130 | ENV/ 131 | env.bak/ 132 | venv.bak/ 133 | 134 | # Spyder project settings 135 | .spyderproject 136 | .spyproject 137 | 138 | # Rope project settings 139 | .ropeproject 140 | 141 | # mkdocs documentation 142 | /site 143 | 144 | # mypy 145 | .mypy_cache/ 146 | .dmypy.json 147 | dmypy.json 148 | 149 | # Pyre type checker 150 | .pyre/ 151 | 152 | # pytype static type analyzer 153 | .pytype/ 154 | 155 | # Cython debug symbols 156 | cython_debug/ 157 | 158 | # PyCharm 159 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 160 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 161 | # and can be added to the global gitignore or merged into this file. For a more nuclear 162 | # option (not recommended) you can uncomment the following to ignore the entire idea folder. 163 | #.idea/ 164 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # ICCV2023-paper-code 2 | ![](./assets/aiwalker.png) 3 | 4 | 持续更新ICCV2023论文、代码等信息,欢迎关注AIWalker。主要聚焦以下几个方向,更多CV/AI资料可添加AIWalker助手【AIWalker-zhushou】获取(可扫描底部二维码)。 5 | 6 | 7 | - [Backbone](#Backbone) 8 | - [Detection](#detection) 9 | - [Segmentation](#segmentation) 10 | - [Knowledge Distillation](#knowledge-distillation) 11 | - [Diffusion](#Diffusion) 12 | - [Restoration](#Restoration) 13 | - [Super-Resolution](#super-resolution) 14 | - [Deblurring](#deblurring) 15 | - [低光图像增强](#low-light-image-enhance) 16 | - [IQA/IAA](#iqaiaa) 17 | - [数据集](#dataset) 18 | - .... 19 | 20 | 21 | ## Backbone 22 | ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision 23 | Transformer on Diverse Mobile Devices 24 | - Paper: https://arxiv.org/pdf/2303.09730.pdf 25 | 26 | 27 | Rethinking Mobile Block for Efficient Attention-based Models 28 | - Paper: https://arxiv.org/abs/2301.01146 29 | - Code: https://github.com/zhangzjn/EMO 30 | 31 | UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer 32 | - Paper: https://arxiv.org/abs/2211.09552 33 | - Code: https://github.com/OpenGVLab/UniFormerV2 34 | 35 | Unmasked Teacher: Towards Training-Efficient Video Foundation Models 36 | - Paper: https://arxiv.org/abs/2303.16058 37 | - Code: https://github.com/OpenGVLab/unmasked_teacher 38 | 39 | A Unified Continual Learning Framework with General Parameter-Efficient Tuning 40 | - Paper: https://arxiv.org/abs/2303.10070 41 | - Code: https://github.com/gqk/LAE 42 | 43 | Scale-Aware Modulation Meet Transformer 44 | - Paper: https://arxiv.org/abs/2307.08579 45 | - Code: https://github.com/AFeng-x/SMT 46 | 47 | Improving Zero-Shot Generalization for CLIP with Synthesized Prompts 48 | - Paper: https://arxiv.org/abs/2307.07397 49 | - Code: https://github.com/mrflogs/SHIP 50 | 51 | DreamTeacher: Pretraining Image Backbones with Deep Generative Models 52 | - Paper: https://arxiv.org/abs/2307.07487 53 | - Home: https://research.nvidia.com/labs/toronto-ai/DreamTeacher/ 54 | 55 | ShiftNAS: Improving One-shot NAS via Probability Shift 56 | - Paper: https://arxiv.org/abs/2307.08300 57 | - Code: https://github.com/bestfleer/ShiftNAS 58 | 59 | MULLER: Multilayer Laplacian Resizer for Vision 60 | - Paper: https://arxiv.org/pdf/2304.02859.pdf 61 | 62 | FLatten Transformer: Vision Transformer with Focused Linear Attention 63 | - Paper: TODO 64 | - Code: https://github.com/LeapLabTHU/FLatten-Transformer 65 | 66 | Not All Features Matter:Enhancing Few-shot CLIP with Adaptive Prior 67 | - Paper: https://arxiv.org/abs/2304.01195 68 | - Code: https://github.com/yangyangyang127/APE 69 | 70 | Tuning Pre-trained Model via Moment Probing 71 | - Paper: https://arxiv.org/abs/2307.11342 72 | - Code: https://github.com/mingzeG/Moment-Probing 73 | 74 | Strip-MLP: Efficient Token Interaction for Vision MLP 75 | - Paper: https://arxiv.org/pdf/2307.11458.pdf 76 | - Code: https://github.com/Med-Process/Strip_MLP 77 | 78 | Adaptive Frequency Filters As Efficient Global Token Mixers 79 | - Paper: https://arxiv.org/abs/2307.14008 80 | 81 | Learning Concise and Descriptive Attributes for Visual Recognition 82 | - Paper: https://arxiv.org/abs/2308.03685 83 | 84 | ## Detection 85 | FemtoDet: an object detection baseline for energy versus performance tradeoffs 86 | - Paper: https://arxiv.org/pdf/2301.06719.pdf 87 | - Code: https://github.com/yh-pengtu/FemtoDet 88 | 89 | Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment 90 | - Paper: https://arxiv.org/pdf/2207.13085.pdf 91 | - Code: https://github.com/Atten4Vis/GroupDETR 92 | 93 | Large Selective Kernel Network for Remote Sensing Object Detection 94 | - Paper: https://arxiv.org/abs/2303.09030 95 | - Code: https://github.com/zcablii/LSKNet 96 | 97 | DiffusionDet: Diffusion Model for Object Detection 98 | - Paper: https://arxiv.org/abs/2211.09788 99 | - Code: https://github.com/ShoufaChen/DiffusionDet 100 | 101 | DETRs with Collaborative Hybrid Assignments Training 102 | - Paper: https://arxiv.org/pdf/2211.12860.pdf 103 | - Code: https://github.com/Sense-X/Co-DETR 104 | 105 | MIMDet: Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection 106 | - Paper: https://arxiv.org/abs/2204.02964 107 | - Code: https://github.com/hustvl/MIMDet 108 | 109 | Detection Transformer with Stable Matching 110 | - Paper: https://arxiv.org/abs/2304.04742 111 | - Code: https://github.com/IDEA-Research/Stable-DINO 112 | 113 | Random Boxes Are Open-world Object Detectors 114 | - Paper: https://arxiv.org/abs/2307.08249 115 | - Code: https://github.com/scuwyh2000/RandBox 116 | 117 | AlignDet: Aligning Pre-training and Fine-tuning in Object Detection 118 | - Paper: https://arxiv.org/abs/2307.11077 119 | - Home: https://liming-ai.github.io/AlignDet 120 | 121 | Cascade-DETR: Delving into High-Quality Universal Object Detection 122 | - Paper: https://arxiv.org/abs/2307.11035 123 | - Code: https://github.com/SysCV/cascade-detr 124 | 125 | Deep Directly-Trained Spiking Neural Networks for Object Detection 126 | - Paper: https://arxiv.org/abs/2307.11411 127 | 128 | COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts 129 | - Paper: https://arxiv.org/abs/2307.12730 130 | - Code: https://github.com/alibaba/easyrobust/tree/main/benchmarks/coco_o 131 | 132 | Less is More: Focus Attention for Efficient DETR 133 | - Ppaer: https://arxiv.org/abs/2307.12612 134 | - Code: https://github.com/huawei-noah/noah-research/tree/master/Focus-DETR 135 | 136 | Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes 137 | - Paper: https://arxiv.org/abs/2307.12101 138 | - Code: https://github.com/ucas-vg/PointTinyBenchmark/tree/SSD-Det 139 | 140 | RecursiveDet: End-to-End Region-based Recursive Object Detection 141 | - Paper: https://arxiv.org/abs/2307.13619 142 | - Code: https://github.com/bravezzzzzz/RecursiveDet 143 | 144 | ## Segmentation 145 | Segment Anything 146 | - Home: https://segment-anything.com/ 147 | - Paper: https://arxiv.org/abs/2304.02643 148 | - Code: https://github.com/facebookresearch/segment-anything 149 | 150 | SegGPT: Segmenting Everything in Context 151 | - Paper: https://arxiv.org/abs/2304.03284 152 | - Code: https://github.com/baaivision/Painter 153 | 154 | VLPart: Going Denser with Open-Vocabulary Part Segmentation 155 | - Paper: https://arxiv.org/abs/2305.11173 156 | - Code: https://github.com/facebookresearch/VLPart 157 | 158 | Referring Image Segmentation Using Text Supervision 159 | - Paper: 160 | - Code: https://github.com/fawnliu/WRIS_ICCV2023 161 | 162 | EfficientViT: Lightweight Multi-Scale Attention for On-Device Semantic Segmentation 163 | - Paper: https://arxiv.org/pdf/2205.14756.pdf 164 | - Code: https://github.com/mit-han-lab/efficientvit 165 | 166 | A Simple Framework for Open-Vocabulary Segmentation and Detection 167 | - Paper: https://arxiv.org/pdf/2303.08131.pdf 168 | - Code: https://github.com/IDEA-Research/OpenSeeD 169 | 170 | Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation 171 | - Paper: https://arxiv.org/abs/2303.13399 172 | 173 | Dynamic Snake Convolution based on Topological Geometric Constraints for Tubular Structure Segmentation 174 | - Paper: https://arxiv.org/abs/2307.08388 175 | 176 | OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation 177 | - Paper: https://arxiv.org/abs/2307.09356 178 | - Code: https://github.com/wudongming97/OnlineRefer 179 | 180 | Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation 181 | - Paper: https://arxiv.org/abs/2307.11545 182 | - Code: https://github.com/kkakkkka/ETRIS 183 | 184 | Exploring Transformers for Open-world Instance Segmentation 185 | - Code: https://arxiv.org/abs/2308.04206 186 | 187 | 188 | ## Knowledge Distillation 189 | From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels 190 | - Paper: https://arxiv.org/abs/2303.13005 191 | - Code: https://github.com/yzd-v/cls_KD 192 | 193 | DOT: A Distillation-Oriented Trainer 194 | - Paper: https://arxiv.org/abs/2307.08436 195 | 196 | Cumulative Spatial Knowledge Distillation for Vision Transformers 197 | - Paper: https://arxiv.org/abs/2307.08500 198 | 199 | Class-relation Knowledge Distillation for Novel Class Discovery 200 | - Paper: https://arxiv.org/abs/2307.09158 201 | - Code: https://github.com/kleinzcy/Cr-KD-NCD 202 | 203 | EMQ: Evolving Training-free Proxies for Automated Mixed Precision Quantization 204 | - Code: https://github.com/lilujunai/EMQ-series 205 | 206 | Rethinking Data Distillation: Do Not Overlook Calibration 207 | - Paper: https://arxiv.org/abs/2307.12463 208 | 209 | ## Diffusion 210 | MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing 211 | - Paper: https://arxiv.org/abs/2304.08465 212 | - Home: https://ljzycmd.github.io/projects/MasaCtrl/ 213 | 214 | Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation 215 | - Paper: https://arxiv.org/abs/2212.11565 216 | - Code: https://github.com/showlab/Tune-A-Video 217 | 218 | FateZero: Fusing Attentions for Zero-shot Text-based Video Editing 219 | - Paper: https://arxiv.org/abs/2303.09535 220 | - Code: https://github.com/ChenyangQiQi/FateZero 221 | 222 | Expressive Text-to-Image Generation with Rich Text 223 | - Paper: https://arxiv.org/abs/2304.06720 224 | - Home: https://rich-text-to-image.github.io/ 225 | 226 | Ablating Concepts in Text-to-Image Diffusion Models 227 | - Paper: https://arxiv.org/abs/2303.13516 228 | - Home: https://www.cs.cmu.edu/~concept-ablation/ 229 | - Code: https://github.com/nupurkmr9/concept-ablation 230 | 231 | Evaluating Data Attribution for Text-to-Image Models 232 | - Paper: https://arxiv.org/abs/2306.09345 233 | - Home: https://peterwang512.github.io/GenDataAttribution/ 234 | - Code: https://github.com/peterwang512/GenDataAttribution 235 | 236 | Masked Diffusion Transformer is a Strong Image Synthesizer 237 | - Paper: TODO 238 | - Code: TODO 239 | 240 | SVDiff: Compact Parameter Space for Diffusion Fine-tuning 241 | - paper: https://arxiv.org/abs/2303.11305 242 | 243 | BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion 244 | - Paper: https://arxiv.org/abs/2307.10816 245 | 246 | TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition 247 | - Code: https://github.com/Shilin-LU/TF-ICON 248 | 249 | 250 | 251 | ## Depth 252 | Neural Video Depth Stabilizer 253 | - Paper: https://arxiv.org/abs/2307.08695 254 | 255 | Kick Back & Relax: Learning to Reconstruct the World by Watching SlowTV 256 | - Paper: https://arxiv.org/abs/2307.10713 257 | - Code: https://github.com/jspenmar/slowtv_monodepth 258 | 259 | MAMo: Leveraging Memory and Attention for Monocular Video Depth Estimation 260 | - Paper: https://arxiv.org/abs/2307.14336 261 | 262 | Uncertainty Guided Adaptive Warping for Robust and Efficient Stereo Matching 263 | - Paper: https://arxiv.org/abs/2307.14071 264 | 265 | VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation 266 | - Paper: https://arxiv.org/abs/2303.08340 267 | - Code: https://github.com/XiaoyuShi97/VideoFlow 268 | 269 | Learning Depth Estimation for Transparent and Mirror Surfaces 270 | - Paper: https://arxiv.org/pdf/2307.15052.pdf 271 | 272 | ## Restoration 273 | Adaptive Nonlinear Latent Transformation for Conditional Face Editing 274 | - Paper: https://arxiv.org/abs/2307.07790 275 | - Code: https://github.com/Hzzone/AdaTrans 276 | 277 | Towards Authentic Face Restoration with Iterative Diffusion Models and Beyond 278 | - Paper: https://arxiv.org/abs/2307.08996 279 | 280 | Diffir: Efficient diffusion model for image restoration 281 | - Paper: https://arxiv.org/abs/2303.09472 282 | - Code: https://github.com/Zj-BinXia/DiffIR 283 | 284 | Physics-Driven Turbulence Image Restoration with Stochastic Refinement 285 | - Paper: https://arxiv.org/pdf/2307.10603 286 | - Code: https://github.com/VITA-Group/PiRN 287 | 288 | Learning Image-Adaptive Codebooks for Class-Agnostic Image Restoration 289 | - Paper: https://arxiv.org/abs/2306.06513 290 | - Code: https://github.com/kechunl/AdaCode 291 | 292 | Under-Display Camera Image Restoration with Scattering Effect 293 | - Paper: https://arxiv.org/abs/2308.04163 294 | - Code: https://github.com/NamecantbeNULL/SRUDC 295 | 296 | From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal 297 | - Paper: https://arxiv.org/abs/2308.03867 298 | 299 | GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild 300 | -Paper: https://arxiv.org/pdf/2211.12352.pdf 301 | 302 | 303 | ## Super-Resolution 304 | SRFormer: Permuted Self-Attention for Single Image Super-Resolution 305 | - Paper: https://arxiv.org/abs/2303.09735 306 | - Code: https://github.com/HVision-NKU/SRFormer 307 | 308 | SAFMN: Spatially-Adaptive Feature Modulation for Efficient Image Super-Resolution 309 | - Paper: https://arxiv.org/pdf/2302.13800.pdf 310 | - Code: https://github.com/sunny2109/SAFMN 311 | 312 | DLGSANet: Lightweight Dynamic Local and Global Self-Attention Network for Image Super-Resolution 313 | - Paper: https://arxiv.org/pdf/2301.02031.pdf 314 | - Code: https://github.com/NeonLeexiang/DLGSANet 315 | 316 | Dual Aggregation Transformer for Image Super-Resolution 317 | - Paper: https://arxiv.org/abs/2308.03364 318 | - Code: https://github.com/zhengchen1999/DAT 319 | 320 | A Benchmark for Chinese-English Scene Text Image Super-resolution 321 | - Paper: https://arxiv.org/abs/2308.03262 322 | - Code:https://github.com/mjq11302010044/Real-CE 323 | 324 | ## Deblurring 325 | Multi-scale Residual Low-Pass Filter Network for Image Deblurring 326 | - Paper: TODO 327 | - Code: TODO 328 | 329 | 330 | ## Low-light Image Enhance 331 | Implicit Neural Representation for Cooperative Low-light Image Enhancement 332 | - Paper: https://arxiv.org/abs/2303.11722 333 | - Code: https://github.com/Ysz2022/NeRCo 334 | 335 | Iterative Prompt Learning for Unsupervised Backlit Image Enhancement 336 | - Paper: https://arxiv.org/abs/2303.17569 337 | - Home: https://zhexinliang.github.io/CLIP_LIT_page/ 338 | 339 | ExposureDiffusion: Learning to Expose for Low-light Image Enhancement 340 | - Paper: https://arxiv.org/abs/2307.07710 341 | 342 | ## IQA/IAA 343 | Delegate Transformer for Image Color Aesthetics Assessment 344 | - Paper: TODO 345 | - Code: https://github.com/woshidandan/Image-Color-Aesthetics-Assessment 346 | 347 | Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives 348 | - Paper: https://arxiv.org/abs/2211.04894 349 | - Code: https://github.com/VQAssessment/DOVER 350 | 351 | On the Effectiveness of Spectral Discriminators for Perceptual Quality Improvement 352 | - Paper: https://arxiv.org/abs/2307.12027 353 | - Code: https://github.com/Luciennnnnnn/DualFormer 354 | 355 | ## Other 356 | Fast Full-frame Video Stabilization with Iterative Optimization 357 | - Paper: https://arxiv.org/abs/2307.12774 358 | 359 | 360 | ## Dataset 361 | LPFF: A Portrait Dataset for Face Generators Across Large Poses 362 | - Paper: https://arxiv.org/abs/2303.14407 363 | - Code: https://github.com/oneThousand1000/LPFF-dataset 364 | 365 | 366 | ## AIWalker-小助手 367 | ![](./assets/zhushou.png) -------------------------------------------------------------------------------- /assets/aiwalker.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HappyAIWalker/ICCV2023-paper-code/32bc92637edd0204dc53db40bb1ff6b5d0c98fe5/assets/aiwalker.png -------------------------------------------------------------------------------- /assets/zhushou.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HappyAIWalker/ICCV2023-paper-code/32bc92637edd0204dc53db40bb1ff6b5d0c98fe5/assets/zhushou.png --------------------------------------------------------------------------------