├── 论文复现-lab4ai.cn创作者群.png ├── contributor-kit ├── 1-选题阶段-论文筛选表.xlsx ├── 2-成果提交-项目上架表.xlsx └── README.md ├── .github ├── workflows │ ├── sync-to-gitee.yml │ └── pages.yml ├── REWARDS.md └── ISSUE_TEMPLATE │ └── paper_application.yml ├── LICENSE ├── REWARDS.md ├── CRITERIA.md ├── DELIVERABLES.md ├── WORKFLOW.md ├── .gitignore ├── generate_html.py ├── README.md └── data.csv /论文复现-lab4ai.cn创作者群.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Lab4AI-Hub/PaperHub/HEAD/论文复现-lab4ai.cn创作者群.png -------------------------------------------------------------------------------- /contributor-kit/1-选题阶段-论文筛选表.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Lab4AI-Hub/PaperHub/HEAD/contributor-kit/1-选题阶段-论文筛选表.xlsx -------------------------------------------------------------------------------- /contributor-kit/2-成果提交-项目上架表.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Lab4AI-Hub/PaperHub/HEAD/contributor-kit/2-成果提交-项目上架表.xlsx -------------------------------------------------------------------------------- /contributor-kit/README.md: -------------------------------------------------------------------------------- 1 | # 贡献者工具包 (Contributor Kit) 2 | 3 | 您好,欢迎来到Lab4AI的贡献者工具包! 4 | 5 | 这个文件夹包含了您在参与论文复现项目时,所需要的信息收集表格和模板文件。请根据 **[贡献指南 (CONTRIBUTING.md)](../CONTRIBUTING.md)** 中的步骤,来使用这里的文件。 6 | 7 | **文件列表:** 8 | - `1-选题阶段-论文筛选表.xlsx` 9 | - `2-成果提交-项目上架表.xlsx` 10 | - `3-代码参考-Notebook示例.ipynb` 11 | -------------------------------------------------------------------------------- /.github/workflows/sync-to-gitee.yml: -------------------------------------------------------------------------------- 1 | # sync-to-gitee.yml (全新终极版) 2 | 3 | name: Sync to Gitee 4 | 5 | on: 6 | push: 7 | branches: 8 | - main # 确保这是您的主分支名 9 | 10 | jobs: 11 | mirror: 12 | runs-on: ubuntu-latest 13 | steps: 14 | - name: 'Checkout' 15 | uses: actions/checkout@v3 16 | with: 17 | fetch-depth: 0 18 | - name: 'Mirror to Gitee' 19 | uses: pixta-dev/repository-mirroring-action@v1 20 | with: 21 | target_repo_url: 22 | ${{ secrets.GITEE_REPO_URL }} 23 | ssh_private_key: # 这个名字虽然叫 ssh_private_key,但我们实际上会传入一个空值,因为我们不用它了 24 | false 25 | -------------------------------------------------------------------------------- /.github/REWARDS.md: -------------------------------------------------------------------------------- 1 | ### **💎 Lab4AI 创作者激励计划 💎** 2 | 3 | 我们为不同类型的贡献提供了丰厚且多样化的算力支持,助您在AI的探索之路上行得更远! 4 | 5 | #### **核心贡献:论文复现** 6 | *(每位贡献者可申请1-2篇)* 7 | 8 | * **作者亲临:复现本人署名论文 (顶会/顶刊)** 9 | * 获得价值 **¥800 / 篇** 的算力支持 10 | * *构成:¥300 基础复现算力 + ¥300 完成奖励 + ¥200 原作者特别贡献奖励* 11 | 12 | * **前沿探索:复现他人优秀论文 (顶会/顶刊)** 13 | * 获得价值 **¥600 / 篇** 的算力支持 14 | * *构成:¥300 基础复现算力 + ¥300 完成奖励* 15 | 16 | #### **社区分享与推广** 17 | 18 | * **直播分享** 19 | * 参加我们的官方直播,深度解析您复现的论文,将获得价值 **¥2000 / 篇** 的算力奖励。 20 | 21 | * **内容传播** 22 | * 在您的个人公众号、博客、知乎等平台,撰写文章宣传和介绍 Lab4AI 平台,将获得价值 **¥200 / 篇** 的算力奖励。 23 | 24 | * **邀请新成员** 25 | * 分享您的个人邀请码,每成功邀请一位新用户注册并登录平台,您将获得 **¥10 / 人** 的算力奖励! 26 | 27 | --- 28 | 您的每一次分享,都是社区最宝贵的财富。再次感谢您的时间与努力! 29 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 ReproCode-Lab 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /REWARDS.md: -------------------------------------------------------------------------------- 1 | # 💎 Lab4AI 论文复现创作者激励计划 💎 2 | 3 | ## 引言 4 | 5 | 在Lab4AI,我们坚信每一次高质量的论文复现都是对AI知识边界的宝贵拓展。为了感谢并激励每一位投身于这项工作的贡献者,我们特别设立了“创作者激励计划”,为您提供实实在在的算力支持。 6 | 7 | 所有成功的贡献者,除了获得算力奖励外,都将被荣誉地加入到我们官方GitHub组织的 **贡献者名单** 中。 8 | 9 | ## 🎉 新人与社区福利 (New!) 10 | 11 | 我们为每一位新加入的AI探索者准备了入门算力: 12 | 13 | * **注册有礼:** 新用户注册账号,即可自动获得 ¥30 平台算力。 14 | * **入群福利:** 加入官方交流群,联系社区运营人员,可额外获得 ¥20 平台算力。 15 | 16 | ## 核心贡献:论文复现算力支持 17 | 18 | 我们根据复现论文的类型,提供以下分级、分阶段的算力奖励: 19 | 20 | ### 1. 作者亲临:复现本人署名论文 (顶会/顶刊) 21 | 22 | 如果您选择复现的是您本人署名、且发表在高水平会议/期刊的论文: 23 | 24 | > 您将获得总价值 ¥400 / 篇 的平台算力支持。 25 | 26 | **奖励构成 (分阶段发放):** 27 | 28 | * **¥100 基础启动算力:** 在您的选题申请通过审核、任务正式启动时发放。 29 | * **¥300 成功上架奖励:** 在您的复现成果最终审核通过并成功上架后发放。(包含 ¥100 基础复现奖励 + ¥200 原作者特别贡献奖励) 30 | 31 | ### 2. 前沿探索:复现他人优秀论文 (顶会/顶刊) 32 | 33 | 如果您选择复现的是其他作者发表在高水平会议/期刊的优秀论文: 34 | 35 | > 您将获得总价值 ¥200 / 篇 的平台算力支持。 36 | 37 | **奖励构成 (分阶段发放):** 38 | 39 | * **¥100 基础启动算力:** 在您的选题申请通过审核、任务正式启动时发放。 40 | * **¥100 成功上架奖励:** 在您的复现成果最终审核通过并成功上架后发放。 41 | 42 | ## 奖励发放说明 43 | 44 | * 算力奖励通过直接充值到账户或者以算力兑换码的形式发放,请注意查收。 45 | * 请确保在任务启动前(或申请新人福利时),向我们的社区运营人员提供您准确的用户ID或注册手机号。 46 | * 奖励算力有效期通常为 1年,具体以充值到账时的说明为准。 47 | 48 | **我们期待您的加入,与我们一同构建一个充满活力、知识共享的AI实践社区!** 49 | -------------------------------------------------------------------------------- /.github/workflows/pages.yml: -------------------------------------------------------------------------------- 1 | # --- 最终黄金版本:pages.yml --- 2 | name: Build and Deploy Paper List to GitHub Pages 3 | 4 | on: 5 | push: 6 | branches: 7 | - main 8 | workflow_dispatch: 9 | 10 | permissions: 11 | contents: read 12 | pages: write 13 | id-token: write 14 | 15 | concurrency: 16 | group: "pages" 17 | cancel-in-progress: true 18 | 19 | jobs: 20 | build: 21 | runs-on: ubuntu-latest 22 | steps: 23 | - name: Checkout repository 24 | uses: actions/checkout@v4 25 | 26 | - name: Set up Python 27 | uses: actions/setup-python@v5 28 | with: 29 | python-version: '3.10' 30 | 31 | - name: Install dependencies 32 | run: pip install pandas lxml 33 | 34 | - name: Run script to generate webpage 35 | run: python generate_html.py 36 | 37 | - name: Upload artifact 38 | uses: actions/upload-pages-artifact@v3 39 | with: 40 | path: ./dist 41 | 42 | deploy: 43 | needs: build 44 | environment: 45 | name: github-pages 46 | url: ${{ steps.deployment.outputs.page_url }} 47 | runs-on: ubuntu-latest 48 | steps: 49 | - name: Deploy to GitHub Pages 50 | id: deployment 51 | uses: actions/deploy-pages@v4 52 | -------------------------------------------------------------------------------- /CRITERIA.md: -------------------------------------------------------------------------------- 1 | # 论文筛选与复现标准 (Paper Selection & Reproduction Criteria) 2 | 3 | ### 引言 4 | 5 | 一篇高质量的论文复现,是对前沿学术思想的深度解读和宝贵实践。为了确保Lab4AI社区收录的论文复现项目都具备高水准、前沿性和学习价值,我们制定了以下筛选标准。 6 | 7 | 当您推荐一篇新论文时,或者我们评估一篇论文是否值得被纳入官方清单时,主要会从**学术影响力**、**技术价值**和**复现可行性**三个维度进行综合考量。 8 | 9 | --- 10 | 11 | ### **一、 学术影响力 (Academic Impact)** 12 | 13 | 我们优先考虑那些在学术界和社区中已获得广泛认可的论文。 14 | 15 | * **会议/期刊级别 (Venue Tier)** 16 | * 论文应发表在**公认的高水平学术会议或期刊**上。(例如:CCF-A/B类会议,中科院一/二区期刊等)。**顶会/顶刊论文优先**。 17 | 18 | * **发表时间 (Recency)** 19 | * 我们鼓励复现**近3年内**发表的论文,以保证内容的前沿性。特别经典或有重大影响力的早期论文也可酌情考虑。 20 | 21 | * **社区关注度 (Community Interest)** 22 | * 该论文是否在开源社区(如GitHub)已有相关的实现或引起了较多讨论?这通常意味着论文具备较高的实践价值和复现可行性。 23 | 24 | --- 25 | 26 | ### **二、 技术价值与复现范围 (Technical Value & Scope)** 27 | 28 | 我们关注论文本身的技术深度以及复现工作的价值。 29 | 30 | * **创新性与启发性 (Innovation & Insight)** 31 | * 论文是否在**模型架构、训练方法、核心算法**或**应用思路**上提出了显著的创新?复现它能否为社区带来新的启发? 32 | 33 | * **复现范围要求 (Reproduction Scope)** 34 | * 我们希望复现工作能涵盖论文的核心贡献。因此,推荐的论文应能够支持**预训练 (Pre-training)、微调 (Fine-tuning) 或 推理 (Inference)** 中的**至少一项或多项**。 35 | * **请注意**:**纯粹只有推理阶段**的项目或论文,通常**不**在我们优先考虑的范围内。 36 | 37 | --- 38 | 39 | ### **三、 复现可行性 (Feasibility)** 40 | 41 | 为了确保贡献者的投入能顺利产出成果,并保证成果能被社区其他成员复用,我们对论文的可行性有如下要求: 42 | 43 | * **代码与资源的开放性** 44 | * 原论文应提供**开源的核心代码**,或者其方法描述足够清晰,具备较高的可复现潜力。 45 | * 实验依赖的**核心模型和数据集**必须是公开可获取的。 46 | 47 | * **硬件资源要求 (H-Card Compatibility)** 48 | * 论文核心流程(训练/微调/推理)所需的计算资源,应在**Lab4AI平台可支持的范围内**。 49 | * **硬性要求**:原则上,核心实验必须能够在我们提供的**H系列计算卡(如H800)**或其他同等级别的硬件上完成。 50 | 51 | * **清晰度与复现门槛** 52 | * 原论文对**实验设置、超参数、环境配置**等细节的描述是否清晰、完整,以降低社区成员的复现门槛。 53 | 54 | --- 55 | 56 | ### **总结** 57 | 58 | 我们最希望看到的,是那些**发表在高水平会议/期刊、技术上有亮点、与当前热点相关,并且具备在Lab4AI平台H卡环境下成功复现潜力**的论文。 59 | 60 | 如果您不确定您的想法是否完全符合所有标准,**请不要犹豫,仍然可以通过Issue向我们推荐**!我们非常乐意与您一同探讨和评估论文的价值。 61 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/paper_application.yml: -------------------------------------------------------------------------------- 1 | # .github/ISSUE_TEMPLATE/paper_application.yml 2 | 3 | name: "📄 论文复现申请 / 成果提交" 4 | description: "用于申请新的论文复现任务,或更新已有任务的进度/提交最终成果" 5 | title: "[论文复现] <请在此填写论文标题>" 6 | labels: ["paper-repro", "needs-review"] 7 | body: 8 | - type: markdown 9 | attributes: 10 | value: | 11 | ### 感谢您参与 Lab4AI 的论文复现! 12 | 请根据您当前所处的阶段,填写或更新以下信息。 13 | 14 | - type: dropdown 15 | id: stage 16 | attributes: 17 | label: 当前阶段 18 | description: "请选择您本次提交Issue的目的" 19 | options: 20 | - "1. 提交选题申请" 21 | - "2. 提交最终成果" 22 | validations: 23 | required: true 24 | 25 | - type: input 26 | id: paper-title 27 | attributes: 28 | label: 论文标题 29 | description: "请填写您希望复现或正在复现的论文完整标题。" 30 | placeholder: "例如:Attention Is All You Need" 31 | validations: 32 | required: true 33 | 34 | - type: input 35 | id: paper-link 36 | attributes: 37 | label: 论文链接 (ArXiv/期刊等) 38 | description: "请提供原始论文的公开访问链接。" 39 | placeholder: "https://arxiv.org/abs/1706.03762" 40 | validations: 41 | required: true 42 | 43 | # --- 👇 这里是新增的“所属机构”字段 👇 --- 44 | - type: input 45 | id: affiliation 46 | attributes: 47 | label: 您的所属机构 (Your Affiliation) 48 | description: "请填写您所在的学校、企业或团队名称。例如:北京大学、Lab4AI内部团队、开源社区贡献者。" 49 | placeholder: "例如:清华大学" 50 | validations: 51 | required: true # <--- 设置为必填项 52 | # --- 👆 新增字段结束 👆 --- 53 | 54 | - type: textarea 55 | id: details 56 | attributes: 57 | label: 详细说明 58 | description: | 59 | - 如果是【选题申请】,请在此处简述理由,并**拖拽上传**填写好的《选题阶段表.xlsx》。 60 | - 如果是【提交最终成果】,请在此处**拖拽上传**填写好的《论文上架信息表.xlsx》,并附上您在平台项目工作空间的**只读分享链接**(如有)。 61 | placeholder: "请根据您选择的“当前阶段”进行详细说明..." 62 | validations: 63 | required: true 64 | 65 | - type: checkboxes 66 | id: checklist 67 | attributes: 68 | label: "确认清单 (仅选题申请时必填)" 69 | options: 70 | - label: 我已阅读并理解所有贡献指南文档。 71 | - label: 我确认我推荐的论文符合筛选标准。 72 | - label: 我已为本项目点亮Star ⭐。 73 | -------------------------------------------------------------------------------- /DELIVERABLES.md: -------------------------------------------------------------------------------- 1 | # 论文复现成果提交说明 (Deliverables Guide) 2 | 3 | ### 引言 4 | 5 | 恭喜您完成了论文复现!这是整个流程中最激动人心的一步。 6 | 7 | 为了能顺利地将您的成果审核、迁移并上架到Lab4AI官方平台,请在通知我们审核**之前**,仔细阅读本指南,并准备好以下**两大部分**内容: 8 | 9 | 1. **一份结构标准化的项目文件**(在您的Lab4AI工作空间内) 10 | 2. **一份信息完备的《论文上架信息表》**(需要您下载并填写) 11 | 12 | --- 13 | 14 | ### **第一部分:项目文件准备 (Project Files Checklist)** 15 | 16 | 请确保您在Lab4AI平台上的项目工作空间,严格遵循我们在 **[《复现者指南》](https://github.com/Lab4AI-Hub/PaperHub/edit/main/WORKFLOW.md)** 中定义的标准文件结构。这是我们进行审核的基础。 17 | 18 | --- 19 | 20 | ### **第二部分:填写《论文上架信息表》 (Submission Info Checklist)** 21 | 22 | 这是我们将您的项目信息录入平台的**核心依据**。 23 | 24 | * **操作**: 25 | 1. 请点击下方链接,**直接下载**《论文上架信息表》: 26 | **[➡️ 点击下载:2-成果提交-项目上架表.xlsx](https://github.com/Lab4AI-Hub/PaperHub/raw/main/contributor-kit/2-成果提交-项目上架表.xlsx)** 27 | *(注意:点击链接会直接开始下载)* 28 | 2. 下载后,请**完整填写**表格中的所有字段(字段说明见下文)。 29 | 3. 在您申请任务的GitHub Issue中 @ 管理员时,**将填写好的此表格作为附件上传**。 30 | 31 | #### **表格字段详细说明:** 32 | 33 | ##### **论文粗略信息 (For Quick View)** 34 | * `ID`: (可不填,由我们统一编号) 35 | * `项目名称`: 您为本次复现项目起的名称。 36 | * `项目简介`: 一段吸引人的、高度概括的文字(100-200字为佳)。 37 | * `项目封面(4:3)`: 请准备一张4:3比例的图片文件,与表格一同提交。 38 | * `完成时间`: 您完成复现的大致日期。 39 | * `项目作者`: 您希望在平台展示的作者昵称。 40 | * `您的user-id或注册手机号`: 用于为您发放算力奖励,请务必填写准确。 41 | * `标签`: 请准备3-5个最相关的技术或领域标签,用逗号 `,` 隔开。 42 | * `虚拟环境路径`: (例如: `/workspace/envs/xxx`) 43 | * `项目存储路径`: (例如: `/workspace/codelab/xxx`) 44 | * `模型原始路径`: (例如: `/workspace/codelab/xxx/model/`) 45 | * `数据原始路径`: (例如: `/workspace/codelab/xxx/dataset/`) 46 | * `代码原始路径`: (例如: `/workspace/codelab/xxx/code/`) 47 | * `你的readme原始路径`: (例如: `/workspace/codelab/xxx/code/demo.ipynb`) 48 | * `所需资源`: (例如: `1 * H800`) 49 | * `按照你的readme执行大约耗时多久`: (例如: `约 2 小时`) 50 | 51 | ##### **论文详细信息 (For Paper Details)** 52 | * `论文名称`: 53 | * `论文作者`: 54 | * `论文链接`: 55 | * `论文摘要`: 56 | * `论文年份`: 57 | * `来源标签(会议期刊)`: (例如: `CVPR`, `ACL`) 58 | * `数据集`: 59 | * `数据集介绍`: 60 | * `github链接`: (原论文的官方/非官方实现链接) 61 | * `案例类型`: (请填写:`已复现`) 62 | * `readme`:用于在平台展示论文复现项目的说明文字 63 | 64 | --- 65 | 66 | ### **最终提交** 67 | 68 | 当您确认以上**项目文件已整理完毕**,并且 **《论文上架信息表.xlsx》已填写完成**后,即可前往您申请任务的GitHub Issue,@ 管理员并上传表格附件,发起最终提交! 69 | -------------------------------------------------------------------------------- /WORKFLOW.md: -------------------------------------------------------------------------------- 1 | # 论文复现流程指南 (Paper Reproduction Workflow) 2 | 3 | ### 引言 4 | 5 | 您好,欢迎加入Lab4AI的论文复现贡献者社区!本指南将清晰地引导您完成从认领任务到最终成果上架的全过程。 6 | 7 | 在整个流程中,我们**最看重的是您最终成果的工程规范性**。因此,请您务必仔细阅读并遵循下文中【**核心要求**】部分定义的文件结构与环境配置标准。 8 | 9 | --- 10 | 11 | ### **贡献流程概览** 12 | 13 | 我们的协作遵循一个清晰的五步流程: 14 | 15 | * **第一步:在GitHub提交选题申请** 16 | > 构思或选择一篇您感兴趣的论文,并在我们 **PaperHub** 仓库的 **[Issue区](https://github.com/Lab4AI-Hub/PaperHub/issues)**,使用标准模板提交您的立项申请。 17 | 18 | * **第二步:审核与准备** 19 | > 我们的团队会审核您的申请。通过后,您将获得项目所需的**启动算力**。 20 | 21 | * **第三步:在Lab4AI平台创建实例并复现** 22 | > **【请注意当前流程】** 您需要登录Lab4AI平台,从 **右侧菜单点击【新建实例】** 来创建您的开发环境,并按照我们的工程规范开始复现。 23 | 24 | * **第四步:在GitHub提交成果** 25 | > **【请注意当前流程】** 当您完成复现后,请填写好**《论文上架信息表》**,并回到您当初申请任务的GitHub Issue中,**作为附件上传**以供审核。 26 | 27 | * **第五步:发布与激励** 28 | > 我们审核通过后,您的项目将被官方收录和展示,同时您将获得丰厚的算力奖励。 29 | 30 | --- 31 | 32 | ### **【核心要求】项目工程规范** 33 | 34 | #### **1. 标准文件结构** 35 | 36 | 为了保证所有复现都具备高质量和一致性,我们要求所有贡献者在**您Lab4AI平台的项目工作空间内**,严格按照以下结构来组织您的最终产出物: 37 | 38 | ├── workspace/user-data/
39 | │ ├── codelab/
40 | │ │ └── [您的项目名]/ # 项目核心工作目录
41 | │ │ │ ├── model/ # 存放所有模型权重文件
42 | │ │ │ ├── dataset/ # 存放数据集文件
43 | │ │ │ ├── code/ # 存放所有核心源代码、以及操作说明文件(Notebook或者readme)
***务必提供**操作说明文件,该文件用于指引用户复现您的项目,详细文件样例请查看【**3.操作说明文件样例**】。*
44 | │ │ │ └── README.md # 项目流程指南说明文档(本文档)

45 | │ ├── envs
46 | │ │ └── [您的项目名]/ # 项目专属环境配置目录
47 | **各文件夹详细说明:** 48 | 49 | * **`codelab/[您的项目名]/`**: 这是存放项目核心内容的文件夹。 50 | * `model/` & `dataset/`: 分别用于存放模型和数据集。**请注意**:在您的项目被我们正式发布后,这两个文件夹对普通用户将设置为**只读权限**。 51 | * `code/`: 用于存放您的所有源代码。这个文件夹在发布后将拥有**读写权限**。因此,**所有由代码执行过程中产生的文件(例如新保存的模型、日志、结果图等),都必须保存到 `code/` 文件夹或其子文件夹下**。 52 | 53 | #### **2. 环境配置与使用** 54 | 55 | * **1. 创建独立的虚拟环境** 56 | * 我们要求每个项目都创建一个独立的Conda虚拟环境,以保证依赖的纯净。 57 | * **示例命令** (请将 `aaa` 替换为您的环境名): 58 | ```bash 59 | # 创建一个名为 aaa 的 python3.12 环境 60 | conda create -n aaa python=3.12 -y 61 | 62 | # 查看已有的conda环境 63 | conda env list 64 | 65 | # 激活新创建的环境 66 | conda activate ttt 67 | ``` 68 | 69 | * **2. 在Jupyter Notebook中使用您的环境** 70 | * 为了让Jupyter能够识别并使用您刚刚创建的Conda环境,您需要为其注册一个“内核”。 71 | * 首先,在您**已激活**的Conda环境中,安装 `ipykernel` 包: 72 | ```bash 73 | pip install ipykernel 74 | ``` 75 | * 然后,执行内核注册命令。请将 `aaa` 替换为您自己的环境名: 76 | ```bash 77 | # 示例:为名为 aaa 的环境注册一个名为 "Python(aaa)" 的内核 78 | kernel_install --name aaa --display-name "Python(aaa)" 79 | ``` 80 | * 完成以上操作后,**刷新**一下您项目中的Jupyter Notebook页面。在右上角的内核选择区域,您现在应该就能看到并选择您刚刚创建的 `"Python(aaa)"` 内核了。 81 | * **3. 在VSCode中使用您的环境** 82 | * VSCode可以自动检测到您新创建的Conda环境,切换过程非常快捷。 83 | * **第一步: 选择Python解释器** 84 | * 确保VS Code插件中已经安装了Python的扩展。 85 | * 使用快捷键 `Ctrl+Shift+P` (Windows/Linux) 或 `Cmd+Shift+P` (macOS) 打开命令面板。 86 | * 输入并选择 `Python: Select Interpreter`。 87 | 88 | * **第二步: 选择您的Conda环境** 89 | * 在弹出的列表中,找到并点击您刚刚创建的环境(例如,名为 `aaa` 的Conda环境)。 90 | * 选择后,右下角状态栏会显示您选择的环境名,表示切换成功。 91 | 92 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[codz] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | share/python-wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | MANIFEST 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .nox/ 43 | .coverage 44 | .coverage.* 45 | .cache 46 | nosetests.xml 47 | coverage.xml 48 | *.cover 49 | *.py.cover 50 | .hypothesis/ 51 | .pytest_cache/ 52 | cover/ 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | local_settings.py 61 | db.sqlite3 62 | db.sqlite3-journal 63 | 64 | # Flask stuff: 65 | instance/ 66 | .webassets-cache 67 | 68 | # Scrapy stuff: 69 | .scrapy 70 | 71 | # Sphinx documentation 72 | docs/_build/ 73 | 74 | # PyBuilder 75 | .pybuilder/ 76 | target/ 77 | 78 | # Jupyter Notebook 79 | .ipynb_checkpoints 80 | 81 | # IPython 82 | profile_default/ 83 | ipython_config.py 84 | 85 | # pyenv 86 | # For a library or package, you might want to ignore these files since the code is 87 | # intended to run in multiple environments; otherwise, check them in: 88 | # .python-version 89 | 90 | # pipenv 91 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 92 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 93 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 94 | # install all needed dependencies. 95 | #Pipfile.lock 96 | 97 | # UV 98 | # Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control. 99 | # This is especially recommended for binary packages to ensure reproducibility, and is more 100 | # commonly ignored for libraries. 101 | #uv.lock 102 | 103 | # poetry 104 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 105 | # This is especially recommended for binary packages to ensure reproducibility, and is more 106 | # commonly ignored for libraries. 107 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 108 | #poetry.lock 109 | #poetry.toml 110 | 111 | # pdm 112 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 113 | # pdm recommends including project-wide configuration in pdm.toml, but excluding .pdm-python. 114 | # https://pdm-project.org/en/latest/usage/project/#working-with-version-control 115 | #pdm.lock 116 | #pdm.toml 117 | .pdm-python 118 | .pdm-build/ 119 | 120 | # pixi 121 | # Similar to Pipfile.lock, it is generally recommended to include pixi.lock in version control. 122 | #pixi.lock 123 | # Pixi creates a virtual environment in the .pixi directory, just like venv module creates one 124 | # in the .venv directory. It is recommended not to include this directory in version control. 125 | .pixi 126 | 127 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 128 | __pypackages__/ 129 | 130 | # Celery stuff 131 | celerybeat-schedule 132 | celerybeat.pid 133 | 134 | # SageMath parsed files 135 | *.sage.py 136 | 137 | # Environments 138 | .env 139 | .envrc 140 | .venv 141 | env/ 142 | venv/ 143 | ENV/ 144 | env.bak/ 145 | venv.bak/ 146 | 147 | # Spyder project settings 148 | .spyderproject 149 | .spyproject 150 | 151 | # Rope project settings 152 | .ropeproject 153 | 154 | # mkdocs documentation 155 | /site 156 | 157 | # mypy 158 | .mypy_cache/ 159 | .dmypy.json 160 | dmypy.json 161 | 162 | # Pyre type checker 163 | .pyre/ 164 | 165 | # pytype static type analyzer 166 | .pytype/ 167 | 168 | # Cython debug symbols 169 | cython_debug/ 170 | 171 | # PyCharm 172 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 173 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 174 | # and can be added to the global gitignore or merged into this file. For a more nuclear 175 | # option (not recommended) you can uncomment the following to ignore the entire idea folder. 176 | #.idea/ 177 | 178 | # Abstra 179 | # Abstra is an AI-powered process automation framework. 180 | # Ignore directories containing user credentials, local state, and settings. 181 | # Learn more at https://abstra.io/docs 182 | .abstra/ 183 | 184 | # Visual Studio Code 185 | # Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore 186 | # that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore 187 | # and can be added to the global gitignore or merged into this file. However, if you prefer, 188 | # you could uncomment the following to ignore the entire vscode folder 189 | # .vscode/ 190 | 191 | # Ruff stuff: 192 | .ruff_cache/ 193 | 194 | # PyPI configuration file 195 | .pypirc 196 | 197 | # Cursor 198 | # Cursor is an AI-powered code editor. `.cursorignore` specifies files/directories to 199 | # exclude from AI features like autocomplete and code analysis. Recommended for sensitive data 200 | # refer to https://docs.cursor.com/context/ignore-files 201 | .cursorignore 202 | .cursorindexingignore 203 | 204 | # Marimo 205 | marimo/_static/ 206 | marimo/_lsp/ 207 | __marimo__/ 208 | -------------------------------------------------------------------------------- /generate_html.py: -------------------------------------------------------------------------------- 1 | # --- 最终黄金版本:generate_html.py --- 2 | import pandas as pd 3 | import os 4 | from urllib.parse import quote_plus 5 | import html 6 | import sys 7 | 8 | # --- 全局配置 --- 9 | CONFIG = { 10 | "csv_path": "data.csv", 11 | "output_dir": "dist", 12 | "output_filename": "index.html", 13 | "repo_url": "https://github.com/Lab4AI-Hub/PaperHub", 14 | "issue_template": "1_paper_suggestion.yml" 15 | } 16 | 17 | def create_github_issue_url(title=""): 18 | base_url = f"{CONFIG['repo_url']}/issues/new" 19 | template = CONFIG['issue_template'] 20 | if title: 21 | encoded_title = quote_plus(f"[选题申请] {title}") 22 | return f"{base_url}?template={template}&title={encoded_title}" 23 | else: 24 | return f"{base_url}?template={template}" 25 | 26 | def generate_html_from_csv(df): 27 | print(f"开始处理 {len(df)} 行数据...") 28 | html_rows = [] 29 | for index, row in df.iterrows(): 30 | try: 31 | paper_title = html.escape(str(row.get('论文名称', ''))) 32 | authors = html.escape(str(row.get('论文作者', ''))) 33 | conference = html.escape(str(row.get('来源标签\n(会议期刊)', ''))) 34 | year = str(row.get('论文年份', '')) 35 | paper_link = str(row.get('论文链接', '')) 36 | github_link = str(row.get('github链接', '')) 37 | form = html.escape(str(row.get('形式', ''))) 38 | status = str(row.get('认领状态', '')) 39 | 40 | title_authors_html = f"{paper_title}
{authors}" 41 | conference_year_html = f"{conference.replace(chr(10), ' ')} {year}" 42 | 43 | links_html_parts = [] 44 | if paper_link: links_html_parts.append(f'原文') 45 | if github_link: links_html_parts.append(f'代码') 46 | links_html = ' | '.join(links_html_parts) if links_html_parts else 'N/A' 47 | 48 | status_action_html = f'{status}' 49 | if status == '待认领💡': 50 | claim_url = create_github_issue_url(paper_title) 51 | status_action_html = f'📝 申请任务' 52 | 53 | html_rows.append(f""" 54 | 55 | {title_authors_html} 56 | {conference_year_html} 57 | {form} 58 | {links_html} 59 | {status_action_html} 60 | 61 | """) 62 | except Exception as e: 63 | print(f"警告:处理第 {index + 2} 行数据时发生错误: {e}") 64 | continue 65 | print("所有数据行处理完毕。") 66 | return "".join(html_rows) 67 | 68 | def main(): 69 | print("脚本开始运行...") 70 | try: 71 | df = pd.read_csv(CONFIG['csv_path'], encoding='utf-8-sig') 72 | df = df.fillna('') 73 | print(f"成功读取 {CONFIG['csv_path']} 文件,共 {len(df)} 条记录。") 74 | except Exception as e: 75 | print(f"读取CSV文件时发生致命错误: {e}。脚本终止。") 76 | sys.exit(1) 77 | 78 | table_content = generate_html_from_csv(df) 79 | propose_new_url = create_github_issue_url() 80 | 81 | html_template = f""" 82 | 83 | 84 | Lab4AI 待复现论文清单 85 | 86 | 99 | 100 |
101 |
102 |

Lab4AI 待复现论文清单

103 |

在申请任务前,请务必仔细阅读我们的 贡献流程和奖励规则

104 | 105 |
106 | 107 | 108 | 109 | 110 | {table_content} 111 |
论文名称 & 作者会议/期刊 & 年份形式相关链接状态 / 操作
112 |
113 | 114 | 115 | 118 | 119 | """ 120 | 121 | output_dir = CONFIG['output_dir'] 122 | if not os.path.exists(output_dir): os.makedirs(output_dir) 123 | output_path = os.path.join(output_dir, CONFIG['output_filename']) 124 | with open(output_path, 'w', encoding='utf-8') as f: f.write(html_template) 125 | print(f"网页已成功生成到: {output_path}") 126 | 127 | if __name__ == '__main__': main() 128 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # PaperHub: 高质量的论文复现中心 2 | 3 | 欢迎来到 **PaperHub**!这是 **Lab4AI** 社区的论文复现库,致力于提供高质量的论文复现。 4 | 5 | --- 6 | 7 |
8 |

前往“大模型实验室”获取最佳体验

9 |

所有项目均可在我们的“大模型实验室”内容社区中找到对应的教程和一键式运行环境,为您提供强大的算力支持。

10 |
11 | 12 | 访问大模型实验室 ➡️ 13 | 14 |
15 | 16 | --- 17 | 18 | ## 👋 如何贡献一篇论文复现? 19 | 20 | 我们热烈欢迎每一位对AI充满热情的你,加入我们的论文复现贡献者行列!我们为您设计了一套清晰的贡献路径,并准备了丰厚的算力奖励,期待您的参与能点亮社区。 21 | 22 | **【关于提交流程的重要说明】**:我们正在持续优化贡献者体验,未来将上线**平台内的一站式创作与提交**功能。在此之前,请暂时按照以下流程进行成果提交。 23 | 24 | ### **贡献步骤概览** 25 | 26 | 27 | 28 | 31 | 39 | 40 | 41 | 44 | 50 | 51 | 52 | 55 | 65 | 66 | 67 | 70 | 76 | 77 |
29 | 30 | 32 |

第一步:选题与申请

33 |

34 | 前置要求:请先为本项目点亮一颗 Star ⭐! 35 |
36 | 您可以从论文复现清单选择或推荐新论文(推荐需先微信联系)。确定后,请通过 提交Issue 的方式进行正式申请。 37 |

38 |
42 | 43 | 45 |

第二步:在线复现

46 |

47 | 申请通过后,您将获得启动算力。请根据《复现者指南》中的文件结构要求,在您自己的Lab4AI平台工作空间内开始复现。 48 |

49 |
53 | 54 | 56 |

第三步:提交审核

57 |

58 | 复现完成后,请按照《成果提交说明》,准备好所有材料,特别是填写完整的《论文上架信息表》。 59 |
60 | 然后,请回到您当初申请任务的GitHub Issue,在评论区 再次联系我们,并将《论文上架信息表》作为附件上传,通知我们进行审核。 61 |
62 | (我们正在努力开发平台内的在线提交通道,未来将更加便捷!) 63 |

64 |
68 | 69 | 71 |

第四步:成功发布

72 |

73 | 审核通过后,您的成果将被官方收录和展示,同时您将根据《创作者激励计划》获得丰厚奖励! 74 |

75 |
78 | 79 | ### **核心文档库 & 申请入口** 80 | 81 | 在开始您的贡献之旅前,请务必仔细阅读以下核心指南,并**下载所需的表格模板**。**所有详细信息和申请入口都在这里**。 82 | 83 | 84 | 85 | 100 | 115 | 116 | 117 | 123 | 129 | 130 |
86 |

87 | 📄 复现者指南 (Reproducer's Guide) 88 |

89 |

90 | 行动手册(SOP):详细介绍从申请到提交的每一步。 91 |

92 |
93 | ✨ 本阶段需下载: 94 |
95 | 96 | 下载选题模板 97 | 98 |
99 |
101 |

102 | 📝 成果提交说明 (Deliverables Guide) 103 |

104 |

105 | 交付物清单:清晰列出您需要准备的文件和信息。 106 |

107 |
108 | ✨ 本阶段需下载: 109 |
110 | 111 | 下载上架模板 112 | 113 |
114 |
118 |

119 | ✅ 论文筛选标准 (Criteria) 120 |

121 |

选题准则:定义了一篇论文是否值得被复现的前置条件。

122 |
124 |

125 | 💎 创作者激励计划 (Rewards) 126 |

127 |

权益手册:详细说明不同贡献所能获得的丰厚算力奖励。

128 |
131 | 132 |

133 | 134 | 开始申请 135 | 136 |

137 | 138 | --- 139 | 140 | ### 🗺️ 论文复现清单 141 | 142 | 我们已经筛选并整理了一份详细的待复现论文清单。这不仅是我们的工作计划,更是我们邀请您参与共建的蓝图。 143 | 144 |
145 | 146 | 查看待复现论文清单 (Roadmap) 147 | 148 |
149 | 150 | 151 | --- 152 | 153 | 154 | 155 | ### ✅ 已完成的复现项目 (Completed Reproductions) 156 | 157 | | 论文名称 & 作者 | 会议来源 & 年份 | 论文链接 | 前往平台体验 | 158 | | :--- | :--- | :--- | :--- | 159 | | **Attention Is All You Need**
*Ashish Vaswani, et al.* | NeurIPS 2017| [📄 arXiv](https://arxiv.org/abs/1706.03762) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=e90aa38fdff9420e8902bc71909fa005&type=paper) | 160 | | **Can We Get Rid of Handcrafted Feature Extractors?**
*Lei Su, et al.* | AAAI 2025| [📄 arXiv](https://arxiv.org/abs/2412.14598) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=97a182e56e904e92a0fe240f1f114709&type=paper) | 161 | | **MOMENT: A Family of Open Time-series Foundation Models**
*Mononito Goswami, et al.* | ICML 2025| [📄 arXiv](https://arxiv.org/abs/2402.03885) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=05087484a3264a9c8b8a2c616e7cce0b&type=paper) | 162 | | **Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data**
*Kashun Shum,et al.* | EMNLP 2023 | [📄 arXiv](https://arxiv.org/abs/2302.12822) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=c76b88e732cf41949b54515bdd319808&type=paper) | 163 | | **Chronos: Learning the Language of Time Series**
*Abdul Fatir Ansari, et al.* | other 2024| [📄 arXiv](https://arxiv.org/pdf/2403.07815) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=6dd7daeec6584f61856876474b860e09&type=paper) | 164 | | **Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis**
*Yu Yuan, et al.* | CVPR 2025| [📄 arXiv](https://arxiv.org/abs/2412.02168) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=2d412a08880f477ca5362af3ef8c14f2&type=paper) | 165 | | **PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data**
*Shijie Huang, et al.* | ICCV 2025| [📄 arXiv](https://arxiv.org/abs/2502.14397) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=673e78d51fcc4bca89b636a52affa6b4&type=paper) | 166 | | **Self-Instruct: Aligning Language Models with Self-Generated Instructions**
*Yizhong Wang, et al.* | ACL 2023| [📄 arXiv](https://arxiv.org/abs/2212.10560) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=2bbf2f4971f74c6e8def26879233f2fe&type=paper) | 167 | | **RobustSAM: Segment Anything Robustly on Degraded Images**
*Wei-Ting Chen, et al.* | CVPR 2024| [📄 arXiv](https://arxiv.org/abs/2406.09627) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=7a23bf525a38476c952df14e72ecce23&type=paper) | 168 | | **Side Adapter Network for Open-Vocabulary Semantic Segmentation**
*Mengde Xu, et al.* |CVPR 2023| [📄 arXiv](https://arxiv.org/pdf/2302.12242) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=5944c280114e41508d99e1fd85cbf78e&type=paper) | 169 | | **Improving day-ahead Solar Irradiance Time Series Forecasting by Leveraging Spatio-Temporal Context**
*Oussama Boussif, et al.* | NIPS 2023| [📄 arXiv](https://arxiv.org/abs/2306.01112) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=77ab6c82cc9444938c4bbbdd6709709a&type=paper) | 170 | | **Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting**
*Kashif Rasul, et al.* | other 2023| [📄 arXiv](https://arxiv.org/abs/2310.08278) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=a05e09588c4c475aac14354cae04986d&type=paper) | 171 | | **CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos**
*Lei Nikita Karaev, et al.* | CVPR 2024| [📄 arXiv](https://arxiv.org/abs/2410.11831) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=269a8dc6d24a49b29d39de138b443a43&type=paper) | 172 | | **Unified Training of Universal Time Series Forecasting Transformers**
*Gerald Woo, et al.* | ICML 2024| [📄 arXiv](https://arxiv.org/abs/2402.02592) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=af34d430edf14f56910c6cfc05c4ee89&type=paper) | 173 | | **A decoder-only foundation model for time-series forecasting**
*Abhimanyu Das, et al.* | ICML 2024| [📄 arXiv](https://arxiv.org/abs/2310.10688) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=e78bad4ea95944af8fa0eeb98f69179f&type=paper) | 174 | | **Timer: Generative Pre-trained Transformers Are Large Time Series Models**
*Yong Liu, et al.* | ICML 2024| [📄 arXiv](https://arxiv.org/abs/2402.02368) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=34d7c1e35c514d77b88762f18298e999&type=paper) | 175 | | **LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant**
*Yikun Liu, et al.* | other 2024| [📄 arXiv](https://arxiv.org/abs/2412.01720) | [➡️ **立即体验**](https://www.lab4ai.cn/paper/detail?id=3c0069c96f60404b948ed30fd498fe7f&type=paper) | 176 | 177 | --- 178 | 179 | ### ✨ 探索更多论文复现 180 | 181 | 想要体验更多前沿论文的一键复现与在线开发?请访问我们的官方平台: 182 | 183 |

184 | 185 | 访问 Lab4AI 论文广场 186 | 187 |

188 | -------------------------------------------------------------------------------- /data.csv: -------------------------------------------------------------------------------- 1 | ID,认领状态,论文名称,论文作者,论文链接,论文年份,"来源标签 2 | (会议期刊)",github链接,形式 3 | 1,💡 未复现,Attention is all you need,"Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ナ「kasz Kaiser, Illia Polosukhin",https://papers.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf,2017,NIPS,https://github.com/jadore801120/attention-is-all-you-need-pytorch, 4 | 2,💡 未复现,Improving language understanding by generative pre-training,Alec Radford,Karthik Narasimhan,Tim Salimans,Ilya Sutskever,https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf,2018,OpenAI,, 5 | 3,💡 未复现,BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,"Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova",https://aclanthology.org/N19-1423/?utm_campaign=The+Batch&utm_source=hs_email&utm_medium=email&_hsenc=p2ANqtz-_m9bbH_7ECE1h3lZ3D61TYg52rKpifVNjL4fvJ85uqggrXsWDBTB7YooFLJeNXHWqhvOyC,2019,NAACL,, 6 | 4,💡 未复现,Language models are unsupervised multitask learners,Alec Radford,Jeffrey Wu,Rewon Child,David Luan,Dario Amodei,Ilya Sutskever,https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf,2019,OpenAI,, 7 | 5,💡 未复现,Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,"Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu",https://www.jmlr.org/papers/v21/20-074.html,2020,JMLR,https://github.com/google-research/text-to-text-transfer-transformer, 8 | 6,💡 未复现,"From Local to Global: A GraphRAG Approach to 9 | Query-Focused Summarization","Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, Dasha Metropolitansky, Robert Osazuwa Ness, Jonathan Larson",https://arxiv.org/abs/2404.16130,2024,,https://github.com/microsoft/graphrag, 10 | 7,💡 未复现,Language Models are Few-Shot Learners,"Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei",https://proceedings.neurips.cc/paper_files/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html?utm_source=transaction&utm_medium=email&utm_campaign=linkedin_newsletter,2020,NIPS,https://github.com/openai/gpt-3, 11 | 8,💡 未复现,LoRA: Low-Rank Adaptation of Large Language Models,"Edward J Hu,yelong shen,Phillip Wallis,Zeyuan Allen-Zhu,Yuanzhi Li,Shean Wang,Lu Wang,Weizhu Chen",https://openreview.net/forum?id=nZeVKeeFYf9,2022,ICLR,https://github.com/microsoft/LoRA, 12 | 9,💡 未复现,Finetuned Language Models are Zero-Shot Learners,"Jason Wei,Maarten Bosma,Vincent Zhao,Kelvin Guu,Adams Wei Yu,Brian Lester,Nan Du,Andrew M. Dai,Quoc V Le",https://openreview.net/forum?id=gEZrGCozdqR&ref=morioh.com&utm_source=morioh.com,2022,ICLR,https://github.com/google-research/FLAN, 13 | 10,💡 未复现,LLaMA: Open and Efficient Foundation Language Models,"Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample",https://arxiv.org/abs/2302.13971,2023,,https://github.com/meta-llama/llama, 14 | 11,💡 未复现,Self-Consistency Improves Chain of Thought Reasoning in Language Models,"Xuezhi Wang,Jason Wei,Dale Schuurmans,Quoc V Le,Ed H. Chi,Sharan Narang,Aakanksha Chowdhery,Denny Zhou",https://openreview.net/forum?id=1PL1NIMMrw&utm_source=chatgpt.com,2023,ICLR,https://github.com/dj-sorry/self_consistency, 15 | 12,💡 未复现,Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers,"Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Shuming Ma, Zhifang Sui, Furu Wei",https://aclanthology.org/2023.findings-acl.247/,2023,ACL,https://github.com/microsoft/LMOps/tree/main/understand_icl, 16 | 13,💡 未复现,Toolformer: Language Models Can Teach Themselves to Use Tools,"Timo Schick, Jane Dwivedi-Yu, Roberto Dessi, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom",https://proceedings.neurips.cc/paper_files/paper/2023/hash/d842425e4bf79ba039352da0f658a906-Abstract-Conference.html,2023,NIPS,https://github.com/lucidrains/toolformer-pytorch, 17 | 14,💡 未复现,DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature,"Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D Manning, Chelsea Finn",https://proceedings.mlr.press/v202/mitchell23a.html,2023,PMLR,https://github.com/eric-mitchell/detect-gpt, 18 | 15,💡 未复现,Recitation-augmented language models,"Zhiqing Sun,Xuezhi Wang,Yi Tay,Yiming Yang,Denny Zhou",https://openreview.net/forum?id=-cqvvvb-NkI,2023,ICLR,https://github.com/Edward-Sun/RECITE, 19 | 16,💡 未复现,Self-Instruct: Aligning Language Models with Self-Generated Instructions,"Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi",https://aclanthology.org/2023.acl-long.754/,2023,ACL,https://github.com/yizhongw/self-instruct, 20 | 17,💡 未复现,Automatic chain of thought prompting in large language models,"Zhuosheng Zhang,Aston Zhang,Mu Li,Alex Smola",https://openreview.net/forum?id=5NTt8GFjUHkr,2023,ICLR,https://github.com/amazon-science/auto-cot, 21 | 18,💡 未复现,REALM: retrieval-augmented language model pre-training,"Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang",https://dl.acm.org/doi/abs/10.5555/3524938.3525306,2020,ICML,https://github.com/google-research/language/tree/master/language/realm, 22 | 19,💡 未复现,Language Is Not All You Need: Aligning Perception with Language Models,"Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Nils Bjorck, Vishrav Chaudhary, Subhojit Som, XIA SONG, Furu Wei",https://proceedings.neurips.cc/paper_files/paper/2023/hash/e425b75bac5742a008d643826428787c-Abstract-Conference.html,2023,NIPS,https://github.com/microsoft/unilm/tree/master/unilm-v1, 23 | 20,💡 未复现,Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data,"Kashun Shum, Shizhe Diao, Tong Zhang",https://aclanthology.org/2023.findings-emnlp.811/,2023,EMNLP,https://github.com/SHUMKASHUN/Automate-CoT, 24 | 21,💡 未复现,Weakly Supervised Semantic Segmentation via Alternate Self-Dual Teaching,"Dingwen Zhang, Wenyuan Zeng, Guangyu Guo, Chaowei Fang, Lechao Cheng, Ming-Ming Cheng, Junwei Han",https://arxiv.org/abs/2112.09459,2025,TIP,, 25 | 22,💡 未复现,Text to Image for Multi-Label Image Recognition with Joint Prompt-Adapter Learning,"Chun-Mei Feng, Kai Yu, Xinxing Xu, Salman Khan, Rick Siow Mong Goh, Wangmeng Zuo, Yong Liu",https://arxiv.org/abs/2506.10575,2024,TPAMI,, 26 | 23,💡 未复现,Segment Concealed Objects with Incomplete Supervision,"Chunming He, Kai Li, Yachao Zhang, Ziyun Yang, Youwei Pang, Longxiang Tang, Chengyu Fang, Yulun Zhang, Linghe Kong, Xiu Li, Sina Farsiu",https://arxiv.org/pdf/2506.08955,2025,TPAMI,https://github.com/ChunmingHe/SEE, 27 | 24,💡 未复现,Event-based Stereo Depth Estimation: A Survey,"Suman Ghosh, Guillermo Gallego",https://arxiv.org/abs/2409.17680,2025,TPAMI,, 28 | 25,💡 未复现,Efficient Low-Resolution Face Recognition via Bridge Distillation,"Shiming Ge, Shengwei Zhao, Chenyu Li, Yu Zhang, Jia Li",https://arxiv.org/abs/2409.11786,2024,TIP,, 29 | 26,💡 未复现,Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning,"Bingqian Lin, Yanxin Long, Yi Zhu, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Liang Lin",https://arxiv.org/abs/2403.05770,2023,TPAMI,https://github.com/YicongHong/Recurrent-VLN-BERT, 30 | 27,💡 未复现,Bayesian Estimate of Mean Proper Scores for Diversity-Enhanced Active Learning,"Wei Tan, Lan Du, Wray Buntine",https://arxiv.org/abs/2312.10116,2023,TPAMI,, 31 | 28,💡 未复现,Paragraph-to-Image Generation with Information-Enriched Diffusion Model,"Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang",https://arxiv.org/abs/2311.14284,2025,IJCV,https://github.com/weijiawu/ParaDiffusion, 32 | 29,💡 未复现,Inherit with Distillation and Evolve with Contrast: Exploring Class Incremental Semantic Segmentation Without Exemplar Memory,"Danpei Zhao, Bo Yuan, Zhenwei Shi",https://arxiv.org/abs/2309.15413,2023,TPAMI,, 33 | 30,💡 未复现,Dual Compensation Residual Networks for Class Imbalanced Learning,"Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen",https://arxiv.org/abs/2308.13165,2023,TPAMI,, 34 | 31,💡 未复现,End-to-end Alternating Optimization for Real-World Blind Super Resolution,"Zhengxiong Luo, Yan Huang, Shang Li, Liang Wang, Tieniu Tan",https://arxiv.org/abs/2308.08816,2023,IJCV,https://github.com/greatlog/RealDAN, 35 | 32,💡 未复现,YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection,"Yuming Chen, Xinbin Yuan, Jiabao Wang, Ruiqi Wu, Xiang Li, Qibin Hou, Ming-Ming Cheng",https://arxiv.org/abs/2308.05480,2025,TPAMI,https://github.com/FishAndWasabi/YOLO-MS, 36 | 33,💡 未复现,"A Survey on Graph Neural Networks for Time Series: Forecasting, Classification, Imputation, and Anomaly Detection","Ming Jin, Huan Yee Koh, Qingsong Wen, Daniele Zambon, Cesare Alippi, Geoffrey I. Webb, Irwin King, Shirui Pan",https://arxiv.org/abs/2307.03759,2024,TPAMI,https://github.com/KimMeen/Awesome-GNN4TS, 37 | 34,💡 未复现,SplatFlow: Learning Multi-frame Optical Flow via Splatting,"Bo Wang, Yifan Zhang, Jian Li, Yang Yu, Zhenping Sun, Li Liu, Dewen Hu",https://arxiv.org/abs/2306.08887,2024,IJCV,https://github.com/wwsource/SplatFlow, 38 | 35,💡 未复现,Towards Expressive Spectral-Temporal Graph Neural Networks for Time Series Forecasting,"Ming Jin, Guangsi Shi, Yuan-Fang Li, Bo Xiong, Tian Zhou, Flora D. Salim, Liang Zhao, Lingfei Wu, Qingsong Wen, Shirui Pan",https://arxiv.org/abs/2305.06587,2025,TPAMI,, 39 | 36,💡 未复现,Efficient Halftoning via Deep Reinforcement Learning,"Haitian Jiang, Dongliang Xiong, Xiaowen Jiang, Li Ding, Liang Chen, Kai Huang",https://arxiv.org/abs/2304.12152,2023,TIP,, 40 | 37,💡 未复现,PAC-Bayes Bounds for Bandit Problems: A Survey and Experimental Comparison,"Hamish Flynn, David Reeb, Melih Kandemir, Jan Peters",https://arxiv.org/abs/2211.16110,2023,TPAMI,, 41 | 38,💡 未复现,Salient Object Detection via Dynamic Scale Routing,"Zhenyu Wu, Shuai Li, Chenglizhao Chen, Hong Qin, Aimin Hao",https://arxiv.org/abs/2210.13821,2022,TIP,https://github.com/wuzhenyubuaa/DPNet, 42 | 39,💡 未复现,Twin Contrastive Learning for Online Clustering,"Yunfan Li, Mouxing Yang, Dezhong Peng, Taihao Li, Jiantao Huang, Xi Peng",https://arxiv.org/abs/2210.11680,2022,IJCV,, 43 | 40,💡 未复现,Kernel-Based Generalized Median Computation for Consensus Learning,"Andreas Nienkötter, Xiaoyi Jiang",https://arxiv.org/abs/2209.10208,2022,TPAMI,, 44 | 41,💡 未复现,A Tale of HodgeRank and Spectral Method: Target Attack Against Rank Aggregation Is the Fixed Point of Adversarial Game,"Ke Ma, Qianqian Xu, Jinshan Zeng, Guorong Li, Xiaochun Cao, Qingming Huang",https://arxiv.org/abs/2209.05742,2022,TPAMI,, 45 | 42,💡 未复现,Boosting Night-time Scene Parsing with Learnable Frequency,"Zhifeng Xie, Sen Wang, Ke Xu, Zhizhong Zhang, Xin Tan, Yuan Xie, Lizhuang Ma",https://arxiv.org/abs/2208.14241,2023,TIP,, 46 | 43,💡 未复现,SiamMask: A Framework for Fast Online Object Tracking and Segmentation,"Weiming Hu, Qiang Wang, Li Zhang, Luca Bertinetto, Philip H. S. Torr",https://arxiv.org/abs/2207.02088,2022,TPAMI,, 47 | 44,💡 未复现,SERE: Exploring Feature Self-relation for Self-supervised Transformer,"Zhong-Yu Li, Shanghua Gao, Ming-Ming Cheng",https://arxiv.org/abs/2206.05184,2023,TPAMI,https://github.com/MCG-NKU/SERE, 48 | 45,💡 未复现,Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis,"Maciej Besta, Torsten Hoefler",https://arxiv.org/abs/2205.09702,2023,TPAMI,, 49 | 46,💡 未复现,Efficient Burst Raw Denoising with Variance Stabilization and Multi-frequency Denoising Network,"Dasong Li, Yi Zhang, Ka Lung Law, Xiaogang Wang, Hongwei Qin, Hongsheng Li",https://arxiv.org/abs/2205.04721,2022,IJCV,, 50 | 47,💡 未复现,Incomplete Gamma Kernels: Generalizing Locally Optimal Projection Operators,"Patrick Stotko, Michael Weinmann, Reinhard Klein",https://arxiv.org/abs/2205.01087,2024,TPAMI,https://github.com/stotko/incomplete-gamma-kernels, 51 | 48,💡 未复现,From Slow Bidirectional to Fast Autoregressive Video Diffusion Models,"Tianwei Yin, Qiang Zhang, Richard Zhang, William T. Freeman, Fredo Durand, Eli Shechtman, Xun Huang",https://arxiv.org/abs/2412.07772,2025,CVPR,https://github.com/tianweiy/CausVid, 52 | 49,💡 未复现,FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention,"Guangxuan Xiao, Tianwei Yin, William T. Freeman, Frédo Durand, Song Han",https://arxiv.org/abs/2305.10431,2024,IJCV,https://github.com/mit-han-lab/fastcomposer, 53 | 50,💡 未复现,ROGRAG: A Robustly Optimized GraphRAG Framework,"Zhefan Wang, Huanjun Kong, Jie Ying, Wanli Ouyang, Nanqing Dong",https://arxiv.org/abs/2503.06474,2025,ACL,https://github.com/tpoisonooo/ROGRAG, 54 | 51,💡 未复现,Dual Encoder GAN Inversion for High-Fidelity 3D Head Reconstruction from Single Images,"Bahri Batuhan Bilecen, Ahmet Berke Gokmen, Aysegul Dundar",https://arxiv.org/abs/2409.20530,2024,NIPS,berkegokmen1/dual-enc-3d-gan-inversion, 55 | 52,💡 未复现,Can We Leave Deepfake Data Behind in Training Deepfake Detector,"Jikang Cheng, Zhiyuan Yan, Ying Zhang, Yuhao Luo, Zhongyuan Wang, Chen Li",https://arxiv.org/pdf/2408.17052,2024,NIPS,https://github.com/beautyremain/ProDet, 56 | 53,💡 未复现,VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset,"Sihan Chen, Handong Li, Qunbo Wang, Zijia Zhao, Mingzhen Sun, Xinxin Zhu, Jing Liu",https://arxiv.org/abs/2305.18500,2023,NIPS,https://github.com/TXH-mercury/VAST, 57 | 54,💡 未复现,MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors,"Riku Murai, Eric Dexheimer, Andrew J. Davison",https://arxiv.org/abs/2412.12392,2025,CVPR,https://github.com/rmurai0610/MASt3R-SLAM, 58 | 55,💡 未复现,MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM,"Vladimir Yugay, Theo Gevers, Martin R. Oswald",https://arxiv.org/abs/2411.16785,2025,CVPR,https://github.com/VladimirYugay/MAGiC-SLAM, 59 | 56,💡 未复现,Murre: Multi-view Reconstruction via SfM-guided Monocular Depth Estimation,"Haoyu Guo, He Zhu, Sida Peng, Haotong Lin, Yunzhi Yan, Tao Xie, Wenguan Wang, Xiaowei Zhou, Hujun Bao",https://arxiv.org/abs/2503.14483,2025,CVPR,https://github.com/zju3dv/Murre, 60 | 57,💡 未复现,STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution,"Rui Xie, Yinhong Liu, Penghao Zhou, Chen Zhao, Jun Zhou, Kai Zhang, Zhenyu Zhang, Jian Yang, Zhenheng Yang, Ying Tai",https://arxiv.org/abs/2501.02976,2025,ICCV,https://github.com/NJU-PCALab/STAR, 61 | 58,💡 未复现,SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models,"Jaerin Lee, Daniel Sungho Jung, Kanggeon Lee, Kyoung Mu Lee",https://arxiv.org/abs/2403.09055,2025,CVPR,https://github.com/ironjr/semantic-draw, 62 | 59,💡 未复现,HSMR: Reconstructing Humans with a Biomechanically Accurate Skeleton,"Yan Xia, Xiaowei Zhou, Etienne Vouga, Qixing Huang, Georgios Pavlakos",https://arxiv.org/abs/2503.21751,2025,CVPR,https://github.com/IsshikiHugh/HSMR, 63 | 60,💡 未复现,RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness,"Tianyu Yu, Haoye Zhang, Qiming Li, Qixin Xu, Yuan Yao, Da Chen, Xiaoman Lu, Ganqu Cui, Yunkai Dang, Taiwen He, Xiaocheng Feng, Jun Song, Bo Zheng, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun",https://arxiv.org/abs/2405.17220,2025,CVPR,https://github.com/RLHF-V/RLAIF-V, 64 | 61,💡 未复现,DFormer:Rethinking RGBD Representation Learning for Semantic Segmentation,"Bowen Yin, Xuying Zhang, Zhongyu Li, Li Liu, Ming-Ming Cheng, Qibin Hou",https://arxiv.org/abs/2309.09668,2025,CVPR,https://github.com/VCIP-RGBD/DFormer, 65 | 62,💡 未复现,GaussianCity: Generative Gaussian Splatting for Unbounded 3D City Generation,"Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, Ziwei Liu",https://arxiv.org/abs/2406.06526,2025,CVPR,https://github.com/hzxie/GaussianCity, 66 | 63,💡 未复现,PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos,"Hanxiao Jiang, Hao-Yu Hsu, Kaifeng Zhang, Hsin-Ni Yu, Shenlong Wang, Yunzhu Li",https://arxiv.org/abs/2503.17973,2025,ICCV,https://github.com/Jianghanxiao/PhysTwin, 67 | 64,💡 未复现,UniGoal: Towards Universal Zero-shot Goal-oriented Navigation,"Hang Yin, Xiuwei Xu, Lingqing Zhao, Ziwei Wang, Jie Zhou, Jiwen Lu",https://arxiv.org/abs/2503.10630,2025,CVPR,https://github.com/bagh2178/UniGoal, 68 | 65,💡 未复现,Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction,"Jixuan Fan, Wanhua Li, Yifei Han, Yansong Tang",https://arxiv.org/abs/2412.04887,2025,ICCV,https://github.com/Jixuan-Fan/Momentum-GS, 69 | 66,💡 未复现,MINIMA: Modality Invariant Image Matching,"Jiangwei Ren, Xingyu Jiang, Zizhuo Li, Dingkang Liang, Xin Zhou, Xiang Bai",https://arxiv.org/abs/2412.19412,2025,CVPR,https://github.com/LSXI7/MINIMA, 70 | 67,💡 未复现,Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video,"David Yifan Yao, Albert J. Zhai, Shenlong Wang",https://arxiv.org/abs/2503.21761,2025,CVPR,https://github.com/Davidyao99/uni4d, 71 | 68,💡 未复现,Diffusion Renderer: Neural Inverse and Forward Rendering with Video Diffusion Models,"Ruofan Liang, Zan Gojcic, Huan Ling, Jacob Munkberg, Jon Hasselgren, Zhi-Hao Lin, Jun Gao, Alexander Keller, Nandita Vijaykumar, Sanja Fidler, Zian Wang",https://arxiv.org/abs/2501.18590,2025,CVPR,https://github.com/52CV/CVPR-2025-Papers, 72 | 69,💡 未复现,Linear Programming Bounds on k-Uniform States,"Yu Ning, Fei Shi, Tao Luo, Xiande Zhang",https://arxiv.org/abs/2503.02222,2025,ICCV,https://github.com/Epona-World-Model/Epona, 73 | 70,💡 未复现,LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation,"Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, Ziwei Liu",https://arxiv.org/abs/2402.05054,2024,ECCV,https://github.com/3DTopia/LGM, 74 | 71,💡 未复现,VideoMamba: State Space Model for Efficient Video Understanding,"Kunchang Li, Xinhao Li, Yi Wang, Yinan He, Yali Wang, Limin Wang, Yu Qiao",https://arxiv.org/abs/2403.06977,2024,ECCV,https://github.com/OpenGVLab/video-mamba-suite, 75 | 72,💡 未复现,DriveLM: Driving with Graph Visual Question Answering,"Chonghao Sima, Katrin Renz, Kashyap Chitta, Li Chen, Hanxue Zhang, Chengen Xie, Jens Beißwenger, Ping Luo, Andreas Geiger, Hongyang Li",https://arxiv.org/abs/2312.14150,2024,ECCV,https://github.com/OpenDriveLab/DriveLM, 76 | 73,💡 未复现,GRiT: A Generative Region-to-text Transformer for Object Understanding,"Jialian Wu, Jianfeng Wang, Zhengyuan Yang, Zhe Gan, Zicheng Liu, Junsong Yuan, Lijuan Wang",https://arxiv.org/abs/2212.00280,2024,ECCV,https://github.com/JialianW/GRiT, 77 | 74,💡 未复现,PointLLM: Empowering Large Language Models to Understand Point Clouds,"Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin",https://arxiv.org/abs/2308.16911,2024,ECCV,https://github.com/OpenRobotLab/PointLLM, 78 | 75,💡 未复现,nuCraft: Crafting High Resolution 3D Semantic Occupancy for Unified 3D Scene Understanding,"Benjin Zhu, Zhe Wang, and Hongsheng Li",https://www.ecva.net/papers/eccv_2024/papers_ECCV/papers/00730.pdf,2024,ECCV,/, 79 | 76,💡 未复现,Adversarial Diffusion Distillation,"Axel Sauer, Dominik Lorenz, Andreas Blattmann, Robin Rombach",https://arxiv.org/abs/2311.17042,2024,ECCV,https://github.com/AMD-AIG-AIMA/AMD-Diffusion-Distillation, 80 | 77,💡 未复现,Generative Image Dynamics,"Zhengqi Li, Richard Tucker, Noah Snavely, Aleksander Holynski",https://arxiv.org/abs/2309.07906,2024,CVPR,https://generative-dynamics.github.io/,最佳论文 81 | 78,💡 未复现,Rich Human Feedback for Text-to-Image Generation,"Youwei Liang, Junfeng He, Gang Li, Peizhao Li, Arseniy Klimovskiy, Nicholas Carolan, Jiao Sun, Jordi Pont-Tuset, Sarah Young, Feng Yang, Junjie Ke, Krishnamurthy Dj Dvijotham, Katie Collins, Yiwen Luo, Yang Li, Kai J Kohlhoff, Deepak Ramachandran, Vidhya Navalpakkam",https://arxiv.org/pdf/2312.10240,2024,CVPR,https://github.com/youweiliang/RichHF,最佳论文 82 | 79,💡 未复现,Mip-Splatting: Alias-free 3D Gaussian Splatting,"Zehao Yu, Anpei Chen, Binbin Huang, Torsten Sattler, Andreas Geiger",https://arxiv.org/abs/2311.16493,2024,CVPR,https://github.com/autonomousvision/mip-splatting,最佳学生论文 83 | 80,💡 未复现,BioCLIP: A Vision Foundation Model for the Tree of Life,"Samuel Stevens, Jiaman Wu, Matthew J Thompson, Elizabeth G Campolongo, Chan Hee Song, David Edward Carlyn, Li Dong, Wasila M Dahdul, Charles Stewart, Tanya Berger-Wolf, Wei-Lun Chao, Yu Su",https://arxiv.org/abs/2311.18803,2024,CVPR,https://github.com/Imageomics/bioclip,最佳学生论文 84 | 81,💡 未复现,4D Gaussian Splatting for Real-Time Dynamic Scene Rendering,"Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang",https://arxiv.org/abs/2310.08528,2024,CVPR,https://github.com/hustvl/4DGaussians, 85 | 82,💡 未复现,Depth Anything: Unleashing The Power of Large-Scale Unlabeled Data,"Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao",https://arxiv.org/abs/2401.10891,2024,CVPR,https://github.com/LiheYoung/Depth-Anything, 86 | 83,💡 未复现,LISA: Reasoning Segmentation Via Large Language Model,"Xin Lai, Zhuotao Tian, Yukang Chen, Yanwei Li, Yuhui Yuan, Shu Liu, Jiaya Jia",https://arxiv.org/abs/2308.00692,2024,CVPR,https://github.com/dvlab-research/LISA, 87 | 84,💡 未复现,InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks,"Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai",https://arxiv.org/abs/2312.14238,2024,CVPR,https://github.com/OpenGVLab/InternVL, 88 | 85,💡 未复现,MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark,"Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen",https://arxiv.org/abs/2311.16502,2024,CVPR,https://github.com/MMMU-Benchmark/MMMU, 89 | 86,💡 未复现,EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything,"Yunyang Xiong, Bala Varadarajan, Lemeng Wu, Xiaoyu Xiang, Fanyi Xiao, Chenchen Zhu, Xiaoliang Dai, Dilin Wang, Fei Sun, Forrest Iandola, Raghuraman Krishnamoorthi, Vikas Chandra",https://arxiv.org/abs/2312.00863,2024,CVPR,https://github.com/yformer/EfficientSAM, 90 | 87,💡 未复现,Improved Baselines with Visual Instruction Tuning (LLaVA-1.5),"Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee",https://arxiv.org/abs/2310.03744,2024,CVPR,https://github.com/LLaVA-VL/LLaVA-NeXT, 91 | 88,💡 未复现,DemoFusion: Democratising High-Resolution Image Generation With No $$$,"Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma",https://arxiv.org/abs/2311.16973,2024,CVPR,https://github.com/PRIS-CV/DemoFusion, 92 | 89,💡 未复现,ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models,"Lukas Höllein, Aljaž Božič, Norman Müller, David Novotny, Hung-Yu Tseng, Christian Richardt, Michael Zollhöfer, Matthias Nießner",https://arxiv.org/abs/2403.01807,2024,CVPR,https://github.com/facebookresearch/ViewDiff, 93 | 90,💡 未复现,OmniGlue: Generalizable Feature Matching with Foundation Model Guidance,"Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, Andre Araujo",https://arxiv.org/abs/2405.12979,2024,CVPR,https://github.com/google-research/omniglue, 94 | 91,💡 未复现,DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks,"Jiaxin Zhang, Dezhi Peng, Chongyu Liu, Peirong Zhang, Lianwen Jin",https://arxiv.org/abs/2405.04408,2024,CVPR,https://github.com/ZZZHANG-jx/DocRes, 95 | 92,💡 未复现,MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training,"Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri, Raviteja Vemulapalli, Oncel Tuzel",https://arxiv.org/abs/2311.17049,2024,CVPR,https://github.com/apple/ml-mobileclip, 96 | 93,💡 未复现,Describing Differences in Image Sets with Natural Language,"Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy",https://arxiv.org/abs/2312.02974,2024,CVPR,https://github.com/Understanding-Visual-Datasets/VisDiff, 97 | 94,💡 未复现,XFeat: Accelerated Features for Lightweight Image Matching,"Guilherme Potje, Felipe Cadar, Andre Araujo, Renato Martins, Erickson R. Nascimento",https://arxiv.org/abs/2404.19174,2024,CVPR,https://github.com/verlab/accelerated_features, 98 | 95,💡 未复现,pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction,"David Charatan, Sizhe Li, Andrea Tagliasacchi, Vincent Sitzmann",https://arxiv.org/abs/2312.12337,2024,CVPR,, 99 | 96,💡 未复现,GPT4Point: A Unified Framework for Point-Language Understanding and Generation,"Zhangyang Qi, Ye Fang, Zeyi Sun, Xiaoyang Wu, Tong Wu, Jiaqi Wang, Dahua Lin, Hengshuang Zhao",https://arxiv.org/abs/2312.02980,2024,CVPR,, 100 | 97,💡 未复现,Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks,"Bin Xiao, Haiping Wu, Weijian Xu, Xiyang Dai, Houdong Hu, Yumao Lu, Michael Zeng, Ce Liu, Lu Yuan",https://arxiv.org/abs/2311.06242,2024,CVPR,, 101 | 98,💡 未复现,Identity-Preserving Text-to-Video Generation by Frequency Decomposition,"Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan",https://arxiv.org/abs/2411.17440,2025,CVPR,https://github.com/PKU-YuanGroup/ConsisID, 102 | 99,💡 未复现,Cinemo: Consistent and Controllable Image Animation with Motion Diffusion Models,"Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Yuan-Fang Li, Cunjian Chen, Yu Qiao",https://arxiv.org/abs/2407.15642,2025,CVPR,https://github.com/maxin-cn/Cinemo, 103 | 100,💡 未复现,X-Dyna: Expressive Dynamic Human Image Animation,"Di Chang, Hongyi Xu, You Xie, Yipeng Gao, Zhengfei Kuang, Shengqu Cai, Chenxu Zhang, Guoxian Song, Chao Wang, Yichun Shi, Zeyuan Chen, Shijie Zhou, Linjie Luo, Gordon Wetzstein, Mohammad Soleymani",https://arxiv.org/abs/2501.10021,2025,CVPR,https://github.com/bytedance/X-Dyna, 104 | 101,💡 未复现,PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation,"Qiyao Xue, Xiangyu Yin, Boyuan Yang, Wei Gao",https://arxiv.org/pdf/2412.00596,2025,CVPR,https://github.com/pittisl/PhyT2V, 105 | 102,💡 未复现,Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model,"Feng Liu, Shiwei Zhang, Xiaofeng Wang, Yujie Wei, Haonan Qiu, Yuzhong Zhao, Yingya Zhang, Qixiang Ye, Fang Wan",https://arxiv.org/abs/2411.19108,2025,CVPR,https://github.com/ali-vilab/TeaCache, 106 | 103,💡 未复现,AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion,"Mingzhen Sun, Weining Wang, Gen Li, Jiawei Liu, Jiahui Sun, Wanquan Feng, Shanshan Lao, SiYu Zhou, Qian He, Jing Liu",https://arxiv.org/abs/2503.07418,2025,CVPR,https://github.com/iva-mzsun/AR-Diffusion, 107 | 104,💡 未复现,Number it: Temporal Grounding Videos like Flipping Manga,"Yongliang Wu, Xinting Hu, Yuyang Sun, Yizhou Zhou, Wenbo Zhu, Fengyun Rao, Bernt Schiele, Xu Yang",https://arxiv.org/abs/2411.10332,2025,CVPR,https://github.com/yongliang-wu/NumPro, 108 | 105,💡 未复现,Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing,"Hanhui Wang, Yihua Zhang, Ruizheng Bai, Yue Zhao, Sijia Liu, Zhengzhong Tu",https://arxiv.org/abs/2411.16832,2025,CVPR,https://github.com/taco-group/FaceLock, 109 | 106,💡 未复现,h-Edit: Effective and Flexible Diffusion-Based Editing via Doob’s h-Transform,"Toan Nguyen, Kien Do, Duc Kieu, Thin Nguyen",https://arxiv.org/abs/2503.02187,2025,CVPR,https://github.com/nktoan/h-edit, 110 | 107,💡 未复现,OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels,"Meng Lou, Yizhou Yu",https://arxiv.org/abs/2502.20087,2025,CVPR,https://github.com/LMMMEng/OverLoCK,oral 111 | 108,💡 未复现,Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space,"Yifan Zhou, Zeqi Xiao, Shuai Yang, Xingang Pan",https://arxiv.org/pdf/2503.09419,2025,CVPR,https://github.com/SingleZombie/AFLDM,oral 112 | 109,💡 未复现,3D Student Splatting and Scooping,"Jialin Zhu, Jiangbei Yue, Feixiang He, He Wang",https://arxiv.org/abs/2503.10148,2025,CVPR,https://github.com/realcrane/3D-student-splating-and-scooping,oral 113 | 110,💡 未复现,CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models,"Felix Taubner, Ruihang Zhang, Mathieu Tuli, David B. Lindell",https://arxiv.org/pdf/2412.12093,2025,CVPR,https://github.com/felixtaubner/cap4d/,oral 114 | 111,💡 未复现,Multi-view Reconstruction via SfM-guided Monocular Depth Estimation,"Haoyu Guo, He Zhu, Sida Peng, Haotong Lin, Yunzhi Yan, Tao Xie, Wenguan Wang, Xiaowei Zhou, Hujun Bao",https://arxiv.org/pdf/2503.14483,2025,CVPR,https://github.com/zju3dv/Murre,oral 115 | 112,💡 未复现,Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models,"Zhejun Zhang, Peter Karkus, Maximilian Igl, Wenhao Ding, Yuxiao Chen, Boris Ivanovic, Marco Pavone",https://arxiv.org/pdf/2412.05334,2025,CVPR,https://github.com/NVlabs/catk,oral 116 | 113,💡 未复现,CustAny: Customizing Anything from A Single Example,"Lingjie Kong, Kai Wu, Xiaobin Hu, Wenhui Han, Jinlong Peng, Chengming Xu, Donghao Luo, Mengtian Li, Jiangning Zhang, Chengjie Wang, Yanwei Fu",https://arxiv.org/pdf/2406.11643v4,2025,CVPR,https://github.com/LingjieKong-fdu/CustAny,oral 117 | 114,💡 未复现,VGGT:Visual Geometry Grounded Transformer,"Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, David Novotny",https://arxiv.org/pdf/2503.11651,2025,CVPR,https://github.com/facebookresearch/vggt,"oral,Award Candidate" 118 | 115,💡 未复现,Navigation World Models,"Amir Bar, Gaoyue Zhou, Danny Tran, Trevor Darrell, Yann LeCun",https://arxiv.org/pdf/2412.03572,2025,CVPR,https://github.com/facebookresearch/nwm/,oral 119 | 116,💡 未复现,"MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos","Zhengqi Li, Richard Tucker, Forrester Cole, Qianqian Wang, Linyi Jin, Vickie Ye, Angjoo Kanazawa, Aleksander Holynski, Noah Snavely",https://arxiv.org/pdf/2412.04463,2025,CVPR,https://github.com/mega-sam/mega-sam,oral 120 | 117,💡 未复现,FoundationStereo: Zero-Shot Stereo Matching,"Bowen Wen, Matthew Trepte, Joseph Aribido, Jan Kautz, Orazio Gallo, Stan Birchfield",https://arxiv.org/pdf/2501.09898,2025,CVPR,https://github.com/NVlabs/FoundationStereo/,"oral,Award Candidate" 121 | 118,💡 未复现,The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition,"Otto Brookes, Maksim Kukushkin, Majid Mirmehdi, Colleen Stephens, Paula Dieguez, Thurston C. Hicks, Sorrel Jones, Kevin Lee, Maureen S. McCarthy, Amelia Meier, Emmanuelle Normand, Erin G. Wessling, Roman M. Wittig, Kevin Langergraber, Klaus Zuberbühler, Lukas Boesch, Thomas Schmid, Mimi Arandjelovic, Hjalmar Kühl, Tilo Burghardt",https://arxiv.org/pdf/2502.21201,2025,CVPR,https://obrookes.github.io/panaf-fgbg.github.io/,"oral,Award Candidate" 122 | 119,💡 未复现,Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing,"Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, Yang Song",https://arxiv.org/pdf/2407.01521,2025,CVPR,https://github.com/zhangbingliang2019/DAPS,oral 123 | 120,💡 未复现,MV-DUSt3R+: Single-StageSceneReconstruction fromSparseViewsIn2Seconds,"Zhenggang Tang, Yuchen Fan, Dilin Wang, Hongyu Xu, Rakesh Ranjan, Alexander Schwing, Zhicheng Yan",https://arxiv.org/pdf/2412.06974,2025,CVPR,https://github.com/facebookresearch/mvdust3r,oral 124 | 121,💡 未复现,DIFIX3D+: Improving 3D Reconstructions with Single-Step Diffusion Models,"Jay Zhangjie Wu, Yuxuan Zhang, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Gojcic, Huan Ling",https://arxiv.org/pdf/2503.01774,2025,CVPR,https://github.com/nv-tlabs/Difix3D,"oral,Award Candidate" 125 | 122,💡 未复现,DIFFUSIONRENDERER: Neural Inverse and Forward Rendering with Video Diffusion Models,"Ruofan Liang, Zan Gojcic, Huan Ling, Jacob Munkberg, Jon Hasselgren, Zhi-Hao Lin, Jun Gao, Alexander Keller, Nandita Vijaykumar, Sanja Fidler, Zian Wang",https://arxiv.org/pdf/2501.18590,2025,CVPR,https://research.nvidia.com/labs/toronto-ai/DiffusionRenderer/,oral 126 | 123,💡 未复现,OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation,"Pengfei Zhou, Xiaopeng Peng, Jiajun Song, Chuanhao Li, Zhaopan Xu, Yue Yang, Ziyao Guo, Hao Zhang, Yuqi Lin, Yefei He, Lirui Zhao, Shuo Liu, Tianhua Li, Yuxuan Xie, Xiaojun Chang, Yu Qiao, Wenqi Shao, Kaipeng Zhang",https://arxiv.org/pdf/2411.18499,2025,CVPR,https://github.com/LanceZPF/OpenING,oral 127 | 124,💡 未复现,RandAR: Decoder-only Autoregressive Visual Generation in Random Orders,"Ziqi Pang, Tianyuan Zhang, Fujun Luan, Yunze Man, Hao Tan, Kai Zhang, William T. Freeman, Yu-Xiong Wang",https://arxiv.org/pdf/2412.01827,2025,CVPR,https://github.com/ziqipang/RandAR,oral 128 | 125,💡 未复现,AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea,"Qifan Yu, Wei Chow, Zhongqi Yue, Kaihang Pan, Yang Wu, Xiaoyang Wan, Juncheng Li, Siliang Tang, Hanwang Zhang, Yueting Zhuang",https://arxiv.org/pdf/2411.15738,2025,CVPR,https://github.com/DCDmllm/AnyEdit,oral 129 | 126,💡 未复现,VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection,"Songhao Han, Wei Huang, Hairong Shi, Le Zhuo, Xiu Su, Shifeng Zhang, Xu Zhou, Xiaojuan Qi, Yue Liao, Si Liu",https://arxiv.org/pdf/2411.14794,2025,CVPR,https://github.com/hshjerry/VideoEspresso,oral 130 | 127,💡 未复现,SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images,"Kaiyu Li, Ruixun Liu, Xiangyong Cao, Xueru Bai, Feng Zhou, Deyu Meng, Zhi Wang",https://arxiv.org/pdf/2410.01768,2025,CVPR,https://github.com/likyoo/SegEarth-OV,oral 131 | 128,💡 未复现,Minority-Focused Text-to-Image Generation via Prompt Optimization,"Soobin Um, Jong Chul Ye",https://arxiv.org/pdf/2410.07838,2025,CVPR,https://github.com/soobin-um/MinorityPrompt,oral 132 | 129,💡 未复现,Autoregressive Distillation of Diffusion Transformers,"Yeongmin Kim, Sotiris Anagnostidis, Yuming Du, Edgar Schönfeld, Jonas Kohler, Markos Georgopoulos, Albert Pumarola, Ali Thabet, Artsiom Sanakoyeu",https://arxiv.org/pdf/2504.11295,2025,CVPR,https://github.com/alsdudrla10/ARD,oral 133 | 130,💡 未复现,Molmo and PixMo:Open Weights and Open Data for State-of-the-Art Vision-Language Mo,"Matt Deitke, Christopher Clark, Sangho Lee, Rohun Tripathi, Yue Yang, Jae Sung Park, Mohammadreza Salehi, Niklas Muennighoff, Kyle Lo, Luca Soldaini, Jiasen Lu, Taira Anderson, Erin Bransom, Kiana Ehsani, Huong Ngo, YenSung Chen, Ajay Patel, Mark Yatskar, Chris Callison-Burch, Andrew Head, Rose Hendrix, Favyen Bastani, Eli VanderBilt, Nathan Lambert, Yvonne Chou, Arnavi Chheda, Jenna Sparks, Sam Skjonsberg, Michael Schmitz, Aaron Sarnat, Byron Bischoff, Pete Walsh, Chris Newell, Piper Wolters, Tanmay Gupta, Kuo-Hao Zeng, Jon Borchardt, Dirk Groeneveld, Crystal Nam, Sophie Lebrecht, Caitlin Wittlif, Carissa Schoenick, Oscar Michel, Ranjay Krishna, Luca Weihs, Noah A. Smith, Hannaneh Hajishirzi, Ross Girshick, Ali Farhadi, Aniruddha Kembhavi",https://arxiv.org/pdf/2409.17146,2025,CVPR,https://github.com/allenai/molmo,oral 134 | 131,💡 未复现,Continuous 3D Perception Model with Persistent State,"Qianqian Wang, Yifei Zhang, Aleksander Holynski, Alexei A. Efros, Angjoo Kanazawa",https://arxiv.org/pdf/2501.12387,2025,CVPR,https://github.com/CUT3R/CUT3R,oral 135 | 132,💡 未复现,Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content,"Zicheng Zhang, Tengchuan Kou, Shushi Wang, Chunyi Li, Wei Sun, Wei Wang, Xiaoyu Li, Zongyu Wang, Xuezhi Cao, Xiongkuo Min, Xiaohong Liu, Guangtao Zhai",https://arxiv.org/pdf/2503.02357,2025,CVPR,https://github.com/zzc-1998/Q-Eval,oral 136 | 133,💡 未复现,Video-XL:Extra-Long Vision Language Model for Hour-Scale Video Understanding,"Yan Shu, Zheng Liu, Peitian Zhang, Minghao Qin, Junjie Zhou, Zhengyang Liang, Tiejun Huang, Bo Zhao",https://arxiv.org/pdf/2409.14485,2025,CVPR,https://github.com/VectorSpaceLab/Video-XL,oral 137 | 134,💡 未复现,Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise,"Ryan Burgert, Yuancheng Xu, Wenqi Xian, Oliver Pilarski, Pascal Clausen, Mingming He, Li Ma, Yitong Deng, Lingxiao Li, Mohsen Mousavi, Michael Ryoo, Paul Debevec, Ning Yu",https://arxiv.org/pdf/2501.08331,2025,CVPR,https://github.com/Eyeline-Research/Go-with-the-Flow,oral 138 | 135,💡 未复现,CleanDIFT: Diffusion Features without Noise,"Nick Stracke, Stefan Andreas Baumann, Kolja Bauer, Frank Fundel, Björn Ommer",https://arxiv.org/pdf/2412.03439,2025,CVPR,https://github.com/CompVis/cleandift,oral 139 | 136,💡 未复现,CraftsMan3D: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner,"Weiyu Li, Jiarui Liu, Hongyu Yan, Rui Chen, Yixun Liang, Xuelin Chen, Ping Tan, Xiaoxiao Long",https://arxiv.org/pdf/2405.14979,2025,CVPR,https://github.com/wyysf-98/CraftsMan3D,oral 140 | 137,💡 未复现,DreamRelation: Bridging Customization and Relation Generation,"Qingyu Shi, Lu Qi, Jianzong Wu, Jinbin Bai, Jingbo Wang, Yunhai Tong, Xiangtai Li",https://arxiv.org/pdf/2410.23280,2025,CVPR,https://github.com/Shi-qingyu/DreamRelation,oral 141 | 138,💡 未复现,Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens,"Kaihang Pan, Wang Lin, Zhongqi Yue, Tenglong Ao, Liyu Jia, Wei Zhao, Juncheng Li, Siliang Tang, Hanwang Zhang",https://arxiv.org/pdf/2504.14666,2025,CVPR,https://github.com/selftok-team/SelftokTokenizer/,oral 142 | 139,💡 未复现,3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion,"Zhaoxi Chen, Jiaxiang Tang, Yuhao Dong, Ziang Cao, Fangzhou Hong, Yushi Lan, Tengfei Wang, Haozhe Xie, Tong Wu, Shunsuke Saito, Liang Pan, Dahua Lin, Ziwei Liu",https://openaccess.thecvf.com/content/CVPR2025/html/Chen_3DTopia-XL_Scaling_High-quality_3D_Asset_Generation_via_Primitive_Diffusion_CVPR_2025_paper.html,2025,CVPR,https://github.com/3DTopia/3DTopia-XL,highlight 143 | 140,💡 未复现,"AnySat: One Earth Observation Model for Many Resolutions, Scales, and Modalities","Guillaume Astruc, Nicolas Gonthier, Clement Mallet, Loic Landrieu",https://arxiv.org/abs/2412.14123,2025,CVPR,https://github.com/gastruc/AnySat,highlight 144 | 141,💡 未复现,Cross-modal Causal Relation Alignment for Video Question Grounding,"Weixing Chen, Yang Liu, Binglin Chen, Jiandong Su, Yongsen Zheng, Liang Lin",https://arxiv.org/abs/2503.07635,2025,CVPR,https://github.com/WissingChen/CRA-GQA,highlight 145 | 142,💡 未复现,DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos,"Wenbo Hu, Xiangjun Gao, Xiaoyu Li, Sijie Zhao, Xiaodong Cun, Yong Zhang, Long Quan, Ying Shan",https://arxiv.org/abs/2409.02095,2025,CVPR,https://github.com/Tencent/DepthCrafter,highlight 146 | 143,💡 未复现,DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery,"Jiadong Tang, Yu Gao, Dianyi Yang, Liqi Yan, Yufeng Yue, Yi Yang",https://arxiv.org/abs/2503.16964,2025,CVPR,https://github.com/BITyia/DroneSplat,highlight 147 | 144,💡 未复现,Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility,"Yidi Li, Jun Xiao, Zhengda Lu, Yiqun Wang, Haiyong Jiang",https://arxiv.org/abs/2505.21377,2025,CVPR,https://github.com/chenxinl/Dream3DVG,highlight 148 | 145,💡 未复现,Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think,"Jie Tian, Xiaoye Qu, Zhenyi Lu, Wei Wei, Sichen Liu, Yu Cheng",https://arxiv.org/abs/2503.00948,2025,CVPR,https://github.com/Chuge0335/EDG,highlight 149 | 146,💡 未复现,ETAP: Event-based Tracking of Any Point,"Friedhelm Hamann, Daniel Gehrig, Filbert Febryanto, Kostas Daniilidis, Guillermo Gallego",https://arxiv.org/abs/2412.00133,2025,CVPR,https://github.com/tub-rip/ETAP,highlight 150 | 147,💡 未复现,Flash3D: Super-scaling Point Transformers through Joint Hardware-Geometry Locality,"Liyan Chen, Gregory P. Meyer, Zaiwei Zhang, Eric M. Wolff, Paul Vernaza",https://arxiv.org/abs/2412.16481,2025,CVPR,https://github.com/liyanc/Flash3DTransformer,highlight 151 | 148,💡 未复现,Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution,"Qihao Liu, Xi Yin, Alan Yuille, Andrew Brown, Mannat Singh",https://arxiv.org/abs/2412.15213,2025,CVPR,https://github.com/qihao067/CrossFlow,highlight 152 | 149,💡 未复现,Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis,"Yu Yuan, Xijun Wang, Yichen Sheng, Prateek Chennuri, Xingguang Zhang, Stanley Chan",https://arxiv.org/abs/2412.02168,2025,CVPR,https://github.com/pandayuanyu/generative-photography,highlight 153 | 150,💡 未复现,Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation,"Hadi Alzayer, Philipp Henzler, Jonathan T. Barron, Jia-Bin Huang, Pratul P. Srinivasan, Dor Verbin",https://arxiv.org/abs/2412.15211,2025,CVPR,https://relight-to-reconstruct.github.io/,highlight 154 | 151,💡 未复现,Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction,"Seungtae Nam, Xiangyu Sun, Gyeongjin Kang, Younggeun Lee, Seungjun Oh, Eunbyung Park",https://arxiv.org/abs/2412.06234,2025,CVPR,https://github.com/stnamjef/GenerativeDensification,highlight 155 | 152,💡 未复现,GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control,"Xuanchi Ren, Tianchang Shen, Jiahui Huang, Huan Ling, Yifan Lu, Merlin Nimier-David, Thomas Müller, Alexander Keller, Sanja Fidler, Jun Gao",https://arxiv.org/abs/2503.03751,2025,CVPR,https://github.com/nv-tlabs/GEN3C,highlight 156 | 153,💡 未复现,Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity,"Huaxin Zhang, Xiaohao Xu, Xiang Wang, Jialong Zuo, Xiaonan Huang, Changxin Gao, Shanjun Zhang, Li Yu, Nong Sang",https://openaccess.thecvf.com/content/CVPR2025/html/Zhang_Holmes-VAU_Towards_Long-term_Video_Anomaly_Understanding_at_Any_Granularity_CVPR_2025_paper.html,2025,CVPR,https://github.com/pipixin321/HolmesVAU,highlight 157 | 154,💡 未复现,HELVIPAD: A Real-World Dataset for Omnidirectional Stereo Depth Estimation,"Mehdi Zayene, Jannik Endres, Albias Havolli, Charles Corbière, Salim Cherkaoui, Alexandre Kontouli, Alexandre Alahi",https://arxiv.org/abs/2411.18335,2025,CVPR,https://github.com/vita-epfl/Helvipad,highlight 158 | 155,💡 未复现,ImViD: Immersive Volumetric Videos for Enhanced VR Engagement,"Zhengxian Yang, Shi Pan, Shengqi Wang, Haoxiang Wang, Li Lin, Guanjun Li, Zhengqi Wen, Borong Lin, Jianhua Tao, Tao Yu",https://arxiv.org/abs/2503.14359,2025,CVPR,https://github.com/Metaverse-AI-Lab-THU/ImViD,highlight 159 | 156,💡 未复现,Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning,"Hanxun Yu, Wentong Li, Song Wang, Junbo Chen, Jianke Zhu",https://openaccess.thecvf.com/content/CVPR2025/html/Yu_Inst3D-LMM_Instance-Aware_3D_Scene_Understanding_with_Multi-modal_Instruction_Tuning_CVPR_2025_paper.html,2025,CVPR,https://github.com/hanxunyu/Inst3D-LMM,highlight 160 | 157,💡 未复现,Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene,"Shengqiong Wu, Hao Fei, Jingkang Yang, Xiangtai Li, Juncheng Li, Hanwang Zhang, Tat-seng Chua",https://arxiv.org/abs/2503.15019,2025,CVPR,https://github.com/ChocoWu/PSG-4D-LLM,highlight 161 | 158,💡 未复现,Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation,"Xin Zhang, Robby T. Tan",https://arxiv.org/abs/2504.03193,2025,CVPR,https://github.com/devinxzhang/MFuser,highlight 162 | 159,💡 未复现,MammAlps: A Multi-view Video Behavior Monitoring Dataset of Wild Mammals in the Swiss Alps,"Valentin Gabeff, Haozhe Qi, Brendan Flaherty, Gencer Sumbül, Alexander Mathis, Devis Tuia",https://arxiv.org/abs/2503.18223,2025,CVPR,https://github.com/eceo-epfl/MammAlps,highlight 163 | 160,💡 未复现,Matrix3D: Large Photogrammetry Model All-in-One,"Yuanxun Lu, Jingyang Zhang, Tian Fang, Jean-Daniel Nahmias, Yanghai Tsin, Long Quan, Xun Cao, Yao Yao, Shiwei Li",https://arxiv.org/abs/2502.07685,2025,CVPR,https://github.com/apple/ml-matrix3d,highlight 164 | 161,💡 未复现,MITracker: Multi-View Integration for Visual Object Tracking,"Mengjie Xu, Yitao Zhu, Haotian Jiang, Jiaming Li, Zhenrong Shen, Sheng Wang, Haolin Huang, Xinyu Wang, Qing Yang, Han Zhang, Qian Wang",https://arxiv.org/abs/2502.20111,2025,CVPR,https://github.com/XuM007/MITracker,highlight 165 | 162,💡 未复现,Open-Canopy: Towards Very High Resolution Forest Monitoring,"Fajwel Fogel, Yohann Perron, Nikola Besic, Laurent Saint-André, Agnès Pellissier-Tanon, Martin Schwartz, Thomas Boudras, Ibrahim Fayad, Alexandre d'Aspremont, Loic Landrieu, Philippe Ciais",https://openaccess.thecvf.com/content/CVPR2025/html/Fogel_Open-Canopy_Towards_Very_High_Resolution_Forest_Monitoring_CVPR_2025_paper.html,2025,CVPR,https://github.com/fajwel/Open-Canopy,highlight 166 | 163,💡 未复现,OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation,"Hui Li, Mingwang Xu, Yun Zhan, Shan Mu, Jiaye Li, Kaihui Cheng, Yuxuan Chen, Tan Chen, Mao Ye, Jingdong Wang, Siyu Zhu",https://arxiv.org/abs/2412.00115,2025,CVPR,https://github.com/fudan-generative-vision/OpenHumanVid,highlight 167 | 164,💡 未复现,OpticalNet: An Optical Imaging Dataset and Benchmark Beyond the Diffraction Limit,"Benquan Wang, Ruyi An, Jin-Kyu So, Sergei Kurdiumov, Eng Aik Chan, Giorgio Adamo, Yuhan Peng, Yewen Li, Bo An",https://openaccess.thecvf.com/content/CVPR2025/html/Wang_OpticalNet_An_Optical_Imaging_Dataset_and_Benchmark_Beyond_the_Diffraction_CVPR_2025_paper.html,2025,CVPR,https://github.com/Deep-See/OpticalNet,highlight 168 | 165,💡 未复现,Optimizing for the Shortest Path in Denoising Diffusion Model,"Ping Chen, Xingpeng Zhang, Zhaoxiang Liu, Huan Hu, Xiang Liu, Kai Wang, Min Wang, Yanlin Qian, Shiguo Lian",https://arxiv.org/abs/2503.03265,2025,CVPR,https://github.com/UnicomAI/ShortDF,highlight 169 | 166,💡 未复现,Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval,"Yuanmin Tang, Xiaoting Qin, Jue Zhang, Jing Yu, Gaopeng Gou, Gang Xiong, Qingwei Ling, Saravan Rajmohan, Dongmei Zhang, Qi Wu",https://arxiv.org/abs/2412.11077,2025,CVPR,https://github.com/Pter61/osrcir,highlight 170 | 167,💡 未复现,SmartCLIP: Modular Vision-language Alignment with Identification Guarantees,"Shaoan Xie, Lingjing Lingjing, Yujia Zheng, Yu Yao, Zeyu Tang, Eric P. Xing, Guangyi Chen, Kun Zhang",https://openaccess.thecvf.com/content/CVPR2025/html/Xie_SmartCLIP_Modular_Vision-language_Alignment_with_Identification_Guarantees_CVPR_2025_paper.html,2025,CVPR,https://openaccess.thecvf.com/content/CVPR2025/html/Xie_SmartCLIP_Modular_Vision-language_Alignment_with_Identification_Guarantees_CVPR_2025_paper.html,highlight 171 | 168,💡 未复现,Structured 3D Latents for Scalable and Versatile 3D Generation,"Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, Jiaolong Yang",https://arxiv.org/abs/2412.01506,2025,CVPR,https://github.com/microsoft/TRELLIS,highlight 172 | 169,💡 未复现,StyleSSP: Sampling StartPoint Enhancement for Training-free Diffusion-based Method for Style Transfer,"Ruojun Xu, Weijie Xi, Xiaodi Wang, Yongbo Mao, Zach Cheng",https://arxiv.org/abs/2501.11319,2025,CVPR,https://github.com/bytedance/StyleSSP,highlight 173 | 170,💡 未复现,Towards Autonomous Micromobility through Scalable Urban Simulation,"Wayne Wu, Honglin He, Chaoyuan Zhang, Jack He, Seth Z. Zhao, Ran Gong, Quanyi Li, Bolei Zhou",https://arxiv.org/abs/2505.00690,2025,CVPR,https://github.com/metadriverse/urban-sim,highlight 174 | 171,💡 未复现,UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models,"Yuning Han, Bingyin Zhao, Rui Chu, Feng Luo, Biplab Sikdar, Yingjie Lao",https://arxiv.org/abs/2412.11441,2025,CVPR,https://github.com/TheLaoLab/UIBDiffusion,highlight 175 | 172,💡 未复现,UCOD-DPL: Unsupervised Camouflaged Object Detection via Dynamic Pseudo-label Learning,"Weiqi Yan, Lvhai Chen, Huaijia Kou, Shengchuan Zhang, Yan Zhang, Liujuan Cao",https://openaccess.thecvf.com/content/CVPR2025/html/Yan_UCOD-DPL_Unsupervised_Camouflaged_Object_Detection_via_Dynamic_Pseudo-label_Learning_CVPR_2025_paper.html,2025,CVPR,https://github.com/Heartfirey/UCOD-DPL,highlight 176 | 173,💡 未复现,Video Depth Anything: Consistent Depth Estimation for Super-Long Videos,"Sili Chen, Hengkai Guo, Shengnan Zhu, Feihu Zhang, Zilong Huang, Jiashi Feng, Bingyi Kang",https://arxiv.org/abs/2501.12375,2025,CVPR,https://github.com/DepthAnything/Video-Depth-Anything,highlight 177 | 174,💡 未复现,World-consistent Video Diffusion with Explicit 3D Modeling,"Qihang Zhang, Shuangfei Zhai, Miguel Ángel Bautista Martin, Kevin Miao, Alexander Toshev, Joshua Susskind, Jiatao Gu",https://openaccess.thecvf.com/content/CVPR2025/html/Zhang_World-consistent_Video_Diffusion_with_Explicit_3D_Modeling_CVPR_2025_paper.html,2025,CVPR,https://zqh0253.github.io/wvd/,highlight 178 | 175,💡 未复现,Your ViT is Secretly an Image Segmentation Model,"Tommie Kerssies, Niccolò Cavagnero, Alexander Hermans, Narges Norouzi, Giuseppe Averta, Bastian Leibe, Gijs Dubbelman, Daan de Geus",https://openaccess.thecvf.com/content/CVPR2025/html/Kerssies_Your_ViT_is_Secretly_an_Image_Segmentation_Model_CVPR_2025_paper.html,2025,CVPR,https://github.com/tue-mps/EoMT,highlight 179 | 176,💡 未复现,WonderWorld: Interactive 3D Scene Generation from a Single Image,"Hong-Xing Yu, Haoyi Duan, Charles Herrmann, William T. Freeman, Jiajun Wu",https://arxiv.org/abs/2406.09394,2025,CVPR,https://github.com/KovenYu/WonderWorld,highlight 180 | 177,💡 未复现,Relightable Gaussian Codec Avatars,"Shunsuke Saito, Gabriel Schwartz, Tomas Simon, Junxuan Li, Giljoo Nam",https://arxiv.org/pdf/2312.03704,2024,CVPR,https://github.com/facebookresearch/goliath,oral 181 | 178,💡 未复现,Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following,"Yutong Feng, Biao Gong, Di Chen, Yujun Shen, Yu Liu, Jingren Zhou",https://arxiv.org/pdf/2311.17002,2024,CVPR,https://github.com/ali-vilab/Ranni,oral 182 | 179,💡 未复现,Rethinking Inductive Biases for Surface Normal Estimation,"Gwangbin Bae, Andrew J. Davison",https://arxiv.org/pdf/2403.00712,2024,CVPR,https://github.com/baegwangbin/DSINE,oral 183 | 180,💡 未复现,PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness,"Anh-Quan Cao, Angela Dai, Raoul de Charette",https://arxiv.org/abs/2312.02158,2024,CVPR,https://github.com/astra-vision/PaSCo,oral 184 | 181,💡 未复现,Transcriptomics-guided Slide Representation Learning in Computational Pathology,"Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya, Richard J. Chen, Drew F. K. Williamson, Thomas Peeters, Andrew H. Song, Faisal Mahmood",https://arxiv.org/pdf/2405.11618,2024,CVPR,https://github.com/mahmoodlab/TANGLE,oral 185 | 182,💡 未复现,DiffusionLight: Light Probes for Free by Painting a Chrome Ball,"Pakkapon Phongthawee, Worameth Chinchuthakun, Nontaphat Sinsunthithet, Amit Raj, Varun Jampani, Pramook Khungurn, Supasorn Suwajanakorn",https://arxiv.org/pdf/2312.09168,2024,CVPR,https://github.com/DiffusionLight/DiffusionLight?tab=readme-ov-file,oral 186 | 183,💡 未复现,URHand: Universal Relightable Hands,"Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito",https://arxiv.org/pdf/2401.05334,2024,CVPR,https://github.com/facebookresearch/goliath,oral 187 | 184,💡 未复现,Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation,"Bingxin Ke, Anton Obukhov, Shengyu Huang, Nando Metzger, Rodrigo Caye Daudt, Konrad Schindler",https://arxiv.org/pdf/2312.02145,2024,CVPR,https://github.com/prs-eth/marigold,oral 188 | 185,💡 未复现,PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics,"Tianyi Xie, Zeshun Zong, Yuxing Qiu, Xuan Li, Yutao Feng, Yin Yang, Chenfanfu Jiang",https://arxiv.org/pdf/2311.12198,2024,CVPR,https://github.com/XPandora/PhysGaussian,oral 189 | 186,💡 未复现,HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting,"Xian Liu, Xiaohang Zhan, Jiaxiang Tang, Ying Shan, Gang Zeng, Dahua Lin, Xihui Liu, Ziwei Liu",https://arxiv.org/pdf/2311.17061,2024,CVPR,https://github.com/alvinliu0/HumanGaussian,oral 190 | 187,💡 未复现,Prompt Highlighter: Interactive Control for Multi-Modal LLMs,"Yuechen Zhang, Shengju Qian, Bohao Peng, Shu Liu, Jiaya Jia",https://arxiv.org/pdf/2312.04302,2024,CVPR,https://github.com/dvlab-research/Prompt-Highlighter/,oral 191 | 188,💡 未复现,Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation,"Qi Yang, Xing Nie, Tong Li, Pengfei Gao, Ying Guo, Cheng Zhen, Pengfei Yan, Shiming Xiang",https://arxiv.org/pdf/2312.06462,2024,CVPR,https://arxiv.org/pdf/2312.06462,oral 192 | 189,💡 未复现,Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution,"Shangchen Zhou, Peiqing Yang, Jianyi Wang, Yihang Luo, Chen Change Loy",https://arxiv.org/pdf/2312.06640,2024,CVPR,https://github.com/sczhou/Upscale-A-Video,oral 193 | 190,💡 未复现,Putting the Object Back into Video Object Segmentation,"Ho Kei Cheng, Seoung Wug Oh, Brian Price, Joon-Young Lee, Alexander Schwing",https://arxiv.org/pdf/2310.12982,2024,CVPR,https://github.com/hkchengrex/Cutie,oral 194 | 191,💡 未复现,InstanceDiffusion: Instance-level Control for Image Generation,"Xudong Wang, Trevor Darrell, Sai Saketh Rambhatla, Rohit Girdhar, Ishan Misra",https://arxiv.org/pdf/2402.03290,2024,CVPR,https://github.com/frank-xwang/InstanceDiffusion,oral 195 | 192,💡 未复现,OMG-Seg: Is One Model Good Enough For All Segmentation?,"Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy",https://arxiv.org/pdf/2401.10229,2024,CVPR,https://github.com/lxtGH/OMG-Seg,oral 196 | 193,💡 未复现,Towards Language-Driven Video Inpainting via Multimodal Large Language Models,"Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy",https://arxiv.org/pdf/2401.10226,2024,CVPR,https://github.com/jianzongwu/Language-Driven-Video-Inpainting,oral 197 | 194,💡 未复现,VBench: Comprehensive Benchmark Suite for Video Generative Models,"Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu",https://arxiv.org/pdf/2311.17982,2024,CVPR,https://github.com/Vchitect/VBench,oral 198 | 195,💡 未复现,PIGEON: Predicting Image Geolocations,"Lukas Haas, Michal Skreta, Silas Alberti, Chelsea Finn",https://arxiv.org/pdf/2307.05845,2024,CVPR,https://github.com/LukasHaas/PIGEON,oral 199 | 196,💡 未复现,DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis,"Yuming Gu, You Xie, Hongyi Xu, Guoxian Song, Yichun Shi, Di Chang, Jing Yang, Linjie Luo",https://arxiv.org/pdf/2312.13016,2024,CVPR,https://github.com/FreedomGu/DiffPortrait3D/,highlight 200 | 197,💡 未复现,Domain Prompt Learning with Quaternion Networks,"Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang",https://arxiv.org/abs/2312.08878,2024,CVPR,https://github.com/caoql98/DPLQ,highlight 201 | 198,💡 未复现,DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing,"Yujun Shi, Chuhui Xue, Jun Hao Liew, Jiachun Pan, Hanshu Yan, Wenqing Zhang, Vincent Y. F. Tan, Song Bai",https://arxiv.org/pdf/2306.14435,2024,CVPR,https://github.com/Yujun-Shi/DragDiffusion,highlight 202 | 199,💡 未复现,Fast ODE-based Sampling for Diffusion Models in Around 5 Steps,"Zhenyu Zhou, Defang Chen, Can Wang, Chun Chen",https://arxiv.org/pdf/2312.00094,2024,CVPR,https://github.com/zju-pi/diff-sampler,highlight 203 | 200,💡 未复现,FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis,"Feng Liang, Bichen Wu, Jialiang Wang, Licheng Yu, Kunpeng Li, Yinan Zhao, Ishan Misra, Jia-Bin Huang, Peizhao Zhang, Peter Vajda, Diana Marculescu",https://arxiv.org/pdf/2312.17681,2024,CVPR,https://jeff-liangf.github.io/projects/flowvid/,highlight 204 | 201,💡 未复现,General Object Foundation Model for Images and Videos at Scale,"Junfeng Wu, Yi Jiang, Qihao Liu, Zehuan Yuan, Xiang Bai, Song Bai",https://arxiv.org/pdf/2312.09158,2024,CVPR,https://github.com/FoundationVision/GLEE,highlight 205 | 202,💡 未复现,Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding,"Hoang-Quan Nguyen, Thanh-Dat Truong, Xuan Bac Nguyen, Ashley Dowling, Xin Li, Khoa Luu",https://openaccess.thecvf.com/content/CVPR2024/html/Nguyen_Insect-Foundation_A_Foundation_Model_and_Large-scale_1M_Dataset_for_Visual_CVPR_2024_paper.html,2024,CVPR,https://uark-cviu.github.io/projects/insect-foundation/,highlight 206 | 203,💡 未复现,"Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance","Zan Wang, Yixin Chen, Baoxiong Jia, Puhao Li, Jinlu Zhang, Jingze Zhang, Tengyu Liu, Yixin Zhu, Wei Liang, Siyuan Huang",https://arxiv.org/abs/2403.18036,2024,CVPR,https://github.com/afford-motion/afford-motion,highlight 207 | 204,💡 未复现,Object Recognition as Next Token Predictio,"Kaiyu Yue, Bor-Chun Chen, Jonas Geiping, Hengduo Li, Tom Goldstein, Ser-Nam Lim",https://arxiv.org/abs/2312.02142,2024,CVPR,https://github.com/KaiyuYue/nxtp,highlight 208 | 205,💡 未复现,RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models,"Ozgur Kara, Bariscan Kurtkaya, Hidir Yesiltepe, James M. Rehg, Pinar Yanardag",https://arxiv.org/abs/2312.04524,2024,CVPR,https://github.com/rehglab/RAVE,highlight 209 | 206,💡 未复现,Readout Guidance: Learning Control from Diffusion Features,"Grace Luo, Trevor Darrell, Oliver Wang, Dan B Goldman, Aleksander Holynski",https://arxiv.org/abs/2312.02150,2024,CVPR,https://github.com/google-research/readout_guidance,highlight 210 | 207,💡 未复现,Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark,"Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander Richard",https://arxiv.org/abs/2403.18821,2024,CVPR,https://github.com/facebookresearch/real-acoustic-fields,highlight 211 | 208,💡 未复现,RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D,"Lingteng Qiu, Guanying Chen, Xiaodong Gu, Qi Zuo, Mutian Xu, Yushuang Wu, Weihao Yuan, Zilong Dong, Liefeng Bo, Xiaoguang Han",https://arxiv.org/abs/2311.16918,2024,CVPR,https://github.com/modelscope/RichDreamer,highlight 212 | 209,💡 未复现,RobustSAM: Segment Anything Robustly on Degraded Images,"Wei-Ting Chen, Yu-Jiet Vong, Sy-Yen Kuo, Sizhuo Ma, Jian Wang",https://arxiv.org/abs/2406.09627,2024,CVPR,https://github.com/robustsam/RobustSAM,highlight 213 | 210,💡 未复现,Scaling Up Dynamic Human-Scene Interaction Modeling,"Nan Jiang, Zhiyuan Zhang, Hongjie Li, Xiaoxuan Ma, Zan Wang, Yixin Chen, Tengyu Liu, Yixin Zhu, Siyuan Huang",https://arxiv.org/abs/2403.08629,2024,CVPR,https://github.com/jnnan/trumans_utils,highlight 214 | 211,💡 未复现,SDDGR: Stable Diffusion-based Deep Generative Replay for Class Incremental Object Detection,"Junsu Kim, Hoseong Cho, Jihyeon Kim, Yihalem Yimolal Tiruneh, Seungryul Baek",https://arxiv.org/abs/2402.17323,2024,CVPR,/,highlight 215 | 212,💡 未复现,SpatialTracker: Tracking Any 2D Pixels in 3D Space,"Yuxi Xiao, Qianqian Wang, Shangzhan Zhang, Nan Xue, Sida Peng, Yujun Shen, Xiaowei Zhou",https://arxiv.org/abs/2404.04319,2024,CVPR,https://github.com/henry123-boy/SpaTracker,highlight 216 | 213,💡 未复现,TFMQ-DM:Temporal Feature Maintenance Quantization for Diffusion Models,"Yushi Huang, Ruihao Gong, Jing Liu, Tianlong Chen, Xianglong Liu",https://openaccess.thecvf.com/content/CVPR2024/html/Huang_TFMQ-DM_Temporal_Feature_Maintenance_Quantization_for_Diffusion_Models_CVPR_2024_paper.html,2024,CVPR,https://github.com/ModelTC/TFMQ-DM,highlight 217 | 214,💡 未复现,Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval,"Jiamian Wang, Guohao Sun, Pichao Wang, Dongfang Liu, Sohail Dianat, Majid Rabbani, Raghuveer Rao, Zhiqiang Tao",https://arxiv.org/abs/2403.17998,2024,CVPR,https://github.com/patrick-0817/T-MASS-text-video-retrieval,highlight 218 | 215,💡 未复现,Towards Learning a Generalist Model for Embodied Navigation,"Duo Zheng, Shijia Huang, Lin Zhao, Yiwu Zhong, Liwei Wang",https://arxiv.org/abs/2312.02010,2024,CVPR,https://github.com/LaVi-Lab/NaviLLM,highlight 219 | 216,💡 未复现,UniMODE: Unified Monocular 3D Object Detection,"Zhuoling Li, Xiaogang Xu, SerNam Lim, Hengshuang Zhao",https://arxiv.org/abs/2402.18573,2024,CVPR,https://github.com/Lizhuoling/UniMODE,highlight 220 | 217,💡 未复现,Unsupervised Keypoints from Pretrained Diffusion Models,"Eric Hedlin, Gopal Sharma, Shweta Mahajan, Xingzhe He, Hossam Isack, Abhishek Kar Helge Rhodin, Andrea Tagliasacchi, Kwang Moo Yi",https://arxiv.org/abs/2312.00065,2024,CVPR,https://github.com/ubc-vision/StableKeypoints,highlight 221 | 218,💡 未复现,VA3: Virtually Assured Amplification Attack on Probabilistic Copyright Protection for Text-to-Image Generative Models,"Xiang Li, Qianli Shen, Kenji Kawaguchi",https://arxiv.org/abs/2312.00057,2024,CVPR,https://github.com/South7X/VA3,highlight 222 | 219,💡 未复现,VecFusion: Vector Font Generation with Diffusion,"Vikas Thamizharasan, Difan Liu, Shantanu Agarwal, Matthew Fisher, Michael Gharbi, Oliver Wang, Alec Jacobson, Evangelos Kalogerakis",https://arxiv.org/abs/2312.10540,2024,CVPR,https://vikastmz.github.io/VecFusion/,highlight 223 | 220,💡 未复现,Wonder3D: Single Image to 3D using Cross-Domain Diffusion,"Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, Wenping Wang",https://arxiv.org/abs/2310.15008,2024,CVPR,https://github.com/xxlong0/Wonder3D,highlight 224 | 221,💡 未复现,VTimeLLM: Empower LLM to Grasp Video Moments,"Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu",https://arxiv.org/abs/2311.18445,2024,CVPR,https://github.com/huangb23/VTimeLLM,highlight 225 | 222,💡 未复现,LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds,"Lingteng Qiu, Xiaodong Gu, Peihao Li, Qi Zuo, Weichao Shen, Junfei Zhang, Kejie Qiu, Weihao Yuan, Guanying Chen, Zilong Dong, Liefeng Bo",https://arxiv.org/pdf/2503.10625v1,2025,ICCV,https://github.com/aigc3d/LHM?tab=readme-ov-file,poster 226 | 223,💡 未复现,EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer,"Yuxuan Zhang, Yirui Yuan, Yiren Song, Haofan Wang, Jiaming Liu",https://arxiv.org/abs/2503.07027,2025,ICCV,https://github.com/Xiaojiu-z/EasyControl,poster 227 | 224,💡 未复现,MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images,"Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, Jianfei Cai",https://arxiv.org/pdf/2403.14627,2024,ECCV,https://github.com/donydchen/mvsplat,Oral 228 | 225,💡 未复现,FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting,"Zehao Zhu, Zhiwen Fan, Yifan Jiang, Zhangyang Wang",https://arxiv.org/abs/2312.00451,2024,ECCV,https://github.com/VITA-Group/FSGS,poster 229 | 226,💡 未复现,ZigMa: A DiT-style Zigzag Mamba Diffusion Model,"Vincent Tao Hu, Stefan Andreas Baumann, Ming Gui, Olga Grebenkova, Pingchuan Ma, Johannes Schusterbauer, Björn Ommer",https://arxiv.org/pdf/2403.13802,2024,ECCV,https://github.com/CompVis/zigma,poster 230 | 227,💡 未复现,Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors,"Tongkun Guan, Wei Shen, Xue Yang, Xuehui Wang, Xiaokang Yang 231 | ",https://arxiv.org/pdf/2312.05286,2024,ECCV,https://github.com/SJTU-DeepVisionLab/FreeReal,poster 232 | 228,💡 未复现,PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer,"Tongkun Guan, Chengyu Lin, Wei Shen, Xiaokang Yang",https://arxiv.org/pdf/2407.07764,2024,ECCV,https://github.com/SJTU-DeepVisionLab/PosFormer,poster 233 | 229,💡 未复现,Fully Sparse 3D Occupancy Prediction,"Haisong Liu, Yang Chen, Haiguang Wang, Zetong Yang, Tianyu Li, Jia Zeng, Li Chen, Hongyang Li, Limin Wang",https://arxiv.org/abs/2312.17118,2024,ECCV,https://github.com/MCG-NJU/SparseOcc,poster 234 | 230,💡 未复现,NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields,"Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini, Adrien Gaidon, Zsolt Kira, Rares Ambrus",https://arxiv.org/abs/2404.01300,2024,ECCV,https://github.com/zubair-irshad/NeRF-MAE,Poster 235 | 231,💡 未复现,ControlCap: Controllable Region-level Captioning,"Yuzhong Zhao, Yue Liu, Zonghao Guo, Weijia Wu, Chen Gong, Fang Wan, Qixiang Ye",https://arxiv.org/abs/2401.17910,2024,ECCV,https://github.com/callsys/ControlCap,Poster 236 | 232,💡 未复现,GiT: Towards Generalist Vision Transformer through Universal Language Interface,"Haiyang Wang, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, Liwei Wang",https://arxiv.org/abs/2403.09394,2024,ECCV,https://github.com/Haiyang-W/GiT,Oral 237 | 233,💡 未复现,Relation DETR: Exploring Explicit Position Relation Prior for Object Detection,"Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen, Xuguang Lan",https://arxiv.org/abs/2407.11699v1,2024,ECCV,https://github.com/xiuqhou/Relation-DETR,Oral 238 | 234,💡 未复现,FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification,"Yu Tian, Congcong Wen, Min Shi, Muhammad Muneeb Afzal, Hao Huang, Muhammad Osama Khan, Yan Luo, Yi Fang, Mengyu Wang",https://arxiv.org/abs/2407.08813,2024,ECCV,https://github.com/Harvard-Ophthalmology-AI-Lab/FairDomain,poster 239 | 235,💡 未复现,OneRestore: A Universal Restoration Framework for Composite Degradation,"Yu Guo, Yuan Gao, Yuxu Lu, Huilin Zhu, Ryan Wen Liu, Shengfeng He",https://arxiv.org/abs/2407.04621,2024,ECCV,https://github.com/gy65896/OneRestore,poster 240 | 236,💡 未复现,VideoStudio: Generating Consistent-Content and Multi-Scene Videos,"Fuchen Long, Zhaofan Qiu, Ting Yao, Tao Mei",https://arxiv.org/pdf/2401.01256,2024,ECCV,https://github.com/FuchenUSTC/VideoStudio,poster 241 | 237,💡 未复现,Zero-shot Object Counting with Good Exemplars,"Huilin Zhu, Jingling Yuan, Zhengwei Yang, Yu Guo, Zheng Wang, Xian Zhong, Shengfeng He",https://arxiv.org/abs/2407.04948,2024,ECCV,https://github.com/HopooLinZ/VA-Count,poster 242 | 238,💡 未复现,SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments,"Niklas Gard, Anna Hilsmann, Peter Eisert",https://arxiv.org/abs/2404.10527,2024,ECCV,https://github.com/fraunhoferhhi/spvloc,Oral 243 | 239,💡 未复现,Stereo Any Video: Temporally Consistent Stereo Matching,Junpeng Jing;Weixun Luo;Ye Mao; Krystian Mikolajczyk,https://arxiv.org/html/2503.05549v1#S4,2025,ICCV,https://github.com/TomTomTommi/stereoanyvideo,highlight 244 | 240,💡 未复现,InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity,"Liming Jiang, Qing Yan, Yumin Jia, Zichuan Liu, Hao Kang, Xin Lu",https://arxiv.org/pdf/2503.16418,2025,ICCV,https://github.com/bytedance/InfiniteYou,highlight 245 | 241,💡 未复现,MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization,"Yiwen Chen, Yikai Wang, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang, Guosheng Lin",https://arxiv.org/abs/2408.02555,2025,ICCV,https://github.com/buaacyw/MeshAnythingV2,poster 246 | 242,💡 未复现,From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers,"Jiacheng Liu, Chang Zou, Yuanhuiyi Lyu, Junjie Chen, Linfeng Zhang",https://arxiv.org/abs/2503.06923,2025,ICCV,https://github.com/Shenyi-Z/TaylorSeer,poster 247 | 243,💡 未复现,MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion,"Shuyuan Tu, Qi Dai, Zihao Zhang, Sicheng Xie, Zhi-Qi Cheng, Chong Luo, Xintong Han, Zuxuan Wu, Yu-Gang Jiang",https://arxiv.org/abs/2405.20325,2025,ICCV,https://github.com/Francis-Rings/MotionFollower?tab=readme-ov-file,poster 248 | 244,💡 未复现,AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation,"Zijie Wu, Chaohui Yu, Fan Wang, Xiang Bai",https://arxiv.org/abs/2506.09982,2025,ICCV,https://github.com/JarrentWu1031/AnimateAnyMesh,poster 249 | 245,💡 未复现,VSSD: Vision Mamba with Non-Causal State Space Duality,"Yuheng Shi, Minjing Dong, Mingjia Li, Chang Xu",https://arxiv.org/abs/2407.18559,2025,ICCV,https://github.com/YuHengsss/VSSD?tab=readme-ov-file,poster 250 | 246,💡 未复现,FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model,Yukang Cao、Chenyang Si、Jinghao Wang、Ziwei Liu,https://arxiv.org/html/2507.01953v1#S4,2025,ICCV,https://github.com/yukangcao/FreeMorph,poster 251 | 247,💡 未复现,GENMO: A GENeralist Model for Human MOtion,"Jiefeng Li, Jinkun Cao, Haotian Zhang, Davis Rempe, Jan Kautz, Umar Iqbal, Ye Yuan 252 | ",https://arxiv.org/abs/2505.01425,2025,ICCV,https://research.nvidia.com/labs/dair/genmo/,poster 253 | 248,💡 未复现,Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers,"Weiming Ren, Wentao Ma, Huan Yang, Cong Wei, Ge Zhang, Wenhu Chen",https://arxiv.org/abs/2503.11579,2025,ICCV,https://github.com/TIGER-AI-Lab/Vamba,poster 254 | 249,💡 未复现,Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning,"Yijun Yang, Zhao-Yang Wang, Qiuping Liu, Shuwen Sun, Kang Wang, Rama Chellappa, Zongwei Zhou, Alan Yuille, Lei Zhu, Yu-Dong Zhang, Jieneng Chen",https://arxiv.org/abs/2506.02327,2025,ICCV,https://github.com/scott-yjyang/MeWM,poster 255 | 250,💡 未复现,"Where, What, Why: Towards Explainable Driver Attention Prediction","Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao, Yueyao Lin, Linkai Liu, Zipeng Guo, Hao Fei, Xiaobo Xia, Chao Gou",https://arxiv.org/pdf/2506.23088,2025,ICCV,https://github.com/yuchen2199/Explainable-Driver-Attention-Prediction,poster 256 | 251,💡 未复现,BirdCollect: A Comprehensive Benchmark for Analyzing Dense Bird Flock Attributes,"Kshitiz, Sonu Shreshtha, Bikash Dutta, Muskan Dosi, Mayank Vatsa, Richa Singh, Saket Anand, Sudeep Sarkar, Sevaram Mali Parihar 257 | ",https://par.nsf.gov/servlets/purl/10545625,2024,AAAI,," 258 | Technical" 259 | 252,💡 未复现,MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records,"Scott L. Fleming, Alejandro Lozano, William J. Haberkorn, Jenelle A. Jindal, Eduardo Reis, Rahul Thapa, Louis Blankemeier, Julian Z. Genkins, Ethan Steinberg, Ashwin Nayak, Birju Patel, Chia-Chun Chiang, Alison Callahan, Zepeng Huo, Sergios Gatidis, Scott Adams, Oluseyi Fayanju, Shreya J. Shah, Thomas Savage, Ethan Goh, Akshay S. Chaudhari, Nima Aghaeepour, Christopher Sharp, Michael A. Pfeffer, Percy Liang, Jonathan H. Chen, Keith E. Morse, Emma P. Brunskill, Jason A. Fries, Nigam H. Shah",https://arxiv.org/pdf/2308.14089,2023,AAAI,https://github.com/som-shahlab/medalign,Technical 260 | 253,💡 未复现,"Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media","Liam Hebert, Gaurav Sahu, Yuxuan Guo, Nanda Kishore Sreenivas, Lukasz Golab, Robin Cohen",https://arxiv.org/pdf/2307.09312,2024,AAAI,https://github.com/liamhebert/MultiModalDiscussionTransformer,Technical 261 | 254,💡 未复现,Quantile-Regression-Ensemble: A Deep Learning Algorithm for Downscaling Extreme Precipitation,"Thomas Bailie, Yun Sing Koh, Neelesh Rampal, Peter B. Gibson 262 | ",https://ojs.aaai.org/index.php/AAAI/article/view/30193,2024,AAAI,,Technical 263 | 255,💡 未复现,Spatial-Logic-Aware Weakly Supervised Learning for Flood Mapping on Earth Imagery,"Zelin Xu, Tingsong Xiao, Wenchong He, Yu Wang, Zhe Jiang, Shigang Chen, Yiqun Xie, Xiaowei Jia, Da Yan, Yang Zhou",https://ojs.aaai.org/index.php/AAAI/article/view/30253,2024,AAAI,https://github.com/spatialdatasciencegroup/SLWSL,Technical 264 | 256,💡 未复现,Vector Field Oriented Diffusion Model for Crystal Material Generation,"Astrid Klipfel, Yael Fregier , Adlane Sayede, Zied Bouraoui 265 | 266 | ",https://arxiv.org/pdf/2401.05402,2023,AAAI,https://github.com/WanyuGroup/AI-for-Crystal-Materials,Technical 267 | 257,💡 未复现,Periodic Graph Transformers for Crystal Material Property Prediction,"Keqiang Yan,Yi Liu,Yuchao Lin,Shuiwang Ji",https://arxiv.org/pdf/2209.11807,2022,AAAI,https://github.com/WanyuGroup/AI-for-Crystal-Materials,Technical 268 | 258,💡 未复现,ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing,"Zhi Jin, Sheng Xu, Xiang Zhang, Tianze Ling, Nanqing Dong, Wanli Ouyang, Zhiqiang Gao, Cheng Chang, Siqi Sun",https://arxiv.org/pdf/2312.11584,2023,AAAI,https://github.com/BEAM-Labs/ContraNovo,Technical 269 | 259,💡 未复现,Dual-Channel Learning Framework for Drug-Drug Interaction Prediction via Relation-Aware Heterogeneous Graph Transformer,"Xiaorui Su, Pengwei Hu, Zhu-Hong You, Philip S. Yu, Lun Hu 270 | 271 | ",https://ojs.aaai.org/index.php/AAAI/article/view/27777,2024,AAAI,https://github.com/Blair1213/TIGER,Technical 272 | 260,💡 未复现,Explore 3D Dance Generation via Reward Model from Automatically-Ranked Demonstrations,"Zilin Wang, Haolin Zhuang, Lu Li, Yinmin Zhang, Junjie Zhong, Jun Chen, Yu Yang, Boshi Tang, Zhiyong Wu 273 | ",https://arxiv.org/pdf/2312.11442,2023,AAAI,https://github.com/liutaocode/talking-face-arxiv-daily,Technical 274 | 261,💡 未复现,Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables,"Haisong Gong, Weizhi Xu, Shu Wu, Qiang Liu, Liang Wang 275 | 276 | ",https://arxiv.org/pdf/2402.13028,2024,AAAI,https://github.com/Deno-V/HeterFC,Technical 277 | 262,💡 未复现,PosDiffNet: Positional Neural Diffusion for Point Cloud Registration in a Large Field of View with Perturbations,"Rui She, Sijie Wang, Qiyu Kang, Kai Zhao, Yang Song, Wee Peng Tay, Tianyu Geng",https://arxiv.org/pdf/2401.03167,2024,AAAI,https://github.com/AI-IT-AVs/PosDiffNet,Technical 278 | 263,💡 未复现,SeGA: Preference-Aware Self-Contrastive Learning with Prompts for Anomalous User Detection on Twitter,"Ying-Ying Chang, Wei-Yao Wang, Wen-Chih Peng",https://arxiv.org/pdf/2312.11553,2023,AAAI,https://github.com/ying0409/SeGA,Technical 279 | 264,💡 未复现,Text-Guided Molecule Generation with Diffusion Language Model,"Haisong Gong, Qiang Liu, Shu Wu, Liang Wang 280 | ",https://arxiv.org/pdf/2402.13040,2024,AAAI,https://github.com/Deno-V/tgm-dlm,Technical 281 | 265,💡 未复现,Unsupervised Gene-Cell Collective Representation Learning with Optimal Transport,"Jixiang Yu, Nanjun Chen, Ming Gao, Xiangtao Li, Ka-Chun Wong",https://ojs.aaai.org/index.php/AAAI/article/view/27789,2024,AAAI,,Technical 282 | 266,💡 未复现,Bi-directional Adapter for Multi-modal Tracking,"Bing Cao, Junliang Guo, Pengfei Zhu, Qinghua Hu",https://arxiv.org/pdf/2312.10611,2023,AAAI,https://github.com/SparkTempest/BAT.,Technical 283 | 267,💡 未复现,DanceAnyWay: Synthesizing Beat-Guided 3D Dances with Randomized Temporal Contrastive Learning,"Aneesh Bhattacharya, Manas Paranjape, Uttaran Bhattacharya, Aniket Bera 284 | 285 | ",https://arxiv.org/pdf/2303.03870,2024,AAAI,https://github.com/aneeshbhattacharya/DanceAnyWay,Technical 286 | 268,💡 未复现,Deep Linear Array Pushbroom Image Restoration: A Degradation Pipeline and Jitter-Aware Restoration Network,"Zida Chen, Ziran Zhang, Haoying Li, Menghao Li, Yueting Chen, Qi Li, Huajun Feng, Zhihai Xu, Shiqi Chen 287 | 288 | ",https://arxiv.org/pdf/2401.08171,2024,AAAI,https://github.com/JHW2000/JARNet,Technical 289 | 269,💡 未复现,DiffSED: Sound Event Detection with Denoising Diffusion,"Swapnil Bhosale, Sauradip Nag, Diptesh Kanojia, Jiankang Deng, Xiatian Zhu 290 | ",https://arxiv.org/pdf/2308.07293,2023,AAAI,https://github.com/sauradip/DiffSED,Technical 291 | 270,💡 未复现,Domain-Controlled Prompt Learning,"Qinglong Cao, Zhengqin Xu, Yuntian Chen, Chao Ma, Xiaokang Yang",https://arxiv.org/pdf/2310.07730,2023,AAAI,https://github.com/ caoql98/DCPL.,Technical 292 | 271,💡 未复现,DreamIdentity: Improved Editability for Efficient Face-identity Preserved Image Generation,"Zhuowei Chen, Shancheng Fang, Wei Liu, Qian He, Mengqi Huang, Yongdong Zhang, Zhendong Mao 293 | 294 | ",https://arxiv.org/pdf/2307.00300,2023,AAAI,https://dreamidentity.github.io/,Technical 295 | 272,💡 未复现,DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models,"Namhyuk Ahn, Junsoo Lee, Chunggi Lee, Kunhee Kim, Daesik Kim, Seung-Hun Nam, Kibeom Hong 296 | 297 | ",https://arxiv.org/pdf/2309.06933,2023,AAAI,https://nmhkahn.github.io/dreamstyler/,Technical 298 | 273,💡 未复现,Evaluate Geometry of Radiance Fields with Low-frequency Color Prior,"Namhyuk Ahn, Junsoo Lee, Chunggi Lee, Kunhee Kim, Daesik Kim, Seung-Hun Nam, Kibeom Hong",https://arxiv.org/pdf/2304.04351,2024,AAAI,https://github.com/qihangGH/IMRC,Technical 299 | 274,💡 未复现,EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE,"Junyi Chen, Longteng Guo, Jia Sun, Shuai Shao, Zehuan Yuan, Liang Lin, Dongyu Zhang 300 | 301 | ",https://arxiv.org/pdf/2308.11971,2024,AAAI,https://github.com/ssyze/EVE,Technical 302 | 275,💡 未复现,Exploiting Polarized Material Cues for Robust Car Detection,"Wen Dong, Haiyang Mei, Ziqi Wei, Ao Jin, Sen Qiu, Qiang Zhang, Xin Yang",https://arxiv.org/pdf/2401.02606,2024,AAAI,https://github.com/wind1117/AAAI24-PCDNet,Technical 303 | 276,💡 未复现,Federated Modality-specific Encoders and Multimodal Anchors for Personalized Brain Tumor Segmentation,"Qian Dai, Dong Wei, Hong Liu, Jinghan Sun, Liansheng Wang, Yefeng Zheng 304 | 305 | ",https://arxiv.org/pdf/2403.11803,2024,AAAI,https://github.com/QDaiing/FedMEMA/blob/main/README.md,Technical 306 | 277,💡 未复现,Markerless Multi-view 3D Human Pose Estimation: a survey,"Ana Filipa Rodrigues Nogueira,Hélder P. Oliveira,Luís F. Teixeira",https://arxiv.org/html/2407.03817v2,2024,AAAI,https://github.com/DoUntilFalse/FusionFormer,Technical 307 | 278,💡 未复现,Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection,"Songmin Dai, Yifan Wu, Xiaoqiang Li, Xiangyang Xue 308 | 309 | ",https://arxiv.org/pdf/2312.15911,2023,AAAI,https://github.com/M-3LAB/awesome-industrial-anomaly-detection,Technical 310 | 279,💡 未复现,SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements,"Haiyang Xie, Xi Shen, Shihua Huang, Qirui Wang, Zheng Wang 311 | ",https://arxiv.org/pdf/2503.07101,2025,AAAI,ocean146.github.io/SimROD2025/,Technical 312 | 280,💡 未复现,Hyp-OW: Exploiting Hierarchical Structure Learning with Hyperbolic Distance Enhances Open World Object Detection,"Thang Doan, Xin Li,Sima Behpour,Wenbin He,Liang Gou,Liu Ren",https://arxiv.org/pdf/2306.14291,2024,AAAI,https://github.com/boschresearch/Hyp-OW,Technical 313 | 281,💡 未复现,iDet3D: Towards Efficient Interactive Object Detection for LiDAR Point Clouds,"Dongmin Choi, Wonwoo Cho, Kangyeol Kim, Jaegul Choo 314 | 315 | ",https://arxiv.org/pdf/2312.15449,2023,AAAI,https://github.com/zhulf0804/3D-PointCloud/blob/master/README.md,Technical 316 | 282,💡 未复现,Image Safeguarding: Reasoning with Conditional Vision Language Model and Obfuscating Unsafe Content Counterfactually,"Mazal Bethany, Brandon Wherry, Nishant Vishwamitra, Peyman Najafirad 317 | 318 | ",https://arxiv.org/pdf/2401.11035,2024,AAAI,https://github.com/SecureAIAutonomyLab/ConditionalVLM,Technical 319 | 283,💡 未复现,Improving Diffusion-Based Image Synthesis with Context Prediction,"Ling Yang, Jingwei Liu, Shenda Hong, Zhilong Zhang, Zhilin Huang, Zheming Cai, Wentao Zhang, Bin Cui 320 | 321 | ",https://arxiv.org/pdf/2401.02015,2024,AAAI,,Technical 322 | 284,💡 未复现,"UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting 323 | 324 | ","Ziyi Wang, Yanran Zhang, Jie Zhou, Jiwen Lu 325 | 326 | ",https://arxiv.org/html/2506.09952v1,2025,AAAI,https://github.com/wangzy22/UniPre3D,Technical 327 | 285,💡 未复现,SpatioTemporal Difference Network for Video Depth Super-Resolution,"Zhengxue Wang, Yuan Wu, Xiang Li, Zhiqiang Yan, Jian Yang 328 | 329 | ",https://www.arxiv.org/pdf/2508.01259,2025,AAAI,https://github.com/yanzq95/STDNet,Technical 330 | 286,💡 未复现,Iterative Token Evaluation and Refinement for Real-World Super-Resolution,"Chaofeng Chen, Shangchen Zhou, Liang Liao, Haoning Wu, Wenxiu Sun, Qiong Yan, Weisi Lin",https://arxiv.org/pdf/2312.05616,2023,AAAI,https://github.com/chaofengc/ITER,Technical 331 | 287,💡 未复现,LDMVFI: Video Frame Interpolation with Latent Diffusion Models,"Duolikun Danier, Fan Zhang, David Bull",https://arxiv.org/pdf/2303.09508,2023,AAAI,https://github.com/danier97/LDMVFI,Technical 332 | 288,💡 未复现,Joint Demosaicing and Denoising for Spike Camera,"Yanchen Dong, Ruiqin Xiong, Jing Zhao, Jian Zhang, Xiaopeng Fan, Shuyuan Zhu, Tiejun Huang",https://ojs.aaai.org/index.php/AAAI/article/view/27924,2024,AAAI,https://github.com/csycdong/SJDD-Net,Technical 333 | 289,💡 未复现,De-LightSAM: Modality-Decoupled Lightweight SAM for Generalizable Medical Segmentation,"Qing Xu, Jiaxuan Li, Xiangjian He, Chenxin Li, Fiseha B. Tesem, Wenting Duan, Zhen Chen, Rong Qu, Jonathan M. Garibaldi,Chang Wen Chen",https://arxiv.org/pdf/2407.14153,2024,AAAI,https://github.com/xq141839/De-LightSAM,Technical 334 | 290,💡 未复现,LogoStyleFool: Vitiating Video Recognition Systems via Logo Style Transfer,"Yuxin Cao, Ziyu Zhao, Xi Xiao, Derui Wang, Minhui Xue, Jin Lu 335 | 336 | ",https://arxiv.org/pdf/2312.09935,2024,AAAI,,Technical 337 | 291,💡 未复现,M-BEV: Masked BEV Perception for Robust Autonomous Driving,"Siran Chen, Yue Ma, Yu Qiao, Yali Wang 338 | 339 | ",https://arxiv.org/pdf/2312.12144,2023,AAAI,https://github.com/Sranc3/M-BEV,Technical 340 | 292,💡 未复现,MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance,"Ernie Chu, Tzuhsuan Huang, Shuo-Yen Lin, Jun-Cheng Chen",https://arxiv.org/pdf/2308.10079,2023,AAAI,https://medm2023.github.io/,Technical 341 | 293,💡 未复现,Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation,"Hui Fu, Zeqing Wang, Ke Gong, Keze Wang, Tianshui Chen, Haojie Li, Haifeng Zeng, Wenxiong Kang 342 | 343 | ",https://arxiv.org/pdf/2312.10877,2023,AAAI,https://zeqingwang.github.io/Mimic/,Technical 344 | 294,💡 未复现,NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement,"Marcos V. Conde, Javier Vazquez-Corral, Michael S. Brown, Radu Timofte 345 | 346 | ",https://arxiv.org/pdf/2306.11920,2023,AAAI,https://github.com/mv-lab/nilut,Technical 347 | 295,💡 未复现,A Range-Null Space Decomposition Approach for Fast and Flexible Spectral Compressive Imaging,"Junyu Wang, Shijie Wang, Ruijie Zhang, Zengqiang Zheng, Wenyu Liu, Xinggang Wang",https://arxiv.org/pdf/2305.09746,2023,AAAI, https://github.com/ hustvl/RND-SCI.,Technical 348 | 296,💡 未复现,"Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding 349 | ","Yubin Gu, Yuan Meng, Xiaoshuai Sun, Jiayi Ji, Weijian Ruan, Rongrong Ji",https://arxiv.org/html/2411.16217v1,2024,AAAI,https://github.com/c-yn/OKNet,Technical 350 | 297,💡 未复现,Revisiting Point Cloud Completion: Are We Ready For The Real-World?,"Stuti Pathak, Prashant Kumar, Dheeraj Baiju, Nicholus Mboga, Gunther Steenackers, Rudi Penne",https://arxiv.org/html/2411.17580v4,2024,AAAI,,Technical 351 | 298,💡 未复现,"PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify 352 | ","Zhengqing Wang, Jiacheng Chen, Yasutaka Furukawa 353 | 354 | ",https://arxiv.org/html/2406.00259v2,2024,AAAI,https://puzzlefusion-plusplus.github.io,Technical 355 | 299,💡 未复现,PNeSM: Arbitrary 3D Scene Stylization via Prompt-Based Neural Style Mapping,"Jiafu Chen, Wei Xing, Jiakai Sun, Tianyi Chu, Yiling Huang, Boyan Ji, Lei Zhao, Huaizhong Lin, Haibo Chen, Zhizhong Wang",https://arxiv.org/pdf/2403.08252,2024,AAAI,https://github.com/chenyingshu/advances_3d_neural_stylization,Technical 356 | 300,💡 未复现,PPEA-Depth: Progressive Parameter-Efficient Adaptation for Self-Supervised Monocular Depth Estimation,"Yue-Jiang Dong, Yuan-Chen Guo, Ying-Tian Liu, Fang-Lue Zhang, Song-Hai Zhang 357 | ",https://arxiv.org/pdf/2312.13066,2024,AAAI,https://github.com/YuejiangDong,Technical 358 | 301,💡 未复现,ResMatch: Residual Attention Learning for Local Feature Matching,"Yuxin Deng,Jiayi Ma",https://arxiv.org/pdf/2307.05180,2023,AAAI,https://github.com/ACuOoOoO/ResMatch,Technical 359 | 302,💡 未复现,Bidirectional Multi-Scale Implicit Neural Representations for Image Deraining,"Xiang Chen,Jinshan Pan,Jiangxin Dong",https://arxiv.org/pdf/2404.01547?,2024,AAAI,github.com/cschenxiang/NeRD-Rain,Technical 360 | 303,💡 未复现,Self-Supervised Bird’s Eye View Motion Prediction with Cross-Modality Signals,"Shaoheng Fang, Zuhong Liu, Mingyu Wang, Chenxin Xu, Yiqi Zhong, Siheng Chen 361 | 362 | ",https://arxiv.org/pdf/2401.11499,2024,AAAI,https://github.com/bshfang/selfsupervised-motion,Technical 363 | 304,💡 未复现,SHaRPose: Sparse High-Resolution Representation for Human Pose Estimation,"Xiaoqi An, Lin Zhao, Chen Gong, Nannan Wang, Di Wang, Jian Yang 364 | 365 | ",https://arxiv.org/pdf/2312.10758,2023,AAAI,https://github.com/AnxQ/sharpose,Technical 366 | 305,💡 未复现,Simple Image-level Classification Improves Open-vocabulary Object Detection,"Ruohuan Fang, Guansong Pang, Xiao Bai 367 | 368 | ",https://arxiv.org/pdf/2312.10439,2023,AAAI,https://github.com/mala-lab/SIC-CADS,Technical 369 | 306,💡 未复现,SparseGNV: Generating Novel Views of Indoor Scenes with Sparse Input Views,Weihao Cheng Yan-Pei Cao Ying Shan,https://arxiv.org/pdf/2305.07024,2023,AAAI,https://github.com/xt4d/SparseGNV,Technical 370 | 307,💡 未复现,"A Systematic Investigation on Deep Learning-Based Omnidirectional Image and Video Super-Resolution 371 | ","Qianqian Zhao, Chunle Guo, Tianyi Zhang, Junpei Zhang, Peiyang Jia, Tan Su, Wenjie Jiang, Chongyi Li",https://arxiv.org/html/2506.06710v1,2025,AAAI,https://github.com/nqian1/Survey-on-ODISR-and-ODVSR,Technical 372 | 308,💡 未复现,TDeLTA: A Light-weight and Robust Table Detection Method based on Learning Text Arrangement,"Yang Fan, Xiangping Wu, Qingcai Chen, Heng Li, Yan Huang, Zhixiang Cai, Qitian Wu 373 | 374 | ",https://arxiv.org/pdf/2312.11043,2023,AAAI,,Technical 375 | 309,💡 未复现,Variance-insensitive and Target-preserving Mask Refinement for Interactive Image Segmentation,"Chaowei Fang, Ziyin Zhou, Junye Chen, Hanjing Su, Qingyao Wu, Guanbin Li 376 | 377 | ",https://arxiv.org/pdf/2312.14387,2023,AAAI,,Technical 378 | 310,💡 未复现,VIXEN: Visual Text Comparison Network for Image Difference Captioning,"Alexander Black, Jing Shi, Yifei Fan, Tu Bui, John Collomosse 379 | 380 | ",https://arxiv.org/pdf/2402.19119,2024,AAAI,http://github.com/alexblck/vixen,Technical 381 | 311,💡 未复现,Weak Distribution Detectors Lead to Stronger Generalizability of Vision-Language Prompt Tuning,"Kun Ding, Haojian Zhang, Qiang Yu, Ying Wang, Shiming Xiang, Chunhong Pan 382 | 383 | ",https://arxiv.org/pdf/2404.00603,2024,AAAI,https://github.com/kding1225,Technical 384 | 312,💡 未复现,WebVLN: Vision-and-Language Navigation on Websites,"Qi Chen, Dileepa Pitawela, Chongyang Zhao, Gengze Zhou, Hsiang-Ting Chen, Qi Wu 385 | 386 | ",https://arxiv.org/pdf/2312.15820,2023,AAAI,https://github.com/WebVLN/WebVLN,Technical 387 | 313,💡 未复现,WeditGAN: Few-Shot Image Generation via Latent Space Relocation,"Yuxuan Duan, Li Niu, Yan Hong, Liqing Zhang 388 | 389 | ",https://arxiv.org/pdf/2305.06671,2024,AAAI,https://github.com/Ldhlwh/WeditGAN,Technical 390 | 314,💡 未复现,3D Visibility-aware Generalizable Neural Radiance Fields for Interacting Hands,"Xuan Huang, Hanhui Li, Zejun Yang, Zhisheng Wang, Xiaodan Liang 391 | 392 | ",https://arxiv.org/pdf/2401.00979,2024,AAAI,https://github.com/XuanHuang0/VANeRF,Technical 393 | 315,💡 未复现,"MAC: A Benchmark for Multiple Attribute Compositional Zero-Shot Learning 394 | ","Xuan Huang, Hanhui Li, Zejun Yang, Zhisheng Wang, Xiaodan Liang 395 | 396 | ",https://arxiv.org/html/2406.12757v2,2025,AAAI,https://github.com/Yanyi-Zhang/Awesome-Compositional-Zero-Shot,Technical 397 | 316,💡 未复现,A General Implicit Framework for Fast NeRF Composition and Rendering,"Xinyu Gao, Ziyi Yang, Yunlu Zhao, Yuxiang Sun, Xiaogang Jin, Changqing Zou 398 | 399 | ",https://arxiv.org/pdf/2308.04669,2024,AAAI,https://github.com/EricLee0224/awesome-nerf-editing,Technical 400 | 317,💡 未复现,A User-Friendly Framework for Generating Model-Preferred Prompts in Text-to-Image Synthesis,"Nailei Hei, Qianyu Guo, Zihao Wang, Yan Wang, Haofen Wang, Wenqiang Zhang 401 | 402 | ",https://arxiv.org/pdf/2402.12760,2024,AAAI,https://github.com/Naylenv/UF-FGTG,Technical 403 | 318,💡 未复现,AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models,"Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang 404 | 405 | ",https://arxiv.org/pdf/2308.15366,2023,AAAI,https: //github.com/CASIA-IVA-Lab/AnomalyGPT,Technical 406 | 319,💡 未复现,"Arbitrary-Scale Video Super-Resolution with Structural and Textural Priors 407 | ","Wei Shang, Dongwei Ren, Wanying Zhang, Yuming Fang, Wangmeng Zuo, Kede Ma 408 | 409 | ",https://arxiv.org/html/2407.09919v1,2024,AAAI,https://github.com/shangwei5/ST-AVSR,Technical 410 | 320,💡 未复现,BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions,"Wenbo Hu, Yifan Xu, Yi Li, Weiyue Li, Zeyuan Chen, Zhuowen Tu 411 | 412 | ",https://arxiv.org/pdf/2308.09936,2023,AAAI,https://github.com/mlpc-ucsd/BLIVA,Technical 413 | 321,💡 未复现,Depth-Guided Robust and Fast Point Cloud Fusion NeRF for Sparse Input Views,"Shuai Guo, Qiuwen Wang, Yijie Gao, Rong Xie, Li Song 414 | 415 | ",https://arxiv.org/pdf/2403.02063,2024,AAAI,https://github.com/yangjiheng/nerf_and_beyond_docs,Technical 416 | 322,💡 未复现,Distribution Matching for Multi-Task Learning of Classification Tasks: a Large-Scale Study on Faces & Beyond,"Dimitrios Kollias, Viktoriia Sharmanska, Stefanos Zafeiriou 417 | 418 | ",https://arxiv.org/pdf/2401.01219,2024,AAAI,,Technical 419 | 323,💡 未复现,"Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration 420 | ","Ting Lei, Shaofeng Yin, Qingchao Chen, Yuxin Peng, Yang Liu 421 | 422 | ",https://arxiv.org/html/2508.03207v1,2025,AAAI,https://github.com/ltttpku/INP-CC,Technical 423 | 324,💡 未复现,Expand-and-Quantize: Unsupervised Semantic Segmentation Using High-Dimensional Space and Product Quantization,"Jiyoung Kim, Kyuhong Shim, Insu Lee, Byonghyo Shim",https://arxiv.org/pdf/2312.07342,2023,AAAI,,Technical 424 | 325,💡 未复现,Expediting Contrastive Language-Image Pretraining via Self-distilled Encoders,"Bumsoo Kim, Jinhyung Kim, Yeonsik Jo, Seung Hwan Kim 425 | 426 | ",https://arxiv.org/pdf/2312.12659,2023,AAAI,https://github.com/openai/CLIP,Technical 427 | 326,💡 未复现,Exploring Self- and Cross-Triplet Correlations for Human-Object Interaction Detection,"Weibo Jiang, Weihong Ren, Jiandong Tian, Liangqiong Qu, Zhiyong Wang, Honghai Liu 428 | 429 | ",https://arxiv.org/pdf/2401.05676,2024,AAAI,https://github.com/52CV/ICCV-2021-Papers/blob/main/ICCV2021.md?plain=1,Technical 430 | 327,💡 未复现,Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering,"Qijun Gan, Wentong Li, Jinwei Ren, Jianke Zhu 431 | 432 | ",https://arxiv.org/pdf/2407.05680,2024,AAAI,https://github.com/agnJason/FMHR,Technical 433 | 328,💡 未复现,Frequency-Adaptive Pan-Sharpening with Mixture of Experts,"Xuanhua He, Keyu Yan, Rui Li, Chengjun Xie, Jie Zhang, Man Zhou 434 | 435 | ",https://arxiv.org/pdf/2401.02151,2024,AAAI,https://github.com/alexhe101/FAME-Net,Technical 436 | 329,💡 未复现,Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation,"Xiang Gao, Zhengbo Xu, Junhan Zhao, Jiaying Liu",https://arxiv.org/pdf/2407.03006,2025,AAAI,https://xianggao1102.github.io/FCDiffusion/,Technical 437 | 330,💡 未复现,GSN: Generalisable Segmentation in Neural Radiance Field,"Vinayak Gupta, Rahul Goel, Sirikonda Dhawal, P. J. Narayanan 438 | 439 | ",https://arxiv.org/pdf/2402.04632,2024,AAAI,https://vinayak-vg.github.io/GSN/,Technical 440 | 331,💡 未复现,Hand-Centric Motion Refinement for 3D Hand-Object Interaction via Hierarchical Spatial-Temporal Modeling,"Yuze Hao, Jianrong Zhang, Tao Zhuo, Fuan Wen, Hehe Fan 441 | 442 | ",https://arxiv.org/pdf/2401.15987,2024,AAAI,https://github.com/Holiday888/HST-Net,Technical 443 | 332,💡 未复现,High-Fidelity Diffusion-based Image Editing,"Chen Hou, Guoqiang Wei, Zhibo Chen 444 | 445 | ",https://arxiv.org/pdf/2312.15707,2024,AAAI,,Technical 446 | 333,💡 未复现,HuTuMotion: Human-Tuned Navigation of Latent Motion Diffusion Models with Minimal Feedback,"Gaoge Han, Shaoli Huang, Mingming Gong, Jinglei Tang 447 | 448 | ",https://arxiv.org/pdf/2312.12227,2023,AAAI,https://github.com/GuyTevet/motion-diffusion-model,Technical 449 | 334,💡 未复现,"Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model 450 | ","Danni Yang, Ruohan Dong, Jiayi Ji, Yiwei Ma, Haowei Wang, Xiaoshuai Sun, Rongrong Ji 451 | 452 | ",https://arxiv.org/html/2407.05352v1,2024,AAAI,https://github.com/nini0919/DiffPNG,Technical 453 | 335,💡 未复现,2DMamba: Efficient State Space Model for Image Representation with Applications on Giga-Pixel Whole Slide Image Classification,"Jingwei Zhang, Anh Tien Nguyen, Xi Han, Vincent Quoc-Huy Trinh, Hong Qin, Dimitris Samaras, Mahdi S. Hosseini 454 | 455 | ",https://arxiv.org/pdf/2412.00678,2025,CVPR,https://github.com/AtlasAnalyticsLab/2DMamba,poster 456 | 336,💡 未复现,3D Dental Model Segmentation with Geometrical Boundary Preserving,"Shufan Xi, Zexian Liu, Junlin Chang, Hongyu Wu, Xiaogang Wang, Aimin Hao 457 | 458 | ",https://arxiv.org/pdf/2503.23702,2025,CVPR,github.com/XiShuFan/CrossTooth_CVPR2025,poster 459 | 337,💡 未复现,3D Gaussian Head Avatars with Expressive Dynamic Appearances by Compact Tensorial Representations,"Yating Wang, Xuan Wang, Ran Yi, Yanbo Fan, Jichen Hu, Jingcheng Zhu, Lizhuang Ma 460 | 461 | ",https://arxiv.org/pdf/2504.14967,2025,CVPR,https://github.com/ant-research/TensorialGaussianAvatar,poster 462 | 338,💡 未复现,3D Gaussian Inpainting with Depth-Guided Cross-View Consistency,"Sheng-Yu Huang, Zi-Ting Chou, Yu-Chiang Frank Wang 463 | 464 | ",https://arxiv.org/pdf/2502.11801,2025,CVPR,https://github.com/peterjohnsonhuang/3dgic,poster 465 | 339,💡 未复现,3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination,"Jianing Yang, Xuweiyi Chen, Nikhil Madaan, Madhavan Iyengar, Shengyi Qian, David F. Fouhey, Joyce Chai 466 | 467 | ",https://arxiv.org/pdf/2406.05132,2025,CVPR,https://3d-grand.github.io/,poster 468 | 340,💡 未复现,3D-GSW: 3D Gaussian Splatting for Robust Watermarking,"Youngdong Jang, Hyunje Park, Feng Yang, Heeju Ko, Euijin Choo, Sangpil Kim 469 | 470 | ",https://arxiv.org/pdf/2409.13222,2025,CVPR,kuai-lab.github.io/cvpr20253dgsw/,poster 471 | 341,💡 未复现,3D-HGS: 3D Half-Gaussian Splatting ,Haolin Li,Jinyang Liu,Mario Sznaier,Octavia Camps,https://arxiv.org/pdf/2406.02720,2025,CVPR,lihaolin88.github.io/CVPR-2025-3DHGS,poster 472 | 342,💡 未复现,3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer,"Jiajun Deng, Tianyu He, Li Jiang, Tianyu Wang, Feras Dayoub, Ian Reid 473 | 474 | ",https://arxiv.org/pdf/2501.01163,2025,CVPR,https://github.com/djiajunustc/3D-LLaVA,poster 475 | 343,💡 未复现,3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning,"Yuncong Yang, Han Yang, Jiachen Zhou, Peihao Chen, Hongxin Zhang, Yilun Du, Chuang Gan 476 | 477 | ",https://arxiv.org/pdf/2411.17735,2025,CVPR,https://umass-embodied-agi.github.io/3D-Mem/,poster 478 | 344,💡 未复现,3DENHANCER: Consistent Multi-View Diffusion for 3D Enhancement,"Yihang Luo, Shangchen Zhou, Yushi Lan, Xingang Pan, Chen Change Loy 479 | 480 | ",https://arxiv.org/pdf/2412.18565,2025,CVPR,https://github.com/Luo-Yihang/3DEnhancer,poster 481 | 345,💡 未复现,3DGUT: Enabling Distorted Cameras and Secondary Rays in Gaussian Splatting,"Qi Wu, Janick Martinez Esturo, Ashkan Mirzaei, Nicolas Moenne-Loccoz, Zan Gojcic 482 | 483 | ",https://arxiv.org/pdf/2412.12507,2025,CVPR,https://github.com/nv-tlabs/3dgrut,poster 484 | 346,💡 未复现,4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models,"Wanhua Li, Renping Zhou, Jiawei Zhou, Yingwei Song, Johannes Herter, Minghan Qin, Gao Huang, Hanspeter Pfister 485 | 486 | ",https://arxiv.org/pdf/2503.10437,2025,CVPR,https://4d-langsplat.github.io/,poster 487 | 347,💡 未复现,4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians,Hidenobu Matsuki Gwangbin Bae Andrew J. Davison,https://arxiv.org/pdf/2505.22859,2025,CVPR,https://muskie82.github.io/4dtam/,poster 488 | 348,💡 未复现,A Bias-Free Training Paradigm for More General AI-generated Image Detection,"Fabrizio Guillaro, Giada Zingarini, Ben Usman, Avneesh Sud, Davide Cozzolino, Luisa Verdoliva 489 | 490 | ",https://arxiv.org/pdf/2412.17671,2025,CVPR,https://grip-unina.github.io/B-Free/,poster 491 | 349,💡 未复现,A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning,"Xin Wen, Bingchen Zhao, Yilun Chen, Jiangmiao Pang, Xiaojuan Qi 492 | 493 | ",https://arxiv.org/pdf/2503.06960,2025,CVPR,https://github.com/CVMI-Lab/SlotMIM,poster 494 | 350,💡 未复现,Bidirectional Copy-Paste for Semi-Supervised Medical Image Segmentation,,,2023,CVPR,https://github.com/DeepMed-Lab-ECNU/BCP, 495 | 351,💡 未复现,Knowledge Graph Embedding by Double Limit Scoring Loss,,,2022,TKDE,https://github.com/IIE-UCAS/Knowledge-Embedding-with-Double-Loss, 496 | 352,💡 未复现,Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion,,,2024,LREC-COLING,https://github.com/zjukg/AdaMF-MAT/blob/main/README.md, 497 | 353,💡 未复现,NativE: Multi-modal Knowledge Graph Completion in the Wild,,,2024,SIGIR,https://github.com/zjukg/NATIVE, 498 | 354,💡 未复现,Multimodal Contextual Interactions of Entities: A Modality Circular Fusion Approach for Link Prediction,,,2024,ACM MM,https://github.com/MoCiGitHub/MoCi, 499 | 355,💡 未复现,APKGC: Noise-enhanced Multi-Modal Knowledge Graph Completion with Attention Penalty,,,2025,AAAI,https://github.com/HubuKG/APKGC, 500 | 356,💡 未复现,Adaptive Keyframe Sampling for Long Video Understanding,,,2025,CVPR,https://github.com/ncTimTang/AKS, 501 | 357,💡 未复现,InsTaG: Learning Personalized 3D Talking Head from Few-Second Video,,,2025,CVPR,https://github.com/Fictionarry/InsTaG#, 502 | 358,💡 未复现,VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos,,,2025,CVPR,https://github.com/Ziyang412/VideoTree, 503 | 359,💡 未复现,Learning Spatiotemporal Inconsistency via Thumbnail Layout for Face Deepfake Detection,,,2024,IJCV,https://github.com/rainy-xu/TALL4Deepfake, 504 | 360,💡 未复现,Multi-Grained Multimodal Interaction Network for Entity Linking,,,2023,KDD,https://github.com/pengfei-luo/MIMIC, 505 | 361,💡 未复现,UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models,,,2024,CIKM,https://github.com/Javkonline/UniMEL., 506 | 362,💡 未复现,Bridging Gaps in Content and Knowledge for Multimodal Entity Linking,,,2024,MM,https://github.com/pengfei-luo/FissFuse, 507 | 363,💡 未复现,KGMEL: Knowledge Graph-Enhanced Multimodal Entity Linking,,,2025,SIGIR,https://github.com/juyeonnn/KGMEL., 508 | 364,💡 未复现,Multi-level Matching Network for Multimodal Entity Linking,,,2025,KDD,https://github.com/zhiweihu1103/MEL-M3EL, 509 | 365,💡 未复现,Cross-modality image synthesis from TOF-MRA to CTA using diffusion-based models,,,2025,MIA,https://github.com/alexander-koch/xmodality?tab=readme-ov-file, 510 | 366,💡 未复现,vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation,,,2025,CVPR,https://github.com/bwittmann/vesselFM, 511 | 367,💡 未复现,Topology-Aware Uncertainty for lmage Segmentation,,,2023,NIPS,https://github.com/Saumya-Gupta-26/struct-uncertainty, 512 | 368,💡 未复现,SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures,,,2025,CVPR,https://github.com/Karl1109/SCSegamba, 513 | 369,💡 未复现,Hybrid-Segmentor: Hybrid approach for automated fine-grained crack segmentation in civil infrastructure,,,2025,Automation in Construction,https://github.com/junegoo94/Hybrid-Segmentor, 514 | 370,💡 未复现,Enriching linformation and Preserving Semantic Consistency in Expanding Curvlinear Object Segmentation Datasets,,,2024,ECCV,https://github.com/tanlei0/COSTG, 515 | 371,💡 未复现,Cooperative Classification and Rationalization for Graph Generalization,,,2024,WWW,https://github.com/yuelinan/Codes-of-C2R, 516 | 372,💡 未复现,VATr++: Choose Your Words Wisely for Handwritten Text Generation,,,2024,TPAMI,github.com/EDM-Research/VATr-pp, 517 | 373,💡 未复现,Denoising Diffusion Probabilistic Models,,,2020,other,https://github.com/hojonathanho/diffusion,, 518 | 374,💡 未复现,Long-Tail Learning with Foundation Model: Heavy Fine-Tuning Hurts,,,2024,ICML,https://github.com/shijxcs/LIFT, 519 | 375,💡 未复现,Identity-Preserving Text-to-Video Generation by Frequency Decomposition,,,2025,CVPR,https://github.com/PKU-YuanGroup/ConsisID, 520 | 376,💡 未复现,Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation,,https://arxiv.org/pdf/2406.02548,2025,ICLR (Oral),https://github.com/aminebdj/OpenYOLO3D,oral 521 | 377,💡 未复现,(ResNet)Deep Residual Learning for Image Recognition,,https://arxiv.org/pdf/1512.03385.pdf,2016,CVPR,, 522 | 378,💡 未复现,(AlexNet)ImageNet Classification with Deep Convolutional Neural Networks,,https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf,2012,NeurIPS (NIPS),, 523 | 379,💡 未复现,(GAN)Generative Adversarial Networks,,https://arxiv.org/pdf/1406.2661.pdf,2014,NeurIPS (NIPS),, 524 | 380,💡 未复现,(DQN)Human-Level Control Through Deep Reinforcement Learning,,https://www.nature.com/articles/nature14236,2015,Nature,, 525 | 381,💡 未复现,Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,,https://arxiv.org/pdf/1506.01497,2015,TPAMI,, 526 | 382,💡 未复现,U-Net: Convolutional Networks for Biomedical Image Segmentation,,https://arxiv.org/pdf/1505.04597.pdf,2015,MICCAI,, 527 | 383,💡 未复现,BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,,https://arxiv.org/pdf/1810.04805.pdf,2019,NAACL-HLT,, 528 | 384,💡 未复现,(VGGNet)Very Deep Convolutional Networks for Large-Scale Image Recognition,,https://arxiv.org/pdf/1409.1556.pdf,2015,ICLR,, 529 | 385,💡 未复现,"(YOLO v1)You Only Look Once: Unified, Real-Time Object Detection",,https://arxiv.org/pdf/1506.02640.pdf,2016,CVPR,, 530 | 386,💡 未复现,(FCN)Fully Convolutional Networks for Semantic Segmentation,,https://arxiv.org/pdf/1411.4038.pdf,2015,CVPR,, 531 | 387,💡 未复现,(DenseNet)Densely Connected Convolutional Networks,,https://arxiv.org/pdf/1608.06993.pdf,2017,CVPR,, 532 | 388,💡 未复现,RAFT: Recurrent All-Pairs Field Transforms for Optical Flow,,https://arxiv.org/pdf/2003.12039.pdf,2020,ECCV,, 533 | 389,💡 未复现,"(Inception, GoogLeNet)Going Deeper with Convolutions",,https://arxiv.org/pdf/1409.4842.pdf,2015,CVPR,, 534 | --------------------------------------------------------------------------------