βββ README.md /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 |
5 | Curated collection of papers and resources on Affordance Learning. 6 |
7 | 8 | 9 | [](https://github.com/hq-King/Awesome-Affordance-Learning) 10 | [](https://opensource.org/licenses/MIT) 11 |  12 |  13 |  14 | 15 | 16 | > π§ Exploring Embodied AI and Embodied perception ? We hope this collection proves useful in your journey. If you'd like to support the project, feel free to βοΈ the repo and share it with your peers. Contributions are warmly welcome! 17 | 18 | --- 19 | 20 | ## π₯ News 21 | 22 | > π’ This list is **actively maintained**, and community contributions are always appreciated! 23 | > Feel free to [open a pull request](https://github.com/hq-King/Awesome-Affordance-Learning/pulls) if you find any relevant papers. 24 | 25 | - π `2025-05`: **Repository launched to curate a comprehensive list of Affordance Learning.** 26 | 27 | --- 28 | - [π Introduction](#-introduction) 29 | - [π Papers](#-papers) 30 | - [π‘2D Affordance Perception] 31 | - [π£οΈ 3D Affordance Perception] 32 | - [π§ͺ Affordance Reasoning] 33 | - [π Affordance based Grasping & Manipulation] 34 | 35 | 36 | 37 | 38 | --- 39 | ## π Introduction 40 | This repository offers a **comprehensive and up-to-date collection** of research papers on **Affordance Learning**. 41 | 42 | > As robots(Embodied Agents) are increasingly integrated into real-world applications, their ability to manipulate objects is increasingly in demand, and affordance describes where and how to interact with objects. 43 | 44 | This list spans across: 45 | - 2D Affordance Perception 46 | - 3D Affordance Perception 47 | - Affordance Reasoning 48 | - Affordance based Grasping & Manipulatation 49 | 50 | 51 | Whether you're a researcher, developer, or enthusiast, this is your go-to hub for exploring Embodied perception. 52 | 53 | --- 54 | ## π Papers 55 | ### π‘ 2D Affordance Perception 56 | 1. **AffordanceSAM: Segment Anything Once More in Affordance Grounding.** 57 | *Dengyang Jiang, Mengmeng Wang, Teli Ma, Hengzhuang Li, Yong liu, Guang Dai, Lei Zhang.* [[abs](https://arxiv.org/abs/2504.15650)], Arxiv 2025.04 58 | 59 | 2. **GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation.** 60 | *Teli Ma, Jia Zheng, Zifan Wang, Ziyao Gao, Jiaming Zhou, Junwei Liang.* [[abs](https://arxiv.org/abs/2505.11865)], Arxiv 2025.05 61 | 62 | 3. **One-Shot Open Affordance Learning with Foundation Models.** 63 | *Gen Li, Deqing Sun, Laura Sevilla-Lara, Varun Jampani.* [[abs](https://arxiv.org/abs/2311.17776)], CVPR 2024 64 | 65 | 4. **Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors.** 66 | *Peiran Xu, Yadong Mu.* [[abs](https://openreview.net/pdf?id=0823rvTIhs))], ICLR 2025 67 | 68 | 5. **AffordanceLLM: Grounding Affordance from Vision Language Models.** 69 | *Shengyi Qian, Weifeng Chen, Min Bai, Xiong Zhou, Zhuowen Tu, Li Erran Li.* [[abs](https://arxiv.org/abs/2401.06341)], CVPRWS 2024 70 | 71 | 6. **Learning Affordance Grounding from Exocentric Images.** 72 | *Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, Dacheng Tao.* [[abs](https://arxiv.org/abs/2203.09905)], CVPR 2022 73 | 74 | 7. **One-Shot Affordance Detection.** 75 | *Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, Dacheng Tao.* [[abs](https://arxiv.org/abs/2106.14747)], IJCAI 2021 76 | 77 | 8. **WorldAfford: Affordance Grounding based on Natural Language Instructions.** 78 | *Changmao Chen, Yuren Cong, Zhen Kan.* [[abs](https://arxiv.org/abs/2405.12461)], Arxiv 2024.05 79 | 80 | 9. **LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding.** 81 | *Gen Li, Varun Jampani, Deqing Sun, Laura Sevilla-Lara.* [[abs](https://arxiv.org/abs/2303.09665)], CVPR 2023 82 | 83 | 10. **Affordance Grounding From Demonstration Video To Target Image.** 84 | *Joya Chen, Difei Gao, Kevin Qinghong Lin, Mike Zheng Shou.* [[abs](https://arxiv.org/abs/2303.14644)], CVPR 2023 85 | 86 | 11. **One-Shot Transfer of Affordance Regions? AffCorrs!.** 87 | *Denis Hadjivelichkov, Sicelukwanda Zwane, Marc Peter Deisenroth, Lourdes Agapito, Dimitrios Kanoulas.* [[abs](https://arxiv.org/abs/2303.09665)], CoRL 2023 88 | 89 | 12. **Text-driven Affordance Learning from Egocentric Vision.** 90 | *Tomoya Yoshida, Shuhei Kurita, Taichi Nishimura, Shinsuke Mori.* [[abs](https://arxiv.org/abs/2309.10932)], Arxiv 2023.09 91 | 92 | 13. **MaskPrompt: Open-Vocabulary Affordance Segmentation with Object Shape Mask Prompts.** 93 | *Dongpan Chen, Dehui Kong, Jinghua Li, Baocai Yin. [[abs](https://ojs.aaai.org/index.php/AAAI/article/view/32200)], AAAI 2025 94 | 95 | 96 | 97 | ### π£οΈ 3D Affordance Perception 98 | 99 | 1. **3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding.** 100 | *Shengheng Deng, Xun Xu, Chaozheng Wu, Ke Chen, Kui Jia.* [[abs](https://arxiv.org/abs/2103.16397)], CVPR 2021 101 | 102 | 2. **Where2explore: Few-shot affordance learning for unseen novel categories of articulated objects.** 103 | *Chuanruo Ning, Ruihai Wu, Haoran Lu, Kaichun Mo, Hao Dong.* [[abs](https://arxiv.org/abs/2103.16397)], NeurIPS 2023 104 | 105 | 3. **LASO: Language-guided Affordance Segmentation on 3D Object.** 106 | *Yicong Li, Na Zhao, Junbin Xiao, Chun Feng, Xiang Wang, Tat-seng Chua.* [[abs](https://openaccess.thecvf.com/content/CVPR2024/html/Li_LASO_Language-guided_Affordance_Segmentation_on_3D_Object_CVPR_2024_paper.html)], CVPR 2024 107 | 108 | 4. **SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes.** 109 | *Shengheng Deng, Xun Xu, Chaozheng Wu, Ke Chen, Kui Jia.* [[abs](https://openaccess.thecvf.com/content/CVPR2024/html/Delitzas_SceneFun3D_Fine-Grained_Functionality_and_Affordance_Understanding_in_3D_Scenes_CVPR_2024_paper.html)], CVPR 2024 110 | 111 | 5. **Grounding 3D Object Affordance from 2D Interactions in Images.** 112 | *Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Jiebo Luo, Zheng-Jun Zha.* [[abs](https://arxiv.org/abs/2303.10437)], ICCV 2023 113 | 114 | 6. **Learning 2d invariant affordance knowledge for 3d affordance grounding.** 115 | *Xianqiang Gao, Pingrui Zhang, Delin Qu, Dong Wang, Zhigang Wang, Yan Ding, Bin Zhao.* [[abs](https://arxiv.org/abs/2408.13024)], AAAI 2025 116 | 117 | 7. **LEMON: Learning 3D Human-Object Interaction Relation from 2D Images.** 118 | *Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Zheng-Jun Zha.* [[abs](https://arxiv.org/abs/2312.08963)], CVPR 2024 119 | 120 | 8. **Open-vocabulary affordance detection in 3d point clouds.** 121 | *Toan Nguyen, Minh Nhat Vu, An Vuong, Dzung Nguyen, Thieu Vo, Ngan Le, Anh Nguyen.* [[abs](https://arxiv.org/abs/2303.02401)], IROS 2023 122 | 123 | 9. **Open-Vocabulary Affordance Detection using Knowledge Distillation and Text-Point Correlation.** 124 | *Tuan Van Vo, Minh Nhat Vu, Baoru Huang, Toan Nguyen, Ngan Le, Thieu Vo, Anh Nguyen.* [[abs](https://arxiv.org/abs/2309.10932)], ICRA 2024 125 | 126 | 10. **3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds.** 127 | *Hengshuo Chu, Xiang Deng, Qi Lv, Xiaoyang Chen, Yinchuan Li, Jianye Hao, Liqiang Nie.* [[abs](https://arxiv.org/abs/2502.20041)], ICLR 2025 128 | 129 | 11. **Grounding 3D Scene Affordance From Egocentric Interactions.** 130 | *Cuiyu Liu, Wei Zhai, Yuhang Yang, Hongchen Luo, Sen Liang, Yang Cao, Zheng-Jun Zha.* [[abs](https://arxiv.org/abs/2409.19650)], Arxiv 2024.09 131 | 132 | 12. **Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions.** 133 | *He Zhu, Quyu Kong, Kechun Xu, Xunlong Xia, Bing Deng, Jieping Ye, Rong Xiong, Yue Wang.* [[abs](https://arxiv.org/abs/2504.04744)], CVPR 2025 134 | 135 | 13. **GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency.** 136 | *Dongyue Lu, Lingdong Kong, Tianxin Huang, Gim Hee Lee.* [[abs](https://arxiv.org/abs/2412.09511)], CVPR 2025 137 | 138 | 14. **GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding.** 139 | *Yawen Shao, Wei Zhai, Yuhang Yang, Hongchen Luo, Yang Cao, Zheng-Jun Zha.* [[abs](https://arxiv.org/abs/2411.19626)], CVPR 2025 140 | 141 | 15. **SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model.** 142 | *Chunlin Yu, Hanqing Wang, Ye Shi, Haoyang Luo, Sibei Yang, Jingyi Yu, Jingya Wang.* [[abs](https://arxiv.org/abs/2412.01550)], CVPR 2025 143 | 144 | 16. **3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians.** 145 | *Zeming Wei, Junyi Lin, Yang Liu, Weixing Chen, Jingzhou Luo, Guanbin Li, Liang Lin.* [[abs](https://arxiv.org/abs/2504.11218)], Arxiv 2025.04 146 | 147 | 148 | 149 | 150 | ### π§ͺ Affordance Reasoning 151 | 152 | 1. **AffordanceLLM: Grounding Affordance from Vision Language Models.** 153 | *Shengyi Qian, Weifeng Chen, Min Bai, Xiong Zhou, Zhuowen Tu, Li Erran Li.* [[abs](https://arxiv.org/abs/2401.06341)], CVPRWS 2024 154 | 155 | 2. **SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model.** 156 | *Chunlin Yu, Hanqing Wang, Ye Shi, Haoyang Luo, Sibei Yang, Jingyi Yu, Jingya Wang.* [[abs](https://arxiv.org/abs/2412.01550)], CVPR 2025 157 | 158 | 3. **3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds.** 159 | *Hengshuo Chu, Xiang Deng, Qi Lv, Xiaoyang Chen, Yinchuan Li, Jianye Hao, Liqiang Nie.* [[abs](https://arxiv.org/abs/2502.20041)], ICLR 2025 160 | 161 | 4. **Affordance Benchmark for MLLMs.** 162 | *Junying Wang, Wenzhe Li, Yalun Wu, Yingji Liang, Yijin Guo, Chunyi Li, Haodong Duan, Zicheng Zhang, Guangtao Zhai.* [[abs](https://arxiv.org/abs/2506.00893)], Arxiv 25.06 163 | 164 | ### π Affordance based Grasping & Manipulation 165 | 166 | 1. **OVAL-Prompt: Open-Vocabulary Affordance Localization for Robot Manipulation through LLM Affordance-Grounding.** 167 | *Edmond Tong, Anthony Opipari, Stanley Lewis, Zhen Zeng, Odest Chadwicke Jenkins.* [[abs](https://arxiv.org/abs/2404.11000)], Arxiv 24.04 168 | 169 | 2. **Learning 6-DoF Task-oriented Grasp Detection via Implicit Estimation and Visual Affordance.** 170 | *Edmond Tong, Anthony Opipari, Stanley Lewis, Zhen Zeng, Odest Chadwicke Jenkins.* [[abs](https://ieeexplore.ieee.org/abstract/document/9981900)], Iros 2022 171 | 172 | 3. **Affordance-Driven Next-Best-View Planning for Robotic Grasping.** 173 | *Xuechao Zhang, Dong Wang, Sun Han, Weichuang Li, Bin Zhao, Zhigang Wang, Xiaoming Duan, Chongrong Fang, Xuelong Li, Jianping He.* [[abs](https://arxiv.org/abs/2309.09556)], Arxiv 23.09 174 | 175 | 4. **Learning Generalizable Dexterous Manipulation from Human Grasp Affordance.** 176 | *Yueh-Hua Wu, Jiashun Wang, Xiaolong Wang.* [[abs](https://arxiv.org/abs/2204.02320)], Arxiv 22.04 177 | 178 | 5. **UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation.** 179 | *Yihe Tang1, Wenlong Huang1, Yingke Wang1, Chengshu Li1, Roy Yuan1, Ruohan Zhang1, Jiajun Wu1, Li Fei-Fei1.* [[abs](https://gpt-affordance.github.io/)], ICRA 2025 180 | 181 | 6. **AffordDP: Generalizable Diffusion Policy with Transferable Affordance.** 182 | *Shijie Wu, Yihang Zhu, Yunao Huang, Kaizhen Zhu, Jiayuan Gu, Jingyi Yu, Ye Shi, Jingya Wang.* [[abs](https://arxiv.org/abs/2412.03142)], CVPR 2025 183 | 184 | 7. **GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation.** 185 | *Teli Ma, Jia Zheng, Zifan Wang, Ziyao Gao, Jiaming Zhou, Junwei Liang.* [[abs](https://arxiv.org/abs/2505.11865)], Arxiv 2025.05 186 | 187 | 8. **GLOVER: Generalizable Open-Vocabulary Affordance Reasoning for Task-Oriented Grasping.** 188 | *Teli Ma, Zifan Wang, Jiaming Zhou, Mengmeng Wang, Junwei Liang.* [[abs](https://arxiv.org/abs/2505.11865)], Arxiv 2024.11 189 | 190 | 9. **BiAssemble: Learning Collaborative Affordance for Bimanual Geometric Assembly.** 191 | *Yan Shen, Ruihai Wu, Yubin Ke, Xinyuan Song, Zeyi Li, Xiaoqi Li, Hongwei Fan, Haoran Lu, Hao Dong.* [[abs](https://sites.google.com/view/biassembly)], ICML 2025 192 | 193 | 10. **GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation.** 194 | *Ruihai Wu, Ziyu Zhu, Yuran Wang, Yue Chen, Jiarui Wang, Hao Dong.* [[abs](https://arxiv.org/pdf/2503.09243)], CVPR 2025 195 | 196 | 11. **NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation.** 197 | *Ran Xu, Yan Shen, Xiaoqi Li, Ruihai Wu, Hao Dong.* [[abs](https://arxiv.org/pdf/2403.08355)], RAL 2024 198 | 199 | 12. **ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models.** 200 | *Siyuan Huang, Iaroslav Ponomarenko, Zhengkai Jiang, Xiaoqi Li, Xiaobin Hu, Peng Gao, Hongsheng Li, Hao Dong.* [[abs](https://arxiv.org/pdf/2403.08355)], IROS 2024 201 | 202 | 13. **Learning Environment-aware Affordance for 3D Articulated Object Manipulation under Occlusions.** 203 | *Ruihai Wu, Kai Cheng, Yan Zhao, Chuanruo Ning, Guanqi Zhan, Hao Dong.* [[abs](https://arxiv.org/abs/2309.07510)], NeurIPS 2023 204 | 205 | 14. **DefoAfford: Learning Foresightful Dense Visual Affordance for Deformable Object Manipulation.** 206 | *Ruihai Wu, Chuanruo Ning, Hao Dong.* [[abs](https://arxiv.org/abs/2303.11057)], ICCV 2023 207 | 208 | 15. **RLAfford: End-to-End Affordance Learning for Robotic Manipulation.** 209 | *Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang, Hao Dong.* [[abs](https://arxiv.org/pdf/2209.12941)], ICRA 2023 210 | 211 | 16. **DualAfford: Learning Collaborative Visual Affordance for Dual-gripper Object Manipulation.** 212 | *Yan Zhao, Ruihai Wu, Zhehuan Chen, Yourong Zhang, Qingnan Fan, Kaichun Mo, Hao Dong.* [[abs](https://arxiv.org/pdf/2207.01971)], ICLR 2023 213 | 214 | 16. **Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation.** 215 | *Yuanchen Ju, Kaizhe Hu, Guowei Zhang, Gu Zhang, Mingrun Jiang, Huazhe Xu.* [[abs](https://arxiv.org/abs/2401.07487)], ECCV 2024 216 | 217 | 17. **AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordanc.** 218 | *Yi-Lin Wei, Mu Lin, Yuhao Lin, Jian-Jian Jiang, Xiao-Ming Wu, Ling-An Zeng, Wei-Shi Zheng* [[abs](https://arxiv.org/abs/2503.07360)], Arxiv 25.03 219 | 220 | 18. **Learning Precise Affordances from Egocentric Videos for Robotic Manipulation.** 221 | *Gen Li, Nikolaos Tsagkas, Jifei Song, Ruaridh Mon-Williams, Sethu Vijayakumar, Kun Shao, Laura Sevilla-Lara* [[abs](https://arxiv.org/abs/2408.10123)], Arxiv 24.08 222 | 223 | 19. **DORA: Object Affordance-Guided Reinforcement Learning for Dexterous Robotic Manipulation.** 224 | *Lei Zhang, Soumya Mondal, Zhenshan Bing, Kaixin Bai, Diwen Zheng, Zhaopeng Chen, Alois Christian Knoll, Jianwei Zhang* [[abs](https://arxiv.org/abs/2505.14819)], Arxiv 25.05 225 | 226 | 20. **A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation.** 227 | *Rongtao Xu, Jian Zhang, Minghao Guo, Youpeng Wen, Haoting Yang, Min Lin, Jianzheng Huang, Zhe Li, Kaidong Zhang, Liqiong Wang, Yuxuan Kuang, Meng Cao, Feng Zheng, Xiaodan Liang* [[abs](https://arxiv.org/abs/2504.12636)], Arxiv 25.04 228 | 229 | 230 | --- 231 | 232 | ## π Contributing 233 | β Help us grow this repository! If you know any valuable works weβve missed, donβt hesitate to contribute β every suggestion makes a difference! 234 | 235 | We welcome and appreciate all contributions! Hereβs how you can help: 236 | 237 | - π **Add or Update a Paper** 238 | Contribute by adding a new paper or improving details of an existing one. Please consider the most appropriate category for the work. 239 | 240 | - βοΈ **Use Consistent Formatting** 241 | Follow the format of the existing entries to maintain clarity and consistency across the list. 242 | 243 | - π **Include Abstract Link** 244 | If the paper is from arXiv, use the `/abs/` link format for the abstract (e.g., `https://arxiv.org/abs/xxxx.xxxxx`). 245 | 246 | - π‘ **Explain Your Edit (Optional but Helpful)** 247 | A short note on why you think the paper deserves to be added or updated is appreciated and helps maintainers process your PR faster. 248 | 249 | > **β Don't worry about getting everything perfect!** 250 | > Minor mistakes are totally fine β weβll help fix them. What matters most is your contribution. Let's highlight your awesome work together! 251 | 252 | --- 253 | ## Acknowledgement 254 | Thanks for the wonderful project: [Awesome-LLM-Empathy](https://github.com/JhCircle/Awesome-LLM-Empathy). This project is built upon it. 255 | 256 | ## π License 257 | 258 | This project is licensed under the [MIT License](https://opensource.org/licenses/MIT). 259 | 260 | ## Contributors 261 | 262 | 263 |