└── README.md /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 |

Awesome-Affordance-Learning

4 |

5 | Curated collection of papers and resources on Affordance Learning. 6 |

7 | 8 | 9 | [![Awesome](https://awesome.re/badge.svg)](https://github.com/hq-King/Awesome-Affordance-Learning) 10 | [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT) 11 | ![Last Commit](https://img.shields.io/github/last-commit/hq-King/Awesome-Affordance-Learning?color=green) 12 | ![PRs Welcome](https://img.shields.io/badge/PRs-Welcome-red) 13 | ![GitHub Repo stars](https://img.shields.io/github/stars/hq-King/Awesome-Affordance-Learning?style=social) 14 | 15 | 16 | > 🧭 Exploring Embodied AI and Embodied perception ? We hope this collection proves useful in your journey. If you'd like to support the project, feel free to ⭐️ the repo and share it with your peers. Contributions are warmly welcome! 17 | 18 | --- 19 | 20 | ## πŸ”₯ News 21 | 22 | > πŸ“’ This list is **actively maintained**, and community contributions are always appreciated! 23 | > Feel free to [open a pull request](https://github.com/hq-King/Awesome-Affordance-Learning/pulls) if you find any relevant papers. 24 | 25 | - πŸŽ‰ `2025-05`: **Repository launched to curate a comprehensive list of Affordance Learning.** 26 | 27 | --- 28 | - [🌟 Introduction](#-introduction) 29 | - [πŸ“œ Papers](#-papers) 30 | - [πŸ’‘2D Affordance Perception] 31 | - [πŸ—£οΈ 3D Affordance Perception] 32 | - [πŸ§ͺ Affordance Reasoning] 33 | - [πŸ“š Affordance based Grasping & Manipulation] 34 | 35 | 36 | 37 | 38 | --- 39 | ## 🌟 Introduction 40 | This repository offers a **comprehensive and up-to-date collection** of research papers on **Affordance Learning**. 41 | 42 | > As robots(Embodied Agents) are increasingly integrated into real-world applications, their ability to manipulate objects is increasingly in demand, and affordance describes where and how to interact with objects. 43 | 44 | This list spans across: 45 | - 2D Affordance Perception 46 | - 3D Affordance Perception 47 | - Affordance Reasoning 48 | - Affordance based Grasping & Manipulatation 49 | 50 | 51 | Whether you're a researcher, developer, or enthusiast, this is your go-to hub for exploring Embodied perception. 52 | 53 | --- 54 | ## πŸ“œ Papers 55 | ### πŸ’‘ 2D Affordance Perception 56 | 1. **AffordanceSAM: Segment Anything Once More in Affordance Grounding.** 57 | *Dengyang Jiang, Mengmeng Wang, Teli Ma, Hengzhuang Li, Yong liu, Guang Dai, Lei Zhang.* [[abs](https://arxiv.org/abs/2504.15650)], Arxiv 2025.04 58 | 59 | 2. **GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation.** 60 | *Teli Ma, Jia Zheng, Zifan Wang, Ziyao Gao, Jiaming Zhou, Junwei Liang.* [[abs](https://arxiv.org/abs/2505.11865)], Arxiv 2025.05 61 | 62 | 3. **One-Shot Open Affordance Learning with Foundation Models.** 63 | *Gen Li, Deqing Sun, Laura Sevilla-Lara, Varun Jampani.* [[abs](https://arxiv.org/abs/2311.17776)], CVPR 2024 64 | 65 | 4. **Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors.** 66 | *Peiran Xu, Yadong Mu.* [[abs](https://openreview.net/pdf?id=0823rvTIhs))], ICLR 2025 67 | 68 | 5. **AffordanceLLM: Grounding Affordance from Vision Language Models.** 69 | *Shengyi Qian, Weifeng Chen, Min Bai, Xiong Zhou, Zhuowen Tu, Li Erran Li.* [[abs](https://arxiv.org/abs/2401.06341)], CVPRWS 2024 70 | 71 | 6. **Learning Affordance Grounding from Exocentric Images.** 72 | *Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, Dacheng Tao.* [[abs](https://arxiv.org/abs/2203.09905)], CVPR 2022 73 | 74 | 7. **One-Shot Affordance Detection.** 75 | *Hongchen Luo, Wei Zhai, Jing Zhang, Yang Cao, Dacheng Tao.* [[abs](https://arxiv.org/abs/2106.14747)], IJCAI 2021 76 | 77 | 8. **WorldAfford: Affordance Grounding based on Natural Language Instructions.** 78 | *Changmao Chen, Yuren Cong, Zhen Kan.* [[abs](https://arxiv.org/abs/2405.12461)], Arxiv 2024.05 79 | 80 | 9. **LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding.** 81 | *Gen Li, Varun Jampani, Deqing Sun, Laura Sevilla-Lara.* [[abs](https://arxiv.org/abs/2303.09665)], CVPR 2023 82 | 83 | 10. **Affordance Grounding From Demonstration Video To Target Image.** 84 | *Joya Chen, Difei Gao, Kevin Qinghong Lin, Mike Zheng Shou.* [[abs](https://arxiv.org/abs/2303.14644)], CVPR 2023 85 | 86 | 11. **One-Shot Transfer of Affordance Regions? AffCorrs!.** 87 | *Denis Hadjivelichkov, Sicelukwanda Zwane, Marc Peter Deisenroth, Lourdes Agapito, Dimitrios Kanoulas.* [[abs](https://arxiv.org/abs/2303.09665)], CoRL 2023 88 | 89 | 12. **Text-driven Affordance Learning from Egocentric Vision.** 90 | *Tomoya Yoshida, Shuhei Kurita, Taichi Nishimura, Shinsuke Mori.* [[abs](https://arxiv.org/abs/2309.10932)], Arxiv 2023.09 91 | 92 | 13. **MaskPrompt: Open-Vocabulary Affordance Segmentation with Object Shape Mask Prompts.** 93 | *Dongpan Chen, Dehui Kong, Jinghua Li, Baocai Yin. [[abs](https://ojs.aaai.org/index.php/AAAI/article/view/32200)], AAAI 2025 94 | 95 | 96 | 97 | ### πŸ—£οΈ 3D Affordance Perception 98 | 99 | 1. **3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding.** 100 | *Shengheng Deng, Xun Xu, Chaozheng Wu, Ke Chen, Kui Jia.* [[abs](https://arxiv.org/abs/2103.16397)], CVPR 2021 101 | 102 | 2. **Where2explore: Few-shot affordance learning for unseen novel categories of articulated objects.** 103 | *Chuanruo Ning, Ruihai Wu, Haoran Lu, Kaichun Mo, Hao Dong.* [[abs](https://arxiv.org/abs/2103.16397)], NeurIPS 2023 104 | 105 | 3. **LASO: Language-guided Affordance Segmentation on 3D Object.** 106 | *Yicong Li, Na Zhao, Junbin Xiao, Chun Feng, Xiang Wang, Tat-seng Chua.* [[abs](https://openaccess.thecvf.com/content/CVPR2024/html/Li_LASO_Language-guided_Affordance_Segmentation_on_3D_Object_CVPR_2024_paper.html)], CVPR 2024 107 | 108 | 4. **SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes.** 109 | *Shengheng Deng, Xun Xu, Chaozheng Wu, Ke Chen, Kui Jia.* [[abs](https://openaccess.thecvf.com/content/CVPR2024/html/Delitzas_SceneFun3D_Fine-Grained_Functionality_and_Affordance_Understanding_in_3D_Scenes_CVPR_2024_paper.html)], CVPR 2024 110 | 111 | 5. **Grounding 3D Object Affordance from 2D Interactions in Images.** 112 | *Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Jiebo Luo, Zheng-Jun Zha.* [[abs](https://arxiv.org/abs/2303.10437)], ICCV 2023 113 | 114 | 6. **Learning 2d invariant affordance knowledge for 3d affordance grounding.** 115 | *Xianqiang Gao, Pingrui Zhang, Delin Qu, Dong Wang, Zhigang Wang, Yan Ding, Bin Zhao.* [[abs](https://arxiv.org/abs/2408.13024)], AAAI 2025 116 | 117 | 7. **LEMON: Learning 3D Human-Object Interaction Relation from 2D Images.** 118 | *Yuhang Yang, Wei Zhai, Hongchen Luo, Yang Cao, Zheng-Jun Zha.* [[abs](https://arxiv.org/abs/2312.08963)], CVPR 2024 119 | 120 | 8. **Open-vocabulary affordance detection in 3d point clouds.** 121 | *Toan Nguyen, Minh Nhat Vu, An Vuong, Dzung Nguyen, Thieu Vo, Ngan Le, Anh Nguyen.* [[abs](https://arxiv.org/abs/2303.02401)], IROS 2023 122 | 123 | 9. **Open-Vocabulary Affordance Detection using Knowledge Distillation and Text-Point Correlation.** 124 | *Tuan Van Vo, Minh Nhat Vu, Baoru Huang, Toan Nguyen, Ngan Le, Thieu Vo, Anh Nguyen.* [[abs](https://arxiv.org/abs/2309.10932)], ICRA 2024 125 | 126 | 10. **3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds.** 127 | *Hengshuo Chu, Xiang Deng, Qi Lv, Xiaoyang Chen, Yinchuan Li, Jianye Hao, Liqiang Nie.* [[abs](https://arxiv.org/abs/2502.20041)], ICLR 2025 128 | 129 | 11. **Grounding 3D Scene Affordance From Egocentric Interactions.** 130 | *Cuiyu Liu, Wei Zhai, Yuhang Yang, Hongchen Luo, Sen Liang, Yang Cao, Zheng-Jun Zha.* [[abs](https://arxiv.org/abs/2409.19650)], Arxiv 2024.09 131 | 132 | 12. **Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions.** 133 | *He Zhu, Quyu Kong, Kechun Xu, Xunlong Xia, Bing Deng, Jieping Ye, Rong Xiong, Yue Wang.* [[abs](https://arxiv.org/abs/2504.04744)], CVPR 2025 134 | 135 | 13. **GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency.** 136 | *Dongyue Lu, Lingdong Kong, Tianxin Huang, Gim Hee Lee.* [[abs](https://arxiv.org/abs/2412.09511)], CVPR 2025 137 | 138 | 14. **GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding.** 139 | *Yawen Shao, Wei Zhai, Yuhang Yang, Hongchen Luo, Yang Cao, Zheng-Jun Zha.* [[abs](https://arxiv.org/abs/2411.19626)], CVPR 2025 140 | 141 | 15. **SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model.** 142 | *Chunlin Yu, Hanqing Wang, Ye Shi, Haoyang Luo, Sibei Yang, Jingyi Yu, Jingya Wang.* [[abs](https://arxiv.org/abs/2412.01550)], CVPR 2025 143 | 144 | 16. **3DAffordSplat: Efficient Affordance Reasoning with 3D Gaussians.** 145 | *Zeming Wei, Junyi Lin, Yang Liu, Weixing Chen, Jingzhou Luo, Guanbin Li, Liang Lin.* [[abs](https://arxiv.org/abs/2504.11218)], Arxiv 2025.04 146 | 147 | 148 | 149 | 150 | ### πŸ§ͺ Affordance Reasoning 151 | 152 | 1. **AffordanceLLM: Grounding Affordance from Vision Language Models.** 153 | *Shengyi Qian, Weifeng Chen, Min Bai, Xiong Zhou, Zhuowen Tu, Li Erran Li.* [[abs](https://arxiv.org/abs/2401.06341)], CVPRWS 2024 154 | 155 | 2. **SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model.** 156 | *Chunlin Yu, Hanqing Wang, Ye Shi, Haoyang Luo, Sibei Yang, Jingyi Yu, Jingya Wang.* [[abs](https://arxiv.org/abs/2412.01550)], CVPR 2025 157 | 158 | 3. **3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds.** 159 | *Hengshuo Chu, Xiang Deng, Qi Lv, Xiaoyang Chen, Yinchuan Li, Jianye Hao, Liqiang Nie.* [[abs](https://arxiv.org/abs/2502.20041)], ICLR 2025 160 | 161 | 4. **Affordance Benchmark for MLLMs.** 162 | *Junying Wang, Wenzhe Li, Yalun Wu, Yingji Liang, Yijin Guo, Chunyi Li, Haodong Duan, Zicheng Zhang, Guangtao Zhai.* [[abs](https://arxiv.org/abs/2506.00893)], Arxiv 25.06 163 | 164 | ### πŸ“š Affordance based Grasping & Manipulation 165 | 166 | 1. **OVAL-Prompt: Open-Vocabulary Affordance Localization for Robot Manipulation through LLM Affordance-Grounding.** 167 | *Edmond Tong, Anthony Opipari, Stanley Lewis, Zhen Zeng, Odest Chadwicke Jenkins.* [[abs](https://arxiv.org/abs/2404.11000)], Arxiv 24.04 168 | 169 | 2. **Learning 6-DoF Task-oriented Grasp Detection via Implicit Estimation and Visual Affordance.** 170 | *Edmond Tong, Anthony Opipari, Stanley Lewis, Zhen Zeng, Odest Chadwicke Jenkins.* [[abs](https://ieeexplore.ieee.org/abstract/document/9981900)], Iros 2022 171 | 172 | 3. **Affordance-Driven Next-Best-View Planning for Robotic Grasping.** 173 | *Xuechao Zhang, Dong Wang, Sun Han, Weichuang Li, Bin Zhao, Zhigang Wang, Xiaoming Duan, Chongrong Fang, Xuelong Li, Jianping He.* [[abs](https://arxiv.org/abs/2309.09556)], Arxiv 23.09 174 | 175 | 4. **Learning Generalizable Dexterous Manipulation from Human Grasp Affordance.** 176 | *Yueh-Hua Wu, Jiashun Wang, Xiaolong Wang.* [[abs](https://arxiv.org/abs/2204.02320)], Arxiv 22.04 177 | 178 | 5. **UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation.** 179 | *Yihe Tang1, Wenlong Huang1, Yingke Wang1, Chengshu Li1, Roy Yuan1, Ruohan Zhang1, Jiajun Wu1, Li Fei-Fei1.* [[abs](https://gpt-affordance.github.io/)], ICRA 2025 180 | 181 | 6. **AffordDP: Generalizable Diffusion Policy with Transferable Affordance.** 182 | *Shijie Wu, Yihang Zhu, Yunao Huang, Kaizhen Zhu, Jiayuan Gu, Jingyi Yu, Ye Shi, Jingya Wang.* [[abs](https://arxiv.org/abs/2412.03142)], CVPR 2025 183 | 184 | 7. **GLOVER++: Unleashing the Potential of Affordance Learning from Human Behaviors for Robotic Manipulation.** 185 | *Teli Ma, Jia Zheng, Zifan Wang, Ziyao Gao, Jiaming Zhou, Junwei Liang.* [[abs](https://arxiv.org/abs/2505.11865)], Arxiv 2025.05 186 | 187 | 8. **GLOVER: Generalizable Open-Vocabulary Affordance Reasoning for Task-Oriented Grasping.** 188 | *Teli Ma, Zifan Wang, Jiaming Zhou, Mengmeng Wang, Junwei Liang.* [[abs](https://arxiv.org/abs/2505.11865)], Arxiv 2024.11 189 | 190 | 9. **BiAssemble: Learning Collaborative Affordance for Bimanual Geometric Assembly.** 191 | *Yan Shen, Ruihai Wu, Yubin Ke, Xinyuan Song, Zeyi Li, Xiaoqi Li, Hongwei Fan, Haoran Lu, Hao Dong.* [[abs](https://sites.google.com/view/biassembly)], ICML 2025 192 | 193 | 10. **GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation.** 194 | *Ruihai Wu, Ziyu Zhu, Yuran Wang, Yue Chen, Jiarui Wang, Hao Dong.* [[abs](https://arxiv.org/pdf/2503.09243)], CVPR 2025 195 | 196 | 11. **NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation.** 197 | *Ran Xu, Yan Shen, Xiaoqi Li, Ruihai Wu, Hao Dong.* [[abs](https://arxiv.org/pdf/2403.08355)], RAL 2024 198 | 199 | 12. **ManipVQA: Injecting Robotic Affordance and Physically Grounded Information into Multi-Modal Large Language Models.** 200 | *Siyuan Huang, Iaroslav Ponomarenko, Zhengkai Jiang, Xiaoqi Li, Xiaobin Hu, Peng Gao, Hongsheng Li, Hao Dong.* [[abs](https://arxiv.org/pdf/2403.08355)], IROS 2024 201 | 202 | 13. **Learning Environment-aware Affordance for 3D Articulated Object Manipulation under Occlusions.** 203 | *Ruihai Wu, Kai Cheng, Yan Zhao, Chuanruo Ning, Guanqi Zhan, Hao Dong.* [[abs](https://arxiv.org/abs/2309.07510)], NeurIPS 2023 204 | 205 | 14. **DefoAfford: Learning Foresightful Dense Visual Affordance for Deformable Object Manipulation.** 206 | *Ruihai Wu, Chuanruo Ning, Hao Dong.* [[abs](https://arxiv.org/abs/2303.11057)], ICCV 2023 207 | 208 | 15. **RLAfford: End-to-End Affordance Learning for Robotic Manipulation.** 209 | *Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang, Hao Dong.* [[abs](https://arxiv.org/pdf/2209.12941)], ICRA 2023 210 | 211 | 16. **DualAfford: Learning Collaborative Visual Affordance for Dual-gripper Object Manipulation.** 212 | *Yan Zhao, Ruihai Wu, Zhehuan Chen, Yourong Zhang, Qingnan Fan, Kaichun Mo, Hao Dong.* [[abs](https://arxiv.org/pdf/2207.01971)], ICLR 2023 213 | 214 | 16. **Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation.** 215 | *Yuanchen Ju, Kaizhe Hu, Guowei Zhang, Gu Zhang, Mingrun Jiang, Huazhe Xu.* [[abs](https://arxiv.org/abs/2401.07487)], ECCV 2024 216 | 217 | 17. **AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordanc.** 218 | *Yi-Lin Wei, Mu Lin, Yuhao Lin, Jian-Jian Jiang, Xiao-Ming Wu, Ling-An Zeng, Wei-Shi Zheng* [[abs](https://arxiv.org/abs/2503.07360)], Arxiv 25.03 219 | 220 | 18. **Learning Precise Affordances from Egocentric Videos for Robotic Manipulation.** 221 | *Gen Li, Nikolaos Tsagkas, Jifei Song, Ruaridh Mon-Williams, Sethu Vijayakumar, Kun Shao, Laura Sevilla-Lara* [[abs](https://arxiv.org/abs/2408.10123)], Arxiv 24.08 222 | 223 | 19. **DORA: Object Affordance-Guided Reinforcement Learning for Dexterous Robotic Manipulation.** 224 | *Lei Zhang, Soumya Mondal, Zhenshan Bing, Kaixin Bai, Diwen Zheng, Zhaopeng Chen, Alois Christian Knoll, Jianwei Zhang* [[abs](https://arxiv.org/abs/2505.14819)], Arxiv 25.05 225 | 226 | 20. **A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation.** 227 | *Rongtao Xu, Jian Zhang, Minghao Guo, Youpeng Wen, Haoting Yang, Min Lin, Jianzheng Huang, Zhe Li, Kaidong Zhang, Liqiong Wang, Yuxuan Kuang, Meng Cao, Feng Zheng, Xiaodan Liang* [[abs](https://arxiv.org/abs/2504.12636)], Arxiv 25.04 228 | 229 | 230 | --- 231 | 232 | ## πŸŽ‰ Contributing 233 | ⭐ Help us grow this repository! If you know any valuable works we’ve missed, don’t hesitate to contribute β€” every suggestion makes a difference! 234 | 235 | We welcome and appreciate all contributions! Here’s how you can help: 236 | 237 | - πŸ“„ **Add or Update a Paper** 238 | Contribute by adding a new paper or improving details of an existing one. Please consider the most appropriate category for the work. 239 | 240 | - ✍️ **Use Consistent Formatting** 241 | Follow the format of the existing entries to maintain clarity and consistency across the list. 242 | 243 | - πŸ”— **Include Abstract Link** 244 | If the paper is from arXiv, use the `/abs/` link format for the abstract (e.g., `https://arxiv.org/abs/xxxx.xxxxx`). 245 | 246 | - πŸ’‘ **Explain Your Edit (Optional but Helpful)** 247 | A short note on why you think the paper deserves to be added or updated is appreciated and helps maintainers process your PR faster. 248 | 249 | > **βœ… Don't worry about getting everything perfect!** 250 | > Minor mistakes are totally fine β€” we’ll help fix them. What matters most is your contribution. Let's highlight your awesome work together! 251 | 252 | --- 253 | ## Acknowledgement 254 | Thanks for the wonderful project: [Awesome-LLM-Empathy](https://github.com/JhCircle/Awesome-LLM-Empathy). This project is built upon it. 255 | 256 | ## πŸ“„ License 257 | 258 | This project is licensed under the [MIT License](https://opensource.org/licenses/MIT). 259 | 260 | ## Contributors 261 | 262 | 263 | 264 | 265 | --------------------------------------------------------------------------------