└── README.md


/README.md:
--------------------------------------------------------------------------------
  1 | # Awesome In-Context Reinforcement Learning 
  2 | 
  3 | This is a collection of research papers for In-Context Reinforcement Learning (ICRL). The repository shall be regularly updated to track the frontiers.
  4 | 
  5 | _Curated by dunnolab._
  6 | 
  7 | -----
  8 | Please, feel free to [PR](https://github.com/dunnolab/awesome-in-context-rl/pulls) new papers and resources you believe are relevant and awesome.
  9 | ```
 10 | format:
 11 | - [title](paper link)
 12 |   - author1, author2, and author3...
 13 | ```
 14 | 
 15 | ## Papers
 16 | ### 2025
 17 | - [In-Context Reinforcement Learning via Communicative World Models](https://arxiv.org/abs/2508.06659)
 18 |   - Fernando Martinez-Lopez, Tao Li, Yingdong Lu, Juntao Chen
 19 | - [Reward Is Enough: LLMs Are In-Context Reinforcement Learners](https://arxiv.org/abs/2506.06303)
 20 |   - Kefan Song, Amir Moeini, Peng Wang, Lei Gong, Rohan Chandra, Yanjun Qi, Shangtong Zhang
 21 | - [Filtering Learning Histories Enhances In-Context Reinforcement Learning](https://arxiv.org/pdf/2505.15143)
 22 |   - Weiqin Chen, Xinjie Zhang, Dharmashankar Subramanian, Santiago Paternain
 23 | - [OmniRL: In-Context Reinforcement Learning by Large-Scale Meta-Training in Randomized Worlds](https://arxiv.org/abs/2502.02869)
 24 |   - Wang, Fan, Pengtao Shao, Yiming Zhang, Bo Yu, Shaoshan Liu, Ning Ding, Yang Cao, Yu Kang, and Haifeng Wang
 25 | - [A **Survey** of In-Context Reinforcement Learning](https://arxiv.org/abs/2502.07978)
 26 |   - Amir Moeini, Jiuqi Wang, Jacob Beck, Ethan Blaser, Shimon Whiteson, Rohan Chandra, Shangtong Zhang
 27 | - [Yes, Q-learning Helps Offline In-Context RL](https://arxiv.org/abs/2502.17666)
 28 |   - Denis Tarasov, Alexander Nikulin, Ilya Zisman, Albina Klepach, Andrei Polubarov, Nikita Lyubaykin, Alexander Derevyagin, Igor Kiselev, Vladislav Kurenkov
 29 | - [Vintix: Action Model via In-Context Reinforcement Learning](https://arxiv.org/abs/2501.19400)
 30 |   - Andrey Polubarov, Nikita Lyubaykin, Alexander Derevyagin, Ilya Zisman, Denis Tarasov, Alexander Nikulin, Vladislav Kurenkov
 31 | - [Training a Generally Curious Agent](https://arxiv.org/abs/2502.17543)
 32 |   - Fahim Tajwar, Yiding Jiang, Abitha Thankaraj, Sumaita Sadia Rahman, J Zico Kolter, Jeff Schneider, Ruslan Salakhutdinov
 33 | 
 34 | ### 2024
 35 | - [LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations](https://arxiv.org/abs/2412.01441)
 36 |   - Anian Ruoss, Fabio Pardo, Harris Chan, Bonnie Li, Volodymyr Mnih, Tim Genewein
 37 | - [Meta-Reinforcement Learning Robust to Distributional Shift Via Performing Lifelong In-Context Learning](https://proceedings.mlr.press/v235/xu24o.html)
 38 |   - Tengye Xu, Zihao Li, Qinyuan Ren
 39 | - [AMAGO-2: Breaking the Multi-Task Barrier in Meta-Reinforcement Learning with Transformers](https://arxiv.org/abs/2411.11188)
 40 |   - Jake Grigsby, Justin Sasek, Samyak Parajuli, Daniel Adebi, Amy Zhang, Yuke Zhu
 41 | - [LLMs Are In-Context Reinforcement Learners](https://arxiv.org/abs/2410.05362)
 42 |   - Giovanni Monea, Antoine Bosselut, Kianté Brantley, Yoav Artzi
 43 | - [EVOLvE: Evaluating and Optimizing LLMs For Exploration](https://arxiv.org/abs/2410.06238)
 44 |   - Allen Nie, Yi Su, Bo Chang, Jonathan N. Lee, Ed H. Chi, Quoc V. Le, Minmin Chen
 45 | - [Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models](https://arxiv.org/abs/2410.01280)
 46 |   - Can Demircan, Tankred Saanum, Akshay K. Jagadish, Marcel Binz, Eric Schulz
 47 | - [ReLIC: A Recipe for 64k Steps of In-Context Reinforcement Learning for Embodied AI](https://arxiv.org/abs/2410.02751)
 48 |   - Ahmad Elawady, Gunjan Chhablani, Ram Ramrakhya, Karmesh Yadav, Dhruv Batra, Zsolt Kira, Andrew Szot
 49 | - [Retrieval-Augmented Decision Transformer: External Memory for In-context RL](https://arxiv.org/abs/2410.07071)
 50 |   - Thomas Schmied, Fabian Paischer, Vihang Patil, Markus Hofmarcher, Razvan Pascanu, Sepp Hochreiter
 51 | - [Random Policy Enables In-Context Reinforcement Learning within Trust Horizons](https://arxiv.org/pdf/2410.19982)
 52 |   - Weiqin Chen, Santiago Paternain
 53 | - [Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning](https://arxiv.org/abs/2406.00392)
 54 |   - Jonathan Cook, Chris Lu, Edward Hughes, Joel Z. Leibo, Jakob Foerster
 55 | - [XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning](https://arxiv.org/abs/2406.08973)
 56 |   - Alexander Nikulin, Ilya Zisman, Alexey Zemtsov, Viacheslav Sinii, Vladislav Kurenkov, Sergey Kolesnikov
 57 | - [Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling](https://arxiv.org/abs/2406.00079)
 58 |   - Sili Huang, Jifeng Hu, Zhejian Yang, Liwei Yang, Tao Luo, Hechang Chen, Lichao Sun, Bo Yang
 59 | - [Pretraining Decision Transformers with Reward Prediction for In-Context Multi-task Structured Bandit Learning](https://arxiv.org/abs/2406.05064)
 60 |   - Subhojyoti Mukherjee, Josiah P. Hanna, Qiaomin Xie, Robert Nowak
 61 | - [Transformers Learn Temporal Difference Methods for In-Context Reinforcement Learning](https://arxiv.org/abs/2405.13861)
 62 |   - Jiuqi Wang, Ethan Blaser, Hadi Daneshmand, Shangtong Zhang
 63 | - [In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought](https://arxiv.org/abs/2405.20692)
 64 |   - Sili Huang, Jifeng Hu, Hechang Chen, Lichao Sun, Bo Yang
 65 | - [In-Context Reinforcement Learning Without Optimal Action Labels](https://openreview.net/forum?id=8Dey9wo2qA)
 66 |   - Juncheng Dong, Moyang Guo, Ethan X Fang, Zhuoran Yang, Vahid Tarokh
 67 | - [Can Large Language Models Explore In-Context?](https://arxiv.org/abs/2403.15371)
 68 |   - Akshay Krishnamurthy, Keegan Harris, Dylan J. Foster, Cyril Zhang, Aleksandrs Slivkins
 69 | - [Large Language Models As Evolution Strategies](https://arxiv.org/abs/2402.18381)
 70 |   - Robert Tjarko Lange, Yingtao Tian, Yujin Tang
 71 | ### 2023
 72 | - [Generalization to New Sequential Decision Making Tasks with In-Context Learning](https://arxiv.org/abs/2312.03801)
 73 |   - Sharath Chandra Raparthy, Eric Hambro, Robert Kirk, Mikael Henaff, Roberta Raileanu
 74 | - [Emergence of In-Context Reinforcement Learning from Noise Distillation](https://arxiv.org/abs/2312.12275)
 75 |   - Ilya Zisman, Vladislav Kurenkov, Alexander Nikulin, Viacheslav Sinii, Sergey Kolesnikov 
 76 | - [In-Context Reinforcement Learning for Variable Action Spaces](https://arxiv.org/abs/2312.13327)
 77 |   - Viacheslav Sinii, Alexander Nikulin, Vladislav Kurenkov, Ilya Zisman, Sergey Kolesnikov
 78 | - [AMAGO: Scalable In-Context Reinforcement Learning for Adaptive Agents](https://arxiv.org/abs/2310.09971)
 79 |   - Jake Grigsby, Linxi Fan, Yuke Zhu
 80 | - [Cross-Episodic Curriculum for Transformer Agents](https://arxiv.org/abs/2310.08549)
 81 |   - Lucy Xiaoyang Shi, Yunfan Jiang, Jake Grigsby, Linxi "Jim" Fan, Yuke Zhu
 82 | - [Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining](https://arxiv.org/abs/2310.08566)
 83 |   - Licong Lin, Yu Bai, Song Mei
 84 | - [Large Language Models as General Pattern Machines](https://arxiv.org/abs/2307.04721)
 85 |   - Suvir Mirchandani, Fei Xia, Pete Florence, Brian Ichter, Danny Driess, Montserrat Gonzalez Arenas, Kanishka Rao, Dorsa Sadigh, Andy Zeng
 86 | - [First-Explore, then Exploit: Meta-Learning Intelligent Exploration](https://arxiv.org/abs/2307.02276)
 87 |   - Ben Norman, Jeff Clune
 88 | - [Supervised Pretraining Can Learn In-Context Reinforcement Learning](https://arxiv.org/abs/2306.14892)
 89 |   - Jonathan N. Lee, Annie Xie, Aldo Pacchiano, Yash Chandak, Chelsea Finn, Ofir Nachum, Emma Brunskill
 90 | - [Structured State Space Models for In-Context Reinforcement Learning](https://arxiv.org/abs/2303.03982)
 91 |   - Chris Lu, Yannick Schroecker, Albert Gu, Emilio Parisotto, Jakob Foerster, Satinder Singh, Feryal Behbahani
 92 | - [Human-Timescale Adaptation in an Open-Ended Task Space](https://arxiv.org/abs/2301.07608)
 93 |   - Adaptive Agent Team, Jakob Bauer, Kate Baumli, Satinder Baveja, Feryal Behbahani, Avishkar Bhoopchand, Nathalie Bradley-Schmieg, Michael Chang, Natalie Clay, Adrian Collister, Vibhavari Dasagi, Lucy Gonzalez, Karol Gregor, Edward Hughes, Sheleem Kashem, Maria Loks-Thompson, Hannah Openshaw, Jack Parker-Holder, Shreya Pathak, Nicolas Perez-Nieves, Nemanja Rakicevic, Tim Rocktäschel, Yannick Schroecker, Jakub Sygnowski, Karl Tuyls, Sarah York, Alexander Zacherl, Lei Zhang
 94 | - [Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration](https://arxiv.org/abs/2302.04250)
 95 |   - Chentian Jiang, Nan Rosemary Ke, Hado van Hasselt
 96 | - [Towards General-Purpose In-Context Learning Agents](https://openreview.net/forum?id=zDTqQVGgzH)
 97 |   - Louis Kirsch, James Harrison, C. Daniel Freeman, Jascha Sohl-Dickstein, Jürgen Schmidhuber
 98 | ### Before 2023
 99 | - [In-context Reinforcement Learning with Algorithm Distillation](https://arxiv.org/abs/2210.14215)
100 |   - Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, DJ Strouse, Steven Hansen, Angelos Filos, Ethan Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih
101 | - [Large Language Models can Implement Policy Iteration](https://arxiv.org/abs/2210.03821)
102 |   - Ethan Brooks, Logan Walls, Richard L. Lewis, Satinder Singh
103 | ### Publication Year Unkown
104 | 


--------------------------------------------------------------------------------