└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Awesome-World-Models 2 | This repository is a collection of research papers on World Models. It aims to provide a useful resource for those interested in this field. 3 | 4 | >World Models are a class of models in the field of artificial intelligence that aim to create a simplified, internal representation of the external world. These models are designed to predict the future state of the environment based on current observations and past experiences, allowing an agent to make informed decisions. 5 | 6 | ## World Model Papers 7 | 1. **Learning to Model the World with Language.** arxiv 2023. [paper](https://arxiv.org/pdf/2308.01399.pdf) 8 | 9 | *Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan.* 10 | 11 | >language helps agents predict the future 12 | 13 | 2. **Unifying (Machine) Vision via Counterfactual World Modeling.** arxiv 2023. [paper](https://arxiv.org/pdf/2306.01828.pdf) 14 | 15 | *Bear, Daniel M., Kevin Feigelis, Honglin Chen, Wanhee Lee, Rahul Venkatesh, Klemen Kotar, Alex Durango, and Daniel LK Yamins.* 16 | 17 | 3. **World Models** NIPS 2018. [paper](https://arxiv.org/pdf/1803.10122.pdf) [demo](https://worldmodels.github.io/) 18 | 19 | *Ha, David, and Jürgen Schmidhuber.* 20 | 21 | 4. **A Control-Centric Benchmark for Video Prediction.** ICLR 2023. [paper](https://arxiv.org/pdf/2304.13723.pdf) 22 | 23 | *Tian, Stephen, Chelsea Finn, and Jiajun Wu.* 24 | 25 | 5. **Transformers are sample efficient world models.** ICLR 2023. [paper](https://arxiv.org/pdf/2209.00588.pdf) 26 | *Micheli, Vincent, Eloi Alonso, and François Fleuret.* 27 | 28 | 6. **Towards Efficient World Models** ICML 2023 Workshops. [paper](https://openreview.net/pdf?id=o8IDoZggqO) 29 | *Eloi Alonso, Vincent Micheli, and François Fleuret.* 30 | 31 | 7. **Learning latent dynamics for planning from pixels.** PMLR 2019. [paper](https://arxiv.org/pdf/1811.04551.pdf) 32 | *Hafner, Danijar, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, and James Davidson.* 33 | 34 | ## Video Model Papers 35 | 1. **MAGVIT: Masked Generative Video Transformer.** CVPR 2023. [paper](https://arxiv.org/pdf/2212.05199.pdf) [demo](https://magvit.cs.cmu.edu/) [code](https://github.com/google-research/magvit) 36 | 37 | *Yu, Lijun, Yong Cheng, Kihyuk Sohn, José Lezama, Han Zhang, Huiwen Chang, Alexander G. Hauptmann et al.* 38 | 39 | >3d VQ + MaskGIT = 37fps on v100 sampling 40 | 41 | 42 | 2. **Diffusion Models for Video Prediction and Infilling.** TMLR 2022. [paper](https://arxiv.org/pdf/2206.07696.pdf) [code](https://github.com/Tobi-r9/RaMViD) 43 | 44 | *Tobias Höppe, Arash Mehrjou, Stefan Bauer, Didrik Nielsen, Andrea Dittadi* 45 | 46 | 3. **Unsupervised Learning for Physical Interaction through Video Prediction** Neurips 2016. [paper](https://proceedings.neurips.cc/paper_files/paper/2016/file/d9d4f495e875a2e075a1a4a6e1b9770f-Paper.pdf) 47 | 48 | *Finn, Chelsea, Ian Goodfellow, and Sergey Levine.* 49 | 50 | 4. **Unsupervised Learning of Video Representations using Lstms.** ICML 2015. [paper](https://arxiv.org/pdf/1502.04681v3.pdf) 51 | *Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhudinov.* 52 | 53 | 54 | ## Action Model Papers 55 | 1. **Decision Transformer: Reinforcement Learning via Sequence Modeling** Neurips 2021. [paper](https://arxiv.org/pdf/2106.01345.pdf) 56 | 57 | *Chen, Lili, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Misha Laskin, Pieter Abbeel, Aravind Srinivas, and Igor Mordatch.* 58 | 59 | 60 | 2. **Diffusion Policy: Visuomotor Policy Learning via Action Diffusion** RSS 2023. [paper](https://arxiv.org/pdf/2303.04137.pdf) [demo](https://diffusion-policy.cs.columbia.edu/) 61 | *Chi, Cheng and Feng, Siyuan and Du, Yilun and Xu, Zhenjia and Cousineau, Eric and Burchfiel, Benjamin and Song, Shuran* 62 | 63 | 64 | 65 | 66 | 67 | --------------------------------------------------------------------------------