└── README.md /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | # State Representation Learning for Control: An Overview [arXiv](https://arxiv.org/abs/1802.04181) 5 | 6 | 7 | ## Abstract 8 | 9 | Representation learning algorithms are designed to learn abstract features that characterize data. 10 | State representation learning (SRL) focuses on a particular kind of representation learning where learned features are in low dimension, evolve through time, and are influenced by actions of an agent. 11 | As the representation learned captures the variation in the environment generated by agents, this kind of representation is particularly suitable for robotics and control scenarios. 12 | In particular, the low dimension helps to overcome the curse of dimensionality, provides easier interpretation and utilization by humans and can help improve performance and speed in policy learning algorithms such as reinforcement learning. 13 | This survey aims at covering the state-of-the-art on state representation learning in the most recent years. It reviews different SRL methods that involve interaction with the environment, their implementations and their applications in robotics control tasks (simulated or real). In particular, it highlights how generic learning objectives are differently exploited in the reviewed algorithms. Finally, it discusses evaluation methods to assess the representation learned and summarizes current and future lines of research. 14 | 15 | 16 | # Learning objective for SRL (state representation learning) 17 | 18 | :one: Learning by reconstruction the observation
19 | :two: Learning a Forward model
20 | :three: Learning an Inverse Model
21 | :four: Using feature adversarial learning
22 | :five: Exploiting reward
23 | :six: Other objective functions
24 | 25 | 26 | - **Deep Spatial Autoencoders for Visuomotor Learning** (2015) :one: :six:
27 | *Chelsea Finn, Xin Yu Tan, Yan Duan, Trevor Darrell, Sergey Levine, Pieter Abbeel* [arXiv](https://arxiv.org/abs/1509.06113) [pdf](https://arxiv.org/pdf/1509.06113) 28 | 29 | - **Goal-Driven Dimensionality Reduction for Reinforcement Learning (rwPCA)** (2017) :one: :five:
30 | *Simone Parisi, Simon Ramstedt and Jan Peters* [pdf](http://www.ausy.tu-darmstadt.de/uploads/Site/EditPublication/parisi2017iros.pdf) 31 | 32 | - **Disentangling the independently controllable factorsof variation by interacting with the world** (2017) :one: :six:
33 | *Valentin Thomas, Emmanuel Bengio, William Fedus, Jules Pondard, Philippe Beaudoin, Hugo Larochelle, Joelle Pineau, Doina Precup, Yoshua Bengio* [pdf](http://acsweb.ucsd.edu/~wfedus/pdf/ICF_NIPS_2017_workshop.pdf) 34 | 35 | - **Independently Controllable Factors** (2017) :one: :six:
36 | *Valentin Thomas, Jules Pondard, Emmanuel Bengio, Marc Sarfati, Philippe Beaudoin, Marie-Jean Meurs, Joelle Pineau, Doina Precup, Yoshua Bengio* [arXiv](https://arxiv.org/abs/1708.01289) [pdf](https://arxiv.org/pdf/1708.01289) 37 | 38 | - **Learn to swing up and balance a real pole based on raw visual input data** (2012) :one:
39 | *Jan Mattner, Sascha Lange, Martin Riedmiller* [pdf](https://pdfs.semanticscholar.org/d64b/08436f690df800a037eba759fcc6f0d971be.pdf) 40 | 41 | - **Dimensionality Reduced Reinforcement Learning for Assistive Robots** (2016) :one:
42 | *William Curran, Tim Brys, David Aha, Matthew Taylor, William D. Smart* [pdf](https://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0ahUKEwiugbGon7TZAhWBKMAKHYE4DlYQFggpMAA&url=https%3A%2F%2Fwww.aaai.org%2Focs%2Findex.php%2FFSS%2FFSS16%2Fpaper%2Fdownload%2F14076%2F13660&usg=AOvVaw3g6Vz6YhKbdC6bLn-QN8GI) 43 | 44 | - **Using PCA to Efficiently Represent State Spaces** (2015) :one:
45 | *Curran15* [arXiv](https://arxiv.org/abs/1505.00322) [pdf](https://arxiv.org/pdf/1505.00322.pdf) 46 | 47 | - **Deep Kalman Filters** (2015) :one: :two:
48 | *Rahul G. Krishnan, Uri Shalit, David Sontag*, [pdf](https://arxiv.org/pdf/1511.05121.pdf) [arXiv](https://arxiv.org/abs/1511.05121) 49 | 50 | - **Learning to linearize under uncertainty** (2015) :one: :two:
51 | *R. Goroshin, M. Mathieu, and Y. LeCun* [pdf](https://arxiv.org/pdf/1506.03011.pdf) [arXiv](https://arxiv.org/abs/1506.03011) 52 | 53 | - **Embed to control: A locally linear latent dynamics model for control from raw images** (2015) :one: :two:
54 | *Watter, Manuel, et al* [pdf](https://pdfs.semanticscholar.org/21c9/dd68b908825e2830b206659ae6dd5c5bfc02.pdf) [arXiv](https://arxiv.org/abs/1506.07365) 55 | 56 | - **Learning State Representation for Deep Actor-Critic Control** (2016) :two: :five:
57 | *Jelle Munk, Jens Kober, Robert Babuška* [pdf](http://www.jenskober.de/MunkCDC2016.pdf) 58 | 59 | - **Stable reinforcement learning with autoencoders for tactile and visual data** (2016) :one: :two:
60 | *Herke van Hoof, Nutan Chen, Maximilian Karl, Patrick van der Smagt, Jan Peters* [pdf](https://brml.org/uploads/tx_sibibtex/Hoof2016.pdf) 61 | 62 | - **Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data** (2017) :one: :two:
63 | *Maximilian Karl, Maximilian Soelch, Justin Bayer, Patrick van der Smagt*, (2017), [pdf](https://openreview.net/pdf?id=HyTqHL5xg) [arXiv](https://arxiv.org/abs/1605.06432) 64 | 65 | - **Value Prediction Network** (2017) :two: :five:
66 | *Junhyuk Oh, Satinder Singh, Honglak Lee* [arXiv](https://arxiv.org/abs/1707.03497) [pdf](https://arxiv.org/pdf/1707.03497) 67 | 68 | - **Data-efficient learning of feedback policies from image pixels using deep dynamical model** (2015) :one :two:
69 | *J.-A. M Assael, Niklas Wahlström, Thomas B. Schön, Marc Peter Deisenroth* [arXiv](https://arxiv.org/abs/1510.02173) [pdf](https://arxiv.org/pdf/1510.02173) 70 | 71 | - **Learning deep dynamical models from image pixels** (2014) :one: :two:
72 | *Niklas Wahlström, Thomas B. Schön, Marc Peter Deisenroth* [arXiv](https://arxiv.org/abs/1410.7550) [pdf](https://arxiv.org/pdf/1410.7550) 73 | 74 | - **From pixels to torques: Policy learning withdeep dynamical models** (2015):one: :two:
75 | *Niklas Wahlström, Thomas B. Schön, Marc Peter Deisenroth* [arXiv](https://arxiv.org/abs/1502.02251) [pdf](https://arxiv.org/pdf/1502.02251) 76 | 77 | 78 | - **Loss is its own Reward: Self-Supervision for Reinforcement Learning** (2016) :three:
79 | *Evan Shelhamer, Parsa Mahmoudieh, Max Argus, Trevor Darrell* [pdf](https://arxiv.org/pdf/1612.07307.pdf) [arXiv](https://arxiv.org/pdf/1612.07307.pdf) 80 | 81 | - **Curiosity-driven Exploration by Self-supervised Prediction** (2017) :two: :three:
82 | *Deepak Pathak et al.* [pdf](http://juxi.net/workshop/deep-learning-robotic-vision-cvpr-2017/papers/23.pdf) 83 | Self-supervised approach. 84 | 85 | - **InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets** (2016) :one: :four:
86 | *Xi Chen, Yan Duan, Rein Houthoof, John Schulman, Ilya Sutskever, Pieter Abbeel * [pdf](https://arxiv.org/pdf/1606.03657.pdf) 87 | 88 | - **Adversarial Feature Learning** (2016) :one: :four:
89 | *Jeff Donahue, Philipp Krähenbühl, Trevor Darrell* [arXiv](https://arxiv.org/abs/1605.09782) [pdf](https://arxiv.org/pdf/1605.09782) 90 | 91 | 92 | - **PVEs: Position-Velocity Encoders for Unsupervised Learning of Structured State Representations** (2017) :six:
93 | *Rico Jonschkowski, Roland Hafner, Jonathan Scholz, Martin Riedmiller* [pdf](https://arxiv.org/pdf/1705.09805), [arXiv](https://arxiv.org/abs/1705.09805) 94 | 95 | - **Learning State Representations with Robotic Priors** (2015) :five: :six:
96 | *Rico Jonschkowski, Oliver Brock*, , [pdf](https://pdfs.semanticscholar.org/dc93/f6d1b704abf12bbbb296f4ec250467bcb882.pdf) 97 | 98 | - **Unsupervised state representation learning with robotic priors: a robustness benchmark** (2017) :five: :six:
99 | *Timothée Lesort, Mathieu Seurin, Xinrui Li, Natalia Díaz Rodríguez, David Filliat* [pdf](https://arxiv.org/pdf/1709.05185.pdf) [arXiv](https://arxiv.org/abs/1709.05185) 100 | 101 | 102 | ## Related Survey 103 | 104 | - **Autonomous learning of state representations for control** (2015)
105 | *Wendelin Bohmer Jost Tobias Springenberg Joschka Boedecker Martin Riedmiller Klaus Obermayer* [pdf](http://www.ni.tu-berlin.de/fileadmin/fg215/articles/boehmer15b.pdf#cite.Lagoudakis03) 106 | 107 | 108 | ## Citation 109 | 110 | If you find this repo useful please cite the relevant paper
111 | 112 | ``` 113 | @article{Lesort2018StateRL, 114 | title={State representation learning for control: An overview}, 115 | author={Timoth{\'e}e Lesort and Natalia D{\'i}az Rodr{\'i}guez and Jean-François Goudou and David Filliat}, 116 | journal={Neural networks : the official journal of the International Neural Network Society}, 117 | year={2018}, 118 | volume={108}, 119 | pages={379-392} 120 | } 121 | ``` 122 | 123 | --------------------------------------------------------------------------------