├── README.md
├── example.py
├── images
├── tc_sincos.png
├── tc_vis.gif
├── tcoverview.png
└── tilingstotiles.png
└── tilecoding.py
/README.md:
--------------------------------------------------------------------------------
1 | # Tile Coding
2 |
3 | [Tile coding](http://incompleteideas.net/book/ebook/node88.html#SECTION04232000000000000000) is a coarse coding method which uses several offsetted tilings to produce binary feature vectors for points in a continuous space.
4 |
5 | At a high level, tile coding is used to convert a point in an n-dimensional space into a binary feature vector such that the vectors of nearby points have many elements in common, and vice versa with distant points. It works by covering a continuous space with *tiles*, where each tile has a corresponding index in a vector. The tiles can be any arbitrary shape, but are typically n-dimensional hyperrectangles for computational convenience. The binary feature vector for a point in the space would have a ```1``` at the indices of the tiles intersected by the point, and a ```0``` everywhere else:
6 |
7 |
8 |
9 |
10 |
11 | Tile coding lays tiles over the continuous space through the use of ```tilings```. A tiling can be thought of as an n-dimensional grid of tiles with potentially different scales of values along each dimension. Several offsetted tilings are then placed over the space to create regions of overlapping tiles. A useful property of laying tiles this way is that the number of tiles intersected will always be the number of tilings used, as a point can't intersect two tiles within the same tiling:
12 |
13 |
14 |
15 |
16 |
17 |
18 | # Dependencies
19 |
20 | * numpy
21 |
22 | # Usage
23 |
24 | A tile coder is instantiated with the following arguments:
25 |
26 | * A list of the number of tiles spanning each dimension
27 | * A list of tuples containing the value limits of each dimension
28 | * The number of tilings
29 | * (Optional) A function returning a list of tiling offsets along each dimension, given the number of dimensions
30 | * Default: Consecutive odd numbers (Miller & Glanz, 1996)
31 |
32 | Once instantiated, it uses ```__getitem__()``` to take a coordinate of a continuous space and return a numpy array with the indices of the active tiles. That is, it implicitly produces a binary vector of active tiles by returning the locations of the vector which have a ```1```. The instance's ```n_tiles``` property will give the total number of tiles across all of the tilings, which corresponds to the tile-coded binary feature vector's length.
33 |
34 | ## A Simple Example
35 |
36 | Suppose we want to tile a continuous 2-dimensional space where the values of each dimension range from ```0``` to ```10```. For this example, we'll have tilings which consist of ```10``` tiles spanning the complete range of values for each dimension (A ```10×10``` tiling), and use ```8``` tilings (with the default tiling offsets).
37 |
38 | First, we import the tile coder:
39 |
40 | ```python
41 | from tilecoding import TileCoder
42 | ```
43 |
44 | Next, we specify the number of tiles spanning each dimension (tiling dimensions), the value limits of each dimension, and the number of tilings:
45 |
46 | ```python
47 | # number of tile spanning each dimension
48 | tiles_per_dim = [10, 10]
49 | # value limits of each dimension
50 | lims = [(0.0, 10.0), (0.0, 10.0)]
51 | # number of tilings
52 | tilings = 8
53 | ```
54 |
55 | We can now instantiate a tile coder (which we'll denote ```T```):
56 |
57 | ```python
58 | T = TileCoder(tiles_per_dim, lims, tilings)
59 | ```
60 |
61 | The tile coder can then return the active tiles for given ```(x, y)``` coordinates in this 2-dimensional space via ```T[x, y]```:
62 |
63 | ```bash
64 | # get active tiles for location (3.6, 7.21)
65 | >>> T[3.6, 7.21]
66 | array([ 80, 201, 322, 443, 565, 697, 807, 928])
67 |
68 | # a nearby point, differs from (3.6, 7.21) by 1 tile
69 | >>> T[3.7, 7.21]
70 | array([ 80, 201, 322, 444, 565, 697, 807, 928])
71 |
72 | # a slightly farther point, differs from (3.6, 7.21) by 5 tiles
73 | >>> T[4.1, 7.10]
74 | array([ 81, 202, 323, 444, 565, 686, 807, 928])
75 |
76 | # a much farther point, no tiles in common with (3.6, 7.21)
77 | >>> T[6.6, 9.14]
78 | array([105, 226, 347, 468, 590, 722, 832, 953])
79 | ```
80 |
81 | Below is a visualization of how the active tiles of the ```8``` tilings are computed for location ```(3.6, 7.21)```:
82 |
83 |
84 |
85 |
86 |
87 | ## Function Approximation Example
88 |
89 | Suppose we want to approximate a continuous 2-dimensional function with a function that's linear in a tile-coded vector of binary features. Let's approximate ```f(x, y) = sin(x) + cos(y)``` where the values of both ```x``` and ```y``` range from ```0``` to ```2π```, and we only have access to *noisy*, *online* samples of the function (within the specified range).
90 |
91 | We'll use a tile coder with ```8``` tilings, each consisting of ```8``` tiles spanning the range of values in each direction (An ```8×8``` tiling):
92 |
93 | ```python
94 | import numpy as np
95 | from tilecoding import TileCoder
96 |
97 | # tile coder tiling dimensions, value limits, number of tilings
98 | tiles_per_dim = [8, 8]
99 | lims = [(0.0, 2.0 * np.pi), (0.0, 2.0 * np.pi)]
100 | tilings = 8
101 |
102 | # create tile coder
103 | T = TileCoder(tiles_per_dim, lims, tilings)
104 | ```
105 |
106 | The following function will produce a noisy sample from our target function:
107 |
108 | ```python
109 | # target function with gaussian noise
110 | def target_fn(x, y):
111 | return np.sin(x) + np.cos(y) + 0.1 * np.random.randn()
112 | ```
113 |
114 | Our approximate (linear) function can be represented with a set of weights, one for each tile in the tile coder's tilings. The function's output can then be computed as a dot product between this weight vector and the tile-coded feature vector for a given coordinate. We can get the total number of tiles across all of the tile coder's tilings (the feature vector length) using ```T.n_tiles```:
115 |
116 | ```python
117 | # linear function weight vector
118 | w = np.zeros(T.n_tiles)
119 | ```
120 |
121 | We'll then take 10,000 online samples (i.e., we don't store them and only work with the most recent sample) at random locations of the target function. We can update our weights using [stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) (SGD) in the *mean squared error* between the samples and our linear function's estimates. Note that because we're using a *binary representation*, we can evaluate our linear function using the indices of the active tiles with ```w[active_tiles].sum()```, as opposed to computing a dot product between our weight vector and an explicit binary feature vector.
122 |
123 | ```python
124 | # step size for SGD
125 | alpha = 0.1 / tilings
126 |
127 | # learn from 10,000 samples
128 | for i in range(10000):
129 | # get noisy sample from target function at random location
130 | x, y = 2.0 * np.pi * np.random.rand(2)
131 | target = target_fn(x, y)
132 | # get prediction from active tiles at that location
133 | tiles = T[x, y]
134 | pred = w[tiles].sum()
135 | # update weights with SGD
136 | w[tiles] += alpha * (target - pred)
137 | ```
138 |
139 | We can check how good our learned approximation is by evaluating our approximate function against the true target function at various points:
140 |
141 | ```bash
142 | # check approximate value at (2.5, 3.1)
143 | >>> tiles = T[2.5, 3.1]
144 | >>> w[tiles].sum()
145 | -0.40287006579746704
146 | # compare to true value at (2.5, 3.1)
147 | >>> np.sin(2.5) + np.cos(3.1)
148 | -0.40066300616932304
149 | ```
150 |
151 | Alternatively, we can plot a surface of our learned approximation (e.g., with matplotlib):
152 |
153 | ```python
154 | import matplotlib.pyplot as plt
155 | from mpl_toolkits.mplot3d import Axes3D
156 |
157 | # resolution
158 | res = 200
159 |
160 | # (x, y) space to evaluate
161 | x = np.arange(0.0, 2.0 * np.pi, 2.0 * np.pi / res)
162 | y = np.arange(0.0, 2.0 * np.pi, 2.0 * np.pi / res)
163 |
164 | # map the function across the above space
165 | z = np.zeros([len(x), len(y)])
166 | for i in range(len(x)):
167 | for j in range(len(y)):
168 | tiles = T[x[i], y[j]]
169 | z[i, j] = w[tiles].sum()
170 |
171 | # plot function
172 | fig = plt.figure()
173 | ax = fig.gca(projection='3d')
174 | X, Y = np.meshgrid(x, y)
175 | surf = ax.plot_surface(X, Y, z, cmap=plt.get_cmap('hot'))
176 | plt.show()
177 | ```
178 |
179 |
180 |
181 |
182 |
--------------------------------------------------------------------------------
/example.py:
--------------------------------------------------------------------------------
1 | def example():
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 | from mpl_toolkits.mplot3d import Axes3D
5 | import time
6 |
7 | from tilecoding import TileCoder
8 |
9 | # tile coder dimensions, limits, tilings
10 | tiles_per_dim = [8, 8]
11 | lims = [(0.0, 2.0 * np.pi), (0.0, 2.0 * np.pi)]
12 | tilings = 8
13 |
14 | # create tile coder
15 | T = TileCoder(tiles_per_dim, lims, tilings)
16 |
17 | # target function with gaussian noise
18 | def target_ftn(x, y):
19 | return np.sin(x) + np.cos(y) + 0.1 * np.random.randn()
20 |
21 | # linear function weight vector, step size for SGD
22 | w = np.zeros(T.n_tiles)
23 | alpha = 0.1 / tilings
24 |
25 | # take 10,000 samples of target function, output mse of batches of 100 points
26 | timer = time.time()
27 | batch_size = 100
28 | for batches in range(100):
29 | mse = 0.0
30 | for b in range(batch_size):
31 | x = lims[0][0] + np.random.rand() * (lims[0][1] - lims[0][0])
32 | y = lims[1][0] + np.random.rand() * (lims[1][1] - lims[1][0])
33 | target = target_ftn(x, y)
34 | tiles = T[x, y]
35 | w[tiles] += alpha * (target - w[tiles].sum())
36 | mse += (target - w[tiles].sum()) ** 2
37 | mse /= batch_size
38 | print('samples:', (batches + 1) * batch_size, 'batch_mse:', mse)
39 | print('elapsed time:', time.time() - timer)
40 |
41 | # get learned function
42 | print('mapping function...')
43 | res = 200
44 | x = np.arange(lims[0][0], lims[0][1], (lims[0][1] - lims[0][0]) / res)
45 | y = np.arange(lims[1][0], lims[1][1], (lims[1][1] - lims[1][0]) / res)
46 | z = np.zeros([len(x), len(y)])
47 | for i in range(len(x)):
48 | for j in range(len(y)):
49 | tiles = T[x[i], y[j]]
50 | z[i, j] = w[tiles].sum()
51 |
52 | # plot
53 | fig = plt.figure()
54 | ax = fig.gca(projection='3d')
55 | X, Y = np.meshgrid(x, y)
56 | surf = ax.plot_surface(X, Y, z, cmap=plt.get_cmap('hot'))
57 | plt.show()
58 |
59 | if __name__ == '__main__':
60 | example()
61 |
--------------------------------------------------------------------------------
/images/tc_sincos.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/MeepMoop/tilecoding/1d99fe313c1b5712089c96a9df0812f79af96e0f/images/tc_sincos.png
--------------------------------------------------------------------------------
/images/tc_vis.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/MeepMoop/tilecoding/1d99fe313c1b5712089c96a9df0812f79af96e0f/images/tc_vis.gif
--------------------------------------------------------------------------------
/images/tcoverview.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/MeepMoop/tilecoding/1d99fe313c1b5712089c96a9df0812f79af96e0f/images/tcoverview.png
--------------------------------------------------------------------------------
/images/tilingstotiles.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/MeepMoop/tilecoding/1d99fe313c1b5712089c96a9df0812f79af96e0f/images/tilingstotiles.png
--------------------------------------------------------------------------------
/tilecoding.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | import numpy as np
3 |
4 | class TileCoder:
5 | def __init__(self, tiles_per_dim, value_limits, tilings, offset=lambda n: 2 * np.arange(n) + 1):
6 | tiling_dims = np.array(np.ceil(tiles_per_dim), dtype=int) + 1
7 | self._offsets = offset(len(tiles_per_dim)) * \
8 | np.repeat([np.arange(tilings)], len(tiles_per_dim), 0).T / float(tilings) % 1
9 | self._limits = np.array(value_limits)
10 | self._norm_dims = np.array(tiles_per_dim) / (self._limits[:, 1] - self._limits[:, 0])
11 | self._tile_base_ind = np.prod(tiling_dims) * np.arange(tilings)
12 | self._hash_vec = np.array([np.prod(tiling_dims[0:i]) for i in range(len(tiles_per_dim))])
13 | self._n_tiles = tilings * np.prod(tiling_dims)
14 |
15 | def __getitem__(self, x):
16 | off_coords = ((x - self._limits[:, 0]) * self._norm_dims + self._offsets).astype(int)
17 | return self._tile_base_ind + np.dot(off_coords, self._hash_vec)
18 |
19 | @property
20 | def n_tiles(self):
21 | return self._n_tiles
22 |
--------------------------------------------------------------------------------