├── .gitignore
├── README.md
└── prob-rob
    ├── Probabilistic-robotics.apkg
    └── ch3
        ├── ch3-solutions.tex
        └── understanding-covariance.tex


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Compiled files
 2 | *.o
 3 | *.so
 4 | *.rlib
 5 | *.dll
 6 | 
 7 | # Executables
 8 | *.exe
 9 | 
10 | # Generated by Cargo
11 | /target/
12 | build/*
13 | reference/*
14 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | probabilistic-robotics
 2 | ======================
 3 | 
 4 | I'm building a robotics curriculum for a K-8 school, and as part of that
 5 | process I'm trying to master the computational parts of mobile robotics (I also
 6 | want to build a robotic mower for my pasture...). I'm going through two books
 7 | to do that: [Introduction to Autonomous Mobile Robots](http://amzn.com/0262015358)
 8 | and [Probabilistic Robotics](http://amzn.com/0262201623). 
 9 | 
10 | These are my notes, in the form of Anki flash cards, along with my attempts at
11 | solutions to all of the exercises in both books. I will be keeping the two
12 | books' answer sets separate for easier reference. 
13 | 
14 | I'm hoping that this repo will turn into a kind of self-directed MOOC for
15 | mobile robotics. I like online courses a lot, but I tend to think they don't go
16 | deep enough most of the time. That will definitely not be the case here - if we
17 | read the books and do all of the note cards and exercises, my hope is that
18 | we'll end up with deep knowledge of the subject. 
19 | 
20 | ### Caveat Emptor
21 | 
22 | I'm not an expert in this stuff (yet), so it's likely that my solutions will be
23 | incomplete, incorrect, or otherwise flawed! I would be extremely happy to get
24 | pull requests with corrections, alternative solutions, and additional
25 | flashcards (or better versions of the existing ones), or anything else that
26 | fits with the idea of self-teaching mobile robotics, as I don't have any
27 | external reference to use for checking my work. So far, both of these books are
28 | excellent and I highly recommend checking them out if you're interested in this
29 | kind of stuff, and if you do please check back here from time to time to see
30 | what we've come up with for study aids.
31 | 
32 | ### What's in here
33 | 
34 | 1. Anki decks, one for Probabilistic Robotics, and the other for Intro to
35 | Autonomous Robots. To use them, you'll need to get [Anki](http://ankisrs.net/).
36 | 2. LaTeX files for my solutions to the exercises. I try to include a lot of
37 | explanation of why I'm doing certain things, as a way to help my future self
38 | when studying this stuff later. Hopefully you'll find them handy as well.
39 | 3. Miscellaneous programs from the books or other sources of inspiration,
40 | probably mostly written in Rust, Python, or Julia. I welcome pull requests with
41 | bug fixes and alternative solutions or example programs, in any language.
42 | 
43 | 
44 | 


--------------------------------------------------------------------------------
/prob-rob/Probabilistic-robotics.apkg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aethertap/probabilistic-robotics/a6c22d6332f91addb0b617347cad5583d651c418/prob-rob/Probabilistic-robotics.apkg


--------------------------------------------------------------------------------
/prob-rob/ch3/ch3-solutions.tex:
--------------------------------------------------------------------------------
  1 | 
  2 | \documentclass[10pt]{article}
  3 | \usepackage{amsmath}
  4 | \begin{document}
  5 | \title{Solutions for Chapter 3 of Probabilistic Robotics}
  6 | 
  7 | \section{Problem 1}
  8 | \begin{enumerate}
  9 |   \item \textit{In this and the following exercises, you are asked to design a Kalman
 10 | filter for a simple dynamical system: a car with linear dynamics moving in a
 11 | linear environment. Assume $\delta t = 1$ for simplicity. The position of the
 12 | car at time $t$ is given by $x_t$. Its velocity is given by $\dot{x}_t$, and its
 13 | acceleration [from time $t-1$ to time $t$] is given by $\ddot{x}_t$. Suppose the
 14 | acceleration is set randomly at each point in time, according to a Gaussian with
 15 | zero mean and covariance $\sigma^2 = 1$.}
 16 | 
 17 | One thing I note: At first it sounded to me like we would know what the
 18 | acceleration value was at each time step. This threw me off for a long time,
 19 | until I realized that this is more like modeling a random wind blowing - you
 20 | don't know what the effect is at any given moment, all you know is it's
 21 | statistical behavior. Since we don't \textit{know} the value of the
 22 | acceleration, it \textit{can't} be part of our state! That may be obvious to
 23 | others reading the problem, but I worked on it for two days (!) before I figured
 24 | it out. C'est la vie.
 25 | 
 26 |   \begin{enumerate}
 27 |     \item \textit{What is a minimal state vector for the Kalman filter (so that
 28 |       the resulting system is Markovian)?} 
 29 | 
 30 |       In order to be Markovian, we have to have a state vector such that the
 31 |       future and the past are independent given the present state. If the state
 32 |       vector has the position and velocity, this condition is met. All of the
 33 |       acceleration values of the past are completely summarized in the position
 34 |       and velocity, so keeping it as a state variable doesn't tell us anything
 35 |       new in terms of predicting the future. The acceleration is set randomly at
 36 |       each time step, but given that we know the position and velocity we don't
 37 |       need to know any past acceleration in order to compute the future given
 38 |       the state and the control (which in this case will be setting the
 39 |       acceleration).
 40 | 
 41 |       Our state vector is thus:
 42 |     $$\begin{pmatrix}x_t \\ \dot{x}_t\end{pmatrix}$$
 43 | 
 44 |     \item \textit{For your state vector, design the state transition probability
 45 |       $p(x_t | u_t,x_{t-1})$. Hint: this transition function will possess linear
 46 |       matrices $A$ and $B$ and a noise covariance $R$.}
 47 | 
 48 |       We will use the moments parameterization since we're building a Kalman filter.
 49 |       The state transition function is of the form
 50 |       $$x_t = A x_{t-1} + B u_t + \epsilon_t$$
 51 | 
 52 |       Where $\epsilon_t$ is a gaussian random variable with mean zero and variance 1
 53 |       (the acceleration, as given in the problem). We'll assume there is no
 54 |       acceleration, and incorporate the acceleration as error in the position and
 55 |       velocity.
 56 | 
 57 |       The mean for our distribution is given by $A x_{t-1} + B u_t$, and the
 58 |       variance is given by $A\Sigma_{t-1}A^T + cov(\epsilon_t)$. The
 59 |       $A\Sigma_{t-1}A^T$ term carries forward the uncertainty from the previous state
 60 |       updated by the state transition function (i.e. we need to apply the same
 61 |       transformation to our error bounds that we apply to the estimate, otherwise
 62 |       the error bounds will become meaningless). The $cov(\epsilon_t)$ term is the
 63 |       new error added in by the random acceleration.
 64 | 
 65 |       All that remains is to use the equations of motion to derive matrices for $A$
 66 |       and $B$. We know that:
 67 |       \begin{gather}
 68 |         x_t = x_{t-1} + \left(\frac{\dot{x}_{t-1}+\dot{x}_t}{2}\right) \Delta_t \\ 
 69 |         \dot{x}_t = \dot{x}_{t-1} + \ddot{x}_t\Delta_t 
 70 |       \end{gather}
 71 | 
 72 |       Using our expression for $\dot{x}_t$ in the first equation yields:
 73 | 
 74 |       \begin{gather}
 75 |         x_t = x_{t-1} + \dot{x}_t-1\Delta_t + \frac{1}{2}\ddot{x}_t\Delta_t^2
 76 |       \end{gather}
 77 | 
 78 |       We use the average velocity between the start and end of the time slice to
 79 |       calculate the new $x_t$ value, because the velocity has changed in the
 80 |       interval as a result of the acceleration ($\ddot{x}_{t}$).
 81 | 
 82 |       There is no control action in this model, since there is no explicit (known)
 83 |       acceleration (i.e. the acceleration value is just a probability distribution
 84 |       and we don't know its value at any given time). That means $B u_t = \vec{0}$.
 85 | 
 86 |       Now we can write our prediction of the overall mean, in matrix form:
 87 | 
 88 |       $$
 89 |     \overline{\mu}_t = \begin{pmatrix}x_t \\ \dot{x}_t\end{pmatrix} = 
 90 |     \begin{pmatrix}1 & \Delta t \\ 0 & 1\end{pmatrix} \begin{pmatrix}x_{t-1} \\
 91 |     \dot{x}_{t-1} \end{pmatrix}
 92 |       $$
 93 | 
 94 |       Next we need to model the covariance. Since we are given the variance of the
 95 |       acceleration (=1), we have to figure out how that maps into variance of the
 96 |       two state variables and make a variance vector accordingly.
 97 | 
 98 |       From our equations of motion, we know that $x_t$ depends on $\ddot{x}_t$
 99 |       scaled by $\frac{\Delta_t}{2}$, and we know that $\dot{x}_t$ depends on
100 |       $\ddot{x}_t$ without scaling (i.e. the scale factor is 1).
101 | 
102 |       That gives us what we need to figure out how the random acceleration should
103 |       affect our estimates for position and velocity. The variance of velocity
104 |       should be equal to the variance of acceleration times the time elapsed, and
105 |       the variance of position should be equal to half of the squared time elapsed. 
106 | 
107 |       If we knew what the acceleration was at each step, we could add a correction
108 |       term to the above expression for the mean to get an exact answer. This is what
109 |       that term would look like:
110 |       \begin{gather}
111 |       \delta_t = \begin{pmatrix}\frac{\Delta_t^2}{2} \\ \Delta_t\end{pmatrix} \ddot{x}_t
112 |       \end{gather}
113 | 
114 |       However, we don't have knowledge of $\ddot{x}_t$, all we know is that it's
115 |       gaussian with mean zero and std deviation 1. We therefore need to turn that
116 |       equation into a multivariate gaussian distribution, which when added to our
117 |       mean will yield the total probability distribution for $x_t$. In order to do
118 |       that, we need to calculate a covariance matrix based on the vector $\delta_t$
119 |       (which is telling us how much the acceleration impacts each of the state
120 |       variables).
121 | 
122 |       We know the variance of the acceleration, but we need to figure out how that
123 |       maps into the variance of the position and velocity. Our $\delta$ vector gives
124 |       us the scaling factors - if we know the standard deviation of $\ddot{x}_t$,
125 |       then we could scale it by the values in $\delta_t$ to see how it would change
126 |       the standard deviation of $x_t$ and $\dot{x}_t$. However, since we're working
127 |       with a gaussian distribution, we want to use variance. That means that we're
128 |       working with the squared standard deviation, so we need to apply the
129 |       appropriate analogous operation to $\delta_t$ in order to make a covariance
130 |       matrix. We can do that by calculating $\delta_t \delta_t^T$ (yes, I'm
131 |       hand-waving a bit here because I haven't figured out the intuitive explanation
132 |       for \textit{why} this should make the right covariance matrix yet...):
133 | 
134 |       \begin{align}
135 |         R = cov\left(\delta_t\right) = \delta_t \delta_t^T &=& 
136 |       \begin{pmatrix}\frac{\Delta_t^2}{2} \\ \Delta_t\end{pmatrix} 
137 |       \begin{pmatrix}\frac{\Delta_t^2}{2} & \Delta_t\end{pmatrix} \\
138 |                                           &=&
139 |         \begin{pmatrix}\frac{\Delta_t^4}{4} & \frac{\Delta_t^3}{2} \\
140 |       \frac{\Delta_t^3}{2} & \Delta_t^2 \end{pmatrix}
141 |       \end{align}
142 | 
143 |       Now we have our probability $p(x_t | u_t,x_{t-1})$:
144 |       \begin{gather}
145 |       \overline{\mu}_t = \begin{pmatrix}1 & \Delta_t \\ 0 & 1\end{pmatrix}
146 |       \begin{pmatrix}x_{t-1} \\ \dot{x}_{t-1}\end{pmatrix}\\
147 |       \overline{\Sigma}_t = \begin{pmatrix}1 & \Delta_t \\ 0 & 1\end{pmatrix}
148 |         \Sigma_{t-1}
149 |       \begin{pmatrix}1 & 0 \\ \Delta_t & 1\end{pmatrix} + 
150 |         \begin{pmatrix}\frac{\Delta_t^4}{4} & \frac{\Delta_t^3}{2} \\
151 |       \frac{\Delta_t^3}{2} & \Delta_t^2 \end{pmatrix} \sigma_{\ddot{x}_t}^2
152 |       \end{gather}
153 | 
154 |       We have to provide an initial value for $\Sigma_0$. In this case, I think it's
155 |       reasonable to assume that it is $0$. The uncertainty from the random
156 |       accelerations at each timestep will propagate into $\Sigma_t$, so it will not
157 |       remain $0$ for long.
158 | 
159 |       Later, the Kalman gain will be used to select how much to weight the
160 |       prediction versus the correction (measurement), based on the relative
161 |       magnitudes of the covariances.
162 | 
163 |     \item \textit{Implement the state prediction step of the Kalman filter. Assuming
164 |         we know at time $t=x, x_0 = \dot{x}_0 = \ddot{x}_0 = 0$. Compute the state
165 |       distributions for times $t=1,2,...,5$.}
166 | 
167 |     \item \textit{For each value of $t$, plot the joint posterior over $x$ and
168 |         $\dot{x}$ in a diagram, where x is the horizontal and $\dot{x}$ is the
169 |         vertical axis. For each posterior, your are asked to plot an uncertainty
170 |         ellipse, which is the ellipse of points that are one standard deviation away
171 |         from the mean. Hint: if you do not have access to a mathematics library, you
172 |         can create those ellipses by analyzing the eigenvalues of the covariance
173 |       matrix.}
174 | 
175 |     \item \textit{What will happen to the correlation between $x_t$ and $\dot{x}_t$
176 |       as $t\to \infty$?}
177 | 
178 |   \end{enumerate}
179 | 
180 |   \item \textit{In Chapter 3.2.4, we derived the prediction step of the KF. This
181 |     step is often derived with Z transforms or Fourier transforms, using the
182 |     Convolution theorem. Re-derive the prediction step using transforms.}
183 |     
184 |     I didn't do this one, it seems like a lot more work than it's worth for my
185 |     goals. Maybe I'll come back to it...
186 |  
187 |   \item \textit{We noted in the text that the EKF linearization is an
188 |     approximation. To see how bad this approximation is, we ask you to work out
189 |     an example. Suppose we have a mobile robot operating in a planar environment.
190 |     Its state is its x-y location and its global headiing $\theta$. Suppose we know
191 |     x and y with high certainty, but the orientation $\theta$ is unknown. This is
192 |     reflected in our initial estimate:}
193 |     
194 |     \begin{align}
195 |     \mu &= \begin{pmatrix}0 \\ 0 \\ 0\end{pmatrix} \\
196 |     \Sigma &= \begin{pmatrix}0.01 & 0 & 0\\ 0 & 0.01 & 0 \\ 0 & 0 & 10000\end{pmatrix}
197 |     \end{align}
198 |   
199 |     \begin{enumerate}
200 |         \item \textit{Draw, graphically, your best model of the posterior over
201 |             the robot pose after the robot moves $d=1$ units forward. For this
202 |             exercise, we assume that the robot moves flawlessly without any noise.
203 |             Thus, the expected location of the robot after motion will be:}
204 |           \begin{align}
205 |           \begin{pmatrix}x' \\ y' \\ \theta'\end{pmatrix} = 
206 |           \begin{pmatrix}x+cos \theta \\ y + sin \theta \\ \theta\end{pmatrix}
207 |           \end{align}
208 |         \item \textit{Now develop this motion into a prediction step for the
209 |             EKF. For that, you have to generate a new Gaussian estimate of the
210 |             robot pose using the linearized model. You should give the exact
211 |             mathematical equations for each of these steps, and state the Gaussian
212 |             that results.}
213 |           \item \textit{Draw the uncertainty ellipse of the Gaussian and
214 |             compare it with your intuitive solution.}
215 |           \item \textit{Now incorporate a measurement. Our measurement shall be
216 |             a noisy projection of the x-coordinate of the robot, with covariance
217 |             $Q=0.01$. Specify the measurement model. Now apply the measurement
218 |             both to your intuitive posterior, and formally to the EKF estimate using
219 |             the standard EKF machinery. Give the exact result of the EKF, and compare
220 |             it with the result of your intuitive analysis.}
221 |           \item \textit{Discuss the difference between your estimate of the
222 |             posterior, and the Gaussian produced by the EKF. How significant
223 |             are those differences? What can be changed to make the approximation
224 |             more accurate? What would have happened if the initial orientation had
225 |             been known, but not the robot's y-coordinate?}
226 |     \end{enumerate}
227 | 
228 |   \item \textit{The Kalman filter in Table 3.1 lacked a constant additive term
229 |       in the motion and the measurement models. Extend this algorithm to contain
230 |       such terms.}
231 | 
232 |   \item \textit{Prove (via example) the existence of a sparse information
233 |       matrix in multivariate Gaussians (of dimension d) that correlate all d
234 |       variables with correlation coefficients that are $\epsilon$-close to 1. We
235 |       say an infomration matrix is sparse if all but a constant number of elements
236 |       in each row and column are zero.}
237 | \end{enumerate}
238 | 
239 | \end{document}
240 | 


--------------------------------------------------------------------------------
/prob-rob/ch3/understanding-covariance.tex:
--------------------------------------------------------------------------------
  1 | \documentclass[12pt]{article}
  2 | \usepackage[margin=0.5in]{geometry}
  3 | \usepackage{amsmath}
  4 | \author{Erik Lee}
  5 | \date{2014-10-24}
  6 | \title{A digression on the mysteries of covariance matrices}
  7 | \begin{document}
  8 | \maketitle
  9 | 
 10 | I struggled for a long time trying to get some intuition about why it is that
 11 | the covariance matrix should be given by $\vec{v} \vec{v}^T$, where $\vec{v}$ is
 12 | the vector of coefficients for the source of variance in the equations of
 13 | motion. That sentence is a tortured mess, let me try it with math:
 14 | 
 15 | If we have these equations of motion:
 16 | 
 17 | \begin{align}
 18 |   x_t &= x_{t-1} + \dot{x}_{t-1} \Delta_t + \frac{\Delta_t^2}{2} \ddot{x}_t \\
 19 |   \dot{x}_t &= \dot{x}_{t-1} + \Delta_t \ddot{x}_t
 20 | \end{align}
 21 | 
 22 | And we want to figure out how variance of $\ddot{x}$ affects the variance of $x$
 23 | and $\dot{x}$, why should it be the case that we make a vector out of the
 24 | coefficients of $\ddot{x}$ for each equation, then multiply it by its transpose:
 25 | 
 26 | \begin{align}
 27 | cov_a(x,\dot{x}) &= \begin{pmatrix}\frac{\Delta_t^2}{2} \\ \Delta_t\end{pmatrix}
 28 | \begin{pmatrix}\frac{\Delta_t^2}{2} & \Delta_t\end{pmatrix}\sigma_a^2\\
 29 |  &= \begin{pmatrix}\frac{\Delta_t^4}{4} & \frac{\Delta_t^3}{2} \\
 30 | \frac{\Delta_t^3}{2} & t^2 \end{pmatrix}\sigma_a^2
 31 | \end{align}
 32 | 
 33 | To try to figure this out, I decided to start at the base definition of
 34 | covariance and work from there. tl;dr: Since $x$ and $\dot{x}$ are linear
 35 | functions with respect to $\ddot{x}$, their covariance terms are just the
 36 | products of their coefficients. Their variances are the squares of their
 37 | coefficients, and if you do the number crunching you see that the vector product
 38 | above gives you all possible combinations of pairwise products between the
 39 | variables, so it's a convenient way to get the data you need. It also happens to
 40 | organize it in a way that's predictable, and can thus be used in the formula for
 41 | a multivariate gaussian distribution. Read on below if you want to see the math
 42 | I did to get there. 
 43 | 
 44 | Here's the setup: We have two variables that are functions of each other and of
 45 | a third variable. The third variable (in this case acceleration) has a variance,
 46 | and we want to see how the variance of that third variable propagates itself
 47 | through to the other two variables, given their function definitions:
 48 | 
 49 | \begin{align}
 50 |   x &= f(y,a)\\
 51 |   y &= g(a) \\
 52 |   cov(x,y) &= E[(x y - E(x)E(y))^2] \\
 53 |            &= \sum_i f(y,a_i)f(a_i) + \left(\sum_j f(y,a_j)p(a_j)\right)\left(\sum_k g(a_k)p(a_k)\right)
 54 | \end{align}
 55 | 
 56 | Now, that pretty much stopped me, so I decided to assume the functions $f$ and
 57 | $g$ are linear in $a$ (which happens to be the case here):
 58 | 
 59 | \begin{align}
 60 |   x &= f(y,a)\\
 61 |     &= f_1(y)F_1a + f_2(y)F_2 + F_3\\
 62 |   y &= g(a) \\
 63 |     &= G_1g_1(a) + G_2 
 64 | \end{align}
 65 | 
 66 | Now, since all we care about is how changes in $a$ affect the outcome, we can
 67 | treat everything that doesn't have an $a$ term as constant, and simplify our
 68 | expressions to this:
 69 | 
 70 | \begin{align}
 71 |   x &= f(a)\\
 72 |     &= F_1 a + F_2\\
 73 |   y &= g(a) \\
 74 |     &= G_1 a + G_2
 75 | \end{align}
 76 | 
 77 | Now we plug those formulas into the covariance definition to see what the
 78 | covariance of x and y are:
 79 | 
 80 | \begin{align}
 81 |   & cov(x,y) =  E\left[\left(f(a)g(a) -
 82 |   E\left(f(a)\right)E\left(g(a)\right)\right)^2\right] \\
 83 |            &= \sum_i \left[\left(F_1 a_i + F_2\right)\left(G_1 a_i + G_2 \right) +
 84 |   \left(\sum_j (F_1 a_j+ F_2) p(a_j)\right)
 85 | \left(\sum_k (G_1 a_k + G_2) p(a_k)\right)\right]p(a_i)\\
 86 | &= \sum_i \left[\left(F_1 G_1 a_i^2 + (F_1 G_2 + F_2 G_1)a_i + F_2 G_2\right) -
 87 | \left(F_2 + F_1 \sum_j a_j p(a_j)\right)\left(G_2 + G_1\sum_k a_k
 88 | p(a_k)\right)\right]p(a_i)\\
 89 | &= \sum_i \left[\left (F_1 G_1 a_i^2 + (F_1 G_2 + F_2 G_1)a_i+F_2 G_2\right) -
 90 | \left(F_2 + F_1 E(a)\right)\left(G_2+G_1 E(a)\right)\right]p(a_i)
 91 | \end{align}
 92 | 
 93 | In the above, we just noticed that the sum terms on the right simplify down to
 94 | the expected value of $a$ passed through the original functions $f$ and $g$,
 95 | which is a constant with respect to the sum, so we can pull it out):
 96 | 
 97 | \begin{align}
 98 |   &cov(x,y) = \sum_i \left[\left (F_1 G_1 a_i^2 + (F_1 G_2 + F_2 G_1)a_i+F_2 G_2\right) -
 99 | \left(F_2 + F_1 E(a)\right)\left(G_2+G_1 E(a)\right)\right]p(a_i)\\
100 | &= \left(\sum_i \left (F_1 G_1 a_i^2 + (F_1 G_2 + F_2 G_1)a_i+F_2 G_2\right)p(a_i)\right) -
101 | \left(F_2 G_2 + (F_2 G_1 + F_1 G_2) E(a) + F_1 G_1 E(a)^2\right)
102 | \end{align}
103 | 
104 | Now we can start to see a pattern. The left term looks a lot like $E(a^2)$ (with
105 | some extra stuff), and the right side looks like $E(a)E(a)$ (with some extra
106 | stuff):
107 | 
108 | \begin{align}
109 |   &cov(x,y) = \left(\sum_i (F_1 G_2 + F_2 G_1)a_ip(a_i)\right) + F_1 G_1 E(a^2)
110 |   -(F_2 G_1 + F_1 G_2) E(a) - F_1 G_1 E(a)^2
111 | \end{align}
112 | 
113 | Now we notice that the two terms that aren't quadratic both cancel!
114 | 
115 | \begin{align}
116 |   &cov(x,y) = \left(\sum_i (F_1 G_2 + F_2 G_1)a_ip(a_i)\right) + F_1 G_1 E(a^2)
117 |   -(F_2 G_1 + F_1 G_2) E(a) - F_1 G_1 E(a)^2\\
118 |   &= (F_1 G_2 + F_2 G_1)E(a) -(F_2 G_1 + F_1 G_2) E(a)  +  F_1 G_1 E(a^2) - F_1
119 |   G_1 E(a)^2\\
120 |   &= F_1 G_1 E(a^2) - F_1 G_1 E(a)^2\\
121 |   &= F_1 G_1 (E(a^2) - E(a)^2) \\
122 |   &= F_1 G_1 \sigma_a^2
123 | \end{align}
124 | 
125 | Phew! Okay, so what we have learned is that the constant factors in the linear
126 | equation all cancel out (no surprise, since we're measuring variance). All we
127 | have to do to figure out what $a$'s variance contributes to $x$'s and $y$'s
128 | variances is to take all of the coefficients of terms with $a$ in them from the
129 | equations for $x$ and $y$ and multiply them together! So, it turns out that the
130 | vector product 
131 | 
132 | \begin{align}
133 | cov(\vec{x}_a) &= \begin{pmatrix}x_a \\ \dot{x}_a\end{pmatrix}\begin{pmatrix}x_a & \dot{x}_a\end{pmatrix}\sigma_a^2\\
134 |   &= \begin{pmatrix}\frac{\Delta_t^2}{2} \\ \Delta_t\end{pmatrix}
135 | \begin{pmatrix}\frac{\Delta_t^2}{2} & \Delta_t\end{pmatrix}\sigma_a^2\\
136 |   &= \begin{pmatrix}\frac{\Delta_t^4}{4} & \frac{\Delta_t^3}{2} \\
137 | \frac{\Delta_t^3}{2} & t^2 \end{pmatrix}\sigma_a^2
138 | \end{align}
139 | 
140 | gives us exactly what we're looking for. In general, we can say that the
141 | covariance of any two variables is going to follow the coefficient product rule
142 | from above, so what we want to do is have a matrix that has all possible
143 | pairwise combinations of variables from the vector. This is what the product
144 | above yields, and this is why you can take the coefficients of the $a$ terms,
145 | multiply them together, and make a covariance matrix out of it. 
146 | 
147 | Hopefully that is as useful to your understanding as it was to mine. I couldn't
148 | see the relationship between the $a$ coefficients of $x,y$ and the covariance
149 | matrix before doing this.  
150 | 
151 | \end{document}
152 | 


--------------------------------------------------------------------------------