├── Data
    ├── 2
    │   └── example.csv
    ├── 3
    │   ├── data.zip
    │   └── example.csv
    ├── 4
    │   ├── data.zip
    │   └── example.csv
    ├── 5
    │   └── data.zip
    ├── 6
    │   └── data.zip
    └── 7
    │   └── data.zip
├── LICENSE
├── README.md
└── R_Markdown
    ├── 1-basics.RMd
    ├── 2-recoding-data.Rmd
    ├── 3-importing-external-data.Rmd
    ├── 4-attribute-joins.Rmd
    ├── 5-basic-maps.Rmd
    ├── 6-basic-spatial-analysis.Rmd
    ├── 7-converting-coordinates.Rmd
    └── common-error-msg.Rmd


/Data/2/example.csv:
--------------------------------------------------------------------------------
1 | Name,Age,Place,School,DegreeJohn,20,Liverpool,Hillside High School,Geography BA (Hons)Rachel,21,Norwich,Colman High School,Geography & Archaeology BA (Joint Hons)Helen,34,Liverpool,Hillside High School,Geography BA (Hons)Mia,20,Liverpool,Central High School,Geography BA (Hons)Carl,26,Exeter,Central High School,Geography BSc (Hons)Kerryn,21,Exeter,Central High School,Geography BSc (Hons)


--------------------------------------------------------------------------------
/Data/3/data.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/alexsingleton/R-Tutorial-Materials/d5d62b21bee53fc9fceab8c45b240a27560f6c26/Data/3/data.zip


--------------------------------------------------------------------------------
/Data/3/example.csv:
--------------------------------------------------------------------------------
1 | Header text we want to ignore
2 | Name,Age,Place,School
3 | John,20,Liverpool,Hillside High School
4 | Rachel,21,Norwich,Colman High School
5 | Helen,34,Liverpool,Hillside High School
6 | 


--------------------------------------------------------------------------------
/Data/4/data.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/alexsingleton/R-Tutorial-Materials/d5d62b21bee53fc9fceab8c45b240a27560f6c26/Data/4/data.zip


--------------------------------------------------------------------------------
/Data/4/example.csv:
--------------------------------------------------------------------------------
1 | Header text we want to ignore
2 | Name,Age,Place,School
3 | John,20,Liverpool,Hillside High School
4 | Rachel,21,Norwich,Colman High School
5 | Helen,34,Liverpool,Hillside High School
6 | 


--------------------------------------------------------------------------------
/Data/5/data.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/alexsingleton/R-Tutorial-Materials/d5d62b21bee53fc9fceab8c45b240a27560f6c26/Data/5/data.zip


--------------------------------------------------------------------------------
/Data/6/data.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/alexsingleton/R-Tutorial-Materials/d5d62b21bee53fc9fceab8c45b240a27560f6c26/Data/6/data.zip


--------------------------------------------------------------------------------
/Data/7/data.zip:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/alexsingleton/R-Tutorial-Materials/d5d62b21bee53fc9fceab8c45b240a27560f6c26/Data/7/data.zip


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2014 alexsingleton
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of
 6 | this software and associated documentation files (the "Software"), to deal in
 7 | the Software without restriction, including without limitation the rights to
 8 | use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
 9 | the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
17 | FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
18 | COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
19 | IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
20 | CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
21 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | Using R as a GIS
2 | ====================
3 | 
4 | This repository provides the code for a series of R tutorials that illustrate the use of R as a GIS. They are written in R markdown, however, the PDF of these are available to download [here](http://www.alex-singleton.com/R-Tutorial-Materials/).
5 | 


--------------------------------------------------------------------------------
/R_Markdown/1-basics.RMd:
--------------------------------------------------------------------------------
  1 | ```{r set-options, echo=FALSE,comment=NA, cache=FALSE}
  2 | options(width=65)
  3 | ```
  4 | 
  5 | 1. R Basics - How do I use R?
  6 | ========
  7 | 
  8 | R is not a traditional program like Word, Excel or Chrome - instead of using the mouse to click on menus, with R you type in commands, which it then runs. Initially this might be is a bit harder to learn, but it does mean that you can easily rerun the same set of commands without having to remember which menus you have clicked. You also have a record of the work you have done, and both of these are very useful as you will see later on. 
  9 | 
 10 | To get started, just click the R icon, and a new window (called the R console) will appear: 
 11 | 
 12 | ![](r-screenshot.jpg) 
 13 | 
 14 | -----
 15 | 
 16 | >> _One key concept you need to know when using R is the working directory. This is the folder where you keep your R files and any other files you happen to be using. Usually this will be set to `M:/R work` in these helpsheets._ 
 17 | 
 18 | >> _If you want to save files elsewhere, this is fine (just put in the appropriate file path). To find out where your current working directory is, run `getwd()`. To set your working directory, run `setwd("M:/R work")`. Make sure that the directory exists - otherwise R may start to give error messages._
 19 | 
 20 | -----
 21 | 
 22 | ### R as a calculator
 23 | 
 24 | At the most basic level R can be used as a calculator. Try typing the following and then press enter/return.
 25 | 
 26 | ```{r,results="hide"}
 27 | 	6 + 8
 28 | ```
 29 | 
 30 | This should output:
 31 | 
 32 | ```{r,echo=FALSE,comment=NA}
 33 |   6+8
 34 | ```
 35 | 
 36 | Don't worry about the `[1]` for the moment - just note that R printed out `14` since this is the answer to the sum you typed in. These helpsheets will contain a mix of things you should type in (such as `6 + 8` above) and things R will output (`14` above). They will always use the same style. 
 37 | 
 38 | R uses * for multiplication, so for example 5 times 4 is:
 39 | 
 40 | ```{r,results='hide'}
 41 |   5 * 4
 42 | ```
 43 | 
 44 | Which outputs:
 45 | 
 46 | ```{r,echo=FALSE,comment=NA}
 47 |   5 * 4
 48 | ```
 49 | 
 50 | You also have '`-`' for subtraction and '`/`' for division:
 51 | 
 52 | ```{r,results='hide'}
 53 |   15 - 8
 54 | ```
 55 | 
 56 | ```{r,echo=FALSE,comment=NA}
 57 |   15 - 8
 58 | ```
 59 | 
 60 | ```{r,results='hide'}
 61 |   125  /  5
 62 | ```
 63 | 
 64 | ```{r,echo=FALSE,comment=NA}
 65 |   125 / 5
 66 | ```
 67 | 
 68 | R also has functions like square root, sine, cosine and so on. For example:
 69 | 
 70 | ```{r,results='hide'}
 71 |   sqrt(25)
 72 | ```
 73 | 
 74 | ```{r,echo=FALSE,comment=NA}
 75 |   sqrt(25)
 76 | ```
 77 | 
 78 | These expressions can also be combined all in one line:
 79 | 
 80 | ```{r,results='hide'}
 81 |   sqrt(5 + 6 * 10 / 4)
 82 | ```
 83 | 
 84 | ```{r,echo=FALSE,comment=NA}
 85 |   sqrt(5 + 6 * 10 / 4)
 86 | ```
 87 | 
 88 | -----
 89 | 
 90 | >>_Note: It is always important to remember the order of combined sums like the one above. Remember that `5 + 6 * 10 / 4` is not the same as `(5 + 6) * 10 / 4`. Remember to put in brackets if they are required (see http://www.mathsisfun.com/operation-order-bodmas.html or http://en.wikipedia.org/wiki/Order\_of\_operations)_.
 91 | 
 92 | -----
 93 | 
 94 | ### Variables
 95 | 
 96 | You can also assign numbers and results from calculations to variables as follows:
 97 | 
 98 | ```{r,results='hide'}
 99 |   price <- 300
100 | ```
101 | 
102 | This has stored the value `300` in the variable `price`. The `<-` symbol put the value on the right into the variable on the left. It is typed with a `<` followed by a `-`. 
103 | 
104 | We can then do calculations using this variable in the same way as the numbers above. For example if we wanted to reduce the `price` by 20% we could do this:
105 | 
106 | ```{r,results='hide'}
107 |   price - price * 0.2
108 | ```
109 | 
110 | ```{r,echo=FALSE,comment=NA}
111 |   price - price * 0.2
112 | ```
113 | 
114 | Or use multiple variables in one line:
115 | 
116 | ```{r,results='hide'}
117 |   discount <- price * 0.2
118 |   price - discount
119 | ```
120 | 
121 | ```{r,echo=FALSE,comment=NA}
122 |   discount <- price * 0.2
123 |   price - discount
124 | ```
125 | 
126 | -----
127 | 
128 | >> _Remember in R that variables are case-sensitive (a bit like passwords). This means that `price` is not the same variable as `Price`. Try it with `discount <- Price * 0.2`. What happens? It gives you an error message like this:_
129 | 
130 | >> _`Error: object 'Price' not found`_
131 | 
132 | >>_This means it can't find the object / variable `Price`._ 
133 | 
134 | >>_If you ever want to check which variables are defined in your "workspace", just run `ls()` and R will print a list of the variables you have. `rm(x)` where `x` is the variable will remove the variable - note there is no undo! Try `rm(price)` which will remove the variable price._ 
135 | 
136 | -----
137 | 
138 | R can also work with lists of numbers as well as individual ones. You can specify a list of numbers using the `c` function. Suppose you have a list of house prices, specified in thousands of pounds. You could store them in a variable called house.prices like this:
139 | 
140 | <!-- ?this is something the students should type in to get practice? -->
141 | 
142 | ```{r,results='hide'}
143 |   house.prices <- c(120, 150, 212, 99, 199, 299, 159)
144 |   house.prices
145 | ```
146 | 
147 | ```{r,echo=FALSE,comment=NA}
148 |   house.prices <- c(120, 150, 212, 99, 199, 299, 159)
149 |   house.prices
150 | ```
151 | 
152 | Variable names can contain full stops in them, like the `house.prices` example above; they still work in the same way. 
153 | 
154 | You can apply functions to the list. For example, to take the average of a list, enter:
155 | 
156 | ```{r,results='hide'}
157 |   mean(house.prices)
158 | ```
159 | 
160 | ```{r,echo=FALSE,comment=NA}
161 |   mean(house.prices)
162 | ```
163 | 
164 | If the house prices are in thousands of pounds, then this tells us that the mean house price is `176.9` thousand pounds. Note here that on your display, the answer may be displayed with more significant digits, so you may have something like `176.8571` as the mean value.
165 | 
166 | ### Data Frames
167 | 
168 | Data frames are an important component of R and worth spending some time on. They are like a spreadsheet, in that they can have columns of related information. We are going to create something like this:
169 | 
170 | House Price | Burglary Rate
171 | --- | ---
172 | 200 | 0
173 | 130 | 7
174 | 200 | 0
175 | 200 | 0
176 | ... | ...
177 | 
178 | Add the following two lists into R, by copying and pasting the code:
179 | 
180 | ```{r,results='hide'}
181 | house.prices <- c(200, 130, 200, 200, 180, 140, 65, 220, 180, 200, 210, 170, 
182 |     180, 160, 180, 130, 240, 180, 170, 230, 150, 200, 200, 210, 220, 180, 200, 
183 |     210, 150, 200, 230, 120, 180, 180, 190, 72, 80, 190, 220, 150, 200, 170, 
184 |     170, 230, 200, 160, 140, 100, 140, 170, 180, 260, 170, 230, 190, 220, 140, 
185 |     220, 120, 96, 210, 170, 180, 140, 150, 67, 200, 230, 140, 230, 83, 170, 
186 |     200, 210, 240, 180, 200, 210, 250, 140, 130, 190, 110, 160, 150, 230, 160, 
187 |     210, 200, 230, 210, 190, 120, 180, 87, 160, 190, 190, 230, 180, 110, 200, 
188 |     250, 180, 200, 130, 180, 190, 190, 230, 210, 210, 150, 190, 210, 200, 210, 
189 |     170)
190 | 
191 | burg.rates <- c(0, 7, 0, 0, 6, 19, 32, 0, 0, 0, 15, 6, 12, 8, 7, 6, 0, 0, 6, 
192 |     0, 7, 0, 0, 0, 0, 0, 0, 0, 17, 0, 0, 21, 7, 12, 7, 36, 18, 0, 0, 7, 6, 0, 
193 |     0, 0, 0, 0, 13, 22, 0, 0, 0, 7, 12, 7, 5, 11, 0, 0, 13, 13, 0, 6, 15, 6, 
194 |     17, 37, 0, 6, 6, 5, 24, 0, 0, 0, 0, 0, 0, 0, 5, 15, 0, 5, 6, 0, 0, 0, 13, 
195 |     0, 6, 0, 0, 0, 23, 6, 13, 15, 6, 0, 0, 7, 7, 0, 0, 0, 0, 19, 13, 0, 0, 0, 
196 |     6, 9, 0, 0, 0, 0, 0, 5)
197 | ```
198 | 
199 | Now, before we go any further we need to make sure that all the data have been entered into R correctly. We can see how many items are in each variable using `length(x)` where `x` is the variable name.
200 | 
201 | ```{r,results='hide'}
202 |   length(house.prices)
203 |   length(burg.rates)
204 | ```
205 | 
206 | You should get `118` for both. If you don't, try adding the numbers into the variables again. 
207 | 
208 | For any variable, if you just type it's name (e.g. `brug.rates`) R will list all of the values contained within it:
209 | 
210 | ```{r,echo=FALSE,comment=NA}
211 |   burg.rates
212 | ```
213 | 
214 | This command shows all of the values, and some numbers in square brackets - these relate to the position in the list of the first number of each row. For the example above, the second row begins with `[21]` which means that the first number in this row (a `7` in this case) is the 21st number in the list. The main idea is to allow you to find positions in the list of higher numbers more easily.
215 | 
216 | -----
217 | 
218 | >>_A handy hint to remember is that pressing up on the keyboard will get R to show the previous command you typed - handy if you want to repeat something, or make a small correction. Pressing up again will take you to further previous commands, and so on. Try this now._
219 | 
220 | >>_You can also use the "`history()`" command, which will open a new window with the history of the commands that you have typed in R. "`history()`" will only work when you are running R on Windows - it doesn't work on OS X or Ubuntu._
221 | 
222 | -----
223 | 
224 | We can now merge these two lists together (`house.prices` and `burg.rates`) using a data frame. You can think of it as a bit like a spreadsheet where all relevant data are stored together as a set of columns. This is similar to the data set storage in SPSS where each variable corresponds to a column and each case (or observation) corresponds to a row. However, while SPSS can only have one data set active at a time, in R you can have several of them, similar to multiple sheets in an Excel workbook. These are stored in your workspace. 
225 | 
226 | To create a data frame containing the two lists enter:
227 | 
228 | ```{r,results='hide'}
229 |   hp.data <- data.frame(Burglary = burg.rates, Price = house.prices)
230 | ```
231 | 
232 | Then type in its name to list it:
233 | 
234 | ```{r,results='hide'}
235 |   hp.data
236 | ```
237 |  
238 | A little bit of explanation: *read this to understand what has just happened!*
239 | 
240 | The function `data.frame` takes all of the variables that you wish to have as columns. The `Burglary=burg.rates` creates a column in the data frame called `Burglary` containing the values in the variable `burg.rates` in the last section. Similarly, it has a column called `Price` containing the values from `house.prices`. This new data frame is called `hp.data` (an object in R is similar to a variable, although it can be more complex - so it can contain more sophisticated things like data frames, not just a list of values). Typing in the name of a data frame object (once it has been created) lists the values in the columns.
241 | 
242 | ### Summarising Data
243 | 
244 | With a data frame, we can use the `ncol()` and `nrow()` commands to see how many rows and columns are in the data frame. Try this now:
245 | 
246 | ```{r,results='hide'}
247 |   ncol(hp.data)
248 | ```
249 | 
250 | ```{r,echo=FALSE,comment=NA}
251 |   ncol(hp.data)
252 | ```
253 | 
254 | ```{r,results='hide'}
255 |   nrow(hp.data)
256 | ```
257 | 
258 | ```{r,echo=FALSE,comment=NA}
259 |   nrow(hp.data)
260 | ```
261 | 
262 | We can also get R to give summary values of the data frame by typing:
263 | 
264 | ```{r,results='hide'}
265 |   summary(hp.data)
266 | ```
267 | 
268 | ```{r,echo=FALSE,comment=NA}
269 |   summary(hp.data)
270 | ```
271 | 
272 | For each column, a number of values are listed:
273 | 
274 | Item | Description
275 | --- | ---
276 | Min.	| The smallest value in the column
277 | 1st. Qu.	| The first quartile (the value ¼ of the way along a sorted list of values)
278 | Median	| The median (the value ½ of the way along a sorted list of values)
279 | Mean	| The average of the column
280 | 3rd. Qu.	| The third quartile (the value ¾ of the way along a sorted list of values)
281 | Max.	| The largest value in the column
282 | 
283 | Between these numbers, an impression of the spread of values of each variable can be obtained. In particular it is possible to see that the median house price in this area by neighborhood ranges from £65,000 to £260,000, and that half of the prices lie between £152,500 and £210,000. Also, it can be seen that since the median measured burglary rate is zero, then at least half of the areas had no burglaries in the month when counts were compiled.
284 | 
285 | ### Data Subsets
286 | 
287 | As we saw above, you can get a summary of a data set using `summary()`, which will give some summary statistics for the data set specified between the brackets. As well as printing out the whole data set by typing the objects name `hp.data` we can get R to output the first 6 rows using the `head()` command. 
288 | 
289 | ```{r,results='hide'}
290 | head(hp.data)
291 | ```
292 | 
293 | We can also specify a specific row and/or column using either numbers in square brackets `[]` or column/row names. For example:
294 | 
295 | ```{r,results='hide'}
296 | hp.data[15,2] 
297 | ```
298 | 
299 | This prints data in the 15th row and the 2nd column, `180` in our case. You can also print using column/row names as well using quote marks:
300 | 
301 | ```{r,results='hide'}
302 | hp.data[15,"Price"]
303 | ```
304 | 
305 | You can also get a range of rows, using a colon:
306 | 
307 | ```{r,results='hide'}
308 | hp.data[10:15,"Price"]
309 | ```
310 | 
311 | This lists the burglary rates in rows 10-15 in the data set.
312 | 
313 | If you want to specify a full column (i.e. all of the burglary rates), just leave the part where you would write the column range empty:
314 | 
315 | ```{r,results='hide'}
316 | hp.data[ ,"Burglary"]
317 | ```
318 | 
319 | You can use a similar approach to select a row of the data -
320 | 
321 | ```{r,results='hide'}
322 | hp.data[12, ]
323 | ```
324 | 
325 | This gives Burglary and Price (i.e. house price) values for the 12th row in the dataframe.
326 | 
327 | Another way of selecting columns is to use the $ (dollar) approach.
328 | 
329 | ```{r,results='hide'}
330 | hp.data$Price
331 | ```
332 | 
333 | This prints the column called `Price`.
334 | 
335 | ### Graphics
336 | 
337 | There are a range of simple graphics that R can do to help you understand your data. 
338 | 
339 | Firstly a histogram:
340 | 
341 | ```{r,results='hide',eval=FALSE}
342 |   hist(burg.rates)
343 | ```
344 | 
345 | ```{r,results='hide',echo=FALSE,comment=NA,warning=FALSE}
346 |   pdf('plot1.pdf',4, 4)
347 |   hist(burg.rates)
348 |   dev.off()
349 | ```
350 | 
351 | \begin{center}
352 | \includegraphics{plot1.pdf}
353 | \par
354 | \end{center}
355 | 
356 | A new window will appear with the histogram in, and you can copy and paste this into Word, PowerPoint or elsewhere. R will generally give basic plots unless you tell it otherwise. To get a histogram with red bars, enter:
357 | 
358 | ```{r,results='hide',eval=FALSE}
359 |   hist(burg.rates, col = "red")
360 | ```
361 | 
362 | ```{r,results='hide',echo=FALSE,comment=NA,warning=FALSE}
363 |   pdf('plot2.pdf', 4, 4)
364 |   hist(burg.rates, col = "red")
365 |   dev.off()
366 | ```
367 | 
368 | \begin{center}
369 | \includegraphics{plot2.pdf}
370 | \par
371 | \end{center}
372 | 
373 | And to add a title,  x-axis label (xlab) and y-axis label (ylab) use:
374 | 
375 | ```{r,results='hide',eval=FALSE}
376 |   hist(burg.rates, col = "red", main = "Burglaries per 1000 households", xlab = "Rate", 
377 |     ylab = "Frequency")
378 | ```
379 | 
380 | ```{r,results='hide',echo=FALSE,comment=NA,warning=FALSE}
381 |   pdf('plot3.pdf', 4, 4)
382 |   hist(burg.rates, col = "red", main = "Burglaries per 1000 households", xlab = "Rate", 
383 |     ylab = "Frequency")
384 |   dev.off()
385 | ```
386 | 
387 | \begin{center}
388 | \includegraphics{plot3.pdf}
389 | \par
390 | \end{center}
391 | 
392 | You can also see the relationship between the two variables (median house price and burglary rates) by creating a scatter plot:
393 | 
394 | ```{r,results='hide',eval=FALSE}
395 |     plot(burg.rates, house.prices, main = "Burglary vs. House Price", 
396 |        xlab = "Burglaries (per 1000 households)", 
397 |     ylab = "Median House Price (1000s Pounds)")
398 | ```
399 | 
400 | ```{r,results='hide',echo=FALSE,comment=NA,warning=FALSE}
401 |   pdf('plot4.pdf', 4, 4)
402 |   plot(burg.rates, house.prices, main = "Burglary vs. House Price", 
403 |        xlab = "Burglaries (per 1000 households)", 
404 |     ylab = "Median House Price (1000s Pounds)")
405 |   dev.off()
406 | ```
407 | 
408 | \begin{center}
409 | \includegraphics{plot4.pdf}
410 | \par
411 | \end{center}
412 | 
413 | This shows that there is a relationship between the two quantities, although there is still a fair amount of randomness as well. The points show there is a general tendency for house prices to fall as burglary rate increases, but that there are other factors affecting house prices as well.


--------------------------------------------------------------------------------
/R_Markdown/2-recoding-data.Rmd:
--------------------------------------------------------------------------------
  1 | ```{r set-options, echo=FALSE,comment=NA, cache=FALSE}
  2 | options(width=87)
  3 | ```
  4 | 
  5 | 2. Reworking and Recoding Data
  6 | =================
  7 | 
  8 | Often when working on a project you will have a data set that will contain additional information that you don't need for your analysis; or, have attributes which aren't specified as you require. This helpsheet explains how to remove and add additional attributes. 
  9 | 
 10 | For example, let's say we have a data set as follows:
 11 | 
 12 | Name | Age | Place | School | Degree
 13 | --- | --- | --- | --- | ---
 14 | John | 20 | Liverpool | Hillside High School | Geography BA (Hons)
 15 | Rachel | 21 | Norwich | Colman High School | Geography & Archaeology BA (Joint Hons)
 16 | ... | ... | ... | ... | ...
 17 | 
 18 | And we are only interested in people's age for this exercise. As such, we don't need all of the other data. 
 19 | 
 20 | Before we start, we need to setup the working directory and read in the data:
 21 | 
 22 | >>*For more information on working directories, see the worksheet '1. R Basics'. Remember to create the folder `R work` if it doesn't exist already.*
 23 | 
 24 | ```{r,echo=FALSE,comment=NA,results='hide'}
 25 |   setwd("/Users/nickbearman/Dropbox/r-helpsheets/helpsheets/2-recoding-data")
 26 | ```
 27 | <!---```{r,echo=FALSE,comment=NA,results='hide'}
 28 |   setwd("C:/Users/Nick & Louise/Dropbox/r-helpsheets/worksheets/2-recoding-data")
 29 | ``` -->
 30 | ```{r,echo=FALSE,comment=NA,results='hide'}
 31 |   file_location <- "example.csv"
 32 |   data <- read.csv(file_location, header = TRUE)
 33 | ```
 34 | ```{r,eval=FALSE,results='hide'}
 35 |   # Set working directory
 36 |   setwd("M:/R work")
 37 |   # Read data from the web  
 38 |   data <- read.csv("http://data.alex-singleton.com/r-helpsheets/2/example.csv", header = TRUE)
 39 | ```
 40 | 
 41 | We will now display this to check it has been read in correctly:
 42 | 
 43 | ```{r,results='hide'}
 44 |   data
 45 | ```
 46 | 
 47 | Which should give you this:
 48 | 
 49 | ```{r,echo=FALSE,comment=NA}
 50 |   data
 51 | ```
 52 | 
 53 | The `subset` command can be used to extract just the specified columns (and/or rows) from the data set. For example:
 54 | 
 55 | ```{r,results='hide'}
 56 |   subset(data, select = c("Name", "Age"))
 57 | ```
 58 |  
 59 | ```{r,results='hide'}
 60 |   subset(data, Place == "Liverpool", select = c("Name", "Age"))
 61 | ```
 62 | 
 63 | We can also store this as a new object:
 64 | 
 65 | ```{r,results='hide'}
 66 |   data.Liverpool <- subset(data, Place == "Liverpool", select = c("Name", "Age"))
 67 | ```
 68 | 
 69 | Because the statement assigns the output of the subset function to the new object called `"data.Liverpool"`, nothing will be printed. As such, we can check by typing `data.Liverpool`:
 70 | 
 71 | ```{r,echo=FALSE,comment=NA}
 72 |   data.Liverpool
 73 | ```
 74 | 
 75 | Adding a column to a data frame is done using the $ symbol. We will initially store `NA` (i.e. no value) in the column. 
 76 | 
 77 | ```{r,results='hide'}
 78 |   data.Liverpool$diff100 <- NA
 79 | ```
 80 | 
 81 | We also use the same principle to calculate the age difference from 100
 82 | 
 83 | ```{r,results='hide'}
 84 |   data.Liverpool$diff100 <- 100 - data.Liverpool$Age
 85 | ```
 86 | 
 87 | Perhaps we decide that we don't like the label of the first column "`Name`" and that it would be more appropriate to call it "`FirstName`". To make this change we create a variable with the column labels that we want:
 88 | 
 89 | ```{r,results='hide'}
 90 |   new_column_names <- c("FirstName","Age","diff100")
 91 | ```
 92 | 
 93 | When doing this it is always a good idea to check that the length of the object we have just created (it should be `3`) is the same as the number of columns in our data frame. 
 94 | 
 95 | ```{r,results='hide'}
 96 | length(new_column_names)
 97 | ```
 98 | 
 99 | ```{r,results='hide'}
100 | ncol(data.Liverpool)
101 | ```
102 | 
103 | We can then add the new column names to the data frame:
104 | 
105 | ```{r,results='hide'}
106 | colnames(data.Liverpool) <- new_column_names
107 | ```
108 | 
109 | Check the data frame now, and the names should be changed. 
110 | 
111 | ```{r,echo=FALSE,comment=NA}
112 |   data.Liverpool
113 | ```
114 | 
115 | Instead of recording people's age in years, perhaps we just need this in two categories - 21 and over, and under 21. We can _recode_ the `Age` variable into a new variable as follows:
116 | 
117 | ```{r,results='hide'}
118 |   data.Liverpool$AgeCat[data.Liverpool$Age < 21] <- "Under 21"
119 |   data.Liverpool$AgeCat[data.Liverpool$Age >= 21] <- "21 or over"
120 | ```
121 | 
122 | This will create the new variable, `AgeCat`. To see what has happened to the object, print the `data.Liverpool` again:
123 | 
124 | ```{r,echo=FALSE,comment=NA}
125 |   data.Liverpool
126 | ```
127 | 


--------------------------------------------------------------------------------
/R_Markdown/3-importing-external-data.Rmd:
--------------------------------------------------------------------------------
  1 | 3. Importing External Data
  2 | ===============
  3 | 
  4 | Often one of the first steps when doing a project in R is to import some data. This helpsheet will cover reading in a CSV file and a Shapefile. A CSV file is a basic format for data; a Shapefile is a collection of files that relate to geographic features (points, lines or polygons), associated attribute data and their projection information. Once such files have been read into R, you might need to tidy them up before doing any analysis - see helpsheet "2. Reworking and Recoding Data", for more information. 
  5 | 
  6 | ### CSV Files
  7 | 
  8 | CSV (Comma Separated Values) files typically look like this when opened in a text editor:
  9 | 
 10 | ```
 11 |   colname1,colname2,....
 12 |   row1value,row1value,....
 13 |   row2value,row2value,....
 14 | ```
 15 | 
 16 | Each column in separated by a comma, and each row with a carriage return. We will now read an example CSV file into a data frame in R. This is avaliable as a file (`example.csv`) which we will use in this exercise, and looks like:
 17 | 
 18 | ```
 19 |   Header text we want to ignore
 20 |   Name,Age,Place,School  
 21 |   John,20,Liverpool,Hillside High School  
 22 |   Rachel,21,Norwich,Colman High School  
 23 |   Helen,34,Liverpool,Hillside High School
 24 | ```
 25 | 
 26 | To read the file in, run this command:
 27 | ```{r,echo=FALSE,comment=NA,results='hide'}
 28 |   setwd("/Users/nickbearman/Dropbox/r-helpsheets/helpsheets/3-importing-external-data")
 29 |   file_location <- "example.csv"
 30 | ```
 31 | <!---
 32 | ```{r,echo=FALSE,comment=NA,results='hide'}
 33 |   setwd("C:/Users/Nick & Louise/Dropbox/r-helpsheets/helpsheets/3-importing-external-data")
 34 | ```
 35 | -->
 36 | ```{r,echo=FALSE,comment=NA,results='hide'}
 37 |   file_location <- "example.csv"
 38 |   data <- read.csv(file_location, header = TRUE, skip = 1)
 39 | ```
 40 | ```{r,eval=FALSE,results='hide'}
 41 |   # Set working directory
 42 |   setwd("M:/R work")
 43 |   # Read data from the web
 44 |   file_location <- "http://data.alex-singleton.com/r-helpsheets/3/example.csv"
 45 |   data <- read.csv(file_location, header = TRUE, skip = 1)
 46 | ```
 47 | 
 48 | And to check that it has been input correctly, which is always a good idea with R, run:
 49 | 
 50 | ```{r,results='hide'}
 51 |   data
 52 | ```
 53 | 
 54 | This should output:
 55 | 
 56 | ```{r,echo=FALSE,comment=NA}
 57 |   data
 58 | ```
 59 | 
 60 | Here, the object we created is called `"data"` and the function that we used is called `"read.csv"`, which has a number of options: 
 61 | 
 62 | 1. '`File_location`' is where the file is stored (within your working directory, see helpsheet 1. Basics for more details).
 63 | 
 64 | 2. '`Header = TRUE`' tells R that the CSV file has some header information (column names) in it, in this case `Name`, `Age`, `Place` and `School`.
 65 | 
 66 | 3. '`Skip = 1`' tells R to ignore the first line of the CSV file as we don't want this in the data set. This was specified as `"Header text we want to ignore"` in the file.
 67 | 
 68 | We can now look at the object `data` in the normal way and, for example, check the column names using:
 69 | 
 70 | ```{r,results='hide'}
 71 |   colnames(data)
 72 | ```
 73 | 
 74 | Which should output:
 75 | 
 76 | ```{r,echo=FALSE,comment=NA}
 77 |   colnames(data)
 78 | ```
 79 | 
 80 | If you want to rename columns or "recode" the attributes of your data, see helpsheet "2. Reworking and Recoding Data".
 81 | 
 82 | ### Shapefiles
 83 | 
 84 | Shapefiles contain geographic data that we can also read into R, but to do this R needs some additional packages. These are already installed, but just need to be loaded. 
 85 | 
 86 | To do this, run these commands:
 87 | 
 88 | ```{r,results='hide',message=FALSE}
 89 |   library(sp)
 90 |   library(rgeos)
 91 |   library(maptools)
 92 |   library(RColorBrewer)
 93 |   library(GISTools)
 94 |   library(rgdal)
 95 | ```
 96 | 
 97 | When you load each package, R will write some output to the console. Check for any error messages, and if everything seems to have worked, continue to the next section. 
 98 | 
 99 | We can read in a Shapefile and then display it in R. 
100 | 
101 | ```{r,echo=FALSE,comment=NA,results='hide'}
102 |   setwd("/Users/nickbearman/Dropbox/r-helpsheets/helpsheets/3-importing-external-data")
103 | ```
104 | <!---
105 | ```{r,echo=FALSE,comment=NA,results='hide'}
106 |   setwd("C:/Users/Nick & Louise/Dropbox/r-helpsheets/helpsheets/3-importing-external-data")
107 | ```
108 | -->
109 | ```{r,eval=FALSE,results='hide'}
110 |   # Set working directory
111 |   setwd("M:/R work")
112 |   # Download data.zip from the web
113 |   download.file("http://data.alex-singleton.com/r-helpsheets/3/data.zip", "data.zip")
114 |   # Unzip file
115 |   unzip("data.zip")
116 | ```
117 | ```{r,results='hide'}
118 |   # Read in Shapefile
119 |   Wards <- readOGR(".", "england-caswa_2001")
120 | ```
121 | 
122 | 
123 | ```{r,eval=FALSE,highlight=TRUE}
124 | plot(Wards)
125 | ```
126 | 
127 | ```{r,results='hide',echo=FALSE,comment=NA,warning=FALSE}
128 |   pdf('plot1.pdf', 4, 4)
129 |   plot(Wards)
130 |   dev.off()
131 | ```
132 | 
133 | ![Image](plot1.pdf)\
134 |   
135 |   
136 | The object `Wards` now contains the attributes of the Shapefile. This has created a new type of object called a SpatialPolygonsDataFrame. If the Shapefile had been lines (e.g. roads), this would be a SpatialLinesDataFrame, or points, a SpatialPointsDataFrame. These new object types contain the spatial information (e.g. the boundary locations) as well as attribute data for each of the spatial features (e.g. Ward boundaries). The SpatialPolygonsDataFrame contains a number of different 'slots', each of which hold different information. Use the `slotNames` function to get a list of the different slots:
137 | 
138 | ```{r,results='hide'}
139 |   slotNames(Wards)
140 | ```
141 | 
142 | ```{r,echo=FALSE,comment=NA}
143 |   slotNames(Wards)
144 | ```
145 | 
146 | The slot `data` contains the attribute information for the shape file, and this is accessed using an @ symbol:
147 | 
148 | ```{r,results='hide'}
149 |   head(Wards@data)
150 | ```
151 | 
152 | ```{r,echo=FALSE,comment=NA}
153 |   head(Wards@data)
154 | ```
155 | 
156 | The data slot can be accessed in the same way as any standard data frame. 
157 | 


--------------------------------------------------------------------------------
/R_Markdown/4-attribute-joins.Rmd:
--------------------------------------------------------------------------------
  1 | 4. Joining Data
  2 | =====
  3 | 
  4 | When doing spatial data analysis, it is quite common to need to merge different data sets together. There are two main ways of doing this, firstly using the `merge()` command which will match attribute data in data frames, and secondly, using the match technique, which works with attribute data in shape files. 
  5 | 
  6 | ### Merging Attribute Data in Data Frames
  7 | 
  8 | The `merge()` function allows us to take two data sets and combine them into one, based on a common variable. To test this, import the following data by running this command:
  9 | 
 10 | ```{r,echo=FALSE,comment=NA,results='hide'}
 11 |   setwd("/Users/nickbearman/Dropbox/r-helpsheets/helpsheets/4-attribute-joins")
 12 |   file_location <- "example.csv"
 13 | ```
 14 | <!---
 15 | ```{r,echo=FALSE,comment=NA,results='hide'}
 16 |   setwd("C:/Users/Nick & Louise/Dropbox/r-helpsheets/helpsheets/4-attribute-joins")
 17 | ```
 18 | -->
 19 | ```{r,echo=FALSE,comment=NA,results='hide'}
 20 |   file_location <- "example.csv"
 21 |   data <- read.csv(file_location, header = TRUE, skip = 1)
 22 | ```
 23 | ```{r,eval=FALSE,results='hide'}
 24 |   # Set working directory
 25 |   setwd("M:/R work")
 26 |   # Read data from the web  
 27 |   data <- read.csv("http://data.alex-singleton.com/r-helpsheets/4/example.csv", header = TRUE, skip = 1)
 28 | ```
 29 | 
 30 | ```{r,results='hide',echo=FALSE,comment=NA}
 31 | data <- read.csv(file_location, header = TRUE, skip = 1)
 32 | ```
 33 | 
 34 | And to check that it has imported correctly, which is always a good idea, run:
 35 | 
 36 | ```{r,results='hide'}
 37 |   data
 38 | ```
 39 | 
 40 | Which should output:
 41 | 
 42 | ```{r,echo=FALSE,comment=NA}
 43 |   data
 44 | ```
 45 | 
 46 | You now need to create another data frame which we will use as an example. You could create another csv file and import this; however, we will illustrate another way of achieving this by joining a series of vector lists.
 47 | 
 48 | ```{r,results='hide'}
 49 |   # Create a person vector
 50 |   Person <- c("Paul", "Mike", "John", "Helen", "Mia", "Leo", "Rachel")
 51 |   # Create a favourite functions vector
 52 |   Function <- c("merge()", "read.csv()", "colnames()", "ncol()", "length()", "getwd()", "save.image()")
 53 |   # We can now join these two vectors into a new data frame of favourite functions
 54 |   fav_fun <- data.frame(Person, Function)
 55 |   # View the fav_fun
 56 |   fav_fun
 57 | ```
 58 | Which should look like this:
 59 | 
 60 | ```{r,echo=FALSE,comment=NA}
 61 |   fav_fun
 62 | ```
 63 | 
 64 | We now have two data sets; `data`, which contains a list of people, locations and schools and `fav_fun`, which contains a list including those people as well as additional people who have attended R workshops. 
 65 | 
 66 | The next step is to combine the two. What we are going to do is select the people in the `fav_fun` data frame who also appear in the `test` data frame, and copy their favourite R function into a new data frame, along with all the information from `test`. 
 67 | 
 68 | We will refer to the two data frames as `x` and `y`. The x data frame is `data`; and the y is `fav_fun`. In `x`, the column containing the list of people is called "Name", and in `y`, it is called "Person". The parameters of the merge function first accept the two table names, and then the lookup columns as `by.x` or `by.y`. You should also include `all.x=TRUE` as a final parameter. This tells the function to keep all the records in `x`, but only those in `y` that match. 
 69 | 
 70 | ```{r,results='hide'}
 71 |   People_And_Functions <- merge(data, fav_fun, by.x = "Name", by.y = "Person", all.x = TRUE)
 72 | ```
 73 | 
 74 | To see what this command has done, type `People_And_Functions` to show the content of the new data frame. This should look like:
 75 | 
 76 | ```{r,echo=FALSE,comment=NA}
 77 |   People_And_Functions
 78 | ```
 79 | 
 80 | If the by column names were named the same in both `x` and `y` (e.g. both called "Name"), we could specify this more simply with `by="column name"` rather than `by.x` and `by.y`; and finally, a critical issue when making any join is assuring that the "`by`" columns are in the same format.
 81 | 
 82 | ### Match data in a Shapefile
 83 | 
 84 | The `match()` function works in a very similar way to `merge()` but can be used to append attribute data to a  shape file. 'merge()' will often cause errors when working with spatial data frames.
 85 | 
 86 | Load the required packages and example shapefile from helpsheet "3. Importing External Data". 
 87 | 
 88 | ```{r,results='hide',message=FALSE}
 89 |   library(rgdal)
 90 | ```
 91 | ```{r,echo=FALSE,comment=NA,results='hide'}
 92 |   setwd("/Users/nickbearman/Dropbox/r-helpsheets/helpsheets/4-attribute-joins")
 93 |   file_location <- "example.csv"
 94 | ```
 95 | <!---
 96 | ```{r,echo=FALSE,comment=NA,results='hide'}
 97 |   setwd("C:/Users/Nick & Louise/Dropbox/r-helpsheets/worksheets/4-attribute-joins")
 98 | ```
 99 | -->
100 | ```{r,eval=FALSE,results='hide'}
101 |   # Set working directory
102 |   setwd("M:/R work")
103 |   # Download data.zip from the web
104 |   download.file("http://data.alex-singleton.com/r-helpsheets/4/data.zip", "data.zip")
105 |   # Unzip file
106 |   unzip("data.zip")
107 | ```
108 | ```{r,results='hide'}
109 |   # Read in shape file
110 |   Wards <- readOGR(".", "england_caswa_2001")
111 | ```
112 | 
113 | ```{r,eval=FALSE,results='hide'} 
114 |   # Plot Wards to check it has been imported correctly
115 |   plot(Wards)
116 | ```
117 | 
118 | ```{r,results='hide',echo=FALSE,comment=NA,warning=FALSE}
119 |   pdf('plot1.pdf', 4, 4)
120 |   plot(Wards)
121 |   dev.off()
122 | ```
123 | 
124 | ![Image](plot1.pdf)\
125 |   
126 |   
127 | We now have the content of the `Wards` shapefile in R. Have a look at the content of the data in the data slot:
128 | 
129 | ```{r,results='hide'}
130 |   head(Wards@data)
131 | ```
132 | 
133 | ```{r,echo=FALSE,comment=NA}
134 |   head(Wards@data)
135 | ```
136 | 
137 | We are now going to append the following data onto it, which are index scores for the rate of diabetes prevelance:
138 | 
139 | Ward | Rate
140 | --- | ---
141 | 00BYGC | 50
142 | 00BYFN | 198
143 | 00BYFU | 56
144 | 00BYFC | 78
145 | 00BYFG | 123
146 | 00BYFS | 21
147 | 
148 | Run this code to create this data frame:
149 | 
150 | ```{r,results='hide'}
151 |   # Create an ons code vector
152 |   Ward <- c("00BYGC", "00BYFN", "00BYFU", "00BYFC", "00BYFG", "00BYFS")
153 |   # Create a rate vector
154 |   Rate <- c(50, 198, 56, 78, 123, 21)
155 |   # We can now join these two vectors into a new data frame of wards_diabetes
156 |   wards_diabetes <- data.frame(Ward, Rate)
157 |   # View the wards_diabetes
158 |   wards_diabetes
159 | ```
160 | This should look like:
161 | 
162 | ```{r,echo=FALSE,comment=NA}
163 |   wards_diabetes
164 | ```
165 | 
166 | We can then use the `match()` function to append these diabetes rates on to `Wards@data`, by matching the `Ward` column from the `wards_diabetes` data frame to the `ons_label` column in the data slot of the wards SpatialPolygonsDataFrame.
167 | 
168 | ```{r,results='hide'}
169 |   Wards@data <- data.frame(Wards@data, wards_diabetes[match(Wards@data[, "ons_label"], wards_diabetes[, "Ward"]), ])
170 | ```
171 | 
172 | And to check, run:
173 | 
174 | ```{r,results='hide'}
175 |   head(Wards@data)
176 | ```
177 | 
178 | ```{r,echo=FALSE,comment=NA}
179 |   head(Wards@data)
180 | ```
181 | 
182 | We have now appended the data, but also have the ward listed twice. To remove this, run:
183 | 
184 | ```{r,results='hide'}
185 |   Wards@data$Ward <- NULL
186 |   head(Wards@data)
187 | ```
188 | 
189 | Which changes `Wards@data` to:
190 | 
191 | ```{r,echo=FALSE,comment=NA}
192 |   head(Wards@data)
193 | ```
194 | 


--------------------------------------------------------------------------------
/R_Markdown/5-basic-maps.Rmd:
--------------------------------------------------------------------------------
  1 | # 5. Basic Maps
  2 | 
  3 | This helpsheet shows you how to make a simple map using the `GISTools` package. 
  4 | 
  5 | To start with, we need to load the `GISTools` package as well as some other packages we need:
  6 | 
  7 | ```{r,results='hide', message=FALSE}
  8 |   library(rgdal)
  9 |   library(GISTools)
 10 |   library(RColorBrewer)
 11 | ```
 12 | 
 13 | We also need to load a data set, which in this example, relate to Lower Layer Super Output (LSOA) zones within Liverpool, and also an outline of England. The following commands will set your working directory, download, unzip and load the data files. 
 14 | 
 15 | ```{r,echo=FALSE,comment=NA,results='hide'}
 16 |   setwd("/Users/nickbearman/Dropbox/r-helpsheets/helpsheets/5-basic-maps")
 17 | ```
 18 | <!---
 19 | ```{r,echo=FALSE,comment=NA,results='hide'}
 20 |   setwd("C:/Users/Nick & Louise/Dropbox/r-helpsheets/helpsheets/5-basic-maps")
 21 | ```
 22 | -->
 23 | ```{r,eval=FALSE,results='hide'}
 24 |   # Set working directory
 25 |   setwd("M:/R work")
 26 |   # Download data.zip from the web
 27 |   download.file("http://data.alex-singleton.com/r-helpsheets/5/data.zip", "data.zip")
 28 |   # Unzip file
 29 |   unzip("data.zip")
 30 | ```
 31 | ```{r,results='hide'}
 32 |   # Read in both shapefiles
 33 |   LSOA <- readOGR(".", "england_LSOA_2011_dwelling_count")
 34 |   outline <- readOGR(".", "England_ol_2011_gen_clipped")
 35 | ```
 36 | 
 37 | We can do a very basic plot of the map using:
 38 | 
 39 | ```{r,eval=FALSE,highlight=TRUE}
 40 |   plot(LSOA)
 41 | ```
 42 | 
 43 | Which gives us a map, just showing the boundaries of the LSOAs. 
 44 |   
 45 |   
 46 |   
 47 | ```{r,results='hide',echo=FALSE,comment=NA,warning=FALSE,fig.cap = "Map of Wards in Liverpool"}
 48 |   pdf('plot1.pdf', 5, 5)
 49 |   plot(LSOA)
 50 |   dev.off()
 51 | ```
 52 | 
 53 | ![Image](plot1.pdf)\
 54 |   
 55 | We can also plot an outline of England in a similar way.
 56 | 
 57 | ```{r,eval=FALSE,highlight=TRUE}
 58 | plot(outline)
 59 | ```
 60 | 
 61 | This replaces the first map, but we can get R to overlay one on top of the other, by using the command `add = TRUE`. The order of plots is key here - R will maintain the scale and extent of the first map. We can also adjust the colour of the border to a red colour (`border="red"`), and the fill colour (`col="#2C7FB820"`) a shade of blue. These represent two ways of specifying colours. The second contains eight alphanumerics, the first six relate to a HEX colour code. To view various colours that can be used in R, have a look at the website http://research.stowers-institute.org/efg/R/Color/Chart/ColorChart.pdf. The final two characters are the level of transparency (in this case 20%). _Sometimes when running R in Windows, the transparency option will not work - it will just fill it with a solid colour. In this case, just remove the `col = "#2C7FB820"` section from the plot command to just generate a red outline._
 62 | 
 63 | 
 64 | ```{r,eval=FALSE,highlight=TRUE}
 65 |   # Plot the LSOA Map
 66 |   plot(LSOA)
 67 |   # Overplot the outline map
 68 |   plot(outline, add = TRUE, border = "red", col = "#2C7FB820")
 69 | ```
 70 | 
 71 | ```{r,results='hide',echo=FALSE,comment=NA}
 72 |   pdf('plot2.pdf', 5, 5)
 73 |   plot(LSOA)
 74 |   plot(outline, add = TRUE, border = "red", col = "#2C7FB820")
 75 |   dev.off()
 76 | ```
 77 | 
 78 | \begin{center}
 79 | \includegraphics{plot2.pdf}
 80 | \par
 81 | \end{center}
 82 |     
 83 | The LSOA data frame contains some more information, which we can see by looking in the data slot of the object: 
 84 | 
 85 | ```{r,results='hide'}
 86 |   head(LSOA@data)
 87 | ```
 88 | 
 89 | ```{r,echo=FALSE,comment=NA}
 90 |   head(LSOA@data)
 91 | ```
 92 | 
 93 | This shows us that the shape file contains a field called '`COUNT_DWELL`' which contains the count of the number of dwellings in each LSOA. We can use this to create a choropleth map with:
 94 | 
 95 | ```{r,eval=FALSE,highlight=TRUE}
 96 |   choropleth(LSOA, LSOA$COUNT_DWEL)
 97 | ```
 98 | 
 99 | ```{r,results='hide',echo=FALSE,comment=NA}
100 |   pdf('plot3.pdf', 5, 5)
101 |   choropleth(LSOA, LSOA$COUNT_DWEL)
102 |   dev.off()
103 | ```
104 | 
105 | \begin{center}
106 | \includegraphics{plot3.pdf}
107 | \par
108 | \end{center}
109 |     
110 | This map is ok, but we can easily make it more effective with a few extra commands. The new commands include:
111 | 
112 | 1. `brewer.pal` which returns a set of colours from a range of pre-set palettes that look good on maps. In this case, we are getting `5` colours from the `"Blues"` palette. For more information on the R command, type '`?brewer.pal`' into R, for more information on the concept, see http://colorbrewer.org. 
113 | 
114 | 1. `auto.shading` which categorises the data we want to show on to the map (in this case, `LSOA$COUNT_DWELL`) into the specified number of categories (`5`), coloured with the specified colours (`cols = brewer.pal(5, "Blues")`).
115 | 
116 | 1. `choro.legend` and `north.arrow` both have a set of coordinates as one of their parameters (e.g. `331089, 384493`). These say where the object is located on the map. You may have to fiddle with these to get the spacing correct (see note below).
117 | 
118 | Run the commands below in R, and read the text below for more information. 
119 | 
120 | ```{r,eval=FALSE,highlight=TRUE}
121 |   # Set colour and number of classes
122 |   shades <- auto.shading(LSOA$COUNT_DWEL, n = 5, cols = brewer.pal(5, "Blues"))
123 |   # Draw the map
124 |   choropleth(LSOA, LSOA$COUNT_DWEL, shades)
125 |   # Add a legend
126 |   choro.legend(331089, 384493, shades, fmt = "%g", title = "Count of Dwellings")
127 |   # Add a title to the map
128 |   title("Count of Dwellings by LSOA, 2011")
129 |   # add Notth arrow
130 |   north.arrow(332308, 387467, 300)
131 |   # Draw a box around the map
132 |   box(which = "outer")
133 | ```
134 | 
135 | ```{r,results='hide',echo=FALSE,comment=NA}
136 |   pdf('plot4.pdf', 7, 7)
137 |   # Set colour and number of classes
138 |   shades <- auto.shading(LSOA$COUNT_DWEL, n = 5, cols = brewer.pal(5, "Blues"))
139 |   # Draw the map
140 |   choropleth(LSOA, LSOA$COUNT_DWEL, shades)
141 |   # Add a legend
142 |   choro.legend(328089, 384493, shades, fmt = "%g", title = "Count of Dwellings")
143 |   # Add a title to the map
144 |   title("Count of Dwellings by LSOA, 2011")
145 |   # add Notth arrow
146 |   north.arrow(332308, 387467, 300)
147 |   # Draw a box around the map
148 |   box(which = "outer")
149 |   dev.off()
150 | ```
151 | 
152 | See the next page for the map. 
153 | 
154 | You might find you will need to adjust the location or size of the legend to get this to fit onto your plot correctly. To find a new set of location coordinates, type `locator()` into the terminal and press enter. After doing this, when you hover over the plot, the mouse will turn into a cross. If you click, and then right-click and choose 'Stop', the location of the click is printed to the terminal - you can use these to re-position items in the plot.
155 | 
156 | To change the size of the legend, use the `cex = ` command. Update the `choro.legend` line to read `choro.legend(328089, 384493, shades, fmt = "%g", title = "Count of Dwellings", cex = 1.1)` and see what happens. The `cex` value is a multiple which increases or decreases the size of the legend. Experiment with this until you find something that works well. 
157 | 
158 | For more information on the `GISTools` package, have a look at http://cran.r-project.org/web/packages/GISTools/  GISTools.pdf. 
159 | 
160 | \begin{center}
161 | \includegraphics{plot4.pdf}
162 | \par
163 | \end{center}
164 |     
165 | 


--------------------------------------------------------------------------------
/R_Markdown/6-basic-spatial-analysis.Rmd:
--------------------------------------------------------------------------------
  1 | # 6. Basic Spatial Analysis
  2 | 
  3 | This helpsheet will explore a variety of basic spatial analysis techniques, including *clipping*, *point in polygon* and *buffering*.
  4 | 
  5 | ### Clipping
  6 | 
  7 | Clipping allows us to use one set of boundaries to cut another, a bit like using a cookie cutter.
  8 | 
  9 | ```{r,results='hide',message=FALSE}
 10 |   # Load the Libaries
 11 |   library(rgdal)
 12 |   library(maptools)
 13 |   library(rgeos)
 14 |   library(stringr)
 15 | ```
 16 | 
 17 | ```{r,echo=FALSE,results='hide'}
 18 |   setwd("/Users/nickbearman/Dropbox/r-helpsheets/helpsheets/6-basic-spatial-analysis")
 19 | ```
 20 | <!---
 21 | ```{r,echo=FALSE,results='hide'}
 22 |   setwd("C:/Users/Nick & Louise/Dropbox/r-helpsheets/helpsheets/6-basic-spatial-analysis")
 23 | ```
 24 | -->
 25 | ```{r,eval=FALSE,results='hide'}
 26 |   # Set working directory
 27 |   setwd("M:/R work")
 28 |   # Download data.zip from the web
 29 |   download.file("http://data.alex-singleton.com/r-helpsheets/6/data.zip", "data.zip")
 30 |   # Unzip file
 31 |   unzip("data.zip")
 32 | ```
 33 | ```{r,results='hide'}
 34 |   # Read in both shape files
 35 |   LSOA <- readOGR(".", "england_LSOA_2011")
 36 |   outline <- readOGR(".", "England-outline")
 37 | ```
 38 | 
 39 | First of all, we can plot the LSOA zones in Liverpool. 
 40 | 
 41 | ```{r,results='hide',eval=FALSE}
 42 |   # Plot the LSOA Map
 43 |   plot(LSOA)
 44 | ```
 45 | 
 46 | ```{r,results='hide',echo=FALSE,warning=FALSE}
 47 |   pdf('plot1.pdf', 5, 5)
 48 |   plot(LSOA)
 49 |   dev.off()
 50 | ```
 51 | 
 52 | ![Image](plot1.pdf)\
 53 |   
 54 |     
 55 |     
 56 | We can also plot the England outline, but if we just run `plot(outline)` it will replace the LSOA plot on the display. To add the `outline` layer to the existing plot window, we can run the code below which will plot the outline with a red border and we can also adjust the colour of the border to a red colour (`border="red"`), and the fill colour (`col="#2C7FB820"`) a shade of blue. These represent two ways of specifying colours. The second contains eight alphanumerics, the first six relate to a HEX colour code. To view various colours that can be used in R, have a look at the website http://research.stowers-institute.org/efg/R/Color/Chart/ColorChart.pdf. The final two characters are the level of transparency (in this case 20%). _Sometimes when running R in Windows, the transparency option will not work - it will just fill it with a solid colour. In this case, just remove the `col = "#2C7FB820"` section from the plot command to generate a red outline._
 57 | 
 58 | ```{r,results='hide',eval=FALSE}
 59 |   # Overplot the outline map
 60 |   plot(outline, add = TRUE, border = "red", col = "#2C7FB820")
 61 | ```
 62 | 
 63 | ```{r,results='hide',echo=FALSE,warning=FALSE}
 64 |   pdf('plot2.pdf', 5, 5)
 65 |   plot(LSOA)
 66 |   plot(outline, add = TRUE, border = "red", col = "#2C7FB820")
 67 |   dev.off()
 68 | ```
 69 | 
 70 | \begin{center}
 71 | \includegraphics{plot2.pdf}
 72 | \par
 73 | \end{center}
 74 | 
 75 | As you will notice, the LSOA boundaries cross the River Mersey and stop at the river centre line. This doesn't look very nice, so we can tidy this up by getting R to clip the LSOA boundaries where they cross the England outline border. 
 76 | 
 77 | We do this using the `gIntersection` command, passing it the two layer variables (`outline` and `LSOA`). We can also tell R we just want it to use the area covering the LSOAs by specifying `byid = TRUE` and `id = my_area_id`. Be aware that the `gIntersection` command may take up to 90 seconds to run - do not worry if your computer appears to freeze. Just wait for the command to complete. 
 78 | 
 79 | ```{r,results='hide',eval=FALSE}
 80 |   # set the area we want to cut
 81 |   my_area_id <- as.character(LSOA@data$ZONECODE)
 82 |   # run the Intersection command, saving output to clipLSOA, this may take anywhere up to 90 seconds to run
 83 |   clipLSOA <- gIntersection(LSOA, outline, byid = TRUE, id = my_area_id)
 84 |   #replot the map as above to see what we have done
 85 |   plot(clipLSOA)
 86 |   plot(outline, add = TRUE, border = "red", col = "#2C7FB820")
 87 | ```
 88 | 
 89 | ```{r,results='hide',echo=FALSE,warning=FALSE}
 90 |   pdf('plot3.pdf', 5, 5)
 91 |   # set the area we want to cut
 92 |   my_area_id <- as.character(LSOA@data$ZONECODE)
 93 |   # run the Intersection command, saving output to clipLSOA, this may take a few seconds to run
 94 |   clipLSOA <- gIntersection(LSOA, outline, byid = TRUE, id = my_area_id)
 95 |   #replot the map as above to see what we have done
 96 |   plot(clipLSOA)
 97 |   plot(outline, add = TRUE, border = "red", col = "#2C7FB820")
 98 |   dev.off()
 99 | ```
100 | 
101 | \begin{center}
102 | \includegraphics{plot3.pdf}
103 | \par
104 | \end{center}
105 | 
106 | We have now removed the parts of the LSOAs that overlap the coastline, and the map looks much more attractive.
107 | 
108 | ### Point in Polygon Analysis
109 | 
110 | Point in polygon analysis is useful when you want to create a subset of points from a larger set based on their spatial location. In this example we will load a list of locations that relate to all doctors surgeries in England, and use the polygons of ward boundaries in Leeds to create a subset of the Leeds doctors surgeries. To begin with, we need to load the libraries and get the GP and Wards data.
111 | 
112 | ```{r,results='hide'}
113 |   # Load the Libaries
114 |   library(maptools)
115 |   library(rgeos)
116 | ```
117 | 
118 | ```{r,echo=FALSE,results='hide'}
119 |   setwd("/Users/nickbearman/Dropbox/r-helpsheets/helpsheets/6-basic-spatial-analysis")
120 | ```
121 | <!---
122 | ```{r,echo=FALSE,results='hide'}
123 |   setwd("C:/Users/Nick & Louise/Dropbox/r-helpsheets/worksheets/6-basic-spatial-analysis")
124 | ```
125 | -->
126 | ```{r,eval=FALSE,results='hide'}
127 |   # Set working directory
128 |   setwd("M:/R work")
129 |   # Download data.zip from the web
130 |   download.file("http://data.alex-singleton.com/r-helpsheets/6/data.zip", "data.zip")
131 |   # Unzip file
132 |   unzip("data.zip")
133 | ```
134 | ```{r,results='hide'}
135 |   # Read in shapefile
136 |   Wards <- readShapeSpatial("CAS-leeds", proj4string = CRS("+init=epsg:27700"))
137 | ```
138 | 
139 | It's worth having a quick look at the Leeds data so we know what it looks like:
140 | 
141 | ```{r,eval=FALSE,results='hide'} 
142 |   # Plot Wards to check it has been read in correctly
143 |   plot(Wards)
144 | ```
145 | 
146 | ```{r,echo=FALSE,results='hide',warning=FALSE}
147 |   pdf('plot4.pdf', 5, 5)
148 |   plot(Wards)
149 |   dev.off()
150 | ```
151 | 
152 | \begin{center}
153 | \includegraphics{plot4.pdf}
154 | \par
155 | \end{center}
156 | 
157 | The doctors surgeries data is quite untidy - once we've read it in, we need to remove some extra columns that we don't need, and rename the ones we do.
158 | 
159 | ```{r,results='hide',comment=NA}
160 |   # Get Data
161 |   GP <- read.csv("General Practices 2006.csv", header = TRUE, skip = 3)
162 | 
163 |   # Extract the columns we want
164 |   GP <- subset(GP, select =c("Practice.Doctor.s.Name", "Easting", "Northing"))
165 | 
166 |   # Rename the columns to something more helpful
167 |   colnames(GP) <- c("Surgery", "Easting", "Northing")
168 | ```
169 | 
170 | ```{r,eval=FALSE,results='hide'} 
171 |   # Do a plot to check what the data look like
172 |   plot(GP$Easting, GP$Northing)
173 | ```
174 | 
175 | ```{r,echo=FALSE,results='hide',warning=FALSE}
176 |   pdf('plot5.pdf', 6, 6)
177 |   # Do a plot to check what the data look like
178 |   plot(GP$Easting, GP$Northing)
179 |   dev.off()
180 | ```
181 | 
182 | \begin{center}
183 | \includegraphics{plot5.pdf}
184 | \par
185 | \end{center}
186 | 
187 | This should look like the above. The next stage is to convert the data into a SpatialPointsDataFrame. 
188 | 
189 | ```{r,results='hide'}
190 |   # Remove those GP without Easting or Northing
191 |   GP <- subset(GP, Easting != "" & Northing != "")
192 |   # Create a unique ID for each GP
193 |   GP$GP_ID <- 1:nrow(GP)
194 |   # Create the SpatialPointsDataFrame
195 |   GP_SP <- SpatialPointsDataFrame(coords = c(GP[2], GP[3]), data = data.frame(GP$Surgery, GP$GP_ID), proj4string = CRS("+init=epsg:27700"))                                     
196 | ```
197 | 
198 | The first line contains a `subset` command which removes any of the entries which have a blank value for Northings or Eastings. `!=` means 'not equal to' and `&` means 'AND' so in "English" the command reads "overwrite the GP data frame with a subset of the GP data frame where the Easting field is not blank and the Northing field is not blank".
199 | 
200 | In a SpatialPointsDataFrame each entry must have a unique ID, so the second line creates an ID in the column `GP_ID`. The third line brings together the different elements to create the SpatialPointsDataFrame, `GP-SP`. `GP[2]` and `GP[3]` are the `Easting` and `Northing` columns respectively, and the `data =` section tells R which bits of the data frame to include. In this case we only want the surgery name (`GP$Surgery`) and the ID number (`GP$GP_ID`). The final term (`proj4string`) specifies which projection the data set is in - in this case, British National Grid (`epsg:27700`).
201 | 
202 | ```{r,eval=FALSE,results='hide'} 
203 |   # Show the results
204 |   plot(GP_SP)       
205 | ```
206 | 
207 | ```{r,echo=FALSE,results='hide',warning=FALSE}
208 |   pdf('plot6.pdf', 6, 6)
209 |   # Show the results
210 |   plot(GP_SP)       
211 |   dev.off()
212 | ```
213 | 
214 | \begin{center}
215 | \includegraphics{plot6.pdf}
216 | \par
217 | \end{center}
218 | 
219 | This plot will look similar to the previous one, but the data are now stored in a SpatialPointsDataFrame. We now can calculate a point in polygon, i.e. to select those points which lie within the boundary of Leeds. The forth line below uses a `!is.na` command. `is.an` is a command to test whether a value is 'NA' and `!` means the inverse, so the command is testing whether the value (of `GP_SP@data$label`) is not `NA`. 
220 | 
221 | ```{r,results='hide'}
222 |   # point in polygon - returns a dataframe of the attributes of the polygons
223 |   # that the point is within.
224 |   o <- over(GP_SP, Wards)
225 | 
226 |   # Many of these will be NA values - because most GPs are not in Leeds!
227 |   head(o)                         
228 | 
229 |   # Add the attributes back onto the GP_SP SpatialPointsDataFrame (they are the same length)
230 |   GP_SP@data <- cbind(GP_SP@data, o)
231 | 
232 |   # Use the NA values to remove those points not within Leeds
233 |   GP_SP_Leeds <- GP_SP[!is.na(GP_SP@data$label), ]                       
234 | ```
235 | 
236 | ```{r,eval=FALSE,results='hide'} 
237 |   # Map your results
238 |   plot(GP_SP_Leeds)        
239 | ```
240 | 
241 | ```{r,echo=FALSE,results='hide',warning=FALSE}
242 |   pdf('plot7.pdf', 6, 6)
243 |   # Map your results
244 |   plot(GP_SP_Leeds)          
245 |   dev.off()
246 | ```
247 | 
248 | \begin{center}
249 | \includegraphics{plot7.pdf}
250 | \par
251 | \end{center}
252 | 
253 | We can also plot the points over the Leeds LSOAs:
254 | 
255 | ```{r,eval=FALSE,results='hide'} 
256 |   plot(Wards)
257 |   plot(GP_SP_Leeds, add = TRUE)     
258 | ```
259 | 
260 | ```{r,echo=FALSE,results='hide',warning=FALSE}
261 |   pdf('plot8.pdf', 6, 6)
262 |   plot(Wards)
263 |   plot(GP_SP_Leeds, add = TRUE)        
264 |   dev.off()
265 | ```
266 | 
267 | \begin{center}
268 | \includegraphics{plot8.pdf}
269 | \par
270 | \end{center}
271 | 
272 | We can also view the data in the `GP_SP_Leeds` data frame. 
273 | 
274 | ```{r,results='hide'}
275 |   # View the data slot of the results
276 |   head(GP_SP_Leeds@data)
277 | ```
278 | 
279 | ```{r,echo=FALSE,comment=NA}
280 |   head(GP_SP_Leeds@data)
281 | ```
282 | 
283 | 
284 | ## Buffers
285 | 
286 | >> _This section looks at buffers. It carries on from the section on points in polygon, so make sure you complete that section first._
287 | 
288 | Buffers are often used in spatial analysis for defining context of points. In this example we will calculate a buffer from the doctors surgeries of a 10 minute walking distance, based on an average of 3 mph, which is around 1608m. 
289 | 
290 | The rgeos package has a function called `gBuffer()` that can be used to create buffers around points, lines or polygon objects. In the following example we create a new SpatialPolygons object called `GP_SP_Leeds_Buffers`. This then needs to be converted into a SpatialPolygonsDataFrame object by joining the `@data` from `GP_SP_Leeds` back onto `GP_SP_Leeds_Buffers`. Spatial Polygons objects do not have the data slot.
291 | 
292 | ```{r,results='hide'}
293 |   # buffers
294 |   GP_SP_Leeds_Buffers <- gBuffer(GP_SP_Leeds, width = 1608, byid = TRUE)
295 |   
296 |   # Convert GP_SP_Leeds_Buffers into a SpatialPolygonsDataFrame (rather than
297 |   # SpatialPolygons) by joining the data of the GP_SP_Leeds
298 |   # SpatialPolygonsDataFrame
299 |   GP_SP_Leeds_Buffers <- SpatialPolygonsDataFrame(GP_SP_Leeds_Buffers, GP_SP_Leeds@data)
300 | ```
301 | 
302 | We can also now plot this on top of the Wards map. 
303 | 
304 | ```{r,eval=FALSE,results='hide'} 
305 |   # Wards wards
306 |   plot(Wards, axes = FALSE, col = "#6E7B8B", border = "#CAE1FF")
307 |   # GP locations
308 |   plot(GP_SP_Leeds, pch = 19, cex = 0.4, col = "#5CACEE", add = TRUE)
309 |   # catchment buffers
310 |   plot(GP_SP_Leeds_Buffers, axes = FALSE, col = NA, border = "red", add = TRUE) 
311 | ```
312 | 
313 | ```{r,echo=FALSE,results='hide',warning=FALSE}
314 |   pdf('plot9.pdf', 6, 6)
315 |   # Wards wards
316 |   plot(Wards, axes = FALSE, col = "#6E7B8B", border = "#CAE1FF")
317 |   # GP locations
318 |   plot(GP_SP_Leeds, pch = 19, cex = 0.4, col = "#5CACEE", add = TRUE)
319 |   # catchment buffers
320 |   plot(GP_SP_Leeds_Buffers, axes = FALSE, col = NA, border = "red", add = TRUE)      
321 |   dev.off()
322 | ```
323 | 
324 | \begin{center}
325 | \includegraphics{plot9.pdf}
326 | \par
327 | \end{center}
328 | 


--------------------------------------------------------------------------------
/R_Markdown/7-converting-coordinates.Rmd:
--------------------------------------------------------------------------------
  1 | ```{r set-options, echo=FALSE,comment=NA, cache=FALSE}
  2 | options(width=62)
  3 | ```
  4 | 
  5 | # 7. Converting Coordinates
  6 | 
  7 | Sometimes you will need to convert spatial data from one coordinate system to another. This is often called reprojecting as different coordinate systems typically use different projections; i.e. the way in which the curved Earth is represented as a flat surface. There are lots of different projections, including the Mercator and Gall-Peters projections, as shown below:
  8 | 
  9 | ![The Mercator projection on the left and the Gall-Peters projection on the right. _Images from http://en.wikipedia.org/wiki/File:Mercator\_projection\_SW.jpg and http://en.wikipedia.org/wiki/File:Gall%E2%80%93Peters\_projection\_SW.jpg._](Mercator_projection_SW-Gall-Peters-projection_SW.jpg)
 10 | 
 11 | This helpsheet will take you through the process of converting BNG (British National Grid coordinates, Eastings and Northings) to Latitude and Longitude which requires reprojection between the OSBG36 and WGS84 datums. The same principle can be applied to any re-projection though. 
 12 | 
 13 | ### Setup
 14 | 
 15 | There are some initial commands we need to run to setup R for this exercise. Firstly, loading the required library, and secondly, declaring some variables for the two different types of coordinate systems we will be using. 
 16 | 
 17 | ```{r,results='hide',message=FALSE}
 18 |   # Load the packages
 19 |   library(rgdal)
 20 | 
 21 |   #Variables for holding the coordinate system types (see: http://www.epsg.org/ for details)
 22 |   ukgrid = "+init=epsg:27700"
 23 |   latlong = "+init=epsg:4326"
 24 | ```
 25 | 
 26 | We will use the locations of doctors surgeries data as an example. Download and import it using the following commands:
 27 | 
 28 | ```{r,echo=FALSE,results='hide'}
 29 |   setwd("/Users/nickbearman/Dropbox/r-helpsheets/helpsheets/7-converting-coordinates")
 30 | ```
 31 | <!---
 32 | ```{r,echo=FALSE,results='hide'}
 33 |   setwd("C:/Users/Nick & Louise/Dropbox/r-helpsheets/helpsheets/7-converting-coordinates")
 34 | ```
 35 | -->
 36 | ```{r,eval=FALSE,results='hide'}
 37 |   # Set working directory
 38 |   setwd("M:/R work")
 39 | 
 40 |   # Download data.zip from the web
 41 |   download.file("http://data.alex-singleton.com/r-helpsheets/7/data.zip", "data.zip")
 42 | 
 43 |   # Unzip file
 44 |   unzip("data.zip")
 45 | ```
 46 | ```{r,results='hide',comment=NA}
 47 |   # Get doctors surgeries data
 48 |   GP <- read.csv("General Practices 2006.csv", header = TRUE, skip = 3)
 49 | 
 50 |   # Extract the columns we want
 51 |   GP <- subset(GP, select =c("Practice.Doctor.s.Name", "Easting", "Northing"))
 52 | 
 53 |   # Rename the columns to something more helpful
 54 |   colnames(GP) <- c("Surgery", "Easting", "Northing")
 55 | ```
 56 | 
 57 | We now have the doctors surgeries, with their eastings and northings. To show a summary, run:
 58 | 
 59 | ```{r,warning=FALSE,comment=NA}
 60 |   head(GP)
 61 | ```
 62 | 
 63 | We next need to convert the GP object from a data frame into a Spatial Data Frame. 
 64 | 
 65 | ```{r,results='hide',comment=NA}
 66 |   # Remove those doctors surgeries with missing Eastings or Northings
 67 |   GP <- subset(GP, Easting != "" | Northing != "")
 68 |   # Create a unique ID for each GP
 69 |   GP$GP_ID <- 1:nrow(GP)
 70 |   # Create coordinates variable
 71 |   coords <- cbind(Easting = as.numeric(as.character(GP$Easting)), Northing = as.numeric(as.character(GP$Northing)))
 72 |   # Create the SpatialPointsDataFrame
 73 |   GP_SP <- SpatialPointsDataFrame(coords, data = data.frame(GP$Surgery, GP$GP_ID), proj4string = CRS("+init=epsg:27700"))                                
 74 | ```
 75 | 
 76 | `GP_SP` is now a spatial data frame. We can do a quick `plot(GP_SP)` to see what this looks like. 
 77 | 
 78 | ```{r,eval=FALSE,results='hide'} 
 79 |   # Show the results
 80 |   plot(GP_SP)       
 81 | ```
 82 | 
 83 | ```{r,results='hide',echo=FALSE,warning=FALSE}
 84 |   pdf('plot1.pdf', 5, 5)
 85 |   plot(GP_SP)
 86 |   dev.off()
 87 | ```
 88 | 
 89 | ![Image](plot1.pdf)\
 90 | 
 91 | 
 92 | 
 93 | Because `GP_SP` is now a Spatial Data Frame, we need to use `head(GP_SP@data)` to view content. 
 94 | 
 95 | ```{r,eval=FALSE} 
 96 |   head(GP_SP@data)
 97 | ```
 98 | 
 99 | ```{r,echo=FALSE,warning=FALSE,comment=NA}
100 |   head(GP_SP@data)
101 | ```
102 | 
103 | You can see that the Eastings and Northings are no longer visible. In fact the eastings and northings are just stored in a different slot of the Spatial Data Frame. Try `head(GP_SP@coords)` instead. 
104 | 
105 | ```{r,eval=FALSE} 
106 |   head(GP_SP@coords)
107 | ```
108 | 
109 | ```{r,echo=FALSE,warning=FALSE,comment=NA}
110 |   head(GP_SP@coords)
111 | ```
112 | 
113 | And there they are! The `Coords` slot will behave like a normal data frame, so we can access specific elements of it in the usual way, for example `head(GP_SP@coords[,1])`. See the helpsheet "1. R Basics" for more information on data frames. 
114 | 
115 | Now, the command to reproject from British National Grid (Eastings and Northings) into WGS84 (Latitude and Longitude).
116 | 
117 | ```{r,results='hide',comment=NA}
118 |   #Convert from Eastings and Northings to Latitude and Longitude
119 |   GP_SP_LL <- spTransform(GP_SP, CRS(latlong))
120 |   # we also need to rename the columns
121 |   colnames(GP_SP_LL@coords)[colnames(GP_SP_LL@coords)=="Easting"] <- "Longitude"
122 |   colnames(GP_SP_LL@coords)[colnames(GP_SP_LL@coords)=="Northing"] <- "Latitude"
123 | ```
124 | 
125 | ```{r,results='hide'}
126 |   head(GP_SP_LL@coords)
127 | ```
128 | 
129 | ```{r,echo=FALSE,warning=FALSE,comment=NA}
130 |   head(GP_SP_LL@coords)
131 | ```
132 | 
133 | Now the data are in Latitude and Longitude.


--------------------------------------------------------------------------------
/R_Markdown/common-error-msg.Rmd:
--------------------------------------------------------------------------------
 1 | # Why doesn't my code work? - Common things to check
 2 | 
 3 | There could be many reasons why your code doesn't work, but that doesn't mean all is lost. These are the most common things you should check:
 4 | 
 5 | ### Error Messages
 6 | 
 7 | Read the error message - R can sometimes be a bit cryptic with error messages, but they usually point you in the right direction. Most of the time it involves checking exactly what you typed - typos are very common in R. Remember you can press 'up' on the keyboard to see the last command and edit it - you don't need to type out the whole thing again. You can also run the `history()` command to see all of your previous commands. 
 8 | 
 9 | Here's some hints on specific error messages:
10 | 
11 | 1. "`Error: unexpected ','`" is fairly self-explanatory - remove the extra comma!
12 | 
13 | 1. "`Error - unexpected symbol`" could mean that you've missed an `=` sign, quote mark `'` or some other small but vital piece of information.
14 | 
15 | 1. "`Error: object not found`" means that R can't find the object you are referring to. Remember R is case sensitive (i.e. the lower case and CAPITAL letters must be the same when referring  to an object) so `House.prices` is not the same object as `house.prices`. Also check that you've spelt the object name correctly. You can use '`ls()`' to give you a list of all the current objects in R. 
16 | 
17 | If you get a different error message, or no message, check exactly what you have typed. If you can't see anything wrong, get the person sitting next to you to check - a second pair of eyes is often useful. 
18 | 
19 | ### Packages
20 | 
21 | Sometimes missing packages can be a problem.
22 | 
23 | 1. Remember when using packages there are two stages to this - installing the package, and then loading the package (using the `library()` command). 
24 | 
25 | 2. The install command looks like this: `install.packages("maptools", depend = TRUE)` where `maptools` is the package name in this case. When you do this it may ask for a mirror to be selected, by opening a new window - just click one of the UK ones to continue. 
26 | 
27 | 3. If R says `Error: package 'sp' required by 'maptools' could not be found` it means it couldn't install the `sp` package for some reason - trying intstalling it separatley (`install.packages("sp", depend = TRUE`) and then install `maptools`. 
28 | 
29 | 
30 | ### What does `x` do?
31 | 
32 | If you're not sure what a particular function does, type `?`, followed by the function (e.g. `?summary`) and R will open the help file for that tool (`summary` in this case). You could also Google 'R summary' which should generate some useful results. 
33 | 


--------------------------------------------------------------------------------