├── .DS_Store ├── .Rprofile ├── .gitignore ├── .hugo_build.lock ├── 01-gds.Rmd ├── 01-gds.html ├── 02-spatial-data.Rmd ├── 02-spatial-data.html ├── 03-spatial_weights.Rmd ├── 03-spatial_weights.html ├── 04-spatial_econometrics.Rmd ├── 04-spatial_econometrics.html ├── 05-sentiment-analysis.Rmd ├── 05-sentiment-analysis.html ├── LICENSE ├── README.md ├── data ├── .DS_Store ├── Liverpool_MSOA.dbf ├── Liverpool_MSOA.prj ├── Liverpool_MSOA.shp ├── Liverpool_MSOA.shx ├── Liverpool_OA.dbf ├── Liverpool_OA.prj ├── Liverpool_OA.shp ├── Liverpool_OA.shx ├── Local_Authority_Districts_(May_2021)_UK_BFE_V3 │ ├── LAD_MAY_2021_UK_BFE_V2.cpg │ ├── LAD_MAY_2021_UK_BFE_V2.dbf │ ├── LAD_MAY_2021_UK_BFE_V2.prj │ ├── LAD_MAY_2021_UK_BFE_V2.shp │ ├── LAD_MAY_2021_UK_BFE_V2.shx │ └── Local_Authority_Districts_(May_2021)_UK_BFE_V3.xml ├── census_data.csv ├── census_data2.csv ├── uk-sentiment-data.csv └── uk_geo_tweets_01012019_31012019.csv ├── figs ├── .DS_Store ├── sources_gds.png └── spatial_weight.png ├── index.Rmd ├── index.html ├── index.pdf ├── intro-gds.Rproj ├── refs.bib ├── rmd └── .DS_Store └── skeleton.bib /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/.DS_Store -------------------------------------------------------------------------------- /.Rprofile: -------------------------------------------------------------------------------- 1 | # REMEMBER to restart R after you modify and save this file! 2 | 3 | # First, execute the global .Rprofile if it exists. You may configure blogdown 4 | # options there, too, so they apply to any blogdown projects. Feel free to 5 | # ignore this part if it sounds too complicated to you. 6 | if (file.exists("~/.Rprofile")) { 7 | base::sys.source("~/.Rprofile", envir = environment()) 8 | } 9 | 10 | # Now set options to customize the behavior of blogdown for this project. Below 11 | # are a few sample options; for more options, see 12 | # https://bookdown.org/yihui/blogdown/global-options.html 13 | options( 14 | # to automatically serve the site on RStudio startup, set this option to TRUE 15 | blogdown.serve_site.startup = FALSE, 16 | # to disable knitting Rmd files on save, set this option to FALSE 17 | blogdown.knit.on_save = TRUE, 18 | # build .Rmd to .html (via Pandoc); to build to Markdown, set this option to 'markdown' 19 | blogdown.method = 'html' 20 | ) 21 | 22 | # fix Hugo version 23 | #options(blogdown.hugo.version = "0.83.0") 24 | options(blogdown.hugo.version = "0.101.0") 25 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | .Ruserdata 5 | -------------------------------------------------------------------------------- /.hugo_build.lock: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/.hugo_build.lock -------------------------------------------------------------------------------- /01-gds.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "What is geographic data science?" 3 | author: "Francisco Rowe ([`@fcorowe`](http://twitter.com/fcorowe))" 4 | date: "`r Sys.Date()`" 5 | output: tint::tintHtml 6 | bibliography: refs.bib 7 | link-citations: yes 8 | --- 9 | 10 | ```{r setup, include=FALSE} 11 | library(tint) 12 | # invalidate cache when the package version changes 13 | knitr::opts_chunk$set(tidy = FALSE, cache.extra = packageVersion('tint'), class.source = "col-source") 14 | options(htmltools.dir.version = FALSE) 15 | ``` 16 | 17 | ```{css, echo=FALSE} 18 | .col-source { 19 | background-color: #E5E7E9; 20 | border: 3px #000000; 21 | } 22 | ``` 23 | 24 | ```{marginfigure} 25 | [**Back**](index.html) \ 26 | 27 | [**Next**](02-spatial-data.html) 28 | ``` 29 | 30 | # Rise of data 31 | 32 | We are experiencing a data revolution. Technological advances in computational power, storage and network platforms have enabled the emergence of *Big Data*. These technological innovations have facilitated the production, processing, analysis and storage of large volumes of digital data. Information that previously could not be stored, or used to be captured using analog devices can now be recorded digitally. We can now digitally generate, store, manage and analyse data that were previously very challenging to access, such as books, newspapers, photographs and art work. Mobile phones, social media platforms, satellites, emails, smart cards, CCTV and The Internet have all led to the current data revolution we are living in. 33 | 34 | ```{marginfigure} 35 | Rowe, F. 2021. [Big Data and Human Geography](https://doi.org/10.31235/osf.io/phz3e). In: Demeritt, D. and Lees L. (eds) *Concise Encyclopedia of Human Geography*. Edward Elgar Encyclopedias in the Social Sciences series. 36 | ``` 37 | 38 | 39 | ![Fig. 1. "Big Data" sources collected through direct, indirect and volunteering systems. Source: Rowe et al (2021)](./figs/sources_gds.png) 40 | 41 | Yet, the rise in data has posed major epistemological, methodological and ethical challenges (Rowe, 2021). Data on themselves are data enough. We need insights. To this end, *Data Science* has been instrumental in turning data resources into insight and understanding. *Data Science* is understood as the processes and techniques involved in this operation. *Big Data* are often unstructured, fragmented and hard to access due to privacy and confidentiality concerns. Significant data engineering is required, involving the use and design of specialised methods, software and expert knowledge, and linkage to other data sources, in order to use most *Big Data* sources (Arribas-Bel et al., 2021). 42 | 43 | ```{marginfigure} 44 | Arribas-Bel, Dani, Mark Green, Francisco Rowe, and Alex Singleton. 2021. “Open Data Products-a Framework for Creating Valuable Analysis Ready Data.” *Journal of Geographical Systems*. 23 (4): 497–514. https://doi.org/10.1007/s10109-021-00363-5 45 | ``` 46 | 47 | # Geographic Data Science 48 | 49 | ```{marginfigure} 50 | Singleton, A., and Arribas-Bel, D. 2019. “Geographic Data Science.” *Geographical Analysis*. 53 (1): 61–75. https://doi.org/10.1111/gean.12194. 51 | ``` 52 | 53 | Geographic data science is a subfield of research in geography and sits at the intersection between geography and data science (Singleton and Arribas-Bel, 2019). Geographic data science entails a bidirectional relationship between geography and data science. Geographic data science argues for the benefits of *Geography* for *Data Science* to address spatially explicit problems, especially because much *Big Data* are spatial. Explicitly consideration of space is not adding an additional variable in regression models, but understanding the conceptual and methodological complexities and understanding of geographical context, such as *spatial autocorrelation*, *spatial non-stationarity*, *spatial heterogeneity* and *local contextual contingencies* (see Rowe and Arribas-Bel 2022). At the same time, *Geography* has much to gain from *Data Science*, particularly in the methodological and technical aspects of working with *Big Data*. 54 | 55 | ```{marginfigure} 56 | Rowe, F. and Arribas-Bel, D. 2022. “Spatial Modelling for Data Scientists.” https://doi.org/10.17605/OSF.IO/8F6XR. 57 | ``` 58 | 59 | -------------------------------------------------------------------------------- /02-spatial-data.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Spatial Data" 3 | author: "Francisco Rowe ([`@fcorowe`](http://twitter.com/fcorowe))" 4 | date: "`r Sys.Date()`" 5 | output: tint::tintHtml 6 | bibliography: skeleton.bib 7 | link-citations: yes 8 | --- 9 | 10 | ```{r setup, include=FALSE} 11 | library(sf) 12 | library(tint) 13 | # invalidate cache when the package version changes 14 | knitr::opts_chunk$set(tidy = FALSE, cache.extra = packageVersion('tint'), class.source = "col-source") 15 | options(htmltools.dir.version = FALSE) 16 | ``` 17 | 18 | ```{css, echo=FALSE} 19 | .col-source { 20 | background-color: #E5E7E9; 21 | border: 3px #000000; 22 | } 23 | ``` 24 | 25 | ```{marginfigure} 26 | [**Back**](01-gds.html) \ 27 | 28 | [**Next**](03-spatial_weights.html) 29 | ``` 30 | 31 | # Fundamental Geographic Data Structures 32 | 33 | Three main structures are generally used to organise geographic data: 34 | 35 | 1. [Vector data structure]{.underline}: The vector data structures record geographic information using points, lines and polygons in a geographic table. These tables contain information about geographic objects. Columns store information about geographic objects, attributes or features, and rows represent individual geographic objects. 36 | 37 | 2. [Raster data structures]{.underline}: The raster data structures record geographic data in an uniform way over a space in the form of grids. It divides geographic surfaces up into cells of constant size. Rows and columns provide information about the geographic location of a grid. 38 | 39 | 3. [Spatial graphs]{.underline}: Spatial graphs store connections between objects through space. These connections may derive from geographical topology (e.g. contiguity), distance, or more sophisticated dimensions, such as interaction flows (e.g. human mobility, trade and information). 40 | 41 | Vector data structures tend to dominate the social sciences are the interest is often in capturing discrete geographic units containing populations. Here therefore we focus on vector data structures. 42 | 43 | ## Vector data 44 | 45 | To understand the structure of vector data, let's read a dataset (`Liverpool_OA.shp`) describing output areas within Liverpool in the United Kingdom. To read in the data, we use the `st_read()` from the package `sf`. `sf` supports geometry collections, which can contain multiple geometry types in a single object. `sf` provides the same functionality previously provided in three separate packages `sp`, `rgdal` and `rgeos` (Robin et al. 2021). `sf` can also be used in combination with `tidyverse`! 46 | 47 | Reading the data set via `sf` returns its geographic metadata (i.e. `Geometry type`, `Dimension`, `Bounding box` and coordinate reference system information on the line beginning `Projected CRS`). 48 | 49 | ```{marginfigure} 50 | For raster data, I would recommend using the package `terra`. 51 | ``` 52 | 53 | ```{marginfigure} 54 | If you are interested in learning more about mapping geographic data, I cannot recommend enough: Lovelace, R., Nowosad, J. and Muenchow, J., 2019. "*Geocomputation with R*". Chapman and Hall/CRC. 55 | ``` 56 | 57 | ```{r} 58 | oa_shp <- st_read("./data/Liverpool_OA.shp") 59 | ``` 60 | 61 | We read a `sf` data frame containing spatial and attribute columns. We can examine the content of the data frame by using the function `head()`. We called the first four columns. The last column in this example contains the geographic information i.e. `geometry`. 62 | 63 | ```{r} 64 | class(oa_shp) 65 | head(oa_shp[,1:4]) 66 | ``` 67 | 68 | Each row represents an output area. Each output area has multiple attributes (i.e. columns): administrative areas codes and geometry, as well as information on the local population in these areas; however, this information is not displayed above (can you access it?). 69 | 70 | The content of the geometry column gives `sf` objects their spatial powers. `oa_shp$geometry` is a 'list column' that contains all the coordinates of the output areas polygons. `sf` objects can be plotted quickly with the base R function `plot()`. 71 | 72 | ```{marginfigure} 73 | For more advanced map making, use dedicated visualisation packages such as `tmap` or `ggplot2`. 74 | ``` 75 | 76 | 77 | ```{r} 78 | plot(oa_shp$geometry) 79 | ``` 80 | We can thematically colour any attributes in the spatial data frame based on a column by passing the name of that column to the plot function. We map the share of unemployed population. We can adjust the key or legend position (`key.pos`), plot axes (`axes`), length of the scale bar (`key.length`), thickness/width of the scale bar (`key.width`), method or number to break the data attribute (`breaks`), line width (`lwd`) and colour of polygon borders (`border`). 81 | 82 | ```{r} 83 | plot(oa_shp["unemp"], key.pos = 4, axes = TRUE, key.width = lcm(1.3), key.length = 1., breaks = "jenks", lwd = 0.1, border = 'grey') 84 | ``` 85 | 86 | Various types of geometries (i.e. lines, points and polygons) exist. We can transform vector data into points by running: 87 | 88 | 89 | ```{r, warning=FALSE} 90 | oa_cents = st_centroid(oa_shp) 91 | head(oa_cents[,1:4]) 92 | ``` 93 | And visualise the data by running: 94 | 95 | ```{r} 96 | plot(st_geometry(oa_cents)) 97 | ``` 98 | 99 | 100 | -------------------------------------------------------------------------------- /03-spatial_weights.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Spatial Weights" 3 | author: "Francisco Rowe ([`@fcorowe`](http://twitter.com/fcorowe))" 4 | date: "`r Sys.Date()`" 5 | output: tint::tintHtml 6 | bibliography: skeleton.bib 7 | link-citations: yes 8 | --- 9 | 10 | ```{r setup, include=FALSE} 11 | library(tint) 12 | # handle spatial data 13 | library(sf) 14 | library(spdep) 15 | # manipulate data 16 | library(tidyverse) 17 | library(lubridate) 18 | # create maps 19 | library(tmap) 20 | # nice colour schemes 21 | library(viridis) 22 | library(viridisLite) 23 | # invalidate cache when the package version changes 24 | knitr::opts_chunk$set(tidy = FALSE, cache.extra = packageVersion('tint'), class.source = "col-source") 25 | options(htmltools.dir.version = FALSE) 26 | ``` 27 | 28 | 29 | ```{css, echo=FALSE} 30 | .col-source { 31 | background-color: #E5E7E9; 32 | border: 3px #000000; 33 | } 34 | ``` 35 | 36 | ```{marginfigure} 37 | [**Back**](02-spatial-data.html) \ 38 | 39 | [**Next**](04-spatial_econometrics.html) 40 | ``` 41 | 42 | # Intuition 43 | 44 | Now we will learn the intuition of how we can represent spatial relationships in practice. 45 | We will explore a key concept of spatial analysis: *spatial weights matrices*. 46 | Spatial weights matrices are structured sets of numbers that formally encode spatial associations between observations. 47 | 48 | ![Fig. 1. Spatial weights matrix.](./figs/spatial_weight.png) 49 | 50 | Key attributes of an spatial weight matrix: 51 | 52 | - Cell elements represent the extent of spatial interaction between two observations;\ 53 | - The extent of spatial interaction is mediated by spatial proximity; 54 | 55 | Spatial weight matrices can be created in various ways. 56 | We will discuss the most commonly used in practice. 57 | 58 | # Data 59 | 60 | For now, we will only need our LA boundaries. 61 | 62 | ```{r} 63 | # clean workspace 64 | rm(list=ls()) 65 | 66 | # read shapefile 67 | la_shp <- st_read("./data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.shp") 68 | 69 | # simplify boundaries 70 | la_shp_simple <- st_simplify(la_shp, 71 | preserveTopology =T, 72 | dTolerance = 500) # .5km 73 | 74 | # ensure geometry is valid 75 | la_shp_simple <- sf::st_make_valid(la_shp_simple) 76 | 77 | head(la_shp_simple[,c(2,3)]) 78 | 79 | ``` 80 | 81 | # Building Spatial Weights 82 | 83 | ## Contiguity-based matrices 84 | 85 | Contiguity weights matrices define spatial connection through the existence of common geographical boundaries. 86 | 87 | ### Queen 88 | 89 | Based on the queen criteria, two spatial units are contiguous if they share a vortex (a single point) of their boundaries. 90 | 91 | ```{r} 92 | wm_queen <- poly2nb(la_shp_simple, queen = TRUE) 93 | summary(wm_queen) 94 | ``` 95 | 96 | > How do we interpret the outcome? 97 | 98 | Finding the most connected area: 99 | 100 | ```{r} 101 | la_shp_simple$LAD21NM[373] 102 | ``` 103 | 104 | Its neighbours: 105 | 106 | ```{r} 107 | wm_queen[[373]] 108 | ``` 109 | 110 | Their names: 111 | 112 | ```{r} 113 | la_shp_simple$LAD21NM[c(19, 48, 354, 356, 358, 359, 361, 363, 367, 369, 371, 374)] 114 | ``` 115 | 116 | Visualising the weights matrix: 117 | 118 | ```{r} 119 | coords <- st_centroid(st_geometry(la_shp_simple)) 120 | plot(st_geometry(la_shp_simple), border="grey") 121 | plot(wm_queen, coords, add = TRUE) 122 | ``` 123 | 124 | ### Rook 125 | 126 | The rook defines two observations as neighbours if they share some of their boundaries. 127 | For irregular polygons, differences between the rook and queen definitions are minimal and tend to boil down to geocoding. 128 | For regular polygons, such as rasters or grids, differences are more noticeable. 129 | 130 | ```{r} 131 | wm_rook <- poly2nb(la_shp_simple, queen = FALSE) 132 | summary(wm_rook) 133 | ``` 134 | 135 | > Have a go at interpreting and plotting the results. 136 | 137 | ## Distance-based matrices 138 | 139 | Distance-based matrices define weights to each pair of observations as a function of their geographical proximity. 140 | There are various distance-based matrices, but they share that same intuition. 141 | 142 | ### K-Nearest Neighbours 143 | 144 | A approach is to define weights based on the distances between a reference observation and a the set of *k* observations; that is, the closest. 145 | For more details see [this vignette](https://r-spatial.github.io/spdep/reference/knearneigh.html) 146 | 147 | ```{r} 148 | col.knn <- knearneigh(coords, k=4) 149 | head(col.knn[[1]], 5) 150 | ``` 151 | 152 | Displaying the network. 153 | 154 | ```{r} 155 | plot(st_geometry(la_shp_simple), border="grey") 156 | plot(knn2nb(col.knn), coords, add=TRUE) 157 | ``` 158 | 159 | ### Distance Band 160 | 161 | An alternative way to define is to draw a circle of certain radius and consider neighbours all observations (i.e. centroids) within that radious. 162 | 163 | ```{r} 164 | wm_dist <- dnearneigh(coords, 0, 20000, longlat = TRUE) 165 | wm_dist 166 | ``` 167 | 168 | ```{r} 169 | plot(st_geometry(la_shp_simple), border="grey") 170 | plot(wm_dist, coords, add=TRUE) 171 | ``` 172 | 173 | ## Row Standardised Weights Matrices 174 | 175 | A spatial weights matrix with raw values (e.g. 1/0s) is rarely the best approach for analysis and some kind of transformation is required. 176 | 177 | Let's use the queen definition to illustrate the example. 178 | 179 | > Note: 'zero.policy = TRUE' allows listing non-neighbours. 180 | > See what happens if you drop this argument or print \`rswm_queen'. 181 | > If you have done that, you may be waiting for the answer. 182 | > Well the answer is we have island in the dataset and the queen definition does not integrate these places very well. 183 | 184 | The argument `Style=“W”` indicates that equal weights are assigned to neighbouring polygons, so they are row standardised: for a given polygon, sums across the columns and divide each cell by the total, to derive weights. 185 | 186 | Let's see the weights for polygon 1: 187 | 188 | ```{r} 189 | rswm_queen <- nb2listw(wm_queen, style = "W", zero.policy = TRUE) 190 | rswm_queen$weights[1] 191 | ``` 192 | 193 | This unit has 2 neighbours and each is assigned a 0.5 of the total weight. 194 | 195 | ```{r eval=FALSE, include=FALSE} 196 | file.edit( 197 | tint:::template_resources( 198 | 'tint', '..', 'skeleton', 'skeleton.Rmd' 199 | ) 200 | ) 201 | ``` 202 | 203 | ```{r bib, include=FALSE} 204 | # create a bib file for the R packages used in this document 205 | knitr::write_bib(c('base', 'rmarkdown'), file = 'skeleton.bib') 206 | ``` 207 | -------------------------------------------------------------------------------- /04-spatial_econometrics.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Spatial Econometrics: Fundamentals" 3 | author: "Francisco Rowe ([`@fcorowe`](http://twitter.com/fcorowe))" 4 | date: "`r Sys.Date()`" 5 | output: tint::tintHtml 6 | bibliography: skeleton.bib 7 | link-citations: yes 8 | --- 9 | 10 | ```{r setup, include=FALSE} 11 | library(tint) 12 | # handle spatial data 13 | library(sf) 14 | library(spdep) 15 | # manipulate data 16 | library(tidyverse) 17 | library(lubridate) 18 | # create maps 19 | library(tmap) 20 | # create interactive maps 21 | library(leaflet) 22 | # nice colour schemes 23 | library(viridis) 24 | library(viridisLite) 25 | # invalidate cache when the package version changes 26 | knitr::opts_chunk$set(tidy = FALSE, cache.extra = packageVersion('tint'), class.source = "col-source") 27 | options(htmltools.dir.version = FALSE) 28 | ``` 29 | 30 | 31 | ```{css, echo=FALSE} 32 | .col-source { 33 | background-color: #E5E7E9; 34 | border: 3px #000000; 35 | } 36 | ``` 37 | 38 | ```{marginfigure} 39 | [**Back**](03-spatial_weights.html) \ 40 | ``` 41 | 42 | # Key idea 43 | 44 | We want to analyse the extent of spatial auto-correlation in anti-immigration sentiment based on Twitter data. 45 | 46 | # Data 47 | 48 | We will be using a sample of data obtained via the [Twitter Academic Application Programming Interface (API)](https://developer.twitter.com/en/products/twitter-api/academic-research). 49 | 50 | I obtained a sample of migration-related geolocated tweets for the United Kingdom. I used a bounding box containing the United Kingdom. Some tweets had the exact location. The majority had information about the name location and were geolocated using their corresponding bounding box. The search terms to identify migration related tweets can be found [here](https://github.com/fcorowe/stigma_covid). The same list of terms was used in Rowe et al (2021). 51 | 52 | ```{marginfigure} 53 | Rowe, F., Mahony, M., Graells-Garrido, E., Rango, M. and Sievers, N., 2021. Using Twitter to track immigration sentiment during early stages of the COVID-19 pandemic. *Data & Policy*, 3. 54 | ``` 55 | 56 | I then used the tweet text content to measure the sentiment using an algorithm known as *VADER* (Valence Aware Dictionary and sEntiment Reasoner). If you are interested in how to do this in *R*, see [this code](05-sentiment-analysis.html). For details on the algorithm, see Hutto and Gilbert (2014) - and on how to interpret the results in the context of migration, see Rowe et al (2021). 57 | 58 | ```{marginfigure} 59 | Hutto, C and Gilbert, E (2014) VADER: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth International Conference on Weblogs and Social Media (ICWSM-14). Menlo Park, CA: *Association for the Advancement of Artificial Intelligence*, pp. 216–225 60 | ``` 61 | 62 | 63 | We now read and inspect the Twitter data 64 | ```{r, output=FALSE, message=FALSE} 65 | # clean workspace 66 | rm(list=ls()) 67 | 68 | # read twitter data 69 | tweet_df <- read_csv("./data/uk-sentiment-data.csv") 70 | 71 | # show head 72 | head(tweet_df) 73 | ``` 74 | 75 | We will be mapping the data so we first transform the non-spatial data frame of tweets to a spatial data frame using the coordinate reference system `crs` `EPSG:4326`. Learn more about CRS in [Lovelace et al (2019) Chapter 7](https://geocompr.robinlovelace.net/reproj-geo-data.html). 76 | 77 | ```{marginfigure} 78 | Lovelace, R., Nowosad, J. and Muenchow, J., 2019. Geocomputation with R. Chapman and Hall/CRC. 79 | ``` 80 | 81 | ```{r} 82 | # from non-spatial data frame to a spatial data frame 83 | tweet_df.geo <- tweet_df %>% 84 | #filter(compound < -0.05 | compound > 0.05) %>% 85 | st_as_sf(coords = c("long", "lat"), 86 | crs = "EPSG:4326") 87 | ``` 88 | 89 | Second, we read a shapefile containing the polygons for local authority districts in the United Kingdom. We simplify these polygons as they are very detailed and may take a long time to render. We will be using these polygons for data visualisation so precision so less important. 90 | 91 | ```{r} 92 | # read shapefile 93 | la_shp <- st_read("./data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.shp") 94 | 95 | # simplify boundaries 96 | la_shp_simple <- st_simplify(la_shp, 97 | preserveTopology =T, 98 | dTolerance = 1000) # 1km 99 | 100 | # ensure geometry is valid 101 | la_shp_simple <- sf::st_make_valid(la_shp_simple) 102 | ``` 103 | 104 | # Exploratory Spatial Data Analysis 105 | 106 | Before diving into more sophisticated analysis, a good starting point is to run exploratory spatial data analysis (ESDA). 107 | ESDAs are usually divided into two main groups: 108 | (1) **global** spatial autocorrelation: which focuses on the overall trend or the degree of spatial clustering in a variable; 109 | (2) **local** spatial autocorrelation: which focuses on spatial instability: the departure of parts of a map from the general trend. it is useful to identify hot or cold spots. 110 | 111 | ```{marginfigure} 112 | Recall: **Spatial autocorrelation** relates to the degree to which the similarity in values between observations in a variable in neighbouring areas. 113 | ``` 114 | 115 | A key idea to develop some intuition here is the idea of **spatial randomness** i.e. a situation in which values of an observation is unrelated to location, and therefore a variable's distribution does not follow a no discernible pattern over space. 116 | 117 | Spatial autocorrelation can be defined as the "absence of spatial randomness". 118 | This gives rise to two main classes of autocorrelation: 119 | (1) **Positive** spatial autocorrelation: when similar values tend to group together in similar locations; and, 120 | (2) **Negative** spatial autocorrelation, where similar values tend to be dispersed and further apart from each other in nearby locations. 121 | 122 | Here we will explore spatial autocorrelation looking at how we can identify its presence, nature, and strength. 123 | 124 | Let's start with some simple exploration of the data creating a point map. 125 | 126 | We can use `ggplot` to draw the polygons of local authority districts in the United Kingdom. 127 | 128 | ```{r} 129 | p <- ggplot(data = la_shp_simple) + 130 | geom_sf(color = "gray60", 131 | size = 0.1) 132 | p 133 | ``` 134 | 135 | We don't really need the axes or background here, so let's remove: 136 | 137 | ```{r} 138 | p + 139 | theme_void() 140 | ``` 141 | We can now visualise the tweets using `geom_point`: 142 | 143 | ```{r} 144 | p + 145 | geom_point(data = tweet_df.geo, 146 | aes(color = neg, geometry = geometry), 147 | stat = "sf_coordinates" 148 | ) + 149 | theme_void() 150 | ``` 151 | 152 | We can adjust the colour palette using `scale_color_viridis_c`: 153 | 154 | ```{r} 155 | p + 156 | geom_point(data = tweet_df.geo, 157 | aes(color = neg, geometry = geometry), 158 | stat = "sf_coordinates" 159 | ) + 160 | theme_void() + 161 | scale_color_viridis_c(option = "C") + 162 | # you could also try: scale_colour_distiller(palette = "RdBu", direction = -1) 163 | labs(color= 'Negative sentiment score') 164 | 165 | ``` 166 | If you are not familiar with the geography of the United Kingdom, this map may not be very informative. So let's add more context by adding an interactive map using the package `leaflet`. 167 | 168 | ```{r} 169 | leaflet() %>% 170 | addProviderTiles("Stamen.TonerLite") %>% 171 | addCircles(data = tweet_df.geo, 172 | color = "blue") 173 | ``` 174 | 175 | ```{marginfigure} 176 | What do we learn from these maps? 177 | ``` 178 | 179 | There seems to be some slight spatial pattering: similar values tend to cluster together in space. 180 | 181 | ```{marginfigure} 182 | How can we measure this apparently spatial clustering or spatial dependence? 183 | Is it statistically significant? 184 | ``` 185 | 186 | # Spatial lag 187 | 188 | To measure spatial dependence and further explore it, we will need to create an spatial lag. 189 | An spatial lag is the product of a spatial weight matrix and a given variable. 190 | The spatial lag of a variable is the average value of that variable in the neighborhood; that is, using the values of all the areas which are defined as neighbours; hence, the concept of spatial lag is inherently related to the concept of spatial weight matrix. 191 | 192 | ## Creating a spatial weight matrix 193 | 194 | So first let's build and standardise a spatial weight matrix. 195 | For this example, we'll use the 10 k nearest neighbours. 196 | 197 | ```{marginfigure} 198 | Can you try other spatial weights matrices definitions? 199 | ``` 200 | 201 | 202 | ```{r, warning=FALSE} 203 | # create knn list 204 | coords <- st_centroid(st_geometry(tweet_df.geo)) 205 | col_knn <- knearneigh(coords, k=10) 206 | # create nb object 207 | hnb <- knn2nb(col_knn) 208 | # create spatial weights matrix (note it row-standardizes by default) 209 | hknn <- nb2listw(hnb) 210 | hknn 211 | ``` 212 | 213 | ```{marginfigure} 214 | Have a go at interpreting the summary of the spatial weight matrix 215 | ``` 216 | 217 | # Creating a spatial lag 218 | 219 | Once we have built a spatial weights matrix, we can compute an spatial lag. 220 | A spatial lag offers a quantitative way to represent spatial dependence, specifically the degree of connection between geographic units. 221 | 222 | Remember: the spatial lag is the product of a spatial weights matrix and a given variable and amounts to the average value of the variable in the neighborhood of each variable's value. 223 | 224 | We use the row-standardised matrix for this and compute the spatial lag of the migration outflows. 225 | 226 | ```{r} 227 | neg_lag <- lag.listw(hknn, tweet_df.geo$neg) 228 | head(neg_lag) 229 | ``` 230 | 231 | The way to interpret the spatial lag `compound_lag` for the first observation: Islington, where a tweet scored a negative sentiment score of 0.033 is surrounded by neighbouring data points which, on average, scored a sentiment score of 0.0679375. 232 | 233 | # Spatial Autocorrelation 234 | 235 | We first start exploring global spatial autocorrelation. 236 | To this end, we will focus on the Moran Plot and Moran's I statistics. 237 | 238 | ## Moran Plot 239 | 240 | The Moran Plot is a way of visualising the nature and strength of spatial autocorrelation. 241 | It's essentially a scatter plot between a variable and its spatial lag. 242 | To more easily interpret the plot, variables are standardised. 243 | 244 | ```{r, fig.margin = TRUE, message=FALSE, warning=FALSE} 245 | ggplot(tweet_df.geo, aes(x = neg, y = neg_lag)) + 246 | geom_point() + 247 | geom_smooth(method = "lm") + 248 | ylab("Negative sentiment lag") + 249 | xlab("Negative sentiment") + 250 | theme_classic() 251 | ``` 252 | 253 | ```{r} 254 | tweet_df.geo <- cbind(tweet_df.geo, as.data.frame(neg_lag)) 255 | 256 | tweet_df.geo <- tweet_df.geo %>% 257 | mutate( 258 | st_neg = ( neg - mean(neg)) / sd(neg), 259 | st_neg_lag = ( neg_lag - mean(neg_lag)) / sd(neg_lag) 260 | ) 261 | 262 | ``` 263 | 264 | In a standardised *Moran Plot*, average values are centered around zero and dispersion is expressed in standard deviations. 265 | The rule of thumb is that values greater or smaller than two standard deviations can be considered outliers. 266 | A standardised Moran Plot can also be used to visualise *local spatial autocorrelation*. 267 | 268 | ```{marginfigure} 269 | Do you recall what *local spatial autocorrelation* is? 270 | ``` 271 | 272 | We can observe local spatial autocorrelation by partitioning the Moran Plot into four quadrants that represent different situations: 273 | 274 | * High-High (HH): values above average surrounded by values above average. 275 | * Low-Low (LL): values below average surrounded by values below average. 276 | * High-Low (HL): values above average surrounded by values below average. 277 | * Low-High (LH): values below average surrounded by values above average. 278 | 279 | ```{r} 280 | ggplot(tweet_df.geo, aes(x = st_neg, y = st_neg_lag)) + 281 | geom_point() + 282 | geom_smooth(method = "lm") + 283 | geom_hline(yintercept = 0, color = "grey", alpha =.5) + 284 | geom_vline(xintercept = 0, color = "grey", alpha =.5) + 285 | ylab("Negative sentiment lag \n (standardised)") + 286 | xlab("Negative sentiment \n (standardised)") + 287 | theme_classic() 288 | ``` 289 | 290 | ```{marginfigure} 291 | What do we learn from the Moran Plot? 292 | ``` 293 | 294 | ## Moran's I 295 | 296 | To measure global spatial autocorrelation, we can use the *Moran's I*. 297 | The Moran Plot and intrinsically related. 298 | The value of Moran’s I corresponds with the slope of the linear fit on the Moran Plot. 299 | We can compute it by running: 300 | 301 | ```{r} 302 | moran.test(tweet_df.geo$neg, listw = hknn, zero.policy = TRUE, na.action = na.omit) 303 | ``` 304 | 305 | ```{marginfigure} 306 | What does the Moran's I tell us? 307 | ``` 308 | 309 | # Exogenous spatial effects model 310 | 311 | ```{marginfigure} 312 | Rowe, F. and Arribas-Bel, D. 2022. “Spatial Modelling for Data Scientists.” https://doi.org/10.17605/OSF.IO/8F6XR. 313 | ``` 314 | 315 | A natural step is to then explore how we can use our spatial lag variable in a regression model and what it can tell us. 316 | So far, we have measured spatial dependence in isolation. 317 | But that spatial dependence could be associated to a particular factor that could be explicitly measured and included in a model. 318 | So it is worth considering spatial dependence in a wider context, analysing its degree as other variables are accounted in a regression model. 319 | We can do this plugging our spatial lag variable into a regression model. 320 | But this goes beyond the scope of this workshop. 321 | If you are interested in how to get started with spatial econometrics modelling in *R*, check out [Chapter 6 of our book Spatial Modelling for Data Scientists](https://gdsl-ul.github.io/san/spatialecon.html). 322 | 323 | ```{marginfigure} 324 | Excellent references to continue your learning on spatial econometrics are: 325 | Anselin, Luc. 1988. [Spatial Econometrics: Methods and Models](https://doi.org/10.1007/978-94-015-7799-1). Vol. 4. Springer Science & Business Media. 326 | Anselin, Luc. 2003. [Spatial Externalities, Spatial Multipliers, and Spatial Econometrics.](https://doi.org/10.1177/0160017602250972) International Regional Science Review 26 (2): 153–66. 327 | Anselin, Luc, and Sergio J. Rey. 2014. [Modern Spatial Econometrics in Practice: A Guide to Geoda, Geodaspace and Pysal.](Anselin, L. and Rey, S.J., 2014. Modern spatial econometrics in practice: A guide to GeoDa, GeoDaSpace and PySAL. GeoDa Press LLC.) GeoDa Press LLC. 328 | ``` 329 | 330 | > Final Note: Introducing a spatial lag of an explanatory variable is the most straightforward way of incorporating the notion of spatial dependence in a linear regression framework. 331 | It does not require additional changes to the modelling structure, can be estimated via OLS and the interpretation is similar to interpreting non-spatial variables. 332 | However, other model specifications are more common in the field of spatial econometrics, specifically: the **spatial lag** and **spatial error** model. 333 | While both built on the notion of spatial lag, they require a different modelling and estimation strategy. 334 | 335 | 336 | -------------------------------------------------------------------------------- /05-sentiment-analysis.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Sentiment analysis: Appendix" 3 | author: "Francisco Rowe ([`@fcorowe`](http://twitter.com/fcorowe))" 4 | date: "`r Sys.Date()`" 5 | output: tint::tintHtml 6 | bibliography: skeleton.bib 7 | link-citations: yes 8 | --- 9 | 10 | ```{r setup, include=FALSE} 11 | library(tint) 12 | # handle spatial data 13 | library(sf) 14 | library(spdep) 15 | # manipulate data 16 | library(tidyverse) 17 | # sentiment analysis 18 | library(vader) 19 | # create maps 20 | library(tmap) 21 | # nice colour schemes 22 | library(viridis) 23 | library(viridisLite) 24 | # invalidate cache when the package version changes 25 | knitr::opts_chunk$set(tidy = FALSE, cache.extra = packageVersion('tint'), class.source = "col-source") 26 | options(htmltools.dir.version = FALSE) 27 | ``` 28 | 29 | 30 | ```{css, echo=FALSE} 31 | .col-source { 32 | background-color: #E5E7E9; 33 | border: 3px #000000; 34 | } 35 | ``` 36 | 37 | ```{marginfigure} 38 | [**Back**](04-spatial-econometrics.html) \ 39 | ``` 40 | 41 | This notebook contains the code to obtain sentiment analysis scores for a sample of tweets relating to public opinion on migration originated from the United Kingdom during January 1st to December 31st 2019. 42 | 43 | # Data 44 | ```{r} 45 | df <- read_csv("./data/uk_geo_tweets_01012019_31012019.csv") 46 | head(df) 47 | ``` 48 | 49 | # Compute sentiment scores 50 | 51 | ```{r, warning=FALSE} 52 | vader_sentiment <- vader_df(df$text) 53 | ``` 54 | # Output 55 | 56 | ```{r} 57 | final_df <- cbind(df$tweet_id, df$created_at, df$place_name, df$full_place_name, df$lat, df$long, df$exact_coords, df$place_type, df$country_code, df$username, vader_sentiment) %>% 58 | rename( 59 | tweet_id = "df$tweet_id", 60 | created_at = "df$created_at", 61 | place_name = "df$place_name", 62 | full_place_name = "df$full_place_name", 63 | lat = "df$long", 64 | long = "df$lat", 65 | exact_coords = "df$exact_coords", 66 | place_type = "df$place_type", 67 | country_code = "df$country_code", 68 | username = "df$username" 69 | ) 70 | 71 | ``` 72 | 73 | 74 | # Save 75 | ```{r} 76 | write_csv(final_df, "./data/uk-sentiment-data.csv") 77 | ``` 78 | 79 | 80 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2021 Francisco Rowe 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Introduction to Geographic Data Science 2 | 3 | [Francisco Rowe](http://www.franciscorowe.com) [[`@fcorowe`](http://twitter.com/fcorowe)]1* 4 | 5 | 1 *Geographic Data Science Lab, University of Liverpool, Liverpool, United Kingdom* 6 | 7 | * *Correspondence*: 8 | F.Rowe-Gonzalez@liverpool.ac.uk 9 | 10 | # Description 11 | 12 | This workshops offers an introduction to *Geographic Data Science*. It provides an introduction to fundamental concepts of geographic data science using a hands-on approach in *R*. It offers an overview of various types of spatial data, key challenges of working with these data, and some basic analytical techniques. 13 | 14 | # Aims 15 | 16 | This module aims to provide an introduction to geographic data science, core concepts and ideas. 17 | 18 | The workshop is structured as follows: 19 | 20 | * [**What is geographic data science?**](01-gds.html) 21 | * [**Spatial data**](02-spatial-data.html) 22 | * [**Spatial weights**](03-spatial_weights.html) 23 | * [**Spatial autocorrelation**](04-spatial_econometrics.html) 24 | 25 | ## Citation 26 | 27 | If you use the material, code or processed data, you can give appropriate attribution by using the following citation: 28 | 29 | ``` 30 | @article{rowe_gds22, 31 | author = {Francisco Rowe}, 32 | title = {Introduction to Geographic Data Science}, 33 | year = 2022, 34 | url = {fcorowe.github.io/intro-gds/}, 35 | doi = {https://doi.org/10.17605/OSF.IO/VHY2P}, 36 | } 37 | ``` 38 | -------------------------------------------------------------------------------- /data/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/data/.DS_Store -------------------------------------------------------------------------------- /data/Liverpool_MSOA.dbf: -------------------------------------------------------------------------------- 1 | u=!yWMSOA_CDCPLAD_CDCPpopN H_VbadN H_badN H_fairN H_goodN H_VgoodN age_menNage_medNage_60NS_RentNEthnicNillnessN unempNmalesN E02001347 E08000012 7653 188 567 1003 2258 3637 37.932580000000002 38.000000000000000 0.190382856396184 0.287141073657928 0.085456683653469 7653 0.093368700265252 3655 E02001348 E08000012 7222 146 402 982 2316 3376 37.928550000000001 37.000000000000000 0.196206037108834 0.057814485387548 0.044032124065356 7222 0.085006518904824 3526 E02001349 E08000012 7966 114 410 1108 2478 3856 38.091510000000000 38.000000000000000 0.182776801405975 0.066043249561660 0.046447401456189 7966 0.078646650235268 3982 E02001350 E08000012 7856 229 649 1191 2285 3502 38.674259999999997 38.000000000000000 0.211558044806517 0.393327480245830 0.091013238289206 7856 0.116736990154712 3661 E02001351 E08000012 8894 219 765 1451 2744 3715 37.534970000000001 36.000000000000000 0.182145266471779 0.429871977240398 0.074994378232516 8894 0.135313531353135 4580 E02001352 E08000012 7416 152 570 1222 2348 3124 36.871630000000003 35.000000000000000 0.163565264293420 0.303348653972423 0.089940668824164 7416 0.135404789053592 4206 E02001353 E08000012 7572 115 350 801 2279 4027 36.616480000000003 38.000000000000000 0.145932382461701 0.065900642108821 0.081088219756999 7572 0.063170163170163 3645 E02001354 E08000012 7613 193 598 1176 2279 3367 37.956260000000000 37.000000000000000 0.202285564166557 0.385265700483092 0.048075660055169 7613 0.117351215423303 3642 E02001355 E08000012 6664 154 532 1144 2041 2793 39.106839999999998 38.000000000000000 0.208283313325330 0.233590733590734 0.048619447779112 6664 0.128637059724349 3239 E02001356 E08000012 5584 139 443 884 1576 2542 35.238540000000000 33.000000000000000 0.178366762177650 0.565989847715736 0.075931232091691 5584 0.174468085106383 2549 E02001357 E08000012 6982 192 616 1176 2017 2981 38.345320000000001 38.000000000000000 0.211830421082784 0.493002099370189 0.048839873961616 6982 0.141981613891726 3157 E02001358 E08000012 7069 142 562 1142 2260 2963 37.121940000000002 35.000000000000000 0.187296647333428 0.194782608695652 0.054321686235677 7069 0.145071658379643 3423 E02001359 E08000012 5751 116 355 750 1785 2745 40.416620000000002 42.000000000000000 0.220309511389324 0.158075601374570 0.087810815510346 5751 0.072775388686735 2768 E02001360 E08000012 8358 273 864 1520 2432 3269 38.569389999999999 37.000000000000000 0.208303421871261 0.478692493946731 0.089375448671931 8358 0.160900604063701 4222 E02001361 E08000012 7618 168 590 1207 2478 3175 39.730510000000002 40.000000000000000 0.222893147807824 0.308441558441558 0.041218167498031 7618 0.114317425083241 3611 E02001362 E08000012 7401 177 582 1220 2224 3198 42.437910000000002 44.000000000000000 0.259019051479530 0.247073710850997 0.034049452776652 7401 0.092075892857143 3498 E02001363 E08000012 7788 206 646 1186 2305 3445 36.527090000000001 35.000000000000000 0.182203389830508 0.432059752944556 0.090010272213662 7788 0.146566647432198 3578 E02001364 E08000012 5311 136 480 1039 1686 1970 37.903590000000001 37.000000000000000 0.199209188476746 0.421011058451817 0.099792882696291 5311 0.176820888685295 2561 E02001365 E08000012 9049 185 558 1295 2978 4033 35.429549999999999 33.000000000000000 0.164990606696873 0.124414880512441 0.113714222566029 9049 0.148479849163328 4480 E02001366 E08000012 8985 204 621 1373 2828 3959 38.399439999999998 37.000000000000000 0.193544796883695 0.166625827003185 0.096716750139121 8985 0.129621380846325 4535 E02001367 E08000012 9304 125 452 1159 2898 4670 42.018490000000000 43.000000000000000 0.259028374892519 0.087077938563087 0.055782459157352 9304 0.061013443640124 4444 E02001368 E08000012 5847 237 710 1121 1609 2170 41.381900000000002 41.000000000000000 0.257739011458868 0.620287868403016 0.082777492731315 5847 0.176013805004314 2836 E02001386 E08000012 7545 262 742 1378 2307 2856 41.923789999999997 43.000000000000000 0.272100728959576 0.495800671892497 0.063618290258449 7545 0.126795752654591 3571 E02001369 E08000012 7655 279 857 1307 2139 3073 39.524360000000001 37.000000000000000 0.222468974526453 0.515862780500387 0.142521227955585 7655 0.147109330280481 3943 E02001370 E08000012 6186 241 749 1165 1766 2265 40.155189999999997 40.000000000000000 0.251374070481733 0.596891849032667 0.143388296152603 6186 0.240120274914089 3068 E02001371 E08000012 7447 178 571 1087 2201 3410 35.511879999999998 34.000000000000000 0.169464213777360 0.388491547464239 0.107560091311938 7447 0.115031129558257 3408 E02001372 E08000012 6221 214 447 925 1887 2748 41.604720000000000 42.000000000000000 0.241761774634303 0.291055718475073 0.061887156405723 6221 0.101611312068399 3018 E02001373 E08000012 9253 180 608 1362 2966 4137 37.661409999999997 37.000000000000000 0.190100507943370 0.194424064563463 0.102345185345293 9253 0.121166306695464 4528 E02001374 E08000012 8199 187 614 1221 2683 3494 35.841200000000001 33.000000000000000 0.142212464934748 0.286723507917174 0.276253201609952 8199 0.170055010762975 4499 E02001375 E08000012 7195 214 565 1139 2279 2998 40.636550000000000 40.000000000000000 0.241417651146630 0.200187382885696 0.073940236275191 7195 0.114510489510490 3352 E02001376 E08000012 8469 195 612 1317 2723 3622 34.810369999999999 32.000000000000000 0.159995276892195 0.272895223013988 0.349509977565238 8469 0.169152903569526 4285 E02001377 E08000012 11467 205 606 1258 3544 5854 30.371590000000001 23.000000000000000 0.105520188366617 0.381737632291410 0.446411441527863 11467 0.100694444444444 5792 E02001378 E08000012 7972 170 550 1261 2355 3636 39.817360000000001 40.000000000000000 0.217134972403412 0.263143098116390 0.128951329653788 7972 0.095564005069708 3920 E02001380 E08000012 7526 112 359 1005 2331 3719 41.300690000000003 43.000000000000000 0.244751528036141 0.055181264035932 0.072149880414563 7526 0.058174523570712 3622 E02001381 E08000012 9286 175 589 1154 2927 4441 33.851390000000002 30.000000000000000 0.138272668533276 0.225185528756957 0.319620934740469 9286 0.123454111520937 4697 E02001382 E08000012 7098 75 254 791 2149 3829 39.982390000000002 41.000000000000000 0.218089602704987 0.026959022286125 0.085798816568047 7098 0.051639555899819 3438 E02001383 E08000012 7501 139 494 1030 2404 3434 33.231569999999998 28.000000000000000 0.112385015331289 0.569248826291080 0.451939741367818 7501 0.141721491228070 4118 E02001384 E08000012 10762 117 393 1026 3189 6037 29.646439999999998 22.000000000000000 0.094870841850957 0.156812991626491 0.271139193458465 10762 0.087471898630697 5418 E02001385 E08000012 8157 195 584 1155 2468 3755 31.145270000000000 28.000000000000000 0.121981120509991 0.591297163315891 0.708839033958563 8157 0.243305785123967 4245 E02001387 E08000012 8025 60 269 835 2379 4482 40.595260000000003 41.000000000000000 0.233021806853583 0.018659076533839 0.117632398753894 8025 0.042681512410114 3826 E02001388 E08000012 8860 79 274 845 2667 4995 32.732280000000003 26.000000000000000 0.118623024830700 0.098321342925659 0.174492099322799 8860 0.051818950930626 4471 E02001389 E08000012 6232 123 456 893 1887 2873 36.991819999999997 33.000000000000000 0.144897304236200 0.363101604278075 0.361360718870347 6232 0.131179232256283 3440 E02001390 E08000012 8756 329 929 1448 2474 3576 37.152119999999996 35.000000000000000 0.188784833257195 0.601367855398144 0.282320694380996 8756 0.170153417015342 4363 E02001391 E08000012 5970 154 520 1017 1810 2469 39.277220000000000 40.000000000000000 0.236348408710218 0.411527609834744 0.050921273031826 5970 0.119969336910694 2741 E02001392 E08000012 5717 59 259 843 1912 2644 46.969389999999997 49.000000000000000 0.351757914990380 0.033646322378717 0.086758789574952 5717 0.054375225063018 2778 E02001393 E08000012 7753 70 206 820 2393 4264 42.477750000000000 46.000000000000000 0.273829485360506 0.018629173989455 0.129627241067974 7753 0.037707006369427 3692 E02001394 E08000012 10143 166 597 1220 2941 5219 37.080649999999999 31.000000000000000 0.181997436655822 0.254312354312354 0.172729961549837 10143 0.072007912957468 5098 E02001395 E08000012 6056 76 245 757 1886 3092 41.371369999999999 41.000000000000000 0.235799207397622 0.038667687595712 0.107496697490093 6056 0.052067836953288 2905 E02001396 E08000012 8035 155 567 1066 2511 3736 37.314749999999997 33.000000000000000 0.180584940883634 0.224093925472180 0.146857498444306 8035 0.105012474484010 4006 E02001397 E08000012 7631 138 540 1182 2278 3493 44.723500000000001 46.000000000000000 0.298519198008125 0.247920997920998 0.073515921897523 7631 0.070490485124631 3708 E02001398 E08000012 7758 127 415 1026 2355 3835 39.560070000000003 35.000000000000000 0.205078628512503 0.209179415855355 0.178009796339263 7758 0.075665576833255 3808 E02001399 E08000012 7276 76 327 830 2254 3789 42.108020000000003 44.000000000000000 0.254672897196262 0.075017445917655 0.082875206157229 7276 0.049947145877378 3524 E02001400 E08000012 6614 154 495 1055 1861 3049 41.600239999999999 43.000000000000000 0.235258542485637 0.264882000704473 0.107801632899909 6614 0.094310921813204 3182 E02001401 E08000012 7872 122 463 1108 2448 3731 43.475230000000003 45.000000000000000 0.287601626016260 0.093667157584683 0.081808943089431 7872 0.059716859716860 3772 E02001402 E08000012 8384 83 385 1021 2511 4384 41.809519999999999 42.000000000000000 0.248449427480916 0.084529505582137 0.110329198473282 8384 0.052943760984183 4140 E02001403 E08000012 6035 154 422 873 1766 2820 36.986739999999998 35.000000000000000 0.176139188069594 0.282294966561070 0.109030654515327 6035 0.118955512572534 3050 E02001404 E08000012 7260 176 566 1069 2196 3253 36.544490000000003 35.000000000000000 0.173691460055096 0.408240887480190 0.073415977961433 7260 0.134609720176730 3452 E02001405 E08000012 9609 300 841 1462 2854 4152 35.377350000000000 32.000000000000000 0.175980851285253 0.542609532980260 0.091580809657613 9609 0.173945409429280 4446 E02006932 E08000012 9949 43 156 566 3148 6036 24.955469999999998 21.000000000000000 0.025932254497939 0.182442986566698 0.379033068650116 9949 0.040554738390418 5172 E02006933 E08000012 5436 93 212 452 1611 3068 33.727919999999997 29.000000000000000 0.105040470934511 0.183413078149920 0.286423841059603 5436 0.056301050175029 3072 E02006934 E08000012 5202 37 102 303 1568 3192 29.359480000000001 26.000000000000000 0.037293348712034 0.060388209920920 0.308150711264898 5202 0.046132596685083 3117 -------------------------------------------------------------------------------- /data/Liverpool_MSOA.prj: -------------------------------------------------------------------------------- 1 | PROJCS["Transverse_Mercator",GEOGCS["GCS_OSGB 1936",DATUM["D_OSGB_1936",SPHEROID["Airy_1830",6377563.396,299.3249646]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",49],PARAMETER["central_meridian",-2],PARAMETER["scale_factor",0.9996012717],PARAMETER["false_easting",400000],PARAMETER["false_northing",-100000],UNIT["Meter",1]] -------------------------------------------------------------------------------- /data/Liverpool_MSOA.shp: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/data/Liverpool_MSOA.shp -------------------------------------------------------------------------------- /data/Liverpool_MSOA.shx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/data/Liverpool_MSOA.shx -------------------------------------------------------------------------------- /data/Liverpool_OA.prj: -------------------------------------------------------------------------------- 1 | PROJCS["Transverse_Mercator",GEOGCS["GCS_OSGB 1936",DATUM["D_OSGB_1936",SPHEROID["Airy_1830",6377563.396,299.3249646]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",49],PARAMETER["central_meridian",-2],PARAMETER["scale_factor",0.9996012717],PARAMETER["false_easting",400000],PARAMETER["false_northing",-100000],UNIT["Meter",1]] -------------------------------------------------------------------------------- /data/Liverpool_OA.shp: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/data/Liverpool_OA.shp -------------------------------------------------------------------------------- /data/Liverpool_OA.shx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/data/Liverpool_OA.shx -------------------------------------------------------------------------------- /data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.cpg: -------------------------------------------------------------------------------- 1 | UTF-8 -------------------------------------------------------------------------------- /data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.dbf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.dbf -------------------------------------------------------------------------------- /data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.prj: -------------------------------------------------------------------------------- 1 | PROJCS["British_National_Grid",GEOGCS["GCS_OSGB_1936",DATUM["D_OSGB_1936",SPHEROID["Airy_1830",6377563.396,299.3249646]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Transverse_Mercator"],PARAMETER["False_Easting",400000.0],PARAMETER["False_Northing",-100000.0],PARAMETER["Central_Meridian",-2.0],PARAMETER["Scale_Factor",0.9996012717],PARAMETER["Latitude_Of_Origin",49.0],UNIT["Meter",1.0]] -------------------------------------------------------------------------------- /data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.shp: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.shp -------------------------------------------------------------------------------- /data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.shx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/LAD_MAY_2021_UK_BFE_V2.shx -------------------------------------------------------------------------------- /data/Local_Authority_Districts_(May_2021)_UK_BFE_V3/Local_Authority_Districts_(May_2021)_UK_BFE_V3.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | utf8 4 | 5 | 6 | dataset 7 | 8 | 9 | dataset 10 | 11 | 12 | 13 | 14 | ISO 19139 Geographic Information - Metadata - Implementation Specification 15 | 16 | 17 | 2007 18 | 19 | 20 | 21 | 22 | 23 | This file contains the digital vector boundaries for Local Authority Districts in the United Kingdom as at May 2021. The May release boundaries are based on the draft version of Ordnance Survey's Boundary Line product so may be subject to change.The boundaries available are: Full resolution - extent of the realm (usually this is the Mean Low Water mark but in some cases boundaries extend beyond this to include off shore islands).Contains both Ordnance Survey and ONS Intellectual Property Rights. REST URL of ArcGIS for INSPIRE View Service – https://ons-inspire.esriuk.com/arcgis/rest/services/Administrative_Boundaries/Local_Authority_Districts_May_2021_UK_BFE/MapServer/exts/InspireView REST URL of ArcGIS for INSPIRE Feature Download Service – https://ons-inspire.esriuk.com/arcgis/rest/services/Administrative_Boundaries/Local_Authority_Districts_May_2021_UK_BFE/MapServer/exts/InspireFeatureDownload REST URL of Feature Access Service – https://ons-inspire.esriuk.com/arcgis/rest/services/Administrative_Boundaries/Local_Authority_Districts_May_2021_UK_BFE/FeatureServer 24 | 25 | 26 | Boundaries 27 | 28 | 29 | Office for National Statistics 30 | 31 | 32 | 33 | 34 | Boundaries 35 | 36 | 37 | Local Authority Districts 38 | 39 | 40 | LAD 41 | 42 | 43 | UK 44 | 45 | 46 | WFS 47 | 48 | 49 | WMS 50 | 51 | 52 | Administrative Boundaries 53 | 54 | 55 | BDY_ADM 56 | 57 | 58 | BDY_LAD 59 | 60 | 61 | MAY_2021 62 | 63 | 64 | 65 | 66 | 67 | 68 | Boundaries 69 | 70 | 71 | Local Authority Districts 72 | 73 | 74 | LAD 75 | 76 | 77 | UK 78 | 79 | 80 | WFS 81 | 82 | 83 | WMS 84 | 85 | 86 | Administrative Boundaries 87 | 88 | 89 | BDY_ADM 90 | 91 | 92 | BDY_LAD 93 | 94 | 95 | MAY_2021 96 | 97 | 98 | 99 | 100 | 101 | 102 | Boundaries 103 | 104 | 105 | Local Authority Districts 106 | 107 | 108 | LAD 109 | 110 | 111 | UK 112 | 113 | 114 | WFS 115 | 116 | 117 | WMS 118 | 119 | 120 | Administrative Boundaries 121 | 122 | 123 | BDY_ADM 124 | 125 | 126 | BDY_LAD 127 | 128 | 129 | MAY_2021 130 | 131 | 132 | 133 | 134 | 135 | 136 | Boundaries 137 | 138 | 139 | Local Authority Districts 140 | 141 | 142 | LAD 143 | 144 | 145 | UK 146 | 147 | 148 | WFS 149 | 150 | 151 | WMS 152 | 153 | 154 | Administrative Boundaries 155 | 156 | 157 | BDY_ADM 158 | 159 | 160 | BDY_LAD 161 | 162 | 163 | MAY_2021 164 | 165 | 166 | 167 | 168 | 169 | 170 | Boundaries 171 | 172 | 173 | Local Authority Districts 174 | 175 | 176 | LAD 177 | 178 | 179 | UK 180 | 181 | 182 | WFS 183 | 184 | 185 | WMS 186 | 187 | 188 | Administrative Boundaries 189 | 190 | 191 | BDY_ADM 192 | 193 | 194 | BDY_LAD 195 | 196 | 197 | MAY_2021 198 | 199 | 200 | 201 | 202 | 203 | 204 | Boundaries 205 | 206 | 207 | Local Authority Districts 208 | 209 | 210 | LAD 211 | 212 | 213 | UK 214 | 215 | 216 | WFS 217 | 218 | 219 | WMS 220 | 221 | 222 | Administrative Boundaries 223 | 224 | 225 | BDY_ADM 226 | 227 | 228 | BDY_LAD 229 | 230 | 231 | MAY_2021 232 | 233 | 234 | 235 | 236 | 237 | 238 | Boundaries 239 | 240 | 241 | Local Authority Districts 242 | 243 | 244 | LAD 245 | 246 | 247 | UK 248 | 249 | 250 | WFS 251 | 252 | 253 | WMS 254 | 255 | 256 | Administrative Boundaries 257 | 258 | 259 | BDY_ADM 260 | 261 | 262 | BDY_LAD 263 | 264 | 265 | MAY_2021 266 | 267 | 268 | 269 | 270 | 271 | 272 | Boundaries 273 | 274 | 275 | Local Authority Districts 276 | 277 | 278 | LAD 279 | 280 | 281 | UK 282 | 283 | 284 | WFS 285 | 286 | 287 | WMS 288 | 289 | 290 | Administrative Boundaries 291 | 292 | 293 | BDY_ADM 294 | 295 | 296 | BDY_LAD 297 | 298 | 299 | MAY_2021 300 | 301 | 302 | 303 | 304 | 305 | 306 | Boundaries 307 | 308 | 309 | Local Authority Districts 310 | 311 | 312 | LAD 313 | 314 | 315 | UK 316 | 317 | 318 | WFS 319 | 320 | 321 | WMS 322 | 323 | 324 | Administrative Boundaries 325 | 326 | 327 | BDY_ADM 328 | 329 | 330 | BDY_LAD 331 | 332 | 333 | MAY_2021 334 | 335 | 336 | 337 | 338 | 339 | 340 | Boundaries 341 | 342 | 343 | Local Authority Districts 344 | 345 | 346 | LAD 347 | 348 | 349 | UK 350 | 351 | 352 | WFS 353 | 354 | 355 | WMS 356 | 357 | 358 | Administrative Boundaries 359 | 360 | 361 | BDY_ADM 362 | 363 | 364 | BDY_LAD 365 | 366 | 367 | MAY_2021 368 | 369 | 370 | 371 | 372 | 373 | 374 | https://www.ons.gov.uk/methodology/geography/licences 375 | 376 | 377 | 378 | 379 | 380 | utf8 381 | 382 | 383 | 384 | 385 | -------------------------------------------------------------------------------- /data/census_data.csv: -------------------------------------------------------------------------------- 1 | code,ward,pop16_74,higher_managerial,pop,ghealth 2 | E05000886,Allerton and Hunts Cross,10930,1103,14853,7274 3 | E05000887,Anfield,10712,312,14510,6124 4 | E05000888,Belle Vale,10987,432,15004,6129 5 | E05000889,Central,19174,1346,20340,11925 6 | E05000890,Childwall,10410,1123,13908,7219 7 | E05000891,Church,10569,1843,13974,7461 8 | E05000892,Clubmoor,11004,315,15272,6403 9 | E05000893,County,10555,280,14045,5930 10 | E05000894,Cressington,10887,1249,14503,7094 11 | E05000895,Croxteth,10491,644,14561,6992 12 | E05000896,Everton,11151,331,14782,5517 13 | E05000897,Fazakerley,12522,596,16786,7879 14 | E05000898,Greenbank,14121,1156,16132,8990 15 | E05000899,Kensington and Fairfield,11828,373,15377,6495 16 | E05000900,Kirkdale,12691,559,16115,6662 17 | E05000901,Knotty Ash,9598,502,13312,5981 18 | E05000902,Mossley Hill,10645,1529,13816,7322 19 | E05000903,Norris Green,10670,258,15047,6529 20 | E05000904,Old Swan,11989,520,16461,7192 21 | E05000905,Picton,13385,485,17009,7953 22 | E05000906,Princes Park,13315,851,17104,7636 23 | E05000907,Riverside,15107,1617,18422,9001 24 | E05000908,St Michael's,10861,1598,12991,6450 25 | E05000909,Speke-Garston,14531,378,20300,8973 26 | E05000910,Tuebrook and Stoneycroft,12434,537,16489,7302 27 | E05000911,Warbreck,12661,621,16481,7521 28 | E05000912,Wavertree,11478,1099,14772,7268 29 | E05000913,West Derby,10735,838,14382,7013 30 | E05000914,Woolton,9397,1447,12921,6025 31 | E05000915,Yew Tree,12038,626,16746,7717 -------------------------------------------------------------------------------- /data/census_data2.csv: -------------------------------------------------------------------------------- 1 | geo_code,households,socialrented_households 2 | E05000886,6359,827 3 | E05000887,6622,1508 4 | E05000888,6622,2818 5 | E05000889,7139,1311 6 | E05000890,5391,374 7 | E05000891,5884,178 8 | E05000892,6576,2859 9 | E05000893,6745,1564 10 | E05000894,6317,1023 11 | E05000895,6024,1558 12 | E05000896,7383,4279 13 | E05000897,6806,1095 14 | E05000898,6854,1426 15 | E05000899,7390,2286 16 | E05000900,7751,3655 17 | E05000901,5667,1691 18 | E05000902,5619,573 19 | E05000903,6412,3267 20 | E05000904,7398,1560 21 | E05000905,7563,2629 22 | E05000906,8877,5148 23 | E05000907,9538,3240 24 | E05000908,6495,1321 25 | E05000909,8959,4148 26 | E05000910,7477,1356 27 | E05000911,6909,1042 28 | E05000912,6595,998 29 | E05000913,6066,697 30 | E05000914,6030,803 31 | E05000915,7047,2251 -------------------------------------------------------------------------------- /figs/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/figs/.DS_Store -------------------------------------------------------------------------------- /figs/sources_gds.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/figs/sources_gds.png -------------------------------------------------------------------------------- /figs/spatial_weight.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/figs/spatial_weight.png -------------------------------------------------------------------------------- /index.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Introduction to Geographic Data Science" 3 | author: "Francisco Rowe ([`@fcorowe`](http://twitter.com/fcorowe))" 4 | date: "`r Sys.Date()`" 5 | output: tint::tintHtml 6 | bibliography: skeleton.bib 7 | link-citations: yes 8 | --- 9 | 10 | ```{r setup, include=FALSE} 11 | library(tint) 12 | # invalidate cache when the package version changes 13 | knitr::opts_chunk$set(tidy = FALSE, cache.extra = packageVersion('tint')) 14 | options(htmltools.dir.version = FALSE) 15 | ``` 16 | 17 | ```{marginfigure} 18 | [**Next**](01-gds.html) 19 | ``` 20 | 21 | # Description 22 | 23 | This workshops offers an introduction to *Geographic Data Science*. It provides an introduction to fundamental concepts of geographic data science using a hands-on approach in *R*. It offers an overview of various types of spatial data, key challenges of working with these data, and some basic analytical techniques. 24 | 25 | # Structure 26 | 27 | The workshop is structured as follows: 28 | 29 | * [**What is geographic data science?**](01-gds.html) 30 | * [**Spatial data**](02-spatial-data.html) 31 | * [**Spatial weights**](03-spatial_weights.html) 32 | * [**Spatial autocorrelation**](04-spatial_econometrics.html) 33 | 34 | 35 | # Resources 36 | 37 | All this course material is available on Github and you can download it [**here**](https://github.com/fcorowe/udd_gds_course/archive/refs/heads/main.zip). Once you have download it, ensure it is in a safe place on your computer. 38 | 39 | # Computational Environment 40 | 41 | You need the most recent version of R and packages. These can be installed following the instructions provided in our [R installation guide](https://gdsl-ul.github.io/r_install/). 42 | 43 | ## Dependency list 44 | 45 | Ensure you have installed the list of libraries used: 46 | 47 | * `knitr` 48 | * `leaflet` 49 | * `rgdal` 50 | * `sf` 51 | * `sp` 52 | * `spdep` 53 | * `tidyverse` 54 | * `tint` 55 | * `tmap` 56 | * `viridis` 57 | * `viridisLite` 58 | 59 | 60 | You can get the materials from this course as a [download](https://github.com/fcorowe/intro-gds/archive/refs/heads/main.zip) of a .zip file or by going directly to the [GitHub repository](https://github.com/fcorowe/intro-gds). 61 | 62 | 63 | -------------------------------------------------------------------------------- /index.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/index.pdf -------------------------------------------------------------------------------- /intro-gds.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | BuildType: Website 16 | -------------------------------------------------------------------------------- /refs.bib: -------------------------------------------------------------------------------- 1 | @article{rowe2021big, 2 | title={Big data}, 3 | author={Rowe, F}, 4 | journal={Concise Encyclopaedia of Human Geography, Elgar Encyclopedias in the Social Sciences Series}, 5 | year={2021}, 6 | url={https://doi.org/10.31235/osf.io/phz3e} 7 | } 8 | 9 | @article{arribas2021open, 10 | title={Open data products-A framework for creating valuable analysis ready data}, 11 | author={Arribas-Bel, Dani and Green, Mark and Rowe, Francisco and Singleton, Alex}, 12 | journal={Journal of Geographical Systems}, 13 | volume={23}, 14 | number={4}, 15 | pages={497--514}, 16 | year={2021}, 17 | publisher={Springer} 18 | } 19 | 20 | @Manual{R-knitr, 21 | title = {knitr: A General-Purpose Package for Dynamic Report Generation in R}, 22 | author = {Yihui Xie}, 23 | year = {2022}, 24 | note = {R package version 1.39}, 25 | url = {https://yihui.org/knitr/}, 26 | } 27 | 28 | @Book{knitr2015, 29 | title = {Dynamic Documents with {R} and knitr}, 30 | author = {Yihui Xie}, 31 | publisher = {Chapman and Hall/CRC}, 32 | address = {Boca Raton, Florida}, 33 | year = {2015}, 34 | edition = {2nd}, 35 | note = {ISBN 978-1498716963}, 36 | url = {https://yihui.org/knitr/}, 37 | } 38 | 39 | @InCollection{knitr2014, 40 | booktitle = {Implementing Reproducible Computational Research}, 41 | editor = {Victoria Stodden and Friedrich Leisch and Roger D. Peng}, 42 | title = {knitr: A Comprehensive Tool for Reproducible Research in {R}}, 43 | author = {Yihui Xie}, 44 | publisher = {Chapman and Hall/CRC}, 45 | year = {2014}, 46 | note = {ISBN 978-1466561595}, 47 | url = {http://www.crcpress.com/product/isbn/9781466561595}, 48 | } 49 | 50 | 51 | @article{singleton2019, 52 | title = {Geographic Data Science}, 53 | author = {Singleton, Alex and {Arribas{-}Bel}, Daniel}, 54 | year = {2019}, 55 | month = {04}, 56 | date = {2019-04-04}, 57 | journal = {Geographical Analysis}, 58 | pages = {61--75}, 59 | volume = {53}, 60 | number = {1}, 61 | doi = {10.1111/gean.12194}, 62 | url = {http://dx.doi.org/10.1111/gean.12194}, 63 | langid = {en} 64 | } 65 | -------------------------------------------------------------------------------- /rmd/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/fcorowe/intro-gds/ddd3458d92677759aee070ab3f6d553e24f6ae77/rmd/.DS_Store -------------------------------------------------------------------------------- /skeleton.bib: -------------------------------------------------------------------------------- 1 | @Manual{R-base, 2 | title = {R: A Language and Environment for Statistical Computing}, 3 | author = {{R Core Team}}, 4 | organization = {R Foundation for Statistical Computing}, 5 | address = {Vienna, Austria}, 6 | year = {2022}, 7 | url = {https://www.R-project.org/}, 8 | } 9 | 10 | @Manual{R-rmarkdown, 11 | title = {rmarkdown: Dynamic Documents for R}, 12 | author = {JJ Allaire and Yihui Xie and Jonathan McPherson and Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and Winston Chang and Richard Iannone}, 13 | year = {2022}, 14 | note = {R package version 2.14}, 15 | url = {https://CRAN.R-project.org/package=rmarkdown}, 16 | } 17 | 18 | @Book{rmarkdown2018, 19 | title = {R Markdown: The Definitive Guide}, 20 | author = {Yihui Xie and J.J. Allaire and Garrett Grolemund}, 21 | publisher = {Chapman and Hall/CRC}, 22 | address = {Boca Raton, Florida}, 23 | year = {2018}, 24 | note = {ISBN 9781138359338}, 25 | url = {https://bookdown.org/yihui/rmarkdown}, 26 | } 27 | 28 | @Book{rmarkdown2020, 29 | title = {R Markdown Cookbook}, 30 | author = {Yihui Xie and Christophe Dervieux and Emily Riederer}, 31 | publisher = {Chapman and Hall/CRC}, 32 | address = {Boca Raton, Florida}, 33 | year = {2020}, 34 | note = {ISBN 9780367563837}, 35 | url = {https://bookdown.org/yihui/rmarkdown-cookbook}, 36 | } 37 | 38 | --------------------------------------------------------------------------------