├── rmarkdown-cheatsheet.pdf ├── UnderGrad_Dissertation_Rmd.pdf ├── README.md ├── RMarkdown_Tutorial_Demo_Rmd.Rmd ├── RMarkdown_Tutorial.R ├── RMarkdown_Demo_2.R ├── RMarkdown_Demo_3.R └── RMarkdown_Demo_1.R /rmarkdown-cheatsheet.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ourcodingclub/CC-2-RMarkdown/HEAD/rmarkdown-cheatsheet.pdf -------------------------------------------------------------------------------- /UnderGrad_Dissertation_Rmd.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ourcodingclub/CC-2-RMarkdown/HEAD/UnderGrad_Dissertation_Rmd.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CC-2-RMarkdown 2 | Using R Markdown to construct reproducible code 3 | 4 | This repository contains the files necessary to complete the Coding Club R Markdown tutorial - you can check it out at 5 | https://ourcodingclub.github.io/2016/11/24/rmarkdown-1.html 6 | 7 | `RMarkdown_Tutorial.R` provides a basic R script to work with using the online tutorial material, to turn into an R Markdown document. 8 | 9 | The data (`edidiv.csv`) were downloaded from the NBN Gateway https://data.nbn.org.uk/ for educational purposes. 10 | 11 | `rmarkdown-cheatsheet.pdf` was downloaded from https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf 12 | 13 | The 3 demo R scripts (`RMarkdown_Demo_1.R`, `RMarkdown_Demo_2.R`, `Rmarkdown_Demo_3.R`) are provided as examples which can be easily turned into R Markdown files. 14 | 15 | For more about Coding Club, please see https://ourcodingclub.github.io/ 16 | 17 | Check out https://ourcodingclub.github.io/workshop/ to learn how you can get involved! 18 | 19 | We would love to hear your feedback on the tutorial, whether you did it at a Coding Club workshop or online: 20 | https://www.surveymonkey.co.uk/r/F5PDDHV 21 | 22 | This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/). 23 | 24 | [![License: CC BY-SA 4.0](https://licensebuttons.net/l/by-sa/4.0/80x15.png)](https://creativecommons.org/licenses/by-sa/4.0/) 25 | -------------------------------------------------------------------------------- /RMarkdown_Tutorial_Demo_Rmd.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "R Markdown Tutorial Demo" 3 | author: "John Doe" 4 | date: "21/11/2016" 5 | output: html_document 6 | --- 7 | 8 | ## Preamble 9 | 10 | ### Packages 11 | 12 | ```{r, message=FALSE, warning=FALSE} 13 | library(dplyr) # for data manipulation 14 | library(pander) # to create pretty tables 15 | ``` 16 | 17 | ```{r, include=FALSE} 18 | edidiv <- read.csv("edidiv.csv") 19 | ``` 20 | 21 | 22 | ## Data Exploration 23 | 24 | A preliminary investigation into the biodiversity of Edinburgh, using data from the NBN Gateway https://data.nbn.org.uk/. 25 | 26 | ### What is the species richness across taxonomic groups? 27 | 28 | A table of species richness: 29 | ```{r, results='asis'} 30 | richness <- 31 | edidiv %>% 32 | group_by(taxonGroup) %>% 33 | summarise(Species_richness = n_distinct(taxonName)) 34 | 35 | pander(richness) 36 | ``` 37 | 38 | 39 | A barplot of the table above: 40 | ```{r, fig.align="center", fig.width=15, fig.height=8} 41 | barplot(richness$Species_richness, 42 | names.arg = richness$taxonGroup, 43 | xlab="Taxa", ylab="Number of species", 44 | ylim=c(0,600) 45 | ) 46 | ``` 47 | 48 | 49 | ### What is the most common species in each taxonomic group? 50 | 51 | A table of the most common species: 52 | ```{r} 53 | #Create a vector of most abundant species per taxa 54 | max_abund <- 55 | edidiv %>% 56 | group_by(taxonGroup) %>% 57 | summarise(taxonName = names(which.max(table(taxonName)))) 58 | 59 | #Add the vector to the data frame 60 | richness_abund <- 61 | inner_join(richness, max_abund, by = "taxonGroup") 62 | richness_abund <- rename(richness_abund, Most_abundant = taxonName, Taxon = taxonGroup) 63 | ``` 64 | 65 | ```{r} 66 | richness_abund <- rename(richness_abund, 67 | "Most Abundant" = Most_abundant, 68 | "Species Richness" = Species_richness) #Change the column names 69 | emphasize.italics.cols(3) #Make the 3rd column italics 70 | pander(richness_abund) #Create a table 71 | ``` 72 | 73 | 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /RMarkdown_Tutorial.R: -------------------------------------------------------------------------------- 1 | # Coding Club Workshop 7 R Markdown and reproducible code - Template R script 2 | # Written by John Godlee 3 | # 21/11/16 4 | # University of Edinburgh 5 | 6 | # Use this example R script to practice compiling an R Markdown file, using the tutorial materials provided at: ourcodingclub.github.io/2016/11/24/rmarkdown-1.html 7 | 8 | # Follow through the tutorial to make a well commented, easy to follow record of what is going on so that others can easily follow. 9 | 10 | # Loading packages 11 | library(dplyr) 12 | 13 | # Loading biodiversity data 14 | # This data is a publicly available dataset of occurrence records for many animal, 15 | # plant, and fungi species, for 2000-2016 from the NBN Gateway 16 | 17 | setwd("") 18 | edidiv <- read.csv("edidiv.csv") 19 | 20 | # Constructing a table of species richness in each taxonomic group 21 | 22 | richness <- 23 | edidiv %>% 24 | group_by(taxonGroup) %>% 25 | summarise(Species_richness = n_distinct(taxonName)) 26 | 27 | richness 28 | 29 | # Creating a barplot of species richness in each taxonomic group 30 | 31 | barplot(richness$Species_richness, 32 | names.arg = richness$taxonGroup, 33 | xlab = "Taxa", ylab = "Number of species", 34 | ylim = c(0,600) 35 | ) 36 | 37 | # Determining what the most common species is in each taxonomic group 38 | 39 | max_abund <- 40 | edidiv %>% 41 | group_by(taxonGroup) %>% 42 | summarise(taxonName = names(which.max(table(taxonName)))) 43 | 44 | max_abund 45 | 46 | # Joining the two data frames together, using "taxonGroup" as the reference 47 | 48 | richness_abund <- inner_join(richness, max_abund, by = "taxonGroup") 49 | 50 | # Renaming the headers of the tables, and viewing the data frame 51 | 52 | richness_abund <- rename(richness_abund, Most_abundant = taxonName, Taxon = taxonGroup) 53 | 54 | richness_abund 55 | 56 | # Things to think about: 57 | # - Which bits of code need to be displayed in the final .html file? 58 | # - How can the formatting of the R markdown file be improved? 59 | 60 | # Experiment with other demo R scripts in the repo, or your own scripts for further practice! 61 | # - RMarkdown_Demo_1.R 62 | # - RMarkdown_Demo_2.R 63 | # - RMarkdown_Demo_3.R 64 | -------------------------------------------------------------------------------- /RMarkdown_Demo_2.R: -------------------------------------------------------------------------------- 1 | ####################################################### 2 | # Example R Markdown Script # 3 | # John Godlee # 4 | # 24/Jan/2017 # 5 | ####################################################### 6 | 7 | # Use this example R script to practice compiling an R Markdown file. 8 | # Try to make a well commented, easy to follow record of what is going on so that others can easily follow. 9 | 10 | # Install and load the relevant packages ---------------------------------------------- 11 | library(datasets) # To get the loblolly pine growth data 12 | library(dplyr) # To get summary statistics on the data 13 | 14 | # Set your working directory to where you have saved your script 15 | setwd() 16 | 17 | # Import data ------------------------------------------------------------- 18 | pine_growth <- Loblolly # This data shows the height of pine trees at different ages, from different seed stocks 19 | head(pine_growth) 20 | 21 | # Investigating the data ------------------------------------------------------------ 22 | # Create a simple scatterplot showing the age-height distribution 23 | # This set of boxplots can be added to your R Markdown document by putting the code in a code chunk 24 | # Try adding some plain text to your R markdown document to explain the histogram 25 | plot(x = pine_growth$age, y = pine_growth$height, xlab = "Age (Years)", ylab = "Height (m)", col=pine_growth$Seed) 26 | 27 | # Create boxplots to show how different Seed stocks compare in height distribution 28 | boxplot(height ~ Seed, data = pine_growth) 29 | 30 | # Use a pipe to get a table of summary statistics for each Seed type 31 | pine_growth_seedsumm <- pine_growth %>% 32 | group_by(Seed) %>% 33 | summarise("Mean Height" = mean(height), "STDEV Height" = sd(height), "Median Height" = median(height)) 34 | 35 | # Use a pipe to get a table of summary statistics for each Age type 36 | pine_growth_agesumm <- pine_growth %>% 37 | group_by(age) %>% 38 | summarise("Mean Height" = mean(height), "STDEV Height" = sd(height), "Median Height" = median(height)) 39 | 40 | ## Make a table of `pine_growth_seedsumm' and `pine_growth_agesumm' in your R markdown document using pander(), the instructions can be found in the tutorial 41 | 42 | 43 | -------------------------------------------------------------------------------- /RMarkdown_Demo_3.R: -------------------------------------------------------------------------------- 1 | ####################################################### 2 | # Example R Markdown Script # 3 | # John Godlee # 4 | # 24/Jan/2017 # 5 | ####################################################### 6 | 7 | # Use this example R script to practice compiling an R Markdown file. 8 | # Try to make a well commented, easy to follow record of what is going on so that others can easily follow. 9 | 10 | # Download the data set for this example script from: 11 | https://github.com/ourcodingclub/Datasets/tree/master/Seedling_Traits 12 | 13 | # Install and load the relevant packages ---------------------------------------------- 14 | library(dplyr) # To get summary statistics on the data 15 | 16 | # Set your working directory to the folder where you have downloaded the datasets 17 | setwd() 18 | 19 | # Import data ------------------------------------------------------------- 20 | seedlings <- read.csv("Seedling_Elevation_Traits.csv") 21 | 22 | # Investigating the data ------------------------------------------------------------ 23 | # Create a scatterplot showing the relationship between `Soil.temp.mean' and `Elevation.m' 24 | # This scatterplot can be added to your R Markdown document by putting the code in a code chunk 25 | # Try adding some plain text to your R markdown document to explain the histogram 26 | plot(x = seedlings$Elevation.m, y = seedlings$Soil.temp.mean) 27 | 28 | # Create a set of boxplots showing how `Leaf.thickness.mean.mm' varies by `Species' 29 | boxplot(Leaf.thickness.mean.mm ~ Species, 30 | col=c("red", "blue", "green", "yellow", "pink", "violet", "orange", "grey", "brown"), data = seedlings) 31 | 32 | # Use a pipe to get a table of summary statistics for each Species type 33 | 34 | seedlings_specsumm <- seedlings %>% 35 | group_by(Species) %>% 36 | summarise("Mean Leaf Thickness" = mean(Leaf.thickness.mean.mm), "Mean Stem Width" = mean(Width.mm), "Mean SPAD" = mean(SPAD.mean)) 37 | 38 | # Use a pipe to get a table of summary statistics for each Site 39 | 40 | seedlings_sitesumm <- seedlings %>% 41 | group_by(Site) %>% 42 | summarise("Mean Soil Temp." = mean(Soil.temp.mean), "Mean Elevation" = mean(Elevation.m), "Undergrowth density" = mean(Num.seedlings.comp)) 43 | 44 | ## Make a table of `seedlings_specsumm' and `seedlings_specsumm' in your R markdown document using pander(), the instructions can be found in the tutorial 45 | 46 | -------------------------------------------------------------------------------- /RMarkdown_Demo_1.R: -------------------------------------------------------------------------------- 1 | ####################################################### 2 | # Example R Markdown Script # 3 | # Adapted from: # 4 | # Tidy data and efficient manipulation # 5 | # Coding Club tutorial # 6 | # January 18th 2017 # 7 | # Sandra Angers-Blondin (s.angers-blondin@ed.ac.uk) # 8 | # John Godlee # 9 | # 24/Jan/2017 # 10 | ####################################################### 11 | 12 | # Use this example R script to practice compiling an R Markdown file. 13 | # Try to make a well commented, easy to follow record of what is going on so that others can easily follow. 14 | 15 | # Download the datasets for this example script from: 16 | https://github.com/ourcodingclub/CC3-DataManip 17 | 18 | # Install and load the relevant packages ---------------------------------------------- 19 | library(dplyr) # an excellent data manipulation package 20 | library(tidyr) # a package to format your data 21 | library(pander) #to create pretty tables 22 | 23 | # Set your working directory to the folder where you have downloaded the datasets 24 | setwd() 25 | 26 | # Import data ------------------------------------------------------------- 27 | elongation <- read.csv("EmpetrumElongation.csv", sep = ";") # stem elongation measurements on crowberry 28 | germination <- read.csv("Germination.csv", sep = ";") # germination of seeds subjected to toxic solutions 29 | 30 | # Tidying the data ------------------------------------------------------------ 31 | #Putting the data into long format using gather() 32 | elongation_long <- gather(elongation, Year, Length, c(X2007, X2008, X2009, X2010, X2011, X2012)) 33 | #gather() works like this: data, key, value, columns to gather. Here we want the lengths (value) to be gathered by year (key). Note that you are completely making up the names of the second and third arguments, unlike most functions in R. 34 | head(elongation_long) 35 | 36 | # Investigating the data ------------------------------------------------------------ 37 | # Create a boxplot of `elongation_long' to visualise elongation for each year. 38 | # This set of boxplots can be added to your R Markdown document by putting the code in a code chunk 39 | boxplot(Length ~ Year, 40 | data = elongation_long, 41 | xlab = "Year", 42 | ylab = "Elongation (cm)", 43 | main = "Annual growth of Empetrum hermaphroditum") 44 | 45 | # Use filter() to keep only the rows of `germination' for species `SR' 46 | germinSR <- filter(germination, Species == 'SR') 47 | 48 | # Let's have a look at the distribution of germination across SR 49 | # This histogram can be added to your R Markdown document by simply putting the code in a code chunk 50 | # Try adding some plain text to your R markdown document to explain the histogram 51 | hist(germinSR$Nb_seeds_germin, breaks = 8) 52 | 53 | # Use mutate() to create a new column of the germination percentage using the total number of seeds and the number of seeds that germinated 54 | germin_percent <- mutate(germination, Percent = Nb_seeds_germin / Nb_seeds_tot * 100) 55 | 56 | # Use a pipe to get a table of summary statistics for each Seed type 57 | germin_summ <- germin_percent %>% 58 | group_by(Species) %>% 59 | summarise("Mean germination per" = mean(Nb_seeds_germin), "Max germination per" = max(Nb_seeds_germin), "Min germination per" = min(Nb_seeds_germin)) 60 | 61 | ## Make a table of `germin_summ' in your R markdown document using pander(), the instructions can be found in the tutorial 62 | 63 | 64 | --------------------------------------------------------------------------------