├── MixedModeling-GKHajduk-script.R
├── README.md
└── dragons.RData


/MixedModeling-GKHajduk-script.R:
--------------------------------------------------------------------------------
  1 | ######################################
  2 | #                                    #
  3 | #   Mixed effects modeling in R     #
  4 | #                                    #
  5 | ######################################
  6 | 
  7 | ## authors: Gabriela K Hajduk, based on workshop developed by Liam Bailey
  8 | ## contact details: gkhajduk.github.io; email: gkhajduk@gmail.com
  9 | ## date: 2017-03-09
 10 | ##
 11 | 
 12 | ###---- Explore the data -----###
 13 | 
 14 | ## load the data and have a look at it
 15 | 
 16 | load("dragons.RData")
 17 | 
 18 | head(dragons)
 19 | 
 20 | ## Let's say we want to know how the body length affects test scores.
 21 | 
 22 | ## Have a look at the data distribution:
 23 | 
 24 | hist(dragons$testScore)  # seems close to normal distribution - good!
 25 | 
 26 | ## It is good practice to  standardise your explanatory variables before proceeding - you can use scale() to do that:
 27 | 
 28 | dragons$bodyLength2 <- scale(dragons$bodyLength)
 29 | 
 30 | ## Back to our question: is test score affected by body length?
 31 | 
 32 | ###---- Fit all data in one analysis -----###
 33 | 
 34 | ## One way to analyse this data would be to try fitting a linear model to all our data, ignoring the sites and the mountain ranges for now.
 35 | 
 36 | library(lme4)
 37 | library(dplyr)
 38 | 
 39 | basic.lm <- lm(testScore ~ bodyLength2, data = dragons)
 40 | 
 41 | summary(basic.lm)
 42 | 
 43 | ## Let's plot the data with ggplot2
 44 | 
 45 | library(ggplot2)
 46 | 
 47 | ggplot(dragons, aes(x = bodyLength, y = testScore)) +
 48 |   geom_point()+
 49 |   geom_smooth(method = "lm")
 50 | 
 51 | 
 52 | ### Assumptions?
 53 | 
 54 | ## Plot the residuals - the red line should be close to being flat, like the dashed grey line
 55 | 
 56 | plot(basic.lm, which = 1)  # not perfect, but look alright
 57 | 
 58 | ## Have a quick look at the  qqplot too - point should ideally fall onto the diagonal dashed line
 59 | 
 60 | plot(basic.lm, which = 2)  # a bit off at the extremes, but that's often the case; again doesn't look too bad
 61 | 
 62 | 
 63 | ## However, what about observation independence? Are our data independent?
 64 | ## We collected multiple samples from eight mountain ranges
 65 | ## It's perfectly plausible that the data from within each mountain range are more similar to each other than the data from different mountain ranges - they are correlated. Pseudoreplication isn't our friend.
 66 | 
 67 | ## Have a look at the data to see if above is true
 68 | boxplot(testScore ~ mountainRange, data = dragons)  # certainly looks like something is going on here
 69 | 
 70 | ## We could also plot it colouring points by mountain range
 71 | ggplot(dragons, aes(x = bodyLength, y = testScore, colour = mountainRange))+
 72 |   geom_point(size = 2)+
 73 |   theme_classic()+
 74 |     theme(legend.position = "none")
 75 | 
 76 | ## From the above plots it looks like our mountain ranges vary both in the dragon body length and in their test scores. This confirms that our observations from within each of the ranges aren't independent. We can't ignore that.
 77 | 
 78 | ## So what do we do?
 79 | 
 80 | ###----- Run multiple analyses -----###
 81 | 
 82 | 
 83 | ## We could run many separate analyses and fit a regression for each of the mountain ranges.
 84 | 
 85 | ## Lets have a quick look at the data split by mountain range
 86 | ## We use the facet_wrap to do that
 87 | 
 88 | ggplot(aes(bodyLength, testScore), data = dragons) + geom_point() +
 89 |     facet_wrap(~ mountainRange) +
 90 |     xlab("length") + ylab("test score")
 91 | 
 92 | 
 93 | 
 94 | ##----- Modify the model -----###
 95 | 
 96 | ## We want to use all the data, but account for the data coming from different mountain ranges
 97 | 
 98 | ## let's add mountain range as a fixed effect to our basic.lm
 99 | 
100 | mountain.lm <- lm(testScore ~ bodyLength2 + mountainRange, data = dragons)
101 | summary(mountain.lm)
102 | 
103 | ## now body length is not significant
104 | 
105 | 
106 | ###----- Mixed effects models -----###
107 | 
108 | 
109 | 
110 | ##----- First mixed model -----##
111 | 
112 | ### model
113 | 
114 | ### plots
115 | 
116 | ### summary
117 | 
118 | ### variance accounted for by mountain ranges
119 | 
120 | 
121 | 
122 | ##-- implicit vs explicit nesting --##
123 | 
124 | head(dragons)  # we have site and mountainRange
125 | str(dragons)  # we took samples from three sites per mountain range and eight mountain ranges in total
126 | 
127 | ### create new "sample" variable
128 | 
129 | 
130 | ##----- Second mixed model -----##
131 | 
132 | ### model
133 | 
134 | ### summary
135 | 
136 | ### plot
137 | 
138 | 
139 | 
140 | ##----- Model selection for the keen -----##
141 | 
142 | ### full model
143 | 
144 | ### reduced model
145 | 
146 | ### comparison
147 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # CC-Linear-mixed-models
 2 | 
 3 | ### Intro tutorial to linear mixed models, available here: https://ourcodingclub.github.io/2017/03/15/mixed-models.html
 4 | 
 5 | We would love to hear your feedback on the tutorial, whether you did it at a Coding Club workshop or online: 
 6 | [https://www.surveymonkey.co.uk/r/HJYGVSF](https://www.surveymonkey.co.uk/r/HJYGVSF)
 7 | 
 8 | Check out https://ourcodingclub.github.io/workshop/ to learn how you can get involved!
 9 | 
10 | This work is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](https://creativecommons.org/licenses/by-sa/4.0/).
11 | 
12 | [![License: CC BY-SA 4.0](https://licensebuttons.net/l/by-sa/4.0/80x15.png)](https://creativecommons.org/licenses/by-sa/4.0/)
13 | 


--------------------------------------------------------------------------------
/dragons.RData:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ourcodingclub/CC-Linear-mixed-models/00ce776eed837c8cc0653287f2d79430667d3b92/dragons.RData


--------------------------------------------------------------------------------