├── .gitignore ├── README.md ├── ggplot-tutorial.Rmd ├── ggplot-tutorial.html └── ggplot2-for-publications.Rproj /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | .Ruserdata 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Making figures ready for publication with `ggplot2` 2 | 3 | This tutorial offers a step-by-step guide for how to create publication-ready figures using `ggplot2` and the data from `palmerpenguins`. 4 | 5 | ![image](https://user-images.githubusercontent.com/39834789/86522447-78104680-be38-11ea-8330-3d5fc96ceafc.png) 6 | 7 | ## Install the package & data 8 | 9 | ```{r, message = FALSE} 10 | # Install the package 11 | remotes::install_github("allisonhorst/palmerpenguins") 12 | 13 | # Load the package 14 | library(palmerpenguins) 15 | 16 | # Load the data into the Global Environment 17 | data("penguins") 18 | ``` 19 | 20 | ## Meet the Penguins 21 | 22 | ![image](https://user-images.githubusercontent.com/39834789/86522450-7f375480-be38-11ea-9437-9fd2a382aa7b.png) 23 | 24 | Artwork by @allison_horst 25 | 26 | 27 | ## What are culmen length & depth? 28 | 29 | The culmen is the upper ridge of a bird’s bill. In the simplified penguins data, culmen length and depth are renamed as variables bill_length_mm and bill_depth_mm to be more intuitive. 30 | For this penguin data, the culmen (bill) length and depth are measured as shown below: 31 | 32 | ![image](https://user-images.githubusercontent.com/39834789/86522451-84949f00-be38-11ea-9555-6409579f3b58.png) 33 | 34 | Artwork by @allison_horst 35 | 36 | 37 | ## The steps for creating a beautiful scatter plot in `ggplot2` 38 | 39 | First we will create a basic scatterplot of `body_mass_g` against `bill_length_mm`. 40 | 41 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 42 | # Load the package 43 | library(ggplot2) 44 | 45 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ # this is the data 46 | geom_point() # here we add the points 47 | ``` 48 | 49 | Screen Shot 2020-07-04 at 9 09 20 PM 50 | 51 | ### Change the size of points 52 | 53 | We can manually change the size of our datapoints. The points in the standard plot are quite small, so lets increase the size of the points with `size = 3`. 54 | 55 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 56 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 57 | geom_point(size = 3) 58 | ``` 59 | 60 | Screen Shot 2020-07-04 at 9 11 04 PM 61 | 62 | 63 | ### Change the shape of points 64 | 65 | In `ggplot2`, it is possible to change the shape of the points. Here is a quick reference guide: 66 | 67 | ![shapes](https://user-images.githubusercontent.com/39834789/86522564-8e1f0680-be3a-11ea-9c7b-8cd1f39eb479.png) 68 | 69 | The shape of all datapoints can be changed with e.g. `shape = 8`. 70 | 71 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 72 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 73 | geom_point(size = 3, shape = 8) 74 | ``` 75 | 76 | Screen Shot 2020-07-04 at 9 15 40 PM 77 | 78 | 79 | Alternatively, we can change the shape of our points based on species with `aes(shape = species)`. 80 | 81 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 82 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 83 | geom_point(aes(shape = species), size = 3) 84 | ``` 85 | 86 | Screen Shot 2020-07-04 at 9 16 06 PM 87 | 88 | ### Change the opacity of points 89 | 90 | You can also change the opacity of the data points using `alpha`. Alpha values are required to be between 0 - 1 where 0 is transparent and 1 is opaque. 91 | 92 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 93 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 94 | geom_point(aes(shape = species), size = 3, alpha = 0.6) 95 | ``` 96 | 97 | Screen Shot 2020-07-04 at 9 16 40 PM 98 | 99 | ### Adding colour 100 | 101 | Now lets explore the different species by adding colour with the code `colour = species`. 102 | 103 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 104 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 105 | geom_point(aes(shape = species, colour = species), size = 3, alpha = 0.6) 106 | ``` 107 | 108 | Screen Shot 2020-07-04 at 9 17 00 PM 109 | 110 | This red-green colour combination is colourblind unfrieldly, so lets change the colour of the points with `scale_colour_manual`. To ensure the shapes match with the names we will also use `scale_shape_manual`. 111 | 112 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 113 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 114 | geom_point(aes(shape = species, colour = species), size = 3, alpha = 0.6)+ 115 | 116 | scale_colour_manual(values = c("#C15CCB", "#00868B", "#FF6A00"), 117 | labels = c("Chinstrap", "Gentoo", "Adélie"))+ 118 | scale_shape_manual(values = c(17, 15, 16), 119 | labels = c("Chinstrap", "Gentoo", "Adélie")) 120 | ``` 121 | 122 | Screen Shot 2020-07-04 at 9 17 21 PM 123 | 124 | We won't change the points any more, so let's save the plot as `penguin_plot`, so we can build upon it. 125 | 126 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 127 | penguin_plot <- ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 128 | geom_point(aes(shape = species, colour = species), size = 3, alpha = 0.6)+ 129 | 130 | scale_colour_manual(values = c("#C15CCB", "#00868B", "#FF6A00"), 131 | labels = c("Chinstrap", "Gentoo", "Adélie"))+ 132 | scale_shape_manual(values = c(17, 15, 16), 133 | labels = c("Chinstrap", "Gentoo", "Adélie")) 134 | ``` 135 | 136 | ### Changing the background 137 | 138 | You can change the background of `ggplot2` figures in a variety of ways with: 139 | 140 | - `theme_gray()` 141 | - `theme_bw()` 142 | - `theme_linedraw()` 143 | - `theme_light()` 144 | - `theme_minimal()` 145 | - `theme_classic()` 146 | - `theme_void()` 147 | - `theme_dark()` 148 | 149 | Screen Shot 2020-07-04 at 9 18 04 PM 150 | 151 | 152 | My personal favourite is `theme_bw` therefore we will continue to make our plot with this theme. 153 | 154 | We will further remove the thicker lines in the background with `panel.grid.major = element_blank()`, and the thinner lines with `panel.grid.minor = element_blank()`. 155 | 156 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 157 | penguin_plot + 158 | theme_bw()+ # set the background theme 159 | theme(panel.grid.major = element_blank(), # remove the major lines 160 | panel.grid.minor = element_blank()) # remove the minor lines 161 | ``` 162 | 163 | Screen Shot 2020-07-04 at 9 22 35 PM 164 | 165 | ### Changing the text size 166 | 167 | There are several ways to change the size of the font, but we can quickly change all font size with `theme_bw(base_size = 20)`. 168 | 169 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 170 | penguin_plot <- penguin_plot + 171 | theme_bw(base_size = 15)+ 172 | theme(panel.grid.major = element_blank(), 173 | panel.grid.minor = element_blank()) 174 | 175 | penguin_plot 176 | ``` 177 | 178 | Screen Shot 2020-07-04 at 9 23 00 PM 179 | 180 | ### Renaming the axes and legend 181 | 182 | You can rename the axes and legend using `labs`. 183 | 184 | 185 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 186 | penguin_plot <- penguin_plot + 187 | labs(x = "Body Mass (g)", y = "Bill Length (mm)", colour = "Species", shape = "Species") 188 | 189 | penguin_plot 190 | ``` 191 | 192 | 193 | Screen Shot 2020-07-04 at 9 23 16 PM 194 | 195 | ### Change axes position 196 | 197 | In the standard plots, the axes titles are really close to the plot. We will increase their distance with `vjust`: 198 | 199 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 200 | penguin_plot <- penguin_plot + 201 | theme(axis.title.y = element_text(vjust = 3))+ # increase distance from the y-axis 202 | theme(axis.title.x = element_text(vjust = -1)) # increase distance from the x-axis 203 | 204 | penguin_plot 205 | ``` 206 | 207 | Screen Shot 2020-07-04 at 9 23 33 PM 208 | 209 | ### Change legend position 210 | 211 | Legends can be positioned in a number of different ways using `legend.position`: 212 | 213 | - `theme(legend.position="top")` 214 | - `theme(legend.position="bottom")` 215 | - `theme(legend.position="left")` 216 | - `theme(legend.position="right")` 217 | - `theme(legend.position="none")` 218 | 219 | Legend loction can be also a numeric vector c(x,y), where x and y are the coordinates of the legend box. Their values should be between 0 - 1. c(0,0) is the “bottom left” and c(1,1) is the “top right” position. 220 | 221 | 222 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 223 | penguin_plot <- penguin_plot + 224 | theme(legend.position = c(0.85, 0.2)) 225 | 226 | penguin_plot 227 | ``` 228 | 229 | Screen Shot 2020-07-04 at 9 23 52 PM 230 | 231 | ### Saving the figure 232 | 233 | Publications require high quality images and they specify the size and format required on their website. 234 | 235 | #### `ggsave` 236 | `ggsave` is one example of how to save your figures. 237 | 238 | It allows you to save high quality images in a variety of different file types (e.g. "png", "eps", "ps", "tex", "pdf", "jpeg", "tiff", "png", "bmp", "svg", "wmf"). You can also specify the `width` and `height` in "in", "cm", or "mm", and specify the plot resolution with `dpi`. 239 | 240 | ```{r, eval = FALSE} 241 | # This will save the last plot for the code you ran: 242 | 243 | ggsave("penguin_plot.pdf", 244 | dpi = 600, 245 | width = 100, height = 60, unit = "mm") 246 | ``` 247 | 248 | #### `pdf` 249 | 250 | An alternative method is to use `pdf`. This allows you to specify the colour mode (e.g. cmyk). 251 | 252 | ```{r, eval = FALSE} 253 | pdf("ggplot-cmyk.pdf", width = 12 / 2.54, height = 8 / 2.54, 254 | colormodel = "cmyk") 255 | 256 | print(penguin_plot) 257 | 258 | dev.off() 259 | ``` 260 | 261 | 262 | 263 | ## More `ggplot2` plots and tricks 264 | 265 | 266 | ### Change the size of points based on other data 267 | 268 | Size can be changed based on data within the `penguins` dataframe. For example, here we have changed the size of the points based on `bill_depth_mm`. 269 | 270 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 271 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 272 | geom_point(aes(size = bill_depth_mm)) 273 | ``` 274 | Screen Shot 2020-07-04 at 9 24 22 PM 275 | 276 | ### Adding an ellispe 277 | 278 | Add 95% confidence interval ellispses with `stat_ellipse`. 279 | 280 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 281 | ellipse<- penguin_plot + 282 | stat_ellipse(aes(colour = species, level=0.95)) 283 | 284 | 285 | ellipse 286 | ``` 287 | 288 | Screen Shot 2020-07-04 at 9 24 42 PM 289 | 290 | 291 | ### Linear regression line 292 | 293 | Add a linear regression line with `geom_smooth(method=lm)`. 294 | 295 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 296 | lm <- penguin_plot + 297 | geom_smooth(method=lm, aes(colour = species)) 298 | 299 | lm 300 | ``` 301 | Screen Shot 2020-07-04 at 9 24 56 PM 302 | 303 | ### `facet_wrap` 304 | 305 | `facet_wrap` is a fantastic tool to slit plots based on a specified categorical column. Here we will use the `species` column. 306 | 307 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 308 | penguin_plot + 309 | facet_wrap(~species, ncol = 3, nrow = 1) + # specifying 3 columns, 1 row 310 | theme(legend.position = "none") # remove legend 311 | ``` 312 | 313 | Screen Shot 2020-07-04 at 9 25 12 PM 314 | 315 | 316 | ### Changing strip design 317 | 318 | It is also possible to change the `size`, `colour`, `face` and `fill` colour of the facet strips with the following code: 319 | 320 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 321 | penguin_plot + 322 | facet_wrap(~species, ncol = 3, nrow = 1) + # specifying 3 columns, 1 row 323 | theme(legend.position = "none")+ # remove legend 324 | theme(strip.text.x = element_text(size = 16, color = "white", face = "bold"), 325 | strip.background = element_rect(fill="black")) 326 | ``` 327 | 328 | Screen Shot 2020-07-04 at 9 25 28 PM 329 | 330 | 331 | ### Removing white space and free axes 332 | 333 | We can also remove the free space using `scales = free`. 334 | 335 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 336 | penguin_plot + 337 | facet_wrap(~species, scales = "free")+ 338 | theme(legend.position = "none")+ # remove legend 339 | theme(strip.text.x = element_text(size = 16, color = "white", face = "bold"), 340 | strip.background = element_rect(fill="black")) 341 | ``` 342 | 343 | Screen Shot 2020-07-04 at 9 25 48 PM 344 | 345 | 346 | ### `facet_grid`: Free facet width 347 | 348 | To allow the facets to be different widths, we must use `facet_grid` and `space = "free"`. This can be used with `scales = "free"`. 349 | 350 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 351 | penguin_plot + 352 | facet_grid(.~species, scales = "free", space = "free") + 353 | theme(legend.position = "none")+ # remove legend 354 | theme(strip.text.x = element_text(size = 16, color = "white", face = "bold"), 355 | strip.background = element_rect(fill="black")) 356 | ``` 357 | Screen Shot 2020-07-04 at 9 26 06 PM 358 | 359 | ### Bar plot 360 | 361 | 362 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 363 | library(dplyr) 364 | penguin_summary <- penguins %>% 365 | group_by(species) %>% 366 | summarise(mean = mean(body_mass_g, na.rm = T), 367 | sd = sd(body_mass_g, na.rm = T)) 368 | 369 | 370 | ggplot(penguin_summary, aes(y = mean, x=species, fill = species)) + 371 | geom_bar(stat="identity")+ 372 | geom_errorbar(aes(ymin = mean-sd, ymax = mean+sd), 373 | width=.1) 374 | ``` 375 | Screen Shot 2020-07-04 at 9 26 26 PM 376 | 377 | Create the plot like the code above for the scatter plot. 378 | 379 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 380 | bar <- ggplot(penguin_summary, aes(y = mean, x = species, fill = species)) + 381 | geom_bar(stat = "identity")+ 382 | geom_errorbar(aes(ymin = mean-sd, ymax = mean+sd), 383 | width=.1)+ 384 | theme_bw(base_size = 20)+ 385 | theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+ 386 | labs(x = "Species", y = "Body Mass (g)")+ 387 | scale_fill_manual(values = c( "#FF6A00","#C15CCB", "#00868B"))+ 388 | theme(legend.position = "none") + 389 | scale_y_continuous(limits = c(0, 5700))+ 390 | theme(axis.title.y = element_text(vjust = 3)) + 391 | theme(axis.title.x = element_text(vjust = -1)) 392 | 393 | bar 394 | ``` 395 | Screen Shot 2020-07-04 at 9 26 41 PM 396 | 397 | ### Histogram 398 | 399 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 400 | ggplot(penguins, aes(x = flipper_length_mm, fill = species)) + 401 | geom_histogram(alpha = 0.4)+ 402 | scale_fill_manual(values = c( "#FF6A00","#C15CCB", "#00868B")) 403 | ``` 404 | 405 | Screen Shot 2020-07-04 at 9 26 57 PM 406 | 407 | ### Density Plot 408 | 409 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 410 | ggplot(penguins, aes(x = flipper_length_mm, fill = species)) + 411 | geom_density(alpha = 0.4)+ 412 | scale_fill_manual(values = c( "#FF6A00","#C15CCB", "#00868B")) 413 | ``` 414 | 415 | Screen Shot 2020-07-04 at 9 27 14 PM 416 | 417 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 418 | flipper <- ggplot(penguins, aes(x = flipper_length_mm, fill = species, colour = species)) + 419 | geom_density(alpha = 0.4)+ 420 | scale_fill_manual(values = c( "#FF6A00","#C15CCB", "#00868B"))+ 421 | scale_colour_manual(values = c( "#FF6A00","#C15CCB", "#00868B"))+ 422 | theme_bw(base_size = 20)+ 423 | theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+ 424 | labs(x = "Flipper Length (mm)", y = "Density", fill = "Species", colour = "Species")+ 425 | #theme(legend.position = "none") + 426 | scale_x_continuous(expand = c(0,0), limits = c(165, 238))+ 427 | scale_y_continuous(expand = c(0,0), limits = c(0, 0.065), breaks=seq(0, 0.065, 0.02))+ 428 | theme(axis.title.y = element_text(vjust = 3)) + 429 | theme(axis.title.x = element_text(vjust = -1)) 430 | 431 | flipper 432 | ``` 433 | 434 | Screen Shot 2020-07-04 at 9 27 30 PM 435 | 436 | ### Add a mean line 437 | 438 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 439 | flipper_mean <- penguins %>% 440 | group_by(species) %>% 441 | summarise(mean = mean(flipper_length_mm, na.rm = TRUE)) 442 | 443 | density <- flipper + 444 | geom_vline(data = flipper_mean, aes(xintercept = mean, colour = species), linetype = "dashed", size = 1.3) 445 | 446 | density 447 | ``` 448 | 449 | Screen Shot 2020-07-04 at 9 27 47 PM 450 | 451 | 452 | ### Box plot 453 | 454 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 455 | ggplot(na.omit(penguins), aes(x=species, y=flipper_length_mm, fill=sex)) + 456 | geom_boxplot() 457 | ``` 458 | Screen Shot 2020-07-04 at 9 28 03 PM 459 | 460 | 461 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 462 | box_plot<- ggplot(na.omit(penguins), aes(x=species, y=flipper_length_mm, fill=sex)) + 463 | geom_boxplot()+ 464 | theme_bw(base_size = 20)+ 465 | theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+ 466 | labs(x = "Species", y = "Flipper Length (mm)", fill = "Sex")+ 467 | theme(axis.title.y = element_text(vjust = 3)) + 468 | theme(axis.title.x = element_text(vjust = -1)) + 469 | theme(legend.position = c(0.15, 0.8)) 470 | 471 | box_plot 472 | ``` 473 | Screen Shot 2020-07-04 at 9 28 17 PM 474 | 475 | 476 | ### Arrange plots 477 | 478 | There are several different methods that allow you to arrange your plots into different panels. 479 | 480 | #### `ggarrange` 481 | 482 | Here we will arrange four of the plots that we made above into one figure using `ggarrange`. You can specify the number of rows `nrow` and `ncol` and add `labels. 483 | 484 | Note we previously saved our figures (`ellipse`, `density`, `bar`, `box_plot`) in the Global Environment and we are calling on them here. 485 | 486 | 487 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, fig.width=12, fig.height=8} 488 | library(ggpubr) 489 | 490 | ggarrange(ellipse, density, bar, box_plot, 491 | nrow = 2, ncol = 2, 492 | labels = c("a", "b", "c", "d")) 493 | ``` 494 | 495 | 496 | Screen Shot 2020-07-04 at 9 28 41 PM 497 | 498 | 499 | #### Same axes 500 | 501 | If two plots have the same axes, you can remove one and use `align = "v"` to ensure the plots remain the same size. 502 | 503 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, fig.width=12, fig.height=8} 504 | # Remove y axes text and title 505 | lm <- lm + 506 | theme(axis.title.y=element_blank(), 507 | axis.text.y=element_blank(), 508 | axis.ticks.y=element_blank()) 509 | 510 | 511 | ggarrange(ellipse, lm, 512 | nrow = 1, ncol = 2, 513 | labels = c("a", "b"), 514 | align = "v") 515 | ``` 516 | 517 | Screen Shot 2020-07-04 at 9 29 02 PM 518 | 519 | 520 | #### `patchwork` 521 | 522 | `patchwork` is a very straighforward and intuative package for arranging plots. 523 | e.g: 524 | 525 | - `plot1 + plot2` aligns two plots next to each other 526 | - `plot1 / plot2` aligns plot2 under plot1 527 | 528 | Lets make a fancy one. 529 | 530 | We add the figure labels with e.g. `labs(tag = 'a')`. We will also make two of our plots half the size of the main plot with `plot_layout(widths=c(2,1))`. 531 | 532 | 533 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, fig.width=18, fig.height=10} 534 | library(patchwork) 535 | 536 | (ellipse + labs(tag = 'a')| ((bar+ labs(tag = 'b')) / (box_plot +labs(tag = 'c')))) + plot_layout(widths=c(2,1)) 537 | ``` 538 | Screen Shot 2020-07-04 at 9 29 25 PM 539 | 540 | 541 | ![image](https://user-images.githubusercontent.com/39834789/86522452-88282600-be38-11ea-8095-4d2cfcd60373.png) 542 | 543 | Artwork by @CerrenRichards 544 | 545 | -------------------------------------------------------------------------------- /ggplot-tutorial.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "ggplot2-tutorial" 3 | author: "Cerren Richards" 4 | date: "7/4/2020" 5 | output: html_document 6 | --- 7 | 8 | ```{r setup, include=FALSE} 9 | knitr::opts_chunk$set(echo = TRUE) 10 | ``` 11 | 12 | ## Overview 13 | 14 | This tutorial offers a step-by-step guide for how to create publication-ready figures using `ggplot2` and the data from `palmerpenguins`. 15 | 16 | 17 | ## Install the package & data 18 | 19 | ```{r, message = FALSE} 20 | 21 | # Install the package 22 | remotes::install_github("allisonhorst/palmerpenguins") 23 | 24 | # Load the package 25 | library(palmerpenguins) 26 | 27 | # Load the data into the Global Environment 28 | data("penguins") 29 | 30 | # View the data 31 | head(penguins) 32 | 33 | ``` 34 | 35 | 36 | 37 | ## The steps for creating a beautiful scatter plot in `ggplot2` 38 | 39 | First we will create a basic scatterplot of `body_mass_g` against `bill_length_mm`. 40 | 41 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 42 | 43 | # Load the package 44 | library(ggplot2) 45 | 46 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ # this is the data 47 | geom_point() # here we add the points 48 | 49 | ``` 50 | 51 | 52 | ### Change the size of points 53 | 54 | We can manually change the size of our datapoints. The points in the standard plot are quite small, so lets increase the size of the points with `size = 3`. 55 | 56 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 57 | 58 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 59 | geom_point(size = 3) 60 | 61 | ``` 62 | 63 | 64 | 65 | ### Change the shape of points 66 | 67 | In `ggplot2`, it is possible to change the shape of the points. Here is a quick reference guide: 68 | 69 | ```{r, echo=FALSE, fig.align = "center"} 70 | #include_graphics("shapes.png") 71 | ``` 72 | 73 | The shape of all datapoints can be changed with e.g. `shape = 8`. 74 | 75 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 76 | 77 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 78 | geom_point(size = 3, shape = 8) 79 | 80 | ``` 81 | 82 | 83 | Alternatively, we can change the shape of our points based on species with `aes(shape = species)`. 84 | 85 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 86 | 87 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 88 | geom_point(aes(shape = species), size = 3) 89 | 90 | ``` 91 | 92 | 93 | ### Change the opacity of points 94 | 95 | You can also change the opacity of the data points using `alpha`. Alpha values are required to be between 0 - 1 where 0 is transparent and 1 is opaque. 96 | 97 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 98 | 99 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 100 | geom_point(aes(shape = species), size = 3, alpha = 0.6) 101 | 102 | ``` 103 | 104 | 105 | 106 | ### Adding colour 107 | 108 | Now lets explore the different species by adding colour with the code `colour = species`. 109 | 110 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 111 | 112 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 113 | geom_point(aes(shape = species, colour = species), size = 3, alpha = 0.6) 114 | 115 | 116 | ``` 117 | 118 | 119 | This red-green colour combination is colourblind unfrieldly, so lets change the colour of the points with `scale_colour_manual`. To ensure the shapes match with the names we will also use `scale_shape_manual`. 120 | 121 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 122 | 123 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 124 | geom_point(aes(shape = species, colour = species), size = 3, alpha = 0.6)+ 125 | 126 | scale_colour_manual(values = c("#C15CCB", "#00868B", "#FF6A00"), 127 | labels = c("Chinstrap", "Gentoo", "Adélie"))+ 128 | scale_shape_manual(values = c(17, 15, 16), 129 | labels = c("Chinstrap", "Gentoo", "Adélie")) 130 | 131 | 132 | ``` 133 | 134 | 135 | We won't change the points any more, so let's save the plot as `penguin_plot`, so we can build upon it. 136 | 137 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 138 | 139 | penguin_plot <- ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 140 | geom_point(aes(shape = species, colour = species), size = 3, alpha = 0.6)+ 141 | 142 | scale_colour_manual(values = c("#C15CCB", "#00868B", "#FF6A00"), 143 | labels = c("Chinstrap", "Gentoo", "Adélie"))+ 144 | scale_shape_manual(values = c(17, 15, 16), 145 | labels = c("Chinstrap", "Gentoo", "Adélie")) 146 | 147 | ``` 148 | 149 | ### Changing the background 150 | 151 | You can change the background of `ggplot2` figures in a variety of ways with: 152 | 153 | - `theme_gray()` 154 | - `theme_bw()` 155 | - `theme_linedraw()` 156 | - `theme_light()` 157 | - `theme_minimal()` 158 | - `theme_classic()` 159 | - `theme_void()` 160 | - `theme_dark()` 161 | 162 | 163 | ```{r, echo=FALSE, warning=FALSE, message = FALSE, fig.align = "center"} 164 | 165 | 166 | plot <- ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 167 | geom_point( alpha = 0.3) 168 | 169 | a <- plot + theme_gray() + ggtitle("theme_gray") 170 | b <- plot + theme_bw() + ggtitle("theme_bw") 171 | c <- plot + theme_linedraw() + ggtitle("theme_linedraw") 172 | d <- plot + theme_light() + ggtitle("theme_light") 173 | e <- plot + theme_minimal() + ggtitle("theme_minimal") 174 | f <- plot + theme_classic() + ggtitle("theme_classic") 175 | g <- plot + theme_void() + ggtitle("theme_void") 176 | h <- plot + theme_dark() + ggtitle("theme_dark") 177 | 178 | library(ggpubr) 179 | 180 | ggarrange(a, b, c, d, 181 | nrow = 2, ncol = 2) 182 | 183 | ggarrange( e, f, g, h, 184 | nrow = 2, ncol = 2) 185 | 186 | ``` 187 | 188 | 189 | 190 | My personal favourite is `theme_bw` therefore we will continue to make our plot with this theme. 191 | 192 | We will further remove the thicker lines in the background with `panel.grid.major = element_blank()`, and the thinner lines with `panel.grid.minor = element_blank()`. 193 | 194 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 195 | penguin_plot + 196 | theme_bw()+ # set the background theme 197 | theme(panel.grid.major = element_blank(), # remove the major lines 198 | panel.grid.minor = element_blank()) # remove the minor lines 199 | ``` 200 | 201 | 202 | 203 | ### Changing the text size 204 | 205 | There are several ways to change the size of the font, but we can quickly change all font size with `theme_bw(base_size = 20)`. 206 | 207 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 208 | penguin_plot <- penguin_plot + 209 | theme_bw(base_size = 15)+ 210 | theme(panel.grid.major = element_blank(), 211 | panel.grid.minor = element_blank()) 212 | 213 | penguin_plot 214 | ``` 215 | 216 | ### Renaming the axes and legend 217 | 218 | You can rename the axes and legend using `labs`. 219 | 220 | 221 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 222 | penguin_plot <- penguin_plot + 223 | labs(x = "Body Mass (g)", y = "Bill Length (mm)", colour = "Species", shape = "Species") 224 | 225 | penguin_plot 226 | ``` 227 | 228 | ### Change axes position 229 | 230 | In the standard plots, the axes titles are really close to the plot. We will increase their distance with `vjust`: 231 | 232 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 233 | 234 | penguin_plot <- penguin_plot + 235 | theme(axis.title.y = element_text(vjust = 3))+ # increase distance from the y-axis 236 | theme(axis.title.x = element_text(vjust = -1)) # increase distance from the x-axis 237 | 238 | penguin_plot 239 | ``` 240 | 241 | 242 | 243 | ### Change legend position 244 | 245 | Legends can be positioned in a number of different ways using `legend.position`: 246 | 247 | - `theme(legend.position="top")` 248 | - `theme(legend.position="bottom")` 249 | - `theme(legend.position="left")` 250 | - `theme(legend.position="right")` 251 | - `theme(legend.position="none")` 252 | 253 | Legend loction can be also a numeric vector c(x,y), where x and y are the coordinates of the legend box. Their values should be between 0 - 1. c(0,0) is the “bottom left” and c(1,1) is the “top right” position. 254 | 255 | 256 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 257 | penguin_plot <- penguin_plot + 258 | theme(legend.position = c(0.85, 0.2)) 259 | 260 | penguin_plot 261 | ``` 262 | 263 | 264 | ### Saving the figure 265 | 266 | Publications require high quality images and they specify the size and format required on their website. 267 | 268 | #### `ggsave` 269 | `ggsave` is one example of how to save your figures. 270 | 271 | It allows you to save high quality images in a variety of different file types (e.g. "png", "eps", "ps", "tex", "pdf", "jpeg", "tiff", "png", "bmp", "svg", "wmf"). You can also specify the `width` and `height` in "in", "cm", or "mm", and specify the plot resolution with `dpi`. 272 | 273 | ```{r, eval = FALSE} 274 | 275 | # This will save the last plot for the code you ran: 276 | 277 | ggsave("penguin_plot.pdf", 278 | dpi = 600, 279 | width = 100, height = 60, unit = "mm") 280 | 281 | ``` 282 | 283 | #### `pdf` 284 | 285 | An alternative method is to use `pdf`. This allows you to specify the colour mode (e.g. cmyk). 286 | 287 | ```{r, eval = FALSE} 288 | 289 | pdf("ggplot-cmyk.pdf", width = 12 / 2.54, height = 8 / 2.54, 290 | colormodel = "cmyk") 291 | 292 | print(penguin_plot) 293 | 294 | dev.off() 295 | 296 | ``` 297 | 298 | 299 | 300 | ## More `ggplot2` plots and tricks 301 | 302 | 303 | ### Change the size of points based on other data 304 | 305 | Size can be changed based on data within the `penguins` dataframe. For example, here we have changed the size of the points based on `bill_depth_mm`. 306 | 307 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 308 | 309 | ggplot(penguins, aes(body_mass_g, bill_length_mm))+ 310 | geom_point(aes(size = bill_depth_mm)) 311 | 312 | ``` 313 | 314 | 315 | ### Adding an ellispe 316 | 317 | Add 95% confidence interval ellispses with `stat_ellipse`. 318 | 319 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 320 | 321 | ellipse<- penguin_plot + 322 | stat_ellipse(aes(colour = species, level=0.95)) 323 | 324 | 325 | ellipse 326 | ``` 327 | 328 | ### Linear regression line 329 | 330 | Add a linear regression line with `geom_smooth(method=lm)`. 331 | 332 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 333 | 334 | lm <- penguin_plot + 335 | geom_smooth(method=lm, aes(colour = species)) 336 | 337 | lm 338 | 339 | ``` 340 | 341 | 342 | ### `facet_wrap` 343 | 344 | `facet_wrap` is a fantastic tool to slit plots based on a specified categorical column. Here we will use the `species` column. 345 | 346 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 347 | 348 | penguin_plot + 349 | facet_wrap(~species, ncol = 3, nrow = 1) + # specifying 3 columns, 1 row 350 | theme(legend.position = "none") # remove legend 351 | 352 | ``` 353 | 354 | ### Changing strip design 355 | 356 | It is also possible to change the `size`, `colour`, `face` and `fill` colour of the facet strips with the following code: 357 | 358 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 359 | 360 | penguin_plot + 361 | facet_wrap(~species, ncol = 3, nrow = 1) + # specifying 3 columns, 1 row 362 | theme(legend.position = "none")+ # remove legend 363 | theme(strip.text.x = element_text(size = 16, color = "white", face = "bold"), 364 | strip.background = element_rect(fill="black")) 365 | 366 | ``` 367 | 368 | ### Removing white space and free axes 369 | 370 | We can also remove the free space using `scales = free`. 371 | 372 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 373 | 374 | penguin_plot + 375 | facet_wrap(~species, scales = "free")+ 376 | theme(legend.position = "none")+ # remove legend 377 | theme(strip.text.x = element_text(size = 16, color = "white", face = "bold"), 378 | strip.background = element_rect(fill="black")) 379 | 380 | ``` 381 | 382 | ### `facet_grid`: Free facet width 383 | 384 | To allow the facets to be different widths, we must use `facet_grid` and `space = "free"`. This can be used with `scales = "free"`. 385 | 386 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 387 | 388 | penguin_plot + 389 | facet_grid(.~species, scales = "free", space = "free") + 390 | theme(legend.position = "none")+ # remove legend 391 | theme(strip.text.x = element_text(size = 16, color = "white", face = "bold"), 392 | strip.background = element_rect(fill="black")) 393 | 394 | ``` 395 | 396 | 397 | ### Bar plot 398 | 399 | 400 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 401 | library(dplyr) 402 | penguin_summary <- penguins %>% 403 | group_by(species) %>% 404 | summarise(mean = mean(body_mass_g, na.rm = T), 405 | sd = sd(body_mass_g, na.rm = T)) 406 | 407 | 408 | ggplot(penguin_summary, aes(y = mean, x=species, fill = species)) + 409 | geom_bar(stat="identity")+ 410 | geom_errorbar(aes(ymin = mean-sd, ymax = mean+sd), 411 | width=.1) 412 | 413 | ``` 414 | 415 | 416 | Create the plot like the code above for the scatter plot. 417 | 418 | ```{r, warning=FALSE, fig.align = "center", out.width = '50%'} 419 | 420 | bar <- ggplot(penguin_summary, aes(y = mean, x = species, fill = species)) + 421 | geom_bar(stat = "identity")+ 422 | geom_errorbar(aes(ymin = mean-sd, ymax = mean+sd), 423 | width=.1)+ 424 | theme_bw(base_size = 20)+ 425 | theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+ 426 | labs(x = "Species", y = "Body Mass (g)")+ 427 | scale_fill_manual(values = c( "#FF6A00","#C15CCB", "#00868B"))+ 428 | theme(legend.position = "none") + 429 | scale_y_continuous(limits = c(0, 5700))+ 430 | theme(axis.title.y = element_text(vjust = 3)) + 431 | theme(axis.title.x = element_text(vjust = -1)) 432 | 433 | bar 434 | 435 | ``` 436 | 437 | 438 | 439 | ### Histogram 440 | 441 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 442 | 443 | ggplot(penguins, aes(x = flipper_length_mm, fill = species)) + 444 | geom_histogram(alpha = 0.4)+ 445 | scale_fill_manual(values = c( "#FF6A00","#C15CCB", "#00868B")) 446 | 447 | ``` 448 | 449 | ### Density Plot 450 | 451 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 452 | 453 | ggplot(penguins, aes(x = flipper_length_mm, fill = species)) + 454 | geom_density(alpha = 0.4)+ 455 | scale_fill_manual(values = c( "#FF6A00","#C15CCB", "#00868B")) 456 | 457 | ``` 458 | 459 | 460 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 461 | 462 | flipper <- ggplot(penguins, aes(x = flipper_length_mm, fill = species, colour = species)) + 463 | geom_density(alpha = 0.4)+ 464 | scale_fill_manual(values = c( "#FF6A00","#C15CCB", "#00868B"))+ 465 | scale_colour_manual(values = c( "#FF6A00","#C15CCB", "#00868B"))+ 466 | theme_bw(base_size = 20)+ 467 | theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+ 468 | labs(x = "Flipper Length (mm)", y = "Density", fill = "Species", colour = "Species")+ 469 | #theme(legend.position = "none") + 470 | scale_x_continuous(expand = c(0,0), limits = c(165, 238))+ 471 | scale_y_continuous(expand = c(0,0), limits = c(0, 0.065), breaks=seq(0, 0.065, 0.02))+ 472 | theme(axis.title.y = element_text(vjust = 3)) + 473 | theme(axis.title.x = element_text(vjust = -1)) 474 | 475 | flipper 476 | 477 | ``` 478 | 479 | ### Add a mean line 480 | 481 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 482 | 483 | flipper_mean <- penguins %>% 484 | group_by(species) %>% 485 | summarise(mean = mean(flipper_length_mm, na.rm = TRUE)) 486 | 487 | density <- flipper + 488 | geom_vline(data = flipper_mean, aes(xintercept = mean, colour = species), linetype = "dashed", size = 1.3) 489 | 490 | density 491 | ``` 492 | 493 | 494 | 495 | 496 | ### Box plot 497 | 498 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 499 | 500 | 501 | ggplot(na.omit(penguins), aes(x=species, y=flipper_length_mm, fill=sex)) + 502 | geom_boxplot() 503 | ``` 504 | 505 | 506 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, out.width = '50%'} 507 | 508 | box_plot<- ggplot(na.omit(penguins), aes(x=species, y=flipper_length_mm, fill=sex)) + 509 | geom_boxplot()+ 510 | theme_bw(base_size = 20)+ 511 | theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+ 512 | labs(x = "Species", y = "Flipper Length (mm)", fill = "Sex")+ 513 | theme(axis.title.y = element_text(vjust = 3)) + 514 | theme(axis.title.x = element_text(vjust = -1)) + 515 | theme(legend.position = c(0.15, 0.8)) 516 | 517 | box_plot 518 | ``` 519 | 520 | 521 | 522 | ### Arrange plots 523 | 524 | There are several different methods that allow you to arrange your plots into different panels. 525 | 526 | #### `ggarrange` 527 | 528 | Here we will arrange four of the plots that we made above into one figure using `ggarrange`. You can specify the number of rows `nrow` and `ncol` and add `labels. 529 | 530 | Note we previously saved our figures (`ellipse`, `density`, `bar`, `box_plot`) in the Global Environment and we are calling on them here. 531 | 532 | 533 | ``````{r, warning=FALSE, fig.align = "center", message = FALSE, fig.width=12, fig.height=8} 534 | library(ggpubr) 535 | 536 | ggarrange(ellipse, density, bar, box_plot, 537 | nrow = 2, ncol = 2, 538 | labels = c("a", "b", "c", "d")) 539 | 540 | ``` 541 | 542 | #### Same axes 543 | 544 | If two plots have the same axes, you can remove one and use `align = "v"` to ensure the plots remain the same size. 545 | 546 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, fig.width=12, fig.height=8} 547 | 548 | # Remove y axes text and title 549 | lm <- lm + 550 | theme(axis.title.y=element_blank(), 551 | axis.text.y=element_blank(), 552 | axis.ticks.y=element_blank()) 553 | 554 | 555 | ggarrange(ellipse, lm, 556 | nrow = 1, ncol = 2, 557 | labels = c("a", "b"), 558 | align = "v") 559 | 560 | ``` 561 | 562 | #### `patchwork` 563 | 564 | `patchwork` is a very straighforward and intuative package for arranging plots. 565 | e.g: 566 | 567 | - `plot1 + plot2` aligns two plots next to each other 568 | - `plot1 / plot2` aligns plot2 under plot1 569 | 570 | Lets make a fancy one. 571 | 572 | We add the figure labels with e.g. `labs(tag = 'a')`. We will also make two of our plots half the size of the main plot with `plot_layout(widths=c(2,1))`. 573 | 574 | 575 | ```{r, warning=FALSE, fig.align = "center", message = FALSE, fig.width=18, fig.height=10} 576 | 577 | library(patchwork) 578 | 579 | (ellipse + labs(tag = 'a')| ((bar+ labs(tag = 'b')) / (box_plot +labs(tag = 'c')))) + plot_layout(widths=c(2,1)) 580 | 581 | ``` 582 | 583 | -------------------------------------------------------------------------------- /ggplot2-for-publications.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | --------------------------------------------------------------------------------