├── README.md ├── environment.yml ├── test1 └── test1.Rmd └── test2 ├── images └── transform-logical.png └── test2.Rmd /README.md: -------------------------------------------------------------------------------- 1 | # Create and serve `learnr` tutorials using Binder 2 | 3 | ## Interactive Tutorials for R with `learnr` 4 | 5 | [**learnr**](https://rstudio.github.io/learnr/) packages are for creating R tutorials with shiny apps running under the hood. Normally, running them require you to run a server or purchase [shiny apps](https://www.shinyapps.io) hosting service. 6 | 7 | However, Binder service can provide an alternative. Following are two examples of `learnr` tutorials running on Binder. 8 | 9 | shiny/test1: [![Binder](http://mybinder.org/badge_logo.svg)](http://mybinder.org/v2/gh/syoh/learnr-tutorial/master?urlpath=shiny/test1/) 10 | 11 | shiny/test2: [![Binder](http://mybinder.org/badge_logo.svg)](http://mybinder.org/v2/gh/syoh/learnr-tutorial/master?urlpath=shiny/test2/) 12 | 13 | ## Developing/Troubleshooting Tutorials 14 | 15 | The Binder image has Jupyter notebook with R and RStudio installed. Rstudio can be invoked to develop and troubleshoot tutorials. Following Binder links will show you how to get to these endpoints 16 | 17 | R + Jupyter notebook: [![Binder](http://mybinder.org/badge_logo.svg)](http://mybinder.org/v2/gh/syoh/learnr-tutorial/master?filepath=index.ipynb) 18 | 19 | RStudio: [![Binder](http://mybinder.org/badge_logo.svg)](http://mybinder.org/v2/gh/syoh/learnr-tutorial/master?urlpath=rstudio) 20 | -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | channels: 2 | - conda-forge 3 | dependencies: 4 | - r-base=3.6 5 | - r-tidyverse 6 | - r-shinydashboard 7 | - r-learnr 8 | - r-nycflights13 9 | - r-Lahman 10 | -------------------------------------------------------------------------------- /test1/test1.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Tutorial" 3 | output: learnr::tutorial 4 | runtime: shiny_prerendered 5 | --- 6 | 7 | ```{r setup, include=FALSE} 8 | library(learnr) 9 | knitr::opts_chunk$set(echo = FALSE) 10 | ``` 11 | 12 | 13 | ## Topic 1 14 | 15 | ### Exercise 16 | 17 | *Here's a simple exercise with an empty code chunk provided for entering the answer.* 18 | 19 | Write the R code required to add two plus two: 20 | 21 | ```{r two-plus-two, exercise=TRUE} 22 | 23 | ``` 24 | 25 | ### Exercise with Code 26 | 27 | *Here's an exercise with some prepopulated code as well as `exercise.lines = 5` to provide a bit more initial room to work.* 28 | 29 | Now write a function that adds any two numbers and then call it: 30 | 31 | ```{r add-function, exercise=TRUE, exercise.lines = 5} 32 | add <- function() { 33 | 34 | } 35 | ``` 36 | 37 | ## Topic 2 38 | 39 | ### Exercise with Hint 40 | 41 | *Here's an exercise where the chunk is pre-evaulated via the `exercise.eval` option (so the user can see the default output we'd like them to customize). We also add a "hint" to the correct solution via the chunk immediate below labeled `print-limit-hint`.* 42 | 43 | Modify the following code to limit the number of rows printed to 5: 44 | 45 | ```{r print-limit, exercise=TRUE, exercise.eval=TRUE} 46 | mtcars 47 | ``` 48 | 49 | ```{r print-limit-hint} 50 | head(mtcars) 51 | ``` 52 | 53 | ### Quiz 54 | 55 | *You can include any number of single or multiple choice questions as a quiz. Use the `question` function to define a question and the `quiz` function for grouping multiple questions together.* 56 | 57 | Some questions to verify that you understand the purposes of various base and recommended R packages: 58 | 59 | ```{r quiz} 60 | quiz( 61 | question("Which package contains functions for installing other R packages?", 62 | answer("base"), 63 | answer("tools"), 64 | answer("utils", correct = TRUE), 65 | answer("codetools") 66 | ), 67 | question("Which of the R packages listed below are used to create plots?", 68 | answer("lattice", correct = TRUE), 69 | answer("tools"), 70 | answer("stats"), 71 | answer("grid", correct = TRUE) 72 | ) 73 | ) 74 | ``` 75 | 76 | -------------------------------------------------------------------------------- /test2/images/transform-logical.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/syoh/learnr-tutorial/d94968de4e7f8633c3a4f65c337b2522a33c10c4/test2/images/transform-logical.png -------------------------------------------------------------------------------- /test2/test2.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Filter observations" 3 | output: 4 | learnr::tutorial: 5 | progressive: true 6 | allow_skip: true 7 | runtime: shiny_prerendered 8 | description: > 9 | Learn how to filter observations in a data frame. Use `filter()` to extract 10 | observations from a data frame, and use `&`, `|`, and `!` to write logical 11 | tests. 12 | --- 13 | 14 | ```{r setup, include=FALSE} 15 | library(learnr) 16 | library(tidyverse) 17 | library(nycflights13) 18 | library(Lahman) 19 | 20 | tutorial_options(exercise.timelimit = 60) 21 | knitr::opts_chunk$set(error = TRUE) 22 | ``` 23 | 24 | ## Welcome 25 | 26 | This is a demo tutorial. Compare it to the [source code](https://github.com/rstudio/learnr/tree/master/inst/tutorials/ex-data-filter/ex-data-filter.Rmd) that made it. 27 | 28 | ### 29 | 30 | In this tutorial, you will learn how to: 31 | 32 | * use `filter()` to extract observations from a data frame or tibble 33 | * write logical tests in R 34 | * combine logical tests with Boolean operators 35 | * handle missing values within logical tests 36 | 37 | The readings in this tutorial follow [_R for Data Science_](http://r4ds.had.co.nz/), section 5.2. 38 | 39 | ### Prerequisites 40 | 41 | To practice these skills, we will use the `flights` data set from the nycflights13 package. This data frame comes from the US [Bureau of Transportation Statistics](http://www.transtats.bts.gov/DatabaseInfo.asp?DB_ID=120&Link=0) and contains all `r format(nrow(nycflights13::flights), big.mark = ",")` flights that departed from New York City in 2013. It is documented in `?flights`. 42 | 43 | We will also use the ggplot2 package to visualize the data. 44 | 45 | If you are ready to begin, click on! 46 | 47 | ## Filter rows with `filter()` 48 | 49 | ### filter() 50 | 51 | `filter()` lets you use a logical test to extract specific rows from a data frame. To use `filter()`, pass it the data frame followed by one or more logical tests. `filter()` will return every row that passes each logical test. 52 | 53 | So for example, we can use `filter()` to select every flight in flights that departed on January 1st. Click Submit Answer to give it a try: 54 | 55 | ```{r filter1, exercise = TRUE, exercise.eval = FALSE} 56 | filter(flights, month == 1, day == 1) 57 | ``` 58 | 59 | 60 | ### output 61 | 62 | Like all dplyr functions, `filter()` returns a new data frame for you to save or use. It doesn't overwrite the old data frame. 63 | 64 | If you want to save the output of `filter()`, you'll need to use the assignment operator, `<-`. 65 | 66 | Rerun the command in the code chunk below, but first arrange to save the output to an object named `jan1`. 67 | 68 | ```{r filter2, exercise = TRUE, exercise.eval = FALSE} 69 | filter(flights, month == 1, day == 1) 70 | ``` 71 | 72 | ```{r filter2-solution} 73 | jan1 <- filter(flights, month == 1, day == 1) 74 | ``` 75 | 76 | ### 77 | 78 | Good job! You can now see the results by running the name jan1 by itself. Or you can pass `jan1` to a function that takes data frames as input. 79 | 80 | Did you notice that this code used the double assignment operator, `==`? `==` is one of R's logical comparison operators. Comparison operators are key to using `filter()` let's take a look at them. 81 | 82 | ## Logical Comparisons 83 | 84 | ### Comparison operators 85 | 86 | R provides a suite of comparison operators that you can use to compare values: `>`, `>=`, `<`, `<=`, `!=` (not equal), and `==` (equal). Each creates a logical test. For example, is `pi` greater than three? 87 | 88 | ```{r} 89 | pi > 3 90 | ``` 91 | 92 | ### 93 | 94 | When you place a logical test inside of `filter()`, filter applies the test to each row in the data frame and then returns the rows that pass, as a new data frame. 95 | 96 | Our code above returned every row whose month value was equal to one _and_ whose day value was equal to one. 97 | 98 | ### Watch out! 99 | 100 | When you start out with R, the easiest mistake to make is to test for equality with `=` instead of `==`. When this happens you'll get an informative error: 101 | 102 | ```{r, error = TRUE} 103 | filter(flights, month = 1) 104 | ``` 105 | 106 | ### Multiple tests 107 | 108 | If you give `filter()` more than one logical test, `filter()` will combine the tests with an implied "and."In other words, `filter()` will return only the rows that return `TRUE` for every test. You can combine tests in other ways with Boolean operators... 109 | 110 | ## Boolean operators 111 | 112 | ### &, |, and ! 113 | 114 | R uses boolean operators to combine multiple logical comparisons into a single logical test. These include `&` (_and_), `|` (_or_), `!` (_not_ or _negation_), and `xor()` (_exactly or_). 115 | 116 | Both `|` and `xor()` will return TRUE is one or the other logical comparison returns TRUE. `xor()` differs from `|` in that it will return FALSE if both logical comparisons return TRUE. The name _xor_ stands for _exactly or_. 117 | 118 | Study the diagram below to get a feel for how these operators work. 119 | 120 | ```{r fig1, echo = FALSE, out.width = "100%", fig.cap = "In the figure above, `x` is the left-hand circle, `y` is the right-hand circle, and the shaded region show which parts each operator selects."} 121 | knitr::include_graphics("images/transform-logical.png") 122 | ``` 123 | 124 | ### Test Your Knowledge 125 | 126 | ```{r logicals, echo = FALSE} 127 | question(" What will the following code return? `filter(flights, month == 11 | month == 12)`", 128 | answer("Every flight that departed in November _or_ December", correct = TRUE), 129 | answer("Every flight that departed in November _and_ December", message = "Technically a flight could not have departed in November _and_ December unless it departed twice."), 130 | answer("Every flight _except for_ those that departed in November or December"), 131 | answer("An error. This is an incorrect way to combine tests.", message = "The next section will say a little more about combining tests."), 132 | allow_retry = TRUE 133 | ) 134 | ``` 135 | 136 | ### Common mistakes 137 | 138 | In R, the order of operations doesn't work like English. You can't write `filter(flights, month == 11 | 12)`, even though you might say "finds all flights that departed in November or December". Be sure to write oue a _complete_ test on each side of a boolean operator. 139 | 140 | Here are four more tips to help you use logical tests and Boolean operators in R: 141 | 142 | ### 143 | 144 | 1. A useful short-hand for this problem is `x %in% y`. This will select every row where `x` is one of the values in `y`. We could use it to rewrite the code in the question above: 145 | 146 | ```{r, eval = FALSE} 147 | nov_dec <- filter(flights, month %in% c(11, 12)) 148 | ``` 149 | 150 | ### 151 | 152 | 2. Sometimes you can simplify complicated subsetting by remembering De Morgan's law: `!(x & y)` is the same as `!x | !y`, and `!(x | y)` is the same as `!x & !y`. For example, if you wanted to find flights that weren't delayed (on arrival or departure) by more than two hours, you could use either of the following two filters: 153 | 154 | ```{r, eval = FALSE} 155 | filter(flights, !(arr_delay > 120 | dep_delay > 120)) 156 | filter(flights, arr_delay <= 120, dep_delay <= 120) 157 | ``` 158 | 159 | ### 160 | 161 | 3. As well as `&` and `|`, R also has `&&` and `||`. Don't use them with `filter()`! You'll learn when you should use them later. 162 | 163 | ### 164 | 165 | 4. Whenever you start using complicated, multipart expressions in `filter()`, consider making them explicit variables instead. That makes it much easier to check your work. You'll learn how to create new variables shortly. 166 | 167 | ## Missing values 168 | 169 | ### NA 170 | 171 | Missing values can make comparisons tricky in R. R uses `NA` to represent missing or unknown values. `NA`s are "contagious" because almost any operation involving an unknown value (`NA`) will also be unknown (`NA`). For example, can you determine what value these expressions that use missing values shoudl evaluate to? Make a prediction and then click "Submit Answer". 172 | 173 | ```{r nas, exercise = TRUE, evaluate.exercise = FALSE} 174 | NA > 5 175 | 10 == NA 176 | NA + 10 177 | NA / 2 178 | ``` 179 | 180 | ```{r nas-check} 181 | "In every case, R does not have enough information to compute a result. Hence, each result is an unknown value, `NA`." 182 | ``` 183 | 184 | ### is.na() 185 | 186 | The most confusing result above is this one: 187 | 188 | ```{r} 189 | NA == NA 190 | ``` 191 | 192 | It's easiest to understand why this is true with a bit more context: 193 | 194 | ```{r} 195 | # Let x be Mary's age. We don't know how old she is. 196 | x <- NA 197 | 198 | # Let y be John's age. We don't know how old he is. 199 | y <- NA 200 | 201 | # Are John and Mary the same age? 202 | x == y 203 | # We don't know! 204 | ``` 205 | 206 | If you want to determine if a value is missing, use `is.na()`: 207 | 208 | ```{r} 209 | is.na(x) 210 | ``` 211 | 212 | ### filter() and NAs 213 | 214 | `filter()` only includes rows where the condition is `TRUE`; it excludes both `FALSE` and `NA` values. If you want to preserve missing values, ask for them explicitly: 215 | 216 | ```{r} 217 | df <- tibble(x = c(1, NA, 3)) 218 | filter(df, x > 1) 219 | filter(df, is.na(x) | x > 1) 220 | ``` 221 | 222 | ## Exercises 223 | 224 | ### Exercise 1 225 | 226 | Use the code chunks below to find all flights that 227 | 228 | 1. Had an arrival delay of two or more hours 229 | 230 | ```{r filterex1, exercise = TRUE} 231 | 232 | ``` 233 | ```{r filterex1-solution} 234 | filter(flights, arr_delay >= 2) 235 | ``` 236 | 237 | 1. Flew to Houston (`IAH` or `HOU`) 238 | 239 | ```{r filterex2, exercise = TRUE} 240 | 241 | ``` 242 | ```{r filterex2-solution} 243 | filter(flights, dest %in% c("IAH", "HOU")) 244 | ``` 245 | 246 |
247 | **Hint:** This is a good case for the `%in%` operator. 248 |
249 | 250 | 1. Were operated by United (`UA`), American (`AA`), or Delta (`DL`) 251 | 252 | ```{r filterex3, exercise = TRUE} 253 | 254 | ``` 255 | ```{r filterex3-solution} 256 | filter(flights, carrier %in% c("UA", "AA", "DL")) 257 | ``` 258 | 259 |
260 | **Hint:** The `carrier` variable lists the airline that operated each flight. This is another good case for the `%in%` operator. 261 |
262 | 263 | 1. Departed in summer (July, August, and September) 264 | 265 | ```{r filterex4, exercise = TRUE} 266 | 267 | ``` 268 | ```{r filterex4-solution} 269 | filter(flights, 6 < month, month < 10) 270 | ``` 271 | 272 |
273 | **Hint:** When converted to numbers, July, August, and September become 7, 8, and 9. 274 |
275 | 276 | 1. Arrived more than two hours late, but didn't leave late 277 | 278 | ```{r filterex5, exercise = TRUE} 279 | 280 | ``` 281 | ```{r filterex5-solution} 282 | filter(flights, arr_delay > 120, dep_delay < 0) 283 | ``` 284 | 285 |
286 | **Hint:** Remember that departure and arrival delays are recorded in _minutes_. 287 |
288 | 289 | 1. Were delayed by at least an hour, but made up over 30 minutes in flight 290 | 291 | ```{r filterex6, exercise = TRUE} 292 | 293 | ``` 294 | ```{r filterex6-solution} 295 | filter(flights, dep_delay > 60, (dep_delay - arr_delay) >= 30) 296 | ``` 297 | 298 |
299 | **Hint:** The time a plane makes up is `dep_delay - arr_delay`. 300 |
301 | 302 | 1. Departed between midnight and 6am (inclusive) 303 | 304 | ```{r filterex7, exercise = TRUE} 305 | 306 | ``` 307 | ```{r filterex7-solution} 308 | filter(flights, dep_time <= 600 | dep_time == 2400) 309 | ``` 310 | 311 |
312 | **Hint:** Don't forget flights thsat left at eactly midnight (`2400`). This is a good case for an "or" operator. 313 |
314 | 315 | ### Exercise 2 316 | 317 | Another useful dplyr filtering helper is `between()`. What does it do? Can you use `between()` to simplify the code needed to answer the previous challenges? 318 | 319 | ```{r filterex8, exercise = TRUE} 320 | ?between 321 | ``` 322 | 323 | ### Exercise 3 324 | 325 | How many flights have a missing `dep_time`? What other variables are missing? What might these rows represent? 326 | 327 | ```{r filterex9, exercise = TRUE} 328 | 329 | ``` 330 | ```{r filterex9-solution} 331 | filter(flights, is.na(dep_time)) 332 | ``` 333 | 334 |
335 | **Hint:** This is a good case for `is.na()`. 336 |
337 | 338 | ```{r filterex9-check} 339 | "Good Job! these look like they might be cancelled flights." 340 | ``` 341 | 342 | ### Exercise 4 343 | 344 | Why is `NA ^ 0` not missing? Why is `NA | TRUE` not missing? 345 | Why is `FALSE & NA` not missing? Can you figure out the general 346 | rule? (`NA * 0` is a tricky counterexample!) 347 | 348 | ```{r filterex10, exercise = TRUE} 349 | 350 | ``` 351 | 352 | --------------------------------------------------------------------------------