├── .gitignore ├── README.md ├── ashiny └── README.md ├── bayesopt └── README.md ├── bshiny ├── 0_app.R ├── 1_1_app.R ├── 1_2_app.R ├── 1_3_app.R ├── 2_1_app.R ├── 2_2_app.R ├── 2_3_app.R ├── 2_4_app.R ├── 2_5_app.R ├── 3_1_app.R ├── 3_2_app.R ├── 4_1_app.R ├── 4_2_app.R ├── 4_3_app.R ├── README.md ├── materials │ ├── bd-dec19-births-by-mothers-age.csv │ ├── bd-dec19-births-deaths-natural-increase.csv │ ├── bd-dec19-deaths-by-sex-and-age.csv │ ├── info.txt │ ├── info2.md │ └── sources.txt └── www │ ├── new-zealand-flag.jpg │ └── new-zealand-map.jpg ├── casual ├── README.html └── README.md ├── drake ├── .gitignore └── README.md ├── ipbox └── README.md ├── legal └── README.md ├── openmp └── README.md ├── rcpp └── README.md ├── satellite ├── README.md ├── slides.pdf └── whyr_satellite.R ├── travis └── README.md ├── workshops.Rproj └── workshops.jpg /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | .Ruserdata 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Why R? 2020 remote Workshops [2020.whyr.pl/register/](http://2020.whyr.pl/register/) 2 | 3 | A part of [2020.whyr.pl](http://2020.whyr.pl). 4 | 5 | Registration for workshops starts 2020-09-01. Workshops take place 2020-09-25. They are divided to 2 sessions 6 | 7 | - 09:00 - 12:30 / CEST / GMT+2 - **Morning (M)** 8 | - 14:00 - 17:30 / CEST / GMT+2 - **Afternoon (A)** 9 | 10 | During the break [McKinsey & Company](https://www.mckinsey.com/pl/careers/careers-in-poland) will host a **Recruiting Panel** 11 | 12 | - [Lessons learned from 500+ interviews for data science jobs](https://github.com/WhyR2020/abstracts/tree/master/panel) (13:00 - 14:00 / CEST / GMT+2 - **Panel (P)**) 13 | 14 | | Title | Tutors | Seats | | 15 | |-------------------|-----------------|-------|--| 16 | | **Invited** [First steps with Continuous Integration](https://github.com/WhyR2020/workshops/tree/master/travis) | [Colin Gillespie](https://twitter.com/csgillespie) & [Rhian Davis](https://twitter.com/trianglegirl) from [Jumping Rivers](https://www.jumpingrivers.com/about/) | 25 | (M) | 17 | | **Invited** [Bayesian Optimization with mlr3mbo](https://github.com/WhyR2020/workshops/tree/master/bayesopt) | [Jakob Richter](https://twitter.com/jak0br) | 25 | (M) | 18 | | [Basics of Shiny](https://github.com/WhyR2020/workshops/tree/master/bshiny) | [Weronika Puchała](https://www.linkedin.com/in/weronika-pucha%C5%82a-831698a4/), [Krystyna Grzesiak](https://www.linkedin.com/in/krystyna-grzesiak-7bb290189/), [Katarzyna Sidorczuk](https://www.linkedin.com/in/katarzyna-sidorczuk/) | 30 | (M) | 19 | | [How to make your code fast - R and C++ integration using Rcpp](https://github.com/WhyR2020/workshops/tree/master/rcpp) | [Jadwiga Słowik](https://github.com/slowikj), [Dominik Rafacz](https://www.linkedin.com/in/dominik-rafacz-4592b8164/), [Mateusz Bąkała](https://www.facebook.com/matibakala) | 30 | (M) | 20 | | [Reproducible data analysis with `drake`](https://github.com/WhyR2020/workshops/tree/master/drake) | [Jakub Kwiecień](https://www.linkedin.com/in/jakub-kwiecien-097797120/) | 30 | (M) | 21 | | | | | | 22 | | Recruiting Panel [Lessons learned from 500+ interviews for data science jobs](https://github.com/WhyR2020/abstracts/tree/master/panel) | [McKinsey & Company](https://www.mckinsey.com/pl/careers/careers-in-poland) | 1000 | (P) | 23 | | | | | | 24 | | **Invited** [Innovation Box (IP Box) in Poland](https://github.com/WhyR2020/workshops/tree/master/ipbox) | Natalia Wojciechowska-Chałupińska & Grzegorz Leśniewski from [Leśniewski Borkiewicz & Partners](https://lbplegal.com/) | 50 | (A - early) | 25 | | **Invited** [Legal basics for data scientists](https://github.com/WhyR2020/workshops/tree/master/legal) | Urszula Ilnicka-Karaban & Grzegorz Leśniewski from [Leśniewski Borkiewicz & Partners](https://lbplegal.com/) | 50 | (A - late) | 26 | | **Invited** [Advanced Shiny](https://github.com/WhyR2020/workshops/tree/master/ashiny) | [Colin Fay](https://colinfay.me/) from [ThinkR](https://thinkr.fr/) | 30 | (A) | 27 | | **Highlighted** [Causal machine learning in practice](https://github.com/WhyR2020/workshops/tree/master/casual) | Mateusz Zawisza [McKinsey & Company](https://www.mckinsey.com/pl/careers/careers-in-poland) | 40 | (A) | 28 | | [Creating R Subroutines with Fortran and OpenMP Tools](https://github.com/WhyR2020/workshops/tree/master/openmp) | [Erin Hodgess](https://www.researchgate.net/profile/Erin_Hodgess) | 30 | (A) | 29 | | [Satellite imagery analysis in R](https://github.com/WhyR2020/workshops/tree/master/satellite) | [Ewa Grabska](https://denali.geo.uj.edu.pl/project/rs4for/index.php/pl/ewa-grabska-2/) | 40 | (A) | 30 | 31 | 32 | 33 | 34 | -------------------------------------------------------------------------------- /ashiny/README.md: -------------------------------------------------------------------------------- 1 | # Good Practice for {shiny} in production with {golem} 2 | 3 | ## Abstract 4 | 5 | Organizing large `{shiny}` applications is a complex task, and can quickly become a daunting enterprise if the whole project wasn't built on solid grounds. 6 | But what if we could create robust `{shiny}` applications from the very begining? 7 | Search no more, that is what the `{golem}` package has been designed for: building robust, production-grade applications from the very beginning. 8 | 9 | During this workshop, Colin will gives an overview of the best practices for creating `{shiny}` applications using `{golem}`: how to organize your project, and how to use the various tools made available by `{golem}` to be sure your application will be successful on the long run. 10 | 11 | ## Intended audience 12 | 13 | `{shiny}` developers with few to no experience of `{golem}` and with package development basic knowledge. 14 | -------------------------------------------------------------------------------- /bayesopt/README.md: -------------------------------------------------------------------------------- 1 | # Bayesian Optimization for Hyperparameter Tuning in Machine Learning 2 | 3 | Authors: Jakob Richter 4 | 5 | ### Description 6 | 7 | In this workshop you will learn how to use Bayesian optimization (also known as Model-based optimization) to tune the performance of your machine learning model. 8 | We will cover basic aspects of hyperparameter tuning such as nested resampling as well as the theoretical foundations of Bayesian optimization. 9 | This tutorial will also cover basic concepts of parallelization and handling of complex search spaces within the Bayesian optimization framework. 10 | The examples in this workshop will use the R-packages `mlr3`, `mlr3tuning` and `mlr3mbo`. 11 | Note, that `mlr3mbo` is currently under development and the concrete code examples in this workshop will likely change in the future after the workshop. 12 | 13 | 14 | ## Requirements 15 | 16 | * basic knowledge of R and machine learning 17 | * installed RStudio and packages `mlr3`, `mlr3tuning`, `mlr3learners`. -------------------------------------------------------------------------------- /bshiny/0_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage(titlePanel("Hello"), 5 | sidebarPanel(h3("We can put things here!")), 6 | mainPanel(plotOutput("example_plot"))) 7 | 8 | server <- function(input, output){ 9 | 10 | output[["example_plot"]] <- renderPlot({ 11 | 12 | ggplot(data.frame(x = 1:10, y = 1:10), aes(x, y)) + 13 | geom_point() + 14 | labs(title = "our example plot") 15 | 16 | 17 | }) 18 | } 19 | 20 | shinyApp(ui = ui, server = server) -------------------------------------------------------------------------------- /bshiny/1_1_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage( 5 | br(), 6 | sidebarPanel(splitLayout(h1("New Zealand"), 7 | img(src = "new-zealand-flag.jpg", height = 100)), 8 | h3("New Zealand (Māori: Aotearoa [aɔˈtɛaɾɔa]) is an island country in the southwestern Pacific Ocean. It consists of two main landmasses—the North Island (Te Ika-a-Māui) and the South Island (Te Waipounamu)—and around 600 smaller islands, covering a total area of 268,021 square kilometres (103,500 sq mi). New Zealand is about 2,000 kilometres (1,200 mi) east of Australia across the Tasman Sea and 1,000 kilometres (600 mi) south of the islands of New Caledonia, Fiji, and Tonga. The country's varied topography and sharp mountain peaks, including the Southern Alps, owe much to tectonic uplift and volcanic eruptions. New Zealand's capital city is Wellington, and its most populous city is Auckland. 9 | "), 10 | h5("Source: https://en.wikipedia.org/wiki/New_Zealand")), 11 | mainPanel(plotOutput("nz_increase_plot"), 12 | h5("Data source: https://www.stats.govt.nz/large-datasets/csv-files-for-download/")) 13 | ) 14 | 15 | server <- function(input, output){ 16 | 17 | dat <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 18 | colnames(dat) <- c("Year", "Type", "Count") 19 | 20 | output[["nz_increase_plot"]] <- renderPlot({ 21 | 22 | ggplot(dat, aes(x = Year, y = Count, color = Type)) + 23 | geom_point() + 24 | labs(title = "The natural increase in New Zealand", 25 | x = "Year", 26 | y = "Count") 27 | }) 28 | 29 | } 30 | 31 | shinyApp(ui = ui, server = server) -------------------------------------------------------------------------------- /bshiny/1_2_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | library(rmarkdown) 4 | 5 | ui <- fluidPage( 6 | tabsetPanel( 7 | tabPanel("Start", 8 | includeMarkdown("materials/info2.md") 9 | ), 10 | tabPanel("Info", 11 | br(), 12 | sidebarPanel(splitLayout(h1("New Zealand"), 13 | img(src = "new-zealand-flag.jpg", height = 100)), 14 | h3("New Zealand (Māori: Aotearoa [aɔˈtɛaɾɔa]) is an island country in the southwestern Pacific Ocean. It consists of two main landmasses—the North Island (Te Ika-a-Māui) and the South Island (Te Waipounamu)—and around 600 smaller islands, covering a total area of 268,021 square kilometres (103,500 sq mi). New Zealand is about 2,000 kilometres (1,200 mi) east of Australia across the Tasman Sea and 1,000 kilometres (600 mi) south of the islands of New Caledonia, Fiji, and Tonga. The country's varied topography and sharp mountain peaks, including the Southern Alps, owe much to tectonic uplift and volcanic eruptions. New Zealand's capital city is Wellington, and its most populous city is Auckland. 15 | "), 16 | h5("Source: https://en.wikipedia.org/wiki/New_Zealand")), 17 | mainPanel(plotOutput("nz_increase_plot"), 18 | h5("Data source: https://www.stats.govt.nz/large-datasets/csv-files-for-download/")) 19 | 20 | ) 21 | 22 | ) 23 | 24 | ) 25 | 26 | server <- function(input, output){ 27 | 28 | dat <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 29 | colnames(dat) <- c("Year", "Type", "Count") 30 | 31 | output[["nz_increase_plot"]] <- renderPlot({ 32 | 33 | ggplot(dat, aes(x = Year, y = Count, color = Type)) + 34 | geom_point() + 35 | labs(title = "The natural increase in New Zealand", 36 | x = "Year", 37 | y = "Count") 38 | }) 39 | 40 | } 41 | 42 | shinyApp(ui = ui, server = server) 43 | -------------------------------------------------------------------------------- /bshiny/1_3_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | library(DT) 4 | library(rmarkdown) 5 | 6 | ui <- fluidPage( 7 | tabsetPanel( 8 | tabPanel("Start", 9 | includeMarkdown("materials/info2.md") 10 | ), 11 | tabPanel("Info", 12 | br(), 13 | sidebarPanel(splitLayout(h1("New Zealand"), 14 | img(src = "new-zealand-flag.jpg", height = 100)), 15 | h3("New Zealand (Māori: Aotearoa [aɔˈtɛaɾɔa]) is an island country in the southwestern Pacific Ocean. It consists of two main landmasses—the North Island (Te Ika-a-Māui) and the South Island (Te Waipounamu)—and around 600 smaller islands, covering a total area of 268,021 square kilometres (103,500 sq mi). New Zealand is about 2,000 kilometres (1,200 mi) east of Australia across the Tasman Sea and 1,000 kilometres (600 mi) south of the islands of New Caledonia, Fiji, and Tonga. The country's varied topography and sharp mountain peaks, including the Southern Alps, owe much to tectonic uplift and volcanic eruptions. New Zealand's capital city is Wellington, and its most populous city is Auckland. 16 | "), 17 | h5("Source: https://en.wikipedia.org/wiki/New_Zealand")), 18 | mainPanel(plotOutput("nz_increase_plot"), 19 | h5("Data source: https://www.stats.govt.nz/large-datasets/csv-files-for-download/")) 20 | 21 | ), 22 | tabPanel("Data", 23 | dataTableOutput("table_with_data")) 24 | 25 | ) 26 | 27 | ) 28 | 29 | server <- function(input, output){ 30 | 31 | dat <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 32 | colnames(dat) <- c("Year", "Type", "Count") 33 | 34 | output[["nz_increase_plot"]] <- renderPlot({ 35 | 36 | ggplot(dat, aes(x = Year, y = Count, color = Type)) + 37 | geom_point() + 38 | labs(title = "The natural increase in New Zealand", 39 | x = "Year", 40 | y = "Count") 41 | }) 42 | 43 | output[["table_with_data"]] <- DT::renderDataTable({ 44 | 45 | dat 46 | 47 | }) 48 | 49 | } 50 | 51 | shinyApp(ui = ui, server = server) 52 | -------------------------------------------------------------------------------- /bshiny/2_1_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage( 5 | br(), 6 | sidebarPanel("Let's see the plot!"), 7 | mainPanel(plotOutput("nz_increase_plot")) 8 | ) 9 | 10 | server <- function(input, output){ 11 | 12 | dat <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 13 | colnames(dat) <- c("Year", "Type", "Count") 14 | 15 | output[["nz_increase_plot"]] <- renderPlot({ 16 | 17 | ggplot(dat, aes(x = Year, y = Count, color = Type)) + 18 | geom_point() + 19 | labs(title = "The natural increase in New Zealand", 20 | x = "Year", 21 | y = "Count") 22 | 23 | }) 24 | 25 | } 26 | 27 | shinyApp(ui = ui, server = server) -------------------------------------------------------------------------------- /bshiny/2_2_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage( 5 | br(), 6 | sidebarPanel("Let's see out plot!", 7 | sliderInput(inputId = "x_axis_range", 8 | label = "Adjust time range", 9 | min = 2000, max = 2019, 10 | step = 1, value = c(2000, 2019))), 11 | mainPanel(plotOutput("nz_increase_plot")) 12 | ) 13 | 14 | server <- function(input, output){ 15 | 16 | dat <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 17 | colnames(dat) <- c("Year", "Type", "Count") 18 | 19 | output[["nz_increase_plot"]] <- renderPlot({ 20 | 21 | ggplot(dat, aes(x = Year, y = Count, color = Type)) + 22 | geom_point() + 23 | labs(title = "The natural increase in New Zealand", 24 | x = "Year", 25 | y = "Count") + 26 | coord_cartesian(xlim = c(input[["x_axis_range"]][[1]], input[["x_axis_range"]][[2]])) 27 | 28 | }) 29 | 30 | } 31 | 32 | shinyApp(ui = ui, server = server) 33 | -------------------------------------------------------------------------------- /bshiny/2_3_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage( 5 | br(), 6 | sidebarPanel("Let's see out plot!", 7 | checkboxGroupInput(inputId = "type_choices", 8 | label = "Choices:", 9 | choices = c("Births", "Deaths", "Natural_Increase"), 10 | selected = c("Births", "Deaths", "Natural_Increase")), 11 | sliderInput(inputId = "x_axis_range", 12 | label = "Adjust time range", 13 | min = 2000, max = 2019, 14 | step = 1, value = c(2000, 2019))), 15 | mainPanel(plotOutput("nz_increase_plot")) 16 | ) 17 | 18 | server <- function(input, output){ 19 | 20 | dat <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 21 | colnames(dat) <- c("Year", "Type", "Count") 22 | 23 | output[["nz_increase_plot"]] <- renderPlot({ 24 | 25 | validate(need(input[["type_choices"]], "Please, select something on the left")) 26 | 27 | dat_tmp <- dat[dat[["Type"]] %in% input[["type_choices"]] , ] 28 | 29 | ggplot(dat_tmp, aes(x = Year, y = Count, color = Type)) + 30 | geom_point() + 31 | labs(title = "The natural increase in New Zealand", 32 | x = "Year", 33 | y = "Count") + 34 | coord_cartesian(xlim = c(input[["x_axis_range"]][[1]], input[["x_axis_range"]][[2]])) 35 | 36 | }) 37 | 38 | } 39 | 40 | shinyApp(ui = ui, server = server) 41 | -------------------------------------------------------------------------------- /bshiny/2_4_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage( 5 | br(), 6 | sidebarPanel("Let's see the plot!", 7 | checkboxGroupInput(inputId = "type_choices", 8 | label = "Choices:", 9 | choices = c("Births", "Deaths", "Natural_Increase"), 10 | selected = c("Births", "Deaths", "Natural_Increase")), 11 | sliderInput(inputId = "x_axis_range", 12 | label = "Adjust time range", 13 | min = 2000, max = 2019, 14 | step = 1, value = c(2000, 2019))), 15 | mainPanel(plotOutput("nz_increase_plot")) 16 | ) 17 | 18 | server <- function(input, output){ 19 | 20 | dat <- reactive({ 21 | 22 | dat_tmp <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 23 | colnames(dat_tmp) <- c("Year", "Type", "Count") 24 | dat_tmp 25 | 26 | }) 27 | 28 | dat_tmp <- reactive({ 29 | 30 | validate(need(input[["type_choices"]], "Please select something on the left!")) 31 | 32 | dat_tmp <- dat()[dat()[["Type"]] %in% input[["type_choices"]], ] 33 | 34 | }) 35 | 36 | 37 | output[["nz_increase_plot"]] <- renderPlot({ 38 | 39 | ggplot(dat_tmp(), aes(x = Year, y = Count, color = Type)) + 40 | geom_point() + 41 | labs(title = "The natural increase in New Zealand", 42 | x = "Year", 43 | y = "Count") + 44 | coord_cartesian(xlim = c(input[["x_axis_range"]][[1]], input[["x_axis_range"]][[2]] )) 45 | 46 | }) 47 | 48 | } 49 | 50 | shinyApp(ui = ui, server = server) -------------------------------------------------------------------------------- /bshiny/2_5_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | library(shinythemes) 4 | 5 | ui <- fluidPage(theme = shinytheme("darkly"), 6 | br(), 7 | sidebarPanel("Let's see the plot!", 8 | checkboxGroupInput(inputId = "type_choices", 9 | label = "Choices:", 10 | choices = c("Births", "Deaths", "Natural_Increase"), 11 | selected = c("Births", "Deaths", "Natural_Increase")), 12 | actionButton(inputId = "apply_changes", 13 | label = "Filter!"), 14 | sliderInput(inputId = "x_axis_range", 15 | label = "Adjust time range", 16 | min = 2000, max = 2019, 17 | step = 1, value = c(2000, 2019)), 18 | sliderInput(inputId = "y_axis_range", 19 | label = "Adjust value range", 20 | min = 24795, max = 64341, 21 | value = c(24795, 64341))), 22 | mainPanel(plotOutput("nz_increase_plot")) 23 | ) 24 | 25 | server <- function(input, output, session){ 26 | 27 | dat <- reactive({ 28 | 29 | dat_tmp <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 30 | colnames(dat_tmp) <- c("Year", "Type", "Count") 31 | dat_tmp 32 | 33 | }) 34 | 35 | dat_tmp <- eventReactive(input[["apply_changes"]], { 36 | 37 | validate(need(input[["type_choices"]], "Please select something on the left!")) 38 | 39 | dat_tmp <- dat()[dat()[["Type"]] %in% input[["type_choices"]], ] 40 | 41 | }) 42 | 43 | observe({ 44 | 45 | updateSliderInput(session, 46 | inputId = "y_axis_range", 47 | min = min(dat_tmp()[["Count"]]), 48 | max = max(dat_tmp()[["Count"]]), 49 | value = c(min(dat_tmp()[["Count"]]), max(dat_tmp()[["Count"]]))) 50 | 51 | }) 52 | 53 | output[["nz_increase_plot"]] <- renderPlot({ 54 | 55 | ggplot(dat_tmp(), aes(x = Year, y = Count, color = Type)) + 56 | geom_point() + 57 | labs(title = "The natural increase in New Zealand", 58 | x = "Year", 59 | y = "Count") + 60 | coord_cartesian(xlim = c(input[["x_axis_range"]][[1]], input[["x_axis_range"]][[2]] ), 61 | ylim = c(input[["y_axis_range"]][[1]], input[["y_axis_range"]][[2]] )) 62 | 63 | }) 64 | 65 | } 66 | 67 | shinyApp(ui = ui, server = server) -------------------------------------------------------------------------------- /bshiny/3_1_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage( 5 | sidebarPanel(h4("Welcome! I'll take care of your file."), 6 | fileInput(inputId = "user_file", 7 | label = "Upload .csv file", 8 | accept = c(".csv"))), 9 | mainPanel(tabsetPanel(tabPanel("Data", 10 | tableOutput("file_data")), 11 | tabPanel("Plot"))) 12 | ) 13 | 14 | server <- function(input, output){ 15 | 16 | output[["file_data"]] <- renderTable({ 17 | 18 | validate(need(input[["user_file"]], "Please upload a file!")) 19 | 20 | read.csv(input[["user_file"]][["datapath"]]) 21 | 22 | }) 23 | } 24 | 25 | shinyApp(ui, server) -------------------------------------------------------------------------------- /bshiny/3_2_app.R: -------------------------------------------------------------------------------- 1 | # 1. Load a file and show it 2 | # 2. Add some options for the plot 3 | 4 | library(shiny) 5 | library(ggplot2) 6 | 7 | ui <- fluidPage( 8 | sidebarPanel(h4("Welcome! I'll take care of your file."), 9 | fileInput(inputId = "user_file", 10 | label = "Upload .csv file", 11 | accept = c(".csv")), 12 | h4("Plot settings"), 13 | selectInput(inputId = "data_x", 14 | label = "Choose x:", 15 | choices = c("x", "y")), 16 | selectInput(inputId = "data_y", 17 | label = "Choose y:", 18 | choices = c("y", "x")), 19 | actionButton(inputId = "generate_plot", 20 | label = "Plot it!") 21 | ), 22 | mainPanel(tabsetPanel(tabPanel("Plot", 23 | plotOutput("file_data_plot")), 24 | tabPanel("Data", 25 | tableOutput("file_data")) 26 | )) 27 | ) 28 | 29 | server <- function(input, output, session){ 30 | 31 | dat <- reactive({ 32 | 33 | validate(need(input[["user_file"]], "Please upload a file!")) 34 | 35 | read.csv(input[["user_file"]][["datapath"]]) 36 | 37 | 38 | }) 39 | 40 | output[["file_data"]] <- renderTable({ 41 | 42 | dat() 43 | 44 | }) 45 | 46 | observe({ 47 | 48 | dat_cols <- names(dat()) 49 | 50 | updateSelectInput(session, 51 | inputId = "data_x", 52 | choices = dat_cols, 53 | selected = dat_cols[[1]]) 54 | 55 | updateSelectInput(session, 56 | inputId = "data_y", 57 | choices = dat_cols, 58 | selected = dat_cols[[2]]) 59 | }) 60 | 61 | plot_out <- eventReactive(input[["generate_plot"]], { 62 | 63 | ggplot(dat(), aes(x = dat()[[input[["data_x"]]]], y = dat()[[input[["data_y"]]]])) + 64 | geom_point() + 65 | labs(x = input[["data_x"]], 66 | y = input[["data_y"]]) 67 | 68 | }) 69 | 70 | output[["file_data_plot"]] <- renderPlot({ 71 | 72 | plot_out() 73 | 74 | }) 75 | } 76 | 77 | shinyApp(ui, server) -------------------------------------------------------------------------------- /bshiny/4_1_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage( 5 | br(), 6 | sidebarPanel(verbatimTextOutput("nz_info")), 7 | mainPanel(plotOutput("nz_increase_plot", click = "nz_plot_click")) 8 | ) 9 | 10 | server <- function(input, output){ 11 | 12 | dat <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 13 | colnames(dat) <- c("Year", "Type", "Count") 14 | 15 | output[["nz_increase_plot"]] <- renderPlot( 16 | 17 | ggplot(dat, aes(x = Year, y = Count, color = Type)) + 18 | geom_point() + 19 | labs(title = "The natural increase in New Zealand", 20 | x = "Year", 21 | y = "Count") 22 | 23 | ) 24 | 25 | output[["nz_info"]] <- renderPrint({ 26 | 27 | input[["nz_plot_click"]] 28 | 29 | }) 30 | 31 | } 32 | 33 | shinyApp(ui = ui, server = server) 34 | -------------------------------------------------------------------------------- /bshiny/4_2_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage( 5 | br(), 6 | sidebarPanel(verbatimTextOutput("nz_info")), 7 | mainPanel(plotOutput("nz_increase_plot", click = "nz_plot_click")) 8 | ) 9 | 10 | server <- function(input, output){ 11 | 12 | dat <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 13 | colnames(dat) <- c("Year", "Type", "Count") 14 | 15 | output[["nz_increase_plot"]] <- renderPlot( 16 | 17 | ggplot(dat, aes(x = Year, y = Count, color = Type)) + 18 | geom_point() + 19 | labs(title = "The natural increase in New Zealand", 20 | x = "Year", 21 | y = "Count") 22 | 23 | ) 24 | 25 | rv <- reactiveValues(clicked_point = c()) 26 | 27 | observeEvent(input[["nz_plot_click"]], { 28 | 29 | rv[["clicked_point"]] <- nearPoints(df = dat, 30 | coordinfo = input[["nz_plot_click"]], 31 | maxpoint = 1, 32 | threshold = 10, 33 | allRows = TRUE) 34 | 35 | }) 36 | 37 | output[["nz_info"]] <- renderPrint({ 38 | 39 | rv[["clicked_point"]] 40 | 41 | }) 42 | 43 | } 44 | 45 | shinyApp(ui = ui, server = server) 46 | -------------------------------------------------------------------------------- /bshiny/4_3_app.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(ggplot2) 3 | 4 | ui <- fluidPage( 5 | br(), 6 | sidebarPanel(verbatimTextOutput("nz_info")), 7 | mainPanel(plotOutput("nz_increase_plot", click = "nz_plot_click")) 8 | ) 9 | 10 | server <- function(input, output){ 11 | 12 | 13 | dat <- reactive({ 14 | 15 | tmp <- read.csv("materials/bd-dec19-births-deaths-natural-increase.csv") 16 | colnames(tmp) <- c("Year", "Type", "Count") 17 | tmp 18 | 19 | }) 20 | 21 | output[["nz_increase_plot"]] <- renderPlot({ 22 | 23 | dat_tmp <- dat() 24 | dat_tmp[["selected"]] <- FALSE 25 | if(length(rv[["clicked_points"]]) > 0) 26 | dat_tmp[which(rv[["clicked_points"]][["selected_"]]), "selected"] <- TRUE 27 | 28 | ggplot(dat_tmp, aes(x = Year, y = Count, color = Type)) + 29 | geom_point(aes(size = selected)) + 30 | labs(title = "The natural increase in New Zealand", 31 | x = "Year", 32 | y = "Count") 33 | 34 | }) 35 | 36 | output[["nz_info"]] <- renderPrint({ 37 | 38 | tmp <- rv[["clicked_points"]][ rv[["clicked_points"]][["selected_"]] , ] 39 | 40 | if(is.null(tmp)){ 41 | "Please, click on some point!" 42 | } else { 43 | paste0("In year ", tmp[["Year"]], " there was ", tmp[["Count"]], " cases of ", tmp[["Type"]], " in New Zealand.") 44 | } 45 | 46 | 47 | }) 48 | 49 | rv <- reactiveValues(clicked_points = c()) 50 | 51 | observeEvent(input[["nz_plot_click"]], { 52 | 53 | rv[["clicked_points"]] <- nearPoints(dat(), 54 | coordinfo = input[["nz_plot_click"]], 55 | maxpoints = 1, 56 | threshold = 10, 57 | allRows = TRUE) 58 | 59 | }) 60 | 61 | } 62 | 63 | 64 | shinyApp(ui = ui, server = server) 65 | -------------------------------------------------------------------------------- /bshiny/README.md: -------------------------------------------------------------------------------- 1 | # Basics of Shiny 2 | 3 | Authors: Weronika Puchała, Krystyna Grzesiak, Katarzyna Sidorczuk 4 | 5 | ## About the workshop 6 | 7 | The presentation and visualization of the results is the critical aspect of data science. No matter how advanced the analysis is if one cannot see what data tells. An effective way of presenting the data analysis is a reactive shiny application, where one can manipulate the parameters and see how they affect the result. If you do not know how to do that but would like to learn - this is a workshop for you! 8 | 9 | In this workshop, you will learn how to build simple applications using shiny package. We will start by composing a static application and gradually add some aspects of reactivity, both on tabular and graphical data. By the end, you will get the basic knowledge of shiny package and be ready to explore on your own. 10 | 11 | ## About the participants 12 | 13 | * basic knowledge of R 14 | * no prior knowledge of Shiny 15 | 16 | ## Before the workshop 17 | 18 | * make sure your RStudio is working 19 | * install packages: 20 | * crucial: `shiny`, `ggplot2` 21 | * additional: `markdown`, `DT` 22 | * download the materials from this repository (/materials and /www) -------------------------------------------------------------------------------- /bshiny/materials/bd-dec19-births-by-mothers-age.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WhyR2020/workshops/5531a8b5180bce4a8c4adeda8b34f6c4554401ef/bshiny/materials/bd-dec19-births-by-mothers-age.csv -------------------------------------------------------------------------------- /bshiny/materials/bd-dec19-births-deaths-natural-increase.csv: -------------------------------------------------------------------------------- 1 | Period,Births_Deaths_or_Natural_Increase,Count 2 | 2000,Births,56604 3 | 2001,Births,55800 4 | 2002,Births,54021 5 | 2003,Births,56136 6 | 2004,Births,58074 7 | 2005,Births,57744 8 | 2006,Births,59193 9 | 2007,Births,64044 10 | 2008,Births,64341 11 | 2009,Births,62541 12 | 2010,Births,63897 13 | 2011,Births,61404 14 | 2012,Births,61179 15 | 2013,Births,58719 16 | 2014,Births,57243 17 | 2015,Births,61038 18 | 2016,Births,59430 19 | 2017,Births,59610 20 | 2018,Births,58020 21 | 2019,Births,59637 22 | 2000,Deaths,26658 23 | 2001,Deaths,27825 24 | 2002,Deaths,28065 25 | 2003,Deaths,28011 26 | 2004,Deaths,28419 27 | 2005,Deaths,27033 28 | 2006,Deaths,28245 29 | 2007,Deaths,28521 30 | 2008,Deaths,29187 31 | 2009,Deaths,28965 32 | 2010,Deaths,28437 33 | 2011,Deaths,30081 34 | 2012,Deaths,30099 35 | 2013,Deaths,29568 36 | 2014,Deaths,31062 37 | 2015,Deaths,31608 38 | 2016,Deaths,31179 39 | 2017,Deaths,33339 40 | 2018,Deaths,33225 41 | 2019,Deaths,34260 42 | 2000,Natural_Increase,29943 43 | 2001,Natural_Increase,27972 44 | 2002,Natural_Increase,25956 45 | 2003,Natural_Increase,28125 46 | 2004,Natural_Increase,29655 47 | 2005,Natural_Increase,30711 48 | 2006,Natural_Increase,30948 49 | 2007,Natural_Increase,35520 50 | 2008,Natural_Increase,35154 51 | 2009,Natural_Increase,33579 52 | 2010,Natural_Increase,35457 53 | 2011,Natural_Increase,31320 54 | 2012,Natural_Increase,31080 55 | 2013,Natural_Increase,29148 56 | 2014,Natural_Increase,26181 57 | 2015,Natural_Increase,29430 58 | 2016,Natural_Increase,28251 59 | 2017,Natural_Increase,26268 60 | 2018,Natural_Increase,24795 61 | 2019,Natural_Increase,25377 62 | -------------------------------------------------------------------------------- /bshiny/materials/bd-dec19-deaths-by-sex-and-age.csv: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WhyR2020/workshops/5531a8b5180bce4a8c4adeda8b34f6c4554401ef/bshiny/materials/bd-dec19-deaths-by-sex-and-age.csv -------------------------------------------------------------------------------- /bshiny/materials/info.txt: -------------------------------------------------------------------------------- 1 | New Zealand (Māori: Aotearoa [aɔˈtɛaɾɔa]) is an island country in the southwestern Pacific Ocean. It consists of two main landmasses—the North Island (Te Ika-a-Māui) and the South Island (Te Waipounamu)—and around 600 smaller islands, covering a total area of 268,021 square kilometres (103,500 sq mi). New Zealand is about 2,000 kilometres (1,200 mi) east of Australia across the Tasman Sea and 1,000 kilometres (600 mi) south of the islands of New Caledonia, Fiji, and Tonga. The country's varied topography and sharp mountain peaks, including the Southern Alps, owe much to tectonic uplift and volcanic eruptions. New Zealand's capital city is Wellington, and its most populous city is Auckland. 2 | 3 | Owing to their remoteness, the islands of New Zealand were the last large habitable lands to be settled by humans. Between about 1280 and 1350, Polynesians began to settle in the islands, and then developed a distinctive Māori culture. In 1642, Dutch explorer Abel Tasman became the first European to sight New Zealand. In 1840, representatives of the United Kingdom and Māori chiefs signed the Treaty of Waitangi, which declared British sovereignty over the islands. In 1841, New Zealand became a colony within the British Empire and in 1907 it became a dominion; it gained full statutory independence in 1947 and the British monarch remained the head of state. Today, the majority of New Zealand's population of 5 million is of European descent; the indigenous Māori are the largest minority, followed by Asians and Pacific Islanders. Reflecting this, New Zealand's culture is mainly derived from Māori and early British settlers, with recent broadening arising from increased immigration. The official languages are English, Māori, and New Zealand Sign Language, with English being very dominant. 4 | 5 | A developed country, New Zealand ranks highly in international comparisons of national performance, such as quality of life, education, protection of civil liberties, government transparency, and economic freedom. New Zealand underwent major economic changes during the 1980s, which transformed it from a protectionist to a liberalised free-trade economy. The service sector dominates the national economy, followed by the industrial sector, and agriculture; international tourism is a significant source of revenue. Nationally, legislative authority is vested in an elected, unicameral Parliament, while executive political power is exercised by the Cabinet, led by the prime minister, currently Jacinda Ardern. Queen Elizabeth II is the country's monarch and is represented by a governor-general, currently Dame Patsy Reddy. In addition, New Zealand is organised into 11 regional councils and 67 territorial authorities for local government purposes. The Realm of New Zealand also includes Tokelau (a dependent territory); the Cook Islands and Niue (self-governing states in free association with New Zealand); and the Ross Dependency, which is New Zealand's territorial claim in Antarctica. 6 | 7 | New Zealand is a member of the United Nations, Commonwealth of Nations, ANZUS, Organisation for Economic Co-operation and Development, ASEAN Plus Six, Asia-Pacific Economic Cooperation, the Pacific Community and the Pacific Islands Forum. -------------------------------------------------------------------------------- /bshiny/materials/info2.md: -------------------------------------------------------------------------------- 1 | # New Zealand 2 | 3 | ![Flag of New Zealand](../www/new-zealand-flag.jpg) 4 | 5 | New Zealand (Māori: Aotearoa [aɔˈtɛaɾɔa]) is an island country in the southwestern Pacific Ocean. It consists of two main landmasses—the North Island (Te Ika-a-Māui) and the South Island (Te Waipounamu)—and around 600 smaller islands, covering a total area of 268,021 square kilometres (103,500 sq mi). New Zealand is about 2,000 kilometres (1,200 mi) east of Australia across the Tasman Sea and 1,000 kilometres (600 mi) south of the islands of New Caledonia, Fiji, and Tonga. The country's varied topography and sharp mountain peaks, including the Southern Alps, owe much to tectonic uplift and volcanic eruptions. New Zealand's capital city is Wellington, and its most populous city is Auckland. 6 | 7 | Owing to their remoteness, the islands of New Zealand were the last large habitable lands to be settled by humans. Between about 1280 and 1350, Polynesians began to settle in the islands, and then developed a distinctive Māori culture. In 1642, Dutch explorer Abel Tasman became the first European to sight New Zealand. In 1840, representatives of the United Kingdom and Māori chiefs signed the Treaty of Waitangi, which declared British sovereignty over the islands. In 1841, New Zealand became a colony within the British Empire and in 1907 it became a dominion; it gained full statutory independence in 1947 and the British monarch remained the head of state. Today, the majority of New Zealand's population of 5 million is of European descent; the indigenous Māori are the largest minority, followed by Asians and Pacific Islanders. Reflecting this, New Zealand's culture is mainly derived from Māori and early British settlers, with recent broadening arising from increased immigration. The official languages are English, Māori, and New Zealand Sign Language, with English being very dominant. 8 | 9 | A developed country, New Zealand ranks highly in international comparisons of national performance, such as quality of life, education, protection of civil liberties, government transparency, and economic freedom. New Zealand underwent major economic changes during the 1980s, which transformed it from a protectionist to a liberalised free-trade economy. The service sector dominates the national economy, followed by the industrial sector, and agriculture; international tourism is a significant source of revenue. Nationally, legislative authority is vested in an elected, unicameral Parliament, while executive political power is exercised by the Cabinet, led by the prime minister, currently Jacinda Ardern. Queen Elizabeth II is the country's monarch and is represented by a governor-general, currently Dame Patsy Reddy. In addition, New Zealand is organised into 11 regional councils and 67 territorial authorities for local government purposes. The Realm of New Zealand also includes Tokelau (a dependent territory); the Cook Islands and Niue (self-governing states in free association with New Zealand); and the Ross Dependency, which is New Zealand's territorial claim in Antarctica. 10 | 11 | New Zealand is a member of the United Nations, Commonwealth of Nations, ANZUS, Organisation for Economic Co-operation and Development, ASEAN Plus Six, Asia-Pacific Economic Cooperation, the Pacific Community and the Pacific Islands Forum. 12 | 13 | ### Government and politics 14 | 15 | New Zealand is a constitutional monarchy with a parliamentary democracy,[69] although its constitution is not codified.[70] Elizabeth II is the queen of New Zealand[71] and thus the head of state.[72] The queen is represented by the governor-general, whom she appoints on the advice of the prime minister.[73] The governor-general can exercise the Crown's prerogative powers, such as reviewing cases of injustice and making appointments of ministers, ambassadors and other key public officials,[74] and in rare situations, the reserve powers (e.g. the power to dissolve parliament or refuse the royal assent of a bill into law).[75] The powers of the monarch and the governor-general are limited by constitutional constraints and they cannot normally be exercised without the advice of ministers.[75] 16 | 17 | The New Zealand Parliament holds legislative power and consists of the queen and the House of Representatives.[76] It also included an upper house, the Legislative Council, until this was abolished in 1950.[76] The supremacy of parliament over the Crown and other government institutions was established in England by the Bill of Rights 1689 and has been ratified as law in New Zealand.[76] The House of Representatives is democratically elected and a government is formed from the party or coalition with the majority of seats. If no majority is formed, a minority government can be formed if support from other parties during confidence and supply votes is assured.[76] The governor-general appoints ministers under advice from the prime minister, who is by convention the parliamentary leader of the governing party or coalition.[77] Cabinet, formed by ministers and led by the prime minister, is the highest policy-making body in government and responsible for deciding significant government actions.[78] Members of Cabinet make major decisions collectively, and are therefore collectively responsible for the consequences of these decisions.[79] 18 | 19 | A parliamentary general election must be called no later than three years after the previous election.[80] Almost all general elections between 1853 and 1993 were held under the first-past-the-post voting system.[81] Since the 1996 election, a form of proportional representation called mixed-member proportional (MMP) has been used.[70] Under the MMP system, each person has two votes; one is for a candidate standing in the voter's electorate and the other is for a party. Since the 2014 election, there have been 71 electorates (which include seven Māori electorates in which only Māori can optionally vote),[82] and the remaining 49 of the 120 seats are assigned so that representation in parliament reflects the party vote, with the threshold that a party must win at least one electorate or 5% of the total party vote before it is eligible for a seat.[83] 20 | 21 | A block of buildings fronted by a large statue. 22 | A statue of Richard Seddon, the "Beehive" (Executive Wing), and Parliament House (right), in Parliament Grounds, Wellington. 23 | Elections since the 1930s have been dominated by two political parties, National and Labour.[81] Between March 2005 and August 2006, New Zealand became the first country in the world in which all the highest offices in the land—head of state, governor-general, prime minister, speaker and chief justice—were occupied simultaneously by women.[84] The current prime minister is Jacinda Ardern, who has been in office since 26 October 2017.[85] She is the country's third female prime minister.[86] 24 | 25 | New Zealand's judiciary, headed by the chief justice,[87] includes the Supreme Court, Court of Appeal, the High Court, and subordinate courts.[88] Judges and judicial officers are appointed non-politically and under strict rules regarding tenure to help maintain judicial independence.[70] This theoretically allows the judiciary to interpret the law based solely on the legislation enacted by Parliament without other influences on their decisions.[89] 26 | 27 | New Zealand is identified as one of the world's most stable and well-governed states.[90] As at 2017, the country was ranked fourth in the strength of its democratic institutions,[91] and first in government transparency and lack of corruption.[92] A 2017 Human Rights Report by the U.S. Department of State noted that the government generally respected the rights of individuals, but voiced concerns regarding the social status of the Māori population.[93] New Zealand ranks highly for civic participation in the political process, with 80% voter turnout during recent elections, compared to an OECD average of 68%.[94] 28 | 29 | #### Foreign relations and military 30 | 31 | Early colonial New Zealand allowed the British Government to determine external trade and be responsible for foreign policy.[95] The 1923 and 1926 Imperial Conferences decided that New Zealand should be allowed to negotiate its own political treaties and the first commercial treaty was ratified in 1928 with Japan. On 3 September 1939 New Zealand allied itself with Britain and declared war on Germany with Prime Minister Michael Joseph Savage proclaiming, "Where she goes, we go; where she stands, we stand."[96] 32 | 33 | In 1951 the United Kingdom became increasingly focused on its European interests,[97] while New Zealand joined Australia and the United States in the ANZUS security treaty.[98] The influence of the United States on New Zealand weakened following protests over the Vietnam War,[99] the refusal of the United States to admonish France after the sinking of the Rainbow Warrior,[100] disagreements over environmental and agricultural trade issues and New Zealand's nuclear-free policy.[101][102] Despite the United States' suspension of ANZUS obligations the treaty remained in effect between New Zealand and Australia, whose foreign policy has followed a similar historical trend.[103] Close political contact is maintained between the two countries, with free trade agreements and travel arrangements that allow citizens to visit, live and work in both countries without restrictions.[104] In 2013 there were about 650,000 New Zealand citizens living in Australia, which is equivalent to 15% of the resident population of New Zealand.[105] 34 | 35 | A soldier in a green army uniform faces forwards 36 | Anzac Day service at the National War Memorial 37 | New Zealand has a strong presence among the Pacific Island countries. A large proportion of New Zealand's aid goes to these countries and many Pacific people migrate to New Zealand for employment.[106] Permanent migration is regulated under the 1970 Samoan Quota Scheme and the 2002 Pacific Access Category, which allow up to 1,100 Samoan nationals and up to 750 other Pacific Islanders respectively to become permanent New Zealand residents each year. A seasonal workers scheme for temporary migration was introduced in 2007 and in 2009 about 8,000 Pacific Islanders were employed under it.[107] New Zealand is involved in the Pacific Islands Forum, the Pacific Community, Asia-Pacific Economic Cooperation and the Association of Southeast Asian Nations Regional Forum (including the East Asia Summit).[104] New Zealand has been described as an emerging power.[108][109] The country is a member of the United Nations,[110] the Commonwealth of Nations[111] and the Organisation for Economic Co-operation and Development (OECD),[112] and participates in the Five Power Defence Arrangements.[113] 38 | 39 | New Zealand's military services—the Defence Force—comprise the New Zealand Army, the Royal New Zealand Air Force and the Royal New Zealand Navy.[114] New Zealand's national defence needs are modest, since a direct attack is unlikely.[115] However, its military has had a global presence. The country fought in both world wars, with notable campaigns in Gallipoli, Crete,[116] El Alamein[117] and Cassino.[118] The Gallipoli campaign played an important part in fostering New Zealand's national identity[119][120] and strengthened the ANZAC tradition it shares with Australia.[121] 40 | 41 | In addition to Vietnam and the two world wars, New Zealand fought in the Second Boer War,[122] the Korean War,[123] the Malayan Emergency,[124] the Gulf War and the Afghanistan War. It has contributed forces to several regional and global peacekeeping missions, such as those in Cyprus, Somalia, Bosnia and Herzegovina, the Sinai, Angola, Cambodia, the Iran–Iraq border, Bougainville, East Timor, and the Solomon Islands.[125] 42 | 43 | ### Source 44 | 45 | See more on [Wikipedia](https://en.wikipedia.org/wiki/New_Zealand) -------------------------------------------------------------------------------- /bshiny/materials/sources.txt: -------------------------------------------------------------------------------- 1 | Sources of the files: 2 | 3 | map of New Zealand: 4 | https://www.nationsonline.org/oneworld/map/new-zealand-map.htm 5 | 6 | flag of New Zealand: 7 | https://www.nationsonline.org/oneworld/new_zealand.htm 8 | 9 | info.txt: 10 | https://en.wikipedia.org/wiki/New_Zealand 11 | 12 | data: 13 | https://www.stats.govt.nz/large-datasets/csv-files-for-download/ -------------------------------------------------------------------------------- /bshiny/www/new-zealand-flag.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WhyR2020/workshops/5531a8b5180bce4a8c4adeda8b34f6c4554401ef/bshiny/www/new-zealand-flag.jpg -------------------------------------------------------------------------------- /bshiny/www/new-zealand-map.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WhyR2020/workshops/5531a8b5180bce4a8c4adeda8b34f6c4554401ef/bshiny/www/new-zealand-map.jpg -------------------------------------------------------------------------------- /casual/README.md: -------------------------------------------------------------------------------- 1 | # Causal machine learning in practice 2 | 3 | Authors: Mateusz Zawisza [McKinsey & Company](https://www.mckinsey.com/pl/careers/careers-in-poland) 4 | 5 | ### Description 6 | At McKinsey we strongly believe that machine learning and artificial intelligence (AI) aren't just about predicting the future: They're about shaping the future by making positive changes. 7 | 8 | To date, machine learning and AI tools have been used to predict diverse business and social phenomena – and proved themselves excellent at the job. Their predictive capability is highly valued as it allows us to prepare for future events, for example, to make inventory decisions addressing future demand. 9 | 10 | But many decision-based problems relating to issues that we face on a daily basis cannot be resolved purely by prediction. For instance, predicting the high probability of client churn doesn't mean unambiguously that a telco operator should target this client with a retention marketing campaign. Indeed, that may not be the right way to approach the clients at all. What we need is not so much a powerful predictive engine as the ability to assess the impact of our decisions. 11 | 12 | 13 | Today's data scientists need the necessary knowledge and skills to assess the impact of their business and policy decisions. This gives them full control of decision-making processes and enables them to shape the future. Such methods had been studied in the past mostly within econometric and graphical models. Recently, these methods have been extended into to machine-learning models that are proving to be more robust and efficient than their predecessors. 14 | 15 | The workshop introduces what is known as "causal machine learning". We then apply this approach in a team competition based on a real scenario. We end with a summary of lessons learned from the competition and key takeaways for participants. 16 | 17 | 18 | ### Agenda (3 hours) 19 | 1.Introduction to decision problems and methodology (60 minutes) 20 | 2.Team decision competition (90 minutes) 21 | 3.Summary: Scoreboard and lessons learned (30 minutes) 22 | 23 | ### Requirements 24 | 25 | - Basic understanding of machine learning and models at an undergraduate level; see, for example, James et al. (2013) 26 | - Basic knowledge of R programming language 27 | 28 | ### Optional extra reading 29 | - Angrist, J. D., & Pischke, J. S. (2008). Mostly harmless econometrics: An empiricist's companion. Princeton university press. 30 | - Ascarza, E. (2018). Retention futility: Targeting high-risk customers might be ineffective. Journal of Marketing Research, 55(1), 80-98. 31 | - Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forests. The Annals of Statistics, 47(2), 1148-1178. 32 | - James, G., Hastie, T., Witten, D., & Tibshirani, R. (2013). An introduction to statistical learning: with applications in R. 33 | - Pearl, J., & Mackenzie, D. (2018). The book of why: the new science of cause and effect. Basic Books. 34 | - Taddy, M. (2019). Business data science: Combining machine learning and economics to optimize, automate, and accelerate business decisions. McGraw Hill Professional. 35 | -------------------------------------------------------------------------------- /drake/.gitignore: -------------------------------------------------------------------------------- 1 | README.html 2 | -------------------------------------------------------------------------------- /drake/README.md: -------------------------------------------------------------------------------- 1 | # Reproducible data analysis with `drake` 2 | 3 | Authors: Jakub Kwiecień 4 | 5 | ### Description 6 | 7 | One of the challenges in complex analytical projects is management of analysis' pipeline. After few iterations of running the code, checking results and improving the scripts everything gets messy: codebase has swollen, number of artifacts has grown, same as number of (intermediate) results. In addition to that, rerunning the whole pipeline takes hours or maybe even weeks and you are not sure which parts you can skip after changing this strict inequality to a non-strict one in a config file. Fortunately, `drake` takes this problems away. It keeps track of all the inputs, analysis steps, outputs and relations between them and utilize this information to recompute only the steps that got outdated since the last run. This leads to enhanced reproducibility, maintainability and development speed and, of course, a happy coder. During the workshop you will learn how to set up drake workflow, how to maintain it and few tips about working with drake. -------------------------------------------------------------------------------- /ipbox/README.md: -------------------------------------------------------------------------------- 1 | # Innovation Box (IP Box) in Poland – how to use preferential 5% income tax rate 2 | 3 | Authors: Natalia Wojciechowska (r.pr. / attorney); Grzegorz Leśniewski (adw. / attorney) 4 | 5 | ### Schedule 6 | 7 | 14:00 - 15:30 / CEST / GMT+2 8 | 9 | ### Description 10 | 11 | Data scientist, programmers and other professionals in Poland may lower they income tax to 5% tax rate for a qualified income derived from qualified IP rights. This is applicable both to natural persons and to companies. IP box is available also to Polish non-residents who receive income from qualified IP rights through their permanent establishment located in Poland. 12 | 13 | During these workshops it will be explained to what income the lowered tax rate can be applied, how to apply it and how to create necessary documentation. 14 | 15 | These workshops will be held in Polish (unless English speaking participants will sign up by Wednesday), whereas presentation will be in English. 16 | -------------------------------------------------------------------------------- /legal/README.md: -------------------------------------------------------------------------------- 1 | # Legal basics for data scientists 2 | 3 | Authors: Urszula Ilnicka - Karaban (r.pr. / attorney) ; Grzegorz Leśniewski (adw. / attorney) 4 | ### Schedule 5 | 6 | 15.30 - 17.00 / CEST / GMT+2 7 | 8 | ### Description 9 | 10 | There is a number of potential ways to conduct business activities in Poland. During these workshops it will be explained: 11 | 12 | 1. What legal forms are best for business activities of data scientists? 13 | 2. What liability you may incur? 14 | 3. How and on what legal basis employ your staff? 15 | 4. How to secure Intellectual Property rights, company’s secrets and know-how? 16 | 5. How to deal with personal data protection, if you analyze personal data? 17 | 6. Getting investors – practical comments. 18 | 19 | These workshops will be held in English. 20 | -------------------------------------------------------------------------------- /openmp/README.md: -------------------------------------------------------------------------------- 1 | # Creating R Subroutines with Fortran and OpenMP Tools 2 | 3 | Authors: [Erin Hodgess](https://www.researchgate.net/profile/Erin_Hodgess) 4 | 5 | ### Description 6 | 7 | We will work with high performance computing using Fortran with OpenMP. 8 | We will produce subroutines which can be easily compiled on Windows, Mac, and 9 | Ubuntu operating systems. We will demonstrate the excellent speed ups which 10 | can be obtained by using Fortran with the OpenMP directives. -------------------------------------------------------------------------------- /rcpp/README.md: -------------------------------------------------------------------------------- 1 | # How to make your code fast - R and C++ integration using Rcpp 2 | 3 | Authors: Jadwiga Słowik, Dominik Rafacz, Mateusz Bąkała 4 | 5 | ### Description 6 | 7 | Rcpp is an R package that facilitates performing more efficient computations by using C++ and provides wrappers for R internals such as vectors and lists. Due to the seamless C++ and R API integration, Rcpp gives the opportunity to export C++ code to R conveniently and, thus, to take advantage of the C++ high performance and idiomatic high-level R interface. 8 | 9 | During the workshop, a participant will gain knowledge of C++ basics that is necessary for Rcpp, become familiar with the usage of Rcpp wrappers (for example IntegerVector, StringVector, NumericVector, List) and learn how to create Rcpp functions in order to invoke them from an R code. Finally, a few extensions are going to be showcased, such as RcppArmadillo, RcppParallel, RcppModules, which can make code even more efficient. 10 | 11 | A participant of the workshop is required to have basic R programming skills such as: vectors manipulation, functions implementation, packages installation and familiarity with the RStudio environment. Some knowledge of C++ help, however, is not essential. 12 | 13 | ### Requirements 14 | 15 | Additionally, a working RStudio environment should be prepared before the workshop. In particular, the following packages will be needed: Rcpp, RcppArmadillo, RcppParallel. -------------------------------------------------------------------------------- /satellite/README.md: -------------------------------------------------------------------------------- 1 | # Satellite imagery analysis in R 2 | 3 | Authors: [Ewa Grabska](https://www.researchgate.net/profile/Ewa_Grabska2) 4 | 5 | ### Description 6 | 7 | Satellite imagery, such as freely available data from Sentinel-2 mission, enable us to monitor the Earth's surface frequently (every 5 days), and with a high spatial resolution (10-20 meters). Furthermore, Sentinel-2 sensors, including 13 spectral bands in the visible and infrared wavelengths, provide very valuable information which can be used to automatically perform tasks such as classify crop types, assess forest changes, or monitor build-up area development. This is particularly important now, in the era of rapid changes in the environment related to climate change. In R, there are plenty of tools and packages which can be used for satellite images such as pre-processing, analyzing, and visualizing data in a simple and efficient way. Also, the variety of methods, such as machine learning algorithms, are available in R and can be applied in the analysis of satellite imagery. I would like to show the framework for acquiring, pre-processing and preliminary analysis of the Sentinel-2 time series in R. It includes the spectral indices calculation, the use of the machine learning algorithms in the classification of land cover, and, the analysis of time series of imagery, i.e. determining the changes in environment based on the spectral trajectories of pixels. 8 | 9 | ### Before the workshop 10 | 11 | * #### Please create an account on the ESA sci-hub website: https://scihub.copernicus.eu/dhus/#/home 12 | 13 | * #### Recquired packages: 14 | 15 | * `sen2r` 16 | 17 | * `getSpatialData` (hosted on github so you also need `devtools` package, after that use: `devtools::install_github("16EAGLE/getSpatialData"`) 18 | 19 | * `raster` 20 | 21 | * `RStoolbox` 22 | 23 | * `dplyr` 24 | 25 | * `tidyr` 26 | 27 | * `rlang` 28 | 29 | * `caret` 30 | 31 | * `ggplot2` 32 | 33 | * `RColorBrewer` 34 | 35 | * `viridis` 36 | 37 | 38 | * #### Download materials: 39 | 40 | * https://www.dropbox.com/s/pss5sto3wb3z4ny/whyr_satellite.zip?dl=0 41 | 42 | 43 | -------------------------------------------------------------------------------- /satellite/slides.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WhyR2020/workshops/5531a8b5180bce4a8c4adeda8b34f6c4554401ef/satellite/slides.pdf -------------------------------------------------------------------------------- /satellite/whyr_satellite.R: -------------------------------------------------------------------------------- 1 | #Satellite imagery analysis in R @ Why R? 2020 2 | 3 | #install.packages("devtools") 4 | #library(devtools) 5 | #devtools::install_github("16EAGLE/getSpatialData") 6 | #install.packages("sen2r") 7 | #two more packages to install (with our ML methods for classification): randomForest and kernlab 8 | 9 | library(getSpatialData) 10 | library(sen2r) 11 | library(raster) 12 | library(RStoolbox) 13 | library(tidyr) 14 | library(dplyr) 15 | library(rlang) 16 | library(ggplot2) 17 | library(RColorBrewer) 18 | library(viridis) 19 | library(caret) 20 | 21 | #PART1 - DOWNLOADING & PRE-PROCESSING DATA ------------------------------------------------------------------------------------ 22 | 23 | #getSpatialData package 24 | 25 | get_products() #list of available products 26 | 27 | set_aoi() #set and view the area of interest 28 | view_aoi() 29 | get_aoi() 30 | 31 | time_range = c("2020-08-30", "2020-09-30") #set time range 32 | platform = "Sentinel-2" #choose platform 33 | login_CopHub(username = "ewagrabska") #login to Esa SCI-HUB 34 | 35 | 36 | #get a Sentinel-2 query with specified parameters 37 | ?getSentinel_records 38 | query = getSentinel_records(time_range, "Sentinel-2") 39 | 40 | str(query) 41 | 42 | #examine query dataframe 43 | query$cloudcov 44 | query$tile_id # 1 tile = 100x100km2 45 | 46 | #you can also specify level of processing - level 1C is before correction, 2A after correction 47 | query10 = query[query$cloudcov < 10 & query$tile_id == "T34UCA" & query$level == "Level-1C",] 48 | query10$record_id 49 | 50 | 51 | #set archive, view previewand download data 52 | set_archive("C:/04_R") 53 | plot_records(query10) 54 | records = get_previews(query10) 55 | view_previews(records[2,]) 56 | getSentinel_data(records) 57 | 58 | #similarly you can search and download Landsat imagery (then you need to sign in at https://earthexplorer.usgs.gov/) 59 | #getLandsat_records() 60 | #getLandsat_data(time_range, platform) 61 | #login_USGS(username = "egrabska") 62 | 63 | #some interesting tools from sen2r package - very good package for processing Sentinel-2 images; also tools for downloading data 64 | ?s2_translate #enables to create one stack (one, multi-band, geotiff image) 65 | ?s2_calcindices #many many indices 66 | ?s2_mask #masking clouds 67 | 68 | 69 | #PART2: READING AND VISUALIZATION USING RASTER PACKAGE ------------------------------------------------------------------------ 70 | 71 | #Sentinel-2 acquires data in 13 spectral bands, however, we will not use all of them, 72 | #as not all of them are designed for analysis of land areas. There are 10 bands for land applications - 73 | #3 visible bands, 3 red-edge bands (located at the edge between red light and infrared) two near-infrared (NIR), 74 | #and we also have two short-wave infrared bands (SWIR). 75 | 76 | setwd("C:/04_R/preliminary_analysis") 77 | list.files() 78 | 79 | #I already prepared one .tif file which is a multi-band image, now you can read bands seperately or in one stack 80 | 81 | r1 = raster("C:/04_R/preliminary_analysis/20190825_crop.tif") #only single band will be read (the first one) 82 | band4 = raster("20190825_crop.tif", band = 4) #here you can specify which band do you want to read 83 | s1 = stack("20190825_crop.tif") #this is an image composed of 6 bands in order: visible blue, visible green, visible red, NIR, SWIR1, SWIR2 84 | 85 | #print information about rasters 86 | r1 87 | band4 88 | s1 89 | 90 | names(s1) = c("blue", "green", "red", "nir", "swir1", "swir2") #change band names in a raster stack 91 | 92 | plot(r1) 93 | plot(s1) #all bands plotted seperately 94 | 95 | #extracting one element (band) from the raster stack 96 | s1[[1]] 97 | s1$blue 98 | blue_band = s1[[1]] 99 | 100 | plot(s1[[1]]) #and plot single band from stack 101 | 102 | #compositions - plot bands in color compostion 103 | plotRGB(s1, stretch = "lin") #default one (1,2,3) 104 | plotRGB(s1, r = 3, g = 2, b = 1, stretch = "lin") 105 | 106 | 107 | #compare different compositions 108 | par(mfrow = c(1,4)) #plot 4 compositions at once 109 | plotRGB(s1, r = 3, g = 2, b = 1, stretch = "lin") #true-color compositions 110 | plotRGB(s1, r = 4, g = 3, b = 2, stretch = "lin") #false color compositions 111 | plotRGB(s1, r = 5, g = 4, b = 3, stretch = "lin") #SWIR false-color 112 | plotRGB(s1, r = 6, g = 5, b = 4, stretch = "lin") #two SWIR false-color 113 | 114 | dev.off() #remove all plots 115 | 116 | #another option to plot images 117 | image(s1) 118 | 119 | 120 | #PART3: PROCESSING DATA ----------------------------------------------------------- 121 | 122 | #cropping image to smaller extent 123 | extent(s1) 124 | e = extent(360000, 380000, 7800000, 7810000) 125 | s1_crop = crop(s1, e) 126 | 127 | plot(s1_crop) 128 | plotRGB(s1_crop, r=6, g=4, b =2, stretch = "lin") 129 | 130 | #writing data 131 | writeRaster(s1_crop, "20190805_crop2.tif") 132 | writeFormats() 133 | 134 | #checking values distribution - histograms, scatterplots and correlations 135 | dev.off() 136 | hist(s1_crop) 137 | pairs(s1_crop, maxpixels = 5000) #remember to use maxpixels values because if you take all of the pixel values it will take a lot of time to produce 138 | pairs(s1_crop[[c(4,6)]], maxpixels = 10000) #note how high values the burning areas have in SWIR 139 | 140 | 141 | 142 | #mathematical operations 143 | s2 = stack("20190805_crop.tif") 144 | e = extent(360000, 380000, 7800000, 7810000) 145 | s2_crop = crop(s2, e) 146 | plotRGB(s2_crop, r = 4, g= 3, b = 2, stretch = "lin") 147 | 148 | 149 | ndvi = (s2_crop[[4]] - s2_crop[[3]])/(s2_crop[[4]] + s2_crop[[3]]) #normalized difference vegetation index - it uses NIR and visible red bands 150 | plot(ndvi) 151 | plot(ndvi, col=brewer.pal(n = 6, name = "PiYG")) 152 | 153 | mndwi = (s2_crop[[2]] - s2_crop[[5]])/(s2_crop[[2]] + s2_crop[[5]]) #modified normalized difference water index - uses visible green and swir 154 | plot(mndwi, col=brewer.pal(n = 10, name = "RdBu")) 155 | 156 | ind_stack = stack(mndwi, ndvi) #you can create a stack of indices, you can also create a stack of bands and indices 157 | plot(ind_stack) 158 | pairs(ind_stack, maxpixels = 1000) 159 | 160 | 161 | #exercise - pre-fire and post-fire NBR (Normalized Burn Ratio) 162 | 163 | s3 = stack("20190830_crop.tif") #image from after fire 164 | s3_crop = crop(s3, e) 165 | plotRGB(s3_crop, r = 3, g = 2, b = 1, stretch = "lin") 166 | 167 | #Typically, Tto estimate the severity of burnt areas, delta NBR is calculated – the difference between pre-fire and post-fire NBR. 168 | nbr_pre = (s2_crop[[4]] - s2_crop[[6]])/(s2_crop[[4]] + s2_crop[[6]]) 169 | nbr_post = (s3_crop[[4]] - s3_crop[[6]])/(s3_crop[[4]] + s3_crop[[6]]) 170 | 171 | 172 | dev.off() 173 | delta_nbr = nbr_pre - nbr_post 174 | hist(delta_nbr, col = "red") 175 | plot(delta_nbr, col=brewer.pal(n = 6, name = "YlOrRd")) 176 | 177 | #determine areas with high severity burn - e.g. areas with values of delta NBR > 0.66 are high severity burnt areas 178 | burnt = reclassify(delta_nbr, c(-1, 0.1, 0, 0.1, 0.27, 1, 0.27, 0.44, 2, 0.44, 0.66, 3, 0.66, 1, 4)) 179 | plot(burnt, col=brewer.pal(n = 5, name = "YlOrRd")) 180 | 181 | 182 | #PART 4 - classification -------------------------------------------------------------------------------- 183 | 184 | warsaw = stack("C:/04_R/classification/warsaw.tif") 185 | plotRGB(warsaw, r =3, g = 2, b=1, stretch = "lin") 186 | warsaw #note that this image is composed of all 10 bands (not only 6 as in Amazon case) 187 | 188 | 189 | #we will take the "full" spectrum of Sentinel-2 to get more information for automatic classification 190 | names(warsaw) = c("blue", "green", "red", "re1", "re2", "re3", "nir1", "nir2", "swir1", "swir2") #change the names again 191 | pairs(warsaw, maxpixels = 1000) 192 | 193 | setwd("C:/04_R/classification") 194 | ref = shapefile("reference_utm.shp") #reading file with reference (training) samples 195 | 196 | unique(ref$class) #how many land cover classes it represents 197 | plotRGB(warsaw, r =3, g = 2, b=1, stretch = "lin") #visualize rgb composition again and... 198 | plot(ref, add =TRUE, col = "red") #the location of the reference samples 199 | 200 | #before classification, in order to analyze spectral properties of land cover classes, 201 | #we firstly extract values from the image to sample polygons (mean values for each polygon) 202 | 203 | #you can use extract function from raster package but it's extremely slow 204 | ref_values = raster::extract(warsaw, ref, fun = "mean") %>% as.data.frame() 205 | ref_values$class = ref$class #add class attribute to a dataframe 206 | 207 | #better choice for larger datasets is: 208 | #library(exactextractr) 209 | #and the function called exact_extract :) 210 | #the thing here is that exact_extract needs a sf object as an input so you have to read shapefile with st_read() function from sf package instead of shapefile() function 211 | 212 | #some visualization with ggplot2 package - scatterplots: 213 | ggplot(ref_values, aes(green, re2, color = class))+ 214 | geom_point(size = 2)+ 215 | stat_ellipse() 216 | 217 | 218 | #we need to prepare the data to create spectral curves - there are many tools in r which can be used, 219 | #e.g. melt and dcast functions from reshape2, functions from dplyr, 220 | #tidyr; aggregate function etc. we can try this: 221 | 222 | mean_spectra = group_by(ref_values, class) %>% #we group ref_values by class 223 | summarise_all(mean) %>% #calculate mean value for each class 224 | gather(key, value, -class) #transform the df to "long" format 225 | 226 | #we also need to specify order of bands: (if not they will be plotted in alphabetical order) 227 | mean_spectra$key = factor(mean_spectra$key, levels=c("blue", "green", "red", "re1", "re2", "re3", "nir1", "nir2", "swir1", "swir2")) 228 | 229 | #and plot sepctral curves: 230 | ggplot(mean_spectra, aes(key, value, color = class, group = class))+ 231 | geom_point()+ 232 | geom_line(size = 1.8, alpha = 0.6) 233 | 234 | #Image classification - there are 2 types of classification – unsupervised and supervised. 235 | #in unsupervised classification we don’t use reference data (training data), all pixels are grouped into clusters using for example k-means algorithm. 236 | #in supervised classification we use reference, training samples. For these training samples the land cover class and exact location is known. 237 | #we will use classification tools form RStoolbox package 238 | 239 | #unsupervised classification 240 | class1 = unsuperClass(warsaw, nSamples = 100, nclasses = 5) 241 | class1$map 242 | plot(class1$map, col = rainbow(5)) 243 | 244 | #supervised classification 245 | ?superClass 246 | 247 | #additional calculation of two indices: 248 | warsaw_mndwi = (warsaw[[2]] - warsaw[[9]])/(warsaw[[2]] + warsaw[[9]]) 249 | warsaw_ndvi = (warsaw[[7]] - warsaw[[3]])/(warsaw[[7]] + warsaw[[3]]) 250 | warsaw_all = stack(warsaw, warsaw_ndvi, warsaw_mndwi) #you can classify one stack with 10 bands and 2 indices and then check the variable importance! 251 | 252 | 253 | 254 | #The function called superClass train the model and then validate it (we have to provide both training and validation datasets). 255 | #We will use the reference polygons and split them into train and validation samples with proportion of 70% for training, 30% for validation. 256 | #We can split the samples inside the superclass function. we can put set.seed() function inside to always get the same random partition. 257 | #remeber that two perform reliable classification you have to follow some rules regarding obtaining training and validation data (e.g. they should not be close to each other in order to avoid spatial autocorrelation) 258 | 259 | 260 | classification_rf = superClass(warsaw_all, ref, set.seed(5), trainPartition = 0.7, responseCol = "class", #random forest classification 261 | model = "rf", mode = "classification", tuneLength = 5, kfold = 10) 262 | 263 | classification_svm = superClass(warsaw_all, ref, set.seed(5), trainPartition = 0.7, responseCol = "class", #support vector machines classification 264 | model = "svmLinear", mode = "classification", tuneLength = 5, kfold = 10) 265 | 266 | classification_svm #the result is a list and it includes classification_svm$map that you can plot and save on your disc using writeRaster() 267 | #the accuracy assessment is also available - if you print the classification result object, the first element is validation, 268 | #two most important measures of the classification are – confusion matrix bewteen reference and prediction and overall accuracy. 269 | 270 | #check the importance of particular bands 271 | varImp_rf = varImp(classification_rf$model) 272 | varImp_svm = varImp(classification_svm$model) 273 | 274 | #and print/plot the results of VI 275 | plot(varImp_rf) 276 | varImp_rf 277 | varImp_svm 278 | 279 | #use only the important bands as input - for example: 280 | classification_rf = superClass(warsaw_all[[c(9,10,11, 7)]], ref, set.seed(5), trainPartition = 0.7, responseCol = "class", 281 | model = "rf", mode = "classification", tuneLength = 5, kfold = 10) 282 | 283 | classification_rf 284 | 285 | #VI using RFE (Recursive Feature Elimination) - with ref_values again (of course to do that in "proper" way you would need another dataset) 286 | #RFE is a simple backwards selection, searching for the optimal subset of variables by performing optimization algorithms 287 | 288 | ref_values$class = as.factor(ref_values$class) #we need class variable as a factor 289 | control = rfeControl(functions=rfFuncs, method="cv", number=10) #create control object 290 | results = rfe(ref_values[,1:10], ref_values[,11], sizes=c(1:10), rfeControl=control) #run RFE algorithm 291 | 292 | #print/plot results 293 | results 294 | plot(results, type = "l") #line plot. as you can see, we not necessary need all of the bands to achieve high accuracy; as seen in scatterplots, 295 | #the correlation between some of the bands is very high and therefore they are redundant 296 | predictors(results) #the most important predictors 297 | 298 | 299 | #Another way of avoiding redudnancy is to reduce space - for example using very popular PCA (Principal Component Analysis) 300 | 301 | #there is a tool rasterPCA in RStoolbox package: 302 | ?rasterPCA 303 | warsaw_pca = rasterPCA(warsaw, nComp = 3) #usually the first 2-3 components have the most infromation 304 | 305 | #look at the results (we have a list again) 306 | warsaw_pca$map 307 | plot(warsaw_pca$map) 308 | plotRGB(warsaw_pca$map, r=3, g=2, b = 1, stretch = "lin") #we can plot is as a color composite as well 309 | 310 | #and perform classification on reduced space: 311 | class_PCA = superClass(warsaw_pca$map, ref, set.seed(5), trainPartition = 0.7, responseCol = "class", 312 | model = "rf", mode = "classification", tuneLength = 5, kfold = 10) 313 | 314 | class_PCA 315 | 316 | # visualisation of classified map 317 | plot(classification_rf$map, col=c("darkgreen", "brown3","chartreuse4", "chartreuse", "yellow", "cadetblue3")) 318 | 319 | #PART 5: MULTI-TEMPORAL ANALYSIS---------------------------------------------------------------------------- 320 | 321 | #In the last part we will analyze multi-temporal imagery – i.e. dense time series of images from the same year. 322 | #Dense time series are used particularly in vegetation monitoring, for example in mapping small forest disturbances, or in crop monitoring. 323 | #In these part we will also analyze the vegetation - how the different species/types of vegetation reflectance changes during the growing season. 324 | #Again, there are some already prepared reference data and 17 cropped images from Senitnel-2. 325 | 326 | 327 | setwd("C:/04_R/multi_temporal") 328 | stacklist = lapply(list.files(pattern = "*.tif$"), stack) #use lapply() function to read all of the images at once - i.e. all of the images with given pattern (.tif format) 329 | #the result is a list of 17 stacks 330 | 331 | ref = shapefile("ref_crops.shp") 332 | 333 | #extract data again; use lapply 334 | ref_values = lapply(stacklist, raster::extract, ref, fun = "mean") %>% as.data.frame() #it takes some time... 335 | ref_values$class = ref$class 336 | 337 | colnames(ref_values) = sub("X", "", colnames(ref_values)) #removing unnecessary strings 338 | band = select(ref_values, ends_with(".7")) #select the band (e.g. 7 - NIR1) 339 | band$class = ref$class #and add the class column again 340 | 341 | means = band %>% #similarly as in previous part, we will calculate mean values for each class 342 | gather(key, value, -class) %>% 343 | as.data.frame 344 | 345 | means$key = as.Date(means$key, format = "%Y%m%d") #change the key, i.e. a variable with date to date format 346 | 347 | #and plot it: 348 | ggplot(means, aes(key, value, color = class, group = class))+ 349 | geom_point()+ 350 | geom_line(size = 2, alpha = 0.6) 351 | 352 | #Some simple conlcusion from the NIR time series analysis are: 353 | #In NIR region, healthy vegetation has a very high values (it is sensitive to scattering surfaces, such as leaves – Leaf Area Index). 354 | #Crops typically have the highest NIR values 355 | #Conifers have lower values than broad-leaved forests, and they are relatively stable, as most of the conifers are evergreen species, 356 | #but there are also some seasonal variations 357 | #RE1 region – lower values = more chlorophyll 358 | #Rapeseed is a specific crop as it blooms intensively,here we can see that in April it starts to growth, while at the beginning of May the intensive bloom starts, 359 | #there is a peak in RE1 on May 360 | 361 | #Similarly, you can analyze other bands or calculate indices and analyze their trajectories during the growing season. 362 | 363 | #THANK YOU!!! :) 364 | 365 | -------------------------------------------------------------------------------- /travis/README.md: -------------------------------------------------------------------------------- 1 | # First steps with Continuous Integration 2 | 3 | Authors: Colin Gillespie & Rhian Davies [Jumping Rivers](https://www.jumpingrivers.com) 4 | 5 | ## Goals 6 | 7 | This goals is aimed at participants who want to start using continuous integration 8 | in their git workflow. By the end of the tutorial participants should: 9 | 10 | - Appreciate that a commit to Git can launch numerous other services 11 | - Be able to use travis to automatically check code and run unit tests 12 | - Create and securely store personal authentication tokens 13 | - Create a package website using __pkgdown__ 14 | - Test their R package against multiple versions of R 15 | - Create and manipulate a travis configuration file 16 | 17 | Target audience: The course is suitable for R users of all levels, but, participants should be familiar with basic git commands. 18 | 19 | ## Assumed Knowledge 20 | 21 | Running courses online means we have to be extra careful about participants 22 | meeting the pre-requisites. All participants should be familiar with: 23 | 24 | - writing basic R functions 25 | - basic git with GitHub, e.g push, pull, commit. The initial task will be to 26 | - Forking a repo 27 | - Clone the repo onto your machine 28 | - writing a simple R package 29 | 30 | As part of registering for this course, we'll ask for a link to your GitHub repository. 31 | 32 | ## Before the course 33 | 34 | - Please create an account on travis: https://travis-ci.com 35 | - Make sure you have updated/working versions of R/RStudio; Git/GitHub (optional: GitHub Desktop) 36 | - Follow instructions at the course GitHub [gist](https://gist.github.com/csgillespie/447e4ebed711199a320c97a65f71da84) 37 | -------------------------------------------------------------------------------- /workshops.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: knitr 13 | LaTeX: XeLaTeX 14 | -------------------------------------------------------------------------------- /workshops.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/WhyR2020/workshops/5531a8b5180bce4a8c4adeda8b34f6c4554401ef/workshops.jpg --------------------------------------------------------------------------------