├── .github
├── .gitignore
└── workflows
│ ├── deploy_bookdown.yml
│ ├── pr_check.yml
│ └── pr_check_readme.yml
├── preamble.tex
├── data
├── .gitignore
├── roster.xlsx
├── students.rds
├── survey.xlsx
├── bake-sale.xlsx
├── penguins.xlsx
├── students.xlsx
├── students.parquet
├── 03-sales.csv
├── 02-sales.csv
├── 01-sales.csv
├── students-2.csv
└── students.csv
├── images
├── sql.png
├── 19_lt.png
├── 28-bib.png
├── cases.png
├── ge_aes.png
├── ge_all.png
├── hadley.jpg
├── layers.png
├── merge.png
├── shape.png
├── 19_anti.png
├── 19_cross.png
├── 19_full.png
├── 19_inner.png
├── 19_left.png
├── 19_right.png
├── 19_semi.png
├── 28-fig28.png
├── 6-tidy-1.png
├── 28_just-1.png
├── 28_themes.png
├── 5_diagram_1.odg
├── 5_diagram_1.png
├── 5_diagram_2.odg
├── 5_diagram_2.png
├── 5_diagram_3.odg
├── 5_diagram_3.png
├── 5_diagram_4.odg
├── 5_diagram_4.png
├── 6-Projects.png
├── duplicates.png
├── duplicates2.png
├── files_pane.png
├── ge_themes.png
├── script_pane.png
├── transform.png
├── 19_relational.png
├── 22-resampling.png
├── 28_book_cairo.png
├── console_pane.png
├── data-science.png
├── ggplot2_logo.png
├── string_stuck.png
├── 14_venn_diagrams.png
├── 19_many-to-many.png
├── 19_many-to-one.png
├── 19_one-to-many.png
├── 28-chunk-label.png
├── 28-chunk-options.png
├── 28-execute-yaml.png
├── 28-knitr-options.png
├── 6-column-names.png
├── 6-multiple-names.png
├── 6-panes_layout.png
├── environment_pane.png
├── horst-spelling.png
├── quarto-chunk-nav.png
├── quarto-dark-bg.jpeg
├── test_functions.png
├── 17_datetime_codes.png
├── 19_equality_match.png
├── 6-names-and-values.png
├── transform-logical.png
├── 22-data-science-model.png
├── data-science-explore.png
├── seperate_wider_delim.png
├── stringr-autocomplete.png
├── 28-quarto-visual-editor.png
├── seperate_longer_delim1.png
├── seperate_wider_position.png
├── visualization-stat-bar.png
├── 17-lord_howe_stick_insect.jpg
├── data-structures-overview.png
├── seperate_longer_position.png
├── 15_search_google_sheets_regex.png
├── special_missing_values_doubles.png
└── visualization-coordinate-systems.png
├── penguin-plot.png
├── .Rbuildignore
├── _bookdown.yml
├── .gitignore
├── book.bib
├── bookclub-r4ds.Rproj
├── _output.yml
├── test.qmd
├── quarto
└── markdown.qmd
├── references.bib
├── style.css
├── DESCRIPTION
├── index.Rmd
├── README.md
├── 24-web_scraping.Rmd
├── 08-workflow_getting_help.Rmd
├── 21-databases.Rmd
├── 99-24-model_building.Rmd
├── 17-dates_and_times.Rmd
├── 02-workflow_basics.Rmd
├── 22-arrow.Rmd
├── 29-quarto_formats.Rmd
├── 26-iteration.Rmd
├── 10-exploratory_data_analysis.Rmd
├── 27-base_r.Rmd
├── 11-communication.Rmd
├── 00-introduction.Rmd
├── 04-workflow_code_style.Rmd
├── 20-spreadsheets.Rmd
├── 16-factors.Rmd
├── 99-23-model_basics.Rmd
├── 23-hierarchical_data.Rmd
└── 12-logical_vectors.Rmd
/.github/.gitignore:
--------------------------------------------------------------------------------
1 | *.html
2 |
--------------------------------------------------------------------------------
/preamble.tex:
--------------------------------------------------------------------------------
1 | \usepackage{booktabs}
2 |
--------------------------------------------------------------------------------
/data/.gitignore:
--------------------------------------------------------------------------------
1 | seattle-library-checkouts
2 | seattle-library-checkouts.csv
3 |
--------------------------------------------------------------------------------
/images/sql.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/sql.png
--------------------------------------------------------------------------------
/data/roster.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/data/roster.xlsx
--------------------------------------------------------------------------------
/data/students.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/data/students.rds
--------------------------------------------------------------------------------
/data/survey.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/data/survey.xlsx
--------------------------------------------------------------------------------
/images/19_lt.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_lt.png
--------------------------------------------------------------------------------
/images/28-bib.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28-bib.png
--------------------------------------------------------------------------------
/images/cases.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/cases.png
--------------------------------------------------------------------------------
/images/ge_aes.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/ge_aes.png
--------------------------------------------------------------------------------
/images/ge_all.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/ge_all.png
--------------------------------------------------------------------------------
/images/hadley.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/hadley.jpg
--------------------------------------------------------------------------------
/images/layers.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/layers.png
--------------------------------------------------------------------------------
/images/merge.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/merge.png
--------------------------------------------------------------------------------
/images/shape.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/shape.png
--------------------------------------------------------------------------------
/penguin-plot.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/penguin-plot.png
--------------------------------------------------------------------------------
/data/bake-sale.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/data/bake-sale.xlsx
--------------------------------------------------------------------------------
/data/penguins.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/data/penguins.xlsx
--------------------------------------------------------------------------------
/data/students.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/data/students.xlsx
--------------------------------------------------------------------------------
/images/19_anti.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_anti.png
--------------------------------------------------------------------------------
/images/19_cross.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_cross.png
--------------------------------------------------------------------------------
/images/19_full.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_full.png
--------------------------------------------------------------------------------
/images/19_inner.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_inner.png
--------------------------------------------------------------------------------
/images/19_left.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_left.png
--------------------------------------------------------------------------------
/images/19_right.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_right.png
--------------------------------------------------------------------------------
/images/19_semi.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_semi.png
--------------------------------------------------------------------------------
/images/28-fig28.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28-fig28.png
--------------------------------------------------------------------------------
/images/6-tidy-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/6-tidy-1.png
--------------------------------------------------------------------------------
/.Rbuildignore:
--------------------------------------------------------------------------------
1 | ^renv$
2 | ^renv\.lock$
3 | ^\.github$
4 | ^.*\.Rproj$
5 | ^\.Rproj\.user$
6 |
--------------------------------------------------------------------------------
/data/students.parquet:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/data/students.parquet
--------------------------------------------------------------------------------
/images/28_just-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28_just-1.png
--------------------------------------------------------------------------------
/images/28_themes.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28_themes.png
--------------------------------------------------------------------------------
/images/5_diagram_1.odg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/5_diagram_1.odg
--------------------------------------------------------------------------------
/images/5_diagram_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/5_diagram_1.png
--------------------------------------------------------------------------------
/images/5_diagram_2.odg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/5_diagram_2.odg
--------------------------------------------------------------------------------
/images/5_diagram_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/5_diagram_2.png
--------------------------------------------------------------------------------
/images/5_diagram_3.odg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/5_diagram_3.odg
--------------------------------------------------------------------------------
/images/5_diagram_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/5_diagram_3.png
--------------------------------------------------------------------------------
/images/5_diagram_4.odg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/5_diagram_4.odg
--------------------------------------------------------------------------------
/images/5_diagram_4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/5_diagram_4.png
--------------------------------------------------------------------------------
/images/6-Projects.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/6-Projects.png
--------------------------------------------------------------------------------
/images/duplicates.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/duplicates.png
--------------------------------------------------------------------------------
/images/duplicates2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/duplicates2.png
--------------------------------------------------------------------------------
/images/files_pane.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/files_pane.png
--------------------------------------------------------------------------------
/images/ge_themes.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/ge_themes.png
--------------------------------------------------------------------------------
/images/script_pane.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/script_pane.png
--------------------------------------------------------------------------------
/images/transform.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/transform.png
--------------------------------------------------------------------------------
/images/19_relational.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_relational.png
--------------------------------------------------------------------------------
/images/22-resampling.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/22-resampling.png
--------------------------------------------------------------------------------
/images/28_book_cairo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28_book_cairo.png
--------------------------------------------------------------------------------
/images/console_pane.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/console_pane.png
--------------------------------------------------------------------------------
/images/data-science.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/data-science.png
--------------------------------------------------------------------------------
/images/ggplot2_logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/ggplot2_logo.png
--------------------------------------------------------------------------------
/images/string_stuck.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/string_stuck.png
--------------------------------------------------------------------------------
/images/14_venn_diagrams.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/14_venn_diagrams.png
--------------------------------------------------------------------------------
/images/19_many-to-many.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_many-to-many.png
--------------------------------------------------------------------------------
/images/19_many-to-one.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_many-to-one.png
--------------------------------------------------------------------------------
/images/19_one-to-many.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_one-to-many.png
--------------------------------------------------------------------------------
/images/28-chunk-label.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28-chunk-label.png
--------------------------------------------------------------------------------
/images/28-chunk-options.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28-chunk-options.png
--------------------------------------------------------------------------------
/images/28-execute-yaml.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28-execute-yaml.png
--------------------------------------------------------------------------------
/images/28-knitr-options.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28-knitr-options.png
--------------------------------------------------------------------------------
/images/6-column-names.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/6-column-names.png
--------------------------------------------------------------------------------
/images/6-multiple-names.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/6-multiple-names.png
--------------------------------------------------------------------------------
/images/6-panes_layout.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/6-panes_layout.png
--------------------------------------------------------------------------------
/images/environment_pane.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/environment_pane.png
--------------------------------------------------------------------------------
/images/horst-spelling.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/horst-spelling.png
--------------------------------------------------------------------------------
/images/quarto-chunk-nav.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/quarto-chunk-nav.png
--------------------------------------------------------------------------------
/images/quarto-dark-bg.jpeg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/quarto-dark-bg.jpeg
--------------------------------------------------------------------------------
/images/test_functions.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/test_functions.png
--------------------------------------------------------------------------------
/images/17_datetime_codes.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/17_datetime_codes.png
--------------------------------------------------------------------------------
/images/19_equality_match.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/19_equality_match.png
--------------------------------------------------------------------------------
/images/6-names-and-values.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/6-names-and-values.png
--------------------------------------------------------------------------------
/images/transform-logical.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/transform-logical.png
--------------------------------------------------------------------------------
/images/22-data-science-model.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/22-data-science-model.png
--------------------------------------------------------------------------------
/images/data-science-explore.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/data-science-explore.png
--------------------------------------------------------------------------------
/images/seperate_wider_delim.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/seperate_wider_delim.png
--------------------------------------------------------------------------------
/images/stringr-autocomplete.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/stringr-autocomplete.png
--------------------------------------------------------------------------------
/images/28-quarto-visual-editor.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/28-quarto-visual-editor.png
--------------------------------------------------------------------------------
/images/seperate_longer_delim1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/seperate_longer_delim1.png
--------------------------------------------------------------------------------
/images/seperate_wider_position.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/seperate_wider_position.png
--------------------------------------------------------------------------------
/images/visualization-stat-bar.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/visualization-stat-bar.png
--------------------------------------------------------------------------------
/images/17-lord_howe_stick_insect.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/17-lord_howe_stick_insect.jpg
--------------------------------------------------------------------------------
/images/data-structures-overview.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/data-structures-overview.png
--------------------------------------------------------------------------------
/images/seperate_longer_position.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/seperate_longer_position.png
--------------------------------------------------------------------------------
/images/15_search_google_sheets_regex.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/15_search_google_sheets_regex.png
--------------------------------------------------------------------------------
/images/special_missing_values_doubles.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/special_missing_values_doubles.png
--------------------------------------------------------------------------------
/images/visualization-coordinate-systems.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/r4ds/bookclub-r4ds/HEAD/images/visualization-coordinate-systems.png
--------------------------------------------------------------------------------
/data/03-sales.csv:
--------------------------------------------------------------------------------
1 | month,year,brand,item,n
2 | March,2019,1,1234,3
3 | March,2019,1,3627,1
4 | March,2019,1,8820,3
5 | March,2019,2,7253,1
6 | March,2019,2,8766,3
7 | March,2019,2,8288,6
8 |
--------------------------------------------------------------------------------
/_bookdown.yml:
--------------------------------------------------------------------------------
1 | book_filename: "bookclub-r4ds"
2 | repo: https://github.com/r4ds/bookclub-r4ds
3 | edit: "https://github.com/r4ds/bookclub-r4ds/edit/main/%s"
4 | output_dir: "_book"
5 | delete_merged_file: true
6 |
--------------------------------------------------------------------------------
/data/02-sales.csv:
--------------------------------------------------------------------------------
1 | month,year,brand,item,n
2 | February,2019,1,1234,8
3 | February,2019,1,8721,2
4 | February,2019,1,1822,3
5 | February,2019,2,3333,1
6 | February,2019,2,2156,3
7 | February,2019,2,3987,6
8 |
--------------------------------------------------------------------------------
/data/01-sales.csv:
--------------------------------------------------------------------------------
1 | month,year,brand,item,n
2 | January,2019,1,1234,3
3 | January,2019,1,8721,9
4 | January,2019,1,1822,2
5 | January,2019,2,3333,1
6 | January,2019,2,2156,9
7 | January,2019,2,3987,6
8 | January,2019,2,3827,6
--------------------------------------------------------------------------------
/.github/workflows/deploy_bookdown.yml:
--------------------------------------------------------------------------------
1 | on:
2 | push:
3 | branches: main
4 | paths-ignore:
5 | - 'README.md'
6 | workflow_dispatch:
7 |
8 | jobs:
9 | bookdown:
10 | uses: r4ds/r4dsactions/.github/workflows/render_pages.yml@main
11 |
--------------------------------------------------------------------------------
/.github/workflows/pr_check.yml:
--------------------------------------------------------------------------------
1 | on:
2 | pull_request:
3 | branches: main
4 | paths-ignore:
5 | - 'README.md'
6 | workflow_dispatch:
7 |
8 | jobs:
9 | pr_check:
10 | uses: r4ds/r4dsactions/.github/workflows/render_check.yml@main
11 |
--------------------------------------------------------------------------------
/.github/workflows/pr_check_readme.yml:
--------------------------------------------------------------------------------
1 | on:
2 | pull_request:
3 | branches: main
4 | paths:
5 | - 'README.md'
6 | workflow_dispatch:
7 |
8 | jobs:
9 | pr_check:
10 | uses: r4ds/r4dsactions/.github/workflows/render_check_readme.yml@main
11 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | .Rproj.user
2 | .Rhistory
3 | .Rdata
4 | .Renviron
5 | .Rprofile
6 | .httr-oauth
7 | .DS_Store
8 | _book
9 | _bookdown_files
10 | bookclub-r4ds.Rmd
11 | bookclub-r4ds_files
12 | *.html
13 | libs
14 | renv
15 | bookclub-r4ds.knit.md
16 |
--------------------------------------------------------------------------------
/book.bib:
--------------------------------------------------------------------------------
1 | @Book{xie2015,
2 | title = {Dynamic Documents with {R} and knitr},
3 | author = {Yihui Xie},
4 | publisher = {Chapman and Hall/CRC},
5 | address = {Boca Raton, Florida},
6 | year = {2015},
7 | edition = {2nd},
8 | note = {ISBN 978-1498716963},
9 | url = {http://yihui.org/knitr/},
10 | }
11 |
--------------------------------------------------------------------------------
/data/students-2.csv:
--------------------------------------------------------------------------------
1 | student_id,full_name,favourite_food,meal_plan,age
2 | 1,Sunil Huffmann,Strawberry yoghurt,Lunch only,4
3 | 2,Barclay Lynn,French fries,Lunch only,5
4 | 3,Jayendra Lyne,NA,Breakfast and lunch,7
5 | 4,Leon Rossini,Anchovies,Lunch only,NA
6 | 5,Chidiegwu Dunkel,Pizza,Breakfast and lunch,5
7 | 6,Güvenç Attila,Ice cream,Lunch only,6
8 |
--------------------------------------------------------------------------------
/data/students.csv:
--------------------------------------------------------------------------------
1 | Student ID,Full Name,favourite.food,mealPlan,AGE
2 | 1,Sunil Huffmann,Strawberry yoghurt,Lunch only,4
3 | 2,Barclay Lynn,French fries,Lunch only,5
4 | 3,Jayendra Lyne,N/A,Breakfast and lunch,7
5 | 4,Leon Rossini,Anchovies,Lunch only,
6 | 5,Chidiegwu Dunkel,Pizza,Breakfast and lunch,five
7 | 6,Güvenç Attila,Ice cream,Lunch only,6
--------------------------------------------------------------------------------
/bookclub-r4ds.Rproj:
--------------------------------------------------------------------------------
1 | Version: 1.0
2 | ProjectId: 0e4ac423-e010-4dc1-ba28-4aa8506b6d4c
3 |
4 | RestoreWorkspace: Default
5 | SaveWorkspace: Default
6 | AlwaysSaveHistory: Default
7 |
8 | EnableCodeIndexing: Yes
9 | UseSpacesForTab: Yes
10 | NumSpacesForTab: 2
11 | Encoding: UTF-8
12 |
13 | RnwWeave: Sweave
14 | LaTeX: pdfLaTeX
15 |
16 | BuildType: Website
17 |
--------------------------------------------------------------------------------
/_output.yml:
--------------------------------------------------------------------------------
1 | bookdown::gitbook:
2 | css: style.css
3 | split_by: section
4 | config:
5 | toc:
6 | collapse: section
7 | before: |
8 |
R for Data Science Book Club
9 | after: |
10 | Published with bookdown
11 | edit:
12 | link: https://github.com/r4ds/bookclub-r4ds/edit/main/%s
13 | text: "Edit"
14 | sharing:
15 | github: yes
16 | facebook: no
17 | twitter: no
18 |
--------------------------------------------------------------------------------
/test.qmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Untitled"
3 | format: html
4 | editor: visual
5 | bibliography: references.bib
6 | ---
7 |
8 | ###### Quarto
9 |
10 | Quarto enables you to weave together content and executable code into a finished document. To learn more about Quarto see .
11 |
12 | ## Running Code
13 |
14 | When you click the **Render** button a document will be generated that includes both content and the output of embedded code. You can embed code like this:
15 |
16 | ```{r}
17 | 1 + 1
18 | ```
19 |
20 | You can add options to executable code like this [@abrahms2016]
21 |
22 | ```{r}
23 | #| echo: false
24 | 2 * 2
25 | ```
26 |
27 | The `echo: false` option disables the printing of code (only output is displayed) .
28 |
29 | ```{r}
30 | 2+2
31 |
32 | i = 1
33 | print (i)
34 | ```
35 |
--------------------------------------------------------------------------------
/quarto/markdown.qmd:
--------------------------------------------------------------------------------
1 | ## Text formatting
2 |
3 | *italic* **bold** ~~strikeout~~ `code`
4 |
5 | superscript^2^ subscript~2~
6 |
7 | [underline]{.underline} [small caps]{.smallcaps}
8 |
9 | ## Headings
10 |
11 | # 1st Level Header
12 |
13 | ## 2nd Level Header
14 |
15 | ### 3rd Level Header
16 |
17 | ## Lists
18 |
19 | - Bulleted list item 1
20 |
21 | - Item 2
22 |
23 | - Item 2a
24 |
25 | - Item 2b
26 |
27 | 1. Numbered list item 1
28 |
29 | 2. Item 2.
30 | The numbers are incremented automatically in the output.
31 |
32 | ## Links and images
33 |
34 |
35 |
36 | [linked phrase](http://example.com)
37 |
38 | {fig-alt="Quarto logo and the word quarto spelled in small case letters"}
39 |
40 | ## Tables
41 |
42 | | First Header | Second Header |
43 | |--------------|---------------|
44 | | Content Cell | Content Cell |
45 | | Content Cell | Content Cell |
46 |
--------------------------------------------------------------------------------
/references.bib:
--------------------------------------------------------------------------------
1 |
2 | @article{abrahms2016,
3 | title = {Lessons from integrating behaviour and resource selection: activity-specific responses of A frican wild dogs to roads},
4 | author = {Abrahms, B and Jordan, NR and Golabek, KA and McNutt, JW and Wilson, AM and Brashares, JS},
5 | year = {2016},
6 | date = {2016},
7 | journal = {Animal Conservation},
8 | pages = {247{\textendash}255},
9 | volume = {19},
10 | number = {3},
11 | note = {Publisher: Wiley Online Library}
12 | }
13 |
14 | @article{abrahms2016,
15 | title = {Lessons from integrating behaviour and resource selection: activity-specific responses of A frican wild dogs to roads},
16 | author = {Abrahms, B and Jordan, NR and Golabek, KA and McNutt, JW and Wilson, AM and Brashares, JS},
17 | year = {2016},
18 | date = {2016},
19 | journal = {Animal Conservation},
20 | pages = {247{\textendash}255},
21 | volume = {19},
22 | number = {3},
23 | note = {Publisher: Wiley Online Library}
24 | }
25 |
--------------------------------------------------------------------------------
/style.css:
--------------------------------------------------------------------------------
1 | .page-inner {
2 | max-width: 1000px !important;
3 | }
4 |
5 | .book.font-size-0 .book-body .page-inner section {
6 | font-size: 1em !important;
7 | }
8 | .book.font-size-1 .book-body .page-inner section {
9 | font-size: 1.5em !important;
10 | }
11 | .book.font-size-2 .book-body .page-inner section {
12 | font-size: 2em !important;
13 | }
14 | .book.font-size-3 .book-body .page-inner section {
15 | font-size: 2.5em !important;
16 | }
17 | .book.font-size-4 .book-body .page-inner section {
18 | font-size: 3em !important;
19 | }
20 |
21 | /* Styles below here were customized before standardization.
22 | Try to get rid of these! */
23 |
24 | p.caption {
25 | color: #777;
26 | margin-top: 10px;
27 | }
28 | p code {
29 | white-space: inherit;
30 | }
31 | pre {
32 | word-break: normal;
33 | word-wrap: normal;
34 | }
35 | pre code {
36 | white-space: inherit;
37 | }
38 |
39 | .book .book-body .page-wrapper .page-inner section.normal img.robot {
40 | float: left;
41 | height: 75px;
42 | margin-right: 20px;
43 | }
44 |
--------------------------------------------------------------------------------
/DESCRIPTION:
--------------------------------------------------------------------------------
1 | Package: bookclub-r4ds
2 | Title: R for Data Science Book Club
3 | Version: 0.0.9.9000
4 | Authors@R:
5 | person("Data Science Learning Community", role = c("aut", "cre", "cph"))
6 | URL: https://r4ds.github.io/bookclub-r4ds,
7 | https://github.com/r4ds/bookclub-r4ds
8 | Depends:
9 | R (>= 3.1.0)
10 | Imports:
11 | arrow,
12 | babynames,
13 | bookdown,
14 | curl,
15 | DBI,
16 | dbplyr,
17 | details,
18 | duckdb,
19 | emo,
20 | gapminder,
21 | ggplot2,
22 | ggthemes,
23 | googlesheets4,
24 | here,
25 | hexbin,
26 | hrbrthemes,
27 | htmlwidgets,
28 | janitor,
29 | jsonlite,
30 | Lahman,
31 | lubridate,
32 | manipulate,
33 | maps,
34 | microbenchmark,
35 | nycflights13,
36 | palmerpenguins,
37 | patchwork,
38 | reactable,
39 | reactablefmtr,
40 | readxl,
41 | repurrrsive,
42 | rvest,
43 | styler,
44 | tidyverse,
45 | tufte,
46 | tvthemes,
47 | viridis,
48 | writexl
49 | Remotes:
50 | hadley/emo
51 | Encoding: UTF-8
52 |
--------------------------------------------------------------------------------
/index.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "R for Data Science Book Club"
3 | date: "`r Sys.Date()`"
4 | site: bookdown::bookdown_site
5 | documentclass: book
6 | bibliography: book.bib
7 | biblio-style: apalike
8 | link-citations: yes
9 | github-repo: r4ds/bookclub-r4ds
10 | description: "This is the product of the Data Science Learning Community's Book Club."
11 | ---
12 |
13 | # Welcome {-}
14 |
15 | This is a companion for the book [R for Data Science](https://r4ds.hadley.nz/) by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund.
16 | This companion is available at [dslc.io/r4ds](https://dslc.io/r4ds).
17 |
18 | This website is being developed by the [Data Science Learning Community](https://dslc.io). Follow along, and [join the community](https://dslc.io/join) to participate.
19 |
20 | This companion follows the [Data Science Learning Community Code of Conduct](https://dslc.io/conduct).
21 |
22 | ## Book club meetings {-}
23 |
24 | - Volunteer leads discussion of a chapter
25 | - **This is the best way to learn the material.**
26 | - Presentations:
27 | - Review of material
28 | - Questions you have
29 | - Maybe live demo
30 | - More info about editing: [this github repo](https://github.com/r4ds/bookclub-r4ds).
31 | - Recorded, available on the [Data Science Learning Community YouTube Channel](https://dslc.io/youtube).
32 |
33 | ## Pace {-}
34 |
35 | - **Goal:** 1 chapter/week
36 | - Ok to split overwhelming chapters
37 | - Ok to combine short chapters
38 | - Meet ***every*** week except holidays, etc
39 | - We'll discuss even if presenter unavailable
40 |
41 | ## Learning objectives {-}
42 |
43 | - Students who study with LOs in mind ***retain more.***
44 | - **Tips:**
45 | - "After today's session, you will be able to..."
46 | - *Very* roughly **1 per section.**
47 | - Likely need to be refined
48 |
49 | ## Today's learning objectives {-}
50 |
51 | After today's session, you will be able to...
52 |
53 | - Explain how our weekly meetings work.
54 | - Sign up to lead a discussion.
55 | - Edit notes on GitHub.
56 | - (More LOs coming in Chapter 1)
57 |
58 | ## GitHub {-}
59 |
60 | - Even *tech bros* can figure it out, ***you'll be fine!***
61 | - See README for setup instructions
62 | - [Cohort 9 Week 1](https://youtu.be/9ar16FGFgT0) included a walk-through
63 | - Ok to edit directly in browser!
64 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # DSLC R for Data Science Book Club
2 |
3 | Welcome to the DSLC R for Data Science Book Club!
4 |
5 | We are working together to read [R for Data Science](https://r4ds.hadley.nz/) by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund.
6 | Join the #book_club-r_for_data_science channel on the [DSLC Slack](https://dslc.io/join) to participate.
7 | As we read, we are producing [notes about the book](https://r4ds.github.io/bookclub-r4ds/).
8 |
9 | ## Meeting Schedule
10 |
11 | If you would like to present, please see the sign-up sheet for your cohort (linked below, and pinned in the [#book_club-r4ds](https://dslcio.slack.com/archives/C012VLJ0KRB) channel on Slack)!
12 |
13 | - Cohort 1 (started 2020-07-31, ended 2020-10-12): [meeting videos](https://youtube.com/playlist?list=PL3x6DOfs2NGgUOBkwtRJQW0hDWCwdzboM)
14 | - Cohort 2 (started 2020-08-03, ended 2021-03-29): [meeting videos](https://www.youtube.com/playlist?list=PL3x6DOfs2NGglHEO3WBEaxiEZ0_ZiwZJi)
15 | - Cohort 3 (started 2020-12-08, ended 2021-11-09): [meeting videos](https://www.youtube.com/playlist?list=PL3x6DOfs2NGiiKcrDqW4m9qhlpbiQ7HCt)
16 | - Cohort 4 (started 2020-12-16, ended 2021-06-23): [meeting videos](https://www.youtube.com/playlist?list=PL3x6DOfs2NGjtn1_4BSX99R5wrLjK7XvY)
17 | - Cohort 5 (started 2021-07-24, ended 2022-04-23): [meeting videos](https://www.youtube.com/playlist?list=PL3x6DOfs2NGjk1sPsrn2CazGiel0yZrhc)
18 | - Cohort 6 (started 2021-10-15, ended 2022-11-17): [meeting videos](https://www.youtube.com/playlist?list=PL3x6DOfs2NGiYnQdq8mgMBeob3YONUWRM)
19 | - Cohort 7 (started 2022-08-29, ended 2023-07-31): [meeting videos](https://youtube.com/playlist?list=PL3x6DOfs2NGi3qrPu8xxURdUoYAJpko5G)
20 | - Cohort 8 (started 2022-09-24, ended 2023-08-19): [meeting videos](https://www.youtube.com/playlist?list=PL3x6DOfs2NGjeq_14X43I3OHYxuE2mO4I)
21 | - Cohort 9 (started 2023-07-30, ended 2024-04-28): [meeting videos](https://www.youtube.com/playlist?list=PL3x6DOfs2NGjVMs1NtbWu4s_ZgGhGKnrN)
22 | - Cohort 10 (started 2023-10-06, ended 2024-07-19): [meeting videos](https://www.youtube.com/playlist?list=PL3x6DOfs2NGj_fqbuP0xWjm5pD9hz6G5Z)
23 | - Cohort 11 (started 2024-08-22, ended 2025-05-01): [meeting videos](https://www.youtube.com/playlist?list=PL3x6DOfs2NGhcXLwZHIEnDLv2HhmhD4ma)
24 |
25 |
26 | ## How to Present
27 |
28 | This repository is structured as a [{bookdown}](https://CRAN.R-project.org/package=bookdown) site.
29 | To present, follow these instructions:
30 |
31 | Do these steps once:
32 |
33 | 1. [Setup Git and GitHub to work with RStudio](https://github.com/r4ds/bookclub-setup) (click through for detailed, step-by-step instructions; I recommend checking this out even if you're pretty sure you're all set).
34 | 2. `usethis::create_from_github("r4ds/bookclub-r4ds")` (cleanly creates your own copy of this repository).
35 |
36 | Do these steps each time you present another chapter:
37 |
38 | 1. Open your project for this book.
39 | 2. `usethis::pr_init("my-chapter")` (creates a branch for your work, to avoid confusion, making sure that you have the latest changes from other contributors; replace `my-chapter` with a descriptive name, ideally).
40 | 3. `devtools::install_dev_deps()` (installs any packages used by the book that you don't already have installed).
41 | 4. Edit the appropriate chapter file, if necessary. Use `##` to indicate new slides (new sections).
42 | 5. If you use any packages that are not already in the `DESCRIPTION`, add them. You can use `usethis::use_package("myCoolPackage")` to add them quickly!
43 | 6. Build the book! ctrl-shift-b (or command-shift-b) will render the full book, or ctrl-shift-k (command-shift-k) to render just your slide. Please do this to make sure it works before you push your changes up to the main repo!
44 | 7. Commit your changes (either through the command line or using Rstudio's Git tab).
45 | 8. `usethis::pr_push()` (pushes the changes up to github, and opens a "pull request" (PR) to let us know your work is ready).
46 | 9. (If we request changes, make them)
47 | 10. When your PR has been accepted ("merged"), `usethis::pr_finish()` to close out your branch and prepare your local repository for future work.
48 | 11. Now that your local copy is up-to-date with the main repo, you need to update your remote fork. Run `gert::git_push("origin")` or click the `Push` button on the `Git` tab of Rstudio.
49 |
50 | When your PR is checked into the main branch, the bookdown site will rebuild, adding your slides to [this site](https://dslc.io/r4ds).
51 |
--------------------------------------------------------------------------------
/24-web_scraping.Rmd:
--------------------------------------------------------------------------------
1 | # Web scraping
2 |
3 | **Learning objectives**
4 |
5 | - Decide whether to scrape data from a web page.
6 | - Recognize enough HTML to find your way around a web page.
7 | - Extract tables from web pages.
8 | - Extract other data from web pages.
9 |
10 | ```{r web_scraping-packages, eval=TRUE, message=FALSE, warning=FALSE}
11 | library(rvest)
12 | library(tidyverse)
13 | ```
14 |
15 | ## Ethics & Legalities {-}
16 |
17 | > [If the data isn’t public, non-personal, or factual or you’re scraping the data specifically to make money with it, you’ll need to talk to a lawyer.](https://r4ds.hadley.nz/webscraping#scraping-ethics-and-legalities)
18 |
19 | - Be polite (and {[polite](https://dmi3kno.github.io/polite/)})
20 | - Check Terms of Service
21 | - Beware PII
22 | - Facts usually aren't copyrightable
23 |
24 | ## Typical HTML structure {-}
25 |
26 | HTML = **H**yper**T**ext **M**arkup **L**anguage
27 |
28 | - Hierarchical structure
29 | - Element = `content`
30 | - Start tag: ``
31 | - Attributes: `attribute="a" other="b"`
32 | - Content: `content`
33 | - End tag: ``
34 | - Elements nest inside elements (as content)
35 | - Nested elements = "children"
36 |
37 | ## Use {rvest} to scrape web pages {-}
38 |
39 | [{rvest}](https://rvest.tidyverse.org/) ("harvest") = tidyverse web-scraping package
40 |
41 | - Load html to scrape: `read_html()`
42 | - Shortcut for tables: `html_table()`
43 |
44 | ## Example: Table {-}
45 |
46 | [Wikipedia List of world expositions](https://en.wikipedia.org/wiki/List_of_world_expositions)
47 |
48 | ```{r web_scraping-tables, eval=TRUE}
49 | url <- "https://en.wikipedia.org/wiki/List_of_world_expositions"
50 | html <- read_html(url)
51 | html |>
52 | html_table()
53 | ```
54 |
55 | ## Select a specific element {-}
56 |
57 | `html_element()` returns same # outputs as inputs (1 thing in, 1 thing out)
58 |
59 | - `"thing"` = `` tag
60 | - `".thing"` = something with attribute `class="thing"`
61 | - `"#thing"` = something with attribute `id="thing"`
62 |
63 | ## Example: One specific table {-}
64 |
65 | ```{r web_scraping-html_element}
66 | html |>
67 | html_element("table.wikitable") |>
68 | html_table()
69 | ```
70 |
71 | ## Select finer-grained elements {-}
72 |
73 | `html_elements()` finds *all* matches
74 |
75 | 👍 Rule of thumb:
76 |
77 | - `html_elements()` to get observations (rows)
78 | - `html_element()` to get variables for each observation (columns)
79 |
80 | ## Extract data {-}
81 |
82 | - `html_text()` for raw text (you probably don't want this)
83 | - `html_text2()` for clean text
84 | - `html_attr()` for attribute value (eg url `href`)
85 |
86 | ## Example: Star Wars Rows {-}
87 |
88 | [Star Wars films (1-7)](https://rvest.tidyverse.org/articles/starwars.html)
89 |
90 | ```{r web_scraping-star_wars-section}
91 | url <- "https://rvest.tidyverse.org/articles/starwars.html"
92 | html <- read_html(url)
93 |
94 | section <- html |> html_elements("section")
95 | section
96 | ```
97 |
98 | ## Example: Star Wars Directors {-}
99 |
100 | ```{r web_scraping-star_wars-directors}
101 | section |> html_element(".director") |> html_text2()
102 | ```
103 |
104 | ## Example: Star Wars Tibble {-}
105 |
106 | ```{r web_scraping-star_wars-tibble}
107 | tibble(
108 | title = section |>
109 | html_element("h2") |>
110 | html_text2(),
111 | released = section |>
112 | html_element("p") |>
113 | html_text2() |>
114 | stringr::str_remove("Released: ") |>
115 | readr::parse_date(),
116 | director = section |>
117 | html_element(".director") |>
118 | html_text2(),
119 | intro = section |>
120 | html_element(".crawl") |>
121 | html_text2()
122 | )
123 | ```
124 |
125 | ## Learn more {-}
126 |
127 | - [SelectorGadget](https://rvest.tidyverse.org/articles/selectorgadget.html)
128 | - [CSS Diner](https://flukeout.github.io/)
129 | - [MDN CSS selectors](https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors)
130 | - [*Web APIs with R* book club](https://DSLC.io/wapir)
131 |
132 | ## Meeting Videos {-}
133 |
134 | ### Cohort 7 {-}
135 |
136 | `r knitr::include_url("https://www.youtube.com/embed/G5_pr9HxbT4")`
137 |
138 |
139 | Meeting chat log
140 | ```
141 | 00:04:47 Oluwafemi Oyedele: Hi Tim!!!
142 | 00:05:27 Tim Newby: Hi Oluwafemi - can you hear me?
143 | 00:05:37 Oluwafemi Oyedele: Yes
144 | 00:11:53 Oluwafemi Oyedele: start
145 | 00:33:49 Oluwafemi Oyedele: https://rvest.tidyverse.org/articles/selectorgadget.html
146 | 00:40:24 Oluwafemi Oyedele: stop
147 | ```
148 |
149 |
150 | `r knitr::include_url("https://www.youtube.com/embed/HnJ3ZY1seY4")`
151 |
152 | ### Cohort 8 {-}
153 |
154 | `r knitr::include_url("https://www.youtube.com/embed/dOVWSSqUvt0")`
155 |
156 | ### Cohort 9 {-}
157 |
158 | `r knitr::include_url("https://www.youtube.com/embed/Hs928CH-_E4")`
159 |
--------------------------------------------------------------------------------
/08-workflow_getting_help.Rmd:
--------------------------------------------------------------------------------
1 | # Workflow: getting help
2 |
3 | **Learning objectives:**
4 |
5 | - Describe a few tips beyond the book on how to get help and to help you keep learning.
6 |
7 | ## Google
8 |
9 | - If you get an R error message and you have no idea what it means, chances are that someone else has been confused by it in the past, and there will be help somewhere on the web.
10 |
11 | - Typically adding "R" to a Google query is enough to restrict it to relevant results, but if the search isn't useful, try adding package names like "tidyverse" or "ggplot2" to narrow down the results
12 |
13 | - e.g., "how to make a boxplot in R" vs. "how to make a boxplot in R with ggplot2".
14 |
15 | - If the error message isn't in English, run `Sys.setenv(LANGUAGE = "en")` and re-run the code as you're more likely to find help for English error messages.
16 |
17 | - If Google doesn't help, try spending a little time searching [Stack Overflow](https://stackoverflow.com/) for an existing answer, including [R], to restrict your search to questions and answers that use R.
18 |
19 | ## Reprex
20 |
21 | - If your googling doesn't find anything useful, it's a really good idea to prepare a **reprex**, short for minimal **repr**oducible **ex**ample.
22 |
23 | - A good reprex makes it easier for other people to help you, and often you'll figure out the problem yourself in the course of making it.
24 |
25 | - There are two parts to creating a reprex:
26 |
27 | - *Make your code reproducible*: Capture everything, i.e. include any `library()` calls and create all necessary objects.
28 | - *Make your code minimal*. Strip away everything that is not directly related to your problem by creating a much smaller and simpler R object than the one you're facing in real life or even using built-in data.
29 |
30 | - Creating a reprex may sound like a lot of work, but it has a great payoff:
31 |
32 | - Creating an excellent reprex often reveals the source of your problem and may allow you to answer your own question.
33 |
34 | - You'll capture the essence of your problem in a way that is easy for others to play with which improves your chances of getting help.
35 |
36 | - The easiest way to avoid the mistake of accidentally miss something problem when creating a reprex by hand is by using the [`reprex`](https://reprex.tidyverse.org/) package.
37 |
38 | ## Making reprexes reproducible
39 |
40 | - There are three things you need to include to make your example reproducible: required packages, data, and code.
41 |
42 | - Packages should be loaded at the top of the script so it's easy to see which ones the example needs, and check that you're using the latest version of each package;
43 |
44 | - you may have discovered a bug that's been fixed since you installed or last updated the package. For packages in the tidyverse, the easiest way to check is to run `tidyverse_update()`.
45 |
46 | - The easiest way to include data is to use `dput()` to generate the R code needed to recreate it. For example, to recreate the mtcars dataset in R, perform the following steps:
47 |
48 | - Run dput(mtcars) in R.
49 | - Copy the output.
50 | - In reprex, type mtcars \<-, then paste.
51 | - Alternatively, click Addins, then Render reprex.
52 |
53 | - Spend a little bit of time ensuring that your code is easy for others to read:
54 |
55 | - Make sure you've used spaces and your variable names are concise yet informative.
56 |
57 | - Use comments to indicate where your problem lies.
58 |
59 | - Do your best to remove everything that is not related to the problem because the shorter your code is, the easier it is to understand and the easier it is to fix.
60 |
61 | - Try to use the smallest subset of your data that still reveals the problem, and finish by checking that you have actually made a reproducible example by starting a fresh R session and copying and pasting your script.
62 |
63 | ## Investing in yourself
64 |
65 | - It will take some practice to learn to create good, truly minimal reprexes, however learning to ask questions that include the code, and investing the time to make it reproducible will continue to pay off as you learn and master R.
66 |
67 | - Also, spend time preparing yourself to solve problems before they occur by investing a little time in learning R each day will pay off handsomely in the long run.
68 |
69 | - One way is to follow what the tidyverse team is doing on the [tidyverse blog](https://www.tidyverse.org/blog/).
70 | - To keep up with the R community more broadly, we recommend reading [R Weekly](https://rweekly.org/), a community effort to aggregate the most interesting news in the R community each week.
71 |
72 | ## Meeting Videos
73 |
74 | ### Cohort 7
75 |
76 | `r knitr::include_url("https://www.youtube.com/embed/kmc54BI9GTg")`
77 |
78 |
79 |
80 | Meeting chat log
81 |
82 | ```
83 | 00:27:53 Oluwafemi Oyedele: https://www.youtube.com/watch?v=5gqksthQ0cM
84 | 00:43:48 Oluwafemi Oyedele: #TidyTuesday
85 | ```
86 |
87 |
88 |
89 | ### Cohort 8
90 |
91 | `r knitr::include_url("https://www.youtube.com/embed/rbYO0oVkJC4")`
92 |
--------------------------------------------------------------------------------
/21-databases.Rmd:
--------------------------------------------------------------------------------
1 | # Databases
2 |
3 | **Learning objectives:**
4 |
5 | - Use {DBI} to connect to a database and retrieve data.
6 | - Use {dbplyr} to translate dplyr code to SQL.
7 |
8 | ```{r 21-packages-used, message=FALSE, warning=FALSE}
9 | library(DBI)
10 | library(dbplyr)
11 | library(tidyverse)
12 | ```
13 |
14 | ## Database basics {-}
15 |
16 | 
17 |
18 | - database (db) = collection of data frames (dfs)
19 | - each df = "table"
20 | - named columns where every value is the same type
21 | - db tables vs dfs:
22 | - db tables on disk (can be huge), dfs in memory (limited)
23 | - db tables have indexes, dfs don't
24 | - dbs row-oriented for fast data collection, dfs column-oriented for fast analysis
25 |
26 | ## Connecting to a database {-}
27 |
28 | - {DBI} = generic SQL interface
29 | - Specific package for your DBMS ({RPostgres}, {RMariaDB}, {duckdb}, etc)
30 | - {odbc} if no specific package available
31 |
32 | ```{r}
33 | con <- DBI::dbConnect(duckdb::duckdb())
34 | ```
35 |
36 | - When using duckdb in a project
37 | ```{r,eval=FALSE,warning=FALSE,message=FALSE}
38 | con <- DBI::dbConnect(duckdb::duckdb(), dbdir = "duckdb")
39 | ```
40 |
41 | ## Load some data {-}
42 |
43 | ```{r}
44 | dbWriteTable(con, "mpg", ggplot2::mpg)
45 | dbWriteTable(con, "diamonds", ggplot2::diamonds)
46 | ```
47 |
48 | ## DBI basics {-}
49 | ```{r}
50 | dbListTables(con)
51 |
52 |
53 | con |>
54 | dbReadTable("diamonds") |>
55 | as_tibble()
56 | ```
57 |
58 | - SQL Syntax
59 |
60 | ```{r}
61 | sql <- "
62 | SELECT carat, cut, clarity, color, price
63 | FROM diamonds
64 | WHERE price > 15000
65 | "
66 | ```
67 |
68 | ```{r}
69 | as_tibble(dbGetQuery(con, sql))
70 | ```
71 |
72 | ## dbplyr basics {-}
73 |
74 | ```{r}
75 | diamonds_db <- tbl(con, "diamonds")
76 |
77 | diamonds_db
78 | ```
79 |
80 | ```{r}
81 | big_diamonds_db <- diamonds_db |>
82 | filter(price > 15000) |>
83 | select(carat:clarity, price)
84 |
85 | big_diamonds_db
86 | ```
87 |
88 | ```{r}
89 | big_diamonds_db |>
90 | show_query()
91 | ```
92 |
93 | - `collect()` moves data into R
94 |
95 | ```{r}
96 | big_diamonds <- big_diamonds_db |>
97 | collect()
98 | big_diamonds
99 | ```
100 |
101 | ## SQL {-}
102 |
103 | ```{r}
104 | dbplyr::copy_nycflights13(con)
105 |
106 |
107 | flights <- tbl(con, "flights")
108 | planes <- tbl(con, "planes")
109 | ```
110 |
111 | ## SQL basics {-}
112 |
113 | - *statements* = top level
114 | - `CREATE` = new tables
115 | - `INSERT` = add data
116 | - `SELECT` = retrieve data
117 | - aka "queries"
118 |
119 | ```{r}
120 | flights |> show_query()
121 |
122 | planes |> show_query()
123 |
124 | ```
125 |
126 | - `WHERE` = `filter()`
127 | - `ORDER BY` = `arrange()`
128 |
129 | ```{r}
130 | flights |>
131 | filter(dest == "IAH") |>
132 | arrange(dep_delay) |>
133 | show_query()
134 | ```
135 |
136 | ## SELECT {-}
137 |
138 | `SELECT` = tons of things!
139 |
140 | - `select()`, `rename()`, and `relocate()`
141 |
142 | ```{r}
143 | planes |>
144 | select(tailnum, type, manufacturer, model, year) |>
145 | show_query()
146 |
147 |
148 | planes |>
149 | select(tailnum, type, manufacturer, model, year) |> rename(year_built = year) |>
150 | show_query()
151 |
152 |
153 | planes |>
154 | select(tailnum, type, manufacturer, model, year) |>
155 | relocate(manufacturer, model, .before = type) |>
156 | show_query()
157 | ```
158 |
159 | Not shown: `mutate()`, `summarize()` are also `SELECT`
160 |
161 | ## Subqueries {-}
162 |
163 | Sometimes {dbplyr} uses subqueries to translate {dplyr} code
164 |
165 | - **subquery** = query used in `FROM` in place of a table
166 |
167 | ```{r}
168 | flights |>
169 | mutate(
170 | year1 = year + 1,
171 | year2 = year1 + 1
172 | ) |>
173 | show_query()
174 | ```
175 |
176 | ## Joins {-}
177 |
178 | SQL joins similar to {dplyr} joins
179 |
180 | ```{r}
181 | flights |>
182 | left_join(planes |> rename(year_built = year), by = "tailnum") |>
183 | show_query()
184 | ```
185 |
186 | ## Other verbs {-}
187 |
188 | - `distinct()`
189 | - `slice_*()`
190 | - `intersect()`
191 | - `tidyr::pivot_longer()`
192 | - `tidyr::pivot_wider()`
193 | - Full list on [dbplyr website](https://dbplyr.tidyverse.org/reference/)
194 |
195 | ## Function translations {-}
196 |
197 | How does {dbplyr} deal with `mean()` vs `median()`?
198 |
199 | ```{r}
200 | summarize_query <- function(df, ...) {
201 | df |>
202 | summarize(...) |>
203 | show_query()
204 | }
205 | mutate_query <- function(df, ...) {
206 | df |>
207 | mutate(..., .keep = "none") |>
208 | show_query()
209 | }
210 | ```
211 |
212 | ```{r}
213 | flights |>
214 | group_by(year, month, day) |>
215 | summarize_query(
216 | mean = mean(arr_delay, na.rm = TRUE),
217 | median = median(arr_delay, na.rm = TRUE)
218 | )
219 | ```
220 |
221 | ```{r}
222 | flights |>
223 | group_by(year, month, day) |>
224 | mutate_query(
225 | mean = mean(arr_delay, na.rm = TRUE),
226 | )
227 | ```
228 |
229 | ```{r}
230 | flights |>
231 | group_by(dest) |>
232 | arrange(time_hour) |>
233 | mutate_query(
234 | lead = lead(arr_delay),
235 | lag = lag(arr_delay)
236 | )
237 | ```
238 |
239 |
240 | ## Clean up {-}
241 |
242 | ```{r clean-up}
243 | dbDisconnect(con, shutdown = TRUE)
244 | ```
245 |
246 | ## Meeting Videos {-}
247 |
248 | ### Cohort 7 {-}
249 |
250 | `r knitr::include_url("https://www.youtube.com/embed/0AWywckm3W4")`
251 |
252 |
253 | Meeting chat log
254 | ```
255 | 00:09:36 Oluwafemi Oyedele: Hi Tim, Good Evening!!!
256 | 00:10:59 Tim Newby: Hi Oluwafemi :-)
257 | 00:14:10 Oluwafemi Oyedele: start
258 | 00:48:43 Oluwafemi Oyedele: https://dbplyr.tidyverse.org/reference/
259 | 00:48:58 Oluwafemi Oyedele: https://dbplyr.tidyverse.org/articles/dbplyr.html
260 | 00:56:01 Oluwafemi Oyedele: https://sqlfordatascientists.com/
261 | 00:56:09 Oluwafemi Oyedele: https://www.practicalsql.com/
262 | 00:57:28 Oluwafemi Oyedele: stop
263 | ```
264 |
265 |
266 |
267 | ### Cohort 8 {-}
268 |
269 | `r knitr::include_url("https://www.youtube.com/embed/ylTfwbQq1v0")`
270 |
271 | `r knitr::include_url("https://www.youtube.com/embed/HnJ3ZY1seY4")`
272 |
--------------------------------------------------------------------------------
/99-24-model_building.Rmd:
--------------------------------------------------------------------------------
1 | # Model building {-}
2 |
3 | **Learning objectives:**
4 |
5 | - Build a **linear model** to explain trends in data.
6 | - Examine the **residuals** of a model to identify remaining trends in data.
7 | - Perform **feature engineering** to explain trends in data.
8 | - Recognize some resources to **learn more about modeling.**
9 |
10 | ## EDA vs Prediction
11 |
12 | **Reminder:** This book focuses on exploratory data analysis, not prediction.
13 |
14 | 
15 |
16 | ## Build a Linear Model
17 |
18 | ```{r 99-24-setup, include = FALSE}
19 | # By this point these are probably already libraried, but I want to be sure.
20 | library(tidyverse)
21 | library(modelr)
22 | library(nycflights13)
23 | library(lubridate)
24 | ```
25 |
26 | ```{r 99-24-lm}
27 | diamonds2 <- diamonds %>%
28 | filter(carat <= 2.5) %>%
29 | mutate(log_price = log2(price), log_carat = log2(carat))
30 |
31 | mod_diamond <- lm(log_price ~ log_carat, data = diamonds2)
32 |
33 | grid <- diamonds2 %>%
34 | data_grid(carat = seq_range(carat, 20)) %>%
35 | mutate(log_carat = log2(carat)) %>%
36 | add_predictions(mod_diamond, "log_price") %>%
37 | mutate(price = 2 ^ log_price)
38 |
39 | ggplot(diamonds2) +
40 | aes(carat, price) +
41 | geom_hex(bins = 50) +
42 | geom_line(data = grid, color = "red", linewidth = 1)
43 | ```
44 |
45 | ## Examine Residuals
46 |
47 | ```{r 99-24-residuals}
48 | diamonds2 <- diamonds2 %>%
49 | add_residuals(mod_diamond, "log_resid")
50 |
51 | ggplot(diamonds2) +
52 | aes(log_carat, log_resid) +
53 | geom_hex(bins = 50)
54 | ```
55 |
56 | ```{r 99-24-residuals-plots}
57 | base_plot <- ggplot(diamonds2) +
58 | aes(y = log_resid) +
59 | geom_boxplot()
60 |
61 | base_plot +
62 | aes(cut)
63 |
64 | base_plot +
65 | aes(color)
66 |
67 | base_plot +
68 | aes(clarity)
69 | ```
70 |
71 | ## Another Diamonds Model
72 |
73 | ```{r 99-24-lm2}
74 | mod_diamond2 <- lm(
75 | log_price ~ log_carat + color + cut + clarity,
76 | data = diamonds2
77 | )
78 |
79 | plot_mod2 <- function(parameter) {
80 | grid <- diamonds2 %>%
81 | data_grid({{parameter}}, .model = mod_diamond2) %>%
82 | add_predictions(mod_diamond2)
83 |
84 | ggplot(grid) +
85 | aes(x = {{parameter}}, y = pred) +
86 | geom_point()
87 | }
88 |
89 | plot_mod2(cut)
90 | plot_mod2(color)
91 | plot_mod2(clarity)
92 | ```
93 |
94 | ```{r 99-24-diamond-leftovers}
95 | diamonds2 <- diamonds2 %>%
96 | add_residuals(mod_diamond2, "log_resid2")
97 |
98 | ggplot(diamonds2) +
99 | aes(log_carat, log_resid2) +
100 | geom_hex(bins = 50)
101 | ```
102 |
103 | ## Feature Engineering
104 |
105 | ```{r 99-24-flights}
106 | daily <- flights %>%
107 | mutate(date = make_date(year, month, day)) %>%
108 | group_by(date) %>%
109 | summarise(n = n())
110 |
111 | ggplot(daily) +
112 | aes(date, n) +
113 | geom_line()
114 | ```
115 |
116 | Feature engineering = using data to create new features to use in models
117 |
118 | ```{r 99-24-wday}
119 | daily <- daily %>%
120 | mutate(wday = wday(date, label = TRUE, week_start = 1))
121 | ggplot(daily) +
122 | aes(wday, n) +
123 | geom_boxplot()
124 | ```
125 |
126 | ```{r 99-24-wday-mod}
127 | mod <- lm(n ~ wday, data = daily)
128 |
129 | grid <- daily %>%
130 | data_grid(wday) %>%
131 | add_predictions(mod, "n")
132 |
133 | ggplot(daily) +
134 | aes(wday, n) +
135 | geom_boxplot() +
136 | geom_point(data = grid, colour = "red", size = 4)
137 | ```
138 |
139 | ```{r 99-24-wday-residuals}
140 | daily <- daily %>%
141 | add_residuals(mod)
142 |
143 | base_plot <- ggplot(daily) +
144 | aes(date, resid) +
145 | geom_ref_line(h = 0) +
146 | geom_line()
147 |
148 | base_plot
149 |
150 | base_plot +
151 | aes(color = wday)
152 |
153 | base_plot +
154 | geom_smooth(se = FALSE, span = 0.20)
155 | ```
156 |
157 | ```{r 99-24-wday-low}
158 | daily %>%
159 | filter(resid < -100) %>%
160 | pull(date, wday)
161 | ```
162 |
163 | ```{r 99-24-seasonal}
164 | term <- function(date) {
165 | cut(date,
166 | breaks = ymd(20130101, 20130605, 20130825, 20140101),
167 | labels = c("spring", "summer", "fall")
168 | )
169 | }
170 |
171 | daily <- daily %>%
172 | mutate(term = term(date))
173 |
174 | mod2 <- MASS::rlm(n ~ wday * term, data = daily)
175 |
176 | daily %>%
177 | add_residuals(mod2, "resid") %>%
178 | ggplot() +
179 | aes(date, resid) +
180 | geom_hline(yintercept = 0, linewidth = 2, colour = "white") +
181 | geom_line()
182 | ```
183 |
184 | ## Learning More
185 |
186 | - An Introduction to Statistical Learning (with Applications in R) ([statlearning.com](https://www.statlearning.com/) / #book_club-islr): Statistical explanations of various machine learning methods, with explanations of how to apply them in R. A good introduction to all of the types of models and why they work (or don't work) the way they do.
187 | - Tidy Modeling with R ([tmwr.org](https://www.tmwr.org/) / #book_club-tmwr): An opinionated introduction to using the tidymodels family of packages to build predictive models. Very hands-on and useful, but I think I might want to read it again after ISLR.
188 | - Feature Engineering and Selection: A Practical Approach for Predictive Models ([feat.engineering](http://www.feat.engineering/) / #book_club-feat_eng): Techniques for manipulating data to get better results out of models.
189 | - Applied Predictive Modeling ([github.com/topepo/tidy-apm](https://github.com/topepo/tidy-apm) / #project-tidy_apm): There isn't a free online version of this book yet, but it's at least theoretically in the works. This was published about 10 years ago by the leader of the tidymodels team, and he has started to update it to tidymodels code. I'd recommend *not* reading this one until/unless he takes that project back up (very possibly with the help of the DSLC community).
190 |
191 | ## Meeting Videos
192 |
193 | ### Cohort 5
194 |
195 | `r knitr::include_url("https://www.youtube.com/embed/jZmSbkkJIzQ")`
196 |
197 |
198 | Meeting chat log
199 | ```
200 | 00:18:47 Njoki Njuki Lucy: yes
201 | 00:56:00 Ryan Metcalf: @Sandra, here is a LARGE section to answer your question. I’m banking that Federica will provide a more specific code snippet….https://ggplot2-book.org/scales-guides.html#scales-guides
202 | 00:56:09 Federica Gazzelloni: https://ggplot2.tidyverse.org/reference/guide_colourbar.html
203 | 00:57:03 Federica Gazzelloni: ggplot()+geom_…()+guides()
204 | 00:58:35 Federica Gazzelloni: guides(color=guide_colourbar())
205 | ```
206 |
207 |
208 | ### Cohort 6
209 |
210 | `r knitr::include_url("https://www.youtube.com/embed/FXR0WWyqDf8")`
211 |
212 | `r knitr::include_url("https://www.youtube.com/embed/jMXyhgS4AVg")`
213 |
--------------------------------------------------------------------------------
/17-dates_and_times.Rmd:
--------------------------------------------------------------------------------
1 | # Dates and times
2 |
3 | **Learning objectives:**
4 |
5 | - **Create date** and **datetime** objects.
6 | - Work with **datetime components.**
7 | - Perform **arithmetic** on **timespans.**
8 | - Recognize ways to deal with **timezones** in R.
9 |
10 | ```{r dates-and-times-libraries, warning=FALSE, message=FALSE}
11 | library(tidyverse)
12 | library(nycflights13)
13 | ```
14 |
15 | ## Date/time objects {-}
16 |
17 | 3 types of date/time objects:
18 |
19 | - **date** = ``
20 | - **time** = `