├── products
├── poster
│ ├── readme.md
│ ├── media
│ │ ├── logo.png
│ │ ├── resulttable2.rds
│ │ ├── summarytable.rds
│ │ ├── height-weight.png
│ │ └── height-weight-stratified.png
│ ├── _extensions
│ │ └── quarto-ext
│ │ │ └── poster
│ │ │ ├── _extension.yml
│ │ │ ├── typst-show.typ
│ │ │ └── typst-template.typ
│ └── poster.qmd
├── presentation
│ ├── media
│ │ ├── height-weight.png
│ │ ├── resulttable2.rds
│ │ ├── summarytable.rds
│ │ └── my-presentation-styling.css
│ ├── readme.md
│ └── presentation.qmd
├── manuscript
│ ├── readme.md
│ ├── supplement
│ │ └── Supplementary-Material.qmd
│ └── Manuscript.qmd
└── README.md
├── assets
├── placeholder.png
├── antigen-recognition.png
├── references
│ ├── 2020-mckay-ofid.pdf
│ └── 2020-mckay-prsb.pdf
├── README.md
├── dataanalysis-references.bib
├── american-journal-of-epidemiology.csl
└── vancouver-author-date.csl
├── data
├── raw-data
│ ├── exampledata.xlsx
│ └── README.md
├── processed-data
│ ├── processeddata.rds
│ └── readme.md
└── README.md
├── results
├── tables
│ ├── resulttable1.rds
│ ├── resulttable2.rds
│ ├── summarytable.rds
│ └── README.md
├── figures
│ ├── height-weight.png
│ ├── README.md
│ ├── height-distribution.png
│ ├── weight-distribution.png
│ └── height-weight-stratified.png
├── output
│ └── README.md
├── README.md
└── large-files
│ └── README.md
├── code
├── analysis-code
│ ├── README.md
│ └── statistical-analysis.R
├── processing-code
│ ├── processingfile-v2_files
│ │ ├── figure-html
│ │ │ └── cleandata1-1.png
│ │ └── libs
│ │ │ ├── bootstrap
│ │ │ └── bootstrap-icons.woff
│ │ │ ├── quarto-html
│ │ │ ├── tippy.css
│ │ │ ├── quarto-syntax-highlighting.css
│ │ │ ├── anchor.min.js
│ │ │ └── popper.min.js
│ │ │ └── clipboard
│ │ │ └── clipboard.min.js
│ ├── processingfile-v1_files
│ │ ├── figure-html
│ │ │ └── unnamed-chunk-5-1.png
│ │ └── libs
│ │ │ ├── bootstrap
│ │ │ └── bootstrap-icons.woff
│ │ │ ├── quarto-html
│ │ │ ├── tippy.css
│ │ │ ├── quarto-syntax-highlighting.css
│ │ │ ├── anchor.min.js
│ │ │ ├── popper.min.js
│ │ │ └── tippy.umd.min.js
│ │ │ └── clipboard
│ │ │ └── clipboard.min.js
│ ├── readme.md
│ ├── processingfile-v2.qmd
│ ├── processingfile-v1.qmd
│ └── processingcode.R
├── eda-code
│ ├── readme.md
│ ├── edacode.R
│ ├── eda-v2.qmd
│ └── eda.qmd
└── README.md
├── data-analysis-template.Rproj
├── .gitignore
└── README.md
/products/poster/readme.md:
--------------------------------------------------------------------------------
1 | This folder contains an example poster using Quarto and typst.
2 |
3 |
--------------------------------------------------------------------------------
/assets/placeholder.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/assets/placeholder.png
--------------------------------------------------------------------------------
/assets/antigen-recognition.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/assets/antigen-recognition.png
--------------------------------------------------------------------------------
/data/raw-data/exampledata.xlsx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/data/raw-data/exampledata.xlsx
--------------------------------------------------------------------------------
/products/poster/media/logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/products/poster/media/logo.png
--------------------------------------------------------------------------------
/results/tables/resulttable1.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/results/tables/resulttable1.rds
--------------------------------------------------------------------------------
/results/tables/resulttable2.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/results/tables/resulttable2.rds
--------------------------------------------------------------------------------
/results/tables/summarytable.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/results/tables/summarytable.rds
--------------------------------------------------------------------------------
/results/figures/height-weight.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/results/figures/height-weight.png
--------------------------------------------------------------------------------
/results/figures/README.md:
--------------------------------------------------------------------------------
1 | # figures
2 |
3 | Folder for all figures.
4 |
5 | You can create further sub-folders if that makes sense.
6 |
--------------------------------------------------------------------------------
/assets/references/2020-mckay-ofid.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/assets/references/2020-mckay-ofid.pdf
--------------------------------------------------------------------------------
/assets/references/2020-mckay-prsb.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/assets/references/2020-mckay-prsb.pdf
--------------------------------------------------------------------------------
/data/processed-data/processeddata.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/data/processed-data/processeddata.rds
--------------------------------------------------------------------------------
/products/poster/media/resulttable2.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/products/poster/media/resulttable2.rds
--------------------------------------------------------------------------------
/products/poster/media/summarytable.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/products/poster/media/summarytable.rds
--------------------------------------------------------------------------------
/products/poster/media/height-weight.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/products/poster/media/height-weight.png
--------------------------------------------------------------------------------
/results/figures/height-distribution.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/results/figures/height-distribution.png
--------------------------------------------------------------------------------
/results/figures/weight-distribution.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/results/figures/weight-distribution.png
--------------------------------------------------------------------------------
/products/presentation/media/height-weight.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/products/presentation/media/height-weight.png
--------------------------------------------------------------------------------
/products/presentation/media/resulttable2.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/products/presentation/media/resulttable2.rds
--------------------------------------------------------------------------------
/products/presentation/media/summarytable.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/products/presentation/media/summarytable.rds
--------------------------------------------------------------------------------
/results/figures/height-weight-stratified.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/results/figures/height-weight-stratified.png
--------------------------------------------------------------------------------
/products/poster/media/height-weight-stratified.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/products/poster/media/height-weight-stratified.png
--------------------------------------------------------------------------------
/results/tables/README.md:
--------------------------------------------------------------------------------
1 | # tables
2 |
3 | Folder for all tables (if you use R, often stored as Rds files)
4 |
5 | You can create further sub-folders if that makes sense.
6 |
--------------------------------------------------------------------------------
/code/analysis-code/README.md:
--------------------------------------------------------------------------------
1 | # analysis-code
2 |
3 | This folder contains an R script with a bit of a statistical analysis. This is only implemented as an R script, no Quarto version.
4 |
--------------------------------------------------------------------------------
/code/processing-code/processingfile-v2_files/figure-html/cleandata1-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/code/processing-code/processingfile-v2_files/figure-html/cleandata1-1.png
--------------------------------------------------------------------------------
/code/processing-code/processingfile-v1_files/figure-html/unnamed-chunk-5-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/code/processing-code/processingfile-v1_files/figure-html/unnamed-chunk-5-1.png
--------------------------------------------------------------------------------
/results/output/README.md:
--------------------------------------------------------------------------------
1 | # output
2 |
3 | Folder for output files from models or other analyses. These need to be
4 | further processed into figures or tables for presentation.
5 |
6 | There's currently no example present.
--------------------------------------------------------------------------------
/code/processing-code/processingfile-v1_files/libs/bootstrap/bootstrap-icons.woff:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/code/processing-code/processingfile-v1_files/libs/bootstrap/bootstrap-icons.woff
--------------------------------------------------------------------------------
/code/processing-code/processingfile-v2_files/libs/bootstrap/bootstrap-icons.woff:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ahgroup/data-analysis-template/HEAD/code/processing-code/processingfile-v2_files/libs/bootstrap/bootstrap-icons.woff
--------------------------------------------------------------------------------
/data-analysis-template.Rproj:
--------------------------------------------------------------------------------
1 | Version: 1.0
2 |
3 | RestoreWorkspace: Default
4 | SaveWorkspace: Default
5 | AlwaysSaveHistory: Default
6 |
7 | EnableCodeIndexing: Yes
8 | UseSpacesForTab: Yes
9 | NumSpacesForTab: 2
10 | Encoding: UTF-8
11 |
12 | RnwWeave: Sweave
13 | LaTeX: pdfLaTeX
14 |
--------------------------------------------------------------------------------
/products/poster/_extensions/quarto-ext/poster/_extension.yml:
--------------------------------------------------------------------------------
1 | title: Poster
2 | author: Carlos Scheidegger
3 | version: 1.0.0
4 | quarto-required: ">=1.4.415"
5 | contributes:
6 | formats:
7 | typst:
8 | template-partials:
9 | - typst-template.typ
10 | - typst-show.typ
11 |
12 |
--------------------------------------------------------------------------------
/code/eda-code/readme.md:
--------------------------------------------------------------------------------
1 | # eda-code
2 |
3 | This folder contains code to do a simple exploratory data analysis (EDA) on the processed/cleaned data.
4 | The code produces a few tables and figures, which are saved in the appropriate `results` sub-folder.
5 |
6 | It's the same code done 3 times. For explanations on the 3 different ways, see the readme file in the `processing-code` folder.
7 |
8 |
--------------------------------------------------------------------------------
/data/processed-data/readme.md:
--------------------------------------------------------------------------------
1 | # processed-data
2 |
3 | This folder contains data that has been processed and cleaned by code.
4 |
5 | Any files located in here are based on the raw data and can be re-created running the various processing/cleaning code scripts in the `code` folder.
6 |
7 | You could add a codebook here, but you could also just provide enough comments in the code that produces the content in this folder for users to understand what is saved in this location.
--------------------------------------------------------------------------------
/results/README.md:
--------------------------------------------------------------------------------
1 | # results
2 |
3 | This folder and subfolders contain results produced by the code, such as figures and tables, and other files.
4 |
5 | A special folder for large files exists. This folder is set in .gitignore to be ignored when pushing/pulling. See the readme in that folder for details.
6 |
7 | Structure the folders inside `results` such that they make sense for your specific analysis. Provide enough documentation that someone can understand what you are doing and what goes where. `readme.md` files inside each folder are a good idea.
8 |
--------------------------------------------------------------------------------
/products/presentation/readme.md:
--------------------------------------------------------------------------------
1 | This folder contains an example of a slide presentation using Quarto and the `revealjs` output format.
2 |
3 | The general suggestion is to place figures/tables/Rds files etc. that are used in the presentation in the `media` folder.
4 |
5 | You could also pull it from the main project `results` folder, but the advantage of copying them into `media` is that you have everything related to your presentation in one place, and if contentigures inside `results` continues to be changed/updated, you can ensure you have the version you used in your presentation by copying it into `media`.
6 |
7 |
--------------------------------------------------------------------------------
/products/manuscript/readme.md:
--------------------------------------------------------------------------------
1 | This folder contains a template for an academic manuscript. The content of the template is structured as a report for a class, but you can easily replace it with whatever structure you need.
2 |
3 | Most manuscripts these days have supplementary material, place those into the `supplement` folder. (You can have the supplement inside the `manuscript` folder or next to it, whatever is better for your setup).
4 |
5 | Figures/tables/etc. should be pulled from their respective locations by code, as shown in the example.
6 |
7 | Based on what most journals want, it is generally best to have the main manuscript render to a Word file, and the supplement to a pdf. Deviations might be necessary, based on specific circumstances.
8 |
--------------------------------------------------------------------------------
/data/raw-data/README.md:
--------------------------------------------------------------------------------
1 | #raw-data
2 |
3 | This folder should contain all raw data. As needed add sub-folders.
4 |
5 | Currently, as an example, it contains a simple made-up data-set in an Excel file.
6 |
7 | The dataset contains the variables `Height`, `Weight` and `Gender` of a few imaginary individuals.
8 |
9 | The dataset purposefully contains some faulty entries that need to be cleaned.
10 |
11 | Generally, any dataset should contain some meta-data explaining what each variable in the dataset is. (This is often called a **Codebook**.) For this simple example, the codebook is given as a second sheet in the Excel file.
12 |
13 | This raw data-set should generally not be edited by hand. It should instead be loaded and processed/cleaned using code.
14 |
15 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # History files
2 | .Rhistory
3 | .Rapp.history
4 |
5 | # Session Data files
6 | .RData
7 |
8 | # Example code in package build process
9 | *-Ex.R
10 |
11 | # Output files from R CMD build
12 | /*.tar.gz
13 |
14 | # Output files from R CMD check
15 | /*.Rcheck/
16 |
17 | # RStudio files
18 | .Rproj.user/
19 |
20 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
21 | .httr-oauth
22 |
23 | # knitr and R markdown default cache directories
24 | /*_cache/
25 | /cache/
26 |
27 | # Temporary files created by R markdown
28 | *.utf8.md
29 | *.knit.md
30 |
31 | # MacOS specific .DS_Store files
32 | .DS_Store
33 |
34 | # Folder for large files
35 | results/large-files/*
36 |
37 | # But we want to commit the folder and the readme!
38 | !results/large-files/readme.md
39 |
--------------------------------------------------------------------------------
/results/large-files/README.md:
--------------------------------------------------------------------------------
1 | # large-files
2 |
3 | This is where you should store the results of computations that produce
4 | large files, such as posterior samples from Bayesian models.
5 |
6 | Folder for any large files that are too big to be tracked with GitHub.
7 | That's generally anything above around 20MB.
8 |
9 | This folder is set in `.gitignore` to be ignored when pushing/pulling. (The folder itself and the readme will be synced, but all other files will be ignored.)
10 |
11 | This allows large files to be part of the project. If you want to collaborate with someone or work on multiple computers, you need to manually share/transfer everything in this folder, e.g. by Dropbox/OneDrive/etc.
12 |
13 | If you use a cloud service, you should add the link to the storage location and instructions for obtaining access below.
14 |
15 | Link to large files: LINK-GOES-HERE
16 | Instructions for obtaining large files: email/discord message/etc. INFO HERE.
17 |
--------------------------------------------------------------------------------
/code/README.md:
--------------------------------------------------------------------------------
1 | # code
2 |
3 | This folder and sub-folders should contain all your code. This can be R or Quarto files (or files for other programming languages).
4 |
5 | Place your files in the appropriate sub-folders. You can structure the folders as appropriate.
6 |
7 | You can either have fewer large scripts, or multiple scripts that do only specific actions. Those can be R or Quarto files (or some other language/format). In either case, document the scripts and what goes on in them so well that someone else (including future you) can easily figure out what is happening.
8 |
9 | The scripts should load the appropriate data (e.g. raw or processed), perform actions, and save results (e.g. processed data, figures, computed values) in the appropriate folders. Document somewhere what inputs each script takes and where output is placed.
10 |
11 | If scripts need to be run in a specific order, document this. Either as comments in the script, or in a separate text file such as this readme file. Ideally of course in both locations.
12 |
13 |
--------------------------------------------------------------------------------
/assets/README.md:
--------------------------------------------------------------------------------
1 | # Assets
2 |
3 | This folder should contain all static content from outside sources which is
4 | neither code nor generated by code. This includes, but is not limited to,
5 | schematics generated from biorender, other images taken from outside sources,
6 | `csl` files, `bib` files, pdf files of references, etc.
7 |
8 | As needed, this can be organized further. For instance one could have separate folders for references or figures (again, not figures generated by code, only manually created figures, e.g. conceptual/schematic drawings).
9 |
10 | The `csl` files are referenced in your Quarto docs and influence the style of the references both in the text and at the reference listing at the end.
11 | Journals require specific formats. While writing, I recommend you use either the more explicit (Author, year) format (as e.g. implemented in `vancouver-author-date`) or tha more concise [#] format that just shows numbers (as e.g. implemented in `american-journal-of-epidemiology.csl`). You can download many more reference style files from here:
12 | https://www.zotero.org/styles
13 |
14 |
--------------------------------------------------------------------------------
/code/processing-code/readme.md:
--------------------------------------------------------------------------------
1 | # processing-code
2 |
3 | This folder contains code for processing data.
4 |
5 | It currently contains 3 example files, showing the same processing steps done using slightly different setup with R and Quarto.
6 |
7 | * First, there is an R script that you can run which does all the cleaning.
8 | * Second, there is a Quarto file which contains exactly the same code as the R script, with some comments. Everything lives inside the Quarto file.
9 | * Third, my current favorite, is a Quarto file with an approach where the code is pulled in from the R script and run.
10 |
11 | The last version has the advantage of having code in one place for easy writing/debugging, and then being able to pull the code into the Quarto file for a nice combination of text/commentary and code.
12 |
13 | Each way of doing this is a reasonable approach, pick whichever one you prefer or makes the most sense for your setup. You can also mix and match. For instance for an EDA task, it might make sense to produce a Quarto file. Then I would use the 2nd or 3rd approach. If you do a main analysis, then you might just want to have an R script that does the data analysis and saves the results to a file, for later use/processing. You might not need or want a quarto file for that.
14 |
15 | Whichever approach you choose, add ample documentation/commentary so you and others can easily understand what's going on and what is done.
--------------------------------------------------------------------------------
/code/processing-code/processingfile-v1_files/libs/quarto-html/tippy.css:
--------------------------------------------------------------------------------
1 | .tippy-box[data-animation=fade][data-state=hidden]{opacity:0}[data-tippy-root]{max-width:calc(100vw - 10px)}.tippy-box{position:relative;background-color:#333;color:#fff;border-radius:4px;font-size:14px;line-height:1.4;white-space:normal;outline:0;transition-property:transform,visibility,opacity}.tippy-box[data-placement^=top]>.tippy-arrow{bottom:0}.tippy-box[data-placement^=top]>.tippy-arrow:before{bottom:-7px;left:0;border-width:8px 8px 0;border-top-color:initial;transform-origin:center top}.tippy-box[data-placement^=bottom]>.tippy-arrow{top:0}.tippy-box[data-placement^=bottom]>.tippy-arrow:before{top:-7px;left:0;border-width:0 8px 8px;border-bottom-color:initial;transform-origin:center bottom}.tippy-box[data-placement^=left]>.tippy-arrow{right:0}.tippy-box[data-placement^=left]>.tippy-arrow:before{border-width:8px 0 8px 8px;border-left-color:initial;right:-7px;transform-origin:center left}.tippy-box[data-placement^=right]>.tippy-arrow{left:0}.tippy-box[data-placement^=right]>.tippy-arrow:before{left:-7px;border-width:8px 8px 8px 0;border-right-color:initial;transform-origin:center right}.tippy-box[data-inertia][data-state=visible]{transition-timing-function:cubic-bezier(.54,1.5,.38,1.11)}.tippy-arrow{width:16px;height:16px;color:#333}.tippy-arrow:before{content:"";position:absolute;border-color:transparent;border-style:solid}.tippy-content{position:relative;padding:5px 9px;z-index:1}
--------------------------------------------------------------------------------
/code/processing-code/processingfile-v2_files/libs/quarto-html/tippy.css:
--------------------------------------------------------------------------------
1 | .tippy-box[data-animation=fade][data-state=hidden]{opacity:0}[data-tippy-root]{max-width:calc(100vw - 10px)}.tippy-box{position:relative;background-color:#333;color:#fff;border-radius:4px;font-size:14px;line-height:1.4;white-space:normal;outline:0;transition-property:transform,visibility,opacity}.tippy-box[data-placement^=top]>.tippy-arrow{bottom:0}.tippy-box[data-placement^=top]>.tippy-arrow:before{bottom:-7px;left:0;border-width:8px 8px 0;border-top-color:initial;transform-origin:center top}.tippy-box[data-placement^=bottom]>.tippy-arrow{top:0}.tippy-box[data-placement^=bottom]>.tippy-arrow:before{top:-7px;left:0;border-width:0 8px 8px;border-bottom-color:initial;transform-origin:center bottom}.tippy-box[data-placement^=left]>.tippy-arrow{right:0}.tippy-box[data-placement^=left]>.tippy-arrow:before{border-width:8px 0 8px 8px;border-left-color:initial;right:-7px;transform-origin:center left}.tippy-box[data-placement^=right]>.tippy-arrow{left:0}.tippy-box[data-placement^=right]>.tippy-arrow:before{left:-7px;border-width:8px 8px 8px 0;border-right-color:initial;transform-origin:center right}.tippy-box[data-inertia][data-state=visible]{transition-timing-function:cubic-bezier(.54,1.5,.38,1.11)}.tippy-arrow{width:16px;height:16px;color:#333}.tippy-arrow:before{content:"";position:absolute;border-color:transparent;border-style:solid}.tippy-content{position:relative;padding:5px 9px;z-index:1}
--------------------------------------------------------------------------------
/code/eda-code/edacode.R:
--------------------------------------------------------------------------------
1 | ## ---- packages --------
2 | #load needed packages. make sure they are installed.
3 | library(here) #for data loading/saving
4 | library(dplyr)
5 | library(skimr)
6 | library(ggplot2)
7 |
8 | ## ---- loaddata --------
9 | #Path to data. Note the use of the here() package and not absolute paths
10 | data_location <- here::here("data","processed-data","processeddata.rds")
11 | #load data
12 | mydata <- readRDS(data_location)
13 |
14 | ## ---- table1 --------
15 | summary_df = skimr::skim(mydata)
16 | print(summary_df)
17 | # save to file
18 | summarytable_file = here("results","tables", "summarytable.rds")
19 | saveRDS(summary_df, file = summarytable_file)
20 |
21 | ## ---- height --------
22 | p1 <- mydata %>% ggplot(aes(x=Height)) + geom_histogram()
23 | plot(p1)
24 | figure_file = here("results", "figures", "height-distribution.png")
25 | ggsave(filename = figure_file, plot=p1)
26 |
27 | ## ---- weight --------
28 | p2 <- mydata %>% ggplot(aes(x=Weight)) + geom_histogram()
29 | plot(p2)
30 | figure_file = here("results", "figures", "weight-distribution.png")
31 | ggsave(filename = figure_file, plot=p2)
32 |
33 | ## ---- fitfig1 --------
34 | p3 <- mydata %>% ggplot(aes(x=Height, y=Weight)) + geom_point() + geom_smooth(method='lm')
35 | plot(p3)
36 | figure_file = here("results","figures", "height-weight.png")
37 |
38 | ## ---- fitfig2 --------
39 | p4 <- mydata %>% ggplot(aes(x=Height, y=Weight, color = Gender)) + geom_point() + geom_smooth(method='lm')
40 | plot(p4)
41 | figure_file = here("results","figures", "height-weight-stratified.png")
42 | ggsave(filename = figure_file, plot=p4)
43 |
44 |
45 |
--------------------------------------------------------------------------------
/products/README.md:
--------------------------------------------------------------------------------
1 | # products
2 |
3 | The folders inside this folder should contain all the products of your project.
4 |
5 | For a classical academic project, this will be a peer-reviewed manuscript. Often, you will also give presentations and/or make posters based on your work.
6 |
7 | The `manuscript` folder contains a template for an academic manuscript. The content of the template is structured as a report for a class, but you can easily replace it with whatever structure you need.
8 |
9 | Most manuscripts these days have supplementary material, place those into the `supplement` folder. (You can have the supplement inside the `manuscript` folder or next to it, whatever is better for your setup).
10 |
11 | Often, you might make/give a presentation on your work and make slides for that. An example is the `presentation` folder.
12 |
13 | Similarly, it is common to make posters to present at conferences. An example using Quarto is in the `poster` folder.
14 |
15 | Often you need a library of references in bibtex format, as well as a CSL style file that determines reference formatting. Those files might be used by several of the products. They should be placed into the `assets` folder, or for presentations and posters, separately into their respective `media` folders.
16 |
17 | You can add further folders. For instance, if you have multiple presentations or posters, you might want to create subfolders for each.
18 | Or you could have a `blog-post` folder if you plan to write a blog-post. It's up to you how to structure/organize, as long as it is somewhat logical and you document it. ideally, put a readme file in each folder to orient others/your future self on what is going on.
19 |
20 |
--------------------------------------------------------------------------------
/code/analysis-code/statistical-analysis.R:
--------------------------------------------------------------------------------
1 | ###############################
2 | # analysis script
3 | #
4 | #this script loads the processed, cleaned data, does a simple analysis
5 | #and saves the results to the results folder
6 |
7 | #load needed packages. make sure they are installed.
8 | library(ggplot2) #for plotting
9 | library(broom) #for cleaning up output from lm()
10 | library(here) #for data loading/saving
11 |
12 | #path to data
13 | #note the use of the here() package and not absolute paths
14 | data_location <- here::here("data","processed-data","processeddata.rds")
15 |
16 | #load data.
17 | mydata <- readRDS(data_location)
18 |
19 |
20 | ######################################
21 | #Data fitting/statistical analysis
22 | ######################################
23 |
24 | ############################
25 | #### First model fit
26 | # fit linear model using height as outcome, weight as predictor
27 |
28 | lmfit1 <- lm(Height ~ Weight, mydata)
29 |
30 | # place results from fit into a data frame with the tidy function
31 | lmtable1 <- broom::tidy(lmfit1)
32 |
33 | #look at fit results
34 | print(lmtable1)
35 |
36 | # save fit results table
37 | table_file1 = here("results", "tables", "resulttable1.rds")
38 | saveRDS(lmtable1, file = table_file1)
39 |
40 | ############################
41 | #### Second model fit
42 | # fit linear model using height as outcome, weight and gender as predictor
43 |
44 | lmfit2 <- lm(Height ~ Weight + Gender, mydata)
45 |
46 | # place results from fit into a data frame with the tidy function
47 | lmtable2 <- broom::tidy(lmfit2)
48 |
49 | #look at fit results
50 | print(lmtable2)
51 |
52 | # save fit results table
53 | table_file2 = here("results", "tables", "resulttable2.rds")
54 | saveRDS(lmtable2, file = table_file2)
55 |
56 |
--------------------------------------------------------------------------------
/data/README.md:
--------------------------------------------------------------------------------
1 | # data
2 |
3 | The folders inside this folder should contain all data at various stages.
4 |
5 | This data is being loaded/manipulated/changed/saved with code from the `code` folders.
6 |
7 | You should place the raw data in the `raw-data` folder and not edit it. Ever!
8 |
9 | Ideally, load the raw data into R and do all changes there with code, so everything is automatically reproducible and documented.
10 |
11 | Sometimes, you need to edit the files in the format you got. For instance, Excel files are sometimes so poorly formatted that it's close to impossible to read them into R, or the persons you got the data from used color to code some information, which of course won't import into R. In those cases, you might have to make modifications in a software other than R. If you need to make edits in whatever format you got the data (e.g. Excel), make a copy and place those copies in a separate folder, AND ONLY EDIT THOSE COPIES. Also, write down somewhere the edits you made.
12 |
13 | Add as many sub-folders as suitable. If you only have a single processing step, one sub-folder for processed data is enough. If you have multiple stages of cleaning and processing, additional sub-folders might be useful. Adjust based on the complexity of your project.
14 |
15 | I suggest you save your processed and cleaned data as RDS or RDA/Rdata files. This preserves coding like factors, characters, numeric, etc. If you save as CSV, that information would get lost.
16 | However, CSV is better for sharing with others since it's plain text. If you do CSV, you might want to write down somewhere what each variable is.
17 |
18 | See here for some suggestions on how to store your processed data:
19 |
20 | http://www.sthda.com/english/wiki/saving-data-into-r-data-format-rds-and-rdata
21 |
--------------------------------------------------------------------------------
/products/presentation/presentation.qmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: Example Quarto slides
3 | subtitle: "with some thoughts on how to set up things"
4 | date: 2025-01-01
5 | author: "NAME"
6 | format:
7 | revealjs:
8 | theme: default
9 | css: "./media/my-presentation-styling.css"
10 | transition: none
11 | incremental: false
12 | cap-location: bottom
13 | self-contained: true
14 | slide-number: true
15 | show-slide-number: all
16 | auto-stretch: true
17 | smaller: false
18 | bibliography: ../../assets/dataanalysis-references.bib
19 | csl: ../../assets/american-journal-of-epidemiology.csl
20 | ---
21 |
22 | ## Overview
23 |
24 | - A few simple slides using the [`revealjs` (html) format](https://quarto.org/docs/presentations/revealjs/).
25 | - For other formats (e.g. Powerpoint, or Beamer/pdf), see [here](https://quarto.org/docs/guide/).
26 |
27 |
28 | ## Figures and tables
29 |
30 | The suggestion is to place figures/tables/Rds files etc. in the `media` folder.
31 |
32 | You could also pull it from the `results` folder, but the advantage of copying them into `media` is that you have everything related to your presentation in one place, and if contentigures inside `results` continues to be changed/updated, you can ensure you have the version you used in your presentation by copying it into `media`.
33 |
34 |
35 |
36 | ## Example slide
37 |
38 | This shows the summary table. It is pulled in from an Rds file and rendered as R chunk with the `kable` package. In general, we suggest more powerful/flexible table packages, such as `gt` or `flextable`, but for this example, `kable` is good enough.
39 |
40 | Note that we could have loaded the data with `here()`, but if we ever want to copy this presentation to another folder outside the current project, you would have to adjust the path.
41 |
42 | ```{r}
43 | #| label: tbl-summarytable
44 | #| tbl-cap: "Data summary table."
45 | #| echo: FALSE
46 | resulttable=readRDS("./media/summarytable.rds")
47 | knitr::kable(resulttable)
48 | ```
49 |
50 | ## Example slide
51 |
52 | This shows a figure created by the analysis script. It is inserted using Quarto/Markdown syntax (not knitr code, but that would be possible too).
53 |
54 | {fig-align="center" width="420"}
55 |
56 | ## Example slide
57 |
58 | This shows the model fitting results as table.
59 |
60 | ```{r}
61 | #| label: tbl-resulttable2
62 | #| tbl-cap: "Linear model fit table."
63 | #| echo: FALSE
64 | resulttable2 = readRDS("./media/resulttable2.rds")
65 | knitr::kable(resulttable2)
66 | ```
67 |
68 | ## Example slide with reference
69 |
70 | This paper [@leek2015] discusses types of analyses.
71 |
72 | ## Further Resources
73 |
74 | * [Quarto Presentation Documentation](https://quarto.org/docs/presentations/)
75 | * [Slidecraft 101](https://emilhvitfeldt.com/project/slidecraft-101/) is a nice blog post showing some more advanced things one can do with Quarto slides.
76 |
77 | ## References
78 |
--------------------------------------------------------------------------------
/products/presentation/media/my-presentation-styling.css:
--------------------------------------------------------------------------------
1 | /* Handel slide template
2 | * Last Modified: 2020-07-06
3 | -------------------------------------------------------------------------------- */
4 |
5 |
6 |
7 | #mytextbox
8 | {
9 | border: 10px solid #cfd7e0;
10 | color: black;
11 | font-weight: normal;
12 | }
13 |
14 | #bigfont
15 | {
16 | font-size: 300%;
17 | font-weight: bold;
18 | }
19 |
20 | #myimage
21 | {
22 | max-height: 10%;
23 | }
24 |
25 | #verysmall
26 | {
27 | font-size: 50%;
28 | }
29 |
30 | .verysmall
31 | {
32 | font-size: 50%;
33 | }
34 |
35 |
36 | .small { font-size: 70% }
37 |
38 |
39 | .smallfont pre {
40 | font-size: 50%;
41 | }
42 |
43 |
44 |
45 | /* Remove slide numbering if using xaringan */
46 | .remark-slide-number {
47 | display: none;
48 | }
49 |
50 |
51 | html {
52 | /* Box-model */
53 | margin: 0;
54 | padding: 0;
55 | border: 0;
56 |
57 | /* Visual */
58 | background-color: #fff; /* white */
59 | color: black; /* dark grey */
60 | }
61 |
62 |
63 | /* Image
64 | ---------------*/
65 | img {
66 | max-width: 100%;
67 | max-height: 80%;
68 | padding: 3px;
69 |
70 | }
71 | .image {
72 | display: inline-block;
73 | margin-left: 0;
74 | margin-right: 0;
75 | padding: 0;
76 | text-align: center;
77 |
78 | /* Visual */
79 | border-radius: 1px;
80 | border: 0 solid #ccc; /* light grey */
81 | }
82 | .figure {
83 | margin-left: 0;
84 | margin-right: 0;
85 | padding: 0;
86 | text-align: center;
87 |
88 | /* Visual */
89 | border-radius: 1px;
90 | border: 0 solid #ccc; /* light grey */
91 | }
92 | .caption {
93 | padding: 0;
94 | margin: auto;
95 | margin-bottom: 3px;
96 | }
97 |
98 |
99 | /* Change color of regular text and h2
100 | also change spacing between heading and main text
101 | -------------------- */
102 | slides > slide {
103 | color: black;
104 | }
105 |
106 | h2 {
107 | color: #123c66; /* dark blue */
108 | margin-bottom: -30px;
109 | }
110 |
111 |
112 | /* Turn off page count in footer */
113 | slides > slide:not(.nobackground):after {
114 | content: '';
115 | }
116 |
117 |
118 | /* Keep text off left side of screen
119 | -------------------- */
120 | p {
121 | margin-right: 0;
122 | }
123 |
124 |
125 | /* My additions
126 | -------------------- */
127 |
128 | #classname
129 | {
130 | color: #123c66;
131 | font-weight: bold;
132 | text-align: center;
133 | font-size: 120%;
134 | }
135 |
136 | #classauthor
137 | {
138 | color: black;
139 | font-weight: normal;
140 | text-align: center;
141 |
142 | }
143 |
144 | #mylicense
145 | {
146 | text-align: center;
147 | font-size: 70%;
148 |
149 | }
150 |
151 | iframe {
152 | display: block;
153 | margin-left: auto;
154 | margin-right: auto;
155 | }
156 |
157 |
158 | /* Change vertical spacing for lists */
159 | li:not(:last-child) {
160 | margin-bottom: 5px;
161 | }
162 |
163 |
164 |
--------------------------------------------------------------------------------
/products/poster/_extensions/quarto-ext/poster/typst-show.typ:
--------------------------------------------------------------------------------
1 | // Typst custom formats typically consist of a 'typst-template.typ' (which is
2 | // the source code for a typst template) and a 'typst-show.typ' which calls the
3 | // template's function (forwarding Pandoc metadata values as required)
4 | //
5 | // This is an example 'typst-show.typ' file (based on the default template
6 | // that ships with Quarto). It calls the typst function named 'article' which
7 | // is defined in the 'typst-template.typ' file.
8 | //
9 | // If you are creating or packaging a custom typst template you will likely
10 | // want to replace this file and 'typst-template.typ' entirely. You can find
11 | // documentation on creating typst templates here and some examples here:
12 | // - https://typst.app/docs/tutorial/making-a-template/
13 | // - https://github.com/typst/templates
14 |
15 | #show: doc => poster(
16 | $if(title)$ title: [$title$], $endif$
17 | // TODO: use Quarto's normalized metadata.
18 | $if(poster-authors)$ authors: [$poster-authors$], $endif$
19 | $if(departments)$ departments: [$departments$], $endif$
20 | $if(size)$ size: "$size$", $endif$
21 |
22 | // Institution logo.
23 | $if(institution-logo)$ univ_logo: "$institution-logo$", $endif$
24 |
25 | // Footer text.
26 | // For instance, Name of Conference, Date, Location.
27 | // or Course Name, Date, Instructor.
28 | $if(footer-text)$ footer_text: [$footer-text$], $endif$
29 |
30 | // Any URL, like a link to the conference website.
31 | $if(footer-url)$ footer_url: [$footer-url$], $endif$
32 |
33 | // Emails of the authors.
34 | $if(footer-emails)$ footer_email_ids: [$footer-emails$], $endif$
35 |
36 | // Color of the footer.
37 | $if(footer-color)$ footer_color: "$footer-color$", $endif$
38 |
39 | // DEFAULTS
40 | // ========
41 | // For 3-column posters, these are generally good defaults.
42 | // Tested on 36in x 24in, 48in x 36in, and 36in x 48in posters.
43 | // For 2-column posters, you may need to tweak these values.
44 | // See ./examples/example_2_column_18_24.typ for an example.
45 |
46 | // Any keywords or index terms that you want to highlight at the beginning.
47 | $if(keywords)$ keywords: ($for(keywords)$"$it$"$sep$, $endfor$), $endif$
48 |
49 | // Number of columns in the poster.
50 | $if(num-columns)$ num_columns: $num-columns$, $endif$
51 |
52 | // University logo's scale (in %).
53 | $if(univ-logo-scale)$ univ_logo_scale: $univ-logo-scale$, $endif$
54 |
55 | // University logo's column size (in in).
56 | $if(univ-logo-column-size)$ univ_logo_column_size: $univ-logo-column-size$, $endif$
57 |
58 | // Title and authors' column size (in in).
59 | $if(title-column-size)$ title_column_size: $title-column-size$, $endif$
60 |
61 | // Poster title's font size (in pt).
62 | $if(title-font-size)$ title_font_size: $title-font-size$, $endif$
63 |
64 | // Authors' font size (in pt).
65 | $if(authors-font-size)$ authors_font_size: $authors-font-size$, $endif$
66 |
67 | // Footer's URL and email font size (in pt).
68 | $if(footer-url-font-size)$ footer_url_font_size: $footer-url-font-size$, $endif$
69 |
70 | // Footer's text font size (in pt).
71 | $if(footer-text-font-size)$ footer_text_font_size: [$footer-text-font-size$], $endif$
72 |
73 | doc,
74 | )
75 |
--------------------------------------------------------------------------------
/code/eda-code/eda-v2.qmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "An example exploratory analysis script with code pulled in"
3 | date: "2024-02-07"
4 | output: html_document
5 | ---
6 |
7 |
8 |
9 | This Quarto file loads the cleaned data and does some exploring.
10 |
11 | This is essentially the same as the other `exploratory_analysis` Quarto file, but now the code is not inside this file. Instead, it is pulled in from the R script `exploratorycode.R` using the code chunk labels.
12 |
13 | Also note that while here I split cleaning and exploring, this is iterative. You saw that as part of the processing, we already had to explore the data somewhat to understand how to clean it. In general, as you explore, you'll find things that need cleaning. As you clean, you can explore more. Therefore, at times it might make more sense to combine the cleaning and exploring code parts into a single R or Quarto file. Or split things in any other logical way.
14 |
15 | As part of the exploratory analysis, you should produce plots or tables or other summary quantities for the most interesting/important quantities in your data. Depending on the total number of variables in your dataset, explore all or some of the others. Figures produced here might be histograms or density plots, correlation plots, etc. Tables might summarize your data.
16 |
17 | Start by exploring one variable at a time. Then continue by creating plots or tables of the outcome(s) of interest and the predictor/exposure/input variables you are most interested in. If your dataset is small, you can do that for all variables.
18 |
19 | Plots produced here can be scatterplots, boxplots, violinplots, etc. Tables can be simple 2x2 tables or larger ones.
20 |
21 |
22 |
23 | # Setup
24 | Load the chunk.
25 |
26 | ```{r, include=FALSE, cache=FALSE}
27 | knitr::read_chunk('edacode.R')
28 | ```
29 |
30 | Load the packages.
31 | ```{r,packages, echo=FALSE,message=FALSE}
32 | ```
33 |
34 |
35 | Load the data.
36 |
37 | ```{r,loaddata}
38 | ```
39 |
40 |
41 |
42 |
43 |
44 | # Data exploration through tables
45 |
46 | Showing a bit of code to produce and save a summary table.
47 |
48 |
49 | ```{r,table1}
50 | ```
51 |
52 | We are saving the results to the `results` folder. Depending on how many tables/figures you have, it might make sense to have separate folders for each. And/or you could have separate folders for exploratory tables/figures and for final tables/figures. Just choose a setup that makes sense for your project and works for you, and provide enough documentation that someone can understand what you are doing.
53 |
54 |
55 | # Data exploration through figures
56 |
57 | Histogram plots for the continuous outcomes.
58 |
59 | Height first.
60 |
61 | ```{r,height}
62 | ```
63 |
64 | Now weights.
65 |
66 | ```{r,weight}
67 | ```
68 |
69 | Now height as function of weight.
70 |
71 | ```{r,fitfig1}
72 | ```
73 |
74 | Once more height as function of weight, stratified by gender. Note that there is so little data, it's a bit silly. But we'll plot it anyway.
75 |
76 | ```{r,fitfig2}
77 | ```
78 |
79 |
80 |
81 | # Notes
82 |
83 | For your own explorations, tables and figures can be "quick and dirty". As long as you can see what's going on, there is no need to polish them. That's in contrast to figures you'll produce for your final products (paper, report, presentation, website, etc.). Those should look as nice, polished and easy to understand as possible.
84 |
85 |
86 |
--------------------------------------------------------------------------------
/products/poster/poster.qmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: This is an academic poster with typst and quarto!
3 | format:
4 | poster-typst:
5 | size: "36x24"
6 | poster-authors: "A. Smith, B. Jones, C. Brown"
7 | departments: "Department of Something"
8 | institution-logo: "./media/logo.png"
9 | footer-text: "Some Conference"
10 | footer-url: ""
11 | footer-emails: "abc@example.com"
12 | footer-color: "ebcfb2"
13 | keywords: ["Poster", "Typst", "Quarto"]
14 | ---
15 |
16 |
17 | # Background
18 |
19 | * This is a poster created with Quarto using the Typst system.
20 | * It is based on this template and extension: https://github.com/quarto-ext/typst-templates/tree/main/poster
21 | * Typst can't run code, therefore any tables one want to include need to be generated and saved as figures outside of this document.
22 | * There is unfortunately currently no robust way to make posters with Quarto yet. If you have suggestions for better Quarto-based alternatives, please let us know!
23 |
24 | # Abstract
25 |
26 | Abstract of your project.
27 |
28 | # Methods
29 |
30 | One can do equations.
31 |
32 | $$
33 | \sum_(k=1)^n k = \frac{(n(n+1))}{2} = \frac{(n^2 + n)}{2}
34 | $$
35 |
36 |
37 | # Result
38 |
39 | This shows a figure created by the analysis script. It is inserted using Quarto/Markdown syntax (not knitr code, but that would be possible too).
40 |
41 | {fig-align="center" width="420"}
42 |
43 | And here is a table.
44 |
45 | ```{r}
46 | #| label: tbl-summarytable
47 | #| tbl-cap: "Data summary table."
48 | #| echo: FALSE
49 | resulttable=readRDS("./media/summarytable.rds")
50 | knitr::kable(resulttable)
51 | ```
52 |
53 |
54 | Here is another figure, now using a code chunk.
55 |
56 |
57 | ```{r}
58 | #| label: fig-2
59 | #| echo: FALSE
60 | #| fig-cap: "Caption for this figure."
61 | knitr::include_graphics("./media/height-weight-stratified.png")
62 | ```
63 |
64 | ## Aonther subsection with results
65 |
66 | This shows the model fitting results as table.
67 |
68 | ```{r}
69 | #| label: tbl-resulttable2
70 | #| tbl-cap: "Linear model fit table."
71 | #| echo: FALSE
72 | resulttable2 = readRDS("./media/resulttable2.rds")
73 | knitr::kable(resulttable2)
74 | ```
75 |
76 |
77 |
78 |
79 | # Discussion
80 |
81 | We did some analysis.
82 |
83 |
84 |
85 | # Acknowledgements
86 |
87 | This project is partially supported by NIH contract 75N93019C00060.
88 |