├── .github
    └── ISSUE_TEMPLATE
    │   └── code-review-template.md
├── .gitignore
├── README.md
└── r-code-review-checklist.Rproj


/.github/ISSUE_TEMPLATE/code-review-template.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: Code review template
 3 | about: This template provides a checklist for code review of data wrangling/analysis
 4 |   projects in R
 5 | title: "[REVIEW]"
 6 | labels: review
 7 | assignees: ''
 8 | 
 9 | ---
10 | 
11 | R code review checklist
12 | ===
13 | 
14 | Summary
15 | ---
16 | 
17 | This checklist is designed to serve as an [issue template](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository) to assist in the code review process for data wrangling/analysis projects developed in R. The focus of the checklist _is not_ R package development and review; rather, it is aimed at teams of data scientists and/or data analysts who write scripts to generate tables, listings, figures, or any other analytic output.
18 | 
19 | The checklist follows principles set forth in [the tidyverse style guide](https://style.tidyverse.org/) + [Good enough practices in scientific computing](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510), and adheres to deodorizing strategies from [Code smells and feels](https://github.com/jennybc/code-smells-and-feels).
20 | 
21 | General
22 | ---
23 | 
24 | - [ ] The code is readable / easy to understand
25 | - [ ] The project executes / produces the intended deliverable on an independent machine (i.e. not the developer's computer) using only the code and resources embedded in the repository
26 | - [ ] All code is necessary; e.g. functions are not written which accomplish tasks already covered by other approved code bases
27 | - [ ] Run-time is not prohibitively long / is appropriate given the nature of the project
28 | 
29 | Documentation and organization
30 | ---
31 | 
32 | - [ ] The repository contains a `readme` which includes, at minimum, the purpose of the project (e.g. deliverables) and a set of instructions for executing the project from the code in the repository
33 | - [ ] File names succinctly describe the purpose of the file
34 | - [ ] File names do not contain special characters other than `-` or `_`, ideally with `-` used to separate words and `_` used to separate categories
35 | - [ ] Files that are to be run sequentially are prefixed with numbers, and single digits are padded with leading `0`s
36 | - [ ] Within files, logically separated chunks of code are divided by commented lines of `-` and/or `=` with a chunk descriptor
37 | - [ ] Comments within chunks begin with `#` and a single space, and are in sentence case ending in `.` only when they contain at least 2 sentences
38 | - [ ] All package dependencies required for a given script are loaded at the beginning of the file
39 | - [ ] Programs are appropriately decomposed into files/functions; ideally, each file is no more than 1 page long (approximately 60 lines)
40 | 
41 | Data management
42 | ---
43 | 
44 | - [ ] External locations of raw data (if the data are too large or are not permitted to be stored directly in the repository) are clearly annotated with written verification that the locations and raw data will not be modified
45 | - [ ] All steps used to process data are clearly identifiable (e.g. through good file naming and organization) and reproducible
46 | - [ ] Data are not hard-coded/manually written into any script
47 | - [ ] Appropriate policy for data storage and access, particularly PHI, is followed
48 | 
49 | Syntax
50 | ---
51 | 
52 | - [ ] Variable and function names use only lowercase letters, numbers, and `_` (to separate words within a name)
53 | - [ ] Variable names are nouns and function names are verbs
54 | - [ ] Commas always have a space after, and not before
55 | - [ ] A space is used before and after `()` with `if`, `for`, or `while` (and no spaces appear on either side of `(` or `)` with regular function calls)
56 | - [ ] Infix operators `==`, `+`, `-`, `<-`, and `=` are surrounded by spaces
57 | - [ ] `{` and `}` are the last and first characters on lines and nested contents are indented by two spaces
58 | - [ ] The first `%>%` in a series is followed by a new line which is indented two spaces
59 | - [ ] Styling for `+` in ggplot layers follows the styling format of `%>%`
60 | - [ ] Code is limited to 80 characters per line, and multiple commands are not combined into one line with `;`
61 | - [ ] If arguments to a function don't all fit on one line, each argument is on its own indented line
62 | - [ ] Assignment is made with `<-` not `=`
63 | - [ ] Text is quoted with `"` not `'`, unless text already contains `"`
64 | - [ ] `TRUE` and `FALSE` are used, not `T` and `F`
65 | 
66 | Change control
67 | ---
68 | 
69 | - [ ] Iterative development in the project is annotated through descriptive commits
70 | - [ ] Major, minor, and patch modifications to the project are captured in release notes with corresponding release numbering (`major.minor.patch`)
71 | 
72 | Pitfalls
73 | ---
74 | 
75 | - [ ] No redundant code blocks or comments
76 | - [ ] Extraneous data are not stored in the repository or loaded in the project
77 | - [ ] No assignment in function calls
78 | - [ ] `else` is used sparingly, if at all (i.e. for readability, it is avoided in favor of `if` with a guard clause or `case_when` when the logic is needed in a `mutate`)
79 | - [ ] No instances of `attach()` or `setwd()`
80 | - [ ] No reliance on a system-specific startup file, such as `.Rprofile`
81 | - [ ] No use of magrittr shortcuts `%<>%` and omission of `()` on functions that don’t have arguments
82 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | .Rproj.user
2 | .Rhistory
3 | .Rdata
4 | .httr-oauth
5 | .DS_Store
6 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | R code review checklist
 2 | ===
 3 | 
 4 | Summary
 5 | ---
 6 | 
 7 | This checklist is designed to serve as an [issue template](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository) to assist in the code review process for data wrangling/analysis projects developed in R. The focus of the checklist _is not_ R package development and review; rather, it is aimed at teams of data scientists and/or data analysts who write scripts to generate tables, listings, figures, or any other analytic output.
 8 | 
 9 | The checklist follows principles set forth in [the tidyverse style guide](https://style.tidyverse.org/) + [Good enough practices in scientific computing](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510), and adheres to deodorizing strategies from [Code smells and feels](https://github.com/jennybc/code-smells-and-feels).
10 | 
11 | General
12 | ---
13 | 
14 | - [ ] The code is readable / easy to understand
15 | - [ ] The project executes / produces the intended deliverable on an independent machine (i.e. not the developer's computer) using only the code and resources embedded in the repository
16 | - [ ] All code is necessary; e.g. functions are not written which accomplish tasks already covered by other approved code bases
17 | - [ ] Run-time is not prohibitively long / is appropriate given the nature of the project
18 | 
19 | Documentation and organization
20 | ---
21 | 
22 | - [ ] The repository contains a `readme` which includes, at minimum, the purpose of the project (e.g. deliverables) and a set of instructions for executing the project from the code in the repository
23 | - [ ] File names succinctly describe the purpose of the file
24 | - [ ] File names do not contain special characters other than `-` or `_`, ideally with `-` used to separate words and `_` used to separate categories
25 | - [ ] Files that are to be run sequentially are prefixed with numbers, and single digits are padded with leading `0`s
26 | - [ ] Within files, logically separated chunks of code are divided by commented lines of `-` and/or `=` with a chunk descriptor
27 | - [ ] Comments within chunks begin with `#` and a single space, and are in sentence case ending in `.` only when they contain at least 2 sentences
28 | - [ ] All package dependencies required for a given script are loaded at the beginning of the file
29 | - [ ] Programs are appropriately decomposed into files/functions; ideally, each file is no more than 1 page long (approximately 60 lines)
30 | 
31 | Data management
32 | ---
33 | 
34 | - [ ] External locations of raw data (if the data are too large or are not permitted to be stored directly in the repository) are clearly annotated with written verification that the locations and raw data will not be modified
35 | - [ ] All steps used to process data are clearly identifiable (e.g. through good file naming and organization) and reproducible
36 | - [ ] Data are not hard-coded/manually written into any script
37 | - [ ] Appropriate policy for data storage and access, particularly PHI, is followed
38 | 
39 | Syntax
40 | ---
41 | 
42 | - [ ] Variable and function names use only lowercase letters, numbers, and `_` (to separate words within a name)
43 | - [ ] Variable names are nouns and function names are verbs
44 | - [ ] Commas always have a space after, and not before
45 | - [ ] A space is used before and after `()` with `if`, `for`, or `while` (and no spaces appear on either side of `(` or `)` with regular function calls)
46 | - [ ] Infix operators `==`, `+`, `-`, `<-`, and `=` are surrounded by spaces
47 | - [ ] `{` and `}` are the last and first characters on lines and nested contents are indented by two spaces
48 | - [ ] The first `%>%` in a series is followed by a new line which is indented two spaces
49 | - [ ] Styling for `+` in ggplot layers follows the styling format of `%>%`
50 | - [ ] Code is limited to 80 characters per line, and multiple commands are not combined into one line with `;`
51 | - [ ] If arguments to a function don't all fit on one line, each argument is on its own indented line
52 | - [ ] Assignment is made with `<-` not `=`
53 | - [ ] Text is quoted with `"` not `'`, unless text already contains `"`
54 | - [ ] `TRUE` and `FALSE` are used, not `T` and `F`
55 | 
56 | Change control
57 | ---
58 | 
59 | - [ ] Iterative development in the project is annotated through descriptive commits
60 | - [ ] Major, minor, and patch modifications to the project are captured in release notes with corresponding release numbering (`major.minor.patch`)
61 | 
62 | Pitfalls
63 | ---
64 | 
65 | - [ ] No redundant code blocks or comments
66 | - [ ] Extraneous data are not stored in the repository or loaded in the project
67 | - [ ] No assignment in function calls
68 | - [ ] `else` is used sparingly, if at all (i.e. for readability, it is avoided in favor of `if` with a guard clause or `case_when` when the logic is needed in a `mutate`)
69 | - [ ] No instances of `attach()` or `setwd()`
70 | - [ ] No reliance on a system-specific startup file, such as `.Rprofile`
71 | - [ ] No use of magrittr shortcuts `%<>%` and omission of `()` on functions that don’t have arguments
72 | 


--------------------------------------------------------------------------------
/r-code-review-checklist.Rproj:
--------------------------------------------------------------------------------
 1 | Version: 1.0
 2 | 
 3 | RestoreWorkspace: Default
 4 | SaveWorkspace: Default
 5 | AlwaysSaveHistory: Default
 6 | 
 7 | EnableCodeIndexing: Yes
 8 | UseSpacesForTab: Yes
 9 | NumSpacesForTab: 2
10 | Encoding: UTF-8
11 | 
12 | RnwWeave: Sweave
13 | LaTeX: pdfLaTeX
14 | 
15 | AutoAppendNewline: Yes
16 | StripTrailingWhitespace: Yes
17 | 


--------------------------------------------------------------------------------