├── .github └── ISSUE_TEMPLATE │ └── code-review-template.md ├── .gitignore ├── README.md └── r-code-review-checklist.Rproj /.github/ISSUE_TEMPLATE/code-review-template.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Code review template 3 | about: This template provides a checklist for code review of data wrangling/analysis 4 | projects in R 5 | title: "[REVIEW]" 6 | labels: review 7 | assignees: '' 8 | 9 | --- 10 | 11 | R code review checklist 12 | === 13 | 14 | Summary 15 | --- 16 | 17 | This checklist is designed to serve as an [issue template](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository) to assist in the code review process for data wrangling/analysis projects developed in R. The focus of the checklist _is not_ R package development and review; rather, it is aimed at teams of data scientists and/or data analysts who write scripts to generate tables, listings, figures, or any other analytic output. 18 | 19 | The checklist follows principles set forth in [the tidyverse style guide](https://style.tidyverse.org/) + [Good enough practices in scientific computing](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510), and adheres to deodorizing strategies from [Code smells and feels](https://github.com/jennybc/code-smells-and-feels). 20 | 21 | General 22 | --- 23 | 24 | - [ ] The code is readable / easy to understand 25 | - [ ] The project executes / produces the intended deliverable on an independent machine (i.e. not the developer's computer) using only the code and resources embedded in the repository 26 | - [ ] All code is necessary; e.g. functions are not written which accomplish tasks already covered by other approved code bases 27 | - [ ] Run-time is not prohibitively long / is appropriate given the nature of the project 28 | 29 | Documentation and organization 30 | --- 31 | 32 | - [ ] The repository contains a `readme` which includes, at minimum, the purpose of the project (e.g. deliverables) and a set of instructions for executing the project from the code in the repository 33 | - [ ] File names succinctly describe the purpose of the file 34 | - [ ] File names do not contain special characters other than `-` or `_`, ideally with `-` used to separate words and `_` used to separate categories 35 | - [ ] Files that are to be run sequentially are prefixed with numbers, and single digits are padded with leading `0`s 36 | - [ ] Within files, logically separated chunks of code are divided by commented lines of `-` and/or `=` with a chunk descriptor 37 | - [ ] Comments within chunks begin with `#` and a single space, and are in sentence case ending in `.` only when they contain at least 2 sentences 38 | - [ ] All package dependencies required for a given script are loaded at the beginning of the file 39 | - [ ] Programs are appropriately decomposed into files/functions; ideally, each file is no more than 1 page long (approximately 60 lines) 40 | 41 | Data management 42 | --- 43 | 44 | - [ ] External locations of raw data (if the data are too large or are not permitted to be stored directly in the repository) are clearly annotated with written verification that the locations and raw data will not be modified 45 | - [ ] All steps used to process data are clearly identifiable (e.g. through good file naming and organization) and reproducible 46 | - [ ] Data are not hard-coded/manually written into any script 47 | - [ ] Appropriate policy for data storage and access, particularly PHI, is followed 48 | 49 | Syntax 50 | --- 51 | 52 | - [ ] Variable and function names use only lowercase letters, numbers, and `_` (to separate words within a name) 53 | - [ ] Variable names are nouns and function names are verbs 54 | - [ ] Commas always have a space after, and not before 55 | - [ ] A space is used before and after `()` with `if`, `for`, or `while` (and no spaces appear on either side of `(` or `)` with regular function calls) 56 | - [ ] Infix operators `==`, `+`, `-`, `<-`, and `=` are surrounded by spaces 57 | - [ ] `{` and `}` are the last and first characters on lines and nested contents are indented by two spaces 58 | - [ ] The first `%>%` in a series is followed by a new line which is indented two spaces 59 | - [ ] Styling for `+` in ggplot layers follows the styling format of `%>%` 60 | - [ ] Code is limited to 80 characters per line, and multiple commands are not combined into one line with `;` 61 | - [ ] If arguments to a function don't all fit on one line, each argument is on its own indented line 62 | - [ ] Assignment is made with `<-` not `=` 63 | - [ ] Text is quoted with `"` not `'`, unless text already contains `"` 64 | - [ ] `TRUE` and `FALSE` are used, not `T` and `F` 65 | 66 | Change control 67 | --- 68 | 69 | - [ ] Iterative development in the project is annotated through descriptive commits 70 | - [ ] Major, minor, and patch modifications to the project are captured in release notes with corresponding release numbering (`major.minor.patch`) 71 | 72 | Pitfalls 73 | --- 74 | 75 | - [ ] No redundant code blocks or comments 76 | - [ ] Extraneous data are not stored in the repository or loaded in the project 77 | - [ ] No assignment in function calls 78 | - [ ] `else` is used sparingly, if at all (i.e. for readability, it is avoided in favor of `if` with a guard clause or `case_when` when the logic is needed in a `mutate`) 79 | - [ ] No instances of `attach()` or `setwd()` 80 | - [ ] No reliance on a system-specific startup file, such as `.Rprofile` 81 | - [ ] No use of magrittr shortcuts `%<>%` and omission of `()` on functions that don’t have arguments 82 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .Rdata 4 | .httr-oauth 5 | .DS_Store 6 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | R code review checklist 2 | === 3 | 4 | Summary 5 | --- 6 | 7 | This checklist is designed to serve as an [issue template](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository) to assist in the code review process for data wrangling/analysis projects developed in R. The focus of the checklist _is not_ R package development and review; rather, it is aimed at teams of data scientists and/or data analysts who write scripts to generate tables, listings, figures, or any other analytic output. 8 | 9 | The checklist follows principles set forth in [the tidyverse style guide](https://style.tidyverse.org/) + [Good enough practices in scientific computing](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005510), and adheres to deodorizing strategies from [Code smells and feels](https://github.com/jennybc/code-smells-and-feels). 10 | 11 | General 12 | --- 13 | 14 | - [ ] The code is readable / easy to understand 15 | - [ ] The project executes / produces the intended deliverable on an independent machine (i.e. not the developer's computer) using only the code and resources embedded in the repository 16 | - [ ] All code is necessary; e.g. functions are not written which accomplish tasks already covered by other approved code bases 17 | - [ ] Run-time is not prohibitively long / is appropriate given the nature of the project 18 | 19 | Documentation and organization 20 | --- 21 | 22 | - [ ] The repository contains a `readme` which includes, at minimum, the purpose of the project (e.g. deliverables) and a set of instructions for executing the project from the code in the repository 23 | - [ ] File names succinctly describe the purpose of the file 24 | - [ ] File names do not contain special characters other than `-` or `_`, ideally with `-` used to separate words and `_` used to separate categories 25 | - [ ] Files that are to be run sequentially are prefixed with numbers, and single digits are padded with leading `0`s 26 | - [ ] Within files, logically separated chunks of code are divided by commented lines of `-` and/or `=` with a chunk descriptor 27 | - [ ] Comments within chunks begin with `#` and a single space, and are in sentence case ending in `.` only when they contain at least 2 sentences 28 | - [ ] All package dependencies required for a given script are loaded at the beginning of the file 29 | - [ ] Programs are appropriately decomposed into files/functions; ideally, each file is no more than 1 page long (approximately 60 lines) 30 | 31 | Data management 32 | --- 33 | 34 | - [ ] External locations of raw data (if the data are too large or are not permitted to be stored directly in the repository) are clearly annotated with written verification that the locations and raw data will not be modified 35 | - [ ] All steps used to process data are clearly identifiable (e.g. through good file naming and organization) and reproducible 36 | - [ ] Data are not hard-coded/manually written into any script 37 | - [ ] Appropriate policy for data storage and access, particularly PHI, is followed 38 | 39 | Syntax 40 | --- 41 | 42 | - [ ] Variable and function names use only lowercase letters, numbers, and `_` (to separate words within a name) 43 | - [ ] Variable names are nouns and function names are verbs 44 | - [ ] Commas always have a space after, and not before 45 | - [ ] A space is used before and after `()` with `if`, `for`, or `while` (and no spaces appear on either side of `(` or `)` with regular function calls) 46 | - [ ] Infix operators `==`, `+`, `-`, `<-`, and `=` are surrounded by spaces 47 | - [ ] `{` and `}` are the last and first characters on lines and nested contents are indented by two spaces 48 | - [ ] The first `%>%` in a series is followed by a new line which is indented two spaces 49 | - [ ] Styling for `+` in ggplot layers follows the styling format of `%>%` 50 | - [ ] Code is limited to 80 characters per line, and multiple commands are not combined into one line with `;` 51 | - [ ] If arguments to a function don't all fit on one line, each argument is on its own indented line 52 | - [ ] Assignment is made with `<-` not `=` 53 | - [ ] Text is quoted with `"` not `'`, unless text already contains `"` 54 | - [ ] `TRUE` and `FALSE` are used, not `T` and `F` 55 | 56 | Change control 57 | --- 58 | 59 | - [ ] Iterative development in the project is annotated through descriptive commits 60 | - [ ] Major, minor, and patch modifications to the project are captured in release notes with corresponding release numbering (`major.minor.patch`) 61 | 62 | Pitfalls 63 | --- 64 | 65 | - [ ] No redundant code blocks or comments 66 | - [ ] Extraneous data are not stored in the repository or loaded in the project 67 | - [ ] No assignment in function calls 68 | - [ ] `else` is used sparingly, if at all (i.e. for readability, it is avoided in favor of `if` with a guard clause or `case_when` when the logic is needed in a `mutate`) 69 | - [ ] No instances of `attach()` or `setwd()` 70 | - [ ] No reliance on a system-specific startup file, such as `.Rprofile` 71 | - [ ] No use of magrittr shortcuts `%<>%` and omission of `()` on functions that don’t have arguments 72 | -------------------------------------------------------------------------------- /r-code-review-checklist.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | --------------------------------------------------------------------------------