├── .gitignore ├── images ├── long_network.png ├── wide_network.png ├── targets_network.png ├── target_definition.png ├── targets_visnetwork2.png └── targets_real_network.png ├── docs ├── images │ ├── long_network.png │ ├── wide_network.png │ ├── targets_network.png │ ├── target_definition.png │ ├── targets_visnetwork2.png │ └── targets_real_network.png ├── site_libs │ ├── bootstrap │ │ └── bootstrap-icons.woff │ ├── quarto-html │ │ ├── tippy.css │ │ ├── quarto-syntax-highlighting.css │ │ ├── anchor.min.js │ │ ├── popper.min.js │ │ └── tippy.umd.min.js │ ├── quarto-nav │ │ ├── headroom.min.js │ │ └── quarto-nav.js │ ├── clipboard │ │ └── clipboard.min.js │ └── quarto-search │ │ └── fuse.min.js └── end.html ├── _quarto.yml ├── end.qmd ├── more.qmd ├── getting_help.qmd ├── index.qmd ├── README.md ├── debugging.qmd ├── long_vs_wide.qmd ├── pure_functions.qmd ├── typical_R_projects.qmd ├── targets_plan.qmd └── branching.qmd /.gitignore: -------------------------------------------------------------------------------- 1 | /.quarto/ 2 | -------------------------------------------------------------------------------- /images/long_network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/images/long_network.png -------------------------------------------------------------------------------- /images/wide_network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/images/wide_network.png -------------------------------------------------------------------------------- /images/targets_network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/images/targets_network.png -------------------------------------------------------------------------------- /docs/images/long_network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/docs/images/long_network.png -------------------------------------------------------------------------------- /docs/images/wide_network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/docs/images/wide_network.png -------------------------------------------------------------------------------- /images/target_definition.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/images/target_definition.png -------------------------------------------------------------------------------- /images/targets_visnetwork2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/images/targets_visnetwork2.png -------------------------------------------------------------------------------- /docs/images/targets_network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/docs/images/targets_network.png -------------------------------------------------------------------------------- /images/targets_real_network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/images/targets_real_network.png -------------------------------------------------------------------------------- /docs/images/target_definition.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/docs/images/target_definition.png -------------------------------------------------------------------------------- /docs/images/targets_visnetwork2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/docs/images/targets_visnetwork2.png -------------------------------------------------------------------------------- /docs/images/targets_real_network.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/docs/images/targets_real_network.png -------------------------------------------------------------------------------- /docs/site_libs/bootstrap/bootstrap-icons.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MilesMcBain/ssa_targets_workshop/HEAD/docs/site_libs/bootstrap/bootstrap-icons.woff -------------------------------------------------------------------------------- /_quarto.yml: -------------------------------------------------------------------------------- 1 | project: 2 | type: website 3 | output-dir: docs 4 | 5 | website: 6 | title: "Working Smarter with {targets}" 7 | sidebar: 8 | style: "docked" 9 | contents: 10 | - index.qmd 11 | - typical_R_projects.qmd 12 | - pure_functions.qmd 13 | - targets_plan.qmd 14 | - debugging.qmd 15 | - branching.qmd 16 | - long_vs_wide.qmd 17 | - getting_help.qmd 18 | - more.qmd 19 | - end.qmd 20 | 21 | -------------------------------------------------------------------------------- /end.qmd: -------------------------------------------------------------------------------- 1 | # The End 2 | 3 | Thanks for your participation. There's more to `{targets}` than we covered here, but I hope you feel on solid enough foundation to take some steps with it.on your own projects. 4 | 5 | My general advice is: 6 | 7 | - Keep it simple! Avoid Dynamic Branching and other advanced techiniques until you're either confident, or they're absolutely necessary. 8 | - Try to keep your `{tarets}` plans (`_targets.R`) really clean and high level. Don't junk them up with a lot of implementation detail (code not in funcions). 9 | - This retains their value as a review / communication tool. 10 | 11 | If you feel able, I would much appreciate your feedback via this [short form](https://forms.gle/eZNiu6v97xc5AQZHA). 12 | 13 | If you're working through the workshop content afterward and get stuck, or spot any problems, feel free to raise an issue on the [workshop GitHub repository](https://github.com/MilesMcBain/ssa_targets_workshop). 14 | 15 | 16 | -------------------------------------------------------------------------------- /more.qmd: -------------------------------------------------------------------------------- 1 | # More 2 | 3 | Some additional topics. No content here but happy to discuss if there's time: 4 | 5 | - Meta programming with `{targets}`. 6 | - You can create 'target factories': targets that generate more than one target in your plan. 7 | - These are away to build domain specific abstractions into our plans 8 | - See: [wlandau.github.io/targetopia/contributing.html](https://wlandau.github.io/targetopia/contributing.html) 9 | 10 | - For large projects `{targets}` supports having muliple plans. 11 | - I use this a fair bit, and it works well for projects that have separate phases. 12 | - E.g. maybe there's a phase where you're building a model, and then there's a later phase after it's been 'in production' where you analayse the performance. These could be separate plans in the one project. 13 | - Be careful about using it to break up a pipeline such that you revert to the classic 'script per pipeline stage' form. 14 | - See: [books.ropensci.org/targets/projects.html#multiple-projects](https://books.ropensci.org/targets/projects.html#multiple-projects) 15 | -------------------------------------------------------------------------------- /docs/site_libs/quarto-html/tippy.css: -------------------------------------------------------------------------------- 1 | .tippy-box[data-animation=fade][data-state=hidden]{opacity:0}[data-tippy-root]{max-width:calc(100vw - 10px)}.tippy-box{position:relative;background-color:#333;color:#fff;border-radius:4px;font-size:14px;line-height:1.4;white-space:normal;outline:0;transition-property:transform,visibility,opacity}.tippy-box[data-placement^=top]>.tippy-arrow{bottom:0}.tippy-box[data-placement^=top]>.tippy-arrow:before{bottom:-7px;left:0;border-width:8px 8px 0;border-top-color:initial;transform-origin:center top}.tippy-box[data-placement^=bottom]>.tippy-arrow{top:0}.tippy-box[data-placement^=bottom]>.tippy-arrow:before{top:-7px;left:0;border-width:0 8px 8px;border-bottom-color:initial;transform-origin:center bottom}.tippy-box[data-placement^=left]>.tippy-arrow{right:0}.tippy-box[data-placement^=left]>.tippy-arrow:before{border-width:8px 0 8px 8px;border-left-color:initial;right:-7px;transform-origin:center left}.tippy-box[data-placement^=right]>.tippy-arrow{left:0}.tippy-box[data-placement^=right]>.tippy-arrow:before{left:-7px;border-width:8px 8px 8px 0;border-right-color:initial;transform-origin:center right}.tippy-box[data-inertia][data-state=visible]{transition-timing-function:cubic-bezier(.54,1.5,.38,1.11)}.tippy-arrow{width:16px;height:16px;color:#333}.tippy-arrow:before{content:"";position:absolute;border-color:transparent;border-style:solid}.tippy-content{position:relative;padding:5px 9px;z-index:1} -------------------------------------------------------------------------------- /getting_help.qmd: -------------------------------------------------------------------------------- 1 | # Getting Help 2 | 3 | ## Things that can go wrong 4 | 5 | As you branch out and try to do more ambitious things with `{targets}` you will 6 | hit stumbling blocks where `{targets}` doesn't behave as you expect. 7 | 8 | Common themes among these issues are 9 | 10 | - Things that defeat `{targets}` static code analysis, so code changes are not detected. 11 | - For example `purrr::partial()`, `purrr::safely()`, `Vectorize()` always return the same function that captures the function you supply in a closure. `{targets}` static code analysis looks at the body of the function for changes, but not the closure. 12 | - Objects store data externally to R 13 | - e.g. use external pointers to objects created and managed by compiled C or CPP code. 14 | - like `data.table`, `stars`, `raster`, `terra` etc. 15 | - care needs to be taken to serialise these objects properly. In many cases the default Rds serialisation will fail to reproduce the object, since the loaded object will just have an invalid pointer. 16 | - Use the "format" arg of `tar_target()` to choose a better format. 17 | - You can author your own custom formats e.g. [`{geotargets}`](https://github.com/njtierney/geotargets) 18 | 19 | There are things that you likely want to do that aren't supported in `{targets}`. It pays to check `{tarchetypes}` and other ['targetopia'](https://wlandau.github.io/targetopia/) packages. 20 | 21 | - A recurrent request is to have targets that become stale after a certain amount 22 | of time passes. E.g. you want to make a new API call if stored target is more 23 | than X days old. This feature does not exist explicitly in `{targets}` but is 24 | supported in `tarchetypes::tar_age()`. 25 | 26 | ## Where to find help 27 | 28 | - The [discussions section of the {targets} GitHub repository](https://github.com/ropensci/targets/discussions) is a good place to ask questions about how to achieve something with `{targets}` or why it is not behaving as you expect. 29 | - Please avoid raising these as issues! 30 | - The rOpenSci slack has a dedicated channel to `{targets}` 31 | - The #rstats hashtag on Fosstodon / Mastodon is watched by a few `{targets}` enthusiasts. 32 | 33 | ## Read The Fancy Manual 34 | 35 | [books.ropensci.org/targets/](https://books.ropensci.org/targets/) 36 | 37 | It's probably the only software manual I have read start to finish.[^1] 38 | 39 | - It's updated frequently. 40 | - It's written for humans. 41 | - Not overly dry 42 | - Well curated. Doesn't cover EVERYTHING. 43 | 44 | [^1]: Apart from those thick glossy concept art drenched jobs that shipped with 90s video game CD-ROMS. 45 | 46 | 47 | -------------------------------------------------------------------------------- /index.qmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Working Smarter With {targets}" 3 | author: Miles McBain 4 | --- 5 | 6 | # Introduction 7 | 8 | The objective of this workshop is to change the way you work with R. Rather than 9 | sacrificing the fluidity and immediacy of R's REPL-based programming in the name 10 | of 'reproducibility', we will find there is a middle way that lets us have both. 11 | That way is the {targets} way. 12 | 13 | We start by setting some context: 14 | 15 | - What brought you to this workshop? 16 | - Have you had any experience with {targets}? 17 | - What problems do you have that you hope {targets} can solve? 18 | - Do you anticipate any barriers to moving forward with {targets} in your workplace? 19 | 20 | # What is {targets}? 21 | 22 | {targets} is a framemork created by [Will Landau](https://github.com/wlandau) for 23 | building data science pipelines with R. It is part of a growing niche of 'data 24 | orchestration' tools that conceive of data processing pipelines as graph 25 | structures. What sets {targets} apart from the rest is its incredible ergonomics 26 | and extensibility facilitated by features of the R programming language. 27 | 28 | You can view {targets} as a replacement for the classic `make` tool, which is a 29 | famed computation time saver in software development. Make's most important 30 | feature is that it can detect outputs (or 'targets') of a software build that 31 | have not changed since the previous build, and so can re-use them. This greatly 32 | accelerates the development / test cycle by minimising compilation time. 33 | 34 | {targets} shares this feature, but it goes far beyond this, giving the user the 35 | ability shape the computational structure of the pipeline so that it can be run 36 | optimally within the bounds of resource constraints. Importantly for data 37 | science, its features also defeat classes of bugs that affect project 38 | reproducibility. 39 | 40 | In practice the value {targets} delivers as seen by teams is around: 41 | 42 | - Increasing the speed of iteration on data science methodology. 43 | - Inducing a structure which makes projects more comprehensible, and more easily peer-reviwed. 44 | 45 | The {targets} R package has cleared the high bar set by the [rOpenSci peer review process](https://github.com/ropensci/software-review/issues/401) peer review process, and has been accepted on CRAN. 46 | 47 | # Overview of the workshop 48 | 49 | - Trying to be foundational or like a 'gentle introduction'. The knowledge you need to get value from `{targets}` is surprisingly small. 50 | - We'll spend time up front understanding the core problems {targets} solves. This will help us articulate the value to our teams. 51 | - Over the course of the workshop we'll progressively refactor an existing R project, written in a classic style, into a modern {targets} pipeline. This will let us see the benefits accumulate, as we deploy more advanced techniques. 52 | - We may run out of time, so sections are in priority order. There should be enough instructions to work through the stuff we don't get to as homework. 53 | 54 | # Notation 55 | 56 | In this workshop material `{targets}` is used to refer to the R package, while target or targets (no braces) refers to a node in the pipeline graph. We say targets are 'built' to refer to executing the code associated with a target to generate its value. The value of target can be any R object. This is a notable difference to make, where targets are files. 57 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Working smarter with Targets 2 | 3 | A half-day introduction to the {targets} framework for R projects. 4 | 5 | ## Summary 6 | 7 | This workshop is for useRs interested in smarter, faster, and more reproducible 8 | data analysis project workflows. You will learn about the R package `{targets}` 9 | and why it is one of the most important tools for 'getting stuff done' with R 10 | since the `{tidyverse}`. The objective of this session is to jump start your 11 | `{targets}` knowledge, and walk through the process of refactoring an existing 12 | project to take advantage `{targets}`. 13 | 14 | 15 | ## Target audience 16 | 17 | Participants should have prior experience working through at least one 18 | challenging data analysis project using R. 19 | 20 | ## Presenter 21 | 22 | Miles McBain is a Data Scientist and R package developer who has been using `{targets}` since release for large data analysis projects in the Public and Not-for-profit sectors. 23 | 24 | # Structure of workshop 25 | 26 | |time | topic | 27 | |-----------|-----------------------------------------------------------------------------------------| 28 | | TBA | Motivating Targets: Strengths and weaknesses of typical R project workflows | 29 | | TBA | Pure functions and their benefits as units of work | 30 | | TBA | The {targets} plan and the two kinds of reproducibility | 31 | | TBA | New debugging access panels | 32 | | TBA | Divide and conquer with branching | 33 | | TBA | Long vs Wide processes | 34 | | TBA | Things that may go wrong and where to get help | 35 | | TBA | Advanced topics: Meta-programming, Tarcheytypes, Multi-plan projects, Cloud computing | 36 | 37 | As we step through each topic we'll refactor our starter project using our new knowledge. 38 | 39 | # Getting started 40 | 41 | 1. You should have a reasonably up to date version of R (e.g. 4.3+), and a text editor setup you feel comfortable being productive with (E.g. RStudio, VSCode, ESS + Emacs, Vim + NvimR). It's going to be less typing if you can use the `{rstudioapi}` via either RStudio or VSCode. 42 | 43 | 2. Make sure you have these packages installed: 44 | 45 | ``` 46 | install.packages('pak') # works better / faster than install.packages, especially on Windows. 47 | pak::pkg_install(c( 48 | "conflicted", 49 | "dplyr", 50 | "galah", 51 | "ggplot2", 52 | "h3jsr", 53 | "lubridate", 54 | "pROC", 55 | "purrr", 56 | "randomForest", 57 | "readr", 58 | "rmarkdown", 59 | "rsample", 60 | "sf", 61 | "tibble", 62 | "tidyr", 63 | "targets", 64 | "tarchetypes", 65 | "crew" 66 | )) 67 | ``` 68 | 69 | If you're a Linux user `sf` might give you some challenges (but you're used to that, right?). Be sure to study their README. 70 | - likewise for `V8` dep of `h3jsr`, see static lib option for linux described in README. 71 | 72 | 3. Our example project is going to pull data from the _Atlas of Living Australia_, so create an account with a valid email address, here: 73 | https://auth.ala.org.au/userdetails/registration/createAccount 74 | 75 | 4. Ahead of the workshop any time you can spend reviewing the example project will be worthwile, so the project content itself can be les distracting. See: 76 | - https://github.com/milesmcbain/classic_r_project 77 | 78 | 5. I'll be using keyboard shortcuts for a couple of RStudio Addins provided by {targets}, in particular: 79 | - 'Run a targets pipeline in the foreground' 80 | - 'Load target at cursor' 81 | 82 | You may also enjoy [creating keyboard shortcuts](https://docs.posit.co/ide/user/ide/guide/productivity/add-ins.html#keyboard-shortcuts) for these. 83 | 84 | # Content 85 | 86 | https://milesmcbain.github.io/ssa_targets_workshop/ 87 | -------------------------------------------------------------------------------- /docs/site_libs/quarto-html/quarto-syntax-highlighting.css: -------------------------------------------------------------------------------- 1 | /* quarto syntax highlight colors */ 2 | :root { 3 | --quarto-hl-ot-color: #003B4F; 4 | --quarto-hl-at-color: #657422; 5 | --quarto-hl-ss-color: #20794D; 6 | --quarto-hl-an-color: #5E5E5E; 7 | --quarto-hl-fu-color: #4758AB; 8 | --quarto-hl-st-color: #20794D; 9 | --quarto-hl-cf-color: #003B4F; 10 | --quarto-hl-op-color: #5E5E5E; 11 | --quarto-hl-er-color: #AD0000; 12 | --quarto-hl-bn-color: #AD0000; 13 | --quarto-hl-al-color: #AD0000; 14 | --quarto-hl-va-color: #111111; 15 | --quarto-hl-bu-color: inherit; 16 | --quarto-hl-ex-color: inherit; 17 | --quarto-hl-pp-color: #AD0000; 18 | --quarto-hl-in-color: #5E5E5E; 19 | --quarto-hl-vs-color: #20794D; 20 | --quarto-hl-wa-color: #5E5E5E; 21 | --quarto-hl-do-color: #5E5E5E; 22 | --quarto-hl-im-color: #00769E; 23 | --quarto-hl-ch-color: #20794D; 24 | --quarto-hl-dt-color: #AD0000; 25 | --quarto-hl-fl-color: #AD0000; 26 | --quarto-hl-co-color: #5E5E5E; 27 | --quarto-hl-cv-color: #5E5E5E; 28 | --quarto-hl-cn-color: #8f5902; 29 | --quarto-hl-sc-color: #5E5E5E; 30 | --quarto-hl-dv-color: #AD0000; 31 | --quarto-hl-kw-color: #003B4F; 32 | } 33 | 34 | /* other quarto variables */ 35 | :root { 36 | --quarto-font-monospace: SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace; 37 | } 38 | 39 | pre > code.sourceCode > span { 40 | color: #003B4F; 41 | } 42 | 43 | code span { 44 | color: #003B4F; 45 | } 46 | 47 | code.sourceCode > span { 48 | color: #003B4F; 49 | } 50 | 51 | div.sourceCode, 52 | div.sourceCode pre.sourceCode { 53 | color: #003B4F; 54 | } 55 | 56 | code span.ot { 57 | color: #003B4F; 58 | font-style: inherit; 59 | } 60 | 61 | code span.at { 62 | color: #657422; 63 | font-style: inherit; 64 | } 65 | 66 | code span.ss { 67 | color: #20794D; 68 | font-style: inherit; 69 | } 70 | 71 | code span.an { 72 | color: #5E5E5E; 73 | font-style: inherit; 74 | } 75 | 76 | code span.fu { 77 | color: #4758AB; 78 | font-style: inherit; 79 | } 80 | 81 | code span.st { 82 | color: #20794D; 83 | font-style: inherit; 84 | } 85 | 86 | code span.cf { 87 | color: #003B4F; 88 | font-style: inherit; 89 | } 90 | 91 | code span.op { 92 | color: #5E5E5E; 93 | font-style: inherit; 94 | } 95 | 96 | code span.er { 97 | color: #AD0000; 98 | font-style: inherit; 99 | } 100 | 101 | code span.bn { 102 | color: #AD0000; 103 | font-style: inherit; 104 | } 105 | 106 | code span.al { 107 | color: #AD0000; 108 | font-style: inherit; 109 | } 110 | 111 | code span.va { 112 | color: #111111; 113 | font-style: inherit; 114 | } 115 | 116 | code span.bu { 117 | font-style: inherit; 118 | } 119 | 120 | code span.ex { 121 | font-style: inherit; 122 | } 123 | 124 | code span.pp { 125 | color: #AD0000; 126 | font-style: inherit; 127 | } 128 | 129 | code span.in { 130 | color: #5E5E5E; 131 | font-style: inherit; 132 | } 133 | 134 | code span.vs { 135 | color: #20794D; 136 | font-style: inherit; 137 | } 138 | 139 | code span.wa { 140 | color: #5E5E5E; 141 | font-style: italic; 142 | } 143 | 144 | code span.do { 145 | color: #5E5E5E; 146 | font-style: italic; 147 | } 148 | 149 | code span.im { 150 | color: #00769E; 151 | font-style: inherit; 152 | } 153 | 154 | code span.ch { 155 | color: #20794D; 156 | font-style: inherit; 157 | } 158 | 159 | code span.dt { 160 | color: #AD0000; 161 | font-style: inherit; 162 | } 163 | 164 | code span.fl { 165 | color: #AD0000; 166 | font-style: inherit; 167 | } 168 | 169 | code span.co { 170 | color: #5E5E5E; 171 | font-style: inherit; 172 | } 173 | 174 | code span.cv { 175 | color: #5E5E5E; 176 | font-style: italic; 177 | } 178 | 179 | code span.cn { 180 | color: #8f5902; 181 | font-style: inherit; 182 | } 183 | 184 | code span.sc { 185 | color: #5E5E5E; 186 | font-style: inherit; 187 | } 188 | 189 | code span.dv { 190 | color: #AD0000; 191 | font-style: inherit; 192 | } 193 | 194 | code span.kw { 195 | color: #003B4F; 196 | font-style: inherit; 197 | } 198 | 199 | .prevent-inlining { 200 | content: "s.tolerance[a.direction],e(a),l=t,i=!1}function h(){i||(i=!0,n=requestAnimationFrame(c))}var u=!!o&&{passive:!0,capture:!1};return t.addEventListener("scroll",h,u),c(),{destroy:function(){cancelAnimationFrame(n),t.removeEventListener("scroll",h,u)}}}function o(t){return t===Object(t)?t:{down:t,up:t}}function s(t,n){n=n||{},Object.assign(this,s.options,n),this.classes=Object.assign({},s.options.classes,n.classes),this.elem=t,this.tolerance=o(this.tolerance),this.offset=o(this.offset),this.initialised=!1,this.frozen=!1}return s.prototype={constructor:s,init:function(){return s.cutsTheMustard&&!this.initialised&&(this.addClass("initial"),this.initialised=!0,setTimeout(function(t){t.scrollTracker=n(t.scroller,{offset:t.offset,tolerance:t.tolerance},t.update.bind(t))},100,this)),this},destroy:function(){this.initialised=!1,Object.keys(this.classes).forEach(this.removeClass,this),this.scrollTracker.destroy()},unpin:function(){!this.hasClass("pinned")&&this.hasClass("unpinned")||(this.addClass("unpinned"),this.removeClass("pinned"),this.onUnpin&&this.onUnpin.call(this))},pin:function(){this.hasClass("unpinned")&&(this.addClass("pinned"),this.removeClass("unpinned"),this.onPin&&this.onPin.call(this))},freeze:function(){this.frozen=!0,this.addClass("frozen")},unfreeze:function(){this.frozen=!1,this.removeClass("frozen")},top:function(){this.hasClass("top")||(this.addClass("top"),this.removeClass("notTop"),this.onTop&&this.onTop.call(this))},notTop:function(){this.hasClass("notTop")||(this.addClass("notTop"),this.removeClass("top"),this.onNotTop&&this.onNotTop.call(this))},bottom:function(){this.hasClass("bottom")||(this.addClass("bottom"),this.removeClass("notBottom"),this.onBottom&&this.onBottom.call(this))},notBottom:function(){this.hasClass("notBottom")||(this.addClass("notBottom"),this.removeClass("bottom"),this.onNotBottom&&this.onNotBottom.call(this))},shouldUnpin:function(t){return"down"===t.direction&&!t.top&&t.toleranceExceeded},shouldPin:function(t){return"up"===t.direction&&t.toleranceExceeded||t.top},addClass:function(t){this.elem.classList.add.apply(this.elem.classList,this.classes[t].split(" "))},removeClass:function(t){this.elem.classList.remove.apply(this.elem.classList,this.classes[t].split(" "))},hasClass:function(t){return this.classes[t].split(" ").every(function(t){return this.classList.contains(t)},this.elem)},update:function(t){t.isOutOfBounds||!0!==this.frozen&&(t.top?this.top():this.notTop(),t.bottom?this.bottom():this.notBottom(),this.shouldUnpin(t)?this.unpin():this.shouldPin(t)&&this.pin())}},s.options={tolerance:{up:0,down:0},offset:0,scroller:t()?window:null,classes:{frozen:"headroom--frozen",pinned:"headroom--pinned",unpinned:"headroom--unpinned",top:"headroom--top",notTop:"headroom--not-top",bottom:"headroom--bottom",notBottom:"headroom--not-bottom",initial:"headroom"}},s.cutsTheMustard=!!(t()&&function(){}.bind&&"classList"in document.documentElement&&Object.assign&&Object.keys&&requestAnimationFrame),s}); 8 | -------------------------------------------------------------------------------- /docs/site_libs/quarto-html/anchor.min.js: -------------------------------------------------------------------------------- 1 | // @license magnet:?xt=urn:btih:d3d9a9a6595521f9666a5e94cc830dab83b65699&dn=expat.txt Expat 2 | // 3 | // AnchorJS - v5.0.0 - 2023-01-18 4 | // https://www.bryanbraun.com/anchorjs/ 5 | // Copyright (c) 2023 Bryan Braun; Licensed MIT 6 | // 7 | // @license magnet:?xt=urn:btih:d3d9a9a6595521f9666a5e94cc830dab83b65699&dn=expat.txt Expat 8 | !function(A,e){"use strict";"function"==typeof define&&define.amd?define([],e):"object"==typeof module&&module.exports?module.exports=e():(A.AnchorJS=e(),A.anchors=new A.AnchorJS)}(globalThis,function(){"use strict";return function(A){function u(A){A.icon=Object.prototype.hasOwnProperty.call(A,"icon")?A.icon:"",A.visible=Object.prototype.hasOwnProperty.call(A,"visible")?A.visible:"hover",A.placement=Object.prototype.hasOwnProperty.call(A,"placement")?A.placement:"right",A.ariaLabel=Object.prototype.hasOwnProperty.call(A,"ariaLabel")?A.ariaLabel:"Anchor",A.class=Object.prototype.hasOwnProperty.call(A,"class")?A.class:"",A.base=Object.prototype.hasOwnProperty.call(A,"base")?A.base:"",A.truncate=Object.prototype.hasOwnProperty.call(A,"truncate")?Math.floor(A.truncate):64,A.titleText=Object.prototype.hasOwnProperty.call(A,"titleText")?A.titleText:""}function d(A){var e;if("string"==typeof A||A instanceof String)e=[].slice.call(document.querySelectorAll(A));else{if(!(Array.isArray(A)||A instanceof NodeList))throw new TypeError("The selector provided to AnchorJS was invalid.");e=[].slice.call(A)}return e}this.options=A||{},this.elements=[],u(this.options),this.add=function(A){var e,t,o,i,n,s,a,r,l,c,h,p=[];if(u(this.options),0!==(e=d(A=A||"h2, h3, h4, h5, h6")).length){for(null===document.head.querySelector("style.anchorjs")&&((A=document.createElement("style")).className="anchorjs",A.appendChild(document.createTextNode("")),void 0===(h=document.head.querySelector('[rel="stylesheet"],style'))?document.head.appendChild(A):document.head.insertBefore(A,h),A.sheet.insertRule(".anchorjs-link{opacity:0;text-decoration:none;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale}",A.sheet.cssRules.length),A.sheet.insertRule(":hover>.anchorjs-link,.anchorjs-link:focus{opacity:1}",A.sheet.cssRules.length),A.sheet.insertRule("[data-anchorjs-icon]::after{content:attr(data-anchorjs-icon)}",A.sheet.cssRules.length),A.sheet.insertRule('@font-face{font-family:anchorjs-icons;src:url(data:n/a;base64,AAEAAAALAIAAAwAwT1MvMg8yG2cAAAE4AAAAYGNtYXDp3gC3AAABpAAAAExnYXNwAAAAEAAAA9wAAAAIZ2x5ZlQCcfwAAAH4AAABCGhlYWQHFvHyAAAAvAAAADZoaGVhBnACFwAAAPQAAAAkaG10eASAADEAAAGYAAAADGxvY2EACACEAAAB8AAAAAhtYXhwAAYAVwAAARgAAAAgbmFtZQGOH9cAAAMAAAAAunBvc3QAAwAAAAADvAAAACAAAQAAAAEAAHzE2p9fDzz1AAkEAAAAAADRecUWAAAAANQA6R8AAAAAAoACwAAAAAgAAgAAAAAAAAABAAADwP/AAAACgAAA/9MCrQABAAAAAAAAAAAAAAAAAAAAAwABAAAAAwBVAAIAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAMCQAGQAAUAAAKZAswAAACPApkCzAAAAesAMwEJAAAAAAAAAAAAAAAAAAAAARAAAAAAAAAAAAAAAAAAAAAAQAAg//0DwP/AAEADwABAAAAAAQAAAAAAAAAAAAAAIAAAAAAAAAIAAAACgAAxAAAAAwAAAAMAAAAcAAEAAwAAABwAAwABAAAAHAAEADAAAAAIAAgAAgAAACDpy//9//8AAAAg6cv//f///+EWNwADAAEAAAAAAAAAAAAAAAAACACEAAEAAAAAAAAAAAAAAAAxAAACAAQARAKAAsAAKwBUAAABIiYnJjQ3NzY2MzIWFxYUBwcGIicmNDc3NjQnJiYjIgYHBwYUFxYUBwYGIwciJicmNDc3NjIXFhQHBwYUFxYWMzI2Nzc2NCcmNDc2MhcWFAcHBgYjARQGDAUtLXoWOR8fORYtLTgKGwoKCjgaGg0gEhIgDXoaGgkJBQwHdR85Fi0tOAobCgoKOBoaDSASEiANehoaCQkKGwotLXoWOR8BMwUFLYEuehYXFxYugC44CQkKGwo4GkoaDQ0NDXoaShoKGwoFBe8XFi6ALjgJCQobCjgaShoNDQ0NehpKGgobCgoKLYEuehYXAAAADACWAAEAAAAAAAEACAAAAAEAAAAAAAIAAwAIAAEAAAAAAAMACAAAAAEAAAAAAAQACAAAAAEAAAAAAAUAAQALAAEAAAAAAAYACAAAAAMAAQQJAAEAEAAMAAMAAQQJAAIABgAcAAMAAQQJAAMAEAAMAAMAAQQJAAQAEAAMAAMAAQQJAAUAAgAiAAMAAQQJAAYAEAAMYW5jaG9yanM0MDBAAGEAbgBjAGgAbwByAGoAcwA0ADAAMABAAAAAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABAAH//wAP) format("truetype")}',A.sheet.cssRules.length)),h=document.querySelectorAll("[id]"),t=[].map.call(h,function(A){return A.id}),i=0;i\]./()*\\\n\t\b\v\u00A0]/g,"-").replace(/-{2,}/g,"-").substring(0,this.options.truncate).replace(/^-+|-+$/gm,"").toLowerCase()},this.hasAnchorJSLink=function(A){var e=A.firstChild&&-1<(" "+A.firstChild.className+" ").indexOf(" anchorjs-link "),A=A.lastChild&&-1<(" "+A.lastChild.className+" ").indexOf(" anchorjs-link ");return e||A||!1}}}); 9 | // @license-end -------------------------------------------------------------------------------- /debugging.qmd: -------------------------------------------------------------------------------- 1 | # Debugging {targets} with new access panels 2 | 3 | In her highly recommended talk [**Object of type 'closure' is not subsettable**](https://www.youtube.com/watch?v=vgYS-F8opgE), Jenny Bryan discusses 4 | leaving yourself 'access panels', like options or arguments that turn on features that help your future self in debugging endeavours. As we shall see `{targets}` has powerful new debugging access panels. 5 | 6 | First we look at how problems present and then we review a spectrum of increasingly powerful debugging techniques made available by `{targets}` 7 | 8 | ## What it looks like when things go bad 9 | 10 | ### Errors 11 | 12 | By default, when an error occurs in targets the pipeline stops. It should be pretty clear from `{targets}`' output which target has thrown the error: 13 | 14 | ``` 15 | ▶ dispatched target species_classification_model 16 | ✖ errored target species_classification_model 17 | ✖ errored pipeline [0.184 seconds] 18 | Error: 19 | ! Error running targets::tar_make() 20 | Error messages: targets::tar_meta(fields = error, complete_only = TRUE) 21 | Debugging guide: https://books.ropensci.org/targets/debugging.html 22 | How to ask for help: https://books.ropensci.org/targets/help.html 23 | Last error message: 24 | I'm broken 25 | Last error traceback: 26 | fit_final_species_classification_model(training_data = training(test_tra... 27 | stop("I'm broken") 28 | .handleSimpleError(function (condition) { state$error <- build_mess... 29 | h(simpleError(msg, call)) 30 | ``` 31 | 32 | ### Warnings 33 | 34 | If your problem results in warnings appearing they won't stop the pipeline. Instead you'll see something like: 35 | 36 | ``` 37 | ▶ dispatched target species_classification_model 38 | ● completed target species_classification_model [3.249 seconds] 39 | ✔ skipped target species_model_validation_data 40 | ✔ skipped target base_plot_model_roc_object 41 | ✔ skipped target gg_species_class_accuracy_hexes 42 | ✔ skipped target report 43 | ▶ ended pipeline [5.439 seconds] 44 | Warning messages: 45 | 1: I'm warning you 46 | 2: 1 targets produced warnings. Run targets::tar_meta(fields = warnings, complete_only = TRUE) for the messages. 47 | NULL 48 | ``` 49 | 50 | So although we don't immediately see which target threw the warnings, `{targets}` does tell us how to find that out. If we run the suggested code: 51 | 52 | ```{r} 53 | #| eval: false 54 | targets::tar_meta(fields = warnings, complete_only = TRUE) 55 | ``` 56 | 57 | we get precisely the metadata we need: 58 | 59 | ``` 60 | # A tibble: 1 × 2 61 | name warnings 62 | 63 | 1 species_classification_model Im warning you 64 | ``` 65 | 66 | ### If we get neither 67 | 68 | If we just got some nonsense results we might have to work a bit harder to 69 | figure out where to start looking for the problem. The process we described for 70 | peer reviewing the pipeline in the 'targets plan' section is similar to how we could 71 | approach finding the logic problem efficiently. 72 | 73 | ## The debugging arsenal 74 | 75 | ### Call `tar_load()` and tinker 76 | 77 | You'll very quickly be able to populate all a target's inputs in your global environment by using `tar_load()`. 78 | 79 | - This is why having functions that use the same argument names as the targets they take as arguments is quite beneficial. 80 | - If this is not the case you might enjoy loading all the input targets and then calling `debugonce`, before manually running the problematic target's expression interactively. 81 | 82 | ### Use browser() 83 | 84 | R's classic can be brought to bear! 85 | - Just one obstacle, the targets are typically built in a separate session that we don't have interactive access to! 86 | - We can actually run the pipeline in the current interactive R session. 87 | - Just make sure the session is pretty 'fresh' or you may create more problems than you solve. 88 | 89 | By way of example: 90 | 91 | 1. put `browser()` on the first line off `fit_final_species_classification_model()` 92 | 93 | 2. To build the pipeline run `tar_make(callr_function = NULL)` 94 | - We're saying "Don't use {callr}" which is the method of creating child sessions for our pipeline execution. 95 | 96 | End up interactively debugging the target: 97 | 98 | ``` 99 | ✔ skipped target gg_species_distribution_hexes 100 | ✔ skipped target gg_species_distribution_months 101 | ✔ skipped target test_train_split 102 | ✔ skipped target species_classification_model_training_summary 103 | ▶ dispatched target species_classification_model 104 | Called from: fit_final_species_classification_model(training_data = training(test_train_split), 105 | species_classification_model_training_summary) 106 | Browse[1]> 107 | ``` 108 | 109 | ### Use the 'debug' option 110 | 111 | This behaves like using `browser()` above, but is a bit better since you don't have to make a change to your code that you could forget to undo! 112 | 113 | - Does anyone else commit `browser()` to repos embarrassingly frequently? 114 | 115 | If you add to `tar_option_set()` in `_targets.R` 116 | 117 | ```{r} 118 | #| eval: false 119 | tar_option_set( 120 | seed = 2048, 121 | debug = "species_classification_model" 122 | ) 123 | 124 | ``` 125 | 126 | Then you can call `tar_make()` and the pipeline will pause for interactive debugging when `species_classification_model` is reached. 127 | 128 | If you'd like to speed things up by skipping processing any other targets you can do: 129 | 130 | ```{r} 131 | #| eval: false 132 | tar_make(species_classification_model, callr_function = NULL, shortcut = TRUE) 133 | ``` 134 | 135 | And `{targets}` will immediately begin debugging this target.[^1] 136 | 137 | [^1]: It may be tempting to use `shortcut` more frequently to speed things up, but using `shortcut` is equivalent to running a numbered pipeline stage script without running the prior scripts in the 'classic R project' we started with. Do it too often and you'll have reproducibility debt that needs to be paid down in bulk. 138 | 139 | Being able to name a target to debug increases in usefulness once we understand a more advanced concept called 'branching'. 140 | 141 | ### Use the `workspace` option 142 | 143 | This is my personal go-to when things just aren't making sense. A 'workspace' is 144 | the set of all of a target's inputs. Since targets should be pure functions, 145 | this should be all the state we need to investigate, reproduce, and fix bugs 146 | occurring in that target. 147 | 148 | The first way to use workspaces is to set an option that automatically saves them on error: 149 | 150 | ```{r} 151 | #| eval: false 152 | tar_option_set( 153 | seed = 2048, 154 | workspace_on_error = TRUE 155 | ) 156 | ``` 157 | 158 | When an error occurs we will get a slightly different output: 159 | 160 | ``` 161 | ✔ skipped target test_train_split 162 | ✔ skipped target species_classification_model_training_summary 163 | ▶ dispatched target species_classification_model 164 | ▶ recorded workspace species_classification_model 165 | ✖ errored target species_classification_model 166 | ✖ errored pipeline [0.215 seconds] 167 | ``` 168 | 169 | If we call `tar_workspace(species_classification_model)`, all of the dependencies of `species_classification_model` 170 | will be loaded into the global environment. These are: 171 | 172 | - `test_train_split` 173 | - `species_classification_model_training_summary` 174 | 175 | But isn't this just the same as calling `tar_load`? 176 | 177 | - Hopefully / Mostly yes! 178 | - But occasionally through contrived circumstances you may not be `tar_load`ing what you think you are. In this case there's no way for that mistake to happen. 179 | - There are also circumstances where you might not know the names of a specific target's inputs, and so cannot `tar_load` them at all. 180 | - More on this when we talk about 'branching' 181 | 182 | There's also another way to use workspaces, when you might not be getting an error, but you want record a workspace to check on suspicious behaviour. We can instead do: 183 | 184 | ```{r} 185 | #| eval: false 186 | tar_option_set( 187 | seed = 2048, 188 | workspaces = c("species_classification_model", "occurrences_weather_hexes") 189 | ) 190 | ``` 191 | 192 | And workspaces for these targets will be recorded, whether they error or not. 193 | 194 | # In practice 195 | 196 | In my personal experience > 90% of targets bugs can be quickly dispatched by the 197 | 'Call `tar_load()` and tinker' approach. 198 | 199 | If that fails I reach straight for workspaces. When I am using this mysterious 200 | 'branching' thing I keep referring to I'll rely on workspaces more frequently. 201 | 202 | So if you take one thing from this section it should be: 203 | 204 | - There's this 'workspaces' concept that will probably help if you're having a hard time debugging something. 205 | -------------------------------------------------------------------------------- /long_vs_wide.qmd: -------------------------------------------------------------------------------- 1 | # Long vs Wide processes 2 | 3 | Within the example project there's a bothersome little wrinkle. We call a helper 4 | function `compute_h3_indices_at_resolutions()` twice. This function is creating 5 | a set of spatial indices for our data. It's a potentially expensive process on 6 | larger data, and ideally one we'd only perform once. 7 | 8 | We call it: 9 | 10 | - Once in `wrangle_and_join_weather()` as part of the creation of our 'clean' dataset 11 | - Once in `plot_species_class_accuracy_hexes()` to create a plot of classifier accuracy by hexagon. 12 | - We're using the testing data for the model with validation metrics at that point and we dropped the spatial index when we created the training data. 13 | 14 | What we could perhaps to instead is: 15 | 16 | 1. Compute the spatial induces for our occurrences in a separate dataset 17 | 2. Join them only when needed e.g. for the hex-binned plots. 18 | 19 | But why did this wrinkle appear anyway? 20 | 21 | In a classic staged script workflow datasets goes through a very linear path. 22 | 23 | - There's this kind of unspoken quest to build the perfect-one-true-clean dataset from which all analysis can flow. 24 | - Columns get added and added, rarely removed. 25 | - Datasets get quite wide. 26 | - Often the binding name is reused each time: 27 | 28 | ```{r} 29 | #| eval: false 30 | the_dataset <- 31 | the_dataset |> 32 | mutate( # or join, summarise, rbind, cbind etc. 33 | # ... 34 | ) 35 | ``` 36 | 37 | - This style is what I am going to call a 'long' process. 38 | 39 | 40 | With `{targets}` long processes probably spend more CPU cycles than necessary. Why? 41 | -
42 | - Small changes to a target possibly trigger a large number of other targets to be rebuilt. Since the target you changed has a huge chain of targets hanging off it. 43 | 44 |
45 | 46 | Initially when users get started with `{targets}` there can be tendency to 47 | continue to pursue this pattern of long chains of data transformations which are 48 | represented as linear sequences of {targets}. 49 | 50 | With `{targets}` we have the choice between: 51 | 52 | - Minimising end to end running time of the plan 53 | - As is usually the aim in the classic workflow 54 | - Using 'long' processes 55 | - Minimising total amount of running time of the plan ever 56 | - By minimising dependencies between targets 57 | - Using 'wide' processes 58 | 59 | We'll make our plan a little wider now by refactoring out the spatial index into a separate target. 60 | 61 | # Refactoring Steps 62 | 63 | 1. Remove the this code from `wrangle_and_join_weather()`: 64 | 65 | ```{r} 66 | #| eval: false 67 | occurrences_weather_hexes <- 68 | st_as_sf( 69 | occurrences_weather, 70 | coords = c("decimalLongitude", "decimalLatitude"), 71 | remove = FALSE, 72 | crs = first(occurrences$geodeticDatum) 73 | ) |> 74 | mutate( 75 | compute_h3_indices_at_resolutions(h3_hex_resolutions, geometry) 76 | ) 77 | 78 | occurrences_weather_hexes 79 | ``` 80 | 81 | and place it in a function that creates a dataset containing the spatial indices. We'll need to refactor it a bit further in a minute. 82 | 83 | ```{r} 84 | #| eval: false 85 | tar_target( 86 | occurrences_hexes, 87 | create_h3_indexes( 88 | occurrences_weather, 89 | h3_hex_resolutions 90 | ) 91 | ) 92 | ``` 93 | 94 | 2. Change the name of `occurrences_weather_hexes` to `occurrences_weather`, since it now has nothing to do with hexes. 95 | - Remove the `h3_hex_resolutions` argument from `wrangle_and_join_weather()` 96 | - Change the name also where this dataset is input to other targets 97 | 98 | 3. Add an `id` column to `occurrences_weather` in `wrangle_and_join_weather()` like: 99 | 100 | ```{r} 101 | #| eval: false 102 | occurrences_weather |> 103 | mutate(id = seq(n())) 104 | ``` 105 | 106 | - This will be a key for us to join to. 107 | 108 | 4. Return select the `id` in the `select()` in `create_training_data()` 109 | 110 | 5. Change the model formulas in `fit_fold_calc_results()` and `fit_final_species_classification_model()` from `scientificName ~ .` to `scientificName ~ . - id` to exclude our `id`. 111 | 112 | - Actually changing the model formula in two places highlights we should probably break it out into its own target! Consider that an exercise left to the reader. 113 | 114 | 6. Refactor `create_h3_indexes` further to just return `id` and the h3 indexes: 115 | 116 | ```{r} 117 | #| eval: false 118 | occurrences_hexes <- 119 | st_as_sf( 120 | occurrences_weather, 121 | coords = c("decimalLongitude", "decimalLatitude"), 122 | remove = FALSE, 123 | crs = first(occurrences_weather$geodeticDatum) 124 | ) |> 125 | mutate( 126 | compute_h3_indices_at_resolutions(h3_hex_resolutions, geometry) 127 | ) |> st_drop_geometry() |> 128 | select( 129 | id, 130 | starts_with("h3") 131 | ) 132 | 133 | occurrences_hexes 134 | ``` 135 | 136 | 7. To all the targets that start with `gg_` and end with `hexes` pass in our `occurrences_hexes` target and join to the main dataset before plotting. 137 | 138 | - E.g. in `plot_species_distribution_hexes()` do this: 139 | 140 | ```{r} 141 | #| eval: false 142 | hex_occurrences <- 143 | occurrences_weather |> 144 | left_join(occurrences_hexes, by = "id") |> # the new bit 145 | st_drop_geometry() |> 146 | select(scientificName, h3_hex_8) |> 147 | summarise( 148 | count = n(), 149 | .by = c("scientificName", "h3_hex_8") 150 | ) |> 151 | mutate( 152 | geometry = cell_to_polygon(h3_hex_8) 153 | ) |> 154 | st_as_sf() 155 | 156 | # plot stuff follows 157 | ``` 158 | 159 | - In `gg_species_class_accuracy_hexes()` remove the `h3_hex_resolutions` argument and replace with `occurrences_hexes`. 160 | 161 | Replace this code: 162 | 163 | ```{r} 164 | #| eval: false 165 | model_validation_predictions_hex <- 166 | species_model_validation_data |> 167 | st_as_sf( 168 | coords = c("decimalLongitude", "decimalLatitude"), 169 | remove = FALSE, 170 | crs = 4326 171 | ) |> 172 | mutate( 173 | compute_h3_indices_at_resolutions(h3_hex_resolutions, geometry) 174 | ) |> 175 | st_drop_geometry() 176 | 177 | ``` 178 | 179 | With this: 180 | 181 | ```{r} 182 | #| eval: false 183 | model_validation_predictions_hex <- 184 | species_model_validation_data |> 185 | left_join(occurrences_hexes, by = "id") 186 | ``` 187 | 188 | 8. In `plot_species_distributions_points()` we no longer have a spatial dataset. So have to make our data spatial for plotting: 189 | 190 | ```{r} 191 | #| eval: false 192 | occurrences_weather_points <- 193 | occurrences_weather |> 194 | st_as_sf( 195 | coords = c("decimalLongitude", "decimalLatitude"), 196 | remove = FALSE, 197 | crs = first(occurrences_weather$geodeticDatum) 198 | ) 199 | 200 | p <- 201 | ggplot() + 202 | geom_sf( 203 | data = brisbane_river 204 | ) + 205 | geom_sf( 206 | data = occurrences_weather_points 207 | ) + 208 | facet_wrap(~scientificName) + 209 | theme_light() + 210 | theme() 211 | 212 | p 213 | ``` 214 | 215 | - We're re-converting to point geometry here. 216 | - We could make a similar argument to hexes for another target `occurrences_points` to be calculated and joined on as needed. 217 | - Where we calculate hexes 218 | - Where we plot points 219 | - Another exercise for the reader! 220 | 221 | The completed refactor is available on [this branch of the example project](https://github.com/MilesMcBain/classic_r_project/tree/refactor4) 222 | 223 | # A wider angle 224 | 225 | If we compare the network graphs before this refactor: 226 | 227 | ![Everything depending on occurrences_weather_hexes](images/long_network.png) 228 | 229 | With the one post this refactor: 230 | 231 | ![Some targets do not need the hex information](images/wide_network.png) 232 | 233 | We can see `occurrences_weather_hexes` is less of a chokepoint, and that the modeling branch of the pipeline no longer depends on the spatial indices. 234 | 235 | # Review 236 | 237 | When working with targets `{targets}` you have a new criteria to optimise for: minimise the lengths of dependency chains. 238 | 239 | - 'Widening' your process will lower overall total running time, since you can make best re-use of work saved in the store. 240 | - Using this strategy some things that were annoyingly slow in a linear pipeline are less important to optimise. 241 | - Code might be slow, but it hardly ever runs! 242 | - For `{tidyverse}` users `{targets}` enables you to 'have your cake and eat it too'. 243 | - E.g. Why bother with faster packages with more difficult syntax for code that hardly ever runs? 244 | -------------------------------------------------------------------------------- /docs/site_libs/clipboard/clipboard.min.js: -------------------------------------------------------------------------------- 1 | /*! 2 | * clipboard.js v2.0.11 3 | * https://clipboardjs.com/ 4 | * 5 | * Licensed MIT © Zeno Rocha 6 | */ 7 | !function(t,e){"object"==typeof exports&&"object"==typeof module?module.exports=e():"function"==typeof define&&define.amd?define([],e):"object"==typeof exports?exports.ClipboardJS=e():t.ClipboardJS=e()}(this,function(){return n={686:function(t,e,n){"use strict";n.d(e,{default:function(){return b}});var e=n(279),i=n.n(e),e=n(370),u=n.n(e),e=n(817),r=n.n(e);function c(t){try{return document.execCommand(t)}catch(t){return}}var a=function(t){t=r()(t);return c("cut"),t};function o(t,e){var n,o,t=(n=t,o="rtl"===document.documentElement.getAttribute("dir"),(t=document.createElement("textarea")).style.fontSize="12pt",t.style.border="0",t.style.padding="0",t.style.margin="0",t.style.position="absolute",t.style[o?"right":"left"]="-9999px",o=window.pageYOffset||document.documentElement.scrollTop,t.style.top="".concat(o,"px"),t.setAttribute("readonly",""),t.value=n,t);return e.container.appendChild(t),e=r()(t),c("copy"),t.remove(),e}var f=function(t){var e=1 { 17 | btn.style.display = "none"; 18 | }; 19 | const showBackToTop = () => { 20 | btn.style.display = "inline-block"; 21 | }; 22 | if (btn) { 23 | window.document.addEventListener( 24 | "scroll", 25 | function () { 26 | const currentScrollTop = 27 | window.pageYOffset || document.documentElement.scrollTop; 28 | 29 | // Shows and hides the button 'intelligently' as the user scrolls 30 | if (currentScrollTop - scrollDownBuffer > lastScrollTop) { 31 | hideBackToTop(); 32 | lastScrollTop = currentScrollTop <= 0 ? 0 : currentScrollTop; 33 | } else if (currentScrollTop < lastScrollTop - scrollUpBuffer) { 34 | showBackToTop(); 35 | lastScrollTop = currentScrollTop <= 0 ? 0 : currentScrollTop; 36 | } 37 | 38 | // Show the button at the bottom, hides it at the top 39 | if (currentScrollTop <= 0) { 40 | hideBackToTop(); 41 | } else if ( 42 | window.innerHeight + currentScrollTop >= 43 | document.body.offsetHeight 44 | ) { 45 | showBackToTop(); 46 | } 47 | }, 48 | false 49 | ); 50 | } 51 | 52 | function throttle(func, wait) { 53 | var timeout; 54 | return function () { 55 | const context = this; 56 | const args = arguments; 57 | const later = function () { 58 | clearTimeout(timeout); 59 | timeout = null; 60 | func.apply(context, args); 61 | }; 62 | 63 | if (!timeout) { 64 | timeout = setTimeout(later, wait); 65 | } 66 | }; 67 | } 68 | 69 | function headerOffset() { 70 | // Set an offset if there is are fixed top navbar 71 | const headerEl = window.document.querySelector("header.fixed-top"); 72 | if (headerEl) { 73 | return headerEl.clientHeight; 74 | } else { 75 | return 0; 76 | } 77 | } 78 | 79 | function footerOffset() { 80 | const footerEl = window.document.querySelector("footer.footer"); 81 | if (footerEl) { 82 | return footerEl.clientHeight; 83 | } else { 84 | return 0; 85 | } 86 | } 87 | 88 | function dashboardOffset() { 89 | const dashboardNavEl = window.document.getElementById( 90 | "quarto-dashboard-header" 91 | ); 92 | if (dashboardNavEl !== null) { 93 | return dashboardNavEl.clientHeight; 94 | } else { 95 | return 0; 96 | } 97 | } 98 | 99 | function updateDocumentOffsetWithoutAnimation() { 100 | updateDocumentOffset(false); 101 | } 102 | 103 | function updateDocumentOffset(animated) { 104 | // set body offset 105 | const topOffset = headerOffset(); 106 | const bodyOffset = topOffset + footerOffset() + dashboardOffset(); 107 | const bodyEl = window.document.body; 108 | bodyEl.setAttribute("data-bs-offset", topOffset); 109 | bodyEl.style.paddingTop = topOffset + "px"; 110 | 111 | // deal with sidebar offsets 112 | const sidebars = window.document.querySelectorAll( 113 | ".sidebar, .headroom-target" 114 | ); 115 | sidebars.forEach((sidebar) => { 116 | if (!animated) { 117 | sidebar.classList.add("notransition"); 118 | // Remove the no transition class after the animation has time to complete 119 | setTimeout(function () { 120 | sidebar.classList.remove("notransition"); 121 | }, 201); 122 | } 123 | 124 | if (window.Headroom && sidebar.classList.contains("sidebar-unpinned")) { 125 | sidebar.style.top = "0"; 126 | sidebar.style.maxHeight = "100vh"; 127 | } else { 128 | sidebar.style.top = topOffset + "px"; 129 | sidebar.style.maxHeight = "calc(100vh - " + topOffset + "px)"; 130 | } 131 | }); 132 | 133 | // allow space for footer 134 | const mainContainer = window.document.querySelector(".quarto-container"); 135 | if (mainContainer) { 136 | mainContainer.style.minHeight = "calc(100vh - " + bodyOffset + "px)"; 137 | } 138 | 139 | // link offset 140 | let linkStyle = window.document.querySelector("#quarto-target-style"); 141 | if (!linkStyle) { 142 | linkStyle = window.document.createElement("style"); 143 | linkStyle.setAttribute("id", "quarto-target-style"); 144 | window.document.head.appendChild(linkStyle); 145 | } 146 | while (linkStyle.firstChild) { 147 | linkStyle.removeChild(linkStyle.firstChild); 148 | } 149 | if (topOffset > 0) { 150 | linkStyle.appendChild( 151 | window.document.createTextNode(` 152 | section:target::before { 153 | content: ""; 154 | display: block; 155 | height: ${topOffset}px; 156 | margin: -${topOffset}px 0 0; 157 | }`) 158 | ); 159 | } 160 | if (init) { 161 | window.dispatchEvent(headroomChanged); 162 | } 163 | init = true; 164 | } 165 | 166 | // initialize headroom 167 | var header = window.document.querySelector("#quarto-header"); 168 | if (header && window.Headroom) { 169 | const headroom = new window.Headroom(header, { 170 | tolerance: 5, 171 | onPin: function () { 172 | const sidebars = window.document.querySelectorAll( 173 | ".sidebar, .headroom-target" 174 | ); 175 | sidebars.forEach((sidebar) => { 176 | sidebar.classList.remove("sidebar-unpinned"); 177 | }); 178 | updateDocumentOffset(); 179 | }, 180 | onUnpin: function () { 181 | const sidebars = window.document.querySelectorAll( 182 | ".sidebar, .headroom-target" 183 | ); 184 | sidebars.forEach((sidebar) => { 185 | sidebar.classList.add("sidebar-unpinned"); 186 | }); 187 | updateDocumentOffset(); 188 | }, 189 | }); 190 | headroom.init(); 191 | 192 | let frozen = false; 193 | window.quartoToggleHeadroom = function () { 194 | if (frozen) { 195 | headroom.unfreeze(); 196 | frozen = false; 197 | } else { 198 | headroom.freeze(); 199 | frozen = true; 200 | } 201 | }; 202 | } 203 | 204 | window.addEventListener( 205 | "hashchange", 206 | function (e) { 207 | if ( 208 | getComputedStyle(document.documentElement).scrollBehavior !== "smooth" 209 | ) { 210 | window.scrollTo(0, window.pageYOffset - headerOffset()); 211 | } 212 | }, 213 | false 214 | ); 215 | 216 | // Observe size changed for the header 217 | const headerEl = window.document.querySelector("header.fixed-top"); 218 | if (headerEl && window.ResizeObserver) { 219 | const observer = new window.ResizeObserver(() => { 220 | setTimeout(updateDocumentOffsetWithoutAnimation, 0); 221 | }); 222 | observer.observe(headerEl, { 223 | attributes: true, 224 | childList: true, 225 | characterData: true, 226 | }); 227 | } else { 228 | window.addEventListener( 229 | "resize", 230 | throttle(updateDocumentOffsetWithoutAnimation, 50) 231 | ); 232 | } 233 | setTimeout(updateDocumentOffsetWithoutAnimation, 250); 234 | 235 | // fixup index.html links if we aren't on the filesystem 236 | if (window.location.protocol !== "file:") { 237 | const links = window.document.querySelectorAll("a"); 238 | for (let i = 0; i < links.length; i++) { 239 | if (links[i].href) { 240 | links[i].dataset.originalHref = links[i].href; 241 | links[i].href = links[i].href.replace(/\/index\.html/, "/"); 242 | } 243 | } 244 | 245 | // Fixup any sharing links that require urls 246 | // Append url to any sharing urls 247 | const sharingLinks = window.document.querySelectorAll( 248 | "a.sidebar-tools-main-item, a.quarto-navigation-tool, a.quarto-navbar-tools, a.quarto-navbar-tools-item" 249 | ); 250 | for (let i = 0; i < sharingLinks.length; i++) { 251 | const sharingLink = sharingLinks[i]; 252 | const href = sharingLink.getAttribute("href"); 253 | if (href) { 254 | sharingLink.setAttribute( 255 | "href", 256 | href.replace("|url|", window.location.href) 257 | ); 258 | } 259 | } 260 | 261 | // Scroll the active navigation item into view, if necessary 262 | const navSidebar = window.document.querySelector("nav#quarto-sidebar"); 263 | if (navSidebar) { 264 | // Find the active item 265 | const activeItem = navSidebar.querySelector("li.sidebar-item a.active"); 266 | if (activeItem) { 267 | // Wait for the scroll height and height to resolve by observing size changes on the 268 | // nav element that is scrollable 269 | const resizeObserver = new ResizeObserver((_entries) => { 270 | // The bottom of the element 271 | const elBottom = activeItem.offsetTop; 272 | const viewBottom = navSidebar.scrollTop + navSidebar.clientHeight; 273 | 274 | // The element height and scroll height are the same, then we are still loading 275 | if (viewBottom !== navSidebar.scrollHeight) { 276 | // Determine if the item isn't visible and scroll to it 277 | if (elBottom >= viewBottom) { 278 | navSidebar.scrollTop = elBottom; 279 | } 280 | 281 | // stop observing now since we've completed the scroll 282 | resizeObserver.unobserve(navSidebar); 283 | } 284 | }); 285 | resizeObserver.observe(navSidebar); 286 | } 287 | } 288 | } 289 | }); 290 | -------------------------------------------------------------------------------- /pure_functions.qmd: -------------------------------------------------------------------------------- 1 | # Pure Functions as units of work 2 | 3 | Functions are fun! 4 | ```{r} 5 | #| eval: false 6 | operate_on_a_b <- function(input_a = 1, input_b = 1, operation = `+`) { 7 | operation(input_a, input_b) 8 | } 9 | 10 | # Predict the output 11 | operate_on_a_b(2, 2) 12 | 13 | operate_on_a_b(2, 2, c) 14 | 15 | operate_on_a_b(2, 2, rnorm) 16 | 17 | operate_on_a_b(2, 2, operate_on_a_b) 18 | 19 | operate_on_a_b(2, 2, function(x, y) { paste0(x, y) |> as.numeric() }) 20 | 21 | operate_on_a_b(2, 2, \(x, y) union(x, y)) 22 | ``` 23 | 24 | - And R really shoves them in your face e.g. `lapply`, e.g. `labels =` in `{gglpot2}` 25 | - Can be intimidating at first. 26 | 27 | - Functions underpin {targets} 28 | - Take home point: USE MORE FUNCTIONS. 29 | - Even if you don't use {targets} your workflows can probably benefit from using more functions 30 | - Question: What is the level of comfort with writing a function? 31 | - How often do you do it? Daily, weekly, monthly? 32 | - Question: How do we recognise a good time to create a function? 33 | -
When to create and use a function 34 | - The classic doga is: "When you've copy-pasted the same code three times". Connected to the DRY approach "Don't Repeat Yourself". 35 | - When you need to make a passage of code less complex. Functions allow us to create new forms of expression that can express solutions in terms that better match our domain. Write more 'elegant', readable, and maintainble code. 36 |
37 | 38 | ## Functions for the win* 39 | 40 | - *win = more maintainable, more easily debuggable code 41 | - Question: Think about the worst code you've ever had to debug. What were its features? What made debugging it hard? 42 | -
Debug-resistant code 43 | - Large amount of environment 'state' that is time consuming to set up to recreate bug 44 | - Large number of lines of code where bug can hide 45 | - Stepping through it all is time consuming 46 | - Large number of objects or functions involved 47 | - Understanding them all is a high cognitive load 48 |
49 | - Is anyone familiar with how bushfires are fought? 50 | - Firefighters create fire breaks (containment lines) to break up fuel (the bushland) into containment zones. The idea is to keep the fire burning within a contained zone until it consumes all the fuel and burns itself out, or weather conditions become less favourable. 51 | - Functions can be 'containment zones' for bugs. 52 | - State to recreate bug is limited function's inputs 53 | - Places where bug can hide is limited to within function's code (sometimes) 54 | - Could actually be somewhere else, but you've narrowed it down 55 | - Functions are communication tools. 56 | - Naming a procedure after what it does can be as effective as a comment 57 | - They provide a navigable hyperlinked structure 58 | - Example: "classic_r_project_/R/compute_h3_indices_at_resolutions.R" 59 | 60 | # Pure functions and the target graph 61 | 62 | - A 'Pure' function is a specific flavour of function that is important in the context of {targets}. 63 | - For pure functions these two properties hold: 64 | 1. Given the same arguments the function will always return the same output ('Deterministic'). 65 | - By extension this means the function cannot depend or be affected by on anything that is not an argument. 66 | 2. Functions have 'no side effects' that is they cannot affect state outside the function scope in any way, other than with the output they return. No writing files, no submitting data to APIs, no setting options or environment variables etc. 67 | 68 | There's a bunch of cool algebra that arises from pure functions you may have heard referred to called the [Lambda Calculus](https://www.youtube.com/watch?v=3VQ382QG-y4). It shows you can calculate anything calculable in a system comprised of ONLY pure functions. 69 | 70 | Question: How do these properties relate to reproducibility? 71 | 72 | For our {targets} use case 2. is more important than 1. We can use non-deterministic functions with {targets}, and in fact sometimes it's hard to get around this because so much data work involves sampling random numbers. 73 | - You're encouraged to set a 'seed' to make random functions deterministic and reproducible. They now act pure within your context. 74 | 75 | Property 2. facilitates 'static code analysis'. That's the process by which {targets} to turns your pipeline into a graph. Formally, a Directed Acyclyic Graph (DAG): 76 | 77 | ![A graph of an example project from the {targets} manual](images/targets_network.png) 78 | 79 | This is a simple example from the {targets} manual. Here's some code from a more realistic targets plan: 80 | 81 | ![A target definition](images/target_definition.png) 82 | 83 | And here is the corresponding section of the {targets} graph: 84 | 85 | ![The target definition in as a node in a graph](images/targets_real_network.png) 86 | 87 | {targets} builds graphs like this by analysising your code. It assumes that each target depends only on its inputs, and returns a single output. Although that output can be a collection of things that is iterated over. More on that later. 88 | 89 | By connecting target nodes via their input and output edges {targets} can determine some interesting things using this graph: 90 | 91 | - For a given change in data / or code, the set of all downstream targets that depend on that data / code. When the pipeline is re-run only the targets that depend on things changed are built. 92 | - The results are guaranteed to be as if the entire pipeline had been re-rerun from start to finish. 93 | - It can determine nodes graph that do not share common dependencies. These targets can be computed in parallel to speed up the overall plan execution. 94 | 95 | 96 | # Refactoring the classic workflow to pure functions 97 | 98 | The first step in refactoring our classic workflow into {targets} is to refactor `run.R` into a series of function calls to pure functions. We'll then convert this into a {targets} 'plan'. 99 | 100 | - There's actually surprisingly little work to get a {targets} pipeline running! 101 | - Initially our functions will be too big and do too much. 102 | - Most of the work will be refactoring to make better use of {targets} features. 103 | 104 | ## But how do design functions? 105 | 106 | - How to size a function as a unit of work for a {targets} plan? 107 | - Code size: Not that much more than a screenful of code 108 | - Complexity: It's like a paragraph or a subheading. One Main idea. 109 | - Ideally one kind of output 110 | - A list of things of the same class is quite normal. 111 | - Multiple distinct results is possible e.g. with a list. 112 | - Potentially a code smell that your function is doing too much. 113 | - Sometimes a result can be opportunistically efficiently calculated as part of something else... fair enough. 114 | 115 | ## Let's do this 116 | 117 | Follow my lead and we'll do our first refactor in preparation for {targets} to [our project](https://github.com/milesmcbain/classic_r_project). 118 | 119 | Don't worry, we won't do it all now. Just enough to give you the idea. I have a "Here's one I prepared earlier" for each refactor step. 120 | 121 | - It would be a good exercise to attempt this complete task on your own 122 | 123 | ### Refactoring Steps 124 | 125 | 1. Move config and library calls into run.R 126 | - All the subsequently created functions are going to be called and wired together in run.R 127 | 2. Wrap up 01_ into a function that fetches data 128 | 129 | ```{r} 130 | #| eval: false 131 | occurrences <- fetch_data( 132 | study_species, 133 | study_date, 134 | study_area_file = inner_brisbane_boundary 135 | ) 136 | ``` 137 | 138 | 3. Wrap up 02_ into a funciton that wrangles data 139 | 140 | ```{r} 141 | #| eval: false 142 | occurrences_weather_hexes <- 143 | wrangle_and_join_weather( 144 | species_data = occurrences, 145 | study_species, 146 | data_start_date, 147 | weather_data_path, 148 | h3_hex_resolutions 149 | ) 150 | ``` 151 | 152 | 4. Split up 03_ into a function for each plot 153 | 154 | ```{r} 155 | #| eval: false 156 | brisbane_river <- st_read(brisbane_river_file) 157 | gg_species_distribution_points <- 158 | plot_species_distribution_points( 159 | occurrences_weather_hexes, 160 | brisbane_river 161 | ) 162 | 163 | gg_species_distribution_hexes <- 164 | plot_species_distribution_hexes( 165 | occurrences_weather_hexes, 166 | brisbane_river 167 | ) 168 | 169 | gg_species_distribution_months <- 170 | plot_species_distribution_months( 171 | occurrences_weather_hexes 172 | ) 173 | 174 | 175 | ``` 176 | 177 | 5. Split up 04_ into: 178 | - function the creates the training data 179 | - function that creates test train splits 180 | - function that does the model parameter grid search 181 | - function that fits final model 182 | - function that creates validation data from final model and test set 183 | 184 | ```{r} 185 | #| eval: false 186 | occurrences_training_data <- 187 | create_trainging_data(occurrences_weather_hexes) 188 | 189 | test_train_split <- 190 | initial_split(occurrences_training_data) 191 | 192 | species_classification_model_training_summary <- 193 | species_classification_model_grid_search( 194 | training_data = training(test_train_split), 195 | n_cv_folds = 5, 196 | mtry_candidates = c(1, 2, 3), 197 | num_trees_candidates = c(200, 500, 100) 198 | ) 199 | 200 | species_classification_model <- 201 | fit_final_species_classification_model( 202 | training_data = training(test_train_split), 203 | species_classification_model_training_summary 204 | ) 205 | 206 | species_model_validation_data <- 207 | create_validation_data( 208 | species_classification_model, 209 | test_data = testing(test_train_split) 210 | ) 211 | ``` 212 | 213 | 214 | 6. Split 05_ into one function for each plot 215 | 216 | ```{r} 217 | #| eval: false 218 | gg_species_class_accuracy_hexes <- 219 | plot_species_class_accuracy_hexes( 220 | species_model_validation_data, 221 | brisbane_river, 222 | h3_hex_resolutions 223 | ) 224 | 225 | # need to call plot on this one in our Rmd 226 | base_plot_model_roc_object <- 227 | get_species_classifier_roc( 228 | species_model_validation_data 229 | ) 230 | 231 | ``` 232 | 233 | 7. Swap out all the images used in the Rmd for plot objects. 234 | - These are already in the global envirnonment and so can be seen by knitr / rmarkdown during render. 235 | 236 | The completed refactor is on the [`refactor1` branch of our project](https://github.com/MilesMcBain/classic_r_project/tree/refactor1) 237 | 238 | In particular pay attention to `run.R`. 239 | - Notice how it is far easier to get a handle on what information our project depends on and where that is used? 240 | 241 | ### Workflow tips 242 | 243 | - [`{fnmate}`](https://github.com/milesmcbain/fnmate) for creating a function defintion from an example call. 244 | - 'Jump to definition' for jumping to the body of a function from a call site. 245 | -------------------------------------------------------------------------------- /typical_R_projects.qmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Strengths and weaknesses of typical R project workflows" 3 | format: 4 | html: 5 | code-fold: true 6 | --- 7 | 8 | # Concept: Data Analysis Pipelines 9 | 10 | - Data goes in, answers, insights, all the magic comes out. 11 | - 'Pipeline' implies a process which is a kind of linear progression from inputs to outputs. 12 | - Contrast this with a process that looks more like a continuous loop, where the aim is to receive input data, react to it, and then rest waiting for the next piece of data. 13 | - E.g. a software Application 14 | - The linear aspect is often reflected in how we structure our data analysis projects. 15 | 16 | # Concept: Reproducible Data Analysis 17 | 18 | - The 'reproducibility' that is connected to data analysis pipeline tools is not the 'replicability' from the replication crisis in science. But the two are connected. 19 | - We can 'reproduce': 20 | - The same conclusion given the same input data, and following the same analysis process. 21 | - e.g. your colleague's work on your computer 22 | - A valid analysis given different input data, and following the same analysis process. 23 | - Conclusion might be different 24 | - What are the reasons we might want to do this? 25 |
26 | Benefits of reproducibility? 27 | - In order to answer questions about, or make extensions to an analysis in the future 28 | - 'Boomerang effect' 29 | - To be able to make realistic predictions about how long a data analysis will take 30 | - To ensure consistent conclusions are reached to related questions 31 | - Need consistent definitions for inputs and key metrics 32 | - In a nutshell: Reliability, Consistency 33 | - Without these you don't have a viable data analysis capability 34 |
35 | 36 | # Reproducibility and Code 37 | 38 | - Code works in favour of reproducibility. 39 | - It's not guaranteed, but well written code can produce a deterministic procedure for data analysis. Use the same dataset with the same code and you should reproduce the same answer. 40 | - In an ideal world every data analysis could be a single succinct script of beautifully aesthetic code, easily understood by humans and machines alike. 41 | - In practice this is rarely possible due to certain forces. What are those forces? 42 |
43 | Forces pulling apart that perfect script 44 | - Domain mismatch: need write a lot of code 45 | - External systems: need to tread lightly on them 46 | - Expensive computations: repeatedly performing them is infeasible 47 | - Division of labour 48 | - (?) 49 |
50 | 51 | # Classic approaches to R projects 52 | 53 | ## Script per pipeline 'stage' 54 | 55 | The most common approach to balancing reproducibility versus other concerns is 56 | to break the pipeline up into discrete scripts that map to stages in the linear 57 | pipeline. These stages might be conceived of as something like: 58 | 59 | 1. Acquire data 60 | 2. Wrangle data 61 | 3. Visualise data 62 | 4. Model data 63 | 5. Present findings 64 | 65 | With variations as required by context. 66 | 67 | A typical folder structure might look something like: 68 | 69 | ``` 70 | . 71 | ├── data 72 | │   ├── processed_data.Rds 73 | │   └── raw_data.csv 74 | ├── doc 75 | │   ├── exploratory_analysis.Rmd 76 | │ └── report.Rmd 77 | ├── output 78 | │ ├── insightful_plot.png 79 | │   └── final_model.Rds 80 | ├── R 81 | │   └── helpers.R 82 | ├── README.md 83 | ├── run.sh 84 | └── scripts 85 | ├── 01_load_data.R 86 | ├── 02_wrangle_data.R 87 | ├── 03_visualise_data.R 88 | ├── 04_model_data.R 89 | └── 05_render_report.R 90 | 91 | ``` 92 | 93 | There's a lot of variations on this idea. There might be multiple scripts per 94 | phase here, e.g. one per plot (figure)03_visualise_data.R, or one per model. 95 | Using more folders seems popular. 96 | 97 | The key element is that the data analysis is broken down into a series of 98 | stages, each of which is captured by a single script file. Quite often these 99 | script files are numbered, with it to be implicitly understood that the the 100 | correct way to run the pipeline is to run the scripts in numerical order[^1]. 101 | 102 | [^1]: Occasionally this presents refactoring challenges, where a new stage needs to 103 | be added late in development and the author might roll with a `02b_` to save 104 | updating too many paths. 105 | 106 | If the author is diligent the `README.md` will contain information about how to 107 | run the pipeline, and may provide some kind of `run.R` or `run.sh` script which 108 | acts as the 'entry-point' for kicking off pipeline execution. This is intended 109 | to be the thing that you run to reproduce the author's results. 110 | 111 | This can be something of a trap because the author likely does not actually use the 112 | `run.sh` script as part of their workflow. 113 | - Why would this be so? 114 |
115 | reasons for not using the 'run everything' entry-point. 116 | - Author likely taking advantage of R's REPL for interactive development. Relies on operating on incomplete pipeline state. 117 | - Running whole pipeline is too slow. Author can't iterate fast enough if they have to re-run all earlier stages just to make small tweaks to a later stage. E.g. playing with plot presentation 118 | - Running whole pipeline would have undesirable side-effects like pulling a large amount of data from an API. 119 |
120 | 121 | So the workflow used in practice tends be some combination of: 122 | - Interactively run numbered scripts up to the one you want to work on, then manually step through code to create the prerequisite state in the global environment. 123 | - Shortcut the early stages of the pipeline by having them write intermediate output files that are read as inputs to later stages. Stages can be worked on independently. 124 | 125 | ### When things go stale 126 | 127 | It's the real, yet informal, workflow that creates problems. 128 | 129 | If we used this project structure, and always ran the pipeline from start to 130 | finish, we would always know whether our code was in a state consistent with 131 | valid outputs [^2]. 132 | 133 | If we work interactively, as we inevitably will[^3], we create opportunities for 134 | the pipeline's code and outputs (be they intermediate or final) to be in conflict. 135 | Here's a few a examples: 136 | 137 | 1. Changes are made to `02_wrangle_data.R` to support better modeling in 138 | `04_model_data.R`. We were so keen to write about the improved results, we 139 | forgot to re-run `03_visualise_data.R` which also outputs some image files 140 | that are used in the `report.Rmd`. When we render `report.Rmd` it contains 141 | images with data that was dropped before running the new model, and our boss 142 | is confused as to why we didn't remove them yet. We look silly. 143 | 144 | 2. In the midst of running `04_model_data.R` interactively we forget that 145 | creating a `|>` chain of `data.table` transformations can modify the head 146 | dataset in-place. We tweak a chain before running it again leading to some 147 | columns being transformed twice. When we run the modeling code to completion 148 | we get some unexpectedly good results, and save that model for later use in 149 | `report.Rmd`. We write the report out around these results, only to have 150 | everything sour at the last minute when we try to run the entire pipeline as 151 | one with `run.sh` and a completely different set of results appears. 152 | 153 | Both of these examples are different aspects of the same core problem. Working 154 | interactively with code that can accumulate data i.e. files on disk, data.frames 155 | in the global environment, rows in a database, etc. creates the opportunity for 156 | the code and the accumulated data to be in an inconsistent state. Sometimes this 157 | accumulated data is itself referred to as 'global state' or simply 'state', and 158 | people might say our problem was caused by 'stale state', that is: we are 159 | working with data that is no longer representative of what our program would 160 | output, if we ran it from scratch. 161 | 162 | ### Cycles vs Lines 163 | 164 | - The project structure is strongly linear: Each script is assumed to be fully dependent on those prior, just as each line of code is on the one before. 165 | - Our work pattern is strongly cyclic as we iteratively refine our reasoning, statistical methods, data visualisations etc. This is involves making smaller targeted changes to code at all stages of the pipeline. 166 | 167 | There are forces that we have discussed that exert pressure to avoid running the pipeline in a linear fashion. This creates space for issues: 168 | 169 | - The pipeline either fails to run, or gives unexpected results when finally run in a complete pass. Reproducibility fail. 170 | - Concepts drift between script files as they are worked on piecemeal. E.g. the same dataset is loaded in multiple script files but referred to by different names. Coherency fail. 171 | 172 | As we will see `{targets}` will remove the pressure to run the entire pipeline end to end, and allow us to work iteratively without the risk of these problems, perhaps faster than ever before. 173 | 174 | [^2]: Or would we? How do we decide what valid outputs are? More on this later. 175 | 176 | [^3]:Probably one of the reasons you're using R is the gloriously fluent 177 | conversations you can have with your data via the REPL. Working with rapid 178 | feedback when you need to simultaneously learn about data and program around it 179 | just feels way too good compared to the alternative where you have to wait for a 180 | heavy process to spool up and to run each time you have a simple question. 181 | 182 | ## Rmd / Quarto Monolith 183 | 184 | The idea of keeping code and output synchronised is often introduced to motivate 185 | the use of _literate programming_ tools like Rmarkdown or Quarto. They 186 | definitely have something to contribute here, these tools work very well for 187 | educational material (like this workshop!), but the format does not scale well to large and complex data science projects. 188 | 189 | There are two main difficulties: 190 | 191 | 1. Fundamentally the format is geared toward producing a single output which is 192 | a text of some kind. Complex data science projects often have a myriad of other 193 | outputs including models, datasets, and other documents. Possibly having your 194 | model run binned because `pandoc` balked at your markdown syntax is not 195 | sensible. 196 | 197 | Rmarkdown and Quarto offer a caching feature to try to mitigate this but it 198 | involves manual cache management, and does not give you control over 199 | serialisation formats which mean certain objects will be unable to be 200 | restored from cache correctly. It's up to you to discover which. 201 | 202 | 2. In projects that involve complex data wrangling or modeling a tension can 203 | develop between the text and the code, where the code needs to be complex, but 204 | the text is pitched at a different (usually higher) conceptual level. The two fight for the narrative thread, and make for a disjointed / confusing reading experience. I call this _illiterate programming_. 205 | 206 | My advice is definitely do use Rmarkdown or Quarto, but avoid shoehorning an 207 | entire pipeline into the document. Have a pipeline produce the intermediate outputs 208 | separately which are then read into the document generation pipeline and given superficial coding treatment e.g. light wrangling into presentation layer plots or tables. 209 | 210 | # Introducing our project V1: 'Classic R Project' 211 | 212 | In this workshop we're going to refactor [this project](https://github.com/MilesMcBain/classic_r_project) into a {targets} pipeline. 213 | 214 | Nominally the project is about fetching some species distribution data from an API, merging that with some weather data, training a species classification model, and producing a report. 215 | 216 | - Take a few minutes to poke around 217 | - Read script 01 and 02. 218 | - Is it clear why things in 02 are the way they are? 219 | - Is it clear why we do that H3 hex index thing? 220 | - Has anyone looked at real estate recently. Which gives you more context: clicking through all the beautiful images, or one look at the floor plan? 221 | - Or imagine reading a text book that had no table of contents? 222 | 223 | 224 | -------------------------------------------------------------------------------- /targets_plan.qmd: -------------------------------------------------------------------------------- 1 | # The {targets} plan 2 | 3 | As discussed in the last section, refactoring our project into a collection of functions is actually most of the work in converting a pipeline to use {targets}. The `run.R` we built already looks a lot like a {targets} 'plan'. 4 | 5 | It's time we actually defined what a {targets} plan is. This can be a little confusing because there are two things that users might refer to as the 'plan': 6 | 7 | 1. A script file, by convention called `_targets.R`, that sits in the root folder of the project. 8 | 9 | - Sets up environment for the project's targets. Loads packages, sources script files containing functions to be called. Sets global state: options, environment variables etc. 10 | - Returns, as its last object, a list of target objects. 11 | 12 | 2. The list of target objects itself. This data structure is what is analysed to determine the dependency structure of the pipeline graph. 13 | 14 | So in classic R fashion, the definition of the computational graph is itself a 15 | data structure which can be manipulated to great metaprogramming effect. For 16 | example we can have targets that appear in the `_targets.R` as a single 17 | computational node, but are actually expanded out into several targets in the 18 | final returned object.[^1] 19 | 20 | [^1]: More concretely: We could have a 'model fit' target that decomposes into two separate targets for the model and performance statistics. We could use an argument to change the assessment criteria without rebuilding the model. Don't worry if that didn't make sense yet. 21 | 22 | In this workshop we'll refer to the `_targets.R` file as the plan. 23 | 24 | # Let's create a plan 25 | 26 | Let's jump into refactoring our `run.R` into a {targets} plan. 27 | 28 | ## Refactoring steps 29 | 30 | 1. rename `run.R` to `_targets.R` 31 | 2. add `library(targets)` and `library(tarcheypes)` to libraries 32 | 33 | - `{tarchetypes}` is discussed in the next section 34 | 35 | 3. replace `set.seed(2048)` with `tar_option_set(seed = 2048)` 36 | 4. Refactor paths bindings of data files like: 37 | 38 | ```{r} 39 | #| eval: false 40 | weather_data_path <- "data/brisbane_weather.csv" 41 | ``` 42 | 43 | ```{r} 44 | #| eval: false 45 | tar_file(weather_data_path, "data/brisbane_weather.csv") 46 | ``` 47 | 48 | 5. Refactor each object binding like: 49 | 50 | ```{r} 51 | #| eval: false 52 | object_name <- function_call(arg1, arg2, arg3) 53 | ``` 54 | 55 | to: 56 | 57 | ```{r} 58 | #| eval: false 59 | tar_target( 60 | object_name, 61 | function_call( 62 | arg1, 63 | arg2, 64 | arg3 65 | ) 66 | ), 67 | ``` 68 | 69 | 6. Wrap all the output bindings and their function calls in a `list()` 70 | 71 | - make sure there's a comma separating each 72 | 73 | 7. Replace `render("docs/report.Rmd")` with 74 | 75 | ```{r} 76 | #| eval: false 77 | tar_render(report, "docs/report.Rmd") 78 | ``` 79 | 80 | 8. Wrap each pipeline output in the report in `tar_read()`, e.g.: 81 | ```{r} 82 | #| eval: false 83 | tar_read(gg_species_distribution_points) 84 | ``` 85 | 86 | 9. Replace the code to source all files in the R folder with `tar_source()` 87 | 88 | Now we have a {targets} plan! 89 | 90 | The completed refactor is available on the ['refactor2' branch of the R project](https://github.com/MilesMcBain/classic_r_project/tree/refactor2). 91 | 92 | # {tarchetypes} and the Targetopia 93 | 94 | [`{tarchtypes}`](https://github.com/ropensci/tarchetypes) is an addon package to 95 | `{targets}` also authored by Will Landau. It contains helpful additions that 96 | make expressing {targets} plans simpler and cleaner. It is part of the 97 | ['Targetopia'](https://wlandau.github.io/targetopia/packages.html) family of extensions 98 | for targets. 99 | 100 | {targets} has been designed to allow users to create their own domain specific extensions, for example see [`{geotargets}`](https://github.com/njtierney/geotargets). 101 | 102 | In this refactor we use several `{tarchetypes}` functions: 103 | 104 | - `tar_file()` is a compact way to declare both input and output file targets. 105 | - These files will trigger targets that depend on them to be rebuilt if they change. 106 | - `tar_render()` is a way to declare an RMarkdown report target (Yes, there is a Quarto variant). 107 | - The R code in the report is analysed to discover dependencies on other targets. 108 | 109 | # 'Making a plan' 110 | 111 | Assuming: 112 | 113 | - Our R session working directory is the to the project root, where `_targets.R` is 114 | - We have the `{targets}` package loaded in our R session 115 | 116 | We can build all targets in the plan by calling `tar_make()` 117 | 118 | If we do that now we should see a stream of messages appear in our R terminal like: 119 | 120 | ``` 121 | ... 122 | ● completed target species_classification_model_training_summary [53.591 seconds] 123 | ▶ dispatched target species_classification_model 124 | ● completed target species_classification_model [3.06 seconds] 125 | ▶ dispatched target species_model_validation_data 126 | ● completed target species_model_validation_data [0.253 seconds] 127 | ▶ dispatched target base_plot_model_roc_object 128 | Setting levels: control = FALSE, case = TRUE 129 | Setting direction: controls < cases 130 | ● completed target base_plot_model_roc_object [0.005 seconds] 131 | ▶ dispatched target gg_species_class_accuracy_hexes 132 | ● completed target gg_species_class_accuracy_hexes [0.136 seconds] 133 | ▶ dispatched target report 134 | ● completed target report [4.148 seconds] 135 | ▶ ended pipeline [1.095 minutes] 136 | ``` 137 | 138 | Immediately calling `tar_make()` again should show something different: 139 | 140 | ``` 141 | ✔ skipped target species_classification_model_training_summary 142 | ✔ skipped target species_classification_model 143 | ✔ skipped target species_model_validation_data 144 | ✔ skipped target base_plot_model_roc_object 145 | ✔ skipped target gg_species_class_accuracy_hexes 146 | ✔ skipped target report 147 | ✔ skipped pipeline [0.173 seconds] 148 | ``` 149 | 150 | By default the plan is built in a separate R session! So only state created by running `_targets.R` is used. This avoids a whole class of bugs that arise due to running code interactively against stale state in the global environment. 151 | 152 | # The Store 153 | 154 | When we ran the plan for the second time all our targets were 'skipped' because 155 | the {targets} framework determined it already had the output for them, 156 | since nothing had changed since first time we ran the plan. 157 | 158 | The built version of every target in the plan is stored in a place referred to 159 | as the store. By default the store lives inside the `_targets` folder which 160 | {targets} creates the first time we call `tar_make()`, and then refers to on every 161 | subsequent run. 162 | 163 | For our project, that folder looks like this now: 164 | 165 | ``` 166 | _targets 167 | ├── meta 168 | │   ├── meta 169 | │   ├── process 170 | │   └── progress 171 | ├── objects 172 | │   ├── base_plot_model_roc_object 173 | │   ├── brisbane_river 174 | │   ├── gg_species_class_accuracy_hexes 175 | │   ├── gg_species_distribution_hexes 176 | │   ├── gg_species_distribution_months 177 | │   ├── gg_species_distribution_points 178 | │   ├── occurrences 179 | │   ├── occurrences_training_data 180 | │   ├── occurrences_weather_hexes 181 | │   ├── species_classification_model 182 | │   ├── species_classification_model_training_summary 183 | │   ├── species_model_validation_data 184 | │   ├── study_species 185 | │   └── test_train_split 186 | └── user 187 | ``` 188 | 189 | `_targets/objects` contains the R objects returned from the functions that ran for each target. Each object is serialised to a file in Rds format, and labeled with the associated target's name. The file format is configurable with the the `format` option of `tar_target()`. There are also format helpers available like `tarchetypes::tar_parquet`, which defines a target that will be serialised to parquet format. 190 | 191 | There are some targets that are not present. File targets are stored only in the metadata. 192 | 193 | We can get some interesting information from the metadata with `tar_meta()`: 194 | 195 | ```{r} 196 | #| eval: false 197 | tar_meta() |> 198 | arrange(-seconds) |> 199 | select(name, format, seconds, bytes, warnings, error) |> 200 | head() |> 201 | knitr::kable() 202 | ``` 203 | 204 | |name |format | seconds| bytes|warnings |error | 205 | |:---------------------------------------------|:------|-------:|-------:|:--------|:-----| 206 | |species_classification_model_training_summary |rds | 54.934| 395|NA |NA | 207 | |occurrences |rds | 9.832| 326259|NA |NA | 208 | |report |file | 4.148| 1950959|NA |NA | 209 | |species_classification_model |rds | 2.967| 7969385|NA |NA | 210 | |occurrences_weather_hexes |rds | 0.488| 511859|NA |NA | 211 | |study_species |rds | 0.347| 555|NA |NA | 212 | 213 | Targets can be read from the store like: 214 | 215 | ```{r} 216 | #| eval: false 217 | species_classification_model_training_summary <- 218 | tar_read(species_classification_model_training_summary) # returns value 219 | tar_load(species_classification_model_training_summary) # adds to global environment 220 | ``` 221 | - This is very useful to inspect intermediate targets to do further development with, or debug the plan. 222 | 223 | # Cleaning the store 224 | 225 | - Running `tar_invalidate(occurrences)` means the next time the plan is run `occurrences` will be rebuilt, which MAY trigger downstream targets to be rebuilt, but not necessarily if targets identifies the value hasn't changed. The stored value and metadata is retained. 226 | - `tar_delete()` removes targets from the store. It supports 'tidyselect' style name matching. 227 | - `tar_delete(everything())` wipes the whole store. 228 | - `tar_prune()` removes historical targets from the store that are no longer in the plan. 229 | - Useful if you accidentally get a typo'd version of target in your store and keep loading it by accident. 230 | 231 | # Plan visualisation 232 | 233 | The graph of targets defined in the plan can be visualised with either 234 | - `tar_visnetwork()` 235 | - `tar_glimpse()` - doesn't show information about which targets are up to date 236 | 237 | e.g. 238 | 239 | ![The result of tar_visnetwork() on our project.](images/targets_visnetwork2.png) 240 | 241 | These get messy fast, but one useful aspect is targets that align vertically are 242 | not dependent on each other, so it can help you understand the scope for 243 | parallelisation to speed up running the plan. 244 | 245 | # Finer points 246 | 247 | ## Packages 248 | 249 | The global environment established by the `_targets.R` is used to evaluate all targets. This includes loaded packages. In the case of parallelism, as we will see later, the environment is replicated. 250 | 251 | Aside from the traditional `library()` method, there is another way to declare 252 | packages, per target, using the `packages` argument. In this case the packages 253 | are loaded right before the target is built, or it's built object is used in a 254 | downstream target. This may be situationally convenient, but because targets 255 | that do not share a dependency relationship can be run in any order, this means 256 | that packages can be loaded in any order, which could have unexpected 257 | consequences. I would suggest avoiding this. 258 | 259 | Many examples you see will set packages globally in `_targets.R` using 260 | `tar_option_set(packages = )` this conveys no advantage over the `library()` 261 | method, and has one significant drawback in that `renv` will not detect these 262 | packages automatically as project dependencies. 263 | 264 | My recommendation is to stick with `library()` 265 | 266 | There is one other option related to packages which is important. 267 | `tar_option_set(imports = )` defines a set of packages whose functions and objects are to be 268 | watched for changes. If the package is updated it will cause dependent targets 269 | that use updated functions or objects to be rebuilt. 270 | 271 | You probably don't want to put every package in here 272 | - scanning through large amounts of code slows things down 273 | - packages that updated regularly could trigger too much churn in your pipeline 274 | - your internal packages or quite unstable packages are probably good candidates. 275 | 276 | ## Constants 277 | 278 | At the moment we have a objects declared outside the plan. E.g. `study_date <- ymd("2024-05-08")`. 279 | These are still watched for changes, but they are not 280 | targets. Their value is not available from the store. My advice is that it makes 281 | for a slightly better interactive development and debugging workflows to have 282 | them in the list of targets objects, so they become fully fledged targets, and 283 | available from the store. 284 | 285 | `tarchetypes::tar_plan` is a replacement for `list()`allows targets to be written in the list like: 286 | 287 | ```{r} 288 | #| eval: false 289 | tar_plan( 290 | study_date = ymd("2024-05-08"), 291 | tar_target( 292 | study_species, 293 | search_taxa(c("Threskiornis molucca", "Threskiornis spinicollis")) 294 | ) 295 | ) 296 | ``` 297 | 298 | ## File targets 299 | 300 | It can be confusing that both input files and output files are declared the same way. 301 | 302 | - If you have a file input, and you need targets that depend on the file to be rebuilt if the file changes, use `tar_file()` in your plan. 303 | - If you have a target that writes a file, and you want that target to be rebuilt (and file rewritten), if the file is removed, use `tar_file()` in your plan. 304 | 305 | Just remember `tar_file()` for files! 306 | 307 | # Interactive development workflow 308 | 309 | Here I'll give you a quick demo of how you proceed with development. 310 | 311 | The main elements are: 312 | 313 | 1. `tar_load()` the targets you want to work with interactively from the targets store with `tar_load()` 314 | - There is an RStuido Addin that ships with targets that can load the target under the cursor from the store. It is highly recommended that you make a keyboard shortcut for this. See [creating a keyboard shortcuts for RStudio addins](https://docs.posit.co/ide/user/ide/guide/productivity/add-ins.html#keyboard-shortcuts) 315 | 2. Declare a new target, and create a new function to be home to your work. 316 | - Don't forget about `{fnmate}`! 317 | 3. Once your function seems like it will work, immediately run `tar_make()` - I also highly advise making a keyboard shortcut for this. 318 | - It's also convenient to run `tar_make(target_name)` to run the pipeline up to, but not past, the target you built, if there are other unrelated targets you don't want to run right now. 319 | 4. Repeat. (e.g. `tar_load()` the target you just defined and built to do further work) 320 | 321 | Sometimes people ask about where to put experimental code they're not sure should be in the plan yet. At the bottom of a file I'm working with I sometimes keep a 'scratchpad' of related code inside a block like: 322 | 323 | ```{r} 324 | #| eval: false 325 | function(){ 326 | # experimental code goes here 327 | } 328 | ``` 329 | 330 | The code is not run when running the plan and since the function has no name, it 331 | cannot be accidentally called. This tip came from Gábor Csárdi. 332 | 333 | # Review 334 | 335 | What does each of these functions do? 336 | 337 | - `tar_make()` 338 | - `tar_load()` 339 | - `tar_meta()` 340 | - `tar_invalidate()` 341 | - `tar_delete()` 342 | - `tar_target()` 343 | - `tar_file()` 344 | - `tar_render()` 345 | 346 | What must be the last object returned in the `_targets.R`? 347 | 348 | If you can answer those, you now have enough tools to use targets productively on 349 | your projects. The cake is baked. It's all icing from here. 350 | 351 | # The two kinds of reproducibility 352 | 353 | Hopefully you can see how the plan we have now defeats classes reproducibility issues typically experienced with classic R projects. Explicitly, we get a more reproducible workflow since: 354 | 355 | - Our targets are built in a separate session, so cannot be affected the by the state of our interactive development environment. 356 | - Since `{targets}` intelligently caches our work, we can frequently run the pipeline with the same reproducibility guarantees as if we were running it from scratch, but take a fraction of the time. 357 | - Avoids the situation where the pipeline is rarely run end to end after small changes. 358 | 359 | There is a second kind of reproducibility supported by this workflow though. 360 | I'll argue with `{targets}` we can more easily reproduce an understanding of 361 | what all this code is doing in someone else's head. 362 | 363 | Imagine you are tasked with peer-reviewing a pipeline. If that pipeline is engineered with `{targets}` your workflow looks like: 364 | 365 | 1. Run the full pipeline with `tar_make()`. 366 | - Barring issues with the package dependencies, the code just works! It has frequently been tested end-to-end. 367 | 2. You start your review from the `_targets.R` which gives you a nice overview of what the important bits of information the pipeline depends on are. 368 | - You might even run `tar_visnetwork()` To get a feel for the most important chains of targets. 369 | 3. When you decide you need to inspect the code for a target to understand it better, you can immediately: 370 | - `tar_load()` the input targets from the store, and execute a 'jump to definition' to go directly to the code where those targets are used. You can interactively play with the code until it makes sense. 371 | 4. You can context switch away and come back, maybe days later, and do exactly the same thing. All the pipeline's targets are still sitting in the store ready for immediate interactive use. 372 | 373 | For similar reasons as described here, `{targets}` works very well to support teams that have a high degree of context switching. 374 | 375 | -------------------------------------------------------------------------------- /docs/site_libs/quarto-html/popper.min.js: -------------------------------------------------------------------------------- 1 | /** 2 | * @popperjs/core v2.11.7 - MIT License 3 | */ 4 | 5 | !function(e,t){"object"==typeof exports&&"undefined"!=typeof module?t(exports):"function"==typeof define&&define.amd?define(["exports"],t):t((e="undefined"!=typeof globalThis?globalThis:e||self).Popper={})}(this,(function(e){"use strict";function t(e){if(null==e)return window;if("[object Window]"!==e.toString()){var t=e.ownerDocument;return t&&t.defaultView||window}return e}function n(e){return e instanceof t(e).Element||e instanceof Element}function r(e){return e instanceof t(e).HTMLElement||e instanceof HTMLElement}function o(e){return"undefined"!=typeof ShadowRoot&&(e instanceof t(e).ShadowRoot||e instanceof ShadowRoot)}var i=Math.max,a=Math.min,s=Math.round;function f(){var e=navigator.userAgentData;return null!=e&&e.brands&&Array.isArray(e.brands)?e.brands.map((function(e){return e.brand+"/"+e.version})).join(" "):navigator.userAgent}function c(){return!/^((?!chrome|android).)*safari/i.test(f())}function p(e,o,i){void 0===o&&(o=!1),void 0===i&&(i=!1);var a=e.getBoundingClientRect(),f=1,p=1;o&&r(e)&&(f=e.offsetWidth>0&&s(a.width)/e.offsetWidth||1,p=e.offsetHeight>0&&s(a.height)/e.offsetHeight||1);var u=(n(e)?t(e):window).visualViewport,l=!c()&&i,d=(a.left+(l&&u?u.offsetLeft:0))/f,h=(a.top+(l&&u?u.offsetTop:0))/p,m=a.width/f,v=a.height/p;return{width:m,height:v,top:h,right:d+m,bottom:h+v,left:d,x:d,y:h}}function u(e){var n=t(e);return{scrollLeft:n.pageXOffset,scrollTop:n.pageYOffset}}function l(e){return e?(e.nodeName||"").toLowerCase():null}function d(e){return((n(e)?e.ownerDocument:e.document)||window.document).documentElement}function h(e){return p(d(e)).left+u(e).scrollLeft}function m(e){return t(e).getComputedStyle(e)}function v(e){var t=m(e),n=t.overflow,r=t.overflowX,o=t.overflowY;return/auto|scroll|overlay|hidden/.test(n+o+r)}function y(e,n,o){void 0===o&&(o=!1);var i,a,f=r(n),c=r(n)&&function(e){var t=e.getBoundingClientRect(),n=s(t.width)/e.offsetWidth||1,r=s(t.height)/e.offsetHeight||1;return 1!==n||1!==r}(n),m=d(n),y=p(e,c,o),g={scrollLeft:0,scrollTop:0},b={x:0,y:0};return(f||!f&&!o)&&(("body"!==l(n)||v(m))&&(g=(i=n)!==t(i)&&r(i)?{scrollLeft:(a=i).scrollLeft,scrollTop:a.scrollTop}:u(i)),r(n)?((b=p(n,!0)).x+=n.clientLeft,b.y+=n.clientTop):m&&(b.x=h(m))),{x:y.left+g.scrollLeft-b.x,y:y.top+g.scrollTop-b.y,width:y.width,height:y.height}}function g(e){var t=p(e),n=e.offsetWidth,r=e.offsetHeight;return Math.abs(t.width-n)<=1&&(n=t.width),Math.abs(t.height-r)<=1&&(r=t.height),{x:e.offsetLeft,y:e.offsetTop,width:n,height:r}}function b(e){return"html"===l(e)?e:e.assignedSlot||e.parentNode||(o(e)?e.host:null)||d(e)}function x(e){return["html","body","#document"].indexOf(l(e))>=0?e.ownerDocument.body:r(e)&&v(e)?e:x(b(e))}function w(e,n){var r;void 0===n&&(n=[]);var o=x(e),i=o===(null==(r=e.ownerDocument)?void 0:r.body),a=t(o),s=i?[a].concat(a.visualViewport||[],v(o)?o:[]):o,f=n.concat(s);return i?f:f.concat(w(b(s)))}function O(e){return["table","td","th"].indexOf(l(e))>=0}function j(e){return r(e)&&"fixed"!==m(e).position?e.offsetParent:null}function E(e){for(var n=t(e),i=j(e);i&&O(i)&&"static"===m(i).position;)i=j(i);return i&&("html"===l(i)||"body"===l(i)&&"static"===m(i).position)?n:i||function(e){var t=/firefox/i.test(f());if(/Trident/i.test(f())&&r(e)&&"fixed"===m(e).position)return null;var n=b(e);for(o(n)&&(n=n.host);r(n)&&["html","body"].indexOf(l(n))<0;){var i=m(n);if("none"!==i.transform||"none"!==i.perspective||"paint"===i.contain||-1!==["transform","perspective"].indexOf(i.willChange)||t&&"filter"===i.willChange||t&&i.filter&&"none"!==i.filter)return n;n=n.parentNode}return null}(e)||n}var D="top",A="bottom",L="right",P="left",M="auto",k=[D,A,L,P],W="start",B="end",H="viewport",T="popper",R=k.reduce((function(e,t){return e.concat([t+"-"+W,t+"-"+B])}),[]),S=[].concat(k,[M]).reduce((function(e,t){return e.concat([t,t+"-"+W,t+"-"+B])}),[]),V=["beforeRead","read","afterRead","beforeMain","main","afterMain","beforeWrite","write","afterWrite"];function q(e){var t=new Map,n=new Set,r=[];function o(e){n.add(e.name),[].concat(e.requires||[],e.requiresIfExists||[]).forEach((function(e){if(!n.has(e)){var r=t.get(e);r&&o(r)}})),r.push(e)}return e.forEach((function(e){t.set(e.name,e)})),e.forEach((function(e){n.has(e.name)||o(e)})),r}function C(e){return e.split("-")[0]}function N(e,t){var n=t.getRootNode&&t.getRootNode();if(e.contains(t))return!0;if(n&&o(n)){var r=t;do{if(r&&e.isSameNode(r))return!0;r=r.parentNode||r.host}while(r)}return!1}function I(e){return Object.assign({},e,{left:e.x,top:e.y,right:e.x+e.width,bottom:e.y+e.height})}function _(e,r,o){return r===H?I(function(e,n){var r=t(e),o=d(e),i=r.visualViewport,a=o.clientWidth,s=o.clientHeight,f=0,p=0;if(i){a=i.width,s=i.height;var u=c();(u||!u&&"fixed"===n)&&(f=i.offsetLeft,p=i.offsetTop)}return{width:a,height:s,x:f+h(e),y:p}}(e,o)):n(r)?function(e,t){var n=p(e,!1,"fixed"===t);return n.top=n.top+e.clientTop,n.left=n.left+e.clientLeft,n.bottom=n.top+e.clientHeight,n.right=n.left+e.clientWidth,n.width=e.clientWidth,n.height=e.clientHeight,n.x=n.left,n.y=n.top,n}(r,o):I(function(e){var t,n=d(e),r=u(e),o=null==(t=e.ownerDocument)?void 0:t.body,a=i(n.scrollWidth,n.clientWidth,o?o.scrollWidth:0,o?o.clientWidth:0),s=i(n.scrollHeight,n.clientHeight,o?o.scrollHeight:0,o?o.clientHeight:0),f=-r.scrollLeft+h(e),c=-r.scrollTop;return"rtl"===m(o||n).direction&&(f+=i(n.clientWidth,o?o.clientWidth:0)-a),{width:a,height:s,x:f,y:c}}(d(e)))}function F(e,t,o,s){var f="clippingParents"===t?function(e){var t=w(b(e)),o=["absolute","fixed"].indexOf(m(e).position)>=0&&r(e)?E(e):e;return n(o)?t.filter((function(e){return n(e)&&N(e,o)&&"body"!==l(e)})):[]}(e):[].concat(t),c=[].concat(f,[o]),p=c[0],u=c.reduce((function(t,n){var r=_(e,n,s);return t.top=i(r.top,t.top),t.right=a(r.right,t.right),t.bottom=a(r.bottom,t.bottom),t.left=i(r.left,t.left),t}),_(e,p,s));return u.width=u.right-u.left,u.height=u.bottom-u.top,u.x=u.left,u.y=u.top,u}function U(e){return e.split("-")[1]}function z(e){return["top","bottom"].indexOf(e)>=0?"x":"y"}function X(e){var t,n=e.reference,r=e.element,o=e.placement,i=o?C(o):null,a=o?U(o):null,s=n.x+n.width/2-r.width/2,f=n.y+n.height/2-r.height/2;switch(i){case D:t={x:s,y:n.y-r.height};break;case A:t={x:s,y:n.y+n.height};break;case L:t={x:n.x+n.width,y:f};break;case P:t={x:n.x-r.width,y:f};break;default:t={x:n.x,y:n.y}}var c=i?z(i):null;if(null!=c){var p="y"===c?"height":"width";switch(a){case W:t[c]=t[c]-(n[p]/2-r[p]/2);break;case B:t[c]=t[c]+(n[p]/2-r[p]/2)}}return t}function Y(e){return Object.assign({},{top:0,right:0,bottom:0,left:0},e)}function G(e,t){return t.reduce((function(t,n){return t[n]=e,t}),{})}function J(e,t){void 0===t&&(t={});var r=t,o=r.placement,i=void 0===o?e.placement:o,a=r.strategy,s=void 0===a?e.strategy:a,f=r.boundary,c=void 0===f?"clippingParents":f,u=r.rootBoundary,l=void 0===u?H:u,h=r.elementContext,m=void 0===h?T:h,v=r.altBoundary,y=void 0!==v&&v,g=r.padding,b=void 0===g?0:g,x=Y("number"!=typeof b?b:G(b,k)),w=m===T?"reference":T,O=e.rects.popper,j=e.elements[y?w:m],E=F(n(j)?j:j.contextElement||d(e.elements.popper),c,l,s),P=p(e.elements.reference),M=X({reference:P,element:O,strategy:"absolute",placement:i}),W=I(Object.assign({},O,M)),B=m===T?W:P,R={top:E.top-B.top+x.top,bottom:B.bottom-E.bottom+x.bottom,left:E.left-B.left+x.left,right:B.right-E.right+x.right},S=e.modifiersData.offset;if(m===T&&S){var V=S[i];Object.keys(R).forEach((function(e){var t=[L,A].indexOf(e)>=0?1:-1,n=[D,A].indexOf(e)>=0?"y":"x";R[e]+=V[n]*t}))}return R}var K={placement:"bottom",modifiers:[],strategy:"absolute"};function Q(){for(var e=arguments.length,t=new Array(e),n=0;n=0?-1:1,i="function"==typeof n?n(Object.assign({},t,{placement:e})):n,a=i[0],s=i[1];return a=a||0,s=(s||0)*o,[P,L].indexOf(r)>=0?{x:s,y:a}:{x:a,y:s}}(n,t.rects,i),e}),{}),s=a[t.placement],f=s.x,c=s.y;null!=t.modifiersData.popperOffsets&&(t.modifiersData.popperOffsets.x+=f,t.modifiersData.popperOffsets.y+=c),t.modifiersData[r]=a}},se={left:"right",right:"left",bottom:"top",top:"bottom"};function fe(e){return e.replace(/left|right|bottom|top/g,(function(e){return se[e]}))}var ce={start:"end",end:"start"};function pe(e){return e.replace(/start|end/g,(function(e){return ce[e]}))}function ue(e,t){void 0===t&&(t={});var n=t,r=n.placement,o=n.boundary,i=n.rootBoundary,a=n.padding,s=n.flipVariations,f=n.allowedAutoPlacements,c=void 0===f?S:f,p=U(r),u=p?s?R:R.filter((function(e){return U(e)===p})):k,l=u.filter((function(e){return c.indexOf(e)>=0}));0===l.length&&(l=u);var d=l.reduce((function(t,n){return t[n]=J(e,{placement:n,boundary:o,rootBoundary:i,padding:a})[C(n)],t}),{});return Object.keys(d).sort((function(e,t){return d[e]-d[t]}))}var le={name:"flip",enabled:!0,phase:"main",fn:function(e){var t=e.state,n=e.options,r=e.name;if(!t.modifiersData[r]._skip){for(var o=n.mainAxis,i=void 0===o||o,a=n.altAxis,s=void 0===a||a,f=n.fallbackPlacements,c=n.padding,p=n.boundary,u=n.rootBoundary,l=n.altBoundary,d=n.flipVariations,h=void 0===d||d,m=n.allowedAutoPlacements,v=t.options.placement,y=C(v),g=f||(y===v||!h?[fe(v)]:function(e){if(C(e)===M)return[];var t=fe(e);return[pe(e),t,pe(t)]}(v)),b=[v].concat(g).reduce((function(e,n){return e.concat(C(n)===M?ue(t,{placement:n,boundary:p,rootBoundary:u,padding:c,flipVariations:h,allowedAutoPlacements:m}):n)}),[]),x=t.rects.reference,w=t.rects.popper,O=new Map,j=!0,E=b[0],k=0;k=0,S=R?"width":"height",V=J(t,{placement:B,boundary:p,rootBoundary:u,altBoundary:l,padding:c}),q=R?T?L:P:T?A:D;x[S]>w[S]&&(q=fe(q));var N=fe(q),I=[];if(i&&I.push(V[H]<=0),s&&I.push(V[q]<=0,V[N]<=0),I.every((function(e){return e}))){E=B,j=!1;break}O.set(B,I)}if(j)for(var _=function(e){var t=b.find((function(t){var n=O.get(t);if(n)return n.slice(0,e).every((function(e){return e}))}));if(t)return E=t,"break"},F=h?3:1;F>0;F--){if("break"===_(F))break}t.placement!==E&&(t.modifiersData[r]._skip=!0,t.placement=E,t.reset=!0)}},requiresIfExists:["offset"],data:{_skip:!1}};function de(e,t,n){return i(e,a(t,n))}var he={name:"preventOverflow",enabled:!0,phase:"main",fn:function(e){var t=e.state,n=e.options,r=e.name,o=n.mainAxis,s=void 0===o||o,f=n.altAxis,c=void 0!==f&&f,p=n.boundary,u=n.rootBoundary,l=n.altBoundary,d=n.padding,h=n.tether,m=void 0===h||h,v=n.tetherOffset,y=void 0===v?0:v,b=J(t,{boundary:p,rootBoundary:u,padding:d,altBoundary:l}),x=C(t.placement),w=U(t.placement),O=!w,j=z(x),M="x"===j?"y":"x",k=t.modifiersData.popperOffsets,B=t.rects.reference,H=t.rects.popper,T="function"==typeof y?y(Object.assign({},t.rects,{placement:t.placement})):y,R="number"==typeof T?{mainAxis:T,altAxis:T}:Object.assign({mainAxis:0,altAxis:0},T),S=t.modifiersData.offset?t.modifiersData.offset[t.placement]:null,V={x:0,y:0};if(k){if(s){var q,N="y"===j?D:P,I="y"===j?A:L,_="y"===j?"height":"width",F=k[j],X=F+b[N],Y=F-b[I],G=m?-H[_]/2:0,K=w===W?B[_]:H[_],Q=w===W?-H[_]:-B[_],Z=t.elements.arrow,$=m&&Z?g(Z):{width:0,height:0},ee=t.modifiersData["arrow#persistent"]?t.modifiersData["arrow#persistent"].padding:{top:0,right:0,bottom:0,left:0},te=ee[N],ne=ee[I],re=de(0,B[_],$[_]),oe=O?B[_]/2-G-re-te-R.mainAxis:K-re-te-R.mainAxis,ie=O?-B[_]/2+G+re+ne+R.mainAxis:Q+re+ne+R.mainAxis,ae=t.elements.arrow&&E(t.elements.arrow),se=ae?"y"===j?ae.clientTop||0:ae.clientLeft||0:0,fe=null!=(q=null==S?void 0:S[j])?q:0,ce=F+ie-fe,pe=de(m?a(X,F+oe-fe-se):X,F,m?i(Y,ce):Y);k[j]=pe,V[j]=pe-F}if(c){var ue,le="x"===j?D:P,he="x"===j?A:L,me=k[M],ve="y"===M?"height":"width",ye=me+b[le],ge=me-b[he],be=-1!==[D,P].indexOf(x),xe=null!=(ue=null==S?void 0:S[M])?ue:0,we=be?ye:me-B[ve]-H[ve]-xe+R.altAxis,Oe=be?me+B[ve]+H[ve]-xe-R.altAxis:ge,je=m&&be?function(e,t,n){var r=de(e,t,n);return r>n?n:r}(we,me,Oe):de(m?we:ye,me,m?Oe:ge);k[M]=je,V[M]=je-me}t.modifiersData[r]=V}},requiresIfExists:["offset"]};var me={name:"arrow",enabled:!0,phase:"main",fn:function(e){var t,n=e.state,r=e.name,o=e.options,i=n.elements.arrow,a=n.modifiersData.popperOffsets,s=C(n.placement),f=z(s),c=[P,L].indexOf(s)>=0?"height":"width";if(i&&a){var p=function(e,t){return Y("number"!=typeof(e="function"==typeof e?e(Object.assign({},t.rects,{placement:t.placement})):e)?e:G(e,k))}(o.padding,n),u=g(i),l="y"===f?D:P,d="y"===f?A:L,h=n.rects.reference[c]+n.rects.reference[f]-a[f]-n.rects.popper[c],m=a[f]-n.rects.reference[f],v=E(i),y=v?"y"===f?v.clientHeight||0:v.clientWidth||0:0,b=h/2-m/2,x=p[l],w=y-u[c]-p[d],O=y/2-u[c]/2+b,j=de(x,O,w),M=f;n.modifiersData[r]=((t={})[M]=j,t.centerOffset=j-O,t)}},effect:function(e){var t=e.state,n=e.options.element,r=void 0===n?"[data-popper-arrow]":n;null!=r&&("string"!=typeof r||(r=t.elements.popper.querySelector(r)))&&N(t.elements.popper,r)&&(t.elements.arrow=r)},requires:["popperOffsets"],requiresIfExists:["preventOverflow"]};function ve(e,t,n){return void 0===n&&(n={x:0,y:0}),{top:e.top-t.height-n.y,right:e.right-t.width+n.x,bottom:e.bottom-t.height+n.y,left:e.left-t.width-n.x}}function ye(e){return[D,L,A,P].some((function(t){return e[t]>=0}))}var ge={name:"hide",enabled:!0,phase:"main",requiresIfExists:["preventOverflow"],fn:function(e){var t=e.state,n=e.name,r=t.rects.reference,o=t.rects.popper,i=t.modifiersData.preventOverflow,a=J(t,{elementContext:"reference"}),s=J(t,{altBoundary:!0}),f=ve(a,r),c=ve(s,o,i),p=ye(f),u=ye(c);t.modifiersData[n]={referenceClippingOffsets:f,popperEscapeOffsets:c,isReferenceHidden:p,hasPopperEscaped:u},t.attributes.popper=Object.assign({},t.attributes.popper,{"data-popper-reference-hidden":p,"data-popper-escaped":u})}},be=Z({defaultModifiers:[ee,te,oe,ie]}),xe=[ee,te,oe,ie,ae,le,he,me,ge],we=Z({defaultModifiers:xe});e.applyStyles=ie,e.arrow=me,e.computeStyles=oe,e.createPopper=we,e.createPopperLite=be,e.defaultModifiers=xe,e.detectOverflow=J,e.eventListeners=ee,e.flip=le,e.hide=ge,e.offset=ae,e.popperGenerator=Z,e.popperOffsets=te,e.preventOverflow=he,Object.defineProperty(e,"__esModule",{value:!0})})); 6 | 7 | -------------------------------------------------------------------------------- /branching.qmd: -------------------------------------------------------------------------------- 1 | # Divide and conquer with branching 2 | 3 | The largest chunk of work in our project is our model fitting and the associated 4 | grid search. When scaling up these kinds of processes to larger data and larger grids you 5 | typically hit some stumbling blocks. 6 | 7 | For example: 8 | 9 | - You 'add more cores', or utilise more parallel threads, 10 | but then you unexpectedly run out of memory. 11 | - You get iterations that have model convergence problems due a bad combinations of hyper parameters. 12 | - A colleague suggests you need to expand you hyper-parameter search. 13 | 14 | These types of things are frustrating because the problem might not appear until 15 | hours into a very long running process, and the intermediate result of all of those hours is 16 | immediately dumped. 17 | 18 | With `{targets}` we can use a technique called 'dynamic branching' to promote every iteration in a set to its own target. 19 | 20 | - Each result is individually cached, which means large iterative processes are now resumable. 21 | - We can also add or remove iterations by changing input data, while reusing the results from previous iterations. 22 | - We can take advantage of `{targets}` parallelism features to run the iterations in parallel. 23 | 24 | We'll refactor the model grid search in our project to use this approach. After 25 | we see how it works it's going to be a little easier to explain why it is called 26 | 'dynamic branching'. 27 | 28 | # Dynamic branching refactor steps: 29 | 30 | In the process of this refactor we're going to remove our training grid. If you recall it was a dataframe with one row per combination of training fold and model hyper parameters: 31 | 32 | ```{r} 33 | #| eval: false 34 | occurrence_cv_splits <- 35 | vfold_cv(training_data, v = 5, repeats = 1) 36 | 37 | training_grid <- 38 | expand.grid( 39 | fold_id = occurrence_cv_splits$id, 40 | mtry = mtry_candidates, 41 | num_trees = num_trees_candidates 42 | ) |> 43 | as_tibble() |> 44 | left_join( 45 | occurrence_cv_splits, 46 | by = c(fold_id = "id") 47 | ) 48 | 49 | training_grid 50 | ``` 51 | 52 | This refactor will give `{targets}` the job of materialising the grid, with one target per combination of parameters and data (row). 53 | 54 | 1. Create a new target that summarises the grid training results: 55 | - This code remains unchanged. 56 | 57 | i.e put this code: 58 | 59 | ```{r} 60 | #| eval: false 61 | summarised_training_results <- 62 | training_results |> 63 | summarise( 64 | mean_auc = mean(auc), 65 | sd_auc = sd(auc), 66 | mean_accuracy = mean(accuracy), 67 | sd_accuracy = sd(accuracy), 68 | .by = c(mtry, num_trees) 69 | ) |> 70 | arrange(-mean_auc) 71 | 72 | summarised_training_results 73 | 74 | ``` 75 | 76 | inside a new target: 77 | 78 | ```{r} 79 | #| eval: false 80 | tar_target( 81 | species_classification_model_training_summary, 82 | summarise_species_model_training_results( 83 | species_classification_model_training_results 84 | ) 85 | ), 86 | ``` 87 | 88 | 2. Promote `mtry_candidates` and `num_trees_candidates` to plan targets: 89 | 90 | ```{r} 91 | #| eval: false 92 | tar_target( 93 | mtry_candidates, 94 | c(1, 2, 3) 95 | ), 96 | tar_target( 97 | num_trees_candidates, 98 | c(200, 500, 100) 99 | ), 100 | ``` 101 | 102 | 3. Make the cross validation fold dataset into a plan target: 103 | 104 | ```{r} 105 | #| eval: false 106 | tar_target( 107 | training_cross_validation_folds, 108 | vfold_cv(training(test_train_split), v = 5, repeats = 1) 109 | ), 110 | 111 | ``` 112 | 113 | 3. What's left is to refactor the middle bit, actually fitting the models, ie. this code: 114 | 115 | ```{r} 116 | #| eval: false 117 | training_results <- 118 | training_grid |> 119 | mutate( 120 | pmap( 121 | .l = list( 122 | training_grid$splits, 123 | training_grid$num_trees, 124 | training_grid$mtry 125 | ), 126 | .f = fit_fold_calc_results 127 | ) |> 128 | bind_rows() 129 | # by returning a dataframe inside mutate, the resulting columns are appended to training_grid 130 | ) 131 | ``` 132 | 133 | We change that into a new target that looks like this: 134 | 135 | ```{r} 136 | #| eval: false 137 | tar_target( 138 | species_classification_model_training_results, 139 | fit_fold_calc_results( 140 | training_cross_validation_folds$splits[[1]], 141 | num_trees_candidates, 142 | mtry_candidates 143 | ), 144 | pattern = cross(training_cross_validation_folds, mtry_candidates, num_trees_candidates) 145 | ) 146 | ``` 147 | 148 | We're introducing a bit of magic here: 149 | `pattern = cross(training_cross_validation_folds, mtry_candidates, num_trees_candidates)`. 150 | 151 | This says to `{targets}`: We're declaring a group of targets here, that you're 152 | going to create for us. That group is defined by evaluating this target's expression on a set of inputs, this case a cross product of input targets. If we call `tar_make()` at this point we get: 153 | 154 | ``` 155 | ▶ dispatched branch species_classification_model_training_results_d19c364718f81c21 156 | Setting levels: control = 0, case = 1 157 | Setting direction: controls < cases 158 | ● completed branch species_classification_model_training_results_d19c364718f81c21 [0.051 seconds] 159 | ▶ dispatched branch species_classification_model_training_results_bde0689544bd8cc7 160 | Setting levels: control = 0, case = 1 161 | Setting direction: controls < cases 162 | ● completed branch species_classification_model_training_results_bde0689544bd8cc7 [0.058 seconds] 163 | ● completed pattern species_classification_model_training_results 164 | ▶ dispatched target species_classification_model_training_summary 165 | ✖ errored target species_classification_model_training_summary 166 | ``` 167 | 168 | If we look `tar_load(species_classification_model_training_results)` and interactively run: 169 | 170 | ```{r} 171 | #| eval: false 172 | summarise_species_model_training_results( 173 | species_classification_model_training_results 174 | ) 175 | ``` 176 | 177 | we can see the problem more clearly: 178 | 179 | ``` 180 | Error in `summarise()` at R/summarise_species_model_training_results.R:12:3: 181 | ! Can't select columns that don't exist. 182 | ✖ Column `mtry` doesn't exist. 183 | Run `rlang::last_trace()` to see where the error occurred. 184 | ``` 185 | 186 | This is happening because `species_classification_model_training_results` no 187 | longer has columns `mtry` and `num_trees`. These were being included in the data 188 | because of the way we were calling mutate in our earlier grid search code. 189 | 190 | To fix this we can modify the object returned by `fit_fold_calc_results()`: 191 | 192 | ```{r} 193 | #| eval: false 194 | 195 | # use auc and accuracy as our summary statistics 196 | data.frame( 197 | auc = auc(roc_object) |> as.numeric(), 198 | accuracy = sum(test_set$is_moluccus > 0.5 & test_set$is_moluccus == 1) / nrow(test_set), 199 | mtry = mtry, 200 | num_trees = num_trees 201 | ) 202 | ``` 203 | 204 | And now `tar_make()` should succeed! 205 | 206 | 4. Put `fit_fold_calc_results()` in `R/fit_fold_calc_results.R` 207 | 208 | - Since the file name where it is no longer reflects what's in there. 209 | - Can delete old file. 210 | 211 | We successfully refactored to 'dynamic branching'. We shall see in a moment all that we have bought with that. But first... 212 | 213 | ### Why it's called 'Dynamic Branching'... 214 | 215 | 'Branching' is a reference to the tree-like appearance of pipeline graphs. 216 | 217 | `{targets}` has a number of ways to add targets to the graph programatically. In 218 | our case we instructed `{targets}` to add a training target to our graph for each combination of model 219 | parameters and training data. After being computed those targets are immediately 220 | consolidated into into the dataset 221 | `species_classification_model_training_results` which we then summarised. 222 | 223 | We do not need to immediately consolidate the dynamically generated targets 224 | though. We could for example create a new target from each of our dynamically generated 225 | targets which if you recall are 1 row dataframes. See `fit_fold_calc_results()`. 226 | 227 | This would look like: 228 | 229 | ```{r} 230 | #| eval: false 231 | tar_target( 232 | new_dynamic_target, 233 | a_function(species_classification_model_training_results), 234 | pattern = map(species_classification_model_training_results) 235 | ) 236 | ``` 237 | 238 | We again use `pattern` to express this, but this time with `map` we're expressing a 1:1 239 | transformation of the input targets, instead of a cross product. 240 | 241 | Our pipeline graph is having 'branches' extended, so that you can imagine looks like: 242 | 243 | ``` 244 | species_classification_model_training_results_a - new_dynamic_target_a \ 245 | / 246 | - species_classification_model_training_results_b - new_dynamic_target_b - final_summary 247 | \ 248 | species_classification_model_training_results_c - new_dynamic_target_c / 249 | ``` 250 | 251 | These branches can be chains of targets that continue on, perhaps 252 | even splitting into even more branches themselves before being finally 253 | consolidated into an collection like a list, dataframe, or vector. 254 | 255 | The analogy of 'branches' fits this idea of splitting and potentially growing 256 | and splitting further. 257 | 258 | 'Dynamic' comes from the fact that there are actually two ways to do branching 259 | in `{targets}`. 260 | 261 | - 'Dynamic branching' where `{targets}` generates branches 262 | for you at run time, when all the target's dependencies are computed. The reason 263 | this is important is that it may not be known how many items a list/vector/dataframe target contains, and thus how many branches `{targets}` would need to create. 264 | - 'Static branching' where the the exact number of branches that need to be 265 | created is known based on fixed inputs in the plan. For example we might have 266 | been able to use this to create a branch for each statically known combination 267 | of parameters in our grid search. 268 | 269 | Static Branching was developed first, and is largely superseded by Dynamic Branching. There is little reason to prefer the static mode. 270 | 271 | # The Proof in the pudding 272 | 273 | ## Reusing existing grid search points 274 | 275 | No here's where if you do a bit of modeling, `{targets}` should get really exciting. 276 | 277 | First let's explore what happens if we expand the grid search, e.g. by trying a model version with 1000 trees: 278 | 279 | ```{r} 280 | #| eval: false 281 | tar_target( 282 | num_trees_candidates, 283 | c(200, 500, 100, 1000) 284 | ), 285 | ``` 286 | 287 | running `tar_make()` gives: 288 | 289 | ``` 290 | ... 291 | ✔ skipped branch species_classification_model_training_results_d15e5fdbcf21af3b 292 | ✔ skipped branch species_classification_model_training_results_d19c364718f81c21 293 | ✔ skipped branch species_classification_model_training_results_bde0689544bd8cc7 294 | ▶ dispatched branch species_classification_model_training_results_e8827b8265f7d539 295 | Setting levels: control = 0, case = 1 296 | Setting direction: controls < cases 297 | ● completed branch species_classification_model_training_results_e8827b8265f7d539 [0.043 seconds] 298 | ● completed pattern species_classification_model_training_results 299 | ▶ dispatched target species_classification_model_training_summary 300 | ● completed target species_classification_model_training_summary [0.009 seconds] 301 | ▶ dispatched target species_classification_model 302 | ● completed target species_classification_model [0.049 seconds] 303 | ▶ dispatched target species_model_validation_data 304 | ● completed target species_model_validation_data [0.014 seconds] 305 | ✔ skipped target base_plot_model_roc_object 306 | ✔ skipped target gg_species_class_accuracy_hexes 307 | ▶ dispatched target report 308 | ● completed target report [3.984 seconds] 309 | ▶ ended pipeline [5.811 seconds] 310 | ``` 311 | 312 | We can see that: 313 | 314 | - We skipped a lot of branches in calculating `species_classification_model_training_results` 315 | - We only calculated new combinations in our training grid with `num_trees = 1000` 316 | - 3 x 15 x 1 of these 317 | - `species_classification_model_training_summary` changed and since it is an input to 318 | `species_classification_model` the model was refit. 319 | - BUT it turned out that the best model remained the same. So we did not rebuild: 320 | - `base_plot_model_roc_object` 321 | - `gg_species_class_accuracy_hexes` 322 | - Question: How could we refactor the plan if we wanted to make it so that if the best model didn't change we would not refit final model? 323 | -
Refactoring ideas 324 | - We could make a separate target which is just the first row of `species_classification_model_training_summary`, which represents the best model. 325 | - The final model would only be refit if this changes. 326 |
327 | 328 | ### Remember workspaces? 329 | 330 | Initially when I made this refactor I made this mistake: 331 | 332 | ```{r} 333 | #| eval: false 334 | tar_target( 335 | species_classification_model_training_results, 336 | fit_fold_calc_results( 337 | training_cross_validation_folds$splits, 338 | mtry_candidates, 339 | num_trees_candidates 340 | ), 341 | pattern = cross(training_cross_validation_folds, mtry_candidates, num_trees_candidates) 342 | ) 343 | ``` 344 | 345 | Forgetting that `splits` was a list column, and so the dataset I want will have 346 | an extra layer of list wrapping that needs to be stripped off. 347 | 348 | The error this generated is hard to debug: 349 | 350 | ``` 351 | ▶ dispatched target training_cross_validation_folds 352 | ● completed target training_cross_validation_folds [0.007 seconds] 353 | ▶ dispatched branch species_classification_model_training_results_2f9ab41c1360f0ce 354 | ✖ errored branch species_classification_model_training_results_2f9ab41c1360f0ce 355 | ✖ errored pipeline [9.093 seconds] 356 | Error: 357 | ! Error running targets::tar_make() 358 | Error messages: targets::tar_meta(fields = error, complete_only = TRUE) 359 | Debugging guide: https://books.ropensci.org/targets/debugging.html 360 | How to ask for help: https://books.ropensci.org/targets/help.html 361 | Last error message: 362 | No method for objects of class: list 363 | Last error trace back: 364 | fit_fold_calc_results(training_cross_validation_folds$splits, mtry_... 365 | ``` 366 | 367 | Partly because the inputs to 368 | `species_classification_model_training_results_2f9ab41c1360f0ce` are not known 369 | exactly. They could be any combination of elements from 370 | `training_cross_validation_folds`, `mtry_candidates`, and 371 | `num_trees_candidates`. So what would we `tar_load()` to test the problem interactively? 372 | 373 | This is the situation we discussed earlier in the context of workspaces. To debug we set: 374 | 375 | ```{r} 376 | #| eval: false 377 | tar_option_set( 378 | seed = 2048, 379 | workspace_on_error = TRUE 380 | ) 381 | ``` 382 | 383 | - It's actually not a bad idea to turn this on defensively when working with dynamic branches. 384 | 385 | An run `tar_make()`: 386 | 387 | ``` 388 | ▶ dispatched branch species_classification_model_training_results_2f9ab41c1360f0ce 389 | ▶ recorded workspace species_classification_model_training_results_2f9ab41c1360f0ce 390 | ✖ errored branch species_classification_model_training_results_2f9ab41c1360f0ce 391 | ✖ errored pipeline [0.349 seconds] 392 | ``` 393 | 394 | and then `tar_workspace(species_classification_model_training_results_2f9ab41c1360f0ce)`. 395 | 396 | We can now observe that the data object we pass to the fitting function for this branch is: 397 | 398 | ``` 399 | > training_cross_validation_folds$splits 400 | [[1]] 401 | 402 | <5910/1478/7388> 403 | ``` 404 | 405 | Inside a length 1 list. So when we try to run `training()` on it inside 406 | `fit_fold_calc_results()` we get this error: 407 | 408 | ``` 409 | > training(training_cross_validation_folds$splits) 410 | Error in `training()`: 411 | ! No method for objects of class: list 412 | Run `rlang::last_trace()` to see where the error occurred. 413 | ``` 414 | 415 | So the quick fix is the `[[1]]` I added. 416 | 417 | 418 | ## Converting to parallel 419 | 420 | BUT WAIT THERE'S MORE: 421 | 422 | Things run pretty fast now. But what if we wanted to speed things up by making more cores available to run model fits in parallel? 423 | 424 | We add `library(crew)` to our packages and then set our options like: 425 | 426 | ```{r} 427 | #| eval: false 428 | tar_option_set( 429 | seed = 2048, 430 | controller = crew_controller_local(workers = 2) 431 | ) 432 | ``` 433 | 434 | Let's blow away our targets store with `tar_destroy()`, and then run `tar_make()` to see: 435 | 436 | An error! 437 | 438 | ``` 439 | ✖ errored target occurrences 440 | ✖ errored pipeline [3.665 seconds] 441 | Error: 442 | ! Error running targets::tar_make() 443 | Error messages: targets::tar_meta(fields = error, complete_only = TRUE) 444 | Debugging guide: https://books.ropensci.org/targets/debugging.html 445 | How to ask for help: https://books.ropensci.org/targets/help.html 446 | Last error message: 447 | [conflicted] filter found in 2 packages. 448 | Either pick the one you want with `::`: 449 | • dplyr::filter 450 | • stats::filter 451 | Or declare a preference with `conflicts_prefer()`: 452 | • `conflicts_prefer(dplyr::filter)` 453 | • `conflicts_prefer(stats::filter)` 454 | ``` 455 | 456 | What gives! We called `conflicts_prefer` at the start of our `_targets.R`. 457 | 458 | So in many cases of your `{targets}` plan can be made parallel with just that one config change. Unfortunately in our case there is a small issue: 459 | 460 | - The environment we create in `_targets.R` is copied to the worker threads that will run targets in parallel 461 | - `{targets}` Doesn't reach into package namespaces and copy their internal state. That's a can of worms! 462 | - So any package that uses internal state its own namespace for its functionality could have problems when that state is not replicated to workers. 463 | - This also a problem of calling impure functions! 464 | 465 | We have two packages that utilise internal state in their namespaces: 466 | 467 | - `{conflicted}` for the conflict resolution data 468 | - `{galah}` for its credentials 469 | 470 | ### Hooks to the rescue 471 | 472 | Luckily this gives us a really good motivating case for something `{targets}` calls 'hooks'. 473 | Hooks are ways to modify the target definitions in our plan after we have defined them. We have at our disposal: 474 | 475 | - `tar_hook_before()`: code to evaluate before a target is built 476 | - `tar_hook_inner()`: code to wrap around a target any time it appears as a dependency for another target (i.e. in input position) 477 | - `tar_hook_outer()`: code to wrap around a target after it is built, but before it is saved to the store. 478 | 479 | In our case we can append `tar_hook_before()` to the end of our `list()` of target definitions: 480 | 481 | ```{r} 482 | #| eval: false 483 | 484 | list( 485 | # inside list of targets 486 | ) |> 487 | tar_hook_before( 488 | hook = { 489 | conflicts_prefer( 490 | dplyr::filter, 491 | ) 492 | galah_config( 493 | atlas = "ALA", 494 | email = Sys.getenv("ALA_EMAIL") # You can replace this with your email to run. But you might not want to commit it to a public repository! 495 | ) 496 | } 497 | ) 498 | ``` 499 | 500 | Hooks can be targeted toward specific targets using the `names`. A classic use 501 | is to strip back `{ggplot2}` objects before they are saved to the store, since 502 | they can hold onto a reference to a large dataset. E.g. 503 | 504 | ```{r} 505 | #| eval: false 506 | 507 | list( 508 | # inside list of targets 509 | ) |> 510 | tar_hook_outer( 511 | hook = lighten_ggplot(.x), # .x is a placeholder we use for the target output 512 | names = starts_with("gg") 513 | ) 514 | ``` 515 | 516 | This will postprocess any targets that start their name with "gg". 517 | 518 | ### Parallel, finally 519 | 520 | With the hook in place, we can now build our pipeline in parallel. Exactly how 521 | that is done we leave to `{targets}` and the parallel backend in `{crew}`. 522 | Targets that are not dependent on each other are fair game to run in parallel. 523 | 524 | Targets supports other parallel backends from packages `{clustermq}` and `{future}`. 525 | 526 | One thing that commonly crops up running in parallel is that the increased 527 | memory pressure can cause out of memory errors. Luckily with `{targets}`, we don't 528 | lose work. If you were working on AWS you could change your instance config for 529 | more RAM and resume processing. 530 | 531 | There are helpful options for dealing with resource usage, for example: 532 | 533 | - `tar_option_set(memory = "transient")` can force targets to be dropped from memory after they're stored. Under defaults targets can stick around in workers memory. It's slower to use this but it lowers peak memory usage. 534 | - Also see `storage = "worker` option which can control if the result needs to be copied back to main thread or not. 535 | - We can also use the `deployment` option to specify only certain targets go to workers, and the rest get processed in the main thread that runs our plan. 536 | 537 | You don't need to remember these. 538 | 539 | - You can find them in the help for `tar_option_set()`. 540 | - They can all be set globally or at an individual target level 541 | - These and other options give you valuable control of 'shape' of your pipeline's process. 542 | - Just remember they exist if you run into resource usage problems. 543 | 544 | # Dynamic Branching refactor 545 | 546 | The completed refactor for this section is available on [this branch of the example project](https://github.com/MilesMcBain/classic_r_project/tree/refactor3). 547 | 548 | # Review 549 | 550 | - Dynamic branching lets us dynamically create targets 551 | - These targets can represent iterations over other targets 552 | - Makes iterations resumable 553 | - Makes iterations parallelisable 554 | - Can add iterations as inputs are updated, but keep prior work 555 | - There are two important arguments for this. 556 | - We discussed `pattern` (with `map` and `cross`). There are other patterns available. 557 | - the `iteration` argument is also useful, but not covered here. 558 | - Hooks are useful for changing selections of targets after we have defined them 559 | - We can apply pre / post processing steps or do setup work. 560 | - `{targets}` has options for controlling resource usage. 561 | -------------------------------------------------------------------------------- /docs/site_libs/quarto-search/fuse.min.js: -------------------------------------------------------------------------------- 1 | /** 2 | * Fuse.js v6.6.2 - Lightweight fuzzy-search (http://fusejs.io) 3 | * 4 | * Copyright (c) 2022 Kiro Risk (http://kiro.me) 5 | * All Rights Reserved. Apache Software License 2.0 6 | * 7 | * http://www.apache.org/licenses/LICENSE-2.0 8 | */ 9 | var e,t;e=this,t=function(){"use strict";function e(e,t){var n=Object.keys(e);if(Object.getOwnPropertySymbols){var r=Object.getOwnPropertySymbols(e);t&&(r=r.filter((function(t){return Object.getOwnPropertyDescriptor(e,t).enumerable}))),n.push.apply(n,r)}return n}function t(t){for(var n=1;ne.length)&&(t=e.length);for(var n=0,r=new Array(t);n0&&void 0!==arguments[0]?arguments[0]:1,t=arguments.length>1&&void 0!==arguments[1]?arguments[1]:3,n=new Map,r=Math.pow(10,t);return{get:function(t){var i=t.match(C).length;if(n.has(i))return n.get(i);var o=1/Math.pow(i,.5*e),c=parseFloat(Math.round(o*r)/r);return n.set(i,c),c},clear:function(){n.clear()}}}var $=function(){function e(){var t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:{},n=t.getFn,i=void 0===n?I.getFn:n,o=t.fieldNormWeight,c=void 0===o?I.fieldNormWeight:o;r(this,e),this.norm=E(c,3),this.getFn=i,this.isCreated=!1,this.setIndexRecords()}return o(e,[{key:"setSources",value:function(){var e=arguments.length>0&&void 0!==arguments[0]?arguments[0]:[];this.docs=e}},{key:"setIndexRecords",value:function(){var e=arguments.length>0&&void 0!==arguments[0]?arguments[0]:[];this.records=e}},{key:"setKeys",value:function(){var e=this,t=arguments.length>0&&void 0!==arguments[0]?arguments[0]:[];this.keys=t,this._keysMap={},t.forEach((function(t,n){e._keysMap[t.id]=n}))}},{key:"create",value:function(){var e=this;!this.isCreated&&this.docs.length&&(this.isCreated=!0,g(this.docs[0])?this.docs.forEach((function(t,n){e._addString(t,n)})):this.docs.forEach((function(t,n){e._addObject(t,n)})),this.norm.clear())}},{key:"add",value:function(e){var t=this.size();g(e)?this._addString(e,t):this._addObject(e,t)}},{key:"removeAt",value:function(e){this.records.splice(e,1);for(var t=e,n=this.size();t2&&void 0!==arguments[2]?arguments[2]:{},r=n.getFn,i=void 0===r?I.getFn:r,o=n.fieldNormWeight,c=void 0===o?I.fieldNormWeight:o,a=new $({getFn:i,fieldNormWeight:c});return a.setKeys(e.map(_)),a.setSources(t),a.create(),a}function R(e){var t=arguments.length>1&&void 0!==arguments[1]?arguments[1]:{},n=t.errors,r=void 0===n?0:n,i=t.currentLocation,o=void 0===i?0:i,c=t.expectedLocation,a=void 0===c?0:c,s=t.distance,u=void 0===s?I.distance:s,h=t.ignoreLocation,l=void 0===h?I.ignoreLocation:h,f=r/e.length;if(l)return f;var d=Math.abs(a-o);return u?f+d/u:d?1:f}function N(){for(var e=arguments.length>0&&void 0!==arguments[0]?arguments[0]:[],t=arguments.length>1&&void 0!==arguments[1]?arguments[1]:I.minMatchCharLength,n=[],r=-1,i=-1,o=0,c=e.length;o=t&&n.push([r,i]),r=-1)}return e[o-1]&&o-r>=t&&n.push([r,o-1]),n}var P=32;function W(e){for(var t={},n=0,r=e.length;n1&&void 0!==arguments[1]?arguments[1]:{},o=i.location,c=void 0===o?I.location:o,a=i.threshold,s=void 0===a?I.threshold:a,u=i.distance,h=void 0===u?I.distance:u,l=i.includeMatches,f=void 0===l?I.includeMatches:l,d=i.findAllMatches,v=void 0===d?I.findAllMatches:d,g=i.minMatchCharLength,y=void 0===g?I.minMatchCharLength:g,p=i.isCaseSensitive,m=void 0===p?I.isCaseSensitive:p,k=i.ignoreLocation,M=void 0===k?I.ignoreLocation:k;if(r(this,e),this.options={location:c,threshold:s,distance:h,includeMatches:f,findAllMatches:v,minMatchCharLength:y,isCaseSensitive:m,ignoreLocation:M},this.pattern=m?t:t.toLowerCase(),this.chunks=[],this.pattern.length){var b=function(e,t){n.chunks.push({pattern:e,alphabet:W(e),startIndex:t})},x=this.pattern.length;if(x>P){for(var w=0,L=x%P,S=x-L;w3&&void 0!==arguments[3]?arguments[3]:{},i=r.location,o=void 0===i?I.location:i,c=r.distance,a=void 0===c?I.distance:c,s=r.threshold,u=void 0===s?I.threshold:s,h=r.findAllMatches,l=void 0===h?I.findAllMatches:h,f=r.minMatchCharLength,d=void 0===f?I.minMatchCharLength:f,v=r.includeMatches,g=void 0===v?I.includeMatches:v,y=r.ignoreLocation,p=void 0===y?I.ignoreLocation:y;if(t.length>P)throw new Error(w(P));for(var m,k=t.length,M=e.length,b=Math.max(0,Math.min(o,M)),x=u,L=b,S=d>1||g,_=S?Array(M):[];(m=e.indexOf(t,L))>-1;){var O=R(t,{currentLocation:m,expectedLocation:b,distance:a,ignoreLocation:p});if(x=Math.min(O,x),L=m+k,S)for(var j=0;j=z;q-=1){var B=q-1,J=n[e.charAt(B)];if(S&&(_[B]=+!!J),K[q]=(K[q+1]<<1|1)&J,F&&(K[q]|=(A[q+1]|A[q])<<1|1|A[q+1]),K[q]&$&&(C=R(t,{errors:F,currentLocation:B,expectedLocation:b,distance:a,ignoreLocation:p}))<=x){if(x=C,(L=B)<=b)break;z=Math.max(1,2*b-L)}}if(R(t,{errors:F+1,currentLocation:b,expectedLocation:b,distance:a,ignoreLocation:p})>x)break;A=K}var U={isMatch:L>=0,score:Math.max(.001,C)};if(S){var V=N(_,d);V.length?g&&(U.indices=V):U.isMatch=!1}return U}(e,n,i,{location:c+o,distance:a,threshold:s,findAllMatches:u,minMatchCharLength:h,includeMatches:r,ignoreLocation:l}),p=y.isMatch,m=y.score,k=y.indices;p&&(g=!0),v+=m,p&&k&&(d=[].concat(f(d),f(k)))}));var y={isMatch:g,score:g?v/this.chunks.length:1};return g&&r&&(y.indices=d),y}}]),e}(),z=function(){function e(t){r(this,e),this.pattern=t}return o(e,[{key:"search",value:function(){}}],[{key:"isMultiMatch",value:function(e){return D(e,this.multiRegex)}},{key:"isSingleMatch",value:function(e){return D(e,this.singleRegex)}}]),e}();function D(e,t){var n=e.match(t);return n?n[1]:null}var K=function(e){a(n,e);var t=l(n);function n(e){return r(this,n),t.call(this,e)}return o(n,[{key:"search",value:function(e){var t=e===this.pattern;return{isMatch:t,score:t?0:1,indices:[0,this.pattern.length-1]}}}],[{key:"type",get:function(){return"exact"}},{key:"multiRegex",get:function(){return/^="(.*)"$/}},{key:"singleRegex",get:function(){return/^=(.*)$/}}]),n}(z),q=function(e){a(n,e);var t=l(n);function n(e){return r(this,n),t.call(this,e)}return o(n,[{key:"search",value:function(e){var t=-1===e.indexOf(this.pattern);return{isMatch:t,score:t?0:1,indices:[0,e.length-1]}}}],[{key:"type",get:function(){return"inverse-exact"}},{key:"multiRegex",get:function(){return/^!"(.*)"$/}},{key:"singleRegex",get:function(){return/^!(.*)$/}}]),n}(z),B=function(e){a(n,e);var t=l(n);function n(e){return r(this,n),t.call(this,e)}return o(n,[{key:"search",value:function(e){var t=e.startsWith(this.pattern);return{isMatch:t,score:t?0:1,indices:[0,this.pattern.length-1]}}}],[{key:"type",get:function(){return"prefix-exact"}},{key:"multiRegex",get:function(){return/^\^"(.*)"$/}},{key:"singleRegex",get:function(){return/^\^(.*)$/}}]),n}(z),J=function(e){a(n,e);var t=l(n);function n(e){return r(this,n),t.call(this,e)}return o(n,[{key:"search",value:function(e){var t=!e.startsWith(this.pattern);return{isMatch:t,score:t?0:1,indices:[0,e.length-1]}}}],[{key:"type",get:function(){return"inverse-prefix-exact"}},{key:"multiRegex",get:function(){return/^!\^"(.*)"$/}},{key:"singleRegex",get:function(){return/^!\^(.*)$/}}]),n}(z),U=function(e){a(n,e);var t=l(n);function n(e){return r(this,n),t.call(this,e)}return o(n,[{key:"search",value:function(e){var t=e.endsWith(this.pattern);return{isMatch:t,score:t?0:1,indices:[e.length-this.pattern.length,e.length-1]}}}],[{key:"type",get:function(){return"suffix-exact"}},{key:"multiRegex",get:function(){return/^"(.*)"\$$/}},{key:"singleRegex",get:function(){return/^(.*)\$$/}}]),n}(z),V=function(e){a(n,e);var t=l(n);function n(e){return r(this,n),t.call(this,e)}return o(n,[{key:"search",value:function(e){var t=!e.endsWith(this.pattern);return{isMatch:t,score:t?0:1,indices:[0,e.length-1]}}}],[{key:"type",get:function(){return"inverse-suffix-exact"}},{key:"multiRegex",get:function(){return/^!"(.*)"\$$/}},{key:"singleRegex",get:function(){return/^!(.*)\$$/}}]),n}(z),G=function(e){a(n,e);var t=l(n);function n(e){var i,o=arguments.length>1&&void 0!==arguments[1]?arguments[1]:{},c=o.location,a=void 0===c?I.location:c,s=o.threshold,u=void 0===s?I.threshold:s,h=o.distance,l=void 0===h?I.distance:h,f=o.includeMatches,d=void 0===f?I.includeMatches:f,v=o.findAllMatches,g=void 0===v?I.findAllMatches:v,y=o.minMatchCharLength,p=void 0===y?I.minMatchCharLength:y,m=o.isCaseSensitive,k=void 0===m?I.isCaseSensitive:m,M=o.ignoreLocation,b=void 0===M?I.ignoreLocation:M;return r(this,n),(i=t.call(this,e))._bitapSearch=new T(e,{location:a,threshold:u,distance:l,includeMatches:d,findAllMatches:g,minMatchCharLength:p,isCaseSensitive:k,ignoreLocation:b}),i}return o(n,[{key:"search",value:function(e){return this._bitapSearch.searchIn(e)}}],[{key:"type",get:function(){return"fuzzy"}},{key:"multiRegex",get:function(){return/^"(.*)"$/}},{key:"singleRegex",get:function(){return/^(.*)$/}}]),n}(z),H=function(e){a(n,e);var t=l(n);function n(e){return r(this,n),t.call(this,e)}return o(n,[{key:"search",value:function(e){for(var t,n=0,r=[],i=this.pattern.length;(t=e.indexOf(this.pattern,n))>-1;)n=t+i,r.push([t,n-1]);var o=!!r.length;return{isMatch:o,score:o?0:1,indices:r}}}],[{key:"type",get:function(){return"include"}},{key:"multiRegex",get:function(){return/^'"(.*)"$/}},{key:"singleRegex",get:function(){return/^'(.*)$/}}]),n}(z),Q=[K,H,B,J,V,U,q,G],X=Q.length,Y=/ +(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)/;function Z(e){var t=arguments.length>1&&void 0!==arguments[1]?arguments[1]:{};return e.split("|").map((function(e){for(var n=e.trim().split(Y).filter((function(e){return e&&!!e.trim()})),r=[],i=0,o=n.length;i1&&void 0!==arguments[1]?arguments[1]:{},i=n.isCaseSensitive,o=void 0===i?I.isCaseSensitive:i,c=n.includeMatches,a=void 0===c?I.includeMatches:c,s=n.minMatchCharLength,u=void 0===s?I.minMatchCharLength:s,h=n.ignoreLocation,l=void 0===h?I.ignoreLocation:h,f=n.findAllMatches,d=void 0===f?I.findAllMatches:f,v=n.location,g=void 0===v?I.location:v,y=n.threshold,p=void 0===y?I.threshold:y,m=n.distance,k=void 0===m?I.distance:m;r(this,e),this.query=null,this.options={isCaseSensitive:o,includeMatches:a,minMatchCharLength:u,findAllMatches:d,ignoreLocation:l,location:g,threshold:p,distance:k},this.pattern=o?t:t.toLowerCase(),this.query=Z(this.pattern,this.options)}return o(e,[{key:"searchIn",value:function(e){var t=this.query;if(!t)return{isMatch:!1,score:1};var n=this.options,r=n.includeMatches;e=n.isCaseSensitive?e:e.toLowerCase();for(var i=0,o=[],c=0,a=0,s=t.length;a-1&&(n.refIndex=e.idx),t.matches.push(n)}}))}function ve(e,t){t.score=e.score}function ge(e,t){var n=arguments.length>2&&void 0!==arguments[2]?arguments[2]:{},r=n.includeMatches,i=void 0===r?I.includeMatches:r,o=n.includeScore,c=void 0===o?I.includeScore:o,a=[];return i&&a.push(de),c&&a.push(ve),e.map((function(e){var n=e.idx,r={item:t[n],refIndex:n};return a.length&&a.forEach((function(t){t(e,r)})),r}))}var ye=function(){function e(n){var i=arguments.length>1&&void 0!==arguments[1]?arguments[1]:{},o=arguments.length>2?arguments[2]:void 0;r(this,e),this.options=t(t({},I),i),this.options.useExtendedSearch,this._keyStore=new S(this.options.keys),this.setCollection(n,o)}return o(e,[{key:"setCollection",value:function(e,t){if(this._docs=e,t&&!(t instanceof $))throw new Error("Incorrect 'index' type");this._myIndex=t||F(this.options.keys,this._docs,{getFn:this.options.getFn,fieldNormWeight:this.options.fieldNormWeight})}},{key:"add",value:function(e){k(e)&&(this._docs.push(e),this._myIndex.add(e))}},{key:"remove",value:function(){for(var e=arguments.length>0&&void 0!==arguments[0]?arguments[0]:function(){return!1},t=[],n=0,r=this._docs.length;n1&&void 0!==arguments[1]?arguments[1]:{},n=t.limit,r=void 0===n?-1:n,i=this.options,o=i.includeMatches,c=i.includeScore,a=i.shouldSort,s=i.sortFn,u=i.ignoreFieldNorm,h=g(e)?g(this._docs[0])?this._searchStringList(e):this._searchObjectList(e):this._searchLogical(e);return fe(h,{ignoreFieldNorm:u}),a&&h.sort(s),y(r)&&r>-1&&(h=h.slice(0,r)),ge(h,this._docs,{includeMatches:o,includeScore:c})}},{key:"_searchStringList",value:function(e){var t=re(e,this.options),n=this._myIndex.records,r=[];return n.forEach((function(e){var n=e.v,i=e.i,o=e.n;if(k(n)){var c=t.searchIn(n),a=c.isMatch,s=c.score,u=c.indices;a&&r.push({item:n,idx:i,matches:[{score:s,value:n,norm:o,indices:u}]})}})),r}},{key:"_searchLogical",value:function(e){var t=this,n=function(e,t){var n=(arguments.length>2&&void 0!==arguments[2]?arguments[2]:{}).auto,r=void 0===n||n,i=function e(n){var i=Object.keys(n),o=ue(n);if(!o&&i.length>1&&!se(n))return e(le(n));if(he(n)){var c=o?n[ce]:i[0],a=o?n[ae]:n[c];if(!g(a))throw new Error(x(c));var s={keyId:j(c),pattern:a};return r&&(s.searcher=re(a,t)),s}var u={children:[],operator:i[0]};return i.forEach((function(t){var r=n[t];v(r)&&r.forEach((function(t){u.children.push(e(t))}))})),u};return se(e)||(e=le(e)),i(e)}(e,this.options),r=function e(n,r,i){if(!n.children){var o=n.keyId,c=n.searcher,a=t._findMatches({key:t._keyStore.get(o),value:t._myIndex.getValueForItemAtKeyId(r,o),searcher:c});return a&&a.length?[{idx:i,item:r,matches:a}]:[]}for(var s=[],u=0,h=n.children.length;u1&&void 0!==arguments[1]?arguments[1]:{},n=t.getFn,r=void 0===n?I.getFn:n,i=t.fieldNormWeight,o=void 0===i?I.fieldNormWeight:i,c=e.keys,a=e.records,s=new $({getFn:r,fieldNormWeight:o});return s.setKeys(c),s.setIndexRecords(a),s},ye.config=I,function(){ne.push.apply(ne,arguments)}(te),ye},"object"==typeof exports&&"undefined"!=typeof module?module.exports=t():"function"==typeof define&&define.amd?define(t):(e="undefined"!=typeof globalThis?globalThis:e||self).Fuse=t(); -------------------------------------------------------------------------------- /docs/site_libs/quarto-html/tippy.umd.min.js: -------------------------------------------------------------------------------- 1 | !function(e,t){"object"==typeof exports&&"undefined"!=typeof module?module.exports=t(require("@popperjs/core")):"function"==typeof define&&define.amd?define(["@popperjs/core"],t):(e=e||self).tippy=t(e.Popper)}(this,(function(e){"use strict";var t={passive:!0,capture:!0},n=function(){return document.body};function r(e,t,n){if(Array.isArray(e)){var r=e[t];return null==r?Array.isArray(n)?n[t]:n:r}return e}function o(e,t){var n={}.toString.call(e);return 0===n.indexOf("[object")&&n.indexOf(t+"]")>-1}function i(e,t){return"function"==typeof e?e.apply(void 0,t):e}function a(e,t){return 0===t?e:function(r){clearTimeout(n),n=setTimeout((function(){e(r)}),t)};var n}function s(e,t){var n=Object.assign({},e);return t.forEach((function(e){delete n[e]})),n}function u(e){return[].concat(e)}function c(e,t){-1===e.indexOf(t)&&e.push(t)}function p(e){return e.split("-")[0]}function f(e){return[].slice.call(e)}function l(e){return Object.keys(e).reduce((function(t,n){return void 0!==e[n]&&(t[n]=e[n]),t}),{})}function d(){return document.createElement("div")}function v(e){return["Element","Fragment"].some((function(t){return o(e,t)}))}function m(e){return o(e,"MouseEvent")}function g(e){return!(!e||!e._tippy||e._tippy.reference!==e)}function h(e){return v(e)?[e]:function(e){return o(e,"NodeList")}(e)?f(e):Array.isArray(e)?e:f(document.querySelectorAll(e))}function b(e,t){e.forEach((function(e){e&&(e.style.transitionDuration=t+"ms")}))}function y(e,t){e.forEach((function(e){e&&e.setAttribute("data-state",t)}))}function w(e){var t,n=u(e)[0];return null!=n&&null!=(t=n.ownerDocument)&&t.body?n.ownerDocument:document}function E(e,t,n){var r=t+"EventListener";["transitionend","webkitTransitionEnd"].forEach((function(t){e[r](t,n)}))}function O(e,t){for(var n=t;n;){var r;if(e.contains(n))return!0;n=null==n.getRootNode||null==(r=n.getRootNode())?void 0:r.host}return!1}var x={isTouch:!1},C=0;function T(){x.isTouch||(x.isTouch=!0,window.performance&&document.addEventListener("mousemove",A))}function A(){var e=performance.now();e-C<20&&(x.isTouch=!1,document.removeEventListener("mousemove",A)),C=e}function L(){var e=document.activeElement;if(g(e)){var t=e._tippy;e.blur&&!t.state.isVisible&&e.blur()}}var D=!!("undefined"!=typeof window&&"undefined"!=typeof document)&&!!window.msCrypto,R=Object.assign({appendTo:n,aria:{content:"auto",expanded:"auto"},delay:0,duration:[300,250],getReferenceClientRect:null,hideOnClick:!0,ignoreAttributes:!1,interactive:!1,interactiveBorder:2,interactiveDebounce:0,moveTransition:"",offset:[0,10],onAfterUpdate:function(){},onBeforeUpdate:function(){},onCreate:function(){},onDestroy:function(){},onHidden:function(){},onHide:function(){},onMount:function(){},onShow:function(){},onShown:function(){},onTrigger:function(){},onUntrigger:function(){},onClickOutside:function(){},placement:"top",plugins:[],popperOptions:{},render:null,showOnCreate:!1,touch:!0,trigger:"mouseenter focus",triggerTarget:null},{animateFill:!1,followCursor:!1,inlinePositioning:!1,sticky:!1},{allowHTML:!1,animation:"fade",arrow:!0,content:"",inertia:!1,maxWidth:350,role:"tooltip",theme:"",zIndex:9999}),k=Object.keys(R);function P(e){var t=(e.plugins||[]).reduce((function(t,n){var r,o=n.name,i=n.defaultValue;o&&(t[o]=void 0!==e[o]?e[o]:null!=(r=R[o])?r:i);return t}),{});return Object.assign({},e,t)}function j(e,t){var n=Object.assign({},t,{content:i(t.content,[e])},t.ignoreAttributes?{}:function(e,t){return(t?Object.keys(P(Object.assign({},R,{plugins:t}))):k).reduce((function(t,n){var r=(e.getAttribute("data-tippy-"+n)||"").trim();if(!r)return t;if("content"===n)t[n]=r;else try{t[n]=JSON.parse(r)}catch(e){t[n]=r}return t}),{})}(e,t.plugins));return n.aria=Object.assign({},R.aria,n.aria),n.aria={expanded:"auto"===n.aria.expanded?t.interactive:n.aria.expanded,content:"auto"===n.aria.content?t.interactive?null:"describedby":n.aria.content},n}function M(e,t){e.innerHTML=t}function V(e){var t=d();return!0===e?t.className="tippy-arrow":(t.className="tippy-svg-arrow",v(e)?t.appendChild(e):M(t,e)),t}function I(e,t){v(t.content)?(M(e,""),e.appendChild(t.content)):"function"!=typeof t.content&&(t.allowHTML?M(e,t.content):e.textContent=t.content)}function S(e){var t=e.firstElementChild,n=f(t.children);return{box:t,content:n.find((function(e){return e.classList.contains("tippy-content")})),arrow:n.find((function(e){return e.classList.contains("tippy-arrow")||e.classList.contains("tippy-svg-arrow")})),backdrop:n.find((function(e){return e.classList.contains("tippy-backdrop")}))}}function N(e){var t=d(),n=d();n.className="tippy-box",n.setAttribute("data-state","hidden"),n.setAttribute("tabindex","-1");var r=d();function o(n,r){var o=S(t),i=o.box,a=o.content,s=o.arrow;r.theme?i.setAttribute("data-theme",r.theme):i.removeAttribute("data-theme"),"string"==typeof r.animation?i.setAttribute("data-animation",r.animation):i.removeAttribute("data-animation"),r.inertia?i.setAttribute("data-inertia",""):i.removeAttribute("data-inertia"),i.style.maxWidth="number"==typeof r.maxWidth?r.maxWidth+"px":r.maxWidth,r.role?i.setAttribute("role",r.role):i.removeAttribute("role"),n.content===r.content&&n.allowHTML===r.allowHTML||I(a,e.props),r.arrow?s?n.arrow!==r.arrow&&(i.removeChild(s),i.appendChild(V(r.arrow))):i.appendChild(V(r.arrow)):s&&i.removeChild(s)}return r.className="tippy-content",r.setAttribute("data-state","hidden"),I(r,e.props),t.appendChild(n),n.appendChild(r),o(e.props,e.props),{popper:t,onUpdate:o}}N.$$tippy=!0;var B=1,H=[],U=[];function _(o,s){var v,g,h,C,T,A,L,k,M=j(o,Object.assign({},R,P(l(s)))),V=!1,I=!1,N=!1,_=!1,F=[],W=a(we,M.interactiveDebounce),X=B++,Y=(k=M.plugins).filter((function(e,t){return k.indexOf(e)===t})),$={id:X,reference:o,popper:d(),popperInstance:null,props:M,state:{isEnabled:!0,isVisible:!1,isDestroyed:!1,isMounted:!1,isShown:!1},plugins:Y,clearDelayTimeouts:function(){clearTimeout(v),clearTimeout(g),cancelAnimationFrame(h)},setProps:function(e){if($.state.isDestroyed)return;ae("onBeforeUpdate",[$,e]),be();var t=$.props,n=j(o,Object.assign({},t,l(e),{ignoreAttributes:!0}));$.props=n,he(),t.interactiveDebounce!==n.interactiveDebounce&&(ce(),W=a(we,n.interactiveDebounce));t.triggerTarget&&!n.triggerTarget?u(t.triggerTarget).forEach((function(e){e.removeAttribute("aria-expanded")})):n.triggerTarget&&o.removeAttribute("aria-expanded");ue(),ie(),J&&J(t,n);$.popperInstance&&(Ce(),Ae().forEach((function(e){requestAnimationFrame(e._tippy.popperInstance.forceUpdate)})));ae("onAfterUpdate",[$,e])},setContent:function(e){$.setProps({content:e})},show:function(){var e=$.state.isVisible,t=$.state.isDestroyed,o=!$.state.isEnabled,a=x.isTouch&&!$.props.touch,s=r($.props.duration,0,R.duration);if(e||t||o||a)return;if(te().hasAttribute("disabled"))return;if(ae("onShow",[$],!1),!1===$.props.onShow($))return;$.state.isVisible=!0,ee()&&(z.style.visibility="visible");ie(),de(),$.state.isMounted||(z.style.transition="none");if(ee()){var u=re(),p=u.box,f=u.content;b([p,f],0)}A=function(){var e;if($.state.isVisible&&!_){if(_=!0,z.offsetHeight,z.style.transition=$.props.moveTransition,ee()&&$.props.animation){var t=re(),n=t.box,r=t.content;b([n,r],s),y([n,r],"visible")}se(),ue(),c(U,$),null==(e=$.popperInstance)||e.forceUpdate(),ae("onMount",[$]),$.props.animation&&ee()&&function(e,t){me(e,t)}(s,(function(){$.state.isShown=!0,ae("onShown",[$])}))}},function(){var e,t=$.props.appendTo,r=te();e=$.props.interactive&&t===n||"parent"===t?r.parentNode:i(t,[r]);e.contains(z)||e.appendChild(z);$.state.isMounted=!0,Ce()}()},hide:function(){var e=!$.state.isVisible,t=$.state.isDestroyed,n=!$.state.isEnabled,o=r($.props.duration,1,R.duration);if(e||t||n)return;if(ae("onHide",[$],!1),!1===$.props.onHide($))return;$.state.isVisible=!1,$.state.isShown=!1,_=!1,V=!1,ee()&&(z.style.visibility="hidden");if(ce(),ve(),ie(!0),ee()){var i=re(),a=i.box,s=i.content;$.props.animation&&(b([a,s],o),y([a,s],"hidden"))}se(),ue(),$.props.animation?ee()&&function(e,t){me(e,(function(){!$.state.isVisible&&z.parentNode&&z.parentNode.contains(z)&&t()}))}(o,$.unmount):$.unmount()},hideWithInteractivity:function(e){ne().addEventListener("mousemove",W),c(H,W),W(e)},enable:function(){$.state.isEnabled=!0},disable:function(){$.hide(),$.state.isEnabled=!1},unmount:function(){$.state.isVisible&&$.hide();if(!$.state.isMounted)return;Te(),Ae().forEach((function(e){e._tippy.unmount()})),z.parentNode&&z.parentNode.removeChild(z);U=U.filter((function(e){return e!==$})),$.state.isMounted=!1,ae("onHidden",[$])},destroy:function(){if($.state.isDestroyed)return;$.clearDelayTimeouts(),$.unmount(),be(),delete o._tippy,$.state.isDestroyed=!0,ae("onDestroy",[$])}};if(!M.render)return $;var q=M.render($),z=q.popper,J=q.onUpdate;z.setAttribute("data-tippy-root",""),z.id="tippy-"+$.id,$.popper=z,o._tippy=$,z._tippy=$;var G=Y.map((function(e){return e.fn($)})),K=o.hasAttribute("aria-expanded");return he(),ue(),ie(),ae("onCreate",[$]),M.showOnCreate&&Le(),z.addEventListener("mouseenter",(function(){$.props.interactive&&$.state.isVisible&&$.clearDelayTimeouts()})),z.addEventListener("mouseleave",(function(){$.props.interactive&&$.props.trigger.indexOf("mouseenter")>=0&&ne().addEventListener("mousemove",W)})),$;function Q(){var e=$.props.touch;return Array.isArray(e)?e:[e,0]}function Z(){return"hold"===Q()[0]}function ee(){var e;return!(null==(e=$.props.render)||!e.$$tippy)}function te(){return L||o}function ne(){var e=te().parentNode;return e?w(e):document}function re(){return S(z)}function oe(e){return $.state.isMounted&&!$.state.isVisible||x.isTouch||C&&"focus"===C.type?0:r($.props.delay,e?0:1,R.delay)}function ie(e){void 0===e&&(e=!1),z.style.pointerEvents=$.props.interactive&&!e?"":"none",z.style.zIndex=""+$.props.zIndex}function ae(e,t,n){var r;(void 0===n&&(n=!0),G.forEach((function(n){n[e]&&n[e].apply(n,t)})),n)&&(r=$.props)[e].apply(r,t)}function se(){var e=$.props.aria;if(e.content){var t="aria-"+e.content,n=z.id;u($.props.triggerTarget||o).forEach((function(e){var r=e.getAttribute(t);if($.state.isVisible)e.setAttribute(t,r?r+" "+n:n);else{var o=r&&r.replace(n,"").trim();o?e.setAttribute(t,o):e.removeAttribute(t)}}))}}function ue(){!K&&$.props.aria.expanded&&u($.props.triggerTarget||o).forEach((function(e){$.props.interactive?e.setAttribute("aria-expanded",$.state.isVisible&&e===te()?"true":"false"):e.removeAttribute("aria-expanded")}))}function ce(){ne().removeEventListener("mousemove",W),H=H.filter((function(e){return e!==W}))}function pe(e){if(!x.isTouch||!N&&"mousedown"!==e.type){var t=e.composedPath&&e.composedPath()[0]||e.target;if(!$.props.interactive||!O(z,t)){if(u($.props.triggerTarget||o).some((function(e){return O(e,t)}))){if(x.isTouch)return;if($.state.isVisible&&$.props.trigger.indexOf("click")>=0)return}else ae("onClickOutside",[$,e]);!0===$.props.hideOnClick&&($.clearDelayTimeouts(),$.hide(),I=!0,setTimeout((function(){I=!1})),$.state.isMounted||ve())}}}function fe(){N=!0}function le(){N=!1}function de(){var e=ne();e.addEventListener("mousedown",pe,!0),e.addEventListener("touchend",pe,t),e.addEventListener("touchstart",le,t),e.addEventListener("touchmove",fe,t)}function ve(){var e=ne();e.removeEventListener("mousedown",pe,!0),e.removeEventListener("touchend",pe,t),e.removeEventListener("touchstart",le,t),e.removeEventListener("touchmove",fe,t)}function me(e,t){var n=re().box;function r(e){e.target===n&&(E(n,"remove",r),t())}if(0===e)return t();E(n,"remove",T),E(n,"add",r),T=r}function ge(e,t,n){void 0===n&&(n=!1),u($.props.triggerTarget||o).forEach((function(r){r.addEventListener(e,t,n),F.push({node:r,eventType:e,handler:t,options:n})}))}function he(){var e;Z()&&(ge("touchstart",ye,{passive:!0}),ge("touchend",Ee,{passive:!0})),(e=$.props.trigger,e.split(/\s+/).filter(Boolean)).forEach((function(e){if("manual"!==e)switch(ge(e,ye),e){case"mouseenter":ge("mouseleave",Ee);break;case"focus":ge(D?"focusout":"blur",Oe);break;case"focusin":ge("focusout",Oe)}}))}function be(){F.forEach((function(e){var t=e.node,n=e.eventType,r=e.handler,o=e.options;t.removeEventListener(n,r,o)})),F=[]}function ye(e){var t,n=!1;if($.state.isEnabled&&!xe(e)&&!I){var r="focus"===(null==(t=C)?void 0:t.type);C=e,L=e.currentTarget,ue(),!$.state.isVisible&&m(e)&&H.forEach((function(t){return t(e)})),"click"===e.type&&($.props.trigger.indexOf("mouseenter")<0||V)&&!1!==$.props.hideOnClick&&$.state.isVisible?n=!0:Le(e),"click"===e.type&&(V=!n),n&&!r&&De(e)}}function we(e){var t=e.target,n=te().contains(t)||z.contains(t);"mousemove"===e.type&&n||function(e,t){var n=t.clientX,r=t.clientY;return e.every((function(e){var t=e.popperRect,o=e.popperState,i=e.props.interactiveBorder,a=p(o.placement),s=o.modifiersData.offset;if(!s)return!0;var u="bottom"===a?s.top.y:0,c="top"===a?s.bottom.y:0,f="right"===a?s.left.x:0,l="left"===a?s.right.x:0,d=t.top-r+u>i,v=r-t.bottom-c>i,m=t.left-n+f>i,g=n-t.right-l>i;return d||v||m||g}))}(Ae().concat(z).map((function(e){var t,n=null==(t=e._tippy.popperInstance)?void 0:t.state;return n?{popperRect:e.getBoundingClientRect(),popperState:n,props:M}:null})).filter(Boolean),e)&&(ce(),De(e))}function Ee(e){xe(e)||$.props.trigger.indexOf("click")>=0&&V||($.props.interactive?$.hideWithInteractivity(e):De(e))}function Oe(e){$.props.trigger.indexOf("focusin")<0&&e.target!==te()||$.props.interactive&&e.relatedTarget&&z.contains(e.relatedTarget)||De(e)}function xe(e){return!!x.isTouch&&Z()!==e.type.indexOf("touch")>=0}function Ce(){Te();var t=$.props,n=t.popperOptions,r=t.placement,i=t.offset,a=t.getReferenceClientRect,s=t.moveTransition,u=ee()?S(z).arrow:null,c=a?{getBoundingClientRect:a,contextElement:a.contextElement||te()}:o,p=[{name:"offset",options:{offset:i}},{name:"preventOverflow",options:{padding:{top:2,bottom:2,left:5,right:5}}},{name:"flip",options:{padding:5}},{name:"computeStyles",options:{adaptive:!s}},{name:"$$tippy",enabled:!0,phase:"beforeWrite",requires:["computeStyles"],fn:function(e){var t=e.state;if(ee()){var n=re().box;["placement","reference-hidden","escaped"].forEach((function(e){"placement"===e?n.setAttribute("data-placement",t.placement):t.attributes.popper["data-popper-"+e]?n.setAttribute("data-"+e,""):n.removeAttribute("data-"+e)})),t.attributes.popper={}}}}];ee()&&u&&p.push({name:"arrow",options:{element:u,padding:3}}),p.push.apply(p,(null==n?void 0:n.modifiers)||[]),$.popperInstance=e.createPopper(c,z,Object.assign({},n,{placement:r,onFirstUpdate:A,modifiers:p}))}function Te(){$.popperInstance&&($.popperInstance.destroy(),$.popperInstance=null)}function Ae(){return f(z.querySelectorAll("[data-tippy-root]"))}function Le(e){$.clearDelayTimeouts(),e&&ae("onTrigger",[$,e]),de();var t=oe(!0),n=Q(),r=n[0],o=n[1];x.isTouch&&"hold"===r&&o&&(t=o),t?v=setTimeout((function(){$.show()}),t):$.show()}function De(e){if($.clearDelayTimeouts(),ae("onUntrigger",[$,e]),$.state.isVisible){if(!($.props.trigger.indexOf("mouseenter")>=0&&$.props.trigger.indexOf("click")>=0&&["mouseleave","mousemove"].indexOf(e.type)>=0&&V)){var t=oe(!1);t?g=setTimeout((function(){$.state.isVisible&&$.hide()}),t):h=requestAnimationFrame((function(){$.hide()}))}}else ve()}}function F(e,n){void 0===n&&(n={});var r=R.plugins.concat(n.plugins||[]);document.addEventListener("touchstart",T,t),window.addEventListener("blur",L);var o=Object.assign({},n,{plugins:r}),i=h(e).reduce((function(e,t){var n=t&&_(t,o);return n&&e.push(n),e}),[]);return v(e)?i[0]:i}F.defaultProps=R,F.setDefaultProps=function(e){Object.keys(e).forEach((function(t){R[t]=e[t]}))},F.currentInput=x;var W=Object.assign({},e.applyStyles,{effect:function(e){var t=e.state,n={popper:{position:t.options.strategy,left:"0",top:"0",margin:"0"},arrow:{position:"absolute"},reference:{}};Object.assign(t.elements.popper.style,n.popper),t.styles=n,t.elements.arrow&&Object.assign(t.elements.arrow.style,n.arrow)}}),X={mouseover:"mouseenter",focusin:"focus",click:"click"};var Y={name:"animateFill",defaultValue:!1,fn:function(e){var t;if(null==(t=e.props.render)||!t.$$tippy)return{};var n=S(e.popper),r=n.box,o=n.content,i=e.props.animateFill?function(){var e=d();return e.className="tippy-backdrop",y([e],"hidden"),e}():null;return{onCreate:function(){i&&(r.insertBefore(i,r.firstElementChild),r.setAttribute("data-animatefill",""),r.style.overflow="hidden",e.setProps({arrow:!1,animation:"shift-away"}))},onMount:function(){if(i){var e=r.style.transitionDuration,t=Number(e.replace("ms",""));o.style.transitionDelay=Math.round(t/10)+"ms",i.style.transitionDuration=e,y([i],"visible")}},onShow:function(){i&&(i.style.transitionDuration="0ms")},onHide:function(){i&&y([i],"hidden")}}}};var $={clientX:0,clientY:0},q=[];function z(e){var t=e.clientX,n=e.clientY;$={clientX:t,clientY:n}}var J={name:"followCursor",defaultValue:!1,fn:function(e){var t=e.reference,n=w(e.props.triggerTarget||t),r=!1,o=!1,i=!0,a=e.props;function s(){return"initial"===e.props.followCursor&&e.state.isVisible}function u(){n.addEventListener("mousemove",f)}function c(){n.removeEventListener("mousemove",f)}function p(){r=!0,e.setProps({getReferenceClientRect:null}),r=!1}function f(n){var r=!n.target||t.contains(n.target),o=e.props.followCursor,i=n.clientX,a=n.clientY,s=t.getBoundingClientRect(),u=i-s.left,c=a-s.top;!r&&e.props.interactive||e.setProps({getReferenceClientRect:function(){var e=t.getBoundingClientRect(),n=i,r=a;"initial"===o&&(n=e.left+u,r=e.top+c);var s="horizontal"===o?e.top:r,p="vertical"===o?e.right:n,f="horizontal"===o?e.bottom:r,l="vertical"===o?e.left:n;return{width:p-l,height:f-s,top:s,right:p,bottom:f,left:l}}})}function l(){e.props.followCursor&&(q.push({instance:e,doc:n}),function(e){e.addEventListener("mousemove",z)}(n))}function d(){0===(q=q.filter((function(t){return t.instance!==e}))).filter((function(e){return e.doc===n})).length&&function(e){e.removeEventListener("mousemove",z)}(n)}return{onCreate:l,onDestroy:d,onBeforeUpdate:function(){a=e.props},onAfterUpdate:function(t,n){var i=n.followCursor;r||void 0!==i&&a.followCursor!==i&&(d(),i?(l(),!e.state.isMounted||o||s()||u()):(c(),p()))},onMount:function(){e.props.followCursor&&!o&&(i&&(f($),i=!1),s()||u())},onTrigger:function(e,t){m(t)&&($={clientX:t.clientX,clientY:t.clientY}),o="focus"===t.type},onHidden:function(){e.props.followCursor&&(p(),c(),i=!0)}}}};var G={name:"inlinePositioning",defaultValue:!1,fn:function(e){var t,n=e.reference;var r=-1,o=!1,i=[],a={name:"tippyInlinePositioning",enabled:!0,phase:"afterWrite",fn:function(o){var a=o.state;e.props.inlinePositioning&&(-1!==i.indexOf(a.placement)&&(i=[]),t!==a.placement&&-1===i.indexOf(a.placement)&&(i.push(a.placement),e.setProps({getReferenceClientRect:function(){return function(e){return function(e,t,n,r){if(n.length<2||null===e)return t;if(2===n.length&&r>=0&&n[0].left>n[1].right)return n[r]||t;switch(e){case"top":case"bottom":var o=n[0],i=n[n.length-1],a="top"===e,s=o.top,u=i.bottom,c=a?o.left:i.left,p=a?o.right:i.right;return{top:s,bottom:u,left:c,right:p,width:p-c,height:u-s};case"left":case"right":var f=Math.min.apply(Math,n.map((function(e){return e.left}))),l=Math.max.apply(Math,n.map((function(e){return e.right}))),d=n.filter((function(t){return"left"===e?t.left===f:t.right===l})),v=d[0].top,m=d[d.length-1].bottom;return{top:v,bottom:m,left:f,right:l,width:l-f,height:m-v};default:return t}}(p(e),n.getBoundingClientRect(),f(n.getClientRects()),r)}(a.placement)}})),t=a.placement)}};function s(){var t;o||(t=function(e,t){var n;return{popperOptions:Object.assign({},e.popperOptions,{modifiers:[].concat(((null==(n=e.popperOptions)?void 0:n.modifiers)||[]).filter((function(e){return e.name!==t.name})),[t])})}}(e.props,a),o=!0,e.setProps(t),o=!1)}return{onCreate:s,onAfterUpdate:s,onTrigger:function(t,n){if(m(n)){var o=f(e.reference.getClientRects()),i=o.find((function(e){return e.left-2<=n.clientX&&e.right+2>=n.clientX&&e.top-2<=n.clientY&&e.bottom+2>=n.clientY})),a=o.indexOf(i);r=a>-1?a:r}},onHidden:function(){r=-1}}}};var K={name:"sticky",defaultValue:!1,fn:function(e){var t=e.reference,n=e.popper;function r(t){return!0===e.props.sticky||e.props.sticky===t}var o=null,i=null;function a(){var s=r("reference")?(e.popperInstance?e.popperInstance.state.elements.reference:t).getBoundingClientRect():null,u=r("popper")?n.getBoundingClientRect():null;(s&&Q(o,s)||u&&Q(i,u))&&e.popperInstance&&e.popperInstance.update(),o=s,i=u,e.state.isMounted&&requestAnimationFrame(a)}return{onMount:function(){e.props.sticky&&a()}}}};function Q(e,t){return!e||!t||(e.top!==t.top||e.right!==t.right||e.bottom!==t.bottom||e.left!==t.left)}return F.setDefaultProps({plugins:[Y,J,G,K],render:N}),F.createSingleton=function(e,t){var n;void 0===t&&(t={});var r,o=e,i=[],a=[],c=t.overrides,p=[],f=!1;function l(){a=o.map((function(e){return u(e.props.triggerTarget||e.reference)})).reduce((function(e,t){return e.concat(t)}),[])}function v(){i=o.map((function(e){return e.reference}))}function m(e){o.forEach((function(t){e?t.enable():t.disable()}))}function g(e){return o.map((function(t){var n=t.setProps;return t.setProps=function(o){n(o),t.reference===r&&e.setProps(o)},function(){t.setProps=n}}))}function h(e,t){var n=a.indexOf(t);if(t!==r){r=t;var s=(c||[]).concat("content").reduce((function(e,t){return e[t]=o[n].props[t],e}),{});e.setProps(Object.assign({},s,{getReferenceClientRect:"function"==typeof s.getReferenceClientRect?s.getReferenceClientRect:function(){var e;return null==(e=i[n])?void 0:e.getBoundingClientRect()}}))}}m(!1),v(),l();var b={fn:function(){return{onDestroy:function(){m(!0)},onHidden:function(){r=null},onClickOutside:function(e){e.props.showOnCreate&&!f&&(f=!0,r=null)},onShow:function(e){e.props.showOnCreate&&!f&&(f=!0,h(e,i[0]))},onTrigger:function(e,t){h(e,t.currentTarget)}}}},y=F(d(),Object.assign({},s(t,["overrides"]),{plugins:[b].concat(t.plugins||[]),triggerTarget:a,popperOptions:Object.assign({},t.popperOptions,{modifiers:[].concat((null==(n=t.popperOptions)?void 0:n.modifiers)||[],[W])})})),w=y.show;y.show=function(e){if(w(),!r&&null==e)return h(y,i[0]);if(!r||null!=e){if("number"==typeof e)return i[e]&&h(y,i[e]);if(o.indexOf(e)>=0){var t=e.reference;return h(y,t)}return i.indexOf(e)>=0?h(y,e):void 0}},y.showNext=function(){var e=i[0];if(!r)return y.show(0);var t=i.indexOf(r);y.show(i[t+1]||e)},y.showPrevious=function(){var e=i[i.length-1];if(!r)return y.show(e);var t=i.indexOf(r),n=i[t-1]||e;y.show(n)};var E=y.setProps;return y.setProps=function(e){c=e.overrides||c,E(e)},y.setInstances=function(e){m(!0),p.forEach((function(e){return e()})),o=e,m(!1),v(),l(),p=g(y),y.setProps({triggerTarget:a})},p=g(y),y},F.delegate=function(e,n){var r=[],o=[],i=!1,a=n.target,c=s(n,["target"]),p=Object.assign({},c,{trigger:"manual",touch:!1}),f=Object.assign({touch:R.touch},c,{showOnCreate:!0}),l=F(e,p);function d(e){if(e.target&&!i){var t=e.target.closest(a);if(t){var r=t.getAttribute("data-tippy-trigger")||n.trigger||R.trigger;if(!t._tippy&&!("touchstart"===e.type&&"boolean"==typeof f.touch||"touchstart"!==e.type&&r.indexOf(X[e.type])<0)){var s=F(t,f);s&&(o=o.concat(s))}}}}function v(e,t,n,o){void 0===o&&(o=!1),e.addEventListener(t,n,o),r.push({node:e,eventType:t,handler:n,options:o})}return u(l).forEach((function(e){var n=e.destroy,a=e.enable,s=e.disable;e.destroy=function(e){void 0===e&&(e=!0),e&&o.forEach((function(e){e.destroy()})),o=[],r.forEach((function(e){var t=e.node,n=e.eventType,r=e.handler,o=e.options;t.removeEventListener(n,r,o)})),r=[],n()},e.enable=function(){a(),o.forEach((function(e){return e.enable()})),i=!1},e.disable=function(){s(),o.forEach((function(e){return e.disable()})),i=!0},function(e){var n=e.reference;v(n,"touchstart",d,t),v(n,"mouseover",d),v(n,"focusin",d),v(n,"click",d)}(e)})),l},F.hideAll=function(e){var t=void 0===e?{}:e,n=t.exclude,r=t.duration;U.forEach((function(e){var t=!1;if(n&&(t=g(n)?e.reference===n:e.popper===n.popper),!t){var o=e.props.duration;e.setProps({duration:r}),e.hide(),e.state.isDestroyed||e.setProps({duration:o})}}))},F.roundArrow='',F})); 2 | 3 | -------------------------------------------------------------------------------- /docs/end.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Working Smarter with {targets} 11 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 69 | 70 | 71 | 72 | 73 | 74 | 75 |
76 |
77 | 90 |
91 | 92 |
93 | 94 | 170 |
171 | 172 | 173 | 174 |
175 | 176 | 177 | 178 | 179 |
180 |

The End

181 |

Thanks for your participation. There’s more to {targets} than we covered here, but I hope you feel on solid enough foundation to take some steps with it.on your own projects.

182 |

My general advice is:

183 |
    184 |
  • Keep it simple! Avoid Dynamic Branching and other advanced techiniques until you’re either confident, or they’re absolutely necessary.
  • 185 |
  • Try to keep your {tarets} plans (_targets.R) really clean and high level. Don’t junk them up with a lot of implementation detail (code not in funcions). 186 |
      187 |
    • This retains their value as a review / communication tool.
    • 188 |
  • 189 |
190 |

If you feel able, I would much appreciate your feedback via this short form.

191 |

If you’re working through the workshop content afterward and get stuck, or spot any problems, feel free to raise an issue on the workshop GitHub repository.

192 | 193 | 194 |
195 | 196 |
197 | 606 |
607 | 608 | 609 | 610 | 611 | --------------------------------------------------------------------------------