├── .Rbuildignore ├── .covrignore ├── .github ├── .gitignore └── workflows │ └── check-standard.yaml ├── .gitignore ├── DESCRIPTION ├── NAMESPACE ├── NEWS.md ├── R ├── comparison.R ├── data.R ├── families.R ├── get_formula.R ├── get_jagscode.R ├── get_prior.R ├── get_segment_table.R ├── is_assert.R ├── mcp-package.R ├── mcp.R ├── mcpfit_methods.R ├── misc.R ├── plot.R └── run_jags.R ├── README.md ├── cran-comments.md ├── data-raw └── ex_fit.R ├── data └── demo_fit.rda ├── inst └── CITATION ├── man ├── bernoulli.Rd ├── check_terms_in_data.Rd ├── criterion.Rd ├── cumpaste.Rd ├── demo_fit.Rd ├── exponential.Rd ├── figures │ ├── logo.png │ └── logo_200px.png ├── fitted.mcpfit.Rd ├── format_code.Rd ├── geom_cp_density.Rd ├── geom_quantiles.Rd ├── get_all_formulas.Rd ├── get_ar_code.Rd ├── get_arma_order.Rd ├── get_density.Rd ├── get_eval_at.Rd ├── get_formula_str.Rd ├── get_jags_data.Rd ├── get_jagscode.Rd ├── get_ppc_plot.Rd ├── get_prior.Rd ├── get_prior_str.Rd ├── get_quantiles.Rd ├── get_segment_table.Rd ├── get_simulate.Rd ├── get_summary.Rd ├── get_term_content.Rd ├── hypothesis.Rd ├── ilogit.Rd ├── is.mcpfit.Rd ├── logit.Rd ├── mcmclist_samples.Rd ├── mcp-package.Rd ├── mcp.Rd ├── mcp_example.Rd ├── mcpfamily.Rd ├── mcpfit-class.Rd ├── negbinomial.Rd ├── phi.Rd ├── plot.mcpfit.Rd ├── plot_pars.Rd ├── pp_check.Rd ├── pp_eval.Rd ├── predict.mcpfit.Rd ├── print.mcplist.Rd ├── print.mcptext.Rd ├── probit.Rd ├── recover_levels.Rd ├── remove_terms.Rd ├── residuals.mcpfit.Rd ├── run_jags.Rd ├── sd_to_prec.Rd ├── summary.mcpfit.Rd ├── tidy_samples.Rd ├── tidy_to_matrix.Rd ├── to_formula.Rd ├── unpack_arma.Rd ├── unpack_cp.Rd ├── unpack_int.Rd ├── unpack_rhs.Rd ├── unpack_slope.Rd ├── unpack_slope_term.Rd ├── unpack_tildes.Rd ├── unpack_varying.Rd ├── unpack_varying_term.Rd ├── unpack_y.Rd └── with_loo.Rd ├── mcp.Rproj ├── pkgdown ├── _pkgdown.yml ├── extra.css └── favicon │ ├── apple-touch-icon-120x120.png │ ├── apple-touch-icon-152x152.png │ ├── apple-touch-icon-180x180.png │ ├── apple-touch-icon-60x60.png │ ├── apple-touch-icon-76x76.png │ ├── apple-touch-icon.png │ ├── favicon-16x16.png │ ├── favicon-32x32.png │ └── favicon.ico ├── tests ├── testthat.R └── testthat │ ├── helper-fits.R │ ├── helper-runs-data.R │ ├── helper-runs.R │ ├── test-fits-arma.R │ ├── test-fits-gauss.R │ ├── test-fits-sigma.R │ ├── test-runs-bernoulli-binomial.R │ ├── test-runs-formulas-gauss.R │ ├── test-runs-poisson.R │ ├── test-runs-prior.R │ └── test-runs-sigma-arma.R └── vignettes ├── _figures ├── ex_ar.png ├── ex_binomial.png ├── ex_demo.png ├── ex_demo_combo.png ├── ex_fix_rel.png ├── ex_plateaus.png ├── ex_quadratic.png ├── ex_slopes.png ├── ex_trig.png ├── ex_variance.png ├── ex_varying.png ├── make_README_plots.R ├── mcp_glm_status.png ├── mcp_glm_status.xlsx ├── packages_table1.png ├── packages_table2.png ├── packages_table3.png ├── packages_tables.pdf └── packages_tables.xlsx ├── arma.Rmd ├── binomial.Rmd ├── comparison.Rmd ├── families.Rmd ├── formulas.Rmd ├── packages.Rmd ├── poisson.Rmd ├── predict.Rmd ├── priors.Rmd ├── tips.Rmd ├── variance.Rmd └── varying.Rmd /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^renv$ 2 | ^renv\.lock$ 3 | # Hidden stuff 4 | ^mcp\.Rproj$ 5 | ^\.Rproj\.user$ 6 | .covrignore 7 | 8 | # Folders 9 | docs 10 | ^man/figures 11 | ^vignettes$ 12 | ^pkgdown$ 13 | ^data-raw$ 14 | ^revdep$ 15 | 16 | # Files 17 | cran-comments.md 18 | logo.png 19 | ^CRAN-RELEASE$ 20 | fix_twittercard.R 21 | ^\.github$ 22 | -------------------------------------------------------------------------------- /.covrignore: -------------------------------------------------------------------------------- 1 | vignettes 2 | docs 3 | data 4 | data-raw 5 | inst 6 | man 7 | pkgdown 8 | tests 9 | vignettes 10 | fix_twittercard.R 11 | -------------------------------------------------------------------------------- /.github/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | -------------------------------------------------------------------------------- /.github/workflows/check-standard.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/blob/v2-branch/examples/check-standard.yaml 2 | 3 | on: 4 | push: 5 | branches: [main, master] 6 | pull_request: 7 | branches: [main, master] 8 | workflow_dispatch: 9 | 10 | name: R-CMD-check 11 | 12 | jobs: 13 | R-CMD-check: 14 | runs-on: ${{ matrix.config.os }} 15 | 16 | name: ${{ matrix.config.os }} (${{ matrix.config.r }}) 17 | 18 | strategy: 19 | fail-fast: false 20 | matrix: 21 | config: 22 | - {os: macos-latest, r: 'release'} 23 | - {os: windows-latest, r: 'release'} 24 | - {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'} 25 | - {os: ubuntu-latest, r: 'release'} 26 | - {os: ubuntu-latest, r: 'oldrel-1'} 27 | 28 | env: 29 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 30 | R_KEEP_PKG_SOURCE: yes 31 | 32 | steps: 33 | - uses: actions/checkout@v4 34 | 35 | - uses: r-lib/actions/setup-pandoc@v2 36 | 37 | # JAGS install. Inspired by https://github.com/boydorr/SpARKjags/blob/master/.github/workflows/test-build.yaml 38 | # but updated with ignoring certificates and redirects. 39 | # JAGS is automatically installed on linux via the rjags package. 40 | - name: Install JAGS (if Windows) 41 | if: runner.os == 'Windows' 42 | run: | 43 | curl.exe -o wjags.exe -L0 -k --url https://downloads.sourceforge.net/project/mcmc-jags/JAGS/4.x/Windows/JAGS-4.3.1.exe 44 | wjags.exe /S 45 | del wjags.exe 46 | shell: cmd 47 | 48 | - name: Install JAGS (if macOS) 49 | if: runner.os == 'macOS' 50 | # This worked before R 4.3: brew install jags 51 | run: | 52 | curl -o wjags.pkg -L0 -k --url https://downloads.sourceforge.net/project/mcmc-jags/JAGS/4.x/Mac%20OS%20X/JAGS-4.3.2.pkg 53 | sudo installer -pkg wjags.pkg -target / 54 | rm wjags.pkg 55 | 56 | # R 57 | - uses: r-lib/actions/setup-r@v2 58 | with: 59 | r-version: ${{ matrix.config.r }} 60 | http-user-agent: ${{ matrix.config.http-user-agent }} 61 | use-public-rspm: true 62 | 63 | - uses: r-lib/actions/setup-r-dependencies@v2 64 | with: 65 | extra-packages: any::rcmdcheck, any::covr 66 | needs: check, coverage 67 | 68 | - uses: r-lib/actions/check-r-package@v2 69 | with: 70 | upload-snapshots: true 71 | 72 | # Run test coverage on the fastest-to-run job 73 | - name: Test coverage 74 | if: matrix.config.os == 'macos-latest' && matrix.config.r == 'release' 75 | run: | 76 | covr::codecov( 77 | quiet = FALSE, 78 | clean = FALSE, 79 | install_path = file.path(Sys.getenv("RUNNER_TEMP"), "package") 80 | ) 81 | shell: Rscript {0} 82 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .Rprofile 4 | fix_twittercard.R 5 | vignettes/arma_cache 6 | vignettes/binomial_cache 7 | vignettes/comparison_cache 8 | vignettes/families_cache 9 | vignettes/formulas_cache 10 | vignettes/packages_cache 11 | vignettes/poisson_cache 12 | vignettes/predict_cache 13 | vignettes/priors_cache 14 | vignettes/tips_cache 15 | vignettes/variance_cache 16 | vignettes/varying_cache 17 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: mcp 2 | Title: Regression with Multiple Change Points 3 | Version: 0.3.4 4 | Date: 2024-03-14 5 | URL: https://lindeloev.github.io/mcp/ 6 | BugReports: https://github.com/lindeloev/mcp/issues 7 | Authors@R: 8 | person(given = "Jonas Kristoffer", 9 | family = "Lindeløv", 10 | role = c("aut", "cre"), 11 | email = "jonas@lindeloev.dk", 12 | comment = c(ORCID = "0000-0003-4565-0595")) 13 | Description: Flexible and informed regression with Multiple Change Points. 'mcp' can infer change points in means, variances, autocorrelation structure, and any combination of these, as well as the parameters of the segments in between. All parameters are estimated with uncertainty and prediction intervals are supported - also near the change points. 'mcp' supports hypothesis testing via Savage-Dickey density ratio, posterior contrasts, and cross-validation. 'mcp' is described in Lindeløv (submitted) and generalizes the approach described in Carlin, Gelfand, & Smith (1992) and Stephens (1994) . 14 | License: GPL-2 15 | Encoding: UTF-8 16 | Language: en-US 17 | LazyData: true 18 | RoxygenNote: 7.3.1 19 | Roxygen: list(markdown = TRUE) 20 | Depends: R (>= 3.5.0) 21 | Imports: 22 | parallel, 23 | future (>= 1.16), 24 | future.apply (>= 1.4), 25 | rjags (>= 4.9), 26 | coda (>= 0.19.3), 27 | loo (>= 2.1.0), 28 | bayesplot (>= 1.7.0), 29 | tidybayes (>= 3.0.0), 30 | dplyr (>= 1.1.1), 31 | magrittr (>= 1.5), 32 | tidyr (>= 1.0.0), 33 | tidyselect (>= 0.2.5), 34 | tibble (>= 2.1.3), 35 | stringr (>= 1.4.0), 36 | ggplot2 (>= 3.2.1), 37 | patchwork (>= 1.0.0), 38 | stats, 39 | rlang (>= 0.4.1) 40 | Suggests: 41 | hexbin, 42 | testthat (>= 3.1.0), 43 | purrr (>= 0.3.4), 44 | knitr, 45 | rmarkdown 46 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method(fitted,mcpfit) 4 | S3method(loo,mcpfit) 5 | S3method(plot,mcpfit) 6 | S3method(predict,mcpfit) 7 | S3method(print,mcpfit) 8 | S3method(print,mcplist) 9 | S3method(print,mcptext) 10 | S3method(residuals,mcpfit) 11 | S3method(summary,mcpfit) 12 | S3method(waic,mcpfit) 13 | export(bernoulli) 14 | export(criterion) 15 | export(exponential) 16 | export(fixef) 17 | export(get_segment_table) 18 | export(hypothesis) 19 | export(ilogit) 20 | export(is.mcpfit) 21 | export(logit) 22 | export(loo) 23 | export(mcp) 24 | export(mcp_example) 25 | export(negbinomial) 26 | export(phi) 27 | export(plot_pars) 28 | export(pp_check) 29 | export(probit) 30 | export(ranef) 31 | export(sd_to_prec) 32 | export(waic) 33 | import(patchwork) 34 | importFrom(dplyr,.data) 35 | importFrom(ggplot2,aes) 36 | importFrom(ggplot2,facet_wrap) 37 | importFrom(ggplot2,geom_line) 38 | importFrom(ggplot2,geom_point) 39 | importFrom(ggplot2,ggplot) 40 | importFrom(loo,loo) 41 | importFrom(loo,waic) 42 | importFrom(magrittr,"%>%") 43 | importFrom(rlang,"!!") 44 | importFrom(rlang,":=") 45 | importFrom(stats,binomial) 46 | importFrom(stats,gaussian) 47 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | # mcp 0.3.4 2 | 3 | This is a bug fix release. 4 | 5 | ## Bug fixes 6 | 7 | * Now respects the `cores` argument to `mcp()`. 8 | * Document all function arguments and remove documentation for removed arguments. 9 | 10 | 11 | 12 | # mcp 0.3.3 13 | 14 | This is a bug fix release. 15 | 16 | ## Bug fixes 17 | 18 | * Support `ggplot >= 3.4.0`, `tidyselect >= 1.2.0`, and newer `future` by replacing deprecated functions. 19 | * Accept `mcp(..., cores = "all")`. 20 | * Fix documentation of `iter` argument to `mcp()`. 21 | * Other small fixes to deployment and documentation. 22 | 23 | 24 | 25 | # mcp 0.3.2 26 | 27 | This release contains no user-facing changes. The test suite suite is now compatible with dplyr 1.0.8, which caused the test suite to fail. This, in turn, would trigger the removal of mcp from CRAN. 28 | 29 | 30 | 31 | # mcp 0.3.1 32 | 33 | This is mostly a bug fix release. 34 | 35 | ## New features: 36 | 37 | * `ex = mcp_example("demo", with_fit = TRUE)` is the new interface that replaces the `ex_*` datasets in prior versions. This reduces clutter of the namespace/documentation and the size of the package. It also gives the user richer details on the simulation and analyses. For "demo", the `ex_demo` dataset is now `ex$data` and the `ex_fit` is `ex$fit`. 38 | 39 | * Nicer printing of lists and texts all over. E.g., try `print(demo_fit$jags_code)` and `print(demo_fit$pars)`. 40 | 41 | 42 | ## Bug fixes 43 | 44 | * Support breaking changes in `tidybayes >= 3.0.0` and `dplyr >= 1.0.6` 45 | 46 | 47 | 48 | 49 | # mcp 0.3.0 50 | 51 | ## New features: 52 | 53 | * Get fits and predictions for in-sample and out-of-sample data. [Read more in the article on these functions](https://lindeloev.github.io/mcp/articles/predict.html). 54 | - Use `predict(fit)` to get predicted values and quantiles. 55 | - Use `fitted(fit)` to get estimated values and quantiles. 56 | - Use `residuals(fit)` to get residuals and quantiles. 57 | 58 | All of the above functions include many arguments that align with (and extends) the options already in `plot.mcpfit()`, including getting fits/predictions for sigma (`which_y = "sigma"`), for the prior (`prior = TRUE`), and arbitrary quantiles (`probs = c(0.1, 0.5, 0.999)`). Use the `newdata` argument to get out-of-sample fitted/predicted values. Set `summary = FALSE` to get per-draw values. 59 | 60 | * Added support for weighted regression for gaussian families: `model = list(y | weights(weight_column) ~ 1 + x)`. Weights are visualized as dot sizes in `plot(fit)`. 61 | 62 | * Support for more link functions across families (e.g., `family = gaussian(link = "log")`): 63 | - `gaussian`: "identity", "log" 64 | - `binomial`: "logit", "probit", "identity" 65 | - `bernoulli`: "logit", "probit", "identity" 66 | - `poisson`: "log", "identity" 67 | 68 | * New argument `scale` in `fitted()`, `plot()`, and `fit$simulate()`. When `scale = "response"` (default), they return fits on the observed scale. When `scale = "linear"`, they return fits on the parameter scale where the linear trends are. Useful for model understanding and debugging. 69 | 70 | * Use `pp_check(fit)` to do prior/posterior predictive checking. See `pp_check(fit, type = "x")` for a list of plot types. `pp_check(fit, facet_by = "varying_column")` facets by a data column. 71 | 72 | * Improvements to `plot()`: 73 | - Change point densities are now computed on a per-panel basis in `plot(fit, facet_by = "varying_column")`. Previous releases only displayed population-level change points. 74 | - You can now plot varying effects with `rate = FALSE` for binomial models. 75 | - Change point densities in `plot(fit)` are not located directly on the x-axis. They were "floating" 5% above the x-axis in the previous releases. 76 | 77 | * New argument `nsamples` reduces the number of samples used in most functions to speed up processing. `nsamples = NULL` uses all samples for maximum accuracy. 78 | 79 | * New argument `arma` in many functions toggles whether autoregressive effects should be modelled. 80 | 81 | * Although the API is still in alpha, feel free to try extracting samples using `mcp:::tidy_samples(fit)`. This is useful for further processing using `tidybayes`, `bayesplot`, etc. and is used extensively internally in `mcp`. One useful feature is computing absolute values for varying change points: `mcp:::tidy_samples(fit, population = FALSE, absolute = TRUE)`. Feedback is appreciated before `tidy_samples` will to become part of the `mcp` API in a future release. 82 | 83 | 84 | ## Other changes 85 | 86 | * Change point densities in `plot(fit)` are now scaled to 20% of the plot for each chain X changepoint combo. This addresses a common problem where a wide posterior was almost invisibly low when a narrow posterior was present. This means that heights should only be compared *within* each chain x changepoint combo - not across. 87 | * Removed the implicit ceiling of 1000 lines and samples in `plot.mcpfit()`. 88 | * Rownames are removed from `ranef()` and `fixef()` returns. 89 | * A major effort has been put into making `mcp` robust and agile to develop. `mcp` now use defensive programming with helpful error messages. The Test suite includes 3600+ tests. 90 | * `plot()`, `predict()`, etc. are now considerably faster for AR(N) due to vectorization of the underlying code. 91 | 92 | 93 | ## Bug fixes 94 | 95 | * Sigma is now forced to stay positive via a floor at 0. 96 | * Fixed: support and require dplyr 1.0.0. Now also requires tidybayes 2.0.3. 97 | * Fixed: Parallel sampling sometimes produced identical chains. 98 | * Fixed several small bugs 99 | 100 | 101 | # mcp 0.2.0 102 | The API and internal structure should be stable now. v0.2.0 will be released on CRAN. 103 | 104 | ## New features: 105 | 106 | * Model quadratic and other terms using `I(x^2)`, `I(x^3.24)`, `sin(x)`, `sqrt(x)`, etc. 107 | * Model variance for `family = gaussian()` using `~ sigma([formula here])`. 108 | * Model Nth order autoregressive models using `~ ar(order, formula)`, typically like `y ~ 1 + x + ar(2)` for AR(2). Simulate AR(N) models from scratch or given known data with `fit$simulate()`. The [article on AR(N)](https://lindeloev.github.io/mcp/articles/arma.html) has more details and examples. AR(N) models are popular to detect changes in time-series. 109 | * Many updates to `plot()`. 110 | - Includes the posterior densities of the change point(s). Disable using `plot(fit, cp_dens = FALSE)`. 111 | - Supports AR(N) models (see above). 112 | - Plot posterior parameter intervals using `plot(fit, q_fit = TRUE)`. `plot(fit, q_fit = c(0.025, 0.5, 0.975))` plots 95% HDI and the median. 113 | - Plot prediction intervals using `plot(fit, q_predict = TRUE)`. 114 | - Choose data geom. Currently takes "point" (default) and "line" (`plot(fit, geom_data = "line")`). The latter is useful for time series. Disable using `geom_data = FALSE`. 115 | * Use `options(mc.cores = 3)` for considerable speed gains for the rest of the session. All vignettes/articles have been updated to recommend this as a default, though serial sampling is still the technical default. `mcp(..., cores = 3)` does the same thing on a call-by-ball basis. 116 | * `fit$simulate()` adds the simulation parameters as an attribute (`attr(y, "simulate")`) to the predicted variable. `summary()` recognizes this and adds the simulated values to the results table (columns `sim` and `match`) so that one can inspect whether the values were recovered. 117 | * Use `plot(fit, which_y = "sigma")` to plot the residual standard deviation on the y-axis. It works for AR(N) as well, e.g., `which_y = "ar1"`, `which_y = "ar2"`, etc. This is useful to visualize change points in variance and autocorrelation. The vignettes on variance and autocorrelations have been updated with worked examples. 118 | * Much love for the priors: 119 | - Set a Dirichlet prior on the change points using `prior = list(cp_1 = "dirichlet(1)", cp_2 = ...)`. [Read pros and cons here](https://lindeloev.github.io/mcp/articles/priors.html). 120 | - The default prior has been changed from "truncated-uniforms" to a "t-tail" prior to be more uninformative while still sampling effectively. [Read more here](https://lindeloev.github.io/mcp/articles/priors.html). 121 | - You can now sample the prior using `mcp(..., sample = "prior")` or `mcp(..., sample = "both")` and most methods can now take the prior: `plot(fit, prior = TRUE)`, `plot_pars(fit, prior = TRUE)`, `summary(fit, prior = TRUE)`, `ranef(fit, prior = TRUE)`. 122 | * `mcp` can now be cited! Call `citation("mcp")` or see the pre-print here: [https://osf.io/fzqxv](https://osf.io/fzqxv). 123 | 124 | ## Other changes: 125 | * Some renaming: "segments" --> "model". `fit$func_y()` --> `fit$simulate()`. 126 | * `plot()` only visualize the total fit while `plot_pars()` only visualize individual parameters. These functions were mixed in `plot()` previously. 127 | * The argument `update` has been discarded from `mcp()` (it's all on `adapt` now) and `inits` has been added. 128 | * Many internal changes to make `mcp` more future proof. The biggest internal change is that `rjags` and `future` replace the `dclone` package. Among other things, this gives faster and cleaner installations. 129 | * Many more informative error messages to help you quickly understand and solve errors. 130 | * Updated documentation and website. 131 | 132 | 133 | # mcp 0.1.0 134 | First public release. 135 | 136 | * Varying change points 137 | * Basic GLM: Gaussian, binomial, Bernoulli, and Poisson, and associated vignettes. 138 | * summary(fit), fixef(fit), and ranef(fit) 139 | * plot(fit, "segments") and plot(fit, "bayesplot-name-here") with some options 140 | * 1000+ basic unit tests to ensure non-breaking code for a wide variety of models. 141 | * Testing and model comparison using `loo` and `hypothesis` 142 | -------------------------------------------------------------------------------- /R/data.R: -------------------------------------------------------------------------------- 1 | #' Example `mcpfit` for examples 2 | #' 3 | #' This was generated using `mcp_examples("demo", sample = TRUE)`. 4 | #' 5 | #' @format An \code{\link{mcpfit}} object. 6 | #' 7 | "demo_fit" 8 | -------------------------------------------------------------------------------- /R/families.R: -------------------------------------------------------------------------------- 1 | #' Bernoulli family for mcp 2 | #' 3 | #' @aliases bernoulli 4 | #' @param link Link function. 5 | #' @export 6 | #' 7 | bernoulli = function(link = "logit") { 8 | assert_value(link, allowed = c("identity", "logit", "probit")) 9 | 10 | # Just copy binomial() 11 | family = binomial(link = link) 12 | family$family = "bernoulli" 13 | mcpfamily(family) 14 | } 15 | 16 | #' Exponential family for mcp 17 | #' 18 | #' @aliases exponential 19 | #' @param link Link function (Character). 20 | #' @export 21 | #' 22 | exponential = function(link = "identity") { 23 | assert_value(link, allowed = c("identity")) 24 | 25 | family = list( 26 | family = "exponential", 27 | link = link # on lambda 28 | ) 29 | class(family) = "family" 30 | family = mcpfamily(family) 31 | } 32 | 33 | 34 | #' Negative binomial for mcp 35 | #' 36 | #' Parameterized as `mu` (mean; poisson lambda) and `size` (a shape parameter), 37 | #' so you can do `rnbinom(10, mu = 10, size = 1)`. Read more in the doc for `rnbinom`, 38 | #' 39 | #' @aliases negbinomial 40 | #' @param link Link function (Character). 41 | #' @export 42 | negbinomial = function(link = "log") { 43 | assert_value(link, allowed = c("log", "identity")) 44 | 45 | family = list( 46 | family = "negbinomial", 47 | link = link # on lambda 48 | ) 49 | class(family) = "family" 50 | family = mcpfamily(family) 51 | } 52 | 53 | 54 | #' Add A family object to store link functions between R and JAGS. 55 | #' 56 | #' This will make more sense once more link functions / families are added. 57 | #' 58 | #' @aliases mcpfamily 59 | #' @keywords internal 60 | #' @param family A family object, e.g., `binomial(link = "identity")`. 61 | mcpfamily = function(family) { 62 | # Set linkfun_str 63 | if (family$link == "identity") { 64 | family$linkfun_str = "" 65 | } else { 66 | family$linkfun_str = family$link 67 | } 68 | 69 | # Set linkinv_str 70 | family$linkinv_str = switch( 71 | family$link, 72 | logit = "ilogit", 73 | probit = "phi", 74 | log = "exp", 75 | identity = "" 76 | ) 77 | 78 | if (rlang::has_name(family, "linkfun") == FALSE) 79 | family$linkfun = eval(parse(text = family$link)) 80 | if (rlang::has_name(family, "linkinv") == FALSE) 81 | family$linkinv = eval(parse(text = family$linkinv_str)) 82 | 83 | return(family) 84 | } 85 | 86 | 87 | 88 | #' Logit function 89 | #' 90 | #' @aliases logit 91 | #' @param mu A vector of probabilities (0.0 to 1.0) 92 | #' @return A vector with same length as `mu` 93 | #' @export 94 | logit = stats::binomial(link = "logit")$linkfun 95 | 96 | #' Inverse logit function 97 | #' 98 | #' @aliases ilogit 99 | #' @param eta A vector of logits 100 | #' @return A vector with same length as `eta` 101 | #' @export 102 | ilogit = stats::binomial(link = "logit")$linkinv 103 | 104 | 105 | #' Probit function 106 | #' 107 | #' @aliases probit 108 | #' @param mu A vector of probabilities (0.0 to 1.0) 109 | #' @return A vector with same length as `mu` 110 | #' @export 111 | probit = stats::binomial(link = "probit")$linkfun 112 | 113 | 114 | #' Inverse probit function 115 | #' 116 | #' @aliases phi 117 | #' @param eta A vector of probits 118 | #' @return A vector with same length as `mu` 119 | #' @export 120 | phi = stats::binomial(link = "probit")$linkinv 121 | -------------------------------------------------------------------------------- /R/is_assert.R: -------------------------------------------------------------------------------- 1 | # ABOUT: These functions are used internally for for defensive programming. 2 | # ----------------- 3 | 4 | # Synonym so that assert_types(x, "tibble", "formula") works 5 | is.tibble = tibble::is_tibble 6 | is.formula = rlang::is_formula 7 | 8 | # Asserts whether x is an `mcpfit` 9 | assert_mcpfit = function(x) { 10 | if (!is.mcpfit(x)) 11 | stop("Expected `mcpfit` but got: ", class(x)) 12 | } 13 | 14 | # Asserts whether x contains non-numeric, decimal, or less-than-lower 15 | assert_integer = function(x, name = NULL, lower = -Inf) { 16 | # Recode 17 | if (is.null(name)) 18 | name = substitute(x) 19 | x = stats::na.omit(x) 20 | 21 | # Do checks 22 | greater_than = ifelse(lower == -Inf, " ", paste0(" >= ", lower, " ")) 23 | if (!is.numeric(x)) 24 | stop("Only integers", greater_than, "allowed for '", name, "'. Got ", x) 25 | 26 | if (!all(x == floor(x)) || !all(x >= lower)) 27 | stop("Only integers", greater_than, "allowed for '", name, "'. Got ", x) 28 | 29 | TRUE 30 | } 31 | 32 | # Asserts whether x is logical 33 | assert_logical = function(x, max_length = 1) { 34 | if (!is.logical(x)) 35 | stop("`", substitute(x), "` must be logical (TRUE or FALSE). Got ", x) 36 | 37 | if (length(x) > max_length) 38 | stop("`", substitute(x), "` must be at most length ", max_length, ". Got length ", length(x)) 39 | } 40 | 41 | # Asserts whether x is one of a set of allowed values 42 | assert_value = function(x, allowed = c()) { 43 | if (!(x %in% allowed)) { 44 | allowed[is.character(allowed)] = paste0("'", allowed[is.character(allowed)], "'") # Add quotes for character values 45 | stop("`", substitute(x), "` must be one of ", paste0(allowed, collapse = ", "), ". Got ", x) 46 | } 47 | } 48 | 49 | 50 | # Asserts whether x is one of a set of allowed types. 51 | # e.g., `assert_types(vec, "numeric", "character", "foo")` 52 | assert_types = function(x, ...) { 53 | types = list(...) 54 | 55 | # Test each function on x 56 | passed = logical(length(types)) 57 | for (i in seq_along(types)) { 58 | is.type = eval(parse(text = paste0("is.", types[[i]]))) # From character to is.foo() function 59 | passed[i] = is.type(x) 60 | } 61 | 62 | # Return helpful error 63 | if (!any(passed == TRUE)) 64 | stop("`", substitute(x), "` must be ", paste0(types, collapse = " or "), ". Got ", class(x)) 65 | } 66 | 67 | # Asserts whether x is numeric in range 68 | assert_numeric = function(x, lower = -Inf, upper = Inf) { 69 | if (!is.numeric(x)) 70 | stop("`", substitute(x), "` must be numeric. Got ", class(x)) 71 | if (any(x < lower) || any(x > upper)) 72 | stop("`", substitute(x), "` contained value(s) outside the interval (", lower, ", ", upper, ").") 73 | } 74 | 75 | # Asserts ellipsis. `ellipsis` is a list and `allowed` is a character vector 76 | assert_ellipsis = function(..., allowed = NULL) { 77 | assert_types(allowed, "null", "character") 78 | illegal_names = dplyr::setdiff(names(list(...)), allowed) 79 | if (length(illegal_names) > 0) 80 | stop("The following arguments are not accepted for this function: '", paste0(illegal_names, collapse = "', '"), "'") 81 | } 82 | -------------------------------------------------------------------------------- /R/mcp-package.R: -------------------------------------------------------------------------------- 1 | #' @keywords internal 2 | "_PACKAGE" 3 | -------------------------------------------------------------------------------- /R/run_jags.R: -------------------------------------------------------------------------------- 1 | #' Run parallel MCMC sampling using JAGS. 2 | #' 3 | #' 4 | #' @aliases run_jags 5 | #' @keywords internal 6 | #' @inheritParams mcp 7 | #' @inheritParams rjags::jags.model 8 | #' @inheritParams rjags::coda.samples 9 | #' @param jags_code A string. JAGS model, usually returned by `make_jagscode()`. 10 | #' @param pars Character vector of parameters to save/monitor. 11 | #' @param ST A segment table (tibble), returned by `get_segment_table`. 12 | #' Only really used when the model contains varying effects. 13 | #' @return `mcmc.list`` 14 | #' @encoding UTF-8 15 | #' @author Jonas Kristoffer Lindeløv \email{jonas@@lindeloev.dk} 16 | #' 17 | 18 | run_jags = function(data, 19 | jags_code, 20 | pars, 21 | ST, 22 | cores, 23 | sample, 24 | n.chains, 25 | n.iter, 26 | n.adapt, 27 | inits 28 | ) { 29 | 30 | # Prevent failure of all mcp methods when length(pars) <= 2 (one parameter + 31 | # loglik_).This always happens when there is only one parameter, so we just 32 | # save samples from the dummy change points. 33 | if (length(pars) <= 2) 34 | pars = c(pars, "cp_0", "cp_1") 35 | 36 | # Set number of cores from "all" or mc.cores if `cores` is not specified. 37 | # Max at 2 for CRAN etc. 38 | opts_cores = options()$mc.cores 39 | if (is.numeric(opts_cores) && cores == 1) 40 | cores = opts_cores 41 | if (cores == "all") { 42 | cores = parallel::detectCores() - 1 43 | n.chains = cores 44 | } 45 | if (Sys.getenv("_R_CHECK_LIMIT_CORES_", "") == "TRUE") { 46 | if (cores > 1) 47 | cores = 2 48 | } 49 | 50 | # Get data ready... 51 | jags_data = get_jags_data(data, ST, jags_code, sample) 52 | 53 | # Start timer 54 | timer = proc.time() 55 | 56 | # Define the sampling function in this environment. 57 | # Can be used sequentially or in parallel. 58 | do_sampling = function(seed, n.chains, quiet) { 59 | # Optionally seed JAGS. Typically for parallel processing to avoid risk of identical seeds. 60 | if (!is.null(seed)) 61 | inits = c(inits, list(.RNG.name = "base::Wichmann-Hill", .RNG.seed = seed)) 62 | 63 | # Compile model 64 | jm = rjags::jags.model( 65 | file = textConnection(jags_code), 66 | data = jags_data, 67 | inits = inits, 68 | n.chains = n.chains, 69 | n.adapt = n.adapt, 70 | quiet = quiet 71 | ) 72 | 73 | # Sample and return 74 | rjags::coda.samples( 75 | model = jm, 76 | variable.names = pars, 77 | n.iter = n.iter, 78 | quiet = quiet 79 | ) 80 | } 81 | 82 | # Time for sampling! 83 | if (cores == 1) { 84 | # # SERIAL 85 | samples = try(do_sampling( 86 | seed = NULL, 87 | n.chains = n.chains, 88 | quiet = FALSE 89 | )) 90 | 91 | } else if (cores == "all" || cores > 1) { 92 | # PARALLEL using the future package and one chain per worker 93 | message("Parallel sampling in progress...") 94 | future::plan(future::multisession, workers = cores, .skip = TRUE) 95 | samples = future.apply::future_lapply( 96 | sample(1:1000, n.chains), # Random seed to JAGS 97 | n.chains = 1, 98 | quiet = TRUE, 99 | FUN = do_sampling, 100 | future.seed = TRUE 101 | ) 102 | 103 | # Get result as mcmc.list 104 | samples = unlist(samples, recursive = FALSE) 105 | class(samples) = "mcmc.list" 106 | } 107 | 108 | # Sampling finished. # Recover the levels of varying effects if it succeeded 109 | if (coda::is.mcmc.list(samples)) { 110 | for (i in seq_len(nrow(ST))) { 111 | S = ST[i, ] 112 | if (!is.na(S$cp_group_col)) { 113 | samples = recover_levels(samples, data, S$cp_group, S$cp_group_col) 114 | } 115 | } 116 | 117 | # Return 118 | passed = proc.time() - timer 119 | message("Finished sampling in ", round(passed["elapsed"], 1), " seconds\n") 120 | return(samples) 121 | 122 | } else { 123 | # If it didn't succeed, quit gracefully. 124 | warning("--------------\nJAGS failed with the above error. Returning an `mcpfit` without samples. Inspect fit$prior and fit$jags_code to identify the problem. Read about typical problems and fixes here: https://lindeloev.github.io/mcp/articles/tips.html.") 125 | return(NULL) 126 | } 127 | } 128 | 129 | 130 | #' Adds helper variables for use in `run_jags` 131 | #' 132 | #' Returns the relevant data columns as a list and add elements with unique 133 | #' varying group levels. 134 | #' 135 | #' @aliases get_jags_data 136 | #' @keywords internal 137 | #' @inheritParams run_jags 138 | #' @param data A tibble 139 | #' @param ST A segment table (tibble), returned by `get_segment_table`. 140 | 141 | get_jags_data = function(data, ST, jags_code, sample) { 142 | cols_varying = unique(stats::na.omit(ST$cp_group_col)) 143 | 144 | # Start with "raw" data 145 | cols_data = unique(stats::na.omit(c(ST$y, ST$x, ST$trials, ST$weights))) 146 | jags_data = as.list(data[, c(cols_varying, cols_data)]) 147 | 148 | for (col in cols_varying) { 149 | # Add meta-data (now many varying group levels) 150 | tmp = paste0("n_unique_", col) 151 | jags_data[[tmp]] = length(unique(data[, col])) 152 | 153 | # Make varying columns numeric in order of appearance 154 | # They will be recovered using the recover_levels() 155 | jags_data[[col]] = as.numeric(factor(jags_data[[col]], levels = unique(jags_data[[col]]))) 156 | } 157 | 158 | 159 | # Add e.g. MINX = min(data$x) for all variables where they are used. 160 | # Searches whether it is in jags_code. If yes, add to jags_data 161 | # TO DO: there must be a prettier way to do this. 162 | funcs = c("min", "max", "sd", "mean") 163 | xy_vars = c("x", "y") 164 | for (func in funcs) { 165 | for (xy_var in xy_vars) { 166 | constant_name = toupper(paste0(func, xy_var)) 167 | if (stringr::str_detect(jags_code, constant_name)) { 168 | func_eval = eval(parse(text = func)) # as real function 169 | column = ST[, xy_var][[1]][1] 170 | jags_data[[constant_name]] = func_eval(data[, column], na.rm = TRUE) 171 | } 172 | } 173 | } 174 | 175 | # For default prior 176 | if (stringr::str_detect(jags_code, "N_CP")) 177 | jags_data$N_CP = nrow(ST) - 1 178 | 179 | # Set response = NA if we only sample prior 180 | if (sample == "prior") 181 | jags_data[[ST$y[1]]] = rep(NA, nrow(data)) 182 | 183 | # Return 184 | jags_data 185 | } 186 | 187 | 188 | 189 | #' Recover the levels of varying effects in mcmc.list 190 | #' 191 | #' Jags uses 1, 2, 3, ..., etc. for indexing of varying effects. 192 | #' This function adds back the original levels, whether numeric or string 193 | #' 194 | #' @aliases recover_levels 195 | #' @keywords internal 196 | #' @param samples An mcmc.list with varying columns starting in `mcmc_col`. 197 | #' @param data A tibble or data.frame with the cols in `data_col`. 198 | #' @param mcmc_col A vector of strings. 199 | #' @param data_col A vector of strings. Has to be same length as `mcmc_col`.` 200 | #' 201 | recover_levels = function(samples, data, mcmc_col, data_col) { 202 | # Get vectors of old ("from") and replacement column names in samples 203 | from = colnames(samples[[1]])[stringr::str_starts(colnames(samples[[1]]), paste0(mcmc_col, '\\['))] # Current column names 204 | to = sprintf(paste0(mcmc_col, '[%s]'), unique(data[, data_col])) # Desired column names 205 | 206 | # Recode column names on each list (chain) using lapply 207 | names(to) = from 208 | lapply(samples, function(x) { 209 | colnames(x) = dplyr::recode(colnames(x), !!!to) 210 | x 211 | }) 212 | } 213 | -------------------------------------------------------------------------------- /cran-comments.md: -------------------------------------------------------------------------------- 1 | # mcp 0.3.4 2 | 3 | ## Notes for the reviewer 4 | 5 | * This fixes NOTES about arguments with missing documentation. 6 | 7 | ## R CMD check results 8 | There were no ERRORs or WARNINGs except the usual artificial ones (see "Expected 9 | NOTEs and ERRORs" in the bottom of this file). 10 | 11 | ## Test environments 12 | Github Actions: oldrel, release, and devel on Ubuntu. 13 | Github Actions: release on MacOS and Windows. 14 | 15 | ## Downstream dependencies 16 | We checked 1 reverse dependencies, comparing R CMD check results across CRAN and dev versions of this package. 17 | 18 | * We saw 0 new problems 19 | * We failed to check 0 packages 20 | 21 | 22 | 23 | 24 | # mcp 0.3.3 25 | 26 | ## Notes for the reviewer 27 | 28 | * This is a bug fix release, mostly for compatibility with breaking changes in 29 | dependencies, including `ggplot 3.4.0` as you kindly notified in a GitHub 30 | issue. 31 | 32 | ## R CMD check results 33 | There were no ERRORs or WARNINGs except the usual artificial ones (see "Expected 34 | NOTEs and ERRORs" in the bottom of this file). 35 | 36 | ## Test environments 37 | Github Actions: oldrel, release, and devel on Ubuntu. 38 | Github Actions: release on MacOS and Windows. 39 | 40 | ## Downstream dependencies 41 | `mcp` has no downstream dependencies. 42 | 43 | 44 | 45 | # mcp 0.3.2 46 | 47 | ## Notes for the reviewer 48 | * This release makes the test suite compatible with dplyr 1.0.8. There are no user-facing changes. 49 | 50 | ## R CMD check results 51 | There were no ERRORs or WARNINGs except the usual artificial ones (see "Expected NOTEs and ERRORs"). 52 | 53 | ## Test environments 54 | rhub 55 | oldrel, release, and devel on macOS, Linux, and Windows 56 | 57 | ## Downstream dependencies 58 | `mcp` has no downstream dependencies. 59 | 60 | 61 | # Resubmission 62 | This is a resubmission. I have corrected the invalid URLs in the README. 63 | 64 | 65 | # mcp 0.3.1 66 | 67 | ## Notes for the reviewer 68 | * This is a patch release that fixes breaking changes in dependencies. 69 | 70 | * Please see unpreventable "Expected NOTEs and ERRORs" in the bottom of this file. 71 | 72 | ## R CMD check results 73 | There were no ERRORs or WARNINGs. 74 | 75 | The DESCRIPTION elicits a few NOTEs on rhub: 76 | * An incorrect NOTE about misspelled words (correctly spelled family names). 77 | * The DOIs work, but Rhub is unable to verify them. 78 | 79 | ## Test environments 80 | rhub 81 | oldrel, release, and devel on macOS, Linux, and Windows 82 | 83 | 84 | ## Downstream dependencies 85 | `mcp` has no downstream dependencies. 86 | 87 | 88 | 89 | # mcp 0.3.0 90 | 91 | ## Notes for the reviewer 92 | * This release adds support for `dplyr` 1.0+ and other newer packages which caused the prior `mcp` to be taken down from CRAN. Sorry it took so long. 93 | 94 | * `rhub` currently have issues with utf8 resulting in the error `Error in loadNamespace(name) : there is no package called 'utf8'`. See https://github.com/r-hub/rhub/issues/374. This has nothing to do with `mcp`. 95 | 96 | * Please see unpreventable "Expected NOTEs and ERRORs" in the bottom of this file. 97 | 98 | ## Test environments 99 | * local Windows 10, R 3.6.1 100 | * Ubuntu 18.04 (on travis-ci): devel and release 101 | * Mac OS X 10.13.6 (on travis-ci): release 102 | * Windows Server 2008 R2 SP1 (on rhub): devel 103 | * win-builder: devel and release 104 | 105 | ## R CMD check results 106 | There were no ERRORs, WARNINGs, or NOTEs. 107 | 108 | ## Downstream dependencies 109 | `mcp` has no downstream dependencies. 110 | 111 | 112 | # Resubmission 3 113 | 114 | * Deleted call to `options(mc.cores = 3)`. 115 | * See the section "Expected NOTEs and ERRORs" below for anticipated ERRORs and NOTEs. 116 | 117 | 118 | 119 | # Resubmission 2 120 | 121 | * Fixed grammatical error in DESCRIPTION. 122 | * mcp now spawns at most 2 cores on CRAN. 123 | 124 | 125 | 126 | # Resubmission 127 | This is a resubmission. I believe I have solved all the points raised in the initial review. All tests pass. In this version I have: 128 | 129 | * Added single quotes around 'mcp' in DESCRIPTION. 130 | * Added literature to DESCRIPTION with the theoretical foundation for the computations done in mcp. 131 | * mcp no longer copies code from other packages so no attribution/ctb is required. 132 | * `print()` and `cat()` now only reside within `print()` and `summary()` functions. 133 | * All functions have a \value specified now. This has led me to do many other improvements in the documentation too. 134 | * All examples run now. Some have been enclosed in \donttest() to reduce runtimes. 135 | * I have taken the liberty to add a few API-breaking updates to `mcp` in this resubmission, so that the API is as stable as possible from the initial CRAN release. These are: (1) changed plotting of time-series, (2) the function name to simulate data, and (3) changed the `summary()` output for simulated data. 136 | 137 | 138 | 139 | # mcp 0.2.0 140 | 141 | ## Notes for the reviewer 142 | * This is the first submission of the `mcp` package and my first CRAN submission personally. I have done my best to adhere to all standards. Extensive documentation for `mcp` is available at https://lindeloev.github.io/mcp/. 143 | * The package `patchwork` is a dependency which just arrived on CRAN a few days ago. It seems it has not rolled out to all test servers. Travis and devtools::check_rhub() install it just fine. 144 | * My email address is long-term. I have had it for 15 years and I co-own the domain. 145 | * `mcp` uses the GPL-2 license. The only code copied (verbatim) from other packages is in R/lme4_utils.R, is GPL (>=2), and has been given proper attribution via the @authors Roxygen tag. 146 | * `mcp` does not make any external changes (files, options, communication, etc.) 147 | 148 | ## Test environments 149 | * local Windows 10, R 3.6.1 150 | * Ubuntu 16.04.6 LTS (on travis-ci): release 151 | * Ubuntu 16.04.6 LTS (on travis-ci): devel 152 | * Mac OS X 10.13.3 (on travis-ci): release 153 | 154 | ## R CMD check results 155 | There were no ERRORs or WARNINGs. 156 | 157 | ## Downstream dependencies 158 | This is the first submission so there are no downstream dependencies. 159 | 160 | 161 | 162 | # Expected NOTEs and ERRORs 163 | 164 | * INSTALL ERROR or PREPERROR: `mcp` uses JAGS (an external binary) for sampling through the `rjags` package. rjags will fail to install without JAGS on the system. This happens when I run `devtools::check_win_release()`, `devtools::check_win_devel()` and `rhub::check_for_cran()`. Github Actions install JAGS prior to installing packages, and all tests pass on Windows, MacOS, and Linux. Binaries for JAGS are here: https://sourceforge.net/projects/mcmc-jags/files/JAGS/4.x/ 165 | 166 | * DESCRIPTION NOTE: rhub says that the DESCRIPTION DOIs return a HTTP 403 error (forbidden). But the DOI works just fine, e.g., http://doi.org/10.2307/2986119. 167 | 168 | * DESCRIPTION NOTE: rhub says that "Lindeløv" (my family name) and "Gelfand" (an researcher's family name) are misspelled. 169 | -------------------------------------------------------------------------------- /data-raw/ex_fit.R: -------------------------------------------------------------------------------- 1 | ex = mcp_example("demo") 2 | demo_fit = mcp(ex$model, data = ex$data, adapt = 3000, iter = 1000, sample = "both") 3 | demo_fit$mcmc_loglik = NULL # Make the object small 4 | 5 | # Save to mcp 6 | usethis::use_data(demo_fit, overwrite = TRUE) 7 | -------------------------------------------------------------------------------- /data/demo_fit.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/data/demo_fit.rda -------------------------------------------------------------------------------- /inst/CITATION: -------------------------------------------------------------------------------- 1 | citHeader("To cite mcp in publications use:") 2 | 3 | bibentry( 4 | bibtype = "Article", 5 | title = "mcp: An R Package for Regression With Multiple Change Points", 6 | author = person(given = "Jonas Kristoffer", family = "Lindeløv"), 7 | journal = "OSF Preprints", 8 | year = "2020", 9 | doi = "10.31219/osf.io/fzqxv", 10 | encoding = "UTF-8" 11 | ) 12 | -------------------------------------------------------------------------------- /man/bernoulli.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/families.R 3 | \name{bernoulli} 4 | \alias{bernoulli} 5 | \title{Bernoulli family for mcp} 6 | \usage{ 7 | bernoulli(link = "logit") 8 | } 9 | \arguments{ 10 | \item{link}{Link function.} 11 | } 12 | \description{ 13 | Bernoulli family for mcp 14 | } 15 | -------------------------------------------------------------------------------- /man/check_terms_in_data.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{check_terms_in_data} 5 | \alias{check_terms_in_data} 6 | \title{Checks if all terms are in the data} 7 | \usage{ 8 | check_terms_in_data(form, data, i, n_terms = NULL) 9 | } 10 | \arguments{ 11 | \item{form}{Formula or character (tilde will be prefixed if it isn't already)} 12 | 13 | \item{data}{A data.frame or tibble} 14 | 15 | \item{i}{The segment number (integer)} 16 | 17 | \item{n_terms}{Int >= 1. Number of expected terms. Will raise error if it doesn't match.} 18 | } 19 | \description{ 20 | Checks if all terms are in the data 21 | } 22 | \author{ 23 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 24 | } 25 | \keyword{internal} 26 | -------------------------------------------------------------------------------- /man/criterion.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/comparison.R 3 | \encoding{UTF-8} 4 | \name{criterion} 5 | \alias{criterion} 6 | \alias{loo.mcpfit} 7 | \alias{loo} 8 | \alias{LOO} 9 | \alias{waic.mcpfit} 10 | \alias{waic} 11 | \alias{WAIC} 12 | \title{Compute information criteria for model comparison} 13 | \usage{ 14 | criterion(fit, criterion = "loo", ...) 15 | 16 | \method{loo}{mcpfit}(x, ...) 17 | 18 | \method{waic}{mcpfit}(x, ...) 19 | } 20 | \arguments{ 21 | \item{fit}{An \code{\link{mcpfit}} object.} 22 | 23 | \item{criterion}{One of \code{"loo"} (calls \code{\link[loo]{loo}}) or \code{"waic"} (calls \code{\link[loo]{waic}}).} 24 | 25 | \item{...}{Currently ignored} 26 | 27 | \item{x}{An \code{\link{mcpfit}} object.} 28 | } 29 | \value{ 30 | a \code{loo} or \code{psis_loo} object. 31 | } 32 | \description{ 33 | Takes an \code{\link{mcpfit}} as input and computes information criteria using loo or 34 | WAIC. Compare models using \code{\link[loo]{loo_compare}} and \code{\link[loo]{loo_model_weights}}. 35 | more in \code{\link[loo]{loo}}. 36 | } 37 | \section{Functions}{ 38 | \itemize{ 39 | \item \code{loo(mcpfit)}: Computes loo on mcpfit objects 40 | 41 | \item \code{waic(mcpfit)}: Computes WAIC on mcpfit objects 42 | 43 | }} 44 | \examples{ 45 | \donttest{ 46 | # Define two models and sample them 47 | # options(mc.cores = 3) # Speed up sampling 48 | ex = mcp_example("intercepts") # Get some simulated data. 49 | model1 = list(y ~ 1 + x, ~ 1) 50 | model2 = list(y ~ 1 + x) # Without a change point 51 | fit1 = mcp(model1, ex$data) 52 | fit2 = mcp(model2, ex$data) 53 | 54 | # Compute LOO for each and compare (works for waic(fit) too) 55 | fit1$loo = loo(fit1) 56 | fit2$loo = loo(fit2) 57 | loo::loo_compare(fit1$loo, fit2$loo) 58 | } 59 | 60 | } 61 | \seealso{ 62 | \code{\link{criterion}} 63 | 64 | \code{\link{criterion}} 65 | } 66 | \author{ 67 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 68 | } 69 | -------------------------------------------------------------------------------- /man/cumpaste.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{cumpaste} 5 | \alias{cumpaste} 6 | \title{Cumulative pasting of character columns} 7 | \usage{ 8 | cumpaste(x, .sep = " ") 9 | } 10 | \arguments{ 11 | \item{x}{A column} 12 | 13 | \item{.sep}{A character to append between pastes} 14 | } 15 | \value{ 16 | string. 17 | } 18 | \description{ 19 | Cumulative pasting of character columns 20 | } 21 | \author{ 22 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} but Inspired by 23 | https://stackoverflow.com/questions/24862046/cumulatively-paste-concatenate-values-grouped-by-another-variable 24 | } 25 | \keyword{internal} 26 | -------------------------------------------------------------------------------- /man/demo_fit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/data.R 3 | \docType{data} 4 | \name{demo_fit} 5 | \alias{demo_fit} 6 | \title{Example \code{mcpfit} for examples} 7 | \format{ 8 | An \code{\link{mcpfit}} object. 9 | } 10 | \usage{ 11 | demo_fit 12 | } 13 | \description{ 14 | This was generated using \code{mcp_examples("demo", sample = TRUE)}. 15 | } 16 | \keyword{datasets} 17 | -------------------------------------------------------------------------------- /man/exponential.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/families.R 3 | \name{exponential} 4 | \alias{exponential} 5 | \title{Exponential family for mcp} 6 | \usage{ 7 | exponential(link = "identity") 8 | } 9 | \arguments{ 10 | \item{link}{Link function (Character).} 11 | } 12 | \description{ 13 | Exponential family for mcp 14 | } 15 | -------------------------------------------------------------------------------- /man/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/man/figures/logo.png -------------------------------------------------------------------------------- /man/figures/logo_200px.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/man/figures/logo_200px.png -------------------------------------------------------------------------------- /man/fitted.mcpfit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \encoding{UTF-8} 4 | \name{fitted.mcpfit} 5 | \alias{fitted.mcpfit} 6 | \alias{fitted} 7 | \title{Expected Values from the Posterior Predictive Distribution} 8 | \usage{ 9 | \method{fitted}{mcpfit}( 10 | object, 11 | newdata = NULL, 12 | summary = TRUE, 13 | probs = TRUE, 14 | rate = TRUE, 15 | prior = FALSE, 16 | which_y = "ct", 17 | varying = TRUE, 18 | arma = TRUE, 19 | nsamples = NULL, 20 | samples_format = "tidy", 21 | scale = "response", 22 | ... 23 | ) 24 | } 25 | \arguments{ 26 | \item{object}{An \code{mcpfit} object.} 27 | 28 | \item{newdata}{A \code{tibble} or a \code{data.frame} containing predictors in the model. If \code{NULL} (default), 29 | the original data is used.} 30 | 31 | \item{summary}{Summarise at each x-value} 32 | 33 | \item{probs}{Vector of quantiles. Only in effect when \code{summary == TRUE}.} 34 | 35 | \item{rate}{Boolean. For binomial models, plot on raw data (\code{rate = FALSE}) or 36 | response divided by number of trials (\code{rate = TRUE}). If FALSE, linear 37 | interpolation on trial number is used to infer trials at a particular x.} 38 | 39 | \item{prior}{TRUE/FALSE. Plot using prior samples? Useful for \code{mcp(..., sample = "both")}} 40 | 41 | \item{which_y}{What to plot on the y-axis. One of 42 | \itemize{ 43 | \item \code{"ct"}: The central tendency which is often the mean after applying the 44 | link function. 45 | \item \code{"sigma"}: The variance 46 | \item \code{"ar1"}, \code{"ar2"}, etc. depending on which order of the autoregressive 47 | effects you want to plot. 48 | }} 49 | 50 | \item{varying}{One of: 51 | \itemize{ 52 | \item \code{TRUE} All varying effects (\code{fit$pars$varying}). 53 | \item \code{FALSE} No varying effects (\code{c()}). 54 | \item Character vector: Only include specified varying parameters - see \code{fit$pars$varying}. 55 | }} 56 | 57 | \item{arma}{Whether to include autoregressive effects. 58 | \itemize{ 59 | \item \code{TRUE} Compute autoregressive residuals. Requires the response variable in \code{newdata}. 60 | \item \code{FALSE} Disregard the autoregressive effects. For \code{family = gaussian()}, \code{predict()} just use \code{sigma} for residuals. 61 | }} 62 | 63 | \item{nsamples}{Integer or \code{NULL}. Number of samples to return/summarise. 64 | If there are varying effects, this is the number of samples from each varying group. 65 | \code{NULL} means "all". Ignored if both are \code{FALSE}. More samples trade speed for accuracy.} 66 | 67 | \item{samples_format}{One of "tidy" or "matrix". Controls the output format when \code{summary == FALSE}. 68 | See more under "value"} 69 | 70 | \item{scale}{One of 71 | \itemize{ 72 | \item "response": return on the observed scale, i.e., after applying the inverse link function. 73 | \item "linear": return on the parameter scale (where the linear trends are modelled). 74 | }} 75 | 76 | \item{...}{Currently unused} 77 | } 78 | \value{ 79 | \itemize{ 80 | \item If \code{summary = TRUE}: A \code{tibble} with the posterior mean for each row in \code{newdata}, 81 | If \code{newdata} is \code{NULL}, the data in \code{fit$data} is used. 82 | \item If \code{summary = FALSE} and \code{samples_format = "tidy"}: A \code{tidybayes} \code{tibble} with all the posterior 83 | samples (\code{Ns}) evaluated at each row in \code{newdata} (\code{Nn}), i.e., with \verb{Ns x Nn} rows. If there are 84 | varying effects, the returned data is expanded with the relevant levels for each row. 85 | 86 | The return columns are: 87 | \itemize{ 88 | \item Predictors from \code{newdata}. 89 | \item Sample descriptors: ".chain", ".iter", ".draw" (see the \code{tidybayes} package for more), and "data_row" (\code{newdata} rownumber) 90 | \item Sample values: one column for each parameter in the model. 91 | \item The estimate. Either "predict" or "fitted", i.e., the name of the \code{type} argument. 92 | } 93 | \item If \code{summary = FALSE} and \code{samples_format = "matrix"}: An \code{N_draws} X \code{nrows(newdata)} matrix with fitted/predicted 94 | values (depending on \code{type}). This format is used by \code{brms} and it's useful as \code{yrep} in 95 | \verb{bayesplot::ppc_*} functions. 96 | } 97 | } 98 | \description{ 99 | Expected Values from the Posterior Predictive Distribution 100 | } 101 | \examples{ 102 | \donttest{ 103 | fitted(demo_fit) 104 | fitted(demo_fit, probs = c(0.1, 0.5, 0.9)) # With median and 80\% credible interval. 105 | fitted(demo_fit, summary = FALSE) # Samples instead of summary. 106 | fitted(demo_fit, 107 | newdata = data.frame(time = c(-5, 20, 300)), # New data 108 | probs = c(0.025, 0.5, 0.975)) 109 | } 110 | 111 | } 112 | \seealso{ 113 | \code{\link{pp_eval}} \code{\link{predict.mcpfit}} \code{\link{residuals.mcpfit}} 114 | } 115 | \author{ 116 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 117 | } 118 | -------------------------------------------------------------------------------- /man/format_code.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{format_code} 5 | \alias{format_code} 6 | \title{Format code with one or multiple terms} 7 | \usage{ 8 | format_code(col, na_col) 9 | } 10 | \arguments{ 11 | \item{col}{A column} 12 | 13 | \item{na_col}{If this column is NA, return NA} 14 | } 15 | \value{ 16 | string 17 | } 18 | \description{ 19 | Take a value like "a + b" and 20 | (1) replace it with NA if na_col == NA. 21 | (2) Change to "(a + b)" if there is a "+" 22 | (3) Return itself otherwise, e.g., "a" --> "a". 23 | } 24 | \author{ 25 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 26 | } 27 | \keyword{internal} 28 | -------------------------------------------------------------------------------- /man/geom_cp_density.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot.R 3 | \name{geom_cp_density} 4 | \alias{geom_cp_density} 5 | \title{Density geom for \code{plot.mcpfit()}} 6 | \usage{ 7 | geom_cp_density(fit, facet_by, limits_y) 8 | } 9 | \arguments{ 10 | \item{fit}{An \code{mcpfit} object} 11 | 12 | \item{facet_by}{\code{NULL} or a a string, like \verb{plot.mcpfit(..., facet_by = "id").}} 13 | } 14 | \value{ 15 | A \code{ggplot2::stat_density} geom representing the change point densities. 16 | } 17 | \description{ 18 | Density geom for \code{plot.mcpfit()} 19 | } 20 | \keyword{internal} 21 | -------------------------------------------------------------------------------- /man/geom_quantiles.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot.R 3 | \encoding{UTF-8} 4 | \name{geom_quantiles} 5 | \alias{geom_quantiles} 6 | \title{Return a geom_line representing the quantiles} 7 | \usage{ 8 | geom_quantiles(samples, quantiles, xvar, yvar, facet_by, ...) 9 | } 10 | \arguments{ 11 | \item{samples}{A tidybayes tibble} 12 | 13 | \item{quantiles}{Vector of quantiles (0.0 to 1.0)} 14 | 15 | \item{xvar}{An rlang::sym() with the name of the x-col in \code{samples}} 16 | 17 | \item{yvar}{An rlang::sym() with the name of the response col in \code{samples}} 18 | 19 | \item{facet_by}{String. Name of a varying group.} 20 | 21 | \item{...}{Arguments passed to geom_line} 22 | } 23 | \value{ 24 | A \code{ggplot2::geom_line} object. 25 | } 26 | \description{ 27 | Called by \code{plot.mcpfit}. 28 | } 29 | \author{ 30 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 31 | } 32 | \keyword{internal} 33 | -------------------------------------------------------------------------------- /man/get_all_formulas.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_formula.R 3 | \encoding{UTF-8} 4 | \name{get_all_formulas} 5 | \alias{get_all_formulas} 6 | \title{Call \code{get_formula_str} for multiple ytypes and paste strings} 7 | \usage{ 8 | get_all_formulas(ST, prior, par_x, ytypes = c("ct", "sigma", "arma")) 9 | } 10 | \arguments{ 11 | \item{ST}{Tibble. Returned by \code{get_segment_table}.} 12 | 13 | \item{prior}{Named list. Names are parameter names (\code{cp_i}, \code{int_i}, \code{xvar_i}, 14 | `sigma``) and the values are either 15 | \itemize{ 16 | \item A JAGS distribution (e.g., \code{int_1 = "dnorm(0, 1) T(0,)"}) indicating a 17 | conventional prior distribution. Uninformative priors based on data 18 | properties are used where priors are not specified. This ensures good 19 | parameter estimations, but it is a questionable for hypothesis testing. 20 | \code{mcp} uses SD (not precision) for dnorm, dt, dlogis, etc. See 21 | details. Change points are forced to be ordered through the priors using 22 | truncation, except for uniform priors where the lower bound should be 23 | greater than the previous change point, \code{dunif(cp_1, MAXX)}. 24 | \item A numerical value (e.g., \code{int_1 = -2.1}) indicating a fixed value. 25 | \item A model parameter name (e.g., \code{int_2 = "int_1"}), indicating that this parameter is shared - 26 | typically between segments. If two varying effects are shared this way, 27 | they will need to have the same grouping variable. 28 | \item A scaled Dirichlet prior is supported for change points if they are all set to 29 | \verb{cp_i = "dirichlet(N)} where \code{N} is the alpha for this change point and 30 | \code{N = 1} is most often used. This prior is less informative about the 31 | location of the change points than the default uniform prior, but it 32 | samples less efficiently, so you will often need to set \code{iter} higher. 33 | It is recommended for hypothesis testing and for the estimation of more 34 | than 5 change points. \href{https://lindeloev.github.io/mcp/articles/priors.html}{Read more}. 35 | }} 36 | 37 | \item{par_x}{String (default: NULL). Only relevant if no segments contains 38 | slope (no hint at what x is). Set this, e.g., par_x = "time".} 39 | 40 | \item{ytypes}{A character vector of ytypes to including in model building} 41 | } 42 | \value{ 43 | A string with JAGS code. 44 | } 45 | \description{ 46 | Currently used to differentiate between the JAGS model (use all) and the 47 | fit$simulate model (do not include arma). 48 | } 49 | \author{ 50 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 51 | } 52 | \keyword{internal} 53 | -------------------------------------------------------------------------------- /man/get_ar_code.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_formula.R 3 | \encoding{UTF-8} 4 | \name{get_ar_code} 5 | \alias{get_ar_code} 6 | \title{Gets code for ARMA terms, resulting in a "resid_"} 7 | \usage{ 8 | get_ar_code(ar_order, family, is_R, xvar, yvar = NA) 9 | } 10 | \arguments{ 11 | \item{ar_order}{Positive integer. The order of ARMA} 12 | 13 | \item{family}{An mcpfamily object} 14 | 15 | \item{is_R}{Bool. Is this R code (TRUE) or JAGS code (FALSE)?} 16 | } 17 | \value{ 18 | String with JAGS code for AR. 19 | } 20 | \description{ 21 | Developer note: Ensuring that this can be used in both simulate() and JAGS 22 | got quite messy with a lot of if-statements. It works but some refactoring 23 | may be good in the future. 24 | } 25 | \author{ 26 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 27 | } 28 | \keyword{internal} 29 | -------------------------------------------------------------------------------- /man/get_arma_order.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/misc.R 3 | \name{get_arma_order} 4 | \alias{get_arma_order} 5 | \title{Extracts the order from ARMA parameter name(s)} 6 | \usage{ 7 | get_arma_order(pars_arma) 8 | } 9 | \arguments{ 10 | \item{pars_arma}{Character vector} 11 | } 12 | \value{ 13 | integer 14 | } 15 | \description{ 16 | If several names are provided (vector), it returns the maximum. If \code{pars_arma} 17 | is an empty string, it returns \code{0}. 18 | } 19 | \keyword{internal} 20 | -------------------------------------------------------------------------------- /man/get_density.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/comparison.R 3 | \encoding{UTF-8} 4 | \name{get_density} 5 | \alias{get_density} 6 | \title{Compute the density at a specific point.} 7 | \usage{ 8 | get_density(samples, LHS, value) 9 | } 10 | \arguments{ 11 | \item{samples}{An mcmc.list} 12 | 13 | \item{LHS}{Expression to compute posterior} 14 | 15 | \item{value}{What value to evaluate the density at} 16 | } 17 | \value{ 18 | A float 19 | } 20 | \description{ 21 | Used in \link{hypothesis} 22 | } 23 | \author{ 24 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 25 | } 26 | \keyword{internal} 27 | -------------------------------------------------------------------------------- /man/get_eval_at.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot.R 3 | \name{get_eval_at} 4 | \alias{get_eval_at} 5 | \title{Get a list of x-coordinates to evaluate fit$simulate at} 6 | \usage{ 7 | get_eval_at(fit, facet_by) 8 | } 9 | \arguments{ 10 | \item{fit}{An mcpfit object.} 11 | 12 | \item{facet_by}{String. Name of a varying group.} 13 | } 14 | \value{ 15 | A vector of x-values to evaluate at. 16 | } 17 | \description{ 18 | Solves two problems: if setting the number of points too high, the 19 | function becomes slow. If setting it too low, the posterior at large intercept- 20 | changes at change points look discrete, because they are evaluated at very 21 | few x in that interval. 22 | } 23 | \details{ 24 | This function makes a vector of x-values with large spacing in general, 25 | but finer resolution at change points. 26 | } 27 | \keyword{internal} 28 | -------------------------------------------------------------------------------- /man/get_formula_str.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_formula.R 3 | \encoding{UTF-8} 4 | \name{get_formula_str} 5 | \alias{get_formula_str} 6 | \title{Build an R formula (as string) given a segment table (ST)} 7 | \usage{ 8 | get_formula_str(ST, par_x, ytype = "ct", init = FALSE) 9 | } 10 | \arguments{ 11 | \item{ST}{Tibble. Returned by \code{get_segment_table}.} 12 | 13 | \item{par_x}{String (default: NULL). Only relevant if no segments contains 14 | slope (no hint at what x is). Set this, e.g., par_x = "time".} 15 | 16 | \item{ytype}{One of "ct" (central tendency), "sigma", "ar1" (or another order), or "ma1" (or another order)} 17 | 18 | \item{init}{TRUE/FALSE. Set to TRUE for the first call. Adds segment-relative 19 | X-codings and verbose commenting of one formula} 20 | } 21 | \value{ 22 | A string with JAGS code. 23 | } 24 | \description{ 25 | You will need to replace PAR_X for whatever your x-axis observation column 26 | is called. In JAGS typically \code{x[i_]}. In R just \code{x}. 27 | } 28 | \author{ 29 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 30 | } 31 | \keyword{internal} 32 | -------------------------------------------------------------------------------- /man/get_jags_data.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/run_jags.R 3 | \name{get_jags_data} 4 | \alias{get_jags_data} 5 | \title{Adds helper variables for use in \code{run_jags}} 6 | \usage{ 7 | get_jags_data(data, ST, jags_code, sample) 8 | } 9 | \arguments{ 10 | \item{data}{A tibble} 11 | 12 | \item{ST}{A segment table (tibble), returned by \code{get_segment_table}.} 13 | 14 | \item{jags_code}{A string. JAGS model, usually returned by \code{make_jagscode()}.} 15 | 16 | \item{sample}{One of 17 | \itemize{ 18 | \item \code{"post"}: Sample the posterior. 19 | \item \code{"prior"}: Sample only the prior. Plots, summaries, etc. will 20 | use the prior. This is useful for prior predictive checks. 21 | \item \code{"both"}: Sample both prior and posterior. Plots, summaries, etc. 22 | will default to using the posterior. The prior only has effect when doing 23 | Savage-Dickey density ratios in \code{\link{hypothesis}}. 24 | \item \code{"none"} or \code{FALSE}: Do not sample. Returns an mcpfit 25 | object without sample. This is useful if you only want to check 26 | prior strings (fit$prior), the JAGS model (fit$jags_code), etc. 27 | }} 28 | } 29 | \description{ 30 | Returns the relevant data columns as a list and add elements with unique 31 | varying group levels. 32 | } 33 | \keyword{internal} 34 | -------------------------------------------------------------------------------- /man/get_jagscode.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_jagscode.R 3 | \encoding{UTF-8} 4 | \name{get_jagscode} 5 | \alias{get_jagscode} 6 | \title{Make JAGS code for Multiple Change Point model} 7 | \usage{ 8 | get_jagscode(prior, ST, formula_str, arma_order, family, sample) 9 | } 10 | \arguments{ 11 | \item{prior}{Named list. Names are parameter names (\code{cp_i}, \code{int_i}, \code{xvar_i}, 12 | `sigma``) and the values are either 13 | \itemize{ 14 | \item A JAGS distribution (e.g., \code{int_1 = "dnorm(0, 1) T(0,)"}) indicating a 15 | conventional prior distribution. Uninformative priors based on data 16 | properties are used where priors are not specified. This ensures good 17 | parameter estimations, but it is a questionable for hypothesis testing. 18 | \code{mcp} uses SD (not precision) for dnorm, dt, dlogis, etc. See 19 | details. Change points are forced to be ordered through the priors using 20 | truncation, except for uniform priors where the lower bound should be 21 | greater than the previous change point, \code{dunif(cp_1, MAXX)}. 22 | \item A numerical value (e.g., \code{int_1 = -2.1}) indicating a fixed value. 23 | \item A model parameter name (e.g., \code{int_2 = "int_1"}), indicating that this parameter is shared - 24 | typically between segments. If two varying effects are shared this way, 25 | they will need to have the same grouping variable. 26 | \item A scaled Dirichlet prior is supported for change points if they are all set to 27 | \verb{cp_i = "dirichlet(N)} where \code{N} is the alpha for this change point and 28 | \code{N = 1} is most often used. This prior is less informative about the 29 | location of the change points than the default uniform prior, but it 30 | samples less efficiently, so you will often need to set \code{iter} higher. 31 | It is recommended for hypothesis testing and for the estimation of more 32 | than 5 change points. \href{https://lindeloev.github.io/mcp/articles/priors.html}{Read more}. 33 | }} 34 | 35 | \item{ST}{Segment table. Returned by \code{get_segment_table()}.} 36 | 37 | \item{formula_str}{String. The formula string returned by \code{build_formula_str}.} 38 | 39 | \item{arma_order}{Positive integer. The autoregressive order.} 40 | 41 | \item{family}{One of \code{gaussian()}, \code{binomial()}, \code{bernoulli()}, or \code{poission()}. 42 | Only default link functions are currently supported.} 43 | 44 | \item{sample}{One of 45 | \itemize{ 46 | \item \code{"post"}: Sample the posterior. 47 | \item \code{"prior"}: Sample only the prior. Plots, summaries, etc. will 48 | use the prior. This is useful for prior predictive checks. 49 | \item \code{"both"}: Sample both prior and posterior. Plots, summaries, etc. 50 | will default to using the posterior. The prior only has effect when doing 51 | Savage-Dickey density ratios in \code{\link{hypothesis}}. 52 | \item \code{"none"} or \code{FALSE}: Do not sample. Returns an mcpfit 53 | object without sample. This is useful if you only want to check 54 | prior strings (fit$prior), the JAGS model (fit$jags_code), etc. 55 | }} 56 | } 57 | \value{ 58 | String. A JAGS model. 59 | } 60 | \description{ 61 | Make JAGS code for Multiple Change Point model 62 | } 63 | \author{ 64 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 65 | } 66 | \keyword{internal} 67 | -------------------------------------------------------------------------------- /man/get_ppc_plot.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot.R 3 | \encoding{UTF-8} 4 | \name{get_ppc_plot} 5 | \alias{get_ppc_plot} 6 | \alias{get_loo_plot_call} 7 | \title{pp_check for loo statistics} 8 | \usage{ 9 | get_ppc_plot(fit, type, y, yrep, nsamples, draws = NULL, ...) 10 | } 11 | \arguments{ 12 | \item{type}{One of \code{bayesplot::available_ppc("grouped", invert = TRUE) \%>\% stringr::str_remove("ppc_")}} 13 | 14 | \item{y}{Response vector} 15 | 16 | \item{yrep}{S X N matrix of predicted responses} 17 | 18 | \item{nsamples}{Number of draws. Note that you may want to use all data for summary geoms. 19 | e.g., \code{pp_check(fit, type = "ribbon", nsamples = NULL)}.} 20 | 21 | \item{draws}{(required for loo-type plots) Indices of draws to use.} 22 | 23 | \item{...}{Arguments passed to \code{bayesplot::ppc_type(y, yrep, ...)}} 24 | } 25 | \value{ 26 | A \code{ggplot2} object returned by \verb{tidybayes::ppc_*(y, yrep, ...)}. 27 | 28 | A string 29 | } 30 | \description{ 31 | pp_check for loo statistics 32 | } 33 | \author{ 34 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 35 | } 36 | \keyword{internal} 37 | -------------------------------------------------------------------------------- /man/get_prior.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_prior.R 3 | \encoding{UTF-8} 4 | \name{get_prior} 5 | \alias{get_prior} 6 | \title{Get priors for all parameters in a segment table.} 7 | \usage{ 8 | get_prior(ST, family, prior = list()) 9 | } 10 | \arguments{ 11 | \item{ST}{Tibble. A segment table returned by \code{get_segment_table}.} 12 | 13 | \item{family}{One of \code{gaussian()}, \code{binomial()}, \code{bernoulli()}, or \code{poission()}. 14 | Only default link functions are currently supported.} 15 | 16 | \item{prior}{A list of user-defined priors. Will overwrite the relevant 17 | default priors.} 18 | } 19 | \value{ 20 | A named list of strings. The names correspond to the parameter names 21 | and the strings are the JAGS code for the prior (before converting SD to 22 | precision). 23 | } 24 | \description{ 25 | Starts by finding all default priors. Then replace them with user priors. 26 | User priors for change points are truncated appropriately using 27 | `truncate_prior_cp``, if not done manually by the user already. 28 | } 29 | \author{ 30 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 31 | } 32 | \keyword{internal} 33 | -------------------------------------------------------------------------------- /man/get_prior_str.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_jagscode.R 3 | \encoding{UTF-8} 4 | \name{get_prior_str} 5 | \alias{get_prior_str} 6 | \title{Get JAGS code for a prior} 7 | \usage{ 8 | get_prior_str(prior, i, varying_group = NULL) 9 | } 10 | \arguments{ 11 | \item{prior}{Named list. Names are parameter names (\code{cp_i}, \code{int_i}, \code{xvar_i}, 12 | `sigma``) and the values are either 13 | \itemize{ 14 | \item A JAGS distribution (e.g., \code{int_1 = "dnorm(0, 1) T(0,)"}) indicating a 15 | conventional prior distribution. Uninformative priors based on data 16 | properties are used where priors are not specified. This ensures good 17 | parameter estimations, but it is a questionable for hypothesis testing. 18 | \code{mcp} uses SD (not precision) for dnorm, dt, dlogis, etc. See 19 | details. Change points are forced to be ordered through the priors using 20 | truncation, except for uniform priors where the lower bound should be 21 | greater than the previous change point, \code{dunif(cp_1, MAXX)}. 22 | \item A numerical value (e.g., \code{int_1 = -2.1}) indicating a fixed value. 23 | \item A model parameter name (e.g., \code{int_2 = "int_1"}), indicating that this parameter is shared - 24 | typically between segments. If two varying effects are shared this way, 25 | they will need to have the same grouping variable. 26 | \item A scaled Dirichlet prior is supported for change points if they are all set to 27 | \verb{cp_i = "dirichlet(N)} where \code{N} is the alpha for this change point and 28 | \code{N = 1} is most often used. This prior is less informative about the 29 | location of the change points than the default uniform prior, but it 30 | samples less efficiently, so you will often need to set \code{iter} higher. 31 | It is recommended for hypothesis testing and for the estimation of more 32 | than 5 change points. \href{https://lindeloev.github.io/mcp/articles/priors.html}{Read more}. 33 | }} 34 | 35 | \item{i}{The index in \code{prior} to get code for} 36 | 37 | \item{varying_group}{String or NULL. Null indicates a population- 38 | level prior. String indicates a varying-effects prior (one for each group 39 | level).} 40 | } 41 | \value{ 42 | A string 43 | } 44 | \description{ 45 | Get JAGS code for a prior 46 | } 47 | \author{ 48 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 49 | } 50 | \keyword{internal} 51 | -------------------------------------------------------------------------------- /man/get_quantiles.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/misc.R 3 | \encoding{UTF-8} 4 | \name{get_quantiles} 5 | \alias{get_quantiles} 6 | \title{Expand samples with quantiles} 7 | \usage{ 8 | get_quantiles(samples, quantiles, xvar, yvar, facet_by = NULL) 9 | } 10 | \arguments{ 11 | \item{samples}{A tidybayes tibble} 12 | 13 | \item{quantiles}{Vector of quantiles (0.0 to 1.0)} 14 | 15 | \item{xvar}{An rlang::sym() with the name of the x-col in \code{samples}} 16 | 17 | \item{yvar}{An rlang::sym() with the name of the response col in \code{samples}} 18 | 19 | \item{facet_by}{String. Name of a varying group.} 20 | } 21 | \value{ 22 | A tidybayes long format tibble with the column "quantile" 23 | } 24 | \description{ 25 | TO DO: implement using \code{fitted()} and \code{predict()} but avoid double-computing the samples? E.g.: 26 | \verb{get_quantiles2 = function(fit, quantiles, facet_by = NULL) \{} 27 | \verb{fitted(fit, probs = c(0.1, 0.5, 0.9), newdata = data.frame(x = c(11, 50, 100))) \%>\%} 28 | \verb{tidyr::pivot_longer(tidyselect::starts_with("Q")) \%>\%} 29 | \code{dplyr::mutate(quantile = stringr::str_remove(name, "Q") \%>\% as.numeric() / 100)} 30 | \verb{\}} 31 | } 32 | \author{ 33 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 34 | } 35 | \keyword{internal} 36 | -------------------------------------------------------------------------------- /man/get_segment_table.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{get_segment_table} 5 | \alias{get_segment_table} 6 | \title{Build a table describing a list of segments} 7 | \usage{ 8 | get_segment_table(model, data = NULL, family = gaussian(), par_x = NULL) 9 | } 10 | \arguments{ 11 | \item{model}{A list of formulas - one for each segment. The first formula 12 | has the format \code{response ~ predictors} while the following formulas 13 | have the format \code{response ~ changepoint ~ predictors}. The response 14 | and change points can be omitted (\code{changepoint ~ predictors} assumes same 15 | response. \code{~ predictors} assumes an intercept-only change point). The 16 | following can be modeled: 17 | \itemize{ 18 | \item \emph{Regular formulas:} e.g., \code{~ 1 + x}). \href{https://lindeloev.github.io/mcp/articles/formulas.html}{Read more}. 19 | \item \emph{Extended formulas:}, e.g., \code{~ I(x^2) + exp(x) + sin(x)}. \href{https://lindeloev.github.io/mcp/articles/formulas.html}{Read more}. 20 | \item \emph{Variance:} e.g., \code{~sigma(1)} for a simple variance change or 21 | \code{~sigma(rel(1) + I(x^2))}) for more advanced variance structures. \href{https://lindeloev.github.io/mcp/articles/variance.html}{Read more} 22 | \item \emph{Autoregression:} e.g., \code{~ar(1)} for a simple onset/change in AR(1) or 23 | \verb{ar(2, 0 + x}) for an AR(2) increasing by \code{x}. \href{https://lindeloev.github.io/mcp/articles/arma.html}{Read more} 24 | }} 25 | 26 | \item{data}{Data.frame or tibble in long format.} 27 | 28 | \item{family}{One of \code{gaussian()}, \code{binomial()}, \code{bernoulli()}, or \code{poission()}. 29 | Only default link functions are currently supported.} 30 | 31 | \item{par_x}{String (default: NULL). Only relevant if no segments contains 32 | slope (no hint at what x is). Set this, e.g., par_x = "time".} 33 | } 34 | \value{ 35 | A tibble with one row describing each segment and the corresponding code. 36 | } 37 | \description{ 38 | Used internally for most mcp functions. 39 | } 40 | \examples{ 41 | model = list( 42 | y ~ 1 + x, 43 | 1 + (1|id) ~ 1 44 | ) 45 | get_segment_table(model) 46 | } 47 | \author{ 48 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 49 | } 50 | \keyword{internal} 51 | -------------------------------------------------------------------------------- /man/get_simulate.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_formula.R 3 | \encoding{UTF-8} 4 | \name{get_simulate} 5 | \alias{get_simulate} 6 | \title{Turn formula_str into a proper R function} 7 | \usage{ 8 | get_simulate(formula_str, pars, nsegments, family) 9 | } 10 | \arguments{ 11 | \item{formula_str}{string. Returned by \code{get_formula}.} 12 | 13 | \item{pars}{List of user-provided parameters, in the format of fit$pars.} 14 | 15 | \item{nsegments}{Positive integer. Number of segments, typically \code{nrow(ST)}.} 16 | 17 | \item{family}{One of \code{gaussian()}, \code{binomial()}, \code{bernoulli()}, or \code{poission()}. 18 | Only default link functions are currently supported.} 19 | } 20 | \value{ 21 | A string with R code for the fit$simulate() function corresponding to the model. 22 | } 23 | \description{ 24 | Turn formula_str into a proper R function 25 | } 26 | \author{ 27 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 28 | } 29 | \keyword{internal} 30 | -------------------------------------------------------------------------------- /man/get_summary.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \encoding{UTF-8} 4 | \name{get_summary} 5 | \alias{get_summary} 6 | \alias{get_summary.mcpfit} 7 | \title{Internal function for summary.mcpfit, fixef.mcpfit, and ranef.mcpfit} 8 | \usage{ 9 | get_summary(fit, width, varying = FALSE, prior = FALSE) 10 | } 11 | \arguments{ 12 | \item{fit}{An \code{\link{mcpfit}}` object.} 13 | 14 | \item{width}{Float. The width of the highest posterior density interval 15 | (between 0 and 1).} 16 | 17 | \item{varying}{Boolean. Get results for varying (TRUE) or population (FALSE)?} 18 | 19 | \item{prior}{TRUE/FALSE. Summarise prior instead of posterior?} 20 | } 21 | \value{ 22 | A data.frame with summaries for each model parameter. 23 | } 24 | \description{ 25 | Internal function for summary.mcpfit, fixef.mcpfit, and ranef.mcpfit 26 | } 27 | \author{ 28 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 29 | } 30 | \keyword{internal} 31 | -------------------------------------------------------------------------------- /man/get_term_content.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{get_term_content} 5 | \alias{get_term_content} 6 | \title{Get formula inside a wrapper} 7 | \usage{ 8 | get_term_content(term) 9 | } 10 | \arguments{ 11 | \item{term}{E.g., "ct(1 + x)", "sigma(0 + rel(x) + I(x^2))", etc.} 12 | } 13 | \value{ 14 | char formula with the content inside the brackets. 15 | } 16 | \description{ 17 | Get formula inside a wrapper 18 | } 19 | \author{ 20 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 21 | } 22 | \keyword{internal} 23 | -------------------------------------------------------------------------------- /man/hypothesis.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/comparison.R 3 | \encoding{UTF-8} 4 | \name{hypothesis} 5 | \alias{hypothesis} 6 | \alias{hypothesis.mcpfit} 7 | \title{Test hypotheses on mcp objects.} 8 | \usage{ 9 | hypothesis(fit, hypotheses, width = 0.95, digits = 3) 10 | } 11 | \arguments{ 12 | \item{fit}{An \code{\link{mcpfit}} object.} 13 | 14 | \item{hypotheses}{String representation of a logical test involving model parameters. 15 | Takes R code that evaluates to TRUE or FALSE in a vectorized way. 16 | 17 | Directional hypotheses are specified using <, >, <=, or >=. \code{hypothesis} 18 | returns the posterior probability and odds in favor of the stated hypothesis. 19 | The odds can be interpreted as a Bayes Factor. For example: 20 | \itemize{ 21 | \item \code{"cp_1 > 30"}: the first change point is above 30. 22 | \item \code{"int_1 > int_2"}: the intercept is greater in segment 1 than 2. 23 | \item \code{"x_2 - x_1 <= 3"}: the difference between slope 1 and 2 is less 24 | than or equal to 3. 25 | \item \code{"int_1 > -2 & int_1 < 2"}: int_1 is between -2 and 2 (an interval hypothesis). This can be useful as a Region Of Practical Equivalence test (ROPE). 26 | \item \code{"cp_1^2 < 30 | (log(x_1) + log(x_2)) > 5"}: be creative. 27 | \item \code{"`cp_1_id[1]` > `cp_1_id[2]`"}: id1 is greater than id2, as estimated 28 | through the varying-by-"id" change point in segment 1. Note that \code{``} 29 | required for varying effects. 30 | } 31 | 32 | Hypotheses can also test equality using the equal sign (=). This runs a 33 | Savage-Dickey test, i.e., the proportion by which the probability density 34 | has increased from the prior to the posterior at a given value. Therefore, 35 | it requires \code{mcp(sample = "both")}. There are two requirements: 36 | First, there can only be one equal sign, so don't use and (&) or or (|). 37 | Second, the point to test has to be on the right, and the variables on the left. 38 | \itemize{ 39 | \item \code{"cp_1 = 30"}: is the first change point at 30? Or to be more precise: 40 | by what factor has the credence in cp_1 = 30 risen/fallen when 41 | conditioning on the data, relative to the prior credence? 42 | \item \code{"int_1 + int_2 = 0"}: Is the sum of two intercepts zero? 43 | \item \code{"`cp_1_id[John]`/`cp_1_id[Erin]` = 2"}: is the varying change 44 | point for John (which is relative to `cp_1``) double that of Erin? 45 | }} 46 | 47 | \item{width}{Float. The width of the highest posterior density interval 48 | (between 0 and 1).} 49 | 50 | \item{digits}{a non-null value for digits specifies the minimum number of 51 | significant digits to be printed in values. The default, NULL, uses 52 | getOption("digits"). (For the interpretation for complex numbers see signif.) 53 | Non-integer values will be rounded down, and only values greater than or 54 | equal to 1 and no greater than 22 are accepted.} 55 | } 56 | \value{ 57 | A data.frame with a row per hypothesis and the following columns: 58 | \itemize{ 59 | \item \code{hypothesis} is the hypothesis; often re-arranged to test against zero. 60 | \item \code{mean} is the posterior mean of the left-hand side of the hypothesis. 61 | \item \code{lower} is the lower bound of the (two-sided) highest-density interval of width \code{width}. 62 | \item \code{upper} is the upper bound of ditto. 63 | \item \code{p} Posterior probability. 64 | For "=" (Savage-Dickey), it is the BF converted to p. 65 | For directional hypotheses, it is the proportion of samples that returns TRUE. 66 | \item \code{BF} Bayes Factor in favor of the hypothesis. 67 | For "=" it is the Savage-Dickey density ratio. 68 | For directional hypotheses, it is p converted to odds. 69 | } 70 | } 71 | \description{ 72 | Returns posterior probabilities and Bayes Factors for flexible hypotheses involving 73 | model parameters. The documentation for the argument \code{hypotheses} below 74 | shows examples of how to specify hypotheses, and \href{https://lindeloev.github.io/mcp/articles/comparison.html}{read worked examples on the mcp website}. 75 | For directional hypotheses, \verb{hypothesis`` executes the hypothesis string in a }tidybayes`` environment and summerises the proportion of samples where 76 | the expression evaluates to TRUE. For equals-hypothesis, a Savage-Dickey 77 | ratio is computed. Savage-Dickey requires a prior too, so remember 78 | \code{mcp(..., sample = "both")}. This function is heavily inspired by the 79 | `hypothesis` function from the `brms` package. 80 | } 81 | \author{ 82 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 83 | } 84 | -------------------------------------------------------------------------------- /man/ilogit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/families.R 3 | \name{ilogit} 4 | \alias{ilogit} 5 | \title{Inverse logit function} 6 | \usage{ 7 | ilogit(eta) 8 | } 9 | \arguments{ 10 | \item{eta}{A vector of logits} 11 | } 12 | \value{ 13 | A vector with same length as \code{eta} 14 | } 15 | \description{ 16 | Inverse logit function 17 | } 18 | -------------------------------------------------------------------------------- /man/is.mcpfit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \name{is.mcpfit} 4 | \alias{is.mcpfit} 5 | \title{Checks if argument is an \code{mcpfit} object} 6 | \usage{ 7 | is.mcpfit(x) 8 | } 9 | \arguments{ 10 | \item{x}{An \code{R} object.} 11 | } 12 | \description{ 13 | Checks if argument is an \code{mcpfit} object 14 | } 15 | -------------------------------------------------------------------------------- /man/logit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/families.R 3 | \name{logit} 4 | \alias{logit} 5 | \title{Logit function} 6 | \usage{ 7 | logit(mu) 8 | } 9 | \arguments{ 10 | \item{mu}{A vector of probabilities (0.0 to 1.0)} 11 | } 12 | \value{ 13 | A vector with same length as \code{mu} 14 | } 15 | \description{ 16 | Logit function 17 | } 18 | -------------------------------------------------------------------------------- /man/mcmclist_samples.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \name{mcmclist_samples} 4 | \alias{mcmclist_samples} 5 | \alias{mcmclist_samples.mcpfit} 6 | \title{Internal function to get samples.} 7 | \usage{ 8 | mcmclist_samples(fit, prior = FALSE, message = TRUE, error = TRUE) 9 | } 10 | \arguments{ 11 | \item{fit}{An \code{\link{mcpfit}} object} 12 | 13 | \item{prior}{TRUE/FALSE. Summarise prior instead of posterior?} 14 | 15 | \item{message}{TRUE: gives a message if returning prior samples. FALSE = no message} 16 | 17 | \item{error}{TRUE: err if there are no samples. FALSE: return NULL} 18 | } 19 | \description{ 20 | Returns posterior samples, if available. If not, then prior samples. If not, 21 | then throw an informative error. This is useful for summary and plotting, that 22 | works on both. 23 | } 24 | \keyword{internal} 25 | -------------------------------------------------------------------------------- /man/mcp-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcp-package.R 3 | \docType{package} 4 | \name{mcp-package} 5 | \alias{mcp-package} 6 | \title{mcp: Regression with Multiple Change Points} 7 | \description{ 8 | \if{html}{\figure{logo.png}{options: style='float: right' alt='logo' width='120'}} 9 | 10 | Flexible and informed regression with Multiple Change Points. 'mcp' can infer change points in means, variances, autocorrelation structure, and any combination of these, as well as the parameters of the segments in between. All parameters are estimated with uncertainty and prediction intervals are supported - also near the change points. 'mcp' supports hypothesis testing via Savage-Dickey density ratio, posterior contrasts, and cross-validation. 'mcp' is described in Lindeløv (submitted) \doi{10.31219/osf.io/fzqxv} and generalizes the approach described in Carlin, Gelfand, & Smith (1992) \doi{10.2307/2347570} and Stephens (1994) \doi{10.2307/2986119}. 11 | } 12 | \seealso{ 13 | Useful links: 14 | \itemize{ 15 | \item \url{https://lindeloev.github.io/mcp/} 16 | \item Report bugs at \url{https://github.com/lindeloev/mcp/issues} 17 | } 18 | 19 | } 20 | \author{ 21 | \strong{Maintainer}: Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} (\href{https://orcid.org/0000-0003-4565-0595}{ORCID}) 22 | 23 | } 24 | \keyword{internal} 25 | -------------------------------------------------------------------------------- /man/mcp.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcp.R 3 | \encoding{UTF-8} 4 | \name{mcp} 5 | \alias{mcp} 6 | \title{Fit Multiple Linear Segments And Their Change Points} 7 | \usage{ 8 | mcp( 9 | model, 10 | data = NULL, 11 | prior = list(), 12 | family = gaussian(), 13 | par_x = NULL, 14 | sample = "post", 15 | cores = 1, 16 | chains = 3, 17 | iter = 3000, 18 | adapt = 1500, 19 | inits = NULL, 20 | jags_code = NULL 21 | ) 22 | } 23 | \arguments{ 24 | \item{model}{A list of formulas - one for each segment. The first formula 25 | has the format \code{response ~ predictors} while the following formulas 26 | have the format \code{response ~ changepoint ~ predictors}. The response 27 | and change points can be omitted (\code{changepoint ~ predictors} assumes same 28 | response. \code{~ predictors} assumes an intercept-only change point). The 29 | following can be modeled: 30 | \itemize{ 31 | \item \emph{Regular formulas:} e.g., \code{~ 1 + x}). \href{https://lindeloev.github.io/mcp/articles/formulas.html}{Read more}. 32 | \item \emph{Extended formulas:}, e.g., \code{~ I(x^2) + exp(x) + sin(x)}. \href{https://lindeloev.github.io/mcp/articles/formulas.html}{Read more}. 33 | \item \emph{Variance:} e.g., \code{~sigma(1)} for a simple variance change or 34 | \code{~sigma(rel(1) + I(x^2))}) for more advanced variance structures. \href{https://lindeloev.github.io/mcp/articles/variance.html}{Read more} 35 | \item \emph{Autoregression:} e.g., \code{~ar(1)} for a simple onset/change in AR(1) or 36 | \verb{ar(2, 0 + x}) for an AR(2) increasing by \code{x}. \href{https://lindeloev.github.io/mcp/articles/arma.html}{Read more} 37 | }} 38 | 39 | \item{data}{Data.frame or tibble in long format.} 40 | 41 | \item{prior}{Named list. Names are parameter names (\code{cp_i}, \code{int_i}, \code{xvar_i}, 42 | `sigma``) and the values are either 43 | \itemize{ 44 | \item A JAGS distribution (e.g., \code{int_1 = "dnorm(0, 1) T(0,)"}) indicating a 45 | conventional prior distribution. Uninformative priors based on data 46 | properties are used where priors are not specified. This ensures good 47 | parameter estimations, but it is a questionable for hypothesis testing. 48 | \code{mcp} uses SD (not precision) for dnorm, dt, dlogis, etc. See 49 | details. Change points are forced to be ordered through the priors using 50 | truncation, except for uniform priors where the lower bound should be 51 | greater than the previous change point, \code{dunif(cp_1, MAXX)}. 52 | \item A numerical value (e.g., \code{int_1 = -2.1}) indicating a fixed value. 53 | \item A model parameter name (e.g., \code{int_2 = "int_1"}), indicating that this parameter is shared - 54 | typically between segments. If two varying effects are shared this way, 55 | they will need to have the same grouping variable. 56 | \item A scaled Dirichlet prior is supported for change points if they are all set to 57 | \verb{cp_i = "dirichlet(N)} where \code{N} is the alpha for this change point and 58 | \code{N = 1} is most often used. This prior is less informative about the 59 | location of the change points than the default uniform prior, but it 60 | samples less efficiently, so you will often need to set \code{iter} higher. 61 | It is recommended for hypothesis testing and for the estimation of more 62 | than 5 change points. \href{https://lindeloev.github.io/mcp/articles/priors.html}{Read more}. 63 | }} 64 | 65 | \item{family}{One of \code{gaussian()}, \code{binomial()}, \code{bernoulli()}, or \code{poission()}. 66 | Only default link functions are currently supported.} 67 | 68 | \item{par_x}{String (default: NULL). Only relevant if no segments contains 69 | slope (no hint at what x is). Set this, e.g., par_x = "time".} 70 | 71 | \item{sample}{One of 72 | \itemize{ 73 | \item \code{"post"}: Sample the posterior. 74 | \item \code{"prior"}: Sample only the prior. Plots, summaries, etc. will 75 | use the prior. This is useful for prior predictive checks. 76 | \item \code{"both"}: Sample both prior and posterior. Plots, summaries, etc. 77 | will default to using the posterior. The prior only has effect when doing 78 | Savage-Dickey density ratios in \code{\link{hypothesis}}. 79 | \item \code{"none"} or \code{FALSE}: Do not sample. Returns an mcpfit 80 | object without sample. This is useful if you only want to check 81 | prior strings (fit$prior), the JAGS model (fit$jags_code), etc. 82 | }} 83 | 84 | \item{cores}{Positive integer or "all". Number of cores. 85 | \itemize{ 86 | \item \code{1}: serial sampling. \code{options(mc.cores = 3)} will dominate \code{cores = 1} 87 | but not larger values of \code{cores}. 88 | \item \verb{>1}: parallel sampling on this number of cores. Ideally set \code{chains} 89 | to the same value. Note: \code{cores > 1} takes a few extra seconds the first 90 | time it's called but subsequent calls will start sampling immediately. 91 | \item \code{"all"}: use all cores but one and sets \code{chains} to the same value. This is 92 | a convenient way to maximally use your computer's power. 93 | }} 94 | 95 | \item{chains}{Positive integer. Number of chains to run.} 96 | 97 | \item{iter}{Positive integer. Number of post-warmup draws from each chain. 98 | The total number of draws is \code{iter * chains}.} 99 | 100 | \item{adapt}{Positive integer. Also sometimes called "burnin", this is the 101 | number of samples used to reach convergence. Set lower for greater speed. 102 | Set higher if the chains haven't converged yet or look at \href{https://lindeloev.github.io/mcp/articles/tips.html}{tips, tricks, and debugging}.} 103 | 104 | \item{inits}{A list if initial values for the parameters. This can be useful 105 | if a model fails to converge. Read more in \code{\link[rjags]{jags.model}}. 106 | Defaults to \code{NULL}, i.e., no inits.} 107 | 108 | \item{jags_code}{String. Pass JAGS code to \code{mcp} to use directly. This is useful if 109 | you want to tweak the code in \code{fit$jags_code} and run it within the \code{mcp} 110 | framework.} 111 | } 112 | \value{ 113 | An \code{\link{mcpfit}} object. 114 | } 115 | \description{ 116 | Given a model (a list of segment formulas), \code{mcp} infers the posterior 117 | distributions of the parameters of each segment as well as the change points 118 | between segments. \href{https://lindeloev.github.io/mcp/}{See more details and worked examples on the mcp website}. 119 | All segments must regress on the same x-variable. Change 120 | points are forced to be ordered using truncation of the priors. You can run 121 | \code{fit = mcp(model, sample=FALSE)} to avoid sampling and the need for 122 | data if you just want to get the priors (\code{fit$prior}), the JAGS code 123 | \code{fit$jags_code}, or the R function to simulate data (\code{fit$simulate}). 124 | } 125 | \details{ 126 | Notes on priors: 127 | \itemize{ 128 | \item Order restriction is automatically applied to cp_\* parameters using 129 | truncation (e.g., \code{T(cp_1, )}) so that they are in the correct order on the 130 | x-axis UNLESS you do it yourself. The one exception is for dunif 131 | distributions where you have to do it as above. 132 | \item In addition to the model parameters, \code{MINX} (minimum x-value), \code{MAXX} 133 | (maximum x-value), \code{SDX} (etc...), \code{MINY}, \code{MAXY}, and \code{SDY} 134 | are also available when you set priors. They are used to set uninformative 135 | default priors. 136 | \item Use SD when you specify priors for dt, dlogis, etc. JAGS uses precision 137 | but \code{mcp} converts to precision under the hood via the sd_to_prec() 138 | function. So you will see SDs in \code{fit$prior} but precision ($1/SD^2) 139 | in \code{fit$jags_code} 140 | } 141 | } 142 | \examples{ 143 | \donttest{ 144 | # Define the segments using formulas. A change point is estimated between each formula. 145 | model = list( 146 | response ~ 1, # Plateau in the first segment (int_1) 147 | ~ 0 + time, # Joined slope (time_2) at cp_1 148 | ~ 1 + time # Disjoined slope (int_3, time_3) at cp_2 149 | ) 150 | 151 | # Fit it and sample the prior too. 152 | # options(mc.cores = 3) # Uncomment to speed up sampling 153 | ex = mcp_example("demo") # Simulated data example 154 | demo_fit = mcp(model, data = ex$data, sample = "both") 155 | 156 | # See parameter estimates 157 | summary(demo_fit) 158 | 159 | # Visual inspection of the results 160 | plot(demo_fit) # Visualization of model fit/predictions 161 | plot_pars(demo_fit) # Parameter distributions 162 | pp_check(demo_fit) # Prior/Posterior predictive checks 163 | 164 | # Test a hypothesis 165 | hypothesis(demo_fit, "cp_1 > 10") 166 | 167 | # Make predictions 168 | fitted(demo_fit) 169 | predict(demo_fit) 170 | predict(demo_fit, newdata = data.frame(time = c(55.545, 80, 132))) 171 | 172 | # Compare to a one-intercept-only model (no change points) with default prior 173 | model_null = list(response ~ 1) 174 | fit_null = mcp(model_null, data = ex$data, par_x = "time") # fit another model here 175 | demo_fit$loo = loo(demo_fit) 176 | fit_null$loo = loo(fit_null) 177 | loo::loo_compare(demo_fit$loo, fit_null$loo) 178 | 179 | # Inspect the prior. Useful for prior predictive checks. 180 | summary(demo_fit, prior = TRUE) 181 | plot(demo_fit, prior = TRUE) 182 | 183 | # Show all priors. Default priors are added where you don't provide any 184 | print(demo_fit$prior) 185 | 186 | # Set priors and re-run 187 | prior = list( 188 | int_1 = 15, 189 | time_2 = "dt(0, 2, 1) T(0, )", # t-dist slope. Truncated to positive. 190 | cp_2 = "dunif(cp_1, 80)", # change point to segment 2 > cp_1 and < 80. 191 | int_3 = "int_1" # Shared intercept between segment 1 and 3 192 | ) 193 | 194 | fit3 = mcp(model, data = ex$data, prior = prior) 195 | 196 | # Show the JAGS model 197 | demo_fit$jags_code 198 | } 199 | 200 | } 201 | \seealso{ 202 | \code{\link{get_segment_table}} 203 | } 204 | \author{ 205 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 206 | } 207 | -------------------------------------------------------------------------------- /man/mcp_example.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/misc.R 3 | \encoding{UTF-8} 4 | \name{mcp_example} 5 | \alias{mcp_example} 6 | \title{Get example models and data} 7 | \usage{ 8 | mcp_example(name, sample = FALSE) 9 | } 10 | \arguments{ 11 | \item{name}{Name of the example. One of: 12 | \itemize{ 13 | \item \code{"demo"}: Two change points between intercepts and joined/disjoined slopes. 14 | \item \code{"ar"}: One change point in autoregressive residuals. 15 | \item \code{"binomial"}: Binomial with two change points. Much like \code{"demo"} on a logit scale. 16 | \item \code{"intercepts"}: An intercept-only change point. 17 | \item \code{rel_prior}: Relative parameterization and informative priors. 18 | \item \code{"quadratic"}: A change point to a quadratic segment. 19 | \item \code{"trigonometric"}: Trigonometric/seasonal data and model. 20 | \item \code{"varying"}: Varying / hierarchical change points. 21 | \item \code{"variance"}: A change in variance, including a variance slope. 22 | }} 23 | 24 | \item{sample}{TRUE (run \code{fit = mcp(model, data, ...)}) or FALSE.} 25 | } 26 | \value{ 27 | List with 28 | \itemize{ 29 | \item \code{model}: A list of formulas 30 | \item \code{data}: The simulated data 31 | \item \code{simulated}: The parameters used for simulating the data. 32 | \item \code{fit}: an \code{mcpfit} if \code{sample = TRUE}, 33 | \item \code{call}: the code to run the above. 34 | } 35 | } 36 | \description{ 37 | Get example models and data 38 | } 39 | \examples{ 40 | \donttest{ 41 | ex = mcp_example("demo") 42 | plot(ex$data) # Plot data 43 | print(ex$simulated) # See true parameters used to simulate 44 | print(ex$call) # See how the data was simulated 45 | 46 | # Fit the model. Either... 47 | fit = mcp(ex$model, ex$data) 48 | plot(fit) 49 | 50 | ex_with_fit = mcp_example("demo", sample = TRUE) 51 | plot(ex_with_fit$fit) 52 | } 53 | } 54 | \author{ 55 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 56 | } 57 | -------------------------------------------------------------------------------- /man/mcpfamily.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/families.R 3 | \name{mcpfamily} 4 | \alias{mcpfamily} 5 | \title{Add A family object to store link functions between R and JAGS.} 6 | \usage{ 7 | mcpfamily(family) 8 | } 9 | \arguments{ 10 | \item{family}{A family object, e.g., \code{binomial(link = "identity")}.} 11 | } 12 | \description{ 13 | This will make more sense once more link functions / families are added. 14 | } 15 | \keyword{internal} 16 | -------------------------------------------------------------------------------- /man/mcpfit-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \docType{class} 4 | \name{mcpfit-class} 5 | \alias{mcpfit-class} 6 | \alias{mcpfit} 7 | \title{Class \code{mcpfit} of models fitted with the \pkg{mcp} package} 8 | \description{ 9 | Models fitted with the \code{\link[mcp:mcp]{mcp}} function are represented as 10 | an \code{mcpfit} object which contains the user input (model, data, family), 11 | derived model characteristics (prior, parameter names, and jags code), and 12 | the fit (prior and/or posterior mcmc samples). 13 | } 14 | \details{ 15 | See \code{methods(class = "mcpfit")} for an overview of available methods. 16 | 17 | User-provided information (see \code{\link{mcp}} for more details): 18 | } 19 | \section{Slots}{ 20 | 21 | \describe{ 22 | \item{\code{model}}{A list of formulas, making up the model. 23 | Provided by user. See \code{\link{mcp}} for more details.} 24 | 25 | \item{\code{data}}{A data frame. 26 | Provided by user. See \code{\link{mcp}} for more details.} 27 | 28 | \item{\code{family}}{An \code{mcpfamily} object. 29 | Provided by user. See \code{\link{mcp}} for more details.} 30 | 31 | \item{\code{prior}}{A named list. 32 | Provided by user. See \code{\link{mcp}} for more details.} 33 | 34 | \item{\code{mcmc_post}}{An \code{\link[coda]{mcmc.list}} object with posterior samples.} 35 | 36 | \item{\code{mcmc_prior}}{An \code{\link[coda]{mcmc.list}} object with prior samples.} 37 | 38 | \item{\code{mcmc_loglik}}{An \code{\link[coda]{mcmc.list}} object with samples of log-likelihood.} 39 | 40 | \item{\code{pars}}{A list of character vectors of model parameter names.} 41 | 42 | \item{\code{jags_code}}{A string with jags code. Use \code{cat(fit$jags_code)} to show it.} 43 | 44 | \item{\code{simulate}}{A method to simulate and predict data.} 45 | 46 | \item{\code{.other}}{Information that is used internally by mcp.} 47 | }} 48 | 49 | -------------------------------------------------------------------------------- /man/negbinomial.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/families.R 3 | \name{negbinomial} 4 | \alias{negbinomial} 5 | \title{Negative binomial for mcp} 6 | \usage{ 7 | negbinomial(link = "log") 8 | } 9 | \arguments{ 10 | \item{link}{Link function (Character).} 11 | } 12 | \description{ 13 | Parameterized as \code{mu} (mean; poisson lambda) and \code{size} (a shape parameter), 14 | so you can do \code{rnbinom(10, mu = 10, size = 1)}. Read more in the doc for \code{rnbinom}, 15 | } 16 | -------------------------------------------------------------------------------- /man/phi.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/families.R 3 | \name{phi} 4 | \alias{phi} 5 | \title{Inverse probit function} 6 | \usage{ 7 | phi(eta) 8 | } 9 | \arguments{ 10 | \item{eta}{A vector of probits} 11 | } 12 | \value{ 13 | A vector with same length as \code{mu} 14 | } 15 | \description{ 16 | Inverse probit function 17 | } 18 | -------------------------------------------------------------------------------- /man/plot.mcpfit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot.R 3 | \encoding{UTF-8} 4 | \name{plot.mcpfit} 5 | \alias{plot.mcpfit} 6 | \alias{plot} 7 | \title{Plot full fits} 8 | \usage{ 9 | \method{plot}{mcpfit}( 10 | x, 11 | facet_by = NULL, 12 | lines = 25, 13 | geom_data = "point", 14 | cp_dens = TRUE, 15 | q_fit = FALSE, 16 | q_predict = FALSE, 17 | rate = TRUE, 18 | prior = FALSE, 19 | which_y = "ct", 20 | arma = TRUE, 21 | nsamples = 2000, 22 | scale = "response", 23 | ... 24 | ) 25 | } 26 | \arguments{ 27 | \item{x}{An \code{\link{mcpfit}} object} 28 | 29 | \item{facet_by}{String. Name of a varying group.} 30 | 31 | \item{lines}{Positive integer or \code{FALSE}. Number of lines (posterior 32 | draws). FALSE or \code{lines = 0} plots no lines. Note that lines always plot 33 | fitted values - not predicted. For prediction intervals, see the \code{q_predict} argument.} 34 | 35 | \item{geom_data}{String. One of "point", "line" (good for time-series), 36 | or FALSE (don not plot).} 37 | 38 | \item{cp_dens}{TRUE/FALSE. Plot posterior densities of the change point(s)? 39 | Currently does not respect \code{facet_by}. This will be added in the future.} 40 | 41 | \item{q_fit}{Whether to plot quantiles of the posterior (fitted value). 42 | \itemize{ 43 | \item \code{TRUE} Add 2.5\% and 97.5\% quantiles. Corresponds to 44 | \code{q_fit = c(0.025, 0.975)}. 45 | \item \code{FALSE} No quantiles 46 | \item A vector of quantiles. For example, \code{quantiles = 0.5} 47 | plots the median and \code{quantiles = c(0.2, 0.8)} plots the 20\% and 80\% 48 | quantiles. 49 | }} 50 | 51 | \item{q_predict}{Same as \code{q_fit}, but for the prediction interval.} 52 | 53 | \item{rate}{Boolean. For binomial models, plot on raw data (\code{rate = FALSE}) or 54 | response divided by number of trials (\code{rate = TRUE}). If FALSE, linear 55 | interpolation on trial number is used to infer trials at a particular x.} 56 | 57 | \item{prior}{TRUE/FALSE. Plot using prior samples? Useful for \code{mcp(..., sample = "both")}} 58 | 59 | \item{which_y}{What to plot on the y-axis. One of 60 | \itemize{ 61 | \item \code{"ct"}: The central tendency which is often the mean after applying the 62 | link function. 63 | \item \code{"sigma"}: The variance 64 | \item \code{"ar1"}, \code{"ar2"}, etc. depending on which order of the autoregressive 65 | effects you want to plot. 66 | }} 67 | 68 | \item{arma}{Whether to include autoregressive effects. 69 | \itemize{ 70 | \item \code{TRUE} Compute autoregressive residuals. Requires the response variable in \code{newdata}. 71 | \item \code{FALSE} Disregard the autoregressive effects. For \code{family = gaussian()}, \code{predict()} just use \code{sigma} for residuals. 72 | }} 73 | 74 | \item{nsamples}{Integer or \code{NULL}. Number of samples to return/summarise. 75 | If there are varying effects, this is the number of samples from each varying group. 76 | \code{NULL} means "all". Ignored if both are \code{FALSE}. More samples trade speed for accuracy.} 77 | 78 | \item{scale}{One of 79 | \itemize{ 80 | \item "response": return on the observed scale, i.e., after applying the inverse link function. 81 | \item "linear": return on the parameter scale (where the linear trends are modelled). 82 | }} 83 | 84 | \item{...}{Currently ignored.} 85 | } 86 | \value{ 87 | A \pkg{ggplot2} object. 88 | } 89 | \description{ 90 | Plot prior or posterior model draws on top of data. Use \code{plot_pars} to 91 | plot individual parameter estimates. 92 | } 93 | \details{ 94 | \code{plot()} uses \code{fit$simulate()} on posterior samples. These represent the 95 | (joint) posterior distribution. 96 | } 97 | \examples{ 98 | # Typical usage. demo_fit is an mcpfit object. 99 | plot(demo_fit) 100 | \donttest{ 101 | plot(demo_fit, prior = TRUE) # The prior 102 | 103 | plot(demo_fit, lines = 0, q_fit = TRUE) # 95\% HDI without lines 104 | plot(demo_fit, q_predict = c(0.1, 0.9)) # 80\% prediction interval 105 | plot(demo_fit, which_y = "sigma", lines = 100) # The variance parameter on y 106 | 107 | # Show a panel for each varying effect 108 | # plot(fit, facet_by = "my_column") 109 | 110 | # Customize plots using regular ggplot2 111 | library(ggplot2) 112 | plot(demo_fit) + theme_bw(15) + ggtitle("Great plot!") 113 | } 114 | 115 | } 116 | \author{ 117 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 118 | } 119 | -------------------------------------------------------------------------------- /man/plot_pars.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot.R 3 | \encoding{UTF-8} 4 | \name{plot_pars} 5 | \alias{plot_pars} 6 | \title{Plot individual parameters} 7 | \usage{ 8 | plot_pars( 9 | fit, 10 | pars = "population", 11 | regex_pars = character(0), 12 | type = "combo", 13 | ncol = 1, 14 | prior = FALSE 15 | ) 16 | } 17 | \arguments{ 18 | \item{fit}{An \code{\link{mcpfit}} object.} 19 | 20 | \item{pars}{Character vector. One of: 21 | \itemize{ 22 | \item Vector of parameter names. 23 | \item \code{"population"} plots all population parameters. 24 | \item \code{"varying"} plots all varying effects. To plot a particular varying 25 | effect, use \code{regex_pars = "^name"}. 26 | }} 27 | 28 | \item{regex_pars}{Vector of regular expressions. This will typically just be 29 | the beginning of the parameter name(s), i.e., "^cp_" plots all change 30 | points, "^my_varying" plots all levels of a particular varying effect, and 31 | "^cp_|^my_varying" plots both.} 32 | 33 | \item{type}{String or vector of strings. Calls \verb{bayesplot::mcmc_>>type<<()}. 34 | Common calls are "combo", "trace", and "dens_overlay". Current options include 35 | 'acf', 'acf_bar', 'areas', 'areas_ridges', 'combo', 'dens', 'dens_chains', 36 | 'dens_overlay', 'hist', 'intervals', 'rank_hist', 'rank_overlay', 'trace', 37 | 'trace_highlight', and 'violin".} 38 | 39 | \item{ncol}{Number of columns in plot. This is useful when you have many 40 | parameters and only one plot \code{type}.} 41 | 42 | \item{prior}{TRUE/FALSE. Plot using prior samples? Useful for \code{mcp(..., sample = "both")}} 43 | } 44 | \value{ 45 | A \pkg{ggplot2} object. 46 | } 47 | \description{ 48 | Plot many types of plots of parameter estimates. See examples for typical use 49 | cases. 50 | } 51 | \details{ 52 | For other \code{type}, it calls \code{bayesplot::mcmc_type()}. Use these 53 | directly on \code{fit$mcmc_post} or \code{fit$mcmc_prior} if you want finer 54 | control of plotting, e.g., \code{bayesplot::mcmc_dens(fit$mcmc_post)}. There 55 | are also a number of useful plots in the \pkg{coda} package, i.e., 56 | \code{coda::gelman.plot(fit$mcmc_post)} and \code{coda::crosscorr.plot(fit$mcmc_post)} 57 | 58 | In any case, if you see a few erratic lines or parameter estimates, this is 59 | a sign that you may want to increase argument 'adapt' and 'iter' in \code{\link{mcp}}. 60 | } 61 | \examples{ 62 | # Typical usage. demo_fit is an mcpfit object. 63 | plot_pars(demo_fit) 64 | 65 | \dontrun{ 66 | # More options 67 | plot_pars(demo_fit, regex_pars = "^cp_") # Plot only change points 68 | plot_pars(demo_fit, pars = c("int_3", "time_3")) # Plot these parameters 69 | plot_pars(demo_fit, type = c("trace", "violin")) # Combine plots 70 | # Some plots only take pairs. hex is good to assess identifiability 71 | plot_pars(demo_fit, type = "hex", pars = c("cp_1", "time_2")) 72 | 73 | # Visualize the priors: 74 | plot_pars(demo_fit, prior = TRUE) 75 | 76 | # Useful for varying effects: 77 | # plot_pars(my_fit, pars = "varying", ncol = 3) # plot all varying effects 78 | # plot_pars(my_fit, regex_pars = "my_varying", ncol = 3) # plot all levels of a particular varying 79 | 80 | # Customize multi-column ggplots using "*" instead of "+" (patchwork) 81 | library(ggplot2) 82 | plot_pars(demo_fit, type = c("trace", "dens_overlay")) * theme_bw(10) 83 | } 84 | } 85 | \author{ 86 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 87 | } 88 | -------------------------------------------------------------------------------- /man/pp_check.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot.R 3 | \encoding{UTF-8} 4 | \name{pp_check} 5 | \alias{pp_check} 6 | \alias{pp_check.mcpfit} 7 | \title{Posterior Predictive Checks For Mcpfit Objects} 8 | \usage{ 9 | pp_check( 10 | object, 11 | type = "dens_overlay", 12 | facet_by = NULL, 13 | newdata = NULL, 14 | prior = FALSE, 15 | varying = TRUE, 16 | arma = TRUE, 17 | nsamples = 100, 18 | ... 19 | ) 20 | } 21 | \arguments{ 22 | \item{object}{An \code{mcpfit} object.} 23 | 24 | \item{type}{One of \code{bayesplot::available_ppc("grouped", invert = TRUE) \%>\% stringr::str_remove("ppc_")}} 25 | 26 | \item{facet_by}{Name of a column in data modeled as varying effect(s).} 27 | 28 | \item{newdata}{A \code{tibble} or a \code{data.frame} containing predictors in the model. If \code{NULL} (default), 29 | the original data is used.} 30 | 31 | \item{prior}{TRUE/FALSE. Plot using prior samples? Useful for \code{mcp(..., sample = "both")}} 32 | 33 | \item{varying}{One of: 34 | \itemize{ 35 | \item \code{TRUE} All varying effects (\code{fit$pars$varying}). 36 | \item \code{FALSE} No varying effects (\code{c()}). 37 | \item Character vector: Only include specified varying parameters - see \code{fit$pars$varying}. 38 | }} 39 | 40 | \item{arma}{Whether to include autoregressive effects. 41 | \itemize{ 42 | \item \code{TRUE} Compute autoregressive residuals. Requires the response variable in \code{newdata}. 43 | \item \code{FALSE} Disregard the autoregressive effects. For \code{family = gaussian()}, \code{predict()} just use \code{sigma} for residuals. 44 | }} 45 | 46 | \item{nsamples}{Number of draws. Note that you may want to use all data for summary geoms. 47 | e.g., \code{pp_check(fit, type = "ribbon", nsamples = NULL)}.} 48 | 49 | \item{...}{Further arguments passed to \code{bayesplot::ppc_type(y, yrep, ...)}} 50 | } 51 | \value{ 52 | A \code{ggplot2} object for single plots. Enriched by \code{patchwork} for faceted plots. 53 | } 54 | \description{ 55 | Plot posterior (default) or prior (\code{prior = TRUE}) predictive checks. This is convenience wrapper 56 | around the \verb{bayesplot::ppc_*()} methods. 57 | } 58 | \examples{ 59 | \donttest{ 60 | pp_check(demo_fit) 61 | pp_check(demo_fit, type = "ecdf_overlay") 62 | #pp_check(some_varying_fit, type = "loo_intervals", facet_by = "id") 63 | } 64 | 65 | } 66 | \seealso{ 67 | \code{\link{plot.mcpfit}} \code{\link{pp_eval}} 68 | } 69 | \author{ 70 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 71 | } 72 | -------------------------------------------------------------------------------- /man/pp_eval.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \encoding{UTF-8} 4 | \name{pp_eval} 5 | \alias{pp_eval} 6 | \alias{pp_eval.mcpfit} 7 | \title{Fits and predictions from samples and newdata} 8 | \usage{ 9 | pp_eval( 10 | object, 11 | newdata = NULL, 12 | summary = TRUE, 13 | type = "fitted", 14 | probs = TRUE, 15 | rate = TRUE, 16 | prior = FALSE, 17 | which_y = "ct", 18 | varying = TRUE, 19 | arma = TRUE, 20 | nsamples = NULL, 21 | samples_format = "tidy", 22 | scale = "response", 23 | ... 24 | ) 25 | } 26 | \arguments{ 27 | \item{object}{An \code{mcpfit} object.} 28 | 29 | \item{newdata}{A \code{tibble} or a \code{data.frame} containing predictors in the model. If \code{NULL} (default), 30 | the original data is used.} 31 | 32 | \item{summary}{Summarise at each x-value} 33 | 34 | \item{type}{One of: 35 | \itemize{ 36 | \item "fitted": return fitted values. See also \code{fitted()} 37 | \item "predict": return predicted values, using random dispersion around the central tendency 38 | (e.g., \code{y_predict = rnorm(N, y_fitted, sigma_fitted)} for \code{family = gaussian()}). 39 | See also \code{predict()}. 40 | \item "residuals": same as "predict" but the observed y-values are subtracted. See also \code{residuals()} 41 | }} 42 | 43 | \item{probs}{Vector of quantiles. Only in effect when \code{summary == TRUE}.} 44 | 45 | \item{rate}{Boolean. For binomial models, plot on raw data (\code{rate = FALSE}) or 46 | response divided by number of trials (\code{rate = TRUE}). If FALSE, linear 47 | interpolation on trial number is used to infer trials at a particular x.} 48 | 49 | \item{prior}{TRUE/FALSE. Plot using prior samples? Useful for \code{mcp(..., sample = "both")}} 50 | 51 | \item{which_y}{What to plot on the y-axis. One of 52 | \itemize{ 53 | \item \code{"ct"}: The central tendency which is often the mean after applying the 54 | link function. 55 | \item \code{"sigma"}: The variance 56 | \item \code{"ar1"}, \code{"ar2"}, etc. depending on which order of the autoregressive 57 | effects you want to plot. 58 | }} 59 | 60 | \item{varying}{One of: 61 | \itemize{ 62 | \item \code{TRUE} All varying effects (\code{fit$pars$varying}). 63 | \item \code{FALSE} No varying effects (\code{c()}). 64 | \item Character vector: Only include specified varying parameters - see \code{fit$pars$varying}. 65 | }} 66 | 67 | \item{arma}{Whether to include autoregressive effects. 68 | \itemize{ 69 | \item \code{TRUE} Compute autoregressive residuals. Requires the response variable in \code{newdata}. 70 | \item \code{FALSE} Disregard the autoregressive effects. For \code{family = gaussian()}, \code{predict()} just use \code{sigma} for residuals. 71 | }} 72 | 73 | \item{nsamples}{Integer or \code{NULL}. Number of samples to return/summarise. 74 | If there are varying effects, this is the number of samples from each varying group. 75 | \code{NULL} means "all". Ignored if both are \code{FALSE}. More samples trade speed for accuracy.} 76 | 77 | \item{samples_format}{One of "tidy" or "matrix". Controls the output format when \code{summary == FALSE}. 78 | See more under "value"} 79 | 80 | \item{scale}{One of 81 | \itemize{ 82 | \item "response": return on the observed scale, i.e., after applying the inverse link function. 83 | \item "linear": return on the parameter scale (where the linear trends are modelled). 84 | }} 85 | 86 | \item{...}{Currently unused} 87 | } 88 | \value{ 89 | \itemize{ 90 | \item If \code{summary = TRUE}: A \code{tibble} with the posterior mean for each row in \code{newdata}, 91 | If \code{newdata} is \code{NULL}, the data in \code{fit$data} is used. 92 | \item If \code{summary = FALSE} and \code{samples_format = "tidy"}: A \code{tidybayes} \code{tibble} with all the posterior 93 | samples (\code{Ns}) evaluated at each row in \code{newdata} (\code{Nn}), i.e., with \verb{Ns x Nn} rows. If there are 94 | varying effects, the returned data is expanded with the relevant levels for each row. 95 | 96 | The return columns are: 97 | \itemize{ 98 | \item Predictors from \code{newdata}. 99 | \item Sample descriptors: ".chain", ".iter", ".draw" (see the \code{tidybayes} package for more), and "data_row" (\code{newdata} rownumber) 100 | \item Sample values: one column for each parameter in the model. 101 | \item The estimate. Either "predict" or "fitted", i.e., the name of the \code{type} argument. 102 | } 103 | \item If \code{summary = FALSE} and \code{samples_format = "matrix"}: An \code{N_draws} X \code{nrows(newdata)} matrix with fitted/predicted 104 | values (depending on \code{type}). This format is used by \code{brms} and it's useful as \code{yrep} in 105 | \verb{bayesplot::ppc_*} functions. 106 | } 107 | } 108 | \description{ 109 | Fits and predictions from samples and newdata 110 | } 111 | \seealso{ 112 | \code{\link{fitted.mcpfit}} \code{\link{predict.mcpfit}} \code{\link{residuals.mcpfit}} 113 | } 114 | \author{ 115 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 116 | } 117 | \keyword{internal} 118 | -------------------------------------------------------------------------------- /man/predict.mcpfit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \encoding{UTF-8} 4 | \name{predict.mcpfit} 5 | \alias{predict.mcpfit} 6 | \alias{predict} 7 | \title{Samples from the Posterior Predictive Distribution} 8 | \usage{ 9 | \method{predict}{mcpfit}( 10 | object, 11 | newdata = NULL, 12 | summary = TRUE, 13 | probs = TRUE, 14 | rate = TRUE, 15 | prior = FALSE, 16 | which_y = "ct", 17 | varying = TRUE, 18 | arma = TRUE, 19 | nsamples = NULL, 20 | samples_format = "tidy", 21 | ... 22 | ) 23 | } 24 | \arguments{ 25 | \item{object}{An \code{mcpfit} object.} 26 | 27 | \item{newdata}{A \code{tibble} or a \code{data.frame} containing predictors in the model. If \code{NULL} (default), 28 | the original data is used.} 29 | 30 | \item{summary}{Summarise at each x-value} 31 | 32 | \item{probs}{Vector of quantiles. Only in effect when \code{summary == TRUE}.} 33 | 34 | \item{rate}{Boolean. For binomial models, plot on raw data (\code{rate = FALSE}) or 35 | response divided by number of trials (\code{rate = TRUE}). If FALSE, linear 36 | interpolation on trial number is used to infer trials at a particular x.} 37 | 38 | \item{prior}{TRUE/FALSE. Plot using prior samples? Useful for \code{mcp(..., sample = "both")}} 39 | 40 | \item{which_y}{What to plot on the y-axis. One of 41 | \itemize{ 42 | \item \code{"ct"}: The central tendency which is often the mean after applying the 43 | link function. 44 | \item \code{"sigma"}: The variance 45 | \item \code{"ar1"}, \code{"ar2"}, etc. depending on which order of the autoregressive 46 | effects you want to plot. 47 | }} 48 | 49 | \item{varying}{One of: 50 | \itemize{ 51 | \item \code{TRUE} All varying effects (\code{fit$pars$varying}). 52 | \item \code{FALSE} No varying effects (\code{c()}). 53 | \item Character vector: Only include specified varying parameters - see \code{fit$pars$varying}. 54 | }} 55 | 56 | \item{arma}{Whether to include autoregressive effects. 57 | \itemize{ 58 | \item \code{TRUE} Compute autoregressive residuals. Requires the response variable in \code{newdata}. 59 | \item \code{FALSE} Disregard the autoregressive effects. For \code{family = gaussian()}, \code{predict()} just use \code{sigma} for residuals. 60 | }} 61 | 62 | \item{nsamples}{Integer or \code{NULL}. Number of samples to return/summarise. 63 | If there are varying effects, this is the number of samples from each varying group. 64 | \code{NULL} means "all". Ignored if both are \code{FALSE}. More samples trade speed for accuracy.} 65 | 66 | \item{samples_format}{One of "tidy" or "matrix". Controls the output format when \code{summary == FALSE}. 67 | See more under "value"} 68 | 69 | \item{...}{Currently unused} 70 | } 71 | \value{ 72 | \itemize{ 73 | \item If \code{summary = TRUE}: A \code{tibble} with the posterior mean for each row in \code{newdata}, 74 | If \code{newdata} is \code{NULL}, the data in \code{fit$data} is used. 75 | \item If \code{summary = FALSE} and \code{samples_format = "tidy"}: A \code{tidybayes} \code{tibble} with all the posterior 76 | samples (\code{Ns}) evaluated at each row in \code{newdata} (\code{Nn}), i.e., with \verb{Ns x Nn} rows. If there are 77 | varying effects, the returned data is expanded with the relevant levels for each row. 78 | 79 | The return columns are: 80 | \itemize{ 81 | \item Predictors from \code{newdata}. 82 | \item Sample descriptors: ".chain", ".iter", ".draw" (see the \code{tidybayes} package for more), and "data_row" (\code{newdata} rownumber) 83 | \item Sample values: one column for each parameter in the model. 84 | \item The estimate. Either "predict" or "fitted", i.e., the name of the \code{type} argument. 85 | } 86 | \item If \code{summary = FALSE} and \code{samples_format = "matrix"}: An \code{N_draws} X \code{nrows(newdata)} matrix with fitted/predicted 87 | values (depending on \code{type}). This format is used by \code{brms} and it's useful as \code{yrep} in 88 | \verb{bayesplot::ppc_*} functions. 89 | } 90 | } 91 | \description{ 92 | Samples from the Posterior Predictive Distribution 93 | } 94 | \examples{ 95 | \donttest{ 96 | predict(demo_fit) # Evaluate at each demo_fit$data 97 | predict(demo_fit, probs = c(0.1, 0.5, 0.9)) # With median and 80\% credible interval. 98 | predict(demo_fit, summary = FALSE) # Samples instead of summary. 99 | predict( 100 | demo_fit, 101 | newdata = data.frame(time = c(-5, 20, 300)), # Evaluate 102 | probs = c(0.025, 0.5, 0.975) 103 | ) 104 | } 105 | 106 | } 107 | \seealso{ 108 | \code{\link{pp_eval}} \code{\link{fitted.mcpfit}} \code{\link{residuals.mcpfit}} 109 | } 110 | \author{ 111 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 112 | } 113 | -------------------------------------------------------------------------------- /man/print.mcplist.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/misc.R 3 | \encoding{UTF-8} 4 | \name{print.mcplist} 5 | \alias{print.mcplist} 6 | \title{Print mcplist} 7 | \usage{ 8 | \method{print}{mcplist}(x, ...) 9 | } 10 | \arguments{ 11 | \item{x}{An \code{\link{mcpfit}} object.} 12 | 13 | \item{...}{Currently ignored} 14 | } 15 | \description{ 16 | Shows a list in a more condensed format using \code{str(list)}. 17 | } 18 | \author{ 19 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 20 | } 21 | -------------------------------------------------------------------------------- /man/print.mcptext.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/misc.R 3 | \encoding{UTF-8} 4 | \name{print.mcptext} 5 | \alias{print.mcptext} 6 | \title{Nice printing texts} 7 | \usage{ 8 | \method{print}{mcptext}(x, ...) 9 | } 10 | \arguments{ 11 | \item{x}{Character, often with newlines.} 12 | 13 | \item{...}{Currently ignored.} 14 | } 15 | \description{ 16 | Useful for \code{print(fit$jags_code)}, \code{print(mcp_demo$call)}, etc. 17 | } 18 | \examples{ 19 | mytext = "line1 = 2\n line2 = 'horse'" 20 | class(mytext) = "mcptext" 21 | print(mytext) 22 | } 23 | \author{ 24 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 25 | } 26 | -------------------------------------------------------------------------------- /man/probit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/families.R 3 | \name{probit} 4 | \alias{probit} 5 | \title{Probit function} 6 | \usage{ 7 | probit(mu) 8 | } 9 | \arguments{ 10 | \item{mu}{A vector of probabilities (0.0 to 1.0)} 11 | } 12 | \value{ 13 | A vector with same length as \code{mu} 14 | } 15 | \description{ 16 | Probit function 17 | } 18 | -------------------------------------------------------------------------------- /man/recover_levels.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/run_jags.R 3 | \name{recover_levels} 4 | \alias{recover_levels} 5 | \title{Recover the levels of varying effects in mcmc.list} 6 | \usage{ 7 | recover_levels(samples, data, mcmc_col, data_col) 8 | } 9 | \arguments{ 10 | \item{samples}{An mcmc.list with varying columns starting in \code{mcmc_col}.} 11 | 12 | \item{data}{A tibble or data.frame with the cols in \code{data_col}.} 13 | 14 | \item{mcmc_col}{A vector of strings.} 15 | 16 | \item{data_col}{A vector of strings. Has to be same length as \code{mcmc_col}.`} 17 | } 18 | \description{ 19 | Jags uses 1, 2, 3, ..., etc. for indexing of varying effects. 20 | This function adds back the original levels, whether numeric or string 21 | } 22 | \keyword{internal} 23 | -------------------------------------------------------------------------------- /man/remove_terms.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/misc.R 3 | \encoding{UTF-8} 4 | \name{remove_terms} 5 | \alias{remove_terms} 6 | \title{Remove varying or population terms from a formula} 7 | \usage{ 8 | remove_terms(form, remove) 9 | } 10 | \arguments{ 11 | \item{form}{A formula} 12 | 13 | \item{remove}{Either "varying" or "population". These are removed.} 14 | } 15 | \value{ 16 | A formula 17 | } 18 | \description{ 19 | WARNING: removes response side from the formula 20 | } 21 | \author{ 22 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 23 | } 24 | \keyword{internal} 25 | -------------------------------------------------------------------------------- /man/residuals.mcpfit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \encoding{UTF-8} 4 | \name{residuals.mcpfit} 5 | \alias{residuals.mcpfit} 6 | \alias{residuals} 7 | \alias{resid} 8 | \alias{resid.mcpfit} 9 | \title{Compute Residuals From Mcpfit Objects} 10 | \usage{ 11 | \method{residuals}{mcpfit}( 12 | object, 13 | newdata = NULL, 14 | summary = TRUE, 15 | probs = TRUE, 16 | prior = FALSE, 17 | varying = TRUE, 18 | arma = TRUE, 19 | nsamples = NULL, 20 | ... 21 | ) 22 | } 23 | \arguments{ 24 | \item{object}{An \code{mcpfit} object.} 25 | 26 | \item{newdata}{A \code{tibble} or a \code{data.frame} containing predictors in the model. If \code{NULL} (default), 27 | the original data is used.} 28 | 29 | \item{summary}{Summarise at each x-value} 30 | 31 | \item{probs}{Vector of quantiles. Only in effect when \code{summary == TRUE}.} 32 | 33 | \item{prior}{TRUE/FALSE. Plot using prior samples? Useful for \code{mcp(..., sample = "both")}} 34 | 35 | \item{varying}{One of: 36 | \itemize{ 37 | \item \code{TRUE} All varying effects (\code{fit$pars$varying}). 38 | \item \code{FALSE} No varying effects (\code{c()}). 39 | \item Character vector: Only include specified varying parameters - see \code{fit$pars$varying}. 40 | }} 41 | 42 | \item{arma}{Whether to include autoregressive effects. 43 | \itemize{ 44 | \item \code{TRUE} Compute autoregressive residuals. Requires the response variable in \code{newdata}. 45 | \item \code{FALSE} Disregard the autoregressive effects. For \code{family = gaussian()}, \code{predict()} just use \code{sigma} for residuals. 46 | }} 47 | 48 | \item{nsamples}{Integer or \code{NULL}. Number of samples to return/summarise. 49 | If there are varying effects, this is the number of samples from each varying group. 50 | \code{NULL} means "all". Ignored if both are \code{FALSE}. More samples trade speed for accuracy.} 51 | 52 | \item{...}{Currently unused} 53 | } 54 | \description{ 55 | Equivalent to \code{fitted(fit, ...) - fit$data[, fit$data$yvar]} (or \code{fitted(fit, ...) - newdata[, fit$data$yvar]}), 56 | but with fixed arguments for \code{fitted}: \verb{rate = FALSE, which_y = 'ct', samples_format = 'tidy'}. 57 | } 58 | \examples{ 59 | \donttest{ 60 | residuals(demo_fit) 61 | residuals(demo_fit, probs = c(0.1, 0.5, 0.9)) # With median and 80\% credible interval. 62 | residuals(demo_fit, summary = FALSE) # Samples instead of summary. 63 | } 64 | 65 | } 66 | \seealso{ 67 | \code{\link{pp_eval}} \code{\link{fitted.mcpfit}} \code{\link{predict.mcpfit}} 68 | } 69 | \author{ 70 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 71 | } 72 | -------------------------------------------------------------------------------- /man/run_jags.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/run_jags.R 3 | \encoding{UTF-8} 4 | \name{run_jags} 5 | \alias{run_jags} 6 | \title{Run parallel MCMC sampling using JAGS.} 7 | \usage{ 8 | run_jags( 9 | data, 10 | jags_code, 11 | pars, 12 | ST, 13 | cores, 14 | sample, 15 | n.chains, 16 | n.iter, 17 | n.adapt, 18 | inits 19 | ) 20 | } 21 | \arguments{ 22 | \item{data}{Data.frame or tibble in long format.} 23 | 24 | \item{jags_code}{A string. JAGS model, usually returned by \code{make_jagscode()}.} 25 | 26 | \item{pars}{Character vector of parameters to save/monitor.} 27 | 28 | \item{ST}{A segment table (tibble), returned by \code{get_segment_table}. 29 | Only really used when the model contains varying effects.} 30 | 31 | \item{cores}{Positive integer or "all". Number of cores. 32 | \itemize{ 33 | \item \code{1}: serial sampling. \code{options(mc.cores = 3)} will dominate \code{cores = 1} 34 | but not larger values of \code{cores}. 35 | \item \verb{>1}: parallel sampling on this number of cores. Ideally set \code{chains} 36 | to the same value. Note: \code{cores > 1} takes a few extra seconds the first 37 | time it's called but subsequent calls will start sampling immediately. 38 | \item \code{"all"}: use all cores but one and sets \code{chains} to the same value. This is 39 | a convenient way to maximally use your computer's power. 40 | }} 41 | 42 | \item{sample}{One of 43 | \itemize{ 44 | \item \code{"post"}: Sample the posterior. 45 | \item \code{"prior"}: Sample only the prior. Plots, summaries, etc. will 46 | use the prior. This is useful for prior predictive checks. 47 | \item \code{"both"}: Sample both prior and posterior. Plots, summaries, etc. 48 | will default to using the posterior. The prior only has effect when doing 49 | Savage-Dickey density ratios in \code{\link{hypothesis}}. 50 | \item \code{"none"} or \code{FALSE}: Do not sample. Returns an mcpfit 51 | object without sample. This is useful if you only want to check 52 | prior strings (fit$prior), the JAGS model (fit$jags_code), etc. 53 | }} 54 | 55 | \item{n.chains}{the number of parallel chains for the model} 56 | 57 | \item{n.iter}{number of iterations to monitor} 58 | 59 | \item{n.adapt}{the number of iterations for adaptation. See 60 | \code{\link[rjags]{adapt}} for details. If \code{n.adapt = 0} then no 61 | adaptation takes place.} 62 | 63 | \item{inits}{A list if initial values for the parameters. This can be useful 64 | if a model fails to converge. Read more in \code{\link[rjags]{jags.model}}. 65 | Defaults to \code{NULL}, i.e., no inits.} 66 | } 67 | \value{ 68 | `mcmc.list`` 69 | } 70 | \description{ 71 | Run parallel MCMC sampling using JAGS. 72 | } 73 | \author{ 74 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 75 | } 76 | \keyword{internal} 77 | -------------------------------------------------------------------------------- /man/sd_to_prec.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_jagscode.R 3 | \encoding{UTF-8} 4 | \name{sd_to_prec} 5 | \alias{sd_to_prec} 6 | \title{Transform a prior from SD to precision.} 7 | \usage{ 8 | sd_to_prec(prior_str) 9 | } 10 | \arguments{ 11 | \item{prior_str}{String. A JAGS prior. Can be truncated, e.g. 12 | \verb{dt(3, 2, 1) T(my_var, )}.} 13 | } 14 | \value{ 15 | A string 16 | } 17 | \description{ 18 | JAGS uses precision rather than SD. This function converts 19 | \code{dnorm(4.2, 1.3)} into \code{dnorm(4.2, 1/1.3^2)}. It allows users to specify 20 | priors using SD and then it's transformed for the JAGS code. It works for the 21 | following distributions: dnorm|dt|dcauchy|ddexp|dlogis|dlnorm. In all of 22 | these, 23 | tau/sd is the second parameter. 24 | } 25 | \author{ 26 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 27 | } 28 | -------------------------------------------------------------------------------- /man/summary.mcpfit.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \encoding{UTF-8} 4 | \name{summary.mcpfit} 5 | \alias{summary.mcpfit} 6 | \alias{summary} 7 | \alias{fixef} 8 | \alias{fixef.mcpfit} 9 | \alias{fixed.effects} 10 | \alias{ranef} 11 | \alias{ranef.mcpfit} 12 | \alias{random.effects} 13 | \alias{print.mcpfit} 14 | \alias{print} 15 | \title{Summarise mcpfit objects} 16 | \usage{ 17 | \method{summary}{mcpfit}(object, width = 0.95, digits = 2, prior = FALSE, ...) 18 | 19 | fixef(object, width = 0.95, prior = FALSE, ...) 20 | 21 | ranef(object, width = 0.95, prior = FALSE, ...) 22 | 23 | \method{print}{mcpfit}(x, ...) 24 | } 25 | \arguments{ 26 | \item{object}{An \code{\link{mcpfit}} object.} 27 | 28 | \item{width}{Float. The width of the highest posterior density interval 29 | (between 0 and 1).} 30 | 31 | \item{digits}{a non-null value for digits specifies the minimum number of 32 | significant digits to be printed in values. The default, NULL, uses 33 | getOption("digits"). (For the interpretation for complex numbers see signif.) 34 | Non-integer values will be rounded down, and only values greater than or 35 | equal to 1 and no greater than 22 are accepted.} 36 | 37 | \item{prior}{TRUE/FALSE. Summarise prior instead of posterior?} 38 | 39 | \item{...}{Currently ignored} 40 | 41 | \item{x}{An \code{\link{mcpfit}} object.} 42 | } 43 | \value{ 44 | A data frame with parameter estimates and MCMC diagnostics. 45 | OBS: The change point distributions are often not unimodal and symmetric so 46 | the intervals can be deceiving Plot them using \code{plot_pars(fit)}. 47 | \itemize{ 48 | \item \code{mean} is the posterior mean 49 | \item \code{lower} is the lower quantile of the highest-density interval (HDI) given in \code{width}. 50 | \item \code{upper} is the upper quantile. 51 | \item \code{Rhat} is the Gelman-Rubin convergence diagnostic which is often taken to 52 | be acceptable if < 1.1. It is computed using \code{\link[coda]{gelman.diag}}. 53 | \item \code{n.eff} is the effective sample size computed using \code{\link[coda]{effectiveSize}}. 54 | Low effective sample sizes are also obvious as poor mixing in trace plots 55 | (see \code{plot_pars(fit)}). Read how to deal with such problems \href{https://lindeloev.github.io/mcp/articles/tips.html}{here} 56 | \item \code{ts_err} is the time-series error, taking autoregressive correlation 57 | into account. It is computed using \code{\link[coda]{spectrum0.ar}}. 58 | } 59 | 60 | For simulated data, the summary contains two additional columns so that it 61 | is easy to inspect whether the model can recover the parameters. Run 62 | simulation and summary multiple times to get a sense of the robustness. 63 | \itemize{ 64 | \item \code{sim} is the value used to generate the data. 65 | \item \code{match} is \code{"OK"} if \code{sim} is contained in the HDI interval (\code{lower} to 66 | \code{upper}). 67 | } 68 | } 69 | \description{ 70 | Summarise parameter estimates and model diagnostics. 71 | } 72 | \section{Functions}{ 73 | \itemize{ 74 | \item \code{fixef()}: Get population-level ("fixed") effects of an \code{\link{mcpfit}} object. 75 | 76 | \item \code{ranef()}: Get varying ("random") effects of an \code{\link{mcpfit}} object. 77 | 78 | \item \code{print(mcpfit)}: Print the posterior summary of an \code{\link{mcpfit}} object. 79 | 80 | }} 81 | \examples{ 82 | # Typical usage 83 | summary(demo_fit) 84 | summary(demo_fit, width = 0.8, digits = 4) # Set HDI width 85 | 86 | # Get the results as a data frame 87 | results = summary(demo_fit) 88 | 89 | # Varying (random) effects 90 | # ranef(my_fit) 91 | 92 | # Summarise prior 93 | summary(demo_fit, prior = TRUE) 94 | 95 | } 96 | \author{ 97 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 98 | } 99 | -------------------------------------------------------------------------------- /man/tidy_samples.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \encoding{UTF-8} 4 | \name{tidy_samples} 5 | \alias{tidy_samples} 6 | \alias{tidy_samples.mcpfit} 7 | \title{Get tidy samples with or without varying effects} 8 | \usage{ 9 | tidy_samples( 10 | fit, 11 | population = TRUE, 12 | varying = TRUE, 13 | absolute = FALSE, 14 | prior = FALSE, 15 | nsamples = NULL 16 | ) 17 | } 18 | \arguments{ 19 | \item{fit}{An \code{\link{mcpfit}} object} 20 | 21 | \item{population}{\itemize{ 22 | \item \code{TRUE} All population effects. Same as \code{fit$pars$population}. 23 | \itemize{ 24 | \item \code{FALSE} No population effects. Same as \code{c()}. 25 | \item Character vector: Only include specified population parameters - see \code{fit$pars$population}. 26 | } 27 | }} 28 | 29 | \item{varying}{One of: 30 | \itemize{ 31 | \item \code{TRUE} All varying effects (\code{fit$pars$varying}). 32 | \item \code{FALSE} No varying effects (\code{c()}). 33 | \item Character vector: Only include specified varying parameters - see \code{fit$pars$varying}. 34 | }} 35 | 36 | \item{absolute}{\itemize{ 37 | \item \code{TRUE} Returns the absolute location of all varying change points. 38 | \itemize{ 39 | \item \code{FALSE} Just returns the varying effects. 40 | \item Character vector: Only do absolute transform for these varying parameters - see \code{fit$pars$varying}. 41 | } 42 | 43 | OBS: This currently only applies to varying change points. It is not implemented for \code{rel()} regressors yet. 44 | }} 45 | 46 | \item{prior}{TRUE/FALSE. Summarise prior instead of posterior?} 47 | 48 | \item{nsamples}{Integer or \code{NULL}. Number of samples to return/summarise. 49 | If there are varying effects, this is the number of samples from each varying group. 50 | \code{NULL} means "all". Ignored if both are \code{FALSE}. More samples trade speed for accuracy.} 51 | } 52 | \value{ 53 | \code{tibble} of posterior draws in \code{tidybayes} format. 54 | } 55 | \description{ 56 | Returns in a format useful for \code{fit$simulate()} with population parameters in wide format 57 | and varying effects in long format (the number of rows will be \code{nsamples * n_levels_in_varying}). 58 | } 59 | \author{ 60 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 61 | } 62 | \keyword{internal} 63 | -------------------------------------------------------------------------------- /man/tidy_to_matrix.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \encoding{UTF-8} 4 | \name{tidy_to_matrix} 5 | \alias{tidy_to_matrix} 6 | \title{Convert from tidy to matrix} 7 | \usage{ 8 | tidy_to_matrix(samples, returnvar) 9 | } 10 | \arguments{ 11 | \item{samples}{Samples in tidy format} 12 | 13 | \item{returnvar}{An \code{rlang::sym()} object.} 14 | } 15 | \value{ 16 | An \code{N_draws} X \code{nrows(newdata)} matrix. 17 | } 18 | \description{ 19 | Converts from the output of \code{tidy_samples()} or \code{pp_eval(fit, samples_format = "tidy")} 20 | to an \code{N_draws} X \code{nrows(newdata)} matrix with fitted/predicted values. This format is 21 | used y \code{brms} and it's useful as \code{yrep} in \verb{bayesplot::ppc_*} functions. 22 | } 23 | \author{ 24 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 25 | } 26 | \keyword{internal} 27 | -------------------------------------------------------------------------------- /man/to_formula.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/misc.R 3 | \encoding{UTF-8} 4 | \name{to_formula} 5 | \alias{to_formula} 6 | \title{Takes any formula-like input and returns a formula} 7 | \usage{ 8 | to_formula(form) 9 | } 10 | \arguments{ 11 | \item{form}{Formula or character (with or without initial tilde/"~")} 12 | } 13 | \value{ 14 | A formula 15 | } 16 | \description{ 17 | Takes any formula-like input and returns a formula 18 | } 19 | \author{ 20 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 21 | } 22 | \keyword{internal} 23 | -------------------------------------------------------------------------------- /man/unpack_arma.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{unpack_arma} 5 | \alias{unpack_arma} 6 | \title{Unpack arma order and formula} 7 | \usage{ 8 | unpack_arma(form_str_in) 9 | } 10 | \arguments{ 11 | \item{form_str_in}{A character. These are allowed: "ar(number)" or "ar(number, formula)"} 12 | } 13 | \value{ 14 | A list with $order and $form_str (e.g., "ar(formula)"). The formula is ar(1) or ma(1) if no formula is given 15 | } 16 | \description{ 17 | Unpack arma order and formula 18 | } 19 | \author{ 20 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 21 | } 22 | \keyword{internal} 23 | -------------------------------------------------------------------------------- /man/unpack_cp.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{unpack_cp} 5 | \alias{unpack_cp} 6 | \title{Takes a cp formula (as a string) and returns its properties} 7 | \usage{ 8 | unpack_cp(form_cp, i) 9 | } 10 | \arguments{ 11 | \item{form_cp}{Segment formula as string.} 12 | 13 | \item{i}{The segment number (integer)} 14 | } 15 | \value{ 16 | A one-row tibble with columns: 17 | \itemize{ 18 | \item \code{cp_int}: bool. Whether there is an intercept change in the change point. 19 | \item \code{cp_in_rel}: bool. Is this intercept change relative? 20 | \item \code{cp_ran_int}: bool or NA. Is there a random intercept on the change point? 21 | \item \code{cp_group_col}: char or NA. Which column in data define the random intercept? 22 | } 23 | } 24 | \description{ 25 | Takes a cp formula (as a string) and returns its properties 26 | } 27 | \author{ 28 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 29 | } 30 | \keyword{internal} 31 | -------------------------------------------------------------------------------- /man/unpack_int.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{unpack_int} 5 | \alias{unpack_int} 6 | \title{Get the intercept of a formula} 7 | \usage{ 8 | unpack_int(form, i, ttype) 9 | } 10 | \arguments{ 11 | \item{form}{A formula} 12 | 13 | \item{i}{Segment number (integer)} 14 | 15 | \item{ttype}{The term type. One of "ct" (central tendency), "sigma" (variance), 16 | or "ar" (autoregressive)} 17 | } 18 | \value{ 19 | A one-row tibble describing the intercept. 20 | } 21 | \description{ 22 | Get the intercept of a formula 23 | } 24 | \author{ 25 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 26 | } 27 | \keyword{internal} 28 | -------------------------------------------------------------------------------- /man/unpack_rhs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{unpack_rhs} 5 | \alias{unpack_rhs} 6 | \title{Unpack right-hand side} 7 | \usage{ 8 | unpack_rhs(form_rhs, i, family, data, last_segment) 9 | } 10 | \arguments{ 11 | \item{form_rhs}{A character representation of a formula} 12 | 13 | \item{i}{The segment number (integer)} 14 | 15 | \item{family}{An mcpfamily object returned by \code{mcpfamily()}.} 16 | 17 | \item{data}{A data.frame or tibble} 18 | 19 | \item{last_segment}{The last row in the segment table, made in \code{get_segment_table()}} 20 | } 21 | \value{ 22 | A one-row tibble with three columns for each of \code{ct}. \code{sigma}, \code{ar}, and \code{ma}: 23 | \itemize{ 24 | \item \verb{_int}: NA or a one-row tibble describing the intercept. 25 | \item \verb{_slope}: NA or a tibble with a row for each slope term. 26 | \item \verb{_code}: NA or a char with the JAGS/R code to implement the slope. 27 | } 28 | } 29 | \description{ 30 | This is a pretty big function. It includes unpacking sigma(), ar(), etc. 31 | } 32 | \author{ 33 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 34 | } 35 | \keyword{internal} 36 | -------------------------------------------------------------------------------- /man/unpack_slope.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{unpack_slope} 5 | \alias{unpack_slope} 6 | \title{Unpack the slope of a formula} 7 | \usage{ 8 | unpack_slope(form, i, ttype, last_slope) 9 | } 10 | \arguments{ 11 | \item{form}{A formula} 12 | 13 | \item{i}{Segment number (integer)} 14 | 15 | \item{ttype}{The term type. One of "ct" (central tendency), "sigma" (variance), 16 | or "ar" (autoregressive)} 17 | 18 | \item{last_slope}{The element in the slope column for this ttype in the previous 19 | segment. I.e., probably what this function returned last time it was called!} 20 | } 21 | \value{ 22 | A "one-row" list with code (char) and a tibble of slopes. 23 | } 24 | \description{ 25 | Makes A list of terms and applies unpack_slope_term() to each of them. Then builds the code for this segment's slope 26 | } 27 | \author{ 28 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 29 | } 30 | \keyword{internal} 31 | -------------------------------------------------------------------------------- /man/unpack_slope_term.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{unpack_slope_term} 5 | \alias{unpack_slope_term} 6 | \title{Unpacks a single term} 7 | \usage{ 8 | unpack_slope_term(term, i, last_slope, ttype = "") 9 | } 10 | \arguments{ 11 | \item{term}{A character, e.g., "x", "I(x^2)", or "log(x)".} 12 | 13 | \item{i}{Segment number (integer)} 14 | 15 | \item{last_slope}{The element in the slope column for this ttype in the previous 16 | segment. I.e., probably what this function returned last time it was called!} 17 | 18 | \item{ttype}{The term type. One of "ct" (central tendency), "sigma" (variance), 19 | or "ar" (autoregressive)} 20 | } 21 | \value{ 22 | A "one-row" list describing a slope term. 23 | } 24 | \description{ 25 | Returns a row for \code{unpack_slope()}. 26 | } 27 | \author{ 28 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 29 | } 30 | \keyword{internal} 31 | -------------------------------------------------------------------------------- /man/unpack_tildes.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{unpack_tildes} 5 | \alias{unpack_tildes} 6 | \title{Takes a formula and returns a string representation of y, cp, and rhs} 7 | \usage{ 8 | unpack_tildes(segment, i) 9 | } 10 | \arguments{ 11 | \item{segment}{A formula} 12 | 13 | \item{i}{The segment number (integer)} 14 | } 15 | \value{ 16 | A one-row tibble with columns: 17 | \itemize{ 18 | \item \code{form}: String. The full formula for this segment. 19 | \item \code{form_y}: String. The expression for y (without tilde) 20 | \item \code{form_cp}: String. The formula for the change point 21 | \item \code{form_rhs}: String. The formula for the right-hand side 22 | } 23 | } 24 | \description{ 25 | Takes a formula and returns a string representation of y, cp, and rhs 26 | } 27 | \author{ 28 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 29 | } 30 | \keyword{internal} 31 | -------------------------------------------------------------------------------- /man/unpack_varying.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \name{unpack_varying} 4 | \alias{unpack_varying} 5 | \alias{unpack_varying.mcpfit} 6 | \title{Get relevant info about varying parameters} 7 | \usage{ 8 | unpack_varying(fit, pars = NULL, cols = NULL) 9 | } 10 | \arguments{ 11 | \item{pars}{\code{NULL}/\code{FALSE} for nothing. \code{TRUE} for all. A vector of varying parameter names for specifics.} 12 | 13 | \item{cols}{\code{NULL}/\code{FALSE} for nothing. \code{TRUE} for all. A vector of varying column names for specifics. Usually provided via "facet_by" argument in other functions.} 14 | } 15 | \value{ 16 | A list. See details. 17 | } 18 | \description{ 19 | Returns parameters, data columns, and implicated segments given parameter name(s) or column(s). 20 | } 21 | \details{ 22 | Returns a list with 23 | } 24 | \section{Slots}{ 25 | 26 | \describe{ 27 | \item{\code{pars}}{Character vector of parameter names. \code{NULL} if empty.} 28 | 29 | \item{\code{cols}}{Character vector of data column names. \code{NULL} if empty.} 30 | 31 | \item{\code{indices}}{Logical vector of segments in the segment table that contains the varying effect} 32 | }} 33 | 34 | \keyword{internal} 35 | -------------------------------------------------------------------------------- /man/unpack_varying_term.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{unpack_varying_term} 5 | \alias{unpack_varying_term} 6 | \title{Unpack varying effects} 7 | \usage{ 8 | unpack_varying_term(term, i) 9 | } 10 | \arguments{ 11 | \item{term}{A character, e.g., "x", "I(x^2)", or "log(x)".} 12 | 13 | \item{i}{Segment number (integer)} 14 | } 15 | \value{ 16 | A "one-row" list describing a varying intercept. 17 | } 18 | \description{ 19 | Unpack varying effects 20 | } 21 | \author{ 22 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 23 | } 24 | \keyword{internal} 25 | -------------------------------------------------------------------------------- /man/unpack_y.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get_segment_table.R 3 | \encoding{UTF-8} 4 | \name{unpack_y} 5 | \alias{unpack_y} 6 | \title{Unpacks y variable name} 7 | \usage{ 8 | unpack_y(form_y, i, family) 9 | } 10 | \arguments{ 11 | \item{form_y}{Character representation of formula} 12 | 13 | \item{i}{The segment number (integer)} 14 | 15 | \item{family}{An mcpfamily object returned by \code{mcpfamily()}.} 16 | } 17 | \value{ 18 | A one-row tibble with the columns 19 | \itemize{ 20 | \item \code{y}: string. The y variable name. 21 | \item \code{trials}: string. The trials variable name. 22 | \item \code{weights}: string. The weights variable name. 23 | } 24 | } 25 | \description{ 26 | Unpacks y variable name 27 | } 28 | \author{ 29 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 30 | } 31 | \keyword{internal} 32 | -------------------------------------------------------------------------------- /man/with_loo.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mcpfit_methods.R 3 | \encoding{UTF-8} 4 | \name{with_loo} 5 | \alias{with_loo} 6 | \title{Add loo if not already present} 7 | \usage{ 8 | with_loo(fit, save_psis = FALSE, info = NULL) 9 | } 10 | \arguments{ 11 | \item{fit}{An mcpfit object} 12 | 13 | \item{save_psis}{Logical. See documentation of loo::loo} 14 | 15 | \item{info}{Optional message if adding loo} 16 | } 17 | \value{ 18 | An mcpfit object with loo. 19 | } 20 | \description{ 21 | Add loo if not already present 22 | } 23 | \author{ 24 | Jonas Kristoffer Lindeløv \email{jonas@lindeloev.dk} 25 | } 26 | \keyword{internal} 27 | -------------------------------------------------------------------------------- /mcp.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: No 4 | SaveWorkspace: No 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | PackageRoxygenize: rd,collate,namespace 22 | -------------------------------------------------------------------------------- /pkgdown/_pkgdown.yml: -------------------------------------------------------------------------------- 1 | template: 2 | params: 3 | bootswatch: flatly # Theme 4 | ganalytics: UA-1026978-3 # Google Analytics 5 | docsearch: 6 | api_key: 702ab134d40a7310606c29f341fc5014 7 | index_name: lindeloev_mcp 8 | opengraph: 9 | image: 10 | src: https://github.com/lindeloev/mcp/raw/docs/man/figures/logo.png # For Twitter card 11 | twitter: 12 | creator: "@jonaslindeloev" 13 | card: summary 14 | 15 | 16 | # Writes sitemap etc. 17 | url: https://lindeloev.github.io/mcp 18 | 19 | authors: 20 | Jonas Kristoffer Lindeløv: 21 | href: http://lindeloev.net 22 | 23 | # Structure of the reference page 24 | reference: 25 | - title: Using mcp 26 | desc: Functions for everyday use of mcp. 27 | contents: 28 | - mcp 29 | - plot.mcpfit 30 | - plot_pars 31 | - pp_check 32 | - summary.mcpfit 33 | - ranef 34 | - fixef 35 | - fitted 36 | - predict 37 | - residuals 38 | - loo 39 | - waic 40 | - hypothesis 41 | - mcp-package 42 | 43 | - title: Axillary functions 44 | desc: These are used internally by mcp, but are exposed here since they may be useful for other purposes. Most other useful internal functions deliver the result already in `mcp(segments, sample = FALSE)`, so `mcp()` will be their API. 45 | contents: 46 | - sd_to_prec 47 | - logit 48 | - ilogit 49 | - probit 50 | - phi 51 | - is.mcpfit 52 | 53 | - title: Families 54 | desc: Distributional families that are not available in base R. 55 | contents: 56 | - bernoulli 57 | - negbinomial 58 | - exponential 59 | 60 | - title: Help and demos 61 | desc: These datasets were simulated with mcp. There are lnks to the simulation scripts in the documentation for each of them. The simulation values will also show up if you fit a model to one of these dataset and call `summary(fit)`. Analyses of most of these are demonstrated on the [front page](https://lindeloev.github.io/mcp). 62 | contents: 63 | - mcp_example 64 | - demo_fit 65 | 66 | - title: Miscellaneous 67 | desc: Stuff you would not usually consult directly. 68 | contents: 69 | - mcpfit-class 70 | - print.mcplist 71 | - print.mcptext 72 | 73 | navbar: 74 | left: 75 | - text: General usage 76 | menu: 77 | - text: Formulas 78 | href: articles/formulas.html 79 | - text: Priors 80 | href: articles/priors.html 81 | - text: Hypotheses and model comparison 82 | href: articles/comparison.html 83 | - text: Random/varying effects 84 | href: articles/varying.html 85 | - text: Modeling variance 86 | href: articles/variance.html 87 | - text: Time series and autocorrelation 88 | href: articles/arma.html 89 | - text: Fits and predictions 90 | href: articles/predict.html 91 | - text: GLM 92 | menu: 93 | - text: Supported families and link functions 94 | href: articles/families.html 95 | - text: Poisson 96 | href: articles/poisson.html 97 | - text: Binomial and Bernoulli 98 | href: articles/binomial.html 99 | - text: Other 100 | menu: 101 | - text: Comparison to other packages 102 | href: articles/packages.html 103 | - text: Tips, tricks, and debugging 104 | href: articles/tips.html 105 | - text: News 106 | href: news/index.html 107 | - icon: fa-twitter 108 | href: https://twitter.com/jonaslindeloev 109 | - icon: fa-github 110 | href: https://github.com/lindeloev/mcp 111 | -------------------------------------------------------------------------------- /pkgdown/extra.css: -------------------------------------------------------------------------------- 1 | h1 { 2 | font-size: 28px 3 | } 4 | h2 { 5 | font-size: 22px; 6 | } 7 | 8 | li { 9 | margin-bottom: 5px; 10 | } -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-120x120.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/pkgdown/favicon/apple-touch-icon-120x120.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-152x152.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/pkgdown/favicon/apple-touch-icon-152x152.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-180x180.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/pkgdown/favicon/apple-touch-icon-180x180.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-60x60.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/pkgdown/favicon/apple-touch-icon-60x60.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-76x76.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/pkgdown/favicon/apple-touch-icon-76x76.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/pkgdown/favicon/apple-touch-icon.png -------------------------------------------------------------------------------- /pkgdown/favicon/favicon-16x16.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/pkgdown/favicon/favicon-16x16.png -------------------------------------------------------------------------------- /pkgdown/favicon/favicon-32x32.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/pkgdown/favicon/favicon-32x32.png -------------------------------------------------------------------------------- /pkgdown/favicon/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/pkgdown/favicon/favicon.ico -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(mcp) 3 | 4 | test_check("mcp") 5 | -------------------------------------------------------------------------------- /tests/testthat/helper-fits.R: -------------------------------------------------------------------------------- 1 | `%>%` = magrittr::`%>%` # instead of importing dplyr 2 | 3 | #' Test a list of segments and simulation values 4 | #' 5 | #' * Simulates data from model and values 6 | #' * Fits model to data 7 | #' * Checks that values are recovered 8 | #' 9 | #' @aliases test_fit 10 | #' @keywords internal 11 | #' @param model A list of (unnamed) formulas 12 | #' @param simulated Parameter values to be used for simulation. 13 | test_fit = function(model, simulated) { 14 | testthat::skip_if(is.null(getOption("test_mcp_fits")), 15 | "This time-consuming test is only run locally before release. Set options(test_mcp_fits = TRUE) to run.") 16 | 17 | # Simulate 18 | empty = mcp(model, sample = FALSE, par_x = "x") 19 | x = seq(1, 200, length.out = 400) 20 | data = data.frame( 21 | x = x, # Needs to be reasonably high to get a correct estimate 22 | y = do.call(empty$simulate, c(list(x = x), simulated)) 23 | ) 24 | 25 | # Fit 26 | options(mc.cores = NULL) # Respect `cores` 27 | quiet_out = purrr::quietly(mcp)(model, data, par_x = "x", chains = 5, cores = 5, adapt = 10000, iter = 3000) # Ensure convergence 28 | fit <<- quiet_out$result 29 | 30 | # Results table 31 | results_table = purrr::quietly(fixef)(fit, width = 0.98)$result 32 | recovered = all(results_table$match == "OK") # Parameter recovery 33 | effective = all(results_table$n.eff > 100) # Effective samples 34 | 35 | # Show table if the tests failed. Cannot be after tests for some reason... 36 | if (recovered == FALSE | effective == FALSE) 37 | print(results_table) 38 | 39 | # Tests 40 | testthat::expect_true(recovered, model) 41 | testthat::expect_true(effective, model) 42 | } 43 | 44 | 45 | #' Apply `test_fit` to each element of this list 46 | #' 47 | #' @aliases apply_test_fit 48 | #' @keywords internal 49 | #' @param all_models A list of lists. Each sub-list is an unnamed list of 50 | #' formulas with one named entry called "simulated" with parameter values to 51 | #' be used for simulation. 52 | apply_test_fit = function(desc, all_models) { 53 | for (this in all_models) { 54 | # Split into formulas and simulation values 55 | simulated = this[names(this) == "simulated"][[1]] 56 | model = this[names(this) == ""] 57 | 58 | # Test! 59 | testthat::test_that(desc, test_fit(model, simulated)) 60 | } 61 | } 62 | -------------------------------------------------------------------------------- /tests/testthat/helper-runs-data.R: -------------------------------------------------------------------------------- 1 | # Samples and checks data structure. 2 | # Meant to be used with testthat::expect_true() 3 | data_gauss = data.frame( 4 | # y should be continuous 5 | y = 1:5, 6 | ok_y = rnorm(5), # test underscore and decimals 7 | bad_y_char = c("a", "b", "c", "d", "e"), 8 | bad_y_factor = factor(1:5), 9 | 10 | # x should be continuous 11 | x = -1:3, 12 | ok_x = rnorm(5), # test underscore and decimals 13 | bad_x_char = c("a", "b", "c", "d", "e"), 14 | bad_x_factor = factor(1:5), 15 | 16 | # varying effects should be categorical-ish 17 | id = c("a", "b", "c", "d", "e"), 18 | ok_id_factor = factor(c(-3, 0, 5, 9, 1.233243)), # It's a factor, so decimals are OK 19 | ok_id_integer = -2:2, # interval 20 | bad_id = rnorm(5), # decimal numbers 21 | 22 | weights_ok = c(0.1, 1, 2, 1, 1), 23 | weights_bad = c(-0.1, 1, 2, 1, 1) # With negative 24 | ) 25 | 26 | # Only needs to test binomial-specific stuff 27 | data_binomial = data.frame( 28 | # y should be a natural number > 0 29 | y = c(1, 0, 100, 3, 5), 30 | y_bad_numeric = c(-1, 5.1, 10, 3, 5), # negative, decimal, 31 | 32 | y_bern = c(0, 1, 0, 1, 1), 33 | 34 | # trials should be a natural number 0 <= N <= y 35 | N = c(1, 1, 100, 6, 10), 36 | N_bad_numeric = c(-1, 1.1, 99, 6, 10), # smaller than y, decimal, negative 37 | N_bad_factor = factor(c(1, 0, 50, 6, 10)), 38 | N_bad_char = c("1", "1", "100", "6", "10"), 39 | 40 | # x 41 | x = -1:3, 42 | 43 | # Varying effects 44 | id = c("a", "b", "c", "d", "e"), 45 | 46 | weights_ok = c(0.1, 1, 2, 1, 1) # Actually not OK since it's not implemented yet 47 | ) 48 | -------------------------------------------------------------------------------- /tests/testthat/test-fits-arma.R: -------------------------------------------------------------------------------- 1 | models_arma = list( 2 | # Simple AR 3 | list(y ~ 1 + ar(1), 4 | simulated = list( 5 | int_1 = 30, 6 | ar1_1 = 0.7, 7 | sigma_1 = 10 8 | )), 9 | 10 | # Larger AR 11 | list(y ~ 1 + ar(2), 12 | ~ 0 + x + ar(1, 1 + x), 13 | ~ 0, 14 | simulated = list( 15 | cp_1 = 80, 16 | cp_2 = 140, 17 | int_1 = -20, 18 | sigma_1 = 5, 19 | ar1_1 = 0.7, 20 | ar2_1 = -0.4, 21 | x_2 = 0.5, 22 | ar1_2 = 0.5, 23 | ar1_x_2 = -0.02 24 | )) 25 | ) 26 | 27 | apply_test_fit("ARMA (gauss) fit", models_arma) 28 | -------------------------------------------------------------------------------- /tests/testthat/test-fits-gauss.R: -------------------------------------------------------------------------------- 1 | models_gauss = list( 2 | # Simple 3 | list(y ~ 1, 4 | ~ 1, 5 | simulated = list( 6 | int_1 = 10, 7 | int_2 = 20, 8 | sigma_1 = 5, 9 | cp_1 = 100)), 10 | 11 | # A lot of terms 12 | list(y ~ 1 + x, 13 | ~ 0 + x, 14 | rel(1) ~ 1 + sigma(1), 15 | simulated = list( 16 | cp_1 = 70, 17 | cp_2 = 70, 18 | int_1 = 10, 19 | int_3 = 0, 20 | x_1 = 0.5, 21 | x_2 = -1, 22 | sigma_1 = 3, 23 | sigma_3 = 6)) 24 | ) 25 | 26 | apply_test_fit("Gaussian fit", models_gauss) 27 | -------------------------------------------------------------------------------- /tests/testthat/test-fits-sigma.R: -------------------------------------------------------------------------------- 1 | models_sigma = list( 2 | # Simple sigma 3 | list(y ~ 1 + sigma(1 + x), 4 | simulated = list( 5 | int_1 = 30, 6 | sigma_1 = 5, 7 | sigma_x_1 = 0.1 8 | )), 9 | 10 | # Larger sigma 11 | list(y ~ 1 + sigma(1), 12 | ~ 0 + x + sigma(1 + x), 13 | ~ 0, 14 | simulated = list( 15 | cp_1 = 80, 16 | cp_2 = 140, 17 | int_1 = -20, 18 | sigma_1 = 2, 19 | x_2 = 0.5, 20 | sigma_2 = 5, 21 | sigma_x_2 = 0.2 22 | )) 23 | ) 24 | 25 | apply_test_fit("Sigma (gauss) fit", models_sigma) 26 | -------------------------------------------------------------------------------- /tests/testthat/test-runs-bernoulli-binomial.R: -------------------------------------------------------------------------------- 1 | ################# 2 | # TEST BINOMIAL # 3 | ################# 4 | 5 | bad_binomial = list( 6 | # Misspecification of y and trials 7 | list(y ~ 1), # no trials 8 | list(y | N ~ 1), # wrong format 9 | list(trials(N) | y ~ 1), # Wrong order 10 | list(y | trials() ~ 1), # trials missing 11 | list(trials(N) ~ 1), # no y 12 | list(y | trials(N) ~ 1 + x, 13 | y | N ~ 1 ~ 1), # misspecification in later segment 14 | 15 | # Bad data 16 | list(y_bad_numeric | trials(N) ~ 1), 17 | list(y | trials(N_bad_numeric) ~ 1), 18 | list(y | trials(N_bad_factor) ~ 1), 19 | list(y | trials(N_bad_char) ~ 1), 20 | 21 | # Does not work with sigma 22 | list(y | trials(N) ~ 1 + sigma(1)), 23 | 24 | # Weights not implemented yet 25 | list(y | trials(N) + weights(weights_ok) ~ 1) 26 | ) 27 | 28 | test_bad(bad_binomial, 29 | data = data_binomial, 30 | family = binomial()) 31 | 32 | 33 | good_binomial_essential = list( 34 | list(y | trials(N) ~ 1, # With varying 35 | 1 + (1|id) ~ 1), 36 | list(y | trials(N) ~ 1 + ar(1)) # Simple AR(1) 37 | #list(y | trials(N) ~ 1, 38 | # 1 ~ N) # N can be both trials and slope. TO DO: Fails in this test because par_x = "x" 39 | ) 40 | good_binomial_extensive = list( 41 | list(y | trials(N) ~ 1), # one segment 42 | list(y | trials(N) ~ 1 + x, # specified multiple times and with rel() 43 | y | trials(N) ~ 1 ~ rel(1) + rel(x), 44 | rel(1) ~ 0) 45 | ) 46 | 47 | test_good(good_binomial_essential, 48 | good_binomial_extensive, 49 | data = data_binomial, 50 | family = binomial()) 51 | 52 | 53 | 54 | 55 | ################## 56 | # TEST BERNOULLI # 57 | ################## 58 | # This is rather short since most is tested via binomial 59 | bad_bernoulli = list( 60 | # Misspecification of y and trials 61 | list(y_bern | trials(N) ~ 1), # trials 62 | list(y_bern ~ 1 + x, 63 | y_bern | trials(N) ~ 1 ~ 1), # misspecification in later segment 64 | 65 | # Bad data 66 | list(y_bad_numeric ~ 1), 67 | list(y ~ 1), # binomial response 68 | 69 | # Does not work with sigma 70 | list(y_bern ~ 1 + sigma(1)), 71 | 72 | # Weights not implemented yet 73 | list(y | trials(N) + weights(weights_ok) ~ 1) 74 | ) 75 | 76 | test_bad(bad_bernoulli, 77 | data = data_binomial, 78 | family = bernoulli()) 79 | 80 | 81 | good_bernoulli_essential = list( 82 | list(y_bern ~ 1, # With varying 83 | 1 + (1|id) ~ 1) 84 | ) 85 | good_bernoulli_extensive = list( 86 | list(y_bern ~ 1), # one segment 87 | list(y_bern ~ 1 + x, # specified multiple times and with rel() 88 | y_bern ~ 1 ~ rel(1) + rel(x), 89 | rel(1) ~ 0) 90 | ) 91 | 92 | test_good(good_bernoulli_essential, 93 | good_bernoulli_extensive, 94 | data = data_binomial, 95 | family = bernoulli()) 96 | 97 | -------------------------------------------------------------------------------- /tests/testthat/test-runs-formulas-gauss.R: -------------------------------------------------------------------------------- 1 | ################# 2 | # TEST RESPONSE # 3 | ################# 4 | bad_y = list( 5 | list( ~ 1), # No y 6 | list((1|id) ~ 1), # y cannot be varying 7 | list(1 ~ 1), # 1 is not y 8 | list(y ~ 1, # Two y 9 | a ~ 1 ~ 1), 10 | list(y ~ 1, # Intercept y 11 | 1 ~ 1 ~ 1), 12 | list(bad_y_char ~ 1), # Character y 13 | list(bad_y_factor ~ 1) # Factor y 14 | ) 15 | 16 | test_bad(bad_y) 17 | 18 | 19 | good_y_essential = list( 20 | list(y ~ 1, # Explicit and implicit y and cp 21 | y ~ 1 ~ 1, 22 | rel(1) + (1|id) ~ rel(1) + x, 23 | ~ 1) 24 | ) 25 | good_y_extensive = list( 26 | list(y ~ 1), # Regular 27 | list(ok_y ~ 1) # decimal y 28 | ) 29 | 30 | test_good(good_y_essential, good_y_extensive) 31 | 32 | 33 | 34 | ################ 35 | # TEST WEIGHTS # 36 | ################ 37 | bad_weights = list( 38 | list(y + weights(weights_ok) ~ 1), # weights added 39 | list(weights(y) ~ 1), # just wrong :-) 40 | list(y | weights_ok ~ 1), # Has to be weights(weights_ok) 41 | list(y | weights(weights_bad) ~ 1), # Bad weights 42 | list(y | weights(weights_ok) ~ 1, 43 | y | weights(weights_bad) ~ 1 ~ 1) # Different weights 44 | ) 45 | 46 | test_bad(bad_weights) 47 | 48 | good_weights = list( 49 | list(y | weights(weights_ok) ~ 1), # Regular 50 | list(y | weights(weights_ok) ~ 1, 51 | ~ 1 + x + I(x^2), 52 | 1 + (1|id) ~ rel(1)) # With multiple segments and functions and varying 53 | ) 54 | 55 | test_good(good_weights) 56 | 57 | 58 | ################### 59 | # TEST INTERCEPTS # 60 | ################### 61 | bad_intercepts = list( 62 | list(y ~ rel(0)), # rel(0) not supported 63 | list(y ~ rel(1)), # Nothing to be relative to here 64 | list(y ~ 2), # 2 not supported 65 | list(y ~ 1, 66 | ~ rel(0)) # rel(0) not supported 67 | ) 68 | 69 | test_bad(bad_intercepts) 70 | 71 | 72 | good_intercepts = list( 73 | #list(y ~ 0), # would be nice if it worked, but mcmc.list does not behave well with just one variable 74 | list(ok_y ~ 1), # y can be called whatever 75 | list(y ~ 0, # Multiple segments 76 | ~ 1, 77 | ~ 0, 78 | ~ 1), 79 | list(y ~ 1, # Chained relative intercepts 80 | ~ rel(1), 81 | ~ rel(1)) 82 | ) 83 | 84 | test_good(good_intercepts) 85 | 86 | 87 | ############### 88 | # TEST SLOPES # 89 | ############### 90 | bad_slopes = list( 91 | list(y ~ rel(x)), # Nothing to be relative to 92 | list(y ~ x + y), # Two slopes 93 | list(y ~ x, # Two slopes 94 | ~ y), 95 | list(y ~ 1, # Relative slope after no slope 96 | ~ rel(x)), 97 | list(y ~ bad_x_char), # not numeric x 98 | list(y ~ bad_x_factor), # not numeric x 99 | list(y ~ 1, 100 | ~ log(x)), # should fail explicitly because negative x 101 | list(y ~ 1, 102 | ~ sqrt(x)) # should fail explicitly because negative x 103 | ) 104 | 105 | test_bad(bad_slopes) 106 | 107 | 108 | 109 | good_slopes_essential = list( 110 | list(y ~ 0 + x, # Multiple on/off 111 | ~ 0, 112 | ~ 1 + x), 113 | list(y ~ 0 + x + I(x^2) + I(x^3), # Test "non-linear" x 114 | ~ 0 + exp(x) + abs(x), 115 | ~ 0 + sin(x) + cos(x) + tan(x)) 116 | ) 117 | good_slopes_extensive = list( 118 | list(y ~ 0 + x), # Regular 119 | list(y ~ x, # Chained relative slopes 120 | ~ 0 + rel(x), 121 | ~ rel(x)), 122 | list(y ~ ok_x) # alternative x 123 | ) 124 | 125 | test_good(good_slopes_essential, good_slopes_extensive, par_x = NULL) 126 | 127 | 128 | 129 | ###################### 130 | # TEST CHANGE POINTS # 131 | ###################### 132 | bad_cps = list( 133 | list(y ~ x, 134 | 0 ~ 1), # Needs changepoint stuff 135 | list(y ~ x, 136 | q ~ 1), # Slope not allowed for changepoint 137 | list(y ~ 1, 138 | (goat|id) ~ 1), # No varying slope allowed 139 | list(y ~ 1, 140 | y ~ ~ 1), # Needs to be explicit if y is defined 141 | list(y ~ 1, 142 | rel(1) ~ 1), # Nothing to be relative to yet 143 | list(y ~ 1, 144 | 1 + (1|bad_id) ~ 1) # decimal group 145 | ) 146 | 147 | test_bad(bad_cps) 148 | 149 | 150 | good_cps_essential = list( 151 | list(y ~ 1, # Implicit cp 152 | ~ 1, 153 | ~ 0), 154 | list(y ~ 0, # Varying 155 | 1 + (1|id) ~ 1), 156 | 157 | list(y ~ 1, 158 | 1 + (1|id) ~ 1, 159 | 1 + (1|ok_id_integer) ~ 1, # multiple groups and alternative data 160 | 1 + (1|ok_id_factor) ~ 1) # alternative group data 161 | ) 162 | good_cps_extensive = list( 163 | list(y ~ 0 + x, # Regular cp 164 | 1 ~ 1), 165 | list(y ~ 0, # Chained varying and relative cp 166 | y ~ 1 ~ 1, 167 | rel(1) + (1|id) ~ 0, 168 | rel(1) + (1|id) ~ 0, 169 | ~ x), 170 | list(y ~ 1, 171 | (1|id) ~ 0) # Intercept is implicit. I don't like it, but OK. 172 | ) 173 | 174 | test_good(good_cps_essential, good_cps_extensive) 175 | -------------------------------------------------------------------------------- /tests/testthat/test-runs-poisson.R: -------------------------------------------------------------------------------- 1 | bad_poisson = list( 2 | # Misspecification of y and trials 3 | list(y | trials(N) ~ 1), # bad response format 4 | list(y ~ 1 + x, 5 | y | trials(N) ~ 1 ~ 1), # misspecification in later segment 6 | 7 | # Bad data 8 | list(y_bad_numeric ~ 1), 9 | 10 | # Does not work with sigma 11 | list(y ~ 1 + sigma(1)), 12 | 13 | # Does not work with weights 14 | list(y | weights(weights_ok) ~ 1) 15 | ) 16 | 17 | test_bad(bad_poisson, 18 | data = data_binomial, 19 | family = poisson()) 20 | 21 | 22 | good_poisson_essential = list( 23 | list(y ~ 1, # With varying 24 | 1 + (1|id) ~ 1), 25 | list(y ~ 1 + ar(1), 26 | ~ 1 + x + ar(2, 1 + x + I(x^3))) 27 | ) 28 | 29 | good_poisson_extensive = list( 30 | list(y ~ 1), # one segment 31 | list(y ~ 1 + x, # specified multiple times and with rel() 32 | y ~ 1 ~ rel(1) + rel(x), 33 | rel(1) ~ 0) 34 | ) 35 | 36 | test_good(good_poisson_essential, 37 | good_poisson_extensive, 38 | data = data_binomial, 39 | family = poisson()) 40 | -------------------------------------------------------------------------------- /tests/testthat/test-runs-prior.R: -------------------------------------------------------------------------------- 1 | ############### 2 | # TEST PRIORS # 3 | ############### 4 | prior_model = list( 5 | y ~ 1 + x, 6 | 1 + (1|id) ~ rel(1) + rel(x), 7 | rel(1) ~ 0 8 | ) 9 | 10 | bad_prior = list( 11 | list( 12 | cp_1 = "dirichlet(1)", # Has to be all-dirichlet 13 | cp_2 = "dnorm(3, 10)" 14 | ), 15 | list( 16 | cp_1 = "dirichlet(1)", 17 | cp_2 = "dirichlet(0)" # alpha has to be > 0 18 | ) 19 | ) 20 | 21 | for (prior in bad_prior) { 22 | test_name = paste0("Bad priors: ", paste0(prior, collapse=", ")) 23 | testthat::test_that(test_name, { 24 | testthat::expect_error(test_runs(prior_model, sample = FALSE, prior = prior)) 25 | }) 26 | } 27 | 28 | 29 | good_prior_essential = list( 30 | list( # Fixed values and non-default change point 31 | int_2 = "int_1", 32 | cp_1 = "dnorm(3, 10)", 33 | x_2 = "-0.5" 34 | ), 35 | 36 | list( 37 | cp_1 = "dirichlet(1)", # Dirichlet prior on change points 38 | cp_2 = "dirichlet(1)" 39 | ) 40 | ) 41 | good_prior_extensive = list( 42 | list( # Changepoint outside of the observed range is allowed 43 | cp_1 = "dunif(-100, -90)", 44 | cp_2 = "dnorm(100, 20) T(100, 110)" 45 | ), 46 | 47 | list( 48 | cp_1 = "dirichlet(3)", # Dirichlet prior on change points 49 | cp_2 = "dirichlet(2)" 50 | ) 51 | ) 52 | 53 | for (prior in good_prior_essential) { 54 | test_name = paste0("Good priors (essential): ", paste0(prior, collapse=", ")) 55 | testthat::test_that(test_name, { 56 | test_runs(prior_model, prior = prior) 57 | }) 58 | } 59 | 60 | if (is.null(getOption("test_mcp_allmodels")) == FALSE) { 61 | for (prior in good_prior_extensive) { 62 | test_name = paste0("Good priors (extensive): ", paste0(prior, collapse=", ")) 63 | testthat::test_that(test_name, { 64 | test_runs(prior_model, prior = prior) 65 | }) 66 | } 67 | } 68 | -------------------------------------------------------------------------------- /tests/testthat/test-runs-sigma-arma.R: -------------------------------------------------------------------------------- 1 | ################# 2 | # TEST VARIANCE # 3 | ################# 4 | bad_variance = list( 5 | list(y ~ 1 + sigma(rel(1))), # no sigma to be relative to 6 | list(y ~ 1, 7 | y ~ 1 + sigma(rel(x))), # no sigma slope to be relative to 8 | list(y ~ 1 + sigma(q)) # variable does not exist 9 | ) 10 | 11 | test_bad(bad_variance) 12 | 13 | 14 | good_variance_essential = list( 15 | list(y ~ 1 + sigma(x + I(x^2))), 16 | list(y ~ 1, 17 | 1 + (1|id) ~ rel(1) + I(x^2) + sigma(rel(1) + x)), # Test with varying change point and more mcp stuff 18 | list(y | weights(weights_ok) ~ 1 + sigma(1 + x), # With weights 19 | ~ 0 + sigma(1 + rel(x))) 20 | ) 21 | 22 | good_variance_extensive = list( 23 | list(y ~ 1 + sigma(1)), 24 | list(y ~ 1 + sigma(1 + sin(x))), 25 | list(y ~ 1, 26 | ~ 0 + sigma(rel(1)), # test relative intercept 27 | ~ x + sigma(x), 28 | ~ 0 + sigma(rel(x))) # test relative slope 29 | ) 30 | 31 | test_good(good_variance_essential, good_variance_extensive) 32 | 33 | 34 | ############# 35 | # TEST ARMA # 36 | ############# 37 | # We can assume that it will fail for the same mis-specifications on the formula 38 | # ar(order, [formula]), since the formula runs through the exact same code as 39 | # sigma and ct. 40 | bad_arma = list( 41 | list(y ~ ar(0)), # currently not implemented 42 | list(y ~ ar(-1)), # must be positive 43 | list(y ~ ar(1.5)), # Cannot be decimal 44 | list(y ~ ar(1) + ar(2)), # Only one per segment 45 | list(y ~ ar("1")), # Should not take strings 46 | list(y ~ ar(1 + x)), # must have order 47 | list(y ~ ar(x)) # must have order 48 | ) 49 | 50 | test_bad(bad_arma) 51 | 52 | 53 | good_arma_essential = list( 54 | list(y ~ ar(1), 55 | ~ ar(2, 0 + x)), # change in ar 56 | list(y ~ 1, 57 | ~ 0 + ar(2)), # onset of AR 58 | list(y ~ ar(1) + sigma(1 + x), 59 | ~ ar(2, 1 + I(x^2)) + sigma(1)) # With sigma 60 | ) 61 | 62 | good_arma_extensive = list( 63 | list(y ~ ar(1)), # simple 64 | list(y ~ ar(5)), # higher order 65 | list(y ~ ar(1, 1 + x + I(x^2) + exp(x))), # complicated regression 66 | list(y ~ ar(1), 67 | ~ ar(2, rel(1))), # Relative to no variance. Perhaps alter this behavior so it becomes illegal? 68 | list(y ~ 1, 69 | 1 + (1|id) ~ rel(1) + I(x^2) + ar(2, rel(1) + x)), # varying change point 70 | list(y | weights(weights_ok) ~ 1 + ar(1), # With weights 71 | ~ 0 + ar(2, 1 + x)) 72 | ) 73 | 74 | test_good(good_arma_essential, good_arma_extensive) 75 | -------------------------------------------------------------------------------- /vignettes/_figures/ex_ar.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_ar.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_binomial.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_binomial.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_demo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_demo.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_demo_combo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_demo_combo.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_fix_rel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_fix_rel.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_plateaus.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_plateaus.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_quadratic.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_quadratic.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_slopes.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_slopes.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_trig.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_trig.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_variance.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_variance.png -------------------------------------------------------------------------------- /vignettes/_figures/ex_varying.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/ex_varying.png -------------------------------------------------------------------------------- /vignettes/_figures/make_README_plots.R: -------------------------------------------------------------------------------- 1 | `%>%` = magrittr::`%>%` 2 | 3 | theme_it = function(x, title) { 4 | x + 5 | ggplot2::ggtitle(title) + 6 | ggplot2::theme_gray(13) + 7 | ggplot2::theme(legend.position = "none") # + 8 | # theme(axis.title = element_blank(), 9 | # axis.text = element_blank(), 10 | # axis.ticks = element_blank()) 11 | } 12 | 13 | save_it = function(filename) { 14 | ggplot2::ggsave(paste0("./vignettes/_figures/", filename), width=6, height=3, dpi = 100, type = "cairo") 15 | } 16 | 17 | 18 | 19 | ############# 20 | # Example 1 # 21 | ############# 22 | library(mcp) 23 | options(mc.cores = 3) # Run in parallel 24 | 25 | ex_demo = mcp_example("demo") 26 | fit_demo = mcp(ex_demo$model, data = ex_demo$data, adapt = 3000) # dataset included in mcp 27 | theme_it(plot(fit_demo), "") 28 | save_it("ex_demo.png") 29 | 30 | plot_pars(fit_demo, regex_pars = "cp_") 31 | save_it("ex_demo_combo.png") 32 | 33 | # LOO 34 | # Fit the model 35 | model_null = list( 36 | response ~ 1 + time, 37 | ~ 1 + time 38 | ) 39 | fit_null = mcp(model_null, ex_demo$data, adapt = 3000) 40 | 41 | # Compare loos: 42 | fit_demo$loo = loo(fit_demo) 43 | fit_null$loo = loo(fit_null) 44 | loo::loo_compare(fit_demo$loo, fit_null$loo) 45 | 46 | 47 | ################ 48 | # Two plateaus # 49 | ################ 50 | ex_intercepts = mcp_example("intercepts") 51 | fit_intercepts = mcp(ex_intercepts$model, ex_intercepts$data, par_x = "x", adapt = 3000) 52 | theme_it(plot(fit_intercepts, lines = 25), "Two plateaus") 53 | save_it("ex_plateaus.png") 54 | 55 | 56 | 57 | ######################## 58 | # VARYING SLOPE CHANGE # 59 | ######################## 60 | ex_varying = mcp_example("varying") 61 | fit_varying = mcp(ex_varying$model, ex_varying$data, adapt = 3000) 62 | theme_it(plot(fit_varying, facet_by = "id"), "Varying slope change") 63 | save_it("ex_varying.png") 64 | 65 | 66 | ############ 67 | # BINOMIAL # 68 | ############ 69 | ex_binomial = mcp_example("binomial") 70 | fit_binomial = mcp(ex_binomial$model, ex_binomial$data, family = binomial(), adapt = 5000) 71 | theme_it(plot(fit_binomial, q_fit = TRUE), "Binomial") 72 | save_it("ex_binomial.png") 73 | 74 | 75 | ########################## 76 | # FIXED, RELATIVE, PRIOR # 77 | ########################## 78 | ex_rel = mcp_example("rel_prior") 79 | prior_rel = list( 80 | int_1 = 10, # fixed value 81 | x_3 = "x_1", # shared slope in segment 1 and 3 82 | int_2 = "dnorm(0, 20)", 83 | cp_1 = "dunif(20, 50)" # has to occur in this interval 84 | ) 85 | 86 | fit_rel = mcp(ex_rel$model, ex_rel$data, prior_rel, iter = 10000) 87 | theme_it(plot(fit_rel, cp_dens = FALSE), "rel() and prior") 88 | save_it("ex_fix_rel.png") 89 | 90 | 91 | ############# 92 | # QUADRATIC # 93 | ############# 94 | ex_quadratic = mcp_example("quadratic") 95 | fit_quadratic = mcp(ex_quadratic$model, ex_quadratic$data, adapt = 3000) 96 | theme_it(plot(fit_quadratic), "Quadratic and other exponentiations") 97 | save_it("ex_quadratic.png") 98 | 99 | 100 | 101 | ################# 102 | # TRIGONOMETRIC # 103 | ################# 104 | ex_trig = mcp_example("trigonometric") 105 | fit_trig = mcp(ex_trig$model, ex_trig$data, adapt = 3000) 106 | theme_it(plot(fit_trig), "Trigonometric for periodic trends") 107 | save_it("ex_trig.png") 108 | 109 | 110 | 111 | ############ 112 | # VARIANCE # 113 | ############ 114 | ex_variance = mcp_example("variance") 115 | fit_variance = mcp(ex_variance$model, ex_variance$data, iter = 10000, adapt = 10000) 116 | theme_it(plot(fit_variance, q_predict = TRUE), "Variance and prediction intervals") 117 | save_it("ex_variance.png") 118 | 119 | 120 | 121 | ######### 122 | # AR(N) # 123 | ######### 124 | ex_ar = mcp_example("ar") 125 | fit_ar = mcp(ex_ar$model, ex_ar$data, adapt = 3000) 126 | theme_it(plot(fit_ar), "Time series with autoregressive residuals") 127 | save_it("ex_ar.png") 128 | -------------------------------------------------------------------------------- /vignettes/_figures/mcp_glm_status.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/mcp_glm_status.png -------------------------------------------------------------------------------- /vignettes/_figures/mcp_glm_status.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/mcp_glm_status.xlsx -------------------------------------------------------------------------------- /vignettes/_figures/packages_table1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/packages_table1.png -------------------------------------------------------------------------------- /vignettes/_figures/packages_table2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/packages_table2.png -------------------------------------------------------------------------------- /vignettes/_figures/packages_table3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/packages_table3.png -------------------------------------------------------------------------------- /vignettes/_figures/packages_tables.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/packages_tables.pdf -------------------------------------------------------------------------------- /vignettes/_figures/packages_tables.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lindeloev/mcp/e5b1370879d5be8b45240d86276d9f99bcff4918/vignettes/_figures/packages_tables.xlsx -------------------------------------------------------------------------------- /vignettes/binomial.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Binomial change point analysis" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{Binomial change point analysis} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | %\VignetteEncoding{UTF-8} 8 | --- 9 | 10 | `mcp` aims to implement Generalized Linear Models in a way that closely mimics that of [brms::brm](https://github.com/paul-buerkner/brms). You can set the family and link functions using the `family` argument. 11 | 12 | First, let us specify a toy model with three segments: 13 | 14 | ```{r} 15 | model = list( 16 | y | trials(N) ~ 1, # constant rate 17 | ~ 0 + year, # joined changing rate 18 | ~ 1 + year # disjoined changing rate 19 | ) 20 | ``` 21 | 22 | 23 | # Simulate data 24 | If you already have data, you can safely skip this section. 25 | 26 | We run `mcp` with `sample = FALSE` to get what we need to simulate data. 27 | 28 | ```{r} 29 | library(mcp) 30 | options(mc.cores = 3) # Speed up sampling 31 | set.seed(42) # Make the script deterministic 32 | 33 | empty = mcp(model, family = binomial(), sample = FALSE) 34 | ``` 35 | 36 | Now we can simulate. First, let us see the model parameters. 37 | 38 | ```{r} 39 | empty$pars 40 | ``` 41 | 42 | * It takes two intercepts (`int_*`), for segments 1 and 3. 43 | * It takes two slopes (`year_*`), for segment 2 and 3. 44 | * It takes two change points (`cp_*`) - one between each segment. 45 | 46 | `empty$simulate` is now a function that can predict data given these parameters. If you are in a reasonable R editor, type `empty$simulate(` and press TAB to see the required arguments. I came up with some values below, including change points at $year = 25$ and $year = 65$. Notice that because `binomial()` defaults to the link function `link = "logit"`, the intercept and slopes are on a [logit scale](https://en.wikipedia.org/wiki/Logit). Briefly, this extends the narrow range of binomial rates (0-1) to an infinite logit scale from minus infinity to plus infinity. This will be important later when we set priors. 47 | 48 | ```{r} 49 | df = data.frame( 50 | year = 1901:2020, # evaluate for each of these 51 | N = sample(10:20, size = 120, replace = TRUE) # number of trials 52 | ) 53 | df$y = empty$simulate( 54 | df$year, df$N, 55 | cp_1 = 1925, cp_2 = 1975, 56 | int_1 = 2, int_3 = -1, 57 | year_2 = -0.1, year_3 = 0.1) 58 | 59 | head(df) 60 | ``` 61 | 62 | Visually: 63 | 64 | ```{r} 65 | plot(df$year, df$y) 66 | ``` 67 | 68 | 69 | # Check parameter recovery 70 | The next sections go into more detail, but let us quickly see if we can recover the parameters used to simulate the data. 71 | 72 | ```{r, cache = TRUE, results=FALSE, message=FALSE, warning=FALSE} 73 | fit = mcp(model, data = df, family = binomial(), adapt = 5000) 74 | ``` 75 | 76 | 77 | We can use `summary` to see that it recovered the parameters to a pretty good precision. Again, recall that intercepts and slopes are on a `logit` scale. 78 | 79 | ```{r} 80 | summary(fit) 81 | ``` 82 | 83 | `summary` uses 95% highest density intervals (HDI) by default, but you can change it using `summary(fit, width = 0.80)`. If you have [varying effects](../articles/varying.html), use `ranef(fit)` to see them. 84 | 85 | Plotting the fit confirms good fit to the data, and we see the discontinuities at the two change points: 86 | 87 | ```{r} 88 | plot(fit) 89 | ``` 90 | 91 | These lines are just `fit$simulate` applied to a random draw of the posterior samples. In other words, they represent the joint distribution of the parameters. You can change the number of draws (lines) using `plot(fit, lines = 50)`. 92 | 93 | Notice for binomial models it defaults to plot the *rate* (`y / N`) as a function of x. The reason why is obvious when we plot on "raw" data by toggling `rate`: 94 | 95 | ```{r} 96 | plot(fit, rate = FALSE) 97 | ``` 98 | 99 | These lines are jagged because `N` varies from year to year. Although there is close too 100% success rate in the years 1900 - 1920, the number of trials varies, as you can see in the raw data. However, using `rate = FALSE` will be great when the number of trials is constant for extended periods of time, as `y` is more interpretable then. 100 | 101 | Speaking of alternative visualizations, you can also plot this on the logit scale, where the linear trends are modeled: 102 | 103 | ```{r, warning=FALSE, message=FALSE} 104 | plot(fit, scale = "linear") 105 | ``` 106 | 107 | 108 | Of course, these plots work with [varying effects](../articles/varying.html) as well. 109 | 110 | 111 | # Model diagnostics and sampling options 112 | Already in the default `plot` as used above, it will be obvious if there was poor convergence. A more direct assessment is to look at the posterior distributions and trace plots: 113 | 114 | ```{r, fig.height = 10, fig.width = 6} 115 | plot_pars(fit) 116 | ``` 117 | 118 | Convergence is perfect here as evidenced by the overlapping trace plots that look like fat caterpillars (Bayesians love fat caterpillars). Notice that the posterior distribution of change points can be quite non-normal and sometimes even bimodal. Therefore, one should be careful not to interpret the HDI as if it was normal. 119 | 120 | `plot()` and `plot_pars()` can do a lot more than this, so check out their documentation. 121 | 122 | 123 | # Priors for binomial models 124 | `mcp` uses priors to achieve a lot of it's functionality. See [how to set priors](../articles/priors.html), including how to share parameters between segments and how to fix values. Here, I post a few notes about the binomial-specific default priors. 125 | 126 | The default priors in `mcp` are set so that they are reasonably broad to cover most scenarios, though also specific enough to sample effectively. They are not "default" as in "canonical". Rather, they are "default" as in "what happens if you do nothing else". All priors are stored in `fit$prior` (also `empty$prior`). We did not specify `prior` above, so it ran with default priors: 127 | 128 | ```{r} 129 | cbind(fit$prior) 130 | ``` 131 | 132 | The priors on change points are discussed extensively in the prior vignette. The priors on slopes and intercepts are normals with standard deviation of "3" logits. This corresponds to quite extreme binomial probabilities, yet not so extreme as to be totally flat. Here are visualization of priors `dnorm(0, 1)` (red), `dnorm(0, 2)` (black, mcp default), and a `dnorm(0, 5)` (blue) prior, and the correspondence between logits and probabilities: 133 | 134 | ```{r} 135 | inverse_logit = function(x) exp(x) / (1 + exp(x)) 136 | 137 | # Start the plot 138 | library(ggplot2) 139 | ggplot(data.frame(logits = 0), aes(x = logits)) + 140 | 141 | # Plot normal prior. Set parameters in "args" 142 | stat_function(fun=dnorm, args = list(mean=0, sd = 1), lwd=2, col="red") + 143 | stat_function(fun=dnorm, args = list(mean=0, sd = 3), lwd=2, col="black") + 144 | stat_function(fun=dnorm, args = list(mean=0, sd = 5), lwd=2, col="blue") + 145 | 146 | # Set the secondary axis 147 | scale_x_continuous(breaks = -7:7,limits = c(-7, 7), sec.axis = sec_axis(~ inverse_logit(.), name = "Probability", breaks = round(inverse_logit(seq(-7, 7, by = 2)), 3))) 148 | ``` 149 | 150 | Please keep in mind that when these priors combine through the model, the joint probability may be quite different. 151 | 152 | Returning to the priors, the `3 / (MAXX - MINX)` on slopes mean that this change in probability occurs over the course of the observed X. 153 | 154 | 155 | # JAGS code 156 | Here is the JAGS code for the model used in this article. 157 | 158 | ```{r} 159 | fit$jags_code 160 | ``` 161 | 162 | -------------------------------------------------------------------------------- /vignettes/families.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Families and link functions" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{Families and link functions} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | %\VignetteEncoding{UTF-8} 8 | --- 9 | 10 | `mcp 0.3` added support for more response families and link functions. For example, you can now do 11 | 12 | ```{r, eval=FALSE} 13 | fit = mcp(model, data = df, family = gaussian("log")) 14 | fit = mcp(model, data = df, family = binomial("identity")) 15 | ``` 16 | 17 | 18 | This is an ongoing effort and more will be added. This table shows the current status: 19 | 20 | ![](https://github.com/lindeloev/mcp/raw/docs/vignettes/_figures/mcp_glm_status.png) 21 | 22 | * **Green:** supported and the default priors are unlikely to change. 23 | * **Yellow:** supported but the default priors may change. 24 | * **White:** not currently supported. [Raise a GitHub issue](https://github.com/lindeloev/mcp/issues) if you need it. 25 | * **Red:** impossible or will not be supported. 26 | 27 | See the "GLM" menu above for more details on using GLM with `mcp`. 28 | 29 | 30 | # General remarks 31 | Some link functions are *default* in GLM for good reasons: they have proven computationally convenient and interpretable. When using a non-default link function, you risk predicting impossible values, at which point `mcp` will error (as it should) - hopefully with informative error messages. For example a `bernoulli("identity")` family with model `prob ~ 1 + x` (i.e., a slope directly on the probability of success) can easily exceed 1 or 0 and there are, of course, no such thing as probabilities below 0% or above 100%. One way to ameliorate such problems is by setting informative [priors](../articles/priors.html) (e.g., via truncation) to prevent the sampler from visiting illegal combinations of such values. 32 | 33 | In short: think carefully and proceed at your own risk. 34 | 35 | 36 | # An example 37 | 38 | ```{r} 39 | library(mcp) 40 | options(mc.cores = 3) # Speed up sampling 41 | set.seed(42) # Make the script deterministic 42 | ``` 43 | 44 | 45 | Reviving the example from the article on [binomial models in mcp](../articles/binomial.html)... 46 | 47 | ```{r} 48 | model = list( 49 | y | trials(N) ~ 1 + x, # Intercept and slope on P(success) 50 | ~ 1 + x # Disjoined slope on P(success) 51 | ) 52 | ``` 53 | 54 | we can model it using an `identity` link function: 55 | 56 | ```{r, echo = FALSE} 57 | ex = mcp_example("binomial") 58 | ``` 59 | 60 | ```{r, eval = FALSE} 61 | ex = mcp_example("binomial") 62 | fit = mcp(model, data = ex$data, family = binomial(link = "identity")) 63 | ``` 64 | 65 | ```{r, echo = FALSE} 66 | message("Parallel sampling in progress... 67 | 68 | Error in update.jags(model, n.iter, ...) : Error in node loglik_[63] 69 | Invalid parent values") 70 | ``` 71 | 72 | 73 | Oops, the sampler visited impossible values! Likely `P(success) < 0%` or `P(success) > 100%`. Let's help it along with some more informative priors. For this data and model, the main problem is that the slope of the second segment has too great posterior probability of surpasses 100% with the default `mcp` priors. So let's set some more informative priors render a long (early `cp_1`) and steep (high `x_2`) slope unlikely: 74 | 75 | ```{r, results=FALSE, message=FALSE, warning=FALSE} 76 | prior = list( 77 | x_2 = "dnorm(0, 0.002)", # Slope is unlikely to be steep 78 | cp_1 = "dnorm(30, 10) T(20, )" # Slope starts not-too-early 79 | ) 80 | 81 | fit = mcp(model, data = ex$data, family = binomial(link = "identity"), prior = prior) 82 | ``` 83 | 84 | ```{r, fig.width=6, fig.height=4} 85 | plot(fit) 86 | ``` 87 | 88 | Sampling succeeded! This is a bad model of this data. But it illustrates the necessary considerations and steps to ameliorate problems when going beyond default link functions. 89 | 90 | Because of the identity-link, the regression coefficients are interpretable as intercepts and slopes on `P(success)` in contrast to the "usual" log-odds fitted when `family = binomial(link = "logit")`. For example, `int_1` is inferred probability of success at `x = 0` and likewise for `int_2` at `x = cp_1`. 91 | 92 | ```{r} 93 | summary(fit) 94 | ``` 95 | 96 | 97 | # Specific remarks 98 | 99 | * The mean ($1 / {inverse\_link}(\lambda)$) is plotted for `family = exponential()` and output by `fit$simulate(..., fype = "fitted")`. Multiply by `log(2)` to get the median. 100 | 101 | -------------------------------------------------------------------------------- /vignettes/formulas.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "About mcp formulas and models" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{About mcp formulas and models} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | %\VignetteEncoding{UTF-8} 8 | --- 9 | 10 | `mcp` takes a list of formulas, and defines the change point as the point on the x-axis where the data shifts from being generated by one formula to the next. So with `N` formulas, you have `N - 1` change points. A list with just one formula thus correspond to normal regression with 0 change points. 11 | 12 | The formulas are called "segments" because they divide the (observed) x-axis into `N` segments. 13 | 14 | # Formula format for segments 15 | The format of each segment is generally `response ~ changepoint ~ predictors + sigma(variance_predictors)`, except for the first segment where there is no change point (`response ~ predictors + sigma(variance_predictors)`). Here is a simple and a complex formula 16 | 17 | 18 | * **The response** is just the name of a column in your data. 19 | * **The change point** can be `1` for a population-level change point or `1 + (1|group)` [varying change points](articles/varying.html) sampled around the population-level change point. 20 | * **The predictors** can be `0` (no change in intercept), `1` (change in intercept), and any column in your data which you want to model a slope on. 21 | * **The variance predictors** has [it's own article](../articles/variance.html), but they follow the same rules as the **predictors**. 22 | 23 | For convenience, you can omit the response and the change point in segment 2+, in which case the former response and an intercept-only (as opposed to random/varying) change point is assumed. When you call `summary(fit)`, it will show the explicit representation. Let us see this in action for this model where we predict `score` as a function of `time` in three segments, i.e., with two change points: 24 | 25 | ```{r} 26 | library(mcp) 27 | model = list( 28 | score ~ 1, # intercept 29 | score ~ 1 ~ 0 + time, # joined slope 30 | ~ time # disjoined slope. "score ~ 1 ~ 1 + time" is implicit here. 31 | ) 32 | 33 | # Interpret, but do not sample. 34 | fit = mcp(model, sample = FALSE) 35 | summary(fit) 36 | ``` 37 | 38 | Notice how it added the response and the change point to the last segment? 39 | 40 | `mcp` is heavily inspired by `brms` which again is inspired by [`lme4::lmer`](https://cran.r-project.org/web/packages/lme4/index.html). [Here is a bit of history](https://twitter.com/jonaslindeloev/status/1117760777249853440) on that. 41 | 42 | 43 | # Parameter names 44 | `mcp` automatically assigns names to the parameters in the format `type_i` where `i` is the segment number. Specifically: 45 | 46 | * `int_i` is the intercept in the ith segment. 47 | * `year_i` is the slope in on the data column `year` in the ith segment. `x_i` is the slope on the data column `x` in the ith segment. The slope takes name after the data it is regressed on. 48 | * `cp_i` is the ith change point. Notice that `cp_i` is specified in segment `i + 1`. `cp_1` occurs when there are two segments, and `cp_2` when there are three segments, etc. *OBS: future versions may start at cp_2.*. 49 | * `cp_i_group` is the varying *deviations* from `cp_i`. See [varying change points in mcp](articles/varying.html). 50 | * `cp_i_sd` is the population-level standard deviation of the varying effects. 51 | * `sigma_*` are variance parameters about which you can [read more here](../articles/variance.html)). Note that for `family = gaussian()` and other families with an SD residual, there will always be an `sigma_1`, i.e., the common sigma initiated in the first segment. 52 | * `arj_i` are autocorrelation coefficients of order `j` for segment `i` ([read more here](../articles/arma.html)). 53 | 54 | 55 | These parameter names are saved in `fit$pars`. Let us specify a somewhat complex model to show off some parameter names: 56 | 57 | ```{r} 58 | model = list( 59 | # int_1 60 | score ~ 1, 61 | 62 | # cp_1, cp_1_sd, cp_1_id, x_2 63 | 1 + (1|id) ~ 0 + year, 64 | 65 | # cp_2, cp_2_sd, cp_2_condition, int_2, x_2 66 | 1 + (1|condition) ~ 1 + rel(year) 67 | ) 68 | 69 | # Intepret, but do not sample. 70 | fit = mcp(model, sample = FALSE) 71 | str(fit$pars, vec.len = 99) # Compact display 72 | ``` 73 | 74 | 75 | 76 | # Modeling intercept change points 77 | A change point is simply like an `ifelse` statement or multiplying with indicators (0s and 1s): 78 | 79 | ```{r} 80 | # Model parameters 81 | x = 1:20 82 | cp_1 = 12 83 | int_1 = 5 84 | int_2 = 10 85 | 86 | # Ifelse version 87 | y_ifelse = ifelse(x <= cp_1, yes = int_1, no = int_2) 88 | 89 | # Indicator equivalent using dummy helpers 90 | cp_0 = -Inf 91 | cp_2 = Inf 92 | y_indicator = (x > cp_0) * (x <= cp_1) * int_1 + # Between cp_0 and cp_1 93 | (x > cp_1) * (x <= cp_2) * int_2 # Between cp_1 and cp_2 94 | 95 | # Show it 96 | par(mfrow = c(1,2)) 97 | plot(x, y_ifelse, main = "ifelse(x <= cp_1)") 98 | plot(x, y_indicator, main = "(x > cp_1) * int_2") 99 | ``` 100 | 101 | The magic of (Bayesian) MCMC sampling is that it can actually infer the change point from this simple formulation. We let `mcp` write the JAGS code for this simple two-plateaus model and see how it uses the indicator formulation of change points: 102 | 103 | ```{r} 104 | model = list(y ~ 1, ~ 1) 105 | fit = mcp(model, sample = FALSE, par_x = "x") 106 | fit$jags_code 107 | ``` 108 | 109 | Look at the section called `# Fitted value` which is the (automatically generated) model that was discussed above. Some unnecessary stuff is added to segment 1 just because it makes the code easier to generate. (`x[i_] >= cp_0` when `cp_0` is the smallest value of `x` is, of course, always true). 110 | 111 | 112 | # Modeling slope change points 113 | We can use the same principle to model change points on slopes. However, we have to "take off" where the previous slope left us on the y-axis. That is, we have to regard whatever y-value the previous segment ended with as a kind of intercept-at-x=0 in the frame of the new segment. The intercept of segment 2 is `cp_1 * slope_1` and the slope in segment 2 is `x * (slope_2 - cp_1)`. 114 | 115 | ```{r} 116 | # Model parameters 117 | x = 1:20 118 | cp_1 = 12 119 | slope_1 = 2 120 | slope_2 = -1 121 | 122 | # Ifelse version 123 | y_ifelse = ifelse(x <= cp_1, 124 | yes = slope_1 * x, 125 | no = cp_1 * slope_1 + slope_2 * (x - cp_1)) 126 | 127 | # Indicator version. pmin() is a vectorized min() 128 | cp_0 = -Inf 129 | y_indicator = (x > cp_0) * slope_1 * pmin(x, cp_1) + 130 | (x > cp_1) * slope_2 * (x - cp_1) 131 | 132 | # Show it 133 | par(mfrow = c(1,2)) 134 | plot(x, y_ifelse, main = "ifelse(x <= cp_1)") 135 | plot(x, y_indicator, main = "(x > cp_1) * int_2") 136 | ``` 137 | 138 | Let us see this in action: 139 | 140 | ```{r} 141 | model = list(y ~ 0 + x, 142 | ~ 0 + x) 143 | fit = mcp(model, sample = FALSE) 144 | fit$jags_code 145 | ``` 146 | 147 | Again, look at the `#Fitted value` to see the indicator-version in action. And again, `mcp` adds something about `cp_0 = -Inf` and `cp_2 = Inf`, just for internal convenience. 148 | 149 | You will find the exact same formula for `y_ = ...` if you do `print(fit$simulate)`, though this function contains a whole lot of other stuff too. 150 | 151 | 152 | # Modeling relative slopes and intercepts 153 | `mcp` allows for specifying relative intercepts and change points through `rel()`. Relative slopes are easy: just replace `x_2` with `x_1 + x_2`. You could do the same if all segments are intercept-only. However, if the previous segment had a slope, we want the intercept to be relative to where that "ended". The `mcp` solution is to only "turn off" the "hanging intercept" from that slope's ending (`pmin(x, cp_i)`) when the model encounters an absolute intercept. An indicator does this. 154 | 155 | ```{r} 156 | # Model parameters 157 | x = 1:20 158 | cp_1 = 12 159 | int_1 = 5 160 | int_2 = 3 # let's model this as relative 161 | 162 | # Indicator version. 163 | cp_0 = -Inf 164 | y_indicator = (x > cp_0) * int_1 + # Note: no (x < cp_1) 165 | (x > cp_1) * (int_2) 166 | 167 | # Plot it 168 | plot(x, y_indicator, main = "Relative intercept") 169 | ``` 170 | 171 | You can look at `fit$jags_code` and `fit$simulate()` to see this in action. 172 | -------------------------------------------------------------------------------- /vignettes/poisson.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Poisson change point analysis" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{Poisson change point analysis} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | %\VignetteEncoding{UTF-8} 8 | --- 9 | 10 | The Poisson distribution models the number of events within similar-sized time frames. 11 | 12 | 13 | # Coal mining disasters 14 | A dataset on coal mining disasters has grown very popular in the change point literature (available in `boot::coal`). It contains a timestamp of each coal mining disaster from 1851 to 1962. By binning the number of events within each year (fixed time frame), we have something very Poisson-friendly: 15 | 16 | ```{r, message=FALSE, warning=FALSE} 17 | # Number of disasters by year 18 | library(dplyr, warn.conflicts = FALSE) 19 | df = round(boot::coal) %>% 20 | group_by(date) %>% 21 | count() 22 | 23 | # See it 24 | head(df) 25 | ``` 26 | 27 | 28 | The number of events (`n`) as a function of year (`date`) is typically modeled as a change between two intercepts. This is very simple to do in `mcp`: 29 | 30 | ```{r} 31 | library(mcp) 32 | options(mc.cores = 3) # Speed up sampling 33 | set.seed(42) # Make the script deterministic 34 | ``` 35 | 36 | ```{r, cache = TRUE, results= FALSE, message=FALSE, warning=FALSE} 37 | model = list( 38 | n ~ 1, # intercept-only 39 | ~ 1 # intercept-only 40 | ) 41 | 42 | fit = mcp(model, data = df, family = poisson(), par_x = "date") 43 | ``` 44 | 45 | Let us see the two intercepts (lambda in log-units) and the change point (in years): 46 | 47 | ```{r} 48 | result = summary(fit) 49 | ``` 50 | 51 | We can see that the model ran well with good convergence and a large number of effective samples. At a first glance, the change point is estimated to lie between the years 1880 and 1895 (approximately). 52 | 53 | Let us take a more direct look, using the default `mcp` plot: 54 | 55 | ```{r} 56 | plot(fit) 57 | ``` 58 | 59 | It seems to fit the data well, but we can see that the change point probability "lumps" around particular data points. Years with a very low number of disasters abruptly increase the probability that the change to a lower disaster rate has taken place. The posterior distributions of change points regularly take these "weird" forms, i.e., not well-described by our toolbox of parameterized distributions. 60 | 61 | We can see this more clearly if plotting the posteriors. We include a traceplot too, just to check convergence visually. 62 | 63 | ```{r} 64 | plot_pars(fit) 65 | ``` 66 | 67 | 68 | # Priors 69 | `poisson()` defaults to `link = 'log'`, meaning that we have to exponentiate the estimates to get the "raw" Poisson parameter $\lambda$. $\lambda$ has the nice property of being the mean number of events. So we see that the mean number of events in segment 1 is `exp(result$mean[2])` (`r exp(result$mean[2])`) and it is `exp(result$mean[3])` (`r exp(result$mean[3])`) for segment 2. 70 | 71 | Default priors were used. They are normals with a standard deviation of 10. I.e. with 68% probability mass between `exp(10) = 22026` and `exp(-10) = 1 / 22026`: 72 | 73 | ```{r} 74 | cbind(fit$prior) 75 | ``` 76 | 77 | As always, the prior on the change point forces it to occur in the observed range. These priors are very vague, so update with more informed priors for your particular case, e.g.: 78 | 79 | ```{r, cache = TRUE, eval=FALSE} 80 | prior = list( 81 | cp_1 = "dnorm(1900, 30) T(MINX, 1925)" 82 | ) 83 | fit_with_prior = mcp(model, data = df, prior, poisson(), par_x = "date") 84 | ``` 85 | 86 | 87 | # Model comparison 88 | 89 | Despite the popularity of this dataset, a question rarely asked is what the evidence is that there is a change point at all. Let us fit two no-changepoint models and use approximate leave-one-out cross-validation to see how the predictive performance of the two models compare. 90 | 91 | A flat model and a one-decay model: 92 | 93 | ```{r, cache = TRUE, results= FALSE, message=FALSE, warning=FALSE} 94 | # Fit an intercept-only model 95 | fit_flat = mcp(list(n ~ 1), data = df, family=poisson(), par_x = "date") 96 | fit_decay = mcp(list(n ~ 1 + date), data = df, family = poisson()) 97 | 98 | 99 | plot(fit_flat) + plot(fit_decay) 100 | ``` 101 | 102 | 103 | Not we compute and compare the LOO ELPDs: 104 | 105 | ```{r, cache = TRUE, results=FALSE, warning=FALSE, message=FALSE} 106 | fit$loo = loo(fit) 107 | fit_flat$loo = loo(fit_flat) 108 | fit_decay$loo = loo(fit_decay) 109 | ``` 110 | ```{r} 111 | loo::loo_compare(fit$loo, fit_flat$loo, fit_decay$loo) 112 | ``` 113 | 114 | 115 | The change point model seems to be preferred with a ratio of around 1.7 over the decay model and 2.5 over the flat model. Another approach is to look at the model weights: 116 | 117 | ```{r} 118 | loo_list = list(fit$loo, fit_flat$loo, fit_decay$loo) 119 | loo::loo_model_weights(loo_list, method="pseudobma") 120 | ``` 121 | 122 | Again, unsurprisingly, the change point model is preferred and they show the same ranking as implied by `loo_compare`. 123 | 124 | 125 | # JAGS code 126 | Here is the JAGS code for the full model above. 127 | 128 | ```{r} 129 | fit$jags_code 130 | ``` 131 | 132 | -------------------------------------------------------------------------------- /vignettes/tips.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Diagnosing and fixing problems" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{Diagnosing and fixing problems} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | %\VignetteEncoding{UTF-8} 8 | --- 9 | 10 | 11 | # Convergence 12 | A common problem when using MCMC is lacking convergence between chains. This will show up as large `rhat` values (> 1.1 is a common criterion) and non-converging lines in `plot_pars(fit)`. 13 | 14 | * The first thing to try is always to make the model warm up longer to see if it reaches convergence later: `mcp(fit, data, adapt = 10000)`. 15 | 16 | * It can be a sign of a deeper non-identifiability in the model. This will show up as strong correlations in the joint distribution of any pair of implicated parameters: `plot_pars(fit, pars = c("int_1", "int_2), type = "hex")`. This may give you ideas how to change the model. 17 | 18 | * You can set the initial values for the JAGS sampler using, e.g., `mcp(..., inits = list(cp_1 = 20, int_2 = -20, etc.))`. This will be passed to `jags.fit` and you can see more documentation there. 19 | 20 | 21 | # Speed 22 | A lot of data and complicated models will slow down fitting. 23 | 24 | * Run the chains in parallel using, e.g., `mcp(..., chains=4, cores=4)`. The only reason this is not enabled by default is because parallel sampling fails on some systems. Turn it on for the whole session using `options(mc.cores = 3)` which will override `cores` (which defaults to 1). 25 | 26 | * More data usually means better identifiability and faster convergence. Lower the adaption period period using, e.g., `mcp(..., adapt = 300)`. This is also sometimes called "burnin". 27 | 28 | 29 | 30 | # Errors or won't run 31 | Most of these problems should stem from inappropriate priors and such problems may be exacerbated by fragile link functions (e.g., `binomial(link = "identity")`. The article on [priors in mcp](https://lindeloev.github.io/mcp/articles/priors.html) may be helpful, but in particular: 32 | 33 | * Errors on "directed cycle" usually stems from using parameters in priors. For example, if you set `prior = list(int_1 = "dnorm(int_2, 1)"", int_2 = "dnorm(int_1, 1)")` this is an infinite regress. 34 | 35 | * Errors on "incompatible with parent nodes" usually stem from impossible values. For example, if you set `prior = list(sigma = "dnorm(0, 1)"")`, this allows for a negative standard deviation, which is impossible. Think about changing the prior distributions and perhaps truncate them using `T(lower, upper)`. 36 | 37 | 38 | If you encounter these or other problems, don't hesitate to [raise a Github Issue](https://github.com/lindeloev/mcp/issues), asking for help or filing a bug report. 39 | -------------------------------------------------------------------------------- /vignettes/variance.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Modeling variance and variance changes" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{Modeling variance and variance changes} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | %\VignetteEncoding{UTF-8} 8 | --- 9 | 10 | For GLM families with a variance parameter (`sigma`), you can model this explicitly. For example, if you want a flat mean (a plateau) with increasing variance, you can do `y ~ 1 + sigma(1 + x)`. In general, all formula syntax that is allowed outside of `sigma()` (where it applies to the mean) also works inside it (applying to the variance). For example, if you go all-out, you can do `~ 1 + sigma(rel(1) + sin(x) + I(x^2))`. Read more about mcp formulas [here](../articles/formulas.html). 11 | 12 | ```{r} 13 | library(mcp) 14 | options(mc.cores = 3) # Speed up sampling 15 | set.seed(42) # Make the script deterministic 16 | ``` 17 | 18 | 19 | # Simple example: a change in variance 20 | Let us model a simple change in variance on a plateau model. First, we specify the model: 21 | 22 | ```{r} 23 | model = list( 24 | y ~ 1, # sigma(1) is implicit in the first segment 25 | ~ 0 + sigma(1) # a new intercept on sigma, but not on the mean 26 | ) 27 | ``` 28 | 29 | We can simulate some data, starting with low variance and an abrupt change to a high variance at $x = 50$: 30 | 31 | ```{r} 32 | empty = mcp(model, sample = FALSE, par_x = "x") 33 | set.seed(40) 34 | df = data.frame(x = 1:100) 35 | df$y = empty$simulate( 36 | df$x, 37 | cp_1 = 50, int_1 = 20, 38 | sigma_1 = 5, sigma_2 = 20) 39 | 40 | head(df) 41 | ``` 42 | 43 | 44 | Now we fit the model to the simulated data. 45 | 46 | ```{r, cache = TRUE, results=FALSE, warning=FALSE, message=FALSE} 47 | fit = mcp(model, data = df, par_x = "x") 48 | ``` 49 | We plot the results with the prediction interval to show the effect of the variance, since it won't be immediately obvious on the default plot of the fitted mean predictions: 50 | 51 | ```{r} 52 | plot(fit, q_predict = TRUE) 53 | ``` 54 | 55 | 56 | We can see all parameters are well recovered (compare `sim` to `mean`). Like the other parameters, the `sigma`s are named after the segment where they were instantiated. There will always be a `sigma_1`. 57 | 58 | ```{r} 59 | summary(fit) 60 | ``` 61 | 62 | 63 | # Advanced example 64 | We can model changes in `sigma` alone or in combination with changes in the mean. In the following, I define a needlessly complex model, just to demonstrate the flexibility of modeling variance: 65 | 66 | ```{r} 67 | model = list( 68 | # Increasing variance. 69 | y ~ 1 + sigma(1 + x), 70 | 71 | # Abrupt change in mean and variance. 72 | ~ 1 + sigma(1), 73 | 74 | # Joined slope on mean. variance changes as 2nd order poly. 75 | ~ 0 + x + sigma(0 + x + I(x^2)), 76 | 77 | # Continue slope on mean, but plateau variance (no sigma() tern). 78 | ~ 0 + x 79 | ) 80 | 81 | # The slope in segment 4 is just a continuation of 82 | # the slope in segment 3, as if there was no change point. 83 | prior = list( 84 | x_4 = "x_3" 85 | ) 86 | ``` 87 | 88 | Notice a few things here: 89 | 90 | * Segment 3 and 4: I changed the variance on a continuous slope. You can do this using priors to define that the slope is shared between segment 3 and 4, effectively canceling the change point on the mean (more about using priors in mcp [here](../articles/priors.html)). 91 | * Segment 4: By not specifying `sigma()`, segment 4 (and later segments) just inherits the variance from the state it was left in in the previous segment. 92 | 93 | In general, the variance parameters are named `sigma_[normalname]`, where "normalname" is the usual parameter names in mcp (see more [here](../articles/formulas.html)). For example, the variance slope on `x` in segment 3 is `sigma_x_3`. However, `sigma_int_i` is just too verbose, so variance intercepts are simply called `sigma_i`, where i is the segment number. 94 | 95 | 96 | 97 | ## Simulate data 98 | We simulate some data from this model, setting all parameters. As always, we can fit an empty model to get `fit$simulate`, which is useful for simulation and predictions from this model. 99 | 100 | ```{r} 101 | empty = mcp(model, sample = FALSE) 102 | set.seed(40) 103 | df = data.frame(x = 1:200) 104 | df$y = empty$simulate( 105 | df$x, 106 | cp_1 = 50, cp_2 = 100, cp_3 = 150, 107 | int_1 = -20, int_2 = 0, 108 | sigma_1 = 3, sigma_x_1 = 0.5, 109 | sigma_2 = 10, 110 | sigma_x_3 = -0.5, 111 | sigma_x_3_E2 = 0.02, 112 | x_3 = 1, x_4 = 1) 113 | ``` 114 | 115 | 116 | ## Fit it and inspect results 117 | Fit it in parallel, to speed things up: 118 | 119 | ```{r, cache = TRUE, results = FALSE, warning=FALSE, message=FALSE} 120 | fit = mcp(model, data = df, prior = prior) 121 | ``` 122 | 123 | Plotting the prediction interval is an intuitive way to to see how the variance is estimated: 124 | 125 | ```{r} 126 | plot(fit, q_predict = TRUE) 127 | ``` 128 | 129 | 130 | We can also plot the `sigma_` parameters directly. Now the y-axis is `sigma`: 131 | 132 | ```{r} 133 | plot(fit, which_y = "sigma", q_fit = TRUE) 134 | ``` 135 | 136 | 137 | `summary()` show that the parameters are well recovered (compare `sim` to `mean`). The last change point is estimated with greater uncertainty than the others. This is expected, given that the only "signal" of this change point is a stop in variance growth. 138 | 139 | ```{r} 140 | summary(fit) 141 | ``` 142 | 143 | The effective sample size (`n.eff`) is fairly low, indicating poor mixing for these parameters. `Rhat` is acceptable at < 1.1, indicating good convergence between chains. Let us verify this by taking a look at the posteriors and trace. For now, we just look at the sigmas: 144 | 145 | ```{r, fig.height=7, fig.width = 6} 146 | plot_pars(fit, regex_pars = "sigma_") 147 | ``` 148 | 149 | This confirms the impression from `Rhat` and `n.eff`. Setting `mcp(..., iter = 10000)` would be advisable to increase the effective sample size. Read more about [tips, tricks, and debugging](../articles/tips.html). 150 | 151 | 152 | 153 | 154 | # Varying change points and variance 155 | The variance model applies to varying change points as well. For example, here we do a spin on the example in [the article on varying change points](../articles/varying.html), and add a by-person change in `sigma`. We model two joined slopes, varying by `id`. The second slope is also characterized by a different variance. This means that the model has more information about when the change point occurs, so it should be easier to estimate (require fewer data). 156 | 157 | ```{r} 158 | model = list( 159 | # intercept + slope 160 | y ~ 1 + x, 161 | 162 | # joined slope and increase in variance, varying by id. 163 | 1 + (1|id) ~ 0 + x + sigma(1) 164 | ) 165 | ``` 166 | 167 | Simulate data: 168 | ```{r} 169 | empty = mcp(model, sample = FALSE) 170 | set.seed(40) 171 | df = data.frame( 172 | x = 1:180, 173 | id = rep(1:6, times = 30) 174 | ) 175 | df$y = empty$simulate( 176 | df$x, 177 | cp_1 = 70, cp_1_id = 15 * (df$id - mean(df$id)), 178 | int_1 = 20, x_1 = 1, x_2 = -0.5, 179 | sigma_1 = 10, sigma_2 = 25) 180 | ``` 181 | 182 | Fit it: 183 | ```{r, cache = TRUE, results=FALSE, warning=FALSE, message=FALSE} 184 | fit = mcp(model, data = df) 185 | ``` 186 | 187 | Plot it: 188 | ```{r} 189 | plot(fit, facet_by = "id") 190 | ``` 191 | 192 | 193 | As usual, we can get the individual change points: 194 | ```{r} 195 | ranef(fit) 196 | ``` 197 | 198 | -------------------------------------------------------------------------------- /vignettes/varying.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Varying change points" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{Varying change points} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | %\VignetteEncoding{UTF-8} 8 | --- 9 | 10 | A unique feature of `mcp` is modeling change points as varying effects (sometimes called "random effects"). This has the advantage that you can let the change point vary by a factor while keeping other parameters common across varying factor levels. 11 | 12 | This article in brief: 13 | 14 | * How to simulate varying change points 15 | * Get posteriors using `ranef(fit)` 16 | * Plot using `plot(fit, facet_by="my_group")` and `plot_pars(fit, pars = "varying", type = "dens_overlay", ncol = 3)`. 17 | * The default priors restrict varying change points to lie between the two adjacent change points. 18 | * The article on modeling variance via `sigma()` contains [an example on varying change points](../articles/variance.html) as well. 19 | 20 | ```{r} 21 | library(mcp) 22 | options(mc.cores = 3) # Speed up sampling 23 | set.seed(42) # Make the script deterministic 24 | ``` 25 | 26 | 27 | # Specifying varying change points 28 | 29 | You specify varying effects using the classical [`lmer`](https://www.rdocumentation.org/packages/lme4/versions/1.1-21/topics/lmer) syntax `(1|group)`. Currently (v. 0.1) `mcp` only support varying intercepts. For example, here we model a varying change point between a plateau and a joined slope: 30 | 31 | ```{r} 32 | model = list( 33 | y ~ 1, # int_1 34 | 1 + (1|id) ~ 0 + x # cp_1, cp_1_sd, cp_1_id[i] 35 | ) 36 | ``` 37 | 38 | You can have multiple varying change points with multiple groupings: 39 | 40 | ```{r} 41 | model = list( 42 | y ~ 1, # int_1 43 | 1 + (1|id) ~ 0 + x, # cp_1, cp_1_sd, cp_1_id[i] 44 | 1 + (1|species) ~ 0, # cp_2, cp_2_sd, cp_2_species[i] 45 | (1|id) ~ 1 # cp_3 (implicit), cp_3_sd, cp_3_id[i] 46 | ) 47 | ``` 48 | 49 | Here are some properties of the change point varying effects: 50 | 51 | **Zero centered:** The varying effects are zero-centered around the associated group-level change point. In other words, the sum of all varying effects are exactly zero. This constraint is necessary for the parameters to be identifiable. 52 | 53 | **Hierarchical:** Consider the first change point, `cp_1`, and it's associated varying effects, `cp_1_id`. By default, it is modeled as sampled from (nested within) the group-level change point, `cp_1`, as well as a spread, `cp_1_sd`. 54 | 55 | **Constraints:** The varying effects are constrained to lie (1) in the observed range of the x-axis, and/or (2) between the two adjacent change points. That is, all `cp_1_id` are between `min(x)` and `cp_2`. All `cp_2_species` are between `cp_1` and `cp_3` and all `cp_3_id` are between `cp_2` and `max(x)`. These constraints are enforced through truncation of the default prior (`fit$prior`) and you can override them by specifying a manual prior (see vignette("priors")). 56 | 57 | 58 | 59 | # Simulating varying effects 60 | Let us do a worked example, simulating the varying change point between a plateau and a slope: 61 | 62 | ```{r} 63 | model = list( 64 | y ~ 1, # int_1 65 | 1 + (1|id) ~ 0 + x # cp_1, cp_1_sd, cp_1_id[i] 66 | ) 67 | ``` 68 | 69 | It is quite similar to simulating non-varying data, except that we need to simulate some varying offsets before passing all parameters to `empty$simulate`: 70 | 71 | ```{r, message=FALSE, warning=FALSE} 72 | empty = mcp(model, sample = FALSE) 73 | 74 | library(dplyr, warn.conflicts = FALSE) 75 | varying = c("Clark", "Louis", "Batman", "Batgirl", "Spiderman", "Jane") 76 | df = data.frame( 77 | x = runif(length(varying) * 30, 0, 100), # 30 data points for each 78 | id = rep(varying, each = 30) # the group names 79 | ) 80 | df$id_numeric = as.numeric(as.factor(df$id)) # to positive integers 81 | df$y = empty$simulate(df$x, 82 | # Population-level: 83 | int_1 = 20, x_2 = 0.5, cp_1 = 50, sigma = 2, 84 | 85 | # Varying: zero-centered and 10 between each level 86 | cp_1_id = 10 * (df$id_numeric - mean(df$id_numeric))) 87 | 88 | head(df) 89 | ``` 90 | 91 | Here, we "translated" the `id` to an offset on the x-axis by multiplying with 10. We subtracted the mean to make the varying effects zero-centered around `cp_1`. The result: 92 | 93 | ```{r} 94 | library(ggplot2) 95 | ggplot(df, aes(x=x, y=y)) + 96 | geom_point() + 97 | facet_wrap(~id) 98 | ``` 99 | 100 | 101 | # Summarise and plot varying effects. 102 | Fitting the model is simple: 103 | 104 | ```{r, cache = TRUE, message = FALSE, warning=FALSE, results=FALSE} 105 | fit = mcp(model, data = df) 106 | ``` 107 | 108 | If we just use `plot(fit)`, we would see all points in one plot. We want to facet by `id`, so: 109 | 110 | ```{r} 111 | plot(fit, facet_by = "id") 112 | ``` 113 | 114 | It seems that `mcp` did a good job of recovering the change points. There is a lot of information in this data, since the intercept and the slope on each side of the (varying) change point is shared between participants here. 115 | 116 | If you use `summary(fit)` (or `fixef(fit)`) you will get the posteriors for the population-level effects. To get the random effects, do: 117 | 118 | ```{r} 119 | ranef(fit) 120 | ``` 121 | 122 | Inspecting the `sim` and `match` columns, we see that they recovered the simulation parameters well. 123 | 124 | Good convergence is not always as obvious as in this example. While `plot_pars(fit)` show population-level parameters only, you can do this to get varying effects only: 125 | 126 | ```{r} 127 | plot_pars(fit, pars = "varying", type = "trace", ncol = 3) 128 | ``` 129 | 130 | Notice the use of the `ncol` argument to set the number of columns. You will often have *many* levels on your varying effect, so this is useful to get a good view of all of them. Naturally, you can do this for almost all kinds of plots. 131 | 132 | Using `pars = "varying"` will plot all varying effects. This may be too much if you have multiple varying effects. To select just one, use regular expression in `regex_pars`. Two very handy operators are "^" (begins with) and "$" (ends with). Just to show that this "faceting" works for almost all of the many plot types, we now do two columns of `"dens_overlay`: 133 | 134 | ```{r} 135 | plot_pars(fit, regex_pars = "^cp_1_id", type = "dens_overlay", ncol = 2) 136 | ``` 137 | 138 | You can also do posterior predictive checking with facets. I think that for the relatively univariate models supported as of `mcp` 0.3, this does not add much new information over and above `plot(fit, facet_by = "id")`, but it's a standard assessment that many will be acquainted with: 139 | 140 | ```{r} 141 | pp_check(fit, facet_by = "id") 142 | ``` 143 | 144 | 145 | 146 | 147 | # Priors for varying effects 148 | You can see the priors of the model like this: 149 | 150 | ```{r} 151 | cbind(fit$prior) 152 | ``` 153 | 154 | The priors `cp_1_sd` is the population-level standard deviation of `cp_1_id`, the latter of which is applied to all levels of `id`. This is also apparent if you inspect the JAGS code for this model. The truncation of varying effects is quite contrived, but just keeps them between the two adjacent (population-level) change points. 155 | 156 | 157 | # JAGS code 158 | Here is the JAGS code for the model used in this article: 159 | 160 | ```{r} 161 | fit$jags_code 162 | ``` 163 | 164 | --------------------------------------------------------------------------------