├── .Rbuildignore ├── .Rprofile ├── .gitignore ├── .travis.yml ├── DESCRIPTION ├── NAMESPACE ├── NEWS ├── R ├── .Rapp.history ├── Qm.R ├── af.R ├── bglmnet.R ├── glmfence.R ├── lmfence.R ├── mplot-package.R ├── mplot.R ├── qrange.R ├── sigMM.R ├── sstab.R ├── utils-pipe.R └── vis.R ├── README.md ├── _pkgdown.yml ├── data ├── .Rapp.history ├── artificialeg.rda ├── bodyfat.rda ├── diabetes.rda ├── fev.rda └── wallabies.rda ├── docs ├── .nojekyll ├── 404.html ├── articles │ ├── af.html │ ├── af_files │ │ └── header-attrs-2.9 │ │ │ └── header-attrs.js │ ├── artificial.html │ ├── artificial_files │ │ ├── figure-html │ │ │ └── pairsplot-1.png │ │ └── header-attrs-2.9 │ │ │ └── header-attrs.js │ ├── background.html │ ├── background_files │ │ └── header-attrs-2.9 │ │ │ └── header-attrs.js │ ├── birthweight.html │ ├── birthweight_files │ │ └── header-attrs-2.9 │ │ │ └── header-attrs.js │ ├── diabetes.html │ ├── diabetes_files │ │ ├── crosstalk-1.0.0 │ │ │ ├── css │ │ │ │ └── crosstalk.css │ │ │ └── js │ │ │ │ ├── crosstalk.js │ │ │ │ ├── crosstalk.js.map │ │ │ │ ├── crosstalk.min.js │ │ │ │ └── crosstalk.min.js.map │ │ ├── crosstalk-1.1.1 │ │ │ ├── css │ │ │ │ └── crosstalk.css │ │ │ └── js │ │ │ │ ├── crosstalk.js │ │ │ │ ├── crosstalk.js.map │ │ │ │ ├── crosstalk.min.js │ │ │ │ └── crosstalk.min.js.map │ │ ├── datatables-binding-0.12 │ │ │ └── datatables.js │ │ ├── datatables-binding-0.18 │ │ │ └── datatables.js │ │ ├── datatables-css-0.0.0 │ │ │ └── datatables-crosstalk.css │ │ ├── dt-core-1.10.20 │ │ │ ├── css │ │ │ │ ├── jquery.dataTables.extra.css │ │ │ │ └── jquery.dataTables.min.css │ │ │ └── js │ │ │ │ └── jquery.dataTables.min.js │ │ ├── header-attrs-2.9 │ │ │ └── header-attrs.js │ │ ├── htmlwidgets-1.5.1 │ │ │ └── htmlwidgets.js │ │ ├── htmlwidgets-1.5.3 │ │ │ └── htmlwidgets.js │ │ ├── jquery-1.12.4 │ │ │ ├── LICENSE.txt │ │ │ └── jquery.min.js │ │ └── jquery-3.5.1 │ │ │ ├── jquery-AUTHORS.txt │ │ │ ├── jquery.js │ │ │ ├── jquery.min.js │ │ │ └── jquery.min.map │ ├── images │ │ ├── artafboTF.png │ │ ├── figure4a.png │ │ ├── figure4b.png │ │ ├── figure4c.png │ │ ├── figure5a.png │ │ ├── figure5b.png │ │ ├── figure5c.png │ │ ├── figure5d.png │ │ ├── nature.png │ │ ├── oncology.png │ │ ├── plotvis.png │ │ └── thyroid.png │ ├── index.html │ ├── interactive.html │ ├── interactive_files │ │ └── header-attrs-2.9 │ │ │ └── header-attrs.js │ ├── msp.html │ ├── msp_files │ │ └── header-attrs-2.9 │ │ │ └── header-attrs.js │ ├── people.html │ ├── people_files │ │ └── header-attrs-2.9 │ │ │ └── header-attrs.js │ ├── publications.html │ ├── publications_files │ │ └── header-attrs-2.9 │ │ │ └── header-attrs.js │ ├── timing.html │ ├── timing_files │ │ └── header-attrs-2.9 │ │ │ └── header-attrs.js │ ├── vip.html │ └── vip_files │ │ └── header-attrs-2.9 │ │ └── header-attrs.js ├── authors.html ├── bootstrap-toc.css ├── bootstrap-toc.js ├── docsearch.css ├── docsearch.js ├── index.html ├── link.svg ├── pkgdown.css ├── pkgdown.js ├── pkgdown.yml ├── reference │ ├── Rplot001.png │ ├── Rplot002.png │ ├── Rplot003.png │ ├── af-1.png │ ├── af.html │ ├── artificialeg.html │ ├── bglmnet-1.png │ ├── bglmnet.html │ ├── bodyfat.html │ ├── diabetes.html │ ├── fev.html │ ├── glmfence.html │ ├── index.html │ ├── lmfence.html │ ├── mplot-package.html │ ├── mplot.html │ ├── pipe.html │ ├── plot.af.html │ ├── plot.bglmnet.html │ ├── plot.vis-1.png │ ├── plot.vis-2.png │ ├── plot.vis-3.png │ ├── plot.vis.html │ ├── print.af.html │ ├── print.vis.html │ ├── process.fn.html │ ├── summary.af.html │ ├── txt.fn.html │ ├── vis-1.png │ ├── vis-2.png │ ├── vis-3.png │ ├── vis.html │ └── wallabies.html └── sitemap.xml ├── inst └── CITATION ├── man ├── .Rapp.history ├── af.Rd ├── artificialeg.Rd ├── bglmnet.Rd ├── bodyfat.Rd ├── diabetes.Rd ├── fev.Rd ├── glmfence.Rd ├── lmfence.Rd ├── mplot-package.Rd ├── mplot.Rd ├── pipe.Rd ├── plot.af.Rd ├── plot.bglmnet.Rd ├── plot.vis.Rd ├── print.af.Rd ├── print.vis.Rd ├── process.fn.Rd ├── summary.af.Rd ├── txt.fn.Rd ├── vis.Rd └── wallabies.Rd ├── mplot.Rproj └── vignettes ├── af.Rmd ├── apa-old-doi-prefix.csl ├── artificial.Rmd ├── artificial_cache └── html │ ├── __packages │ ├── pairsplot_6c89e27e0924f8fe4f44e1cca65b063f.RData │ ├── pairsplot_6c89e27e0924f8fe4f44e1cca65b063f.rdb │ └── pairsplot_6c89e27e0924f8fe4f44e1cca65b063f.rdx ├── background.Rmd ├── birthweight.Rmd ├── bw_lm.RData ├── bw_main.RData ├── diabetes.Rmd ├── diabetes_int.RData ├── diabetes_main.RData ├── images ├── .DS_Store ├── artafboTF.png ├── favicon.ico ├── figure4a.png ├── figure4b.png ├── figure4c.png ├── figure5a.png ├── figure5b.png ├── figure5c.png ├── figure5d.png ├── nature.png ├── oncology.png ├── plotvis.png └── thyroid.png ├── interactive.Rmd ├── jss.bib ├── msp.Rmd ├── people.Rmd ├── publications.Rmd ├── publications.md ├── timing.Rmd └── vip.Rmd /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^\.travis\.yml$ 4 | ^docs$ 5 | ^_pkgdown\.yml$ 6 | vignettes/ 7 | -------------------------------------------------------------------------------- /.Rprofile: -------------------------------------------------------------------------------- 1 | options(repos = c(CRAN="http://cran.rstudio.com")) -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | inst/doc 5 | .DS_Store 6 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | # Sample .travis.yml for R projects 2 | 3 | language: r 4 | warnings_are_errors: true 5 | sudo: required 6 | 7 | env: 8 | global: 9 | - CRAN: http://cran.rstudio.com 10 | 11 | notifications: 12 | email: 13 | on_success: change 14 | on_failure: change 15 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: mplot 2 | Type: Package 3 | Title: Graphical Model Stability and Variable Selection Procedures 4 | Version: 1.0.6 5 | Date: 2021-07-10 6 | Authors@R: c(person("Garth", "Tarr", role = c("aut", "cre"), email = "garth.tarr@gmail.com", comment = c(ORCID = "0000-0002-6605-7478")), person("Samuel", "Mueller", role = "aut", email = "samuel.mueller@sydney.edu.au", comment = c(ORCID = "0000-0002-3087-8127")), person("Alan H", "Welsh", role = "aut", email = "alan.welsh@anu.edu.au", comment = c(ORCID = "0000-0002-3165-9559"))) 7 | Description: Model stability and variable inclusion plots [Mueller and Welsh 8 | (2010, ); Murray, Heritier and Mueller 9 | (2013, )] as well as the adaptive fence [Jiang et al. 10 | (2008, ); Jiang et al. 11 | (2009, )] for linear and generalised linear models. 12 | License: GPL (>= 2) 13 | Suggests: 14 | knitr, 15 | mvoutlier, 16 | glmulti, 17 | rmarkdown, 18 | DT, 19 | MASS 20 | Imports: 21 | leaps, 22 | foreach, 23 | parallel, 24 | bestglm, 25 | doParallel, 26 | doRNG, 27 | plyr, 28 | shinydashboard, 29 | shiny, 30 | glmnet, 31 | graphics, 32 | stats, 33 | googleVis, 34 | ggplot2, 35 | reshape2, 36 | scales, 37 | dplyr, 38 | tidyr, 39 | magrittr 40 | URL: https://garthtarr.github.io/mplot/, https://github.com/garthtarr/mplot 41 | Roxygen: list(markdown = TRUE) 42 | LazyData: TRUE 43 | RoxygenNote: 7.1.1 44 | Encoding: UTF-8 45 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method(plot,af) 4 | S3method(plot,bglmnet) 5 | S3method(plot,vis) 6 | S3method(print,af) 7 | S3method(print,vis) 8 | S3method(summary,af) 9 | export("%>%") 10 | export(af) 11 | export(bglmnet) 12 | export(glmfence) 13 | export(lmfence) 14 | export(mplot) 15 | export(process.fn) 16 | export(vis) 17 | import(foreach) 18 | import(parallel) 19 | import(shiny) 20 | import(shinydashboard) 21 | importFrom(doRNG,"%dorng%") 22 | importFrom(dplyr,n) 23 | importFrom(magrittr,"%>%") 24 | -------------------------------------------------------------------------------- /NEWS: -------------------------------------------------------------------------------- 1 | mplot 1.0.5 [2019-03-02] 2 | ------------------------ 3 | 4 | * reduce reliance on bestglm as it appears to be no longer maintained 5 | 6 | mplot 1.0.3 [2019-02-13] 7 | ------------------------ 8 | 9 | * Fix for plot.bglmnet with min.prob in mplot 10 | * Resolve warning messages around exporting objects 11 | 12 | 13 | mplot 1.0.2 [2019-01-22] 14 | ------------------------ 15 | 16 | * Compatibility update with a new version of dplyr. 17 | 18 | 19 | mplot 1.0.0 [2018-02-13] 20 | ------------------------ 21 | 22 | * Version 1.0.0 release to coincide with appearance of 23 | Journal of Statistical Software article. 24 | * Documentation updates. 25 | * Examples simplified so that run time of each is <5 seconds 26 | 27 | 28 | mplot 0.8.2 [2017-11-26] 29 | ------------------------ 30 | 31 | * The bglmnet function has been been re-written 32 | * Added boot_size plot option for bglmnet objects 33 | * Vignettes removed, replaced with pkgdown website 34 | 35 | 36 | mplot 0.8.1 [2017-11-18] 37 | ------------------------ 38 | 39 | * More sensible nbest parameter behaviour in the vis function. 40 | Now nbest = "all" reverts back to nbest = 1 for models with 41 | more than 15 parameters. This can be overridden by specifying 42 | an integer, rather than specifying "all". For large model sizes, 43 | if the number specified is too large, it will lead to memory 44 | overruns and failures with dependent packages. 45 | * Fixed vignette/pkgdown issue where there were multiple 46 | vignettes created with the same name. 47 | 48 | 49 | mplot 0.8.0 [2017-11-03] 50 | ------------------------ 51 | 52 | * Random seed parameter for reproducible results 53 | * New website 54 | * Small fixes for JSS article (e.g. googleVis axis padding) 55 | 56 | 57 | mplot 0.7.8 [2016-08-17] 58 | ------------------------ 59 | 60 | * reimplemented classic plots in ggplot2 61 | * classic plots are now default (reviewer feedback) 62 | * improved documentation 63 | - more detail about procedures 64 | - more detail about plotting methods 65 | * fixed the passing of weights for weighted models 66 | * can now obtain full loss v dimension which="lvk" 67 | plots for glms 68 | * vis() default is now nbest="all". 69 | * experimental support for glmulti as a backend 70 | instead of bestglm for glms for the vis function. 71 | This enables users to enforce marginality constraints. 72 | 73 | mplot 0.7.6 [2016-05-20] 74 | ------------------------ 75 | 76 | * maintenance, larger update coming soon 77 | 78 | 79 | mplot 0.7.5 [2015-11-05] 80 | ------------------------ 81 | 82 | * added fev and wallabies dataset 83 | 84 | 85 | mplot 0.7.4 [2015-10-10] 86 | ------------------------ 87 | 88 | * added tag option to googleVis plots to facilitate easier 89 | plotting in rmarkdown documents. 90 | * citation updated to reference arXiv article 91 | * bootstrapping glmnet (bglmnet) function refined and 92 | added to the mplot() shiny interface 93 | 94 | mplot 0.7.1 [2015-09-09] 95 | ------------------------ 96 | 97 | * Fix for undefined globals (CRAN submission) 98 | 99 | mplot 0.7.0 [2015-09-08] 100 | ------------------------ 101 | 102 | * Release to coincide with JSS article 103 | * Implements "all" for vis function 104 | * Improved blmnet plots 105 | * Consistency with interactive plotting methods 106 | * Improved documentation 107 | 108 | mplot 0.6.5 [2015-07-11] 109 | ------------------------ 110 | 111 | * Numerous refinements including consistency of 112 | style between the classic plots 113 | * ylim argument for classic boot and lvk plots 114 | * legend.position argument for classic af plots 115 | 116 | mplot 0.6.0 [2015-06-10] 117 | ------------------------ 118 | 119 | * First CRAN release 120 | * Reimplemented vis function to 121 | avoid massive memory use for moderate model 122 | sizes. Now runs much faster and leaner. 123 | 124 | 125 | mplot 0.5.5 [2015-04-10] 126 | ------------------------ 127 | 128 | * mplot interface now uses shinydashboard 129 | * the scatterplot matrix from the mplot shiny 130 | interface has now been spun off into its 131 | own package: parisD3 132 | * issue with zooming on transparent reported to 133 | GoogleCharts - the workaround is to not use 134 | backgroundColor = 'transparent' until it is fixed 135 | at the source 136 | 137 | 138 | mplot 0.5.0 [2015-02-01] 139 | ------------------------ 140 | 141 | * Limited robustness via screening. 142 | * Weights now get passed through in the adaptive fence. 143 | 144 | 145 | mplot 0.4.9 [2014-12-05] 146 | ------------------------ 147 | 148 | * Changed parallel backend for af() from doMC 149 | to doParallel which should work for both 150 | unix-like systems and windows. 151 | * Added redundant variable to vis(). 152 | * Fixed issue with deparse(model.formula) 153 | when the model.formula was too long for 154 | deparse to cope with. 155 | 156 | 157 | mplot 0.4.7 [2014-11-10] 158 | ------------------------ 159 | 160 | * New data sets: diabetes and artificial example 161 | 162 | mplot 0.4.6 [2014-10-15] 163 | ------------------------ 164 | 165 | * First public version 166 | 167 | 168 | -------------------------------------------------------------------------------- /R/.Rapp.history: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/R/.Rapp.history -------------------------------------------------------------------------------- /R/Qm.R: -------------------------------------------------------------------------------- 1 | #' A measure of lack of fit 2 | #' 3 | #' This function calculates a lack-of-fit measure for 4 | #' a model, \eqn{\hat{Q}_M}. It currently simply 5 | #' returns the negative log-likelihood value of an 6 | #' estimated model object. 7 | #' 8 | #' @param object Any object from which a log-likelihood value can be extracted. 9 | #' @param method method the model selection method to be used. Currently 10 | #' only \code{method = "ML"} is supported (perhaps in the future 11 | #' \code{method = "MVC"} will be implemented). 12 | #' @noRd 13 | Qm = function(object,method){ 14 | if(method=="ML"){ 15 | return(-as.numeric(stats::logLik(object))) 16 | } 17 | } 18 | -------------------------------------------------------------------------------- /R/glmfence.R: -------------------------------------------------------------------------------- 1 | #' The fence procedure for generalised linear models 2 | #' 3 | #' This function implements the fence procedure to 4 | #' find the best generalised linear model. 5 | #' 6 | #' @param mf an object of class \code{\link[stats]{glm}} 7 | #' specifying the full model. 8 | #' @param cstar the boundary of the fence, typically found 9 | #' through bootstrapping. 10 | #' @param nvmax the maximum number of variables that will be 11 | #' be considered in the model. 12 | #' @param adaptive logical. If \code{TRUE} the boundary of the fence is 13 | #' given by cstar. Otherwise, it the original (non-adaptive) fence 14 | #' is performed where the boundary is cstar*hat(sigma)_{M,tildeM}. 15 | #' @param trace logical. If \code{TRUE} the function prints out its 16 | #' progress as it iterates up through the dimensions. 17 | #' @param ... further arguments (currently unused) 18 | #' @seealso \code{\link{af}}, \code{\link{lmfence}} 19 | #' @references Jiming Jiang, Thuan Nguyen, J. Sunil Rao, 20 | #' A simplified adaptive fence procedure, Statistics & 21 | #' Probability Letters, Volume 79, Issue 5, 1 March 2009, 22 | #' Pages 625-629, http://dx.doi.org/10.1016/j.spl.2008.10.014. 23 | #' @export 24 | #' @keywords Internal 25 | #' @family fence 26 | 27 | glmfence = function(mf, 28 | cstar, 29 | nvmax, 30 | adaptive=TRUE, 31 | trace=TRUE,...){ 32 | method="ML" 33 | if(any(class(mf)=="glm")!=TRUE){ 34 | stop("The argument to mf needs to be a glm object.") 35 | } 36 | if(attr(mf$terms,"intercept")==0){ 37 | stop("Please allow for an intercept in your model.") 38 | } 39 | m = mextract(mf) 40 | kf = m$k 41 | fixed = m$fixed 42 | family = m$family 43 | yname = m$yname 44 | Xy = m$X 45 | n = m$n 46 | wts = m$wts 47 | if(missing(nvmax)) nvmax=kf 48 | null.ff = stats::as.formula(paste(yname,"~1")) # null formula 49 | m0 = stats::glm(null.ff, data = Xy, family=family, weights = wts) # null model 50 | Qmf = Qm(mf, method=method) # Qm for the full model 51 | Qm0 = Qm(m0, method=method) # Qm for the null model 52 | ret = met = list() 53 | 54 | # Null model 55 | if(trace) cat(paste("Null model ")) 56 | UB = Qmf + cstar*sigMM(k.mod=1, method, k.full=kf,adaptive=adaptive) 57 | if(Qm0<=UB){ 58 | if(trace) txt.fn(Qm0,UB,m0) 59 | ret[[1]] = null.ff 60 | return(ret) 61 | } else if(trace) cat("(Not a candidate model) \n") 62 | 63 | if(cstar<5){ # avoids having to add variables to get the full model 64 | nvmax = kf 65 | prev.nvmax = nvmax 66 | } else if(nvmax<5){ 67 | prev.nvmax = nvmax 68 | nvmax = nvmax+5 69 | } else prev.nvmax = nvmax 70 | # look around for the best model at each model size 71 | while(prev.nvmax<=kf){ 72 | prev.nvmax = nvmax 73 | bg = bestglm::bestglm(Xy=Xy, family=family, 74 | IC = "BIC", 75 | TopModels = 5*kf, 76 | nvmax = nvmax, weights=wts) 77 | lc = bg$Subsets[,1:kf]+0 # 'leaps' candidates 78 | for(i in 2:nvmax){ 79 | if(trace) cat(paste("Model size:",i,"")) 80 | UB = Qmf + cstar*sigMM(k.mod=i, method, k.full=kf, adaptive=adaptive) 81 | mnames = colnames(lc)[which(lc[i,]==1)] 82 | ff = stats::as.formula(paste(yname," ~ ",paste(mnames[-1],collapse="+"),sep="")) 83 | em = stats::glm(formula=ff, data=Xy, family=family, weights=wts) 84 | hatQm = Qm(em,method=method) 85 | if(hatQm<=UB){ 86 | if(trace){ 87 | cat("\n") 88 | cat("Candidate model found via bestglm. \n") 89 | cat("Exploring other options at this model size. ") 90 | txt.fn(hatQm,UB,em) 91 | } 92 | pos = 1 93 | environment(ff) = globalenv() 94 | ret[[pos]] = ff # record the result 95 | met[[pos]] = hatQm #record its score 96 | # look for others at this model size: 97 | lfm = bg$BestModels[,1:(kf-1)]+0 98 | lfm.sum = apply(lfm,1,sum) 99 | lfm = lfm[lfm.sum==i-1,] 100 | # remove already estimated model from lfm: 101 | check.fn = function(x) !all(x==lc[i,-1]) 102 | lfm = lfm[apply(lfm,1,check.fn),] 103 | if(dim(lfm)[1]>0){ 104 | for(j in 1:dim(lfm)[1]){ 105 | mnames = colnames(lfm)[which(lfm[j,]==1)] 106 | ff = stats::as.formula(paste(yname," ~ ", 107 | paste(mnames,collapse="+"), 108 | sep="")) 109 | em = stats::glm(ff, data=Xy, family=family, weights=wts) 110 | hatQm = Qm(em,method=method) 111 | if(hatQm<=UB){ 112 | if(trace) txt.fn(hatQm,UB,em) 113 | pos = pos+1 114 | environment(ff) = globalenv() 115 | ret[[pos]] = ff 116 | met[[pos]] = hatQm 117 | } 118 | } 119 | return(ret[rank(unlist(met),ties.method="random")]) 120 | } else return(ret) 121 | } 122 | if(trace) cat("(No candidate models found) \n") 123 | } 124 | if(trace) cat(" (No candidate models found: increasing nvmax) \n", cstar) 125 | nvmax = nvmax+5 126 | } 127 | } 128 | -------------------------------------------------------------------------------- /R/lmfence.R: -------------------------------------------------------------------------------- 1 | #' The fence procedure for linear models 2 | #' 3 | #' This function implements the fence procedure to 4 | #' find the best linear model. 5 | #' 6 | #' @param mf an object of class \code{\link[stats]{lm}} 7 | #' specifying the full model. 8 | #' @param cstar the boundary of the fence, typically found 9 | #' through bootstrapping. 10 | #' @param nvmax the maximum number of variables that will be 11 | #' be considered in the model. 12 | #' @param adaptive logical. If \code{TRUE} the boundary of the fence is 13 | #' given by cstar. Otherwise, it the original (non-adaptive) fence 14 | #' is performed where the boundary is cstar*hat(sigma)_{M,tildeM}. 15 | #' @param trace logical. If \code{TRUE} the function prints out its 16 | #' progress as it iterates up through the dimensions. 17 | #' @param force.in the names of variables that should be forced 18 | #' into all estimated models. 19 | #' @param ... further arguments (currently unused) 20 | #' @seealso \code{\link{af}}, \code{\link{glmfence}} 21 | #' @references Jiming Jiang, Thuan Nguyen, J. Sunil Rao, 22 | #' A simplified adaptive fence procedure, Statistics & 23 | #' Probability Letters, Volume 79, Issue 5, 1 March 2009, 24 | #' Pages 625-629, http://dx.doi.org/10.1016/j.spl.2008.10.014. 25 | #' @export 26 | #' @keywords Internal 27 | #' @family fence 28 | #' @examples 29 | #' n = 40 # sample size 30 | #' beta = c(1,2,3,0,0) 31 | #' K=length(beta) 32 | #' set.seed(198) 33 | #' X = cbind(1,matrix(rnorm(n*(K-1)),ncol=K-1)) 34 | #' e = rnorm(n) 35 | #' y = X%*%beta + e 36 | #' dat = data.frame(y,X[,-1]) 37 | #' # Non-adaptive approach (not recommended) 38 | #' lm1 = lm(y~.,data=dat) 39 | #' lmfence(lm1,cstar=log(n),adaptive=FALSE) 40 | 41 | lmfence = function(mf, cstar, 42 | nvmax, 43 | adaptive=TRUE, 44 | trace=TRUE, 45 | force.in=NULL,...){ 46 | method="ML" 47 | if(class(mf)!="lm"){ 48 | stop("The argument to mf needs to be a lm object.") 49 | } 50 | if(attr(mf$terms,"intercept")==0){ 51 | stop("Please allow for an intercept in your model.") 52 | } 53 | m = mextract(mf) 54 | kf = m$k 55 | fixed = m$fixed 56 | yname = m$yname 57 | data = m$X 58 | n = m$n 59 | wts = m$wts 60 | if(missing(nvmax)) nvmax=kf 61 | null.ff = stats::as.formula(paste(yname,"~1")) 62 | m0 = stats::lm(null.ff, data = data, weights=m$wts) # null model 63 | Qmf = Qm(mf, method=method) # Qm for the full model 64 | Qm0 = Qm(m0, method=method) # Qm for the null model 65 | ret = met = list() 66 | # Null model 67 | if(trace) cat(paste("Null model ")) 68 | UB = Qmf + cstar*sigMM(k.mod = 1, method = method, 69 | k.full = kf, adaptive = adaptive) 70 | if(Qm0<=UB){ 71 | if(trace) txt.fn(Qm0,UB,m0) 72 | ret[[1]] = null.ff # record the result 73 | return(ret) 74 | } else if(trace) cat("(Not a candidate model) \n") 75 | 76 | if(cstar<5){ # avoids having to add variables to get the full model 77 | nvmax = kf 78 | prev.nvmax = nvmax 79 | } else if(nvmax<5){ 80 | prev.nvmax = nvmax 81 | nvmax = nvmax+5 82 | } else prev.nvmax = nvmax 83 | prev.nvmax = min(prev.nvmax,kf) 84 | # look around for the best model at each model size 85 | while(prev.nvmax<=kf){ 86 | prev.nvmax = nvmax 87 | # finds the best candidate for each model size 88 | rss = do.call(leaps::regsubsets,list(x=fixed, 89 | data=data, 90 | nbest = 5+kf, 91 | nvmax = nvmax, 92 | intercept=TRUE, 93 | force.in=force.in, 94 | really.big=TRUE, 95 | weights = m$wts)) 96 | rs = summary(rss) 97 | rs.which = data.frame(rs$which+0,row.names = NULL) 98 | rs.k = apply(rs.which,1,sum) 99 | rs.bic = split(rs$bic,f = rs.k) 100 | leaps.cands = lapply(split(rs.which,rs.k),FUN = function(x) x[1,]) 101 | # best model of each size to test if it passes the fence 102 | leaps.cands = do.call(rbind,leaps.cands) 103 | lc.k = apply(leaps.cands,1,sum) 104 | # next best models of each size to test if also pass the fence 105 | other.cands = lapply(split(rs.which,rs.k),FUN = function(x) x[-1,]) 106 | start = lc.k[1] #2+length(force.in) 107 | for(i in start:nvmax){ 108 | if(trace) cat(paste("Model size:",i,"")) 109 | UB = Qmf + cstar*sigMM(k.mod = i, method = method, 110 | k.full = kf, adaptive = adaptive) 111 | mnames = colnames(leaps.cands)[which(leaps.cands[lc.k==i,]==1)] 112 | ff = stats::as.formula(paste(yname," ~ ", 113 | paste(mnames[-1],collapse="+"),sep="")) 114 | em = stats::lm(formula=ff, data=data, weights=m$wts) 115 | hatQm = Qm(em,method=method) 116 | if(hatQm<=UB){ 117 | if(trace){ 118 | cat("\n Candidate model found via leaps. \n") 119 | cat("Exploring other options at this model size. ") 120 | txt.fn(hatQm,UB,em) 121 | } 122 | pos=1 123 | environment(ff) = globalenv() 124 | ret[[pos]] = ff # record the result 125 | met[[pos]] = hatQm #record its score 126 | lmf = other.cands[[paste(i)]] 127 | if(dim(lmf)[1]>0){ 128 | for(j in 1:dim(lmf)[1]){ 129 | mnames = colnames(lmf)[which(lmf[j,]==1)] 130 | ff = stats::as.formula(paste(yname," ~ ", 131 | paste(mnames[-1],collapse="+"), 132 | sep="")) 133 | em = stats::lm(ff, data = data, weights=m$wts) 134 | hatQm = Qm(em,method=method) 135 | if(hatQm<=UB){ 136 | if(trace) txt.fn(hatQm,UB,em) 137 | pos = pos+1 138 | environment(ff) = globalenv() 139 | ret[[pos]] = ff 140 | met[[pos]] = hatQm 141 | } else break 142 | } 143 | return(ret[rank(unlist(met),ties.method="random")]) 144 | } else return(ret) 145 | } 146 | if(trace) cat("(No candidate models found) \n") 147 | } 148 | if(trace) cat(" (No candidate models found: increasing nvmax) \n",cstar) 149 | nvmax = nvmax+5 150 | } 151 | } 152 | -------------------------------------------------------------------------------- /R/qrange.R: -------------------------------------------------------------------------------- 1 | #' Identify an appropriate rance of values over which to bootstrap 2 | #' 3 | #' This function takes an upper and lower dimension size (obtained by 4 | #' forwards and backwards model selection and then adding and subtracting 5 | #' 2 from each of the extremes to encompas a broader range of models). 6 | #' For both the small and large model size, the "best" model is identified 7 | #' using the \code{leaps} package and the corresponding lack of fit measure 8 | #' is calculated. 9 | #' 10 | #' @param k.range list with dimension elements k.max and k.min 11 | #' @param yname name of dependent variable 12 | #' @param fixed the full model formula 13 | #' @param data full data table 14 | #' @param method method used in Qm 15 | #' @param force.in which variables to force into the model 16 | #' @param model.type currently only lm or glm 17 | #' @param family for glms. 18 | #' @noRd 19 | qrange = function(k.range,yname,fixed, 20 | data,method,force.in, 21 | model.type,family){ 22 | kf = k.range$k.max 23 | if(model.type=="lm"){ 24 | cand.models = summary(leaps::regsubsets(x = fixed, 25 | data = data, 26 | nbest = 1, 27 | nvmax = kf, 28 | force.in=force.in))$which+0 29 | } else if(model.type=="glm"){ 30 | cand.models = bestglm::bestglm(Xy=data, 31 | family=family, 32 | IC = "BIC", 33 | TopModels = 1, 34 | nvmax = )$Subsets[,1:kf]+0 35 | } 36 | small.row = which(rowSums(cand.models)==k.range$k.min) 37 | small.names = colnames(cand.models)[which(cand.models[small.row,]==1)] 38 | if(length(small.names)>1){ 39 | small.ff = paste(yname," ~ ",paste(small.names[-1],collapse="+"),sep="") 40 | } else small.ff = paste(yname,"~1") 41 | small.ff = stats::as.formula(small.ff) 42 | 43 | big.row = which(rowSums(cand.models)==kf) 44 | big.names = colnames(cand.models)[which(cand.models[big.row,]==1)] 45 | if(length(big.names)>1){ 46 | big.ff = paste(yname," ~ ",paste(big.names[-1],collapse="+"),sep="") 47 | } else big.ff = paste(yname,"~1") 48 | big.ff = stats::as.formula(big.ff) 49 | if(model.type=="lm"){ 50 | small.em = stats::lm(small.ff, data=data) 51 | big.em = stats::lm(big.ff, data=data) 52 | } else if(model.type=="glm"){ 53 | small.em = stats::glm(small.ff, data=data, family=family) 54 | big.em = stats::glm(big.ff, data=data, family=family) 55 | } 56 | Q.min = Qm(big.em,method=method) 57 | Q.max = Qm(small.em,method=method) 58 | return(list(Q.min=Q.min,Q.max=Q.max)) 59 | } 60 | -------------------------------------------------------------------------------- /R/sigMM.R: -------------------------------------------------------------------------------- 1 | #' Standard deviation of hatQ_M and hatQ_tildeM 2 | #' 3 | #' This function calculates \eqn{\hat(sigma)_{M,tildeM}} 4 | #' the standard deviation of 5 | #' the difference between the two lack-of-fit measures 6 | #' \eqn{\hat{Q}_M} and \eqn{\hat{Q}_{\tilde{M}}} as 7 | #' described in Jiang et. al. (2008). When using the 8 | #' adaptive fence procedure, this quantity no longer needs 9 | #' to be calculated and simply returns a value of 1. 10 | #' 11 | #' @param k.mod number of parameters in the estimated model 12 | #' @param method method the model selection method to be used. Currently 13 | #' only \code{method = "ML"} is supported (perhaps in the future 14 | #' \code{method = "MVC"} will be implemented). 15 | #' @param k.full number of parameters in the full model 16 | #' @param adaptive logical. If \code{TRUE} the boundary of the fence is 17 | #' given by cstar. Otherwise, it the original (non-adaptive) fence 18 | #' is performed where the boundary is cstar*hat(sigma)_{M,tildeM}. 19 | #' @noRd 20 | sigMM = function(k.mod,method,k.full,adaptive){ 21 | if(method=="ML" & adaptive==FALSE){ 22 | return(sqrt((k.full-k.mod)/2)) 23 | } else if (adaptive==TRUE){ 24 | return(1) 25 | } 26 | } -------------------------------------------------------------------------------- /R/sstab.R: -------------------------------------------------------------------------------- 1 | # data("diabetes",package="lars") 2 | # x = diabetes$x 3 | # y = diabetes$y 4 | # df = data.frame(scale(cbind(y,x))) 5 | # lm1 = lm(y ~ ., data = df) 6 | # 7 | # sstab = function(mf, B = 100){ 8 | # full_coeff = coefficients(mf) 9 | # kf = length(full_coeff) 10 | # coef.res = matrix(ncol = kf, nrow = B) 11 | # colnames(coef.res) = names(full_coeff) 12 | # n.obs = length(resid(mf)) 13 | # for(i in 1:B){ 14 | # mod = stats::lm(stats::formula(mf), 15 | # data = model.frame(mf), 16 | # weights = stats::rexp(n = n.obs, rate = 1)) 17 | # coef.res[i,] = coefficients(mod) 18 | # } 19 | # return(coef.res) 20 | # } 21 | # 22 | # sj = sstab(lm1) 23 | # sj_ranks = apply(sj, 1, rank) 24 | # sj_rank_mean = sort(apply(sj_ranks, 1, mean), decreasing = TRUE) 25 | # sj_rank_sd = sort(apply(sj_ranks, 1, sd), decreasing = TRUE) 26 | -------------------------------------------------------------------------------- /R/utils-pipe.R: -------------------------------------------------------------------------------- 1 | #' Pipe operator 2 | #' 3 | #' See \code{magrittr::\link[magrittr:pipe]{\%>\%}} for details. 4 | #' 5 | #' @name %>% 6 | #' @rdname pipe 7 | #' @keywords internal 8 | #' @export 9 | #' @importFrom magrittr %>% 10 | #' @usage lhs \%>\% rhs 11 | #' @param lhs A value or the magrittr placeholder. 12 | #' @param rhs A function call using the magrittr semantics. 13 | #' @return The result of calling `rhs(lhs)`. 14 | NULL 15 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # mplot: graphical model stability and variable selection procedures 2 | 3 | [![Travis-CI Build Status](https://travis-ci.org/garthtarr/mplot.svg?branch=master)](https://travis-ci.org/garthtarr/mplot) [![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/mplot)](https://cran.r-project.org/package=mplot) [![](http://cranlogs.r-pkg.org/badges/mplot)](https://cran.r-project.org/package=mplot) [![DL_Total](http://cranlogs.r-pkg.org/badges/grand-total/mplot?color=blue)](https://cran.r-project.org/package=mplot) 4 | 5 | The `mplot` package provides a collection of functions designed for exploratory model selection. 6 | 7 | We implement model stability and variable importance plots ([Mueller and Welsh (2010)](https://doi.org/10.1111/j.1751-5823.2010.00108.x); [Murray, Heritier and Mueller (2013)](https://doi.org/10.1002/sim.5855)) as well as the adaptive fence ([Jiang et al. (2008)](https://doi.org/10.1214/07-AOS517); [Jiang et al. (2009)](https://doi.org/10.1016/j.spl.2008.10.014)) for linear and generalised linear models. We address many practical implementation issues with sensible defaults and interactive graphics to highlight model selection stability. The speed of implementation comes from the leaps package and multicore support for bootstrapping. 8 | 9 | The `mplot` currently only supports linear and generalised linear models, however work is progressing to incorporate survival models and mixed models. 10 | 11 | You can see an example of the output [here](https://garthtarr.com/apps/mplot/). 12 | 13 | ## Installation 14 | 15 | Check that you're running the most recent versions of your currently installed R packages: 16 | 17 | ```s 18 | update.packages() 19 | ``` 20 | 21 | ### Stable release on CRAN 22 | 23 | The mplot package has been on [CRAN](https://cran.r-project.org/package=mplot) since June 2015. You can install it from CRAN in the usual way: 24 | 25 | ```s 26 | install.packages("mplot") 27 | library("mplot") 28 | ``` 29 | 30 | ### Development version on Github 31 | 32 | You can use the **devtools** package to install the development version of **mplot** from [GitHub](https://github.com/garthtarr/mplot): 33 | 34 | ```s 35 | # install.packages("devtools") 36 | devtools::install_github("garthtarr/mplot") 37 | library(mplot) 38 | ``` 39 | 40 | ## Usage 41 | 42 | A reference manual is available at [garthtarr.github.io/mplot](https://garthtarr.github.io/mplot/) 43 | 44 | ## Citation 45 | 46 | If you use this package to inform your model selection choices, please use the following citation: 47 | 48 | - Tarr G, Müller S and Welsh AH (2018). "mplot: An R Package for Graphical Model Stability and Variable Selection Procedures." _Journal of Statistical Software_, **83**(9), pp. 1–28. doi: 10.18637/jss.v083.i09. 49 | 50 | From R you can use: 51 | 52 | ```s 53 | citation("mplot") 54 | toBibtex(citation("mplot")) 55 | ``` 56 | 57 | -------------------------------------------------------------------------------- /_pkgdown.yml: -------------------------------------------------------------------------------- 1 | template: 2 | params: 3 | bootswatch: lumen 4 | toc_depth: 3 5 | 6 | reference: 7 | - title: "Core functions" 8 | contents: 9 | - af 10 | - vis 11 | - bglmnet 12 | - mplot 13 | - title: "Data sets" 14 | contents: 15 | - bodyfat 16 | - diabetes 17 | - fev 18 | - wallabies 19 | - artificialeg 20 | - title: "Generic plot, print and summary methods" 21 | contents: 22 | - plot.af 23 | - plot.bglmnet 24 | - plot.vis 25 | - print.af 26 | - print.vis 27 | - summary.af 28 | - exclude: 29 | contents: 30 | - mplot-package 31 | 32 | 33 | navbar: 34 | title: "mplot" 35 | type: default 36 | left: 37 | - icon: fa-home fa-lg 38 | href: index.html 39 | - text: "Reference" 40 | href: reference/index.html 41 | - text: "Theory" 42 | menu: 43 | - text: "Background and philosophy" 44 | href: articles/background.html 45 | - text: "Variable inclusion plots" 46 | href: articles/vip.html 47 | - text: "Model stability plots" 48 | href: articles/msp.html 49 | - text: "Adaptive fence" 50 | href: articles/af.html 51 | - text: "Applications" 52 | menu: 53 | - text: "Publications" 54 | href: articles/publications.html 55 | - text: "Diabetes" 56 | href: articles/diabetes.html 57 | - text: "Birth weight" 58 | href: articles/birthweight.html 59 | - text: "Artificial example" 60 | href: articles/artificial.html 61 | - text: "Tips" 62 | menu: 63 | - text: "Interactive plots" 64 | href: articles/interactive.html 65 | - text: "Timing" 66 | href: articles/timing.html 67 | right: 68 | - icon: fa-github fa-lg 69 | href: https://github.com/garthtarr/mplot 70 | - icon: fa-info-circle fa-lg 71 | href: articles/people.html 72 | 73 | -------------------------------------------------------------------------------- /data/.Rapp.history: -------------------------------------------------------------------------------- 1 | load("/Users/garthtarr/Dropbox/Packages/mplot/data/artificialeg.rda") 2 | ls() 3 | ~/Dropbox/Packages/mplot/data/bodyfat.rda 4 | load("/Users/garthtarr/Dropbox/Packages/mplot/data/bodyfat.rda") 5 | ls() 6 | load("/Users/gt415/Dropbox/Packages/mplot/data/fev.rda") 7 | str(fev) 8 | -------------------------------------------------------------------------------- /data/artificialeg.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/data/artificialeg.rda -------------------------------------------------------------------------------- /data/bodyfat.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/data/bodyfat.rda -------------------------------------------------------------------------------- /data/diabetes.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/data/diabetes.rda -------------------------------------------------------------------------------- /data/fev.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/data/fev.rda -------------------------------------------------------------------------------- /data/wallabies.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/data/wallabies.rda -------------------------------------------------------------------------------- /docs/.nojekyll: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/.nojekyll -------------------------------------------------------------------------------- /docs/404.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Page not found (404) • mplot 9 | 10 | 11 | 12 | 13 | 14 | 15 | 19 | 20 | 21 | 22 | 23 |
24 |
127 | 128 | 129 | 130 | 131 |
132 |
133 | 136 | 137 | Content not found. Please use links in the navbar. 138 | 139 |
140 | 141 | 145 | 146 |
147 | 148 | 149 | 150 |
154 | 155 |
156 |

157 |

Site built with pkgdown 1.6.1.9001.

158 |
159 | 160 |
161 |
162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | -------------------------------------------------------------------------------- /docs/articles/af_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/artificial_files/figure-html/pairsplot-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/artificial_files/figure-html/pairsplot-1.png -------------------------------------------------------------------------------- /docs/articles/artificial_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/background_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/birthweight_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/diabetes_files/crosstalk-1.0.0/css/crosstalk.css: -------------------------------------------------------------------------------- 1 | /* Adjust margins outwards, so column contents line up with the edges of the 2 | parent of container-fluid. */ 3 | .container-fluid.crosstalk-bscols { 4 | margin-left: -30px; 5 | margin-right: -30px; 6 | white-space: normal; 7 | } 8 | 9 | /* But don't adjust the margins outwards if we're directly under the body, 10 | i.e. we were the top-level of something at the console. */ 11 | body > .container-fluid.crosstalk-bscols { 12 | margin-left: auto; 13 | margin-right: auto; 14 | } 15 | 16 | .crosstalk-input-checkboxgroup .crosstalk-options-group .crosstalk-options-column { 17 | display: inline-block; 18 | padding-right: 12px; 19 | vertical-align: top; 20 | } 21 | 22 | @media only screen and (max-width:480px) { 23 | .crosstalk-input-checkboxgroup .crosstalk-options-group .crosstalk-options-column { 24 | display: block; 25 | padding-right: inherit; 26 | } 27 | } 28 | -------------------------------------------------------------------------------- /docs/articles/diabetes_files/crosstalk-1.1.1/css/crosstalk.css: -------------------------------------------------------------------------------- 1 | /* Adjust margins outwards, so column contents line up with the edges of the 2 | parent of container-fluid. */ 3 | .container-fluid.crosstalk-bscols { 4 | margin-left: -30px; 5 | margin-right: -30px; 6 | white-space: normal; 7 | } 8 | 9 | /* But don't adjust the margins outwards if we're directly under the body, 10 | i.e. we were the top-level of something at the console. */ 11 | body > .container-fluid.crosstalk-bscols { 12 | margin-left: auto; 13 | margin-right: auto; 14 | } 15 | 16 | .crosstalk-input-checkboxgroup .crosstalk-options-group .crosstalk-options-column { 17 | display: inline-block; 18 | padding-right: 12px; 19 | vertical-align: top; 20 | } 21 | 22 | @media only screen and (max-width:480px) { 23 | .crosstalk-input-checkboxgroup .crosstalk-options-group .crosstalk-options-column { 24 | display: block; 25 | padding-right: inherit; 26 | } 27 | } 28 | -------------------------------------------------------------------------------- /docs/articles/diabetes_files/datatables-css-0.0.0/datatables-crosstalk.css: -------------------------------------------------------------------------------- 1 | .dt-crosstalk-fade { 2 | opacity: 0.2; 3 | } 4 | 5 | html body div.DTS div.dataTables_scrollBody { 6 | background: none; 7 | } 8 | 9 | 10 | /* 11 | Fix https://github.com/rstudio/DT/issues/563 12 | If the `table.display` is set to "block" (e.g., pkgdown), the browser will display 13 | datatable objects strangely. The search panel and the page buttons will still be 14 | in full-width but the table body will be "compact" and shorter. 15 | In therory, having this attributes will affect `dom="t"` 16 | with `display: block` users. But in reality, there should be no one. 17 | We may remove the below lines in the future if the upstream agree to have this there. 18 | See https://github.com/DataTables/DataTablesSrc/issues/160 19 | */ 20 | 21 | table.dataTable { 22 | display: table; 23 | } 24 | -------------------------------------------------------------------------------- /docs/articles/diabetes_files/dt-core-1.10.20/css/jquery.dataTables.extra.css: -------------------------------------------------------------------------------- 1 | /* Selected rows/cells */ 2 | table.dataTable tr.selected td, table.dataTable td.selected { 3 | background-color: #b0bed9 !important; 4 | } 5 | /* In case of scrollX/Y or FixedHeader */ 6 | .dataTables_scrollBody .dataTables_sizing { 7 | visibility: hidden; 8 | } 9 | 10 | /* The datatables' theme CSS file doesn't define 11 | the color but with white background. It leads to an issue that 12 | when the HTML's body color is set to 'white', the user can't 13 | see the text since the background is white. One case happens in the 14 | RStudio's IDE when inline viewing the DT table inside an Rmd file, 15 | if the IDE theme is set to "Cobalt". 16 | 17 | See https://github.com/rstudio/DT/issues/447 for more info 18 | 19 | This fixes should have little side-effects because all the other elements 20 | of the default theme use the #333 font color. 21 | 22 | TODO: The upstream may use relative colors for both the table background 23 | and the color. It means the table can display well without this patch 24 | then. At that time, we need to remove the below CSS attributes. 25 | */ 26 | div.datatables { 27 | color: #333; 28 | } 29 | -------------------------------------------------------------------------------- /docs/articles/diabetes_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/diabetes_files/jquery-1.12.4/LICENSE.txt: -------------------------------------------------------------------------------- 1 | Copyright 2005, 2014 jQuery Foundation and other contributors, 2 | https://jquery.org/ 3 | 4 | This software consists of voluntary contributions made by many 5 | individuals. For exact contribution history, see the revision history 6 | available at https://github.com/jquery/jquery 7 | 8 | The following license applies to all parts of this software except as 9 | documented below: 10 | 11 | ==== 12 | 13 | Permission is hereby granted, free of charge, to any person obtaining 14 | a copy of this software and associated documentation files (the 15 | "Software"), to deal in the Software without restriction, including 16 | without limitation the rights to use, copy, modify, merge, publish, 17 | distribute, sublicense, and/or sell copies of the Software, and to 18 | permit persons to whom the Software is furnished to do so, subject to 19 | the following conditions: 20 | 21 | The above copyright notice and this permission notice shall be 22 | included in all copies or substantial portions of the Software. 23 | 24 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 28 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 29 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 30 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 31 | 32 | ==== 33 | 34 | All files located in the node_modules and external directories are 35 | externally maintained libraries used by this software which have their 36 | own licenses; we recommend you read them, as their terms may differ from 37 | the terms above. 38 | -------------------------------------------------------------------------------- /docs/articles/images/artafboTF.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/artafboTF.png -------------------------------------------------------------------------------- /docs/articles/images/figure4a.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/figure4a.png -------------------------------------------------------------------------------- /docs/articles/images/figure4b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/figure4b.png -------------------------------------------------------------------------------- /docs/articles/images/figure4c.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/figure4c.png -------------------------------------------------------------------------------- /docs/articles/images/figure5a.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/figure5a.png -------------------------------------------------------------------------------- /docs/articles/images/figure5b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/figure5b.png -------------------------------------------------------------------------------- /docs/articles/images/figure5c.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/figure5c.png -------------------------------------------------------------------------------- /docs/articles/images/figure5d.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/figure5d.png -------------------------------------------------------------------------------- /docs/articles/images/nature.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/nature.png -------------------------------------------------------------------------------- /docs/articles/images/oncology.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/oncology.png -------------------------------------------------------------------------------- /docs/articles/images/plotvis.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/plotvis.png -------------------------------------------------------------------------------- /docs/articles/images/thyroid.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/articles/images/thyroid.png -------------------------------------------------------------------------------- /docs/articles/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Articles • mplot 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 |
70 |
71 | 173 | 174 | 175 | 176 |
177 | 178 |
179 |
180 | 183 | 184 |
185 |

All vignettes

186 |

187 | 188 |
189 |
Simplified adaptive fence
190 |
191 |
Artificial example
192 |
193 |
mplot philosophy
194 |
195 |
Birth weight example
196 |
197 |
Diabetes example
198 |
199 |
Interactive graphics
200 |
201 |
Model stability plots
202 |
203 |
mplot contributors
204 |
205 |
mplot in publications
206 |
207 |
Timing considerations
208 |
209 |
Variable inclusion plots
210 |
211 |
212 |
213 |
214 |
215 | 216 | 217 |
218 | 221 | 222 |
223 |

Site built with pkgdown 1.6.1.9001.

224 |
225 | 226 |
227 |
228 | 229 | 230 | 231 | 232 | 233 | 234 | 235 | 236 | 237 | 238 | -------------------------------------------------------------------------------- /docs/articles/interactive_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/msp_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/people_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/publications_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/timing_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/vip_files/header-attrs-2.9/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.css: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | 6 | /* modified from https://github.com/twbs/bootstrap/blob/94b4076dd2efba9af71f0b18d4ee4b163aa9e0dd/docs/assets/css/src/docs.css#L548-L601 */ 7 | 8 | /* All levels of nav */ 9 | nav[data-toggle='toc'] .nav > li > a { 10 | display: block; 11 | padding: 4px 20px; 12 | font-size: 13px; 13 | font-weight: 500; 14 | color: #767676; 15 | } 16 | nav[data-toggle='toc'] .nav > li > a:hover, 17 | nav[data-toggle='toc'] .nav > li > a:focus { 18 | padding-left: 19px; 19 | color: #563d7c; 20 | text-decoration: none; 21 | background-color: transparent; 22 | border-left: 1px solid #563d7c; 23 | } 24 | nav[data-toggle='toc'] .nav > .active > a, 25 | nav[data-toggle='toc'] .nav > .active:hover > a, 26 | nav[data-toggle='toc'] .nav > .active:focus > a { 27 | padding-left: 18px; 28 | font-weight: bold; 29 | color: #563d7c; 30 | background-color: transparent; 31 | border-left: 2px solid #563d7c; 32 | } 33 | 34 | /* Nav: second level (shown on .active) */ 35 | nav[data-toggle='toc'] .nav .nav { 36 | display: none; /* Hide by default, but at >768px, show it */ 37 | padding-bottom: 10px; 38 | } 39 | nav[data-toggle='toc'] .nav .nav > li > a { 40 | padding-top: 1px; 41 | padding-bottom: 1px; 42 | padding-left: 30px; 43 | font-size: 12px; 44 | font-weight: normal; 45 | } 46 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 47 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 48 | padding-left: 29px; 49 | } 50 | nav[data-toggle='toc'] .nav .nav > .active > a, 51 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 52 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 53 | padding-left: 28px; 54 | font-weight: 500; 55 | } 56 | 57 | /* from https://github.com/twbs/bootstrap/blob/e38f066d8c203c3e032da0ff23cd2d6098ee2dd6/docs/assets/css/src/docs.css#L631-L634 */ 58 | nav[data-toggle='toc'] .nav > .active > ul { 59 | display: block; 60 | } 61 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.js: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | (function() { 6 | 'use strict'; 7 | 8 | window.Toc = { 9 | helpers: { 10 | // return all matching elements in the set, or their descendants 11 | findOrFilter: function($el, selector) { 12 | // http://danielnouri.org/notes/2011/03/14/a-jquery-find-that-also-finds-the-root-element/ 13 | // http://stackoverflow.com/a/12731439/358804 14 | var $descendants = $el.find(selector); 15 | return $el.filter(selector).add($descendants).filter(':not([data-toc-skip])'); 16 | }, 17 | 18 | generateUniqueIdBase: function(el) { 19 | var text = $(el).text(); 20 | var anchor = text.trim().toLowerCase().replace(/[^A-Za-z0-9]+/g, '-'); 21 | return anchor || el.tagName.toLowerCase(); 22 | }, 23 | 24 | generateUniqueId: function(el) { 25 | var anchorBase = this.generateUniqueIdBase(el); 26 | for (var i = 0; ; i++) { 27 | var anchor = anchorBase; 28 | if (i > 0) { 29 | // add suffix 30 | anchor += '-' + i; 31 | } 32 | // check if ID already exists 33 | if (!document.getElementById(anchor)) { 34 | return anchor; 35 | } 36 | } 37 | }, 38 | 39 | generateAnchor: function(el) { 40 | if (el.id) { 41 | return el.id; 42 | } else { 43 | var anchor = this.generateUniqueId(el); 44 | el.id = anchor; 45 | return anchor; 46 | } 47 | }, 48 | 49 | createNavList: function() { 50 | return $(''); 51 | }, 52 | 53 | createChildNavList: function($parent) { 54 | var $childList = this.createNavList(); 55 | $parent.append($childList); 56 | return $childList; 57 | }, 58 | 59 | generateNavEl: function(anchor, text) { 60 | var $a = $(''); 61 | $a.attr('href', '#' + anchor); 62 | $a.text(text); 63 | var $li = $('
  • '); 64 | $li.append($a); 65 | return $li; 66 | }, 67 | 68 | generateNavItem: function(headingEl) { 69 | var anchor = this.generateAnchor(headingEl); 70 | var $heading = $(headingEl); 71 | var text = $heading.data('toc-text') || $heading.text(); 72 | return this.generateNavEl(anchor, text); 73 | }, 74 | 75 | // Find the first heading level (`

    `, then `

    `, etc.) that has more than one element. Defaults to 1 (for `

    `). 76 | getTopLevel: function($scope) { 77 | for (var i = 1; i <= 6; i++) { 78 | var $headings = this.findOrFilter($scope, 'h' + i); 79 | if ($headings.length > 1) { 80 | return i; 81 | } 82 | } 83 | 84 | return 1; 85 | }, 86 | 87 | // returns the elements for the top level, and the next below it 88 | getHeadings: function($scope, topLevel) { 89 | var topSelector = 'h' + topLevel; 90 | 91 | var secondaryLevel = topLevel + 1; 92 | var secondarySelector = 'h' + secondaryLevel; 93 | 94 | return this.findOrFilter($scope, topSelector + ',' + secondarySelector); 95 | }, 96 | 97 | getNavLevel: function(el) { 98 | return parseInt(el.tagName.charAt(1), 10); 99 | }, 100 | 101 | populateNav: function($topContext, topLevel, $headings) { 102 | var $context = $topContext; 103 | var $prevNav; 104 | 105 | var helpers = this; 106 | $headings.each(function(i, el) { 107 | var $newNav = helpers.generateNavItem(el); 108 | var navLevel = helpers.getNavLevel(el); 109 | 110 | // determine the proper $context 111 | if (navLevel === topLevel) { 112 | // use top level 113 | $context = $topContext; 114 | } else if ($prevNav && $context === $topContext) { 115 | // create a new level of the tree and switch to it 116 | $context = helpers.createChildNavList($prevNav); 117 | } // else use the current $context 118 | 119 | $context.append($newNav); 120 | 121 | $prevNav = $newNav; 122 | }); 123 | }, 124 | 125 | parseOps: function(arg) { 126 | var opts; 127 | if (arg.jquery) { 128 | opts = { 129 | $nav: arg 130 | }; 131 | } else { 132 | opts = arg; 133 | } 134 | opts.$scope = opts.$scope || $(document.body); 135 | return opts; 136 | } 137 | }, 138 | 139 | // accepts a jQuery object, or an options object 140 | init: function(opts) { 141 | opts = this.helpers.parseOps(opts); 142 | 143 | // ensure that the data attribute is in place for styling 144 | opts.$nav.attr('data-toggle', 'toc'); 145 | 146 | var $topContext = this.helpers.createChildNavList(opts.$nav); 147 | var topLevel = this.helpers.getTopLevel(opts.$scope); 148 | var $headings = this.helpers.getHeadings(opts.$scope, topLevel); 149 | this.helpers.populateNav($topContext, topLevel, $headings); 150 | } 151 | }; 152 | 153 | $(function() { 154 | $('nav[data-toggle="toc"]').each(function(i, el) { 155 | var $nav = $(el); 156 | Toc.init($nav); 157 | }); 158 | }); 159 | })(); 160 | -------------------------------------------------------------------------------- /docs/docsearch.js: -------------------------------------------------------------------------------- 1 | $(function() { 2 | 3 | // register a handler to move the focus to the search bar 4 | // upon pressing shift + "/" (i.e. "?") 5 | $(document).on('keydown', function(e) { 6 | if (e.shiftKey && e.keyCode == 191) { 7 | e.preventDefault(); 8 | $("#search-input").focus(); 9 | } 10 | }); 11 | 12 | $(document).ready(function() { 13 | // do keyword highlighting 14 | /* modified from https://jsfiddle.net/julmot/bL6bb5oo/ */ 15 | var mark = function() { 16 | 17 | var referrer = document.URL ; 18 | var paramKey = "q" ; 19 | 20 | if (referrer.indexOf("?") !== -1) { 21 | var qs = referrer.substr(referrer.indexOf('?') + 1); 22 | var qs_noanchor = qs.split('#')[0]; 23 | var qsa = qs_noanchor.split('&'); 24 | var keyword = ""; 25 | 26 | for (var i = 0; i < qsa.length; i++) { 27 | var currentParam = qsa[i].split('='); 28 | 29 | if (currentParam.length !== 2) { 30 | continue; 31 | } 32 | 33 | if (currentParam[0] == paramKey) { 34 | keyword = decodeURIComponent(currentParam[1].replace(/\+/g, "%20")); 35 | } 36 | } 37 | 38 | if (keyword !== "") { 39 | $(".contents").unmark({ 40 | done: function() { 41 | $(".contents").mark(keyword); 42 | } 43 | }); 44 | } 45 | } 46 | }; 47 | 48 | mark(); 49 | }); 50 | }); 51 | 52 | /* Search term highlighting ------------------------------*/ 53 | 54 | function matchedWords(hit) { 55 | var words = []; 56 | 57 | var hierarchy = hit._highlightResult.hierarchy; 58 | // loop to fetch from lvl0, lvl1, etc. 59 | for (var idx in hierarchy) { 60 | words = words.concat(hierarchy[idx].matchedWords); 61 | } 62 | 63 | var content = hit._highlightResult.content; 64 | if (content) { 65 | words = words.concat(content.matchedWords); 66 | } 67 | 68 | // return unique words 69 | var words_uniq = [...new Set(words)]; 70 | return words_uniq; 71 | } 72 | 73 | function updateHitURL(hit) { 74 | 75 | var words = matchedWords(hit); 76 | var url = ""; 77 | 78 | if (hit.anchor) { 79 | url = hit.url_without_anchor + '?q=' + escape(words.join(" ")) + '#' + hit.anchor; 80 | } else { 81 | url = hit.url + '?q=' + escape(words.join(" ")); 82 | } 83 | 84 | return url; 85 | } 86 | -------------------------------------------------------------------------------- /docs/link.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 8 | 12 | 13 | -------------------------------------------------------------------------------- /docs/pkgdown.css: -------------------------------------------------------------------------------- 1 | /* Sticky footer */ 2 | 3 | /** 4 | * Basic idea: https://philipwalton.github.io/solved-by-flexbox/demos/sticky-footer/ 5 | * Details: https://github.com/philipwalton/solved-by-flexbox/blob/master/assets/css/components/site.css 6 | * 7 | * .Site -> body > .container 8 | * .Site-content -> body > .container .row 9 | * .footer -> footer 10 | * 11 | * Key idea seems to be to ensure that .container and __all its parents__ 12 | * have height set to 100% 13 | * 14 | */ 15 | 16 | html, body { 17 | height: 100%; 18 | } 19 | 20 | body { 21 | position: relative; 22 | } 23 | 24 | body > .container { 25 | display: flex; 26 | height: 100%; 27 | flex-direction: column; 28 | } 29 | 30 | body > .container .row { 31 | flex: 1 0 auto; 32 | } 33 | 34 | footer { 35 | margin-top: 45px; 36 | padding: 35px 0 36px; 37 | border-top: 1px solid #e5e5e5; 38 | color: #666; 39 | display: flex; 40 | flex-shrink: 0; 41 | } 42 | footer p { 43 | margin-bottom: 0; 44 | } 45 | footer div { 46 | flex: 1; 47 | } 48 | footer .pkgdown { 49 | text-align: right; 50 | } 51 | footer p { 52 | margin-bottom: 0; 53 | } 54 | 55 | img.icon { 56 | float: right; 57 | } 58 | 59 | img { 60 | max-width: 100%; 61 | } 62 | 63 | /* Fix bug in bootstrap (only seen in firefox) */ 64 | summary { 65 | display: list-item; 66 | } 67 | 68 | /* Typographic tweaking ---------------------------------*/ 69 | 70 | .contents .page-header { 71 | margin-top: calc(-60px + 1em); 72 | } 73 | 74 | dd { 75 | margin-left: 3em; 76 | } 77 | 78 | /* Section anchors ---------------------------------*/ 79 | 80 | a.anchor { 81 | margin-left: -30px; 82 | display:inline-block; 83 | width: 30px; 84 | height: 30px; 85 | visibility: hidden; 86 | 87 | background-image: url(./link.svg); 88 | background-repeat: no-repeat; 89 | background-size: 20px 20px; 90 | background-position: center center; 91 | } 92 | 93 | .hasAnchor:hover a.anchor { 94 | visibility: visible; 95 | } 96 | 97 | @media (max-width: 767px) { 98 | .hasAnchor:hover a.anchor { 99 | visibility: hidden; 100 | } 101 | } 102 | 103 | 104 | /* Fixes for fixed navbar --------------------------*/ 105 | 106 | .contents h1, .contents h2, .contents h3, .contents h4 { 107 | padding-top: 60px; 108 | margin-top: -40px; 109 | } 110 | 111 | /* Navbar submenu --------------------------*/ 112 | 113 | .dropdown-submenu { 114 | position: relative; 115 | } 116 | 117 | .dropdown-submenu>.dropdown-menu { 118 | top: 0; 119 | left: 100%; 120 | margin-top: -6px; 121 | margin-left: -1px; 122 | border-radius: 0 6px 6px 6px; 123 | } 124 | 125 | .dropdown-submenu:hover>.dropdown-menu { 126 | display: block; 127 | } 128 | 129 | .dropdown-submenu>a:after { 130 | display: block; 131 | content: " "; 132 | float: right; 133 | width: 0; 134 | height: 0; 135 | border-color: transparent; 136 | border-style: solid; 137 | border-width: 5px 0 5px 5px; 138 | border-left-color: #cccccc; 139 | margin-top: 5px; 140 | margin-right: -10px; 141 | } 142 | 143 | .dropdown-submenu:hover>a:after { 144 | border-left-color: #ffffff; 145 | } 146 | 147 | .dropdown-submenu.pull-left { 148 | float: none; 149 | } 150 | 151 | .dropdown-submenu.pull-left>.dropdown-menu { 152 | left: -100%; 153 | margin-left: 10px; 154 | border-radius: 6px 0 6px 6px; 155 | } 156 | 157 | /* Sidebar --------------------------*/ 158 | 159 | #pkgdown-sidebar { 160 | margin-top: 30px; 161 | position: -webkit-sticky; 162 | position: sticky; 163 | top: 70px; 164 | } 165 | 166 | #pkgdown-sidebar h2 { 167 | font-size: 1.5em; 168 | margin-top: 1em; 169 | } 170 | 171 | #pkgdown-sidebar h2:first-child { 172 | margin-top: 0; 173 | } 174 | 175 | #pkgdown-sidebar .list-unstyled li { 176 | margin-bottom: 0.5em; 177 | } 178 | 179 | /* bootstrap-toc tweaks ------------------------------------------------------*/ 180 | 181 | /* All levels of nav */ 182 | 183 | nav[data-toggle='toc'] .nav > li > a { 184 | padding: 4px 20px 4px 6px; 185 | font-size: 1.5rem; 186 | font-weight: 400; 187 | color: inherit; 188 | } 189 | 190 | nav[data-toggle='toc'] .nav > li > a:hover, 191 | nav[data-toggle='toc'] .nav > li > a:focus { 192 | padding-left: 5px; 193 | color: inherit; 194 | border-left: 1px solid #878787; 195 | } 196 | 197 | nav[data-toggle='toc'] .nav > .active > a, 198 | nav[data-toggle='toc'] .nav > .active:hover > a, 199 | nav[data-toggle='toc'] .nav > .active:focus > a { 200 | padding-left: 5px; 201 | font-size: 1.5rem; 202 | font-weight: 400; 203 | color: inherit; 204 | border-left: 2px solid #878787; 205 | } 206 | 207 | /* Nav: second level (shown on .active) */ 208 | 209 | nav[data-toggle='toc'] .nav .nav { 210 | display: none; /* Hide by default, but at >768px, show it */ 211 | padding-bottom: 10px; 212 | } 213 | 214 | nav[data-toggle='toc'] .nav .nav > li > a { 215 | padding-left: 16px; 216 | font-size: 1.35rem; 217 | } 218 | 219 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 220 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 221 | padding-left: 15px; 222 | } 223 | 224 | nav[data-toggle='toc'] .nav .nav > .active > a, 225 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 226 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 227 | padding-left: 15px; 228 | font-weight: 500; 229 | font-size: 1.35rem; 230 | } 231 | 232 | /* orcid ------------------------------------------------------------------- */ 233 | 234 | .orcid { 235 | font-size: 16px; 236 | color: #A6CE39; 237 | /* margins are required by official ORCID trademark and display guidelines */ 238 | margin-left:4px; 239 | margin-right:4px; 240 | vertical-align: middle; 241 | } 242 | 243 | /* Reference index & topics ----------------------------------------------- */ 244 | 245 | .ref-index th {font-weight: normal;} 246 | 247 | .ref-index td {vertical-align: top; min-width: 100px} 248 | .ref-index .icon {width: 40px;} 249 | .ref-index .alias {width: 40%;} 250 | .ref-index-icons .alias {width: calc(40% - 40px);} 251 | .ref-index .title {width: 60%;} 252 | 253 | .ref-arguments th {text-align: right; padding-right: 10px;} 254 | .ref-arguments th, .ref-arguments td {vertical-align: top; min-width: 100px} 255 | .ref-arguments .name {width: 20%;} 256 | .ref-arguments .desc {width: 80%;} 257 | 258 | /* Nice scrolling for wide elements --------------------------------------- */ 259 | 260 | table { 261 | display: block; 262 | overflow: auto; 263 | } 264 | 265 | /* Syntax highlighting ---------------------------------------------------- */ 266 | 267 | pre, pre code { 268 | background-color: #f8f8f8; 269 | color: #333; 270 | white-space: pre-wrap; 271 | word-break: break-all; 272 | overflow-wrap: break-word; 273 | } 274 | 275 | pre { 276 | border: 1px solid #eee; 277 | } 278 | 279 | pre .img { 280 | margin: 5px 0; 281 | } 282 | 283 | pre .img img { 284 | background-color: #fff; 285 | display: block; 286 | height: auto; 287 | } 288 | 289 | code a, pre a { 290 | color: #375f84; 291 | } 292 | 293 | a.sourceLine:hover { 294 | text-decoration: none; 295 | } 296 | 297 | .fl {color: #1514b5;} 298 | .fu {color: #000000;} /* function */ 299 | .ch,.st {color: #036a07;} /* string */ 300 | .kw {color: #264D66;} /* keyword */ 301 | .co {color: #888888;} /* comment */ 302 | 303 | .error {font-weight: bolder;} 304 | .warning {font-weight: bolder;} 305 | 306 | /* Clipboard --------------------------*/ 307 | 308 | .hasCopyButton { 309 | position: relative; 310 | } 311 | 312 | .btn-copy-ex { 313 | position: absolute; 314 | right: 0; 315 | top: 0; 316 | visibility: hidden; 317 | } 318 | 319 | .hasCopyButton:hover button.btn-copy-ex { 320 | visibility: visible; 321 | } 322 | 323 | /* headroom.js ------------------------ */ 324 | 325 | .headroom { 326 | will-change: transform; 327 | transition: transform 200ms linear; 328 | } 329 | .headroom--pinned { 330 | transform: translateY(0%); 331 | } 332 | .headroom--unpinned { 333 | transform: translateY(-100%); 334 | } 335 | 336 | /* mark.js ----------------------------*/ 337 | 338 | mark { 339 | background-color: rgba(255, 255, 51, 0.5); 340 | border-bottom: 2px solid rgba(255, 153, 51, 0.3); 341 | padding: 1px; 342 | } 343 | 344 | /* vertical spacing after htmlwidgets */ 345 | .html-widget { 346 | margin-bottom: 10px; 347 | } 348 | 349 | /* fontawesome ------------------------ */ 350 | 351 | .fab { 352 | font-family: "Font Awesome 5 Brands" !important; 353 | } 354 | 355 | /* don't display links in code chunks when printing */ 356 | /* source: https://stackoverflow.com/a/10781533 */ 357 | @media print { 358 | code a:link:after, code a:visited:after { 359 | content: ""; 360 | } 361 | } 362 | -------------------------------------------------------------------------------- /docs/pkgdown.js: -------------------------------------------------------------------------------- 1 | /* http://gregfranko.com/blog/jquery-best-practices/ */ 2 | (function($) { 3 | $(function() { 4 | 5 | $('.navbar-fixed-top').headroom(); 6 | 7 | $('body').css('padding-top', $('.navbar').height() + 10); 8 | $(window).resize(function(){ 9 | $('body').css('padding-top', $('.navbar').height() + 10); 10 | }); 11 | 12 | $('[data-toggle="tooltip"]').tooltip(); 13 | 14 | var cur_path = paths(location.pathname); 15 | var links = $("#navbar ul li a"); 16 | var max_length = -1; 17 | var pos = -1; 18 | for (var i = 0; i < links.length; i++) { 19 | if (links[i].getAttribute("href") === "#") 20 | continue; 21 | // Ignore external links 22 | if (links[i].host !== location.host) 23 | continue; 24 | 25 | var nav_path = paths(links[i].pathname); 26 | 27 | var length = prefix_length(nav_path, cur_path); 28 | if (length > max_length) { 29 | max_length = length; 30 | pos = i; 31 | } 32 | } 33 | 34 | // Add class to parent
  • , and enclosing
  • if in dropdown 35 | if (pos >= 0) { 36 | var menu_anchor = $(links[pos]); 37 | menu_anchor.parent().addClass("active"); 38 | menu_anchor.closest("li.dropdown").addClass("active"); 39 | } 40 | }); 41 | 42 | function paths(pathname) { 43 | var pieces = pathname.split("/"); 44 | pieces.shift(); // always starts with / 45 | 46 | var end = pieces[pieces.length - 1]; 47 | if (end === "index.html" || end === "") 48 | pieces.pop(); 49 | return(pieces); 50 | } 51 | 52 | // Returns -1 if not found 53 | function prefix_length(needle, haystack) { 54 | if (needle.length > haystack.length) 55 | return(-1); 56 | 57 | // Special case for length-0 haystack, since for loop won't run 58 | if (haystack.length === 0) { 59 | return(needle.length === 0 ? 0 : -1); 60 | } 61 | 62 | for (var i = 0; i < haystack.length; i++) { 63 | if (needle[i] != haystack[i]) 64 | return(i); 65 | } 66 | 67 | return(haystack.length); 68 | } 69 | 70 | /* Clipboard --------------------------*/ 71 | 72 | function changeTooltipMessage(element, msg) { 73 | var tooltipOriginalTitle=element.getAttribute('data-original-title'); 74 | element.setAttribute('data-original-title', msg); 75 | $(element).tooltip('show'); 76 | element.setAttribute('data-original-title', tooltipOriginalTitle); 77 | } 78 | 79 | if(ClipboardJS.isSupported()) { 80 | $(document).ready(function() { 81 | var copyButton = ""; 82 | 83 | $("div.sourceCode").addClass("hasCopyButton"); 84 | 85 | // Insert copy buttons: 86 | $(copyButton).prependTo(".hasCopyButton"); 87 | 88 | // Initialize tooltips: 89 | $('.btn-copy-ex').tooltip({container: 'body'}); 90 | 91 | // Initialize clipboard: 92 | var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', { 93 | text: function(trigger) { 94 | return trigger.parentNode.textContent; 95 | } 96 | }); 97 | 98 | clipboardBtnCopies.on('success', function(e) { 99 | changeTooltipMessage(e.trigger, 'Copied!'); 100 | e.clearSelection(); 101 | }); 102 | 103 | clipboardBtnCopies.on('error', function() { 104 | changeTooltipMessage(e.trigger,'Press Ctrl+C or Command+C to copy'); 105 | }); 106 | }); 107 | } 108 | })(window.jQuery || window.$) 109 | -------------------------------------------------------------------------------- /docs/pkgdown.yml: -------------------------------------------------------------------------------- 1 | pandoc: 2.11.4 2 | pkgdown: 1.6.1.9001 3 | pkgdown_sha: ce9781a15c7ea07df9fb17a11295ba4abec0b54b 4 | articles: 5 | af: af.html 6 | artificial: artificial.html 7 | background: background.html 8 | birthweight: birthweight.html 9 | diabetes: diabetes.html 10 | interactive: interactive.html 11 | msp: msp.html 12 | people: people.html 13 | publications: publications.html 14 | timing: timing.html 15 | vip: vip.html 16 | last_built: 2021-07-10T10:35Z 17 | 18 | -------------------------------------------------------------------------------- /docs/reference/Rplot001.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/Rplot001.png -------------------------------------------------------------------------------- /docs/reference/Rplot002.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/Rplot002.png -------------------------------------------------------------------------------- /docs/reference/Rplot003.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/Rplot003.png -------------------------------------------------------------------------------- /docs/reference/af-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/af-1.png -------------------------------------------------------------------------------- /docs/reference/bglmnet-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/bglmnet-1.png -------------------------------------------------------------------------------- /docs/reference/mplot-package.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Graphical model stability and model selection procedures — mplot-package • mplot 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 |
    71 |
    72 | 174 | 175 | 176 | 177 |
    178 | 179 |
    180 |
    181 | 186 | 187 |
    188 |

    Graphical model stability and model selection procedures

    189 |
    190 | 191 | 192 | 193 |

    References

    194 | 195 |

    Tarr G, Mueller S and Welsh AH (2018). mplot: An R Package for 196 | Graphical Model Stability and Variable Selection Procedures. 197 | Journal of Statistical Software, 83(9), pp. 1-28. doi: 10.18637/jss.v083.i09

    198 | 199 |
    200 | 205 |
    206 | 207 | 208 |
    209 | 212 | 213 |
    214 |

    Site built with pkgdown 1.6.1.9001.

    215 |
    216 | 217 |
    218 |
    219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | -------------------------------------------------------------------------------- /docs/reference/pipe.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Pipe operator — %>% • mplot 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 |
    71 |
    72 | 174 | 175 | 176 | 177 |
    178 | 179 |
    180 |
    181 | 186 | 187 |
    188 |

    See magrittr::%>% for details.

    189 |
    190 | 191 |
    lhs %>% rhs
    192 | 193 |

    Arguments

    194 | 195 | 196 | 197 | 198 | 199 | 200 | 201 | 202 | 203 | 204 |
    lhs

    A value or the magrittr placeholder.

    rhs

    A function call using the magrittr semantics.

    205 | 206 |

    Value

    207 | 208 |

    The result of calling rhs(lhs).

    209 | 210 |
    211 | 216 |
    217 | 218 | 219 |
    220 | 223 | 224 |
    225 |

    Site built with pkgdown 1.6.1.9001.

    226 |
    227 | 228 |
    229 |
    230 | 231 | 232 | 233 | 234 | 235 | 236 | 237 | 238 | 239 | 240 | -------------------------------------------------------------------------------- /docs/reference/plot.vis-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/plot.vis-1.png -------------------------------------------------------------------------------- /docs/reference/plot.vis-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/plot.vis-2.png -------------------------------------------------------------------------------- /docs/reference/plot.vis-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/plot.vis-3.png -------------------------------------------------------------------------------- /docs/reference/vis-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/vis-1.png -------------------------------------------------------------------------------- /docs/reference/vis-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/vis-2.png -------------------------------------------------------------------------------- /docs/reference/vis-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/docs/reference/vis-3.png -------------------------------------------------------------------------------- /docs/sitemap.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | /404.html 5 | 6 | 7 | /articles/af.html 8 | 9 | 10 | /articles/artificial.html 11 | 12 | 13 | /articles/background.html 14 | 15 | 16 | /articles/birthweight.html 17 | 18 | 19 | /articles/diabetes.html 20 | 21 | 22 | /articles/index.html 23 | 24 | 25 | /articles/interactive.html 26 | 27 | 28 | /articles/msp.html 29 | 30 | 31 | /articles/people.html 32 | 33 | 34 | /articles/publications.html 35 | 36 | 37 | /articles/timing.html 38 | 39 | 40 | /articles/vip.html 41 | 42 | 43 | /authors.html 44 | 45 | 46 | /index.html 47 | 48 | 49 | /reference/af.html 50 | 51 | 52 | /reference/artificialeg.html 53 | 54 | 55 | /reference/bglmnet.html 56 | 57 | 58 | /reference/bodyfat.html 59 | 60 | 61 | /reference/diabetes.html 62 | 63 | 64 | /reference/fev.html 65 | 66 | 67 | /reference/glmfence.html 68 | 69 | 70 | /reference/index.html 71 | 72 | 73 | /reference/lmfence.html 74 | 75 | 76 | /reference/mplot-package.html 77 | 78 | 79 | /reference/mplot.html 80 | 81 | 82 | /reference/pipe.html 83 | 84 | 85 | /reference/plot.af.html 86 | 87 | 88 | /reference/plot.bglmnet.html 89 | 90 | 91 | /reference/plot.vis.html 92 | 93 | 94 | /reference/print.af.html 95 | 96 | 97 | /reference/print.vis.html 98 | 99 | 100 | /reference/process.fn.html 101 | 102 | 103 | /reference/summary.af.html 104 | 105 | 106 | /reference/txt.fn.html 107 | 108 | 109 | /reference/vis.html 110 | 111 | 112 | /reference/wallabies.html 113 | 114 | 115 | -------------------------------------------------------------------------------- /inst/CITATION: -------------------------------------------------------------------------------- 1 | bibentry(bibtype = "Article", 2 | title = "{mplot}: An {R} Package for Graphical Model Stability and Variable Selection Procedures", 3 | author = c(person(given = "Garth", 4 | family = "Tarr", 5 | email = "garth.tarr@sydney.edu.au"), 6 | person(given = "Samuel", 7 | family = "M{\\\"u}ller", 8 | email = "samuel.mueller@sydney.edu.au"), 9 | person(given = c("Alan", "H."), 10 | family = "Welsh", 11 | email = "alan.welsh@anu.edu.au")), 12 | journal = "Journal of Statistical Software", 13 | year = "2018", 14 | volume = "83", 15 | number = "9", 16 | pages = "1--28", 17 | doi = "10.18637/jss.v083.i09", 18 | 19 | header = "To cite mplot in publications use:" 20 | ) 21 | -------------------------------------------------------------------------------- /man/.Rapp.history: -------------------------------------------------------------------------------- 1 | require(devtools) 2 | install_github("garthtarr/mplot",quick=TRUE) 3 | mplot 4 | require(mplot) 5 | ?diabetes 6 | install_github("garthtarr/mplot") 7 | require(mplot) 8 | ?diabetes 9 | -------------------------------------------------------------------------------- /man/af.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/af.R 3 | \name{af} 4 | \alias{af} 5 | \title{The adaptive fence procedure} 6 | \usage{ 7 | af( 8 | mf, 9 | B = 60, 10 | n.c = 20, 11 | initial.stepwise = FALSE, 12 | force.in = NULL, 13 | cores, 14 | nvmax, 15 | c.max, 16 | screen = FALSE, 17 | seed = NULL, 18 | ... 19 | ) 20 | } 21 | \arguments{ 22 | \item{mf}{a fitted 'full' model, the result of a call 23 | to lm or glm (and in the future lme or lmer).} 24 | 25 | \item{B}{number of bootstrap replications at each fence 26 | boundary value} 27 | 28 | \item{n.c}{number of boundary values to be considered} 29 | 30 | \item{initial.stepwise}{logical. Performs an initial stepwise 31 | procedure to look for the range of model sizes where attention 32 | should be focussed. See details for implementation.} 33 | 34 | \item{force.in}{the names of variables that should be forced 35 | into all estimated models} 36 | 37 | \item{cores}{number of cores to be used when parallel 38 | processing the bootstrap} 39 | 40 | \item{nvmax}{size of the largest model that can still be 41 | considered as a viable candidate. Included for performance 42 | reasons but if it is an active constraint it could lead to 43 | misleading results.} 44 | 45 | \item{c.max}{manually specify the upper boundary limit. 46 | Only applies when \code{initial.stepwise=FALSE}.} 47 | 48 | \item{screen}{logical, whether or not to perform an initial 49 | screen for outliers. Highly experimental, use at own risk. 50 | Default = FALSE.} 51 | 52 | \item{seed}{random seed for reproducible results} 53 | 54 | \item{...}{further arguments (currently unused)} 55 | } 56 | \description{ 57 | This function implements the adaptive fence procedure to 58 | first find the optimal cstar value and then finds the 59 | corresponding best model as described in Jiang et. al. 60 | (2009) with some practical modifications. 61 | } 62 | \details{ 63 | The initial stepwise procedure performs forward stepwise model 64 | selection using the AIC and backward stepwise model selection 65 | using BIC. In general the backwise selection via the more 66 | conservative BIC will tend to select a smaller model than that 67 | of the forward selection AIC approach. The size of these two 68 | models is found, and we go two dimensions smaller and larger 69 | to estimate a sensible range of \code{c} values over which to 70 | perform a parametric bootstrap. 71 | 72 | This procedure can take some time. It is recommended that you start 73 | with a relatively small number of bootstrap samples (\code{B}) 74 | and grid of boundary values (\code{n.c}) and increase both as 75 | required. 76 | 77 | If you use \code{initial.stepwise=TRUE} then in general you will 78 | need a smaller grid of boundary values than if you select 79 | \code{initial.stepwise=FALSE}. 80 | It can be useful to check \code{initial.stepwise=FALSE} with a 81 | small number of bootstrap replications over a sparse grid to ensure 82 | that the \code{initial.stepwise=TRUE} has landed you in a reasonable 83 | region. 84 | 85 | The \code{best.only=FALSE} option when plotting the results of the 86 | adaptive fence is a modification to the adaptive fence procedure 87 | which considers all models at a particular size that pass the fence 88 | hurdle when calculating the p* values. In particular, 89 | for each value of c and at each bootstrap replication, 90 | if a candidate model is found that passes the fence, then we look to see 91 | if there are any other models of the same size that also pass the fence. 92 | If no other models of the same size pass the fence, then that model is 93 | allocated a weight of 1. If there are two models that pass the fence, then 94 | the best model is allocated a weight of 1/2. If three models pass the fence, 95 | the best model gets a weight of 1/3, and so on. After \code{B} bootstrap 96 | replications, we aggregate the weights by summing over the various models. 97 | The p* value is the maximum aggregated weight divided by the number of bootstrap 98 | replications. 99 | This correction penalises the probability associated with the best model if 100 | there were other models of the same size that also passed the fence hurdle. 101 | The rationale being that if a model has no redundant variables 102 | then it will be the only model at that size that passes the fence over a 103 | range of values of c. 104 | The result is more pronounced peaks which can help to determine 105 | the location of the correct peak and identify the optimal c*. 106 | 107 | See \code{?plot.af} or \code{help("plot.af")} for details of the 108 | plot method associated with the result. 109 | } 110 | \examples{ 111 | n = 100 112 | set.seed(11) 113 | e = rnorm(n) 114 | x1 = rnorm(n) 115 | x2 = rnorm(n) 116 | x3 = x1^2 117 | x4 = x2^2 118 | x5 = x1*x2 119 | y = 1 + x1 + x2 + e 120 | dat = data.frame(y,x1,x2,x3,x4,x5) 121 | lm1 = lm(y ~ ., data = dat) 122 | \dontshow{ 123 | af1 = af(lm1, cores = 1, B = 5, n.c = 5, seed = 1) 124 | summary(af1) 125 | plot(af1) 126 | } 127 | \dontrun{ 128 | af1 = af(lm1, initial.stepwise = TRUE, seed = 1) 129 | summary(af1) 130 | plot(af1) 131 | } 132 | } 133 | \references{ 134 | Jiang J., Nguyen T., Sunil Rao J. (2009), 135 | A simplified adaptive fence procedure, Statistics & 136 | Probability Letters, 79(5):625-629. doi: 10.1016/j.spl.2008.10.014 137 | 138 | Jiang J., Sunil Rao J., Gu Z, Nguyen T. (2008), 139 | Fence methods for mixed model selection, Annals of Statistics, 140 | 36(4):1669-1692. doi: 10.1214/07-AOS517 141 | } 142 | \seealso{ 143 | \code{\link{plot.af}} 144 | 145 | Other fence: 146 | \code{\link{glmfence}()}, 147 | \code{\link{lmfence}()} 148 | } 149 | \concept{fence} 150 | -------------------------------------------------------------------------------- /man/artificialeg.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mplot-package.R 3 | \docType{data} 4 | \name{artificialeg} 5 | \alias{artificialeg} 6 | \title{Artificial example} 7 | \format{ 8 | A data frame with 50 observations on 10 variables. 9 | } 10 | \usage{ 11 | data(artificialeg) 12 | } 13 | \description{ 14 | An artificial data set which causes stepwise regression 15 | procedures to select a non-parsimonious model. 16 | The true model is a simple linear regression of 17 | y against x8. 18 | } 19 | \details{ 20 | Inspired by the pathoeg data set in the MPV pacakge. 21 | } 22 | \examples{ 23 | data(artificialeg) 24 | full.mod = lm(y~.,data=artificialeg) 25 | step(full.mod) 26 | # generating model 27 | n=50 28 | set.seed(8) # a seed of 2 also works 29 | x1 = rnorm(n,0.22,2) 30 | x7 = 0.5*x1 + rnorm(n,0,sd=2) 31 | x6 = -0.75*x1 + rnorm(n,0,3) 32 | x3 = -0.5-0.5*x6 + rnorm(n,0,2) 33 | x9 = rnorm(n,0.6,3.5) 34 | x4 = 0.5*x9 + rnorm(n,0,sd=3) 35 | x2 = -0.5 + 0.5*x9 + rnorm(n,0,sd=2) 36 | x5 = -0.5*x2+0.5*x3+0.5*x6-0.5*x9+rnorm(n,0,1.5) 37 | x8 = x1 + x2 -2*x3 - 0.3*x4 + x5 - 1.6*x6 - 1*x7 + x9 +rnorm(n,0,0.5) 38 | y = 0.6*x8 + rnorm(n,0,2) 39 | artificialeg = round(data.frame(x1,x2,x3,x4,x5,x6,x7,x8,x9,y),1) 40 | } 41 | \keyword{datasets} 42 | -------------------------------------------------------------------------------- /man/bglmnet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/bglmnet.R 3 | \name{bglmnet} 4 | \alias{bglmnet} 5 | \title{Model stability and variable importance plots for glmnet} 6 | \usage{ 7 | bglmnet( 8 | mf, 9 | nlambda = 100, 10 | lambda = NULL, 11 | B = 100, 12 | penalty.factor, 13 | screen = FALSE, 14 | redundant = TRUE, 15 | cores = NULL, 16 | force.in = NULL, 17 | seed = NULL 18 | ) 19 | } 20 | \arguments{ 21 | \item{mf}{a fitted 'full' model, the result of a call 22 | to lm or glm.} 23 | 24 | \item{nlambda}{how many penalty values to consider. Default = 100.} 25 | 26 | \item{lambda}{manually specify the penalty values (optional).} 27 | 28 | \item{B}{number of bootstrap replications} 29 | 30 | \item{penalty.factor}{Separate penalty factors can be applied to each 31 | coefficient. This is a number that multiplies lambda to allow 32 | differential shrinkage. Can be 0 for some variables, which implies 33 | no shrinkage, and that variable is always included in the model. 34 | Default is 1 for all variables (and implicitly infinity for variables 35 | listed in exclude). Note: the penalty factors are internally rescaled 36 | to sum to nvars, and the lambda sequence will reflect this change.} 37 | 38 | \item{screen}{logical, whether or not to perform an initial 39 | screen for outliers. Highly experimental, use at own risk. 40 | Default = FALSE.} 41 | 42 | \item{redundant}{logical, whether or not to add a redundant 43 | variable. Default = \code{TRUE}.} 44 | 45 | \item{cores}{number of cores to be used when parallel 46 | processing the bootstrap (Not yet implemented.)} 47 | 48 | \item{force.in}{the names of variables that should be forced 49 | into all estimated models. (Not yet implemented.)} 50 | 51 | \item{seed}{random seed for reproducible results} 52 | } 53 | \description{ 54 | Model stability and variable importance plots for glmnet 55 | } 56 | \details{ 57 | The result of this function is essentially just a 58 | list. The supplied plot method provides a way to visualise the 59 | results. 60 | } 61 | \examples{ 62 | n = 100 63 | set.seed(11) 64 | e = rnorm(n) 65 | x1 = rnorm(n) 66 | x2 = rnorm(n) 67 | x3 = x1^2 68 | x4 = x2^2 69 | x5 = x1*x2 70 | y = 1 + x1 + x2 + e 71 | dat = data.frame(y, x1, x2, x3, x4, x5) 72 | lm1 = lm(y ~ ., data = dat) 73 | \dontshow{ 74 | bg1 = bglmnet(lm1, seed = 1, B=10) 75 | plot(bg1) 76 | } 77 | \dontrun{ 78 | bg1 = bglmnet(lm1, seed = 1) 79 | # plot(bg1, which = "boot_size", interactive = TRUE) 80 | plot(bg1, which = "boot_size", interactive = FALSE) 81 | # plot(bg1, which = "vip", interactive = TRUE) 82 | plot(bg1, which = "vip", interactive = FALSE) 83 | } 84 | } 85 | \seealso{ 86 | \code{\link{plot.bglmnet}} 87 | } 88 | -------------------------------------------------------------------------------- /man/bodyfat.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mplot-package.R 3 | \docType{data} 4 | \name{bodyfat} 5 | \alias{bodyfat} 6 | \title{Body fat data set} 7 | \format{ 8 | A data frame with 128 observations on 15 variables. 9 | \describe{ 10 | \item{Id}{Identifier} 11 | \item{Bodyfat}{Bodyfat percentage} 12 | \item{Age}{Age (years)} 13 | \item{Weight}{Weight (kg)} 14 | \item{Height}{Height (inches)} 15 | \item{Neck}{Neck circumference (cm)} 16 | \item{Chest}{Chest circumference (cm)} 17 | \item{Abdo}{Abdomen circumference (cm) "at the umbilicus 18 | and level with the iliac crest"} 19 | \item{Hip}{Hip circumference (cm)} 20 | \item{Thigh}{Thigh circumference (cm)} 21 | \item{Knee}{Knee circumference (cm)} 22 | \item{Ankle}{Ankle circumference (cm)} 23 | \item{Bic}{Extended biceps circumference (cm)} 24 | \item{Fore}{Forearm circumference (cm)} 25 | \item{Wrist}{Wrist circumference (cm) "distal to the 26 | styloid processes"} 27 | } 28 | } 29 | \usage{ 30 | data(bodyfat) 31 | } 32 | \description{ 33 | A data frame with 128 observations on 15 variables. 34 | } 35 | \details{ 36 | A subset of the 252 observations available in the \code{mfp} package. 37 | The selected observations avoid known high leverage points and 38 | outliers. The unused points from the data set could be used to validate 39 | selected models. 40 | } 41 | \examples{ 42 | data(bodyfat) 43 | full.mod = lm(Bodyfat~.,data=subset(bodyfat,select=-Id)) 44 | } 45 | \references{ 46 | Johnson W (1996, Vol 4). Fitting percentage of 47 | body fat to simple body measurements. Journal of Statistics 48 | Education. Bodyfat data retrieved from 49 | http://www.amstat.org/publications/jse/v4n1/datasets.johnson.html 50 | An expanded version is included in the \code{mfp} R package. 51 | } 52 | \keyword{datasets} 53 | -------------------------------------------------------------------------------- /man/diabetes.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mplot-package.R 3 | \docType{data} 4 | \name{diabetes} 5 | \alias{diabetes} 6 | \title{Blood and other measurements in diabetics} 7 | \format{ 8 | A data frame with 442 observations on 11 variables. 9 | \describe{ 10 | \item{age}{Age} 11 | \item{sex}{Gender} 12 | \item{bmi}{Body mass index} 13 | \item{map}{Mean arterial pressure (average blood pressure)} 14 | \item{tc}{Total cholesterol (mg/dL)? Desirable range: below 200 mg/dL} 15 | \item{ldl}{Low-density lipoprotein ("bad" cholesterol)? 16 | Desirable range: below 130 mg/dL } 17 | \item{hdl}{High-density lipoprotein ("good" cholesterol)? 18 | Desirable range: above 40 mg/dL} 19 | \item{tch}{Blood serum measurement} 20 | \item{ltg}{Blood serum measurement} 21 | \item{glu}{Blood serum measurement (glucose?)} 22 | \item{y}{A quantitative measure of disease progression 23 | one year after baseline} 24 | } 25 | } 26 | \usage{ 27 | data(diabetes) 28 | } 29 | \description{ 30 | The diabetes data frame has 442 rows and 11 columns. 31 | These are the data used in Efron et al. (2004). 32 | } 33 | \details{ 34 | Data sourced from http://web.stanford.edu/~hastie/Papers/LARS 35 | } 36 | \examples{ 37 | data(diabetes) 38 | full.mod = lm(y~.,data=diabetes) 39 | } 40 | \references{ 41 | Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., (2004). 42 | Least angle regression. The Annals of Statistics 32(2) 407-499. 43 | DOI: 10.1214/009053604000000067 44 | } 45 | \keyword{datasets} 46 | -------------------------------------------------------------------------------- /man/fev.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mplot-package.R 3 | \docType{data} 4 | \name{fev} 5 | \alias{fev} 6 | \title{Forced Expiratory Volume} 7 | \format{ 8 | A data frame with 654 observations on 5 variables. 9 | \describe{ 10 | \item{age}{Age (years)} 11 | \item{fev}{Forced expiratory volume (liters). Roughly the amount 12 | of air an individual can exhale in the first second of 13 | a forceful breath.} 14 | \item{height}{Height (inches).} 15 | \item{sex}{Female is 0. Male is 1.} 16 | \item{smoke}{A binary variable indicating whether or not the 17 | youth smokes. Nonsmoker is 0. Smoker is 1.} 18 | } 19 | } 20 | \usage{ 21 | data(fev) 22 | } 23 | \description{ 24 | This data set consists of 654 observations on youths aged 3 to 19 from 25 | East Boston recorded duing the middle to late 1970's. 26 | Forced expiratory volume (FEV), a measure of lung capacity, is the 27 | variable of interest. Age and height are two continuous predictors. 28 | Sex and smoke are two categorical predictors. 29 | } 30 | \details{ 31 | Copies of this data set can also be found in the 32 | \code{coneproj} and \code{tmle} packages. 33 | } 34 | \examples{ 35 | data(fev) 36 | full.mod = lm(fev~.,data=fev) 37 | step(full.mod) 38 | } 39 | \references{ 40 | Tager, I. B., Weiss, S. T., Rosner, B., and Speizer, F. E. (1979). 41 | Effect of parental cigarette smoking on pulmonary function in children. 42 | \emph{American Journal of Epidemiology}, \bold{110}, 15-26. 43 | 44 | Rosner, B. (1999). 45 | \emph{Fundamentals of Biostatistics}, 5th Ed., Pacific Grove, CA: Duxbury. 46 | 47 | Kahn, M.J. (2005). An Exhalent Problem for Teaching Statistics. 48 | \emph{Journal of Statistics Education}, \bold{13}(2). 49 | http://www.amstat.org/publications/jse/v13n2/datasets.kahn.html 50 | } 51 | \keyword{datasets} 52 | -------------------------------------------------------------------------------- /man/glmfence.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/glmfence.R 3 | \name{glmfence} 4 | \alias{glmfence} 5 | \title{The fence procedure for generalised linear models} 6 | \usage{ 7 | glmfence(mf, cstar, nvmax, adaptive = TRUE, trace = TRUE, ...) 8 | } 9 | \arguments{ 10 | \item{mf}{an object of class \code{\link[stats]{glm}} 11 | specifying the full model.} 12 | 13 | \item{cstar}{the boundary of the fence, typically found 14 | through bootstrapping.} 15 | 16 | \item{nvmax}{the maximum number of variables that will be 17 | be considered in the model.} 18 | 19 | \item{adaptive}{logical. If \code{TRUE} the boundary of the fence is 20 | given by cstar. Otherwise, it the original (non-adaptive) fence 21 | is performed where the boundary is cstar*hat(sigma)_{M,tildeM}.} 22 | 23 | \item{trace}{logical. If \code{TRUE} the function prints out its 24 | progress as it iterates up through the dimensions.} 25 | 26 | \item{...}{further arguments (currently unused)} 27 | } 28 | \description{ 29 | This function implements the fence procedure to 30 | find the best generalised linear model. 31 | } 32 | \references{ 33 | Jiming Jiang, Thuan Nguyen, J. Sunil Rao, 34 | A simplified adaptive fence procedure, Statistics & 35 | Probability Letters, Volume 79, Issue 5, 1 March 2009, 36 | Pages 625-629, http://dx.doi.org/10.1016/j.spl.2008.10.014. 37 | } 38 | \seealso{ 39 | \code{\link{af}}, \code{\link{lmfence}} 40 | 41 | Other fence: 42 | \code{\link{af}()}, 43 | \code{\link{lmfence}()} 44 | } 45 | \concept{fence} 46 | \keyword{Internal} 47 | -------------------------------------------------------------------------------- /man/lmfence.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/lmfence.R 3 | \name{lmfence} 4 | \alias{lmfence} 5 | \title{The fence procedure for linear models} 6 | \usage{ 7 | lmfence(mf, cstar, nvmax, adaptive = TRUE, trace = TRUE, force.in = NULL, ...) 8 | } 9 | \arguments{ 10 | \item{mf}{an object of class \code{\link[stats]{lm}} 11 | specifying the full model.} 12 | 13 | \item{cstar}{the boundary of the fence, typically found 14 | through bootstrapping.} 15 | 16 | \item{nvmax}{the maximum number of variables that will be 17 | be considered in the model.} 18 | 19 | \item{adaptive}{logical. If \code{TRUE} the boundary of the fence is 20 | given by cstar. Otherwise, it the original (non-adaptive) fence 21 | is performed where the boundary is cstar*hat(sigma)_{M,tildeM}.} 22 | 23 | \item{trace}{logical. If \code{TRUE} the function prints out its 24 | progress as it iterates up through the dimensions.} 25 | 26 | \item{force.in}{the names of variables that should be forced 27 | into all estimated models.} 28 | 29 | \item{...}{further arguments (currently unused)} 30 | } 31 | \description{ 32 | This function implements the fence procedure to 33 | find the best linear model. 34 | } 35 | \examples{ 36 | n = 40 # sample size 37 | beta = c(1,2,3,0,0) 38 | K=length(beta) 39 | set.seed(198) 40 | X = cbind(1,matrix(rnorm(n*(K-1)),ncol=K-1)) 41 | e = rnorm(n) 42 | y = X\%*\%beta + e 43 | dat = data.frame(y,X[,-1]) 44 | # Non-adaptive approach (not recommended) 45 | lm1 = lm(y~.,data=dat) 46 | lmfence(lm1,cstar=log(n),adaptive=FALSE) 47 | } 48 | \references{ 49 | Jiming Jiang, Thuan Nguyen, J. Sunil Rao, 50 | A simplified adaptive fence procedure, Statistics & 51 | Probability Letters, Volume 79, Issue 5, 1 March 2009, 52 | Pages 625-629, http://dx.doi.org/10.1016/j.spl.2008.10.014. 53 | } 54 | \seealso{ 55 | \code{\link{af}}, \code{\link{glmfence}} 56 | 57 | Other fence: 58 | \code{\link{af}()}, 59 | \code{\link{glmfence}()} 60 | } 61 | \concept{fence} 62 | \keyword{Internal} 63 | -------------------------------------------------------------------------------- /man/mplot-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mplot-package.R 3 | \docType{package} 4 | \name{mplot-package} 5 | \alias{mplot-package} 6 | \title{Graphical model stability and model selection procedures} 7 | \description{ 8 | Graphical model stability and model selection procedures 9 | } 10 | \references{ 11 | Tarr G, Mueller S and Welsh AH (2018). mplot: An R Package for 12 | Graphical Model Stability and Variable Selection Procedures. 13 | Journal of Statistical Software, 83(9), pp. 1-28. doi: 10.18637/jss.v083.i09 14 | } 15 | \keyword{package} 16 | -------------------------------------------------------------------------------- /man/mplot.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mplot.R 3 | \name{mplot} 4 | \alias{mplot} 5 | \title{Model selection and stability curves} 6 | \usage{ 7 | mplot(mf, ...) 8 | } 9 | \arguments{ 10 | \item{mf}{a fitted model.} 11 | 12 | \item{...}{objects of type \code{vis} or \code{af} or \code{bglmnet}.} 13 | } 14 | \description{ 15 | Opens a shiny GUI to investigate a range of model selection 16 | and stability issues 17 | } 18 | \examples{ 19 | n = 100 20 | set.seed(11) 21 | e = rnorm(n) 22 | x1 = rnorm(n) 23 | x2 = rnorm(n) 24 | x3 = x1^2 25 | x4 = x2^2 26 | x5 = x1*x2 27 | y = 1 + x1 + x2 + e 28 | dat = round(data.frame(y,x1,x2,x3,x4,x5),2) 29 | lm1 = lm(y ~ ., data = dat) 30 | \dontrun{ 31 | v1 = vis(lm1) 32 | af1 = af(lm1) 33 | bg1 = bglmnet(lm1) 34 | mplot(lm1, v1, af1, bg1) 35 | } 36 | 37 | } 38 | \references{ 39 | Tarr G, Mueller S and Welsh AH (2018). mplot: An R Package for 40 | Graphical Model Stability and Variable Selection Procedures. 41 | Journal of Statistical Software, 83(9), pp. 1-28. doi: 10.18637/jss.v083.i09 42 | } 43 | -------------------------------------------------------------------------------- /man/pipe.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils-pipe.R 3 | \name{\%>\%} 4 | \alias{\%>\%} 5 | \title{Pipe operator} 6 | \usage{ 7 | lhs \%>\% rhs 8 | } 9 | \arguments{ 10 | \item{lhs}{A value or the magrittr placeholder.} 11 | 12 | \item{rhs}{A function call using the magrittr semantics.} 13 | } 14 | \value{ 15 | The result of calling \code{rhs(lhs)}. 16 | } 17 | \description{ 18 | See \code{magrittr::\link[magrittr:pipe]{\%>\%}} for details. 19 | } 20 | \keyword{internal} 21 | -------------------------------------------------------------------------------- /man/plot.af.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/af.R 3 | \name{plot.af} 4 | \alias{plot.af} 5 | \title{Plot diagnostics for an af object} 6 | \usage{ 7 | \method{plot}{af}( 8 | x, 9 | pch, 10 | interactive = FALSE, 11 | classic = NULL, 12 | tag = NULL, 13 | shiny = FALSE, 14 | best.only = FALSE, 15 | width = 800, 16 | height = 400, 17 | fontSize = 12, 18 | left = 50, 19 | top = 30, 20 | chartWidth = "60\%", 21 | chartHeight = "80\%", 22 | backgroundColor = "transparent", 23 | legend.position = "top", 24 | model.wrap = NULL, 25 | legend.space = NULL, 26 | options = NULL, 27 | ... 28 | ) 29 | } 30 | \arguments{ 31 | \item{x}{\code{af} object, the result of \code{\link{af}}} 32 | 33 | \item{pch}{plotting character, i.e., symbol to use} 34 | 35 | \item{interactive}{logical. If \code{interactive=TRUE} a 36 | googleVis plot is provided instead of the base graphics plot. 37 | Default is \code{interactive=FALSE}.} 38 | 39 | \item{classic}{logical. Depricated. If \code{classic=TRUE} a 40 | base graphics plot is provided instead of a googleVis plot. 41 | For now specifying \code{classic} will overwrite the 42 | default \code{interactive} behaviour, though this is 43 | likely to be removed in the future.} 44 | 45 | \item{tag}{Default NULL. Name tag of the objects to be extracted 46 | from a gvis (googleVis) object. 47 | 48 | The default tag for is NULL, which will 49 | result in R opening a browser window. Setting \code{tag='chart'} 50 | or setting \code{options(gvis.plot.tag='chart')} is useful when 51 | googleVis is used in scripts, like knitr or rmarkdown.} 52 | 53 | \item{shiny}{Default FALSE. Set to TRUE when using in a shiny interface.} 54 | 55 | \item{best.only}{logical determining whether the output used the 56 | standard fence approach of only considering the best models 57 | that pass the fence (\code{TRUE}) or if it should take into 58 | account all models that pass the fence at each boundary 59 | value (\code{FALSE}).} 60 | 61 | \item{width}{Width of the googleVis chart canvas area, in pixels. 62 | Default: 800.} 63 | 64 | \item{height}{Height of the googleVis chart canvas area, in pixels. 65 | Default: 400.} 66 | 67 | \item{fontSize}{font size used in googleVis chart. Default: 12.} 68 | 69 | \item{left}{space at left of chart (pixels?). Default: "50".} 70 | 71 | \item{top}{space at top of chart (pixels?). Default: "30".} 72 | 73 | \item{chartWidth}{googleVis chart area width. 74 | A simple number is a value in pixels; 75 | a string containing a number followed by \code{\%} is a percentage. 76 | Default: \code{"60\%"}} 77 | 78 | \item{chartHeight}{googleVis chart area height. 79 | A simple number is a value in pixels; 80 | a string containing a number followed by \code{\%} is a percentage. 81 | Default: \code{"80\%"}} 82 | 83 | \item{backgroundColor}{The background colour for the main area 84 | of the chart. A simple HTML color string, 85 | for example: 'red' or '#00cc00'. Default: 'transparent'} 86 | 87 | \item{legend.position}{legend position, e.g. \code{"topleft"} 88 | or \code{"bottomright"}} 89 | 90 | \item{model.wrap}{Optional parameter to split the legend names 91 | if they are too long for classic plots. \code{model.wrap=2} 92 | means that there will be two variables per line, \code{model.wrap=2} 93 | gives three variables per line and \code{model.wrap=4} gives 4 94 | variables per line.} 95 | 96 | \item{legend.space}{Optional parameter to add additional space 97 | between the legend items for the classic plot.} 98 | 99 | \item{options}{If you want to specify the full set of googleVis 100 | options.} 101 | 102 | \item{...}{further arguments (currently unused)} 103 | } 104 | \description{ 105 | Summary plot of the bootstrap results of an af object. 106 | } 107 | \details{ 108 | For each value of \eqn{c}{c} a parametric 109 | bootstrap is performed under the full model. 110 | For each bootstrap 111 | sample we identify the smallest model inside the fence, 112 | \eqn{\hat{\alpha}(c)}{hat{alpha}(c)}. We calculate the empirical probability of selecting 113 | model \eqn{\alpha}{alpha} for a given value of \eqn{c}{c} as 114 | \deqn{p^*(c,\alpha)=P^*\{\hat{\alpha}(c)=\alpha\}.}{p*(c,alpha)=P*{hat{alpha}(c)=alpha}.} 115 | Hence, if \eqn{B}{B} bootstrap replications are performed, 116 | \eqn{p^*(c,\alpha)}{p^*(c,alpha)} is the 117 | proportion of times that model \eqn{\alpha}{alpha} is selected. Finally, 118 | define an overall selection probability, 119 | \deqn{p^*(c)=\max_{\alpha\in\mathcal{A}}p^*(c,\alpha)}{p*(c)=max_{alpha in mathcal{A}}p*(c,alpha)} and we plot 120 | \eqn{p^*(c)}{p*(c)} against \eqn{c}{c}. The points on the scatter plot are 121 | colour coded by the model that yielded the highest inclusion probability. 122 | } 123 | -------------------------------------------------------------------------------- /man/plot.bglmnet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/bglmnet.R 3 | \name{plot.bglmnet} 4 | \alias{plot.bglmnet} 5 | \title{Plot diagnostics for a bglmnet object} 6 | \usage{ 7 | \method{plot}{bglmnet}( 8 | x, 9 | highlight, 10 | interactive = FALSE, 11 | classic = NULL, 12 | tag = NULL, 13 | shiny = FALSE, 14 | which = c("vip", "boot", "boot_size"), 15 | width = 800, 16 | height = 400, 17 | fontSize = 12, 18 | left = 50, 19 | top = 30, 20 | chartWidth = "60\%", 21 | chartHeight = "80\%", 22 | axisTitlesPosition = "out", 23 | dataOpacity = 0.5, 24 | options = NULL, 25 | hAxis.logScale = TRUE, 26 | ylim, 27 | text = FALSE, 28 | backgroundColor = "transparent", 29 | legend.position = "right", 30 | jitterk = 0.1, 31 | srt = 45, 32 | max.circle = 15, 33 | min.prob = 0.1, 34 | ... 35 | ) 36 | } 37 | \arguments{ 38 | \item{x}{\code{bglmnet} object, the result of \code{\link{bglmnet}}} 39 | 40 | \item{highlight}{the name of a variable that will be highlighted.} 41 | 42 | \item{interactive}{logical. If \code{interactive=TRUE} a 43 | googleVis plot is provided instead of the base graphics plot. 44 | Default is \code{interactive=FALSE}.} 45 | 46 | \item{classic}{logical. Depricated. If \code{classic=TRUE} a 47 | base graphics plot is provided instead of a googleVis plot. 48 | For now specifying \code{classic} will overwrite the 49 | default \code{interactive} behaviour, though this is 50 | likely to be removed in the future.} 51 | 52 | \item{tag}{Default NULL. Name tag of the objects to be extracted 53 | from a gvis (googleVis) object. 54 | 55 | The default tag for is NULL, which will 56 | result in R opening a browser window. Setting \code{tag='chart'} 57 | or setting \code{options(gvis.plot.tag='chart')} is useful when 58 | googleVis is used in scripts, like knitr or rmarkdown.} 59 | 60 | \item{shiny}{Default FALSE. Set to TRUE when using in a shiny interface.} 61 | 62 | \item{which}{a vector specifying the plots to be output. Variable 63 | inclusion type plots \code{which = "vip"} or plots where the size 64 | of the point representing each model is proportional to selection 65 | probabilities by model size \code{which = "boot_size"} 66 | or by penalty paramter \code{which = "boot"}.} 67 | 68 | \item{width}{Width of the googleVis chart canvas area, in pixels. 69 | Default: 800.} 70 | 71 | \item{height}{Height of the googleVis chart canvas area, in pixels. 72 | Default: 400.} 73 | 74 | \item{fontSize}{font size used in googleVis chart. Default: 12.} 75 | 76 | \item{left}{space at left of chart (pixels?). Default: "50".} 77 | 78 | \item{top}{space at top of chart (pixels?). Default: "30".} 79 | 80 | \item{chartWidth}{googleVis chart area width. 81 | A simple number is a value in pixels; 82 | a string containing a number followed by \code{\%} is a percentage. 83 | Default: \code{"60\%"}} 84 | 85 | \item{chartHeight}{googleVis chart area height. 86 | A simple number is a value in pixels; 87 | a string containing a number followed by \code{\%} is a percentage. 88 | Default: \code{"80\%"}} 89 | 90 | \item{axisTitlesPosition}{Where to place the googleVis axis titles, 91 | compared to the chart area. Supported values: 92 | "in" - Draw the axis titles inside the the chart area. 93 | "out" - Draw the axis titles outside the chart area. 94 | "none" - Omit the axis titles.} 95 | 96 | \item{dataOpacity}{The transparency of googleVis data points, 97 | with 1.0 being completely opaque and 0.0 fully transparent.} 98 | 99 | \item{options}{a list to be passed to the googleVis function giving 100 | complete control over the output. Specifying a value for 101 | \code{options} overwrites all other plotting variables.} 102 | 103 | \item{hAxis.logScale}{logical, whether or not to use a log scale on 104 | the horizontal axis. Default = TRUE.} 105 | 106 | \item{ylim}{the y limits of the \code{which="boot"} plots.} 107 | 108 | \item{text}{logical, whether or not to add text labels to classic 109 | boot plot. Default = \code{FALSE}.} 110 | 111 | \item{backgroundColor}{The background colour for the main area 112 | of the chart. A simple HTML color string, 113 | for example: 'red' or '#00cc00'. Default: 'transparent'} 114 | 115 | \item{legend.position}{the postion of the legend for classic plots. 116 | Default \code{legend.position="right"} alternatives include 117 | \code{legend.position="top"} and \code{legend.position="bottom"}} 118 | 119 | \item{jitterk}{amount of jittering of the model size in the lvk and boot plots. 120 | Default = 0.1.} 121 | 122 | \item{srt}{when \code{text=TRUE}, the angle of rotation for the text labels. 123 | Default = 45.} 124 | 125 | \item{max.circle}{determines the maximum circle size. 126 | Default = 15.} 127 | 128 | \item{min.prob}{lower bound on the probability of a model being selected. If 129 | a model has a selection probability lower than \code{min.prob} it will not be 130 | plotted.} 131 | 132 | \item{...}{further arguments (currently unused)} 133 | } 134 | \description{ 135 | A plot method to visualise the results of a \code{bglmnet} object. 136 | } 137 | \seealso{ 138 | \code{\link{bglmnet}} 139 | } 140 | -------------------------------------------------------------------------------- /man/plot.vis.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/vis.R 3 | \name{plot.vis} 4 | \alias{plot.vis} 5 | \title{Plot diagnostics for a vis object} 6 | \usage{ 7 | \method{plot}{vis}( 8 | x, 9 | highlight, 10 | interactive = FALSE, 11 | classic = NULL, 12 | tag = NULL, 13 | shiny = FALSE, 14 | nbest = "all", 15 | which = c("vip", "lvk", "boot"), 16 | width = 800, 17 | height = 400, 18 | fontSize = 12, 19 | left = 50, 20 | top = 30, 21 | chartWidth = "60\%", 22 | chartHeight = "80\%", 23 | axisTitlesPosition = "out", 24 | dataOpacity = 0.5, 25 | options = NULL, 26 | ylim, 27 | legend.position = "right", 28 | backgroundColor = "transparent", 29 | text = FALSE, 30 | min.prob = 0.4, 31 | srt = 45, 32 | max.circle = 15, 33 | print.full.model = FALSE, 34 | jitterk = 0.1, 35 | seed = NULL, 36 | ... 37 | ) 38 | } 39 | \arguments{ 40 | \item{x}{\code{vis} object, the result of \code{\link{vis}}} 41 | 42 | \item{highlight}{the name of a variable that will be highlighted} 43 | 44 | \item{interactive}{logical. If \code{interactive=TRUE} a 45 | googleVis plot is provided instead of the base graphics plot. 46 | Default is \code{interactive=FALSE}.} 47 | 48 | \item{classic}{logical. Depricated. If \code{classic=TRUE} a 49 | base graphics plot is provided instead of a googleVis plot. 50 | For now specifying \code{classic} will overwrite the 51 | default \code{interactive} behaviour, though this is 52 | likely to be removed in the future.} 53 | 54 | \item{tag}{Default NULL. Name tag of the objects to be extracted 55 | from a gvis (googleVis) object. 56 | 57 | The default tag for is NULL, which will 58 | result in R opening a browser window. Setting \code{tag='chart'} 59 | or setting \code{options(gvis.plot.tag='chart')} is useful when 60 | googleVis is used in scripts, like knitr or rmarkdown.} 61 | 62 | \item{shiny}{Default FALSE. Set to TRUE when using in a shiny interface.} 63 | 64 | \item{nbest}{maximum number of models at each model size 65 | that will be considered for the lvk plot. Can also take 66 | a value of \code{"all"} which displays all models (default).} 67 | 68 | \item{which}{a vector specifying the plots to be output. Variable 69 | inclusion plots \code{which="vip"}; description loss against model 70 | size \code{which="lvk"}; bootstrapped description loss against 71 | model size \code{which="boot"}.} 72 | 73 | \item{width}{Width of the googleVis chart canvas area, in pixels. 74 | Default: 800.} 75 | 76 | \item{height}{Height of the googleVis chart canvas area, in pixels. 77 | Default: 400.} 78 | 79 | \item{fontSize}{font size used in googleVis chart. Default: 12.} 80 | 81 | \item{left}{space at left of chart (pixels?). Default: "50".} 82 | 83 | \item{top}{space at top of chart (pixels?). Default: "30".} 84 | 85 | \item{chartWidth}{googleVis chart area width. 86 | A simple number is a value in pixels; 87 | a string containing a number followed by \code{\%} is a percentage. 88 | Default: \code{"60\%"}} 89 | 90 | \item{chartHeight}{googleVis chart area height. 91 | A simple number is a value in pixels; 92 | a string containing a number followed by \code{\%} is a percentage. 93 | Default: \code{"80\%"}} 94 | 95 | \item{axisTitlesPosition}{Where to place the googleVis axis titles, 96 | compared to the chart area. Supported values: 97 | "in" - Draw the axis titles inside the the chart area. 98 | "out" - Draw the axis titles outside the chart area. 99 | "none" - Omit the axis titles.} 100 | 101 | \item{dataOpacity}{The transparency of googleVis data points, 102 | with 1.0 being completely opaque and 0.0 fully transparent.} 103 | 104 | \item{options}{a list to be passed to the googleVis function giving 105 | complete control over the output. Specifying a value for 106 | \code{options} overwrites all other plotting variables.} 107 | 108 | \item{ylim}{the y limits of the lvk and boot plots.} 109 | 110 | \item{legend.position}{the postion of the legend for classic plots. 111 | Default \code{legend.position="right"} alternatives include 112 | \code{legend.position="top"} and \code{legend.position="bottom"}} 113 | 114 | \item{backgroundColor}{The background colour for the main area 115 | of the chart. A simple HTML color string, 116 | for example: 'red' or '#00cc00'. Default: 'null' (there is an 117 | issue with GoogleCharts when setting 'transparent' related to the 118 | zoom window sticking - once that's sorted out, the default 119 | will change back to 'transparent')} 120 | 121 | \item{text}{logical, whether or not to add text labels to classic 122 | boot plot. Default = \code{FALSE}.} 123 | 124 | \item{min.prob}{when \code{text=TRUE}, a lower bound on the probability of 125 | selection before a text label is shown.} 126 | 127 | \item{srt}{when \code{text=TRUE}, the angle of rotation for the text labels. 128 | Default = 45.} 129 | 130 | \item{max.circle}{determines the maximum circle size. 131 | Default = 15.} 132 | 133 | \item{print.full.model}{logical, when \code{text=TRUE} this determines if the full 134 | model gets a label or not. Default=\code{FALSE}.} 135 | 136 | \item{jitterk}{amount of jittering of the model size in the lvk and boot plots. 137 | Default = 0.1.} 138 | 139 | \item{seed}{random seed for reproducible results} 140 | 141 | \item{...}{further arguments (currently unused)} 142 | } 143 | \description{ 144 | A plot method to visualise the results of a \code{vis} object. 145 | } 146 | \details{ 147 | Specifying \code{which = "lvk"} generates a scatter plot where 148 | the points correspond to description loss is plot against model size 149 | for each model considered. The \code{highlight} argument is 150 | used to differentiate models that contain a particular variable 151 | from those that do not. 152 | 153 | Specifying \code{which = "boot"} generates a scatter plot where 154 | each circle represents a model with a non-zero bootstrap probability, 155 | that is, each model that was selected as the best model of a 156 | particular dimension in at least one bootstrap replication. 157 | The area of each circle is proportional to the 158 | corresponding model's bootstrapped selection probability. 159 | } 160 | \examples{ 161 | n = 100 162 | set.seed(11) 163 | e = rnorm(n) 164 | x1 = rnorm(n) 165 | x2 = rnorm(n) 166 | x3 = x1^2 167 | x4 = x2^2 168 | x5 = x1*x2 169 | y = 1 + x1 + x2 + e 170 | dat = data.frame(y,x1,x2,x3,x4,x5) 171 | lm1 = lm(y~.,data=dat) 172 | \dontshow{ 173 | v1 = vis(lm1, B = 5, cores = 1, seed = 1) 174 | plot(v1, highlight = "x1", which = "lvk") 175 | plot(v1, which = "boot") 176 | plot(v1, which = "vip") 177 | } 178 | \dontrun{ 179 | v1 = vis(lm1, seed = 1) 180 | plot(v1, highlight = "x1", which = "lvk") 181 | plot(v1, which = "boot") 182 | plot(v1, which = "vip") 183 | } 184 | } 185 | \references{ 186 | Mueller, S. and Welsh, A. H. (2010), On model 187 | selection curves. International Statistical Review, 78:240-256. 188 | doi: 10.1111/j.1751-5823.2010.00108.x 189 | 190 | Murray, K., Heritier, S. and Mueller, S. (2013), Graphical 191 | tools for model selection in generalized linear models. 192 | Statistics in Medicine, 32:4438-4451. doi: 10.1002/sim.5855 193 | 194 | Tarr G, Mueller S and Welsh AH (2018). mplot: An R Package for 195 | Graphical Model Stability and Variable Selection Procedures. 196 | Journal of Statistical Software, 83(9), pp. 1-28. doi: 10.18637/jss.v083.i09 197 | } 198 | \seealso{ 199 | \code{\link{vis}} 200 | } 201 | -------------------------------------------------------------------------------- /man/print.af.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/af.R 3 | \name{print.af} 4 | \alias{print.af} 5 | \title{Print method for an af object} 6 | \usage{ 7 | \method{print}{af}(x, best.only = TRUE, ...) 8 | } 9 | \arguments{ 10 | \item{x}{an \code{af} object, the result of \code{\link{af}}} 11 | 12 | \item{best.only}{logical determining whether the output used the 13 | standard fence approach of only considering the best models 14 | that pass the fence (\code{TRUE}) or if it should take into 15 | account all models that pass the fence at each boundary 16 | value (\code{FALSE}).} 17 | 18 | \item{...}{further arguments (currently unused)} 19 | } 20 | \description{ 21 | Prints basic output of the bootstrap results of an 22 | af object. 23 | } 24 | -------------------------------------------------------------------------------- /man/print.vis.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/vis.R 3 | \name{print.vis} 4 | \alias{print.vis} 5 | \title{Print method for a vis object} 6 | \usage{ 7 | \method{print}{vis}(x, min.prob = 0.3, print.full.model = FALSE, ...) 8 | } 9 | \arguments{ 10 | \item{x}{a \code{vis} object, the result of \code{\link{vis}}} 11 | 12 | \item{min.prob}{a lower bound on the probability of 13 | selection before the result is printed} 14 | 15 | \item{print.full.model}{logical, determines if the full 16 | model gets printed or not. Default=\code{FALSE}.} 17 | 18 | \item{...}{further arguments (currently unused)} 19 | } 20 | \description{ 21 | Prints basic output of the bootstrap results of an 22 | vis object. 23 | } 24 | -------------------------------------------------------------------------------- /man/process.fn.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mplot-package.R 3 | \name{process.fn} 4 | \alias{process.fn} 5 | \title{Process results within af function} 6 | \usage{ 7 | process.fn(fence.mod, fence.rank) 8 | } 9 | \arguments{ 10 | \item{fence.mod}{set of fence models} 11 | 12 | \item{fence.rank}{set of fence model ranks} 13 | } 14 | \description{ 15 | This function is used by the af function to process 16 | the results when iterating over different boundary values 17 | } 18 | \keyword{Internal} 19 | -------------------------------------------------------------------------------- /man/summary.af.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/af.R 3 | \name{summary.af} 4 | \alias{summary.af} 5 | \title{Summary method for an af object} 6 | \usage{ 7 | \method{summary}{af}(object, best.only = TRUE, ...) 8 | } 9 | \arguments{ 10 | \item{object}{\code{af} object, the result of \code{\link{af}}} 11 | 12 | \item{best.only}{logical determining whether the output used the 13 | standard fence approach of only considering the best models 14 | that pass the fence (\code{TRUE}) or if it should take into 15 | account all models that pass the fence at each boundary 16 | value (\code{FALSE}).} 17 | 18 | \item{...}{further arguments (currently unused)} 19 | } 20 | \description{ 21 | Provides comprehensive output of the bootstrap results of an 22 | af object. 23 | } 24 | -------------------------------------------------------------------------------- /man/txt.fn.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mplot-package.R 3 | \name{txt.fn} 4 | \alias{txt.fn} 5 | \title{Print text for fence methods} 6 | \usage{ 7 | txt.fn(score, UB, obj) 8 | } 9 | \arguments{ 10 | \item{score}{realised value} 11 | 12 | \item{UB}{upper bound} 13 | 14 | \item{obj}{fitted model object} 15 | } 16 | \description{ 17 | This function provides the text for the case when trace=TRUE 18 | when using lmfence and glmfence functions. 19 | } 20 | \keyword{internal} 21 | -------------------------------------------------------------------------------- /man/vis.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/vis.R 3 | \name{vis} 4 | \alias{vis} 5 | \title{Model stability and variable inclusion plots} 6 | \usage{ 7 | vis( 8 | mf, 9 | nvmax, 10 | B = 100, 11 | lambda.max, 12 | nbest = "all", 13 | use.glmulti = FALSE, 14 | cores, 15 | force.in = NULL, 16 | screen = FALSE, 17 | redundant = TRUE, 18 | seed = NULL, 19 | ... 20 | ) 21 | } 22 | \arguments{ 23 | \item{mf}{a fitted 'full' model, the result of a call 24 | to lm or glm (and in the future lme or lmer)} 25 | 26 | \item{nvmax}{size of the largest model that can still be 27 | considered as a viable candidate} 28 | 29 | \item{B}{number of bootstrap replications} 30 | 31 | \item{lambda.max}{maximum penalty value for the vip plot, 32 | defaults to 2*log(n)} 33 | 34 | \item{nbest}{maximum number of models at each model size 35 | that will be considered for the lvk plot. Can also take 36 | a value of \code{"all"} which displays all models.} 37 | 38 | \item{use.glmulti}{logical. Whether to use the glmulti package 39 | instead of bestglm. Default \code{use.glmulti=FALSE}.} 40 | 41 | \item{cores}{number of cores to be used when parallel 42 | processing the bootstrap} 43 | 44 | \item{force.in}{the names of variables that should be forced 45 | into all estimated models. (Not yet implemented.)} 46 | 47 | \item{screen}{logical, whether or not to perform an initial 48 | screen for outliers. Highly experimental, use at own risk. 49 | Default = \code{FALSE}.} 50 | 51 | \item{redundant}{logical, whether or not to add a redundant 52 | variable. Default = \code{TRUE}.} 53 | 54 | \item{seed}{random seed for reproducible results} 55 | 56 | \item{...}{further arguments (currently unused)} 57 | } 58 | \description{ 59 | Calculates and provides the plot methods for standard 60 | and bootstrap enhanced model stability plots (\code{lvk} and 61 | \code{boot}) as well as variable inclusion plots (\code{vip}). 62 | } 63 | \details{ 64 | The result of this function is essentially just a 65 | list. The supplied plot method provides a way to visualise the 66 | results. 67 | 68 | See \code{?plot.vis} or \code{help("plot.vis")} for details of the 69 | plot method associated with the result. 70 | } 71 | \examples{ 72 | n = 100 73 | set.seed(11) 74 | e = rnorm(n) 75 | x1 = rnorm(n) 76 | x2 = rnorm(n) 77 | x3 = x1^2 78 | x4 = x2^2 79 | x5 = x1*x2 80 | y = 1 + x1 + x2 + e 81 | dat = data.frame(y, x1, x2, x3, x4, x5) 82 | lm1 = lm(y ~ ., data = dat) 83 | \dontshow{ 84 | v1 = vis(lm1, B = 5, cores = 1, seed = 1) 85 | plot(v1, highlight = "x1", which = "lvk") 86 | plot(v1, which = "boot") 87 | plot(v1, which = "vip") 88 | } 89 | \dontrun{ 90 | v1 = vis(lm1, seed = 1) 91 | plot(v1, highlight = "x1", which = "lvk") 92 | plot(v1, which = "boot") 93 | plot(v1, which = "vip") 94 | } 95 | } 96 | \references{ 97 | Mueller, S. and Welsh, A. H. (2010), On model 98 | selection curves. International Statistical Review, 78:240-256. 99 | doi: 10.1111/j.1751-5823.2010.00108.x 100 | 101 | Murray, K., Heritier, S. and Mueller, S. (2013), Graphical 102 | tools for model selection in generalized linear models. 103 | Statistics in Medicine, 32:4438-4451. doi: 10.1002/sim.5855 104 | 105 | Tarr G, Mueller S and Welsh AH (2018). mplot: An R Package for 106 | Graphical Model Stability and Variable Selection Procedures. 107 | Journal of Statistical Software, 83(9), pp. 1-28. doi: 10.18637/jss.v083.i09 108 | } 109 | \seealso{ 110 | \code{\link{plot.vis}} 111 | } 112 | -------------------------------------------------------------------------------- /man/wallabies.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mplot-package.R 3 | \docType{data} 4 | \name{wallabies} 5 | \alias{wallabies} 6 | \title{Rock-wallabies data set} 7 | \format{ 8 | A data frame with 200 observations on 9 variables. 9 | \describe{ 10 | \item{rw}{Presence of rock-wallaby scat} 11 | \item{edible}{Percentage cover of edible vegetation} 12 | \item{inedible}{Percentage cover of inedible vegetation} 13 | \item{canopy}{Percentage canopy cover} 14 | \item{distance}{Distance from diurnal refuge} 15 | \item{shelter}{Whether or not a plot occurred within a shelter point (large 16 | rock or boulder pile)} 17 | \item{lat}{Latitude of the plot location} 18 | \item{long}{Longitude of the plot location} 19 | } 20 | } 21 | \usage{ 22 | data(wallabies) 23 | } 24 | \description{ 25 | On Chalkers Top in the Warrumbungles (NSW, Australia) 200 evenly distributed 26 | one metre squared plots were surveyed. Plots were placed at a density 27 | of 7-13 per hectare. The presence or absence of fresh 28 | (<1 month old) scats of rock-wallabies was recorded for each plot 29 | along with location and a selection of predictor variables. 30 | } 31 | \details{ 32 | Macropods defaecate randomly as they forage and scat 33 | (faecal pellet) surveys are a reliable method for detecting the 34 | presence of rock-wallabies and other macropods. 35 | Scats are used as an indication of spatial foraging patterns 36 | of rock-wallabies and sympatric macropods. Scats deposited while 37 | foraging were not confused with scats deposited while 38 | resting because the daytime refuge areas of rock-wallabies 39 | were known in detail for each colony and no samples were 40 | taken from those areas. Each of the 200 sites were 41 | examined separately to 42 | account for the different levels of predation risk and the 43 | abundance of rock-wallabies. 44 | } 45 | \examples{ 46 | data(wallabies) 47 | wdat = data.frame(subset(wallabies,select=-c(lat,long)), 48 | EaD = wallabies$edible*wallabies$distance, 49 | EaS = wallabies$edible*wallabies$shelter, 50 | DaS = wallabies$distance*wallabies$shelter) 51 | M1 = glm(rw~., family = binomial(link = "logit"), data = wdat) 52 | } 53 | \references{ 54 | Tuft KD, Crowther MS, Connell K, Mueller S and McArthur C (2011), 55 | Predation risk and competitive interactions affect foraging of 56 | an endangered refuge-dependent herbivore. Animal Conservation, 57 | 14: 447-457. doi: 10.1111/j.1469-1795.2011.00446.x 58 | } 59 | \keyword{datasets} 60 | -------------------------------------------------------------------------------- /mplot.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: knitr 13 | LaTeX: pdfLaTeX 14 | 15 | BuildType: Package 16 | PackageUseDevtools: Yes 17 | PackageInstallArgs: --no-multiarch --with-keep.source 18 | PackageCheckArgs: --as-cran 19 | PackageRoxygenize: rd,collate,namespace 20 | -------------------------------------------------------------------------------- /vignettes/af.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Simplified adaptive fence 3 | keywords: "af, adaptive fence, linear models, generalised linear models" 4 | series: "Adaptive fence" 5 | bibliography: jss.bib 6 | csl: apa-old-doi-prefix.csl 7 | output: 8 | github_document: 9 | toc_dept: 1 10 | --- 11 | 12 | > Overview of the simplified adaptive fence procedure. 13 | 14 | The fence, first introduced by @Jiang:2008, is built around the inequality 15 | $$\hat{Q}(\alpha) - \hat{Q}(\alpha_{f}) \leq c,$$ 16 | where $\hat Q$ is an empirical measure of description loss, $\alpha$ is a candidate model and $\alpha_{f}$ is the baseline, _full_ model. The procedure attempts to isolate a set of _correct models_ that satisfy the inequality. A model $\alpha^*$, is described as _within the fence_ if $\hat{Q}(\alpha^*) - \hat{Q}(\alpha_{f}) \leq c$. From the set of models within the fence, the one with minimum dimension is considered optimal. If there are multiple models within the fence at the minimum dimension, then the model with the smallest $\hat{Q}(\alpha)$ is selected. For a recent review of the fence and related methods, see @Jiang:2014. 17 | 18 | The implementation we provide in the **mplot** package is inspired by the simplified adaptive fence proposed by @Jiang:2009, which represents a significant advance over the original fence method proposed by @Jiang:2008. The key difference is that the parameter $c$ is not fixed at a certain value, but is instead adaptively chosen. Simulation results have shown that the adaptive method improves the finite sample performance of the fence. 19 | 20 | The adaptive fence procedure entails bootstrapping over a range of values of the parameter $c$. For each value of $c$ a parametric bootstrap is performed under $\alpha_f$. For each bootstrap sample we identify the smallest model inside the fence, $\hat{\alpha}(c)$. @Jiang:2009 suggest that if there is more than one model, choose the one with the smallest $\hat{Q}(\alpha)$. Define the empirical probability of selecting model $\alpha$ for a given value of $c$ as $p^*(c,\alpha)=P^*\{\hat{\alpha}(c)=\alpha\}$. Hence, if $B$ bootstrap replications are performed, $p^*(c,\alpha)$ is the proportion of times that model $\alpha$ is selected. Finally, define an overall selection probability, $p^*(c)=\max_{\alpha\in\mathcal{A}}p^*(c,\alpha)$ and plot $p^*(c)$ against $c$ to find the first peak. The value of $c$ at the first peak, $c^*$, is then used with the standard fence procedure on the original data. 21 | 22 | Our implementation is provided through the `af()` function and associated plot methods. An example with the artificial data set is generated using the following code. 23 | 24 | ```s 25 | af.art = af(lm.art, B = 150, n.c = 50) 26 | plot(af.art, interactive = FALSE, best.only = TRUE) 27 | ``` 28 | 29 | The arguments indicate that we perform $B = 150$ bootstrap resamples, over a grid of $50$ values of the parameter $c$. In this example, there is only one peak, and the choice of $c^*=21.1$ is clear. 30 | 31 | 32 |
    33 | 34 | 35 |
    36 | *Result of a call to `plot(af.art, interactive = FALSE)` with additional arguments `best.only = TRUE` on the left and `best.only = FALSE` on the right. The more rapid decay after the $x_8$ model is typical of using `best.only = FALSE` where the troughs between candidate/dominant models are more pronounced.* 37 | 38 | One might expect that there should be a peak corresponding to the full model at $c=0$, but this is avoided by the inclusion of at least one redundant variable. Any model that includes the redundant variable is known to not be a _true_ model and hence is not included in the calculation of $p^*(c)$. This issue was first identified and addressed by @Jiang:2009. 39 | 40 | There are a number of key differences between our implementation and the method proposed by @Jiang:2009. Perhaps the most fundamental difference is in the philosophy underlying our implementation. Our approach is more closely aligned with the concept of model stability than with trying to pick a single _best_ model. This can be seen through the plot methods we provide. Instead of simply using the plots to identify the first peak, we add a legend that highlights which models were the most frequently selected for each parameter value, that is, for each $c$ value we identify which model gave rise to the $p^*(c)$ value. In this way, researchers can ascertain if there are regions of stability for various models. In the example given above, there is no need to even define a $c^*$ value, it is obvious from the plot that there is only one viable candidate model, a regression of $y$ on $x_8$. 41 | 42 | Our approach considers not just the best model of a given model size, but also allows users to view a plot that takes into account the possibility that more than one model of a given model size is within the fence. The `best.only = FALSE` option when plotting the results of the adaptive fence is a modification of the adaptive fence procedure which considers all models of a particular size that are within the fence when calculating the $p^*(c)$ values. In particular, for each value of $c$ and for each bootstrap replication, if a candidate model is found inside the fence, then we look to see if there are any other models of the same size that are also within the fence. If no other models of the same size are inside the fence, then that model is allocated a weight of 1. If there are two models inside the fence, then the best model is allocated a weight of 1/2. If three models are inside the fence, the best model gets a weight of 1/3, and so on. After $B$ bootstrap replications, we aggregate the weights by summing over the various models. The $p^*(c)$ value is the maximum aggregated weight divided by the number of bootstrap replications. This correction penalises the probability associated with the best model if there were other models of the same size inside the fence. The rationale is that if a model has no redundant variables then it will be the only model of that size inside the fence over a range of values of $c$. The result is more pronounced peaks which can help to determine the location of the correct peak and identify the optimal $c^*$ value or more clearly differentiate regions of model stability. This can be seen in the right hand panel of the figure above. 43 | 44 | Another key difference is that our implementation is designed for linear and generalised linear models, rather than mixed models. As far as we are aware, this is the first time fence methods have been applied to such models. There is potential to add mixed model capabilities to future versions of the **mplot** package, but computational speed is a major hurdle that needs to be overcome. The current implementation is made computationally feasible through the use of the **leaps** and **bestglm** packages and the use of parallel processing [@Lumley:2009; @McLeod:2014]. 45 | 46 | We have also provided an optional initial stepwise screening method that can help limit the range of $c$ values over which to perform the adaptive fence procedure. The initial stepwise procedure performs forward and backward stepwise model selection using both the AIC and BIC. From the four candidate models, we extract the size of smallest and largest models, $k_L$ and $k_U$ respectively. To obtain a sensible range of $c$ values we consider the set of models with dimension between $k_L-2$ and $k_U+2$. Due to the inherent limitations of stepwise procedures, it can be useful to check `initial.stepwise = FALSE` with a small number of bootstrap replications over a sparse grid of $c$ values to ensure that the `initial.stepwise = TRUE` has produced a reasonable region. 47 | 48 | 49 | #### References -------------------------------------------------------------------------------- /vignettes/artificial.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Artificial example 3 | tags: 4 | - VIP 5 | - AF 6 | - MSP 7 | keywords: "VIP, variable inclusion plots, adaptive fence, model stability plots" 8 | bibliography: jss.bib 9 | csl: apa-old-doi-prefix.csl 10 | output: 11 | github_document: 12 | toc_dept: 1 13 | --- 14 | 15 | 16 | The artificially generated data set was originally designed to emphasise statistical deficiencies in stepwise procedures, but here it will be used to highlight the utility of the various procedures and plots provided by **mplot**. The data set and details of how it was generated are provided with the **mplot** package. 17 | 18 | ```{r, eval = FALSE} 19 | # install.packages("mplot") 20 | data("artificialeg", package = "mplot") 21 | help("artificialeg", package = "mplot") 22 | ``` 23 | 24 | A scatterplot matrix of the data and the estimated pairwise correlations are given in the [pairs plot](#fig:pairsplot) below. There are no outliers and we have not positioned the observations in a subspace of the artificially generated data set. All variables, while related, originate from a Gaussian distribution. Fitting the full model yields no individually significant variables. 25 | 26 | ```{r} 27 | library("mplot") 28 | data("artificialeg") 29 | full.model = lm(y ~ ., data = artificialeg) 30 | round(summary(full.model)$coef, 2) 31 | ``` 32 | 33 | Performing default stepwise variable selection yields a model with all explanatory variables except $x_8$. As an aside, the dramatic changes in the p-values indicate that there is substantial interdependence between the explanatory variables even though none of the pairwise correlations in the [pairs plot](#fig:pairsplot) are particularly extreme. 34 | 35 | ```{r pairsplot, cache=TRUE} 36 | par(mar=c(0,0,0,0), mgp=c(1,0.5,0), tcl=-0.3, bg = "transparent") 37 | panel.cor <- function(x, y, digits=1, prefix="", cex.cor, ...) 38 | { 39 | usr <- par("usr"); on.exit(par(usr)) 40 | par(usr = c(0, 1, 0, 1)) 41 | r <- abs(cor(x, y)) 42 | txt <- format(c(r, 0.123456789), digits=digits)[1] 43 | txt <- paste("", txt, sep="") 44 | text(0.5, 0.5, txt, cex = 1) 45 | } 46 | pairs(artificialeg, upper.panel = panel.cor, 47 | pch = 19, col = '#22558866', oma = c(1.5,1.5,1.5,1.5), 48 | cex.labels = 1.25, gap = 0.2) 49 | ``` 50 | 51 |
    52 | 53 |
    54 | *Figure: scatterplot matrix of the artificially generated data set with estimated correlations in the upper right triangle. The true data generating process for the dependent variable is $y=0.6\, x_8 + \varepsilon$ where $\varepsilon\sim\mathcal{N}(0,2^2)$.* 55 | 56 | 57 | ```{r} 58 | step.model = step(full.model, trace = 0) 59 | round(summary(step.model)$coef, 2) 60 | ``` 61 | 62 | The true data generating process is, $y = 0.6\,x_{8} + \varepsilon$, where $\varepsilon\sim\mathcal{N}(0,2^2)$. The bivariate regression of $y$ on $x_{8}$ is the more desirable model, not just because it is the true model representing the data generating process, but it is also more parsimonious with essentially the same residual variance as the larger model chosen by the stepwise procedure. This example illustrates a key statistical failing of stepwise model selection procedures, in that they only explore a subset of the model space so are inherently susceptible to local minima in the information criterion [@Harrell:2001]. 63 | 64 | Perhaps the real problem with of stepwise methods is that they allow researchers to transfer all responsibility for model selection to a computer and not put any real thought into the model selection process. This is an issue that is also shared, to a certain extent with more recent model selection procedures based on regularisation such as the lasso and least angle regression [@Tibshirani:1996; @Tibshirani:2004], where attention focusses only on those models that are identified by the path taken through the model space. In the lasso, as the tuning parameter $\lambda$ is varied from zero to $\infty$, different regression parameters remain non-zero, thus generating a path through the set of possible regression models, starting with the largest _optimal_ model when $\lambda=0$ to the smallest possible model when $\lambda=\infty$, typically the null model because the intercept is not penalised. The lasso selects that model on the lasso path at a single $\lambda$ value, that minimises one of the many possible criteria (such as 5-fold cross-validation, or the prediction error) or by determining the model on the lasso path that minimises an information criterion (for example BIC). 65 | 66 | An alternative to stepwise or regularisation procedures is to perform exhaustive searches of the model space. While exhaustive searches avoid the issue of local minima, they are computationally expensive, growing exponentially in the number of variables $p$, with more than a thousand models when $p=10$ and a million when $p=20$. The methods provided in the **mplot** package and described in the remainder of the article go beyond stepwise procedures by incorporating exhaustive searches where feasible and using resampling techniques to provide an indication of the stability of the selected model. The **mplot** package can feasibly handle up to 50 variables in linear regression models and a similar number for logistic regression models when an appropriate transformation (described in the [birth weight example](birthweight.html)) is implemented. 67 | 68 | #### References 69 | -------------------------------------------------------------------------------- /vignettes/artificial_cache/html/__packages: -------------------------------------------------------------------------------- 1 | base 2 | mplot 3 | -------------------------------------------------------------------------------- /vignettes/artificial_cache/html/pairsplot_6c89e27e0924f8fe4f44e1cca65b063f.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/artificial_cache/html/pairsplot_6c89e27e0924f8fe4f44e1cca65b063f.RData -------------------------------------------------------------------------------- /vignettes/artificial_cache/html/pairsplot_6c89e27e0924f8fe4f44e1cca65b063f.rdb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/artificial_cache/html/pairsplot_6c89e27e0924f8fe4f44e1cca65b063f.rdb -------------------------------------------------------------------------------- /vignettes/artificial_cache/html/pairsplot_6c89e27e0924f8fe4f44e1cca65b063f.rdx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/artificial_cache/html/pairsplot_6c89e27e0924f8fe4f44e1cca65b063f.rdx -------------------------------------------------------------------------------- /vignettes/birthweight.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Birth weight example 3 | keywords: "example, AF, VIP, MSP, linear model" 4 | bibliography: jss.bib 5 | csl: apa-old-doi-prefix.csl 6 | output: 7 | github_document: 8 | toc_dept: 1 9 | --- 10 | 11 | The `birthwt` dataset from the **MASS** package has data on 189 births at the Baystate Medical Centre, Springfield, Massachusetts during 1986 [@Venables:2002]. The main variable of interest is low birth weight, a binary response variable `low` [@Hosmer:1989book]. We have taken the same approach to modelling the full model as in @Venables:2002, where `ptl` is reduced to a binary indicator of past history and `ftv` is reduced to a factor with three levels. 12 | 13 | ```{r, message = FALSE} 14 | data("birthwt", package = "MASS") 15 | bwt <- with(birthwt, { 16 | race <- factor(race, labels = c("white", "black", "other")) 17 | ptd <- factor(ptl > 0) 18 | ftv <- factor(ftv) 19 | levels(ftv)[-(1:2)] <- "2+" 20 | data.frame(low = factor(low), age, lwt, race, smoke = (smoke > 0), ptd, ht = (ht > 0), ui = (ui > 0), ftv) 21 | }) 22 | options(contrasts = c("contr.treatment", "contr.poly")) 23 | bw.glm <- glm(low ~ ., family = binomial, data = bwt) 24 | round(summary(bw.glm)$coef, 2) 25 | ``` 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | The `vis` and `af` objects are generated using the fitted full model object as an argument to the `vis()` and `af()` functions. Because this is very computationally intensive, the results have been saved in an RData file which can then be subsequently reloaded for future plotting. 37 | 38 | ```{r, eval = FALSE} 39 | af.bw = af(bw.glm, B = 150, c.max = 20, n.c = 40) 40 | vis.bw = vis(bw.glm, B = 150) 41 | save(bw.glm, af.bw, vis.bw, file = "bw_main.RData") 42 | ``` 43 | 44 | The results are shown below in the interactive plots. Note that they they display the larger set of variables more clearly than the static plot methods (where the legend might overwhelm the plot). An interactive plot is obtained using the `interactive = TRUE` parameter. As this is being rendered in an rmarkdown document, we use the rmarkdown chunk option `results = "asis"` along with the additional `tag = "chart"` parameter in the `plot()` function. 45 | 46 | ```{r, results = "asis", message=FALSE} 47 | load("bw_main.RData") 48 | plot(vis.bw, which = "vip", interactive = TRUE, tag = "chart") 49 | ``` 50 | 51 | In this example, it is far less clear which is the best model, or if indeed a _best model_ exists. It is possible to infer an ordering of variable importance from the variable inclusion plots, but there is no clear cutoff as to which variables should be included and which should be excluded. The `ptdTRUE` variable is clearly important. It's less obvious that the `htTRUE` and `lwt` variables are important as they are in the vicinity of the redundant variable curve (`RV`). The other variables lie below the redundant variable curve with `ftv2+` clearly the least important variable. 52 | 53 | ```{r, results = "asis", message=FALSE} 54 | plot(vis.bw, which = "boot", highlight = "htTRUE", interactive = TRUE, tag = "chart", seed = 1) 55 | ``` 56 | 57 | In the model stability plot (below) the only dominant (non-trivial, i.e. not the full model or the null model) is the simple linear regression with `ptdTRUE` as the explanatory variable. In models of size 3, there are a range of competing models that perform similarly well. 58 | 59 | ```{r, results = "asis", message=FALSE} 60 | plot(af.bw, interactive = TRUE, tag = "chart") 61 | ``` 62 | 63 | In the adaptive fence plot, the only model more complex than a single covariate regression model that shows up with some regularity is the model with `lwt`, `ptd` and `ht`, though at such low levels, it can hardly be considered as a region of stability. This model also stands out slightly in the model stability plot, where it is selected in 6% of bootstrap resamples and has a slightly lower description loss than other models of the same dimension. It is worth recalling that the bootstrap resamples generated for the adaptive fence are separate from those generated for the model stability plots. Indeed the adaptive fence procedure relies on a parametric bootstrap, whereas the model stability plots rely on an exponential weighted bootstrap. Thus, to find some agreement between these methods is reassuring. 64 | 65 | Stepwise approaches using AIC or BIC yield conflicting models, depending on whether the search starts with the full model or the null model. As expected, the BIC stepwise approach returns smaller models than AIC, selecting the single covariate logistic regression, `low ~ ptd`, in the forward direction and the larger model, `low ~ lwt + ptd + ht` when stepping backwards from the full model. Forward selection from the null model with the AIC yielded `low ~ ptd + age + ht + lwt + ui` whereas backward selection the slightly larger model, `low ~ lwt + race + smoke + ptd + ht + ui`. Some of these models appear as features in the model stability plots. Most notably the dominant single covariate logistic regression and the model with `lwt`, `ptd` and `ht` identified as a possible region of stability in the adaptive fence plot. The larger models identified by the AIC are reflective of the variable importance plot in that they show there may still be important information contained in a number of other variables not identified by the BIC approach. 66 | 67 | @Calcagno:2010 also consider this data set, but they allow for the possibility of interaction terms. Using their approach, they identify _two_ best models 68 | 69 | ``` 70 | low ~ smoke + ptd + ht + ui + ftv + age + lwt + ui:smoke + ftv:age 71 | low ~ smoke + ptd + ht + ui + ftv + age + lwt + ui:smoke + ui:ht + ftv:age 72 | ``` 73 | 74 | As a general rule, we would warn against the `.*.` approach, where all possible interaction terms are considered, as it does not consider whether or not the interaction terms actually make practical sense. @Calcagno:2010 conclude that "Having two best models and not one is an extreme case where taking model selection uncertainty into account rather than looking for a single best model is certainly recommended!" The issue here is that the software did not highlight that these models are identical as the `ui:ht` interaction variable is simply a vector of ones, and as such, is ignored by the GLM fitting routine. 75 | 76 | As computation time can be an issue for GLMs, it is useful to approximate the results using weighted least squares [@Hosmer:1989]. In practice this can be done by fitting the logistic regression and extracting the estimated logistic probabilities, $\hat{\pi}_{i}$. A new dependent variable is then constructed, 77 | 78 | $$z_{i} = \log\left(\frac{\hat{\pi}_{i}}{1-\hat{\pi}_{i}}\right) + \frac{y_{i}-\hat{\pi}_{i}}{\hat{\pi}_{i}(1-\hat{\pi}_{i})},$$ 79 | 80 | along with observation weights $v_{i}=\hat{\pi}_{i}(1-\hat{\pi}_{i})$. For any submodel $\alpha$ this approach produces the approximate coefficient estimates of @Lawless:1978 and enables us to use the **leaps** package to perform the computations for best subsets logistic regression as follows. 81 | 82 | ```{r} 83 | pihat = bw.glm$fitted.values 84 | r = bw.glm$residuals 85 | z = log(pihat/(1 - pihat)) + r 86 | v = pihat*(1 - pihat) 87 | nbwt = bwt 88 | nbwt$z = z 89 | nbwt$low = NULL 90 | bw.lm = lm(z ~ ., data = nbwt, weights = v) 91 | ``` 92 | 93 | ```{r, eval = FALSE} 94 | bw.lm.vis = vis(bw.lm, B = 150) 95 | bw.lm.af = af(bw.lm, B = 150, c.max = 20, n.c = 40) 96 | save(bw.lm, bw.lm.vis, bw.lm.af, file = "bw_lm.RData") 97 | ``` 98 | 99 | ```{r, results = "asis", message = FALSE} 100 | load("bw_lm.RData") 101 | plot(bw.lm.vis, which = "vip", interactive = TRUE, tag = "chart") 102 | ``` 103 | 104 | ```{r, results = "asis", message=FALSE} 105 | plot(bw.lm.vis, which = "boot", highlight = "htTRUE", interactive = TRUE, tag = "chart") 106 | ``` 107 | 108 | ```{r, results = "asis", message=FALSE} 109 | plot(bw.lm.af, interactive = TRUE, tag = "chart") 110 | ``` 111 | 112 | The coefficients from `bw.lm` are identical to `bw.glm`. This approximation provides similar results, shown in the [figures above](#fig:bwtapprox), in a fraction of the time. 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | #### References 123 | -------------------------------------------------------------------------------- /vignettes/bw_lm.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/bw_lm.RData -------------------------------------------------------------------------------- /vignettes/bw_main.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/bw_main.RData -------------------------------------------------------------------------------- /vignettes/diabetes_int.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/diabetes_int.RData -------------------------------------------------------------------------------- /vignettes/diabetes_main.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/diabetes_main.RData -------------------------------------------------------------------------------- /vignettes/images/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/.DS_Store -------------------------------------------------------------------------------- /vignettes/images/artafboTF.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/artafboTF.png -------------------------------------------------------------------------------- /vignettes/images/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/favicon.ico -------------------------------------------------------------------------------- /vignettes/images/figure4a.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/figure4a.png -------------------------------------------------------------------------------- /vignettes/images/figure4b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/figure4b.png -------------------------------------------------------------------------------- /vignettes/images/figure4c.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/figure4c.png -------------------------------------------------------------------------------- /vignettes/images/figure5a.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/figure5a.png -------------------------------------------------------------------------------- /vignettes/images/figure5b.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/figure5b.png -------------------------------------------------------------------------------- /vignettes/images/figure5c.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/figure5c.png -------------------------------------------------------------------------------- /vignettes/images/figure5d.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/figure5d.png -------------------------------------------------------------------------------- /vignettes/images/nature.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/nature.png -------------------------------------------------------------------------------- /vignettes/images/oncology.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/oncology.png -------------------------------------------------------------------------------- /vignettes/images/plotvis.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/plotvis.png -------------------------------------------------------------------------------- /vignettes/images/thyroid.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/garthtarr/mplot/3d6072a8249ca2607f2bfe5eb464abfd3cdec526/vignettes/images/thyroid.png -------------------------------------------------------------------------------- /vignettes/interactive.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Interactive graphics 3 | keywords: "interactive graphics, plots, graphics, visualisations" 4 | bibliography: jss.bib 5 | csl: apa-old-doi-prefix.csl 6 | output: 7 | github_document: 8 | toc_dept: 1 9 | --- 10 | 11 | > Overview of the interactive graphics provided by mplot. 12 | 13 | To facilitate that researchers can more easily gain value from the static plots (see [here](msp#fig:plotvis) and [here](af#fig:af.plot)) and to help them interact with the model selection problem more closely, we have provided a set of interactive graphics based on the **googleVis** package and wrapped them in a **shiny** user interface. It is still quite novel for a package to provide a shiny interface for its methods, but there is precedent, see, for example @McMurdie:2013 or @Gabry:2015. 14 | 15 | Among the most important contributions of these interactive methods is: the provision of tooltips to identify the models and/or variables; pagination of the legend for the variable inclusion plots; and a way to quickly select which variable to highlight in the model stability plots. These interactive plots can be generated when the `plot()` function is run on an `af` or `vis` object by specifying `interactive=TRUE`. 16 | 17 | The **mplot** package takes interactivity a step further, embedding these plots within a shiny web interface. This is done through a call to the `mplot()` function, which requires the full fitted model as the first argument and then a `vis` object and/or `af` object (in any order). 18 | 19 | ```{r eval=FALSE} 20 | mplot(lm.art, vis.art, af.art) 21 | ``` 22 | 23 | Note that the `vis()` and `af()` functions need to be run and the results stored prior to calling the `mplot()` function. The result of a call to this function is a webpage built using the **shiny** package with **shinydashboard** stylings [@Chang:2015a;Chang:2015b]. Figure \ref{fig:shiny} shows a series of screen shots for the artificial example, equivalent to Figures \ref{plot.vis} and \ref{plot.af}, resulting from the above call to `mplot()`. 24 | 25 | 26 |
    27 | 28 | 29 | 30 |
    31 | 32 | _Screenshots from the web interface generated using `mplot()`._ 33 | 34 | The top panel of the [figure above](#fig:shiny) shows a model stability plot where the full model that does not contain $x_8$ has been selected and a tooltip has been displayed. It gives details about the model specification, the log-likelihood and the bootstrap selection probability within models of size 10. The tooltip makes it easier for users to identify which variables are included in dominant models than the static plot equivalent. On the left hand side of the shiny interface, a drop down menu allows users to select the variable to be highlighted. This is passed through the `highlight` argument. Models with the highlighted variable are displayed as red circles whereas models without the highlighted variable are displayed as blue circles. The ability for researchers to quickly and easily see which models in the stability plot contain certain variables enhances their understanding of the relative importance of different components in the model. Selecting `No` at the `Bootstrap?` radio buttons yields the plot of description loss against dimension. 35 | 36 | The middle panel of the [figure above](#fig:shiny) is a screen shot of an interactive variable inclusion plot. When the mouse hovers over a line, the tooltip gives information about the bootstrap inclusion probability and which variable the line represents. Note that in comparison to the bottom panel of [this figure](msp.html#fig:plotvis), the legend is now positioned outside of the main plot area. When the user clicks a variable in the legend, the corresponding line in the plot is highlighted. This can be seen in the [figure above](#fig:shiny), where the $x_8$ variable in the legend has been clicked and the corresponding $x_8$ line in the variable inclusion plot has been highlighted. The highlighting is particularly useful with the redundant variable, so it can easily be identified. If the number of predictor variables is such that they no longer fit neatly down the right hand side of the plot, they simply paginate, that is an arrow appears allowing users to toggle through to the next page of variables. This makes the interface cleaner and easier to interpret than the static plots. Note also the vertical lines corresponding to traditional AIC and BIC penalty values. 37 | 38 | The bottom panel of the [figure above](#fig:shiny) is an interactive adaptive fence plot. The tooltip for a particular point gives information about the explanatory variable(s) in the model, the $\alpha^*=\arg\max_{\alpha\in\mathcal{A}}p^*(c,\alpha)$ value and the $(c,p^*(c))$ pair that has been plotted. Hovering or clicking on a model in the legend highlights all the points in the plot corresponding to that model. In this example, the $x_8$ legend has been clicked on and an additional circle has been added around all points representing the regression with $x_8$ as the sole explanatory variable. The shiny interface on the left allows users to toggle between `best.only = TRUE` and `best.only = FALSE`. 39 | 40 | The interactive graphics and shiny interface are most useful in the exploratory stage of model selection. Once the researcher has found the most informative plot through interactive analysis, the more traditional static plots may be used in a formal write up of the problem. 41 | 42 | #### References 43 | 44 | -------------------------------------------------------------------------------- /vignettes/msp.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Model stability plots 3 | keywords: "MSP, model stability plots" 4 | bibliography: jss.bib 5 | csl: apa-old-doi-prefix.csl 6 | output: 7 | github_document: 8 | toc_dept: 1 9 | --- 10 | 11 | > Overview of model stability plots. 12 | 13 | In order to generate model stability and variable inclusion plots, the first step is to generate a `vis` object using the `vis()` function. To generate a `vis` object for the artificial data example the fitted full model object along with some optional arguments are passed to the `vis()` function. 14 | 15 | ```s 16 | lm.art = lm(y ~ ., data = artificialeg) 17 | vis.art = vis(lm.art, B = 150, redundant = TRUE, nbest = "all") 18 | ``` 19 | 20 |
    21 | 22 | 23 |
    24 | _Figure: results of calls to `plot(vis.art, interactive = FALSE)` with additional arguments `which = "lvk"` in the top left, `which = "boot"` in the top right and `which = "vip"` down the bottom._ 25 | 26 | The `B = 150` argument provided to the `vis()` function tells us that we want to perform 150 bootstrap replications. See @Murray:2013 for more detail on the use of exponential weights in bootstrap model selection. Specifying `redundant = TRUE` is unnecessary, as it is the default option; it ensures that an extra variable, randomly generated from a standard normal distribution and hence completely unrelated to the true data generating process, is added to the full model. This extra redundant variable can be used as a baseline comparison in the variable inclusion plots. Finally, the `nbest` argument controls how many models with the smallest $\hat{Q}(\alpha)$ for each model size $k=1,\ldots,p$ are recorded. It can take an integer argument or specifying `nbest = "all"` ensures that all possible models are displayed when the plot methods is called, as shown in the top left panel of the [figure above](#fig:plotvis). Typically researchers do not need to visualise the entire model space and in problems with larger numbers of candidate variables it is impractical to store and plot results for all models. The default behaviour of the `vis()` function is to set `nbest = 5`, essentially highlighting the maximum enveloping lower convex curve of @Murray:2013. 27 | 28 | The simplest visualisation of the model space is to plot a measure of description loss against model complexity for all possible models, a special implementation is the Mallows $C_p$ plot [@Mallows:2000]. This is done using the argument `which = "lvk"` to the plot function applied to a `vis` object. The string `"lvk"` is short for loss versus $k$, the dimension of the model. 29 | 30 | ```s 31 | plot(vis.art, interactive = FALSE, highlight = "x8", which = "lvk") 32 | ``` 33 | 34 | The result of this function can be found in the top left panel of the [figure above](#fig:plotvis). The `highlight` argument is used to differentiate models that contain a particular variable from those that do not. This is an implementation of the _enriched scatter plot_ of @Murray:2013. There is a clear separation between models that contain $x_8$ and those that do not, that is, all triangles are clustered towards the bottom with the circles above in a separate cluster. There is no similar separation for the other explanatory variables (not shown). These results strongly suggest that $x_8$ is the single most important variable. For clarity the points have been jittered slightly along the horizontal axis, though the model sizes remain clearly differentiated. 35 | 36 | Rather than performing a single pass over the model space and plotting the description loss against model size, a more nuanced and discerning approach is to use a (exponential weighted) bootstrap to determine how often various models achieve the minimal loss for each model size. The advantage of the bootstrap approach is that it gives a measure of model stability for each model size as promoted by @Meinshausen:2010, @Mueller:2010 and @Murray:2013. 37 | 38 | The weighted bootstrap has two key-benefits over the residual or nonparametric bootstrap: First, the weighted bootstrap always yields observable responses which is particularly relevant when these observable values are restricted to be integers (as in many generalized linear models), or, when $y$ values are naturally bounded, say to be observed on the interval 0 to 1; Second, the weighted bootstrap does not suffer from separation issues that regularly occur in logistic and other models. The pairs bootstrap also yields observable responses and can be thought of as a special (boundary) case of the weighted bootstrap where some weights are allowed to be exactly zero, which can create a separation issue in logistic models. Therefore, we have chosen to implement the weighted bootstrap because it is a simple, elegant method that appears to work well. Specifically, we utilise the exponential weighted bootstrap where the observations are reweighted with weights drawn from an exponential distribution with mean 1 [see @Murray:2013]. 39 | 40 | To visualise the results of the exponential weighted bootstrap, the `which = "boot"` argument needs to be passed to the plot call on a `vis` object. The `highlight` argument can again be used to distinguish between models with and without a particular variable. Each circle represents a model with a non-zero bootstrap probability, that is, each model that was selected as the best model of a particular dimension in at least one bootstrap replication. Furthermore, the area of each circle is proportional to the corresponding model's bootstrapped selection probability. 41 | 42 | The top right panel of the [figure above](#fig:plotvis) is an example of a model stability plot for the artificial data set. The null model, the full model and the simple linear regression of $y$ on $x_8$ all have bootstrap probabilities equal to one. While there are alternatives to the null and full model their inclusion in the plot serves two main purposes. Firstly, to gauge the potential range in description loss and secondly to provide a baseline against which to compare other circles to see if any approach a similar size, which would indicate that those are dominant models of a given model dimension. In the [figure above](#fig:plotvis), for model dimensions of between three and ten, there are no clearly dominant models, that is, within each model size there are no models that are selected much more commonly than the alternatives. 43 | 44 | A print method is available for `vis` objects which prints the model formula, log-likelihood and proportion of times that a given model was selected as the _best_ model within each model size. The default minimum probability of a model being selected before it gets printed is 0.3, though this can be customised by passing a `min.prob` argument to the `print` function. 45 | 46 | ```s 47 | print(vis.art, min.prob = 0.25) 48 | ``` 49 | 50 | ``` 51 | name prob logLikelihood 52 | y~1 1.00 -135.33 53 | y~x8 1.00 -105.72 54 | y~x4+x8 0.40 -103.63 55 | y~x1+x8 0.27 -104.47 56 | y~x1+x2+x3+x4+x5+x6+x7+x9 0.26 -100.63 57 | y~x1+x2+x3+x4+x5+x6+x7+x9+RV 0.33 -100.51 58 | ``` 59 | 60 | The output above, reinforces what we know from the top right panel of the [figure above](#fig:plotvis). The null model is always selected and in models of size two a regression of $y$ on $x_8$ is always selected. In models of size three the two most commonly selected models are `y~x4+x8`, which was selected 40% of the time and `y~x1+x8` selected in 27% of bootstrap replications. Interestingly, in models of size nine and ten, the most commonly selected models do not contain $x_8$, these are shown as blue circles in the plot. We will see in the next section that this phenomenon is related to the failure of stepwise variable selection with this data set. 61 | 62 | #### References 63 | -------------------------------------------------------------------------------- /vignettes/people.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: mplot contributors 3 | keywords: authors, mplot, contributors 4 | bibliography: jss.bib 5 | csl: apa-old-doi-prefix.csl 6 | output: 7 | github_document: 8 | toc_dept: 1 9 | --- 10 | 11 | > Details of those who have contributed to the mplot project. 12 | 13 | Core development team: 14 | 15 | - Dr Garth Tarr (University of Sydney) 16 | - Prof Samuel Mueller (University of Sydney) 17 | - Prof Alan Welsh (Australian National University) 18 | 19 | Suggestions and testing: 20 | 21 | - Dr Kevin Murray (University of Western Australia) 22 | 23 | This research was undertaken with the assistance of resources from the National Computational Infrastructure (NCI), which is supported by the Australian Government. Samuel Mueller and Alan Welsh were supported by the Australian Research Council (DP140101259). We also gratefully acknowledge two anonymous referees for their helpful comments and suggestions for the paper and package. 24 | 25 | #### Citation 26 | 27 | If you use this package to inform your model selection choices, please use the following citation: 28 | 29 | ```{r, comment=NA} 30 | citation("mplot") 31 | ``` 32 | 33 | 34 | 35 | -------------------------------------------------------------------------------- /vignettes/publications.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: mplot in publications 3 | bibliography: jss.bib 4 | csl: apa-old-doi-prefix.csl 5 | output: 6 | github_document: 7 | toc_dept: 1 8 | --- 9 | 10 | ### Abundance and richness of key Antarctic seafloor fauna correlates with modelled food availability 11 | 12 | 13 | Most seafloor communities at depths below the photosynthesis zone rely on food that sinks through the water column. However, the nature and strength of this pelagic–benthic coupling and its influence on the structure and diversity of seafloor communities is unclear, especially around Antarctica where ecological data are sparse. Here we show that the strength of pelagic–benthic coupling along the East Antarctic shelf depends on both physical processes and the types of benthic organisms considered. In an approach based on modelling food availability, we combine remotely sensed sea-surface chlorophyll-a, a regional ocean model and diatom abundances from sediment grabs with particle tracking and show that fluctuating seabed currents are crucial in the redistribution of surface productivity at the seafloor. The estimated availability of suspended food near the seafloor correlates strongly with the abundance of benthic suspension feeders, while the deposition of food particles correlates with decreasing suspension feeder richness and more abundant deposit feeders. The modelling framework, which can be modified for other regions, has broad applications in conservation and management, as it enables spatial predictions of key components of seafloor biodiversity over vast regions around Antarctica. 14 | 15 | Jan Jansen, Nicole A. Hill, Piers K. Dunstan, John McKinlay, Michael D. Sumner, Alexandra L. Post, Marc P. Eléaume, Leanne K. Armand, Jonathan P. Warnock, Benjamin K. Galton-Fenzi & Craig R. Johnson (2017). Abundance and richness of key Antarctic seafloor fauna correlates with modelled food availability, _Nature Ecology & Evolution_ **2**, 71-80. [doi:10.1038/s41559-017-0392-3](http://dx.doi.org/10.1038/s41559-017-0392-3) 16 | 17 |
    18 | 19 | ### Image-based data mining to probe dosimetric correlates of radiation-induced trismus 20 | 21 | 22 | 23 | **Purpose** 24 | 25 | To identify imaged regions in which dose is associated with radiation-induced trismus after head and neck cancer radiation therapy (HNRT) using a novel image-based data mining (IBDM) framework. 26 | 27 | **Conclusions** 28 | 29 | IBDM bypasses the common assumption that dose patterns within structures are unimportant. Our novel IBDM approach for continuous outcome variables successfully identified a cluster of voxels that are highly associated with trismus, overlapping partially with the ipsilateral masseter. Tests on an external validation cohort showed an even stronger correlation with trismus. These results support use of the region in HNRT treatment planning to potentially reduce trismus. 30 | 31 | William Beasley, Maria Thor, Alan McWilliam, Andrew Green, Ranald Mackay, Nick Slevin, Caroline Olsson, Niclas Pettersson, Caterina Finizia, Cherry Estilo, Nadeem Riaz, Nancy Y. Lee, Joseph O. Deasy & Marcel van Herk (2018). Image-based data mining to probe dosimetric correlates of radiation-induced trismus, 32 | _International Journal of Radiation Oncology, Biology, Physics_, **102**(4), 1330-1338. [doi:10.1016/j.ijrobp.2018.05.054](https://doi.org/10.1016/j.ijrobp.2018.05.054) 33 | 34 | 35 |
    36 | 37 | ### How can the occurrence of delayed elevation of thyroid stimulating hormone in preterm infants born between 35 and 36 weeks gestation be predicted? 38 | 39 | 40 | **Objective** 41 | 42 | We evaluated frequency and risk factors of delayed TSH elevation (dTSH) and investigated follow-up outcomes in the dTSH group with venous TSH (v-TSH) levels of 6–20 mU/L according to whether late preterm infants born at gestational age (GA) 35–36 weeks had risk factors. 43 | 44 | **Conclusions** 45 | 46 | dTSH was detected in 9.0% and levothyroxine was indicated in 1.5% of infants born at GA 35–36 weeks, particularly those with a LBW, a congenital anomaly, or history of ICM exposure. Either levothyroxine or retesting is indicated for late preterm neonates with TSH levels ≥10 mU/L regardless of risk factors. If healthy preterm neonates show v-TSH levels of 6–10 mU/L, a second repeat test may not be necessary; however, further studies are required to set a threshold for retesting. 47 | 48 | Heo YJ, Lee YA, Lee B, Lee YJ, Lim YH, Chung HR, et al. (2019) How can the occurrence of delayed elevation of thyroid stimulating hormone in preterm infants born between 35 and 36 weeks gestation be predicted? _PLoS ONE_ **14**(8): e0220240. (10.1371/journal.pone.0220240)[https://doi.org/10.1371/journal.pone.0220240] -------------------------------------------------------------------------------- /vignettes/publications.md: -------------------------------------------------------------------------------- 1 | mplot in publications 2 | ================ 3 | 4 | ### Abundance and richness of key Antarctic seafloor fauna correlates with modelled food availability 5 | 6 | 7 | 8 | **Abstract**: Most seafloor communities at depths below the 9 | photosynthesis zone rely on food that sinks through the water column. 10 | However, the nature and strength of this pelagic–benthic coupling and 11 | its influence on the structure and diversity of seafloor communities is 12 | unclear, especially around Antarctica where ecological data are sparse. 13 | Here we show that the strength of pelagic–benthic coupling along the 14 | East Antarctic shelf depends on both physical processes and the types of 15 | benthic organisms considered. In an approach based on modelling food 16 | availability, we combine remotely sensed sea-surface chlorophyll-a, a 17 | regional ocean model and diatom abundances from sediment grabs with 18 | particle tracking and show that fluctuating seabed currents are crucial 19 | in the redistribution of surface productivity at the seafloor. The 20 | estimated availability of suspended food near the seafloor correlates 21 | strongly with the abundance of benthic suspension feeders, while the 22 | deposition of food particles correlates with decreasing suspension 23 | feeder richness and more abundant deposit feeders. The modelling 24 | framework, which can be modified for other regions, has broad 25 | applications in conservation and management, as it enables spatial 26 | predictions of key components of seafloor biodiversity over vast regions 27 | around Antarctica. 28 | 29 | Jan Jansen, Nicole A. Hill, Piers K. Dunstan, John McKinlay, Michael D. 30 | Sumner, Alexandra L. Post, Marc P. Eléaume, Leanne K. Armand, Jonathan 31 | P. Warnock, Benjamin K. Galton-Fenzi & Craig R. Johnson (2017). 32 | Abundance and richness of key Antarctic seafloor fauna correlates with 33 | modelled food availability, *Nature Ecology & Evolution* **2**, 71-80. 34 | [doi:10.1038/s41559-017-0392-3](http://dx.doi.org/10.1038/s41559-017-0392-3) 35 | 36 | ### Image-based Data Mining to Probe Dosimetric Correlates of Radiation-induced Trismus 37 | 38 | 39 | 40 | \***Abstract** 41 | 42 | #### Purpose 43 | 44 | To identify imaged regions in which dose is associated with 45 | radiation-induced trismus after head and neck cancer radiation therapy 46 | (HNRT) using a novel image-based data mining (IBDM) framework. 47 | 48 | #### Methods and Materials 49 | 50 | A cohort of 86 HNRT patients were analyzed for region identification. 51 | Trismus was characterized as a continuous variable by the maximum 52 | incisor-to-incisor opening distance (MID) at 6 months after radiation 53 | therapy. Patient anatomies and dose distributions were spatially 54 | normalized to a common frame of reference using deformable image 55 | registration. IBDM was used to identify clusters of voxels associated 56 | with MID (P ≤ .05 based on permutation testing). The result was 57 | externally tested on a cohort of 35 patients with head and neck cancer. 58 | Internally, we also performed a dose-volume histogram–based analysis by 59 | comparing the magnitude of the correlation between MID and the mean dose 60 | for the IBDM-identified cluster in comparison with 5 delineated 61 | masticatory structures. 62 | 63 | #### Results 64 | 65 | A single cluster was identified with the IBDM approach (P \< .01), 66 | partially overlapping with the ipsilateral masseter. The dose-volume 67 | histogram–based analysis confirmed that the IBDM cluster had the 68 | strongest association with MID, followed by the ipsilateral masseter and 69 | the ipsilateral medial pterygoid (Spearman’s rank correlation 70 | coefficients: Rs = −0.36, –0.35, −0.32; P = .001, .001, .002, 71 | respectively). External validation confirmed an association between mean 72 | dose to the IBDM cluster and MID (Rs = −0.45; P = .007). 73 | 74 | #### Conclusions 75 | 76 | IBDM bypasses the common assumption that dose patterns within structures 77 | are unimportant. Our novel IBDM approach for continuous outcome 78 | variables successfully identified a cluster of voxels that are highly 79 | associated with trismus, overlapping partially with the ipsilateral 80 | masseter. Tests on an external validation cohort showed an even stronger 81 | correlation with trismus. These results support use of the region in 82 | HNRT treatment planning to potentially reduce trismus. 83 | 84 | William Beasley, Maria Thor, Alan McWilliam, Andrew Green, Ranald 85 | Mackay, Nick Slevin, Caroline Olsson, Niclas Pettersson, Caterina 86 | Finizia, Cherry Estilo, Nadeem Riaz, Nancy Y. Lee, Joseph O. Deasy & 87 | Marcel van Herk (2018). Image-based Data Mining to Probe Dosimetric 88 | Correlates of Radiation-induced Trismus, *International Journal of 89 | Radiation Oncology*Biology*Physics*, 102(4), 1330-1338. 90 | [doi:10.1016/j.ijrobp.2018.05.054](https://doi.org/10.1016/j.ijrobp.2018.05.054) 91 | -------------------------------------------------------------------------------- /vignettes/timing.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Timing considerations 3 | keywords: mplot, getting_started 4 | bibliography: jss.bib 5 | csl: apa-old-doi-prefix.csl 6 | output: 7 | github_document: 8 | toc_dept: 1 9 | --- 10 | 11 | 23 | 24 | Any bootstrap model selection procedure is time consuming. However, for linear models, we have leveraged the efficiency of the branch-and-bound algorithm provided by **leaps** [@Lumley:2009; @Miller:2002]. The **bestglm** package is used for GLMs; but in the absence of a comparably efficient algorithm the computational burden is much greater [@McLeod:2014]. 25 | 26 | Furthermore, we have taken advantage of the embarrassingly parallel nature of bootstrapping, utilising the [**doParallel**](https://cran.r-project.org/package=doParallel) and [**foreach**](https://cran.r-project.org/package=foreach) packages to provide cross platform multicore support, available through the `cores` argument. By default it will detect the number of cores available on your computer and leave one free. 27 | 28 | Figure \ref{fig:time} shows the timing results of simulations run for standard use scenarios with 4, 8 or 16 cores used in parallel. Each observation plotted is the average of four runs of a given model size. The simulated models had a sample size of $n=100$ with $5,10,\ldots,50$ candidate variables, of which 30% were active in the true model. 29 | 30 | The results show both the `vis()` and `af()` functions are quite feasible on standard desktop hardware with 4 cores even for moderate dimensions of up to 40 candidate variables. The adaptive fence takes longer than the `vis()` function, though this is to be expected as the effective number of bootstrap replications is `B`$\times$`n.c`, where `n.c` is the number divisions in the grid of the parameter $c$. 31 | 32 | The results for GLMs are far less impressive, even when the maximum dimension of a candidate solution is set to `nvmax = 10`. In its current implementation, the adaptive fence is only really feasible for models of around 10 predictors and the `vis()` function for 15. Future improvements could see approximations of the type outlined by \citet{Hosmer:1989} to bring the power of the linear model branch-and-bound algorithm to GLMs. An example of how this works in practice is given in Section \ref{sec:bw}. 33 | 34 | An alternative approach for high dimensional models would be to consider subset selection with convex relaxations as in \citet{Shen:2012} or combine bootstrap model selection with regularisation. In particular, we have implemented variable inclusion plots and model stability plots for **glmnet** [@Shen:2012]. In general, this is very fast for models of moderate dimension, but it does not consider the full model space. Restrictions within the **glmnet** package, mean it is only applicable to linear models, binomial logistic regression, and Poisson regression with the log link function. The **glmnet** package also allows for `"multinomial"`, `"cox"`, and `"mgaussian"` families, though we have not yet incorporated these into the **mplot** package. 35 | 36 |
    37 | 38 | 39 |
    40 | 41 | 42 | 43 | 44 | _Average time required to run the `af()` and `vis()` functions when $n=100$. A binomial regression was used for the GLM example._ 45 | 46 | #### References 47 | 48 | -------------------------------------------------------------------------------- /vignettes/vip.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Variable inclusion plots 3 | keywords: "VIP, variable inclusion plots" 4 | bibliography: jss.bib 5 | csl: apa-old-doi-prefix.csl 6 | output: 7 | github_document: 8 | toc_dept: 1 9 | --- 10 | 11 | > Overview of variable inclusion plots. 12 | 13 | Rather than visualising a loss measure against model size, it can be instructive to consider which variables are present in the overall _best_ model over a set of bootstrap replications. To facilitate comparison between models of different sizes we use the generalised information criterion, 14 | 15 | $$\textrm{GIC}(\alpha,\lambda) = \hat{Q}(\alpha) + \lambda p_{\alpha}.$$ 16 | 17 | The $\hat{Q}(\alpha)$ component is a measure of _description loss_ or _lack of fit_, a function that describes how well a model fits the data, for example, the residual sum of squares or $-2~\times~\text{log-likelihood}$. The number of independent regression model parameters, $p_{\alpha}$, is a measure of _model complexity_. The penalty multiplier, $\lambda$, determines the properties of the model selection criterion [@Mueller:2010; @Mueller:2013]. Special cases, when $\hat{Q}(\alpha)=-2\times\text{log-likelihood}(\alpha)$, include the AIC with $\lambda=2$, BIC with $\lambda=\log(n)$ and more generally the generalised information criterion (GIC) with $\lambda\in\mathbb{R}$ [@Konishi:1996]. 18 | 19 | Using the same exponential weighted bootstrap replications as in the model selection plots, we have a set of $B$ bootstrap replications and for each model size we know which model has the smallest description loss. This information is used to determine which model minimises the GIC over a range of values of the penalty parameter, $\lambda$, in each bootstrap sample. For each value of $\lambda$, we extract the variables present in the _best_ models over the $B$ bootstrap replications and calculate the corresponding bootstrap probabilities that a given variable is present. These calculations are visualised in a variable inclusion plot (VIP) as introduced by @Mueller:2010 and @Murray:2013. The VIP shows empirical inclusion probabilities as a function of the penalty multiplier $\lambda$. The probabilities are calculated by observing how often each variable is retained in $B$ exponential weighted bootstrap replications. Specifically, for each bootstrap sample $b=1,\ldots,B$ and each penalty multiplier $\lambda$, the chosen model, $\hat{\alpha}_{\lambda}^{b}\in \mathcal{A}$, is that which achieves the smallest $\textrm{GIC}(\alpha,\lambda;\mathbf{w}_b) = \hat{Q}^b(\alpha)+\lambda p_{\alpha}$, where $\mathbf{w}_b$ is the $n$-vector of independent and identically distributed exponential weights. The inclusion probability for variable $x_{j}$ is estimated by $B^{-1}\sum_{i=1}^{B}\mathbb{I}\{j\in \hat{\alpha}_{\lambda}^{b}\}$, where $\mathbb{I}\{j\in \hat{\alpha}_{\lambda}^{b}\}$ is one if $x_{j}$ is in the final model and zero otherwise. Following @Murray:2013, the default range of $\lambda$ values is $\lambda\in[0,2\log(n)]$ as this includes most standard values used for the penalty parameter. 20 | 21 | The example shown in the bottom panel of [this figure](msp#fig:plotvis) is obtained using the `which = "vip"` argument to the plot function. As expected, when the penalty parameter is equal to zero, all variables are included in the model; the full model achieves the lowest description loss, and hence minimises the GIC when there is no penalisation. As the penalty parameter increases, the inclusion probabilities for individual variables typically decrease as more parsimonious models are preferred. In the present example, the inclusion probabilities for the $x_8$ variable exhibit a sharp decrease at low levels of the penalty parameter, but then increase steadily as a more parsimonious model is sought. This pattern helps to explain why stepwise model selection chose the larger model with all the variables except $x_8$ -- there exists a local minimum. Hence, for large models the inclusion of $x_8$ adds no additional value over having all the other explanatory variables in the model. 22 | 23 | It is often instructive to visualise how the inclusion probabilities change over the range of penalty parameters. The ordering of the variables in the legend corresponds to their average inclusion probability over the whole range of penalty values. We have also added an independent standard Gaussian random variable to the model matrix as a redundant variable (`RV`). This provides a baseline to help determine which inclusion probabilities are _significant_ in the sense that they exhibit a different behaviour to the `RV` curve. Variables with inclusion probabilities near or below the `RV` curve can be considered to have been included by chance. 24 | 25 | To summarise, VIPs continue the model stability theme. Rather than simply using a single penalty parameter associated with a particular information criterion, for example the AIC with $\lambda=2$, our implementation of VIPs adds considerable value by allowing us to learn from a range of penalty parameters. Furthermore, we are able to see which variables are most often included over a number of bootstrap samples. 26 | 27 | #### References 28 | --------------------------------------------------------------------------------