├── .Rbuildignore ├── .gitignore ├── DESCRIPTION ├── NAMESPACE ├── NEWS.md ├── OLDNEWS.txt ├── R ├── 1.1-classes.R ├── 1.2-methods.R ├── 1.3-exprso.R ├── 2-conjoin.R ├── 3-mod.R ├── 4-split.R ├── 5.1-fs.R ├── 5.2-build.R ├── 6.1-predict.R ├── 6.2-calc.R ├── 7.1-plCV.R ├── 7.2-plGrid.R ├── 7.3-plMonteCarlo.R ├── 7.4-plNested.R ├── 8.1-pipe.R ├── 8.2-ens.R ├── 9-deprecated.R └── 9-global.R ├── README.Rmd ├── README.md ├── TODO.md ├── cran-comments.md ├── data-raw ├── data.R ├── makenew-build.R └── r&d.R ├── data ├── array.rda └── arrayMulti.rda ├── exprso.Rproj ├── inst └── CITATION ├── man ├── ExprsArray-class.Rd ├── ExprsBinary-class.Rd ├── ExprsEnsemble-class.Rd ├── ExprsMachine-class.Rd ├── ExprsModel-class.Rd ├── ExprsModule-class.Rd ├── ExprsMulti-class.Rd ├── ExprsPipeline-class.Rd ├── ExprsPredict-class.Rd ├── GSE2eSet.Rd ├── MultiPredict-class.Rd ├── RegrsArray-class.Rd ├── RegrsModel-class.Rd ├── RegrsPredict-class.Rd ├── array.Rd ├── arrayExprs.Rd ├── arrayMulti.Rd ├── build..Rd ├── build.Rd ├── buildANN.Rd ├── buildDNN.Rd ├── buildDT.Rd ├── buildEnsemble.Rd ├── buildFRB.Rd ├── buildGLM.Rd ├── buildLASSO.Rd ├── buildLDA.Rd ├── buildLM.Rd ├── buildLR.Rd ├── buildNB.Rd ├── buildRF.Rd ├── buildSVM.Rd ├── calcMonteCarlo.Rd ├── calcNested.Rd ├── calcStats.Rd ├── check.ctrlGS.Rd ├── classCheck.Rd ├── compare.Rd ├── conjoin.Rd ├── ctrlFeatureSelect.Rd ├── ctrlGridSearch.Rd ├── ctrlModSet.Rd ├── ctrlSplitSet.Rd ├── defaultArg.Rd ├── exprso-predict.Rd ├── exprso.Rd ├── forceArg.Rd ├── fs..Rd ├── fs.Rd ├── fsANOVA.Rd ├── fsAmalgam.Rd ├── fsAnnot.Rd ├── fsBalance.Rd ├── fsCor.Rd ├── fsEbayes.Rd ├── fsEdger.Rd ├── fsInclude.Rd ├── fsMrmre.Rd ├── fsNULL.Rd ├── fsPCA.Rd ├── fsPRA.Rd ├── fsPrcomp.Rd ├── fsRDA.Rd ├── fsRankProd.Rd ├── fsSample.Rd ├── fsStats.Rd ├── getArgs.Rd ├── getFeatures.Rd ├── getWeights.Rd ├── lequal.Rd ├── makeGridFromArgs.Rd ├── mod.Rd ├── modAcomp.Rd ├── modCLR.Rd ├── modCluster.Rd ├── modFilter.Rd ├── modHistory.Rd ├── modInclude.Rd ├── modNormalize.Rd ├── modPermute.Rd ├── modRatios.Rd ├── modSample.Rd ├── modScale.Rd ├── modSkew.Rd ├── modSubset.Rd ├── modSwap.Rd ├── modTMM.Rd ├── modTransform.Rd ├── nfeats.Rd ├── nsamps.Rd ├── packageCheck.Rd ├── pipe.Rd ├── pipeFilter.Rd ├── pipeUnboot.Rd ├── pl.Rd ├── plCV.Rd ├── plGrid.Rd ├── plMonteCarlo.Rd ├── plNested.Rd ├── progress.Rd ├── split.Rd ├── splitBalanced.Rd ├── splitBoost.Rd ├── splitBy.Rd ├── splitSample.Rd ├── splitStratify.Rd ├── trainingSet.Rd └── validationSet.Rd ├── tests ├── testthat.R └── testthat │ ├── data.RData │ ├── test-1.1-classes.R │ ├── test-2-conjoin.R │ ├── test-3-modHistory.R │ ├── test-5.1-fs.R │ ├── test-5.2-build.R │ ├── test-8.2-ens.R │ ├── test-fsRDA.R │ ├── test-mod.R │ ├── test-pl-cv.R │ ├── test-pl-gs.R │ ├── test-regrs.R │ └── test-split.R └── vignettes ├── a_introduction-vignette.Rmd ├── b_advanced-vignette.Rmd ├── c_readme.Rmd ├── exprso-diagram.cmap └── exprso-diagram.jpg /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^README\.Rmd$ 4 | ^README_.*\.png$ 5 | ^OLDNEWS\.txt$ 6 | ^cran-comments\.md$ 7 | ^TODO\.md$ 8 | ^data-raw$ 9 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | inst/doc 5 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: exprso 2 | Title: Rapid Deployment of Machine Learning Algorithms 3 | Version: 0.6.4 4 | URL: http://github.com/tpq/exprso 5 | BugReports: http://github.com/tpq/exprso/issues 6 | Authors@R: c( 7 | person("Thomas", "Quinn", email = "contacttomquinn@gmail.com", role = c("aut", "cre")), 8 | person("Daniel", "Tylee", email = "dantylee@gmail.com", role = "ctb"), 9 | person("Samuel", "Lee", email = "samleenz@me.com", role = "ctb") 10 | ) 11 | Description: Supervised machine learning has an increasingly important role in data 12 | analysis. This package introduces a framework for rapidly building and deploying 13 | supervised machine learning in a high-throughput manner. This package provides a 14 | user-friendly interface that empowers investigators to execute state-of-the-art 15 | binary and multi-class classification, as well as regression, with minimal 16 | programming experience necessary. 17 | License: GPL-2 18 | LazyData: TRUE 19 | VignetteBuilder: knitr 20 | RoxygenNote: 6.1.1 21 | Encoding: UTF-8 22 | Imports: 23 | cluster, 24 | MASS, 25 | e1071, 26 | glmnet, 27 | frbs, 28 | lattice, 29 | methods, 30 | nnet, 31 | plyr, 32 | randomForest, 33 | ROCR, 34 | rpart, 35 | sampling, 36 | stats 37 | Depends: 38 | R (>= 3.2.2), 39 | kernlab 40 | Suggests: 41 | amalgam, 42 | balance, 43 | Biobase, 44 | edgeR, 45 | GEOquery, 46 | h2o, 47 | knitr, 48 | limma, 49 | magrittr, 50 | mRMRe, 51 | propr, 52 | RankProd, 53 | rmarkdown, 54 | testthat, 55 | vegan 56 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | export(GSE2eSet) 4 | export(arrayExprs) 5 | export(build.) 6 | export(buildANN) 7 | export(buildDNN) 8 | export(buildDT) 9 | export(buildEnsemble) 10 | export(buildFRB) 11 | export(buildGLM) 12 | export(buildLASSO) 13 | export(buildLDA) 14 | export(buildLM) 15 | export(buildLR) 16 | export(buildNB) 17 | export(buildRF) 18 | export(buildSVM) 19 | export(calcMonteCarlo) 20 | export(calcNested) 21 | export(calcStats) 22 | export(compare) 23 | export(conjoin) 24 | export(ctrlFeatureSelect) 25 | export(ctrlGridSearch) 26 | export(ctrlModSet) 27 | export(ctrlSplitSet) 28 | export(exprso) 29 | export(fs.) 30 | export(fsANOVA) 31 | export(fsAmalgam) 32 | export(fsAnnot) 33 | export(fsBalance) 34 | export(fsCor) 35 | export(fsEbayes) 36 | export(fsEdger) 37 | export(fsInclude) 38 | export(fsMrmre) 39 | export(fsNULL) 40 | export(fsPCA) 41 | export(fsPRA) 42 | export(fsPrcomp) 43 | export(fsRDA) 44 | export(fsRankProd) 45 | export(fsSample) 46 | export(fsStats) 47 | export(getFeatures) 48 | export(getWeights) 49 | export(lequal) 50 | export(modAcomp) 51 | export(modCLR) 52 | export(modCluster) 53 | export(modFilter) 54 | export(modHistory) 55 | export(modInclude) 56 | export(modNormalize) 57 | export(modPermute) 58 | export(modRatios) 59 | export(modSample) 60 | export(modScale) 61 | export(modSkew) 62 | export(modSubset) 63 | export(modSwap) 64 | export(modTMM) 65 | export(modTransform) 66 | export(nfeats) 67 | export(nsamps) 68 | export(pipeFilter) 69 | export(pipeSubset) 70 | export(pipeUnboot) 71 | export(plCV) 72 | export(plGrid) 73 | export(plMonteCarlo) 74 | export(plNested) 75 | export(splitBalanced) 76 | export(splitBoost) 77 | export(splitBy) 78 | export(splitSample) 79 | export(splitStratify) 80 | export(testSet) 81 | export(trainingSet) 82 | export(validationSet) 83 | exportClasses(ExprsArray) 84 | exportClasses(ExprsBinary) 85 | exportClasses(ExprsEnsemble) 86 | exportClasses(ExprsMachine) 87 | exportClasses(ExprsModel) 88 | exportClasses(ExprsModule) 89 | exportClasses(ExprsMulti) 90 | exportClasses(ExprsPipeline) 91 | exportClasses(ExprsPredict) 92 | exportClasses(MultiPredict) 93 | exportClasses(RegrsArray) 94 | exportClasses(RegrsModel) 95 | exportClasses(RegrsPredict) 96 | exportMethods("$") 97 | exportMethods("[") 98 | exportMethods(buildEnsemble) 99 | exportMethods(calcStats) 100 | exportMethods(compare) 101 | exportMethods(conjoin) 102 | exportMethods(getFeatures) 103 | exportMethods(getWeights) 104 | exportMethods(modCluster) 105 | exportMethods(modSwap) 106 | exportMethods(plot) 107 | exportMethods(predict) 108 | exportMethods(show) 109 | exportMethods(subset) 110 | exportMethods(summary) 111 | importFrom(cluster,agnes) 112 | importFrom(cluster,clara) 113 | importFrom(cluster,diana) 114 | importFrom(cluster,fanny) 115 | importFrom(cluster,pam) 116 | importFrom(grDevices,rainbow) 117 | importFrom(graphics,boxplot) 118 | importFrom(graphics,layout) 119 | importFrom(graphics,plot) 120 | importFrom(lattice,cloud) 121 | importFrom(methods,as) 122 | importFrom(methods,new) 123 | importFrom(methods,show) 124 | importFrom(plyr,rbind.fill) 125 | importFrom(stats,anova) 126 | importFrom(stats,aov) 127 | importFrom(stats,as.formula) 128 | importFrom(stats,chisq.test) 129 | importFrom(stats,cor) 130 | importFrom(stats,cutree) 131 | importFrom(stats,density) 132 | importFrom(stats,dist) 133 | importFrom(stats,hclust) 134 | importFrom(stats,kmeans) 135 | importFrom(stats,ks.test) 136 | importFrom(stats,lm) 137 | importFrom(stats,median) 138 | importFrom(stats,na.omit) 139 | importFrom(stats,prcomp) 140 | importFrom(stats,qqnorm) 141 | importFrom(stats,quantile) 142 | importFrom(stats,rnorm) 143 | importFrom(stats,sd) 144 | importFrom(stats,t.test) 145 | importFrom(stats,var.test) 146 | importFrom(stats,wilcox.test) 147 | importFrom(utils,head) 148 | importFrom(utils,read.delim) 149 | importFrom(utils,tail) 150 | importFrom(utils,write.csv) 151 | importMethodsFrom(ROCR,plot) 152 | importMethodsFrom(kernlab,predict) 153 | -------------------------------------------------------------------------------- /R/1.1-classes.R: -------------------------------------------------------------------------------- 1 | #' An S4 class to store feature and annotation data 2 | #' 3 | #' @slot exprs A matrix. Stores the feature data. 4 | #' @slot annot A data.frame. Stores the annotation data. 5 | #' @slot preFilter Typically a list. Stores feature selection history. 6 | #' @slot reductionModel Typically a list. Stores dimension reduction history. 7 | #' 8 | #' @seealso 9 | #' \code{\link{ExprsArray-class}}\cr 10 | #' \code{\link{ExprsModel-class}}\cr 11 | #' \code{\link{ExprsPipeline-class}}\cr 12 | #' \code{\link{ExprsEnsemble-class}}\cr 13 | #' \code{\link{ExprsPredict-class}}\cr 14 | #' \code{\link{MultiPredict-class}}\cr 15 | #' \code{\link{RegrsPredict-class}} 16 | #' @export 17 | setClass("ExprsArray", 18 | slots = c( 19 | exprs = "matrix", 20 | annot = "data.frame", 21 | preFilter = "ANY", 22 | reductionModel = "ANY" 23 | ) 24 | ) 25 | 26 | #' An S4 class to store feature and annotation data 27 | #' 28 | #' An \code{ExprsArray} sub-class for data with binary class outcomes. 29 | #' 30 | #' @export 31 | setClass("ExprsBinary", 32 | contains = "ExprsArray" 33 | ) 34 | 35 | #' An S4 class to store feature and annotation data 36 | #' 37 | #' An \code{ExprsArray} sub-class for data with multiple class outcomes. 38 | #' 39 | #' @export 40 | setClass("ExprsMulti", 41 | contains = "ExprsArray" 42 | ) 43 | 44 | #' An S4 class to store feature and annotation data 45 | #' 46 | #' An \code{ExprsArray} sub-class for data with continuous outcomes. 47 | #' 48 | #' @export 49 | setClass("RegrsArray", 50 | contains = "ExprsArray" 51 | ) 52 | 53 | #' An S4 class to store the model 54 | #' 55 | #' @slot preFilter Typically a list. Stores feature selection history. 56 | #' @slot reductionModel Typically a list. Stores dimension reduction history. 57 | #' @slot mach Typically an S4 class. Stores the model. 58 | #' 59 | #' @seealso 60 | #' \code{\link{ExprsArray-class}}\cr 61 | #' \code{\link{ExprsModel-class}}\cr 62 | #' \code{\link{ExprsPipeline-class}}\cr 63 | #' \code{\link{ExprsEnsemble-class}}\cr 64 | #' \code{\link{ExprsPredict-class}}\cr 65 | #' \code{\link{MultiPredict-class}}\cr 66 | #' \code{\link{RegrsPredict-class}} 67 | #' @export 68 | setClass("ExprsModel", 69 | slots = c( 70 | preFilter = "ANY", 71 | reductionModel = "ANY", 72 | mach = "ANY" 73 | ) 74 | ) 75 | 76 | #' An S4 class to store the model 77 | #' 78 | #' An \code{ExprsModel} sub-class for binary classification models. 79 | #' 80 | #' @export 81 | setClass("ExprsMachine", 82 | contains = "ExprsModel" 83 | ) 84 | 85 | #' An S4 class to store the model 86 | #' 87 | #' An \code{ExprsModel} sub-class for multi-class classification models. 88 | #' 89 | #' @export 90 | setClass("ExprsModule", 91 | contains = "ExprsModel" 92 | ) 93 | 94 | #' An S4 class to store the model 95 | #' 96 | #' An \code{ExprsModel} sub-class for continuous outcome models. 97 | #' 98 | #' @export 99 | setClass("RegrsModel", 100 | contains = "ExprsModel" 101 | ) 102 | 103 | #' An S4 class to store models built during high-throughput learning 104 | #' 105 | #' @slot summary Typically a data.frame. Stores the parameters and 106 | #' performances for the models. 107 | #' @slot machs Typically a list. Stores the models 108 | #' referenced in \code{summary} slot. 109 | #' 110 | #' @seealso 111 | #' \code{\link{ExprsArray-class}}\cr 112 | #' \code{\link{ExprsModel-class}}\cr 113 | #' \code{\link{ExprsPipeline-class}}\cr 114 | #' \code{\link{ExprsEnsemble-class}}\cr 115 | #' \code{\link{ExprsPredict-class}}\cr 116 | #' \code{\link{MultiPredict-class}}\cr 117 | #' \code{\link{RegrsPredict-class}} 118 | #' @export 119 | setClass("ExprsPipeline", 120 | slots = c( 121 | summary = "ANY", 122 | machs = "ANY") 123 | ) 124 | 125 | #' An S4 class to store multiple models 126 | #' 127 | #' @slot machs Typically a list. Stores the models. 128 | #' 129 | #' @seealso 130 | #' \code{\link{ExprsArray-class}}\cr 131 | #' \code{\link{ExprsModel-class}}\cr 132 | #' \code{\link{ExprsPipeline-class}}\cr 133 | #' \code{\link{ExprsEnsemble-class}}\cr 134 | #' \code{\link{ExprsPredict-class}}\cr 135 | #' \code{\link{MultiPredict-class}}\cr 136 | #' \code{\link{RegrsPredict-class}} 137 | #' @export 138 | setClass("ExprsEnsemble", 139 | slots = c( 140 | machs = "ANY" 141 | ) 142 | ) 143 | 144 | #' An S4 class to store model predictions 145 | #' 146 | #' @slot pred A factor. Stores class predictions as an unambiguous 147 | #' class assignment. 148 | #' @slot decision.values Typically a matrix. Stores class predictions 149 | #' as a decision value. 150 | #' @slot probability Typically a matrix. Stores class predictions 151 | #' as a probability. 152 | #' @slot actual Typically a factor. Stores known class labels. 153 | #' Used by \code{\link{calcStats}}. 154 | #' 155 | #' @seealso 156 | #' \code{\link{ExprsArray-class}}\cr 157 | #' \code{\link{ExprsModel-class}}\cr 158 | #' \code{\link{ExprsPipeline-class}}\cr 159 | #' \code{\link{ExprsEnsemble-class}}\cr 160 | #' \code{\link{ExprsPredict-class}}\cr 161 | #' \code{\link{MultiPredict-class}}\cr 162 | #' \code{\link{RegrsPredict-class}} 163 | #' @export 164 | setClass("ExprsPredict", 165 | slots = c( 166 | pred = "factor", 167 | decision.values = "ANY", 168 | probability = "ANY", 169 | actual = "ANY" 170 | ) 171 | ) 172 | 173 | #' An S4 class to store model predictions 174 | #' 175 | #' @slot pred Any. Stores predicted outcome. 176 | #' @slot actual Any. Stores actual outcome. 177 | #' Used by \code{\link{calcStats}}. 178 | #' 179 | #' @seealso 180 | #' \code{\link{ExprsArray-class}}\cr 181 | #' \code{\link{ExprsModel-class}}\cr 182 | #' \code{\link{ExprsPipeline-class}}\cr 183 | #' \code{\link{ExprsEnsemble-class}}\cr 184 | #' \code{\link{ExprsPredict-class}}\cr 185 | #' \code{\link{MultiPredict-class}}\cr 186 | #' \code{\link{RegrsPredict-class}} 187 | #' @export 188 | setClass("MultiPredict", 189 | slots = c( 190 | pred = "ANY", 191 | actual = "ANY" 192 | ) 193 | ) 194 | 195 | #' An S4 class to store model predictions 196 | #' 197 | #' @slot pred Any. Stores predicted outcome. 198 | #' @slot actual Any. Stores actual outcome. 199 | #' Used by \code{\link{calcStats}}. 200 | #' 201 | #' @seealso 202 | #' \code{\link{ExprsArray-class}}\cr 203 | #' \code{\link{ExprsModel-class}}\cr 204 | #' \code{\link{ExprsPipeline-class}}\cr 205 | #' \code{\link{ExprsEnsemble-class}}\cr 206 | #' \code{\link{ExprsPredict-class}}\cr 207 | #' \code{\link{MultiPredict-class}}\cr 208 | #' \code{\link{RegrsPredict-class}} 209 | #' @export 210 | setClass("RegrsPredict", 211 | slots = c( 212 | pred = "ANY", 213 | actual = "ANY" 214 | ) 215 | ) 216 | -------------------------------------------------------------------------------- /R/2-conjoin.R: -------------------------------------------------------------------------------- 1 | #' Combine \code{exprso} Objects 2 | #' 3 | #' \code{conjoin} combines two or more \code{exprso} objects based on their class. 4 | #' 5 | #' When joining \code{ExprsArray} objects, this function returns one 6 | #' \code{ExprsArray} object as output. This only works on \code{ExprsArray} objects 7 | #' that have not undergone feature selection. Any missing annotations in \code{@annot} 8 | #' will get replaced with \code{NA} values. 9 | #' 10 | #' When joining \code{ExprsModel} or \code{ExprsEnsemble} objects, 11 | #' this function returns an ensemble. 12 | #' 13 | #' When joining \code{ExprsPipeline} objects, this function returns one 14 | #' \code{ExprsPipeline} object as output. To track which \code{ExprsPipeline} 15 | #' objects contributed to the resultant object, the source gets flagged 16 | #' with a \code{boot} column. If a pipeline already has a \code{boot} column, 17 | #' the original boot tracker will receive an offset (and the old \code{boot} 18 | #' column will get renamed to \code{unboot}). This system ensures that all 19 | #' models deriving from the same training set will get handled as a 20 | #' "pseudo-bootstrap" by downstream \code{\link{pipe}} functions. 21 | #' 22 | #' @param object Any \code{exprso} object. 23 | #' @param ... More objects of the same class. 24 | #' @return See Details. 25 | #' @export 26 | setGeneric("conjoin", 27 | function(object, ...) standardGeneric("conjoin") 28 | ) 29 | 30 | #' @describeIn conjoin Method to join \code{ExprsArray} objects. 31 | #' @export 32 | setMethod("conjoin", "ExprsArray", 33 | function(object, ...){ 34 | 35 | # Prepare list of objects 36 | args <- list(...) 37 | args <- append(list(object), args) 38 | 39 | if(!lequal(lapply(args, class))){ 40 | stop("All provided objects must have the same class.") 41 | } 42 | 43 | if(!lequal(lapply(args, function(o) rownames(o@exprs)))){ 44 | stop("All provided objects must have the same class.") 45 | } 46 | 47 | if(any(!sapply(args, function(e) is.null(e@preFilter) | is.null(e@reductionModel)))){ 48 | stop("All provided objects must not have undergone feature selection.") 49 | } 50 | 51 | # Prepare single matrix for @exprs and @annot each 52 | exprs <- as.matrix(do.call(cbind, lapply(args, function(a) a@exprs))) 53 | annot <- do.call(plyr::rbind.fill, lapply(args, function(a) a@annot)) 54 | rownames(annot) <- unlist(lapply(args, function(a) rownames(a@annot))) 55 | 56 | # Return single ExprsArray object 57 | new(class(object), exprs = exprs, annot = annot, 58 | preFilter = NULL, reductionModel = NULL) 59 | } 60 | ) 61 | 62 | #' @describeIn conjoin Method to join \code{ExprsModel} objects. 63 | #' @export 64 | setMethod("conjoin", "ExprsModel", 65 | function(object, ...){ 66 | 67 | # Prepare list of objects 68 | args <- list(...) 69 | args <- append(list(object), args) 70 | 71 | if(!lequal(lapply(args, class))){ 72 | stop("All provided objects must have the same class.") 73 | } 74 | 75 | # Return single ExprsEnsemble object 76 | new("ExprsEnsemble", machs = args) 77 | } 78 | ) 79 | 80 | #' @describeIn conjoin Method to join \code{ExprsPipeline} objects. 81 | #' @export 82 | setMethod("conjoin", "ExprsPipeline", 83 | function(object, ...){ 84 | 85 | # Prepare list of objects 86 | args <- list(...) 87 | args <- append(list(object), args) 88 | 89 | if(!lequal(lapply(args, class))){ 90 | stop("All provided objects must have the same class.") 91 | } 92 | 93 | # Get @summary and @machs from list 94 | args.summary <- lapply(args, function(pl) pl@summary) 95 | args.machs <- lapply(args, function(pl) pl@machs) 96 | 97 | # When joining pl objects, treat each input as a new "boot" 98 | b <- 1 # <-- the conjoin boot counter 99 | pls <- lapply(args.summary, 100 | function(pl){ 101 | 102 | pl <- cbind("join" = 0, pl) # <-- set "join" as first column 103 | if(!"boot" %in% colnames(pl)){ 104 | 105 | # Add conjoin boot counter 106 | pl$join <- b 107 | b <<- b + 1 108 | 109 | }else{ 110 | 111 | # For each boot in $boot 112 | for(i in 1:length(unique(pl$boot))){ 113 | 114 | # Change each unique boot to conjoin boot counter 115 | pl$join[pl$boot == i] <- b 116 | b <<- b + 1 117 | } 118 | 119 | # Rename $boot to $unboot 120 | colnames(pl)[colnames(pl) == "boot"] <- "unboot" 121 | } 122 | 123 | # Rename $join to $boot 124 | colnames(pl)[colnames(pl) == "join"] <- "boot" 125 | 126 | return(pl) 127 | } 128 | ) 129 | 130 | # Return single ExprsPipeline object 131 | new("ExprsPipeline", summary = do.call(plyr::rbind.fill, pls), machs = unlist(args.machs)) 132 | } 133 | ) 134 | 135 | #' @describeIn conjoin Method to join \code{ExprsEnsemble} objects. 136 | #' @export 137 | setMethod("conjoin", "ExprsEnsemble", 138 | function(object, ...){ 139 | 140 | # Prepare list of objects 141 | args <- list(...) 142 | args <- append(list(object), args) 143 | 144 | if(!lequal(lapply(args, class))){ 145 | stop("All provided objects must have the same class.") 146 | } 147 | 148 | # Return single ExprsEnsemble object 149 | machs <- unlist(lapply(args, function(pl) pl@machs), recursive = TRUE) 150 | new("ExprsEnsemble", machs = as.list(machs)) 151 | } 152 | ) 153 | -------------------------------------------------------------------------------- /R/6.2-calc.R: -------------------------------------------------------------------------------- 1 | #' Calculate Model Performance 2 | #' 3 | #' \code{calcStats} calculates the performance of a deployed model. 4 | #' 5 | #' For classification, if the argument \code{aucSkip = FALSE} AND the \code{ExprsArray} 6 | #' object was an \code{ExprsBinary} object with at least one case and one control AND 7 | #' \code{ExprsPredict} contains a coherent \code{@@probability} slot, \code{calcStats} 8 | #' will calculate classifier performance using the area under the receiver operating 9 | #' characteristic (ROC) curve via the \code{ROCR} package. Otherwise, \code{calcStats} 10 | #' will calculate classifier performance traditionally using a confusion matrix. 11 | #' Note that accuracies calculated using \code{ROCR} may differ from those calculated 12 | #' using a confusion matrix because \code{ROCR} adjusts the discrimination threshold to 13 | #' optimize sensitivity and specificity. This threshold is automatically chosen as the 14 | #' point along the ROC which minimizes the Euclidean distance from (0, 1). 15 | #' 16 | #' For regression, accuracy is defined the R-squared of the fitted regression. This 17 | #' ranges from 0 to 1 for use with \code{\link{pl}} and \code{\link{pipe}}. Note that 18 | #' the \code{aucSkip} and \code{plotSkip} arguments are ignored for regression. 19 | #' 20 | #' @param object An \code{ExprsPredict} or \code{RegrsPredict} object. 21 | #' @param aucSkip A logical scalar. Toggles whether to calculate area under the 22 | #' receiver operating characteristic curve. See Details. 23 | #' @param plotSkip A logical scalar. Toggles whether to plot the receiver 24 | #' operating characteristic curve. See Details. 25 | #' @param verbose A logical scalar. Toggles whether to print the results 26 | #' of model performance to console. 27 | #' 28 | #' @return Returns a \code{data.frame} of performance metrics. 29 | #' 30 | #' @export 31 | setGeneric("calcStats", 32 | function(object, aucSkip = FALSE, plotSkip = FALSE, verbose = TRUE) standardGeneric("calcStats") 33 | ) 34 | 35 | #' @describeIn calcStats Method to calculate performance for classification models. 36 | #' @export 37 | setMethod("calcStats", "ExprsPredict", 38 | function(object, aucSkip, plotSkip, verbose){ 39 | 40 | if(all(c("Case", "Control") %in% object@actual) & !is.null(object@probability) & !aucSkip){ 41 | 42 | if(verbose) cat("Calculating accuracy based on optimal AUC cutoff...\n") 43 | 44 | # Find optimal cutoff based on distance from top-left corner 45 | p <- ROCR::prediction(object@probability[, "Case"], as.numeric(object@actual == "Case")) 46 | perf <- ROCR::performance(p, measure = "tpr", x.measure = "fpr") 47 | index <- which.min(sqrt((1 - perf@y.values[[1]])^2 + (0 - perf@x.values[[1]])^2)) 48 | if(!plotSkip) plot(perf, col = rainbow(10)) 49 | if(!plotSkip) graphics::points(perf@x.values[[1]][index], perf@y.values[[1]][index], col = "blue") 50 | 51 | # Get performance for optimal cutoff 52 | acc <- ROCR::performance(p, "acc")@y.values[[1]][index] 53 | sens <- ROCR::performance(p, "sens")@y.values[[1]][index] 54 | spec <- ROCR::performance(p, "spec")@y.values[[1]][index] 55 | prec <- ROCR::performance(p, "prec")@y.values[[1]][index] 56 | f1 <- 2 * (prec * sens) / (prec + sens) 57 | auc <- ROCR::performance(p, "auc")@y.values[[1]] 58 | 59 | df <- data.frame(acc, sens, spec, prec, f1, auc) 60 | df[is.na(df)] <- 0 61 | return(df) 62 | 63 | }else{ 64 | 65 | if(verbose) cat("Calculating accuracy without AUC support...\n") 66 | 67 | # Build confusion table from factor 68 | object@actual <- factor(object@actual, levels = c("Control", "Case")) 69 | table <- table("predicted" = object@pred, "actual" = object@actual) 70 | if(verbose){ 71 | cat("Classification confusion table:\n"); print(table) 72 | } 73 | 74 | tn <- table[1,1] 75 | fp <- table[2,1] 76 | fn <- table[1,2] 77 | tp <- table[2,2] 78 | 79 | acc <- (tp + tn) / (tp + tn + fp + fn) 80 | sens <- tp / (tp + fn) 81 | spec <- tn / (fp + tn) 82 | prec <- tp / (tp + fp) 83 | f1 <- 2 * (prec * sens) / (prec + sens) 84 | 85 | df <- data.frame(acc, sens, spec, prec, f1) 86 | df[is.na(df)] <- 0 87 | 88 | if(verbose){ 89 | cat("Classification confusion table:\n"); print(table) 90 | cat("Classifier model performance:\n"); print(df) 91 | } 92 | 93 | return(df) 94 | } 95 | } 96 | ) 97 | 98 | 99 | #' @describeIn calcStats Method to calculate performance for multi-class models. 100 | #' @export 101 | setMethod("calcStats", "MultiPredict", 102 | function(object, verbose){ 103 | 104 | mat <- table("predicted" = object@pred, "actual" = object@actual) # predicted as rows 105 | acc <- sum(diag(mat)) / sum(mat) 106 | prec <- diag(mat) / rowSums(mat) 107 | sens <- diag(mat) / colSums(mat) 108 | df <- data.frame(acc, "sens" = t(sens), "prec" = t(prec)) # ensures 1 row 109 | df[is.na(df)] <- 0 110 | 111 | if(verbose){ 112 | cat("Classification confusion table:\n"); print(mat) 113 | cat("Classifier model performance:\n"); print(df) 114 | } 115 | 116 | return(df) 117 | } 118 | ) 119 | 120 | #' @describeIn calcStats Method to calculate performance for continuous outcome models. 121 | #' @export 122 | setMethod("calcStats", "RegrsPredict", 123 | function(object, verbose){ 124 | 125 | mse <- mean((object@pred - object@actual)^2) 126 | rmse <- sqrt(mse) 127 | mae <- mean(abs(object@pred - object@actual)) 128 | cor <- stats::cor(object@pred, object@actual, method = "pearson") 129 | R2 <- cor^2 130 | acc <- R2 131 | df <- data.frame(acc, mse, rmse, mae, cor, R2) 132 | df[is.na(df)] <- 0 133 | 134 | if(verbose){ 135 | cat("Regression model performance:\n"); print(df) 136 | } 137 | 138 | return(df) 139 | } 140 | ) 141 | -------------------------------------------------------------------------------- /R/7.1-plCV.R: -------------------------------------------------------------------------------- 1 | #' Perform Simple Cross-Validation 2 | #' 3 | #' Calculates v-fold or leave-one-out cross-validation without selecting a new 4 | #' set of features with each fold. See Details. 5 | #' 6 | #' \code{plCV} performs v-fold or leave-one-out cross-validation. The argument 7 | #' \code{fold} specifies the number of v-folds to use during cross-validation. 8 | #' Set \code{fold = 0} to perform leave-one-out cross-validation. 9 | #' 10 | #' This type of cross-validation is most appropriate if the data 11 | #' has not undergone any prior feature selection. However, it is also useful 12 | #' as an unbiased guide to parameter selection within another 13 | #' \code{\link{pl}} workflow. 14 | #' 15 | #' Users should never need to call this function directly. Instead, they 16 | #' should use \code{\link{plMonteCarlo}} or \code{\link{plNested}}. 17 | #' There, \code{plCV} handles inner-fold cross-validation. 18 | #' 19 | #' @param array Specifies the \code{ExprsArray} object to undergo cross-validation. 20 | #' @inheritParams plGrid 21 | #' @return The average inner-fold cross-validation accuracy. 22 | #' @export 23 | plCV <- function(array, top, how, fold, aucSkip, plCV.acc, ...){ 24 | 25 | args.how <- getArgs(...) 26 | 27 | # Perform LOOCV if 0 fold 28 | if(fold == 0) fold <- nsamps(array) 29 | if(fold > nsamps(array)){ 30 | 31 | warning("Insufficient subjects for plCV v-fold cross-validation. Performing LOOCV instead.") 32 | fold <- nsamps(array) 33 | } 34 | 35 | # Add the ith subject ID to the vth fold 36 | ids <- sample(rownames(array@annot)) 37 | splits <- suppressWarnings(split(ids, 1:fold)) # warns that some splits are bigger than others 38 | 39 | # Build a machine against the vth fold 40 | accs <- vector("numeric", fold) 41 | for(v in 1:length(splits)){ 42 | 43 | holdout <- colnames(array@exprs) %in% splits[[v]] 44 | array.train <- array[!holdout,] 45 | array.valid <- array[holdout,] 46 | 47 | # Build machine and deploy 48 | args.v <- append(list("object" = array.train, "top" = top), args.how) 49 | mach.v <- do.call(what = how, args = args.v) 50 | pred.v <- predict(mach.v, array.valid, verbose = FALSE) 51 | perfs <- calcStats(pred.v, aucSkip = aucSkip, plotSkip = TRUE, verbose = FALSE) 52 | if(plCV.acc %in% colnames(perfs)){ 53 | accs[v] <- perfs[, plCV.acc] 54 | }else{ 55 | warning("plCV.acc not available: using plCV.acc = 'acc' instead.") 56 | accs[v] <- perfs[, "acc"] 57 | } 58 | } 59 | 60 | acc <- mean(accs) 61 | 62 | return(acc) 63 | } 64 | -------------------------------------------------------------------------------- /R/7.2-plGrid.R: -------------------------------------------------------------------------------- 1 | #' Perform High-Throughput Machine Learning 2 | #' 3 | #' Trains and deploys models across a vast parameter search space. 4 | #' 5 | #' \code{plGrid} will \code{\link{build}} and \code{\link{exprso-predict}} for 6 | #' each combination of parameters provided as additional arguments (\code{...}). 7 | #' When using \code{plGrid}, supplying a numeric vector as the \code{top} 8 | #' argument will train and deploy a model of each mentioned size for 9 | #' each combination of parameters provided. 10 | #' 11 | #' To skip validation set prediction, use \code{array.valid = NULL}. 12 | #' Either way, this function returns an \code{\link{ExprsPipeline-class}} 13 | #' object which contains a summary of the build parameters and the models 14 | #' themselves. The argument \code{fold} controls inner-fold 15 | #' cross-validation via \code{\link{plCV}}. Use this to 16 | #' select the best model unbiasedly. 17 | #' 18 | #' @param array.train The \code{ExprsArray} object to use as training set. 19 | #' @param array.valid The \code{ExprsArray} object to use as validation set. 20 | #' @param how A character string. The \code{\link{build}} method to iterate. 21 | #' @param top A numeric scalar or character vector. A numeric scalar indicates 22 | #' the number of top features that should undergo feature selection. A character vector 23 | #' indicates specifically which features by name should undergo feature selection. 24 | #' Set \code{top = 0} to include all features. Note that providing a numeric vector 25 | #' for the \code{top} argument will have \code{plGrid} search across multiple 26 | #' top features. However, by providing a list of numeric vectors as the \code{top} 27 | #' argument, the user can force the default handling of numeric vectors. 28 | #' @param fold A numeric scalar. The number of folds for cross-validation. 29 | #' Set \code{fold = 0} to perform leave-one-out cross-validation. Argument passed 30 | #' to \code{\link{plCV}}. Set \code{fold = NULL} to skip cross-validation altogether. 31 | #' @param aucSkip A logical scalar. Argument passed to \code{\link{calcStats}}. 32 | #' @param plCV.acc A string. The performance metric to use. For example, 33 | #' choose from "acc", "sens", "spec", "prec", "f1", "auc", or any of the 34 | #' regression specific measures. Argument passed to \code{\link{plCV}}. 35 | #' @param verbose A logical scalar. Toggles whether to print to console. 36 | #' @param ... Arguments passed to the \code{how} method. Unlike the \code{build} method, 37 | #' \code{plGrid} allows multiple parameters for each argument, supplied as a vector. 38 | #' See Details. 39 | #' @return An \code{\link{ExprsPipeline-class}} object. 40 | #' @export 41 | plGrid <- function(array.train, array.valid = NULL, how, top = 0, fold = 10, 42 | aucSkip = FALSE, plCV.acc = "acc", 43 | verbose = FALSE, ...){ 44 | 45 | if(missing(how)){ 46 | 47 | stop("Uh oh! You must provide a valid build method for the 'how' argument.") 48 | } 49 | 50 | # For each gridpoint in grid 51 | grid <- makeGridFromArgs(array.train = array.train, top = top, how = how, ...) 52 | grid <- grid[, !colnames(grid) %in% "plotSkip", drop = FALSE] 53 | statistics <- vector("list", nrow(grid)) 54 | models <- vector("list", nrow(grid)) 55 | for(i in 1:nrow(grid)){ 56 | 57 | if(verbose){ 58 | cat("Now building machine at gridpoint:\n") 59 | print(grid[i, , drop = FALSE]) 60 | } 61 | 62 | # Format gridpoint args to pass along to build do.call 63 | args <- append(list("object" = array.train), as.list(grid[i, , drop = FALSE])) 64 | 65 | # Build and save model 66 | args <- lapply(args, unlist) 67 | model <- do.call(what = how, args = args[!is.na(args)]) 68 | models[[i]] <- model 69 | 70 | # Predict class labels using the provided training set and calculate accuracy 71 | pred.train <- predict(model, array.train, verbose = verbose) 72 | stats <- calcStats(pred.train, aucSkip = aucSkip, plotSkip = TRUE, verbose = FALSE) 73 | colnames(stats) <- paste0("train.", colnames(stats)) 74 | acc <- stats 75 | 76 | # If a validation set is provided 77 | if(!is.null(array.valid)){ 78 | 79 | # Predict class labels using the provided validation set and calculate accuracy 80 | pred.valid <- predict(model, array.valid, verbose = verbose) 81 | stats <- calcStats(pred.valid, aucSkip = aucSkip, plotSkip = TRUE, verbose = FALSE) 82 | colnames(stats) <- paste0("valid.", colnames(stats)) 83 | acc <- data.frame(acc, stats) 84 | } 85 | 86 | # If 'fold' argument is provided 87 | if(!is.null(fold)){ 88 | 89 | # Perform leave-one-out or v-fold cross-validation 90 | args <- append(list("how" = how, "fold" = fold, "aucSkip" = aucSkip, "plCV.acc" = plCV.acc), args) 91 | names(args)[names(args) == "object"] <- "array" 92 | cv <- do.call(what = plCV, args = args[!is.na(args)]) 93 | acc <- data.frame("fold" = fold, "train.plCV" = cv, acc) 94 | } 95 | 96 | # Save summary statistics 97 | statistics[[i]] <- data.frame("build" = how, grid[i, , drop = FALSE], acc) 98 | } 99 | 100 | pl <- new("ExprsPipeline", 101 | summary = do.call(rbind, statistics), 102 | machs = models 103 | ) 104 | 105 | return(pl) 106 | } 107 | -------------------------------------------------------------------------------- /R/8.2-ens.R: -------------------------------------------------------------------------------- 1 | #' Build Ensemble 2 | #' 3 | #' \code{buildEnsemble} builds an ensemble from \code{ExprsModel} or 4 | #' \code{ExprsPipeline} objects. See Details. 5 | #' 6 | #' This function can combine any number of model objects into an ensemble. 7 | #' These models do not necessarily have to derive from the same \code{build} 8 | #' method. In this way, it works like \code{\link{conjoin}}. 9 | #' 10 | #' This function can also build an ensemble from pipeline objects. It does 11 | #' this by calling \code{\link{pipeFilter}}, then joining the remaining models 12 | #' into an ensemble. As an adjunct to this method, consider first combining 13 | #' multiple pipeline objects with \code{\link{conjoin}}. 14 | #' 15 | #' @inheritParams pipeFilter 16 | #' @param ... Additional \code{ExprsModel} objects to use in the ensemble. 17 | #' Argument applies to the \code{\link{ExprsModel-class}} method only. 18 | #' @return An \code{\link{ExprsEnsemble-class}} object. 19 | #' @export 20 | setGeneric("buildEnsemble", 21 | function(object, ...) standardGeneric("buildEnsemble") 22 | ) 23 | 24 | #' @describeIn buildEnsemble Method to build ensemble from \code{ExprsModel} objects. 25 | #' @export 26 | setMethod("buildEnsemble", "ExprsModel", 27 | function(object, ...){ # args to include additional ExprsMachine objects 28 | 29 | conjoin(object, ...) 30 | } 31 | ) 32 | 33 | #' @describeIn buildEnsemble Method to build ensemble from \code{ExprsPipeline} objects. 34 | #' @export 35 | setMethod("buildEnsemble", "ExprsPipeline", 36 | function(object, colBy = 0, how = 0, gate = 0, top = 0){ 37 | 38 | object <- pipeFilter(object, colBy = colBy, how = how, gate = gate, top = top) 39 | new("ExprsEnsemble", machs = unlist(object@machs)) 40 | } 41 | ) 42 | 43 | #' @rdname exprso-predict 44 | #' 45 | #' @details 46 | #' For regression ensembles, the average outcome is reported. For multi-class 47 | #' classifier ensembles, the majority vote is reported. For binary classifier 48 | #' ensembles, the majority vote or probability-weighted vote is reported. 49 | #' For probability-weighted voting considers the threshold, the average 50 | #' "Case" probability is reported. All ties broken randomly. 51 | #' 52 | #' @param how A string. Describes how the ensemble decides. By default, it 53 | #' uses "majority" voting. However, the user can select "probability" voting 54 | #' for binary classifier ensembles. 55 | #' 56 | #' @export 57 | setMethod("predict", "ExprsEnsemble", 58 | function(object, array, how = "majority", verbose = TRUE){ 59 | 60 | # Deploy each machine in @machs on the provided ExprsArray 61 | results <- lapply(object@machs, function(mach) predict(mach, array, verbose = verbose)) 62 | 63 | if(class(array) == "ExprsBinary"){ 64 | 65 | if(how == "majority"){ 66 | 67 | # Majority vote on best outcome 68 | votes <- data.frame(lapply(results, function(result) result@pred)) 69 | pred <- apply(votes, 1, function(x) names(table(x))[nnet::which.is.max(table(x))]) 70 | 71 | # Clean up pred 72 | pred <- factor(pred, levels = c("Control", "Case")) 73 | px <- NULL 74 | dv <- NULL 75 | 76 | }else if(how == "probability"){ 77 | 78 | # Get average probabilities 79 | pxs <- lapply(results, function(result) result@probability) 80 | Case <- rowMeans(data.frame(lapply(pxs, function(px) px[, "Case"]))) 81 | px <- data.frame("Control" = 1 - Case, "Case" = Case) 82 | 83 | # Get 'decision.values' from 'probabilities' using inverse Platt scaling 84 | dv <- as.matrix(log(1 / (1 - px[, "Case"]) - 1)) 85 | colnames(dv) <- "Case/Control" 86 | 87 | # Majority vote on best outcome based on average probability 88 | pred <- ifelse(px$Case > .5, "Case", "Control") 89 | pred[px$Case == .5] <- sample(c("Case", "Control")) 90 | 91 | # Clean up pred 92 | pred <- factor(pred, levels = c("Control", "Case")) 93 | 94 | }else{ 95 | 96 | stop("Provided 'how' not recognized.") 97 | } 98 | 99 | final <- new("ExprsPredict", 100 | pred = pred, decision.values = dv, probability = px, 101 | actual = array$defineCase) 102 | 103 | }else if(class(array) == "ExprsMulti"){ 104 | 105 | # Majority vote on best outcome 106 | votes <- data.frame(lapply(results, function(result) result@pred)) 107 | pred <- apply(votes, 1, function(x) names(table(x))[nnet::which.is.max(table(x))]) 108 | 109 | # Clean up pred 110 | pred <- factor(pred, levels = levels(array$defineCase)) 111 | final <- new("MultiPredict", pred = pred, actual = array$defineCase) 112 | 113 | }else if(class(array) == "RegrsArray"){ 114 | 115 | # Average the predictions 116 | pred <- colMeans(do.call("rbind", lapply(results, function(x) x@pred))) 117 | final <- new("RegrsPredict", pred = pred, actual = array$defineCase) 118 | } 119 | 120 | if(verbose){ 121 | cat("Ensemble classifier performance:\n") 122 | print(calcStats(final, aucSkip = TRUE, plotSkip = TRUE)) 123 | } 124 | 125 | return(final) 126 | } 127 | ) 128 | -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | output: 3 | md_document: 4 | variant: markdown_github 5 | --- 6 | 7 | 8 | 9 | ## Quick start 10 | 11 | Welcome to the `exprso` GitHub page! Let's get started. 12 | 13 | ```{r, eval = FALSE} 14 | library(devtools) 15 | devtools::install_github("tpq/exprso") 16 | library(exprso) 17 | ``` 18 | 19 | ```{r, echo = FALSE, message = FALSE} 20 | library(exprso) 21 | set.seed(1) 22 | ``` 23 | 24 | ## Importing data 25 | 26 | To import data, we use the `exprso` function. This function has two arguments. 27 | 28 | ```{r} 29 | data(iris) 30 | array <- exprso(iris[1:80, 1:4], iris[1:80, 5]) 31 | ``` 32 | 33 | ## Pre-processing data 34 | 35 | Functions with a `mod` prefix pre-process the data. 36 | 37 | ```{r} 38 | array <- modTransform(array) 39 | array <- modNormalize(array, c(1, 2)) 40 | ``` 41 | 42 | ## Split data 43 | 44 | Functions with a `split` prefix split the data into training and test sets. 45 | 46 | ```{r} 47 | arrays <- splitSample(array, percent.include = 67) 48 | array.train <- arrays$array.train 49 | array.test <- arrays$array.valid 50 | ``` 51 | 52 | ## Select features 53 | 54 | Functions with a `fs` prefix select features. 55 | 56 | ```{r} 57 | array.train <- fsStats(array.train, top = 0, how = "t.test") 58 | ``` 59 | 60 | ## Build models 61 | 62 | Functions with a `build` prefix build models. 63 | 64 | ```{r} 65 | mach <- buildSVM(array.train, 66 | top = 50, 67 | kernel = "linear", 68 | cost = 1) 69 | pred <- predict(mach, array.train) 70 | pred <- predict(mach, array.test) 71 | ``` 72 | 73 | ```{r, eval = FALSE} 74 | calcStats(pred) 75 | ``` 76 | 77 | ## Deploy pipelines 78 | 79 | Functions with a `pl` prefix deploy high-throughput learning pipelines. 80 | 81 | ```{r, results = "hide"} 82 | pl <- plGrid(array.train, 83 | array.test, 84 | how = "buildSVM", 85 | top = c(2, 4), 86 | kernel = "linear", 87 | cost = 10^(-3:3), 88 | fold = NULL) 89 | ``` 90 | 91 | ```{r} 92 | pl 93 | ``` 94 | 95 | Read the exprso vignettes for more details. 96 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | Quick start 3 | ----------- 4 | 5 | Welcome to the `exprso` GitHub page! Let's get started. 6 | 7 | ``` r 8 | library(devtools) 9 | devtools::install_github("tpq/exprso") 10 | library(exprso) 11 | ``` 12 | 13 | Importing data 14 | -------------- 15 | 16 | To import data, we use the `exprso` function. This function has two arguments. 17 | 18 | ``` r 19 | data(iris) 20 | array <- exprso(iris[1:80, 1:4], iris[1:80, 5]) 21 | ``` 22 | 23 | ## [1] "Preparing data for binary classification." 24 | 25 | Pre-processing data 26 | ------------------- 27 | 28 | Functions with a `mod` prefix pre-process the data. 29 | 30 | ``` r 31 | array <- modTransform(array) 32 | array <- modNormalize(array, c(1, 2)) 33 | ``` 34 | 35 | Split data 36 | ---------- 37 | 38 | Functions with a `split` prefix split the data into training and test sets. 39 | 40 | ``` r 41 | arrays <- splitSample(array, percent.include = 67) 42 | array.train <- arrays$array.train 43 | array.test <- arrays$array.valid 44 | ``` 45 | 46 | Select features 47 | --------------- 48 | 49 | Functions with a `fs` prefix select features. 50 | 51 | ``` r 52 | array.train <- fsStats(array.train, top = 0, how = "t.test") 53 | ``` 54 | 55 | Build models 56 | ------------ 57 | 58 | Functions with a `build` prefix build models. 59 | 60 | ``` r 61 | mach <- buildSVM(array.train, 62 | top = 50, 63 | kernel = "linear", 64 | cost = 1) 65 | ``` 66 | 67 | ## Setting probability to TRUE (forced behavior, cannot override)... 68 | ## Setting cross to 0 (forced behavior, cannot override)... 69 | 70 | ``` r 71 | pred <- predict(mach, array.train) 72 | ``` 73 | 74 | ## Individual classifier performance: 75 | ## Arguments not provided in an ROCR AUC format. Calculating accuracy outside of ROCR... 76 | ## Classification confusion table: 77 | ## actual 78 | ## predicted Control Case 79 | ## Control 29 0 80 | ## Case 0 25 81 | ## acc sens spec 82 | ## 1 1 1 1 83 | 84 | ``` r 85 | pred <- predict(mach, array.test) 86 | ``` 87 | 88 | ## Individual classifier performance: 89 | ## Arguments not provided in an ROCR AUC format. Calculating accuracy outside of ROCR... 90 | ## Classification confusion table: 91 | ## actual 92 | ## predicted Control Case 93 | ## Control 21 0 94 | ## Case 0 5 95 | ## acc sens spec 96 | ## 1 1 1 1 97 | 98 | ``` r 99 | calcStats(pred) 100 | ``` 101 | 102 | Deploy pipelines 103 | ---------------- 104 | 105 | Functions with a `pl` prefix deploy high-throughput learning pipelines. 106 | 107 | ``` r 108 | pl <- plGrid(array.train, 109 | array.test, 110 | how = "buildSVM", 111 | top = c(2, 4), 112 | kernel = "linear", 113 | cost = 10^(-3:3), 114 | fold = NULL) 115 | ``` 116 | 117 | ``` r 118 | pl 119 | ``` 120 | 121 | ## Accuracy summary (complete summary stored in @summary slot): 122 | ## 123 | ## build top kernel cost train.acc train.sens train.spec train.auc 124 | ## 1 buildSVM 2 linear 0.001 0.537037 0 1 0 125 | ## 2 buildSVM 4 linear 0.001 0.537037 0 1 0 126 | ## 3 buildSVM 2 linear 0.010 1.000000 1 1 1 127 | ## 4 buildSVM 4 linear 0.010 1.000000 1 1 1 128 | ## valid.acc valid.sens valid.spec valid.auc 129 | ## 1 0.8076923 0 1 0 130 | ## 2 0.8076923 0 1 0 131 | ## 3 1.0000000 1 1 1 132 | ## 4 1.0000000 1 1 1 133 | ## ... 134 | ## build top kernel cost train.acc train.sens train.spec train.auc 135 | ## 11 buildSVM 2 linear 100 1 1 1 1 136 | ## 12 buildSVM 4 linear 100 1 1 1 1 137 | ## 13 buildSVM 2 linear 1000 1 1 1 1 138 | ## 14 buildSVM 4 linear 1000 1 1 1 1 139 | ## valid.acc valid.sens valid.spec valid.auc 140 | ## 11 1 1 1 1 141 | ## 12 1 1 1 1 142 | ## 13 1 1 1 1 143 | ## 14 1 1 1 1 144 | ## 145 | ## Machine summary (all machines stored in @machs slot): 146 | ## 147 | ## ##Number of classes: 2 148 | ## @preFilter summary: 4 2 149 | ## @reductionModel summary: logical logical 150 | ## @mach class: svm.formula svm 151 | ## ... 152 | ## ##Number of classes: 2 153 | ## @preFilter summary: 4 4 154 | ## @reductionModel summary: logical logical 155 | ## @mach class: svm.formula svm 156 | 157 | Read the exprso vignettes for more details. 158 | -------------------------------------------------------------------------------- /TODO.md: -------------------------------------------------------------------------------- 1 | ## Works in progress 2 | --------------------- 3 | * `regrso` expansion: 4 | * [] Add `RegrsArray` class 5 | * [] Add `RegrsModel` class 6 | * [] Watch out for `x[,,drop=TRUE]` errors 7 | * [] Make ExprsBinary $defineCase a factor? 8 | * This change alone crashes downstream code 9 | * [x] Make `calcStats` tidy 10 | * [x] Add 2D plot 11 | * [x] Rename `getProbeSet` function. 12 | * [x] Rename `probes` as `top` 13 | * [x] Rename `top.N` as `top` 14 | 15 | * `ExprsMulti` expansion: 16 | * [x] Fix `splitStratify` for multi-class 17 | * [x] Fix `compare` for multi-class 18 | * [x] Implement all build_ ExprsMulti methods 19 | * Uses `doMulti` for 1-vs-all classification 20 | * Add `ExprsModule` predict method 21 | * Add multi-class `calcStats` 22 | 23 | * `pl` expansion: 24 | * [x] Consider "random Plains" wrapper 25 | * [] Consider `plRFE` with embedded RFE 26 | * [x] `plGridMulti` 27 | * Performs fs before each 1-vs-all build 28 | * [x] Disable ROC plotting during high-throughput `pl` 29 | * [x] `plGridPlus` with "better" plCV? 30 | * Achieved with plNested(plNested) method 31 | 32 | * `fs` expansion: 33 | * [x] Remove `doMulti` fs methods? 34 | * [] F-test 35 | * [x] ANOVA 36 | * Add as a true multi-class method 37 | 38 | * `build` expansion: 39 | * [] Logistic regression 40 | * [] Democratic SVM 41 | * [] KNN 42 | -------------------------------------------------------------------------------- /cran-comments.md: -------------------------------------------------------------------------------- 1 | ## Test environments 2 | * local ubuntu 16.04, R 3.3.1 3 | * local ubuntu 18.04, R 3.3.1 4 | * win-builder (devel and release) 5 | 6 | ## R CMD check results 7 | 8 | 0 errors | 0 warnings | 1 note 9 | 10 | * Possibly mis-spelled words in DESCRIPTION: 11 | 12 | I have reviewed the DESCRIPTION and attest it does not contain any mis-spellings. 13 | 14 | ## Reverse dependencies 15 | 16 | No reverse dependencies. 17 | 27 | -------------------------------------------------------------------------------- /data-raw/data.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | set.seed(1235) # changing seed may break tests! 3 | 4 | df.a <- data.frame( 5 | "id" = 1:10, 6 | "class" = rep("a", 10), 7 | "sex" = c(rep("M", 5), rep("F", 5)), 8 | "feat1" = rnorm(10, mean = 10, sd = 1), 9 | "feat2" = rnorm(10, mean = 20, sd = 5), 10 | "feat3" = rnorm(10, mean = 5, sd = 1), 11 | "feat4" = rnorm(10, mean = 40, sd = 1) 12 | ) 13 | 14 | df.b <- data.frame( 15 | "id" = 11:30, 16 | "class" = rep("b", 20), 17 | "sex" = c(rep("M", 10), rep("F", 10)), 18 | "feat1" = rnorm(20, mean = 20, sd = 5), 19 | "feat2" = rnorm(20, mean = 10, sd = 1), 20 | "feat3" = rnorm(20, mean = 5, sd = 1), 21 | "feat4" = rnorm(20, mean = 20, sd = 1) 22 | ) 23 | 24 | df.c <- data.frame( 25 | "id" = 31:40, 26 | "class" = rep("c", 10), 27 | "sex" = c(rep("M", 3), rep("F", 7)), 28 | "feat1" = rnorm(10, mean = 15, sd = 3), 29 | "feat2" = rnorm(10, mean = 15, sd = 3), 30 | "feat3" = rnorm(10, mean = 5, sd = 1), 31 | "feat4" = rnorm(10, mean = 30, sd = 1) 32 | ) 33 | 34 | df <- do.call(rbind, list(df.a, df.b, df.c)) 35 | 36 | tempFile <- tempfile() 37 | write.table(df, file = tempFile, sep = "\t") 38 | 39 | array <- 40 | arrayExprs(tempFile, begin = 4, colID = "id", colBy = "class", 41 | include = list("a", "b")) 42 | 43 | arrayMulti <- 44 | arrayExprs(tempFile, begin = 4, colID = "id", colBy = "class", 45 | include = list("a", "b", "c")) 46 | 47 | devtools::use_data(array, arrayMulti) 48 | test.file <- paste0(getwd(), "/tests/testthat/data.RData") 49 | save(array, arrayMulti, file = test.file) 50 | -------------------------------------------------------------------------------- /data-raw/makenew-build.R: -------------------------------------------------------------------------------- 1 | ### 2 | # Set up data each with binary, mulit-class, and continuous outcomes 3 | ### 4 | 5 | library(exprso) 6 | data(iris) 7 | e1 <- exprso(iris[1:100, 1:4], iris[1:100, 5]) 8 | e1 9 | e2 <- e <- exprso(iris[, 1:4], iris[, 5]) 10 | e2 11 | e3 <- exprso(iris[, 1:3], iris[, 4]) 12 | e3 13 | 14 | ### 15 | # Bring workhorses into active environment 16 | ### 17 | 18 | fs. <- exprso:::fs. 19 | build. <- exprso:::build. 20 | getArgs <- exprso:::getArgs 21 | defaultArg <- exprso:::defaultArg 22 | classCheck <- exprso:::classCheck 23 | forceArg <- exprso:::forceArg 24 | 25 | ### 26 | # Create new module here 27 | ### 28 | 29 | #' Build Logistic Regression Model 30 | #' 31 | #' \code{buildLR} builds a model using the \code{glm} function. 32 | #' 33 | #' @inheritParams build. 34 | #' @return Returns an \code{ExprsModel} object. 35 | #' @export 36 | buildLR <- function(object, top = 0, ...){ # args to glm 37 | 38 | classCheck(object, c("ExprsBinary", "ExprsMulti"), 39 | "This build method only works for classification tasks.") 40 | 41 | build.(object, top, 42 | uniqueFx = function(data, labels, ...){ 43 | 44 | # Perform GLM via ~ method 45 | args <- getArgs(...) 46 | args <- forceArg("family", "binomial", args) 47 | df <- data.frame(data, "defineCase" = as.numeric(labels) - 1) 48 | args <- append(list("formula" = defineCase ~ ., "data" = df), args) 49 | do.call(stats::glm, args) 50 | }, ...) 51 | } 52 | 53 | ### 54 | # Test new module here (pre-implementation) 55 | ### 56 | 57 | m1 <- buildLR(e1) 58 | m2 <- buildLR(e2) 59 | m3 <- buildLR(e3) 60 | 61 | predict(m1@mach, as.data.frame(t(e1@exprs))) 62 | predict(m3@mach, as.data.frame(t(e3@exprs))) 63 | 64 | class(m1@mach) 65 | class(m3@mach) 66 | 67 | ### 68 | # Test new module here (post-implementation) 69 | ### 70 | 71 | library(exprso) 72 | m1 <- buildLR(e1) 73 | m2 <- buildLR(e2) 74 | m3 <- buildLR(e3) 75 | predict(m1, e1) 76 | predict(m2, e2) 77 | predict(m3, e3) 78 | -------------------------------------------------------------------------------- /data-raw/r&d.R: -------------------------------------------------------------------------------- 1 | # Use this script to help make new fs and build methods 2 | data(array) 3 | data(arrayMulti) 4 | 5 | array2data <- function(object, top){ 6 | 7 | if(class(top) == "numeric"){ 8 | 9 | if(length(top) == 1){ 10 | 11 | if(top > nrow(object@exprs)) top <- 0 12 | if(top == 0) top <- nrow(object@exprs) 13 | top <- rownames(object@exprs[1:top, ]) 14 | 15 | }else{ 16 | 17 | top <- rownames(object@exprs[top, ]) 18 | } 19 | } 20 | 21 | t(object@exprs[top, ]) 22 | } 23 | 24 | data <- array2data(array, top = 0) 25 | labels <- factor(array@annot[rownames(data), "defineCase"], levels = c("Control", "Case")) 26 | dataMulti <- array2data(arrayMulti, top = 0) 27 | 28 | # Use for new build method: 29 | uniqueFx <- function(data, labels, ...){ 30 | 31 | 32 | } 33 | 34 | # Use for new fs method: 35 | uniqueFx <- function(data, top, ...){ 36 | 37 | 38 | } 39 | -------------------------------------------------------------------------------- /data/array.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tpq/exprso/c4a0eb6412833abe216b61c6ca53737bc8f53c5b/data/array.rda -------------------------------------------------------------------------------- /data/arrayMulti.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tpq/exprso/c4a0eb6412833abe216b61c6ca53737bc8f53c5b/data/arrayMulti.rda -------------------------------------------------------------------------------- /exprso.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | PackageRoxygenize: rd,collate,namespace 22 | -------------------------------------------------------------------------------- /inst/CITATION: -------------------------------------------------------------------------------- 1 | citHeader("To cite exprso in publications use:") 2 | 3 | citEntry(entry = "Article", 4 | title = "exprso: an R-package for the rapid implementation of machine learning algorithms", 5 | author = personList(as.person("Thomas Quinn"), 6 | as.person("Daniel Tylee"), 7 | as.person("Stephen Glatt")), 8 | journal = "F1000Research", 9 | year = "2016", 10 | volume = "5", 11 | number = "2588", 12 | url = "http://f1000research.com/articles/5-2588/", 13 | 14 | textVersion = 15 | paste("Quinn T, Tylee D and Glatt S. 2016. exprso: an R-package for", 16 | "the rapid implementation of machine learning algorithms.", 17 | "F1000Research, 5:2588.", 18 | "URL http://f1000research.com/articles/5-2588/.") 19 | ) 20 | -------------------------------------------------------------------------------- /man/ExprsArray-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R, R/1.2-methods.R 3 | \docType{class} 4 | \name{ExprsArray-class} 5 | \alias{ExprsArray-class} 6 | \alias{show,ExprsArray-method} 7 | \alias{[,ExprsArray,ANY,ANY,ANY-method} 8 | \alias{[,ExprsArray-method} 9 | \alias{$,ExprsArray-method} 10 | \alias{subset,ExprsArray-method} 11 | \alias{plot,ExprsArray,missing-method} 12 | \alias{summary,ExprsArray-method} 13 | \alias{getFeatures,ExprsArray-method} 14 | \title{An S4 class to store feature and annotation data} 15 | \usage{ 16 | \S4method{show}{ExprsArray}(object) 17 | 18 | \S4method{[}{ExprsArray,ANY,ANY,ANY}(x, i, j) 19 | 20 | \S4method{$}{ExprsArray}(x, name) 21 | 22 | \S4method{subset}{ExprsArray}(x, subset, select) 23 | 24 | \S4method{plot}{ExprsArray,missing}(x, y, a = 1, b = 2, c = 3, ...) 25 | 26 | \S4method{summary}{ExprsArray}(object) 27 | 28 | \S4method{getFeatures}{ExprsArray}(object) 29 | } 30 | \arguments{ 31 | \item{object, x}{An object of class \code{ExprsArray}.} 32 | 33 | \item{i, j}{Subsets entire \code{ExprsArray} object via 34 | \code{object@annot[i, j]}. Returns \code{object@annot[, j]} if 35 | argument \code{i} is missing.} 36 | 37 | \item{name}{Returns \code{object@annot[, name]}.} 38 | 39 | \item{subset}{Subsets entire \code{ExprsArray} object via 40 | \code{object@annot[subset, ]}. Can be used to rearrange subject order.} 41 | 42 | \item{select}{Subsets entire \code{ExprsArray} object via 43 | \code{object@annot[, select]}. Can be used to rearrange label order.} 44 | 45 | \item{y}{Leave missing. Argument exists because of \code{\link{plot}} generic definition.} 46 | 47 | \item{a, b, c}{A numeric scalar. Indexes the first, second, and third dimensions to plot. 48 | Set \code{c = 0} to plot two dimensions.} 49 | 50 | \item{...}{Additional arguments passed to\code{plot} or \code{lattice::cloud}.} 51 | } 52 | \description{ 53 | An S4 class to store feature and annotation data 54 | } 55 | \section{Methods (by generic)}{ 56 | \itemize{ 57 | \item \code{show}: Method to show \code{ExprsArray} object. 58 | 59 | \item \code{[}: Method to subset \code{ExprsArray} object. 60 | 61 | \item \code{$}: Method to subset \code{ExprsArray} object. 62 | 63 | \item \code{subset}: Method to subset \code{ExprsArray} object. 64 | 65 | \item \code{plot}: Method to plot two or three dimensions of data. 66 | 67 | \item \code{summary}: Method to plot summary graphs for a sub-sample of feature data. 68 | 69 | \item \code{getFeatures}: Method to return features within an \code{ExprsArray} object. 70 | }} 71 | 72 | \section{Slots}{ 73 | 74 | \describe{ 75 | \item{\code{exprs}}{A matrix. Stores the feature data.} 76 | 77 | \item{\code{annot}}{A data.frame. Stores the annotation data.} 78 | 79 | \item{\code{preFilter}}{Typically a list. Stores feature selection history.} 80 | 81 | \item{\code{reductionModel}}{Typically a list. Stores dimension reduction history.} 82 | }} 83 | 84 | \seealso{ 85 | \code{\link{ExprsArray-class}}\cr 86 | \code{\link{ExprsModel-class}}\cr 87 | \code{\link{ExprsPipeline-class}}\cr 88 | \code{\link{ExprsEnsemble-class}}\cr 89 | \code{\link{ExprsPredict-class}}\cr 90 | \code{\link{MultiPredict-class}}\cr 91 | \code{\link{RegrsPredict-class}} 92 | } 93 | -------------------------------------------------------------------------------- /man/ExprsBinary-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R 3 | \docType{class} 4 | \name{ExprsBinary-class} 5 | \alias{ExprsBinary-class} 6 | \title{An S4 class to store feature and annotation data} 7 | \description{ 8 | An \code{ExprsArray} sub-class for data with binary class outcomes. 9 | } 10 | -------------------------------------------------------------------------------- /man/ExprsEnsemble-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R, R/1.2-methods.R 3 | \docType{class} 4 | \name{ExprsEnsemble-class} 5 | \alias{ExprsEnsemble-class} 6 | \alias{show,ExprsEnsemble-method} 7 | \alias{getFeatures,ExprsEnsemble-method} 8 | \alias{getWeights,ExprsEnsemble-method} 9 | \title{An S4 class to store multiple models} 10 | \usage{ 11 | \S4method{show}{ExprsEnsemble}(object) 12 | 13 | \S4method{getFeatures}{ExprsEnsemble}(object, index) 14 | 15 | \S4method{getWeights}{ExprsEnsemble}(object, index, ...) 16 | } 17 | \arguments{ 18 | \item{object}{An \code{ExprsArray}, \code{ExprsModel}, \code{ExprsPipeline}, 19 | or \code{ExprsEnsemble} object.} 20 | 21 | \item{index}{A numeric scalar. The i-th model from which to retrieve features or weights. 22 | If missing, function will tabulate features or weights across all models.} 23 | 24 | \item{...}{For \code{getWeights}, optional arguments passed to 25 | \code{glmnet::coef.cv.glmnet}.} 26 | } 27 | \description{ 28 | An S4 class to store multiple models 29 | } 30 | \section{Methods (by generic)}{ 31 | \itemize{ 32 | \item \code{show}: Method to show \code{ExprsEnsemble} object. 33 | 34 | \item \code{getFeatures}: Method to return features within an \code{ExprsEnsemble} model. 35 | 36 | \item \code{getWeights}: Method to return LASSO weights. 37 | }} 38 | 39 | \section{Slots}{ 40 | 41 | \describe{ 42 | \item{\code{machs}}{Typically a list. Stores the models.} 43 | }} 44 | 45 | \seealso{ 46 | \code{\link{ExprsArray-class}}\cr 47 | \code{\link{ExprsModel-class}}\cr 48 | \code{\link{ExprsPipeline-class}}\cr 49 | \code{\link{ExprsEnsemble-class}}\cr 50 | \code{\link{ExprsPredict-class}}\cr 51 | \code{\link{MultiPredict-class}}\cr 52 | \code{\link{RegrsPredict-class}} 53 | } 54 | -------------------------------------------------------------------------------- /man/ExprsMachine-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R 3 | \docType{class} 4 | \name{ExprsMachine-class} 5 | \alias{ExprsMachine-class} 6 | \title{An S4 class to store the model} 7 | \description{ 8 | An \code{ExprsModel} sub-class for binary classification models. 9 | } 10 | -------------------------------------------------------------------------------- /man/ExprsModel-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R, R/1.2-methods.R 3 | \docType{class} 4 | \name{ExprsModel-class} 5 | \alias{ExprsModel-class} 6 | \alias{show,ExprsModel-method} 7 | \alias{getFeatures,ExprsModel-method} 8 | \alias{getWeights,ExprsModel-method} 9 | \title{An S4 class to store the model} 10 | \usage{ 11 | \S4method{show}{ExprsModel}(object) 12 | 13 | \S4method{getFeatures}{ExprsModel}(object) 14 | 15 | \S4method{getWeights}{ExprsModel}(object, ...) 16 | } 17 | \arguments{ 18 | \item{object}{An object of class \code{ExprsModel}.} 19 | 20 | \item{...}{For \code{getWeights}, optional arguments passed to 21 | \code{glmnet::coef.cv.glmnet}.} 22 | } 23 | \description{ 24 | An S4 class to store the model 25 | } 26 | \section{Methods (by generic)}{ 27 | \itemize{ 28 | \item \code{show}: Method to show \code{ExprsModel} object. 29 | 30 | \item \code{getFeatures}: Method to return features within an \code{ExprsModel} object. 31 | 32 | \item \code{getWeights}: Method to return LASSO weights. 33 | }} 34 | 35 | \section{Slots}{ 36 | 37 | \describe{ 38 | \item{\code{preFilter}}{Typically a list. Stores feature selection history.} 39 | 40 | \item{\code{reductionModel}}{Typically a list. Stores dimension reduction history.} 41 | 42 | \item{\code{mach}}{Typically an S4 class. Stores the model.} 43 | }} 44 | 45 | \seealso{ 46 | \code{\link{ExprsArray-class}}\cr 47 | \code{\link{ExprsModel-class}}\cr 48 | \code{\link{ExprsPipeline-class}}\cr 49 | \code{\link{ExprsEnsemble-class}}\cr 50 | \code{\link{ExprsPredict-class}}\cr 51 | \code{\link{MultiPredict-class}}\cr 52 | \code{\link{RegrsPredict-class}} 53 | } 54 | -------------------------------------------------------------------------------- /man/ExprsModule-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R 3 | \docType{class} 4 | \name{ExprsModule-class} 5 | \alias{ExprsModule-class} 6 | \title{An S4 class to store the model} 7 | \description{ 8 | An \code{ExprsModel} sub-class for multi-class classification models. 9 | } 10 | -------------------------------------------------------------------------------- /man/ExprsMulti-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R 3 | \docType{class} 4 | \name{ExprsMulti-class} 5 | \alias{ExprsMulti-class} 6 | \title{An S4 class to store feature and annotation data} 7 | \description{ 8 | An \code{ExprsArray} sub-class for data with multiple class outcomes. 9 | } 10 | -------------------------------------------------------------------------------- /man/ExprsPipeline-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R, R/1.2-methods.R 3 | \docType{class} 4 | \name{ExprsPipeline-class} 5 | \alias{ExprsPipeline-class} 6 | \alias{show,ExprsPipeline-method} 7 | \alias{[,ExprsPipeline,ANY,ANY,ANY-method} 8 | \alias{[,ExprsPipeline-method} 9 | \alias{$,ExprsPipeline-method} 10 | \alias{subset,ExprsPipeline-method} 11 | \alias{summary,ExprsPipeline-method} 12 | \alias{getFeatures,ExprsPipeline-method} 13 | \alias{getWeights,ExprsPipeline-method} 14 | \title{An S4 class to store models built during high-throughput learning} 15 | \usage{ 16 | \S4method{show}{ExprsPipeline}(object) 17 | 18 | \S4method{[}{ExprsPipeline,ANY,ANY,ANY}(x, i, j) 19 | 20 | \S4method{$}{ExprsPipeline}(x, name) 21 | 22 | \S4method{subset}{ExprsPipeline}(x, subset, select) 23 | 24 | \S4method{summary}{ExprsPipeline}(object) 25 | 26 | \S4method{getFeatures}{ExprsPipeline}(object, index) 27 | 28 | \S4method{getWeights}{ExprsPipeline}(object, index, ...) 29 | } 30 | \arguments{ 31 | \item{object, x}{An object of class \code{ExprsPipeline}.} 32 | 33 | \item{i, j}{Subsets entire \code{ExprsPipeline} object via 34 | \code{object@summary[i, j]}. Returns \code{object@summary[, j]} if 35 | argument \code{i} is missing.} 36 | 37 | \item{name}{Returns \code{object@summary[, name]}.} 38 | 39 | \item{subset}{Subsets entire \code{ExprsPipeline} object via 40 | \code{object@summary[subset, ]}. Can be used to rearrange summary table.} 41 | 42 | \item{select}{Subsets entire \code{ExprsPipeline} object via 43 | \code{object@summary[, select]}. Can be used to rearrange summary table.} 44 | 45 | \item{index}{A numeric scalar. The i-th model from which to retrieve features or weights. 46 | If missing, function will tabulate features or weights across all models.} 47 | 48 | \item{...}{For \code{getWeights}, optional arguments passed to 49 | \code{glmnet::coef.cv.glmnet}.} 50 | } 51 | \description{ 52 | An S4 class to store models built during high-throughput learning 53 | } 54 | \section{Methods (by generic)}{ 55 | \itemize{ 56 | \item \code{show}: Method to show \code{ExprsPipeline} object. 57 | 58 | \item \code{[}: Method to subset \code{ExprsPipeline} object. 59 | 60 | \item \code{$}: Method to subset \code{ExprsPipeline} object. 61 | 62 | \item \code{subset}: Method to subset \code{ExprsPipeline} object. 63 | 64 | \item \code{summary}: Method to summarize \code{ExprsPipeline} results. 65 | 66 | \item \code{getFeatures}: Method to return features within an \code{ExprsPredict} model. 67 | 68 | \item \code{getWeights}: Method to return LASSO weights. 69 | }} 70 | 71 | \section{Slots}{ 72 | 73 | \describe{ 74 | \item{\code{summary}}{Typically a data.frame. Stores the parameters and 75 | performances for the models.} 76 | 77 | \item{\code{machs}}{Typically a list. Stores the models 78 | referenced in \code{summary} slot.} 79 | }} 80 | 81 | \seealso{ 82 | \code{\link{ExprsArray-class}}\cr 83 | \code{\link{ExprsModel-class}}\cr 84 | \code{\link{ExprsPipeline-class}}\cr 85 | \code{\link{ExprsEnsemble-class}}\cr 86 | \code{\link{ExprsPredict-class}}\cr 87 | \code{\link{MultiPredict-class}}\cr 88 | \code{\link{RegrsPredict-class}} 89 | } 90 | -------------------------------------------------------------------------------- /man/ExprsPredict-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R, R/1.2-methods.R 3 | \docType{class} 4 | \name{ExprsPredict-class} 5 | \alias{ExprsPredict-class} 6 | \alias{show,ExprsPredict-method} 7 | \title{An S4 class to store model predictions} 8 | \usage{ 9 | \S4method{show}{ExprsPredict}(object) 10 | } 11 | \arguments{ 12 | \item{object}{An object of class \code{ExprsPredict}.} 13 | } 14 | \description{ 15 | An S4 class to store model predictions 16 | } 17 | \section{Methods (by generic)}{ 18 | \itemize{ 19 | \item \code{show}: Method to show \code{ExprsPredict} object. 20 | }} 21 | 22 | \section{Slots}{ 23 | 24 | \describe{ 25 | \item{\code{pred}}{A factor. Stores class predictions as an unambiguous 26 | class assignment.} 27 | 28 | \item{\code{decision.values}}{Typically a matrix. Stores class predictions 29 | as a decision value.} 30 | 31 | \item{\code{probability}}{Typically a matrix. Stores class predictions 32 | as a probability.} 33 | 34 | \item{\code{actual}}{Typically a factor. Stores known class labels. 35 | Used by \code{\link{calcStats}}.} 36 | }} 37 | 38 | \seealso{ 39 | \code{\link{ExprsArray-class}}\cr 40 | \code{\link{ExprsModel-class}}\cr 41 | \code{\link{ExprsPipeline-class}}\cr 42 | \code{\link{ExprsEnsemble-class}}\cr 43 | \code{\link{ExprsPredict-class}}\cr 44 | \code{\link{MultiPredict-class}}\cr 45 | \code{\link{RegrsPredict-class}} 46 | } 47 | -------------------------------------------------------------------------------- /man/GSE2eSet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-deprecated.R 3 | \name{GSE2eSet} 4 | \alias{GSE2eSet} 5 | \title{Convert GSE to eSet} 6 | \usage{ 7 | GSE2eSet(gse, colBy, colID) 8 | } 9 | \arguments{ 10 | \item{gse}{A GSE data object retrieved using GEOquery.} 11 | 12 | \item{colBy}{A character string. The GSE column name that contains the feature value. 13 | If missing, function will prompt user for a column name after previewing options.} 14 | 15 | \item{colID}{A character string. The GSE column name that contains the feature identity. 16 | If missing, function will prompt user for a column name after previewing options.} 17 | } 18 | \value{ 19 | An \code{ExpressionSet} object. 20 | } 21 | \description{ 22 | A convenience function that builds an \code{eSet} object from a GSE data source. 23 | } 24 | \details{ 25 | The NCBI GEO hosts files in GSE or GDS format, the latter of which exists as a curated version 26 | the former. These GDS data files easily convert to an \code{ExpressionSet} (abbreviated 27 | \code{eSet}) object using the \code{GDS2eSet} function available from the GEOquery package. 28 | However, not all GSE data files have a corresponding GDS data file available. To convert GSE 29 | data files into \code{eSet} objects, \code{exprso} provides this convenience function. 30 | 31 | However, the user should note that GSE data files do not always get stored in an easy to parse format. 32 | Although this function has worked successfully with some GSE data files, we cannot make any 33 | guarantee that it will work for all GSE data files. 34 | 35 | To acquire GSE data files, use the function \code{getGEO} from the GEOquery package (e.g., 36 | \code{getGEO("GSExxxxx", GSEMatrix = FALSE)}). For more information, see the GEOquery package. 37 | } 38 | \seealso{ 39 | \code{\link{ExprsArray-class}}, \code{\link{arrayExprs}} 40 | } 41 | -------------------------------------------------------------------------------- /man/MultiPredict-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R, R/1.2-methods.R 3 | \docType{class} 4 | \name{MultiPredict-class} 5 | \alias{MultiPredict-class} 6 | \alias{show,MultiPredict-method} 7 | \title{An S4 class to store model predictions} 8 | \usage{ 9 | \S4method{show}{MultiPredict}(object) 10 | } 11 | \arguments{ 12 | \item{object}{An object of class \code{MultiPredict}.} 13 | } 14 | \description{ 15 | An S4 class to store model predictions 16 | } 17 | \section{Methods (by generic)}{ 18 | \itemize{ 19 | \item \code{show}: Method to show \code{MultiPredict} object. 20 | }} 21 | 22 | \section{Slots}{ 23 | 24 | \describe{ 25 | \item{\code{pred}}{Any. Stores predicted outcome.} 26 | 27 | \item{\code{actual}}{Any. Stores actual outcome. 28 | Used by \code{\link{calcStats}}.} 29 | }} 30 | 31 | \seealso{ 32 | \code{\link{ExprsArray-class}}\cr 33 | \code{\link{ExprsModel-class}}\cr 34 | \code{\link{ExprsPipeline-class}}\cr 35 | \code{\link{ExprsEnsemble-class}}\cr 36 | \code{\link{ExprsPredict-class}}\cr 37 | \code{\link{MultiPredict-class}}\cr 38 | \code{\link{RegrsPredict-class}} 39 | } 40 | -------------------------------------------------------------------------------- /man/RegrsArray-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R 3 | \docType{class} 4 | \name{RegrsArray-class} 5 | \alias{RegrsArray-class} 6 | \title{An S4 class to store feature and annotation data} 7 | \description{ 8 | An \code{ExprsArray} sub-class for data with continuous outcomes. 9 | } 10 | -------------------------------------------------------------------------------- /man/RegrsModel-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R 3 | \docType{class} 4 | \name{RegrsModel-class} 5 | \alias{RegrsModel-class} 6 | \title{An S4 class to store the model} 7 | \description{ 8 | An \code{ExprsModel} sub-class for continuous outcome models. 9 | } 10 | -------------------------------------------------------------------------------- /man/RegrsPredict-class.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.1-classes.R, R/1.2-methods.R 3 | \docType{class} 4 | \name{RegrsPredict-class} 5 | \alias{RegrsPredict-class} 6 | \alias{show,RegrsPredict-method} 7 | \title{An S4 class to store model predictions} 8 | \usage{ 9 | \S4method{show}{RegrsPredict}(object) 10 | } 11 | \arguments{ 12 | \item{object}{An object of class \code{RegrsPredict}.} 13 | } 14 | \description{ 15 | An S4 class to store model predictions 16 | } 17 | \section{Methods (by generic)}{ 18 | \itemize{ 19 | \item \code{show}: Method to show \code{RegrsPredict} object. 20 | }} 21 | 22 | \section{Slots}{ 23 | 24 | \describe{ 25 | \item{\code{pred}}{Any. Stores predicted outcome.} 26 | 27 | \item{\code{actual}}{Any. Stores actual outcome. 28 | Used by \code{\link{calcStats}}.} 29 | }} 30 | 31 | \seealso{ 32 | \code{\link{ExprsArray-class}}\cr 33 | \code{\link{ExprsModel-class}}\cr 34 | \code{\link{ExprsPipeline-class}}\cr 35 | \code{\link{ExprsEnsemble-class}}\cr 36 | \code{\link{ExprsPredict-class}}\cr 37 | \code{\link{MultiPredict-class}}\cr 38 | \code{\link{RegrsPredict-class}} 39 | } 40 | -------------------------------------------------------------------------------- /man/array.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \docType{data} 4 | \name{array} 5 | \alias{array} 6 | \title{Sample ExprsBinary Data} 7 | \format{An object of class \code{ExprsBinary} of length 1.} 8 | \usage{ 9 | data(array) 10 | } 11 | \description{ 12 | Sample ExprsBinary Data 13 | } 14 | \keyword{datasets} 15 | -------------------------------------------------------------------------------- /man/arrayExprs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-deprecated.R 3 | \name{arrayExprs} 4 | \alias{arrayExprs} 5 | \title{Import Data as ExprsArray} 6 | \usage{ 7 | arrayExprs(object, colBy, include, colID, begin, ...) 8 | } 9 | \arguments{ 10 | \item{object}{What to import as an \code{ExprsArray} object. See Details.} 11 | 12 | \item{colBy}{A numeric or character index. The column that contains group annotations.} 13 | 14 | \item{include}{A list of character vectors. Specifies which annotations in \code{colBy} 15 | to include in which groups. Each element of the list specifies a unique group while 16 | each element of the character vector specifies which annotations define that group. For 17 | binary classification, the first list element defines the negative, or control, group.} 18 | 19 | \item{colID}{A numeric or character index. The column used to name subjects. 20 | For \code{data.frame} or file import only.} 21 | 22 | \item{begin}{A numeric scalar. The j-th column at which feature data starts. 23 | For \code{data.frame} or file import only.} 24 | 25 | \item{...}{Additional arguments passed along to \code{read.delim}. 26 | For file import only.} 27 | } 28 | \value{ 29 | An \code{ExprsArray} object. 30 | } 31 | \description{ 32 | A convenience function that builds an \code{ExprsArray} object. 33 | This function is no longer supported. Please use \code{\link{exprso}} instead. 34 | } 35 | \details{ 36 | Importing a \code{data.frame} object: 37 | 38 | This function expects that the imported \code{data.frame} has the following format: 39 | rows indicate subject entries while columns indicate measured variables. 40 | The first several columns should contain annotation information (e.g., age, sex, diagnosis). 41 | The remaining columns should contain feature data (e.g., expression values). 42 | The argument \code{begin} defines the j-th column at which the feature 43 | data starts. This function automatically removes any features with \code{NA} values. 44 | Take care to remove any \code{factor} columns before importing. 45 | 46 | Importing an \code{ExpressionSet} object: 47 | 48 | The package Biobase maintains a popular class object called \code{ExpressionSet} that 49 | often gets used to store expression data. This function converts this \code{eSet} 50 | object into an \code{ExprsArray} object. This function automatically removes any 51 | features with \code{NA} values. 52 | 53 | Importing a \code{file}: 54 | 55 | \code{arrayExprs} can also build an \code{ExprsArray} object from a tab-delimited 56 | data file, passing along the \code{file} and \code{...} argument(s) to 57 | \code{\link{read.delim}}. All rules for \code{data.frame} import also apply here. 58 | By default, \code{arrayExprs} forces \code{stringsAsFactors = FASE}. 59 | } 60 | \seealso{ 61 | \code{\link{ExprsArray-class}}, \code{\link{GSE2eSet}} 62 | } 63 | -------------------------------------------------------------------------------- /man/arrayMulti.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \docType{data} 4 | \name{arrayMulti} 5 | \alias{arrayMulti} 6 | \title{Sample ExprsMulti Data} 7 | \format{An object of class \code{ExprsMulti} of length 1.} 8 | \usage{ 9 | data(arrayMulti) 10 | } 11 | \description{ 12 | Sample ExprsMulti Data 13 | } 14 | \keyword{datasets} 15 | -------------------------------------------------------------------------------- /man/build..Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{build.} 4 | \alias{build.} 5 | \title{Workhorse for build Methods} 6 | \usage{ 7 | build.(object, top, uniqueFx, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{uniqueFx}{A function call unique to the method. See Details.} 19 | 20 | \item{...}{Arguments passed to the detailed function.} 21 | } 22 | \value{ 23 | Returns an \code{ExprsModel} object. 24 | } 25 | \description{ 26 | Used as a back-end wrapper for creating new build methods. 27 | } 28 | -------------------------------------------------------------------------------- /man/build.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.3-exprso.R 3 | \name{build} 4 | \alias{build} 5 | \title{Build Models} 6 | \description{ 7 | The \code{exprso} package includes these build modules: 8 | 9 | - \code{\link{buildNB}} 10 | 11 | - \code{\link{buildLDA}} 12 | 13 | - \code{\link{buildSVM}} 14 | 15 | - \code{\link{buildLM}} 16 | 17 | - \code{\link{buildGLM}} 18 | 19 | - \code{\link{buildLR}} 20 | 21 | - \code{\link{buildLASSO}} 22 | 23 | - \code{\link{buildANN}} 24 | 25 | - \code{\link{buildDT}} 26 | 27 | - \code{\link{buildRF}} 28 | 29 | - \code{\link{buildFRB}} 30 | 31 | - \code{\link{buildDNN}} 32 | } 33 | \details{ 34 | Build a binary classifier, multi-class classifier, or regression model. 35 | Like \code{\link{fs}} methods, \code{build} methods have a \code{top} argument 36 | which allows the user to specify which features to feed INTO the model 37 | build. This effectively provides the user with one last opportunity to subset 38 | the feature space based on prior feature selection or dimension reduction. 39 | For all build methods, \code{@preFilter} and \code{@reductionModel} will 40 | get passed along to the resultant \code{ExprsModel} object, again ensuring 41 | that any test or validation sets will undergo the same feature selection and 42 | dimension reduction in the appropriate steps when deploying the model. 43 | Set \code{top = 0} to pass all features through a \code{build} method. 44 | } 45 | -------------------------------------------------------------------------------- /man/buildANN.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildANN} 4 | \alias{buildANN} 5 | \title{Build Artificial Neural Network Model} 6 | \usage{ 7 | buildANN(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildANN} builds a model using the \code{nnet} function 25 | from the \code{nnet} package. 26 | } 27 | -------------------------------------------------------------------------------- /man/buildDNN.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildDNN} 4 | \alias{buildDNN} 5 | \title{Build Deep Neural Network Model} 6 | \usage{ 7 | buildDNN(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildDNN} builds a model using the \code{h2o.deeplearning} function 25 | from the \code{h2o} package. 26 | } 27 | -------------------------------------------------------------------------------- /man/buildDT.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildDT} 4 | \alias{buildDT} 5 | \title{Build Decision Tree Model} 6 | \usage{ 7 | buildDT(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildDT} builds a model using the \code{rpart} function 25 | from the \code{rpart} package. 26 | } 27 | \details{ 28 | Provide \code{cp} as a numeric scalar to trim the \code{rpart} decision tree. 29 | If provided, this argument is passed to the \code{rpart::prune} function. 30 | Set \code{cp = 0} to skip pruning (default behavior). 31 | } 32 | -------------------------------------------------------------------------------- /man/buildEnsemble.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/8.2-ens.R 3 | \docType{methods} 4 | \name{buildEnsemble} 5 | \alias{buildEnsemble} 6 | \alias{buildEnsemble,ExprsModel-method} 7 | \alias{buildEnsemble,ExprsPipeline-method} 8 | \title{Build Ensemble} 9 | \usage{ 10 | buildEnsemble(object, ...) 11 | 12 | \S4method{buildEnsemble}{ExprsModel}(object, ...) 13 | 14 | \S4method{buildEnsemble}{ExprsPipeline}(object, colBy = 0, how = 0, 15 | gate = 0, top = 0) 16 | } 17 | \arguments{ 18 | \item{object}{An \code{\link{ExprsPipeline-class}} object.} 19 | 20 | \item{...}{Additional \code{ExprsModel} objects to use in the ensemble. 21 | Argument applies to the \code{\link{ExprsModel-class}} method only.} 22 | 23 | \item{colBy}{A character vector or string. Specifies column(s) to use when 24 | filtering by model performance. Listing multiple columns will result 25 | in a filter based on the product all listed columns.} 26 | 27 | \item{how}{A numeric scalar. Arguments between 0 and 1 will impose 28 | a threshold or ceiling filter, respectively, based on the raw value of 29 | \code{colBy}. Arguments between 1 and 100 will impose a filter based on 30 | the percentile of \code{colBy}. The user may also provide "midrange", 31 | "median", or "mean" as an argument for these filters.} 32 | 33 | \item{gate}{A numeric scalar. Arguments between 0 and 1 will impose 34 | a threshold or ceiling filter, respectively, based on the raw value of 35 | \code{colBy}. Arguments between 1 and 100 will impose a filter based on 36 | the percentile of \code{colBy}. The user may also provide "midrange", 37 | "median", or "mean" as an argument for these filters.} 38 | 39 | \item{top}{A numeric scalar. Determines the top N models based on 40 | \code{colBy} to include after the threshold and ceiling filters. 41 | In the case that the \code{@summary} slot contains the column "boot", 42 | this selects the top N models for each unique bootstrap.} 43 | } 44 | \value{ 45 | An \code{\link{ExprsEnsemble-class}} object. 46 | } 47 | \description{ 48 | \code{buildEnsemble} builds an ensemble from \code{ExprsModel} or 49 | \code{ExprsPipeline} objects. See Details. 50 | } 51 | \details{ 52 | This function can combine any number of model objects into an ensemble. 53 | These models do not necessarily have to derive from the same \code{build} 54 | method. In this way, it works like \code{\link{conjoin}}. 55 | 56 | This function can also build an ensemble from pipeline objects. It does 57 | this by calling \code{\link{pipeFilter}}, then joining the remaining models 58 | into an ensemble. As an adjunct to this method, consider first combining 59 | multiple pipeline objects with \code{\link{conjoin}}. 60 | } 61 | \section{Methods (by class)}{ 62 | \itemize{ 63 | \item \code{ExprsModel}: Method to build ensemble from \code{ExprsModel} objects. 64 | 65 | \item \code{ExprsPipeline}: Method to build ensemble from \code{ExprsPipeline} objects. 66 | }} 67 | 68 | -------------------------------------------------------------------------------- /man/buildFRB.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildFRB} 4 | \alias{buildFRB} 5 | \title{Build Fuzzy Rule Based Model} 6 | \usage{ 7 | buildFRB(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildFRB} builds a model using the \code{frbs} function 25 | from the \code{frbs} package. 26 | } 27 | -------------------------------------------------------------------------------- /man/buildGLM.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildGLM} 4 | \alias{buildGLM} 5 | \title{Build Generalized Linear Model} 6 | \usage{ 7 | buildGLM(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildGLM} builds a model using the \code{glm} function. 25 | } 26 | -------------------------------------------------------------------------------- /man/buildLASSO.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildLASSO} 4 | \alias{buildLASSO} 5 | \title{Build LASSO or Ridge Model} 6 | \usage{ 7 | buildLASSO(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildLASSO} builds a model using the \code{cv.glmnet} function 25 | from the \code{glmnet} package. 26 | } 27 | -------------------------------------------------------------------------------- /man/buildLDA.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildLDA} 4 | \alias{buildLDA} 5 | \title{Build Linear Discriminant Analysis Model} 6 | \usage{ 7 | buildLDA(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildLDA} builds a model using the \code{lda} function 25 | from the \code{MASS} package. 26 | } 27 | -------------------------------------------------------------------------------- /man/buildLM.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildLM} 4 | \alias{buildLM} 5 | \title{Build Linear Model} 6 | \usage{ 7 | buildLM(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildLM} builds a model using the \code{lm} function. 25 | } 26 | -------------------------------------------------------------------------------- /man/buildLR.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildLR} 4 | \alias{buildLR} 5 | \title{Build Logistic Regression Model} 6 | \usage{ 7 | buildLR(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildLR} builds a model using the \code{glm} function. 25 | } 26 | -------------------------------------------------------------------------------- /man/buildNB.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildNB} 4 | \alias{buildNB} 5 | \title{Build Naive Bayes Model} 6 | \usage{ 7 | buildNB(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildNB} builds a model using the \code{naiveBayes} function 25 | from the \code{e1071} package. 26 | } 27 | -------------------------------------------------------------------------------- /man/buildRF.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildRF} 4 | \alias{buildRF} 5 | \title{Build Random Forest Model} 6 | \usage{ 7 | buildRF(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildRF} builds a model using the \code{randomForest} function 25 | from the \code{randomForest} package. 26 | } 27 | -------------------------------------------------------------------------------- /man/buildSVM.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.2-build.R 3 | \name{buildSVM} 4 | \alias{buildSVM} 5 | \title{Build Support Vector Machine Model} 6 | \usage{ 7 | buildSVM(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The training set.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsModel} object. 22 | } 23 | \description{ 24 | \code{buildSVM} builds a model using the \code{svm} function 25 | from the \code{e1071} package. 26 | } 27 | -------------------------------------------------------------------------------- /man/calcMonteCarlo.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/7.3-plMonteCarlo.R 3 | \name{calcMonteCarlo} 4 | \alias{calcMonteCarlo} 5 | \title{Calculate \code{plMonteCarlo} Performance} 6 | \usage{ 7 | calcMonteCarlo(pl, colBy = "valid.acc") 8 | } 9 | \arguments{ 10 | \item{pl}{Specifies the \code{ExprsPipeline} object returned by \code{plMonteCarlo}.} 11 | 12 | \item{colBy}{A character vector or string. Specifies column(s) to use when 13 | summarizing model performance. Listing multiple columns will calculate 14 | performance as a product of those listed performances.} 15 | } 16 | \value{ 17 | A numeric scalar. The cross-validation accuracy. 18 | } 19 | \description{ 20 | \code{calcMonteCarlo} calculates a single performance measure for the 21 | results of a \code{plMonteCarlo} function call. 22 | } 23 | \details{ 24 | For each dataset split (i.e., bootstrap), \code{calcMonteCarlo} averages 25 | the validation set performance for the "best" model (where "best" is 26 | defined as the model with the maximum "internal" cross-validation 27 | accuracy, \code{max($train.plCV)}). The validation set performance 28 | ultimately averaged depends on the supplied \code{colBy} argument. 29 | } 30 | -------------------------------------------------------------------------------- /man/calcNested.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/7.4-plNested.R 3 | \name{calcNested} 4 | \alias{calcNested} 5 | \title{Calculate \code{plNested} Performance} 6 | \usage{ 7 | calcNested(pl, colBy = "valid.acc") 8 | } 9 | \arguments{ 10 | \item{pl}{Specifies the \code{ExprsPipeline} object returned by \code{plNested}.} 11 | 12 | \item{colBy}{A character vector or string. Specifies column(s) to use when 13 | summarizing model performance. Listing multiple columns will calculate 14 | performance as a product of those listed performances.} 15 | } 16 | \value{ 17 | A numeric scalar. The cross-validation accuracy. 18 | } 19 | \description{ 20 | \code{calcNested} calculates a single performance measure for the 21 | results of a \code{plNested} function call. 22 | } 23 | \details{ 24 | For each dataset split (i.e., bootstrap), \code{calcNested} averages 25 | the validation set performance for the "best" model (where "best" is 26 | defined as the model with the maximum "internal" cross-validation 27 | accuracy, \code{max($train.plCV)}). The validation set performance 28 | ultimately averaged depends on the supplied \code{colBy} argument. 29 | } 30 | -------------------------------------------------------------------------------- /man/calcStats.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/6.2-calc.R 3 | \docType{methods} 4 | \name{calcStats} 5 | \alias{calcStats} 6 | \alias{calcStats,ExprsPredict-method} 7 | \alias{calcStats,MultiPredict-method} 8 | \alias{calcStats,RegrsPredict-method} 9 | \title{Calculate Model Performance} 10 | \usage{ 11 | calcStats(object, aucSkip = FALSE, plotSkip = FALSE, verbose = TRUE) 12 | 13 | \S4method{calcStats}{ExprsPredict}(object, aucSkip = FALSE, 14 | plotSkip = FALSE, verbose = TRUE) 15 | 16 | \S4method{calcStats}{MultiPredict}(object, verbose) 17 | 18 | \S4method{calcStats}{RegrsPredict}(object, verbose) 19 | } 20 | \arguments{ 21 | \item{object}{An \code{ExprsPredict} or \code{RegrsPredict} object.} 22 | 23 | \item{aucSkip}{A logical scalar. Toggles whether to calculate area under the 24 | receiver operating characteristic curve. See Details.} 25 | 26 | \item{plotSkip}{A logical scalar. Toggles whether to plot the receiver 27 | operating characteristic curve. See Details.} 28 | 29 | \item{verbose}{A logical scalar. Toggles whether to print the results 30 | of model performance to console.} 31 | } 32 | \value{ 33 | Returns a \code{data.frame} of performance metrics. 34 | } 35 | \description{ 36 | \code{calcStats} calculates the performance of a deployed model. 37 | } 38 | \details{ 39 | For classification, if the argument \code{aucSkip = FALSE} AND the \code{ExprsArray} 40 | object was an \code{ExprsBinary} object with at least one case and one control AND 41 | \code{ExprsPredict} contains a coherent \code{@probability} slot, \code{calcStats} 42 | will calculate classifier performance using the area under the receiver operating 43 | characteristic (ROC) curve via the \code{ROCR} package. Otherwise, \code{calcStats} 44 | will calculate classifier performance traditionally using a confusion matrix. 45 | Note that accuracies calculated using \code{ROCR} may differ from those calculated 46 | using a confusion matrix because \code{ROCR} adjusts the discrimination threshold to 47 | optimize sensitivity and specificity. This threshold is automatically chosen as the 48 | point along the ROC which minimizes the Euclidean distance from (0, 1). 49 | 50 | For regression, accuracy is defined the R-squared of the fitted regression. This 51 | ranges from 0 to 1 for use with \code{\link{pl}} and \code{\link{pipe}}. Note that 52 | the \code{aucSkip} and \code{plotSkip} arguments are ignored for regression. 53 | } 54 | \section{Methods (by class)}{ 55 | \itemize{ 56 | \item \code{ExprsPredict}: Method to calculate performance for classification models. 57 | 58 | \item \code{MultiPredict}: Method to calculate performance for multi-class models. 59 | 60 | \item \code{RegrsPredict}: Method to calculate performance for continuous outcome models. 61 | }} 62 | 63 | -------------------------------------------------------------------------------- /man/check.ctrlGS.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{check.ctrlGS} 4 | \alias{check.ctrlGS} 5 | \title{Check \code{ctrlGS} Arguments} 6 | \usage{ 7 | check.ctrlGS(args) 8 | } 9 | \arguments{ 10 | \item{args}{A list of arguments to check.} 11 | } 12 | \description{ 13 | This function ensures that the list of arguments for \code{ctrlGS} meets 14 | the criteria required by the \code{\link{plNested}} function. This 15 | function forces \code{aucSkip = TRUE} and \code{plotSkip = TRUE}. 16 | } 17 | -------------------------------------------------------------------------------- /man/classCheck.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{classCheck} 4 | \alias{classCheck} 5 | \title{Class Check} 6 | \usage{ 7 | classCheck(x, what, msg) 8 | } 9 | \arguments{ 10 | \item{x}{An object.} 11 | 12 | \item{what}{A character vector. The classes any of which \code{x} should have.} 13 | 14 | \item{msg}{A string. An error message if \code{x} is not \code{what}.} 15 | } 16 | \description{ 17 | Checks whether an object belongs to a specified class. 18 | For back-end use only. 19 | } 20 | -------------------------------------------------------------------------------- /man/compare.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-deprecated.R 3 | \docType{methods} 4 | \name{compare} 5 | \alias{compare} 6 | \alias{compare,ExprsArray-method} 7 | \title{Compare \code{ExprsArray} Objects} 8 | \usage{ 9 | compare(object, array.valid = NULL, colBy = "defineCase", 10 | cutoff = 0.05) 11 | 12 | \S4method{compare}{ExprsArray}(object, array.valid = NULL, 13 | colBy = "defineCase", cutoff = 0.05) 14 | } 15 | \arguments{ 16 | \item{object}{The \code{ExprsArray} object used when comparing annotations.} 17 | 18 | \item{array.valid}{A second \code{ExprsArray} object used when comparing 19 | annotations. Optional. Exclude with \code{array.valid = NULL}.} 20 | 21 | \item{colBy}{A character string. The annotation column against which to 22 | compare all other annotation terms (i.e., to test as the independent 23 | variable).} 24 | 25 | \item{cutoff}{A numeric scalar. The p-value cutoff that determines when 26 | the annotation test returns a \code{TRUE} result} 27 | } 28 | \value{ 29 | A list of three logical vectors. The first and second elements 30 | of the list correspond to "internal" comparisons for the two provided 31 | \code{ExprsArray} objects, respectively. The third element of the list 32 | corresponds to comparisons made between the provided objects. 33 | } 34 | \description{ 35 | This method compares the values of all \code{ExprsArray} annotations across a 36 | specified annotation term for up to two \code{ExprsArray} objects. 37 | Depending on the composition of each annotation, \code{compare} 38 | will perform either a chi-squared test or an ANOVA test. 39 | } 40 | \details{ 41 | This method performs two kinds of comparisons. First, it tests all 42 | annotation variables against the annotation supplied by the \code{colBy} 43 | argument for each provided \code{ExprsArray} object. In other words, 44 | the \code{colBy} argument determines which annotation to use as the 45 | independent variable for "internal" comparisons. Second, it tests 46 | all annotation variables between the provided \code{ExprsArray} objects. 47 | Providing \code{array.valid = NULL} will skip the between comparisons. 48 | 49 | This method will test annotations using either a chi-squared test or an 50 | ANOVA test depending on the class of the values stored by the tested column. 51 | The presence of a "character" or "factor" in the tested column will trigger 52 | a chi-squared test. As such, this method requires the user to select 53 | a \code{colBy} annotation that contains categorical data (i.e., to use as 54 | the independent variable). 55 | 56 | We anticipate that this method will serve as a useful adjunct to 57 | \code{\link{modCluster}}. However, it may also help in quickly determining 58 | whether the data \code{\link{split}} has yielded comparable training and 59 | test sets in terms of the annotations included in \code{@annot}. 60 | } 61 | \section{Methods (by class)}{ 62 | \itemize{ 63 | \item \code{ExprsArray}: Method to compare \code{ExprsArray} objects. 64 | }} 65 | 66 | -------------------------------------------------------------------------------- /man/conjoin.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/2-conjoin.R 3 | \docType{methods} 4 | \name{conjoin} 5 | \alias{conjoin} 6 | \alias{conjoin,ExprsArray-method} 7 | \alias{conjoin,ExprsModel-method} 8 | \alias{conjoin,ExprsPipeline-method} 9 | \alias{conjoin,ExprsEnsemble-method} 10 | \title{Combine \code{exprso} Objects} 11 | \usage{ 12 | conjoin(object, ...) 13 | 14 | \S4method{conjoin}{ExprsArray}(object, ...) 15 | 16 | \S4method{conjoin}{ExprsModel}(object, ...) 17 | 18 | \S4method{conjoin}{ExprsPipeline}(object, ...) 19 | 20 | \S4method{conjoin}{ExprsEnsemble}(object, ...) 21 | } 22 | \arguments{ 23 | \item{object}{Any \code{exprso} object.} 24 | 25 | \item{...}{More objects of the same class.} 26 | } 27 | \value{ 28 | See Details. 29 | } 30 | \description{ 31 | \code{conjoin} combines two or more \code{exprso} objects based on their class. 32 | } 33 | \details{ 34 | When joining \code{ExprsArray} objects, this function returns one 35 | \code{ExprsArray} object as output. This only works on \code{ExprsArray} objects 36 | that have not undergone feature selection. Any missing annotations in \code{@annot} 37 | will get replaced with \code{NA} values. 38 | 39 | When joining \code{ExprsModel} or \code{ExprsEnsemble} objects, 40 | this function returns an ensemble. 41 | 42 | When joining \code{ExprsPipeline} objects, this function returns one 43 | \code{ExprsPipeline} object as output. To track which \code{ExprsPipeline} 44 | objects contributed to the resultant object, the source gets flagged 45 | with a \code{boot} column. If a pipeline already has a \code{boot} column, 46 | the original boot tracker will receive an offset (and the old \code{boot} 47 | column will get renamed to \code{unboot}). This system ensures that all 48 | models deriving from the same training set will get handled as a 49 | "pseudo-bootstrap" by downstream \code{\link{pipe}} functions. 50 | } 51 | \section{Methods (by class)}{ 52 | \itemize{ 53 | \item \code{ExprsArray}: Method to join \code{ExprsArray} objects. 54 | 55 | \item \code{ExprsModel}: Method to join \code{ExprsModel} objects. 56 | 57 | \item \code{ExprsPipeline}: Method to join \code{ExprsPipeline} objects. 58 | 59 | \item \code{ExprsEnsemble}: Method to join \code{ExprsEnsemble} objects. 60 | }} 61 | 62 | -------------------------------------------------------------------------------- /man/ctrlFeatureSelect.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{ctrlFeatureSelect} 4 | \alias{ctrlFeatureSelect} 5 | \title{Manage \code{fs} Arguments} 6 | \usage{ 7 | ctrlFeatureSelect(func, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{func}{A character string. The \code{fs} function to call.} 11 | 12 | \item{top}{Argument passed to the \code{fs} function.} 13 | 14 | \item{...}{Additional arguments passed to the \code{fs} function.} 15 | } 16 | \value{ 17 | A list of arguments. 18 | } 19 | \description{ 20 | This function organizes \code{fs} arguments passed to \code{pl} functions. 21 | } 22 | -------------------------------------------------------------------------------- /man/ctrlGridSearch.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{ctrlGridSearch} 4 | \alias{ctrlGridSearch} 5 | \title{Manage \code{plGrid} Arguments} 6 | \usage{ 7 | ctrlGridSearch(func, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{func}{A character string. The \code{pl} function to call.} 11 | 12 | \item{top}{Argument passed to the \code{pl} function. Leave missing 13 | when handling \code{plMonteCarlo} or \code{plNested} arguments.} 14 | 15 | \item{...}{Additional arguments passed to the \code{pl} function.} 16 | } 17 | \value{ 18 | A list of arguments. 19 | } 20 | \description{ 21 | This function organizes \code{plGrid} arguments passed to \code{pl} functions. 22 | } 23 | -------------------------------------------------------------------------------- /man/ctrlModSet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{ctrlModSet} 4 | \alias{ctrlModSet} 5 | \title{Manage \code{mod} Arguments} 6 | \usage{ 7 | ctrlModSet(func, ...) 8 | } 9 | \arguments{ 10 | \item{func}{A character string. The \code{mod} function to call.} 11 | 12 | \item{...}{Additional arguments passed to the \code{mod} function.} 13 | } 14 | \value{ 15 | A list of arguments. 16 | } 17 | \description{ 18 | This function organizes \code{mod} arguments passed to \code{pl} functions. 19 | } 20 | -------------------------------------------------------------------------------- /man/ctrlSplitSet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{ctrlSplitSet} 4 | \alias{ctrlSplitSet} 5 | \title{Manage \code{split} Arguments} 6 | \usage{ 7 | ctrlSplitSet(func, percent.include = 67, ...) 8 | } 9 | \arguments{ 10 | \item{func}{A character string. The \code{split} function to call.} 11 | 12 | \item{percent.include}{Argument passed to the \code{split} function.} 13 | 14 | \item{...}{Additional arguments passed to the \code{split} function.} 15 | } 16 | \value{ 17 | A list of arguments. 18 | } 19 | \description{ 20 | This function organizes \code{split} arguments passed to \code{pl} functions. 21 | } 22 | -------------------------------------------------------------------------------- /man/defaultArg.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{defaultArg} 4 | \alias{defaultArg} 5 | \title{Set an args List Element to Default Value} 6 | \usage{ 7 | defaultArg(what, as, args, verbose = TRUE) 8 | } 9 | \arguments{ 10 | \item{what}{The name of the argument.} 11 | 12 | \item{as}{The value to set it as.} 13 | 14 | \item{args}{An args list. The result of \code{\link{getArgs}}.} 15 | 16 | \item{verbose}{A boolean. Toggles whether to alert 17 | the user that an argument is set.} 18 | } 19 | \description{ 20 | Set an args List Element to Default Value 21 | } 22 | -------------------------------------------------------------------------------- /man/exprso-predict.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/6.1-predict.R, R/8.2-ens.R 3 | \docType{methods} 4 | \name{exprso-predict} 5 | \alias{exprso-predict} 6 | \alias{predict,ExprsMachine-method} 7 | \alias{predict,ExprsModule-method} 8 | \alias{predict,RegrsModel-method} 9 | \alias{predict,ExprsEnsemble-method} 10 | \title{Deploy Model} 11 | \usage{ 12 | \S4method{predict}{ExprsMachine}(object, array, verbose = TRUE) 13 | 14 | \S4method{predict}{ExprsModule}(object, array, verbose = TRUE) 15 | 16 | \S4method{predict}{RegrsModel}(object, array, verbose = TRUE) 17 | 18 | \S4method{predict}{ExprsEnsemble}(object, array, how = "majority", 19 | verbose = TRUE) 20 | } 21 | \arguments{ 22 | \item{object}{An \code{exprso} model.} 23 | 24 | \item{array}{An \code{exprso} object. The test data.} 25 | 26 | \item{verbose}{A boolean. Argument passed to \code{calcStats}.} 27 | 28 | \item{how}{A string. Describes how the ensemble decides. By default, it 29 | uses "majority" voting. However, the user can select "probability" voting 30 | for binary classifier ensembles.} 31 | } 32 | \value{ 33 | Returns an \code{exprso} prediction object. 34 | } 35 | \description{ 36 | Deploy a model to predict outcomes from the data. 37 | } 38 | \details{ 39 | Models can only get deployed on an object of the type used to build 40 | the model. This method now supports binary classification, 41 | multi-class classification, and regression. 42 | 43 | For regression ensembles, the average outcome is reported. For multi-class 44 | classifier ensembles, the majority vote is reported. For binary classifier 45 | ensembles, the majority vote or probability-weighted vote is reported. 46 | For probability-weighted voting considers the threshold, the average 47 | "Case" probability is reported. All ties broken randomly. 48 | } 49 | -------------------------------------------------------------------------------- /man/exprso.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.3-exprso.R 3 | \name{exprso} 4 | \alias{exprso} 5 | \title{The \code{exprso} Package} 6 | \usage{ 7 | exprso(x, y, label = 1, switch = FALSE) 8 | } 9 | \arguments{ 10 | \item{x}{A matrix of feature data for all samples. Rows should 11 | contain samples and columns should contain features.} 12 | 13 | \item{y}{A vector of outcomes for all samples. If 14 | \code{class(y) == "character"} or \code{class(y) == "factor"}, 15 | \code{exprso} prepares data for binary or multi-class classification. 16 | Else, \code{exprso} prepares data for regression. If \code{y} is a 17 | matrix, the program uses the outcome in \code{label}.} 18 | 19 | \item{label}{A numeric scalar or character string. The column to 20 | use as the label if \code{y} is a matrix.} 21 | 22 | \item{switch}{A logical scalar. Toggles which class label is 23 | called Control in binary classification.} 24 | } 25 | \value{ 26 | An \code{ExprsArray} object. 27 | } 28 | \description{ 29 | Welcome to the \code{exprso} package! 30 | 31 | The \code{exprso} function imports data into the learning environment. 32 | 33 | See \code{\link{mod}} to process the data. 34 | 35 | See \code{\link{split}} to split off a test set. 36 | 37 | See \code{\link{fs}} to select features. 38 | 39 | See \code{\link{build}} to build models. 40 | 41 | See \code{\link{pl}} to build models high-throughput. 42 | 43 | See \code{\link{pipe}} to process pipelines. 44 | 45 | See \code{\link{buildEnsemble}} to build ensembles. 46 | 47 | See \code{\link{exprso-predict}} to deploy models. 48 | 49 | See \code{\link{conjoin}} to merge objects. 50 | } 51 | \examples{ 52 | library(exprso) 53 | data(iris) 54 | array <- exprso(iris[,1:4], iris[,5]) 55 | arrays <- splitSample(array, percent.include = 67) 56 | train <- trainingSet(arrays) 57 | test <- testSet(arrays) 58 | train <- fsANOVA(train, top = 0) 59 | train <- fsPrcomp(train, top = 3) 60 | mach <- buildSVM(train, top = 5, kernel = "linear", cost = 1) 61 | pred <- predict(mach, test) 62 | calcStats(pred) 63 | } 64 | -------------------------------------------------------------------------------- /man/forceArg.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{forceArg} 4 | \alias{forceArg} 5 | \title{Force an args List Element to Value} 6 | \usage{ 7 | forceArg(what, as, args, verbose = TRUE) 8 | } 9 | \arguments{ 10 | \item{what}{The name of the argument.} 11 | 12 | \item{as}{The value to set it as.} 13 | 14 | \item{args}{An args list. The result of \code{\link{getArgs}}.} 15 | 16 | \item{verbose}{A boolean. Toggles whether to alert 17 | the user that an argument is set.} 18 | } 19 | \description{ 20 | Force an args List Element to Value 21 | } 22 | -------------------------------------------------------------------------------- /man/fs..Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fs.} 4 | \alias{fs.} 5 | \title{Workhorse for fs Methods} 6 | \usage{ 7 | fs.(object, top, uniqueFx, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{uniqueFx}{A function call unique to the method. See Details.} 19 | 20 | \item{...}{Arguments passed to the detailed function.} 21 | } 22 | \value{ 23 | Returns an \code{ExprsArray} object. 24 | } 25 | \description{ 26 | Used as a back-end wrapper for creating new fs methods. 27 | } 28 | \details{ 29 | If the uniqueFx returns a character vector, it is assumed 30 | that the fs method is for feature selection only. If the 31 | uniqueFx returns a list, it is assumed that the fs method 32 | is a reduction model method only. 33 | } 34 | -------------------------------------------------------------------------------- /man/fs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.3-exprso.R 3 | \name{fs} 4 | \alias{fs} 5 | \title{Select Features} 6 | \description{ 7 | The \code{exprso} package includes these feature selection modules: 8 | 9 | - \code{\link{fsSample}} 10 | 11 | - \code{\link{fsNULL}} 12 | 13 | - \code{\link{fsANOVA}} 14 | 15 | - \code{\link{fsInclude}} 16 | 17 | - \code{\link{fsStats}} 18 | 19 | - \code{\link{fsCor}} 20 | 21 | - \code{\link{fsPrcomp}} 22 | 23 | - \code{\link{fsPCA}} 24 | 25 | - \code{\link{fsRDA}} 26 | 27 | - \code{\link{fsEbayes}} 28 | 29 | - \code{\link{fsEdger}} 30 | 31 | - \code{\link{fsMrmre}} 32 | 33 | - \code{\link{fsRankProd}} 34 | 35 | - \code{\link{fsBalance}} 36 | 37 | - \code{\link{fsAmalgam}} 38 | 39 | - \code{\link{fsAnnot}} 40 | } 41 | \details{ 42 | Considering the high-dimensionality of many datasets, it is prudent and 43 | often necessary to prioritize which features to include during model 44 | construction. This package provides functions for some of the most frequently 45 | used feature selection methods. Each function works as a self-contained wrapper 46 | that (1) pre-processes the \code{ExprsArray} input, (2) performs the feature 47 | selection, and (3) returns an \code{ExprsArray} output with an updated feature 48 | selection history. These histories get passed along at every step of the way 49 | until they eventually get used to pre-process an unlabeled dataset during 50 | model deployment (i.e., prediction). 51 | 52 | The argument \code{top} specifies either the names or the number of features 53 | to supply TO the feature selection method, not what the user intends to 54 | retrieve FROM the feature selection method. When calling the first feature 55 | selection method (or the first build method, if skipping feature selection), 56 | a numeric \code{top} argument will select a "top ranked" feature set according 57 | to their default order in the \code{ExprsArray} input. 58 | } 59 | -------------------------------------------------------------------------------- /man/fsANOVA.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsANOVA} 4 | \alias{fsANOVA} 5 | \title{Select Features by ANOVA} 6 | \usage{ 7 | fsANOVA(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsANOVA} selects features using the \code{aov} function. 25 | Note that the ANOVA assumes equal variances, so will differ from 26 | the \code{fsStats} t-test in the two-group setting. 27 | } 28 | -------------------------------------------------------------------------------- /man/fsAmalgam.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsAmalgam} 4 | \alias{fsAmalgam} 5 | \title{Reduce Dimensions by Amalgamation} 6 | \usage{ 7 | fsAmalgam(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsAmalgam} finds a set of explanatory "amalgams" using 25 | the \code{amalgam::amalgam} function. This function expects 26 | a compositional data set that can be reduced by amalgamation. 27 | The resultant "amalgams" are clr- or slr-transformed. 28 | The amalgamation rule is saved and deployed automatically by the 29 | \code{predict} method during test set validation. 30 | } 31 | -------------------------------------------------------------------------------- /man/fsAnnot.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsAnnot} 4 | \alias{fsAnnot} 5 | \title{Use Annotations as Features} 6 | \usage{ 7 | fsAnnot(object, top = 0, colBy) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{colBy}{A character vector. The names of annotations to rank above all others.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsAnnot} moves annotations named by the \code{colBy} argument 25 | to the top of the ranked features list. Otherwise, the relative order 26 | of the features does not change. 27 | } 28 | -------------------------------------------------------------------------------- /man/fsBalance.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsBalance} 4 | \alias{fsBalance} 5 | \title{Convert Features into Balances} 6 | \usage{ 7 | fsBalance(object, top = 0, sbp.how = "sbp.fromPBA", ternary = FALSE, 8 | ratios = FALSE, ...) 9 | } 10 | \arguments{ 11 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 12 | 13 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 14 | the number of top features that should undergo feature selection. A character vector 15 | indicates specifically which features by name should undergo feature selection. 16 | Set \code{top = 0} to include all features. A numeric vector can also be used 17 | to indicate specific features by location, similar to a character vector.} 18 | 19 | \item{sbp.how}{A character string. The method used to build 20 | the serial binary partition matrix of balances. Any 21 | \code{balance::sbp.from*} function will work.} 22 | 23 | \item{ternary}{A boolean. Toggles whether to return balances 24 | representing three components. Argument passed to 25 | \code{balance::sbp.subset}. Set \code{ternary = FALSE} and 26 | \code{ratios = FALSE} to skip subset.} 27 | 28 | \item{ratios}{A boolean. Toggles whether to return balances 29 | representing two components. Argument passed to 30 | \code{balance::sbp.subset}. Set \code{ternary = FALSE} and 31 | \code{ratios = FALSE} to skip subset.} 32 | 33 | \item{...}{Arguments passed to the detailed function.} 34 | } 35 | \value{ 36 | Returns an \code{ExprsArray} object. 37 | } 38 | \description{ 39 | \code{fsBalance} converts features into balances. 40 | The balance rule is saved and deployed automatically by the 41 | \code{predict} method during test set validation. 42 | } 43 | -------------------------------------------------------------------------------- /man/fsCor.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsCor} 4 | \alias{fsCor} 5 | \title{Select Features by Correlation} 6 | \usage{ 7 | fsCor(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsCor} selects features using the \code{cor} function. 25 | Ranks features by absolute value of correlation. 26 | } 27 | -------------------------------------------------------------------------------- /man/fsEbayes.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsEbayes} 4 | \alias{fsEbayes} 5 | \title{Select Features by Moderated t-test} 6 | \usage{ 7 | fsEbayes(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsEbayes} selects features using the \code{lmFit} and 25 | \code{eBayes} functions from the \code{limma} package. Features 26 | ranked by the \code{topTableF} function. 27 | } 28 | -------------------------------------------------------------------------------- /man/fsEdger.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsEdger} 4 | \alias{fsEdger} 5 | \title{Selects Features by Exact Test} 6 | \usage{ 7 | fsEdger(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsEdger} selects features using the \code{exactTest} function 25 | from the \code{edgeR} package. This function does not normalize the data, 26 | but does estimate dispersion using the \code{estimateCommonDisp} 27 | and \code{estimateTagwiseDisp} functions. 28 | } 29 | \details{ 30 | The user can normalize the data before feature selection using the 31 | \code{modTMM} function. Note that applying \code{edgeR} to already normalized 32 | counts differs slightly from applying \code{edgeR} with normalization. 33 | } 34 | -------------------------------------------------------------------------------- /man/fsInclude.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsInclude} 4 | \alias{fsInclude} 5 | \title{Select Features by Explicit Reference} 6 | \usage{ 7 | fsInclude(object, top = 0, include) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{include}{A character vector. The names of features to rank above all others.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsInclude} moves features named by the \code{include} argument 25 | to the top of the ranked features list. Otherwise, the relative order 26 | of the features does not change. 27 | } 28 | -------------------------------------------------------------------------------- /man/fsMrmre.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsMrmre} 4 | \alias{fsMrmre} 5 | \title{Select Features by mRMR} 6 | \usage{ 7 | fsMrmre(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsMrmre} selects features using the \code{mRMR.classic} function 25 | from the \code{mRMRe} package. 26 | } 27 | \details{ 28 | Note that \code{fsMrmre} crashes when supplied a very large 29 | \code{feature_count} owing to its \code{mRMRe} implementation. 30 | } 31 | -------------------------------------------------------------------------------- /man/fsNULL.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsNULL} 4 | \alias{fsNULL} 5 | \title{Null Feature Selection} 6 | \usage{ 7 | fsNULL(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsNULL} does not select features. However, it will handle the 25 | \code{top} and \code{keep} arguments if provided. 26 | } 27 | -------------------------------------------------------------------------------- /man/fsPCA.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsPCA} 4 | \alias{fsPCA} 5 | \title{Reduce Dimensions by PCA} 6 | \usage{ 7 | fsPCA(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsPrcomp} runs a PCA using the \code{prcomp} function. 25 | The PCA model is saved and deployed automatically by the 26 | \code{predict} method during test set validation. 27 | } 28 | -------------------------------------------------------------------------------- /man/fsPRA.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsPRA} 4 | \alias{fsPRA} 5 | \title{Reduce Dimensions by Log-Ratio Selection} 6 | \usage{ 7 | fsPRA(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsPRA} finds the most explanatory pairwise log-ratios 25 | using the variable selection method proposed by Michael Greenacre 26 | in "Variable Selection in Compositional Data Analysis Using 27 | Pairwise Logratios", modified to run faster. 28 | } 29 | -------------------------------------------------------------------------------- /man/fsPrcomp.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsPrcomp} 4 | \alias{fsPrcomp} 5 | \title{Reduce Dimensions by PCA} 6 | \usage{ 7 | fsPrcomp(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsPrcomp} runs a PCA using the \code{prcomp} function. 25 | The PCA model is saved and deployed automatically by the 26 | \code{predict} method during test set validation. 27 | } 28 | -------------------------------------------------------------------------------- /man/fsRDA.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsRDA} 4 | \alias{fsRDA} 5 | \title{Reduce Dimensions by RDA} 6 | \usage{ 7 | fsRDA(object, top = 0, colBy = NULL) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{colBy}{A character vector. Lists the columns in \code{@annot} 19 | to use as the constraining matrix. Passed to \code{vegan::rda}. 20 | Optional argument. Skip with \code{colBy = NULL}.} 21 | } 22 | \value{ 23 | Returns an \code{ExprsArray} object. 24 | } 25 | \description{ 26 | \code{fsRDA} runs an RDA using the \code{rda} function 27 | from the \code{vegan} package, partialling out \code{colBy}. 28 | The RDA model is saved and deployed automatically by the 29 | \code{predict} method during test set validation. 30 | } 31 | \details{ 32 | When \code{colBy} is provided, it serves as the constraining matrix. 33 | However, \code{fsRDA} always returns the unconstrained scores. 34 | As such, \code{fsRDA} effectively partials out the contribution 35 | of \code{colBy} to the training set, then uses this rule 36 | to partial out the contribution to the test set too. 37 | } 38 | -------------------------------------------------------------------------------- /man/fsRankProd.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsRankProd} 4 | \alias{fsRankProd} 5 | \title{Select Features by Rank Product Analysis} 6 | \usage{ 7 | fsRankProd(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsRankProd} selects features using the \code{RankProducts} function 25 | from the \code{RankProd} package. 26 | } 27 | -------------------------------------------------------------------------------- /man/fsSample.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsSample} 4 | \alias{fsSample} 5 | \title{Select Features by Random Sampling} 6 | \usage{ 7 | fsSample(object, top = 0, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. A numeric vector can also be used 16 | to indicate specific features by location, similar to a character vector.} 17 | 18 | \item{...}{Arguments passed to the detailed function.} 19 | } 20 | \value{ 21 | Returns an \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{fsSample} selects features using the \code{sample} function. 25 | } 26 | -------------------------------------------------------------------------------- /man/fsStats.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/5.1-fs.R 3 | \name{fsStats} 4 | \alias{fsStats} 5 | \title{Select Features by Statistical Testing} 6 | \usage{ 7 | fsStats(object, top = 0, how = c("t.test", "ks.test", "wilcox.test", 8 | "var.test"), ...) 9 | } 10 | \arguments{ 11 | \item{object}{An \code{ExprsArray} object to undergo feature selection.} 12 | 13 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 14 | the number of top features that should undergo feature selection. A character vector 15 | indicates specifically which features by name should undergo feature selection. 16 | Set \code{top = 0} to include all features. A numeric vector can also be used 17 | to indicate specific features by location, similar to a character vector.} 18 | 19 | \item{how}{A character string. Toggles between the "t.test", "ks.test", 20 | "wilcox.test", and "var.test" methods.} 21 | 22 | \item{...}{Arguments passed to the detailed function.} 23 | } 24 | \value{ 25 | Returns an \code{ExprsArray} object. 26 | } 27 | \description{ 28 | \code{fsStats} selects features using a base R statistics 29 | function (toggled by the \code{how} argument). 30 | } 31 | -------------------------------------------------------------------------------- /man/getArgs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{getArgs} 4 | \alias{getArgs} 5 | \title{Build an args List} 6 | \usage{ 7 | getArgs(...) 8 | } 9 | \arguments{ 10 | \item{...}{Arguments passed down from a calling function.} 11 | } 12 | \description{ 13 | Build an args List 14 | } 15 | -------------------------------------------------------------------------------- /man/getFeatures.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.2-methods.R 3 | \name{getFeatures} 4 | \alias{getFeatures} 5 | \title{Retrieve Feature Set} 6 | \usage{ 7 | getFeatures(object, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray}, \code{ExprsModel}, \code{ExprsPipeline}, 11 | or \code{ExprsEnsemble} object.} 12 | 13 | \item{...}{See \code{\link{ExprsPipeline-class}} or 14 | \code{\link{ExprsEnsemble-class}}.} 15 | } 16 | \description{ 17 | See the object class for method details. 18 | } 19 | -------------------------------------------------------------------------------- /man/getWeights.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.2-methods.R 3 | \name{getWeights} 4 | \alias{getWeights} 5 | \title{Retrieve LASSO Weights} 6 | \usage{ 7 | getWeights(object, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsModel}, \code{ExprsPipeline}, 11 | or \code{ExprsEnsemble} object.} 12 | 13 | \item{...}{Arguments passed to \code{glmnet::coef.cv.glmnet}.} 14 | } 15 | \description{ 16 | See the respective S4 class for method details. 17 | } 18 | -------------------------------------------------------------------------------- /man/lequal.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{lequal} 4 | \alias{lequal} 5 | \title{Test All Equal Within List} 6 | \usage{ 7 | lequal(list) 8 | } 9 | \arguments{ 10 | \item{list}{A list.} 11 | } 12 | \description{ 13 | This function tests whether all elements in a list are identical. 14 | Works like an iterative \code{all.equal}. 15 | } 16 | -------------------------------------------------------------------------------- /man/makeGridFromArgs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{makeGridFromArgs} 4 | \alias{makeGridFromArgs} 5 | \title{Build Argument Grid} 6 | \usage{ 7 | makeGridFromArgs(array.train, top, how, ...) 8 | } 9 | \arguments{ 10 | \item{array.train}{The \code{array.train} argument as fed to \code{plGrid}.} 11 | 12 | \item{top}{The \code{top} argument as fed to \code{plGrid}.} 13 | 14 | \item{how}{The \code{how} argument as fed to \code{plGrid}.} 15 | 16 | \item{...}{Additional arguments as fed to \code{plGrid}.} 17 | } 18 | \description{ 19 | This function builds an argument grid from any number of arguments. 20 | Used to prepare a grid-search for the \code{plGrid} and 21 | \code{plGridMulti} functions. 22 | } 23 | -------------------------------------------------------------------------------- /man/mod.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.3-exprso.R 3 | \name{mod} 4 | \alias{mod} 5 | \title{Process Data} 6 | \description{ 7 | The \code{exprso} package includes these data process modules: 8 | 9 | - \code{\link{modHistory}} 10 | 11 | - \code{\link{modSubset}} 12 | 13 | - \code{\link{modFilter}} 14 | 15 | - \code{\link{modTransform}} 16 | 17 | - \code{\link{modSample}} 18 | 19 | - \code{\link{modPermute}} 20 | 21 | - \code{\link{modInclude}} 22 | 23 | - \code{\link{modNormalize}} 24 | 25 | - \code{\link{modTMM}} 26 | 27 | - \code{\link{modAcomp}} 28 | 29 | - \code{\link{modCLR}} 30 | 31 | - \code{\link{modRatios}} 32 | 33 | - \code{\link{modScale}} 34 | 35 | - \code{\link{modSkew}} 36 | } 37 | -------------------------------------------------------------------------------- /man/modAcomp.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modAcomp} 4 | \alias{modAcomp} 5 | \title{Compositionally Constrain Data} 6 | \usage{ 7 | modAcomp(object) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo pre-processing.} 11 | } 12 | \value{ 13 | An \code{ExprsArray} object. 14 | } 15 | \description{ 16 | \code{modAcomp} makes it so that all sample vectors have the same total sum. 17 | } 18 | -------------------------------------------------------------------------------- /man/modCLR.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modCLR} 4 | \alias{modCLR} 5 | \title{Log-ratio Transform Data} 6 | \usage{ 7 | modCLR(object) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo pre-processing.} 11 | } 12 | \value{ 13 | An \code{ExprsArray} object. 14 | } 15 | \description{ 16 | \code{modCLR} applies a centered log-ratio transformation to the data. 17 | } 18 | -------------------------------------------------------------------------------- /man/modCluster.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-deprecated.R 3 | \docType{methods} 4 | \name{modCluster} 5 | \alias{modCluster} 6 | \alias{modCluster,ExprsArray-method} 7 | \title{Cluster Subjects} 8 | \usage{ 9 | modCluster(object, top = 0, how = "hclust", onlyCluster = FALSE, ...) 10 | 11 | \S4method{modCluster}{ExprsArray}(object, top = 0, how = "hclust", 12 | onlyCluster = FALSE, ...) 13 | } 14 | \arguments{ 15 | \item{object}{An \code{ExprsArray} object. The object containing the subject 16 | data to cluster.} 17 | 18 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 19 | the number of top features that should undergo feature selection. A character vector 20 | indicates specifically which features by name should undergo feature selection. 21 | Set \code{top = 0} to include all features. A numeric vector can also be used 22 | to indicate specific features by location, similar to a character vector.} 23 | 24 | \item{how}{A character string. The name of the function used to cluster. 25 | Select from "hclust", "kmeans", "agnes", "clara", "diana", "fanny", or 26 | "pam".} 27 | 28 | \item{onlyCluster}{A logical scalar. Toggles whether to return a processed 29 | cluster object or an updated \code{ExprsArray} object.} 30 | 31 | \item{...}{Additional arguments to the cluster function and/or 32 | other functions used for clustering (e.g., \code{dist} and 33 | \code{cutree}).} 34 | } 35 | \value{ 36 | Typically an \code{ExprsArray} object with subject cluster assignments 37 | added to the \code{$cluster} column of the \code{@anot} slot. 38 | } 39 | \description{ 40 | This method clusters subjects based on feature data using any one of 41 | seven available clustering algorithms. See Arguments below. 42 | } 43 | \details{ 44 | Note that this function will expect the argument \code{k} to define the returned 45 | number of clusters, except when \code{how = "kmeans"} in which case this 46 | function will expect the argument \code{centers} instead. 47 | } 48 | \section{Methods (by class)}{ 49 | \itemize{ 50 | \item \code{ExprsArray}: Method to compare \code{ExprsArray} objects. 51 | }} 52 | 53 | -------------------------------------------------------------------------------- /man/modFilter.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modFilter} 4 | \alias{modFilter} 5 | \title{Hard Filter Data} 6 | \usage{ 7 | modFilter(object, threshold, maximum, beta1, beta2) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo pre-processing.} 11 | 12 | \item{threshold}{A numeric scalar. The value below which to assign this value.} 13 | 14 | \item{maximum}{A numeric scalar. The value above which to assign this value.} 15 | 16 | \item{beta1}{A numeric scalar. The \code{max - min} range above which to 17 | include the feature. Inclusive with \code{beta2}.} 18 | 19 | \item{beta2}{A numeric scalar. The \code{max / min} ratio above which to 20 | include the feature. Inclusive with \code{beta1}.} 21 | } 22 | \value{ 23 | An \code{ExprsArray} object. 24 | } 25 | \description{ 26 | \code{modFilter} imposes a hard filter for (gene expression) feature data. 27 | } 28 | \details{ 29 | This method reproduces the hard filter described by Deb and Reddy (2003) 30 | for pre-processing the hallmark Golub ALL/AML dataset. This filter 31 | first sets all values less than \code{threshold} to \code{threshold} 32 | and all values greater than \code{maximum} to \code{maximum}. 33 | 34 | Next, this method includes only those features with (a) a range greater 35 | than \code{beta1}, and also (b) a ratio of maximum feature expression to 36 | minimum feature expression greater than \code{beta2}. 37 | } 38 | -------------------------------------------------------------------------------- /man/modHistory.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modHistory} 4 | \alias{modHistory} 5 | \title{Replicate Data Process History} 6 | \usage{ 7 | modHistory(object, reference) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The object that should undergo a 11 | replication of the feature selection history.} 12 | 13 | \item{reference}{An \code{ExprsArray} or \code{ExprsModel} object. The object 14 | containing the reference history.} 15 | } 16 | \value{ 17 | An \code{ExprsArray} object. 18 | } 19 | \description{ 20 | \code{modHistory} replicates the feature selection history of a reference. 21 | Used by \code{predict} to prepare test set for model deployment. 22 | } 23 | -------------------------------------------------------------------------------- /man/modInclude.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modInclude} 4 | \alias{modInclude} 5 | \title{Select Features from Data} 6 | \usage{ 7 | modInclude(object, include = rownames(object@exprs)) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo pre-processing.} 11 | 12 | \item{include}{A character vector. The names of features to include.} 13 | } 14 | \value{ 15 | An \code{ExprsArray} object. 16 | } 17 | \description{ 18 | \code{modSelect} selects specific features from a data set. Unlike 19 | \code{fsInclude}, this function does not update \code{@preFilter} 20 | and returns only those features stated by \code{include}. 21 | } 22 | -------------------------------------------------------------------------------- /man/modNormalize.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modNormalize} 4 | \alias{modNormalize} 5 | \title{Normalize Data} 6 | \usage{ 7 | modNormalize(object, MARGIN = c(1, 2)) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo pre-processing.} 11 | 12 | \item{MARGIN}{A numeric vector. The margin by which to normalize. 13 | Provide \code{MARGIN = 1} to normalize the feature vector. 14 | Provide \code{MARGIN = 2} to normalize the subject vector. 15 | Provide \code{MARGIN = c(1, 2)} to normalize by the subject vector 16 | and then by the feature vector.} 17 | } 18 | \value{ 19 | An \code{ExprsArray} object. 20 | } 21 | \description{ 22 | \code{modNormalize} normalizes feature data. 23 | } 24 | \details{ 25 | This method normalizes subject and/or feature vectors according to the 26 | formula \code{y = (x - mean(x)) / sd(x)}. 27 | } 28 | -------------------------------------------------------------------------------- /man/modPermute.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modPermute} 4 | \alias{modPermute} 5 | \title{Permute Features in Data} 6 | \usage{ 7 | modPermute(object) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo pre-processing.} 11 | } 12 | \value{ 13 | An \code{ExprsArray} object. 14 | } 15 | \description{ 16 | \code{modPermute} randomly shuffles each feature across 17 | all samples in the data. Using permuted data can establish 18 | a null model for testing the significance of prediction 19 | error estimates. This approach preserves the univariate 20 | distributions, but will change the multivariate 21 | distribution. 22 | } 23 | -------------------------------------------------------------------------------- /man/modRatios.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modRatios} 4 | \alias{modRatios} 5 | \title{Recast Data as Feature (Log-)Ratios} 6 | \usage{ 7 | modRatios(object, alpha = NA) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The object that should undergo a 11 | replication of the feature selection history.} 12 | 13 | \item{alpha}{A numeric scalar. This argument guides 14 | a Box-Cox transformation to approximate log-ratios in the 15 | presence of zeros. Skip with \code{NA}.} 16 | } 17 | \value{ 18 | An \code{ExprsArray} object. 19 | } 20 | \description{ 21 | \code{modRatios} recasts a data set with N feature columns as a new 22 | data set with N * (N - 1) / 2 feature (log-)ratio columns. 23 | } 24 | -------------------------------------------------------------------------------- /man/modSample.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modSample} 4 | \alias{modSample} 5 | \title{Sample Features from Data} 6 | \usage{ 7 | modSample(object, size = 0) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo pre-processing.} 11 | 12 | \item{size}{A numeric scalar. The number of randomly sampled features 13 | to include in the result.} 14 | } 15 | \value{ 16 | An \code{ExprsArray} object. 17 | } 18 | \description{ 19 | \code{modSample} samples features from a data set randomly without 20 | replacement. When \code{size = 0}, this is equivalent to 21 | \code{fsSample, top = 0}, but probably quicker. 22 | } 23 | -------------------------------------------------------------------------------- /man/modScale.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modScale} 4 | \alias{modScale} 5 | \title{Scale Data by Factor Range} 6 | \usage{ 7 | modScale(object, alpha = 0, uniform = TRUE) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The object that should undergo a 11 | replication of the feature selection history.} 12 | 13 | \item{alpha}{An integer. The maximum range of scale factors used 14 | for scaling if \code{uniform = TRUE}. The standard deviation 15 | of the scale factors if \code{uniform = FALSE}. See Details.} 16 | 17 | \item{uniform}{A boolean. Toggles whether to draw scale factors 18 | from a uniform distribution or a normal distribution.} 19 | } 20 | \value{ 21 | An \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{modScale} scales a data set by making all sample vectors 25 | have the same total sum, then multiplying each sample vector by 26 | a scale factor. 27 | } 28 | \details{ 29 | If \code{uniform = TRUE}, scale factors are randomly sampled from 30 | the uniform distribution \code{(0, alpha) + 1}. Otherwise, scale 31 | factors are randomly sampled from the normal distribution with 32 | a mean of 0 and standard deviation of \code{alpha}. When using 33 | the normal distribution, these scale factors are transformed by 34 | taking the absolute value then adding one. For this reason, 35 | data are always unscaled when \code{alpha = 0}. 36 | } 37 | -------------------------------------------------------------------------------- /man/modSkew.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modSkew} 4 | \alias{modSkew} 5 | \title{Skew Data by Factor Range} 6 | \usage{ 7 | modSkew(object, alpha = 0, uniform = TRUE) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object. The object that should undergo a 11 | replication of the feature selection history.} 12 | 13 | \item{alpha}{An integer. The maximum range of skew factors used 14 | for skewing if \code{uniform = TRUE}. The standard deviation 15 | of the skew factors if \code{uniform = FALSE}. See Details.} 16 | 17 | \item{uniform}{A boolean. Toggles whether to draw skew factors 18 | from a uniform distribution or a normal distribution.} 19 | } 20 | \value{ 21 | An \code{ExprsArray} object. 22 | } 23 | \description{ 24 | \code{modSkew} skews a data set by making all sample vectors 25 | have the same total sum, introducing a new feature, and then 26 | making all sample vectors again have the same total sum. 27 | } 28 | \details{ 29 | If \code{uniform = TRUE}, skew factors are randomly sampled from 30 | the uniform distribution \code{(0, alpha) + 1}. Otherwise, skew 31 | factors are randomly sampled from the normal distribution with 32 | a mean of 0 and standard deviation of \code{alpha}. When using 33 | the normal distribution, these skew factors are transformed by 34 | taking the absolute value then adding one. For this reason, 35 | data are always unskewed when \code{alpha = 0}. 36 | } 37 | -------------------------------------------------------------------------------- /man/modSubset.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.2-methods.R 3 | \name{modSubset} 4 | \alias{modSubset} 5 | \alias{pipeSubset} 6 | \title{Tidy Subset Wrapper} 7 | \usage{ 8 | modSubset(object, colBy, include) 9 | 10 | pipeSubset(object, colBy, include) 11 | } 12 | \arguments{ 13 | \item{object}{An \code{ExprsArray} or \code{ExprsPipeline} object to subset.} 14 | 15 | \item{colBy}{A numeric or character index. The column that contains group annotations.} 16 | 17 | \item{include}{A character vector. Specifies which annotations in \code{colBy} 18 | to include in the subset.} 19 | } 20 | \value{ 21 | An \code{ExprsArray} or \code{ExprsPipeline} object. 22 | } 23 | \description{ 24 | \code{modSubset} function provides a tidy wrapper for the \code{ExprsArray} 25 | \code{subset} method. \code{pipeSubset} provides a tidy wrapper for the 26 | \code{ExprsPipeline} \code{subset} method. 27 | } 28 | \section{Functions}{ 29 | \itemize{ 30 | \item \code{pipeSubset}: A variant of \code{modSubset}. 31 | }} 32 | 33 | -------------------------------------------------------------------------------- /man/modSwap.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-deprecated.R 3 | \docType{methods} 4 | \name{modSwap} 5 | \alias{modSwap} 6 | \alias{modSwap,ExprsBinary-method} 7 | \title{Swap Case Subjects} 8 | \usage{ 9 | modSwap(object, how = "fp", percent = 10, theta = 1) 10 | 11 | \S4method{modSwap}{ExprsBinary}(object, how = "fp", percent = 10, 12 | theta = 1) 13 | } 14 | \arguments{ 15 | \item{object}{An \code{ExprsBinary} object to mutate.} 16 | 17 | \item{how}{A character string. The method used to mutate case subjects. Select from 18 | "rp.1", "rp.2", "fp", "ng", or "tg". Alternatively, another \code{ExprsBinary} 19 | object. See Details.} 20 | 21 | \item{percent}{A numeric scalar. The percentage of subjects to mutate.} 22 | 23 | \item{theta}{A numeric scalar. Applies a weight to the distribution of means when 24 | mutating subjects via the "ng" or "tg" method.} 25 | } 26 | \value{ 27 | An \code{ExprsBinary} object containing mutated subjects with an index 28 | appended to the \code{$mutated} column of the \code{@annot} slot. 29 | } 30 | \description{ 31 | This experimental function mutates a percentage of case subjects 32 | into noisy positives, false positives, or defined out-groups. 33 | } 34 | \details{ 35 | This function includes several methods for distorting the features of \code{ExprsBinary} 36 | subjects. The "rp.1" method randomizes subject vectors to create "subject noise". 37 | The "rp.2" method creates a new subject vector by randomly sampling feature values 38 | from the respective feature vector. The "fp" method creates a new subject vector 39 | by randomly sampling feature values from the respective control feature vector. 40 | 41 | The "ng" and "tg" methods create out-groups by defining new means for each feature. 42 | These methods yield fixed distributions around new feature means such that 43 | the mean of all new feature means remains constant. The argument \code{theta} 44 | dictates how much the new feature mean might differ from the original feature mean 45 | (where larger \code{theta} values lead to more similar new feature means). For 46 | the "ng" method, the mean of new feature means equals that of the original features 47 | for case subjects only. On the other hand, for the "tg" method, the mean of new 48 | feature means equals that of the original features for all subjects. 49 | 50 | Alternatively, by providing another \code{ExprsBinary} object as the \code{how} 51 | argument, this function will swap a percentage of case subjects from the main dataset 52 | with control subjects from the second dataset. 53 | } 54 | \section{Methods (by class)}{ 55 | \itemize{ 56 | \item \code{ExprsBinary}: A method to mutate \code{ExprsBinary} objects. 57 | }} 58 | 59 | -------------------------------------------------------------------------------- /man/modTMM.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modTMM} 4 | \alias{modTMM} 5 | \title{Normalize Data} 6 | \usage{ 7 | modTMM(object, method = "TMM") 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo pre-processing.} 11 | 12 | \item{method}{A character string. The method used by \code{calcNormFactors}. 13 | Defaults to the "TMM" method.} 14 | } 15 | \value{ 16 | An \code{ExprsArray} object. 17 | } 18 | \description{ 19 | \code{modTMM} normalizes feature data. 20 | } 21 | \details{ 22 | This method normalizes data using the \code{calcNormFactors} function 23 | from the \code{edgeR} package. It returns the original counts 24 | multiplied by the effective library size factors. 25 | } 26 | -------------------------------------------------------------------------------- /man/modTransform.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/3-mod.R 3 | \name{modTransform} 4 | \alias{modTransform} 5 | \title{Log Transform Data} 6 | \usage{ 7 | modTransform(object, base = exp(1)) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to undergo pre-processing.} 11 | 12 | \item{base}{A numeric scalar. The base of the logarithm.} 13 | } 14 | \value{ 15 | An \code{ExprsArray} object. 16 | } 17 | \description{ 18 | \code{modTransform} log transforms feature data. 19 | } 20 | -------------------------------------------------------------------------------- /man/nfeats.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.2-methods.R 3 | \name{nfeats} 4 | \alias{nfeats} 5 | \title{Get Number of Features} 6 | \usage{ 7 | nfeats(object) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object.} 11 | } 12 | \description{ 13 | This function returns the number of features in an \code{ExprsArray} object. 14 | } 15 | -------------------------------------------------------------------------------- /man/nsamps.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.2-methods.R 3 | \name{nsamps} 4 | \alias{nsamps} 5 | \title{Get Number of Samples} 6 | \usage{ 7 | nsamps(object) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object.} 11 | } 12 | \description{ 13 | This function returns the number of samples in an \code{ExprsArray} object. 14 | } 15 | -------------------------------------------------------------------------------- /man/packageCheck.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{packageCheck} 4 | \alias{packageCheck} 5 | \title{Package Check} 6 | \usage{ 7 | packageCheck(package) 8 | } 9 | \arguments{ 10 | \item{package}{A character string. An R package.} 11 | } 12 | \description{ 13 | Checks whether the user has the required package installed. 14 | For back-end use only. 15 | } 16 | -------------------------------------------------------------------------------- /man/pipe.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.3-exprso.R 3 | \name{pipe} 4 | \alias{pipe} 5 | \title{Process Pipelines} 6 | \description{ 7 | The \code{exprso} package includes these pipeline process modules: 8 | 9 | - \code{\link{pipeSubset}} 10 | 11 | - \code{\link{pipeFilter}} 12 | 13 | - \code{\link{pipeUnboot}} 14 | } 15 | -------------------------------------------------------------------------------- /man/pipeFilter.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/8.1-pipe.R 3 | \name{pipeFilter} 4 | \alias{pipeFilter} 5 | \title{Filter \code{ExprsPipeline} Object} 6 | \usage{ 7 | pipeFilter(object, colBy, how = 0, gate = 0, top = 0) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{\link{ExprsPipeline-class}} object.} 11 | 12 | \item{colBy}{A character vector or string. Specifies column(s) to use when 13 | filtering by model performance. Listing multiple columns will result 14 | in a filter based on the product all listed columns.} 15 | 16 | \item{how, gate}{A numeric scalar. Arguments between 0 and 1 will impose 17 | a threshold or ceiling filter, respectively, based on the raw value of 18 | \code{colBy}. Arguments between 1 and 100 will impose a filter based on 19 | the percentile of \code{colBy}. The user may also provide "midrange", 20 | "median", or "mean" as an argument for these filters.} 21 | 22 | \item{top}{A numeric scalar. Determines the top N models based on 23 | \code{colBy} to include after the threshold and ceiling filters. 24 | In the case that the \code{@summary} slot contains the column "boot", 25 | this selects the top N models for each unique bootstrap.} 26 | } 27 | \value{ 28 | An \code{\link{ExprsPipeline-class}} object. 29 | } 30 | \description{ 31 | \code{pipeFilter} subsets an \code{ExprsPipeline} object. 32 | } 33 | \details{ 34 | The filter process occurs in three steps. However, the user may skip 35 | any one of these steps by setting the respective argument to \code{0}. 36 | First, a threshold filter gets imposed. Any model with a performance 37 | less than the threshold filter, \code{how}, gets excluded. Second, 38 | a ceiling filter gets imposed. Any model with a performance greater 39 | than the ceiling filter, \code{gate}, gets excluded. Third, an 40 | arbitrary subset occurs. The top N models in the \code{ExprsPipeline} 41 | object get selected based on the argument \code{top}. However, 42 | in the case that the \code{@summary} slot contains the column "boot", 43 | \code{pipeFilter} selects the top N models per bootstrap. 44 | 45 | \code{pipeFilter} will apply this filter based on the performance 46 | metrics listed in the \code{colBy} argument. Listing multiple columns 47 | will result in a filter based on the product of all listed columns. 48 | To weigh one metric over another, list that column more times. 49 | } 50 | -------------------------------------------------------------------------------- /man/pipeUnboot.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/8.1-pipe.R 3 | \name{pipeUnboot} 4 | \alias{pipeUnboot} 5 | \title{Rename "boot" Column} 6 | \usage{ 7 | pipeUnboot(object) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{\link{ExprsPipeline-class}} object.} 11 | } 12 | \value{ 13 | An \code{\link{ExprsPipeline-class}} object. 14 | } 15 | \description{ 16 | \code{pipeUnboot} renames the "boot" column summary to "unboot". 17 | } 18 | \details{ 19 | This method provides a convenient adjunct to \code{\link{pipeFilter}} owing to 20 | how \code{exprso} handles \code{ExprsPipeline} objects with a "boot" column. 21 | } 22 | -------------------------------------------------------------------------------- /man/pl.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.3-exprso.R 3 | \name{pl} 4 | \alias{pl} 5 | \title{Deploy Pipeline} 6 | \description{ 7 | The \code{exprso} package includes these automated pipeline modules: 8 | 9 | - \code{\link{plCV}} 10 | 11 | - \code{\link{plGrid}} 12 | 13 | - \code{\link{plMonteCarlo}} 14 | 15 | - \code{\link{plNested}} 16 | } 17 | -------------------------------------------------------------------------------- /man/plCV.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/7.1-plCV.R 3 | \name{plCV} 4 | \alias{plCV} 5 | \title{Perform Simple Cross-Validation} 6 | \usage{ 7 | plCV(array, top, how, fold, aucSkip, plCV.acc, ...) 8 | } 9 | \arguments{ 10 | \item{array}{Specifies the \code{ExprsArray} object to undergo cross-validation.} 11 | 12 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 13 | the number of top features that should undergo feature selection. A character vector 14 | indicates specifically which features by name should undergo feature selection. 15 | Set \code{top = 0} to include all features. Note that providing a numeric vector 16 | for the \code{top} argument will have \code{plGrid} search across multiple 17 | top features. However, by providing a list of numeric vectors as the \code{top} 18 | argument, the user can force the default handling of numeric vectors.} 19 | 20 | \item{how}{A character string. The \code{\link{build}} method to iterate.} 21 | 22 | \item{fold}{A numeric scalar. The number of folds for cross-validation. 23 | Set \code{fold = 0} to perform leave-one-out cross-validation. Argument passed 24 | to \code{\link{plCV}}. Set \code{fold = NULL} to skip cross-validation altogether.} 25 | 26 | \item{aucSkip}{A logical scalar. Argument passed to \code{\link{calcStats}}.} 27 | 28 | \item{plCV.acc}{A string. The performance metric to use. For example, 29 | choose from "acc", "sens", "spec", "prec", "f1", "auc", or any of the 30 | regression specific measures. Argument passed to \code{\link{plCV}}.} 31 | 32 | \item{...}{Arguments passed to the \code{how} method. Unlike the \code{build} method, 33 | \code{plGrid} allows multiple parameters for each argument, supplied as a vector. 34 | See Details.} 35 | } 36 | \value{ 37 | The average inner-fold cross-validation accuracy. 38 | } 39 | \description{ 40 | Calculates v-fold or leave-one-out cross-validation without selecting a new 41 | set of features with each fold. See Details. 42 | } 43 | \details{ 44 | \code{plCV} performs v-fold or leave-one-out cross-validation. The argument 45 | \code{fold} specifies the number of v-folds to use during cross-validation. 46 | Set \code{fold = 0} to perform leave-one-out cross-validation. 47 | 48 | This type of cross-validation is most appropriate if the data 49 | has not undergone any prior feature selection. However, it is also useful 50 | as an unbiased guide to parameter selection within another 51 | \code{\link{pl}} workflow. 52 | 53 | Users should never need to call this function directly. Instead, they 54 | should use \code{\link{plMonteCarlo}} or \code{\link{plNested}}. 55 | There, \code{plCV} handles inner-fold cross-validation. 56 | } 57 | -------------------------------------------------------------------------------- /man/plGrid.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/7.2-plGrid.R 3 | \name{plGrid} 4 | \alias{plGrid} 5 | \title{Perform High-Throughput Machine Learning} 6 | \usage{ 7 | plGrid(array.train, array.valid = NULL, how, top = 0, fold = 10, 8 | aucSkip = FALSE, plCV.acc = "acc", verbose = FALSE, ...) 9 | } 10 | \arguments{ 11 | \item{array.train}{The \code{ExprsArray} object to use as training set.} 12 | 13 | \item{array.valid}{The \code{ExprsArray} object to use as validation set.} 14 | 15 | \item{how}{A character string. The \code{\link{build}} method to iterate.} 16 | 17 | \item{top}{A numeric scalar or character vector. A numeric scalar indicates 18 | the number of top features that should undergo feature selection. A character vector 19 | indicates specifically which features by name should undergo feature selection. 20 | Set \code{top = 0} to include all features. Note that providing a numeric vector 21 | for the \code{top} argument will have \code{plGrid} search across multiple 22 | top features. However, by providing a list of numeric vectors as the \code{top} 23 | argument, the user can force the default handling of numeric vectors.} 24 | 25 | \item{fold}{A numeric scalar. The number of folds for cross-validation. 26 | Set \code{fold = 0} to perform leave-one-out cross-validation. Argument passed 27 | to \code{\link{plCV}}. Set \code{fold = NULL} to skip cross-validation altogether.} 28 | 29 | \item{aucSkip}{A logical scalar. Argument passed to \code{\link{calcStats}}.} 30 | 31 | \item{plCV.acc}{A string. The performance metric to use. For example, 32 | choose from "acc", "sens", "spec", "prec", "f1", "auc", or any of the 33 | regression specific measures. Argument passed to \code{\link{plCV}}.} 34 | 35 | \item{verbose}{A logical scalar. Toggles whether to print to console.} 36 | 37 | \item{...}{Arguments passed to the \code{how} method. Unlike the \code{build} method, 38 | \code{plGrid} allows multiple parameters for each argument, supplied as a vector. 39 | See Details.} 40 | } 41 | \value{ 42 | An \code{\link{ExprsPipeline-class}} object. 43 | } 44 | \description{ 45 | Trains and deploys models across a vast parameter search space. 46 | } 47 | \details{ 48 | \code{plGrid} will \code{\link{build}} and \code{\link{exprso-predict}} for 49 | each combination of parameters provided as additional arguments (\code{...}). 50 | When using \code{plGrid}, supplying a numeric vector as the \code{top} 51 | argument will train and deploy a model of each mentioned size for 52 | each combination of parameters provided. 53 | 54 | To skip validation set prediction, use \code{array.valid = NULL}. 55 | Either way, this function returns an \code{\link{ExprsPipeline-class}} 56 | object which contains a summary of the build parameters and the models 57 | themselves. The argument \code{fold} controls inner-fold 58 | cross-validation via \code{\link{plCV}}. Use this to 59 | select the best model unbiasedly. 60 | } 61 | -------------------------------------------------------------------------------- /man/plMonteCarlo.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/7.3-plMonteCarlo.R 3 | \name{plMonteCarlo} 4 | \alias{plMonteCarlo} 5 | \title{Monte Carlo Cross-Validation} 6 | \usage{ 7 | plMonteCarlo(array, B = 10, ctrlSS, ctrlFS = NULL, ctrlGS, 8 | ctrlMS = NULL, save = FALSE) 9 | } 10 | \arguments{ 11 | \item{array}{Specifies the \code{ExprsArray} object to undergo cross-validation.} 12 | 13 | \item{B}{A numeric scalar. The number of times to \code{split} the data.} 14 | 15 | \item{ctrlSS}{Arguments handled by \code{\link{ctrlSplitSet}}.} 16 | 17 | \item{ctrlFS}{A list of arguments handled by \code{\link{ctrlFeatureSelect}}.} 18 | 19 | \item{ctrlGS}{Arguments handled by \code{\link{ctrlGridSearch}}.} 20 | 21 | \item{ctrlMS}{Arguments handled by \code{\link{ctrlModSet}}. Optional.} 22 | 23 | \item{save}{A logical scalar. Toggles whether to save randomly split 24 | training and validation sets.} 25 | } 26 | \value{ 27 | An \code{\link{ExprsPipeline-class}} object. 28 | } 29 | \description{ 30 | Perform Monte Carlo style cross-validation. 31 | } 32 | \details{ 33 | Analogous to how \code{\link{plGrid}} manages multiple \code{build} and 34 | \code{predict} tasks, one can think of \code{plMonteCarlo} as managing 35 | multiple \code{pl} tasks. 36 | 37 | Specifically, \code{plMonteCarlo} will call the provided \code{split} 38 | function (via \code{ctrlSS}) some \code{B} times, perform all 39 | feature selection tasks (listed via \code{ctrlFS}) on each split of 40 | the data, and execute the \code{pl} function (via \code{ctrlGS}) 41 | using the bootstrapped set. 42 | 43 | To perform multiple feature selection tasks, supply a list of multiple 44 | \code{\link{ctrlFeatureSelect}} argument wrappers to \code{ctrlFS}. 45 | To reduce the results of \code{plMonteCarlo} to a single performance metric, 46 | you can feed the returned \code{ExprsPipeline} object through the helper 47 | function \code{\link{calcMonteCarlo}}. 48 | 49 | When embedding another \code{plMonteCarlo} or \code{plNested} call within 50 | this function (i.e., via \code{ctrlGS}), outer-fold model performance 51 | will force \code{aucSkip = TRUE} and \code{plotSkip = TRUE}. 52 | } 53 | \examples{ 54 | \dontrun{ 55 | require(golubEsets) 56 | data(Golub_Merge) 57 | array <- arrayEset(Golub_Merge, colBy = "ALL.AML", include = list("ALL", "AML")) 58 | array <- modFilter(array, 20, 16000, 500, 5) # pre-filter Golub ala Deb 2003 59 | array <- modTransform(array) # lg transform 60 | array <- modNormalize(array, c(1, 2)) # normalize gene and subject vectors 61 | ss <- ctrlSplitSet(func = "splitStratify", percent.include = 67, colBy = NULL) 62 | fs <- list(ctrlFeatureSelect(func = "fsStats", top = 0, how = "t.test"), 63 | ctrlFeatureSelect(func = "fsPrcomp", top = 50)) 64 | gs <- ctrlGridSearch(func = "plGrid", how = "buildSVM", top = c(2, 3, 4), fold = 10, 65 | kernel = c("linear", "radial"), cost = 10^(-3:3), gamma = 10^(-3:3)) 66 | boot <- plMonteCarlo(array, B = 3, ctrlSS = ss, ctrlFS = fs, ctrlGS = gs) 67 | } 68 | } 69 | -------------------------------------------------------------------------------- /man/plNested.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/7.4-plNested.R 3 | \name{plNested} 4 | \alias{plNested} 5 | \title{Nested Cross-Validation} 6 | \usage{ 7 | plNested(array, fold = 10, ctrlFS = NULL, ctrlGS, save = FALSE) 8 | } 9 | \arguments{ 10 | \item{array}{Specifies the \code{ExprsArray} object to undergo cross-validation.} 11 | 12 | \item{fold}{A numeric scalar. Specifies the number of folds for cross-validation. 13 | Set \code{fold = 0} to perform leave-one-out cross-validation.} 14 | 15 | \item{ctrlFS}{A list of arguments handled by \code{\link{ctrlFeatureSelect}}.} 16 | 17 | \item{ctrlGS}{Arguments handled by \code{\link{ctrlGridSearch}}.} 18 | 19 | \item{save}{A logical scalar. Toggles whether to save each fold.} 20 | } 21 | \value{ 22 | An \code{\link{ExprsPipeline-class}} object. 23 | } 24 | \description{ 25 | Perform nested cross-validation. 26 | } 27 | \details{ 28 | Analogous to how \code{\link{plGrid}} manages multiple \code{build} and 29 | \code{predict} tasks, one can think of \code{plNested} as managing 30 | multiple \code{pl} tasks. 31 | 32 | Specifically, \code{plNested} segregates the data into v-folds, 33 | treating each fold as a validation set and the subjects not in that fold 34 | as a training set. Then, some \code{fold} times, it performs all 35 | feature selection tasks (listed via \code{ctrlFS}) on each split 36 | of the data, and executes the \code{pl} function (via \code{ctrlGS}) 37 | using the training set. 38 | 39 | To perform multiple feature selection tasks, supply a list of multiple 40 | \code{\link{ctrlFeatureSelect}} argument wrappers to \code{ctrlFS}. 41 | To reduce the results of \code{plNested} to a single performance metric, 42 | you can feed the returned \code{ExprsPipeline} object through the helper 43 | function \code{\link{calcNested}}. 44 | 45 | When calculating model performance with \code{\link{calcStats}}, this 46 | function forces \code{aucSkip = TRUE} and \code{plotSkip = TRUE}. 47 | When embedding another \code{plMonteCarlo} or \code{plNested} call within 48 | this function (i.e., via \code{ctrlGS}), outer-fold model performance 49 | will force \code{aucSkip = TRUE} and \code{plotSkip = TRUE}. 50 | } 51 | \examples{ 52 | \dontrun{ 53 | require(golubEsets) 54 | data(Golub_Merge) 55 | array <- arrayEset(Golub_Merge, colBy = "ALL.AML", include = list("ALL", "AML")) 56 | array <- modFilter(array, 20, 16000, 500, 5) # pre-filter Golub ala Deb 2003 57 | array <- modTransform(array) # lg transform 58 | array <- modNormalize(array, c(1, 2)) # normalize gene and subject vectors 59 | fs <- ctrlFeatureSelect(func = "fsEbayes", top = 0) 60 | gs <- ctrlGridSearch(func = "plGrid", how = "buildANN", top = c(10, 20, 30), 61 | size = 1:3, decay = c(0, .5, 1), fold = 0) 62 | nest <- plNested(arrays[[1]], fold = 10, ctrlFS = fs, ctrlGS = gs, save = FALSE) 63 | } 64 | } 65 | -------------------------------------------------------------------------------- /man/progress.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/9-global.R 3 | \name{progress} 4 | \alias{progress} 5 | \title{Make Progress Bar} 6 | \usage{ 7 | progress(i, k, numTicks) 8 | } 9 | \arguments{ 10 | \item{i}{The current iteration.} 11 | 12 | \item{k}{Total iterations.} 13 | 14 | \item{numTicks}{The result of \code{progress}.} 15 | } 16 | \value{ 17 | The next \code{numTicks} argument. 18 | } 19 | \description{ 20 | Make Progress Bar 21 | } 22 | -------------------------------------------------------------------------------- /man/split.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.3-exprso.R 3 | \name{split} 4 | \alias{split} 5 | \title{Split Data} 6 | \description{ 7 | The \code{exprso} package includes these split modules: 8 | 9 | - \code{\link{splitSample}} 10 | 11 | - \code{\link{splitStratify}} 12 | 13 | - \code{\link{splitBalanced}} 14 | 15 | - \code{\link{splitBoost}} 16 | 17 | - \code{\link{splitBy}} 18 | } 19 | -------------------------------------------------------------------------------- /man/splitBalanced.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/4-split.R 3 | \name{splitBalanced} 4 | \alias{splitBalanced} 5 | \title{Split by Balanced Sampling} 6 | \usage{ 7 | splitBalanced(object, percent.include = 67, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to split.} 11 | 12 | \item{percent.include}{Specifies the percent of the total number 13 | of subjects to include in the training set.} 14 | 15 | \item{...}{Arguments passed to both \code{splitStratify} calls.} 16 | } 17 | \value{ 18 | Returns a list of two \code{ExprsArray} objects. 19 | } 20 | \description{ 21 | \code{splitBalance} is a wrapper that calls \code{splitStratify} 22 | twice. In the first call, \code{splitStratify} is used to create a 23 | balanced training set from the total data. In the second call, 24 | \code{splitStratify} is used to create a balanced validation set 25 | from the leftover data. This function ensures that there are always 26 | an equal number of samples from each class in the split. 27 | } 28 | -------------------------------------------------------------------------------- /man/splitBoost.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/4-split.R 3 | \name{splitBoost} 4 | \alias{splitBoost} 5 | \title{Sample by Boosting} 6 | \usage{ 7 | splitBoost(object, percent.include = 67) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to split.} 11 | 12 | \item{percent.include}{Specifies the percent of the total number 13 | of subjects to include in the training set (i.e., based on the larger group). 14 | Subjects from the smaller group are up-sampled to match this number.} 15 | } 16 | \description{ 17 | \code{splitBoost} builds a training and validation set by randomly up-sampling 18 | (with replacement) the smaller of two classes. This results in an equal 19 | representation of each class in the training set. For example, given 30 cases and 20 | 3 controls, a 2/3 split would place 20 cases and 20 controls in the training set. 21 | Of these 20 controls, only 2 are unique. The test set is not boosted. In this 22 | example, the test set would contain 10 cases and 1 control. 23 | } 24 | -------------------------------------------------------------------------------- /man/splitBy.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/4-split.R 3 | \name{splitBy} 4 | \alias{splitBy} 5 | \title{Split by User-defined Group} 6 | \usage{ 7 | splitBy(object, colBy, include) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to split.} 11 | 12 | \item{colBy}{A character string. Specifies the column used to split the data.} 13 | 14 | \item{include}{A character vector. Specifies which annotations in \code{colBy} 15 | to include in the training set.} 16 | } 17 | \value{ 18 | Returns a list of two \code{ExprsArray} objects. 19 | } 20 | \description{ 21 | \code{splitBy} builds a training set and validation set by placing 22 | all samples that have the \code{include} annotation in the specified 23 | \code{colBy} column in the training set. The remaining samples get 24 | placed in the validation set. This \code{split} is not random. 25 | } 26 | -------------------------------------------------------------------------------- /man/splitSample.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/4-split.R 3 | \name{splitSample} 4 | \alias{splitSample} 5 | \title{Split by Random Sampling} 6 | \usage{ 7 | splitSample(object, percent.include = 67, ...) 8 | } 9 | \arguments{ 10 | \item{object}{An \code{ExprsArray} object to split.} 11 | 12 | \item{percent.include}{Specifies the percent of the total number 13 | of subjects to include in the training set.} 14 | 15 | \item{...}{For \code{splitSample}: additional arguments passed 16 | along to \code{\link{sample}}. For \code{splitStratify}: additional 17 | arguments passed along to \code{\link{cut}}.} 18 | } 19 | \value{ 20 | Returns a list of two \code{ExprsArray} objects. 21 | } 22 | \description{ 23 | \code{splitSample} builds a training and validation set by randomly sampling 24 | the subjects found within the \code{ExprsArray} object. Note that this method 25 | is not truly random. Instead, \code{splitSample} iterates through the random sampling 26 | process until it settles on a solution such that both the training and validation set 27 | contain at least one subject for each class label. If this method finds no solution 28 | after 10 iterations, the function will post an error. Set \code{percent.include = 100} 29 | to skip random sampling and return a \code{NULL} validation set. Additional arguments 30 | (e.g., \code{replace = TRUE}) passed along to \code{\link{sample}}. 31 | } 32 | -------------------------------------------------------------------------------- /man/splitStratify.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/4-split.R 3 | \name{splitStratify} 4 | \alias{splitStratify} 5 | \title{Split by Stratified Sampling} 6 | \usage{ 7 | splitStratify(object, percent.include = 67, colBy = NULL, 8 | bin = rep(FALSE, length(colBy)), breaks = rep(list(NA), 9 | length(colBy)), ...) 10 | } 11 | \arguments{ 12 | \item{object}{An \code{ExprsArray} object to split.} 13 | 14 | \item{percent.include}{Specifies the percent of the total number 15 | of subjects to include in the training set.} 16 | 17 | \item{colBy}{Specifies a vector of column names by which to stratify in 18 | addition to class labels annotation. If \code{colBy = NULL}, random 19 | sampling will occur across the class label annotation only. 20 | For \code{splitStratify} only.} 21 | 22 | \item{bin}{A logical vector indicating whether to bin the respective 23 | \code{colBy} column using \code{cut} (e.g., \code{bin = c(FALSE, TRUE)}). 24 | For \code{splitStratify} only.} 25 | 26 | \item{breaks}{A list. Each element of the list should correspond to a 27 | \code{breaks} argument passed to \code{cut} for the respective 28 | \code{colBy} column. Set an element to \code{NA} when not binning 29 | that \code{colBy}. For \code{splitStratify} only.} 30 | 31 | \item{...}{For \code{splitSample}: additional arguments passed 32 | along to \code{\link{sample}}. For \code{splitStratify}: additional 33 | arguments passed along to \code{\link{cut}}.} 34 | } 35 | \value{ 36 | Returns a list of two \code{ExprsArray} objects. 37 | } 38 | \description{ 39 | \code{splitStratify} builds a training and validation set through a stratified 40 | random sampling process. This function utilizes the \code{strata} function from the 41 | sampling package as well as the \code{cut} function from the base package. The latter 42 | function provides a means by which to bin continuous data prior to stratified random 43 | sampling. We refer the user to the parameter descriptions to learn the specifics of 44 | how to apply binning, although the user might find it easier to instead bin 45 | annotations beforehand. When applied to an \code{ExprsMulti} object, this function 46 | stratifies subjects across all classes found in that dataset. 47 | } 48 | -------------------------------------------------------------------------------- /man/trainingSet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.2-methods.R 3 | \name{trainingSet} 4 | \alias{trainingSet} 5 | \title{Extract Training Set} 6 | \usage{ 7 | trainingSet(splitSets) 8 | } 9 | \arguments{ 10 | \item{splitSets}{A two-item list. The result of a \code{split} method call.} 11 | } 12 | \value{ 13 | An \code{ExprsArray} object. 14 | } 15 | \description{ 16 | This function extracts the training set from the result of a 17 | \code{split} method call such as \code{splitSample} or \code{splitStratify}. 18 | } 19 | -------------------------------------------------------------------------------- /man/validationSet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/1.2-methods.R 3 | \name{validationSet} 4 | \alias{validationSet} 5 | \alias{testSet} 6 | \title{Extract Validation Set} 7 | \usage{ 8 | validationSet(splitSets) 9 | 10 | testSet(splitSets) 11 | } 12 | \arguments{ 13 | \item{splitSets}{A two-item list. The result of a \code{split} method call.} 14 | } 15 | \value{ 16 | An \code{ExprsArray} object. 17 | } 18 | \description{ 19 | This function extracts the validation set from the result of a 20 | \code{split} method call such as \code{splitSample} or \code{splitStratify}. 21 | } 22 | \section{Functions}{ 23 | \itemize{ 24 | \item \code{testSet}: A variant of \code{validationSet}. 25 | }} 26 | 27 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(exprso) 3 | 4 | test_check("exprso") 5 | -------------------------------------------------------------------------------- /tests/testthat/data.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tpq/exprso/c4a0eb6412833abe216b61c6ca53737bc8f53c5b/tests/testthat/data.RData -------------------------------------------------------------------------------- /tests/testthat/test-1.1-classes.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | 3 | binary <- exprso(iris[1:100,1:4], iris[1:100,5]) 4 | multi <- exprso(iris[,1:4], iris[,5]) 5 | cont <- exprso(iris[,1:3], iris[,4]) 6 | 7 | pureclass <- function(data){ 8 | expect_true(attr(class(data), "package") == "exprso") 9 | class(data)[1] 10 | } 11 | 12 | test_that("exprso objects work", { 13 | 14 | expect_equal( 15 | pureclass(binary), 16 | "ExprsBinary" 17 | ) 18 | 19 | expect_equal( 20 | pureclass(multi), 21 | "ExprsMulti" 22 | ) 23 | 24 | expect_equal( 25 | pureclass(cont), 26 | "RegrsArray" 27 | ) 28 | 29 | expect_true( 30 | inherits(binary, "ExprsArray") 31 | ) 32 | 33 | expect_true( 34 | inherits(multi, "ExprsArray") 35 | ) 36 | 37 | expect_true( 38 | inherits(cont, "ExprsArray") 39 | ) 40 | 41 | expect_equal( 42 | pureclass(buildRF(binary)), 43 | "ExprsMachine" 44 | ) 45 | 46 | expect_equal( 47 | pureclass(buildRF(multi)), 48 | "ExprsModule" 49 | ) 50 | 51 | expect_equal( 52 | pureclass(buildRF(cont)), 53 | "RegrsModel" 54 | ) 55 | 56 | expect_true( 57 | inherits(buildRF(binary), "ExprsModel") 58 | ) 59 | 60 | expect_true( 61 | inherits(buildRF(multi), "ExprsModel") 62 | ) 63 | 64 | expect_true( 65 | inherits(buildRF(cont), "ExprsModel") 66 | ) 67 | 68 | expect_equal( 69 | pureclass(predict(buildRF(binary), binary)), 70 | "ExprsPredict" 71 | ) 72 | 73 | expect_equal( 74 | pureclass(predict(buildRF(multi), multi)), 75 | "MultiPredict" 76 | ) 77 | 78 | expect_equal( 79 | pureclass(predict(buildRF(cont), cont)), 80 | "RegrsPredict" 81 | ) 82 | }) 83 | -------------------------------------------------------------------------------- /tests/testthat/test-2-conjoin.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | 3 | binary <- exprso(iris[1:100,1:4], iris[1:100,5]) 4 | multi <- exprso(iris[,1:4], iris[,5]) 5 | cont <- exprso(iris[,1:3], iris[,4]) 6 | 7 | build.binary <- plGrid(binary, how = "buildSVM", cost = 1:7, top = 0) 8 | build.multi <- plGrid(binary, how = "buildSVM", cost = 1:7, top = 0) 9 | build.cont <- plGrid(binary, how = "buildSVM", cost = 1:7, top = 0) 10 | 11 | checkConjoin <- function(object){ 12 | 13 | a <- object[1:5,] 14 | b <- object[6:7,] 15 | 16 | ab <- conjoin(a, b) 17 | ba <- conjoin(b, a) 18 | if(class(object) == "ExprsPipeline"){ 19 | ab@summary <- ab@summary[,-1] # get rid of boot column 20 | ba@summary <- ba@summary[,-1] 21 | } 22 | 23 | expect_equal( 24 | ab, 25 | object[1:7,] 26 | ) 27 | 28 | expect_equal( 29 | ba, 30 | object[c(6:7, 1:5),] 31 | ) 32 | } 33 | 34 | test_that("conjoin method works correctly", { 35 | 36 | checkConjoin(binary) 37 | checkConjoin(multi) 38 | checkConjoin(cont) 39 | 40 | checkConjoin(build.binary) 41 | checkConjoin(build.multi) 42 | checkConjoin(build.cont) 43 | 44 | set.seed(1) 45 | c.mac <- conjoin(buildSVM(binary), buildLASSO(binary), 46 | buildRF(binary), buildDT(binary)) 47 | set.seed(1) 48 | c.ens <- conjoin(conjoin(buildSVM(binary), buildLASSO(binary)), 49 | conjoin(buildRF(binary), buildDT(binary))) 50 | expect_equal( 51 | c.mac, 52 | c.ens 53 | ) 54 | }) 55 | -------------------------------------------------------------------------------- /tests/testthat/test-3-modHistory.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | data(iris) 3 | e <- exprso(iris[1:100,1:4], iris[1:100,5]) 4 | 5 | test_that("modHistory reconstructs reduced dimensions", { 6 | 7 | expect_equal( 8 | fsPrcomp(e), 9 | fsPCA(e) 10 | ) 11 | 12 | A <- fsPCA(e) 13 | expect_equal( 14 | A@exprs, 15 | modHistory(e, A)@exprs 16 | ) 17 | 18 | B <- fsRDA(e) 19 | expect_equal( 20 | B@exprs, 21 | modHistory(e, B)@exprs 22 | ) 23 | 24 | C <- fsBalance(e) 25 | expect_equal( 26 | C@exprs, 27 | modHistory(e, C)@exprs 28 | ) 29 | 30 | D <- fsAnnot(e, colBy = "y") 31 | expect_equal( 32 | D@exprs, 33 | modHistory(e, D)@exprs 34 | ) 35 | 36 | if(requireNamespace("amalgam", quietly = TRUE)){ 37 | 38 | E <- fsAmalgam(e) 39 | expect_equal( 40 | E@exprs, 41 | modHistory(e, E)@exprs 42 | ) 43 | 44 | F <- fsAmalgam(e, n.amalgams = 4, asSLR = TRUE) 45 | expect_equal( 46 | F@exprs, 47 | modHistory(e, F)@exprs 48 | ) 49 | } 50 | 51 | G <- fsPRA(e, nclust = 3) 52 | expect_equal( 53 | G@exprs, 54 | modHistory(e, G)@exprs 55 | ) 56 | }) 57 | -------------------------------------------------------------------------------- /tests/testthat/test-5.1-fs.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | 3 | set.seed(1) 4 | fakeiris <- iris 5 | fakeiris[,1] <- sample(fakeiris[,1]) 6 | colnames(fakeiris) <- c('bad', 'better', 'good') 7 | 8 | binary <- exprso(fakeiris[1:100,1:3], fakeiris[1:100,5]) 9 | multi <- exprso(fakeiris[,1:3], fakeiris[,5]) 10 | cont <- exprso(fakeiris[,1:3], fakeiris[,4]) 11 | 12 | checkFS <- function(input, method, should, ...){ 13 | 14 | args <- as.list(substitute(list(...)))[-1] 15 | if(identical(should, "error")){ 16 | print(should) 17 | expect_error( 18 | do.call(method, append(args, list("object" = input))) 19 | ) 20 | }else{ 21 | print(should) 22 | e <- do.call(method, append(args, list("object" = input))) 23 | expect_equal( 24 | rownames(e@exprs), 25 | should 26 | ) 27 | } 28 | } 29 | 30 | test_that("@preFilter order matches the @exprs order", { 31 | 32 | f <- fsStats(binary) 33 | expect_equal( 34 | rownames(f@exprs), 35 | f@preFilter[[1]] 36 | ) 37 | 38 | # but now for reduction models... 39 | f <- fsPrcomp(binary) 40 | expect_false( 41 | identical( 42 | rownames(f@exprs), 43 | f@preFilter[[1]] 44 | ) 45 | ) 46 | }) 47 | 48 | test_that("build modules work for each data type", { 49 | 50 | # set.seed(1);checkFS(binary, fsSample, should = c("bad", "good", "better")) 51 | # set.seed(1);checkFS(multi, fsSample, should = c("bad", "good", "better")) 52 | # set.seed(1);checkFS(cont, fsSample, should = c("bad", "good", "better")) 53 | 54 | checkFS(binary, fsNULL, should = c("bad", "better", "good")) 55 | checkFS(multi, fsNULL, should = c("bad", "better", "good")) 56 | checkFS(cont, fsNULL, should = c("bad", "better", "good")) 57 | 58 | checkFS(binary, fsANOVA, should = c("good", "better", "bad")) 59 | checkFS(multi, fsANOVA, should = c("good", "better", "bad")) 60 | checkFS(cont, fsANOVA, should = "error") 61 | 62 | checkFS(binary, fsInclude, should = c("bad", "good", "better"), include = c("bad", "good")) 63 | checkFS(multi, fsInclude, should = c("bad", "good", "better"), include = c("bad", "good")) 64 | checkFS(cont, fsInclude, should = c("bad", "good", "better"), include = c("bad", "good")) 65 | 66 | checkFS(binary, fsStats, should = c("good", "better", "bad")) 67 | checkFS(binary, fsStats, should = c("good", "better", "bad"), how = "ks.test") 68 | checkFS(binary, fsStats, should = c("good", "better", "bad"), how = "wilcox.test") 69 | checkFS(binary, fsStats, should = c("good", "better", "bad"), how = "var.test") 70 | checkFS(binary, fsStats, should = "error", how = "fake.test") 71 | checkFS(multi, fsStats, should = "error") 72 | checkFS(cont, fsStats, should = "error") 73 | 74 | checkFS(binary, fsCor, should = "error") 75 | checkFS(multi, fsCor, should = "error") 76 | checkFS(cont, fsCor, should = c("good", "better", "bad")) 77 | 78 | checkFS(binary, fsPrcomp, should = c("PC1", "PC2", "PC3")) 79 | checkFS(multi, fsPrcomp, should = c("PC1", "PC2", "PC3")) 80 | checkFS(cont, fsPrcomp, should = c("PC1", "PC2", "PC3")) 81 | 82 | if(requireNamespace("vegan", quietly = TRUE)){ 83 | checkFS(binary, fsRDA, should = c("PC1", "PC2", "PC3")) 84 | checkFS(multi, fsRDA, should = c("PC1", "PC2", "PC3")) 85 | checkFS(cont, fsRDA, should = c("PC1", "PC2", "PC3")) 86 | } 87 | 88 | if(requireNamespace("limma", quietly = TRUE)){ 89 | checkFS(binary, fsEbayes, should = c("good", "better", "bad")) 90 | checkFS(multi, fsEbayes, should = c("good", "better", "bad")) 91 | checkFS(cont, fsEbayes, should = "error") 92 | } 93 | 94 | if(requireNamespace("edgeR", quietly = TRUE)){ 95 | checkFS(binary, fsEdger, should = c("good", "better", "bad")) 96 | checkFS(multi, fsEdger, should = "error") 97 | checkFS(cont, fsEbayes, should = "error") 98 | } 99 | 100 | if(requireNamespace("mRMRe", quietly = TRUE)){ 101 | checkFS(binary, fsMrmre, should = c("good", "better", "bad")) 102 | checkFS(multi, fsMrmre, should = "error") 103 | checkFS(cont, fsMrmre, should = "error") 104 | } 105 | 106 | # if(requireNamespace("RankProd", quietly = TRUE)){ 107 | # checkFS(binary, fsRankProd, should = c("good", "better", "bad")) 108 | # checkFS(multi, fsRankProd, should = "error") 109 | # checkFS(cont, fsRankProd, should = "error") 110 | # } 111 | 112 | if(requireNamespace("balance", quietly = TRUE)){ 113 | checkFS(binary, fsBalance, should = c("z1", "z2")) 114 | checkFS(multi, fsBalance, should = c("z1", "z2")) 115 | checkFS(cont, fsBalance, should = c("z1", "z2")) 116 | } 117 | 118 | checkFS(binary, fsAnnot, should = c("y", "bad", "better", "good"), colBy = "y") 119 | checkFS(multi, fsAnnot, should = c("y", "bad", "better", "good"), colBy = "y") 120 | checkFS(cont, fsAnnot, should = c("y", "bad", "better", "good"), colBy = "y") 121 | }) 122 | -------------------------------------------------------------------------------- /tests/testthat/test-5.2-build.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | 3 | binary <- exprso(iris[1:100,1:4], iris[1:100,5]) 4 | multi <- exprso(iris[,1:4], iris[,5]) 5 | cont <- exprso(iris[,1:3], iris[,4]) 6 | 7 | checkBuild <- function(input, classifier, should){ 8 | 9 | if(should == "error"){ 10 | print(should) 11 | expect_error( 12 | do.call(classifier, list("object" = input)) 13 | ) 14 | }else{ 15 | print(should) 16 | m <- do.call(classifier, list("object" = input)) 17 | expect_equal( 18 | round(calcStats(predict(m, input))$acc, 3), 19 | round(should, 3) 20 | ) 21 | } 22 | } 23 | 24 | test_that("build modules work for each data type", { 25 | 26 | # NAIVE BAYES 27 | set.seed(1) 28 | checkBuild(binary, buildNB, should = 1) 29 | checkBuild(multi, buildNB, should = .96) 30 | checkBuild(cont, buildNB, should = "error") 31 | 32 | # LINEAR DISCRIMINANT ANALYSIS 33 | set.seed(1) 34 | checkBuild(binary, buildLDA, should = 1) 35 | checkBuild(multi, buildLDA, should = .98) 36 | checkBuild(cont, buildLDA, should = "error") 37 | 38 | # SUPPORT VECTOR MACHINE 39 | set.seed(1) 40 | checkBuild(binary, buildSVM, should = 1) 41 | checkBuild(multi, buildSVM, should = .9667) 42 | checkBuild(cont, buildSVM, should = .9378) 43 | 44 | # LM / GLM / LR 45 | set.seed(1) 46 | checkBuild(binary, buildLM, should = "error") 47 | checkBuild(multi, buildLM, should = "error") 48 | checkBuild(cont, buildLM, should = .9379) 49 | set.seed(1) 50 | checkBuild(binary, buildGLM, should = 1) 51 | checkBuild(multi, buildGLM, should = "error") 52 | checkBuild(cont, buildGLM, should = .9379) 53 | 54 | # LASSO 55 | set.seed(1) 56 | checkBuild(binary, buildLASSO, should = 1) 57 | checkBuild(multi, buildLASSO, should = .953) 58 | checkBuild(cont, buildLASSO, should = .6690) 59 | 60 | # NEURAL NETS 61 | set.seed(1) 62 | checkBuild(binary, buildANN, should = 1) 63 | checkBuild(multi, buildANN, should = .6667) 64 | checkBuild(cont, buildANN, should = 0) 65 | 66 | # DECISION TREES 67 | set.seed(1) 68 | checkBuild(binary, buildDT, should = 1) 69 | checkBuild(multi, buildDT, should = .96) 70 | checkBuild(cont, buildDT, should = .9336) 71 | 72 | # RANDOM FORESTS 73 | set.seed(1) 74 | checkBuild(binary, buildRF, should = 1) 75 | checkBuild(multi, buildRF, should = 1) 76 | checkBuild(cont, buildRF, should = .9778) 77 | 78 | # FRB 79 | set.seed(1) 80 | checkBuild(binary, buildFRB, should = 1) 81 | checkBuild(multi, buildFRB, should = .953) 82 | checkBuild(cont, buildFRB, should = .932) 83 | }) 84 | -------------------------------------------------------------------------------- /tests/testthat/test-8.2-ens.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | 3 | binary <- exprso(iris[1:100,1:4], iris[1:100,5]) 4 | multi <- exprso(iris[,1:4], iris[,5]) 5 | cont <- exprso(iris[,1:3], iris[,4]) 6 | 7 | checkEnsemble <- function(ens, data, should){ 8 | 9 | print(should) 10 | expect_equal( 11 | round(calcStats(predict(ens, data))$acc, 3), 12 | round(should, 3) 13 | ) 14 | } 15 | 16 | test_that("buildEnsemble is actually the same as conjoin", { 17 | 18 | set.seed(1); b <- buildEnsemble(buildSVM(binary), buildLASSO(binary), buildRF(binary)) 19 | set.seed(1); j <- conjoin(buildSVM(binary), buildLASSO(binary), buildRF(binary)) 20 | expect_equal(b, j) 21 | }) 22 | 23 | test_that("buildEnsemble from argument works for each data type", { 24 | 25 | set.seed(1) 26 | ens.binary <- buildEnsemble(buildSVM(binary), buildLASSO(binary), buildRF(binary)) 27 | checkEnsemble(ens.binary, binary, should = 1) 28 | 29 | set.seed(1) 30 | ens.multi <- buildEnsemble(buildSVM(multi), buildLASSO(multi), buildRF(multi)) 31 | checkEnsemble(ens.multi, multi, should = .9733) 32 | 33 | set.seed(1) 34 | ens.cont <- buildEnsemble(buildSVM(cont), buildLASSO(cont), buildRF(cont)) 35 | checkEnsemble(ens.cont, cont, should = .9649) 36 | }) 37 | 38 | test_that("buildEnsemble from pl works for each data type", { 39 | 40 | set.seed(1) 41 | ens.binary <- buildEnsemble(plGrid(binary, how = "buildSVM", top = c(1, 2, 3))) 42 | checkEnsemble(ens.binary, binary, should = 1) 43 | 44 | set.seed(1) 45 | ens.multi <- buildEnsemble(plGrid(multi, how = "buildSVM", top = c(1, 2, 3))) 46 | checkEnsemble(ens.multi, multi, should = .8333) 47 | 48 | set.seed(1) 49 | ens.cont <- buildEnsemble(plGrid(cont, how = "buildSVM", top = c(1, 2, 3))) 50 | checkEnsemble(ens.cont, cont, should = .8472) 51 | }) 52 | -------------------------------------------------------------------------------- /tests/testthat/test-fsRDA.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | data(iris) 3 | x <- iris[1:100,1:4] 4 | y <- iris[1:100,5] 5 | e <- exprso(x, y) 6 | 7 | r1 <- fsRDA(e) 8 | plot(r1) 9 | m1 <- buildLR(r1, top = 1) 10 | acc1 <- calcStats(predict(m1, e))$acc 11 | 12 | r2 <- fsRDA(e, colBy = "defineCase") 13 | plot(r2) 14 | m2 <- buildLR(r2, top = 1) 15 | acc2 <- calcStats(predict(m2, e))$acc 16 | 17 | test_that("fsRDA will partial out colBy correctly", { 18 | 19 | expect_equal( 20 | acc1 > acc2, 21 | TRUE 22 | ) 23 | }) 24 | -------------------------------------------------------------------------------- /tests/testthat/test-mod.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | library(magrittr) 3 | suppressWarnings(RNGversion("3.5.0")) 4 | 5 | ########################################################### 6 | ### Check modSwap, modSubset, and modCluster 7 | 8 | load(file.path("data.RData")) 9 | 10 | arrays <- splitStratify(array, percent.include = 50, colBy = "sex") 11 | array.train <- arrays[[1]] 12 | spill <- splitStratify(arrays[[2]], percent.include = 67, colBy = "sex") 13 | array.test <- spill[[1]] 14 | spillover <- spill[[2]] 15 | 16 | test_that("modSwap and subset work without any error", { 17 | 18 | set.seed(1) 19 | expect_equal( 20 | array.train %>% modSwap(percent = 50) %>% subset(select = "mutated") %>% '$'("mutated"), 21 | c(0, 1, 0, 1, 0, 0, 0, 0) 22 | ) 23 | 24 | set.seed(1) 25 | expect_equal( 26 | 27 | array.train %>% modSwap(how = spillover, percent = 50) %>% '$'("mutated"), 28 | c(0, 1, 0, 1, 0, 0, 0, 0) 29 | ) 30 | }) 31 | 32 | test_that("modCluster and modSubset work without any error", { 33 | 34 | set.seed(1) 35 | expect_equal( 36 | 37 | array.train %>% modCluster(how = "hclust") %>% modSubset(colBy = "cluster", include = 1), 38 | array.train %>% modCluster(how = "kmeans") %>% modSubset(colBy = "cluster", include = 1) 39 | ) 40 | 41 | expect_equal( 42 | 43 | array.train %>% modCluster(how = "diana"), 44 | array.train %>% modCluster(how = "fanny") 45 | ) 46 | 47 | expect_equal( 48 | 49 | array.train %>% modCluster(how = "hclust"), 50 | array.train %>% modCluster(how = "pam") 51 | ) 52 | }) 53 | -------------------------------------------------------------------------------- /tests/testthat/test-pl-cv.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | suppressWarnings(RNGversion("3.5.0")) 3 | 4 | ########################################################### 5 | ### Check plMonteCarlo 6 | 7 | load(file.path("data.RData")) 8 | 9 | array@annot$defineCase[1:2] <- "Case" 10 | array@annot$defineCase[25:30] <- "Control" 11 | 12 | # Perform bootstrapping with plMonteCarlo 13 | set.seed(12345) 14 | ss <- ctrlSplitSet(func = "splitStratify", percent.include = 50, colBy = NULL) 15 | fs <- ctrlFeatureSelect(func = "fsStats", top = 0, how = "t.test") 16 | gs <- ctrlGridSearch(func = "plGrid", how = "buildLDA", top = 2, method = "mle") 17 | boot <- plMonteCarlo(array, B = 20, ctrlSS = ss, ctrlFS = fs, ctrlGS = gs) 18 | 19 | # Repeat bootstrapping manually 20 | set.seed(12345) 21 | aucs <- vector("numeric", 20) 22 | for(b in 1:20){ 23 | 24 | arrays.b <- splitStratify(array, percent.include = 50, colBy = NULL) 25 | array.b.train <- fsStats(arrays.b[[1]], top = 0, how = "t.test") 26 | array.b.test <- arrays.b[[2]] 27 | pl.b <- plGrid(array.b.train, array.b.test, how = "buildLDA", top = 2, method = "mle") 28 | aucs[b] <- pl.b$valid.auc 29 | } 30 | 31 | test_that("plMonteCarlo is grossly intact", { 32 | 33 | expect_equal( 34 | calcMonteCarlo(boot, colBy = "valid.auc"), 35 | mean(aucs) 36 | ) 37 | }) 38 | 39 | # Check calcMonteCarlo with contrived example 40 | set.seed(12345) 41 | ss <- ctrlSplitSet(func = "splitStratify", percent.include = 50, colBy = NULL) 42 | fs <- ctrlFeatureSelect(func = "fsStats", top = 0, how = "t.test") 43 | gs <- ctrlGridSearch(func = "plGrid", how = "buildLDA", top = c(4, 3, 2), method = "mle", fold = 0) 44 | boot <- plMonteCarlo(array, B = 1, ctrlSS = ss, ctrlFS = fs, ctrlGS = gs) 45 | 46 | test_that("plMonteCarlo returns correctly sized @machs", { 47 | 48 | expect_equal( 49 | length(boot@machs[[1]]@preFilter[[2]]), 50 | 4 51 | ) 52 | 53 | expect_equal( 54 | length(boot@machs[[2]]@preFilter[[2]]), 55 | 3 56 | ) 57 | 58 | expect_equal( 59 | length(boot@machs[[3]]@preFilter[[2]]), 60 | 2 61 | ) 62 | }) 63 | 64 | test_that("calcMonteCarlo picks best CV", { 65 | 66 | expect_equal( 67 | round(calcMonteCarlo(boot, colBy = "valid.auc"), 7), 68 | 0.7619048 69 | ) 70 | }) 71 | 72 | ########################################################### 73 | ### Check plNested 74 | 75 | array@annot$defineCase[1:2] <- "Case" 76 | array@annot$defineCase[25:30] <- "Control" 77 | 78 | # Perform cross-validation with plNested 79 | set.seed(12345) 80 | fs <- ctrlFeatureSelect(func = "fsStats", top = 0) 81 | gs <- ctrlGridSearch(func = "plGrid", how = "buildLDA", top = 0, fold = NULL, method = "mle") 82 | nest <- plNested(array, fold = 10, ctrlFS = fs, ctrlGS = gs) 83 | 84 | # Perform cross-validation with plCV 85 | set.seed(12345) 86 | cv <- plCV(array, top = 0, fold = 10, aucSkip = TRUE, plCV.acc = "acc", how = "buildLDA", method = "mle") 87 | 88 | test_that("plNested without fs matches plCV", { 89 | 90 | expect_equal( 91 | mean(nest$valid.acc), 92 | cv 93 | ) 94 | }) 95 | 96 | # Check calcNested with contrived example 97 | set.seed(12345) 98 | fs <- ctrlFeatureSelect(func = "fsStats", top = 0) 99 | gs <- ctrlGridSearch(func = "plGrid", how = "buildSVM", top = 2, fold = 10, 100 | kernel = "linear", cost = 10^(c(-10, 1))) 101 | nest <- plNested(array, fold = 10, ctrlFS = fs, ctrlGS = gs) 102 | 103 | test_that("calcMonteCarlo picks best CV", { 104 | 105 | expect_equal( 106 | calcNested(nest, colBy = "valid.f1"), 107 | calcNested(pipeFilter(nest, colBy = "train.plCV"), colBy = "valid.f1") 108 | ) 109 | }) 110 | -------------------------------------------------------------------------------- /tests/testthat/test-pl-gs.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | suppressWarnings(RNGversion("3.5.0")) 3 | 4 | ########################################################### 5 | ### Check plGrid 6 | 7 | load(file.path("data.RData")) 8 | 9 | arrays <- splitStratify(array, percent.include = 50, colBy = NULL) 10 | array.train <- arrays[[1]] 11 | array.test <- arrays[[2]] 12 | array.test@annot$defineCase[1] <- "Case" 13 | 14 | pl <- plGrid(array.train, 15 | array.test, 16 | top = 0, 17 | how = "buildLDA", 18 | fold = NULL, 19 | method = c("mle", 20 | "mve") 21 | ) 22 | 23 | test_that("individual @machs match expectations", { 24 | 25 | mach <- buildLDA(array.train, top = 0, method = "mle") 26 | expect_equal( 27 | predict(pl@machs[[1]], array.test), 28 | predict(mach, array.test) 29 | ) 30 | 31 | mach <- buildLDA(array.train, top = 0, method = "mve") 32 | expect_equal( 33 | predict(pl@machs[[2]], array.test), 34 | predict(mach, array.test) 35 | ) 36 | }) 37 | 38 | test_that("@summary match expectations", { 39 | 40 | mach <- buildLDA(array.train, top = 0, method = "mle") 41 | expect_equal( 42 | matrix(pl@summary[1, c("valid.acc", "valid.sens", "valid.spec", "valid.prec", "valid.f1", "valid.auc")]), 43 | matrix(calcStats(predict(mach, array.test))) 44 | ) 45 | 46 | mach <- buildLDA(array.train, top = 0, method = "mve") 47 | expect_equal( 48 | matrix(pl@summary[2, c("valid.acc", "valid.sens", "valid.spec", "valid.prec", "valid.f1", "valid.auc")]), 49 | matrix(calcStats(predict(mach, array.test))) 50 | ) 51 | }) 52 | 53 | ########################################################### 54 | ### Check plCV 55 | 56 | set.seed(12345) 57 | 58 | arrays <- splitStratify(array, percent.include = 100, colBy = NULL) 59 | array.train <- arrays[[1]] 60 | array.test <- arrays[[2]] 61 | 62 | acc <- plCV(array, top = 0, fold = 10, aucSkip = TRUE, plCV.acc = "acc", how = "buildLDA", method = "mle") 63 | 64 | array.train@annot$defineCase[c(1:2)] <- "Case" 65 | array.train@annot$defineCase[c(19:20)] <- "Control" 66 | 67 | acc.off <- plCV(array.train, top = 0, fold = 0, aucSkip = TRUE, plCV.acc = "acc", how = "buildLDA", method = "mle") 68 | pl <- plGrid(array.train, 69 | array.test, 70 | top = 0, 71 | how = "buildLDA", 72 | fold = 0, 73 | method = "mle" 74 | ) 75 | 76 | test_that("plCV is grossly intact", { 77 | 78 | expect_equal( 79 | acc, 80 | 1 81 | ) 82 | 83 | expect_equal( 84 | acc.off, 85 | .8 86 | ) 87 | 88 | expect_equal( 89 | pl@summary$train.plCV, 90 | .8 91 | ) 92 | }) 93 | -------------------------------------------------------------------------------- /tests/testthat/test-regrs.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | suppressWarnings(RNGversion("3.5.0")) 3 | 4 | o <- exprso(iris[,1:3], iris[,4]) 5 | set.seed(1) 6 | arrays <- splitSample(o) 7 | a <- buildANN(arrays[[1]]) 8 | b <- buildRF(arrays[[1]]) 9 | c <- buildSVM(arrays[[1]]) 10 | 11 | test_that("Continuous outcome models work", { 12 | 13 | expect_error( 14 | buildLDA(o) 15 | ) 16 | 17 | expect_error( 18 | buildNB(o) 19 | ) 20 | 21 | expect_equal( 22 | predict(a, arrays[[2]])@actual, 23 | predict(b, arrays[[2]])@actual 24 | ) 25 | 26 | expect_equal( 27 | predict(b, arrays[[2]])@actual, 28 | predict(c, arrays[[2]])@actual 29 | ) 30 | }) 31 | 32 | # ss <- ctrlSplitSet(func = "splitSample", percent.include = 67) 33 | # fs <- ctrlFeatureSelect(func = "fsNULL", top = 0) 34 | # gs <- ctrlGridSearch(func = "plGrid", how = "buildSVM", top = c(2, 3, 0), 35 | # kernel = c("linear", "radial")) 36 | # 37 | # test_that("Continuous outcome pl modules work", { 38 | # 39 | # set.seed(1) 40 | # expect_equal( 41 | # round(plCV(o, top = 0, how = "buildRF", fold = 2), 4), 42 | # .9489 43 | # ) 44 | # 45 | # set.seed(1) 46 | # expect_equal( 47 | # round(plGrid(arrays[[1]], arrays[[2]], top = c(0, 2, 3), how = "buildRF", 48 | # fold = 2)@summary$valid.acc, 4), 49 | # c(.9645, .9244, .9643) 50 | # ) 51 | # 52 | # set.seed(1) 53 | # boot <- plMonteCarlo(o, B = 3, ctrlSS = ss, ctrlFS = fs, ctrlGS = gs) 54 | # expect_equal( 55 | # round(calcMonteCarlo(boot), 4), 56 | # .9619 57 | # ) 58 | # 59 | # set.seed(1) 60 | # boot <- plNested(o, fold = 2, ctrlFS = fs, ctrlGS = gs, save = FALSE) 61 | # expect_equal( 62 | # round(calcNested(boot), 4), 63 | # .9555 64 | # ) 65 | # 66 | # ens <- buildEnsemble(a, b, c) 67 | # pred <- predict(ens, o) 68 | # expect_equal( 69 | # round(calcStats(pred)$acc, 4), 70 | # .9286 71 | # ) 72 | # 73 | # ens <- buildEnsemble(boot) 74 | # pred <- predict(ens, o) 75 | # expect_equal( 76 | # round(calcStats(pred)$acc, 4), 77 | # .9536 78 | # ) 79 | # }) 80 | -------------------------------------------------------------------------------- /tests/testthat/test-split.R: -------------------------------------------------------------------------------- 1 | library(exprso) 2 | suppressWarnings(RNGversion("3.5.0")) 3 | 4 | ########################################################### 5 | ### Test ExprsBinary and ExprsMulti imports 6 | 7 | set.seed(1235) 8 | 9 | df.a <- data.frame( 10 | "id" = 1:10, 11 | "class" = rep("a", 10), 12 | "sex" = c(rep("M", 5), rep("F", 5)), 13 | "feat1" = rnorm(10, mean = 10, sd = 1), 14 | "feat2" = rnorm(10, mean = 20, sd = 5), 15 | "feat3" = rnorm(10, mean = 5, sd = 1) 16 | ) 17 | 18 | df.b <- data.frame( 19 | "id" = 11:30, 20 | "class" = rep("b", 20), 21 | "sex" = c(rep("M", 10), rep("F", 10)), 22 | "feat1" = rnorm(20, mean = 20, sd = 5), 23 | "feat2" = rnorm(20, mean = 10, sd = 1), 24 | "feat3" = rnorm(20, mean = 5, sd = 1) 25 | ) 26 | 27 | df.c <- data.frame( 28 | "id" = 31:40, 29 | "class" = rep("c", 10), 30 | "sex" = c(rep("M", 3), rep("F", 7)), 31 | "feat1" = rnorm(10, mean = 15, sd = 3), 32 | "feat2" = rnorm(10, mean = 15, sd = 3), 33 | "feat3" = rnorm(10, mean = 5, sd = 1) 34 | ) 35 | 36 | df <- do.call(rbind, list(df.a, df.b, df.c)) 37 | 38 | tempFile <- tempfile() 39 | write.table(df, file = tempFile, sep = "\t") 40 | 41 | array <- 42 | arrayExprs(tempFile, begin = 4, colID = "id", colBy = "class", 43 | include = list("a", "b")) 44 | 45 | arrayMulti <- 46 | arrayExprs(tempFile, begin = 4, colID = "id", colBy = "class", 47 | include = list("a", "b", "c")) 48 | 49 | test_that("ExprsBinary imports correctly", { 50 | 51 | expect_equal( 52 | as.character(class(array)), 53 | "ExprsBinary" 54 | ) 55 | 56 | expect_equal( 57 | unique(array[array$defineCase == "Control", "class"]), 58 | "a" 59 | ) 60 | 61 | expect_equal( 62 | unique(array[array$defineCase == "Case", "class"]), 63 | "b" 64 | ) 65 | }) 66 | 67 | test_that("ExprsMulti imports correctly", { 68 | 69 | expect_equal( 70 | as.character(class(arrayMulti)), 71 | "ExprsMulti" 72 | ) 73 | 74 | expect_equal( 75 | unique(arrayMulti[arrayMulti$defineCase == 1, "class"]), 76 | "a" 77 | ) 78 | 79 | expect_equal( 80 | unique(arrayMulti[arrayMulti$defineCase == 2, "class"]), 81 | "b" 82 | ) 83 | 84 | expect_equal( 85 | unique(arrayMulti[arrayMulti$defineCase == 3, "class"]), 86 | "c" 87 | ) 88 | }) 89 | 90 | ########################################################### 91 | ### Test splitStratify without colBy 92 | 93 | arrays <- 94 | splitStratify(array, percent.include = 50, colBy = NULL) 95 | 96 | arraysMulti <- 97 | splitStratify(arrayMulti, percent.include = 50, colBy = NULL) 98 | 99 | test_that("splitStratify correctly splits ExprsBinary objects", { 100 | 101 | expect_equal( 102 | nrow(arrays[[1]]@annot), 103 | 10 104 | ) 105 | 106 | expect_equal( 107 | sum(arrays[[2]]$defineCase == "Control"), 108 | 5 109 | ) 110 | 111 | expect_equal( 112 | sum(arrays[[2]]$defineCase == "Case"), 113 | 15 114 | ) 115 | }) 116 | 117 | test_that("splitStratify correctly splits ExprsMulti objects", { 118 | 119 | expect_equal( 120 | sum(arraysMulti[[1]]$defineCase == 1), 121 | 5 122 | ) 123 | 124 | expect_equal( 125 | sum(arraysMulti[[2]]$defineCase == 2), 126 | 15 127 | ) 128 | 129 | expect_equal( 130 | sum(arraysMulti[[2]]$defineCase == 3), 131 | 5 132 | ) 133 | }) 134 | 135 | ########################################################### 136 | ### Test splitStratify with colBy 137 | 138 | arrays <- 139 | splitStratify(array, percent.include = 100, colBy = c("sex"), bin = c(FALSE)) 140 | 141 | arraysMulti <- 142 | splitStratify(arrayMulti, percent.include = 100, colBy = c("sex"), bin = c(FALSE)) 143 | 144 | test_that("splitStratify correctly splits ExprsBinary objects with colBy", { 145 | 146 | expect_equal( 147 | nrow(arrays[[1]]@annot), 148 | 20 149 | ) 150 | 151 | expect_equal( 152 | nrow(arrays[[2]]@annot), 153 | 10 154 | ) 155 | 156 | expect_equal( 157 | sum(arrays[[1]]$defineCase == "Control" & arrays[[1]]$sex == "M"), 158 | sum(arrays[[1]]$defineCase == "Case" & arrays[[1]]$sex == "M") 159 | ) 160 | 161 | expect_equal( 162 | sum(arrays[[1]]$defineCase == "Control" & arrays[[1]]$sex == "F"), 163 | sum(arrays[[1]]$defineCase == "Case" & arrays[[1]]$sex == "F") 164 | ) 165 | }) 166 | 167 | test_that("splitStratify correctly splits ExprsMulti objects with colBy", { 168 | 169 | expect_equal( 170 | nrow(arraysMulti[[1]]@annot), 171 | 24 172 | ) 173 | 174 | expect_equal( 175 | nrow(arraysMulti[[2]]@annot), 176 | 16 177 | ) 178 | 179 | expect_equal( 180 | sum(arraysMulti[[1]]$defineCase == 1 & arraysMulti[[1]]$sex == "M"), 181 | 3 182 | ) 183 | 184 | expect_equal( 185 | sum(arraysMulti[[1]]$defineCase == 2 & arraysMulti[[1]]$sex == "M"), 186 | 3 187 | ) 188 | 189 | expect_equal( 190 | sum(arraysMulti[[1]]$defineCase == 3 & arraysMulti[[1]]$sex == "M"), 191 | 3 192 | ) 193 | 194 | expect_equal( 195 | sum(arraysMulti[[1]]$defineCase == 1 & arraysMulti[[1]]$sex == "F"), 196 | 5 197 | ) 198 | 199 | expect_equal( 200 | sum(arraysMulti[[1]]$defineCase == 2 & arraysMulti[[1]]$sex == "F"), 201 | 5 202 | ) 203 | 204 | expect_equal( 205 | sum(arraysMulti[[1]]$defineCase == 3 & arraysMulti[[1]]$sex == "F"), 206 | 5 207 | ) 208 | }) 209 | -------------------------------------------------------------------------------- /vignettes/c_readme.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Frequently Asked Questions" 3 | author: "Thomas Quinn" 4 | date: "`r Sys.Date()`" 5 | output: rmarkdown::html_vignette 6 | vignette: > 7 | %\VignetteIndexEntry{Frequently Asked Questions} 8 | %\VignetteEngine{knitr::rmarkdown} 9 | %\VignetteEncoding{UTF-8} 10 | --- 11 | 12 | ## Warnings against improper use 13 | 14 | * **plGrid, plMonteCarlo, plNested:** For a high-throughput classification pipeline, if you supply an $x$ number of top features to the `top` argument greater than the number of total number of features available in a training set, exprso will automatically use all features instead. 15 | * **pipeFilter, buildEnsemble:** For an `ExprsPipeline` model extraction, if you supply an $x$ number of top models to the `top` argument greater than the total number of models available in a filtered cut of models, exprso will automatically use all models instead. If you are concerned about this default behavior, call `pipeFilter` first, then call `buildEnsemble` on the `pipeFilter` results after inspecting them manually. 16 | * **plCV:** This function calculates a simple metric of cross-validation during high-throughput classification. When the function receives data that have already undergone feature selection, **`plCV` provides an overly-optimistic metric of classifier performance that should never get published**. However, the results of `plCV` do have *relative* validity, so it is fine to use them to choose parameters. 17 | * **splitSample:** The `splitSample` method builds the training and validation sets by randomly sampling all subjects in an `ExprsArray` object. However, **`splitSample` is not truly random; it iteratively samples until at least one of every class appears in the test set**. This rule makes it easier to run analyses and interpret results, but requires caution when articulating in a report how you chose the test set. 18 | 19 | ## Known issues 20 | 21 | * **fsMrmre:** This feature selection method will crash with too many (> 46340) features. 22 | * **buildDNN:** This classification method will exhaust RAM unless you manually clear old models. 23 | * **buildRF:** This classification method will crash sometimes when working with very small or unbalanced datasets within a large high-throughput classification pipeline. 24 | -------------------------------------------------------------------------------- /vignettes/exprso-diagram.cmap: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tpq/exprso/c4a0eb6412833abe216b61c6ca53737bc8f53c5b/vignettes/exprso-diagram.cmap -------------------------------------------------------------------------------- /vignettes/exprso-diagram.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tpq/exprso/c4a0eb6412833abe216b61c6ca53737bc8f53c5b/vignettes/exprso-diagram.jpg --------------------------------------------------------------------------------