├── .Rbuildignore ├── .gitattributes ├── .gitignore ├── DESCRIPTION ├── LICENSE ├── NAMESPACE ├── R ├── R.Rproj ├── batchDesign_utils.R ├── mlOSP_utils.R ├── ospPiecewiseBW.R ├── ospProbDesign.R ├── ospSeqBatchDesign.R ├── ospSeqBatchDesignSimplified.R ├── ospSeqDesign.R ├── payoffs.R └── simFuncs.R ├── README.md ├── data ├── int300_3d.RData └── int_2d.RData ├── demo ├── 00Index ├── BenchmarkedSolvers.R ├── desktop.ini └── ml-osp-jcf.R ├── inst └── doc │ ├── Bermudan_demo.R │ ├── Bermudan_demo.Rmd │ ├── Bermudan_demo.html │ └── arxivPreprint.pdf ├── man ├── CalcOverhead.Rd ├── batch.adsa.Rd ├── batch.ddsa.Rd ├── batch.mlb.Rd ├── batch.rb.Rd ├── call.payoff.Rd ├── capexp.impulse.Rd ├── cf.absur.Rd ├── cf.csur.Rd ├── cf.el.Rd ├── cf.mcu.Rd ├── cf.smcu.Rd ├── cf.sur.Rd ├── cf.tMSE.Rd ├── digital.put.payoff.Rd ├── forest.impulse.Rd ├── forward.impulse.policy.Rd ├── forward.sim.policy.Rd ├── geom.put.payoff.Rd ├── int300_3d.Rd ├── int_2d.Rd ├── lin.impulse.Rd ├── maxi.call.payoff.Rd ├── mini.put.payoff.Rd ├── osp.fixed.design.Rd ├── osp.impulse.control.Rd ├── osp.prob.design.Rd ├── osp.probDesign.piecewisebw.Rd ├── osp.seq.batch.design.Rd ├── osp.seq.batch.design.simplified.Rd ├── osp.seq.design.Rd ├── osp.tvr.Rd ├── ospPredict.Rd ├── pegging.alg.Rd ├── plot_style.Rd ├── plt.2d.surf.Rd ├── plt.2d.surf.batch.Rd ├── plt.2d.surf.with.batch.Rd ├── put.payoff.Rd ├── r.reallocate.Rd ├── sim.bm.Rd ├── sim.expOU.sv.Rd ├── sim.gbm.Rd ├── sim.gbm.asian.Rd ├── sim.gbm.cor.Rd ├── sim.gbm.matrix.Rd ├── sim.gbm.moving.ave.Rd ├── sim.logOU_Discrete.Rd ├── sim.ouExp.Rd ├── sim.price.and.capacity.Rd ├── sv.put.payoff.Rd ├── swing.fixed.design.Rd ├── swing.policy.Rd ├── treeDivide.BW.1d.Rd └── treeDivide.BW.Rd ├── seqOSP.Rproj └── vignettes ├── mlosp-demo.Rmd ├── mlosp-demo.html └── seqDesignADSA-demo.Rmd /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^Meta$ 4 | ^doc$ 5 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | *.rdb 5 | *.rdx 6 | *.RData 7 | Meta 8 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: mlOSP 2 | Title: Machine Learning and Regression Monte Carlo Algorithms for Optimal Stopping 3 | Version: 1.0 4 | Authors@R: person("Mike", "Ludkovski", email = "ludkovski@pstat.ucsb.edu", role = c("aut", "cre")) 5 | Maintainer: Mike Ludkovski 6 | Description: A suite of regression Monte Carlo algorithms. Includes both static and sequential designs. We implement the original Longstaff-Schwartz and Tsitsiklis-van Roy algorithms, as well as machine learning approaches that explicitly capture the underlying experimental designs. The mlOSP template then allows to mix and match the choice of the regression method, the experimental design and the stochastic simulator. The library directly accepts function hooks for the option payoff and the path generation. Key functions are osp.prob.design (original LSM), osp.fixed.design (a variety of space-filling or user-specified designs, generally assumed to be batched), osp.seq.design (sequential designs using a collection of pre-specified Expected Improvement Criteria). Also implements the Bouchard-Warin adaptive partitioning with linear regression (osp.design.piecewisebw). Work partially supported by NSF-1521743. 7 | Depends: R (>= 3.6.0) 8 | License: LGPL 9 | LazyData: true 10 | Suggests: 11 | earth, 12 | DiceKriging, 13 | tgp, 14 | ggplot2, 15 | RColorBrewer, 16 | fields, 17 | scales, 18 | randomForest, 19 | hetGP, 20 | mvtnorm, 21 | kernlab, 22 | ks, 23 | laGP, 24 | np, 25 | knitr, rmarkdown, 26 | randtoolbox, 27 | nnet, 28 | dynaTree, 29 | deepnet, 30 | pander 31 | VignetteBuilder: knitr, rmarkdown 32 | RoxygenNote: 7.1.2 33 | Date/Publication: 2020-11-30 18:10:03 UTC 34 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | export(batch.adsa) 4 | export(batch.ddsa) 5 | export(batch.mlb) 6 | export(batch.rb) 7 | export(call.payoff) 8 | export(capexp.impulse) 9 | export(cf.absur) 10 | export(cf.csur) 11 | export(cf.el) 12 | export(cf.mcu) 13 | export(cf.smcu) 14 | export(cf.sur) 15 | export(cf.tMSE) 16 | export(digital.put.payoff) 17 | export(forest.impulse) 18 | export(forward.impulse.policy) 19 | export(forward.sim.policy) 20 | export(geom.put.payoff) 21 | export(lin.impulse) 22 | export(maxi.call.payoff) 23 | export(mini.put.payoff) 24 | export(osp.fixed.design) 25 | export(osp.impulse.control) 26 | export(osp.prob.design) 27 | export(osp.probDesign.piecewisebw) 28 | export(osp.seq.batch.design) 29 | export(osp.seq.design) 30 | export(osp.tvr) 31 | export(ospPredict) 32 | export(plt.2d.surf) 33 | export(plt.2d.surf.batch) 34 | export(plt.2d.surf.with.batch) 35 | export(put.payoff) 36 | export(sim.bm) 37 | export(sim.expOU.sv) 38 | export(sim.gbm) 39 | export(sim.gbm.asian) 40 | export(sim.gbm.cor) 41 | export(sim.gbm.matrix) 42 | export(sim.gbm.moving.ave) 43 | export(sim.logOU_Discrete) 44 | export(sim.ouExp) 45 | export(sim.price.and.capacity) 46 | export(sv.put.payoff) 47 | export(swing.fixed.design) 48 | export(swing.policy) 49 | export(treeDivide.BW) 50 | export(treeDivide.BW.1d) 51 | -------------------------------------------------------------------------------- /R/R.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | -------------------------------------------------------------------------------- /R/batchDesign_utils.R: -------------------------------------------------------------------------------- 1 | ##################### 2 | #' ABSUR overhead 3 | #' 4 | #' @title Calculates c_over in ABSUR 5 | #' @param theta0,theta1,theta2 parameters in linear regression 6 | #' @param n current design size 7 | #' @details compute the estimated overhead in ABSUR 8 | #' 9 | CalcOverhead <- function(theta0, theta1, theta2, n) { 10 | overhead = theta0 + theta1*n + theta2*n^2; 11 | } 12 | 13 | ##################### 14 | #' ABSUR for Adaptive Batching 15 | #' 16 | #' @title Calculates weights for location and batch size in ABSUR 17 | #' @param objMean predicted mean response 18 | #' @param objSd posterior standard deviation of the response 19 | #' @param nugget the noise variance to compute the ALC factor 20 | #' @param r_cand candidate batch size vector 21 | #' @param overhead estimated computation overhead in GP 22 | #' @param t0 overhead for individual simulation 23 | #' @references 24 | #' M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 25 | #' in Noisy Level Set Estimation, 26 | #' @author Xiong Lyu 27 | #' @seealso \code{\link[mlOSP]{osp.seq.batch.design}} 28 | #' @export 29 | cf.absur <- function(objMean, objSd, nugget, r_cand, overhead, t0) { 30 | # expand mean and sd vectors to matrix of size len(x_cand) * len(r_cand) 31 | r_len <- length(r_cand) 32 | x_len <- length(objMean) 33 | objMean_matrix <- matrix(objMean, nrow=x_len, ncol=r_len) 34 | objSd_matrix <- matrix(objSd, nrow=x_len, ncol=r_len) 35 | r_matrix <- matrix(r_cand, nrow = x_len, ncol=r_len, byrow=TRUE) 36 | 37 | # EI at cross combination of candidate input and batch size 38 | nugget_matrix <- nugget / sqrt(r_matrix) 39 | a = pnorm(-abs(objMean_matrix)/objSd_matrix) # normalized distance to zero contour 40 | new_objSd2 <- nugget_matrix * objSd_matrix / sqrt(nugget_matrix ^ 2 + objSd_matrix ^ 2) # new posterior variance 41 | a_new <- pnorm(-abs(objMean_matrix)/new_objSd2) # new distance to zero-contour 42 | 43 | # difference between next-step ZC and current ZC weighted by the overhead 44 | return( (a-a_new) / (r_matrix * t0 + overhead) ) 45 | } 46 | 47 | ##################### 48 | #' Ratchet Batching heuristic 49 | #' 50 | #' @title Calculates weights for batch size in RB 51 | #' @param objSd_at_x_optim posterior standard deviation of the response at the selected new input 52 | #' @param r_cand candidate batch size vector 53 | #' @param nugget the noise variance to compute the ALC factor 54 | #' @param last_r the last batch size 55 | #' @param gamma threshold compared with sd 56 | #' @seealso \code{\link[mlOSP]{osp.seq.batch.design}} 57 | #' @references 58 | #' M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 59 | #' in Noisy Level Set Estimation, http://arxiv.org/abs/2003.08579 60 | #' @export 61 | batch.rb <- function(objSd_at_x_optim, r_cand, last_r, nugget, gamma) { 62 | eta = 0.8 63 | rindex = which(r_cand == last_r)[1] 64 | 65 | # if r reaches the upper bound, just keep it at the same level 66 | if (rindex == length(r_cand)) { 67 | roptim = r_cand[rindex] 68 | } else { 69 | 70 | # compare whether to stay at the current r or move to the next level 71 | rcand_two_levels = c(r_cand[rindex], r_cand[rindex + 1]) 72 | nugget2_est_by_level = nugget ^ 2 / rcand_two_levels 73 | post_var = objSd_at_x_optim^2 * nugget2_est_by_level / (nugget2_est_by_level + objSd_at_x_optim^2) 74 | post_sd = sqrt(post_var) 75 | 76 | # if no solution, update value of gamma 77 | if (post_sd[1] < gamma) { 78 | k = ceiling(log(post_sd[1] / gamma) / log(eta)) 79 | gamma = gamma * (eta ^ k) 80 | } 81 | 82 | # choose whether to stay depending on comparison of sd with gamma 83 | r_index = rindex - 1 + sum(post_sd > gamma) 84 | roptim = r_cand[r_index] 85 | } 86 | return(list(roptim = roptim, gamma = gamma)) 87 | } 88 | 89 | ##################### 90 | #' Multi-Level Batching Heuristic 91 | #' 92 | #' @description Calculates weights for batch size in MLB when called from \link{osp.seq.batch.design} 93 | #' @param objSd_at_x_optim posterior standard deviation of the response at the selected new input 94 | #' @param r_cand candidate batch size vector 95 | #' @param nugget the noise variance to compute the ALC factor 96 | #' @param gamma threshold compared with sd 97 | #' @return list containing: 98 | #' \itemize{ 99 | #' \item \code{roptim}: new replication count 100 | #' \item \code{gamma}: new gamma variable 101 | #' } 102 | #' @references 103 | #' M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 104 | #' in Noisy Level Set Estimation, http://arxiv.org/abs/2003.08579 105 | #' 106 | #' @seealso \code{\link[mlOSP]{osp.seq.batch.design}} 107 | #' @export 108 | batch.mlb <- function(objSd_at_x_optim, r_cand, nugget, gamma) { 109 | eta = 0.5; 110 | 111 | # calculate the new posterior variance at the new site location 112 | nugget2_est_by_level = nugget ^ 2 / r_cand 113 | post_var = objSd_at_x_optim^2 * nugget2_est_by_level / (nugget2_est_by_level + objSd_at_x_optim^2) 114 | post_sd = sqrt(post_var) 115 | 116 | # if no solution, update value of gamma 117 | if (post_sd[1] < gamma) { 118 | k = ceiling(log(post_sd[1] / gamma) / log(eta)) 119 | gamma = gamma * (eta ^ k) 120 | } 121 | 122 | r_index = sum(post_sd > gamma) 123 | roptim = r_cand[r_index] 124 | return(list(roptim = roptim, gamma = gamma)) 125 | } 126 | 127 | ##################### 128 | #' ADSA for Adaptive Batching 129 | #' 130 | #' @title Calculates reallocated batch size or new input location for Adaptive Design with Sequential Allocation 131 | #' @param fit GP/TP fit 132 | #' @param r_seq batch size vector for existing inputs 133 | #' @param xtest testing points to compare reallocation and adding a new inputs 134 | #' @param xt_dens density of xtest 135 | #' @param x_new new input location selected by the EI criteria 136 | #' @param r0 total number of new simulations 137 | #' @param nugget the noise variance to compute the ALC factor 138 | #' @param method \code{km} or \code{trainkm} or \code{hetgp} or \code{homtp} 139 | #' @return a list containing 140 | #' \itemize{ 141 | #' \item \code{xoptim}: new design input (NULL if re-allocation is chosen) 142 | #' \item \code{roptim}: added replications (a scalar r0 if new input chosen, 143 | #' a vector containing the re-allocation amounts otherwise 144 | #' } 145 | #' @references 146 | #' M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 147 | #' in Noisy Level Set Estimation, 148 | #' @author Xiong Lyu 149 | #' @seealso \code{\link[mlOSP]{osp.seq.batch.design}} 150 | #' @export 151 | batch.adsa <- function(fit, r_seq, xtest, xt_dens, x_new, r0, nugget, method) { 152 | x_optim = NULL 153 | ddsa.res = batch.ddsa(fit, r_seq, xtest, xt_dens, r0, method) 154 | r_new = ddsa.res$r_new 155 | K = ddsa.res$K 156 | L = ddsa.res$L 157 | # determine between allocating in existing samples or moving to a new location 158 | 159 | # i1 - reallocation 160 | Delta_R = 1/r_seq - 1/r_new 161 | Delta_R = diag(Delta_R) 162 | 163 | v = solve(t(L)) %*% (solve(L) %*% K) 164 | I1 = diag(t(v) %*% Delta_R %*% v) 165 | 166 | # i2 - adding a new input 167 | if (method == "km" | method == "trainkm") { 168 | x_all = rbind(xtest, fit@X, x_new) 169 | C = covMatrix(fit@covariance, x_all)$C 170 | k = C[1:nrow(xtest), (nrow(xtest) + nrow(fit@X) + 1)] 171 | k_new = C[(nrow(xtest) + 1):(nrow(xtest) + nrow(fit@X)), (nrow(xtest) + nrow(fit@X) + 1)] 172 | } else { 173 | x_all = rbind(xtest, fit$X0, x_new) 174 | 175 | if (method == "homtp") { 176 | # Lower triangle matrix of covariance matrix 177 | C <- hetGP::cov_gen(x_all, theta=fit$theta, type=fit$covtype) * fit$sigma2 178 | } else { 179 | C <- hetGP::cov_gen(x_all, theta=fit$theta, type=fit$covtype) * fit$nu_hat 180 | } 181 | k = C[1:nrow(xtest), (nrow(xtest) + nrow(fit$X0) + 1)] 182 | k_new = C[(nrow(xtest) + 1):(nrow(xtest) + nrow(fit$X0)), (nrow(xtest) + nrow(fit$X0) + 1)] 183 | } 184 | 185 | if (method == "km" | method == "trainkm") { 186 | ss = fit@covariance@sd2 187 | } else { 188 | if (method == 'homtp') { 189 | ss = fit$sigma2 190 | } else { 191 | ss = fit$nu_hat 192 | } 193 | } 194 | 195 | a = solve(t(L)) %*% (solve(L) %*% k_new) 196 | var_new = ss - t(k_new) %*% a 197 | cov = k - t(K) %*% a 198 | I2 = vec(cov) ^ 2 / (nugget^2 / r0 + var_new[1, 1]) 199 | 200 | if (sum(I1 * xt_dens) > sum(I2 * xt_dens)) { 201 | # allocate on existing samples 202 | r_optim = r_new 203 | } else { 204 | # choose a new location 205 | r_optim = r0 206 | x_optim = x_new 207 | } 208 | return(list(x_optim = x_optim, r_optim = r_optim)) 209 | } 210 | 211 | ##################### 212 | #' DDSA for Adaptive Batching 213 | #' 214 | #' @title Calculates reallocated batch size for DDSA 215 | #' @param fit gp/tp fit 216 | #' @param r_seq batch size vector for existing inputs 217 | #' @param xtest testing points to compare reallocation and adding a new inputs 218 | #' @param xt_dens density of xtest 219 | #' @param r0 total number of new simulations to add 220 | #' @param method "km" or "trainkm" or "hetgp" or "homtp" 221 | #' @references 222 | #' M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 223 | #' in Noisy Level Set Estimation, 224 | #' @author Xiong Lyu 225 | #' @seealso \code{\link[mlOSP]{osp.seq.batch.design}} 226 | #' @export 227 | batch.ddsa <- function(fit, r_seq, xtest, xt_dens, r0, method) { 228 | if (method == "km" | method == "trainkm") { 229 | pred.test <- predict(fit, data.frame(x=xtest), type="UK") 230 | 231 | # Covariance of xtest and x 232 | K <- pred.test$c 233 | # Lower triangle matrix of covariance matrix 234 | L <- t(fit@T) 235 | } else { # hetGP/homTP 236 | pred.test <- predict(x=xtest, object=fit, xprime = fit$X0) 237 | 238 | # Covariance of xtest and x 239 | K <- t(pred.test$cov) 240 | 241 | if (method == "homtp") { 242 | # Lower triangle matrix of covariance matrix 243 | C <- hetGP::cov_gen(fit$X0, theta=fit$theta, type=fit$covtype) * fit$sigma2 + 244 | diag(rep(fit$g, dim(fit$X0)[1])) 245 | } 246 | if (method == "hetgp") { 247 | C <- fit$nu_hat * (hetGP::cov_gen(fit$X0, theta=fit$theta, type=fit$covtype) + 248 | fit$Lambda * diag(1/fit$mult)) 249 | } 250 | if (method == "homgp") { 251 | C <- fit$nu_hat * (hetGP::cov_gen(fit$X0, theta=fit$theta, type=fit$covtype) + 252 | fit$g * diag(1/fit$mult)) 253 | } 254 | L <- t(chol(C)) 255 | } 256 | 257 | 258 | # allocate between existing samples 259 | r_new = r.reallocate(L, K, xt_dens, r_seq, r0) 260 | return(list(r_new = r_new, K = K, L = L)) 261 | } 262 | 263 | ##################### 264 | #' New batch size calculator 265 | #' 266 | #' @title Calculates reallocated batch size 267 | #' @param L lower triangle of cholesky decomposition of covariance matrix 268 | #' @param K covariance matrix 269 | #' @param xt_dens density of xtest 270 | #' @param r_seq batch size vector for existing inputs 271 | #' @param r0 total number of new simulations 272 | r.reallocate <- function(L, K, xt_dens, r_seq, r0) { 273 | U <- solve(t(L)) %*% (solve(L) %*% K) %*% xt_dens 274 | r_seq_new <- pegging.alg(r0 + sum(r_seq), U, r_seq) 275 | return(r_seq_new) 276 | } 277 | 278 | ##################### 279 | #' Pegging algorithm for ADSA/DDSA 280 | #' 281 | #' @title Calculates reallocated batch size for ADSA 282 | #' @param r total number of simulations 283 | #' @param U weighted matrix for pegging algorithm 284 | #' @param r_seq batch size vector for existing inputs 285 | #' @references 286 | #' M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 287 | #' in Noisy Level Set Estimation, http://arxiv.org/abs/2003.08579 288 | #' @author Xiong Lyu 289 | pegging.alg <- function(r, U, r_seq) { 290 | is_end = FALSE 291 | indexes = seq(1, length(r_seq)) 292 | r_new = r_seq 293 | r_total = r 294 | 295 | while(!is_end) { 296 | r_new[indexes] = r_total * U[indexes] / sum(U[indexes]) 297 | 298 | if(sum(r_new[indexes] >= r_seq[indexes]) == length(indexes)) { 299 | is_end = TRUE 300 | } else { 301 | idx = which(r_new <= r_seq) 302 | unchanged = intersect(idx, indexes) 303 | r_new[unchanged] = r_seq[unchanged] 304 | indexes = which(r_new > r_seq) 305 | r_total = r - sum(r_new[idx]) 306 | } 307 | } 308 | r_new_int = round(r_new) 309 | if (sum(r_new_int == r_seq) == length(r_seq)) { 310 | idx_max = which(r_new == max(r_new[r_new != r_new_int])) 311 | r_new_int[idx_max] = r_new_int[idx_max] + 1 312 | } 313 | return(r_new_int) 314 | } 315 | 316 | 317 | ###### 318 | #' two-dimensional image of contour + site + batch plot for two fits 319 | #' @title Visualize and compare 2D emulator + stopping region for two fits 320 | #' 321 | #' @param fit1,fit2 can be any of the types supported by \code{\link{forward.sim.policy}} 322 | #' @param r1,r2 batch vectors for the two fits 323 | #' @param x,y locations to use for the \code{predict()} functions. Default is a 200x200 fine grid. 324 | #' Passed to \code{expand.grid} 325 | #' @param batch1,batch2 batch heristics for two fits; Passed to \code{ggplot()} 326 | #' This only works for \code{km} and \code{het/homGP} objects 327 | #' @export 328 | plt.2d.surf.batch <- function( fit1, fit2, r1, r2, batch1, batch2, x=seq(31,43,len=201), y = seq(31,43,len=201)) 329 | { 330 | gr <- expand.grid(x=x,y=y) 331 | r <- c(r1, r2) 332 | batch.samples <- c(rep(batch1, length(r1)), rep(batch2, length(r2))) 333 | batch.fitted <- c(rep(batch1, length(x) ^ 2), rep(batch2, length(x) ^ 2)) 334 | 335 | # calculate posterior mean and standard deviation for grid 336 | if (class(fit1)=="km") { 337 | x <- rbind(fit1@X, fit2@X) 338 | m <- c(predict(fit1,data.frame(x=cbind(gr$x,gr$y)), type="UK")$mean, 339 | predict(fit2,data.frame(x=cbind(gr$x,gr$y)), type="UK")$mean) 340 | sd<- c(predict(fit1,data.frame(x=cbind(gr$x,gr$y)), type="UK")$sd, 341 | predict(fit2,data.frame(x=cbind(gr$x,gr$y)), type="UK")$sd) 342 | } 343 | if( (class(fit1)=="homGP" | class(fit1) == "hetGP")) { 344 | x <- rbind(fit1$X0, fit2$X0) 345 | m <- c(predict(x=cbind(gr$x,gr$y), object=fit1)$mean, 346 | predict(x=cbind(gr$x,gr$y), object=fit2)$mean) 347 | sd <- sqrt(c(predict(x=cbind(gr$x,gr$y), object=fit1)$sd2, 348 | predict(x=cbind(gr$x,gr$y), object=fit2)$sd2)) 349 | } 350 | 351 | # credible interval 352 | lb <- m - 1.96 * sd 353 | ub <- m + 1.96 * sd 354 | 355 | fitted.data.2d <- data.frame(x1 = gr$x, x2 = gr$y, m = m, lbound = lb, ubound = ub, batch = batch.fitted) 356 | samples <- data.frame(x1 = x[,1], x2 = x[,2], r = r, batch = batch.samples) 357 | 358 | p <- ggplot(data = samples, aes(x = x1, y = x2)) + 359 | geom_raster(data = fitted.data.2d, aes(x = x1, y = x2, fill = m)) + 360 | scale_fill_gradient2(low = "black", high = "yellow", mid = "red") + 361 | geom_point(aes(color = r), size = 2) + 362 | scale_color_gradient(low = "cyan", high = "purple") + 363 | facet_grid(col = vars(batch)) + labs(x=expression(X[t]^1), y= expression(X[t]^2)) + 364 | scale_x_continuous(name = expression(X[t]^1), expand = c(0, 0)) + 365 | scale_y_continuous(name = expression(X[t]^2), expand = c(0, 0)) 366 | 367 | p + stat_contour(data = fitted.data.2d, aes(x = x1, y = x2, z = m), breaks = 0, color = "black", inherit.aes=FALSE, size = 1.4) + 368 | facet_grid(col = vars(batch)) + 369 | stat_contour(data = fitted.data.2d, aes(x = x1, y = x2, z = lbound), color = "black", linetype = "longdash", breaks = 0, size = 0.8) + 370 | stat_contour(data = fitted.data.2d, aes(x = x1, y = x2, z = ubound), color = "black",linetype = "longdash", breaks = 0, size = 0.8) + 371 | plot_style() 372 | } 373 | 374 | ########################## 375 | #' GGplot style 376 | #' @param base_size is the font size 377 | #' @param base_family is the font style 378 | #' @param ... is for the parameters to specialize the ggplot style 379 | #' @noMd 380 | plot_style <- function(base_size = 14, base_family = "Helvetica",...) { 381 | theme_bw(base_size = base_size, base_family = base_family) %+replace% 382 | theme(panel.border = element_blank(), 383 | panel.grid.major = element_blank(), 384 | panel.grid.minor = element_blank(), 385 | strip.text = element_blank(), 386 | axis.line = element_line(colour = "black",size = 0.5), 387 | strip.background = element_rect(fill = "white",colour = "white",size = 1), 388 | legend.direction = "vertical", 389 | legend.position = "right", 390 | legend.box = "vertical", 391 | legend.text.align = 0, 392 | legend.text = element_text(size=8, margin = margin(r = 5, unit = "pt")), 393 | legend.box.just = "left", 394 | legend.key = element_blank(), 395 | #legend.title = element_blank(), 396 | legend.background = element_blank(), 397 | legend.margin = margin(t = 2, r = 2, b = 0, l = 0), 398 | legend.key.width = unit(0.35,"cm"), 399 | legend.key.height = unit(0.6,"cm"), 400 | axis.text.x = element_text( colour = 'black', size = 10, hjust = 0.5, vjust = 0.5), 401 | axis.title.x = element_text(size = 10, hjust = 0.5, vjust = 0.2), 402 | axis.text.y = element_text(colour = 'black', size = 10), 403 | axis.title.y = element_text(size = 10, angle = 90, hjust = 0.5, vjust = 0.2), 404 | strip.text.y = element_text(size = 10, hjust = 0.5, vjust = 0.5, face = 'bold'), 405 | strip.text.x = element_text(size = 10, hjust = 0.5, vjust = 0.5, face = 'bold'), 406 | ...) 407 | } 408 | 409 | ###### 410 | #' two-dimensional image+contour plot with replication counts for an \code{osp.seq.batch.design} fit 411 | #' @title Visualize 2D emulator + stopping region + batch amounts 412 | #' 413 | #' @param x,y locations to use for the \code{predict()} functions. Default is a 200x200 fine grid. 414 | #' Passed to \code{expand.grid} 415 | #' @param fit An emulator object. Can be any of the types supported by \code{\link[mlOSP]{forward.sim.policy}} 416 | #' @param contour.col (default is "red") -- color of the zero contour 417 | #' @param rep.limits (default is c(20,100)) -- range of the legend for the replication counts 418 | #' @param batch_size array of replication counts for each input generated by \code{\link[mlOSP]{osp.seq.batch.design}} 419 | #' 420 | #' @author Xiong Lyu 421 | #' @export 422 | #' 423 | #' @examples 424 | #' sob30 <- randtoolbox::sobol(55, d=2) # construct a space-filling initial design 425 | #' sob30 <- sob30[ which( sob30[,1] + sob30[,2] <= 1) ,] 426 | #' sob30 <- 25+30*sob30 427 | #' model2d <- list(x0 = rep(40,2),K=40,sigma=rep(0.2,2),r=0.06, 428 | #' div=0,T=1,dt=0.04,dim=2,sim.func=sim.gbm, 429 | #' payoff.func=put.payoff, look.ahead=1, pilot.nsims=1000, 430 | #' cand.len=1000,max.lengthscale=c(40,40),min.lengthscale=c(3,3), 431 | #' seq.design.size=50,batch.nrep=25,total.budget=1000,init.size=30, 432 | #' init.grid=sob30, kernel.family="gauss",update.freq=5, 433 | #' r.cand=c(20, 30,40,50,60, 80, 120, 160)) 434 | #' set.seed(11) 435 | #' require(tgp) 436 | #' require(DiceKriging) 437 | #' require(laGP) 438 | #' require(ks) 439 | #' require(RColorBrewer) 440 | #' require(scales) 441 | #' model2d$batch.heuristic <- 'adsa' 442 | #' model2d$ei.func <- 'amcu' 443 | #' oos.obj.adsa <- osp.seq.batch.design(model2d,method="trainkm") 444 | #' plt.2d.surf.with.batch(oos.obj.adsa$fit[[15]], 445 | #' oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15]) 446 | plt.2d.surf.with.batch <- function( fit, batch_size, x=seq(25,50,len=201),y = seq(25,50,len=201), 447 | contour.col="red", rep.limits=c(20,100)) 448 | { 449 | gr <- expand.grid(x=x,y=y) 450 | 451 | # calculate posterior mean and standard deviation for grid 452 | if (class(fit)=="km") { 453 | m <- predict(fit,data.frame(x=cbind(gr$x,gr$y)), type="UK")$mean 454 | sd<- predict(fit,data.frame(x=cbind(gr$x,gr$y)), type="UK")$sd 455 | samples <- data.frame(x1 = fit@X[,1], x2 = fit@X[,2], r = batch_size) 456 | } 457 | if( (class(fit)=="homGP" | class(fit) == "hetGP")) { 458 | m <- predict(x=cbind(gr$x,gr$y), object=fit)$mean 459 | sd <- sqrt(predict(x=cbind(gr$x,gr$y), object=fit)$sd2) 460 | samples <- data.frame(x1 = fit$X0[,1], x2 = fit$X0[,2], r = batch_size) 461 | } 462 | 463 | fitted.data.2d <- data.frame(x1 = gr$x, x2 = gr$y, m = m, 464 | lbound = m - 1.96 * sd , ubound = m + 1.96 * sd) 465 | 466 | cols <- RColorBrewer::brewer.pal(n = 9, name = "PuBuGn") 467 | 468 | p <- ggplot(fitted.data.2d) + 469 | geom_raster(aes(x = x1, y = x2, fill = m)) + 470 | #scale_fill_gradient2(low = "black", high = "yellow", mid = "red") + 471 | scale_fill_gradientn(colours = fields::tim.colors(64)) + 472 | geom_point(data=samples, aes(x = x1, y = x2, color = r), size = 4) + 473 | # scale_color_gradient(low = "cyan", high = "blue", limits = c(0, 200)) + 474 | scale_colour_gradientn(colours = cols, #values = scales::rescale(c(seq(60, 180, 8), 218)), 475 | guide = "colorbar", limits=rep.limits) + 476 | scale_x_continuous(name = expression(X[t]^1), expand = c(0, 0),limits=range(x)) + 477 | scale_y_continuous(name = expression(X[t]^2), expand = c(0, 0),limits=range(y)) + 478 | labs(color = "Reps", fill=expression(hat(T)(t,x))) 479 | 480 | p + stat_contour(data = fitted.data.2d, aes(x = x1, y = x2, z = m), breaks = 0, color = contour.col, inherit.aes=FALSE, size = 1.4) + 481 | stat_contour(data = fitted.data.2d, aes(x = x1, y = x2, z = lbound), color = "black", linetype = "longdash", breaks = 0, size = 0.8) + 482 | stat_contour(data = fitted.data.2d, aes(x = x1, y = x2, z = ubound), color = "black",linetype = "longdash", breaks = 0, size = 0.8) + 483 | plot_style() 484 | 485 | } 486 | -------------------------------------------------------------------------------- /R/ospPiecewiseBW.R: -------------------------------------------------------------------------------- 1 | ########################################## 2 | #' @title Longstaff Schwartz Algorithm using the Bouchard-Warin method 3 | #' 4 | #' Uses the Bouchard-Warin recursive partitioning to create N-d trees 5 | #' for local linear regression fits. Each tree node contains N/model$nBins^model$dim inputs. 6 | #' @details Calls \link{treeDivide.BW} to create the equi-probable partitions. 7 | #' Must have N/model$nBins^model$dim as an integer. 8 | #' 9 | #' @param N the number of forward training paths 10 | #' @param model a list defining all model parameters. Must contain the following fields: 11 | #' \cr \code{T, dt, dim, nBins}, 12 | #' \code{sim.func, x0, r, payoff.func} 13 | #' @param verb if specified, produces plots of the 1-dim fit every \code{verb} time-steps 14 | #' [default is zero, no plotting] 15 | #' @param tst.paths (optional) a list containing out-of-sample paths to obtain a price estimate 16 | #' 17 | #' @return a list with the following fields: 18 | #' \itemize{ 19 | #' \item \code{price} is the scalar optimal reward; 20 | #' \item \code{tau} is a vector of stopping times over in-sample paths; 21 | #' \item \code{test} is a vector of out-of-sample pathwise rewards; 22 | #' \item \code{val} is a vector of in-sample pathwise rewards 23 | #' \item \code{timeElapsed} total running time based on \code{Sys.time} 24 | #' } 25 | #' 26 | #' @references 27 | #' Bruno Bouchard and Xavier Warin. Monte-Carlo valorisation of American options: facts and new 28 | #' algorithms to improve existing methods. In R. Carmona, P. Del Moral, P. Hu, and N. Oudjane, editors, 29 | #' Numerical Methods in Finance, volume 12 of Springer Proceedings in Mathematics. Springer, 2011. 30 | #' 31 | #' @examples 32 | #' set.seed(1) 33 | #' modelSV5 <- list(K=100,x0=c(90, log(0.35)),r=0.0225,div=0,sigma=1, 34 | #' T=50/252,dt=1/252,svAlpha=0.015,svEpsY=1,svVol=3,svRho=-0.03,svMean=2.95, 35 | #' eulerDt=1/2520, dim=2,sim.func=sim.expOU.sv,nBins=10,payoff.func=sv.put.payoff) 36 | #' putPr <- osp.probDesign.piecewisebw(20000,modelSV5) 37 | #' putPr$price 38 | #' # get [1] 17.30111 39 | #' @export 40 | ########################################### 41 | osp.probDesign.piecewisebw <- function(N,model,tst.paths=NULL, verb=0) 42 | { 43 | t.start <- Sys.time() 44 | M <- as.integer(round(model$T/model$dt)) 45 | grids <- list() 46 | preds <- list() 47 | 48 | if (is.null(model$nBins) ) 49 | stop("Missing model parameters: must specify nBins (number of partitions per dimension)") 50 | 51 | 52 | # in 1-d save all the models to analyze the fits 53 | if (model$dim == 1) { 54 | all.models <- array(list(NULL), dim=c(M,model$nBins)) 55 | all.bounds <- array(0, dim=c(M,model$nBins)) 56 | bnd <- array(0, dim=c(M,3)) 57 | # used for 1-D case, saves the stopping boundary for each time step 58 | bnd[M,] <- c(model$K,model$K,N) 59 | } 60 | 61 | # Build the grids from a global simulation of X_{1:T} 62 | grids[[1]] <- model$sim.func( matrix(rep(model$x0, N), nrow=N,byrow=T), model, model$dt) 63 | for (i in 2:M) 64 | grids[[i]] <- model$sim.func( grids[[i-1]], model, model$dt) 65 | 66 | # make sure to have something for out-of-sample 67 | if (is.null(tst.paths)) { 68 | test.paths <- list() 69 | for (i in 1:M) 70 | test.paths[[i]] <- grids[[i]][1:min(N,100),,drop=F] 71 | } 72 | else 73 | test.paths <- tst.paths 74 | 75 | # initialize the continuation values/stopping times 76 | contValue <- exp(-model$r*model$dt)*model$payoff.func( grids[[M]], model) 77 | tau <- rep(model$T, N) 78 | test.value <- exp(-model$r*model$dt)*model$payoff.func(test.paths[[M]], model) 79 | 80 | ###### Main loop: Backward step in time 81 | # Estimate T(t,x) 82 | for (i in (M-1):1) 83 | { 84 | # forward predict 85 | if (model$dim == 1) { 86 | preds[[i]] <- treeDivide.BW.1d( data.frame(cont=contValue,grid=grids[[i]]), 2, 87 | model, test=test.paths[[i]]) 88 | all.models[i,] <- preds[[i]]$lm.model 89 | all.bounds[i,] <- preds[[i]]$boundaries 90 | #preds[[i]]$out.sample <- predict( all.models[i,], test.paths[[i]]) 91 | } 92 | else 93 | preds[[i]] <- treeDivide.BW( data.frame(cont=contValue,grid=grids[[i]]), 2, 94 | model, test=test.paths[[i]]) 95 | 96 | # compute T(t,x) = C(t,x) - h(x) 97 | immPayoff <- model$payoff.func(grids[[i]],model) 98 | timingValue <- preds[[i]]$in.sample - immPayoff 99 | 100 | # same on the out-of-sample paths 101 | test.payoff <- model$payoff.func(test.paths[[i]],model) 102 | test.tv <- preds[[i]]$out.sample - test.payoff 103 | 104 | # figure out the Put boundary in 1d 105 | if (model$dim == 1) { 106 | stop.ndx <- which( timingValue < 0 & model$payoff.func(grids[[i]], model) > 0) 107 | bnd[i,] <- c(max( grids[[i]][stop.ndx,1]),quantile( grids[[i]][stop.ndx,1], 0.98), length(stop.ndx)/N ) 108 | if (verb > 0 & i > 1) 109 | if (i %% verb == 1) { 110 | plot(grids[[i]][,1] ,timingValue,cex=0.5,col="red",xlim=c(model$K*0.6,model$K*1.6),ylim=c(-0.5,2), 111 | xlab="S_t", ylab="Timing Value",cex.lab=1.1,main=i) 112 | abline(h=0, lty=2, lwd=2) 113 | rug( grids[[i]][1:1000,1], col="blue") 114 | } 115 | } 116 | 117 | # paths on which stop 118 | stopNdx <- which( timingValue <= 0 & immPayoff > 0) 119 | contValue[stopNdx] <- immPayoff[stopNdx] 120 | contValue <- exp(-model$r*model$dt)*contValue 121 | tau[stopNdx] <- i*model$dt 122 | 123 | # paths on which stop out-of-sample 124 | stop.test <- which( test.tv <= 0 & test.payoff > 0) 125 | test.value[stop.test] <- test.payoff[ stop.test] 126 | test.value <- exp(-model$r*model$dt)*test.value 127 | } 128 | 129 | # final answer 130 | price <- mean(contValue) 131 | out.message <- paste("In-sample estimated price: ", round(price, digits=3)) 132 | if (is.null(tst.paths) == FALSE) 133 | out.message <- paste(out.message, " and out-of-sample ", round(mean(test.value), digits=3 )) 134 | cat(out.message) 135 | if (verb >0) { 136 | plot(seq(model$dt,model$T,by=model$dt),bnd[,2], lwd=2, type="l", ylim=c(33,41), 137 | xlab="Time t", ylab="Stopping Boundary",col="red", cex.lab=1.2) 138 | points(0,40, cex=2, pch= 18,col='black') 139 | text(0.5,model$K-2, 'Continue',cex=1.7) 140 | text(0.75,model$K-6, 'Stop', cex=1.7) 141 | 142 | } 143 | 144 | # Return a list with the following fields: 145 | # price is the scalar optimal reward 146 | # tau is a vector of stopping times over in-sample paths 147 | # test is a vector of out-of-sample pathwise rewards 148 | # val is a vector of in-sample pathwise rewards 149 | # time elapsed (for benchmarking) 150 | return( list(price=price,tau=tau,test=test.value,val=contValue, timeElapsed=Sys.time()-t.start)) 151 | } 152 | -------------------------------------------------------------------------------- /R/ospSeqBatchDesign.R: -------------------------------------------------------------------------------- 1 | ################# 2 | #' @title Adaptive Batching designs for optimal stopping 3 | #' 4 | #' @description Sequential experimental design for optimal stopping problems with several 5 | #' adaptive batching heuristics based on Lyu & Ludkovski (2020+) 6 | #' 7 | #' @details Implements the adaptive batching strategy defined in \code{mode$batch.heuristic}. 8 | #' Calls \code{lhs} from library \pkg{tgp}. Possible batch heuristics are: 9 | #' \itemize{ 10 | #' \item \code{fb}: [Default] fixed batch amounts (essentially same as \link{osp.seq.design}) 11 | #' \item \code{mlb}: Multi-level batching; relies on \code{model$r.cand} 12 | #' \item \code{rb}: Ratchet batching; relies on \code{model$r.cand} 13 | #' \item \code{absur}: Adaptive batching with Stepwise Uncertainty Reduction; relies on \code{model$t0} 14 | #' \item \code{adsa}: Adaptive Design with Sequential Allocation 15 | #' \item \code{ddsa}: Deterministic ADSA that alternates between adding a new input site and allocating 16 | #' to existing sites 17 | #' } 18 | #' 19 | #' All heuristics also require specifying the acquisition function for expected improvement criterion 20 | #' via \code{model$ei.func}, see \link{osp.seq.design} 21 | #' 22 | #' @param model a list containing all the model parameters. 23 | #' 24 | #' @param method A GP emulator to apply. Must be one of \code{km}, \code{trainkm} 25 | #' \code{homgp}, \code{homtp} or \code{hetgp} 26 | #' @param t0 parameter \code{t0} for the \code{ABSUR} heuristic [Default value is 0.01] 27 | #' @param is.gbm flag to indicate whether the underlying simulator is independent log-normals (used 28 | #' as part of density computation for integrated EI criteria) [Default FALSE] 29 | #' @export 30 | #' @return a list containing: 31 | #' \itemize{ 32 | #' \item \code{fit} a list of fitted response surfaces 33 | #' \item \code{timeElapsed} vector of time costs for each round 34 | #' \item \code{nsims} total number of 1-step \code{model$sim.func} calls 35 | #' \item \code{empLoss} vector of empirical losses 36 | #' \item \code{ndesigns}: number of unique designs k_T 37 | #' \item \code{batches}: matrix of replications r_i, indexed by time-steps and by sequential rounds 38 | #' } 39 | #' @references 40 | #' M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 41 | #' in Noisy Level Set Estimation, 42 | #' 43 | #' @seealso [mlOSP::osp.seq.design] 44 | #' 45 | #' @examples 46 | #' sob30 <- randtoolbox::sobol(55, d=2) # construct a space-filling initial design 47 | #' sob30 <- sob30[ which( sob30[,1] + sob30[,2] <= 1) ,] 48 | #' sob30 <- 25+30*sob30 49 | #' model2d <- list(x0 = rep(40,2),K=40,sigma=rep(0.2,2),r=0.06, 50 | #' div=0,T=1,dt=0.04,dim=2,sim.func=sim.gbm, 51 | #' payoff.func=put.payoff, look.ahead=1, pilot.nsims=1000, 52 | #' cand.len=1000,max.lengthscale=c(40,40),min.lengthscale=c(3,3), 53 | #' seq.design.size=50,batch.nrep=25,total.budget=2000,init.size=30, 54 | #' init.grid=sob30, kernel.family="gauss",update.freq=5, 55 | #' r.cand=c(20, 30,40,50,60, 80, 120, 160)) 56 | #' set.seed(11) 57 | #' require(tgp) 58 | #' require(DiceKriging) 59 | #' require(laGP) 60 | #' require(ks) 61 | #' model2d$batch.heuristic <- 'adsa' 62 | #' model2d$ei.func <- 'amcu' 63 | #' oos.obj.adsa <- osp.seq.batch.design(model2d,method="trainkm") 64 | #' plt.2d.surf.with.batch(oos.obj.adsa$fit[[15]], 65 | #' oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15]) 66 | osp.seq.batch.design <- function(model, method="km", t0 = 0.01, is.gbm=FALSE) 67 | { 68 | M <- model$T/model$dt 69 | if (method %in% c('km','trainkm','hetgp','homgp','homtp') == FALSE) 70 | stop("Regression `method` must case-match one of implemented choices.") 71 | 72 | t.start <- Sys.time() 73 | cur.sim <- 0 74 | if (is.null(model$ei.func)) { 75 | model$ei.func <- "csur" 76 | } 77 | if (is.null(model$update.freq)) { 78 | model$update.freq <- 10 79 | } 80 | if (is.null(model$batch.heuristic)) { 81 | model$batch.heuristic <- 'fb' 82 | } 83 | if (is.null(model$cand.len)) 84 | model$cand.len <- 500*model$dim 85 | 86 | if (is.null(model$pilot.nsims)) 87 | model$pilot.nsims <- 5*model$init.size 88 | 89 | # parameters in absur 90 | if (is.null(model$total.budget)) { 91 | model$total.budget = model$seq.design.size * model$batch.nrep 92 | } 93 | if (is.null(model$c.batch)) # parameter for new replicates in adsa and ddsa 94 | model$c.batch = 20 / model$dim 95 | if (is.null(model$r.cand)) { 96 | model$r.cand = c(model$batch.nrep, model$batch.nrep) 97 | } 98 | r_lower = model$r.cand[1] 99 | r_upper = min(model$r.cand[length(model$r.cand)], 0.1 * model$total.budget) 100 | r_interval = seq(r_lower, r_upper, length = 1000) 101 | theta_for_optim = c(0.1371, 0.000815, 1.9871E-6) # c_{oh} in eq. (13) 102 | batch_matrix <- matrix(rep(0, M*model$seq.design.size), ncol=M) 103 | 104 | 105 | fits <- list() # list of emulator objects at each step 106 | pilot.paths <- list() 107 | emp.loss <- array(0, dim=c(M,model$seq.design.size-model$init.size)) 108 | update.kernel.iters <- seq(0,model$seq.design.size,by=model$update.freq) # when to refit the whole GP 109 | 110 | # set-up a skeleton to understand the distribution of X 111 | pilot.paths[[1]] <- model$sim.func( matrix(rep(model$x0[1:model$dim], model$pilot.nsims), 112 | nrow=model$pilot.nsims, byrow=T), model, model$dt) 113 | for (i in 2:(M-1)) { 114 | pilot.paths[[i]] <- model$sim.func( pilot.paths[[i-1]], model, model$dt) 115 | } 116 | pilot.paths[[1]] <- pilot.paths[[3]] 117 | init.grid <- pilot.paths[[M-1]] 118 | budget.used <- rep(0,M-1) 119 | theta.fit <- array(0, dim=c(M,model$seq.design.size-model$init.size+1,model$dim)) 120 | 121 | ############ step back in time 122 | for (i in (M-1):1) 123 | { 124 | all.X <- matrix( rep(0, (model$dim+2)*model$seq.design.size), ncol=model$dim+2) 125 | 126 | # construct the input domain where candidates will be looked for 127 | if (is.null(model$lhs.rect)) { 128 | model$lhs.rect <- 0.02 129 | } 130 | if (length(model$lhs.rect) == 1) { 131 | lhs.rect <- matrix( rep(0, 2*model$dim), ncol=2) 132 | # create a box using empirical quantiles of the init.grid cloud 133 | for (jj in 1:model$dim) 134 | lhs.rect[jj,] <- quantile( init.grid[,jj], c(model$lhs.rect, 1-model$lhs.rect) ) 135 | } else { # already specified 136 | lhs.rect <- model$lhs.rect 137 | } 138 | 139 | if (is.null(model$min.lengthscale)) { 140 | model$min.lengthscale <- rep(0.1, model$dim) 141 | } 142 | 143 | # Candidate grid of potential NEW sites to add. Will be ranked using the EI acquisition function 144 | # only keep in-the-money sites 145 | ei.cands <- tgp::lhs( model$cand.len, lhs.rect ) # from tgp package 146 | ei.cands <- ei.cands[ model$payoff.func( ei.cands,model) > 0,,drop=F] 147 | 148 | 149 | # initial design 150 | if (is.null(model$init.grid)) { 151 | init.grid <- tgp::lhs(model$init.size, lhs.rect) 152 | } else { 153 | init.grid <- model$init.grid 154 | } 155 | if (model$dim > 1) { 156 | K0 <- dim(init.grid)[1] 157 | 158 | # initial conditions for all the forward paths: replicated design with batch.nrep 159 | big.grid <- init.grid[ rep(1:K0, model$batch.nrep),] 160 | } else { 161 | K0 <- length(init.grid) 162 | big.grid <- as.matrix(init.grid[ rep(1:K0, model$batch.nrep)]) 163 | } 164 | 165 | fsim <- forward.sim.policy( big.grid, M-i,fits[i:M],model, compact=T, offset=0) 166 | cur.sim <- cur.sim + fsim$nsims 167 | 168 | # payoff at t 169 | immPayoff <- model$payoff.func( init.grid, model) 170 | 171 | # batched mean and variance 172 | for (jj in 1:K0) { 173 | all.X[jj,model$dim+1] <- mean( fsim$payoff[ jj + seq(from=0,len=model$batch.nrep,by=K0)]) - immPayoff[ jj] 174 | all.X[jj,model$dim+2] <- var( fsim$payoff[ jj + seq(from=0,len=model$batch.nrep,by=K0)]) 175 | } 176 | all.X[1:K0,1:model$dim] <- init.grid # use first dim+1 columns for batched GP regression 177 | k <- K0 178 | batch_matrix[1:K0, i] <- model$batch.nrep 179 | 180 | # create the km object 181 | if (method == "km") { 182 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=init.grid), response=data.frame(y=all.X[1:k,model$dim+1]), 183 | noise.var=all.X[1:k,model$dim+2]/model$batch.nrep, 184 | control=list(trace=F), lower=model$min.lengthscale, covtype=model$kernel.family, 185 | #nugget.estim= TRUE, 186 | coef.trend=0, coef.cov=model$km.cov, coef.var=model$km.var) 187 | } 188 | if (method == "trainkm") { 189 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=init.grid), 190 | response=data.frame(y=all.X[1:k,model$dim+1]), 191 | nugget.estim=TRUE, # 192 | #noise.var=all.X[1:k,model$dim+2]/model$batch.nrep, 193 | covtype=model$kernel.family, 194 | control=list(trace=F), lower=model$min.lengthscale, upper=model$max.lengthscale) 195 | #if (coef(fits[[i]])$sd2 < 1e-10) 196 | # fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=init.grid), 197 | # response=data.frame(y=all.X[1:k,model$dim+1]), 198 | # nugget.estim=TRUE, # 199 | # #noise.var=all.X[1:k,model$dim+2]/model$batch.nrep, 200 | # covtype=model$kernel.family, 201 | # control=list(trace=F), lower=model$min.lengthscale, upper=model$max.lengthscale) 202 | 203 | theta.fit[i,k-model$init.size+1,] <- coef(fits[[i]])$range 204 | } 205 | if (method == "homtp") { 206 | fits[[i]] <- hetGP::mleHomTP(X=all.X[1:k,1:model$dim], Z=all.X[1:k,model$dim+1], 207 | lower=model$min.lengthscale,upper=model$max.lengthscale, 208 | covtype=model$kernel.family, noiseControl=list(nu_bounds=c(2.01,5))) 209 | theta.fit[i,k-model$init.size+1,] <- fits[[i]]$theta 210 | } 211 | if (method == "homgp") { 212 | big.payoff <- model$payoff.func(big.grid,model) 213 | hetData <- hetGP::find_reps(big.grid, fsim$payoff-big.payoff) 214 | fits[[i]] <- hetGP::mleHomGP(X = list(X0=hetData$X0, Z0=hetData$Z0,mult=hetData$mult), Z= hetData$Z, 215 | lower = model$min.lengthscale, upper = model$max.lengthscale, 216 | covtype=model$kernel.family) 217 | 218 | theta.fit[i,k-model$init.size+1,] <- fits[[i]]$theta 219 | } 220 | if (method== "hetgp") { 221 | big.payoff <- model$payoff.func(big.grid,model) 222 | hetData <- hetGP::find_reps(big.grid, fsim$payoff-big.payoff) 223 | fits[[i]] <- hetGP::mleHetGP(X = list(X0=hetData$X0, Z0=hetData$Z0,mult=hetData$mult), Z= hetData$Z, 224 | lower = model$min.lengthscale, upper = model$max.lengthscale, 225 | covtype=model$kernel.family) 226 | theta.fit[i,k-model$init.size+1,] <- fits[[i]]$theta 227 | } 228 | 229 | # initialize gamma for RB and MLB 230 | gamma <- sqrt(mean(all.X[1:K0,model$dim+2])/model$batch.nrep) / 2 231 | r_batch = model$batch.nrep 232 | 233 | # active learning loop 234 | add.more.sites <- TRUE 235 | k <- K0 + 1 236 | running_budget = model$batch.nrep*K0 237 | n <- K0 + 1 238 | 239 | if (i == 43) 240 | browser() 241 | 242 | # active learning loop: 243 | # to do it longer for later points to minimize error build-up use: *(1 + i/M) 244 | while(add.more.sites) 245 | { 246 | # predict on the candidate grid. Need predictive GP mean, posterior GP variance and similation Stdev 247 | if (method == "km" | method == "trainkm") { 248 | pred.cands <- predict(fits[[i]],data.frame(x=ei.cands), type="UK") 249 | cand.mean <- pred.cands$mean 250 | cand.sd <- pred.cands$sd 251 | nug <- sqrt(coef(fits[[i]])$nug) 252 | #nug <- sqrt(mean(all.X[1:(k-1), model$dim+2])) 253 | nug <- rep(nug, length(cand.mean)) # noise.var=all.X[1:n,model$dim+2]/batch_matrix[1:n,i] 254 | } else { # hetGP/homTP/homGP 255 | pred.cands <- predict(x=ei.cands, object=fits[[i]]) 256 | cand.mean <- pred.cands$mean 257 | cand.sd <- sqrt(pred.cands$sd2) 258 | nug <- sqrt(pred.cands$nugs) 259 | } 260 | losses <- cf.el(cand.mean, cand.sd) 261 | 262 | # multiply by weights based on the distribution of pilot paths 263 | dX_1 <- pilot.paths[[i]]; dX_2 <- ei.cands 264 | for (dd in 1:model$dim) { 265 | dX_1[,dd] <- dX_1[,dd]/(lhs.rect[dd,2]-lhs.rect[dd,1]) 266 | dX_2[,dd] <- dX_2[,dd]/(lhs.rect[dd,2]-lhs.rect[dd,1]) 267 | } 268 | # from package lagp 269 | ddx <- laGP::distance( dX_1, dX_2) 270 | x.dens <- apply( exp(-ddx*dim(ei.cands)[1]*0.01), 2, sum) 271 | emp.loss[i,k-model$init.size] <- sum(losses*x.dens)/sum(x.dens) 272 | #if (is.null(model$el.thresh) == FALSE) { # stop if below the Emp Loss threshold 273 | # if (emp.loss[i,k-model$init.size] < model$el.thresh) { 274 | # add.more.sites <- FALSE 275 | # } 276 | #} 277 | 278 | # use active learning measure to select new sites and associated replication 279 | if (model$ei.func == 'absur') { 280 | overhead = 3 * model$dim * CalcOverhead(theta_for_optim[1], theta_for_optim[2], theta_for_optim[3], k + 1) 281 | al.weights <- cf.absur(cand.mean, cand.sd, nug, r_interval, overhead, t0) 282 | 283 | # select site and replication with highest EI score 284 | x.dens.matrix <- matrix(x.dens, nrow=length(x.dens), ncol=length(r_interval)) 285 | ei.weights <- x.dens.matrix * al.weights 286 | 287 | # select next input location 288 | max_index <- which.max(ei.weights) 289 | x_index <- max_index %% length(x.dens) 290 | x_index <- ifelse(x_index, x_index, length(x.dens)) 291 | add.grid <- ei.cands[x_index,,drop=F] 292 | 293 | # select associated batch size 294 | r_index <- (max_index - 1) / model$cand.len + 1 295 | r_batch <- min(round(r_interval[r_index]), model$total.budget - running_budget) 296 | } else { 297 | # use active learning measure to select new sites 298 | if (model$ei.func == 'sur') 299 | al.weights <- cf.sur(cand.mean, cand.sd, nugget = nug / sqrt(model$batch.nrep)) 300 | if (model$ei.func == 'tmse') 301 | al.weights <- cf.tMSE(cand.mean, cand.sd, seps = model$tmse.eps) 302 | if (model$ei.func == 'mcu') 303 | al.weights <- cf.mcu(cand.mean, cand.sd) 304 | if (model$ei.func == 'smcu') 305 | al.weights <- cf.smcu(cand.mean, cand.sd, model$ucb.gamma) 306 | if (model$ei.func == 'amcu') { # adaptive gamma from paper with Xiong 307 | adaptive.gamma <- (quantile(cand.mean, 0.75, na.rm = T) - quantile(cand.mean, 0.25, na.rm = T))/mean(cand.sd) 308 | al.weights <- cf.smcu(cand.mean, cand.sd, adaptive.gamma) 309 | } 310 | if (model$ei.func == 'csur') 311 | al.weights <- cf.csur(cand.mean, cand.sd,nugget=nug / sqrt(model$batch.nrep)) 312 | if (model$ei.func == 'icu' || model$batch.heuristic == 'adsa' || model$batch.heuristic == 'ddsa') { 313 | # Integrated contour uncertainty with weights based on *Hard-coded* log-normal density and create testing points 314 | if (is.gbm) { 315 | x.dens2 <- dlnorm( ei.cands[,1], meanlog=log(model$x0[1])+(model$r-model$div - 0.5*model$sigma[1]^2)*i*model$dt, 316 | sdlog = model$sigma[1]*sqrt(i*model$dt) ) 317 | jdim <- 2 318 | while (jdim <= model$dim) { 319 | x.dens2 <- x.dens2*dlnorm( ei.cands[,jdim], 320 | meanlog=log(model$x0[jdim])+(model$r-model$div-0.5*model$sigma[jdim]^2)*i*model$dt, 321 | sdlog = model$sigma[jdim]*sqrt(i*model$dt) ) 322 | jdim <- jdim+1 323 | } 324 | } else { 325 | x.dens2 <- x.dens } 326 | 327 | if (model$ei.func == 'icu' & method == "hetgp") { # only works with hetGP 328 | kxprime <- cov_gen(X1 = fits[[i]]$X0, X2 = ei.cands, theta = fits[[i]]$theta, type = fits[[i]]$covtype) 329 | al.weights <- apply(ei.cands,1, crit_ICU, model=fits[[i]], thres = 0, Xref=ei.cands, 330 | w = x.dens2, preds = pred.cands, kxprime = kxprime) 331 | } 332 | 333 | } 334 | 335 | # select site with highest EI score 336 | if (model$ei.func != 'icu') { 337 | ei.weights <- x.dens*al.weights 338 | } else { 339 | ei.weights<- al.weights # those are negative for ICU 340 | } 341 | 342 | x_index <- which.max(ei.weights) 343 | add.grid <- ei.cands[x_index,,drop=F] 344 | } 345 | 346 | # use active batching algorithms to select batch size 347 | if (model$batch.heuristic == 'fb') { 348 | r_batch = model$batch.nrep 349 | } else { 350 | r0 = min(model$total.budget - running_budget, round(model$c.batch*sqrt(k))) 351 | if (model$batch.heuristic == 'rb') { 352 | rb_batch <- batch.rb(cand.sd[x_index], model$r.cand, r_batch, nug[x_index], gamma) 353 | r_batch <- min(rb_batch$roptim, model$total.budget - running_budget) 354 | gamma <- rb_batch$gamma 355 | } 356 | if (model$batch.heuristic == 'mlb') { 357 | mlb_batch <- batch.mlb(cand.sd[x_index], model$r.cand, nug[x_index], gamma) 358 | r_batch <- min(mlb_batch$roptim, model$total.budget - running_budget) 359 | gamma <- mlb_batch$gamma 360 | } 361 | if (model$batch.heuristic == 'adsa') { 362 | adsa_batch <- batch.adsa(fits[[i]], batch_matrix[1:(n - 1), i], ei.cands, x.dens2, add.grid, r0, nug, method) 363 | add.grid <- adsa_batch$x_optim 364 | r_batch <- adsa_batch$r_optim 365 | #browser() 366 | } 367 | if (model$batch.heuristic == 'ddsa') { 368 | if (k%%2) { 369 | r_batch <- batch.ddsa(fits[[i]], batch_matrix[1:(n - 1), i], ei.cands, x.dens2, r0, method)$r_new 370 | } else { 371 | r_batch <- r0 372 | } 373 | } 374 | } 375 | 376 | # if using up all budget, move on to next time step 377 | if (length(r_batch) == 1 && is.numeric(r_batch) && r_batch == 0) { 378 | add.more.sites <- FALSE 379 | next 380 | } 381 | 382 | if(is.null(add.grid) || model$batch.heuristic == 'ddsa' && k%%2) { # Reallocation 383 | 384 | # Indexes for inputs which receives further allocation 385 | idx_diff = which(r_batch != batch_matrix[1:(n - 1), i]) 386 | r_seq_diff = r_batch[idx_diff] - batch_matrix[idx_diff, i] 387 | 388 | ids <- seq(1, length(r_seq_diff)) 389 | ids_rep <- unlist(mapply(rep, ids, r_seq_diff)) 390 | 391 | newX <- all.X[idx_diff, 1:model$dim] 392 | newX <- matrix(unlist(mapply(rep, newX, r_seq_diff)), ncol = model$dim) 393 | 394 | # compute corresponding y-values 395 | fsim <- forward.sim.policy(newX, M-i, fits[i:M], model, compact=T, offset=0) 396 | cur.sim <- cur.sim + fsim$nsims 397 | 398 | # payoff at t 399 | immPayoff <- model$payoff.func(newX, model) 400 | newY <- fsim$payoff - immPayoff 401 | 402 | add.mean <- tapply(newY, ids_rep, mean) 403 | add.var <- tapply(newY, ids_rep, var) 404 | add.var[is.na(add.var)] <- 0.0000001 405 | 406 | y_new = (all.X[idx_diff, model$dim+1] * batch_matrix[idx_diff, i] + add.mean * r_seq_diff) / r_batch[idx_diff] 407 | var_new = (all.X[idx_diff, model$dim+2] * (batch_matrix[idx_diff, i] - 1) + 408 | batch_matrix[idx_diff, i] * all.X[idx_diff, model$dim+1] ^ 2 + 409 | add.var * (r_seq_diff - 1) + r_seq_diff * add.mean ^ 2) / (batch_matrix[idx_diff, i] - 1) 410 | all.X[idx_diff, model$dim + 1] = y_new 411 | all.X[idx_diff, model$dim + 2] = var_new 412 | batch_matrix[1:(n - 1), i] = r_batch 413 | 414 | running_budget = running_budget + sum(r_seq_diff) 415 | 416 | if (k %in% update.kernel.iters) { 417 | if (method == "km") 418 | fits[[i]] <- km(y~0, design=data.frame(x=all.X[1:n - 1, 1:model$dim]), 419 | response=data.frame(y=all.X[1:n - 1, model$dim+1]), 420 | noise.var=all.X[1:n - 1,model$dim+2]/batch_matrix[1:(n - 1), i], 421 | covtype=model$kernel.family, coef.trend=0, coef.cov=model$km.cov, 422 | coef.var=model$km.var, control=list(trace=F), 423 | lower=model$min.lengthscale,upper=model$max.lengthscale) 424 | if (method == "trainkm") { 425 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=all.X[1:n - 1, 1:model$dim]), 426 | response=data.frame(y=all.X[1:n - 1, model$dim+1]), 427 | nugget.estim=TRUE, 428 | #noise.var=all.X[1:n - 1,model$dim+2]/batch_matrix[1:(n - 1), i], 429 | covtype=model$kernel.family, control=list(trace=F), 430 | lower=model$min.lengthscale, upper=model$max.lengthscale) 431 | theta.fit[i,n-model$init.size+1,] <- coef(fits[[i]])$range 432 | } 433 | if (method == "hetgp" | method == "homgp") { 434 | fits[[i]] <- update(object=fits[[i]], Xnew=newX, Znew=newY, method="mixed", 435 | lower = model$min.lengthscale, upper=model$max.lengthscale) 436 | theta.fit[i,n-model$init.size+1,] <- fits[[i]]$theta 437 | } 438 | if (method == "homtp") { 439 | fits[[i]] <- hetGP::mleHomTP(X=all.X[1:n - 1, 1:model$dim], Z=all.X[1:n - 1, model$dim+1], 440 | noiseControl=list(nu_bounds=c(2.01,5)), 441 | lower=model$min.lengthscale,upper=model$max.lengthscale, covtype=model$kernel.family) 442 | theta.fit[i,n-model$init.size+1,] <- fits[[i]]$theta 443 | } 444 | } else { 445 | if (method == "km" | method == "trainkm") 446 | fits[[i]] <- update(fits[[i]], newX=all.X[idx_diff, 1:model$dim, drop = F], newy=y_new, 447 | #newnoise=var_new / r_batch, 448 | newX.alreadyExist=T, cov.re=F) 449 | if (method == "hetgp" | method == "homgp") 450 | fits[[i]] <- update(object=fits[[i]], Xnew=newX, Znew=newY, maxit = 0, method = "quick") 451 | if (method == "homtp" ) 452 | fits[[i]] <- update(object=fits[[i]], Xnew=all.X[idx_diff, 1:model$dim, drop = F], Znew=y_new, maxit = 0) 453 | } 454 | } else { # add a new input 455 | add.grid <- matrix(rep(add.grid[1, ,drop=F], r_batch), nrow = r_batch, byrow=T) #batch 456 | 457 | # compute corresponding y-values 458 | fsim <- forward.sim.policy( add.grid,M-i,fits[i:M],model,offset=0) 459 | cur.sim <- cur.sim + fsim$nsims 460 | 461 | immPayoff <- model$payoff.func(add.grid, model) 462 | add.mean <- mean(fsim$payoff - immPayoff) 463 | if (r_batch == 1) { 464 | add.var <- 0.00001 465 | } else { 466 | add.var <- var(fsim$payoff - immPayoff) 467 | } 468 | all.X[n,] <- c(add.grid[1,],add.mean,add.var) 469 | batch_matrix[n, i] <- r_batch 470 | 471 | if (k %in% update.kernel.iters) { 472 | if (method == "km") 473 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=all.X[1:n,1:model$dim]), 474 | response=data.frame(y=all.X[1:n,model$dim+1]), 475 | noise.var=all.X[1:n,model$dim+2]/batch_matrix[1:n,i], covtype=model$kernel.family, 476 | coef.trend=0, coef.cov=model$km.cov, 477 | coef.var=model$km.var, control=list(trace=F), 478 | lower=model$min.lengthscale,upper=model$max.lengthscale) 479 | if (method == "trainkm") { 480 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=all.X[1:n,1:model$dim]), 481 | response=data.frame(y=all.X[1:n,model$dim+1]), 482 | nugget.estim=TRUE, 483 | #noise.var=all.X[1:n,model$dim+2]/batch_matrix[1:n,i], 484 | covtype=model$kernel.family, control=list(trace=F), 485 | lower=model$min.lengthscale, upper=model$max.lengthscale) 486 | theta.fit[i,n-model$init.size+1,] <- coef(fits[[i]])$range 487 | } 488 | if (method == "hetgp" | method == "homgp") { 489 | fits[[i]] <- update(object=fits[[i]], Xnew=matrix(rep(add.grid[1,,drop=F], r_batch), nrow = r_batch, byrow = T), 490 | Znew=fsim$payoff-immPayoff,method="mixed", 491 | lower = model$min.lengthscale, upper=model$max.lengthscale) 492 | theta.fit[i,n-model$init.size+1,] <- fits[[i]]$theta 493 | } 494 | if (method == "homtp") { 495 | fits[[i]] <- hetGP::mleHomTP(X=all.X[1:n,1:model$dim], Z=all.X[1:n,model$dim+1], 496 | noiseControl=list(nu_bounds=c(2.01,5)), 497 | lower=model$min.lengthscale,upper=model$max.lengthscale, covtype=model$kernel.family) 498 | theta.fit[i,n-model$init.size+1,] <- fits[[i]]$theta 499 | } 500 | 501 | } else { 502 | if (method == "km" | method == "trainkm") 503 | fits[[i]] <- update(fits[[i]], newX=add.grid[1,,drop=F], newy=add.mean, 504 | #newnoise=add.var / r_batch, 505 | cov.re=F) 506 | if (method == "hetgp" | method == "homgp") 507 | fits[[i]] <- update(object=fits[[i]], Xnew=matrix(rep(add.grid[1,,drop=F], r_batch), nrow = r_batch, byrow = T), 508 | Znew=fsim$payoff-immPayoff, maxit = 0, method = "quick") 509 | if (method == "homtp" ) 510 | fits[[i]] <- update(object=fits[[i]], Xnew=add.grid[1,,drop=F], Znew=add.mean, maxit = 0) 511 | } 512 | 513 | running_budget = running_budget + sum(r_batch) 514 | n = n + 1 515 | } 516 | 517 | # resample the candidate set 518 | ei.cands <- lhs(model$cand.len, lhs.rect) 519 | ei.cands <- ei.cands[ model$payoff.func( ei.cands,model) > 0,,drop=F] 520 | if (n >= model$seq.design.size || running_budget >= model$total.budget) { 521 | add.more.sites <- FALSE 522 | } 523 | k <- k+1 524 | } 525 | budget.used[i] <- n 526 | } 527 | 528 | return (list(fit=fits,timeElapsed=Sys.time()-t.start,nsims=cur.sim,empLoss=emp.loss, 529 | ndesigns=budget.used, theta=theta.fit, batches = batch_matrix)) 530 | } 531 | -------------------------------------------------------------------------------- /R/ospSeqBatchDesignSimplified.R: -------------------------------------------------------------------------------- 1 | ################# 2 | #' @title Adaptive Batch design for optimal stopping (simplified version) 3 | #' 4 | #' @details Implements the adaptive batching strategy defined in batch.heuristic with model defined in method. 5 | #' @param method \code{km} to select the GP emulator to apply 6 | #' @param model a list containing all model parameters 7 | #' @return a list containing: 8 | #' \itemize{ 9 | #' \item \code{fit} a list of fitted response surfaces 10 | #' \item \code{ndesigns}: number of design size k_T 11 | #' \item \code{batches}: matrix of replications r_i 12 | #' } 13 | osp.seq.batch.design.simplified <- function(model, method="km") 14 | { 15 | M <- model$T/model$dt 16 | 17 | # replication range in absur 18 | r_lower <- model$r.cand[1] 19 | r_upper <- min(model$r.cand[length(model$r.cand)], 0.1 * model$total.budget) 20 | r_interval <- seq(r_lower, r_upper, length = 1000) 21 | theta_for_optim <- c(0.1371, 0.000815, 1.9871E-6) # c_{ovh} in eq. (13) 22 | batch_matrix <- matrix(rep(0, M*model$seq.design.size), ncol=M) 23 | 24 | # parameter in adsa and ddsa 25 | c_batch <- 20 / model$dim 26 | 27 | fits <- list() # list of emulator objects at each step 28 | pilot.paths <- list() 29 | update.kernel.iters <- seq(0,model$seq.design.size,by=model$update.freq) # when to refit the whole GP 30 | 31 | # set-up a skeleton to understand the distribution of X 32 | pilot.paths[[1]] <- model$sim.func( matrix(rep(model$x0[1:model$dim], 5*model$init.size), 33 | nrow=5*model$init.size, byrow=T), model, model$dt) 34 | for (i in 2:(M-1)) { 35 | pilot.paths[[i]] <- model$sim.func( pilot.paths[[i-1]], model, model$dt) 36 | } 37 | pilot.paths[[1]] <- pilot.paths[[3]] 38 | init.grid <- pilot.paths[[M-1]] 39 | budget.used <- rep(0,M-1) 40 | 41 | ############ step back in time 42 | for (i in (M-1):1) 43 | { 44 | all.X <- matrix( rep(0, (model$dim+2)*model$seq.design.size), ncol=model$dim+2) 45 | 46 | # construct the input domain where candidates will be looked for 47 | if (is.null(model$lhs.rect)) { 48 | model$lhs.rect <- 0.02 49 | } 50 | if (length(model$lhs.rect) == 1) { 51 | lhs.rect <- matrix(rep(0, 2*model$dim), ncol=2) 52 | # create a box using empirical quantiles of the init.grid cloud 53 | for (jj in 1:model$dim) 54 | lhs.rect[jj,] <- quantile( init.grid[,jj], c(model$lhs.rect, 1-model$lhs.rect) ) 55 | } else { # already specified 56 | lhs.rect <- model$lhs.rect 57 | } 58 | 59 | # Candidate grid of potential NEW sites to add. Will be ranked using the EI acquisition function 60 | # only keep in-the-money sites 61 | ei.cands <- lhs( model$cand.len, lhs.rect ) # from tgp package 62 | ei.cands <- ei.cands[ model$payoff.func( ei.cands,model) > 0,,drop=F] 63 | 64 | # initial design 65 | init.grid <- model$init.grid 66 | K0 <- dim(init.grid)[1] 67 | # initial conditions for all the forward paths: replicated design with batch.nrep 68 | big.grid <- init.grid[ rep(1:K0, model$batch.nrep),] 69 | fsim <- forward.sim.policy( big.grid, M-i,fits[i:M],model, compact=T, offset=0) 70 | 71 | # payoff at t 72 | immPayoff <- model$payoff.func( init.grid, model) 73 | 74 | # batched mean and variance 75 | for (jj in 1:K0) { 76 | all.X[jj,model$dim+1] <- mean( fsim$payoff[ jj + seq(from=0,len=model$batch.nrep,by=K0)]) - immPayoff[ jj] 77 | all.X[jj,model$dim+2] <- var( fsim$payoff[ jj + seq(from=0,len=model$batch.nrep,by=K0)]) 78 | } 79 | all.X[1:K0,1:model$dim] <- init.grid # use first dim+1 columns for batched GP regression 80 | k <- K0 81 | batch_matrix[1:K0, i] <- model$batch.nrep 82 | 83 | # create the km object 84 | fits[[i]] <- km(y~0, design=data.frame(x=init.grid), response=data.frame(y=all.X[1:k,model$dim+1]), 85 | noise.var=all.X[1:k,model$dim+2]/model$batch.nrep, covtype=model$kernel.family, 86 | control=list(trace=F), lower=model$min.lengthscale, upper=model$max.lengthscale) 87 | 88 | # initialize gamma for RB and MLB 89 | gamma <- sqrt(mean(all.X[1:K0,model$dim+2])/model$batch.nrep) / 2 90 | r_batch = model$batch.nrep 91 | 92 | # active learning loop 93 | add.more.sites <- TRUE 94 | k <- K0 + 1 95 | running_budget = model$batch.nrep*K0 96 | n <- K0 + 1 97 | 98 | # active learning loop: 99 | # to do it longer for later points to minimize error build-up use: *(1 + i/M) 100 | while(add.more.sites) 101 | { 102 | # predict on the candidate grid. Need predictive GP mean, posterior GP variance and similation Stdev 103 | pred.cands <- predict(fits[[i]],data.frame(x=ei.cands), type="UK") 104 | cand.mean <- pred.cands$mean 105 | cand.sd <- pred.cands$sd 106 | nug <- sqrt(mean(all.X[1:(k-1), model$dim+2])) 107 | nug <- rep(nug, length(cand.mean)) 108 | 109 | losses <- cf.el(cand.mean, cand.sd) 110 | 111 | # multiply by weights based on the distribution of pilot paths 112 | dX_1 <- pilot.paths[[i]]; dX_2 <- ei.cands 113 | for (dd in 1:model$dim) { 114 | dX_1[,dd] <- dX_1[,dd]/(lhs.rect[dd,2]-lhs.rect[dd,1]) 115 | dX_2[,dd] <- dX_2[,dd]/(lhs.rect[dd,2]-lhs.rect[dd,1]) 116 | } 117 | # from package lagp 118 | ddx <- distance( dX_1, dX_2) 119 | x.dens <- apply( exp(-ddx*dim(ei.cands)[1]*0.01), 2, sum) 120 | 121 | # use active learning measure to select new sites and associated replication 122 | if (model$ei.func == 'absur') { 123 | overhead = 3 * model$dim * CalcOverhead(theta_for_optim[1], theta_for_optim[2], theta_for_optim[3], k + 1) 124 | al.weights <- cf.absur(cand.mean, cand.sd, nug, r_interval, overhead, model$t0) 125 | 126 | # select site and replication with highest EI score 127 | x.dens.matrix <- matrix(x.dens, nrow=length(x.dens), ncol=length(r_interval)) 128 | ei.weights <- x.dens.matrix * al.weights 129 | 130 | # select next input location 131 | max_index <- which.max(ei.weights) 132 | x_index <- max_index %% length(x.dens) 133 | x_index <- ifelse(x_index, x_index, length(x.dens)) 134 | add.grid <- ei.cands[x_index,,drop=F] 135 | 136 | # select associated batch size 137 | r_index <- (max_index - 1) / model$cand.len + 1 138 | r_batch <- min(round(r_interval[r_index]), model$total.budget - running_budget) 139 | } else { 140 | # use active learning measure to select new sites 141 | adaptive.gamma <- (quantile(cand.mean, 0.75, na.rm = T) - quantile(cand.mean, 0.25, na.rm = T))/mean(cand.sd) 142 | al.weights <- cf.smcu(cand.mean, cand.sd, adaptive.gamma) 143 | x.dens2 <- dlnorm( ei.cands[,1], meanlog=log(model$x0[1])+(model$r-model$div - 0.5*model$sigma[1]^2)*i*model$dt, 144 | sdlog = model$sigma[1]*sqrt(i*model$dt) ) 145 | x.dens2 <- x.dens2*dlnorm( ei.cands[,2], meanlog=log(model$x0[2])+(model$r-model$div-0.5*model$sigma[2]^2)*i*model$dt, 146 | sdlog = model$sigma[2]*sqrt(i*model$dt) ) 147 | 148 | # select site with highest EI score 149 | if (model$ei.func != 'icu') { 150 | ei.weights <- x.dens*al.weights 151 | } else { 152 | ei.weights<- al.weights # those are negative for ICU 153 | } 154 | 155 | x_index <- which.max(ei.weights) 156 | add.grid <- ei.cands[x_index,,drop=F] 157 | } 158 | 159 | # use active batching algorithms to select batch size 160 | if (model$batch.heuristic == 'fb') { 161 | r_batch = model$batch.nrep 162 | } else { 163 | r0 = min(model$total.budget - running_budget, round(c_batch*sqrt(k))) 164 | if (model$batch.heuristic == 'adsa') { 165 | adsa_batch <- batch.adsa(fits[[i]], batch_matrix[1:(n - 1), i], ei.cands, x.dens2, add.grid, r0, nug, method) 166 | add.grid <- adsa_batch$x_optim 167 | r_batch <- adsa_batch$r_optim 168 | } 169 | } 170 | 171 | # if using up all budget, move on to next time step 172 | if (is.numeric(r_batch) && r_batch == 0) { 173 | add.more.sites <- FALSE 174 | next 175 | } 176 | 177 | if(is.null(add.grid) || model$batch.heuristic == 'ddsa' && k%%2) { # Reallocation 178 | 179 | # Indexes for inputs which receives further allocation 180 | idx_diff = which(r_batch != batch_matrix[1:(n - 1), i]) 181 | r_seq_diff = r_batch[idx_diff] - batch_matrix[idx_diff, i] 182 | 183 | ids <- seq(1, length(r_seq_diff)) 184 | ids_rep <- unlist(mapply(rep, ids, r_seq_diff)) 185 | 186 | newX <- all.X[idx_diff, 1:model$dim] 187 | newX <- matrix(unlist(mapply(rep, newX, r_seq_diff)), ncol = model$dim) 188 | 189 | # compute corresponding y-values 190 | fsim <- forward.sim.policy(newX, M-i, fits[i:M], model, compact=T, offset=0) 191 | 192 | # payoff at t 193 | immPayoff <- model$payoff.func(newX, model) 194 | newY <- fsim$payoff - immPayoff 195 | 196 | add.mean <- tapply(newY, ids_rep, mean) 197 | add.var <- tapply(newY, ids_rep, var) 198 | add.var[is.na(add.var)] <- 0.0000001 199 | 200 | y_new = (all.X[idx_diff, model$dim+1] * batch_matrix[idx_diff, i] + add.mean * r_seq_diff) / r_batch[idx_diff] 201 | var_new = (all.X[idx_diff, model$dim+2] * (batch_matrix[idx_diff, i] - 1) + batch_matrix[idx_diff, i] * all.X[idx_diff, model$dim+1] ^ 2 + add.var * (r_seq_diff - 1) + r_seq_diff * add.mean ^ 2) / (batch_matrix[idx_diff, i] - 1) 202 | all.X[idx_diff, model$dim + 1] = y_new 203 | all.X[idx_diff, model$dim + 2] = var_new 204 | batch_matrix[1:(n - 1), i] = r_batch 205 | 206 | running_budget = running_budget + sum(r_seq_diff) 207 | 208 | if (k %in% update.kernel.iters) { 209 | fits[[i]] <- km(y~0, design=data.frame(x=all.X[1:n - 1, 1:model$dim]), response=data.frame(y=all.X[1:n - 1, model$dim+1]), 210 | noise.var=all.X[1:n - 1,model$dim+2]/r_batch, covtype=model$kernel.family, control=list(trace=F), 211 | lower=model$min.lengthscale, upper=model$max.lengthscale) 212 | } else { 213 | fits[[i]] <- update(fits[[i]], newX=all.X[idx_diff, 1:model$dim, drop = F], newy=y_new, 214 | newnoise=var_new / r_batch, newX.alreadyExist=T, cov.re=F) 215 | } 216 | } else { # add a new input 217 | add.grid <- matrix(rep(add.grid[1, ,drop=F], r_batch), nrow = r_batch, byrow=T) #batch 218 | 219 | # compute corresponding y-values 220 | fsim <- forward.sim.policy( add.grid,M-i,fits[i:M],model,offset=0) 221 | 222 | immPayoff <- model$payoff.func(add.grid, model) 223 | add.mean <- mean(fsim$payoff - immPayoff) 224 | if (r_batch == 1) { 225 | add.var <- 0.00001 226 | } else { 227 | add.var <- var(fsim$payoff - immPayoff) 228 | } 229 | all.X[n,] <- c(add.grid[1,],add.mean,add.var) 230 | batch_matrix[n, i] <- r_batch 231 | 232 | if (k %in% update.kernel.iters) { 233 | fits[[i]] <- km(y~0, design=data.frame(x=all.X[1:n,1:model$dim]), response=data.frame(y=all.X[1:n,model$dim+1]), 234 | noise.var=all.X[1:n,model$dim+2]/r_batch, covtype=model$kernel.family, control=list(trace=F), 235 | lower=model$min.lengthscale, upper=model$max.lengthscale) 236 | } else { 237 | fits[[i]] <- update(fits[[i]], newX=add.grid[1,,drop=F], newy=add.mean, 238 | newnoise=add.var / r_batch, cov.re=F) 239 | } 240 | 241 | running_budget = running_budget + sum(r_batch) 242 | n = n + 1 243 | } 244 | 245 | # resample the candidate set 246 | ei.cands <- lhs(model$cand.len, lhs.rect) 247 | ei.cands <- ei.cands[ model$payoff.func( ei.cands,model) > 0,,drop=F] 248 | if (n >= model$seq.design.size || running_budget >= model$total.budget) { 249 | add.more.sites <- FALSE 250 | } 251 | k <- k+1 252 | } 253 | budget.used[i] <- n 254 | } 255 | 256 | return (list(fit=fits, ndesigns=budget.used, batches = batch_matrix)) 257 | } 258 | -------------------------------------------------------------------------------- /R/ospSeqDesign.R: -------------------------------------------------------------------------------- 1 | ################# 2 | #' Regression Monte Carlo via sequential experimental design. The experimental design is augmented 3 | #' one input at a time, using an Expected Improvement (EI) acquisition function. This is repeated at 4 | #' each time step. The method is likely to be somewhat slow, but highly efficient in its use of underlying 5 | #' simulations. See Gramacy & Ludkovski (2013), Ludkovski (2018) for details. 6 | #' 7 | #' EI criteria are based on posterior and/or predictive variance and therefore require the use of a 8 | #' Gaussian-process based surrogate (currently from \pkg{DiceKriging} or \pkg{hetGP}). 9 | #' 10 | #' @title Sequential design for optimal stopping 11 | #' 12 | #' @details Implements the EI strategy defined in \code{model$ei.func}. Calls \code{lhs} from library \pkg{tgp}. 13 | #' Empirical losses are computed using \code{cf.el} function. The acquisition function is specified via 14 | #' \code{ei.func} which can be \code{csur} (Default), \code{sur}, \code{smcu}, \code{amcu}, 15 | #' \code{tmse} and \code{icu}. 16 | #' 17 | #' The experimental design is initialized via \code{init.size}/\code{init.grid} parameters and then is grown 18 | #' one input-at-a-time until it is of size \code{model$seq.design.size}. Thus, there are a total of 19 | #' seq.design.size-init.size sequential iterations. 20 | #' 21 | #' @param model a list containing all the model parameters. 22 | #' 23 | #' The following model parameters are used: 24 | #' \itemize{ 25 | #' \item \code{init.size}: size of starting grid (will be generated via lhs sampling if \code{init.grid} is not given) 26 | #' \item \code{pilot.nsims}: number of pilot simulations to create the search space where new inputs will be added 27 | #' (Default is 5*\code{model$init.size}) 28 | #' \item \code{cand.len}: number of candidate new inputs to be proposed. The next input is chosen greedily as 29 | #' the candidate that maximizes the EI criterion. Candidate inputs are selected via \code{tgp::lhs} 30 | #' (Default is 500*\code{model$dim}) 31 | #' \item \code{lhs.rect}: specification of the bounding hyper-rectangle where search is conducted 32 | #' (Default: construct based on 0.02/0.98 quantiles of the pilot paths in each dimension) 33 | #' \item \code{update.freq}: how often to re-fit the entire GP surrogate as new inputs are added (Default is 10) 34 | #' \item \code{batch.nrep} (REQUIRED): number of replicates at each unique input 35 | #' \item \code{min.lengthscale}: vector with minimum lengthscales of the surrogate (Default: 1% of lhs.rect dimensions) 36 | #' \item \code{max.lengthscale}: vector with maximum lengthscale of the surrogate (Default: 10x each of lhs.rect dimensions) 37 | #' \item \code{ei.func}: acquisition function (cSUR by Default) 38 | #' } 39 | #' 40 | #' @param method one of \code{km}, \code{trainkm}, \code{homtp} or \code{hetgp} to select the GP emulator to apply 41 | #' @export 42 | #' @return a list containing: 43 | #' \itemize{ 44 | #' \item \code{fit} a list of fitted response surfaces. 45 | #' \item \code{timeElapsed}, 46 | #' \item \code{nsims} total number of 1-step sim.func calls 47 | #' \item \code{budget} -- number of sequential iterations per time-step 48 | #' \item \code{empLoss} --matrix of empirical losses (rows for time-steps, columns for iterations) 49 | #' \item \code{theta.fit} -- 3d array of estimated lengthscales (sorted by time-steps,iterations,dimensions-of-x) 50 | #' } 51 | #' 52 | #' @references 53 | #' Mike Ludkovski, Kriging Metamodels and Experimental Design for Bermudan Option Pricing 54 | #' Journal of Computational Finance, 22(1), 37-77, 2018 55 | #' @author Mike Ludkovski 56 | osp.seq.design <- function(model,method="km") 57 | { 58 | M <- model$T/model$dt 59 | t.start <- Sys.time() 60 | cur.sim <- 0 61 | if (is.null(model$ei.func)) 62 | model$ei.func <- "csur" 63 | if (is.null(model$update.freq)) 64 | model$update.freq <- 10 65 | if (is.null(model$pilot.nsims)) 66 | model$pilot.nsims <- 5*model$init.size 67 | if (is.null(model$cand.len)) 68 | model$cand.len <- 500*model$dim 69 | if(is.null(model$ucb.gamma)) 70 | model$ucb.gamma <- 1.96 71 | 72 | fits <- list() # list of emulator objects at each step 73 | pilot.paths <- list() 74 | emp.loss <- array(0, dim=c(M,model$seq.design.size-model$init.size)) 75 | update.kernel.iters <- seq(0,model$seq.design.size,by=model$update.freq) # when to refit the whole GP 76 | 77 | # set-up a skeleton to understand the distribution of X 78 | pilot.paths[[1]] <- model$sim.func( matrix(rep(model$x0[1:model$dim], model$pilot.nsims), 79 | nrow=model$pilot.nsims, byrow=T), model, model$dt) 80 | for (i in 2:(M-1)) 81 | pilot.paths[[i]] <- model$sim.func( pilot.paths[[i-1]], model, model$dt) 82 | pilot.paths[[1]] <- pilot.paths[[3]] 83 | budget.used <- rep(0,M-1) 84 | theta.fit <- array(0, dim=c(M,model$seq.design.size-model$init.size+1,model$dim)) 85 | 86 | #------- step back in time 87 | for (i in (M-1):1) 88 | { 89 | all.X <- matrix( rep(0, (model$dim+2)*model$seq.design.size), ncol=model$dim+2) 90 | 91 | # construct the input domain where candidates will be looked for 92 | if (is.null(model$lhs.rect)) 93 | model$lhs.rect <- 0.02 94 | if (length(model$lhs.rect) == 1) 95 | { 96 | lhs.rect <- matrix( rep(0, 2*model$dim), ncol=2) 97 | # create a box using empirical quantiles of the pilot.paths cloud 98 | for (jj in 1:model$dim) 99 | lhs.rect[jj,] <- quantile( pilot.paths[[i]][,jj], c(model$lhs.rect, 1-model$lhs.rect) ) 100 | } 101 | else # already specified 102 | lhs.rect <- model$lhs.rect 103 | 104 | 105 | 106 | # Candidate grid of potential NEW sites to add. Will be ranked using the EI acquisition function 107 | # only keep in-the-money sites 108 | ei.cands <- tgp::lhs( model$cand.len, lhs.rect ) # from tgp package 109 | ei.cands <- ei.cands[ model$payoff.func( ei.cands,model) > 0,,drop=F] 110 | 111 | if (is.null(model$min.lengthscale)) 112 | model$min.lengthscale <- 0.01*diff(t(lhs.rect)) 113 | if (is.null(model$max.lengthscale)) 114 | model$max.lengthscale <- 10*diff(t(lhs.rect)) 115 | 116 | # initial design 117 | if (is.null(model$init.grid)) 118 | init.grid <- tgp::lhs( model$init.size, lhs.rect) 119 | else 120 | init.grid <- model$init.grid 121 | if (model$dim > 1) { 122 | K0 <- dim(init.grid)[1] 123 | 124 | # initial conditions for all the forward paths: replicated design with batch.nrep 125 | big.grid <- init.grid[ rep(1:K0, model$batch.nrep),] 126 | } 127 | else { 128 | K0 <- length(init.grid) 129 | big.grid <- as.matrix(init.grid[ rep(1:K0, model$batch.nrep)]) 130 | } 131 | 132 | fsim <- forward.sim.policy( big.grid, M-i,fits[i:M],model, compact=T,offset=0) 133 | cur.sim <- cur.sim + fsim$nsims 134 | 135 | # payoff at t 136 | immPayoff <- model$payoff.func( init.grid, model) 137 | 138 | # batched mean and variance 139 | for (jj in 1:K0) { 140 | all.X[jj,model$dim+1] <- mean( fsim$payoff[ jj + seq(from=0,len=model$batch.nrep,by=K0)]) - immPayoff[ jj] 141 | all.X[jj,model$dim+2] <- var( fsim$payoff[ jj + seq(from=0,len=model$batch.nrep,by=K0)]) 142 | } 143 | all.X[1:K0,1:model$dim] <- init.grid # use first dim+1 columns for batched GP regression 144 | k <- K0 145 | 146 | # create the km object 147 | if (method == "km") 148 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=init.grid), response=data.frame(y=all.X[1:k,model$dim+1]), 149 | noise.var=all.X[1:k,model$dim+2]/model$batch.nrep, 150 | control=list(trace=F), lower=model$min.lengthscale, covtype=model$kernel.family, 151 | coef.trend=0, coef.cov=model$km.cov, coef.var=model$km.var) 152 | if (method == "trainkm") { 153 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=init.grid), response=data.frame(y=all.X[1:k,model$dim+1]), 154 | noise.var=all.X[1:k,model$dim+2]/model$batch.nrep, covtype=model$kernel.family, 155 | control=list(trace=F), lower=model$min.lengthscale, upper=model$max.lengthscale) 156 | theta.fit[i,k-model$init.size+1,] <- coef(fits[[i]])$range 157 | } 158 | if (method == "homtp") { 159 | fits[[i]] <- hetGP::mleHomTP(X=init.grid, Z=all.X[1:k,model$dim+1], 160 | lower=model$min.lengthscale,upper=model$max.lengthscale, 161 | covtype=model$kernel.family, noiseControl=list(nu_bounds=c(2.01,10),sigma2_bounds = c(5e-2, 4))) 162 | theta.fit[i,k-model$init.size+1,] <- fits[[i]]$theta 163 | } 164 | 165 | if (method== "hetgp") { 166 | big.payoff <- model$payoff.func(big.grid,model) 167 | hetData <- find_reps(big.grid, fsim$payoff-big.payoff) 168 | fits[[i]] <- hetGP::mleHetGP(X = list(X0=hetData$X0, Z0=hetData$Z0,mult=hetData$mult), Z= hetData$Z, 169 | lower = model$min.lengthscale, upper = model$max.lengthscale, 170 | covtype=model$kernel.family) 171 | theta.fit[i,k-model$init.size+1,] <- fits[[i]]$theta 172 | } 173 | 174 | #----- active learning loop: 175 | add.more.sites <- TRUE; k <- K0 + 1 176 | # to do it longer for later points to minimize error build-up use: *(1 + i/M) 177 | while (add.more.sites == TRUE) 178 | { 179 | # predict on the candidate grid. Need predictive GP mean, posterior GP variance and similation Stdev 180 | if (method == "km" | method == "trainkm") { 181 | pred.cands <- predict(fits[[i]],data.frame(x=ei.cands), type="UK") 182 | cand.mean <- pred.cands$mean 183 | cand.sd <- pred.cands$sd 184 | nug <- sqrt(mean(all.X[1:(k-1),model$dim+2])/model$batch.nrep) 185 | } 186 | else { # hetGP/homTP 187 | pred.cands <- predict(x=ei.cands, object=fits[[i]]) 188 | cand.mean <- pred.cands$mean 189 | cand.sd <- sqrt(pred.cands$sd2) 190 | nug <- sqrt(pred.cands$nugs/model$batch.nrep) 191 | } 192 | losses <- cf.el(cand.mean, cand.sd) 193 | 194 | 195 | 196 | #---------- use active learning measure to select new sites 197 | #if (model$ei.func == 'ucb') 198 | # al.weights <- qEI.ucb(pred.cands, model$ucb.gamma*sqrt(log(k))) 199 | if (model$ei.func == 'sur') 200 | al.weights <- cf.sur(cand.mean, cand.sd, nugget = nug) 201 | if (model$ei.func == 'tmse') 202 | al.weights <- cf.tMSE(cand.mean, cand.sd, seps = model$tmse.eps) 203 | if (model$ei.func == 'mcu') 204 | al.weights <- cf.mcu(cand.mean, cand.sd) 205 | if (model$ei.func == 'smcu') 206 | al.weights <- cf.smcu(cand.mean, cand.sd, model$ucb.gamma) 207 | if (model$ei.func == 'amcu') { # adaptive gamma from paper with Xiong 208 | adaptive.gamma <- (quantile(cand.mean, 0.75) - quantile(cand.mean, 0.25))/mean(cand.sd) 209 | al.weights <- cf.smcu(cand.mean, cand.sd, adaptive.gamma) 210 | } 211 | if (model$ei.func == 'csur') 212 | al.weights <- cf.csur(cand.mean, cand.sd,nugget=nug) 213 | if (model$ei.func == 'icu') { 214 | # Integrated contour uncertainty with weights based on *Hard-coded* log-normal density 215 | x.dens2 <- dlnorm( ei.cands[,1], meanlog=log(model$x0[1])+(model$r-model$div - 0.5*model$sigma[1]^2)*i*model$dt, 216 | sdlog = model$sigma[1]*sqrt(i*model$dt) ) 217 | jdim <- 2 218 | while (jdim <= model$dim ) { 219 | 220 | x.dens2 <- x.dens2*dlnorm( ei.cands[,jdim], meanlog=log(model$x0[jdim])+(model$r-model$div-0.5*model$sigma[jdim]^2)*i*model$dt, 221 | sdlog = model$sigma[jdim]*sqrt(i*model$dt) ) 222 | jdim <- jdim+1 223 | } 224 | #plot(ei.cands[,1], ei.cands[,2], cex=cf.mcu(cand.mean, cand.sd)*x.dens2*4000,pch=19) 225 | 226 | kxprime <- cov_gen(X1 = fits[[i]]$X0, X2 = ei.cands, theta = fits[[i]]$theta, type = fits[[i]]$covtype) 227 | al.weights <- apply(ei.cands,1, crit_ICU, model=fits[[i]], thres = 0, Xref=ei.cands, 228 | w = x.dens2, preds = pred.cands, kxprime = kxprime) 229 | 230 | } 231 | # multiply by weights based on the distribution of pilot paths 232 | dX_1 <- pilot.paths[[i]]; dX_2 <- ei.cands 233 | for (dd in 1:model$dim) { 234 | dX_1[,dd] <- dX_1[,dd]/(lhs.rect[dd,2]-lhs.rect[dd,1]) 235 | dX_2[,dd] <- dX_2[,dd]/(lhs.rect[dd,2]-lhs.rect[dd,1]) 236 | } 237 | # from package lagp 238 | ddx <- laGP::distance( dX_1, dX_2) #ddx <- distance( pilot.paths[[i]], ei.cands) 239 | #x.dens <- apply( exp(-ddx/2/(lhs.rect[1,2]-lhs.rect[1,1])), 2, sum) 240 | x.dens <- apply( exp(-ddx*dim(ei.cands)[1]*0.01), 2, sum) 241 | 242 | 243 | if (model$ei.func != 'icu') 244 | ei.weights <- x.dens*al.weights 245 | else 246 | ei.weights<- al.weights # those are negative for ICU 247 | emp.loss[i,k-model$init.size] <- sum(losses*x.dens)/sum(x.dens) 248 | if (is.null(model$el.thresh) == FALSE) # stop if below the Emp Loss threshold 249 | if (emp.loss[i,k-model$init.size] < model$el.thresh) 250 | add.more.sites <- FALSE 251 | 252 | 253 | #add.grid <- ei.cands[sample(dim(ei.cands)[1],model$n.propose,replace=T, prob=ei.weights),,drop=F] 254 | # select site with highest EI score 255 | add.grid <- ei.cands[ which.max(ei.weights),,drop=F] 256 | add.grid <- matrix(rep(add.grid[1,,drop=F], model$batch.nrep), nrow=model$batch.nrep, byrow=T) #batch 257 | 258 | # compute corresponding y-values 259 | fsim <- forward.sim.policy( add.grid,M-i,fits[i:M],model,offset=0) 260 | cur.sim <- cur.sim + fsim$nsims 261 | 262 | immPayoff <- model$payoff.func(add.grid, model) 263 | add.mean <- mean(fsim$payoff - immPayoff) 264 | add.var <- var(fsim$payoff - immPayoff) 265 | all.X[k,] <- c(add.grid[1,],add.mean,add.var) 266 | 267 | if (k %in% update.kernel.iters) { 268 | if (method == "km") 269 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=all.X[1:k,1:model$dim]), response=data.frame(y=all.X[1:k,model$dim+1]), 270 | noise.var=all.X[1:k,model$dim+2]/model$batch.nrep, coef.trend=0, coef.cov=model$km.cov, 271 | coef.var=model$km.var, control=list(trace=F), 272 | lower=model$min.lengthscale,upper=model$max.lengthscale) 273 | if (method == "trainkm") { 274 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=all.X[1:k,1:model$dim]), response=data.frame(y=all.X[1:k,model$dim+1]), 275 | noise.var=all.X[1:k,model$dim+2]/model$batch.nrep, control=list(trace=F), 276 | lower=model$min.lengthscale, upper=model$max.lengthscale) 277 | theta.fit[i,k-model$init.size+1,] <- coef(fits[[i]])$range 278 | } 279 | if (method == "hetgp") { 280 | fits[[i]] <- update(object=fits[[i]], Xnew=add.grid, Znew=fsim$payoff-immPayoff,method="mixed", 281 | lower = model$min.lengthscale, upper=model$max.lengthscale) 282 | theta.fit[i,k-model$init.size+1,] <- fits[[i]]$theta 283 | } 284 | if (method == "homtp") { 285 | fits[[i]] <- hetGP::mleHomTP(X=all.X[1:k,1:model$dim], Z=all.X[1:k,model$dim+1],noiseControl=list(nu_bounds=c(2.01,10),sigma2_bounds=c(5e-2,4)), 286 | lower=model$min.lengthscale,upper=model$max.lengthscale, covtype=model$kernel.family) 287 | theta.fit[i,k-model$init.size+1,] <- fits[[i]]$theta 288 | } 289 | } 290 | else { 291 | if (method == "km" | method == "trainkm") 292 | fits[[i]] <- update(fits[[i]],newX=add.grid[1,,drop=F],newy=add.mean, 293 | newnoise=add.var/model$batch.nrep, cov.re=F) 294 | if (method == "hetgp" | method == "homtp") 295 | fits[[i]] <- update(object=fits[[i]], Xnew=add.grid, Znew=fsim$payoff-immPayoff, maxit = 0) 296 | } 297 | 298 | # resample the candidate set 299 | ei.cands <- tgp::lhs( model$cand.len, lhs.rect ) 300 | ei.cands <- ei.cands[ model$payoff.func( ei.cands,model) > 0,,drop=F] 301 | if (k >= model$seq.design.size) 302 | add.more.sites <- FALSE 303 | k <- k+1 304 | } 305 | budget.used[i] <- k 306 | } 307 | 308 | return (list(fit=fits,timeElapsed=Sys.time()-t.start,nsims=cur.sim,empLoss=emp.loss, 309 | budget=budget.used,theta=theta.fit)) 310 | } 311 | 312 | 313 | ######################################## 314 | #### Adaptive batching 315 | osp.batch.design <- function(model,input.domain=NULL, method ="km",inTheMoney.thresh = 0) 316 | { 317 | M <- model$T/model$dt 318 | t.start <- Sys.time() 319 | 320 | fits <- list() # list of fits at each step 321 | grids <- list() 322 | 323 | cur.sim <- 0 324 | 325 | # set-up pilot design using a forward simulation of X 326 | if (model$pilot.nsims > 0) { 327 | grids[[1]] <- model$sim.func( matrix(rep(model$x0[1:model$dim], model$pilot.nsims), 328 | nrow=model$pilot.nsims, byrow=T), model, model$dt) 329 | for (i in 2:(M-1)) 330 | grids[[i]] <- model$sim.func( grids[[i-1]], model, model$dt) 331 | grids[[1]] <- grids[[3]] 332 | cur.sim <- model$pilot.nsims 333 | } 334 | 335 | ############ step back in time 336 | design.size <- rep(0,M) 337 | 338 | for (i in (M-1):1) { 339 | # figure out design size -- this is the number of unique sites, before restricting to in-the-money 340 | if (length(model$N) == 1) 341 | design.size[i] <- model$N 342 | else 343 | design.size[i] <- model$N[i] 344 | 345 | #figure out batch size 346 | if (is.null(model$batch.nrep)) 347 | n.reps <- 1 348 | else if (length(model$batch.nrep) == 1) 349 | n.reps <- model$batch.nrep 350 | else 351 | n.reps <- model$batch.nrep[i] 352 | 353 | if (is.null(input.domain)) { # empirical design using simulated pilot paths 354 | init.grid <- grids[[i]] 355 | 356 | init.grid <- init.grid[sample(1:min(design.size[i],dim(init.grid)[1]), design.size[i], replace=F),,drop=F] 357 | } 358 | else if (length(input.domain)==2*model$dim | length(input.domain)==1) { 359 | # space-filling design over a rectangle 360 | if (length(input.domain) == 1){ 361 | # specifies the box as empirical quantiles, should be 0.01. If zero then use the full range 362 | my.domain <- matrix( rep(0, 2*model$dim), ncol=2) 363 | if (input.domain > 0) { 364 | for (jj in 1:model$dim) 365 | my.domain[jj,] <- quantile( grids[[i]][,jj], c(input.domain, 1-input.domain)) 366 | } 367 | else { 368 | for (jj in 1:model$dim) 369 | my.domain[jj,] <- range( grids[[i]][,jj] ) 370 | } 371 | } 372 | else my.domain <- input.domain # user-specified box 373 | 374 | # now choose how to space-fill 375 | if (is.null(model$qmc.method)) { 376 | init.grid <- tgp::lhs( design.size[i], my.domain) 377 | } 378 | else { 379 | init.grid <- model$qmc.method( design.size[i], dim=model$dim) 380 | # rescale to the correct rectangle 381 | for (jj in 1:model$dim) 382 | init.grid[,jj] <- my.domain[jj,1] + (my.domain[jj,2]-my.domain[jj,1])*init.grid[,jj] 383 | 384 | } 385 | } 386 | else { # fixed pre-specified design 387 | init.grid <- matrix(input.domain,nrow=length(input.domain)/model$dim) 388 | design.size[i] <- nrow(init.grid) 389 | } 390 | 391 | init.grid <- init.grid[ model$payoff.func(init.grid, model) > inTheMoney.thresh,,drop=F] 392 | 393 | design.size[i] <- dim(init.grid)[1] 394 | all.X <- matrix( rep(0, (model$dim+2)*design.size[i]), ncol=model$dim+2) 395 | 396 | # construct replicated design 397 | if (i == (M-1)) 398 | my.reps <- rep(n.reps,design.size[i]) 399 | else { 400 | pred <- predict(fits[[i+1]], x=init.grid) 401 | my.reps <- n.reps*model$batch.func( pred$mean) 402 | my.reps <- ceiling( model$N*my.reps/sum(my.reps)) 403 | print(sum(my.reps)) 404 | } 405 | 406 | big.grid <- matrix(nrow=sum(my.reps), ncol=model$dim) 407 | for (jj in 1:model$dim) 408 | big.grid[,jj] <- rep(init.grid[,jj], my.reps) 409 | #big.grid <- matrix(rep(t(init.grid), my.reps), ncol = ncol(init.grid), byrow = TRUE) 410 | 411 | fsim <- forward.sim.policy( big.grid, M-i,fits[i:M],model,offset=0) 412 | immPayoff <- model$payoff.func( init.grid, model) 413 | cur.sim <- cur.sim + fsim$nsims 414 | 415 | # pre-averaged mean/variance 416 | #for (jj in 1:design.size[i]) { 417 | # all.X[jj,model$dim+1] <- mean( fsim$payoff[ jj + seq(from=0,len=my.reps[jj],by=design.size[i])]) - immPayoff[ jj] 418 | # all.X[jj,model$dim+2] <- var( fsim$payoff[ jj + seq(from=0,len=n.reps,by=design.size[i])]) 419 | #} 420 | 421 | all.X[,1:model$dim] <- init.grid # use the first dim+1 columns for the batched GP regression. 422 | 423 | # create the km object 424 | if (n.reps > 10 & method == "km") 425 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=init.grid), response=data.frame(y=all.X[,model$dim+1]), 426 | noise.var=all.X[,model$dim+2]/n.reps, control=list(trace=F), 427 | lower=model$min.lengthscale, upper=model$max.lengthscale, 428 | coef.trend=0,coef.cov=model$km.cov, coef.var=model$km.var, covtype=model$kernel.family) 429 | else if (method == "km") # manually estimate the nugget for small batches 430 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=init.grid), response=data.frame(y=all.X[,model$dim+1]), 431 | control=list(trace=F), lower=model$min.lengthscale, upper=model$max.lengthscale, 432 | coef.trend=0, coef.cov=model$km.cov, coef.var=model$km.var, 433 | nugget.estim=TRUE, nugget=sqrt(mean(all.X[,model$dim+2])), covtype=model$kernel.family) 434 | else if (method =="trainkm") 435 | fits[[i]] <- DiceKriging::km(y~0, design=data.frame(x=init.grid), response=data.frame(y=all.X[,model$dim+1]), 436 | control=list(trace=F), lower=rep(0.1,model$dim), upper=model$max.lengthscale, 437 | noise.var=all.X[,model$dim+2]/n.reps, covtype=model$kernel.family) 438 | 439 | else if (n.reps < 10 & method == "lagp") # laGP library implementation 440 | fits[[i]] <- laGP::newGP(X=init.grid, Z=all.X[,model$dim+1], 441 | d=list(mle=FALSE, start=model$km.cov), g=list(start=1, mle=TRUE)) 442 | else if(method =="hetgp") { 443 | big.payoff <- model$payoff.func(big.grid,model) 444 | hetData <- find_reps(big.grid, fsim$payoff-big.payoff) 445 | fits[[i]] <- hetGP::mleHetGP(X = list(X0=hetData$X0, Z0=hetData$Z0,mult=hetData$mult), Z= hetData$Z, 446 | lower = model$min.lengthscale, upper = model$max.lengthscale, covtype=model$kernel.family) 447 | #ehtPred <- predict(x=check.x, object=hetModel) 448 | } 449 | else if (method =="homgp") { 450 | big.payoff <- model$payoff.func(big.grid,model) 451 | hetData <- hetGP::find_reps(big.grid, fsim$payoff-big.payoff) 452 | fits[[i]] <- hetGP::mleHomGP(X = list(X0=hetData$X0, Z0=hetData$Z0,mult=hetData$mult), Z= hetData$Z, 453 | lower = model$min.lengthscale, upper = model$max.lengthscale, covtype=model$kernel.family) 454 | } 455 | else if (model$dim == 1 & method=="spline") # only possible in 1D 456 | fits[[i]] <- smooth.spline(x=init.grid,y=all.X[,2],nknots=model$nk) 457 | else if (method == "rvm") { 458 | if (is.null(model$rvm.kernel)) 459 | rvmk <- "rbfdot" 460 | else 461 | rvmk <- model$rvm.kernel 462 | fits[[i]] <- rvm(x=init.grid, y=all.X[,model$dim+1],kernel=rvmk) 463 | } 464 | else if (method == "npreg") { 465 | if (is.null(model$np.kertype)) 466 | kertype = "gaussian" 467 | else 468 | kertype <- model$np.kertype 469 | if (is.null(model$np.regtype)) 470 | regtype <- "lc" 471 | else 472 | regtype <- model$np.regtype 473 | if (is.null(model$np.kerorder)) 474 | kerorder <- 2 475 | else 476 | kerorder <- model$np.kerorder 477 | 478 | fits[[i]] <- npreg(txdat = init.grid, tydat = all.X[,model$dim+1], 479 | regtype=regtype, ckertype=kertype,ckerorder=kerorder) 480 | } 481 | 482 | } # end of loop over time-steps 483 | 484 | return (list(fit=fits,timeElapsed=Sys.time()-t.start,nsims=cur.sim)) 485 | } 486 | -------------------------------------------------------------------------------- /R/payoffs.R: -------------------------------------------------------------------------------- 1 | ########################################## 2 | #' Arithmetic basket Put for d-dim x 3 | #' @title American Put payoff 4 | #' 5 | #' @param model list containing model params. Uses \code{model$K} as the Put strike. 6 | #' @param x is a matrix of asset prices of dimension N * \code{model$dim} 7 | #' @details in more than 1D, the prices are averaged and maxed with zero. 8 | #' @export 9 | put.payoff <- function(x,model) 10 | { 11 | return( pmax(model$K-apply(x,1,mean),0)) 12 | } 13 | 14 | ######################## 15 | #' Basket Call option on average asset price 16 | #' @title Arithmetic average Call payoff \eqn{(mean(x)-K)_+} 17 | #' @details arithmetic basket for d-dim x) 18 | #' @export 19 | #' @inheritParams put.payoff 20 | call.payoff <- function(x,model) 21 | { 22 | return( pmax(apply(x,1,mean)-model$K,0)) 23 | } 24 | 25 | 26 | ######################## 27 | #' Digital Put 28 | #' @title geometric digital Put payoff 29 | #' 30 | #' @export 31 | #' @inheritParams put.payoff 32 | digital.put.payoff <- function(x,model) 33 | { 34 | return ( as.numeric(model$K > apply(x,1,prod))) 35 | } 36 | 37 | ###################### 38 | #' Geometric Put 39 | #' @title geometric basket Put 40 | #' @export 41 | #' @inheritParams put.payoff 42 | geom.put.payoff <- function(x,model) 43 | { 44 | return( pmax(model$K-apply(x,1,prod),0)) 45 | } 46 | 47 | 48 | ###################### 49 | #' Multivariate min Put 50 | #' @title Min Put payoff 51 | #' 52 | #' @param x Matrix of asset prices with \code{model$dim} columns and N rows 53 | #' @export 54 | #' @inheritParams put.payoff 55 | mini.put.payoff <- function(x,model) 56 | { 57 | return( pmax(model$K-apply(x,1,min),0)) 58 | } 59 | 60 | ###################### 61 | #' Multivariate max call 62 | #' @title Max Call payoff 63 | #' 64 | #' @param x Matrix of asset prices with \code{model$dim} columns and N rows 65 | #' @export 66 | #' @inheritParams put.payoff 67 | maxi.call.payoff <- function(x, model) 68 | { 69 | return( pmax(apply(x,1,max)-model$K,0)) 70 | } 71 | 72 | 73 | ###################### 74 | #' Put payoff for a stoch vol model 75 | #' @title Basket Put with stochastic volatility 76 | #' @param x First coordinate is the asset price, the rest are vol factors 77 | #' @export 78 | #' @inheritParams put.payoff 79 | #' @details Uses K-x[,1], ignores other coordinates 80 | sv.put.payoff <- function(x, model) 81 | { 82 | return (pmax(model$K-x[,1],0)) 83 | } 84 | -------------------------------------------------------------------------------- /R/simFuncs.R: -------------------------------------------------------------------------------- 1 | 2 | #################################### 3 | #' Simulate paths of Geometric Brownian Motion with constant parameters 4 | #' 5 | #' @details Simulate from \eqn{p(X_t|X_{t-1})}. 6 | #' Use log-normal transition density specified by the \code{model} parameters 7 | #' @param x0 is the starting values (matrix of size N x model$dim) 8 | #' @param dt is the step size in time. Defaults to \code{model$dt} 9 | #' @param model a list containing all the other parameters, including volatility \code{model$sigma}, 10 | #' interest rate \code{model$r} and continuous dividend yield \code{model$div}. 11 | #' @return a vector of same dimensions as x0 12 | #' @export 13 | #' 14 | sim.gbm <- function( x0, model, dt=model$dt) 15 | { 16 | len <- nrow(x0) 17 | 18 | newX <- x0 19 | for (j in 1:model$dim) 20 | newX[,j] <- x0[,j,drop=F]*exp( rnorm(len)*model$sigma[j]*sqrt(dt) + 21 | (model$r- model$div- model$sigma[j]^2/2)*dt) 22 | 23 | return (newX) 24 | } 25 | 26 | #################################### 27 | #' Simulate from exponential Ornstein Uhlenbeck process 28 | #' @inheritParams sim.gbm 29 | #' @details Uses \code{model$alpha} for the mean-reversion strength, 30 | #' \code{model$meanrev} for the mean-reversion level, and \code{model$sigma} for the volatility. 31 | #' @export 32 | sim.ouExp <- function( x0, model, dt=model$dt) 33 | { 34 | len <- nrow(x0) 35 | newX <- log(x0) 36 | for (j in 1:model$dim) { 37 | newX[,j] <- newX[,j]*exp( - model$alpha*dt) + (1-exp(-model$alpha*dt))*(log(model$meanRev) - 38 | model$sigma^2/2/model$alpha) 39 | newX[,j] <- exp( newX[,j] + rnorm(len)*model$sigma[j]*sqrt( (1-exp(-2*model$alpha*dt))/2/model$alpha) 40 | ) 41 | } 42 | return (newX) 43 | } 44 | 45 | #################################### 46 | #' Simulate from 1-D discretized exponential Ornstein Uhlenbeck process 47 | #' See Bender (SIFIN 2011). Only works in one dimension 48 | #' @inheritParams sim.gbm 49 | #' @export 50 | #' @details Requires 51 | #' \itemize{ 52 | #' \item \code{model$rho} -- similar to mean-reversion rate, should be close to 1 53 | #' \item \code{model$mu} -- mean-reversion level 54 | #' \item \code{model$sigma} -- volatility 55 | #' } 56 | sim.logOU_Discrete <- function( x0, model, dt=model$dt) 57 | { 58 | len <- nrow(x0) 59 | newX <- log(x0) 60 | newX <- (1-model$rho)*(newX - model$mu) + model$mu + model$sigma*rnorm(len) 61 | newX <- exp(newX) 62 | 63 | return (newX) 64 | } 65 | 66 | 67 | 68 | 69 | ######################################## 70 | #' Simulate an exp-OU stoch volatility model with 1 or 2 vol factors 71 | #' @inheritParams sim.gbm 72 | #' @export 73 | #' @param x0 should have 2 or 3 columns 74 | #' @param useEuler flag, whether to use the exact transition for the StochVol factor, or its 75 | #' Euler approximation 76 | #' @details: Need the following fields in model: \code{svMean} (mean-reversion level), 77 | #' \code{svAlpha} (mean-reversion strength), \code{svEpsY} (fast scaling parameter), 78 | #' \code{svVol} (volatility of volatility), \code{svRho} (correlation with asset S). 79 | #' For 2-factor also need: \code{svMeanZ} (slow scale mean-reversion level), \code{svAlphaZ} 80 | #' (mean-reversion strength), \code{svDeltaZ} (slow scaling parameter), 81 | #' \code{svVolZ} (Z volatility), \code{svRhoZ} (correlation between Z and S), \code{svRhoYZ} 82 | #' ( correlation between the fast and slow SV factors) 83 | sim.expOU.sv <- function(x0, model, dt=model$dt,useEuler=FALSE) 84 | { 85 | len <- nrow(x0) 86 | newX <- x0 87 | rhoBar <- sqrt(1- model$svRho^2) 88 | 89 | t.seq <- seq(0,dt,by=model$eulerDt) 90 | if (max(t.seq) < dt) 91 | t.seq <- c(t.seq,dt) 92 | step.seq <- diff(t.seq) 93 | nSteps <- length(step.seq) 94 | 95 | #if (ncol(x0) == 2) 96 | # epsY <- 1 97 | epsY <- model$svEpsY 98 | alphaY <- model$svAlpha/epsY 99 | cur.vol <- exp(newX[,2]) 100 | 101 | if (ncol(x0) == 3) { 102 | alphaZ <- model$svAlphaZ/model$svDeltaZ 103 | rhoBarZ <- sqrt( 1-model$svRhoZ^2 - model$svRhoYZ^2) 104 | cur.vol <- exp(newX[,2] + newX[,3]) 105 | } 106 | 107 | for (j in 1:nSteps) { 108 | w1 <- rnorm(len) 109 | w2 <- rnorm(len) 110 | newX[,1] <- newX[,1]*exp( (model$r - cur.vol^2/2)*step.seq[j] + cur.vol*sqrt(step.seq[j])*w1 ) 111 | 112 | # new SV factor 113 | if (useEuler == T) 114 | newX[,2] <- newX[,2] + model$svAlpha/epsY*(model$svMean - newX[,2])*step.seq[j] + 115 | model$svVol/sqrt(epsY)*(model$svRho*w1 + rhoBar*w2)*sqrt(step.seq[j]) 116 | else 117 | newX[,2] <- model$svMean + exp(-alphaY*step.seq[j])*(newX[,2]-model$svMean) + 118 | model$svVol/sqrt(epsY)*sqrt( 1-exp(-2*alphaY*step.seq[j]))/sqrt(2*alphaY)*(model$svRho*w1 + 119 | rhoBar*w2) 120 | # new asset price 121 | cur.vol <- exp(newX[,2]) 122 | 123 | if (ncol(newX) == 3) { 124 | w3 <- rnorm(len) 125 | newX[,3] <- model$svMeanZ + exp(-alphaZ*step.seq[j])*(newX[,3]-model$svMeanZ) + 126 | model$svVolZ/sqrt(model$svDeltaZ)*sqrt( 127 | 1-exp(-2*alphaZ*step.seq[j]))/sqrt(2*alphaZ)*(model$svRhoZ*w1 + model$svRhoYZ*w2 + rhoBarZ*w3) 128 | 129 | cur.vol <- exp( newX[,2] + newX[,3]) 130 | 131 | } 132 | 133 | 134 | 135 | 136 | } 137 | return (newX) 138 | } 139 | 140 | #################################### 141 | #' Simulate paths of correlated GBM with a constant correlation 142 | #' 143 | #' @details Simulate correlated multivariate Geometric Brownian motion 144 | #' with a given \code{model$rho} and **identical** \code{model$sigma}'s. 145 | #' Calls \code{rmvnorm} from \pkg{mvtnorm} 146 | #' 147 | #' 148 | #' 149 | #' @param x0 is the starting values (vector) 150 | #' @param dt is the step size 151 | #' @param model contains all the other parameters. 152 | #' In particular, need \code{model$r, model$rho, model$sigma, model$div, model$dim} 153 | #' Note that \code{model$sigma} is the **volatility** parameter (scalar) 154 | #' @export 155 | #' @md 156 | sim.gbm.cor <- function( x0, model, dt=model$dt) 157 | { 158 | len <- nrow(x0) 159 | sig <- rep(model$sigma[1], model$dim) 160 | sigm <- diag(sig^2) + kronecker(sig,t(sig))*model$rho*(1- diag(model$dim)) 161 | 162 | newX <- x0*exp( rmvnorm(len, sig=sigm)*sqrt(dt) + 163 | (model$r- model$div- model$sigma[1]^2/2)*dt) 164 | 165 | return (newX) 166 | } 167 | 168 | #################################### 169 | #' Simulate paths of correlated GBM 170 | #' 171 | #' @details Simulate correlated multivariate Geometric Brownian motion 172 | #' with a given \code{model$rho} and arbitrary \code{model$sigma}'s. Calls \code{rmvnorm} 173 | #' from \pkg{mvtnorm} 174 | #' 175 | #' 176 | #' @param x0 is the starting values. Should be a matrix of size N x \code{model$dim}) 177 | #' @param dt is the step size (Defaults to \code{model$dt}) 178 | #' @param model contains all the other parameters. 179 | #' In particular, need \code{model$r, model$rho, model$sigma, model$div, model$dim} 180 | #' Note that \code{model$sigma} is the **volatility vector** 181 | #' @return a vector of the new states (same dimension as x0) 182 | #' @export 183 | #' @md 184 | sim.gbm.matrix <- function( x0, model, dt=model$dt) 185 | { 186 | len <- nrow(x0) 187 | 188 | newX <- x0*exp( mvtnorm::rmvnorm(len, sig=model$sigma)*sqrt(dt) + 189 | (model$r- model$div- diag(model$sigma)/2)*dt) 190 | 191 | return (newX) 192 | } 193 | 194 | #################################### 195 | #' Simulate 1D Brownian Motion for Asian Options 196 | #' @details first column is t, second column is S_t 197 | #' third column is A_t (arithmetic average) 198 | #' fourth column is tilde{A}_t (geometric average) 199 | #' @inheritParams sim.gbm 200 | #' @export 201 | sim.gbm.asian <- function( x0, model, dt=model$dt) 202 | { 203 | len <- nrow(x0) 204 | newX <- x0 205 | 206 | newX[,1] <- x0[,1]*exp( (model$r - 0.5*model$sigma^2)*dt + model$sigma*sqrt(dt)*rnorm(len)) 207 | newX[,2] <- x0[,2] + dt # time 208 | newX[,3] <- x0[,3]*x0[,2]/newX[,2] + newX[,1]/newX[,2] # (t-1)/t*A_{t-1} + S_t/t 209 | newX[,4] <- exp( (x0[,2]/newX[,2]*log(x0[,4])) + log(newX[,1])/newX[,2]) 210 | 211 | return (newX) 212 | } 213 | 214 | #################################### 215 | #' Simulate 1D Brownian Motion for Moving Average Asian Options 216 | #' @details first column is S_t, other columns are lagged S_t's 217 | #' the lags are in terms of dt 218 | #' @inheritParams sim.gbm 219 | #' @export 220 | sim.gbm.moving.ave <- function( x0, model, dt=model$dt) 221 | { 222 | len <- nrow(x0) 223 | newX <- x0 224 | 225 | for (j in 2:ncol(x0)) 226 | newX[,j] <- x0[,j-1] 227 | 228 | newX[,1] <- x0[,1]*exp( (model$r - 0.5*model$sigma^2)*dt + model$sigma*sqrt(dt)*rnorm(len)) 229 | 230 | return (newX) 231 | } 232 | 233 | #################################### 234 | #' Simulate 1D arithmetic Brownian Motion 235 | #' @inheritParams sim.gbm 236 | #' @param model$drift for the drift term 237 | #' @export 238 | sim.bm <- function( x0, model, dt=model$dt) 239 | { 240 | len <- nrow(x0) 241 | 242 | newX <- x0 243 | for (j in 1:model$dim) 244 | newX[,j] <- x0[,j,drop=F] + rnorm(len)*model$sigma[j]*sqrt(dt) + 245 | model$drift*dt 246 | 247 | return (newX) 248 | } 249 | 250 | #################################### 251 | #' Simulate 2D process for price and capacity 252 | #' @details first column is P_t price, second column is C_t capacity 253 | #' @inheritParams sim.gbm 254 | #' @param model$capSpoil models decay of existing capacity (default is 0) 255 | #' @export 256 | sim.price.and.capacity <- function( x0, model, dt=model$dt) 257 | { 258 | len <- nrow(x0) 259 | newX <- x0 260 | if (is.null(model$capSpoil)) 261 | decay <- 0 262 | else 263 | decay <- model$capSpoil 264 | 265 | newX[,2] <- x0[,2]*exp(-decay*dt) 266 | 267 | newX[,1] <- x0[,1]*exp( (model$mu - 0.5*model$sigma^2)*dt + model$sigma*sqrt(dt)*rnorm(len)) 268 | 269 | return (newX) 270 | } 271 | 272 | 273 | 274 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # mlOSP 2 | Regression Monte Carlo for Optimal Stopping 3 | 4 | 5 | Description: A suite of regression Monte Carlo algorithms utilizing *Machine Learning for Optimal Stopping Problems* (mlOSP). 6 | 7 | Includes both static and 8 | sequential experimental designs. We implement the original Longstaff-Schwartz 9 | and Tsitsiklis-van Roy algorithms, as well as machine learning 10 | approaches that explicitly specify the underlying experimental 11 | designs. The mlOSP template then allows to mix and match the choice 12 | of the regression method, the experimental design, and the 13 | stochastic simulator. Key solver functions are 14 | **osp.prob.design** (original LSM), **osp.fixed.design** (a variety of 15 | space-filling or user-specified designs, generally assumed to be 16 | batched), **osp.seq.design** (sequential designs using a collection of 17 | pre-specified Expected Improvement Criteria), **osp.seq.batch.design** ( 18 | sequential design with adaptive batching) and **osp.tvr** (TvR method). 19 | Also implements the Bouchard-Warin hierarchical adaptive partitioning with 20 | linear regression (**osp.design.piecewisebw**). The library currently works 21 | with 10+ regression emulators, see documentation. 22 | 23 | The Bermudan_demo *vignette* provides a short illustration with a 2D Bermudan 24 | basket Put. The two demo R files provide the source code for the 25 | http://arxiv.org/abs/2012.00729 article and the respective benchmarked solvers. 26 | 27 | Work partially supported by NSF-1521743 and NSF-1821240. 28 | -------------------------------------------------------------------------------- /data/int300_3d.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mludkov/mlOSP/f4c46edd90b627e3800003a7d3d8ff2be9fcb288/data/int300_3d.RData -------------------------------------------------------------------------------- /data/int_2d.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mludkov/mlOSP/f4c46edd90b627e3800003a7d3d8ff2be9fcb288/data/int_2d.RData -------------------------------------------------------------------------------- /demo/00Index: -------------------------------------------------------------------------------- 1 | ml-osp-jcf R source for the preprint http://arxiv.org/abs/2012.00729 2 | BenchmarkedSolvers R source file that defines the benchmarked solvers S1-S10 for the M1-M10 problem instances 3 | 4 | 5 | -------------------------------------------------------------------------------- /demo/BenchmarkedSolvers.R: -------------------------------------------------------------------------------- 1 | ## ----results="hide",warning=FALSE,message=FALSE,error=FALSE,echo=FALSE------------------------------------- 2 | library(ks) 3 | library(fields) # for plotting purposes, use quilt.plot in 2D 4 | library(mlOSP) 5 | library(DiceKriging) 6 | library(tgp) # use lhs from there 7 | library(randtoolbox) # use sobol and halton QMC sequences 8 | library(randomForest) 9 | library(earth) 10 | library(hetGP) 11 | library(kernlab) 12 | library(ggplot2) 13 | library(kableExtra) 14 | library(gridExtra) 15 | library(mvtnorm) 16 | library(nnet) 17 | library(np) 18 | library(laGP) # for distance function 19 | 20 | #' 21 | #' # Test Sets 22 | #' 23 | #' Generate test sets for the 10 benchmarked models. Models are defined in the arxiv paper appendix. 24 | ## ----create-tests------------------------------------------------------------------------------------------ 25 | set.seed(44) 26 | all.tests <- list() 27 | nSims <- rep(10000,6) 28 | nSteps <- rep(20,6) 29 | 30 | for (j in 1:10) { 31 | nSims[j] <- 25000 32 | nSteps[j] <- BModel[[j]]$T/BModel[[j]]$dt 33 | test.j <- list() 34 | test.j[[1]] <- BModel[[j]]$sim.func( matrix(rep(BModel[[j]]$x0, nSims[j]), nrow=nSims[j], byrow=T), BModel[[j]]) 35 | for (i in 2:nSteps[j]) 36 | test.j[[i]] <- BModel[[j]]$sim.func( test.j[[i-1]], BModel[[j]]) 37 | # European option price 38 | print(mean( exp(-BModel[[j]]$r*BModel[[j]]$T)*BModel[[j]]$payoff.func(test.j[[nSteps[j]]],BModel[[j]])) ) 39 | all.tests[[j]] <- test.j 40 | } 41 | save(all.tests,file="allTests-mlosp.RData") 42 | 43 | #' 44 | #' # Solvers 45 | #' 46 | #' Iterate through the 10 solvers, one by one. 47 | #' 48 | #' ## LM solver 49 | #' 50 | #' Define the bases: 3rd degree in 1d and 2d, else quadratic only 51 | ## ----lm-bases---------------------------------------------------------------------------------------------- 52 | poly_base<-function(x,dim){ 53 | if (dim==1){ 54 | base=cbind(x[,1],x[,1]^2, x[,1]^3) 55 | } 56 | if (dim==2){ 57 | base=cbind(x[,1],x[,1]^2,x[,2],x[,2]^2,x[,1]*x[,2],x[,1]^3, x[,2]^3, x[,1]^2*x[,2], x[,1]*x[,2]^2) 58 | } 59 | if (dim==3){ 60 | base=cbind(x[,1],x[,1]^2,x[,2],x[,2]^2,x[,1]*x[,2],x[,3],x[,3]^2,x[,1]*x[,3], x[,2]*x[,3]) 61 | } 62 | if (dim==5){ 63 | base=cbind(x[,1],x[,1]^2,x[,2],x[,2]^2,x[,1]*x[,2],x[,3],x[,3]^2,x[,1]*x[,3], x[,2]*x[,3], 64 | x[,4],x[,4]^2,x[,5],x[,5]^2,x[,1]*x[,4],x[,1]*x[,5], x[,2]*x[,4], x[,2]*x[,5], 65 | x[,3]*x[,4], x[,3]*x[,5], x[,4]*x[,5]) 66 | } 67 | 68 | return(base) 69 | } 70 | 71 | #' 72 | #' 73 | #' Run the Linear model solver 74 | ## ----lm-solver--------------------------------------------------------------------------------------------- 75 | 76 | lmPrice <-array(0,dim=c(10,3)) 77 | for (j in 1:9){ 78 | set.seed(1) 79 | lmModel <- BModel[[j]] 80 | lmModel$bases <- function(x) return(cbind( poly_base(x, lmModel$dim), lmModel$payoff.func(x,lmModel))) 81 | if (lmModel$dim < 3) { 82 | nPaths <- 41000 83 | } else { 84 | nPaths <- 101000 85 | 86 | } 87 | lmSolve <- osp.prob.design(N=nPaths,lmModel, subset=1:1000,method="lm") 88 | oos.lm <- forward.sim.policy( all.tests[[j]], lmModel$T/lmModel$dt, lmSolve$fit, lmModel) 89 | units(lmSolve$timeElapsed) <- "secs" 90 | lmPrice[j,] <- c(lmSolve$p[1],mean(oos.lm$payoff), lmSolve$timeElapsed ) 91 | 92 | } 93 | 94 | 95 | #' 96 | #' ## S2: MARS: Multivariate Adaptive Regression Splines solver 97 | ## ----mars-solver------------------------------------------------------------------------------------------- 98 | print('Now working on MARS') 99 | library(earth) 100 | 101 | mrPrice <-array(0,dim=c(10,3)) 102 | for (j in 1:10){ 103 | set.seed(2) 104 | mrModel <- BModel[[j]] 105 | mrModel$earth.deg = 2; # earth parameters 106 | mrModel$earth.nk = 100; 107 | mrModel$earth.thresh = 1e-8 108 | 109 | if (mrModel$dim < 3) { 110 | nPaths <- 41000 111 | } else { 112 | nPaths <- 101000 113 | 114 | } 115 | mrSolve <- osp.prob.design(N=nPaths,mrModel, subset=1:1000,method="earth") 116 | oos.mr <- forward.sim.policy( all.tests[[j]], mrModel$T/mrModel$dt, mrSolve$fit, mrModel) 117 | units(mrSolve$timeElapsed) <- "secs" 118 | mrPrice[j,] <- c(mrSolve$p[1],mean(oos.mr$payoff), mrSolve$timeElapsed ) 119 | 120 | } 121 | 122 | #' 123 | #' 124 | #' ## S3: Random Forest Solver 125 | ## ----random-forest-solver---------------------------------------------------------------------------------- 126 | print('Now working on Random Forest') 127 | library(randomForest) 128 | rfPrice <-array(0,dim=c(10,3)) 129 | for (j in 1:9){ 130 | set.seed(3) 131 | rfModel <- BModel[[j]] 132 | rfModel$rf.ntree = 200 # random forest parameters 133 | 134 | if (rfModel$dim < 3) { 135 | nPaths <- 41000 136 | rfModel$rf.maxnode=100 # number of nodes per tree 137 | 138 | } 139 | if (rfModel$dim == 3) { 140 | nPaths <- 101000 141 | rfModel$rf.maxnode=200 142 | } 143 | if (rfModel$dim == 5) { 144 | nPaths <- 201000 145 | rfModel$rf.maxnode=200 146 | } 147 | 148 | rfSolve <- osp.prob.design(N=nPaths,rfModel, subset=1:1000,method="randomforest") 149 | oos.rf <- forward.sim.policy( all.tests[[j]], rfModel$T/rfModel$dt, rfSolve$fit, rfModel) 150 | units(rfSolve$timeElapsed) <- "secs" 151 | rfPrice[j,] <- c(rfSolve$p[1],mean(oos.rf$payoff), rfSolve$timeElapsed ) 152 | 153 | } 154 | 155 | #' 156 | #' ## S4: Neural Net (single-layer) solver 157 | ## ----nnet-solver------------------------------------------------------------------------------------------- 158 | print('Now working on nnet') 159 | library(nnet) 160 | nnPrice <-array(0,dim=c(10,3)) 161 | 162 | nnNodes <- c(20,20,40,40,40,50,50,50,50) 163 | nn.paths <- c(rep(41000,5),101000,rep(101000,3)) 164 | for (j in 1:9){ 165 | set.seed(11) 166 | nnModel <- BModel[[j]] 167 | nnModel$nn.nodes = nnNodes[j] 168 | 169 | nnSolve <- osp.prob.design(N=nn.paths[j],nnModel, subset=1:1000,method="nnet") 170 | oos.nn <- forward.sim.policy( all.tests[[j]], nnModel$T/nnModel$dt, nnSolve$fit, nnModel) 171 | units(nnSolve$timeElapsed) <- "secs" 172 | nnPrice[j,] <- c(nnSolve$p[1],mean(oos.nn$payoff), nnSolve$timeElapsed ) 173 | 174 | } 175 | 176 | #' ## S5: TvR scheme with MARS emulator 177 | ## ----tvr-solver-------------------------------------------------------------------------------------------- 178 | library(earth) 179 | print('Now working on TvR') 180 | 181 | tvrPrice <-array(0,dim=c(10,3)) 182 | for (j in 1:9){ 183 | set.seed(10) 184 | tvrModel <- BModel[[j]] 185 | tvrModel$earth.deg = 2; # earth parameters 186 | tvrModel$earth.nk = 100; 187 | tvrModel$earth.thresh = 1e-8 188 | 189 | if (tvrModel$dim < 3) { 190 | nPaths <- 41000 191 | } else { 192 | nPaths <- 101000 193 | 194 | } 195 | tvrSolve <- osp.tvr(N=nPaths,tvrModel, subset=1:1000,method="earth") 196 | oos.tvr <- forward.sim.policy( all.tests[[j]], tvrModel$T/tvrModel$dt, tvrSolve$fit, tvrModel) 197 | units(tvrSolve$timeElapsed) <- "secs" 198 | tvrPrice[j,] <- c(tvrSolve$p[1],mean(oos.tvr$payoff), tvrSolve$timeElapsed ) 199 | print(j) 200 | } 201 | 202 | #' 203 | #' 204 | #' 205 | #' ## S6: Hierarchical adaptive partitioning solver 206 | ## ----bouchard-warin-solver--------------------------------------------------------------------------------- 207 | print('Now working on Bouchard-Warin') 208 | bwPrice <-array(0,dim=c(10,3)) 209 | for (j in 1:9){ 210 | set.seed(5) 211 | bwModel <- BModel[[j]] 212 | if (bwModel$dim < 3) { 213 | bwModel$nChildren <- 8 214 | nPaths <- 40000 215 | } 216 | if (bwModel$dim == 3) { 217 | bwModel$nChildren <- 5 218 | nPaths <- 100000 219 | } 220 | if (bwModel$dim == 5) { 221 | bwModel$nChildren= 4 222 | nPaths = 204800 223 | } 224 | bwSolve <- osp.probDesign.piecewisebw(nPaths,bwModel,test=all.tests[[j]]) 225 | units(bwSolve$timeElapsed) <- "secs" 226 | bwPrice[j,] <- c(bwSolve$price,mean(bwSolve$test),bwSolve$timeElapsed) 227 | 228 | } 229 | 230 | 231 | #' ## S7: GP train-km emulator with LHS space-filling design 232 | ## ----lhs-trainkm-solver------------------------------------------------------------------------------------ 233 | print('Now working on LHS km') 234 | library(DiceKriging) 235 | library(tgp) 236 | 237 | 238 | lhsPrice <-array(0,dim=c(10,2)) 239 | lhsDesSize <- list() 240 | lhs.inputs <- c(rep(400,5),800,1000,1000,2500) 241 | for (j in 1:9){ 242 | set.seed(4) 243 | lhsModel <- BModel[[j]] 244 | lhsModel$pilot.nsims <- 1000 245 | lhsModel$batch.nrep <- 100 246 | lhsModel$kernel.family="matern5_2" 247 | lhsModel$qmc.method <- NULL 248 | lhsModel$N <- lhs.inputs[j] 249 | 250 | lhsSolve <- osp.fixed.design(lhsModel, input.dom=0.02,method="trainkm") 251 | oos.lhs <- forward.sim.policy( all.tests[[j]], lhsModel$T/lhsModel$dt, lhsSolve$fit, lhsModel) 252 | units(lhsSolve$timeElapsed) <- "secs" 253 | lhsPrice[j,] <- c(mean(oos.lhs$payoff), lhsSolve$timeElapsed ) 254 | lhsDesSize[[j]] <- rep(0, length(lhsSolve$fit)) # save the design size at each step 255 | for (jj in 1:length(lhsSolve$fit)) 256 | lhsDesSize[[j]][jj] <- dim(lhsSolve$fit[[jj]]@X)[1] 257 | 258 | } 259 | 260 | 261 | #' 262 | #' ## S8: GP-km emulator with ADSA sequential batched design 263 | #' 264 | ## ----adaptive-batch-solver--------------------------------------------------------------------------------- 265 | print('Now Working on ADSA') 266 | adsaPrice <- array(0,dim=c(10,2)) 267 | 268 | adsaParams <- list( ucb.gamma=1.96, tmse.eps=0, kernel.family="matern5_2", 269 | look.ahead=1, cand.len=1000,pilot.nsims=1000, update.freq=5) 270 | 271 | gbmFlag <-c(TRUE, TRUE, TRUE, TRUE, FALSE, TRUE,TRUE,TRUE,FALSE) 272 | adsaC.batch <- c(20,20,10,10,10,10,10,10,10) 273 | adsa.len.max <- c(10, 10,20,40,40,40,40,40,40) 274 | adsa.len.min <- c(1,1,2,3,3,3,3,3,3) 275 | adsa.desSize <- c(200,200,200, 200,200,250,500,500,500) 276 | adsa.batch<- c(25,25,40,40,40,50,50,50,50) 277 | adsaInitSize <- c(20,20,20,20,20,40,80,80,80) 278 | adsaInitGrid <- list() 279 | adsaInitGrid[[1]] <- array(seq(25,42,len=20),dim=c(20,1)) 280 | adsaInitGrid[[2]] <- array(seq(25,42,len=20),dim=c(20,1)) 281 | adsaInitGrid[[3]] <- 25+30*sobol(20,2) 282 | adsaInitGrid[[4]] <- 80+70*sobol(20,2) 283 | adsaInitGrid[[5]] <- sobol(20,2) 284 | adsaInitGrid[[5]][,1] <- 75 + 27*sobol(20,2)[,1] 285 | adsaInitGrid[[5]][,2] <- -1.5 + 1*sobol(20,2)[,2] 286 | adsaInitGrid[[6]] <- 80+60*sobol(40,3) 287 | adsaInitGrid[[7]] <- 80+70*sobol(80,d=5) 288 | adsaInitGrid[[8]] <- 60+t(c(40,50,60,70,90)*t(sobol(80,d=5))) 289 | adsaInitGrid[[9]] <- 70+60*sobol(80,d=5) 290 | 291 | 292 | adsa.total.budget <- c(10000, 10000, 10000, 10000,10000, 20000,40000,40000,40000) 293 | adsaDesSize <- list() 294 | 295 | library(ks) 296 | 297 | for (j in 1:9){ 298 | set.seed(7) 299 | adsaModel <- c(BModel[[j]],adsaParams) 300 | adsaModel$batch.nrep <- adsa.batch[j] 301 | adsaModel$max.lengthscale <- rep( adsa.len.max[j], adsaModel$dim) 302 | adsaModel$min.lengthscale <- rep( adsa.len.min[j], adsaModel$dim) 303 | if (j == 5) { 304 | adsaModel$max.lengthscale <- c(40,2) 305 | adsaModel$min.lengthscale <- c(3,0.1) 306 | } 307 | 308 | adsaModel$batch.heuristic <- 'adsa' 309 | adsaModel$ei.func <- 'amcu' 310 | adsaModel$total.budget <- adsa.total.budget[j] 311 | adsaModel$seq.design.size <- adsa.desSize[j] 312 | adsaModel$init.size <- adsaInitSize[j] 313 | adsaModel$init.grid <- adsaInitGrid[[j]] 314 | adsaModel$c.batch <- adsaC.batch[j] 315 | 316 | adsaModel$r.cand <- c(20, 30,40,50,60, 80, 120, 160, 200, 250) 317 | 318 | adsaSolve <- osp.seq.batch.design(adsaModel, method="trainkm", is.gbm=gbmFlag[j]) 319 | oos.adsa <- forward.sim.policy( all.tests[[j]], adsaModel$T/adsaModel$dt, adsaSolve$fit, adsaModel) 320 | units(adsaSolve$timeElapsed) <- "secs" 321 | adsaDesSize[[j]] <- adsaSolve$ndesigns 322 | adsaPrice[j,] <- c(mean(oos.adsa$payoff), adsaSolve$timeElapsed) 323 | 324 | } 325 | #plt.2d.surf( adsaSolve$fit[[7]], x=seq(90,130, len=101), y=seq(90,130,len=101), ub=10) 326 | 327 | #' 328 | #' ## S10: Heteroskedastic GP model with a fixed simulation design (pseudo-regression): 329 | #' 330 | ## ----hetgp-fixed-sobol-solver------------------------------------------------------------------------------ 331 | print('Now working on hetGP') 332 | hetGP.params <- list(kernel.family="Gaussian",pilot.nsims=0) 333 | batch.sizes <- c(20,20,40,40,40,50,50,50,50) 334 | hgp.len.min <- c(1,1,2,2,2,3,3,3,3) 335 | hgp.len.max <- c(20,20,40,40,40,40,40,40,40) 336 | 337 | require(randtoolbox) 338 | fixed.designs <- list() 339 | fixed.designs[[1]] <- 25+16*sobol(100) 340 | fixed.designs[[2]] <- 25+16*sobol(100) 341 | fixed.designs[[3]] <- 25+30*sobol(276,d=2) 342 | fixed.designs[[3]] <- fixed.designs[[3]][ which( fixed.designs[[3]][,1] + fixed.designs[[3]][,2] <= 80) ,] # a lot are on the diagonal 343 | fixed.designs[[4]] <- 80+70*sobol(250,d=2) 344 | fixed.designs[[5]] <- 75+27*sobol(250,d=2) 345 | fixed.designs[[5]][,1] <- 75 + 27*sobol(250,d=2)[,1] 346 | fixed.designs[[5]][,2] <- -1.5 + 1*sobol(250,d=2)[,2] 347 | fixed.designs[[6]] <- 80+70*sobol(500,d=3) 348 | fixed.designs[[7]] <- 80+70*sobol(800,d=5) 349 | fixed.designs[[8]] <- 60+t(c(40,50,60,70,90)*t(sobol(800,d=5))) 350 | fixed.designs[[9]] <- 70+60*sobol(1000,d=5) # bigger because Put 351 | 352 | hgpDesSize <- list() 353 | hgpPrice <-array(0,dim=c(10,2)) 354 | for (j in 1:9){ 355 | set.seed(7) 356 | hgpModel <- c(BModel[[j]],hetGP.params) 357 | hgpModel$batch.nrep <- batch.sizes[j] 358 | hgpModel$max.lengthscale <- rep( hgp.len.max[j], hgpModel$dim) 359 | hgpModel$min.lengthscale <- rep( hgp.len.min[j], hgpModel$dim) 360 | if (j == 5) { 361 | hgpModel$max.lengthscale <- c(40,1) 362 | hgpModel$min.lengthscale <- c(3,0.1) 363 | } 364 | if (hgpModel$dim == 1) { 365 | hgpModel$N <- length(fixed.designs[[j]]) 366 | } else { 367 | hgpModel$N <- dim(fixed.designs[[j]])[1] 368 | } 369 | 370 | hgpSolve <- osp.fixed.design(hgpModel, input.dom=fixed.designs[[j]],method="hetgp") 371 | oos.hgp <- forward.sim.policy( all.tests[[j]], hgpModel$T/hgpModel$dt, hgpSolve$fit, hgpModel) 372 | units(hgpSolve$timeElapsed) <- "secs" 373 | hgpPrice[j,] <- c(mean(oos.hgp$payoff),hgpSolve$timeElapsed ) 374 | hgpDesSize[[j]] <- rep(0, length(hgpSolve$fit)) # save the design size at each step 375 | for (jj in 1:length(hgpSolve$fit)) 376 | hgpDesSize[[j]][jj] <- dim(hgpSolve$fit[[j]]$X0)[1] 377 | 378 | } 379 | 380 | #' 381 | #' ## S9: GP-km emulator with sequential SUR design 382 | #' 383 | ## ----seq-csur-km------------------------------------------------------------------------------------------- 384 | print('Now working on seq') 385 | des.size <- c(100, 100, 150, 150, 150, 250,300,320,200) 386 | init.n <- c(20, 20, 30, 30, 40, 50,100,100,100) 387 | batch.n <- c( 20, 20, 40, 40, 40, 100,125,125,125) 388 | seq.len.min <- c(1,1,2,2,2,3,3,3,3) 389 | seq.len.max <- c(20,20,40,40,40,40,40,40,40) 390 | 391 | init.grids <- list() 392 | init.grids[[1]] <- matrix(25+16*sobol(20),ncol=1) 393 | init.grids[[2]] <- matrix(25+16*sobol(20),ncol=1) 394 | init.grids[[3]] <- 25+30*sobol(30,d=2) 395 | init.grids[[4]] <- 80+70*sobol(30,d=2) 396 | init.grids[[5]] <- 75+27*sobol(30,d=2) 397 | init.grids[[5]][,1] <- 75 + 27*sobol(30,d=2)[,1] 398 | init.grids[[5]][,2] <- -1.5 + 1*sobol(30,d=2)[,2] 399 | init.grids[[6]] <- 80+70*sobol(50,d=3) 400 | init.grids[[7]] <- 80+70*sobol(100,d=5) 401 | init.grids[[8]] <- 40+70*sobol(100,d=5) 402 | init.grids[[9]] <- 70+60*sobol(100,d=5) 403 | seqPrice <- array(0, dim=c(10,2)) 404 | seq.candLen <- c(500,1000,500,500,1000,2000,2000,4000,2000) 405 | lhs.size <- c(rep(0.02,6),0.05, 0.02, 0.05) 406 | 407 | for (j in 1:9){ 408 | set.seed(9) 409 | seqGP.params <- list(kernel.family = "Matern5_2", ei.func="sur", 410 | batch.nrep =batch.n[j], cand.len=seq.candLen[j], pilot.nsims=1000, lhs.rect=lhs.size[j], 411 | init.size=init.n[j], init.grid = init.grids[[j]], seq.design.size=des.size[j]) 412 | seqModel <- c(BModel[[j]],seqGP.params) 413 | seqModel$max.lengthscale <- rep( seq.len.max[j], seqModel$dim) 414 | seqModel$min.lengthscale <- rep( seq.len.min[j], seqModel$dim) 415 | if (j == 5) { 416 | seqModel$max.lengthscale <- c(40,1) 417 | seqModel$min.lengthscale <- c(3,0.1) 418 | } 419 | seqSolve <- osp.seq.design(seqModel, method="hetgp") 420 | oos.seq <- forward.sim.policy( all.tests[[j]], seqModel$T/seqModel$dt, seqSolve$fit, seqModel) 421 | units(seqSolve$timeElapsed) <- "secs" 422 | seqPrice[j,] <- c(mean(oos.seq$payoff),seqSolve$timeElapsed ) 423 | print(j) 424 | } 425 | 426 | #' 427 | #' 428 | #' ## Aggregate all results 429 | #' 430 | #' 431 | ## ----solver-table------------------------------------------------------------------------------------------ 432 | all.solvers <- data.frame(models=c('M1', 'M2', 'M3', 'M4', 'M5', 'M6', 'M7', 'M8', 'M9', 'M10'), 433 | lm.poly=lmPrice[,2],rf=rfPrice[,2], mars=mrPrice[,2], 434 | tvr.mars=tvrPrice[,2],nnet=nnPrice[,2], 435 | bou.war=bwPrice[,2], lhs.trainkm = lhsPrice[,1], 436 | adsa=adsaPrice[,1], seq.sur=seqPrice[,1], hetgp.fix=hgpPrice[,1]) 437 | solvers.time <- data.frame(models=c('M1', 'M2', 'M3', 'M4', 'M5', 'M6', 'M7', 'M8', 'M9', 'M10'), 438 | lm.time = lmPrice[,3], rf.time=rfPrice[,3], mr.time =mrPrice[,3], 439 | tvr.time=tvrPrice[,3], nn.time=nnPrice[,3], 440 | bw.time = bwPrice[,3], lhs.time=lhsPrice[,2], 441 | adsa.time=adsaPrice[,2],seq.time=seqPrice[,2],het.time = hgpPrice[,2]) 442 | solvers.insample <- data.frame(models=c('M1', 'M2', 'M3', 'M4', 'M5', 'M6', 'M7', 'M8', 'M9', 'M10'), 443 | lm.in=lmPrice[,1],rf.in=rfPrice[,1], mars.in=mrPrice[,1], 444 | tvr.in=tvrPrice[,1],nnet.in=nnPrice[,1],bw.in=bwPrice[,1]) 445 | 446 | #save(bwSolve,lhsSolve,lmSolve,mrSolve,rfSolve,all.solvers, file="solvers-mlosp.RData") -- 200MB put in Downloads 447 | save(solvers.time, all.solvers,solvers.insample,file="solver-summary.RData" ) 448 | kable(all.solvers, caption="Benchmarked Bermudan Option Prices") %>% 449 | kable_styling(bootstrap_options="striped", full_width=F) 450 | kable(solvers.time, caption="Benchmarked Algorithm Running Times") %>% 451 | kable_styling(bootstrap_options="striped", full_width=F) 452 | 453 | #' 454 | #' 455 | #' # Swing Solver 456 | #' 457 | ## ---- swing-km, eval=FALSE-------------------------------------------------------------------------------- 458 | ## swingModel$kernel.family="matern5_2" 459 | ## swingModel$qmc.method <- NULL 460 | ## swingModel$N <- 120; 461 | ## km.swing2 <- swing.fixed.design(swingModel,input.domain=c(65,67,69,seq(70,101,len=117)), method ="trainkm") 462 | ## # next one is much smaller since only in-the-money is kept 463 | ## #km.swing <- swing.fixed.design(swingModel,input.domain=0.02, method ="trainkm") 464 | ## # Below are benchmarks for the corresponding Put 465 | ## #km.put <- osp.fixed.design(swingModel,input.domain=0.02,method="trainkm") 466 | ## #spl.put <- osp.prob.design(60000,swingModel,subset=1:20000,method="spline") 467 | ## #spl.tvr <- osp.tvr(60000,swingModel,subset=1:20000,method="spline") 468 | ## #earthParams <- c(earth.deg=2,earth.nk=200,earth.thresh=1E-8) 469 | 470 | -------------------------------------------------------------------------------- /demo/desktop.ini: -------------------------------------------------------------------------------- 1 | [LocalizedFileNames] 2 | 00Index=@00Index,0 3 | -------------------------------------------------------------------------------- /inst/doc/Bermudan_demo.R: -------------------------------------------------------------------------------- 1 | ## ----setup, include=FALSE----------------------------------------------------- 2 | knitr::opts_chunk$set(echo = TRUE) 3 | library(ks) 4 | library(fields) # for plotting purposes 5 | library(mlOSP) 6 | library(DiceKriging) 7 | library(tgp) # use lhs from there 8 | library(randtoolbox) # use sobol and halton QMC sequences 9 | library(hetGP) 10 | library(laGP) 11 | library(ggplot2) 12 | library(pander) 13 | data("int_2d") 14 | 15 | ## ----------------------------------------------------------------------------- 16 | model2d <- list(x0 = rep(40,2),K=40,sigma=rep(0.2,2),r=0.06,div=0,T=1,dt=0.04,dim=2,sim.func=sim.gbm, payoff.func=put.payoff) 17 | model2d$pilot.nsims <- 1000 18 | model2d$look.ahead <- 1 19 | model2d$cand.len <- 1000 # size of candidate set m_0 for acquisition function 20 | model2d$max.lengthscale <- c(50,50) 21 | model2d$min.lengthscale <- c(5,5) 22 | model2d$tmse.eps <- 0 23 | 24 | model2d$ucb.gamma <- 1.96 25 | 26 | model2d$seq.design.size <- 100 # budget N = r_0 * k = 2500 27 | model2d$batch.nrep <- 25 # initial replication r_0 28 | model2d$total.budget <- 2500 29 | 30 | model2d$init.size <- 20 # initial design size k_0 31 | model2d$init.grid <- int_2d 32 | 33 | model2d$tmse.eps <- 0 34 | model2d$kernel.family <- "Matern5_2" # kernel function for Gaussian Process 35 | model2d$ucb.gamma <- 1.96 36 | model2d$update.freq <- 10 # number of sequential design steps to update the GP surrogate 37 | model2d$r.cand <- c(20, 30,40,50,60, 80, 120, 160) # r_L 38 | 39 | ## ----testing-homtp2d, message=FALSE, warning=FALSE, fig.width=6--------------- 40 | ### GP + ADSA 41 | set.seed(110) 42 | model2d$batch.heuristic <- 'adsa' 43 | model2d$ei.func <- "amcu" 44 | oos.obj.adsa <- osp.seq.batch.design(model2d, method="hetgp") 45 | 46 | ### GP + ABSUR 47 | set.seed(122) 48 | model2d$batch.heuristic <- 'absur' 49 | model2d$ei.func <- 'absur' 50 | oos.obj.absur <- osp.seq.batch.design(model2d, method="hetgp") 51 | 52 | ### plot Figure 6 53 | plt.2d.surf.batch(oos.obj.adsa$fit[[15]], oos.obj.absur$fit[[15]], oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15], oos.obj.absur$batches[1:oos.obj.absur$ndesigns[15] - 1, 15], "ADSA", "ABSUR", x=seq(25,50,len=201),y = seq(25, 50,len=201)) 54 | 55 | ## ---- message=FALSE, warning=FALSE, fig.height=3.5, fig.width=6, fig.cap="Timing Value (background color) and Exercise Boundary (zero-contour) at $t=0.6$ using ADSA"---- 56 | oos.obj.adsa$ndesigns[15] # number of unique designs 57 | ### plot Figure 6 right panel - ADSA 58 | plt.2d.surf.with.batch(oos.obj.adsa$fit[[15]], 59 | oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15]) 60 | 61 | ## ---- message=FALSE, warning=FALSE,fig.height=3.5, fig.width=6, fig.cap="Timing Value and Exercise Boundary using ABSUR"---- 62 | oos.obj.absur$ndesigns[15] # number of unique designs 63 | ### plot Figure 6 left panel - ABSUR 64 | plt.2d.surf.with.batch(oos.obj.absur$fit[[15]], 65 | oos.obj.absur$batches[1:oos.obj.absur$ndesigns[15] - 1, 15]) 66 | 67 | -------------------------------------------------------------------------------- /inst/doc/Bermudan_demo.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "2D Bermudan Basket Put Demo" 3 | author: Xiong Lyu, Mike Ludkovski 4 | date: '`r Sys.Date()`' 5 | output: 6 | html_vignette: 7 | df_print: paged 8 | vignette: > 9 | %\VignetteEngine{knitr::rmarkdown} 10 | %\VignetteIndexEntry{2D Bermudan Put} 11 | %\VignetteEncoding{UTF-8} 12 | --- 13 | 14 | This is a demo file to generate Figure 6 in the paper "Adaptive Batching for Gaussian Process Surrogates with Application in Noisy Level Set Estimation". The plot shows the fitted exercise boundary with its 95\% credible interval (solid line and dashed line) obtained with Gaussian Process for two-dimensional basket put Bermudan option. Two batch heuristics, ABSUR and ADSA, are used to select the location of inputs and their replications, which are shown as the dots and their color/size. 15 | ```{r setup, include=FALSE} 16 | knitr::opts_chunk$set(echo = TRUE) 17 | library(ks) 18 | library(fields) # for plotting purposes 19 | library(mlOSP) 20 | library(DiceKriging) 21 | library(tgp) # use lhs from there 22 | library(randtoolbox) # use sobol and halton QMC sequences 23 | library(hetGP) 24 | library(laGP) 25 | library(ggplot2) 26 | library(pander) 27 | data("int_2d") 28 | ``` 29 | 30 | We consider adaptive batching for a Gaussian GP metamodel for a two-dimensional basket Put Bermudan option. The asset follows 31 | log-normal dynamics 32 | $$ 33 | {Z}_{t+\Delta t} = {Z}_{t} \cdot\exp \bigg((r-\frac{1}{2} diag{{\Xi}})\Delta t + \sqrt{\Delta t} \cdot {\Xi} (\Delta {W}_{t})\bigg) 34 | $$ 35 | where $\Delta W_t$ are independent Gaussians, and the payoff is $h_{Put}(t,{z}) = e^{-r t}( {\cal K} - z^1 - z^2)_+$. 36 | 37 | 38 | ## Set up the model for Two-dim basket put with parameters in Table 4 39 | 40 | The code below sets up the model for an arithmetic Basket Put with parameters in Table 4 of the article, including the total simulations $N_T$ stored in \textit{total.budget}, initial design size $k_0$ in \textit{init.size}, initial batch size $r_0$ in \textit{batch.nrep} and kernel function $K$ in \textit{kernel.family}. The parameters of the model are initialized as a list, which is later used as input in the main function \textbf{osp.seq.batch.design}. 41 | 42 | ```{r} 43 | model2d <- list(x0 = rep(40,2),K=40,sigma=rep(0.2,2),r=0.06,div=0,T=1,dt=0.04,dim=2,sim.func=sim.gbm, payoff.func=put.payoff) 44 | model2d$pilot.nsims <- 1000 45 | model2d$look.ahead <- 1 46 | model2d$cand.len <- 1000 # size of candidate set m_0 for acquisition function 47 | model2d$max.lengthscale <- c(50,50) 48 | model2d$min.lengthscale <- c(5,5) 49 | model2d$tmse.eps <- 0 50 | 51 | model2d$ucb.gamma <- 1.96 52 | 53 | model2d$seq.design.size <- 100 # budget N = r_0 * k = 2500 54 | model2d$batch.nrep <- 25 # initial replication r_0 55 | model2d$total.budget <- 2500 56 | 57 | model2d$init.size <- 20 # initial design size k_0 58 | model2d$init.grid <- int_2d 59 | 60 | model2d$tmse.eps <- 0 61 | model2d$kernel.family <- "Matern5_2" # kernel function for Gaussian Process 62 | model2d$ucb.gamma <- 1.96 63 | model2d$update.freq <- 10 # number of sequential design steps to update the GP surrogate 64 | model2d$r.cand <- c(20, 30,40,50,60, 80, 120, 160) # r_L 65 | ``` 66 | 67 | 68 | ## Compare results for GP with ABSUR and ADSA 69 | 70 | To implement adaptive batching we need to specify the batch heuristic via \textit{batch.heuristic} and the sequential design framework with \textit{ei.func}. The function \textbf{osp.seq.batch.design.simplified} then fits the GP simulator. Fitting is done through the \texttt{DiceKriging} library, namely the main \texttt{km} function there. Below we apply Adaptive Design with Sequential Allocation, ADSA (which relies on the MCU acquisition function) and Adaptive Batching with Stepwise Uncertainty Reduction, ABSUR. The method parameter \texttt{trainkm} means that the GP metamodels are trained using the default \texttt{DiceKriging::km} MLE optimizer. 71 | 72 | ```{r testing-homtp2d, message=FALSE, warning=FALSE, fig.width=6} 73 | ### GP + ADSA 74 | set.seed(110) 75 | model2d$batch.heuristic <- 'adsa' 76 | model2d$ei.func <- "amcu" 77 | oos.obj.adsa <- osp.seq.batch.design(model2d, method="hetgp") 78 | 79 | ### GP + ABSUR 80 | set.seed(122) 81 | model2d$batch.heuristic <- 'absur' 82 | model2d$ei.func <- 'absur' 83 | oos.obj.absur <- osp.seq.batch.design(model2d, method="hetgp") 84 | 85 | ### plot Figure 6 86 | plt.2d.surf.batch(oos.obj.adsa$fit[[15]], oos.obj.absur$fit[[15]], oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15], oos.obj.absur$batches[1:oos.obj.absur$ndesigns[15] - 1, 15], "ADSA", "ABSUR", x=seq(25,50,len=201),y = seq(25, 50,len=201)) 87 | ``` 88 | 89 | 90 | The objects \texttt{oos.obj.xxx} are lists that contain the GP metamodels for each time-step of the Bermudan option problem ($M=25$ in the example above). We visualize the fitted timing values at $t=0.6 = 15 \Delta t$. The respective zero contour is the \textit{exercise boundary}. 91 | 92 | The function \textbf{plt.2d.surf.with.batch} plots the fitted exercise boundary together with its 95\% credible interval (solid line and dashed curves, respectively). The inputs are shown as the dots and their color indicates the replication amounts $r_i$'s. The first argument is the fitted GP emulator (including the fitted model and the input sites), and the second argument is the batch sizes corresponding to the selected input sites. 93 | 94 | First we plot the figure for results obtained with ADSA. 95 | ```{r, message=FALSE, warning=FALSE, fig.height=3.5, fig.width=6, fig.cap="Timing Value (background color) and Exercise Boundary (zero-contour) at $t=0.6$ using ADSA"} 96 | oos.obj.adsa$ndesigns[15] # number of unique designs 97 | ### plot Figure 6 right panel - ADSA 98 | plt.2d.surf.with.batch(oos.obj.adsa$fit[[15]], 99 | oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15]) 100 | ``` 101 | 102 | The second one is for ABSUR 103 | ```{r, message=FALSE, warning=FALSE,fig.height=3.5, fig.width=6, fig.cap="Timing Value and Exercise Boundary using ABSUR"} 104 | oos.obj.absur$ndesigns[15] # number of unique designs 105 | ### plot Figure 6 left panel - ABSUR 106 | plt.2d.surf.with.batch(oos.obj.absur$fit[[15]], 107 | oos.obj.absur$batches[1:oos.obj.absur$ndesigns[15] - 1, 15]) 108 | ``` 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | -------------------------------------------------------------------------------- /inst/doc/arxivPreprint.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mludkov/mlOSP/f4c46edd90b627e3800003a7d3d8ff2be9fcb288/inst/doc/arxivPreprint.pdf -------------------------------------------------------------------------------- /man/CalcOverhead.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{CalcOverhead} 4 | \alias{CalcOverhead} 5 | \title{Calculates c_over in ABSUR} 6 | \usage{ 7 | CalcOverhead(theta0, theta1, theta2, n) 8 | } 9 | \arguments{ 10 | \item{theta0, theta1, theta2}{parameters in linear regression} 11 | 12 | \item{n}{current design size} 13 | } 14 | \description{ 15 | ABSUR overhead 16 | } 17 | \details{ 18 | compute the estimated overhead in ABSUR 19 | } 20 | -------------------------------------------------------------------------------- /man/batch.adsa.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{batch.adsa} 4 | \alias{batch.adsa} 5 | \title{Calculates reallocated batch size or new input location for Adaptive Design with Sequential Allocation} 6 | \usage{ 7 | batch.adsa(fit, r_seq, xtest, xt_dens, x_new, r0, nugget, method) 8 | } 9 | \arguments{ 10 | \item{fit}{GP/TP fit} 11 | 12 | \item{r_seq}{batch size vector for existing inputs} 13 | 14 | \item{xtest}{testing points to compare reallocation and adding a new inputs} 15 | 16 | \item{xt_dens}{density of xtest} 17 | 18 | \item{x_new}{new input location selected by the EI criteria} 19 | 20 | \item{r0}{total number of new simulations} 21 | 22 | \item{nugget}{the noise variance to compute the ALC factor} 23 | 24 | \item{method}{\code{km} or \code{trainkm} or \code{hetgp} or \code{homtp}} 25 | } 26 | \value{ 27 | a list containing 28 | \itemize{ 29 | \item \code{xoptim}: new design input (NULL if re-allocation is chosen) 30 | \item \code{roptim}: added replications (a scalar r0 if new input chosen, 31 | a vector containing the re-allocation amounts otherwise 32 | } 33 | } 34 | \description{ 35 | ADSA for Adaptive Batching 36 | } 37 | \references{ 38 | M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 39 | in Noisy Level Set Estimation, 40 | } 41 | \seealso{ 42 | \code{\link[mlOSP]{osp.seq.batch.design}} 43 | } 44 | \author{ 45 | Xiong Lyu 46 | } 47 | -------------------------------------------------------------------------------- /man/batch.ddsa.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{batch.ddsa} 4 | \alias{batch.ddsa} 5 | \title{Calculates reallocated batch size for DDSA} 6 | \usage{ 7 | batch.ddsa(fit, r_seq, xtest, xt_dens, r0, method) 8 | } 9 | \arguments{ 10 | \item{fit}{gp/tp fit} 11 | 12 | \item{r_seq}{batch size vector for existing inputs} 13 | 14 | \item{xtest}{testing points to compare reallocation and adding a new inputs} 15 | 16 | \item{xt_dens}{density of xtest} 17 | 18 | \item{r0}{total number of new simulations to add} 19 | 20 | \item{method}{"km" or "trainkm" or "hetgp" or "homtp"} 21 | } 22 | \description{ 23 | DDSA for Adaptive Batching 24 | } 25 | \references{ 26 | M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 27 | in Noisy Level Set Estimation, 28 | } 29 | \seealso{ 30 | \code{\link[mlOSP]{osp.seq.batch.design}} 31 | } 32 | \author{ 33 | Xiong Lyu 34 | } 35 | -------------------------------------------------------------------------------- /man/batch.mlb.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{batch.mlb} 4 | \alias{batch.mlb} 5 | \title{Multi-Level Batching Heuristic} 6 | \usage{ 7 | batch.mlb(objSd_at_x_optim, r_cand, nugget, gamma) 8 | } 9 | \arguments{ 10 | \item{objSd_at_x_optim}{posterior standard deviation of the response at the selected new input} 11 | 12 | \item{r_cand}{candidate batch size vector} 13 | 14 | \item{nugget}{the noise variance to compute the ALC factor} 15 | 16 | \item{gamma}{threshold compared with sd} 17 | } 18 | \value{ 19 | list containing: 20 | \itemize{ 21 | \item \code{roptim}: new replication count 22 | \item \code{gamma}: new gamma variable 23 | } 24 | } 25 | \description{ 26 | Calculates weights for batch size in MLB when called from \link{osp.seq.batch.design} 27 | } 28 | \references{ 29 | M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 30 | in Noisy Level Set Estimation, http://arxiv.org/abs/2003.08579 31 | } 32 | \seealso{ 33 | \code{\link[mlOSP]{osp.seq.batch.design}} 34 | } 35 | -------------------------------------------------------------------------------- /man/batch.rb.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{batch.rb} 4 | \alias{batch.rb} 5 | \title{Calculates weights for batch size in RB} 6 | \usage{ 7 | batch.rb(objSd_at_x_optim, r_cand, last_r, nugget, gamma) 8 | } 9 | \arguments{ 10 | \item{objSd_at_x_optim}{posterior standard deviation of the response at the selected new input} 11 | 12 | \item{r_cand}{candidate batch size vector} 13 | 14 | \item{last_r}{the last batch size} 15 | 16 | \item{nugget}{the noise variance to compute the ALC factor} 17 | 18 | \item{gamma}{threshold compared with sd} 19 | } 20 | \description{ 21 | Ratchet Batching heuristic 22 | } 23 | \references{ 24 | M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 25 | in Noisy Level Set Estimation, http://arxiv.org/abs/2003.08579 26 | } 27 | \seealso{ 28 | \code{\link[mlOSP]{osp.seq.batch.design}} 29 | } 30 | -------------------------------------------------------------------------------- /man/call.payoff.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/payoffs.R 3 | \name{call.payoff} 4 | \alias{call.payoff} 5 | \title{Arithmetic average Call payoff \eqn{(mean(x)-K)_+}} 6 | \usage{ 7 | call.payoff(x, model) 8 | } 9 | \arguments{ 10 | \item{x}{is a matrix of asset prices of dimension N * \code{model$dim}} 11 | 12 | \item{model}{list containing model params. Uses \code{model$K} as the Put strike.} 13 | } 14 | \description{ 15 | Basket Call option on average asset price 16 | } 17 | \details{ 18 | arithmetic basket for d-dim x) 19 | } 20 | -------------------------------------------------------------------------------- /man/capexp.impulse.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{capexp.impulse} 4 | \alias{capexp.impulse} 5 | \title{Compute intervention function for Capacity Expansion impulse problems} 6 | \usage{ 7 | capexp.impulse(cur_x, model, fit, ext = FALSE) 8 | } 9 | \arguments{ 10 | \item{cur_x}{Set of inputs where to compute the intervention function 11 | Should be a n x 2 matrix, with first column for prices and second column 12 | for capacities. Impulse affects second column only.} 13 | 14 | \item{model}{a list containing all model parameters. 15 | In particular must have \code{model$imp.cost.capexp} to compute cost of impulses} 16 | 17 | \item{fit}{Object containing the one-step-ahead functional approximator for V(k,x)} 18 | 19 | \item{ext}{logical flag (default is FALSE) whether to return extended information} 20 | } 21 | \description{ 22 | Compute intervention function for Capacity Expansion impulse problems 23 | } 24 | \details{ 25 | Calculates the intervention operator for a 2-D capacity 26 | expansion problem. This is done by running \code{optimize} on the 27 | cost-to-go based on \code{fit}. Calls \code{ospPredict} 28 | } 29 | -------------------------------------------------------------------------------- /man/cf.absur.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{cf.absur} 4 | \alias{cf.absur} 5 | \title{Calculates weights for location and batch size in ABSUR} 6 | \usage{ 7 | cf.absur(objMean, objSd, nugget, r_cand, overhead, t0) 8 | } 9 | \arguments{ 10 | \item{objMean}{predicted mean response} 11 | 12 | \item{objSd}{posterior standard deviation of the response} 13 | 14 | \item{nugget}{the noise variance to compute the ALC factor} 15 | 16 | \item{r_cand}{candidate batch size vector} 17 | 18 | \item{overhead}{estimated computation overhead in GP} 19 | 20 | \item{t0}{overhead for individual simulation} 21 | } 22 | \description{ 23 | ABSUR for Adaptive Batching 24 | } 25 | \references{ 26 | M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 27 | in Noisy Level Set Estimation, 28 | } 29 | \seealso{ 30 | \code{\link[mlOSP]{osp.seq.batch.design}} 31 | } 32 | \author{ 33 | Xiong Lyu 34 | } 35 | -------------------------------------------------------------------------------- /man/cf.csur.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{cf.csur} 4 | \alias{cf.csur} 5 | \title{Compute reduction in contour-distance} 6 | \usage{ 7 | cf.csur(objMean, objSd, nugget) 8 | } 9 | \arguments{ 10 | \item{objMean}{predicted mean response} 11 | 12 | \item{objSd}{posterior standard deviation of the response} 13 | 14 | \item{nugget}{the noise variance to compute the ALC factor} 15 | } 16 | \description{ 17 | cSUR for Contour Finding based on Ludkovski (2018) 18 | } 19 | \details{ 20 | compute the change in ZC = sd*(1-sqrt{(nugget)})/sqrt{(nugget + sd^2)} 21 | } 22 | \references{ 23 | Mike Ludkovski, Kriging Metamodels and Experimental Design for Bermudan Option Pricing 24 | Journal of Computational Finance, 22(1), 37-77, 2018 25 | } 26 | \seealso{ 27 | [osp.seq.design] 28 | } 29 | \author{ 30 | Mike Ludkovski 31 | } 32 | -------------------------------------------------------------------------------- /man/cf.el.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{cf.el} 4 | \alias{cf.el} 5 | \title{Compute expected loss using the optimal stopping loss function.} 6 | \usage{ 7 | cf.el(objMean, objSd) 8 | } 9 | \arguments{ 10 | \item{objMean}{predicted mean response} 11 | 12 | \item{objSd}{posterior standard deviation of the response} 13 | } 14 | \description{ 15 | Expected Loss for Contour Finding 16 | } 17 | \references{ 18 | Mike Ludkovski, Kriging Metamodels and Experimental Design for Bermudan Option Pricing 19 | Journal of Computational Finance, 22(1), 37-77, 2018 20 | } 21 | \author{ 22 | Mike Ludkovski 23 | } 24 | -------------------------------------------------------------------------------- /man/cf.mcu.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{cf.mcu} 4 | \alias{cf.mcu} 5 | \title{Maximum Contour Uncertainty criterion} 6 | \usage{ 7 | cf.mcu(objMean, objSd) 8 | } 9 | \arguments{ 10 | \item{objMean}{predicted mean response} 11 | 12 | \item{objSd}{posterior standard deviation of the response} 13 | } 14 | \description{ 15 | MCU for Contour Finding. DEPRECATED. 16 | } 17 | \details{ 18 | compute normalized distance to zero-contour |mu|/sd 19 | } 20 | \references{ 21 | Mike Ludkovski, Kriging Metamodels and Experimental Design for Bermudan Option Pricing 22 | Journal of Computational Finance, 22(1), 37-77, 2018 23 | } 24 | \seealso{ 25 | [osp.seq.design] 26 | } 27 | \author{ 28 | Mike Ludkovski 29 | } 30 | -------------------------------------------------------------------------------- /man/cf.smcu.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{cf.smcu} 4 | \alias{cf.smcu} 5 | \title{Straddle Maximum Contour Uncertainty criterion} 6 | \usage{ 7 | cf.smcu(objMean, objSd, gamma = 1.96) 8 | } 9 | \arguments{ 10 | \item{objMean}{predicted mean response} 11 | 12 | \item{objSd}{posterior standard deviation of the response} 13 | 14 | \item{gamma}{weight on the variance} 15 | } 16 | \description{ 17 | straddle MCU with a specified variance weight 18 | } 19 | \details{ 20 | compute the UCB criterion with constant weight: gamma*s(x) - |f(x)| 21 | } 22 | \references{ 23 | Mike Ludkovski, Kriging Metamodels and Experimental Design for Bermudan Option Pricing 24 | Journal of Computational Finance, 22(1), 37-77, 2018 25 | 26 | X.Lyu, M Binois, M. Ludkovski (2020+) Evaluating Gaussian Process Metamodels and Sequential Designs for 27 | Noisy Level Set Estimation 28 | } 29 | \seealso{ 30 | [osp.seq.design] 31 | } 32 | \author{ 33 | Mike Ludkovski 34 | } 35 | -------------------------------------------------------------------------------- /man/cf.sur.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{cf.sur} 4 | \alias{cf.sur} 5 | \title{Compute EI for Contour Finding using the ZC-SUR formula} 6 | \usage{ 7 | cf.sur(objMean, objSd, nugget) 8 | } 9 | \arguments{ 10 | \item{objMean}{predicted mean response} 11 | 12 | \item{objSd}{posterior standard deviation of the response} 13 | 14 | \item{nugget}{GP nugget parameter, see under details} 15 | } 16 | \description{ 17 | SUR (Stepwise Uncertainty Reduction) acquisition function for Contour Finding from Ludkovski (2018) 18 | } 19 | \details{ 20 | compute the change in ZC = sd*(1-sqrt{(nugget)})/sqrt{(nugget + sd^2)} 21 | } 22 | \references{ 23 | Mike Ludkovski, Kriging Metamodels and Experimental Design for Bermudan Option Pricing 24 | Journal of Computational Finance, 22(1), 37-77, 2018 25 | } 26 | \author{ 27 | Mike Ludkovski 28 | } 29 | -------------------------------------------------------------------------------- /man/cf.tMSE.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{cf.tMSE} 4 | \alias{cf.tMSE} 5 | \title{targeted Mean Squared Error criterion} 6 | \usage{ 7 | cf.tMSE(objMean, objSd, seps = 0) 8 | } 9 | \arguments{ 10 | \item{objMean}{predicted mean response} 11 | 12 | \item{objSd}{posterior standard deviation of the response} 13 | 14 | \item{seps}{epsilon in the tMSE formula. By default taken to be zero.} 15 | } 16 | \description{ 17 | tMSE for Contour Finding 18 | } 19 | \details{ 20 | compute predictive density at the contour, smoothed by seps 21 | } 22 | \references{ 23 | Mike Ludkovski, Kriging Metamodels and Experimental Design for Bermudan Option Pricing 24 | Journal of Computational Finance, 22(1), 37-77, 2018\cr 25 | 26 | X.Lyu, M Binois, M. Ludkovski (2020+) Evaluating Gaussian Process Metamodels and Sequential Designs for 27 | Noisy Level Set Estimation 28 | } 29 | \seealso{ 30 | [osp.seq.design] 31 | } 32 | \author{ 33 | Mike Ludkovski 34 | } 35 | -------------------------------------------------------------------------------- /man/digital.put.payoff.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/payoffs.R 3 | \name{digital.put.payoff} 4 | \alias{digital.put.payoff} 5 | \title{geometric digital Put payoff} 6 | \usage{ 7 | digital.put.payoff(x, model) 8 | } 9 | \arguments{ 10 | \item{x}{is a matrix of asset prices of dimension N * \code{model$dim}} 11 | 12 | \item{model}{list containing model params. Uses \code{model$K} as the Put strike.} 13 | } 14 | \description{ 15 | Digital Put 16 | } 17 | -------------------------------------------------------------------------------- /man/forest.impulse.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{forest.impulse} 4 | \alias{forest.impulse} 5 | \title{Compute intervention function for the Faustmann forest rotation problem} 6 | \usage{ 7 | forest.impulse(cur_x, model, fit, ext = FALSE) 8 | } 9 | \arguments{ 10 | \item{cur_x}{Set of inputs where to compute the intervention function. 11 | Should be a n x 1 vector} 12 | 13 | \item{model}{a list containing all model parameters, 14 | including \code{model$impulse.fixed.cost} for the constant cost of any impulse} 15 | 16 | \item{fit}{Object containing the one-step-ahead functional approximator for V(k,x)} 17 | 18 | \item{ext}{logical flag (default is FALSE) whether to return extended information} 19 | } 20 | \description{ 21 | Compute intervention function for the Faustmann forest rotation problem 22 | } 23 | \details{ 24 | Calculates the intervention operator for a 1-D impulse control problem 25 | arising in the Faustmann forest rotation setup. In that case, the impulse target level 26 | is fixed at zero (or \code{model$impulse.target}) and the impulse value is x-fixed.cost-target 27 | Calls \code{ospPredict} on \code{fit} to find that 28 | } 29 | -------------------------------------------------------------------------------- /man/forward.impulse.policy.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{forward.impulse.policy} 4 | \alias{forward.impulse.policy} 5 | \title{Simulate a payoff of an impulse strategy along a set of forward paths} 6 | \usage{ 7 | forward.impulse.policy(x, M, fit, model, mpc = FALSE) 8 | } 9 | \arguments{ 10 | \item{x}{is a matrix of starting values 11 | 12 | if input \code{x} is a list, then use the grids specified by x} 13 | 14 | \item{M}{number of time steps to forward simulate} 15 | 16 | \item{fit}{a list of M fitted emulators that determine the functional approximators of 17 | V(k,x). Supports km, spline, and hetGP objects (anything supported by \code{ospPredict})} 18 | 19 | \item{model}{a list containing all model parameters. In particular need 20 | \code{model$impulse.func} for computing the intervention operator (optimal impulse 21 | to consider), \code{model$sim.func} for simulating each step with time step 22 | \code{model$dt}.} 23 | } 24 | \value{ 25 | a list containing: 26 | \itemize{ 27 | \item \code{payoff} (vector) is the resulting payoff NPV from t=0 28 | \item \code{tau} (vector) number of times impulses were applied on each path 29 | \item \code{impulses} (matrix) impulse amounts matching tau 30 | \item \code{paths} ((d+2)-tensor) forward trajectories of the controlled state process 31 | \item \code{bnd} (vector) impulse target levels for the case of linear impulse costs 32 | } 33 | } 34 | \description{ 35 | Simulate a payoff of an impulse strategy along a set of forward paths 36 | } 37 | \details{ 38 | Should be used in conjunction with the \code{\link{osp.impulse.control}} function 39 | that builds the emulators and calls forward.impulse.policy internally. 40 | } 41 | -------------------------------------------------------------------------------- /man/forward.sim.policy.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{forward.sim.policy} 4 | \alias{forward.sim.policy} 5 | \title{Forward simulation based on a sequence of emulators} 6 | \usage{ 7 | forward.sim.policy( 8 | x, 9 | M, 10 | fit, 11 | model, 12 | offset = 1, 13 | compact = TRUE, 14 | use.qv = FALSE 15 | ) 16 | } 17 | \arguments{ 18 | \item{x}{is a matrix of starting values 19 | 20 | if input \code{x} is a list, then use the grids specified by x} 21 | 22 | \item{M}{number of time steps to forward simulate} 23 | 24 | \item{fit}{a list of fitted emulators that determine the stopping classifiers to be used} 25 | 26 | \item{model}{a list containing all model parameters} 27 | 28 | \item{offset}{(internal for debugging purposes)} 29 | 30 | \item{compact}{flag; if FALSE returns additional information about forward x-values.} 31 | 32 | \item{use.qv}{boolean to indicate whether to plug-in continuation value for allpaths 33 | still alive at the last time-step. Default is set to \code{FALSE}} 34 | } 35 | \value{ 36 | a list containing: 37 | \itemize{ 38 | \item \code{payoff} is the resulting payoff NPV from t=0 39 | \item \code{fvalue[i]} is a list of resulting payoffs (on paths still not stopped) NPV from t=i 40 | \item \code{tau} are the times when stopped 41 | \item \code{sims} is a list; \code{sims[[i]]} are the forward x-values of paths at t=i (those not stopped yet) 42 | \code{nsims} number of total 1-step simulations performed 43 | } 44 | } 45 | \description{ 46 | Simulate h(X_tau) using FIT (can be a dynaTree, smooth.spline, MARS or RandomForest or RVM or hetGP) 47 | } 48 | \details{ 49 | Should be used in conjuction with the \code{osp.xxx} functions that build the emulators. Also called 50 | internally from \code{\link{osp.fixed.design}} 51 | } 52 | -------------------------------------------------------------------------------- /man/geom.put.payoff.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/payoffs.R 3 | \name{geom.put.payoff} 4 | \alias{geom.put.payoff} 5 | \title{geometric basket Put} 6 | \usage{ 7 | geom.put.payoff(x, model) 8 | } 9 | \arguments{ 10 | \item{x}{is a matrix of asset prices of dimension N * \code{model$dim}} 11 | 12 | \item{model}{list containing model params. Uses \code{model$K} as the Put strike.} 13 | } 14 | \description{ 15 | Geometric Put 16 | } 17 | -------------------------------------------------------------------------------- /man/int300_3d.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \docType{data} 4 | \name{int300_3d} 5 | \alias{int300_3d} 6 | \title{Initial design for the 3D Bermudan max-Call example} 7 | \format{ 8 | An array with 300 rows and 3 columns: 9 | \describe{ 10 | \item{row}{unique designs} 11 | \item{columns}{values of corresponding x_1, x_2 and x_3 coordinates} 12 | } 13 | } 14 | \usage{ 15 | int300_3d 16 | } 17 | \description{ 18 | A dataset containing 300 initial locations to be used with \code{osp.seq.design}. 19 | The max Call has strike K=100 and i.i.d assets, so the experimental design is roughly symmetric. 20 | The design is space-filling a hyper-rectangular in-the-money region 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/int_2d.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \docType{data} 4 | \name{int_2d} 5 | \alias{int_2d} 6 | \title{Initial design for the 2D Bermudan Put example} 7 | \format{ 8 | An array with 20 rows and 2 columns: 9 | \describe{ 10 | \item{row}{unique designs} 11 | \item{columns}{values of corresponding x_1 and x_2 coordinates} 12 | } 13 | } 14 | \usage{ 15 | int_2d 16 | } 17 | \description{ 18 | A dataset containing 20 initial locations to be used with \code{osp.seq.design}. 19 | The Put has strike K=40 and i.i.d assets, so the experimental design is roughly symmetric. 20 | The design is space-filling a triangular in-the-money region 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/lin.impulse.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{lin.impulse} 4 | \alias{lin.impulse} 5 | \title{Compute intervention function for an impulse problem wth linear impulse costs} 6 | \usage{ 7 | lin.impulse(cur_x, model, fit, ext = FALSE) 8 | } 9 | \arguments{ 10 | \item{cur_x}{Set of inputs where to compute the intervention function 11 | Should be a n x 1 vector} 12 | 13 | \item{model}{a list containing all model parameters. 14 | In particular must have \code{model$impulse.fixed.cost} for the constant cost of any impulse} 15 | 16 | \item{fit}{Object containing the one-step-ahead functional approximator for V(k,x)} 17 | 18 | \item{ext}{logical flag (default is FALSE) whether to return extended information} 19 | } 20 | \description{ 21 | Compute intervention function for an impulse problem wth linear impulse costs 22 | } 23 | \details{ 24 | Calculates the intervention operator for a 1-D impulse control problem. 25 | Assumes linear impulse costs with slope=1. This means that the optimal impulse 26 | target level is independent of current state x and is characterized by the location 27 | where the gradient of fitted value function is equal to 1. 28 | Calls \code{ospPredict} on \code{fit} to find that 29 | } 30 | -------------------------------------------------------------------------------- /man/maxi.call.payoff.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/payoffs.R 3 | \name{maxi.call.payoff} 4 | \alias{maxi.call.payoff} 5 | \title{Max Call payoff} 6 | \usage{ 7 | maxi.call.payoff(x, model) 8 | } 9 | \arguments{ 10 | \item{x}{Matrix of asset prices with \code{model$dim} columns and N rows} 11 | 12 | \item{model}{list containing model params. Uses \code{model$K} as the Put strike.} 13 | } 14 | \description{ 15 | Multivariate max call 16 | } 17 | -------------------------------------------------------------------------------- /man/mini.put.payoff.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/payoffs.R 3 | \name{mini.put.payoff} 4 | \alias{mini.put.payoff} 5 | \title{Min Put payoff} 6 | \usage{ 7 | mini.put.payoff(x, model) 8 | } 9 | \arguments{ 10 | \item{x}{Matrix of asset prices with \code{model$dim} columns and N rows} 11 | 12 | \item{model}{list containing model params. Uses \code{model$K} as the Put strike.} 13 | } 14 | \description{ 15 | Multivariate min Put 16 | } 17 | -------------------------------------------------------------------------------- /man/osp.fixed.design.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ospProbDesign.R 3 | \name{osp.fixed.design} 4 | \alias{osp.fixed.design} 5 | \title{Generic dynamic emulation of OSP with a non-sequential design} 6 | \usage{ 7 | osp.fixed.design( 8 | model, 9 | input.domain = NULL, 10 | method = "km", 11 | inTheMoney.thresh = 0, 12 | stop.freq = model$dt 13 | ) 14 | } 15 | \arguments{ 16 | \item{model}{a list containing all the model parameters, see Details.} 17 | 18 | \item{input.domain}{the domain of the emulator. Several options are available. Default in \code{NULL} 19 | All the empirical domains rely on pilot paths generated using \code{pilot.nsims}>0 model parameter. 20 | \itemize{ 21 | \item NULL will use an empirical probabilistic design based on the pilot paths (default); 22 | \item if a vector of length 2*model$dim then specifies the bounding rectangle 23 | \item a single positive number, then build a bounding rectangle based on the 24 | \eqn{\alpha}-quantile of the pilot paths 25 | \item a single negative number, then build a bounding rectangle based on the full range of the pilot paths 26 | \item a vector specifies the precise design, used as-is (\emph{overrides design size}) 27 | }} 28 | 29 | \item{method}{regression method to use (defaults to \code{km}) 30 | \itemize{ 31 | \item km: Gaussian process with fixed hyperparams uses \pkg{DiceKriging} via \code{km} (default) 32 | Must provide \code{km.cov} (vector of lengthscales) and \code{km.var} (process variance) 33 | \item trainkm: GP w/trained hyperparams: uses \pkg{DiceKriging} via \code{km} 34 | \item mlegp Local approximate GP from the \pkg{laGP}. Requires 35 | \item homgp Homoskedastic GP: use \pkg{hetGP} with \code{mleHomGP} 36 | \item hetgp Heteroskedastic GP: use \pkg{hetGP} with \code{mleHetGP} 37 | \item spline: Smoothing Splines, use \code{smooth.spline} (only supported in 1D). Requires number of 38 | knots via \code{model$nk} 39 | \item cvspline: Cross-validated Smoothing Splines, use \code{smooth.spline} (only supported in 1D). 40 | Number of knots chosen automatically via cross-validation. 41 | \item loess: Local Regression: use \code{loess} with \code{lo.span} parameter (only in 1D or 2D) 42 | \item rvm: Relevance Vector Machine: uses \code{rvm} from \pkg{kernlab}. Can optionally provide 43 | \code{rvm.kernel} parameter (default is 'rbfdot') 44 | \item npreg: kernel regression using \pkg{np} package. Can optionally provide \code{np.kertype} 45 | (default is "gaussian"); \code{np.regtype} (default is Linear-constant "lc"); \code{np.kerorder} 46 | (default kernel order is 2) and \code{np.bwtype} (default bandwidth type is "fixed") 47 | \item lm: linear model from \pkg{stats} 48 | }} 49 | 50 | \item{inTheMoney.thresh}{which paths are kept, out-of-the-money is dropped. 51 | Defines threshold in terms of \code{model$payoff.func}} 52 | 53 | \item{stop.freq}{*experimental, currently disabled* frequency of stopping decisions (default is \code{model$dt}). 54 | Can be used to stop less frequently.} 55 | } 56 | \value{ 57 | a list containing: 58 | \itemize{ 59 | \item \code{fit} a list containing all the models generated at each time-step. \code{fit[[1]]} is the emulator 60 | at \eqn{t=\Delta t}, the last one is \code{fit[[M-1]]} which is emulator for \eqn{T-\Delta t}. 61 | \item \code{timeElapsed}: total running time based on \code{Sys.time} 62 | } 63 | } 64 | \description{ 65 | RMC based on a batched non-adaptive design with a variety of regression methods 66 | } 67 | \details{ 68 | The design can be replicated through \code{batch.nrep} model parameter. Replication allows to use 69 | nonparametric techniques which would be too expensive otherwise, in particular LOESS, GP and RVM. 70 | All designs are restricted to in-the-money region, see \code{inTheMoney.thresh} parameter (modify at your own risk) 71 | Thus, actual design size will be smaller than specified. By default, no forward evaluation is provided, i.e. the 72 | method only builds the emulators. Thus, to obtain an actual estimate of the option price 73 | combine with \code{\link{forward.sim.policy}}. 74 | } 75 | \examples{ 76 | set.seed(1) 77 | model2d <- list(K=40,x0=rep(40,2),sigma=rep(0.2,2),r=0.06,div=0, 78 | T=1,dt=0.04,dim=2, sim.func=sim.gbm, payoff.func=put.payoff,pilot.nsims=1000, 79 | batch.nrep=100,kernel.family="matern5_2",N=400) 80 | } 81 | \author{ 82 | Mike Ludkovski 83 | } 84 | -------------------------------------------------------------------------------- /man/osp.impulse.control.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ospProbDesign.R 3 | \name{osp.impulse.control} 4 | \alias{osp.impulse.control} 5 | \title{LS-flavor RMC algorithm with a variety of regression methods for stochastic impulse control} 6 | \usage{ 7 | osp.impulse.control( 8 | model, 9 | input.domain = NULL, 10 | method = "spline", 11 | verb = 101, 12 | mpc = FALSE 13 | ) 14 | } 15 | \arguments{ 16 | \item{model}{a list defining the simulator and reward model, with the two main model hooks being 17 | \code{impulse.func} (plus parameters) and \code{sim.func} (plus parameters).} 18 | 19 | \item{method}{a string specifying regression method to use 20 | \itemize{ 21 | \item spline [Default]: \code{smooth.spline} from \pkg{base} which only works \emph{in 1D} 22 | \item randomforest: (from \pkg{randomForest} package) requires \code{rf.maxnode} 23 | and \code{rf.ntree} (number of trees) model parameters 24 | \item loess: only works in \emph{1D or 2D}, requires \code{lo.span} model parameter 25 | \item deepnet: neural network using \pkg{deepnet}. Specify \code{nn.layers} as a vector 26 | to describe the number of nodes across hidden layers 27 | \item homgp Homoskedastic GP: use \pkg{hetGP} with \code{mleHomGP} 28 | \item hetgp Heteroskedastic GP: use \pkg{hetGP} with \code{mleHetGP} 29 | \item lm: linear global regression using \code{model$bases} (required) basis functions (+ constant) 30 | }} 31 | } 32 | \value{ 33 | a list containing 34 | \itemize{ 35 | \item \code{fit} a list containing all the models generated at each time-step. \code{fit[[1]]} is the emulator 36 | at \eqn{t=\Delta t}, the last one is \code{fit[[M-1]]} which is emulator for \eqn{T-\Delta t}. 37 | \item \code{timeElapsed} (based on \code{Sys.time}) 38 | } 39 | } 40 | \description{ 41 | RMC for impulse control. 42 | Training design specified explicitly by the user 43 | } 44 | \details{ 45 | Works with a design specified by the user 46 | 47 | Calls \code{model$impulse.func}, so the latter must be set prior to calling. 48 | Also needs \code{model$dt} and \code{model$r} for discounting. 49 | 50 | Calls \code{model$sim.func} to generate forward paths. Use in conjunction with 51 | \code{\link{forward.impulse.policy}} 52 | } 53 | \examples{ 54 | set.seed(1) 55 | require(DiceKriging) 56 | modelBelak <- list(dim=1, sim.func=sim.bm, r=0.5, drift=0, sigma=1, 57 | x0=1, impulse.fixed.cost = 1,impulse.target = 0,impulse.func = forest.impulse, 58 | imp.type = "forest",T=5, dt=0.05,pilot.nsims=0,batch.nrep = 10,nk = 30,N = 601) 59 | belSolve <- osp.impulse.control(modelBelak, input.domain = seq(-0.5,2.5,by=0.005),method="spline") 60 | } 61 | \author{ 62 | Mike Ludkovski 63 | } 64 | -------------------------------------------------------------------------------- /man/osp.prob.design.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ospProbDesign.R 3 | \name{osp.prob.design} 4 | \alias{osp.prob.design} 5 | \title{Longstaff-Schwartz RMC algorithm with a variety of regression methods.} 6 | \usage{ 7 | osp.prob.design(N, model, subset = 1:N, method = "lm") 8 | } 9 | \arguments{ 10 | \item{N}{is the number of paths} 11 | 12 | \item{model}{defines the simulator and reward model, with the two main model hooks being. 13 | Initial condition is \code{model$x0}. Can be either a vector of length \code{model$dim} 14 | or a vector of length \code{model$dim}*N 15 | option payoff \code{payoff.func} (plus parameters) and stochastic simulator \code{sim.func} (plus parameters)} 16 | 17 | \item{subset}{To have out-of-sample paths, specify \code{subset} (e.g 1:1000) to use for testing. 18 | By default everything is in-sample} 19 | 20 | \item{method}{a string specifying regression method to use 21 | \itemize{ 22 | \item spline: Smoothing splines \code{smooth.spline} from \pkg{base}. Only works \emph{in 1D}. 23 | Requires number of knots \code{nk}. If \code{nk} is omitted, does cross-validation. 24 | \item randomforest: (from \pkg{randomForest} package) requires \code{rf.maxnode} 25 | and \code{rf.ntree} (number of trees) model parameters 26 | \item loess: local polynomial regression. Only works in \emph{1D or 2D}, 27 | requires \code{lo.span} model parameter 28 | \item earth: multivariate adaptive regression splines (MARS) using \pkg{earth} package. 29 | Requires \code{earth.deg} (interaction degree), \code{earth.nk} (max number of terms to keep), 30 | \code{earth.thresh} params 31 | \item rvm: relevance vector machine from \pkg{kernlab} package. Optional \code{rvm.kernel} 32 | model parameter to decide which kernel family to utilize. Default kernel is rbfdot 33 | \item npreg: kernel regression using \pkg{np} package. Can optionally provide \code{np.kertype} 34 | (default is "gaussian"); \code{np.regtype} (default is "lc"); \code{np.kerorder} (default is 2) 35 | \item nnet: neural network using \pkg{nnet}. This is a single-layer neural net. Specify a scalar \code{nn.nodes} 36 | to describe the number of nodes at the hidden layer 37 | \item lagp: local approximate Gaussian Process regression using \pkg{lagp} package. Can 38 | optionally provide \code{lagp.type} (default is "alcray" which is fastest, other choices are "alc" 39 | and "mspe") that determines how the local design is constructed, and \code{lagp.end} which determines 40 | how many inputs are in the above local design. 41 | \item dynatree: dynamic trees using \pkg{dynaTree}. Requires \code{dt.type} ("constant" 42 | or "linear" fits at the leafs), \code{dt.Npart} (number of trees), \code{dt.minp} (minimum size 43 | of each partition) and \code{dt.ab} (the tree prior parameter) model parameters. 44 | \item lm [Default]: linear global regression using \code{model$bases} (required) basis functions 45 | (+ constant) which is a function pointer. 46 | }} 47 | } 48 | \value{ 49 | a list containing 50 | \itemize{ 51 | \item \code{fit} a list containing all the models generated at each time-step. \code{fit[[1]]} is the emulator 52 | at \eqn{t=\Delta t}, the last one is \code{fit[[M-1]]} which is emulator for \eqn{T-\Delta t}. 53 | \item \code{val}: the in-sample pathwise rewards 54 | \item \code{test}: the out-of-sample pathwise rewards 55 | \item \code{p}: the final price (2-vector for in/out-of-sample) 56 | \item \code{timeElapsed} total running time in seconds, based on \code{Sys.time} 57 | } 58 | } 59 | \description{ 60 | RMC using probabilistic design: backpropagation along fixed set of paths (a la Longstaff-Schwartz). 61 | All designs are kept in memory. By default produces only an in-sample estimate. Use in conjuction 62 | with \code{\link{forward.sim.policy}} to generate out-of-sample price estimates. 63 | } 64 | \details{ 65 | Works with a probabilistic design that requires storing all paths in memory. Specifying \code{subset} 66 | allows to compute in parallel with the original computation an out-of-sample estimate of the value function 67 | 68 | Calls \code{model$payoff.func}, so the latter must be set prior to calling. 69 | Also needs \code{model$dt}, \code{model$T} for simulation and \code{model$r} for discounting 70 | 71 | Calls \code{model$sim.func} to generate forward paths 72 | 73 | Emulator is trained only on paths where payoffs are strictly positive 74 | } 75 | \examples{ 76 | set.seed(1) 77 | model2d <- list(look.ahead=1,K=40,x0=rep(40,2),sigma=rep(0.2,2),r=0.06, 78 | div=0, T=1,dt=0.04,dim=2, sim.func=sim.gbm, payoff.func=put.payoff) 79 | bas22 <- function(x) return(cbind(x[,1],x[,1]^2,x[,2],x[,2]^2,x[,1]*x[,2])) 80 | model2d$bases <- bas22 81 | prob.lm <- osp.prob.design(30000,model2d,method="lm",subset=1:15000) 82 | prob.lm$p 83 | # yields [1] 1.440918 1.482422 84 | } 85 | -------------------------------------------------------------------------------- /man/osp.probDesign.piecewisebw.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ospPiecewiseBW.R 3 | \name{osp.probDesign.piecewisebw} 4 | \alias{osp.probDesign.piecewisebw} 5 | \title{Longstaff Schwartz Algorithm using the Bouchard-Warin method 6 | 7 | Uses the Bouchard-Warin recursive partitioning to create N-d trees 8 | for local linear regression fits. Each tree node contains N/model$nBins^model$dim inputs.} 9 | \usage{ 10 | osp.probDesign.piecewisebw(N, model, tst.paths = NULL, verb = 0) 11 | } 12 | \arguments{ 13 | \item{N}{the number of forward training paths} 14 | 15 | \item{model}{a list defining all model parameters. Must contain the following fields: 16 | \cr \code{T, dt, dim, nBins}, 17 | \code{sim.func, x0, r, payoff.func}} 18 | 19 | \item{tst.paths}{(optional) a list containing out-of-sample paths to obtain a price estimate} 20 | 21 | \item{verb}{if specified, produces plots of the 1-dim fit every \code{verb} time-steps 22 | [default is zero, no plotting]} 23 | } 24 | \value{ 25 | a list with the following fields: 26 | \itemize{ 27 | \item \code{price} is the scalar optimal reward; 28 | \item \code{tau} is a vector of stopping times over in-sample paths; 29 | \item \code{test} is a vector of out-of-sample pathwise rewards; 30 | \item \code{val} is a vector of in-sample pathwise rewards 31 | \item \code{timeElapsed} total running time based on \code{Sys.time} 32 | } 33 | } 34 | \description{ 35 | Longstaff Schwartz Algorithm using the Bouchard-Warin method 36 | 37 | Uses the Bouchard-Warin recursive partitioning to create N-d trees 38 | for local linear regression fits. Each tree node contains N/model$nBins^model$dim inputs. 39 | } 40 | \details{ 41 | Calls \link{treeDivide.BW} to create the equi-probable partitions. 42 | Must have N/model$nBins^model$dim as an integer. 43 | } 44 | \examples{ 45 | set.seed(1) 46 | modelSV5 <- list(K=100,x0=c(90, log(0.35)),r=0.0225,div=0,sigma=1, 47 | T=50/252,dt=1/252,svAlpha=0.015,svEpsY=1,svVol=3,svRho=-0.03,svMean=2.95, 48 | eulerDt=1/2520, dim=2,sim.func=sim.expOU.sv,nBins=10,payoff.func=sv.put.payoff) 49 | putPr <- osp.probDesign.piecewisebw(20000,modelSV5) 50 | putPr$price 51 | # get [1] 17.30111 52 | } 53 | \references{ 54 | Bruno Bouchard and Xavier Warin. Monte-Carlo valorisation of American options: facts and new 55 | algorithms to improve existing methods. In R. Carmona, P. Del Moral, P. Hu, and N. Oudjane, editors, 56 | Numerical Methods in Finance, volume 12 of Springer Proceedings in Mathematics. Springer, 2011. 57 | } 58 | -------------------------------------------------------------------------------- /man/osp.seq.batch.design.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ospSeqBatchDesign.R 3 | \name{osp.seq.batch.design} 4 | \alias{osp.seq.batch.design} 5 | \title{Adaptive Batching designs for optimal stopping} 6 | \usage{ 7 | osp.seq.batch.design(model, method = "km", t0 = 0.01, is.gbm = FALSE) 8 | } 9 | \arguments{ 10 | \item{model}{a list containing all the model parameters.} 11 | 12 | \item{method}{A GP emulator to apply. Must be one of \code{km}, \code{trainkm} 13 | \code{homgp}, \code{homtp} or \code{hetgp}} 14 | 15 | \item{t0}{parameter \code{t0} for the \code{ABSUR} heuristic [Default value is 0.01]} 16 | 17 | \item{is.gbm}{flag to indicate whether the underlying simulator is independent log-normals (used 18 | as part of density computation for integrated EI criteria) [Default FALSE]} 19 | } 20 | \value{ 21 | a list containing: 22 | \itemize{ 23 | \item \code{fit} a list of fitted response surfaces 24 | \item \code{timeElapsed} vector of time costs for each round 25 | \item \code{nsims} total number of 1-step \code{model$sim.func} calls 26 | \item \code{empLoss} vector of empirical losses 27 | \item \code{ndesigns}: number of unique designs k_T 28 | \item \code{batches}: matrix of replications r_i, indexed by time-steps and by sequential rounds 29 | } 30 | } 31 | \description{ 32 | Sequential experimental design for optimal stopping problems with several 33 | adaptive batching heuristics based on Lyu & Ludkovski (2020+) 34 | } 35 | \details{ 36 | Implements the adaptive batching strategy defined in \code{mode$batch.heuristic}. 37 | Calls \code{lhs} from library \pkg{tgp}. Possible batch heuristics are: 38 | \itemize{ 39 | \item \code{fb}: [Default] fixed batch amounts (essentially same as \link{osp.seq.design}) 40 | \item \code{mlb}: Multi-level batching; relies on \code{model$r.cand} 41 | \item \code{rb}: Ratchet batching; relies on \code{model$r.cand} 42 | \item \code{absur}: Adaptive batching with Stepwise Uncertainty Reduction; relies on \code{model$t0} 43 | \item \code{adsa}: Adaptive Design with Sequential Allocation 44 | \item \code{ddsa}: Deterministic ADSA that alternates between adding a new input site and allocating 45 | to existing sites 46 | } 47 | 48 | All heuristics also require specifying the acquisition function for expected improvement criterion 49 | via \code{model$ei.func}, see \link{osp.seq.design} 50 | } 51 | \examples{ 52 | sob30 <- randtoolbox::sobol(55, d=2) # construct a space-filling initial design 53 | sob30 <- sob30[ which( sob30[,1] + sob30[,2] <= 1) ,] 54 | sob30 <- 25+30*sob30 55 | model2d <- list(x0 = rep(40,2),K=40,sigma=rep(0.2,2),r=0.06, 56 | div=0,T=1,dt=0.04,dim=2,sim.func=sim.gbm, 57 | payoff.func=put.payoff, look.ahead=1, pilot.nsims=1000, 58 | cand.len=1000,max.lengthscale=c(40,40),min.lengthscale=c(3,3), 59 | seq.design.size=50,batch.nrep=25,total.budget=2000,init.size=30, 60 | init.grid=sob30, kernel.family="gauss",update.freq=5, 61 | r.cand=c(20, 30,40,50,60, 80, 120, 160)) 62 | set.seed(11) 63 | require(tgp) 64 | require(DiceKriging) 65 | require(laGP) 66 | require(ks) 67 | model2d$batch.heuristic <- 'adsa' 68 | model2d$ei.func <- 'amcu' 69 | oos.obj.adsa <- osp.seq.batch.design(model2d,method="trainkm") 70 | plt.2d.surf.with.batch(oos.obj.adsa$fit[[15]], 71 | oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15]) 72 | } 73 | \references{ 74 | M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 75 | in Noisy Level Set Estimation, 76 | } 77 | \seealso{ 78 | [mlOSP::osp.seq.design] 79 | } 80 | -------------------------------------------------------------------------------- /man/osp.seq.batch.design.simplified.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ospSeqBatchDesignSimplified.R 3 | \name{osp.seq.batch.design.simplified} 4 | \alias{osp.seq.batch.design.simplified} 5 | \title{Adaptive Batch design for optimal stopping (simplified version)} 6 | \usage{ 7 | osp.seq.batch.design.simplified(model, method = "km") 8 | } 9 | \arguments{ 10 | \item{model}{a list containing all model parameters} 11 | 12 | \item{method}{\code{km} to select the GP emulator to apply} 13 | } 14 | \value{ 15 | a list containing: 16 | \itemize{ 17 | \item \code{fit} a list of fitted response surfaces 18 | \item \code{ndesigns}: number of design size k_T 19 | \item \code{batches}: matrix of replications r_i 20 | } 21 | } 22 | \description{ 23 | Adaptive Batch design for optimal stopping (simplified version) 24 | } 25 | \details{ 26 | Implements the adaptive batching strategy defined in batch.heuristic with model defined in method. 27 | } 28 | -------------------------------------------------------------------------------- /man/osp.seq.design.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ospSeqDesign.R 3 | \name{osp.seq.design} 4 | \alias{osp.seq.design} 5 | \title{Sequential design for optimal stopping} 6 | \usage{ 7 | osp.seq.design(model, method = "km") 8 | } 9 | \arguments{ 10 | \item{model}{a list containing all the model parameters. 11 | 12 | The following model parameters are used: 13 | \itemize{ 14 | \item \code{init.size}: size of starting grid (will be generated via lhs sampling if \code{init.grid} is not given) 15 | \item \code{pilot.nsims}: number of pilot simulations to create the search space where new inputs will be added 16 | (Default is 5*\code{model$init.size}) 17 | \item \code{cand.len}: number of candidate new inputs to be proposed. The next input is chosen greedily as 18 | the candidate that maximizes the EI criterion. Candidate inputs are selected via \code{tgp::lhs} 19 | (Default is 500*\code{model$dim}) 20 | \item \code{lhs.rect}: specification of the bounding hyper-rectangle where search is conducted 21 | (Default: construct based on 0.02/0.98 quantiles of the pilot paths in each dimension) 22 | \item \code{update.freq}: how often to re-fit the entire GP surrogate as new inputs are added (Default is 10) 23 | \item \code{batch.nrep} (REQUIRED): number of replicates at each unique input 24 | \item \code{min.lengthscale}: vector with minimum lengthscales of the surrogate (Default: 1% of lhs.rect dimensions) 25 | \item \code{max.lengthscale}: vector with maximum lengthscale of the surrogate (Default: 10x each of lhs.rect dimensions) 26 | \item \code{ei.func}: acquisition function (cSUR by Default) 27 | }} 28 | 29 | \item{method}{one of \code{km}, \code{trainkm}, \code{homtp} or \code{hetgp} to select the GP emulator to apply} 30 | } 31 | \value{ 32 | a list containing: 33 | \itemize{ 34 | \item \code{fit} a list of fitted response surfaces. 35 | \item \code{timeElapsed}, 36 | \item \code{nsims} total number of 1-step sim.func calls 37 | \item \code{budget} -- number of sequential iterations per time-step 38 | \item \code{empLoss} --matrix of empirical losses (rows for time-steps, columns for iterations) 39 | \item \code{theta.fit} -- 3d array of estimated lengthscales (sorted by time-steps,iterations,dimensions-of-x) 40 | } 41 | } 42 | \description{ 43 | Regression Monte Carlo via sequential experimental design. The experimental design is augmented 44 | one input at a time, using an Expected Improvement (EI) acquisition function. This is repeated at 45 | each time step. The method is likely to be somewhat slow, but highly efficient in its use of underlying 46 | simulations. See Gramacy & Ludkovski (2013), Ludkovski (2018) for details. 47 | } 48 | \details{ 49 | EI criteria are based on posterior and/or predictive variance and therefore require the use of a 50 | Gaussian-process based surrogate (currently from \pkg{DiceKriging} or \pkg{hetGP}). 51 | 52 | Implements the EI strategy defined in \code{model$ei.func}. Calls \code{lhs} from library \pkg{tgp}. 53 | Empirical losses are computed using \code{cf.el} function. The acquisition function is specified via 54 | \code{ei.func} which can be \code{csur} (Default), \code{sur}, \code{smcu}, \code{amcu}, 55 | \code{tmse} and \code{icu}. 56 | 57 | The experimental design is initialized via \code{init.size}/\code{init.grid} parameters and then is grown 58 | one input-at-a-time until it is of size \code{model$seq.design.size}. Thus, there are a total of 59 | seq.design.size-init.size sequential iterations. 60 | } 61 | \references{ 62 | Mike Ludkovski, Kriging Metamodels and Experimental Design for Bermudan Option Pricing 63 | Journal of Computational Finance, 22(1), 37-77, 2018 64 | } 65 | \author{ 66 | Mike Ludkovski 67 | } 68 | -------------------------------------------------------------------------------- /man/osp.tvr.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ospProbDesign.R 3 | \name{osp.tvr} 4 | \alias{osp.tvr} 5 | \title{Tsitsiklis van Roy RMC algorithm with a variety of regression methods} 6 | \usage{ 7 | osp.tvr(N, model, subset = 1:N, method = "lm") 8 | } 9 | \arguments{ 10 | \item{N}{the number of forward paths to train on} 11 | 12 | \item{model}{a list defining the simulator and reward model, with the two main model hooks being 13 | \code{payoff.func} (plus parameters) and \code{sim.func} (plus parameters). 14 | 15 | Also \code{x0} 16 | is a required part of the \code{model}. Can be either a vector of length \code{model$dim} 17 | or a vector of length \code{model$dim}*N} 18 | 19 | \item{subset}{To reserve out-of-sample paths, specify \code{subset} (eg 1:1000) to use for testing. 20 | By default everything is in-sample.} 21 | 22 | \item{method}{a string specifying regression method to use 23 | \itemize{ 24 | \item spline: \code{smooth.spline} from \pkg{base} which only works \emph{in 1D} 25 | \item cvspline: \code{smooth.spline} from \pkg{base} with automatically chosen 26 | (via cross-validation) degrees of freedom/number of knots. Only works \emph{in 1D} 27 | \item randomforest: (from \pkg{randomForest} package) requires \code{rf.maxnode} 28 | and \code{rf.ntree} (number of trees) model parameters 29 | \item loess: only works in \emph{1D or 2D}, requires \code{lo.span} model parameter 30 | \item earth: multivariate regression splines (MARS) using \pkg{earth} package. 31 | requires \code{earth.deg} (interaction degree), \code{earth.nk} (max number of terms to keep), 32 | \code{earth.thresh} params 33 | \item rvm: relevance vector machine from \pkg{kernlab} package. Optional \code{rvm.kernel} 34 | model parameter to decide which kernel family to utilize. Default kernel is rbfdot 35 | \item deepnet: neural network using \pkg{deepnet}. Specify \code{nn.layers} as a vector 36 | to describe the number of nodes across hidden layers 37 | \item lm [Default]: linear global regression using \code{model$bases} (required) basis functions (+ constant) 38 | }} 39 | } 40 | \value{ 41 | a list containing 42 | \itemize{ 43 | \item \code{fit} a list containing all the models generated at each time-step. \code{fit[[1]]} is the emulator 44 | at \eqn{t=\Delta t}, the last one is \code{fit[[M-1]]} which is emulator for \eqn{T-\Delta t}. 45 | \item \code{val}: the in-sample pathwise rewards 46 | \item \code{test}: the out-of-sample pathwise rewards 47 | \item \code{p}: the final price (2-vector for in/out-of-sample) 48 | \item \code{timeElapsed} (based on \code{Sys.time}) 49 | } 50 | } 51 | \description{ 52 | RMC using TvR along a global set of paths. 53 | All designs are kept in memory 54 | } 55 | \details{ 56 | Works with a probabilistic design that requires storing all paths in memory. Specifying \code{subset} 57 | allows to compute in parallel with the original computation an out-of-sample estimate of the value function 58 | 59 | Calls \code{model$payoff.func}, so the latter must be set prior to calling. 60 | Also needs \code{model$dt} and \code{model$r} for discounting 61 | 62 | Calls \code{model$sim.func} to generate forward paths 63 | 64 | Emulator is trained on all paths, even those that are out-of-the-money 65 | } 66 | \examples{ 67 | set.seed(1) 68 | require(earth) 69 | model2d <- list(K=40,x0=rep(40,2),sigma=rep(0.2,2),r=0.06,div=0, 70 | T=1,dt=0.04,dim=2, sim.func=sim.gbm, payoff.func=put.payoff,pilot.nsims=1000, 71 | earth.deg=2,earth.nk=200,earth.thresh=1E-8) 72 | tvrSolve <- osp.tvr(N=41000,model2d, subset=1:1000,method="earth") 73 | # "in-sample v_0 1.224009; and out-of-sample: 1.233986" 74 | } 75 | -------------------------------------------------------------------------------- /man/ospPredict.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{ospPredict} 4 | \alias{ospPredict} 5 | \title{Forward simulation of a swing payoff based on a sequence of emulators} 6 | \usage{ 7 | ospPredict(myFit, myx, model) 8 | } 9 | \arguments{ 10 | \item{myFit}{fitted emulators of any type supported by osp.prob.design} 11 | 12 | \item{myx}{inputs to predict at} 13 | 14 | \item{model}{List containing all model parameters.} 15 | } 16 | \value{ 17 | prediction of myFit evaluated at myx 18 | } 19 | \description{ 20 | Evaluate the \code{fit} emulator at input \code{myx} 21 | } 22 | \details{ 23 | Generally internal to other mlOSP functions 24 | } 25 | -------------------------------------------------------------------------------- /man/pegging.alg.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{pegging.alg} 4 | \alias{pegging.alg} 5 | \title{Calculates reallocated batch size for ADSA} 6 | \usage{ 7 | pegging.alg(r, U, r_seq) 8 | } 9 | \arguments{ 10 | \item{r}{total number of simulations} 11 | 12 | \item{U}{weighted matrix for pegging algorithm} 13 | 14 | \item{r_seq}{batch size vector for existing inputs} 15 | } 16 | \description{ 17 | Pegging algorithm for ADSA/DDSA 18 | } 19 | \references{ 20 | M. Ludkovski, X. Lyu (2020+) Adaptive Batching for Gaussian Process Surrogates with Application 21 | in Noisy Level Set Estimation, http://arxiv.org/abs/2003.08579 22 | } 23 | \author{ 24 | Xiong Lyu 25 | } 26 | -------------------------------------------------------------------------------- /man/plot_style.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{plot_style} 4 | \alias{plot_style} 5 | \title{GGplot style} 6 | \usage{ 7 | plot_style(base_size = 14, base_family = "Helvetica", ...) 8 | } 9 | \arguments{ 10 | \item{base_size}{is the font size} 11 | 12 | \item{base_family}{is the font style} 13 | 14 | \item{...}{is for the parameters to specialize the ggplot style} 15 | } 16 | \description{ 17 | GGplot style 18 | } 19 | -------------------------------------------------------------------------------- /man/plt.2d.surf.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{plt.2d.surf} 4 | \alias{plt.2d.surf} 5 | \title{Visualize a 2D emulator + stopping region} 6 | \usage{ 7 | plt.2d.surf( 8 | fit, 9 | x = seq(31, 43, len = 201), 10 | y = seq(31, 43, len = 201), 11 | show.var = FALSE, 12 | only.contour = FALSE, 13 | contour.col = "red", 14 | bases = NULL, 15 | strike = 0 16 | ) 17 | } 18 | \arguments{ 19 | \item{fit}{a fitted emulator. can be any of the types supported by \code{\link{forward.sim.policy}}} 20 | 21 | \item{x, y}{locations to use for the \code{predict()} functions. Default is a 200x200 fine grid. 22 | Passed to \code{expand.grid}} 23 | 24 | \item{show.var}{if \code{TRUE} then plot posterior surrogate variance instead of surrogate mean [default = FALSE] 25 | This only works for \code{km} and \code{het/homGP/homTP} objects} 26 | 27 | \item{only.contour}{-- just the zero-contour, no raster plot} 28 | 29 | \item{contour.col}{(default is "red") -- color of the zero contour} 30 | 31 | \item{bases}{(only used for lm objects)} 32 | } 33 | \value{ 34 | a ggplot handle for the created plot. 35 | } 36 | \description{ 37 | Two-dimensional raster+contour+point plot of an mlOSP emulator at a single time step. 38 | } 39 | \details{ 40 | Uses the raster plot from \pkg{ggplot2}. For GP-based objects, also shows the unique design 41 | sites via geom_point. See \code{\link{plt.2d.surf.batch}} for a similar plot 42 | for \code{osp.seq.batch.design} emulators. 43 | } 44 | -------------------------------------------------------------------------------- /man/plt.2d.surf.batch.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{plt.2d.surf.batch} 4 | \alias{plt.2d.surf.batch} 5 | \title{Visualize and compare 2D emulator + stopping region for two fits} 6 | \usage{ 7 | plt.2d.surf.batch( 8 | fit1, 9 | fit2, 10 | r1, 11 | r2, 12 | batch1, 13 | batch2, 14 | x = seq(31, 43, len = 201), 15 | y = seq(31, 43, len = 201) 16 | ) 17 | } 18 | \arguments{ 19 | \item{fit1, fit2}{can be any of the types supported by \code{\link{forward.sim.policy}}} 20 | 21 | \item{r1, r2}{batch vectors for the two fits} 22 | 23 | \item{batch1, batch2}{batch heristics for two fits; Passed to \code{ggplot()} 24 | This only works for \code{km} and \code{het/homGP} objects} 25 | 26 | \item{x, y}{locations to use for the \code{predict()} functions. Default is a 200x200 fine grid. 27 | Passed to \code{expand.grid}} 28 | } 29 | \description{ 30 | two-dimensional image of contour + site + batch plot for two fits 31 | } 32 | -------------------------------------------------------------------------------- /man/plt.2d.surf.with.batch.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{plt.2d.surf.with.batch} 4 | \alias{plt.2d.surf.with.batch} 5 | \title{Visualize 2D emulator + stopping region + batch amounts} 6 | \usage{ 7 | plt.2d.surf.with.batch( 8 | fit, 9 | batch_size, 10 | x = seq(25, 50, len = 201), 11 | y = seq(25, 50, len = 201), 12 | contour.col = "red", 13 | rep.limits = c(20, 100) 14 | ) 15 | } 16 | \arguments{ 17 | \item{fit}{An emulator object. Can be any of the types supported by \code{\link[mlOSP]{forward.sim.policy}}} 18 | 19 | \item{batch_size}{array of replication counts for each input generated by \code{\link[mlOSP]{osp.seq.batch.design}}} 20 | 21 | \item{x, y}{locations to use for the \code{predict()} functions. Default is a 200x200 fine grid. 22 | Passed to \code{expand.grid}} 23 | 24 | \item{contour.col}{(default is "red") -- color of the zero contour} 25 | 26 | \item{rep.limits}{(default is c(20,100)) -- range of the legend for the replication counts} 27 | } 28 | \description{ 29 | two-dimensional image+contour plot with replication counts for an \code{osp.seq.batch.design} fit 30 | } 31 | \examples{ 32 | sob30 <- randtoolbox::sobol(55, d=2) # construct a space-filling initial design 33 | sob30 <- sob30[ which( sob30[,1] + sob30[,2] <= 1) ,] 34 | sob30 <- 25+30*sob30 35 | model2d <- list(x0 = rep(40,2),K=40,sigma=rep(0.2,2),r=0.06, 36 | div=0,T=1,dt=0.04,dim=2,sim.func=sim.gbm, 37 | payoff.func=put.payoff, look.ahead=1, pilot.nsims=1000, 38 | cand.len=1000,max.lengthscale=c(40,40),min.lengthscale=c(3,3), 39 | seq.design.size=50,batch.nrep=25,total.budget=1000,init.size=30, 40 | init.grid=sob30, kernel.family="gauss",update.freq=5, 41 | r.cand=c(20, 30,40,50,60, 80, 120, 160)) 42 | set.seed(11) 43 | require(tgp) 44 | require(DiceKriging) 45 | require(laGP) 46 | require(ks) 47 | require(RColorBrewer) 48 | require(scales) 49 | model2d$batch.heuristic <- 'adsa' 50 | model2d$ei.func <- 'amcu' 51 | oos.obj.adsa <- osp.seq.batch.design(model2d,method="trainkm") 52 | plt.2d.surf.with.batch(oos.obj.adsa$fit[[15]], 53 | oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15]) 54 | } 55 | \author{ 56 | Xiong Lyu 57 | } 58 | -------------------------------------------------------------------------------- /man/put.payoff.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/payoffs.R 3 | \name{put.payoff} 4 | \alias{put.payoff} 5 | \title{Arithmetic basket Put for d-dim x 6 | @title American Put payoff} 7 | \usage{ 8 | put.payoff(x, model) 9 | } 10 | \arguments{ 11 | \item{x}{is a matrix of asset prices of dimension N * \code{model$dim}} 12 | 13 | \item{model}{list containing model params. Uses \code{model$K} as the Put strike.} 14 | } 15 | \description{ 16 | Arithmetic basket Put for d-dim x 17 | @title American Put payoff 18 | } 19 | \details{ 20 | in more than 1D, the prices are averaged and maxed with zero. 21 | } 22 | -------------------------------------------------------------------------------- /man/r.reallocate.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/batchDesign_utils.R 3 | \name{r.reallocate} 4 | \alias{r.reallocate} 5 | \title{Calculates reallocated batch size} 6 | \usage{ 7 | r.reallocate(L, K, xt_dens, r_seq, r0) 8 | } 9 | \arguments{ 10 | \item{L}{lower triangle of cholesky decomposition of covariance matrix} 11 | 12 | \item{K}{covariance matrix} 13 | 14 | \item{xt_dens}{density of xtest} 15 | 16 | \item{r_seq}{batch size vector for existing inputs} 17 | 18 | \item{r0}{total number of new simulations} 19 | } 20 | \description{ 21 | New batch size calculator 22 | } 23 | -------------------------------------------------------------------------------- /man/sim.bm.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.bm} 4 | \alias{sim.bm} 5 | \title{Simulate 1D arithmetic Brownian Motion} 6 | \usage{ 7 | sim.bm(x0, model, dt = model$dt) 8 | } 9 | \arguments{ 10 | \item{x0}{is the starting values (matrix of size N x model$dim)} 11 | 12 | \item{model}{a list containing all the other parameters, including volatility \code{model$sigma}, 13 | interest rate \code{model$r} and continuous dividend yield \code{model$div}.} 14 | 15 | \item{dt}{is the step size in time. Defaults to \code{model$dt}} 16 | 17 | \item{model$drift}{for the drift term} 18 | } 19 | \description{ 20 | Simulate 1D arithmetic Brownian Motion 21 | } 22 | -------------------------------------------------------------------------------- /man/sim.expOU.sv.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.expOU.sv} 4 | \alias{sim.expOU.sv} 5 | \title{Simulate an exp-OU stoch volatility model with 1 or 2 vol factors} 6 | \usage{ 7 | sim.expOU.sv(x0, model, dt = model$dt, useEuler = FALSE) 8 | } 9 | \arguments{ 10 | \item{x0}{should have 2 or 3 columns} 11 | 12 | \item{model}{a list containing all the other parameters, including volatility \code{model$sigma}, 13 | interest rate \code{model$r} and continuous dividend yield \code{model$div}.} 14 | 15 | \item{dt}{is the step size in time. Defaults to \code{model$dt}} 16 | 17 | \item{useEuler}{flag, whether to use the exact transition for the StochVol factor, or its 18 | Euler approximation} 19 | } 20 | \description{ 21 | Simulate an exp-OU stoch volatility model with 1 or 2 vol factors 22 | } 23 | \details{ 24 | : Need the following fields in model: \code{svMean} (mean-reversion level), 25 | \code{svAlpha} (mean-reversion strength), \code{svEpsY} (fast scaling parameter), 26 | \code{svVol} (volatility of volatility), \code{svRho} (correlation with asset S). 27 | For 2-factor also need: \code{svMeanZ} (slow scale mean-reversion level), \code{svAlphaZ} 28 | (mean-reversion strength), \code{svDeltaZ} (slow scaling parameter), 29 | \code{svVolZ} (Z volatility), \code{svRhoZ} (correlation between Z and S), \code{svRhoYZ} 30 | ( correlation between the fast and slow SV factors) 31 | } 32 | -------------------------------------------------------------------------------- /man/sim.gbm.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.gbm} 4 | \alias{sim.gbm} 5 | \title{Simulate paths of Geometric Brownian Motion with constant parameters} 6 | \usage{ 7 | sim.gbm(x0, model, dt = model$dt) 8 | } 9 | \arguments{ 10 | \item{x0}{is the starting values (matrix of size N x model$dim)} 11 | 12 | \item{model}{a list containing all the other parameters, including volatility \code{model$sigma}, 13 | interest rate \code{model$r} and continuous dividend yield \code{model$div}.} 14 | 15 | \item{dt}{is the step size in time. Defaults to \code{model$dt}} 16 | } 17 | \value{ 18 | a vector of same dimensions as x0 19 | } 20 | \description{ 21 | Simulate paths of Geometric Brownian Motion with constant parameters 22 | } 23 | \details{ 24 | Simulate from \eqn{p(X_t|X_{t-1})}. 25 | Use log-normal transition density specified by the \code{model} parameters 26 | } 27 | -------------------------------------------------------------------------------- /man/sim.gbm.asian.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.gbm.asian} 4 | \alias{sim.gbm.asian} 5 | \title{Simulate 1D Brownian Motion for Asian Options} 6 | \usage{ 7 | sim.gbm.asian(x0, model, dt = model$dt) 8 | } 9 | \arguments{ 10 | \item{x0}{is the starting values (matrix of size N x model$dim)} 11 | 12 | \item{model}{a list containing all the other parameters, including volatility \code{model$sigma}, 13 | interest rate \code{model$r} and continuous dividend yield \code{model$div}.} 14 | 15 | \item{dt}{is the step size in time. Defaults to \code{model$dt}} 16 | } 17 | \description{ 18 | Simulate 1D Brownian Motion for Asian Options 19 | } 20 | \details{ 21 | first column is t, second column is S_t 22 | third column is A_t (arithmetic average) 23 | fourth column is tilde{A}_t (geometric average) 24 | } 25 | -------------------------------------------------------------------------------- /man/sim.gbm.cor.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.gbm.cor} 4 | \alias{sim.gbm.cor} 5 | \title{Simulate paths of correlated GBM with a constant correlation} 6 | \usage{ 7 | sim.gbm.cor(x0, model, dt = model$dt) 8 | } 9 | \arguments{ 10 | \item{x0}{is the starting values (vector)} 11 | 12 | \item{model}{contains all the other parameters. 13 | In particular, need \code{model$r, model$rho, model$sigma, model$div, model$dim} 14 | Note that \code{model$sigma} is the \strong{volatility} parameter (scalar)} 15 | 16 | \item{dt}{is the step size} 17 | } 18 | \description{ 19 | Simulate paths of correlated GBM with a constant correlation 20 | } 21 | \details{ 22 | Simulate correlated multivariate Geometric Brownian motion 23 | with a given \code{model$rho} and \strong{identical} \code{model$sigma}'s. 24 | Calls \code{rmvnorm} from \pkg{mvtnorm} 25 | } 26 | -------------------------------------------------------------------------------- /man/sim.gbm.matrix.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.gbm.matrix} 4 | \alias{sim.gbm.matrix} 5 | \title{Simulate paths of correlated GBM} 6 | \usage{ 7 | sim.gbm.matrix(x0, model, dt = model$dt) 8 | } 9 | \arguments{ 10 | \item{x0}{is the starting values. Should be a matrix of size N x \code{model$dim})} 11 | 12 | \item{model}{contains all the other parameters. 13 | In particular, need \code{model$r, model$rho, model$sigma, model$div, model$dim} 14 | Note that \code{model$sigma} is the \strong{volatility vector}} 15 | 16 | \item{dt}{is the step size (Defaults to \code{model$dt})} 17 | } 18 | \value{ 19 | a vector of the new states (same dimension as x0) 20 | } 21 | \description{ 22 | Simulate paths of correlated GBM 23 | } 24 | \details{ 25 | Simulate correlated multivariate Geometric Brownian motion 26 | with a given \code{model$rho} and arbitrary \code{model$sigma}'s. Calls \code{rmvnorm} 27 | from \pkg{mvtnorm} 28 | } 29 | -------------------------------------------------------------------------------- /man/sim.gbm.moving.ave.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.gbm.moving.ave} 4 | \alias{sim.gbm.moving.ave} 5 | \title{Simulate 1D Brownian Motion for Moving Average Asian Options} 6 | \usage{ 7 | sim.gbm.moving.ave(x0, model, dt = model$dt) 8 | } 9 | \arguments{ 10 | \item{x0}{is the starting values (matrix of size N x model$dim)} 11 | 12 | \item{model}{a list containing all the other parameters, including volatility \code{model$sigma}, 13 | interest rate \code{model$r} and continuous dividend yield \code{model$div}.} 14 | 15 | \item{dt}{is the step size in time. Defaults to \code{model$dt}} 16 | } 17 | \description{ 18 | Simulate 1D Brownian Motion for Moving Average Asian Options 19 | } 20 | \details{ 21 | first column is S_t, other columns are lagged S_t's 22 | the lags are in terms of dt 23 | } 24 | -------------------------------------------------------------------------------- /man/sim.logOU_Discrete.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.logOU_Discrete} 4 | \alias{sim.logOU_Discrete} 5 | \title{Simulate from 1-D discretized exponential Ornstein Uhlenbeck process 6 | See Bender (SIFIN 2011). Only works in one dimension} 7 | \usage{ 8 | sim.logOU_Discrete(x0, model, dt = model$dt) 9 | } 10 | \arguments{ 11 | \item{x0}{is the starting values (matrix of size N x model$dim)} 12 | 13 | \item{model}{a list containing all the other parameters, including volatility \code{model$sigma}, 14 | interest rate \code{model$r} and continuous dividend yield \code{model$div}.} 15 | 16 | \item{dt}{is the step size in time. Defaults to \code{model$dt}} 17 | } 18 | \description{ 19 | Simulate from 1-D discretized exponential Ornstein Uhlenbeck process 20 | See Bender (SIFIN 2011). Only works in one dimension 21 | } 22 | \details{ 23 | Requires 24 | \itemize{ 25 | \item \code{model$rho} -- similar to mean-reversion rate, should be close to 1 26 | \item \code{model$mu} -- mean-reversion level 27 | \item \code{model$sigma} -- volatility 28 | } 29 | } 30 | -------------------------------------------------------------------------------- /man/sim.ouExp.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.ouExp} 4 | \alias{sim.ouExp} 5 | \title{Simulate from exponential Ornstein Uhlenbeck process} 6 | \usage{ 7 | sim.ouExp(x0, model, dt = model$dt) 8 | } 9 | \arguments{ 10 | \item{x0}{is the starting values (matrix of size N x model$dim)} 11 | 12 | \item{model}{a list containing all the other parameters, including volatility \code{model$sigma}, 13 | interest rate \code{model$r} and continuous dividend yield \code{model$div}.} 14 | 15 | \item{dt}{is the step size in time. Defaults to \code{model$dt}} 16 | } 17 | \description{ 18 | Simulate from exponential Ornstein Uhlenbeck process 19 | } 20 | \details{ 21 | Uses \code{model$alpha} for the mean-reversion strength, 22 | \code{model$meanrev} for the mean-reversion level, and \code{model$sigma} for the volatility. 23 | } 24 | -------------------------------------------------------------------------------- /man/sim.price.and.capacity.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/simFuncs.R 3 | \name{sim.price.and.capacity} 4 | \alias{sim.price.and.capacity} 5 | \title{Simulate 2D process for price and capacity} 6 | \usage{ 7 | sim.price.and.capacity(x0, model, dt = model$dt) 8 | } 9 | \arguments{ 10 | \item{x0}{is the starting values (matrix of size N x model$dim)} 11 | 12 | \item{model}{a list containing all the other parameters, including volatility \code{model$sigma}, 13 | interest rate \code{model$r} and continuous dividend yield \code{model$div}.} 14 | 15 | \item{dt}{is the step size in time. Defaults to \code{model$dt}} 16 | 17 | \item{model$capSpoil}{models decay of existing capacity (default is 0)} 18 | } 19 | \description{ 20 | Simulate 2D process for price and capacity 21 | } 22 | \details{ 23 | first column is P_t price, second column is C_t capacity 24 | } 25 | -------------------------------------------------------------------------------- /man/sv.put.payoff.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/payoffs.R 3 | \name{sv.put.payoff} 4 | \alias{sv.put.payoff} 5 | \title{Basket Put with stochastic volatility} 6 | \usage{ 7 | sv.put.payoff(x, model) 8 | } 9 | \arguments{ 10 | \item{x}{First coordinate is the asset price, the rest are vol factors} 11 | 12 | \item{model}{list containing model params. Uses \code{model$K} as the Put strike.} 13 | } 14 | \description{ 15 | Put payoff for a stoch vol model 16 | } 17 | \details{ 18 | Uses K-x[,1], ignores other coordinates 19 | } 20 | -------------------------------------------------------------------------------- /man/swing.fixed.design.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/ospProbDesign.R 3 | \name{swing.fixed.design} 4 | \alias{swing.fixed.design} 5 | \title{Generic dynamic emulation of a multiple-stopping problem with a non-sequential design} 6 | \usage{ 7 | swing.fixed.design( 8 | model, 9 | input.domain = NULL, 10 | method = "km", 11 | inTheMoney.thresh = 0 12 | ) 13 | } 14 | \arguments{ 15 | \item{model}{a list defining all the model parameters} 16 | 17 | \item{input.domain}{the domain of the emulator. Several options are available. Default in \code{NULL} 18 | All the empirical domains rely on pilot paths generated using \code{pilot.nsims}>0 model parameter. 19 | \itemize{ 20 | \item NULL will use an empirical probabilistic design based on the pilot paths (default); 21 | \item if a vector of length 2*model$dim then specifies the bounding rectangle 22 | \item a single positive number, then build a bounding rectangle based on the \eqn{\alpha}-quantile of the pilot paths 23 | \item a single negative number, then build a bounding rectangle based on the full range of the pilot paths 24 | \item a vector specifies the precise design, used as-is (\emph{overrides design size}) 25 | }} 26 | 27 | \item{method}{regression method to use (defaults to \code{km}) 28 | \itemize{ 29 | \item km: [(default] Gaussian process with fixed hyperparams uses \pkg{DiceKriging} 30 | via \code{km}. Requires \code{km.cov} (vector of lengthscales) 31 | and \code{km.var} (scalar process variance) 32 | \item trainkm: GP w/trained hyperparams: use \pkg{DiceKriging} via \code{km}. 33 | Requires to specify kernel family via \code{kernel.family} 34 | \item mlegp Local GP from \pkg{laGP} (uses Gaussian squared exponential kernel) 35 | \item homgp Homoskedastic GP: use \pkg{hetGP} with \code{mleHomGP}. 36 | Requires to specify kernel family via \code{kernel.family} 37 | \item hetgp Heteroskedastic GP: use \pkg{hetGP} with \code{mleHetGP} 38 | Requires to specify kernel family via \code{kernel.family} 39 | \item spline: Smoothing Splines, use \code{smooth.spline} from \pkg{base} 40 | with the user-specified \code{nk} number of knots (1D only) 41 | \item cvspline: \code{smooth.spline} from \pkg{base} with automatically chosen 42 | (via cross-validation) degrees of freedom/number of knots. Only works \emph{in 1D} 43 | \item loess: Local polynomial regression: use \code{loess} with \code{lo.span} parameter 44 | \item rvm: Relevance Vector Machine: use \pkg{kernlab} with \code{rvm} 45 | \item lm: linear model from \pkg{stats} using \code{model$bases} 46 | }} 47 | 48 | \item{inTheMoney.thresh}{which paths are kept, out-of-the-money is dropped. 49 | Defines threshold in terms of \code{model$payoff.func}} 50 | } 51 | \value{ 52 | a list containing: 53 | \itemize{ 54 | \item \code{fit} a list containing all the models generated at each time-step. \code{fit[[1]]} is the emulator 55 | at \eqn{t=\Delta t}, the last one is \code{fit[[M-1]]} which is emulator for \eqn{T-\Delta t}. 56 | \item \code{val}: the in-sample pathwise rewards 57 | \item \code{test}: the out-of-sample pathwise rewards 58 | \item \code{p}: the final price (2-vector for in/out-of-sample) 59 | \item \code{timeElapsed} (based on \code{Sys.time}) 60 | } 61 | } 62 | \description{ 63 | Swing option solver based on a batched non-adaptive design with a variety of regression methods 64 | } 65 | \details{ 66 | Solves for a swing with \code{n.swing} exercise rights. The payoff function is 67 | saved in \code{swing.payoff}. Also assumes a refraction period of \code{refract} between consecutive 68 | exercises. The experimental design is based on \code{\link{osp.fixed.design}}. By default, no forward evaluation is provided, ie the 69 | method only builds the emulators. Thus, to obtain an actual estimate of the value 70 | combine with \code{\link{swing.policy}}. 71 | } 72 | \examples{ 73 | set.seed(1) 74 | swingModel <- list(dim=1, sim.func=sim.gbm, x0=100, 75 | swing.payoff=put.payoff, n.swing=3,K=100, 76 | sigma=0.3, r=0.05, div=0, 77 | T=1,dt=0.02,refract=0.1, 78 | N=800,pilot.nsims=1000,batch.nrep=25) 79 | swingModel$nk=16 # number of knots for the smoothing spline 80 | spl.swing <- swing.fixed.design(swingModel,input.domain=0.03, method ="spline") 81 | } 82 | -------------------------------------------------------------------------------- /man/swing.policy.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{swing.policy} 4 | \alias{swing.policy} 5 | \title{Forward simulation of a swing payoff based on a sequence of emulators} 6 | \usage{ 7 | swing.policy( 8 | x, 9 | M, 10 | fit, 11 | model, 12 | offset = 1, 13 | use.qv = FALSE, 14 | n.swing = 1, 15 | verbose = FALSE 16 | ) 17 | } 18 | \arguments{ 19 | \item{x}{a matrix of starting values (N x \code{model$dim}). 20 | If input \code{x} is a list, then use the grids specified by x} 21 | 22 | \item{M}{number of time steps to forward simulate} 23 | 24 | \item{fit}{a list of fitted emulators that determine the stopping classifiers to be used} 25 | 26 | \item{model}{List containing all model parameters. In particular uses \code{model$dt,model$r} 27 | for discounting and \code{model$swing.payoff} to compute payoffs} 28 | 29 | \item{offset}{deprecated} 30 | 31 | \item{use.qv}{experimental, do not use} 32 | 33 | \item{n.swing}{number of swing rights (integer, at least 1)} 34 | 35 | \item{verbose}{for debugging purposes} 36 | } 37 | \value{ 38 | a list containing: 39 | \itemize{ 40 | \item \code{payoff}: a vector of length `nrow(x)` containing the resulting payoffs NPV from $t=0$ 41 | \item \code{tau} matrix of the times when stopped. Columns represent the rights exercised 42 | \item \code{nsims} number of total 1-step simulations performed 43 | } 44 | } 45 | \description{ 46 | Simulate \eqn{\sum_k h(X_{tau_k})} using \code{fit} emulators 47 | } 48 | \details{ 49 | Should be used in conjuction with the \code{\link[mlOSP]{swing.fixed.design}} function that builds the emulators. 50 | } 51 | -------------------------------------------------------------------------------- /man/treeDivide.BW.1d.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{treeDivide.BW.1d} 4 | \alias{treeDivide.BW.1d} 5 | \title{1-d version of \code{\link{treeDivide.BW}} that stores all the local fits} 6 | \usage{ 7 | treeDivide.BW.1d(grid, curDim, model, test) 8 | } 9 | \arguments{ 10 | \item{grid}{dataset of x-values} 11 | 12 | \item{curDim}{dimension of the grid} 13 | 14 | \item{model}{a list containing all model parameters. In particular must have 15 | \code{model$nBins} defined} 16 | } 17 | \description{ 18 | 1-d version of \code{\link{treeDivide.BW}} that stores all the local fits 19 | } 20 | -------------------------------------------------------------------------------- /man/treeDivide.BW.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/mlOSP_utils.R 3 | \name{treeDivide.BW} 4 | \alias{treeDivide.BW} 5 | \title{Create a Bouchard-Warin equi-probable grid of regression sub-domains} 6 | \usage{ 7 | treeDivide.BW(grid, curDim, model, test.paths) 8 | } 9 | \arguments{ 10 | \item{grid}{dataset of x-values} 11 | 12 | \item{curDim}{dimension of the grid} 13 | 14 | \item{model}{a list containing all model parameters. In particular must have 15 | \code{model$nBins} defined} 16 | 17 | \item{test.paths}{testing paths to predict along as well} 18 | } 19 | \description{ 20 | Create a Bouchard-Warin equi-probable grid of regression sub-domains 21 | } 22 | \details{ 23 | Recursively sort along each of the d-coordinates 24 | At the end do local linear regression at each leaf 25 | This is a recursive algorithm! 26 | first column is reserved for the y-coordinate (timingValue) 27 | It's safest if nrows(grid) is divisible by \code{model$nBinsmodel$dim} 28 | } 29 | -------------------------------------------------------------------------------- /seqOSP.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: No 4 | SaveWorkspace: No 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | PackageRoxygenize: rd,collate,namespace 22 | -------------------------------------------------------------------------------- /vignettes/seqDesignADSA-demo.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "2D Bermudan Basket Put Demo" 3 | author: Xiong Lyu, Mike Ludkovski 4 | date: '`r Sys.Date()`' 5 | output: 6 | html_vignette: 7 | df_print: paged 8 | vignette: > 9 | %\VignetteEngine{knitr::rmarkdown} 10 | %\VignetteIndexEntry{2D Bermudan Put} 11 | %\VignetteEncoding{UTF-8} 12 | --- 13 | 14 | This is a demo file to generate Figure 6 in the paper "Adaptive Batching for Gaussian Process Surrogates with Application in Noisy Level Set Estimation". The plot shows the fitted exercise boundary with its 95\% credible interval (solid line and dashed line) obtained with Gaussian Process for two-dimensional basket put Bermudan option. Two batch heuristics, ABSUR and ADSA, are used to select the location of inputs and their replications, which are shown as the dots and their color/size. 15 | ```{r setup, include=FALSE} 16 | knitr::opts_chunk$set(echo = TRUE) 17 | library(ks) 18 | library(fields) # for plotting purposes 19 | library(mlOSP) 20 | library(DiceKriging) 21 | library(tgp) # use lhs from there 22 | library(randtoolbox) # use sobol and halton QMC sequences 23 | library(hetGP) 24 | library(laGP) 25 | library(ggplot2) 26 | library(pander) 27 | data("int_2d") 28 | ``` 29 | 30 | We consider adaptive batching for a Gaussian GP metamodel for a two-dimensional basket Put Bermudan option. The asset follows 31 | log-normal dynamics 32 | $$ 33 | {Z}_{t+\Delta t} = {Z}_{t} \cdot\exp \bigg((r-\frac{1}{2} diag{{\Xi}})\Delta t + \sqrt{\Delta t} \cdot {\Xi} (\Delta {W}_{t})\bigg) 34 | $$ 35 | where $\Delta W_t$ are independent Gaussians, and the payoff is $h_{Put}(t,{z}) = e^{-r t}( {\cal K} - z^1 - z^2)_+$. 36 | 37 | 38 | ## Set up the model for Two-dim basket put with parameters in Table 4 39 | 40 | The code below sets up the model for an arithmetic Basket Put with parameters in Table 4 of the article, including the total simulations $N_T$ stored in \textit{total.budget}, initial design size $k_0$ in \textit{init.size}, initial batch size $r_0$ in \textit{batch.nrep} and kernel function $K$ in \textit{kernel.family}. The parameters of the model are initialized as a list, which is later used as input in the main function \textbf{osp.seq.batch.design}. 41 | 42 | ```{r} 43 | model2d <- list(x0 = rep(40,2),K=40,sigma=rep(0.2,2),r=0.06,div=0,T=1,dt=0.04,dim=2,sim.func=sim.gbm, payoff.func=put.payoff) 44 | model2d$pilot.nsims <- 1000 45 | model2d$look.ahead <- 1 46 | model2d$cand.len <- 1000 # size of candidate set m_0 for acquisition function 47 | model2d$max.lengthscale <- c(50,50) 48 | model2d$min.lengthscale <- c(5,5) 49 | model2d$tmse.eps <- 0 50 | 51 | model2d$ucb.gamma <- 1.96 52 | 53 | model2d$seq.design.size <- 100 # budget N = r_0 * k = 2500 54 | model2d$batch.nrep <- 25 # initial replication r_0 55 | model2d$total.budget <- 2500 56 | 57 | model2d$init.size <- 20 # initial design size k_0 58 | model2d$init.grid <- int_2d 59 | 60 | model2d$tmse.eps <- 0 61 | model2d$kernel.family <- "Matern5_2" # kernel function for Gaussian Process 62 | model2d$ucb.gamma <- 1.96 63 | model2d$update.freq <- 10 # number of sequential design steps to update the GP surrogate 64 | model2d$r.cand <- c(20, 30,40,50,60, 80, 120, 160) # r_L 65 | ``` 66 | 67 | 68 | ## Compare results for GP with ABSUR and ADSA 69 | 70 | To implement adaptive batching we need to specify the batch heuristic via \textit{batch.heuristic} and the sequential design framework with \textit{ei.func}. The function \textbf{osp.seq.batch.design.simplified} then fits the GP simulator. Fitting is done through the \texttt{DiceKriging} library, namely the main \texttt{km} function there. Below we apply Adaptive Design with Sequential Allocation, ADSA (which relies on the MCU acquisition function) and Adaptive Batching with Stepwise Uncertainty Reduction, ABSUR. The method parameter \texttt{trainkm} means that the GP metamodels are trained using the default \texttt{DiceKriging::km} MLE optimizer. 71 | 72 | ```{r testing-homtp2d, message=FALSE, warning=FALSE, fig.width=6} 73 | ### GP + ADSA 74 | set.seed(110) 75 | model2d$batch.heuristic <- 'adsa' 76 | model2d$ei.func <- "amcu" 77 | oos.obj.adsa <- osp.seq.batch.design(model2d, method="hetgp") 78 | 79 | ### GP + ABSUR 80 | set.seed(122) 81 | model2d$batch.heuristic <- 'absur' 82 | model2d$ei.func <- 'absur' 83 | oos.obj.absur <- osp.seq.batch.design(model2d, method="hetgp") 84 | 85 | ### plot Figure 6 86 | plt.2d.surf.batch(oos.obj.adsa$fit[[15]], oos.obj.absur$fit[[15]], oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15], oos.obj.absur$batches[1:oos.obj.absur$ndesigns[15] - 1, 15], "ADSA", "ABSUR", x=seq(25,50,len=201),y = seq(25, 50,len=201)) 87 | ``` 88 | 89 | 90 | The objects \texttt{oos.obj.xxx} are lists that contain the GP metamodels for each time-step of the Bermudan option problem ($M=25$ in the example above). We visualize the fitted timing values at $t=0.6 = 15 \Delta t$. The respective zero contour is the \textit{exercise boundary}. 91 | 92 | The function \textbf{plt.2d.surf.with.batch} plots the fitted exercise boundary together with its 95\% credible interval (solid line and dashed curves, respectively). The inputs are shown as the dots and their color indicates the replication amounts $r_i$'s. The first argument is the fitted GP emulator (including the fitted model and the input sites), and the second argument is the batch sizes corresponding to the selected input sites. 93 | 94 | First we plot the figure for results obtained with ADSA. 95 | ```{r, message=FALSE, warning=FALSE, fig.height=3.5, fig.width=6, fig.cap="Timing Value (background color) and Exercise Boundary (zero-contour) at $t=0.6$ using ADSA"} 96 | oos.obj.adsa$ndesigns[15] # number of unique designs 97 | ### plot Figure 6 right panel - ADSA 98 | plt.2d.surf.with.batch(oos.obj.adsa$fit[[15]], 99 | oos.obj.adsa$batches[1:oos.obj.adsa$ndesigns[15] - 1, 15]) 100 | ``` 101 | 102 | The second one is for ABSUR 103 | ```{r, message=FALSE, warning=FALSE,fig.height=3.5, fig.width=6, fig.cap="Timing Value and Exercise Boundary using ABSUR"} 104 | oos.obj.absur$ndesigns[15] # number of unique designs 105 | ### plot Figure 6 left panel - ABSUR 106 | plt.2d.surf.with.batch(oos.obj.absur$fit[[15]], 107 | oos.obj.absur$batches[1:oos.obj.absur$ndesigns[15] - 1, 15]) 108 | ``` 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | --------------------------------------------------------------------------------