├── .Rbuildignore ├── .gitattributes ├── .gitignore ├── DESCRIPTION ├── LICENSE ├── NAMESPACE ├── NEWS.md ├── R ├── auxiliary_kalman.R ├── auxiliary_tools.R ├── fit.R ├── intradayModel-package.R ├── plot.R ├── use_model.R ├── volume_aapl.R └── volume_fdx.R ├── README.Rmd ├── README.html ├── README.md ├── R_buildignore ├── AAPL_volume_xts.rda ├── GE_volume.rda ├── GE_volume_xts.rda ├── data_generate.R ├── dataset_SP100 ├── developer_commands.R └── modelSpec_format.rda ├── cran-comments.md ├── data ├── volume_aapl.rda └── volume_fdx.rda ├── inst └── CITATION ├── intradayModeling.Rproj ├── man ├── decompose_volume.Rd ├── figures │ ├── README-unnamed-chunk-7-1.png │ └── README-unnamed-chunk-8-1.png ├── fit_volume.Rd ├── forecast_volume.Rd ├── generate_plots.Rd ├── intradayModel-package.Rd ├── volume_aapl.Rd └── volume_fdx.Rd ├── tests ├── testthat.R └── testthat │ ├── fixtures │ ├── ACN_expected_par │ ├── ACN_expected_pred │ ├── ACN_volume │ ├── ADBE_expected_par │ ├── ADBE_expected_pred │ ├── ADBE_volume │ ├── CVS_expected_par │ ├── CVS_expected_pred │ ├── CVS_volume │ ├── GE_expected_par │ └── GE_expected_pred │ ├── test-auxiliary.R │ ├── test-message.R │ ├── test-uniModelFit.R │ └── test-uniModelUse.R └── vignettes ├── .gitignore ├── apalike.csl ├── intradayModel.Rmd ├── intradayModel.html └── reference.bib /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^R_buildignore$ 4 | ^cran-comments\.md$ 5 | .github 6 | ^README\.Rmd$ 7 | ^README\.html$ 8 | ^README_cache$ 9 | ^README_files$ 10 | ^doc$ 11 | ^Meta$ 12 | ^LICENSE$ 13 | ^vignettes/intradayModel_cache$ 14 | ^vignettes/intradayModel_files$ -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | vignettes/** linguist-vendored -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | .Ruserdata 5 | .DS_Store 6 | /doc/ 7 | /Meta/ 8 | *.pdf 9 | intradayModel.Rcheck 10 | *.tar.gz -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: intradayModel 2 | Title: Modeling and Forecasting Financial Intraday Signals 3 | Version: 0.0.1.9000 4 | Date: 2023-05-20 5 | Description: Models, analyzes, and forecasts financial intraday signals. This package 6 | currently supports a univariate state-space model for intraday trading volume provided 7 | by Chen (2016) . 8 | Authors@R: c( 9 | person("Shengjie", "Xiu", role = "aut", email = "sxiu@connect.ust.hk"), 10 | person("Yifan", "Yu", role = "aut", email = "yyuco@connect.ust.hk"), 11 | person(c("Daniel", "P."), "Palomar", role = c("cre", "aut", "cph"), email = "daniel.p.palomar@gmail.com") 12 | ) 13 | Maintainer: Daniel P. Palomar 14 | URL: https://github.com/convexfi/intradayModel, 15 | https://www.danielppalomar.com, 16 | https://dx.doi.org/10.2139/ssrn.3101695 17 | BugReports: https://github.com/convexfi/intradayModel/issues 18 | License: Apache License (== 2.0) 19 | Encoding: UTF-8 20 | RoxygenNote: 7.2.3 21 | Depends: R (>= 2.10) 22 | Imports: 23 | ggplot2, 24 | magrittr, 25 | patchwork, 26 | reshape2, 27 | scales, 28 | xts, 29 | zoo, 30 | utils 31 | Suggests: 32 | knitr, 33 | rmarkdown, 34 | R.rsp, 35 | testthat (>= 3.0.0), 36 | cleanrmd, 37 | devtools 38 | VignetteBuilder: 39 | knitr, 40 | rmarkdown, 41 | R.rsp 42 | LazyData: true 43 | Config/testthat/edition: 3 44 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright 2023 Shengjie Xiu, Yifan Yu, Daniel P. Palomar 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | export(decompose_volume) 4 | export(fit_volume) 5 | export(forecast_volume) 6 | export(generate_plots) 7 | import(ggplot2) 8 | import(patchwork) 9 | import(xts) 10 | importFrom(magrittr,"%>%") 11 | importFrom(scales,math_format) 12 | importFrom(scales,trans_breaks) 13 | importFrom(scales,trans_format) 14 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | ## Changes in intradayModel version 0.0.1 (2023-05-19) 2 | 3 | * Initial bare-bone implementation for intraday volume (Chen 2016). 4 | -------------------------------------------------------------------------------- /R/auxiliary_kalman.R: -------------------------------------------------------------------------------- 1 | # HEADER -------------------------------------------- 2 | # UNIvariate State-Space (UNISS) model 3 | # The engine of Kalman filter & smoother & em updates. 4 | # 5 | # Its object is names "uniss_obj", it stores all the required information 6 | # in the state-space world. 7 | # The user's "volume_model" is for user interface, and its detailed information 8 | # is stored in "uniss_obj". 9 | # -------------------------------------------------- 10 | 11 | # define the UNISS model 12 | # prepare all required information in log form 13 | specify_uniss <- function(...) { 14 | # read input information 15 | args <- list(...) 16 | data <- args$data # log intraday signal 17 | volume_model <- args$volume_model 18 | 19 | data.reform <- unlist(as.list(data)) 20 | n_bin <- nrow(data) 21 | n_day <- ncol(data) 22 | n_bin_total <- n_bin * n_day 23 | 24 | # define uniss_obj list 25 | uniss_obj <- list() 26 | 27 | ## uniss parameters 28 | uniss_obj$par <- list() 29 | all.pars.name <- c("a_eta", "a_mu", "var_eta", "var_mu", "r", "phi", "x0", "V0") 30 | init.default <- list( 31 | "x0" = c(mean(data.reform), 0), 32 | "a_eta" = 1, "a_mu" = 0, 33 | "r" = 1e-4, 34 | "var_eta" = 1e-4, "var_mu" = 1e-4, 35 | "V0" = c(1e-3, 1e-7, 1e-5), 36 | "phi" = c(rowMeans(matrix(data.reform, nrow = n_bin)) - mean(data.reform)) 37 | ) 38 | for (name in all.pars.name) { 39 | ### specify EM initial values 40 | if (!volume_model$converged[[name]]) { 41 | if (name %in% names(volume_model$init)) { 42 | uniss_obj$par[[name]] <- volume_model$init[[name]] 43 | } else { 44 | uniss_obj$par[[name]] <- init.default[[name]] 45 | } 46 | } else { ### set to fixed values 47 | uniss_obj$par[[name]] <- volume_model$par[[name]] 48 | } 49 | } 50 | 51 | ## uniss other properties 52 | uniss_obj$y <- data.reform 53 | uniss_obj$n_bin <- n_bin 54 | uniss_obj$n_day <- n_day 55 | uniss_obj$n_bin_total <- n_bin_total 56 | uniss_obj$converged <- volume_model$converged 57 | 58 | return(uniss_obj) 59 | } 60 | 61 | 62 | # one-time Kalman filter, smoother, and em update on uniss 63 | # type = c("filter", "smoother", "em_update") 64 | uniss_kalman <- function(uniss_obj, type = "em_update") { 65 | result <- list() 66 | 67 | # filter ----------------------------------- 68 | 69 | ## declare containers for filter results 70 | xtt1 <- matrix(NA, 2, uniss_obj$n_bin_total) 71 | Vtt1 <- array(NA, dim = c(2, 2, uniss_obj$n_bin_total)) 72 | xtt <- matrix(NA, 2, uniss_obj$n_bin_total) 73 | Vtt <- array(NA, dim = c(2, 2, uniss_obj$n_bin_total)) 74 | Kt <- matrix(NA, 2, uniss_obj$n_bin_total) 75 | 76 | ## fixed-variables to use 77 | A_jump <- matrix(c(uniss_obj$par$a_eta, 0, 0, uniss_obj$par$a_mu), 2, 2) 78 | A_intra <- matrix(c(1, 0, 0, uniss_obj$par$a_mu), 2, 2) 79 | Q_jump <- matrix(c(uniss_obj$par$var_eta, 0, 0, uniss_obj$par$var_mu), 2, 2) 80 | Q_intra <- matrix(c(0, 0, 0, uniss_obj$par$var_mu), 2, 2) 81 | r <- uniss_obj$par$r 82 | phi <- rep(uniss_obj$par$phi, uniss_obj$n_day) 83 | C <- matrix(1, nrow = 1, ncol = 2) 84 | 85 | ## initialize according to MARSS tinitx = 1 rule 86 | xtt1[, 1] <- uniss_obj$par$x0 87 | Vtt1[, , 1] <- matrix(c( 88 | uniss_obj$par$V0[1], uniss_obj$par$V0[2], 89 | uniss_obj$par$V0[2], uniss_obj$par$V0[3] 90 | ), 2) 91 | Kt[, 1] <- Vtt1[, , 1] %*% t(C) %*% solve(C %*% Vtt1[, , 1] %*% t(C) + r) 92 | xtt[, 1] <- xtt1[, 1] + Kt[, 1] %*% (uniss_obj$y[1] - phi[1] - C %*% xtt1[, 1]) 93 | Vtt[, , 1] <- Vtt1[, , 1] - Kt[, 1] %*% C %*% Vtt1[, , 1] 94 | 95 | ## Kalman filter recursion 96 | for (i in 1:(uniss_obj$n_bin_total - 1)) { 97 | ### prediction 98 | if ((i %% uniss_obj$n_bin) == 0) { 99 | xtt1[, i + 1] <- A_jump %*% xtt[, i] 100 | Vtt1[, , i + 1] <- A_jump %*% Vtt[, , i] %*% t(A_jump) + Q_jump 101 | } else { 102 | xtt1[, i + 1] <- A_intra %*% xtt[, i] 103 | Vtt1[, , i + 1] <- A_intra %*% Vtt[, , i] %*% t(A_intra) + Q_intra 104 | } 105 | 106 | ### kalman gain 107 | Kt[, i + 1] <- Vtt1[, , i + 1] %*% t(C) %*% solve(C %*% Vtt1[, , i + 1] %*% t(C) + r) 108 | 109 | ### measurement update 110 | xtt[, i + 1] <- xtt1[, i + 1] + Kt[, i + 1] %*% 111 | (uniss_obj$y[i + 1] - phi[i + 1] - C %*% xtt1[, i + 1]) 112 | Vtt[, , i + 1] <- Vtt1[, , i + 1] - Kt[, i + 1] %*% C %*% Vtt1[, , i + 1] 113 | } 114 | 115 | result$xtt1 <- xtt1 116 | result$Vtt1 <- Vtt1 117 | result$Kt <- Kt 118 | result$xtt <- xtt 119 | result$Vtt <- Vtt 120 | if (type == "filter") { 121 | return(result) 122 | } 123 | 124 | # smoother ----------------------------------- 125 | 126 | ## declare containers for smoother results 127 | xtT <- matrix(NA, 2, uniss_obj$n_bin_total) 128 | VtT <- array(NA, dim = c(2, 2, uniss_obj$n_bin_total)) 129 | Lt <- array(NA, dim = c(2, 2, (uniss_obj$n_bin_total - 1))) 130 | 131 | ## initialize 132 | xtT[, uniss_obj$n_bin_total] <- xtt[, uniss_obj$n_bin_total] 133 | VtT[, , uniss_obj$n_bin_total] <- Vtt[, , uniss_obj$n_bin_total] 134 | 135 | # Kalman smoother recursion 136 | for (i in (uniss_obj$n_bin_total - 1):1) { 137 | if ((i %% uniss_obj$n_bin) == 0) { 138 | Lt[, , i] <- Vtt[, , i] %*% t(A_jump) %*% solve(Vtt1[, , (i + 1)]) 139 | } else { 140 | Lt[, , i] <- Vtt[, , i] %*% t(A_intra) %*% solve(Vtt1[, , (i + 1)]) 141 | } 142 | xtT[, i] <- xtt[, i] + Lt[, , i] %*% (xtT[, i + 1] - xtt1[, i + 1]) 143 | VtT[, , i] <- Vtt[, , i] + Lt[, , i] %*% (VtT[, , i + 1] - Vtt1[, , i + 1]) %*% t(Lt[, , i]) 144 | } 145 | 146 | x0T <- xtT[, 1] 147 | V0T <- VtT[, , 1] 148 | 149 | result$x0T <- x0T 150 | result$V0T <- V0T 151 | result$xtT <- xtT 152 | result$VtT <- VtT 153 | result$Lt <- Lt 154 | if (type == "smoother") { 155 | return(result) 156 | } 157 | 158 | # em update ----------------------------------- 159 | 160 | all.pars.name <- c("a_eta", "a_mu", "var_eta", "var_mu", "r", "phi", "x0", "V0") 161 | unfitted_pars <- names(uniss_obj$converged[uniss_obj$converged == FALSE]) 162 | 163 | Pt <- Ptt1 <- array(NA, c(2, 2, uniss_obj$n_bin_total)) 164 | for (n in 1:uniss_obj$n_bin_total) { 165 | Pt[, , n] <- VtT[, , n] + xtT[, n] %*% t(xtT[, n]) 166 | } 167 | for (n in 2:uniss_obj$n_bin_total) { 168 | Ptt1[, , n] <- VtT[, , n] %*% t(Lt[, , n - 1]) + xtT[, n] %*% t(xtT[, n - 1]) 169 | } 170 | jump_interval <- seq(uniss_obj$n_bin + 1, uniss_obj$n_bin_total, uniss_obj$n_bin) 171 | 172 | new_par <- uniss_obj$par 173 | for (name in unfitted_pars) { 174 | switch(name, 175 | "x0" = { 176 | new_par$x0 <- x0T 177 | }, 178 | "V0" = { 179 | new_par$V0[1] <- V0T[1, 1] 180 | new_par$V0[2] <- V0T[2, 1] 181 | new_par$V0[3] <- V0T[2, 2] 182 | }, 183 | "phi" = { 184 | new_par$phi <- rowMeans(matrix(uniss_obj$y - C %*% xtT, nrow = uniss_obj$n_bin)) 185 | new_par$phi <- new_par$phi - mean(new_par$phi) 186 | }, 187 | "r" = { 188 | if ("phi" %in% unfitted_pars) { 189 | phi_matrix <- rep(matrix(new_par$phi, nrow = 1), uniss_obj$n_day) 190 | } else { 191 | phi_matrix <- unlist(uniss_obj$par$phi) 192 | } ## need input 193 | new_par$r <- mean(uniss_obj$y^2 + apply(Pt, 3, function(p) C %*% p %*% t(C)) - 194 | 2 * uniss_obj$y * as.numeric(C %*% xtT) + 195 | phi_matrix^2 - 196 | 2 * uniss_obj$y * phi_matrix + 197 | 2 * phi_matrix * as.numeric(C %*% xtT)) 198 | }, 199 | "a_eta" = { 200 | new_par$a_eta <- sum(Ptt1[1, 1, jump_interval]) / 201 | sum(Pt[1, 1, jump_interval - 1]) 202 | }, 203 | "a_mu" = { 204 | new_par$a_mu <- sum(Ptt1[2, 2, 2:uniss_obj$n_bin_total]) / 205 | sum(Pt[2, 2, 1:(uniss_obj$n_bin_total - 1)]) 206 | }, 207 | "var_eta" = { 208 | if ("a_eta" %in% unfitted_pars) { 209 | curr_a_eta <- new_par$a_eta 210 | } else { 211 | curr_a_eta <- uniss_obj$par$a_eta 212 | } ## need input 213 | new_par$var_eta <- mean(Pt[1, 1, jump_interval] + 214 | curr_a_eta^2 * Pt[1, 1, jump_interval - 1] - 215 | 2 * curr_a_eta * Ptt1[1, 1, jump_interval]) 216 | }, 217 | "var_mu" = { 218 | if ("a_mu" %in% unfitted_pars) { 219 | curr_a_mu <- new_par$a_mu 220 | } else { 221 | curr_a_mu <- uniss_obj$par$a_mu 222 | } ## need input 223 | new_par$var_mu <- mean(Pt[2, 2, 2:uniss_obj$n_bin_total] + 224 | curr_a_mu^2 * Pt[2, 2, 1:(uniss_obj$n_bin_total - 1)] - 225 | 2 * curr_a_mu * Ptt1[2, 2, 2:uniss_obj$n_bin_total]) 226 | } 227 | ) 228 | } 229 | 230 | result$new_par <- new_par 231 | return(result) 232 | } 233 | -------------------------------------------------------------------------------- /R/auxiliary_tools.R: -------------------------------------------------------------------------------- 1 | # This is a library of auxiliary functions, which might be useful for other exported functions of this package. 2 | # They should be invisible to package users 3 | 4 | # Define a Univariate State-Space Model 5 | spec_volume_model <- function(fixed_pars = NULL, init_pars = NULL) { 6 | volume_model <- list() 7 | class(volume_model) <- "volume_model" 8 | 9 | # error control 10 | if (!is.null(init_pars) && !is.list(init_pars)) stop("init_pars must be a list.") 11 | if (!is.null(fixed_pars) && !is.list(fixed_pars)) stop("fixed_pars must be a list.") 12 | 13 | # volume_model class properties 14 | all_pars_name <- c("a_eta", "a_mu", "var_eta", "var_mu", "r", "phi", "x0", "V0") 15 | volume_model$par$"a_eta" <- NA 16 | volume_model$par$"a_mu" <- NA 17 | volume_model$par$"var_eta" <- NA 18 | volume_model$par$"var_mu" <- NA 19 | volume_model$par$"r" <- NA 20 | volume_model$par$"phi" <- NA 21 | volume_model$par$"x0" <- rep(NA, 2) 22 | volume_model$par$"V0" <- rep(NA, 3) 23 | volume_model$init <- list() 24 | 25 | # read in input parameters 26 | fixed_clean_result <- clean_pars_list(fixed_pars) 27 | fixed_pars <- fixed_clean_result$input_list 28 | 29 | unecessary_init <- intersect(names(init_pars), names(fixed_pars)) 30 | init_pars <- init_pars[setdiff(names(init_pars), names(fixed_pars))] 31 | init_clean_result <- clean_pars_list(init_pars) 32 | init_pars <- init_clean_result$input_list 33 | 34 | # generate warning message 35 | msg <- c() 36 | if (length(fixed_clean_result$msg) > 0) { 37 | msg <- append(msg, "Warnings in fixed_pars:\n") 38 | # cat("Warnings in fixed_pars:\n") 39 | for (m in fixed_clean_result$msg) { 40 | msg <- append(msg, paste(" ", m, "\n", sep = "")) 41 | # cat(" ", m, "\n", sep = "") 42 | } 43 | } 44 | if (length(init_clean_result$msg) > 0 | length(unecessary_init) > 0) { 45 | msg <- append(msg, "Warnings in init_pars:\n") 46 | # cat("Warnings in init_pars:\n") 47 | if (!is.null(init_clean_result$msg)) { 48 | for (m in init_clean_result$msg) { 49 | msg <- append(msg, paste(" ", m, "\n", sep = "")) 50 | # cat(" ", m, "\n", sep = "") 51 | } 52 | } 53 | if (length(unecessary_init) > 0) { 54 | msg <- append(msg, paste(" Elements ", paste(unecessary_init, collapse = ", "), 55 | " have already been fixed.\n", 56 | sep = "" 57 | )) 58 | # cat(" Elements ", paste(unecessary_init, collapse = ", "), 59 | # " have already been fixed.\n", sep = "") 60 | } 61 | } 62 | if (length(msg) > 0) { 63 | warning(msg) 64 | } 65 | # if (length(fixed_clean_result$msg) > 0) { 66 | # cat("Warnings in fixed_pars:\n") 67 | # for (m in fixed_clean_result$msg) { 68 | # cat(" ", m, "\n", sep = "") 69 | # } 70 | # } 71 | # if (length(init_clean_result$msg) > 0 | length(unecessary_init) > 0) { 72 | # cat("Warnings in init_pars:\n") 73 | # if (!is.null(init_clean_result$msg)) { 74 | # for (m in init_clean_result$msg) { 75 | # cat(" ", m, "\n", sep = "") 76 | # } 77 | # } 78 | # if (length(unecessary_init) > 0) { 79 | # cat(" Elements ", paste(unecessary_init, collapse = ", "), 80 | # " have already been fixed.\n", sep = "") 81 | # } 82 | # } 83 | 84 | # store inputs in univariate model object 85 | for (name in all_pars_name) { 86 | if (name %in% names(fixed_pars)) { 87 | volume_model$par[[name]] <- fixed_pars[[name]] 88 | } else if (name %in% names(init_pars)) { 89 | volume_model$init[[name]] <- init_pars[[name]] 90 | } 91 | } 92 | 93 | volume_model$converged <- list() 94 | for (name in all_pars_name) { 95 | if (anyNA(volume_model$par[[name]])) { 96 | volume_model$converged[[name]] <- FALSE 97 | } else { 98 | volume_model$converged[[name]] <- TRUE 99 | } 100 | } 101 | 102 | return(volume_model) 103 | } 104 | 105 | # Remove anyday containing NA/missing bins 106 | # unify the data to matrix 107 | clean_data <- function(data) { 108 | if (is.xts(data)) { 109 | data <- intraday_xts_to_matrix(data) 110 | } else { 111 | index_NA_bin <- colnames(data)[apply(data, 2, anyNA)] 112 | data <- data[, !apply(data, 2, anyNA)] 113 | if (length(index_NA_bin) > 0) { 114 | msg <- paste("For input matrix:\n", " Remove trading days with missing bins: ", toString(index_NA_bin), ".\n", sep = "") 115 | warning(msg) 116 | # cat("Warning in input matrix:\n") 117 | # cat(" Remove trading days with missing bins: ", format(index_NA_bin), "\n") 118 | } 119 | } 120 | return(data) 121 | } 122 | 123 | # Process xts data 124 | # remove any day containing missing bins/NA 125 | # convert the data to matrix 126 | intraday_xts_to_matrix <- function(data_xts) { 127 | contain_NA <- xts::apply.daily(data_xts, function(x) as.integer(any(is.na(x)))) 128 | index_no_NA_bin <- zoo::index(to.daily(contain_NA[contain_NA == 0])) 129 | data_xts <- data_xts[format(index_no_NA_bin, format = "%Y-%m-%d")] 130 | 131 | bins_count <- apply.daily(data_xts, nrow) 132 | n_bin <- max(bins_count) 133 | index_full_bin <- zoo::index(to.daily(bins_count[bins_count == n_bin])) 134 | 135 | data_xts <- data_xts[format(index_no_NA_bin, format = "%Y-%m-%d")] 136 | data_xts <- data_xts[format(index_full_bin, format = "%Y-%m-%d")] 137 | data.mat <- matrix(data_xts, nrow = n_bin) 138 | 139 | index_NA_bin <- c() 140 | if (sum(contain_NA != 0) > 0) { 141 | index_NA_bin <- zoo::index(to.daily(contain_NA[contain_NA != 0])) 142 | } 143 | index_notfull_bin <- c() 144 | if (sum(bins_count != n_bin) > 0) { 145 | index_notfull_bin <- zoo::index(to.daily(bins_count[bins_count != n_bin])) 146 | } 147 | wrong_index <- c(index_NA_bin, index_notfull_bin) 148 | if (length(wrong_index) > 0) { 149 | msg <- paste("For input xts:\n", " Remove trading days with missing bins: ", toString(sort(format(wrong_index))), ".\n", sep = "") 150 | warning(msg) 151 | # cat("Warning in input xts:\n") 152 | # cat(" Remove trading days with missing bins: ", sort(format(wrong_index)), "\n") 153 | } 154 | 155 | return(data.mat) 156 | } 157 | 158 | 159 | # clean the univolume_model()'s input args (init_pars/fixed_pars) 160 | # remove any variable containing NA/inf/non-numeric 161 | # remove any variable that won't appear in model 162 | # flatten the variable if user input a high dimension one 163 | clean_pars_list <- function(input_list) { 164 | all_pars_name <- c("a_eta", "a_mu", "var_eta", "var_mu", "r", "x0", "V0", "phi") 165 | expected_pars_len <- list( 166 | "a_eta" = 1, "a_mu" = 1, 167 | "var_eta" = 1, "var_mu" = 1, "r" = 1, 168 | "x0" = 2, "V0" = 4 169 | ) 170 | 171 | invalid_param <- c() 172 | incorrect_param <- c() 173 | msg <- NULL 174 | 175 | # check if parameters are valid 176 | for (name in names(input_list)) { 177 | if (!(name %in% all_pars_name)) { 178 | input_list[[name]] <- NULL 179 | invalid_param <- c(invalid_param, name) 180 | next 181 | } 182 | input_list[[name]] <- unlist(as.list(input_list[[name]])) 183 | 184 | if (mode(input_list[[name]]) != "numeric" || 185 | any(is.na(input_list[[name]])) || 186 | any(is.infinite(input_list[[name]]))) { 187 | input_list[[name]] <- NULL 188 | incorrect_param <- c(incorrect_param, name) 189 | next 190 | } 191 | 192 | if (name == "phi") next 193 | 194 | if (expected_pars_len[[name]] != length(input_list[[name]])) { 195 | input_list[[name]] <- NULL 196 | incorrect_param <- c(incorrect_param, name) 197 | next 198 | } 199 | 200 | switch(name, 201 | "V0" = { 202 | V_matrix <- matrix(input_list[["V0"]], 2) 203 | eigen_value <- eigen(matrix(input_list[["V0"]], 2), only.values = TRUE)$values >= 0 204 | if (!isSymmetric(V_matrix) || !eigen_value[1] || !eigen_value[2]) { 205 | input_list[[name]] <- NULL 206 | incorrect_param <- c(incorrect_param, name) 207 | } else { 208 | input_list[["V0"]] <- input_list[["V0"]][c(1, 2, 4)] 209 | } 210 | }, 211 | "r" = { 212 | if (input_list[["r"]] < 0) { 213 | input_list[["r"]] <- NULL 214 | incorrect_param <- c(incorrect_param, "r") 215 | } 216 | } 217 | ) 218 | } 219 | 220 | if (length(invalid_param) > 0) { 221 | msg <- c(msg, paste("Elements ", paste(invalid_param, collapse = ", "), 222 | " are not allowed in parameter list.", 223 | sep = "" 224 | )) 225 | } 226 | if (length(incorrect_param) > 0) { 227 | msg <- c(msg, paste("Elements ", paste(unique(incorrect_param), collapse = ", "), 228 | " are invalid (check number/dimension/PSD).", 229 | sep = "" 230 | )) 231 | } 232 | clean_result <- list( 233 | "input_list" = input_list, 234 | "msg" = msg 235 | ) 236 | 237 | return(clean_result) 238 | } 239 | 240 | # part of error check for init_pars/fixed_pars 241 | check_pars_list <- function(volume_model, n_bin = NULL) { 242 | converged_list <- volume_model$converged 243 | par_list <- volume_model$par 244 | init_list <- volume_model$init 245 | 246 | all_par_list <- c("a_eta", "a_mu", "var_eta", "var_mu", "r", "x0", "V0") 247 | scalar_par_list <- c("a_eta", "a_mu", "var_eta", "var_mu", "r") 248 | len_expect <- list("a_eta" = 1L, "a_mu" = 1L, "var_eta" = 1L, "var_mu" = 1L, "r" = 1L, "x0" = 2L, "V0" = 3L) 249 | if (!identical(n_bin, NULL)) { 250 | all_par_list <- append(all_par_list, "phi") 251 | len_expect <- append(len_expect, list("phi" = as.integer(n_bin))) 252 | } 253 | unfixed <- intersect(names(converged_list[converged_list == FALSE]), all_par_list) 254 | fixed <- intersect(names(converged_list[converged_list == TRUE]), all_par_list) 255 | 256 | 257 | msg <- NULL 258 | for (name in fixed) { 259 | if (mode(par_list[[name]]) != "numeric" || any(is.na(par_list[[name]])) || any(is.infinite(par_list[[name]]))) { 260 | msg <- c(msg, paste(name, "must be numeric, have no NAs, and no Infs.\n")) 261 | } 262 | if (name %in% scalar_par_list && !identical(len_expect[[name]], length(par_list[[name]]))) { 263 | msg <- c(msg, paste("Length of volume_model$par$", name, " is wrong.\n", sep = "")) 264 | } 265 | } 266 | for (name in unfixed) { 267 | if (!all(is.na(par_list[[name]]))) { 268 | msg <- c(msg, paste("volume_model$par$", name, " and volume_model$converged$", name, " are conflicted.\n", sep = "")) 269 | } 270 | } 271 | 272 | 273 | for (name in fixed) { 274 | if (name %in% names(init_list)) { 275 | msg <- c(msg, paste(name, "is fixed. No need for init.\n")) 276 | } 277 | } 278 | unfixed_init <- intersect(unfixed, names(init_list)) 279 | for (name in unfixed_init) { 280 | if (mode(init_list[[name]]) != "numeric" || any(is.na(init_list[[name]])) || any(is.infinite(init_list[[name]]))) { 281 | msg <- c(msg, paste(name, "must be numeric, have no NAs, and no Infs.\n")) 282 | } 283 | if (name %in% scalar_par_list && !identical(len_expect[[name]], length(init_list[[name]]))) { 284 | msg <- c(msg, paste("Lenght of volume_model$init$", name, " is wrong.\n", sep = "")) 285 | } 286 | } 287 | return(msg) 288 | } 289 | 290 | # check whether the volume_model is correct 291 | is_volume_model <- function(volume_model, n_bin = NULL) { 292 | ## Check for required components 293 | el <- c("converged", "par", "init") 294 | # if some components are missing from the volume_model, rest of the tests won't work so stop now 295 | if (!all(el %in% names(volume_model))) { 296 | stop("Elements ", paste(el[!(el %in% names(volume_model))], collapse = ", "), " are missing from the model.\n") 297 | } 298 | 299 | # if some args are missing from the volume_model's components, the code will stop when all missing parts are found. 300 | msg <- NULL 301 | all_pars_name <- c("a_eta", "a_mu", "var_eta", "var_mu", "r", "phi", "x0", "V0") 302 | if (!all(all_pars_name %in% names(volume_model$par))) { 303 | msg <- c(msg, "Elements ", paste(all_pars_name[!(all_pars_name %in% names(volume_model$par))], collapse = ", "), " are missing from volume_model$par.\n") 304 | } 305 | if (!all(all_pars_name %in% names(volume_model$converged))) { 306 | msg <- c(msg, "Elements ", paste(all_pars_name[!(all_pars_name %in% names(volume_model$converged))], collapse = ", "), " are missing from volume_model$converged.\n") 307 | } 308 | if (!is.null(msg)) { # rest of the tests won't work so stop now 309 | stop(msg) 310 | } 311 | 312 | # check converged 313 | logical_check <- lapply(volume_model$converged, function(f) isTRUE(f) | identical(f, FALSE)) 314 | if (any(logical_check == FALSE)) { 315 | msg <- c("Elements in volume_model$converged must be TRUE/FALSE.\n") 316 | stop(msg) 317 | } 318 | 319 | # Check no NA inf and dimension 320 | msg <- check_pars_list(volume_model, n_bin) 321 | if (!is.null(msg)) { 322 | stop(msg) 323 | } 324 | } 325 | 326 | fetch_par_log <- function(par_log, index) { 327 | par_list <- list() 328 | for (i in 1:length(par_log)) { 329 | par_list <- append(par_list, par_log[[i]][[index]]) 330 | } 331 | return(do.call(cbind, par_list)) 332 | } 333 | 334 | calculate_mape <- function(referenced_data, predicted_data) { 335 | referenced_data <- as.vector(referenced_data) 336 | predicted_data <- as.vector(predicted_data) 337 | return(mean(abs(predicted_data - referenced_data) / referenced_data)) 338 | } 339 | 340 | calculate_mae <- function(referenced_data, predicted_data) { 341 | referenced_data <- as.vector(referenced_data) 342 | predicted_data <- as.vector(predicted_data) 343 | return(mean(abs(predicted_data - referenced_data))) 344 | } 345 | 346 | calculate_rmse <- function(referenced_data, predicted_data) { 347 | referenced_data <- as.vector(referenced_data) 348 | predicted_data <- as.vector(predicted_data) 349 | return(sqrt(mean((predicted_data - referenced_data)^2))) 350 | } 351 | -------------------------------------------------------------------------------- /R/fit.R: -------------------------------------------------------------------------------- 1 | #' @title Fit a Univariate State-Space Model on Intraday Trading Volume 2 | #' 3 | #' @description The main function for defining and fitting a univaraite state-space model on intraday trading volume. The model is proposed in (Chen et al., 2016) as 4 | #' \deqn{\mathbf{x}_{\tau+1} = \mathbf{A}_{\tau}\mathbf{x}_{\tau} + \mathbf{w}_{\tau},}{x(\tau+1) = A(\tau) x(\tau) + w(\tau),} 5 | #' \deqn{y_{\tau} = \mathbf{C}\mathbf{x}_{\tau} + \phi_{\tau} + v_\tau,}{y(\tau) = C x(\tau) + \phi(\tau) + v(\tau),} 6 | #' where 7 | #' \itemize{\item{\eqn{\mathbf{x}_{\tau} = [\eta_{\tau}, \mu_{\tau}]^\top}{x(\tau) = [\eta(\tau); \mu(\tau)]} is the hidden state vector containing the log daily component and the log intraday dynamic component;} 8 | #' \item{\eqn{\mathbf{A}_{\tau} = \left[\begin{array}{cc}a_{\tau}^{\eta}&0\\0&a^{\mu}\end{array} \right]}{A(\tau) = [a.\eta(\tau), 0; 0, a.\mu]} 9 | #' is the state transition matrix with \eqn{a_{\tau}^{\eta} = \left\{\begin{array}{cl}a^{\eta} & t = kI, k = 1,2,\dots\\0 & \textrm{otherwise};\end{array}\right.}{a.\eta(\tau) = a.\eta, when \tau = kI, k = 1, 2, ... , and zero otherwise;}} 10 | #' \item{\eqn{\mathbf{C} = [1, 1]}{C = [1, 1]} is the observation matrix;} 11 | #' \item{\eqn{\phi_{\tau}}{\phi(\tau)} is the corresponding element from \eqn{\boldsymbol{\phi} = [\phi_1,\dots, \phi_I]^\top}{\phi = [\phi(1); ... ; \phi(I)]}, which is the log seasonal component;} 12 | #' \item{\eqn{\mathbf{w}_{\tau} = [\epsilon_{\tau}^{\eta},\epsilon_{\tau}^{\mu}]^\top \sim \mathcal{N}(\mathbf{0}, \mathbf{Q}_{\tau})}{w(\tau) = [\epsilon.\eta(\tau); \epsilon.\mu(\tau)] ~ N(0, Q(\tau))} 13 | #' represents the i.i.d. Gaussian noise in the state transition, with a time-varying covariance matrix 14 | #' \eqn{\mathbf{Q}_{\tau} = \left[\begin{array}{cc}(\sigma_{\tau}^{\eta})^2&0\\ 0&(\sigma^{\mu})^2\end{array} \right]}{Q(\tau) = [(\sigma.\eta(\tau))^2, 0; 0, (\sigma.\mu)^2]} 15 | #' and \eqn{\sigma_\tau^{\eta} = \left\{\begin{array}{cl} \sigma^{\eta} & t = kI, k = 1,2,\dots\\0 & \textrm{otherwise};\end{array}\right.}{\sigma.\eta(\tau) = \sigma.\eta, when \tau = kI, k = 1, 2, ... , and zero otherwise;}} 16 | #' \item{\eqn{v_\tau \sim \mathcal{N}(0, r)}{v(\tau) ~ N(0, r)} is the i.i.d. Gaussian noise in the observation;} 17 | #' \item{\eqn{\mathbf{x}_1}{x(1)} is the initial state at \eqn{\tau = 1}{\tau = 1}, and it follows \eqn{\mathcal{N}(\mathbf{x}_0, \mathbf{V}_0)}{N(x(0), V(0))}}.} 18 | #' In the model, \eqn{\boldsymbol{\Theta} = \left\{a^{\eta},a^{\mu},\sigma^{\eta},\sigma^{\mu},r,\boldsymbol{\phi}, \mathbf{x}_0, \mathbf{V}_0\right\}}{\Theta = {a.\eta, a.\mu, (\sigma.\eta)^2, (\sigma.\mu)^2, r, \phi, x(0), V(0)}} 19 | #' are treated as parameters. 20 | #' The model is fitted by expectation-maximization (EM) algorithms. The implementation follows (Chen et al., 2016), and the accelerated scheme is provided in (Varadhan and Roland, 2008). 21 | #' The algorithm terminates when \code{maxit} is reached or the condition \eqn{\|\Delta \boldsymbol{\Theta}_i\| \le \textrm{abstol}}{||\Delta \Theta(i)|| <= abstol} is satisfied. 22 | #' 23 | #' @author Shengjie Xiu, Yifan Yu and Daniel P. Palomar 24 | #' 25 | #' @param data An n_bin * n_day matrix or an \code{xts} object storing intraday trading volume. 26 | #' @param fixed_pars A list of parameters' fixed values. The allowed parameters are listed below, 27 | #' \itemize{\item{\code{"a_eta"}: \eqn{a^{\eta}}{a.\eta}} of size 1 ; 28 | #' \item{\code{"a_mu"}: \eqn{a^{\mu}}{a.\mu}} of size 1 ; 29 | #' \item{\code{"var_eta"}: \eqn{\sigma^{\eta}}{(\sigma.\eta)^2}} of size 1 ; 30 | #' \item{\code{"var_mu"}: \eqn{\sigma^{\mu}}{(\sigma.\mu)^2}} of size 1 ; 31 | #' \item{\code{"r"}: \eqn{r}{r} of size 1 ;} 32 | #' \item{\code{"phi"}: \eqn{\boldsymbol{\phi} = [\phi_1,\dots, \phi_I]^\top}{\phi = [\phi(1); ... ; \phi(I)]} of size \eqn{I} ;} 33 | #' \item{\code{"x0"}: \eqn{\mathbf{x}_0}{x(0)} of size 2 ;} 34 | #' \item{\code{"V0"}: \eqn{\mathbf{V}_0}{V(0)} of size 2 * 2 .}} 35 | #' @param init_pars A list of unfitted parameters' initial values. The parameters are the same as \code{fixed_pars}. 36 | #' If the user does not assign initial values for the unfitted parameters, default ones will be used. 37 | #' @param verbose An integer specifying the print level of information during the algorithm (default \code{1}). Possible numbers: 38 | #' \itemize{\item{\code{"0"}: no output;} 39 | #' \item{\code{"1"}: show the iteration number and \eqn{\|\Delta \boldsymbol{\Theta}_i\|}{||\Delta \Theta(i)||};} 40 | #' \item{\code{"2"}: 1 + show the obtained parameters.}} 41 | #' @param control A list of control values of EM algorithm: 42 | #' \itemize{\item{\code{acceleration}: TRUE/FALSE indicating whether to use the accelerated EM algorithm (default TRUE);} 43 | #' \item{\code{maxit}: Maximum number of iterations (default \code{3000});} 44 | #' \item{\code{abstol}: Absolute tolerance for parameters' change \eqn{\|\Delta \boldsymbol{\Theta}_i\|}{||\Delta \Theta(i)||} as the stopping criteria (default \code{1e-4})} 45 | #' \item{\code{log_switch}: TRUE/FALSE indicating whether to record the history of convergence progress (defalut TRUE).}} 46 | #' 47 | #' @return A list of class "\code{volume_model}" with the following elements (if the algorithm converges): 48 | #' \itemize{\item{\code{par}: }{A list of parameters' fitted values.} 49 | #' \item{\code{init}: }{A list of valid initial values from users.} 50 | #' \item{\code{par_log}: }{A list of intermediate parameters' values if \code{log_switch = TRUE}.} 51 | #' \item{\code{converged}: }{A list of logical values indicating whether each parameter is fitted.} 52 | #' } 53 | #' 54 | #' @references 55 | #' Chen, R., Feng, Y., and Palomar, D. (2016). Forecasting intraday trading volume: A Kalman filter approach. Available at SSRN 3101695. 56 | #' 57 | #' Varadhan, R., and Roland, C. (2008). Simple and globally convergent methods for accelerating the convergence of any EM algorithm. 58 | #' Scandinavian Journal of Statistics, 35(2), 335–353. 59 | #' 60 | #' @examples 61 | #' library(intradayModel) 62 | #' data(volume_aapl) 63 | #' volume_aapl_training <- volume_aapl[, 1:20] 64 | #' \donttest{ 65 | #' # fit model with no prior knowledge 66 | #' model_fit <- fit_volume(volume_aapl_training) 67 | #' } 68 | #' # fit model with fixed_pars and init_pars 69 | #' model_fit <- fit_volume(volume_aapl_training, fixed_pars = list(a_mu = 0.5, var_mu = 0.05), 70 | #' init_pars = list(a_eta = 0.5)) 71 | #' \donttest{ 72 | #' # fit model with other control options 73 | #' model_fit <- fit_volume(volume_aapl_training, verbose = 2, 74 | #' control = list(acceleration = FALSE, maxit = 1000, abstol = 1e-4, log_switch = FALSE)) 75 | #' } 76 | #' 77 | #' @importFrom magrittr %>% 78 | #' @import xts 79 | #' 80 | #' @export 81 | fit_volume <- function(data, fixed_pars = NULL, init_pars = NULL, verbose = 0, control = NULL) { 82 | # error control of data 83 | if (!is.xts(data) & !is.matrix(data)) { 84 | stop("data must be matrix or xts.") 85 | } 86 | data <- clean_data(data) 87 | 88 | # Define a Univariate State-Space Model 89 | volume_model <- spec_volume_model(fixed_pars, init_pars) 90 | is_volume_model(volume_model, nrow(data)) 91 | 92 | # check if fit is required 93 | if (Reduce("+", volume_model$converged) == 8) { 94 | if (verbose > 0) { 95 | cat("All parameters have already been fixed.\n") 96 | } 97 | return(volume_model) 98 | } 99 | 100 | # control list check 101 | ## initial control values 102 | control_final <- list( 103 | acceleration = TRUE, maxit = 3000, abstol = 1e-4, 104 | log_switch = TRUE 105 | ) 106 | if (is.list(control)) { 107 | for (prop in c("acceleration", "maxit", "abstol", "log_switch")) { 108 | if (prop %in% names(control)) { 109 | control_final[[prop]] <- control[[prop]] 110 | } 111 | } 112 | } 113 | 114 | # specify uniss-format model (volume_model is outer interface, uniss is inner obj) 115 | args <- list( 116 | data = log(data), 117 | volume_model = volume_model 118 | ) 119 | uniss_obj <- do.call(specify_uniss, args) 120 | 121 | # fit parameters with EM algorithm 122 | args <- append(args, list( 123 | uniss_obj = uniss_obj, 124 | verbose = verbose, 125 | control = control_final 126 | )) 127 | if (control_final$acceleration == FALSE) { 128 | em_result <- do.call(uniss_em_alg, args) 129 | } else { 130 | em_result <- do.call(uniss_em_alg_acc, args) 131 | } 132 | volume_model$par_log <- em_result$par_log 133 | 134 | if (length(em_result$warning_msg) > 0) { 135 | warning(em_result$warning_msg) 136 | } 137 | 138 | # update volume_model list object 139 | volume_model$par <- em_result$uniss_obj$par 140 | if (em_result$convergence) { 141 | volume_model$converged[] <- TRUE 142 | volume_model$init <- list() 143 | } 144 | 145 | # verbose 146 | if (verbose >= 2) { 147 | cat("--- obtained parameters ---\n") 148 | par_visual <- lapply(volume_model$par, as.numeric) 149 | par_visual$V0 <- matrix(c( 150 | par_visual$V0[1], par_visual$V0[2], 151 | par_visual$V0[2], par_visual$V0[3] 152 | ), 2) 153 | utils::str(par_visual) 154 | cat("---------------------------\n") 155 | } 156 | 157 | return(volume_model) 158 | } 159 | 160 | uniss_em_alg <- function(...) { 161 | # read input information 162 | args <- list(...) 163 | uniss_obj <- args$uniss_obj 164 | verbose <- args$verbose 165 | control <- args$control 166 | 167 | # required settings 168 | convergence <- FALSE 169 | par_log <- list(uniss_obj$par) 170 | 171 | # EM loops 172 | for (i in 1:control$maxit) { 173 | ## one update 174 | new_par <- uniss_kalman(uniss_obj, "em_update")$new_par 175 | 176 | ## logging 177 | if (control$log_switch == TRUE) { 178 | par_log <- append(par_log, list(new_par)) 179 | } 180 | 181 | ## verbose & stopping criteria 182 | diff <- norm(as.numeric(unlist(uniss_obj$par)) - 183 | as.numeric(unlist(new_par)), type = "2") 184 | if (verbose >= 1 & i %% 25 == 0) { 185 | cat("iter:", i, " diff:", diff, "\n", sep = "") 186 | } 187 | if (diff < control$abstol) { 188 | convergence <- TRUE 189 | break 190 | } 191 | 192 | uniss_obj$par <- new_par 193 | } 194 | 195 | # verbose 196 | warning_msg <- NULL 197 | if (!convergence) { 198 | warning_msg <- c(warning_msg, paste("Warning! Reached maxit before parameters converged. Maxit was ", control$maxit, ".\n", sep = "")) 199 | } else if (verbose > 0) { 200 | cat("Success! abstol test passed at", i, "iterations.\n") 201 | } 202 | 203 | result <- list( 204 | "uniss_obj" = uniss_obj, "convergence" = convergence, 205 | "par_log" = par_log, "warning_msg" = warning_msg 206 | ) 207 | return(result) 208 | } 209 | 210 | uniss_em_alg_acc <- function(...) { 211 | # read input information 212 | args <- list(...) 213 | uniss_obj <- args$uniss_obj 214 | verbose <- args$verbose 215 | control <- args$control 216 | 217 | # required settings 218 | convergence <- FALSE 219 | par_log <- list(uniss_obj$par) 220 | 221 | # EM loops 222 | for (i in 1:control$maxit) { 223 | ## one update 224 | curr_par <- uniss_obj$par 225 | new_par_1 <- uniss_kalman(uniss_obj, "em_update")$new_par 226 | uniss_obj$par <- new_par_1 227 | new_par_2 <- uniss_kalman(uniss_obj, "em_update")$new_par 228 | uniss_obj$par <- new_par_2 229 | 230 | new_par <- curr_par # copy the structure 231 | ## vector-wise acceleration for intraday periodic 232 | if (!uniss_obj$converged$phi) { 233 | r <- new_par_1$phi - curr_par$phi 234 | v <- new_par_2$phi - new_par_1$phi - r 235 | r_norm <- norm(r, "2") 236 | v_norm <- norm(v, "2") 237 | step_vec <- -r_norm / v_norm 238 | new_par$phi <- curr_par$phi - 239 | 2 * step_vec * r + step_vec^2 * v 240 | } 241 | 242 | ## element-wise acceleration for other parameters 243 | for (name in names(curr_par)) { 244 | if (name != "phi" & length(curr_par[[name]]) > 0) { 245 | r <- new_par_1[[name]] - curr_par[[name]] 246 | v <- new_par_2[[name]] - new_par_1[[name]] - r 247 | step_len <- -abs(r) / abs(v) 248 | for (n in 1:length(step_len)) { 249 | if (abs(v[n]) > 1e-8) { 250 | new_par[[name]][n] <- curr_par[[name]][n] - step_len[n] * r[n] 251 | } else { 252 | new_par[[name]][n] <- new_par_2[[name]][n] 253 | } 254 | } 255 | } 256 | } 257 | 258 | ## acceleration error check 259 | if (new_par$r < 0) { 260 | new_par$r <- new_par_2$r 261 | } 262 | if (new_par$var_eta < 0) { 263 | new_par$var_eta <- new_par_2$var_eta 264 | } 265 | if (new_par$var_mu < 0) { 266 | new_par$var_mu <- new_par_2$var_mu 267 | } 268 | 269 | ## logging 270 | if (control$log_switch == TRUE) { 271 | par_log <- append(par_log, list(new_par)) 272 | } 273 | 274 | ## verbose & stopping criteria 275 | diff <- norm(as.numeric(unlist(new_par_1)) - 276 | as.numeric(unlist(new_par_2)), type = "2") 277 | if (verbose >= 1 & i %% 5 == 0) { 278 | cat("iter:", i, " diff:", diff, "\n", sep = "") 279 | } 280 | if (diff < control$abstol) { 281 | convergence <- TRUE 282 | break 283 | } 284 | 285 | uniss_obj$par <- new_par 286 | } 287 | 288 | # verbose 289 | warning_msg <- NULL 290 | if (!convergence) { 291 | warning_msg <- c(warning_msg, paste("Warning! Reached maxit before parameters converged. Maxit was ", control$maxit, ".\n", sep = "")) 292 | } else if (verbose > 0) { 293 | cat("Success! abstol test passed at", i, "iterations.\n") 294 | } 295 | 296 | result <- list( 297 | "uniss_obj" = uniss_obj, "convergence" = convergence, 298 | "par_log" = par_log, "warning_msg" = warning_msg 299 | ) 300 | return(result) 301 | } 302 | -------------------------------------------------------------------------------- /R/intradayModel-package.R: -------------------------------------------------------------------------------- 1 | #' intradayModel: Modeling and Forecasting Financial Intraday Signals 2 | #' 3 | #' This package uses state-of-the-art state-space models to facilitate the modeling, analyzing and forecasting of 4 | #' financial intraday signals. It currently offers a univariate model for intraday trading volume, 5 | #' with new features on intraday volatility and multivariate models in development. 6 | #' 7 | #' @section Functions: 8 | #' \code{\link{fit_volume}}, 9 | #' \code{\link{decompose_volume}}, 10 | #' \code{\link{forecast_volume}}, 11 | #' \code{\link{generate_plots}} 12 | #' 13 | #' @section Data: 14 | #' \code{\link{volume_aapl}}, 15 | #' \code{\link{volume_fdx}} 16 | #' 17 | #' @section Help: 18 | #' For a quick help see the README file: 19 | #' \href{https://github.com/convexfi/intradayModel/blob/master/README.md}{GitHub-README}. 20 | #' 21 | #' 22 | #' @author Shengjie Xiu, Yifan Yu and Daniel P. Palomar 23 | #' 24 | #' @docType package 25 | #' @name intradayModel-package 26 | NULL -------------------------------------------------------------------------------- /R/plot.R: -------------------------------------------------------------------------------- 1 | #' @title Plot Analysis and Forecast Result 2 | #' 3 | #' @description Generate plots for the analysis and forecast results. 4 | #' 5 | #' @author Shengjie Xiu, Yifan Yu and Daniel P. Palomar 6 | #' 7 | #' @param analysis_forecast_result Analysis/forecast result from \code{decompose_volume()} or \code{forecast_volume()}. 8 | #' 9 | #' @return A list of \code{patchwork} objects: 10 | #' \itemize{ 11 | #' \item{\code{components}: }{Plot of components of intraday volume;} 12 | #' \item{\code{log_components}: }{Plot of components of intraday volume in their log10 scale;} 13 | #' \item{\code{original_and_smooth} / \code{original_and_forecast}: }{Plot of the original and the smooth/forecast intraday volume.}} 14 | #' 15 | #' @import patchwork 16 | #' @import ggplot2 17 | #' @importFrom scales trans_breaks trans_format math_format 18 | #' @importFrom magrittr %>% 19 | #' @examples 20 | #' library(intradayModel) 21 | #' data(volume_aapl) 22 | #' volume_aapl_training <- volume_aapl[, 1:20] 23 | #' volume_aapl_testing <- volume_aapl[, 21:50] 24 | #' 25 | #' # obtain analysis and forecast result 26 | #' model_fit <- fit_volume(volume_aapl_training, fixed_pars = list(a_mu = 0.5, var_mu = 0.05), 27 | #' init_pars = list(a_eta = 0.5)) 28 | #' analysis_result <- decompose_volume(purpose = "analysis", model_fit, volume_aapl_training) 29 | #' forecast_result <- forecast_volume(model_fit, volume_aapl_testing) 30 | #' 31 | #' # plot the analysis and forecast result 32 | #' generate_plots(analysis_result) 33 | #' generate_plots(forecast_result) 34 | #' 35 | #' @export 36 | generate_plots <- function(analysis_forecast_result) { 37 | plot_list <- list() 38 | 39 | plot_list$components <- plot_components(analysis_forecast_result, log = FALSE) 40 | plot_list$log_components <- plot_components(analysis_forecast_result, log = TRUE) 41 | if ("analysis" %in% attr(analysis_forecast_result, "type")) { 42 | plot_list$original_and_smooth <- plot_performance(analysis_forecast_result) 43 | } else { 44 | plot_list$original_and_forecast <- plot_performance(analysis_forecast_result) 45 | } 46 | return(plot_list) 47 | } 48 | 49 | plot_components <- function(analysis_forecast_result, log = TRUE) { 50 | if ("analysis" %in% attr(analysis_forecast_result, "type")) { 51 | if (log == TRUE) { 52 | title <- "Components of Log Intraday Volume (analysis)" 53 | } else { 54 | title <- "Components of Intraday Volume (analysis)" 55 | } 56 | } else { 57 | if (log == TRUE) { 58 | title <- "Components of Log Intraday Volume (forecast)" 59 | } else { 60 | title <- "Components of Intraday Volume (forecast)" 61 | } 62 | } 63 | 64 | i <- original <- daily <- seasonal <- dynamic <- residual <- NULL 65 | components <- analysis_forecast_result[[grep("components", names(analysis_forecast_result))]] 66 | 67 | plt_data <- 68 | data.frame( 69 | original = analysis_forecast_result$original_signal, 70 | daily = components$daily, 71 | seasonal = components$seasonal, 72 | dynamic = components$dynamic, 73 | residual = components$residual 74 | ) 75 | if (log == TRUE) { 76 | plt_data <- log10(plt_data) 77 | } 78 | plt_data$i <- plt_data$i <- c(1:nrow(plt_data)) 79 | 80 | text_size <- 10 81 | .x <- NULL 82 | 83 | if (log == TRUE) { 84 | p1 <- plt_data %>% 85 | ggplot() + 86 | geom_line(aes(x = i, y = original), alpha = 0.8, color = "steelblue", size = 0.4) + 87 | ylab("Original") + 88 | theme_bw() + 89 | theme( 90 | axis.title = element_text(size = text_size, face = "bold"), 91 | legend.position = "right", 92 | legend.justification = c(0, 1), 93 | legend.box.just = "left", 94 | legend.margin = margin(8, 8, 8, 8), 95 | legend.text = element_text(size = text_size, face = "bold"), 96 | legend.key.size = unit(1, "cm"), 97 | plot.title = element_text(size = 18, face = "bold", hjust = 0.5), 98 | axis.title.x = element_blank(), 99 | axis.text.x = element_blank() 100 | ) 101 | 102 | p2 <- plt_data %>% 103 | ggplot() + 104 | geom_line(aes(x = i, y = daily), alpha = 0.8, color = "steelblue", size = 0.6) + 105 | ylab("Daily") + 106 | theme_bw() + 107 | theme( 108 | axis.title = element_text(size = text_size, face = "bold"), 109 | legend.position = "right", 110 | legend.justification = c(0, 1), 111 | legend.box.just = "left", 112 | legend.margin = margin(8, 8, 8, 8), 113 | legend.text = element_text(size = text_size, face = "bold"), 114 | legend.key.size = unit(1, "cm"), 115 | plot.title = element_text(size = 18, face = "bold", hjust = 0.5), 116 | axis.title.x = element_blank(), 117 | axis.text.x = element_blank() 118 | ) 119 | } else { 120 | p1 <- plt_data %>% 121 | ggplot() + 122 | geom_line(aes(x = i, y = original), alpha = 0.8, color = "steelblue", size = 0.4) + 123 | ylab("Original") + 124 | scale_y_log10( 125 | breaks = trans_breaks("log10", function(x) 10^x), 126 | labels = trans_format("log10", math_format(10^.x)) 127 | ) + 128 | theme_bw() + 129 | theme( 130 | axis.title = element_text(size = text_size, face = "bold"), 131 | legend.position = "right", 132 | legend.justification = c(0, 1), 133 | legend.box.just = "left", 134 | legend.margin = margin(8, 8, 8, 8), 135 | legend.text = element_text(size = text_size, face = "bold"), 136 | legend.key.size = unit(1, "cm"), 137 | plot.title = element_text(size = 18, face = "bold", hjust = 0.5), 138 | axis.title.x = element_blank(), 139 | axis.text.x = element_blank() 140 | ) 141 | 142 | p2 <- plt_data %>% 143 | ggplot() + 144 | geom_line(aes(x = i, y = daily), alpha = 0.8, color = "steelblue", size = 0.6) + 145 | ylab("Daily") + 146 | scale_y_log10( 147 | breaks = trans_breaks("log10", function(x) 10^x), 148 | labels = trans_format("log10", math_format(10^.x)) 149 | ) + 150 | theme_bw() + 151 | theme( 152 | axis.title = element_text(size = text_size, face = "bold"), 153 | legend.position = "right", 154 | legend.justification = c(0, 1), 155 | legend.box.just = "left", 156 | legend.margin = margin(8, 8, 8, 8), 157 | legend.text = element_text(size = text_size, face = "bold"), 158 | legend.key.size = unit(1, "cm"), 159 | plot.title = element_text(size = 18, face = "bold", hjust = 0.5), 160 | axis.title.x = element_blank(), 161 | axis.text.x = element_blank() 162 | ) 163 | } 164 | 165 | 166 | p3 <- plt_data %>% 167 | ggplot() + 168 | geom_line(aes(x = i, y = seasonal), alpha = 0.8, color = "steelblue", size = 0.4) + 169 | ylab("Seasonal") + 170 | theme_bw() + 171 | theme( 172 | axis.title = element_text(size = text_size, face = "bold"), 173 | legend.position = "right", 174 | legend.justification = c(0, 1), 175 | legend.box.just = "left", 176 | legend.margin = margin(8, 8, 8, 8), 177 | legend.text = element_text(size = text_size, face = "bold"), 178 | legend.key.size = unit(1, "cm"), 179 | plot.title = element_text(size = 18, face = "bold", hjust = 0.5), 180 | axis.title.x = element_blank(), 181 | axis.text.x = element_blank() 182 | ) 183 | 184 | p4 <- plt_data %>% 185 | ggplot() + 186 | geom_line(aes(x = i, y = dynamic), alpha = 0.8, color = "steelblue", size = 0.4) + 187 | ylab("Intraday\nDynamic") + 188 | theme_bw() + 189 | theme( 190 | axis.title = element_text(size = text_size, face = "bold"), 191 | legend.position = "right", 192 | legend.justification = c(0, 1), 193 | legend.box.just = "left", 194 | legend.margin = margin(8, 8, 8, 8), 195 | legend.text = element_text(size = text_size, face = "bold"), 196 | legend.key.size = unit(1, "cm"), 197 | plot.title = element_text(size = 18, face = "bold", hjust = 0.5), 198 | axis.title.x = element_blank(), 199 | axis.text.x = element_blank() 200 | ) 201 | 202 | p5 <- plt_data %>% 203 | ggplot() + 204 | geom_line(aes(x = i, y = residual), alpha = 0.8, color = "steelblue", size = 0.4) + 205 | ylab("Residual") + 206 | theme_bw() + 207 | xlab("time (bins)") + 208 | theme( 209 | axis.title.x = element_text(size = 12, face = "bold"), 210 | axis.title.y = element_text(size = text_size, face = "bold"), 211 | legend.position = "right", 212 | legend.justification = c(0, 1), 213 | legend.box.just = "left", 214 | legend.margin = margin(8, 8, 8, 8), 215 | legend.text = element_text(size = text_size, face = "bold"), 216 | legend.key.size = unit(1, "cm"), 217 | plot.title = element_text(size = 18, face = "bold", hjust = 0.5) 218 | ) 219 | 220 | p <- p1 / p2 / p3 / p4 / p5 + 221 | plot_annotation( 222 | title = title, 223 | theme = theme(plot.title = element_text(size = 16, face = "bold", hjust = 0.5)) 224 | ) 225 | 226 | return(p) 227 | } 228 | 229 | plot_performance <- function(analysis_forecast_result) { 230 | i <- value <- variable <- .x <- NULL 231 | 232 | # determine type 233 | if ("analysis" %in% attr(analysis_forecast_result, "type")) { 234 | approximated_signal <- analysis_forecast_result$smooth_signal 235 | title <- "Original and Smooth Signals (analysis)" 236 | legend_name <- "smooth" 237 | } else { 238 | approximated_signal <- analysis_forecast_result$forecast_signal 239 | title <- "Original and One-bin-ahead Forecast signal (forecast)" 240 | legend_name <- "forecast" 241 | } 242 | 243 | plt_data <- 244 | data.frame( 245 | original = analysis_forecast_result$original_signal, 246 | output = approximated_signal 247 | ) 248 | 249 | plt_data$i <- c(1:nrow(plt_data)) 250 | 251 | plt_reshape <- plt_data %>% 252 | reshape2::melt( 253 | id.vars = c("i"), 254 | variable.name = "variable", value.name = "value" 255 | ) 256 | 257 | text_size <- 14 258 | p <- plt_reshape %>% 259 | ggplot() + 260 | geom_line(aes(x = i, y = value, color = variable), alpha = 0.8, size = 0.4) + 261 | scale_colour_manual(values = c(original = "steelblue", output = "#FD6467"), labels = c("original", legend_name)) + 262 | scale_y_log10( 263 | breaks = trans_breaks("log10", function(x) 10^x), 264 | labels = trans_format("log10", math_format(10^.x)) 265 | ) + 266 | xlab("time (bins)") + 267 | ylab("Intraday Volume") + 268 | theme_bw() + 269 | theme( 270 | axis.title = element_text(size = text_size, face = "bold"), 271 | legend.position = "bottom", 272 | legend.text = element_text(size = text_size, face = "bold"), 273 | legend.title = element_blank(), 274 | plot.title = element_text(size = 16, face = "bold", hjust = 0.5) 275 | ) + 276 | plot_annotation( 277 | title = title, 278 | theme = theme(plot.title = element_text(size = 16, face = "bold", hjust = 0.5)) 279 | ) 280 | 281 | return(p) 282 | } 283 | -------------------------------------------------------------------------------- /R/use_model.R: -------------------------------------------------------------------------------- 1 | #' @title Decompose Intraday Volume into Several Components 2 | #' 3 | #' @description This function decomposes the intraday volume into daily, seasonal, and intraday dynamic components according to (Chen et al., 2016). 4 | #' If \code{purpose = “analysis”} (aka Kalman smoothing), the optimal components are conditioned on both the past and future observations. 5 | #' Its mathematical expression is \eqn{\hat{x}_{\tau} = E[x_{\tau}|\{y_{j}\}_{j=1}^{M}]}{x*(\tau) = E[x(\tau) | y(j), j = 1, ... , M]}, 6 | #' where \eqn{M} is the total number of bins in the dataset. 7 | #' 8 | #' If \code{purpose = “forecast”} (aka Kalman forecasting), the optimal components are conditioned on only the past observations. 9 | #' Its mathematical expression is \eqn{\hat{x}_{\tau+1} = E[x_{\tau+1}|\{y_{j}\}_{j=1}^{\tau}]}{x*(\tau+1) = E[x(\tau + 1) | y(j), j = 1, ... , \tau]}. 10 | #' 11 | #' Three measures are used to evaluate the model performance: 12 | #' \itemize{\item{Mean absolute error (MAE): 13 | #' \eqn{\frac{1}{M}\sum_{\tau=1}^M|\hat{y}_{\tau} - y_{\tau}|}{\sum (|y*(\tau) - y(\tau)|) / M} ;} 14 | #' \item{Mean absolute percent error (MAPE): 15 | #' \eqn{\frac{1}{M}\sum_{\tau=1}^M\frac{|\hat{y}_{\tau} - y_{\tau}|}{y_{\tau}}}{\sum (|y*(\tau) - y(\tau)| / y(\tau)) / M} ;} 16 | #' \item{Root mean square error (RMSE): 17 | #' \eqn{\sqrt{\sum_{\tau=1}^M\frac{\left(\hat{y}_{\tau} - y_{\tau}\right)^2}{M}}}{[\sum ((y*(\tau) - y(\tau))^2 / M)]^0.5} .} 18 | #' } 19 | #' 20 | #' @author Shengjie Xiu, Yifan Yu and Daniel P. Palomar 21 | #' 22 | #' @param purpose String \code{"analysis"/"forecast"}. Indicates the purpose of using the provided model. 23 | #' @param model A model object of class "\code{volume_model}" from \code{fit_volume()}. 24 | #' @param data An n_bin * n_day matrix or an \code{xts} object storing intraday volume. 25 | #' @param burn_in_days Number of initial days in the burn-in period for forecast. Samples from the first \code{burn_in_days} are used to warm up the model and then are discarded. 26 | #' 27 | #' 28 | #' @return A list containing the following elements: 29 | #' \itemize{ 30 | #' \item{\code{original_signal}: }{A vector of original intraday volume;} 31 | #' \item{\code{smooth_signal} / \code{forecast_signal}: }{A vector of smooth/forecast intraday volume;} 32 | #' \item{\code{smooth_components} /\code{forecast_components}: }{A list of smooth/forecast components: daily, seasonal, intraday dynamic, and residual components.} 33 | #' \item{\code{error}: }{A list of three error measures: mae, mape, and rmse.} 34 | #' } 35 | #' 36 | #' 37 | #' @references 38 | #' Chen, R., Feng, Y., and Palomar, D. (2016). Forecasting intraday trading volume: A Kalman filter approach. Available at SSRN 3101695. 39 | #' 40 | #' 41 | #' @examples 42 | #' library(intradayModel) 43 | #' data(volume_aapl) 44 | #' volume_aapl_training <- volume_aapl[, 1:20] 45 | #' volume_aapl_testing <- volume_aapl[, 21:50] 46 | #' model_fit <- fit_volume(volume_aapl_training, fixed_pars = list(a_mu = 0.5, var_mu = 0.05), 47 | #' init_pars = list(a_eta = 0.5)) 48 | #' 49 | #' # analyze training volume 50 | #' analysis_result <- decompose_volume(purpose = "analysis", model_fit, volume_aapl_training) 51 | #' 52 | #' # forecast testing volume 53 | #' forecast_result <- decompose_volume(purpose = "forecast", model_fit, volume_aapl_testing) 54 | #' 55 | #' # forecast testing volume with burn-in 56 | #' forecast_result <- decompose_volume(purpose = "forecast", model_fit, volume_aapl[, 1:50], 57 | #' burn_in_days = 20) 58 | #' 59 | #' @export 60 | decompose_volume <- function(purpose, model, data, burn_in_days = 0) { 61 | if (tolower(purpose) == "analysis") { 62 | res <- smooth_volume_model(data = data, volume_model = model) 63 | attr(res, "type") <- c("analysis", "smooth") 64 | } else if (tolower(purpose) == "forecast") { 65 | res <- forecast_volume_model(data = data, volume_model = model, burn_in_days = burn_in_days) 66 | attr(res, "type") <- "forecast" 67 | } else { 68 | warning("Wrong purpose for decompose_volume function.\n") 69 | } 70 | 71 | return(res) 72 | } 73 | 74 | 75 | #' @title Forecast One-bin-ahead Intraday Volume 76 | #' 77 | #' @description This function forecasts one-bin-ahead intraday volume. 78 | #' Its mathematical expression is \eqn{\hat{y}_{\tau+1} = E[y_{\tau+1}|\{y_{j}\}_{j=1}^{\tau}]}{y*(\tau+1) = E[y(\tau + 1) | y(j), j = 1, ... , \tau]}. 79 | #' It is a wrapper of \code{decompose_volume()} with \code{purpose = "forecast"}. 80 | #' 81 | #' @author Shengjie Xiu, Yifan Yu and Daniel P. Palomar 82 | #' 83 | #' @param model A model object of class "\code{volume_model}" from \code{fit_volume()}. 84 | #' @param data An n_bin * n_day matrix or an \code{xts} object storing intraday volume. 85 | #' @param burn_in_days Number of initial days in the burn-in period. Samples from the first \code{burn_in_days} are used to warm up the model and then are discarded. 86 | #' 87 | #' 88 | #' @return A list containing the following elements: 89 | #' \itemize{ 90 | #' \item{\code{original_signal}: }{A vector of original intraday volume;} 91 | #' \item{\code{forecast_signal}: }{A vector of forecast intraday volume;} 92 | #' \item{\code{forecast_components}: }{A list of the three forecast components: daily, seasonal, intraday dynamic, and residual components.} 93 | #' \item{\code{error}: }{A list of three error measures: mae, mape, and rmse.} 94 | #' } 95 | #' 96 | #' 97 | #' @references 98 | #' Chen, R., Feng, Y., and Palomar, D. (2016). Forecasting intraday trading volume: A Kalman filter approach. Available at SSRN 3101695. 99 | #' 100 | #' 101 | #' @examples 102 | #' library(intradayModel) 103 | #' data(volume_aapl) 104 | #' volume_aapl_training <- volume_aapl[, 1:20] 105 | #' volume_aapl_testing <- volume_aapl[, 21:50] 106 | #' model_fit <- fit_volume(volume_aapl_training, fixed_pars = list(a_mu = 0.5, var_mu = 0.05), 107 | #' init_pars = list(a_eta = 0.5)) 108 | #' 109 | #' # forecast testing volume 110 | #' forecast_result <- forecast_volume(model_fit, volume_aapl_testing) 111 | #' 112 | #' # forecast testing volume with burn-in 113 | #' forecast_result <- forecast_volume(model_fit, volume_aapl[, 1:50], burn_in_days = 20) 114 | #' 115 | #' @export 116 | forecast_volume <- function(model, data, burn_in_days = 0) { 117 | res <- decompose_volume("forecast", model, data, burn_in_days) 118 | return(res) 119 | } 120 | 121 | 122 | smooth_volume_model <- function(data, volume_model) { 123 | # error control of data 124 | if (!is.xts(data) & !is.matrix(data)) { 125 | stop("data must be matrix or xts.") 126 | } 127 | data <- clean_data(data) 128 | 129 | is_volume_model(volume_model, nrow(data)) 130 | 131 | # if model isn't optimally fitted (no convergence), it cannot filter 132 | if (Reduce("+", volume_model$converged) != 8) { 133 | msg <- c( 134 | "All parameters must be optimally fitted. ", 135 | "Parameters ", paste(names(volume_model$converged[volume_model$converged == FALSE]), collapse = ", "), " are not optimally fitted." 136 | ) 137 | stop(msg) 138 | } 139 | 140 | # filter using UNISS (our own Kalman) 141 | args <- list( 142 | data = log(data), 143 | volume_model = volume_model 144 | ) 145 | uniss_obj <- do.call(specify_uniss, args) 146 | Kf <- uniss_kalman(uniss_obj, "smoother") 147 | 148 | # tidy up components (scale change) 149 | smooth_components <- list( 150 | daily = exp(Kf$xtT[1, ]), 151 | dynamic = exp(Kf$xtT[2, ]), 152 | seasonal = exp(rep(uniss_obj$par$phi, uniss_obj$n_day)) 153 | ) 154 | smooth_signal <- smooth_components$daily * 155 | smooth_components$dynamic * smooth_components$seasonal 156 | original_signal <- as.vector(data) 157 | smooth_components$residual <- original_signal / smooth_signal 158 | error <- list( 159 | mae = calculate_mae(original_signal, smooth_signal), 160 | mape = calculate_mape(original_signal, smooth_signal), 161 | rmse = calculate_rmse(original_signal, smooth_signal) 162 | ) 163 | 164 | res <- list( 165 | original_signal = original_signal, 166 | smooth_signal = smooth_signal, 167 | smooth_components = smooth_components, 168 | error = error 169 | ) 170 | 171 | return(res) 172 | } 173 | 174 | forecast_volume_model <- function(data, volume_model, burn_in_days = 0) { 175 | # error control of data 176 | if (!is.xts(data) & !is.matrix(data)) { 177 | stop("data must be matrix or xts.") 178 | } 179 | data <- clean_data(data) 180 | if (burn_in_days > ncol(data)) stop("out_sample must be smaller than the number of columns in data matrix.") 181 | 182 | is_volume_model(volume_model, nrow(data)) 183 | 184 | # check if fit is necessary 185 | if (Reduce("+", volume_model$converged) != 8) { 186 | msg <- c( 187 | "All parameters must be fitted.\n ", 188 | "Parameter ", paste(names(volume_model$converged[volume_model$converged == FALSE]), collapse = ", "), " is not fitted." 189 | ) 190 | stop(msg) 191 | } 192 | 193 | # one-step ahead prediction using UNISS (our own Kalman) 194 | args <- list( 195 | data = log(data), 196 | volume_model = volume_model 197 | ) 198 | uniss_obj <- do.call(specify_uniss, args) 199 | Kf <- uniss_kalman(uniss_obj, "filter") 200 | 201 | # tidy up components (scale change) 202 | forecast_components <- list( 203 | daily = exp(Kf$xtt1[1, ]), 204 | dynamic = exp(Kf$xtt1[2, ]), 205 | seasonal = exp(rep(uniss_obj$par$phi, uniss_obj$n_day)) 206 | ) 207 | components_out <- lapply(forecast_components, function(c) utils::tail(c, nrow(data) * (ncol(data) - burn_in_days))) 208 | forecast_signal <- components_out$daily * 209 | components_out$dynamic * components_out$seasonal 210 | 211 | # error measures 212 | original_signal <- utils::tail(as.vector(as.matrix(data)), nrow(data) * (ncol(data) - burn_in_days)) 213 | components_out$residual <- original_signal / forecast_signal 214 | error <- list( 215 | mae = calculate_mae(original_signal, forecast_signal), 216 | mape = calculate_mape(original_signal, forecast_signal), 217 | rmse = calculate_rmse(original_signal, forecast_signal) 218 | ) 219 | 220 | # result 221 | res <- list( 222 | original_signal = original_signal, 223 | forecast_signal = forecast_signal, 224 | forecast_components = components_out, 225 | error = error 226 | ) 227 | 228 | return(res) 229 | } 230 | -------------------------------------------------------------------------------- /R/volume_aapl.R: -------------------------------------------------------------------------------- 1 | #' 15-min Intraday Volume of AAPL 2 | #' 3 | #' @description A 26 * 124 matrix including 15-min trading volume of AAPL from 2019-01-02 to 2019-06-28. 4 | #' 5 | #' 6 | #' @docType data 7 | #' 8 | #' @usage data(volume_aapl) 9 | #' 10 | #' @format A 26 * 124 matrix. 11 | #' 12 | #' @source \href{https://www.barchart.com}{barchart} 13 | #' 14 | #' @keywords dataset 15 | #' 16 | "volume_aapl" -------------------------------------------------------------------------------- /R/volume_fdx.R: -------------------------------------------------------------------------------- 1 | #' 15-min Intraday Volume of FDX 2 | #' 3 | #' @description An \code{xts} object including 15-min trading volume of FDX from 2019-07-01 to 2019-12-31. 4 | #' 5 | #' @docType data 6 | #' 7 | #' @usage data(volume_fdx) 8 | #' 9 | #' @format An \code{xts} object. 10 | #' 11 | #' @source \href{https://www.barchart.com}{barchart} 12 | #' 13 | #' @keywords dataset 14 | #' 15 | "volume_fdx" -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | output: 3 | md_document: 4 | variant: markdown_github 5 | html_document: 6 | variant: markdown_github 7 | --- 8 | 9 | 10 | 11 | ```{r, echo=FALSE, warning=FALSE} 12 | library(knitr) 13 | opts_chunk$set( 14 | collapse = TRUE, 15 | comment = "#>", 16 | fig.path = "man/figures/README-", 17 | fig.align = "center", 18 | fig.retina = 2, 19 | out.width = "100%", 20 | dpi = 96 21 | ) 22 | knit_hooks$set(pngquant = hook_pngquant) 23 | ``` 24 | 25 | # intradayModel 26 | Our package uses state-of-the-art state-space models to facilitate the modeling and forecasting of financial intraday signals. It currently offers a univariate model for intraday trading volume, with new features on intraday volatility and multivariate models in development. It is a valuable tool for anyone interested in exploring intraday, algorithmic, and high-frequency trading. 27 | 28 | ## Installation 29 | To install the latest stable version of **intradayModel** from [CRAN](https://cran.r-project.org/web/packages/intradayModel/index.html), run the following commands in R: 30 | ```{r, eval=FALSE} 31 | install.packages("intradayModel") 32 | ``` 33 | 34 | To install the development version of **intradayModel** from [GitHub](https://github.com/convexfi/intradayModel), run the following commands in R: 35 | ```{r, eval=FALSE} 36 | install.packages("devtools") 37 | devtools::install_github("convexfi/intradayModel") 38 | ``` 39 | 40 | Please cite **intradayModel** in publications: 41 | ```{r, eval=FALSE} 42 | citation("intradayModel") 43 | ``` 44 | 45 | ## Quick Start 46 | To get started, we load our package and sample data: the 15-minute intraday trading volume of AAPL from 2019-01-02 to 2019-06-28, covering 124 trading days. We use the first 104 trading days for fitting, and the last 20 days for evaluation of forecasting performance. 47 | 48 | ```{r, message = FALSE} 49 | library(intradayModel) 50 | data(volume_aapl) 51 | volume_aapl[1:5, 1:5] # print the head of data 52 | 53 | volume_aapl_training <- volume_aapl[, 1:104] 54 | volume_aapl_testing <- volume_aapl[, 105:124] 55 | ``` 56 | 57 | Next, we fit a univariate state-space model using `fit_volume()` function. 58 | 59 | ```{r} 60 | model_fit <- fit_volume(volume_aapl_training) 61 | ``` 62 | 63 | Once the model is fitted, we can analyze the hidden components of any intraday volume based on all its observations. By calling `decompose_volume()` function with `purpose = "analysis"`, we obtain the smoothed daily, seasonal, and intraday dynamic components. It involves incorporating both past and future observations to refine the state estimates. 64 | 65 | ```{r} 66 | analysis_result <- decompose_volume(purpose = "analysis", model_fit, volume_aapl_training) 67 | 68 | # visualization 69 | plots <- generate_plots(analysis_result) 70 | plots$log_components 71 | ``` 72 | 73 | To see how well our model performs on new data, we call `forecast_volume()` function to do one-bin-ahead forecast on the testing set. 74 | 75 | ```{r} 76 | forecast_result <- forecast_volume(model_fit, volume_aapl_testing) 77 | 78 | # visualization 79 | plots <- generate_plots(forecast_result) 80 | plots$original_and_forecast 81 | ``` 82 | 83 | ## Contributing 84 | We welcome all sorts of contributions. Please feel free to open an issue 85 | to report a bug or discuss a feature request. 86 | 87 | ## Citation 88 | If you make use of this software please consider citing: 89 | 90 | - Chen, R., Feng, Y., and Palomar, D. (2016). Forecasting intraday trading volume: A Kalman filter approach. 91 | 92 | ## Links 93 | Package: [GitHub](https://github.com/convexfi/intradayModel) 94 | 95 | Vignette: [GitHub-vignette](https://htmlpreview.github.io/?https://github.com/convexfi/intradayModel/blob/master/vignettes/intradayModel.html). 96 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # intradayModel 4 | 5 | Our package uses state-of-the-art state-space models to facilitate the 6 | modeling and forecasting of financial intraday signals. It currently 7 | offers a univariate model for intraday trading volume, with new features 8 | on intraday volatility and multivariate models in development. It is a 9 | valuable tool for anyone interested in exploring intraday, algorithmic, 10 | and high-frequency trading. 11 | 12 | ## Installation 13 | 14 | To install the latest stable version of **intradayModel** from 15 | [CRAN](https://cran.r-project.org/web/packages/intradayModel/index.html), 16 | run the following commands in R: 17 | 18 | ``` r 19 | install.packages("intradayModel") 20 | ``` 21 | 22 | To install the development version of **intradayModel** from 23 | [GitHub](https://github.com/convexfi/intradayModel), run the following 24 | commands in R: 25 | 26 | ``` r 27 | install.packages("devtools") 28 | devtools::install_github("convexfi/intradayModel") 29 | ``` 30 | 31 | Please cite **intradayModel** in publications: 32 | 33 | ``` r 34 | citation("intradayModel") 35 | ``` 36 | 37 | ## Quick Start 38 | 39 | To get started, we load our package and sample data: the 15-minute 40 | intraday trading volume of AAPL from 2019-01-02 to 2019-06-28, covering 41 | 124 trading days. We use the first 104 trading days for fitting, and the 42 | last 20 days for evaluation of forecasting performance. 43 | 44 | ``` r 45 | library(intradayModel) 46 | data(volume_aapl) 47 | volume_aapl[1:5, 1:5] # print the head of data 48 | #> 2019-01-02 2019-01-03 2019-01-04 2019-01-07 2019-01-08 49 | #> 09:30 AM 10142172 3434769 20852127 15463747 14719388 50 | #> 09:45 AM 5691840 19751251 13374784 9962816 9515796 51 | #> 10:00 AM 6240374 14743180 11478596 7453044 6145623 52 | #> 10:15 AM 5273488 14841012 16024512 7270399 6031988 53 | #> 10:30 AM 4587159 18041115 8686059 7130980 5479852 54 | 55 | volume_aapl_training <- volume_aapl[, 1:104] 56 | volume_aapl_testing <- volume_aapl[, 105:124] 57 | ``` 58 | 59 | Next, we fit a univariate state-space model using `fit_volume()` 60 | function. 61 | 62 | ``` r 63 | model_fit <- fit_volume(volume_aapl_training) 64 | ``` 65 | 66 | Once the model is fitted, we can analyze the hidden components of any 67 | intraday volume based on all its observations. By calling 68 | `decompose_volume()` function with `purpose = "analysis"`, we obtain the 69 | smoothed daily, seasonal, and intraday dynamic components. It involves 70 | incorporating both past and future observations to refine the state 71 | estimates. 72 | 73 | ``` r 74 | analysis_result <- decompose_volume(purpose = "analysis", model_fit, volume_aapl_training) 75 | 76 | # visualization 77 | plots <- generate_plots(analysis_result) 78 | plots$log_components 79 | ``` 80 | 81 | 82 | 83 | To see how well our model performs on new data, we call 84 | `forecast_volume()` function to do one-bin-ahead forecast on the testing 85 | set. 86 | 87 | ``` r 88 | forecast_result <- forecast_volume(model_fit, volume_aapl_testing) 89 | 90 | # visualization 91 | plots <- generate_plots(forecast_result) 92 | plots$original_and_forecast 93 | ``` 94 | 95 | 96 | 97 | ## Contributing 98 | 99 | We welcome all sorts of contributions. Please feel free to open an issue 100 | to report a bug or discuss a feature request. 101 | 102 | ## Citation 103 | 104 | If you make use of this software please consider citing: 105 | 106 | - Chen, R., Feng, Y., and Palomar, D. (2016). Forecasting intraday 107 | trading volume: A Kalman filter approach. 108 | 109 | 110 | ## Links 111 | 112 | Package: [GitHub](https://github.com/convexfi/intradayModel) 113 | 114 | Vignette: 115 | [GitHub-vignette](https://htmlpreview.github.io/?https://github.com/convexfi/intradayModel/blob/master/vignettes/intradayModel.html). 116 | -------------------------------------------------------------------------------- /R_buildignore/AAPL_volume_xts.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/R_buildignore/AAPL_volume_xts.rda -------------------------------------------------------------------------------- /R_buildignore/GE_volume.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/R_buildignore/GE_volume.rda -------------------------------------------------------------------------------- /R_buildignore/GE_volume_xts.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/R_buildignore/GE_volume_xts.rda -------------------------------------------------------------------------------- /R_buildignore/data_generate.R: -------------------------------------------------------------------------------- 1 | library(reshape2) 2 | library(xts) 3 | 4 | dataset_list <- readRDS("D:\\OneDrive - HKUST Connect\\HKUST\\Kalman\\dataset_SP500_100_batch1") 5 | data <- dataset_list[[8]] 6 | data$log_volume = log(data$Volume) 7 | n_bin <- 26 8 | data.train <- data['2019-01-01/2019-05-31'] 9 | 10 | y_train <- data.train[, c("log_volume","bin")] 11 | 12 | # xts to data.frame 13 | # data_frame <- as.data.frame(y_train) 14 | data_frame <- data.frame(date=index(y_train), coredata(y_train)) 15 | # modify the date colume 16 | data_frame$date <- as.Date(data_frame$date, format = "%Y%m%d") 17 | # long to wide 18 | dataset <- melt(data_frame, id=c("date","bin")) 19 | dataset <- dcast(dataset, bin ~ date) 20 | # delete the bin col 21 | dataset <- dataset[,-1] 22 | data_log_volume <- dataset 23 | 24 | 25 | # new --------------------------------------------------------------------- 26 | dataset_list <- readRDS("R_buildignore/dataset_SP100") 27 | data <- dataset_list[[1]] 28 | n_bin <- 26 29 | data.train <- data['2019-01-01/2019-06-30'] 30 | 31 | y_train <- data.train[, c("Volume","bin")] 32 | 33 | # xts to data.frame 34 | # data_frame <- as.data.frame(y_train) 35 | data_frame <- data.frame(date=index(y_train), coredata(y_train)) 36 | # modify the date colume 37 | data_frame$date <- as.Date(data_frame$date, format = "%Y%m%d") 38 | # long to wide 39 | dataset <- melt(data_frame, id=c("date","bin")) 40 | dataset <- dcast(dataset, bin ~ date) 41 | # delete the bin col 42 | dataset <- dataset[,-1] 43 | volume_aapl <- as.matrix(dataset) 44 | 45 | bin_names <- c("09:30 AM", "09:45 AM", "10:00 AM", "10:15 AM", "10:30 AM", 46 | "10:45 AM", "11:00 AM", "11:15 AM", "11:30 AM", "11:45 AM", 47 | "12:00 PM", "12:15 PM", "12:30 PM", "12:45 PM", "01:00 PM", 48 | "01:15 PM", "01:30 PM", "01:45 PM", "02:00 PM", "02:15 PM", 49 | "02:30 PM", "02:45 PM", "03:00 PM", "03:15 PM", "03:30 PM", "03:45 PM") 50 | rownames(volume_aapl) <- bin_names 51 | usethis::use_data(volume_aapl, overwrite = TRUE) 52 | -------------------------------------------------------------------------------- /R_buildignore/dataset_SP100: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/R_buildignore/dataset_SP100 -------------------------------------------------------------------------------- /R_buildignore/developer_commands.R: -------------------------------------------------------------------------------- 1 | # install basic package for vignette 2 | install.packages(c("cleanrmd", "R.rsp")) 3 | 4 | ## 5 | ## User installation 6 | ## 7 | # Local installation 8 | install.packages(file.choose(), repos = NULL, type = "source") 9 | # Installation from GitHub 10 | devtools::install_github("convexfi/intradayModel") 11 | # devtools::install_github("convexfi/intradayModel", 12 | # ref="master" 13 | # ,auth_token = "") 14 | # Installation from CRAN 15 | install.packages("intradayModel") 16 | # Getting help 17 | library(intradayModel) 18 | help(package = "intradayModel") 19 | ?fit_volume 20 | citation("intradayModel") 21 | vignette(package = "intradayModel") 22 | 23 | 24 | ## 25 | ## Developer commands (https://r-pkgs.org/) 26 | ## 27 | devtools::load_all() #or Ctrl-Shift-L 28 | devtools::document() #to generate all documentation via roxygen 29 | devtools::install() 30 | devtools::install(dependencies = FALSE) 31 | library(intradayModel) 32 | 33 | 34 | # Code tests 35 | devtools::test() 36 | #covr::package_coverage() #coverage of tests 37 | 38 | 39 | # CRAN check and submission (https://r-pkgs.org/release.html) 40 | # checklist: https://kalimu.github.io/post/checklist-for-r-package-submission-to-cran/ 41 | devtools::check() # run_dont_test = TRUE 42 | rcmdcheck::rcmdcheck() # build_args = "--run-donttest" 43 | devtools::build() 44 | 45 | # Alternatives to the above three that ignore vignettes 46 | devtools::check(args = c('--ignore-vignettes'), build_args = c('--no-build-vignettes')) 47 | rcmdcheck::rcmdcheck(args = c('--ignore-vignettes'), build_args = c('--no-build-vignettes')) 48 | devtools::build(args = c('--no-build-vignettes')) 49 | 50 | 51 | #devtools::revdep(pkg = "intradayModel") # to check reverse dependencies 52 | #devtools::check_win_release() #to check under windows 53 | 54 | #R CMD build . --resave-data # this is to generate tarball 55 | #R CMD check intradayModel_0.0.1.tar.gz --as-cran --run-donttest # this is before submission to CRAN 56 | # (on mac) R CMD install intradayModel_0.0.1.tar.gz 57 | # (on win) Rcmd INSTALL intradayModel_0.0.1.tar.gz 58 | 59 | # check Mac builder at: https://mac.r-project.org/macbuilder/submit.html 60 | # or with devtools::check_mac_release() 61 | # check Windows and Linux builder: https://builder.r-hub.io/ 62 | # or with rhub::check_for_cran() 63 | 64 | #submit the tarball directly via the webform: https://cran.r-project.org/submit.html 65 | -------------------------------------------------------------------------------- /R_buildignore/modelSpec_format.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/R_buildignore/modelSpec_format.rda -------------------------------------------------------------------------------- /cran-comments.md: -------------------------------------------------------------------------------- 1 | ## R CMD check results 2 | 3 | 0 errors | 0 warnings | 0 note 4 | 5 | * This is a new release. 6 | -------------------------------------------------------------------------------- /data/volume_aapl.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/data/volume_aapl.rda -------------------------------------------------------------------------------- /data/volume_fdx.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/data/volume_fdx.rda -------------------------------------------------------------------------------- /inst/CITATION: -------------------------------------------------------------------------------- 1 | citHeader("To cite 'intradayModel' in publications, please use:") 2 | 3 | bibentry(bibtype = "Manual", 4 | title = "intradayModel: Modeling and Forecasting Financial Intraday Signals", 5 | author = c(person("Shengjie", "Xiu", role = "aut", email = "sxiu@connect.ust.hk"), 6 | person("Yifan", "Yu", role = "aut", email = "yyuco@connect.ust.hk"), 7 | person(c("Daniel", "P."), "Palomar", role = c("cre", "aut", "cph"), email = "daniel.p.palomar@gmail.com")), 8 | note = "R package version 0.0.1", 9 | year = "2023", 10 | url = "https://CRAN.R-project.org/package=intradayModel", 11 | textVersion = 12 | paste("S. Xiu, Y. Yu, and D. P. Palomar (2023).", 13 | "intradayModel: Modeling and Forecasting Financial Intraday Signals.", 14 | "R package version 0.0.1.", 15 | "https://CRAN.R-project.org/package=intradayModel") 16 | ) 17 | 18 | bibentry(bibtype = "Article", 19 | title = "Forecasting Intraday Trading Volume: A Kalman Filter Approach", 20 | author = c(person("Ren", "Chen"), 21 | person("Yiyong", "Feng"), 22 | person(c("Daniel", "P."), "Palomar")), 23 | journal = "Available at SSRN", 24 | year = "2016", 25 | url = "https://dx.doi.org/10.2139/ssrn.3101695", 26 | textVersion = 27 | paste("R. Chen, Y. Feng, and D. P. Palomar (2016).", 28 | "Forecasting Intraday Trading Volume: A Kalman Filter Approach.", 29 | "Available at SSRN.", 30 | "https://dx.doi.org/10.2139/ssrn.3101695") 31 | ) 32 | -------------------------------------------------------------------------------- /intradayModeling.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | BuildType: Package 16 | PackageUseDevtools: Yes 17 | PackageInstallArgs: --no-multiarch --with-keep.source 18 | PackageBuildArgs: --no-build-vignettes 19 | PackageCheckArgs: --ignore-vignettes 20 | PackageRoxygenize: rd,collate,namespace 21 | -------------------------------------------------------------------------------- /man/decompose_volume.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/use_model.R 3 | \name{decompose_volume} 4 | \alias{decompose_volume} 5 | \title{Decompose Intraday Volume into Several Components} 6 | \usage{ 7 | decompose_volume(purpose, model, data, burn_in_days = 0) 8 | } 9 | \arguments{ 10 | \item{purpose}{String \code{"analysis"/"forecast"}. Indicates the purpose of using the provided model.} 11 | 12 | \item{model}{A model object of class "\code{volume_model}" from \code{fit_volume()}.} 13 | 14 | \item{data}{An n_bin * n_day matrix or an \code{xts} object storing intraday volume.} 15 | 16 | \item{burn_in_days}{Number of initial days in the burn-in period for forecast. Samples from the first \code{burn_in_days} are used to warm up the model and then are discarded.} 17 | } 18 | \value{ 19 | A list containing the following elements: 20 | \itemize{ 21 | \item{\code{original_signal}: }{A vector of original intraday volume;} 22 | \item{\code{smooth_signal} / \code{forecast_signal}: }{A vector of smooth/forecast intraday volume;} 23 | \item{\code{smooth_components} /\code{forecast_components}: }{A list of smooth/forecast components: daily, seasonal, intraday dynamic, and residual components.} 24 | \item{\code{error}: }{A list of three error measures: mae, mape, and rmse.} 25 | } 26 | } 27 | \description{ 28 | This function decomposes the intraday volume into daily, seasonal, and intraday dynamic components according to (Chen et al., 2016). 29 | If \code{purpose = “analysis”} (aka Kalman smoothing), the optimal components are conditioned on both the past and future observations. 30 | Its mathematical expression is \eqn{\hat{x}_{\tau} = E[x_{\tau}|\{y_{j}\}_{j=1}^{M}]}{x*(\tau) = E[x(\tau) | y(j), j = 1, ... , M]}, 31 | where \eqn{M} is the total number of bins in the dataset. 32 | 33 | If \code{purpose = “forecast”} (aka Kalman forecasting), the optimal components are conditioned on only the past observations. 34 | Its mathematical expression is \eqn{\hat{x}_{\tau+1} = E[x_{\tau+1}|\{y_{j}\}_{j=1}^{\tau}]}{x*(\tau+1) = E[x(\tau + 1) | y(j), j = 1, ... , \tau]}. 35 | 36 | Three measures are used to evaluate the model performance: 37 | \itemize{\item{Mean absolute error (MAE): 38 | \eqn{\frac{1}{M}\sum_{\tau=1}^M|\hat{y}_{\tau} - y_{\tau}|}{\sum (|y*(\tau) - y(\tau)|) / M} ;} 39 | \item{Mean absolute percent error (MAPE): 40 | \eqn{\frac{1}{M}\sum_{\tau=1}^M\frac{|\hat{y}_{\tau} - y_{\tau}|}{y_{\tau}}}{\sum (|y*(\tau) - y(\tau)| / y(\tau)) / M} ;} 41 | \item{Root mean square error (RMSE): 42 | \eqn{\sqrt{\sum_{\tau=1}^M\frac{\left(\hat{y}_{\tau} - y_{\tau}\right)^2}{M}}}{[\sum ((y*(\tau) - y(\tau))^2 / M)]^0.5} .} 43 | } 44 | } 45 | \examples{ 46 | library(intradayModel) 47 | data(volume_aapl) 48 | volume_aapl_training <- volume_aapl[, 1:20] 49 | volume_aapl_testing <- volume_aapl[, 21:50] 50 | model_fit <- fit_volume(volume_aapl_training, fixed_pars = list(a_mu = 0.5, var_mu = 0.05), 51 | init_pars = list(a_eta = 0.5)) 52 | 53 | # analyze training volume 54 | analysis_result <- decompose_volume(purpose = "analysis", model_fit, volume_aapl_training) 55 | 56 | # forecast testing volume 57 | forecast_result <- decompose_volume(purpose = "forecast", model_fit, volume_aapl_testing) 58 | 59 | # forecast testing volume with burn-in 60 | forecast_result <- decompose_volume(purpose = "forecast", model_fit, volume_aapl[, 1:50], 61 | burn_in_days = 20) 62 | 63 | } 64 | \references{ 65 | Chen, R., Feng, Y., and Palomar, D. (2016). Forecasting intraday trading volume: A Kalman filter approach. Available at SSRN 3101695. 66 | } 67 | \author{ 68 | Shengjie Xiu, Yifan Yu and Daniel P. Palomar 69 | } 70 | -------------------------------------------------------------------------------- /man/figures/README-unnamed-chunk-7-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/man/figures/README-unnamed-chunk-7-1.png -------------------------------------------------------------------------------- /man/figures/README-unnamed-chunk-8-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/man/figures/README-unnamed-chunk-8-1.png -------------------------------------------------------------------------------- /man/fit_volume.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/fit.R 3 | \name{fit_volume} 4 | \alias{fit_volume} 5 | \title{Fit a Univariate State-Space Model on Intraday Trading Volume} 6 | \usage{ 7 | fit_volume( 8 | data, 9 | fixed_pars = NULL, 10 | init_pars = NULL, 11 | verbose = 0, 12 | control = NULL 13 | ) 14 | } 15 | \arguments{ 16 | \item{data}{An n_bin * n_day matrix or an \code{xts} object storing intraday trading volume.} 17 | 18 | \item{fixed_pars}{A list of parameters' fixed values. The allowed parameters are listed below, 19 | \itemize{\item{\code{"a_eta"}: \eqn{a^{\eta}}{a.\eta}} of size 1 ; 20 | \item{\code{"a_mu"}: \eqn{a^{\mu}}{a.\mu}} of size 1 ; 21 | \item{\code{"var_eta"}: \eqn{\sigma^{\eta}}{(\sigma.\eta)^2}} of size 1 ; 22 | \item{\code{"var_mu"}: \eqn{\sigma^{\mu}}{(\sigma.\mu)^2}} of size 1 ; 23 | \item{\code{"r"}: \eqn{r}{r} of size 1 ;} 24 | \item{\code{"phi"}: \eqn{\boldsymbol{\phi} = [\phi_1,\dots, \phi_I]^\top}{\phi = [\phi(1); ... ; \phi(I)]} of size \eqn{I} ;} 25 | \item{\code{"x0"}: \eqn{\mathbf{x}_0}{x(0)} of size 2 ;} 26 | \item{\code{"V0"}: \eqn{\mathbf{V}_0}{V(0)} of size 2 * 2 .}}} 27 | 28 | \item{init_pars}{A list of unfitted parameters' initial values. The parameters are the same as \code{fixed_pars}. 29 | If the user does not assign initial values for the unfitted parameters, default ones will be used.} 30 | 31 | \item{verbose}{An integer specifying the print level of information during the algorithm (default \code{1}). Possible numbers: 32 | \itemize{\item{\code{"0"}: no output;} 33 | \item{\code{"1"}: show the iteration number and \eqn{\|\Delta \boldsymbol{\Theta}_i\|}{||\Delta \Theta(i)||};} 34 | \item{\code{"2"}: 1 + show the obtained parameters.}}} 35 | 36 | \item{control}{A list of control values of EM algorithm: 37 | \itemize{\item{\code{acceleration}: TRUE/FALSE indicating whether to use the accelerated EM algorithm (default TRUE);} 38 | \item{\code{maxit}: Maximum number of iterations (default \code{3000});} 39 | \item{\code{abstol}: Absolute tolerance for parameters' change \eqn{\|\Delta \boldsymbol{\Theta}_i\|}{||\Delta \Theta(i)||} as the stopping criteria (default \code{1e-4})} 40 | \item{\code{log_switch}: TRUE/FALSE indicating whether to record the history of convergence progress (defalut TRUE).}}} 41 | } 42 | \value{ 43 | A list of class "\code{volume_model}" with the following elements (if the algorithm converges): 44 | \itemize{\item{\code{par}: }{A list of parameters' fitted values.} 45 | \item{\code{init}: }{A list of valid initial values from users.} 46 | \item{\code{par_log}: }{A list of intermediate parameters' values if \code{log_switch = TRUE}.} 47 | \item{\code{converged}: }{A list of logical values indicating whether each parameter is fitted.} 48 | } 49 | } 50 | \description{ 51 | The main function for defining and fitting a univaraite state-space model on intraday trading volume. The model is proposed in (Chen et al., 2016) as 52 | \deqn{\mathbf{x}_{\tau+1} = \mathbf{A}_{\tau}\mathbf{x}_{\tau} + \mathbf{w}_{\tau},}{x(\tau+1) = A(\tau) x(\tau) + w(\tau),} 53 | \deqn{y_{\tau} = \mathbf{C}\mathbf{x}_{\tau} + \phi_{\tau} + v_\tau,}{y(\tau) = C x(\tau) + \phi(\tau) + v(\tau),} 54 | where 55 | \itemize{\item{\eqn{\mathbf{x}_{\tau} = [\eta_{\tau}, \mu_{\tau}]^\top}{x(\tau) = [\eta(\tau); \mu(\tau)]} is the hidden state vector containing the log daily component and the log intraday dynamic component;} 56 | \item{\eqn{\mathbf{A}_{\tau} = \left[\begin{array}{cc}a_{\tau}^{\eta}&0\\0&a^{\mu}\end{array} \right]}{A(\tau) = [a.\eta(\tau), 0; 0, a.\mu]} 57 | is the state transition matrix with \eqn{a_{\tau}^{\eta} = \left\{\begin{array}{cl}a^{\eta} & t = kI, k = 1,2,\dots\\0 & \textrm{otherwise};\end{array}\right.}{a.\eta(\tau) = a.\eta, when \tau = kI, k = 1, 2, ... , and zero otherwise;}} 58 | \item{\eqn{\mathbf{C} = [1, 1]}{C = [1, 1]} is the observation matrix;} 59 | \item{\eqn{\phi_{\tau}}{\phi(\tau)} is the corresponding element from \eqn{\boldsymbol{\phi} = [\phi_1,\dots, \phi_I]^\top}{\phi = [\phi(1); ... ; \phi(I)]}, which is the log seasonal component;} 60 | \item{\eqn{\mathbf{w}_{\tau} = [\epsilon_{\tau}^{\eta},\epsilon_{\tau}^{\mu}]^\top \sim \mathcal{N}(\mathbf{0}, \mathbf{Q}_{\tau})}{w(\tau) = [\epsilon.\eta(\tau); \epsilon.\mu(\tau)] ~ N(0, Q(\tau))} 61 | represents the i.i.d. Gaussian noise in the state transition, with a time-varying covariance matrix 62 | \eqn{\mathbf{Q}_{\tau} = \left[\begin{array}{cc}(\sigma_{\tau}^{\eta})^2&0\\ 0&(\sigma^{\mu})^2\end{array} \right]}{Q(\tau) = [(\sigma.\eta(\tau))^2, 0; 0, (\sigma.\mu)^2]} 63 | and \eqn{\sigma_\tau^{\eta} = \left\{\begin{array}{cl} \sigma^{\eta} & t = kI, k = 1,2,\dots\\0 & \textrm{otherwise};\end{array}\right.}{\sigma.\eta(\tau) = \sigma.\eta, when \tau = kI, k = 1, 2, ... , and zero otherwise;}} 64 | \item{\eqn{v_\tau \sim \mathcal{N}(0, r)}{v(\tau) ~ N(0, r)} is the i.i.d. Gaussian noise in the observation;} 65 | \item{\eqn{\mathbf{x}_1}{x(1)} is the initial state at \eqn{\tau = 1}{\tau = 1}, and it follows \eqn{\mathcal{N}(\mathbf{x}_0, \mathbf{V}_0)}{N(x(0), V(0))}}.} 66 | In the model, \eqn{\boldsymbol{\Theta} = \left\{a^{\eta},a^{\mu},\sigma^{\eta},\sigma^{\mu},r,\boldsymbol{\phi}, \mathbf{x}_0, \mathbf{V}_0\right\}}{\Theta = {a.\eta, a.\mu, (\sigma.\eta)^2, (\sigma.\mu)^2, r, \phi, x(0), V(0)}} 67 | are treated as parameters. 68 | The model is fitted by expectation-maximization (EM) algorithms. The implementation follows (Chen et al., 2016), and the accelerated scheme is provided in (Varadhan and Roland, 2008). 69 | The algorithm terminates when \code{maxit} is reached or the condition \eqn{\|\Delta \boldsymbol{\Theta}_i\| \le \textrm{abstol}}{||\Delta \Theta(i)|| <= abstol} is satisfied. 70 | } 71 | \examples{ 72 | library(intradayModel) 73 | data(volume_aapl) 74 | volume_aapl_training <- volume_aapl[, 1:20] 75 | \donttest{ 76 | # fit model with no prior knowledge 77 | model_fit <- fit_volume(volume_aapl_training) 78 | } 79 | # fit model with fixed_pars and init_pars 80 | model_fit <- fit_volume(volume_aapl_training, fixed_pars = list(a_mu = 0.5, var_mu = 0.05), 81 | init_pars = list(a_eta = 0.5)) 82 | \donttest{ 83 | # fit model with other control options 84 | model_fit <- fit_volume(volume_aapl_training, verbose = 2, 85 | control = list(acceleration = FALSE, maxit = 1000, abstol = 1e-4, log_switch = FALSE)) 86 | } 87 | 88 | } 89 | \references{ 90 | Chen, R., Feng, Y., and Palomar, D. (2016). Forecasting intraday trading volume: A Kalman filter approach. Available at SSRN 3101695. 91 | 92 | Varadhan, R., and Roland, C. (2008). Simple and globally convergent methods for accelerating the convergence of any EM algorithm. 93 | Scandinavian Journal of Statistics, 35(2), 335–353. 94 | } 95 | \author{ 96 | Shengjie Xiu, Yifan Yu and Daniel P. Palomar 97 | } 98 | -------------------------------------------------------------------------------- /man/forecast_volume.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/use_model.R 3 | \name{forecast_volume} 4 | \alias{forecast_volume} 5 | \title{Forecast One-bin-ahead Intraday Volume} 6 | \usage{ 7 | forecast_volume(model, data, burn_in_days = 0) 8 | } 9 | \arguments{ 10 | \item{model}{A model object of class "\code{volume_model}" from \code{fit_volume()}.} 11 | 12 | \item{data}{An n_bin * n_day matrix or an \code{xts} object storing intraday volume.} 13 | 14 | \item{burn_in_days}{Number of initial days in the burn-in period. Samples from the first \code{burn_in_days} are used to warm up the model and then are discarded.} 15 | } 16 | \value{ 17 | A list containing the following elements: 18 | \itemize{ 19 | \item{\code{original_signal}: }{A vector of original intraday volume;} 20 | \item{\code{forecast_signal}: }{A vector of forecast intraday volume;} 21 | \item{\code{forecast_components}: }{A list of the three forecast components: daily, seasonal, intraday dynamic, and residual components.} 22 | \item{\code{error}: }{A list of three error measures: mae, mape, and rmse.} 23 | } 24 | } 25 | \description{ 26 | This function forecasts one-bin-ahead intraday volume. 27 | Its mathematical expression is \eqn{\hat{y}_{\tau+1} = E[y_{\tau+1}|\{y_{j}\}_{j=1}^{\tau}]}{y*(\tau+1) = E[y(\tau + 1) | y(j), j = 1, ... , \tau]}. 28 | It is a wrapper of \code{decompose_volume()} with \code{purpose = "forecast"}. 29 | } 30 | \examples{ 31 | library(intradayModel) 32 | data(volume_aapl) 33 | volume_aapl_training <- volume_aapl[, 1:20] 34 | volume_aapl_testing <- volume_aapl[, 21:50] 35 | model_fit <- fit_volume(volume_aapl_training, fixed_pars = list(a_mu = 0.5, var_mu = 0.05), 36 | init_pars = list(a_eta = 0.5)) 37 | 38 | # forecast testing volume 39 | forecast_result <- forecast_volume(model_fit, volume_aapl_testing) 40 | 41 | # forecast testing volume with burn-in 42 | forecast_result <- forecast_volume(model_fit, volume_aapl[, 1:50], burn_in_days = 20) 43 | 44 | } 45 | \references{ 46 | Chen, R., Feng, Y., and Palomar, D. (2016). Forecasting intraday trading volume: A Kalman filter approach. Available at SSRN 3101695. 47 | } 48 | \author{ 49 | Shengjie Xiu, Yifan Yu and Daniel P. Palomar 50 | } 51 | -------------------------------------------------------------------------------- /man/generate_plots.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot.R 3 | \name{generate_plots} 4 | \alias{generate_plots} 5 | \title{Plot Analysis and Forecast Result} 6 | \usage{ 7 | generate_plots(analysis_forecast_result) 8 | } 9 | \arguments{ 10 | \item{analysis_forecast_result}{Analysis/forecast result from \code{decompose_volume()} or \code{forecast_volume()}.} 11 | } 12 | \value{ 13 | A list of \code{patchwork} objects: 14 | \itemize{ 15 | \item{\code{components}: }{Plot of components of intraday volume;} 16 | \item{\code{log_components}: }{Plot of components of intraday volume in their log10 scale;} 17 | \item{\code{original_and_smooth} / \code{original_and_forecast}: }{Plot of the original and the smooth/forecast intraday volume.}} 18 | } 19 | \description{ 20 | Generate plots for the analysis and forecast results. 21 | } 22 | \examples{ 23 | library(intradayModel) 24 | data(volume_aapl) 25 | volume_aapl_training <- volume_aapl[, 1:20] 26 | volume_aapl_testing <- volume_aapl[, 21:50] 27 | 28 | # obtain analysis and forecast result 29 | model_fit <- fit_volume(volume_aapl_training, fixed_pars = list(a_mu = 0.5, var_mu = 0.05), 30 | init_pars = list(a_eta = 0.5)) 31 | analysis_result <- decompose_volume(purpose = "analysis", model_fit, volume_aapl_training) 32 | forecast_result <- forecast_volume(model_fit, volume_aapl_testing) 33 | 34 | # plot the analysis and forecast result 35 | generate_plots(analysis_result) 36 | generate_plots(forecast_result) 37 | 38 | } 39 | \author{ 40 | Shengjie Xiu, Yifan Yu and Daniel P. Palomar 41 | } 42 | -------------------------------------------------------------------------------- /man/intradayModel-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/intradayModel-package.R 3 | \docType{package} 4 | \name{intradayModel-package} 5 | \alias{intradayModel-package} 6 | \title{intradayModel: Modeling and Forecasting Financial Intraday Signals} 7 | \description{ 8 | This package uses state-of-the-art state-space models to facilitate the modeling, analyzing and forecasting of 9 | financial intraday signals. It currently offers a univariate model for intraday trading volume, 10 | with new features on intraday volatility and multivariate models in development. 11 | } 12 | \section{Functions}{ 13 | 14 | \code{\link{fit_volume}}, 15 | \code{\link{decompose_volume}}, 16 | \code{\link{forecast_volume}}, 17 | \code{\link{generate_plots}} 18 | } 19 | 20 | \section{Data}{ 21 | 22 | \code{\link{volume_aapl}}, 23 | \code{\link{volume_fdx}} 24 | } 25 | 26 | \section{Help}{ 27 | 28 | For a quick help see the README file: 29 | \href{https://github.com/convexfi/intradayModel/blob/master/README.md}{GitHub-README}. 30 | } 31 | 32 | \author{ 33 | Shengjie Xiu, Yifan Yu and Daniel P. Palomar 34 | } 35 | -------------------------------------------------------------------------------- /man/volume_aapl.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/volume_aapl.R 3 | \docType{data} 4 | \name{volume_aapl} 5 | \alias{volume_aapl} 6 | \title{15-min Intraday Volume of AAPL} 7 | \format{ 8 | A 26 * 124 matrix. 9 | } 10 | \source{ 11 | \href{https://www.barchart.com}{barchart} 12 | } 13 | \usage{ 14 | data(volume_aapl) 15 | } 16 | \description{ 17 | A 26 * 124 matrix including 15-min trading volume of AAPL from 2019-01-02 to 2019-06-28. 18 | } 19 | \keyword{dataset} 20 | -------------------------------------------------------------------------------- /man/volume_fdx.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/volume_fdx.R 3 | \docType{data} 4 | \name{volume_fdx} 5 | \alias{volume_fdx} 6 | \title{15-min Intraday Volume of FDX} 7 | \format{ 8 | An \code{xts} object. 9 | } 10 | \source{ 11 | \href{https://www.barchart.com}{barchart} 12 | } 13 | \usage{ 14 | data(volume_fdx) 15 | } 16 | \description{ 17 | An \code{xts} object including 15-min trading volume of FDX from 2019-07-01 to 2019-12-31. 18 | } 19 | \keyword{dataset} 20 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(intradayModel) 3 | 4 | Sys.setenv(NOT_CRAN='skip') 5 | test_check("intradayModel") 6 | -------------------------------------------------------------------------------- /tests/testthat/fixtures/ACN_expected_par: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/ACN_expected_par -------------------------------------------------------------------------------- /tests/testthat/fixtures/ACN_expected_pred: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/ACN_expected_pred -------------------------------------------------------------------------------- /tests/testthat/fixtures/ACN_volume: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/ACN_volume -------------------------------------------------------------------------------- /tests/testthat/fixtures/ADBE_expected_par: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/ADBE_expected_par -------------------------------------------------------------------------------- /tests/testthat/fixtures/ADBE_expected_pred: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/ADBE_expected_pred -------------------------------------------------------------------------------- /tests/testthat/fixtures/ADBE_volume: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/ADBE_volume -------------------------------------------------------------------------------- /tests/testthat/fixtures/CVS_expected_par: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/CVS_expected_par -------------------------------------------------------------------------------- /tests/testthat/fixtures/CVS_expected_pred: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/CVS_expected_pred -------------------------------------------------------------------------------- /tests/testthat/fixtures/CVS_volume: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/CVS_volume -------------------------------------------------------------------------------- /tests/testthat/fixtures/GE_expected_par: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/GE_expected_par -------------------------------------------------------------------------------- /tests/testthat/fixtures/GE_expected_pred: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/convexfi/intradayModel/5509ab033d8504bf49ed58bf4bb8caf9ee556d29/tests/testthat/fixtures/GE_expected_pred -------------------------------------------------------------------------------- /tests/testthat/test-auxiliary.R: -------------------------------------------------------------------------------- 1 | test_that("specify_uniss works with partial fixed params", { 2 | skip_on_cran() 3 | data <- readRDS(test_path("fixtures", "ADBE_volume"))[,1:104] 4 | n_bin <- nrow(data) 5 | n_day <- ncol(data) 6 | n_bin_total <- n_bin * n_day 7 | ## reform data 8 | data_reform <- log(data) %>% 9 | as.list() %>% 10 | unlist() 11 | 12 | fixed_pars <- list() 13 | fixed_pars$"a_mu" <- 1 14 | fixed_pars$"var_eta" <- 4 15 | fixed_pars$"x0" <- matrix(0,2) 16 | fixed_pars$"phi" <- matrix(2, n_bin) 17 | model_test <- spec_volume_model(fixed_pars = fixed_pars) 18 | 19 | # predefined model 20 | model_predefined <- list() 21 | 22 | model_predefined$par <- list("a_eta" = NA, "a_mu" = 1, "var_eta" = 4, "var_mu" = NA, 23 | "r" = NA, "phi"= rep(2, n_bin), 24 | "x0" = c(0,0), "V0" = rep(NA, 3)) 25 | model_predefined$init <- list() 26 | model_predefined$converged <- list("a_eta" = FALSE, "a_mu" = TRUE, "var_eta" = TRUE, "var_mu" = FALSE, 27 | "r" = FALSE, "phi" = TRUE, "x0" = TRUE, "V0" = FALSE) 28 | 29 | class(model_predefined) <- "volume_model" 30 | expect_equal(model_test, model_predefined) 31 | }) 32 | 33 | test_that("cleanParsList works", { 34 | skip_on_cran() 35 | fixed_pars <- list() 36 | fixed_pars$"a_mu" <- NA 37 | fixed_pars$"var_eta" <- 4 38 | fixed_pars$"x0" <- matrix(0, 2) 39 | fixed_pars$"V0" <- matrix(c(1,2,3,4), 2) 40 | 41 | fixed_pars <- clean_pars_list(fixed_pars)$input_list 42 | 43 | predefined.pars <- list() 44 | predefined.pars$"var_eta" <- 4 45 | predefined.pars$"x0" <- c(0, 0) 46 | 47 | expect_equal(fixed_pars, predefined.pars) 48 | }) 49 | -------------------------------------------------------------------------------- /tests/testthat/test-message.R: -------------------------------------------------------------------------------- 1 | data(volume_aapl) 2 | 3 | test_that("is_volume_model works", { 4 | skip_on_cran() 5 | data <- volume_aapl 6 | n_bin <- 26 7 | fixed_pars <- list() 8 | fixed_pars$"a_mu" <- 1 9 | fixed_pars$"var_eta" <- 4 10 | fixed_pars$"x0" <- matrix(0,2) 11 | fixed_pars$"phi" <- matrix(2, n_bin) 12 | modelSpec <- spec_volume_model(fixed_pars = fixed_pars) 13 | 14 | modelSpec_check1 <- modelSpec[c("par", "init")] 15 | expect_error(is_volume_model(modelSpec_check1), "Elements converged are missing from the model.\n") 16 | 17 | modelSpec_check2 <- modelSpec 18 | modelSpec_check2$par[["x0"]] <- NULL 19 | expect_error(is_volume_model(modelSpec_check2),"Elements x0 are missing from volume_model[$]par.\n") 20 | 21 | modelSpec_check3 <- modelSpec 22 | modelSpec_check3$converged[["var_eta"]] <- FALSE 23 | expect_error(is_volume_model(modelSpec_check3), "volume_model[$]par[$]var_eta and volume_model[$]converged[$]var_eta are conflicted.\n") 24 | modelSpec_check3$converged[["a_eta"]] <- Inf 25 | expect_error(is_volume_model(modelSpec_check3), "Elements in volume_model[$]converged must be TRUE/FALSE.\n") 26 | modelSpec_check3$converged[["a_eta"]] <- NA 27 | expect_error(is_volume_model(modelSpec_check3), "Elements in volume_model[$]converged must be TRUE/FALSE.\n") 28 | 29 | 30 | modelSpec_check4 <- modelSpec 31 | modelSpec_check4$par[["x0"]] <- 1 32 | modelSpec_check4$par[["var_eta"]] <- array(c(1,2)) 33 | error_message <- paste("Length of volume_model[$]par[$]var_eta is wrong.\n") 34 | expect_error(is_volume_model(modelSpec_check4, 25), error_message) 35 | }) 36 | 37 | test_that("forecast_volume_model/Smooth works", { 38 | skip_on_cran() 39 | data <- volume_aapl 40 | n_bin <- 26 41 | fixed_pars <- list() 42 | fixed_pars$"a_mu" <- 1 43 | fixed_pars$"var_eta" <- 4 44 | fixed_pars$"x0" <- matrix(0,2) 45 | fixed_pars$"phi" <- matrix(2, n_bin) 46 | modelSpec <- spec_volume_model(fixed_pars = fixed_pars) 47 | expect_error(smooth_volume_model(data, modelSpec), 48 | regexp = "All parameters must be optimally fitted. Parameters a_eta, var_mu, r, V0 are not optimally fitted.") 49 | 50 | 51 | fixed_pars$"r" <- 1 52 | fixed_pars$"a_eta" <- 1 53 | fixed_pars$"var_mu" <- 4 54 | fixed_pars$"V0" <- c(1,0,0,1) 55 | modelSpec <- spec_volume_model(fixed_pars = fixed_pars) 56 | expect_error(forecast_volume_model(data, modelSpec, 300), 57 | regexp = "out_sample must be smaller than the number of columns in data matrix.") 58 | 59 | }) 60 | 61 | test_that("spec_volume_model message", { 62 | skip_on_cran() 63 | init_pars <- list() 64 | init_pars$"a_eta" <- 1 65 | init_pars$"x0" <- matrix(0, 2, 2) 66 | init_pars$"V0" <- matrix(1, 4) 67 | init_pars$"xxx" <- 3 68 | 69 | fixed_pars <- list() 70 | fixed_pars$"a_mu" <- NA 71 | fixed_pars$"var_eta" <- 4 72 | fixed_pars$"x0" <- matrix(Inf, 2) 73 | fixed_pars$"V0" <- matrix(c(1,0,0,1), nrow= 2) 74 | 75 | # predefinde_model <- list() 76 | # predefinde_model$par <- list("a_eta" = NA, "a_mu" = NA, 77 | # "var_eta" = 4, "var_mu" = NA, 78 | # "r" = NA, "phi" = NA, 79 | # "x0" = rep(NA, 2), 80 | # "V0" = c(1,0,1)) 81 | # predefinde_model$init <- list("a_eta" = 1) 82 | # predefinde_model$converged <- list("a_eta" = TRUE, "a_mu" = TRUE, 83 | # "var_eta" = FALSE, "var_mu" = TRUE, 84 | # "r" = TRUE, "phi" = TRUE, 85 | # "x0" = TRUE, "V0" = FALSE) 86 | 87 | warning_message <- paste("Warnings in fixed_pars:\n"," Elements a_mu, x0 are invalid [(]check number/dimension/PSD[)].\n", 88 | "Warnings in init_pars:\n"," Elements xxx are not allowed in parameter list.\n", 89 | " Elements x0 are invalid [(]check number/dimension/PSD[)].\n"," Elements V0 have already been fixed." ,sep = "") 90 | # expect_output(spec_volume_model(init_pars = init_pars, fixed_pars = fixed_pars), warning_message) 91 | expect_warning(spec_volume_model(init_pars = init_pars, fixed_pars = fixed_pars), warning_message) 92 | 93 | }) 94 | 95 | test_that("fit_volume message", { 96 | skip_on_cran() 97 | data <- volume_aapl 98 | data_train <- volume_aapl[, 1:104] 99 | data_error_test <- data_train 100 | data_error_test[1,1] <- NA 101 | fixed_pars <- list( 102 | "x0" = c(0, 0), 103 | "a_eta" = 1, "a_mu" = 0, 104 | "r" = 1e-4, 105 | "var_eta" = 1e-4, "var_mu" = 1e-4, 106 | "V0" = c(1, 0,0,1), 107 | "phi" = rep(2,26) 108 | ) 109 | 110 | expect_warning(fit_volume(data_train, control = list(maxit = 1)), 111 | regexp = "Warning! Reached maxit before parameters converged. Maxit was 1.\n") 112 | expect_output(fit_volume(data_train,verbose = 1, control = list(maxit = 1000, acceleration = TRUE)), 113 | regexp = "Success! abstol test passed at") 114 | expect_error(fit_volume(c(1,1)), regexp = "data must be matrix or xts.") 115 | expect_output(fit_volume(data, fixed_pars = fixed_pars, verbose = 1), "All parameters have already been fixed.") 116 | 117 | # modelSpec.fit_acc <- fit_volume(data_train, modelSpec, maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = TRUE, verbose = 0) 118 | # 119 | # expect_output(fit_volume(data_train, modelSpec.fit_acc), "All parameters have already been fixed.") 120 | # 121 | 122 | }) 123 | 124 | test_that("clean_data message", { 125 | skip_on_cran() 126 | data_error_test <- volume_aapl 127 | data_error_test[1,1] <- NA 128 | data_error_test[2,3] <- NA 129 | expect_warning(clean_data(data_error_test),"For input matrix:\n Remove trading days with missing bins: 2019-01-02, 2019-01-04.\n") 130 | 131 | data("volume_fdx") 132 | expect_warning(clean_data(volume_fdx),"For input xts:\n Remove trading days with missing bins: 2019-07-03, 2019-11-29, 2019-12-24.\n") 133 | }) 134 | -------------------------------------------------------------------------------- /tests/testthat/test-uniModelFit.R: -------------------------------------------------------------------------------- 1 | test_that("fit_volume from raw (after zero constraint and initial noise), stock = ADBE", { 2 | skip_on_cran() 3 | data <- readRDS(test_path("fixtures", "ADBE_volume"))[,1:104] 4 | modelSpec.fit <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = FALSE)) 5 | modelSpec.fit_acc <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = TRUE)) 6 | 7 | # expected output 8 | expected_par <- readRDS(test_path("fixtures", "ADBE_expected_par")) 9 | expected_modelSpec <- list() 10 | expected_modelSpec$par$a_eta <- expected_par$B[1] 11 | expected_modelSpec$par$a_mu <- expected_par$B[2] 12 | expected_modelSpec$par$var_eta <- expected_par$Q[1] 13 | expected_modelSpec$par$var_mu <- expected_par$Q[2] 14 | expected_modelSpec$par$r <- expected_par$R[1] 15 | expected_modelSpec$par$phi <- as.vector(expected_par$A) 16 | expected_modelSpec$par$x0 <- as.vector(expected_par$x0) 17 | expected_modelSpec$par$V0 <- as.vector(expected_par$V0) 18 | 19 | compared_par <- c("a_eta", "a_mu", "var_eta", "var_mu", "r", "phi") 20 | expect_equal(modelSpec.fit$par[compared_par], expected_modelSpec$par[compared_par], tolerance = 5e-2) 21 | expect_equal(modelSpec.fit_acc$par[compared_par], expected_modelSpec$par[compared_par], tolerance = 5e-2) 22 | 23 | plot(fetch_par_log(modelSpec.fit$par_log, "a_eta")[1, ]) 24 | plot(fetch_par_log(modelSpec.fit_acc$par_log, "a_eta")[1, ]) 25 | }) 26 | 27 | test_that("fit_volume from raw (after zero constraint and initial noise), stock = ACN", { 28 | skip_on_cran() 29 | data <- readRDS(test_path("fixtures", "ACN_volume"))[,1:104] 30 | modelSpec.fit <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = FALSE)) 31 | modelSpec.fit_acc <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = TRUE)) 32 | 33 | # expected output 34 | expected_par <- readRDS(test_path("fixtures", "ACN_expected_par")) 35 | expected_modelSpec <- list() 36 | expected_modelSpec$par$a_eta <- expected_par$B[1] 37 | expected_modelSpec$par$a_mu <- expected_par$B[2] 38 | expected_modelSpec$par$var_eta <- expected_par$Q[1] 39 | expected_modelSpec$par$var_mu <- expected_par$Q[2] 40 | expected_modelSpec$par$r <- expected_par$R[1] 41 | 42 | expected_modelSpec$par$phi <- as.vector(expected_par$A) 43 | expected_modelSpec$par$x0 <- as.vector(expected_par$x0) 44 | expected_modelSpec$par$V0 <- as.vector(expected_par$V0) 45 | 46 | compared_par <- c("a_eta", "a_mu", "var_eta", "var_mu", "r", "phi") 47 | expect_equal(modelSpec.fit$par[compared_par], expected_modelSpec$par[compared_par], tolerance = 5e-2) 48 | expect_equal(modelSpec.fit_acc$par[compared_par], expected_modelSpec$par[compared_par], tolerance = 5e-2) 49 | 50 | plot(fetch_par_log(modelSpec.fit$par_log, "a_eta")[1, ]) 51 | plot(fetch_par_log(modelSpec.fit_acc$par_log, "a_eta")[1, ]) 52 | }) 53 | 54 | test_that("fit_volume from raw (after zero constraint and initial noise), stock = CVS", { 55 | skip_on_cran() 56 | data <- readRDS(test_path("fixtures", "CVS_volume"))[, 1:104] 57 | modelSpec.fit <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = FALSE)) 58 | modelSpec.fit_acc <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = TRUE)) 59 | 60 | # expected output 61 | expected_par <- readRDS(test_path("fixtures", "CVS_expected_par")) 62 | expected_modelSpec <- list() 63 | expected_modelSpec$par$a_eta <- expected_par$B[1] 64 | expected_modelSpec$par$a_mu <- expected_par$B[2] 65 | expected_modelSpec$par$var_eta <- expected_par$Q[1] 66 | expected_modelSpec$par$var_mu <- expected_par$Q[2] 67 | expected_modelSpec$par$r <- expected_par$R[1] 68 | expected_modelSpec$par$phi <- as.vector(expected_par$A) 69 | expected_modelSpec$par$x0 <- as.vector(expected_par$x0) 70 | expected_modelSpec$par$V0 <- as.vector(expected_par$V0) 71 | 72 | compared_par <- c("a_eta", "a_mu", "var_eta", "var_mu", "r", "phi") 73 | expect_equal(modelSpec.fit$par[compared_par], expected_modelSpec$par[compared_par], tolerance = 5e-2) 74 | expect_equal(modelSpec.fit_acc$par[compared_par], expected_modelSpec$par[compared_par], tolerance = 5e-2) 75 | 76 | plot(fetch_par_log(modelSpec.fit$par_log, "a_mu")[1, ]) 77 | plot(fetch_par_log(modelSpec.fit_acc$par_log, "a_mu")[1, ]) 78 | }) 79 | -------------------------------------------------------------------------------- /tests/testthat/test-uniModelUse.R: -------------------------------------------------------------------------------- 1 | test_that("forecast_volume_model, stock = ADBE", { 2 | skip_on_cran() 3 | data.pred <- readRDS(test_path("fixtures", "ADBE_volume")) 4 | data <- data.pred[,1:104] 5 | 6 | modelSpec.fit <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = FALSE)) 7 | modelSpec.fit_acc <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = TRUE)) 8 | 9 | log_volume_pred <- log(decompose_volume("forecast", modelSpec.fit, data.pred, 104)$forecast_signal) 10 | log_volume_pred_acc <- log(decompose_volume("forecast", modelSpec.fit_acc,data.pred, 104)$forecast_signal) 11 | log_volume_real <- log(utils::tail(as.vector(data.pred), 26 * 20)) 12 | 13 | mae <-calculate_mae(log_volume_real, log_volume_pred) 14 | mape <- calculate_mape(log_volume_real, log_volume_pred) 15 | rmse <-calculate_rmse(log_volume_real, log_volume_pred) 16 | 17 | expected_res <- readRDS(test_path("fixtures", "ADBE_expected_pred")) 18 | 19 | expect_equal(log_volume_pred, expected_res$log_volume_pred, tolerance = 1e-3) 20 | expect_equal(log_volume_pred_acc, expected_res$log_volume_pred, tolerance = 1e-3) 21 | expect_equal(c(mae, mape, rmse), c(expected_res$mae, expected_res$mape, expected_res$rmse), tolerance = 1e-4) 22 | }) 23 | 24 | test_that("forecast_volume_model, ACN", { 25 | skip_on_cran() 26 | data.pred <- readRDS(test_path("fixtures", "ACN_volume")) 27 | data <- data.pred[,1:104] 28 | 29 | modelSpec.fit <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = FALSE)) 30 | modelSpec.fit_acc <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = TRUE)) 31 | 32 | log_volume_pred <- log(decompose_volume("forecast", modelSpec.fit, data.pred, 104)$forecast_signal) 33 | log_volume_pred_acc <- log(decompose_volume("forecast", modelSpec.fit_acc,data.pred, 104)$forecast_signal) 34 | log_volume_real <- log(utils::tail(as.vector(as.matrix(data.pred)), 26 * 20)) 35 | 36 | mae <-calculate_mae(log_volume_real, log_volume_pred) 37 | mape <- calculate_mape(log_volume_real, log_volume_pred) 38 | rmse <-calculate_rmse(log_volume_real, log_volume_pred) 39 | 40 | expected_res <- readRDS(test_path("fixtures", "ACN_expected_pred")) 41 | 42 | expect_equal(log_volume_pred, expected_res$log_volume_pred, tolerance = 1e-4) 43 | expect_equal(log_volume_pred_acc, expected_res$log_volume_pred, tolerance = 1e-4) 44 | expect_equal(c(mae, mape, rmse), c(expected_res$mae, expected_res$mape, expected_res$rmse), tolerance = 5e-4) 45 | }) 46 | 47 | test_that("forecast_volume_model, stock = CVS", { 48 | skip_on_cran() 49 | data.pred <- readRDS(test_path("fixtures", "CVS_volume")) 50 | data <- data.pred[,1:104] 51 | 52 | modelSpec.fit <- fit_volume(data, control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = FALSE)) 53 | 54 | modelSpec.fit_acc <- fit_volume(data, init_pars = list(a_mu = 0), control = list(maxit = 1000, abstol = 1e-4, log_switch = TRUE, acceleration = TRUE)) 55 | 56 | log_volume_pred <- log(decompose_volume("forecast", modelSpec.fit, data.pred, 104)$forecast_signal) 57 | log_volume_pred_acc <- log(decompose_volume("forecast", modelSpec.fit_acc,data.pred, 104)$forecast_signal) 58 | log_volume_real <- log(utils::tail(as.vector(data.pred), 26 * 20)) 59 | 60 | mae <-calculate_mae(log_volume_real, log_volume_pred) 61 | mape <- calculate_mape(log_volume_real, log_volume_pred) 62 | rmse <-calculate_rmse(log_volume_real, log_volume_pred) 63 | 64 | expected_res <- readRDS(test_path("fixtures", "CVS_expected_pred")) 65 | 66 | expect_equal(log_volume_pred, expected_res$log_volume_pred, tolerance = 1e-3) 67 | expect_equal(log_volume_pred_acc, expected_res$log_volume_pred, tolerance = 1e-3) 68 | expect_equal(c(mae, mape, rmse), c(expected_res$mae, expected_res$mape, expected_res$rmse), tolerance = 1e-3) 69 | }) 70 | -------------------------------------------------------------------------------- /vignettes/.gitignore: -------------------------------------------------------------------------------- 1 | *.RData -------------------------------------------------------------------------------- /vignettes/apalike.csl: -------------------------------------------------------------------------------- 1 | 2 | 588 | -------------------------------------------------------------------------------- /vignettes/intradayModel.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "intradayModel: Modeling and Forecasting Financial Intraday Signals" 3 | author: | 4 | | Shengjie Xiu, Yifan Yu, and Daniel P. Palomar 5 | | The Hong Kong University of Science and Technology (HKUST) 6 | date: "2023-05-19" 7 | output: 8 | cleanrmd::html_document_clean: 9 | theme: "bamboo" 10 | mathjax: default 11 | toc: true 12 | toc_depth: 2 13 | csl: apalike.csl 14 | bibliography: reference.bib 15 | link-citations: yes 16 | vignette: > 17 | %\VignetteIndexEntry{Intraday Volume and Volatility: Modeling and Forecasting} 18 | %\VignetteEngine{knitr::rmarkdown} 19 | %\VignetteEncoding{UTF-8} 20 | %\VignetteKeyword{State-space Model} 21 | --- 22 | 23 | ```{r, echo=FALSE, warning=FALSE} 24 | library(knitr) 25 | opts_chunk$set( 26 | collapse = TRUE, 27 | comment = "#>", 28 | fig.align = "center", 29 | fig.retina = 2, 30 | out.width = "100%", 31 | dpi = 96 , 32 | pngquant = "--speed=1" 33 | ) 34 | knit_hooks$set(pngquant = hook_pngquant) # brew install pngquant 35 | ``` 36 | 37 | ```{css, echo = FALSE} 38 | /* ensure cleanrmd is centered */ 39 | body { 40 | margin: 0 auto; 41 | max-width: 1000px; 42 | padding: 2rem; 43 | } 44 | 45 | /* math is smaller */ 46 | .math { 47 | font-size: small; 48 | } 49 | 50 | /* set reference spacing in cleanrmd */ 51 | .references>div:first-child{ 52 | margin-bottom: 1.6em; 53 | } 54 | ``` 55 | 56 | ------------------------------------------------------------------------ 57 | 58 | > Welcome to the `intradayModel` package! This vignette provides an overview of the package's features and how to use them. `intradayModel` uses state-space models to model and forecast financial intraday signal, with a focus on intraday trading volume. Our team is currently working on expanding the package to include more support for intraday volatility. 59 | 60 | # Quick start 61 | 62 | To get started, we load our package and sample data: the 15-minute intraday trading volume of AAPL from 2019-01-02 to 2019-06-28, covering 124 trading days. We use the first 104 trading days for fitting, and the last 20 days for evaluation of forecasting performance. 63 | 64 | ```{r, message = FALSE} 65 | library(intradayModel) 66 | data(volume_aapl) 67 | volume_aapl[1:5, 1:5] # print the head of data 68 | 69 | volume_aapl_training <- volume_aapl[, 1:104] 70 | volume_aapl_testing <- volume_aapl[, 105:124] 71 | ``` 72 | 73 | Next, we fit a univariate state-space model using `fit_volume( )` function. 74 | 75 | ```{r} 76 | model_fit <- fit_volume(volume_aapl_training) 77 | ``` 78 | 79 | Once the model is fitted, we can analyze the hidden components of any intraday volume based on all its observations. By calling `decompose_volume( )` function with `purpose = "analysis"`, we obtain the smoothed daily, seasonal, and intraday dynamic components. It involves incorporating both past and future observations to refine the state estimates. 80 | 81 | ```{r, out.width="100%"} 82 | analysis_result <- decompose_volume(purpose = "analysis", model_fit, volume_aapl_training) 83 | 84 | # visualization 85 | plots <- generate_plots(analysis_result) 86 | plots$log_components 87 | ``` 88 | 89 | To see how well our model performs on new data, we call `forecast_volume( )` function to do one-bin-ahead forecast on the testing set. 90 | 91 | ```{r, out.width="100%"} 92 | forecast_result <- forecast_volume(model_fit, volume_aapl_testing) 93 | 94 | # visualization 95 | plots <- generate_plots(forecast_result) 96 | plots$original_and_forecast 97 | ``` 98 | 99 | Now that you have a quick start on using the package, let's explore the details and dive deeper into its functionalities and features. 100 | 101 |   102 | 103 | # Usage of the package 104 | 105 | ## Preliminary theory 106 | 107 | Intraday observations of trading volume are divided into days, indexed by $t\in\{1,\dots,T\}$. Each day is further divided into bins, indexed by $i\in\{1,\dots,I\}$. To refer to a specific observation, we use the index $\tau = I \times (t-1) + i$. 108 | 109 | Our package uses a state-space model to extract several components of intraday volume. These components include the daily component, which adjusts the mean level of the time series; the seasonal component, which captures the U-shaped intraday periodic pattern; and the intraday dynamic component, which represents movements within a day. 110 | 111 | 112 | The observed intraday volume can be written in a multiplicative combination of the components [@brownlees2011intra]: 113 | 114 | $$ 115 | \large 116 | \text{intraday volume} = \text{daily} \times \text{seasonal} \times \text{intraday dynamic} \times \text{noise}. \tag{1} 117 | \small 118 | $$ 119 | 120 | Alternatively, by taking the logarithm transform, the intraday volume can be also regarded as an addictive combination of these components: 121 | 122 | $$ 123 | \large 124 | y_{\tau} = \eta_{\tau} + \phi_i + \mu_{t,i} + v_{t,i}. \tag{2} 125 | \small 126 | $$ 127 | 128 | The state-space model proposed by [@chen2016forecasting] is defined on Equation (2) as 129 | $$ 130 | \large 131 | \begin{aligned} 132 | \mathbf{x}_{\tau+1} &= \mathbf{A}_{\tau}\mathbf{x}_{\tau} + \mathbf{w}_{\tau},\\ 133 | y_{\tau} &= \mathbf{C}\mathbf{x}_{\tau} + \phi_{\tau} + v_\tau, 134 | \end{aligned} 135 | \tag{3} 136 | \small 137 | $$ 138 | where 139 | 140 | - $\mathbf{x}_{\tau} = [\eta_{\tau}, \mu_{\tau}]^\top$ is the hidden state vector containing the log daily component and the log intraday dynamic component; 141 | 142 | - $\mathbf{A}_{\tau} = \left[\begin{array}{l}a_{\tau}^{\eta}&0\\0&a^{\mu}\end{array} \right]$ is the state transition matrix with $a_{\tau}^{\eta} = \begin{cases}a^{\eta}&\tau = kI, k = 1,2,\dots\\0&\text{otherwise};\end{cases}$ 143 | 144 | - $\mathbf{C} = [1, 1]$ is the observation matrix; 145 | 146 | - $\phi_{\tau}$ is the corresponding element from $\boldsymbol{\phi} = [\phi_1,\dots, \phi_I]^\top$, which is the log seasonal component; 147 | 148 | - $\mathbf{w}_{\tau} = \left[\epsilon_{\tau}^{\eta},\epsilon_{\tau}^{\mu}\right]^\top \sim \mathcal{N}(\mathbf{0}, \mathbf{Q}_{\tau})$ represents the i.i.d. Gaussian noise in the state transition, with a time-varying covariance matrix $\mathbf{Q}_{\tau} = \left[\begin{array}{l}(\sigma_\tau^{\eta})^2&0\\0&(\sigma^{\mu})^2\end{array} \right]$ and $\sigma_\tau^{\eta} = \begin{cases}\sigma^{\eta}&\tau = kI, k = 1,2,\dots\\0&\text{otherwise};\end{cases}$ 149 | 150 | - $v_\tau \sim \mathcal{N}(0, r)$ is the i.i.d. Gaussian noise in the observation; 151 | 152 | - $\mathbf{x}_1$ is the initial state at $\tau = 1$, and it follows $\mathcal{N}(\mathbf{x}_0, \mathbf{V}_0)$. 153 | 154 | In this model, $\boldsymbol{\Theta} = \{a^{\eta}, a^{\mu}, (\sigma^{\eta})^2, (\sigma^{\mu})^2, r, \boldsymbol{\phi}, \mathbf{x}_0, \mathbf{V}_0 \}$ are treated as parameters. 155 | 156 | ## Datasets 157 | 158 | Two data classes of intraday volume are supported: 159 | 160 | - a 2D numeric matrix of size `(n_bin, n_day)`; 161 | 162 | - an xts object. 163 | 164 | To help you get started, we provide two sample datasets: a matrix-class `volume_aapl` and an xts-class `volume_fdx`. Here, we elaborate on the later one. 165 | 166 | ```{r, warning=FALSE} 167 | data(volume_fdx) 168 | head(volume_fdx) 169 | tail(volume_fdx) 170 | ``` 171 | 172 | ## Fitting 173 | 174 | > **fit_volume**(data, fixed_pars = NULL, init_pars = NULL, verbose = 0, control = NULL) 175 | 176 | To fit a univariate state-space model on intraday volume, you should use `fit_volume( )` function. If you want to fix some parameters to specific values, you can provide a list of values to `fixed_pars`. If you have prior knowledge of the initial values for the unfitted parameters, you can provide it through `init_pars`. Besides, `verbose` controls the level of print, and more control options can be set via `control`. 177 | 178 | The fitting process stops when either the maximum number of iterations is reached or the termination criteria is met $\|\Delta \boldsymbol{\Theta}_i\| \le \text{abstol}$. 179 | 180 | The following code shows how to fit the model to the FDX stock. 181 | 182 | ```{r} 183 | # set fixed value 184 | fixed_pars <- list() 185 | fixed_pars$"x0" <- c(13.33, -0.37) 186 | 187 | # set initial value 188 | init_pars <- list() 189 | init_pars$"a_eta" <- 1 190 | 191 | volume_fdx_training <- volume_fdx['2019-07-01/2019-11-30'] 192 | model_fit <- fit_volume(volume_fdx_training, verbose = 2, control = list(acceleration = TRUE)) 193 | ``` 194 | 195 | Trading days with missing bins are automatically removed. They are 2019-07-03 (Independence Day) and 2019-11-29 (Thanksgiving Day) which have early close. 196 | 197 | ## Decomposition 198 | 199 | > **decompose_volume**(purpose, model, data, burn_in_days = 0) 200 | 201 | `decompose_volume( )` function allows you to decomposes the intraday volume into its daily, seasonal, and intraday dynamic components. 202 | 203 | With `purpose = "analysis"`, it applies Kalman smoothing to estimate the hidden states given all available observations up to a certain point in time. The daily component and intraday dynamic component at time $\tau$ are the smoothed state estimate conditioned on all the data, and denoted by $\mathbb{E}[\mathbf{x}_{\tau}|\{y_{j}\}_{j=1}^{M}]$, where $M$ is the total number of bins in the dataset. Besides, the seasonal component has the value of $\boldsymbol{\phi}$. 204 | 205 | 206 | ```{r} 207 | analysis_result <- decompose_volume(purpose = "analysis", model_fit, volume_fdx_training) 208 | 209 | str(analysis_result) 210 | ``` 211 | 212 | Function `generate_plots( )` visualizes the smooth components and the smoothing performance. 213 | 214 | ```{r} 215 | plots <- generate_plots(analysis_result) 216 | plots$log_components 217 | plots$original_and_smooth 218 | ``` 219 | 220 | 221 | With `purpose = "forecast"`, it applies Kalman forecasting to estimate the one-bin-ahead hidden state based on the available observations, which is mathematically denoted by $\mathbb{E}[\mathbf{x}_{\tau+1}|\{y_{j}\}_{j=1}^{\tau}]$. Details can be found in the next subsection. 222 | 223 | This function also helps to evaluate the model performance with the following measures: 224 | 225 | - Mean absolute error (MAE): $\frac{1}{M}\sum_{\tau=1}^M\lvert\hat{y}_\tau - y_\tau\rvert$. 226 | 227 | - Mean absolute percent error (MAPE): $\frac{1}{M}\sum_{\tau=1}^M\frac{\lvert\hat{y}_\tau - y_\tau\rvert}{y_\tau}$. 228 | 229 | - Root mean square error (RMSE): $\sqrt{\sum_{\tau=1}^M\frac{\left(\hat{y}_\tau - y_\tau\right)^2}{M}}$. 230 | 231 | ## Forecasting 232 | 233 | > **forecast_volume**(model, data, burn_in_days = 0) 234 | 235 | `forecast_volume( )` function is a wrapper of `decompose_volume(purpose = "forecast", ...)`. It forecasts the one-bin-ahead intraday volume on a new dataset. The one-bin-ahead forecast is mathematically denoted by $\hat{y}_{\tau+1} = \mathbb{E}[y_{\tau+1}|\{y_{j}\}_{j=1}^{\tau}]$. 236 | 237 | When encountering a new dataset with different statistical characteristics or from different stocks, the state space model may not initially start in an optimal state. To address this, the first `burn_in_days` days in the data can be utilized to warm up the Kalman filter, allowing it to reach the desired state. These initial days will be discarded after initialization. 238 | 239 | 240 | ```{r} 241 | # use training data for burn-in 242 | forecast_result <- forecast_volume(model_fit, volume_fdx, burn_in_days = 105) 243 | 244 | str(forecast_result) 245 | ``` 246 | 247 | Function `generate_plots( )` visualizes the one-bin-ahead forecast components and the forecasting performance. 248 | 249 | ```{r} 250 | plots <- generate_plots(forecast_result) 251 | plots$log_components 252 | plots$original_and_forecast 253 | ``` 254 | 255 | 256 |   257 | 258 | # Next steps 259 | 260 | This guide gives an overview of the package's main features. Check the manual for details on each function, including parameters and examples. 261 | 262 | The current version only supports univariate state-space models for intraday trading volume. Soon, we'll add models for intraday volatility and their multivariate versions. We hope you find these resources helpful and that our package will continue to be a valuable tool for your work. 263 | 264 |   265 | 266 | # References 267 | 268 |
-------------------------------------------------------------------------------- /vignettes/reference.bib: -------------------------------------------------------------------------------- 1 | @article{chen2016forecasting, 2 | title={Forecasting intraday trading volume: a {K}alman filter approach}, 3 | author={Chen, Ran and Feng, Yiyong and Palomar, Daniel}, 4 | journal={Available at SSRN 3101695}, 5 | year={2016} 6 | } 7 | 8 | @article{varadhan2008simple, 9 | title={Simple and globally convergent methods for accelerating the convergence of any {EM} algorithm}, 10 | author={Varadhan, Ravi and Roland, Christophe}, 11 | journal={Scandinavian Journal of Statistics}, 12 | volume={35}, 13 | number={2}, 14 | pages={335--353}, 15 | year={2008}, 16 | publisher={Wiley Online Library} 17 | } 18 | 19 | @article{brownlees2011intra, 20 | title={Intra-daily volume modeling and prediction for algorithmic trading}, 21 | author={Brownlees, Christian T and Cipollini, Fabrizio and Gallo, Giampiero M}, 22 | journal={Journal of Financial Econometrics}, 23 | volume={9}, 24 | number={3}, 25 | pages={489--518}, 26 | year={2011}, 27 | publisher={Oxford University Press} 28 | } 29 | 30 | --------------------------------------------------------------------------------