├── .github └── workflows │ └── publish.yml ├── CNAME ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── Example ├── example_gensys.py ├── example_snkm.ipynb ├── example_snkm.py ├── example_soe.py └── example_us.py ├── LICENSE ├── README.md ├── Utilities └── prior_table.xlsx ├── _config.yml ├── dsgepy ├── __init__.py ├── data_api.py ├── kalman │ ├── __init__.py │ ├── kalman.py │ └── utils.py ├── lineardsge.py └── pycsminwel.py └── setup.py /.github/workflows/publish.yml: -------------------------------------------------------------------------------- 1 | name: Upload Python Package to PyPI when a Release is Created 2 | 3 | on: 4 | release: 5 | types: [created] 6 | 7 | jobs: 8 | pypi-publish: 9 | name: Publish release to PyPI 10 | runs-on: ubuntu-latest 11 | environment: 12 | name: pypi 13 | url: https://pypi.org/p/dsgepy 14 | permissions: 15 | id-token: write 16 | steps: 17 | - uses: actions/checkout@v4 18 | - name: Set up Python 19 | uses: actions/setup-python@v4 20 | with: 21 | python-version: "3.x" 22 | - name: Install dependencies 23 | run: | 24 | python -m pip install --upgrade pip 25 | pip install setuptools wheel 26 | - name: Build package 27 | run: | 28 | python setup.py sdist bdist_wheel # Could also be python -m build 29 | - name: Publish package distributions to PyPI 30 | uses: pypa/gh-action-pypi-publish@release/v1 -------------------------------------------------------------------------------- /CNAME: -------------------------------------------------------------------------------- 1 | dsgepy.com -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | In the interest of fostering an open and welcoming environment, we as 6 | contributors and maintainers pledge to making participation in our project and 7 | our community a harassment-free experience for everyone, regardless of age, body 8 | size, disability, ethnicity, gender identity and expression, level of experience, 9 | nationality, personal appearance, race, religion, or sexual identity and 10 | orientation. 11 | 12 | ## Our Standards 13 | 14 | Examples of behavior that contributes to creating a positive environment 15 | include: 16 | 17 | * Using welcoming and inclusive language 18 | * Being respectful of differing viewpoints and experiences 19 | * Gracefully accepting constructive criticism 20 | * Focusing on what is best for the community 21 | * Showing empathy towards other community members 22 | 23 | Examples of unacceptable behavior by participants include: 24 | 25 | * The use of sexualized language or imagery and unwelcome sexual attention or 26 | advances 27 | * Trolling, insulting/derogatory comments, and personal or political attacks 28 | * Public or private harassment 29 | * Publishing others' private information, such as a physical or electronic 30 | address, without explicit permission 31 | * Other conduct which could reasonably be considered inappropriate in a 32 | professional setting 33 | 34 | ## Our Responsibilities 35 | 36 | Project maintainers are responsible for clarifying the standards of acceptable 37 | behavior and are expected to take appropriate and fair corrective action in 38 | response to any instances of unacceptable behavior. 39 | 40 | Project maintainers have the right and responsibility to remove, edit, or 41 | reject comments, commits, code, wiki edits, issues, and other contributions 42 | that are not aligned to this Code of Conduct, or to ban temporarily or 43 | permanently any contributor for other behaviors that they deem inappropriate, 44 | threatening, offensive, or harmful. 45 | 46 | ## Scope 47 | 48 | This Code of Conduct applies both within project spaces and in public spaces 49 | when an individual is representing the project or its community. Examples of 50 | representing a project or community include using an official project e-mail 51 | address, posting via an official social media account, or acting as an appointed 52 | representative at an online or offline event. Representation of a project may be 53 | further defined and clarified by project maintainers. 54 | 55 | ## Enforcement 56 | 57 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 58 | reported by contacting the project team at support@pydsge.com. All 59 | complaints will be reviewed and investigated and will result in a response that 60 | is deemed necessary and appropriate to the circumstances. The project team is 61 | obligated to maintain confidentiality with regard to the reporter of an incident. 62 | Further details of specific enforcement policies may be posted separately. 63 | 64 | Project maintainers who do not follow or enforce the Code of Conduct in good 65 | faith may face temporary or permanent repercussions as determined by other 66 | members of the project's leadership. 67 | 68 | --- 69 | # Attribution 70 | 71 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, 72 | available at [http://contributor-covenant.org/version/1/4][version] 73 | 74 | [homepage]: http://contributor-covenant.org 75 | [version]: http://contributor-covenant.org/version/1/4/ 76 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other 4 | method with the owners of this repository before making a change. Make sure that any contributions use consistent code 5 | conventions and clear function/method/variable names. The code must be clearly commented, documenting intentions and 6 | edge cases. There must be no sensitive materials in the revision history, issues, or pull requests (for example, 7 | passwords or other non-public information). 8 | 9 | Please note we have a [code of conduct](https://github.com/gusamarante/pydsge/blob/master/CODE_OF_CONDUCT.md), please 10 | follow it in all your interactions with the project. 11 | 12 | If you need more information and help, specially about contributing, you can contact Gustavo Amarante on 13 | developer@pydsge.com 14 | -------------------------------------------------------------------------------- /Example/example_gensys.py: -------------------------------------------------------------------------------- 1 | """ 2 | This script implents 3 | """ 4 | 5 | from dsgepy import gensys 6 | import numpy as np 7 | 8 | g0 = np.array([[1, 0, 0, 0], 9 | [0, 1, 0, 0], 10 | [-1.1, 0, 1, 1], 11 | [0, 1, 0, 0]]) 12 | 13 | g1 = np.array([[0, 0, 1, 0], 14 | [0, 0, 0, 1], 15 | [0, 0, 0, 0], 16 | [0, 0.7, 0, 0]]) 17 | 18 | c = np.zeros((4, 1)) 19 | 20 | psi = np.array([[0, 0], 21 | [0, 0], 22 | [4, 1], 23 | [3, -2]]) 24 | 25 | pi = np.array([[1, 0], 26 | [0, 1], 27 | [0, 0], 28 | [0, 0]]) 29 | 30 | G1, C, impact, fmat, fwt, ywt, gev, eu, loose = gensys(g0, g1, c, psi, pi) 31 | -------------------------------------------------------------------------------- /Example/example_snkm.py: -------------------------------------------------------------------------------- 1 | from dsgepy import DSGE 2 | from sympy import symbols, Matrix 3 | 4 | 5 | # ================================ 6 | # ===== MODEL ESPECIFICATION ===== 7 | # ================================ 8 | # endogenous variables at t 9 | y, pi, i, a, v, exp_y, exp_pi = symbols('y, pi, i, a, v, exp_y, exp_pi') 10 | endog = Matrix([y, pi, i, a, v, exp_y, exp_pi]) 11 | 12 | # endogenous variables at t - 1 13 | yl, pil, il, al, vl, exp_yl, exp_pil = symbols('yl, pil, il, al, vl, exp_yl, exp_pil') 14 | endogl = Matrix([yl, pil, il, al, vl, exp_yl, exp_pil]) 15 | 16 | # exogenous shocks 17 | eps_a, eps_v, eps_pi = symbols('eps_a, eps_v, eps_pi') 18 | 19 | exog = Matrix([eps_a, eps_v, eps_pi]) 20 | 21 | # expectational shocks 22 | eta_y, eta_pi = symbols('eta_y, eta_pi') 23 | 24 | expec = Matrix([eta_y, eta_pi]) 25 | 26 | # parameters 27 | sigma, varphi, alpha, beta, theta, phi_pi, phi_y, rho_a, sigma_a, rho_v, sigma_v, sigma_pi = \ 28 | symbols('sigma, varphi, alpha, beta, theta, phi_pi, phi_y, rho_a, sigma_a, rho_v, sigma_v, sigma_pi') 29 | 30 | # Summary parameters 31 | psi_nya = (1 + varphi) / (sigma * (1 - alpha) + varphi + alpha) 32 | kappa = (1 - theta) * (1 - theta * beta) * (sigma * (1 - alpha) + varphi + alpha) 33 | 34 | # model (state) equations 35 | eq1 = y - exp_y + (1/sigma)*(i - exp_pi) - psi_nya * (rho_a - 1) * a 36 | eq2 = pi - beta * exp_pi - kappa * y - sigma_pi * eps_pi 37 | eq3 = i - phi_pi * pi - phi_y * y - v 38 | eq4 = a - rho_a * al - sigma_a * eps_a 39 | eq5 = v - rho_v * vl - sigma_v * eps_v 40 | eq6 = y - exp_yl - eta_y 41 | eq7 = pi - exp_pil - eta_pi 42 | 43 | equations = Matrix([eq1, eq2, eq3, eq4, eq5, eq6, eq7]) 44 | 45 | # observation equations 46 | obs01 = y 47 | obs02 = pi 48 | obs03 = 1/beta - 1 + i 49 | 50 | obs_names = ['Output Gap', 'Inflation', 'Interest Rate'] 51 | 52 | obs_equations = Matrix([obs01, obs02, obs03]) 53 | 54 | 55 | # ====================== 56 | # ===== SIMULATION ===== 57 | # ====================== 58 | 59 | calib_dict = {sigma: 1.3, 60 | varphi: 1, 61 | alpha: 0.4, 62 | beta: 0.997805, 63 | theta: 0.75, 64 | phi_pi: 1.5, 65 | phi_y: 0.2, 66 | rho_a: 0.9, 67 | sigma_a: 1.1, 68 | rho_v: 0.5, 69 | sigma_v: 0.3, 70 | sigma_pi: 0.8} 71 | 72 | dsge_simul = DSGE(endog=endog, 73 | endogl=endogl, 74 | exog=exog, 75 | expec=expec, 76 | state_equations=equations, 77 | calib_dict=calib_dict, 78 | obs_equations=obs_equations, 79 | obs_names=obs_names) 80 | 81 | # IRFs from the theoretical Model 82 | dsge_simul.irf(periods=24, show_charts=True) 83 | 84 | # Check existance and uniqueness 85 | print(dsge_simul.eu) 86 | 87 | # Simulate observations 88 | df_obs, df_states = dsge_simul.simulate(n_obs=200, random_seed=1) 89 | 90 | df_states = df_states.tail(100).reset_index(drop=True) 91 | df_obs = df_obs.tail(100).reset_index(drop=True) 92 | 93 | # df_obs.plot() 94 | # plt.show() 95 | 96 | 97 | # ============================= 98 | # ===== MODEL ESTIMATION ===== 99 | # ============================= 100 | # Not all parameters need to be estimated, these are going to be calibrated 101 | calib_param = {varphi: 1, alpha: 0.4, beta: 0.997805} 102 | estimate_param = Matrix([sigma, theta, phi_pi, phi_y, rho_a, sigma_a, rho_v, sigma_v, sigma_pi]) 103 | 104 | # priors 105 | prior_dict = {sigma: {'dist': 'normal', 'mean': 1.30, 'std': 0.20, 'label': '$\\sigma$'}, 106 | theta: {'dist': 'beta', 'mean': 0.60, 'std': 0.20, 'label': '$\\theta$'}, 107 | phi_pi: {'dist': 'normal', 'mean': 1.50, 'std': 0.35, 'label': '$\\phi_{\\pi}$'}, 108 | phi_y: {'dist': 'gamma', 'mean': 0.25, 'std': 0.10, 'label': '$\\phi_{y}$'}, 109 | rho_a: {'dist': 'beta', 'mean': 0.50, 'std': 0.25, 'label': '$\\rho_a$'}, 110 | sigma_a: {'dist': 'invgamma', 'mean': 0.50, 'std': 0.25, 'label': '$\\sigma_a$'}, 111 | rho_v: {'dist': 'beta', 'mean': 0.50, 'std': 0.25, 'label': '$\\rho_v$'}, 112 | sigma_v: {'dist': 'invgamma', 'mean': 0.50, 'std': 0.25, 'label': '$\\sigma_v$'}, 113 | sigma_pi: {'dist': 'invgamma', 'mean': 0.50, 'std': 0.25, 'label': '$\\sigma_{\\pi}$'}} 114 | 115 | 116 | dsge = DSGE(endog, endogl, exog, expec, equations, 117 | estimate_params=estimate_param, 118 | calib_dict=calib_param, 119 | obs_equations=obs_equations, 120 | prior_dict=prior_dict, 121 | obs_data=df_obs, 122 | obs_names=obs_names, 123 | verbose=True) 124 | 125 | dsge.estimate(nsim=1000, ck=0.1, file_path='snkm.h5') 126 | 127 | dsge.eval_chains(burnin=0.1, show_charts=True) 128 | 129 | print(dsge.posterior_table) 130 | 131 | # IRFs from the estimated Model # TODO compare with the originals 132 | dsge.irf(periods=24, show_charts=True) 133 | 134 | # Extraxct state variables # TODO compare with the originals 135 | df_states_hat, df_states_se = dsge.states() 136 | 137 | # Historical Decomposition 138 | df_hd = dsge.hist_decomp(show_charts=True) 139 | -------------------------------------------------------------------------------- /Example/example_soe.py: -------------------------------------------------------------------------------- 1 | from dsgepy import DSGE 2 | from sympy import symbols, Matrix 3 | 4 | 5 | # ================================ 6 | # ===== MODEL ESPECIFICATION ===== 7 | # ================================ 8 | # endogenous variables at t 9 | c, r, r_s, pi, pi_s, pi_h, pi_f, y, y_s, s, q, v_a, v_s, v_q, exp_c, exp_pi, exp_pih, exp_pif, exp_pis, exp_q = \ 10 | symbols('c, r, r_s, pi, pi_s, pi_h, pi_f, y, y_s, s, q, v_a, v_s, v_q, exp_c, exp_pi, exp_pih, exp_pif, exp_pis, exp_q') 11 | endog = Matrix([c, r, r_s, pi, pi_s, pi_h, pi_f, y, y_s, s, q, v_a, v_s, v_q, exp_c, exp_pi, exp_pih, exp_pif, exp_pis, exp_q]) 12 | 13 | # endogenous variables at t - 1 14 | cl, rl, r_sl, pil, pi_sl, pi_hl, pi_fl, yl, y_sl, sl, ql, v_al, v_sl, v_ql, exp_cl, exp_pil, exp_pihl, exp_pifl, exp_pisl, exp_ql = \ 15 | symbols('cl, rl, r_sl, pil, pi_sl, pi_hl, pi_fl, yl, y_sl, sl, ql, v_al, v_sl, v_ql, exp_cl, exp_pil, exp_pihl, exp_pifl, exp_pisl, exp_ql') 16 | endogl = Matrix([cl, rl, r_sl, pil, pi_sl, pi_hl, pi_fl, yl, y_sl, sl, ql, v_al, v_sl, v_ql, exp_cl, exp_pil, exp_pihl, exp_pifl, exp_pisl, exp_ql]) 17 | 18 | # exogenous shocks 19 | eps_a, eps_h, eps_f, eps_q, eps_s, eps_r, eps_pis, eps_ys, eps_rs = \ 20 | symbols('eps_a, eps_h, eps_f, eps_q, eps_s, eps_r, eps_pis, eps_ys, eps_rs') 21 | exog = Matrix([eps_a, eps_h, eps_f, eps_q, eps_s, eps_r, eps_pis, eps_ys, eps_rs]) 22 | 23 | # expectational shocks 24 | eta_c, eta_pi, eta_pih, eta_pif, eta_pis, eta_q = \ 25 | symbols('eta_c, eta_pi, eta_pih, eta_pif, eta_pis, eta_q') 26 | expec = Matrix([eta_c, eta_pi, eta_pih, eta_pif, eta_pis, eta_q]) 27 | 28 | # parameters 29 | sigma, h, phi, beta, alpha, eta, delta_h, delta_f, theta_h, theta_f, rho_r, mu_pi, mu_q, mu_y, sigma_r, rho_a, sigma_a, rho_s, sigma_s, rho_q, sigma_q, rho_pis, sigma_pis, rho_ys, sigma_ys, rho_rs, sigma_rs, sigma_h, sigma_f = \ 30 | symbols('sigma, h, phi, beta, alpha, eta, delta_h, delta_f, theta_h, theta_f, rho_r, mu_pi, mu_q, mu_y, sigma_r, rho_a, sigma_a, rho_s, sigma_s, rho_q, sigma_q, rho_pis, sigma_pis, rho_ys, sigma_ys, rho_rs, sigma_rs, sigma_h, sigma_f') 31 | 32 | # Summary parameters 33 | lambda_h = (1 - theta_h * beta) * (1 - theta_h) / theta_h 34 | lambda_f = (1 - theta_f * beta) * (1 - theta_f) / theta_f 35 | 36 | # model (state) equations 37 | eq01 = (1+h)*c - h*cl - exp_c + ((1-h)/sigma)*r - ((1-h)/sigma)*exp_pi 38 | eq02 = (1+delta_h)*pi_h - beta*exp_pih - delta_h*pi_hl - lambda_h*phi*y + lambda_h*(1+phi)*v_a - lambda_h*alpha*s - lambda_h*(sigma/(1-h))*c + lambda_h*(sigma/(1-h))*h*cl - lambda_h*sigma_h*eps_h 39 | eq03 = (1+delta_f)*pi_f - beta*exp_pif - delta_f*pi_fl - lambda_f*q + lambda_f*(1-alpha)*s - lambda_f*sigma_f*eps_f 40 | eq04 = exp_q - q - r + exp_pi + r_s - exp_pis - v_q 41 | eq05 = s - sl - pi_f + pi_h - v_s 42 | eq06 = y - (1-alpha)*c - alpha*eta*q - alpha*eta*s - alpha*y_s 43 | eq07 = pi - (1-alpha)*pi_h - alpha*pi_f 44 | eq08 = v_a - rho_a*v_al - sigma_a*eps_a 45 | eq09 = v_s - rho_s*v_sl - sigma_s*eps_s 46 | eq10 = v_q - rho_q*v_ql - sigma_q*eps_q 47 | eq11 = pi_s - rho_pis*pi_sl - sigma_pis*eps_pis 48 | eq12 = y_s - rho_ys*y_sl - sigma_ys*eps_ys 49 | eq13 = r_s - rho_rs*r_sl - sigma_rs*eps_rs 50 | eq14 = r - rho_r*rl - (1-rho_r)*mu_pi*pi - (1-rho_r)*mu_q*q - (1-rho_r)*mu_y*y - sigma_r*eps_r 51 | eq15 = c - exp_cl - eta_c 52 | eq16 = pi - exp_pil - eta_pi 53 | eq17 = pi_h - exp_pihl - eta_pih 54 | eq18 = pi_f - exp_pifl - eta_pif 55 | eq19 = pi_s - exp_pisl - eta_pis 56 | eq20 = q - exp_ql - eta_q 57 | 58 | equations = Matrix([eq01, eq02, eq03, eq04, eq05, eq06, eq07, eq08, eq09, eq10, eq11, eq12, eq13, eq14, eq15, eq16, eq17, eq18, eq19, eq20]) 59 | 60 | # ======================= 61 | # ===== CALIBRATION ===== 62 | # ======================= 63 | 64 | calib_dict = { 65 | sigma: 0.81, 66 | h: 0.91, 67 | phi: 1.58, 68 | beta: 0.99, 69 | alpha: 0.45, 70 | eta: 0.36, 71 | delta_h: 0.25, 72 | delta_f: 0.05, 73 | theta_h: 0.77, 74 | theta_f: 0.68, 75 | rho_r: 0.7, 76 | mu_pi: 1.2, 77 | mu_q: 0, 78 | mu_y: 0, 79 | sigma_r: 0.36, 80 | rho_a: 0.81, 81 | sigma_a: 5.17, 82 | rho_s: 0.81, 83 | sigma_s: 5.45, 84 | rho_q: 0.68, 85 | sigma_q: 0.75, 86 | rho_pis: 0.26, 87 | sigma_pis: 0.42, 88 | rho_ys: 0.72, 89 | sigma_ys: 0.54, 90 | rho_rs: 0.89, 91 | sigma_rs: 0.22, 92 | sigma_h: 1.05, 93 | sigma_f: 4.43, 94 | } 95 | 96 | dsge_simul = DSGE(endog=endog, 97 | endogl=endogl, 98 | exog=exog, 99 | expec=expec, 100 | state_equations=equations, 101 | calib_dict=calib_dict, 102 | ) 103 | 104 | # Check existance and uniqueness 105 | print(dsge_simul.eu) 106 | 107 | # IRFs from the theoretical Model 108 | dsge_simul.irf(periods=24, show_charts=True) 109 | -------------------------------------------------------------------------------- /Example/example_us.py: -------------------------------------------------------------------------------- 1 | """ 2 | This example is not finished 3 | """ 4 | 5 | import numpy as np 6 | import pandas as pd 7 | from dsgepy import DSGE 8 | import matplotlib.pyplot as plt 9 | from sympy import Matrix, symbols 10 | from statsmodels.tsa.filters.hp_filter import hpfilter 11 | 12 | from dsgepy import FRED 13 | 14 | fred = FRED() 15 | 16 | # ===== Grab and organize Data ===== # 17 | # Get data from the FRED 18 | series_dict = {'CPIAUCSL': 'CPI', 19 | 'GDP': 'GDP', 20 | 'DFF': 'Fed Funds Rate'} 21 | 22 | df = fred.fetch(series_id=series_dict) 23 | 24 | # Observed varibles 25 | df_obs = pd.DataFrame() 26 | df_obs['CPI'] = df['CPI'].dropna().resample('Q').last().pct_change(1) * 4 27 | df_obs['FFR'] = df['Fed Funds Rate'].resample('Q').mean() 28 | df_obs['outputgap'], _ = hpfilter(np.log(df['GDP'].resample('Q').last().dropna()), 1600) 29 | 30 | df_obs = df_obs.dropna() 31 | 32 | df_obs = df_obs[df_obs.index >= '2000-01-01'] 33 | 34 | # ================================ 35 | # ===== MODEL ESPECIFICATION ===== 36 | # ================================ 37 | # endogenous variables at t 38 | 39 | 40 | # endogenous variables at t - 1 41 | 42 | # exogenous shocks 43 | 44 | 45 | # expectational shocks 46 | 47 | 48 | # parameters 49 | 50 | 51 | # Summary parameters 52 | 53 | 54 | # model (state) equations 55 | 56 | 57 | # observation equations 58 | 59 | 60 | # ============================= 61 | # ===== MODEL ESTIMATION ===== 62 | # ============================= 63 | # Not all parameters need to be estimated, these are going to be calibrated 64 | 65 | # priors 66 | 67 | 68 | # DSGE object 69 | 70 | # Estimation 71 | 72 | # Posterior Table 73 | 74 | # IRFs from the estimated Model 75 | 76 | # Extraxct state variables 77 | 78 | # Historical Decomposition 79 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Gustavo Amarante 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # dsgepy 2 | This is a Python library to specify, calibrate, solve, simulate, estimate and 3 | analyze linearized DSGE models. The specification interface is inpired by 4 | dynare, which allows for symbolic declarations of parameters, variables and 5 | equations. Once a model is calbrated or estimated, it is solved using Sims 6 | (2002) methodology. 7 | Simulated trajectories can be generated from a calibrated model. Estimation 8 | uses bayesian methods, specifically markov chain monte carlo (MCMC), to 9 | simulate the posterior distributions of the parameters. Analysis tools include 10 | impulse-response functions, historical decompostion and extraction of latent 11 | variables. 12 | 13 | This library is an effort to bring the DSGE toolset into the open-source world 14 | in a full python implementation, which allows to embrace the advantages of this 15 | programming language when working with DSGEs. 16 | 17 | --- 18 | # Installation 19 | You can install this development version using: 20 | ``` 21 | pip install dsgepy 22 | ``` 23 | 24 | --- 25 | # Example 26 | A full example on how to use this library with a small New Keynesian model is available in 27 | [this Jupyter notebook](https://github.com/gusamarante/pydsge/blob/master/Example/example_snkm.ipynb). The model used 28 | in the example is descibred briefly by the following equations: 29 | 30 | $$\tilde{y}\_{t}=E\_{t}\left(\tilde{y}\_{t+1}\right)-\frac{1}{\sigma}\left[\hat{i}\_{t}-E\_{t}\left(\pi\_{t+1}\right)\right]+\psi\_{ya}^{n}\left(\rho\_{a}-1\right)a\_{t}$$ 31 | 32 | $$\pi\_{t}=\beta E\_{t}\left(\pi\_{t+1}\right)+\kappa\tilde{y}\_{t}+\sigma\_{\pi}\varepsilon\_{t}^{\pi}$$ 33 | 34 | $$\hat{i}\_{t}=\phi\_{\pi}\pi\_{t}+\phi\_{y}\tilde{y}\_{t}+v\_{t}$$ 35 | 36 | $$a_{t}=\rho\_{a}a\_{t-1}+\sigma\_{a}\varepsilon\_{t}^{a}$$ 37 | 38 | $$v_{t}=\rho_{v}v_{t-1}+\sigma_{v}\varepsilon_{t}^{v}$$ 39 | 40 | 41 | # Model Especification 42 | For now, the model equations have to be linearized around its steady-state. 43 | Soon, there will be a functionality that allows for declaration with 44 | non-linearized equilibrium conditions. 45 | 46 | # Model Solution 47 | The solution method used is based on the implementation of Christopher A. Sims' `gensys` function. You can find the 48 | author's original matlab code [here](https://dge.repec.org/codes/sims/linre3a/). The paper explaining the solution 49 | method is [this one](https://dge.repec.org/codes/sims/linre3a/LINRE3A.pdf). 50 | 51 | # Model Estimation 52 | The models are estimated using Bayesian methdos, specifically, by simulating the posterior distribution using MCMC 53 | sampling. This process is slow, so there is a functionality that allows you to stop a simulation and continue 54 | it later from where it stoped. 55 | 56 | # Analysis 57 | There are functionalities for computing Impulse-Response funcions for both state variables and observed variables. 58 | Historical decomposition is also available, but only when the number of exogenous shocks matches the number of 59 | observed variables. 60 | 61 | --- 62 | # Drawbacks 63 | Since there is symbolic declaration of variables and equations, methdos 64 | involving them are slow. Also, MCMC methods for macroeconomic models require 65 | many iterations to achieve convergence. Clearly, there is room for improvement 66 | on the efficiency of these estimation algorithms. Contributions are welcome. 67 | Speaking of contributions... 68 | 69 | --- 70 | # Contributing 71 | If you would like to contribute to this repository, plese check the 72 | [contributing guidelines](https://github.com/gusamarante/pydsge/blob/master/CONTRIBUTING.md) here. A 73 | [list of feature suggestions](https://github.com/gusamarante/pydsge/projects) is available on the projects page of this 74 | repository. 75 | 76 | --- 77 | # More Information and Help 78 | If you need more information and help, specially about contributing, you can 79 | contact Gustavo Amarante on developer@dsgepy.com 80 | -------------------------------------------------------------------------------- /Utilities/prior_table.xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gusamarante/dsgepy/a792bac2d54c034929566d79954861272e3170e7/Utilities/prior_table.xlsx -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-cayman -------------------------------------------------------------------------------- /dsgepy/__init__.py: -------------------------------------------------------------------------------- 1 | __all__ = ['DSGE', 'gensys', 'csminwel', 'FRED', 'SGS'] 2 | 3 | from .lineardsge import DSGE, gensys 4 | from .pycsminwel import csminwel 5 | from .data_api import FRED, SGS 6 | -------------------------------------------------------------------------------- /dsgepy/data_api.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | 3 | 4 | class FRED(object): 5 | """ 6 | Wrapper for the data API of the FRED 7 | """ 8 | 9 | def fetch(self, series_id, initial_date=None, end_date=None): 10 | """ 11 | Grabs series from the FRED website and returns a pandas dataframe 12 | :param series_id: string with series ID, list of strings of the series ID or dict with series ID as keys 13 | :param initial_date: string in the format 'yyyy-mm-dd' (optional) 14 | :param end_date: string in the format 'yyyy-mm-dd' (optional) 15 | :return: pandas DataFrame withe the requested series. If a dict is passed as series ID, the dict values are used 16 | as column names. 17 | """ 18 | 19 | if type(series_id) is list: 20 | 21 | df = pd.DataFrame() 22 | 23 | for cod in series_id: 24 | single_series = self._fetch_single_code(cod) 25 | df = pd.concat([df, single_series], axis=1) 26 | 27 | df.sort_index(inplace=True) 28 | 29 | elif type(series_id) is dict: 30 | 31 | df = pd.DataFrame() 32 | 33 | for cod in series_id.keys(): 34 | single_series = self._fetch_single_code(cod) 35 | df = pd.concat([df, single_series], axis=1) 36 | 37 | df.columns = series_id.values() 38 | 39 | else: 40 | 41 | df = self._fetch_single_code(series_id) 42 | 43 | df = self._correct_dates(df, initial_date, end_date) 44 | 45 | return df 46 | 47 | @staticmethod 48 | def _fetch_single_code(series_id): 49 | 50 | url = r'https://fred.stlouisfed.org/data/' + series_id + '.txt' 51 | df = pd.read_csv(url, sep='\n') 52 | series_start = df[df[df.columns[0]].str.contains('DATE\s+VALUE')].index[0] + 1 53 | df = df.loc[series_start:] 54 | df = df[df.columns[0]].str.split('\s+', expand=True) 55 | df = df[~(df[1] == '.')] 56 | df = pd.DataFrame(data=df[1].values.astype(float), 57 | index=pd.to_datetime(df[0]), 58 | columns=[series_id]) 59 | df.index.rename('Date', inplace=True) 60 | 61 | return df 62 | 63 | @staticmethod 64 | def _correct_dates(df, initial_date, end_date): 65 | 66 | if initial_date is not None: 67 | dt_ini = pd.to_datetime(initial_date) 68 | df = df[df.index >= dt_ini] 69 | 70 | if end_date is not None: 71 | dt_end = pd.to_datetime(end_date) 72 | df = df[df.index <= dt_end] 73 | 74 | return df 75 | 76 | 77 | class SGS(object): 78 | """ 79 | Wrapper for the Data API of the SGS (Sistema de Gerenciamento de Séries) of the Brazilian Central Bank. 80 | """ 81 | 82 | def fetch(self, series_id, initial_date=None, end_date=None): 83 | """ 84 | Grabs series from the SGS 85 | :param series_id: series code on the SGS. (int, str, list of int or list of str) 86 | :param initial_date: initial date for the result (optional) 87 | :param end_date: end date for the result (optional) 88 | :return: pandas DataFrame withe the requested series. If a dict is passed as series ID, the dict values are used 89 | as column names. 90 | """ 91 | 92 | if type(series_id) is list: # loop all series codes 93 | 94 | df = pd.DataFrame() 95 | 96 | for cod in series_id: 97 | single_series = self._fetch_single_code(cod, initial_date, end_date) 98 | df = pd.concat([df, single_series], axis=1) 99 | 100 | df.sort_index(inplace=True) 101 | 102 | elif type(series_id) is dict: 103 | 104 | df = pd.DataFrame() 105 | 106 | for cod in series_id.keys(): 107 | single_series = self._fetch_single_code(cod, initial_date, end_date) 108 | df = pd.concat([df, single_series], axis=1) 109 | 110 | df.columns = series_id.values() 111 | 112 | else: 113 | df = self._fetch_single_code(series_id, initial_date, end_date) 114 | 115 | df = self._correct_dates(df, initial_date, end_date) 116 | 117 | return df 118 | 119 | def _fetch_single_code(self, series_id, initial_date=None, end_date=None): 120 | """ 121 | Grabs a single series using the API of the SGS. Queries are built using the url 122 | http://api.bcb.gov.br/dados/serie/bcdata.sgs.{seriesID}/dados?formato=json&dataInicial={initial_date}&dataFinal={end_date} 123 | The url returns a json file which is read and parsed to a pandas DataFrame. 124 | ISSUE: It seems like the SGS API URL is not working properly. Even if you pass and initial or end date argument, 125 | the API does not filter the dates and always returns all the available values. I added a date filter for the 126 | pandas dataframe that is passed to the user and left the old code intact, in case this gets corrected in the 127 | future. 128 | :param series_id: series code on the SGS 129 | :param initial_date: initial date for the result (optional) 130 | :param end_date: end date for the result (optional) 131 | :return: pandas DataFrame with a single series as column and the dates as the index 132 | """ 133 | 134 | url = self._build_url(series_id, initial_date, end_date) 135 | 136 | df = pd.read_json(url) 137 | df = df.set_index(pd.to_datetime(df['data'], dayfirst=True)).drop('data', axis=1) 138 | df.columns = [str(series_id)] 139 | 140 | return df 141 | 142 | @staticmethod 143 | def _build_url(series_id, initial_date, end_date): 144 | """ returns the search url as string """ 145 | 146 | url = 'http://api.bcb.gov.br/dados/serie/bcdata.sgs.' + str(series_id) + '/dados?formato=json' 147 | 148 | if not (initial_date is None): 149 | url = url + '&dataInicial=' + str(initial_date) 150 | 151 | if not (end_date is None): 152 | url = url + '&dataFinal=' + str(end_date) 153 | 154 | return url 155 | 156 | @staticmethod 157 | def _correct_dates(df, initial_date, end_date): 158 | 159 | if initial_date is not None: 160 | dt_ini = pd.to_datetime(initial_date, dayfirst=True) 161 | df = df[df.index >= dt_ini] 162 | 163 | if end_date is not None: 164 | dt_end = pd.to_datetime(end_date, dayfirst=True) 165 | df = df[df.index <= dt_end] 166 | 167 | return df 168 | -------------------------------------------------------------------------------- /dsgepy/kalman/__init__.py: -------------------------------------------------------------------------------- 1 | from .kalman import KalmanFilter 2 | 3 | __all__ = ['KalmanFilter'] 4 | -------------------------------------------------------------------------------- /dsgepy/kalman/kalman.py: -------------------------------------------------------------------------------- 1 | """ 2 | ===================================== 3 | Inference for Linear-Gaussian Systems 4 | ===================================== 5 | 6 | This module implements the Kalman Filter, Kalman Smoother, and 7 | EM Algorithm for Linear-Gaussian state space models 8 | """ 9 | import warnings 10 | 11 | import numpy as np 12 | from scipy import linalg 13 | 14 | from .utils import array1d, array2d, check_random_state, \ 15 | get_params, log_multivariate_normal_density, preprocess_arguments 16 | 17 | # Dimensionality of each Kalman Filter parameter for a single time step 18 | DIM = { 19 | 'transition_matrices': 2, 20 | 'transition_offsets': 1, 21 | 'observation_matrices': 2, 22 | 'observation_offsets': 1, 23 | 'transition_covariance': 2, 24 | 'observation_covariance': 2, 25 | 'initial_state_mean': 1, 26 | 'initial_state_covariance': 2, 27 | } 28 | 29 | 30 | def _arg_or_default(arg, default, dim, name): 31 | if arg is None: 32 | result = np.asarray(default) 33 | else: 34 | result = arg 35 | if len(result.shape) > dim: 36 | raise ValueError( 37 | ('%s is not constant for all time.' 38 | + ' You must specify it manually.') % (name,) 39 | ) 40 | return result 41 | 42 | 43 | def _determine_dimensionality(variables, default): 44 | """Derive the dimensionality of the state space 45 | 46 | Parameters 47 | ---------- 48 | variables : list of ({None, array}, conversion function, index) 49 | variables, functions to convert them to arrays, and indices in those 50 | arrays to derive dimensionality from. 51 | default : {None, int} 52 | default dimensionality to return if variables is empty 53 | 54 | Returns 55 | ------- 56 | dim : int 57 | dimensionality of state space as derived from variables or default. 58 | """ 59 | # gather possible values based on the variables 60 | candidates = [] 61 | for (v, converter, idx) in variables: 62 | if v is not None: 63 | v = converter(v) 64 | candidates.append(v.shape[idx]) 65 | 66 | # also use the manually specified default 67 | if default is not None: 68 | candidates.append(default) 69 | 70 | # ensure consistency of all derived values 71 | if len(candidates) == 0: 72 | return 1 73 | else: 74 | if not np.all(np.array(candidates) == candidates[0]): 75 | raise ValueError( 76 | "The shape of all " + 77 | "parameters is not consistent. " + 78 | "Please re-check their values." 79 | ) 80 | return candidates[0] 81 | 82 | 83 | def _last_dims(X, t, ndims=2): 84 | """Extract the final dimensions of `X` 85 | 86 | Extract the final `ndim` dimensions at index `t` if `X` has >= `ndim` + 1 87 | dimensions, otherwise return `X`. 88 | 89 | Parameters 90 | ---------- 91 | X : array with at least dimension `ndims` 92 | t : int 93 | index to use for the `ndims` + 1th dimension 94 | ndims : int, optional 95 | number of dimensions in the array desired 96 | 97 | Returns 98 | ------- 99 | Y : array with dimension `ndims` 100 | the final `ndims` dimensions indexed by `t` 101 | """ 102 | X = np.asarray(X) 103 | if len(X.shape) == ndims + 1: 104 | return X[t] 105 | elif len(X.shape) == ndims: 106 | return X 107 | else: 108 | raise ValueError(("X only has %d dimensions when %d" + 109 | " or more are required") % (len(X.shape), ndims)) 110 | 111 | 112 | def _loglikelihoods(observation_matrices, observation_offsets, 113 | observation_covariance, predicted_state_means, 114 | predicted_state_covariances, observations): 115 | """Calculate log likelihood of all observations 116 | 117 | Parameters 118 | ---------- 119 | observation_matrices : [n_timesteps, n_dim_obs, n_dim_state] or [n_dim_obs, 120 | n_dim_state] array 121 | observation matrices for t in [0...n_timesteps-1] 122 | observation_offsets : [n_timesteps, n_dim_obs] or [n_dim_obs] array 123 | offsets for observations for t = [0...n_timesteps-1] 124 | observation_covariance : [n_dim_obs, n_dim_obs] array 125 | covariance matrix for all observations 126 | predicted_state_means : [n_timesteps, n_dim_state] array 127 | mean of state at time t given observations from times 128 | [0...t-1] for t in [0...n_timesteps-1] 129 | predicted_state_covariances : [n_timesteps, n_dim_state, n_dim_state] array 130 | covariance of state at time t given observations from times 131 | [0...t-1] for t in [0...n_timesteps-1] 132 | observations : [n_dim_obs] array 133 | All observations. If `observations[t]` is a masked array and any of 134 | its values are masked, the observation will be ignored. 135 | 136 | Returns 137 | ------- 138 | loglikelihoods: [n_timesteps] array 139 | `loglikelihoods[t]` is the probability density of the observation 140 | generated at time step t 141 | """ 142 | n_timesteps = observations.shape[0] 143 | loglikelihoods = np.zeros(n_timesteps) 144 | for t in range(n_timesteps): 145 | observation = observations[t] 146 | if not np.any(np.ma.getmask(observation)): 147 | observation_matrix = _last_dims(observation_matrices, t) 148 | observation_offset = _last_dims(observation_offsets, t, ndims=1) 149 | predicted_state_mean = _last_dims( 150 | predicted_state_means, t, ndims=1 151 | ) 152 | predicted_state_covariance = _last_dims( 153 | predicted_state_covariances, t 154 | ) 155 | 156 | predicted_observation_mean = ( 157 | np.dot(observation_matrix, 158 | predicted_state_mean) 159 | + observation_offset 160 | ) 161 | predicted_observation_covariance = ( 162 | np.dot(observation_matrix, 163 | np.dot(predicted_state_covariance, 164 | observation_matrix.T)) 165 | + observation_covariance 166 | ) 167 | loglikelihoods[t] = log_multivariate_normal_density( 168 | observation[np.newaxis, :], 169 | predicted_observation_mean[np.newaxis, :], 170 | predicted_observation_covariance[np.newaxis, :, :] 171 | ) 172 | return loglikelihoods 173 | 174 | 175 | def _filter_predict(transition_matrix, transition_covariance, 176 | transition_offset, current_state_mean, 177 | current_state_covariance): 178 | r"""Calculate the mean and covariance of :math:`P(x_{t+1} | z_{0:t})` 179 | 180 | Using the mean and covariance of :math:`P(x_t | z_{0:t})`, calculate the 181 | mean and covariance of :math:`P(x_{t+1} | z_{0:t})`. 182 | 183 | Parameters 184 | ---------- 185 | transition_matrix : [n_dim_state, n_dim_state} array 186 | state transition matrix from time t to t+1 187 | transition_covariance : [n_dim_state, n_dim_state] array 188 | covariance matrix for state transition from time t to t+1 189 | transition_offset : [n_dim_state] array 190 | offset for state transition from time t to t+1 191 | current_state_mean: [n_dim_state] array 192 | mean of state at time t given observations from times 193 | [0...t] 194 | current_state_covariance: [n_dim_state, n_dim_state] array 195 | covariance of state at time t given observations from times 196 | [0...t] 197 | 198 | Returns 199 | ------- 200 | predicted_state_mean : [n_dim_state] array 201 | mean of state at time t+1 given observations from times [0...t] 202 | predicted_state_covariance : [n_dim_state, n_dim_state] array 203 | covariance of state at time t+1 given observations from times 204 | [0...t] 205 | """ 206 | predicted_state_mean = ( 207 | np.dot(transition_matrix, current_state_mean) 208 | + transition_offset 209 | ) 210 | predicted_state_covariance = ( 211 | np.dot(transition_matrix, 212 | np.dot(current_state_covariance, 213 | transition_matrix.T)) 214 | + transition_covariance 215 | ) 216 | 217 | return (predicted_state_mean, predicted_state_covariance) 218 | 219 | 220 | def _filter_correct(observation_matrix, observation_covariance, 221 | observation_offset, predicted_state_mean, 222 | predicted_state_covariance, observation): 223 | r"""Correct a predicted state with a Kalman Filter update 224 | 225 | Incorporate observation `observation` from time `t` to turn 226 | :math:`P(x_t | z_{0:t-1})` into :math:`P(x_t | z_{0:t})` 227 | 228 | Parameters 229 | ---------- 230 | observation_matrix : [n_dim_obs, n_dim_state] array 231 | observation matrix for time t 232 | observation_covariance : [n_dim_obs, n_dim_obs] array 233 | covariance matrix for observation at time t 234 | observation_offset : [n_dim_obs] array 235 | offset for observation at time t 236 | predicted_state_mean : [n_dim_state] array 237 | mean of state at time t given observations from times 238 | [0...t-1] 239 | predicted_state_covariance : [n_dim_state, n_dim_state] array 240 | covariance of state at time t given observations from times 241 | [0...t-1] 242 | observation : [n_dim_obs] array 243 | observation at time t. If `observation` is a masked array and any of 244 | its values are masked, the observation will be ignored. 245 | 246 | Returns 247 | ------- 248 | kalman_gain : [n_dim_state, n_dim_obs] array 249 | Kalman gain matrix for time t 250 | corrected_state_mean : [n_dim_state] array 251 | mean of state at time t given observations from times 252 | [0...t] 253 | corrected_state_covariance : [n_dim_state, n_dim_state] array 254 | covariance of state at time t given observations from times 255 | [0...t] 256 | """ 257 | if not np.any(np.ma.getmask(observation)): 258 | predicted_observation_mean = ( 259 | np.dot(observation_matrix, 260 | predicted_state_mean) 261 | + observation_offset 262 | ) 263 | predicted_observation_covariance = ( 264 | np.dot(observation_matrix, 265 | np.dot(predicted_state_covariance, 266 | observation_matrix.T)) 267 | + observation_covariance 268 | ) 269 | 270 | kalman_gain = ( 271 | np.dot(predicted_state_covariance, 272 | np.dot(observation_matrix.T, 273 | linalg.pinv(predicted_observation_covariance))) 274 | ) 275 | 276 | corrected_state_mean = ( 277 | predicted_state_mean 278 | + np.dot(kalman_gain, observation - predicted_observation_mean) 279 | ) 280 | corrected_state_covariance = ( 281 | predicted_state_covariance 282 | - np.dot(kalman_gain, 283 | np.dot(observation_matrix, 284 | predicted_state_covariance)) 285 | ) 286 | else: 287 | n_dim_state = predicted_state_covariance.shape[0] 288 | n_dim_obs = observation_matrix.shape[0] 289 | kalman_gain = np.zeros((n_dim_state, n_dim_obs)) 290 | 291 | corrected_state_mean = predicted_state_mean 292 | corrected_state_covariance = predicted_state_covariance 293 | 294 | return (kalman_gain, corrected_state_mean, 295 | corrected_state_covariance) 296 | 297 | 298 | def _filter(transition_matrices, observation_matrices, transition_covariance, 299 | observation_covariance, transition_offsets, observation_offsets, 300 | initial_state_mean, initial_state_covariance, observations): 301 | """Apply the Kalman Filter 302 | 303 | Calculate posterior distribution over hidden states given observations up 304 | to and including the current time step. 305 | 306 | Parameters 307 | ---------- 308 | transition_matrices : [n_timesteps-1,n_dim_state,n_dim_state] or 309 | [n_dim_state,n_dim_state] array-like 310 | state transition matrices 311 | observation_matrices : [n_timesteps, n_dim_obs, n_dim_state] or [n_dim_obs, \ 312 | n_dim_state] array-like 313 | observation matrix 314 | transition_covariance : [n_timesteps-1,n_dim_state,n_dim_state] or 315 | [n_dim_state,n_dim_state] array-like 316 | state transition covariance matrix 317 | observation_covariance : [n_timesteps, n_dim_obs, n_dim_obs] or [n_dim_obs, 318 | n_dim_obs] array-like 319 | observation covariance matrix 320 | transition_offsets : [n_timesteps-1, n_dim_state] or [n_dim_state] \ 321 | array-like 322 | state offset 323 | observation_offsets : [n_timesteps, n_dim_obs] or [n_dim_obs] array-like 324 | observations for times [0...n_timesteps-1] 325 | initial_state_mean : [n_dim_state] array-like 326 | mean of initial state distribution 327 | initial_state_covariance : [n_dim_state, n_dim_state] array-like 328 | covariance of initial state distribution 329 | observations : [n_timesteps, n_dim_obs] array 330 | observations from times [0...n_timesteps-1]. If `observations` is a 331 | masked array and any of `observations[t]` is masked, then 332 | `observations[t]` will be treated as a missing observation. 333 | 334 | Returns 335 | ------- 336 | predicted_state_means : [n_timesteps, n_dim_state] array 337 | `predicted_state_means[t]` = mean of hidden state at time t given 338 | observations from times [0...t-1] 339 | predicted_state_covariances : [n_timesteps, n_dim_state, n_dim_state] array 340 | `predicted_state_covariances[t]` = covariance of hidden state at time t 341 | given observations from times [0...t-1] 342 | kalman_gains : [n_timesteps, n_dim_state] array 343 | `kalman_gains[t]` = Kalman gain matrix for time t 344 | filtered_state_means : [n_timesteps, n_dim_state] array 345 | `filtered_state_means[t]` = mean of hidden state at time t given 346 | observations from times [0...t] 347 | filtered_state_covariances : [n_timesteps, n_dim_state] array 348 | `filtered_state_covariances[t]` = covariance of hidden state at time t 349 | given observations from times [0...t] 350 | """ 351 | n_timesteps = observations.shape[0] 352 | n_dim_state = len(initial_state_mean) 353 | n_dim_obs = observations.shape[1] 354 | 355 | predicted_state_means = np.zeros((n_timesteps, n_dim_state)) 356 | predicted_state_covariances = np.zeros( 357 | (n_timesteps, n_dim_state, n_dim_state) 358 | ) 359 | kalman_gains = np.zeros((n_timesteps, n_dim_state, n_dim_obs)) 360 | filtered_state_means = np.zeros((n_timesteps, n_dim_state)) 361 | filtered_state_covariances = np.zeros( 362 | (n_timesteps, n_dim_state, n_dim_state) 363 | ) 364 | 365 | for t in range(n_timesteps): 366 | if t == 0: 367 | predicted_state_means[t] = initial_state_mean 368 | predicted_state_covariances[t] = initial_state_covariance 369 | else: 370 | transition_matrix = _last_dims(transition_matrices, t - 1) 371 | transition_covariance = _last_dims(transition_covariance, t - 1) 372 | transition_offset = _last_dims(transition_offsets, t - 1, ndims=1) 373 | predicted_state_means[t], predicted_state_covariances[t] = ( 374 | _filter_predict( 375 | transition_matrix, 376 | transition_covariance, 377 | transition_offset, 378 | filtered_state_means[t - 1], 379 | filtered_state_covariances[t - 1] 380 | ) 381 | ) 382 | 383 | observation_matrix = _last_dims(observation_matrices, t) 384 | observation_covariance = _last_dims(observation_covariance, t) 385 | observation_offset = _last_dims(observation_offsets, t, ndims=1) 386 | (kalman_gains[t], filtered_state_means[t], 387 | filtered_state_covariances[t]) = ( 388 | _filter_correct(observation_matrix, 389 | observation_covariance, 390 | observation_offset, 391 | predicted_state_means[t], 392 | predicted_state_covariances[t], 393 | observations[t] 394 | ) 395 | ) 396 | 397 | return (predicted_state_means, predicted_state_covariances, 398 | kalman_gains, filtered_state_means, 399 | filtered_state_covariances) 400 | 401 | 402 | def _smooth_update(transition_matrix, filtered_state_mean, 403 | filtered_state_covariance, predicted_state_mean, 404 | predicted_state_covariance, next_smoothed_state_mean, 405 | next_smoothed_state_covariance): 406 | r"""Correct a predicted state with a Kalman Smoother update 407 | 408 | Calculates posterior distribution of the hidden state at time `t` given the 409 | observations all observations via Kalman Smoothing. 410 | 411 | Parameters 412 | ---------- 413 | transition_matrix : [n_dim_state, n_dim_state] array 414 | state transition matrix from time t to t+1 415 | filtered_state_mean : [n_dim_state] array 416 | mean of filtered state at time t given observations from 417 | times [0...t] 418 | filtered_state_covariance : [n_dim_state, n_dim_state] array 419 | covariance of filtered state at time t given observations from 420 | times [0...t] 421 | predicted_state_mean : [n_dim_state] array 422 | mean of filtered state at time t+1 given observations from 423 | times [0...t] 424 | predicted_state_covariance : [n_dim_state, n_dim_state] array 425 | covariance of filtered state at time t+1 given observations from 426 | times [0...t] 427 | next_smoothed_state_mean : [n_dim_state] array 428 | mean of smoothed state at time t+1 given observations from 429 | times [0...n_timesteps-1] 430 | next_smoothed_state_covariance : [n_dim_state, n_dim_state] array 431 | covariance of smoothed state at time t+1 given observations from 432 | times [0...n_timesteps-1] 433 | 434 | Returns 435 | ------- 436 | smoothed_state_mean : [n_dim_state] array 437 | mean of smoothed state at time t given observations from times 438 | [0...n_timesteps-1] 439 | smoothed_state_covariance : [n_dim_state, n_dim_state] array 440 | covariance of smoothed state at time t given observations from 441 | times [0...n_timesteps-1] 442 | kalman_smoothing_gain : [n_dim_state, n_dim_state] array 443 | correction matrix for Kalman Smoothing at time t 444 | """ 445 | kalman_smoothing_gain = ( 446 | np.dot(filtered_state_covariance, 447 | np.dot(transition_matrix.T, 448 | linalg.pinv(predicted_state_covariance))) 449 | ) 450 | 451 | smoothed_state_mean = ( 452 | filtered_state_mean 453 | + np.dot(kalman_smoothing_gain, 454 | next_smoothed_state_mean - predicted_state_mean) 455 | ) 456 | smoothed_state_covariance = ( 457 | filtered_state_covariance 458 | + np.dot(kalman_smoothing_gain, 459 | np.dot( 460 | (next_smoothed_state_covariance 461 | - predicted_state_covariance), 462 | kalman_smoothing_gain.T 463 | )) 464 | ) 465 | 466 | return (smoothed_state_mean, smoothed_state_covariance, 467 | kalman_smoothing_gain) 468 | 469 | 470 | def _smooth(transition_matrices, filtered_state_means, 471 | filtered_state_covariances, predicted_state_means, 472 | predicted_state_covariances): 473 | """Apply the Kalman Smoother 474 | 475 | Estimate the hidden state at time for each time step given all 476 | observations. 477 | 478 | Parameters 479 | ---------- 480 | transition_matrices : [n_timesteps-1, n_dim_state, n_dim_state] or \ 481 | [n_dim_state, n_dim_state] array 482 | `transition_matrices[t]` = transition matrix from time t to t+1 483 | filtered_state_means : [n_timesteps, n_dim_state] array 484 | `filtered_state_means[t]` = mean state estimate for time t given 485 | observations from times [0...t] 486 | filtered_state_covariances : [n_timesteps, n_dim_state, n_dim_state] array 487 | `filtered_state_covariances[t]` = covariance of state estimate for time 488 | t given observations from times [0...t] 489 | predicted_state_means : [n_timesteps, n_dim_state] array 490 | `predicted_state_means[t]` = mean state estimate for time t given 491 | observations from times [0...t-1] 492 | predicted_state_covariances : [n_timesteps, n_dim_state, n_dim_state] array 493 | `predicted_state_covariances[t]` = covariance of state estimate for 494 | time t given observations from times [0...t-1] 495 | 496 | Returns 497 | ------- 498 | smoothed_state_means : [n_timesteps, n_dim_state] 499 | mean of hidden state distributions for times [0...n_timesteps-1] given 500 | all observations 501 | smoothed_state_covariances : [n_timesteps, n_dim_state, n_dim_state] array 502 | covariance matrix of hidden state distributions for times 503 | [0...n_timesteps-1] given all observations 504 | kalman_smoothing_gains : [n_timesteps-1, n_dim_state, n_dim_state] array 505 | Kalman Smoothing correction matrices for times [0...n_timesteps-2] 506 | """ 507 | n_timesteps, n_dim_state = filtered_state_means.shape 508 | 509 | smoothed_state_means = np.zeros((n_timesteps, n_dim_state)) 510 | smoothed_state_covariances = np.zeros((n_timesteps, n_dim_state, 511 | n_dim_state)) 512 | kalman_smoothing_gains = np.zeros((n_timesteps - 1, n_dim_state, 513 | n_dim_state)) 514 | 515 | smoothed_state_means[-1] = filtered_state_means[-1] 516 | smoothed_state_covariances[-1] = filtered_state_covariances[-1] 517 | 518 | for t in reversed(range(n_timesteps - 1)): 519 | transition_matrix = _last_dims(transition_matrices, t) 520 | (smoothed_state_means[t], smoothed_state_covariances[t], 521 | kalman_smoothing_gains[t]) = ( 522 | _smooth_update( 523 | transition_matrix, 524 | filtered_state_means[t], 525 | filtered_state_covariances[t], 526 | predicted_state_means[t + 1], 527 | predicted_state_covariances[t + 1], 528 | smoothed_state_means[t + 1], 529 | smoothed_state_covariances[t + 1] 530 | ) 531 | ) 532 | return (smoothed_state_means, smoothed_state_covariances, 533 | kalman_smoothing_gains) 534 | 535 | 536 | def _smooth_pair(smoothed_state_covariances, kalman_smoothing_gain): 537 | r"""Calculate pairwise covariance between hidden states 538 | 539 | Calculate covariance between hidden states at :math:`t` and :math:`t-1` for 540 | all time step pairs 541 | 542 | Parameters 543 | ---------- 544 | smoothed_state_covariances : [n_timesteps, n_dim_state, n_dim_state] array 545 | covariance of hidden state given all observations 546 | kalman_smoothing_gain : [n_timesteps-1, n_dim_state, n_dim_state] 547 | Correction matrices from Kalman Smoothing 548 | 549 | Returns 550 | ------- 551 | pairwise_covariances : [n_timesteps, n_dim_state, n_dim_state] array 552 | Covariance between hidden states at times t and t-1 for t = 553 | [1...n_timesteps-1]. Time 0 is ignored. 554 | """ 555 | n_timesteps, n_dim_state, _ = smoothed_state_covariances.shape 556 | pairwise_covariances = np.zeros((n_timesteps, n_dim_state, n_dim_state)) 557 | for t in range(1, n_timesteps): 558 | pairwise_covariances[t] = ( 559 | np.dot(smoothed_state_covariances[t], 560 | kalman_smoothing_gain[t - 1].T) 561 | ) 562 | return pairwise_covariances 563 | 564 | 565 | def _em(observations, transition_offsets, observation_offsets, 566 | smoothed_state_means, smoothed_state_covariances, pairwise_covariances, 567 | given={}): 568 | """Apply the EM Algorithm to the Linear-Gaussian model 569 | 570 | Estimate Linear-Gaussian model parameters by maximizing the expected log 571 | likelihood of all observations. 572 | 573 | Parameters 574 | ---------- 575 | observations : [n_timesteps, n_dim_obs] array 576 | observations for times [0...n_timesteps-1]. If observations is a 577 | masked array and any of observations[t] is masked, then it will be 578 | treated as a missing observation. 579 | transition_offsets : [n_dim_state] or [n_timesteps-1, n_dim_state] array 580 | transition offset 581 | observation_offsets : [n_dim_obs] or [n_timesteps, n_dim_obs] array 582 | observation offsets 583 | smoothed_state_means : [n_timesteps, n_dim_state] array 584 | smoothed_state_means[t] = mean of state at time t given all 585 | observations 586 | smoothed_state_covariances : [n_timesteps, n_dim_state, n_dim_state] array 587 | smoothed_state_covariances[t] = covariance of state at time t given all 588 | observations 589 | pairwise_covariances : [n_timesteps, n_dim_state, n_dim_state] array 590 | pairwise_covariances[t] = covariance between states at times t and 591 | t-1 given all observations. pairwise_covariances[0] is ignored. 592 | given: dict 593 | if one of the variables EM is capable of predicting is in given, then 594 | that value will be used and EM will not attempt to estimate it. e.g., 595 | if 'observation_matrix' is in given, observation_matrix will not be 596 | estimated and given['observation_matrix'] will be returned in its 597 | place. 598 | 599 | Returns 600 | ------- 601 | transition_matrix : [n_dim_state, n_dim_state] array 602 | estimated transition matrix 603 | observation_matrix : [n_dim_obs, n_dim_state] array 604 | estimated observation matrix 605 | transition_offsets : [n_dim_state] array 606 | estimated transition offset 607 | observation_offsets : [n_dim_obs] array 608 | estimated observation offset 609 | transition_covariance : [n_dim_state, n_dim_state] array 610 | estimated covariance matrix for state transitions 611 | observation_covariance : [n_dim_obs, n_dim_obs] array 612 | estimated covariance matrix for observations 613 | initial_state_mean : [n_dim_state] array 614 | estimated mean of initial state distribution 615 | initial_state_covariance : [n_dim_state] array 616 | estimated covariance of initial state distribution 617 | """ 618 | if 'observation_matrices' in given: 619 | observation_matrix = given['observation_matrices'] 620 | else: 621 | observation_matrix = _em_observation_matrix( 622 | observations, observation_offsets, 623 | smoothed_state_means, smoothed_state_covariances 624 | ) 625 | 626 | if 'observation_covariance' in given: 627 | observation_covariance = given['observation_covariance'] 628 | else: 629 | observation_covariance = _em_observation_covariance( 630 | observations, observation_offsets, 631 | observation_matrix, smoothed_state_means, 632 | smoothed_state_covariances 633 | ) 634 | 635 | if 'transition_matrices' in given: 636 | transition_matrix = given['transition_matrices'] 637 | else: 638 | transition_matrix = _em_transition_matrix( 639 | transition_offsets, smoothed_state_means, 640 | smoothed_state_covariances, pairwise_covariances 641 | ) 642 | 643 | if 'transition_covariance' in given: 644 | transition_covariance = given['transition_covariance'] 645 | else: 646 | transition_covariance = _em_transition_covariance( 647 | transition_matrix, transition_offsets, 648 | smoothed_state_means, smoothed_state_covariances, 649 | pairwise_covariances 650 | ) 651 | 652 | if 'initial_state_mean' in given: 653 | initial_state_mean = given['initial_state_mean'] 654 | else: 655 | initial_state_mean = _em_initial_state_mean(smoothed_state_means) 656 | 657 | if 'initial_state_covariance' in given: 658 | initial_state_covariance = given['initial_state_covariance'] 659 | else: 660 | initial_state_covariance = _em_initial_state_covariance( 661 | initial_state_mean, smoothed_state_means, 662 | smoothed_state_covariances 663 | ) 664 | 665 | if 'transition_offsets' in given: 666 | transition_offset = given['transition_offsets'] 667 | else: 668 | transition_offset = _em_transition_offset( 669 | transition_matrix, 670 | smoothed_state_means 671 | ) 672 | 673 | if 'observation_offsets' in given: 674 | observation_offset = given['observation_offsets'] 675 | else: 676 | observation_offset = _em_observation_offset( 677 | observation_matrix, smoothed_state_means, 678 | observations 679 | ) 680 | 681 | return (transition_matrix, observation_matrix, transition_offset, 682 | observation_offset, transition_covariance, 683 | observation_covariance, initial_state_mean, 684 | initial_state_covariance) 685 | 686 | 687 | def _em_observation_matrix(observations, observation_offsets, 688 | smoothed_state_means, smoothed_state_covariances): 689 | r"""Apply the EM algorithm to parameter `observation_matrix` 690 | 691 | Maximize expected log likelihood of observations with respect to the 692 | observation matrix `observation_matrix`. 693 | 694 | .. math:: 695 | 696 | C &= ( \sum_{t=0}^{T-1} (z_t - d_t) \mathbb{E}[x_t]^T ) 697 | ( \sum_{t=0}^{T-1} \mathbb{E}[x_t x_t^T] )^-1 698 | 699 | """ 700 | _, n_dim_state = smoothed_state_means.shape 701 | n_timesteps, n_dim_obs = observations.shape 702 | res1 = np.zeros((n_dim_obs, n_dim_state)) 703 | res2 = np.zeros((n_dim_state, n_dim_state)) 704 | for t in range(n_timesteps): 705 | if not np.any(np.ma.getmask(observations[t])): 706 | observation_offset = _last_dims(observation_offsets, t, ndims=1) 707 | res1 += np.outer(observations[t] - observation_offset, 708 | smoothed_state_means[t]) 709 | res2 += ( 710 | smoothed_state_covariances[t] 711 | + np.outer(smoothed_state_means[t], smoothed_state_means[t]) 712 | ) 713 | return np.dot(res1, linalg.pinv(res2)) 714 | 715 | 716 | def _em_observation_covariance(observations, observation_offsets, 717 | transition_matrices, smoothed_state_means, 718 | smoothed_state_covariances): 719 | r"""Apply the EM algorithm to parameter `observation_covariance` 720 | 721 | Maximize expected log likelihood of observations with respect to the 722 | observation covariance matrix `observation_covariance`. 723 | 724 | .. math:: 725 | 726 | R &= \frac{1}{T} \sum_{t=0}^{T-1} 727 | [z_t - C_t \mathbb{E}[x_t] - b_t] 728 | [z_t - C_t \mathbb{E}[x_t] - b_t]^T 729 | + C_t Var(x_t) C_t^T 730 | """ 731 | _, n_dim_state = smoothed_state_means.shape 732 | n_timesteps, n_dim_obs = observations.shape 733 | res = np.zeros((n_dim_obs, n_dim_obs)) 734 | n_obs = 0 735 | for t in range(n_timesteps): 736 | if not np.any(np.ma.getmask(observations[t])): 737 | transition_matrix = _last_dims(transition_matrices, t) 738 | transition_offset = _last_dims(observation_offsets, t, ndims=1) 739 | err = ( 740 | observations[t] 741 | - np.dot(transition_matrix, smoothed_state_means[t]) 742 | - transition_offset 743 | ) 744 | res += ( 745 | np.outer(err, err) 746 | + np.dot(transition_matrix, 747 | np.dot(smoothed_state_covariances[t], 748 | transition_matrix.T)) 749 | ) 750 | n_obs += 1 751 | if n_obs > 0: 752 | return (1.0 / n_obs) * res 753 | else: 754 | return res 755 | 756 | 757 | def _em_transition_matrix(transition_offsets, smoothed_state_means, 758 | smoothed_state_covariances, pairwise_covariances): 759 | r"""Apply the EM algorithm to parameter `transition_matrix` 760 | 761 | Maximize expected log likelihood of observations with respect to the state 762 | transition matrix `transition_matrix`. 763 | 764 | .. math:: 765 | 766 | A &= ( \sum_{t=1}^{T-1} \mathbb{E}[x_t x_{t-1}^{T}] 767 | - b_{t-1} \mathbb{E}[x_{t-1}]^T ) 768 | ( \sum_{t=1}^{T-1} \mathbb{E}[x_{t-1} x_{t-1}^T] )^{-1} 769 | """ 770 | n_timesteps, n_dim_state, _ = smoothed_state_covariances.shape 771 | res1 = np.zeros((n_dim_state, n_dim_state)) 772 | res2 = np.zeros((n_dim_state, n_dim_state)) 773 | for t in range(1, n_timesteps): 774 | transition_offset = _last_dims(transition_offsets, t - 1, ndims=1) 775 | res1 += ( 776 | pairwise_covariances[t] 777 | + np.outer(smoothed_state_means[t], 778 | smoothed_state_means[t - 1]) 779 | - np.outer(transition_offset, smoothed_state_means[t - 1]) 780 | ) 781 | res2 += ( 782 | smoothed_state_covariances[t - 1] 783 | + np.outer(smoothed_state_means[t - 1], 784 | smoothed_state_means[t - 1]) 785 | ) 786 | return np.dot(res1, linalg.pinv(res2)) 787 | 788 | 789 | def _em_transition_covariance(transition_matrices, transition_offsets, 790 | smoothed_state_means, smoothed_state_covariances, 791 | pairwise_covariances): 792 | r"""Apply the EM algorithm to parameter `transition_covariance` 793 | 794 | Maximize expected log likelihood of observations with respect to the 795 | transition covariance matrix `transition_covariance`. 796 | 797 | .. math:: 798 | 799 | Q &= \frac{1}{T-1} \sum_{t=0}^{T-2} 800 | (\mathbb{E}[x_{t+1}] - A_t \mathbb{E}[x_t] - b_t) 801 | (\mathbb{E}[x_{t+1}] - A_t \mathbb{E}[x_t] - b_t)^T 802 | + A_t Var(x_t) A_t^T + Var(x_{t+1}) 803 | - Cov(x_{t+1}, x_t) A_t^T - A_t Cov(x_t, x_{t+1}) 804 | """ 805 | n_timesteps, n_dim_state, _ = smoothed_state_covariances.shape 806 | res = np.zeros((n_dim_state, n_dim_state)) 807 | for t in range(n_timesteps - 1): 808 | transition_matrix = _last_dims(transition_matrices, t) 809 | transition_offset = _last_dims(transition_offsets, t, ndims=1) 810 | err = ( 811 | smoothed_state_means[t + 1] 812 | - np.dot(transition_matrix, smoothed_state_means[t]) 813 | - transition_offset 814 | ) 815 | Vt1t_A = ( 816 | np.dot(pairwise_covariances[t + 1], 817 | transition_matrix.T) 818 | ) 819 | res += ( 820 | np.outer(err, err) 821 | + np.dot(transition_matrix, 822 | np.dot(smoothed_state_covariances[t], 823 | transition_matrix.T)) 824 | + smoothed_state_covariances[t + 1] 825 | - Vt1t_A - Vt1t_A.T 826 | ) 827 | 828 | return (1.0 / (n_timesteps - 1)) * res 829 | 830 | 831 | def _em_initial_state_mean(smoothed_state_means): 832 | r"""Apply the EM algorithm to parameter `initial_state_mean` 833 | 834 | Maximize expected log likelihood of observations with respect to the 835 | initial state distribution mean `initial_state_mean`. 836 | 837 | .. math:: 838 | 839 | \mu_0 = \mathbb{E}[x_0] 840 | """ 841 | 842 | return smoothed_state_means[0] 843 | 844 | 845 | def _em_initial_state_covariance(initial_state_mean, smoothed_state_means, 846 | smoothed_state_covariances): 847 | r"""Apply the EM algorithm to parameter `initial_state_covariance` 848 | 849 | Maximize expected log likelihood of observations with respect to the 850 | covariance of the initial state distribution `initial_state_covariance`. 851 | 852 | .. math:: 853 | 854 | \Sigma_0 = \mathbb{E}[x_0, x_0^T] - mu_0 \mathbb{E}[x_0]^T 855 | - \mathbb{E}[x_0] mu_0^T + mu_0 mu_0^T 856 | """ 857 | x0 = smoothed_state_means[0] 858 | x0_x0 = smoothed_state_covariances[0] + np.outer(x0, x0) 859 | return ( 860 | x0_x0 861 | - np.outer(initial_state_mean, x0) 862 | - np.outer(x0, initial_state_mean) 863 | + np.outer(initial_state_mean, initial_state_mean) 864 | ) 865 | 866 | 867 | def _em_transition_offset(transition_matrices, smoothed_state_means): 868 | r"""Apply the EM algorithm to parameter `transition_offset` 869 | 870 | Maximize expected log likelihood of observations with respect to the 871 | state transition offset `transition_offset`. 872 | 873 | .. math:: 874 | 875 | b = \frac{1}{T-1} \sum_{t=1}^{T-1} 876 | \mathbb{E}[x_t] - A_{t-1} \mathbb{E}[x_{t-1}] 877 | """ 878 | n_timesteps, n_dim_state = smoothed_state_means.shape 879 | transition_offset = np.zeros(n_dim_state) 880 | for t in range(1, n_timesteps): 881 | transition_matrix = _last_dims(transition_matrices, t - 1) 882 | transition_offset += ( 883 | smoothed_state_means[t] 884 | - np.dot(transition_matrix, smoothed_state_means[t - 1]) 885 | ) 886 | if n_timesteps > 1: 887 | return (1.0 / (n_timesteps - 1)) * transition_offset 888 | else: 889 | return np.zeros(n_dim_state) 890 | 891 | 892 | def _em_observation_offset(observation_matrices, smoothed_state_means, 893 | observations): 894 | r"""Apply the EM algorithm to parameter `observation_offset` 895 | 896 | Maximize expected log likelihood of observations with respect to the 897 | observation offset `observation_offset`. 898 | 899 | .. math:: 900 | 901 | d = \frac{1}{T} \sum_{t=0}^{T-1} z_t - C_{t} \mathbb{E}[x_{t}] 902 | """ 903 | n_timesteps, n_dim_obs = observations.shape 904 | observation_offset = np.zeros(n_dim_obs) 905 | n_obs = 0 906 | for t in range(n_timesteps): 907 | if not np.any(np.ma.getmask(observations[t])): 908 | observation_matrix = _last_dims(observation_matrices, t) 909 | observation_offset += ( 910 | observations[t] 911 | - np.dot(observation_matrix, smoothed_state_means[t]) 912 | ) 913 | n_obs += 1 914 | if n_obs > 0: 915 | return (1.0 / n_obs) * observation_offset 916 | else: 917 | return observation_offset 918 | 919 | 920 | class KalmanFilter(object): 921 | """Implements the Kalman Filter, Kalman Smoother, and EM algorithm. 922 | 923 | This class implements the Kalman Filter, Kalman Smoother, and EM Algorithm 924 | for a Linear Gaussian model specified by, 925 | 926 | .. math:: 927 | 928 | x_{t+1} &= A_{t} x_{t} + b_{t} + \\text{Normal}(0, Q_{t}) \\\\ 929 | z_{t} &= C_{t} x_{t} + d_{t} + \\text{Normal}(0, R_{t}) 930 | 931 | The Kalman Filter is an algorithm designed to estimate 932 | :math:`P(x_t | z_{0:t})`. As all state transitions and observations are 933 | linear with Gaussian distributed noise, these distributions can be 934 | represented exactly as Gaussian distributions with mean 935 | `filtered_state_means[t]` and covariances `filtered_state_covariances[t]`. 936 | 937 | Similarly, the Kalman Smoother is an algorithm designed to estimate 938 | :math:`P(x_t | z_{0:T-1})`. 939 | 940 | The EM algorithm aims to find for 941 | :math:`\\theta = (A, b, C, d, Q, R, \\mu_0, \\Sigma_0)` 942 | 943 | .. math:: 944 | 945 | \\max_{\\theta} P(z_{0:T-1}; \\theta) 946 | 947 | If we define :math:`L(x_{0:T-1},\\theta) = \\log P(z_{0:T-1}, x_{0:T-1}; 948 | \\theta)`, then the EM algorithm works by iteratively finding, 949 | 950 | .. math:: 951 | 952 | P(x_{0:T-1} | z_{0:T-1}, \\theta_i) 953 | 954 | then by maximizing, 955 | 956 | .. math:: 957 | 958 | \\theta_{i+1} = \\arg\\max_{\\theta} 959 | \\mathbb{E}_{x_{0:T-1}} [ 960 | L(x_{0:T-1}, \\theta)| z_{0:T-1}, \\theta_i 961 | ] 962 | 963 | Parameters 964 | ---------- 965 | transition_matrices : [n_timesteps-1, n_dim_state, n_dim_state] or \ 966 | [n_dim_state,n_dim_state] array-like 967 | Also known as :math:`A`. state transition matrix between times t and 968 | t+1 for t in [0...n_timesteps-2] 969 | observation_matrices : [n_timesteps, n_dim_obs, n_dim_state] or [n_dim_obs, \ 970 | n_dim_state] array-like 971 | Also known as :math:`C`. observation matrix for times 972 | [0...n_timesteps-1] 973 | transition_covariance : [n_dim_state, n_dim_state] array-like 974 | Also known as :math:`Q`. state transition covariance matrix for times 975 | [0...n_timesteps-2] 976 | observation_covariance : [n_dim_obs, n_dim_obs] array-like 977 | Also known as :math:`R`. observation covariance matrix for times 978 | [0...n_timesteps-1] 979 | transition_offsets : [n_timesteps-1, n_dim_state] or [n_dim_state] \ 980 | array-like 981 | Also known as :math:`b`. state offsets for times [0...n_timesteps-2] 982 | observation_offsets : [n_timesteps, n_dim_obs] or [n_dim_obs] array-like 983 | Also known as :math:`d`. observation offset for times 984 | [0...n_timesteps-1] 985 | initial_state_mean : [n_dim_state] array-like 986 | Also known as :math:`\\mu_0`. mean of initial state distribution 987 | initial_state_covariance : [n_dim_state, n_dim_state] array-like 988 | Also known as :math:`\\Sigma_0`. covariance of initial state 989 | distribution 990 | random_state : optional, numpy random state 991 | random number generator used in sampling 992 | em_vars : optional, subset of ['transition_matrices', \ 993 | 'observation_matrices', 'transition_offsets', 'observation_offsets', \ 994 | 'transition_covariance', 'observation_covariance', 'initial_state_mean', \ 995 | 'initial_state_covariance'] or 'all' 996 | if `em_vars` is an iterable of strings only variables in `em_vars` 997 | will be estimated using EM. if `em_vars` == 'all', then all 998 | variables will be estimated. 999 | n_dim_state: optional, integer 1000 | the dimensionality of the state space. Only meaningful when you do not 1001 | specify initial values for `transition_matrices`, `transition_offsets`, 1002 | `transition_covariance`, `initial_state_mean`, or 1003 | `initial_state_covariance`. 1004 | n_dim_obs: optional, integer 1005 | the dimensionality of the observation space. Only meaningful when you 1006 | do not specify initial values for `observation_matrices`, 1007 | `observation_offsets`, or `observation_covariance`. 1008 | """ 1009 | def __init__(self, transition_matrices=None, observation_matrices=None, 1010 | transition_covariance=None, observation_covariance=None, 1011 | transition_offsets=None, observation_offsets=None, 1012 | initial_state_mean=None, initial_state_covariance=None, 1013 | random_state=None, 1014 | em_vars=['transition_covariance', 'observation_covariance', 1015 | 'initial_state_mean', 'initial_state_covariance'], 1016 | n_dim_state=None, n_dim_obs=None): 1017 | """Initialize Kalman Filter""" 1018 | 1019 | # determine size of state space 1020 | n_dim_state = _determine_dimensionality( 1021 | [(transition_matrices, array2d, -2), 1022 | (transition_offsets, array1d, -1), 1023 | (transition_covariance, array2d, -2), 1024 | (initial_state_mean, array1d, -1), 1025 | (initial_state_covariance, array2d, -2), 1026 | (observation_matrices, array2d, -1)], 1027 | n_dim_state 1028 | ) 1029 | n_dim_obs = _determine_dimensionality( 1030 | [(observation_matrices, array2d, -2), 1031 | (observation_offsets, array1d, -1), 1032 | (observation_covariance, array2d, -2)], 1033 | n_dim_obs 1034 | ) 1035 | 1036 | self.transition_matrices = transition_matrices 1037 | self.observation_matrices = observation_matrices 1038 | self.transition_covariance = transition_covariance 1039 | self.observation_covariance = observation_covariance 1040 | self.transition_offsets = transition_offsets 1041 | self.observation_offsets = observation_offsets 1042 | self.initial_state_mean = initial_state_mean 1043 | self.initial_state_covariance = initial_state_covariance 1044 | self.random_state = random_state 1045 | self.em_vars = em_vars 1046 | self.n_dim_state = n_dim_state 1047 | self.n_dim_obs = n_dim_obs 1048 | 1049 | def sample(self, n_timesteps, initial_state=None, random_state=None): 1050 | """Sample a state sequence :math:`n_{\\text{timesteps}}` timesteps in 1051 | length. 1052 | 1053 | Parameters 1054 | ---------- 1055 | n_timesteps : int 1056 | number of timesteps 1057 | 1058 | Returns 1059 | ------- 1060 | states : [n_timesteps, n_dim_state] array 1061 | hidden states corresponding to times [0...n_timesteps-1] 1062 | observations : [n_timesteps, n_dim_obs] array 1063 | observations corresponding to times [0...n_timesteps-1] 1064 | """ 1065 | (transition_matrices, transition_offsets, transition_covariance, 1066 | observation_matrices, observation_offsets, observation_covariance, 1067 | initial_state_mean, initial_state_covariance) = ( 1068 | self._initialize_parameters() 1069 | ) 1070 | 1071 | n_dim_state = transition_matrices.shape[-2] 1072 | n_dim_obs = observation_matrices.shape[-2] 1073 | states = np.zeros((n_timesteps, n_dim_state)) 1074 | observations = np.zeros((n_timesteps, n_dim_obs)) 1075 | 1076 | # logic for instantiating rng 1077 | if random_state is None: 1078 | rng = check_random_state(self.random_state) 1079 | else: 1080 | rng = check_random_state(random_state) 1081 | 1082 | # logic for selecting initial state 1083 | if initial_state is None: 1084 | initial_state = rng.multivariate_normal( 1085 | initial_state_mean, 1086 | initial_state_covariance 1087 | ) 1088 | 1089 | # logic for generating samples 1090 | for t in range(n_timesteps): 1091 | if t == 0: 1092 | states[t] = initial_state 1093 | else: 1094 | transition_matrix = _last_dims( 1095 | transition_matrices, t - 1 1096 | ) 1097 | transition_offset = _last_dims( 1098 | transition_offsets, t - 1, ndims=1 1099 | ) 1100 | transition_covariance = _last_dims( 1101 | transition_covariance, t - 1 1102 | ) 1103 | states[t] = ( 1104 | np.dot(transition_matrix, states[t - 1]) 1105 | + transition_offset 1106 | + rng.multivariate_normal( 1107 | np.zeros(n_dim_state), 1108 | transition_covariance.newbyteorder('=') 1109 | ) 1110 | ) 1111 | 1112 | observation_matrix = _last_dims( 1113 | observation_matrices, t 1114 | ) 1115 | observation_offset = _last_dims( 1116 | observation_offsets, t, ndims=1 1117 | ) 1118 | observation_covariance = _last_dims( 1119 | observation_covariance, t 1120 | ) 1121 | observations[t] = ( 1122 | np.dot(observation_matrix, states[t]) 1123 | + observation_offset 1124 | + rng.multivariate_normal( 1125 | np.zeros(n_dim_obs), 1126 | observation_covariance.newbyteorder('=') 1127 | ) 1128 | ) 1129 | 1130 | return (states, np.ma.array(observations)) 1131 | 1132 | def filter(self, X): 1133 | """Apply the Kalman Filter 1134 | 1135 | Apply the Kalman Filter to estimate the hidden state at time :math:`t` 1136 | for :math:`t = [0...n_{\\text{timesteps}}-1]` given observations up to 1137 | and including time `t`. Observations are assumed to correspond to 1138 | times :math:`[0...n_{\\text{timesteps}}-1]`. The output of this method 1139 | corresponding to time :math:`n_{\\text{timesteps}}-1` can be used in 1140 | :func:`KalmanFilter.filter_update` for online updating. 1141 | 1142 | Parameters 1143 | ---------- 1144 | X : [n_timesteps, n_dim_obs] array-like 1145 | observations corresponding to times [0...n_timesteps-1]. If `X` is 1146 | a masked array and any of `X[t]` is masked, then `X[t]` will be 1147 | treated as a missing observation. 1148 | 1149 | Returns 1150 | ------- 1151 | filtered_state_means : [n_timesteps, n_dim_state] 1152 | mean of hidden state distributions for times [0...n_timesteps-1] 1153 | given observations up to and including the current time step 1154 | filtered_state_covariances : [n_timesteps, n_dim_state, n_dim_state] \ 1155 | array 1156 | covariance matrix of hidden state distributions for times 1157 | [0...n_timesteps-1] given observations up to and including the 1158 | current time step 1159 | """ 1160 | Z = self._parse_observations(X) 1161 | 1162 | (transition_matrices, transition_offsets, transition_covariance, 1163 | observation_matrices, observation_offsets, observation_covariance, 1164 | initial_state_mean, initial_state_covariance) = ( 1165 | self._initialize_parameters() 1166 | ) 1167 | 1168 | (_, _, _, filtered_state_means, 1169 | filtered_state_covariances) = ( 1170 | _filter( 1171 | transition_matrices, observation_matrices, 1172 | transition_covariance, observation_covariance, 1173 | transition_offsets, observation_offsets, 1174 | initial_state_mean, initial_state_covariance, 1175 | Z 1176 | ) 1177 | ) 1178 | return (filtered_state_means, filtered_state_covariances) 1179 | 1180 | def filter_update(self, filtered_state_mean, filtered_state_covariance, 1181 | observation=None, transition_matrix=None, 1182 | transition_offset=None, transition_covariance=None, 1183 | observation_matrix=None, observation_offset=None, 1184 | observation_covariance=None): 1185 | r"""Update a Kalman Filter state estimate 1186 | 1187 | Perform a one-step update to estimate the state at time :math:`t+1` 1188 | give an observation at time :math:`t+1` and the previous estimate for 1189 | time :math:`t` given observations from times :math:`[0...t]`. This 1190 | method is useful if one wants to track an object with streaming 1191 | observations. 1192 | 1193 | Parameters 1194 | ---------- 1195 | filtered_state_mean : [n_dim_state] array 1196 | mean estimate for state at time t given observations from times 1197 | [1...t] 1198 | filtered_state_covariance : [n_dim_state, n_dim_state] array 1199 | covariance of estimate for state at time t given observations from 1200 | times [1...t] 1201 | observation : [n_dim_obs] array or None 1202 | observation from time t+1. If `observation` is a masked array and 1203 | any of `observation`'s components are masked or if `observation` is 1204 | None, then `observation` will be treated as a missing observation. 1205 | transition_matrix : optional, [n_dim_state, n_dim_state] array 1206 | state transition matrix from time t to t+1. If unspecified, 1207 | `self.transition_matrices` will be used. 1208 | transition_offset : optional, [n_dim_state] array 1209 | state offset for transition from time t to t+1. If unspecified, 1210 | `self.transition_offset` will be used. 1211 | transition_covariance : optional, [n_dim_state, n_dim_state] array 1212 | state transition covariance from time t to t+1. If unspecified, 1213 | `self.transition_covariance` will be used. 1214 | observation_matrix : optional, [n_dim_obs, n_dim_state] array 1215 | observation matrix at time t+1. If unspecified, 1216 | `self.observation_matrices` will be used. 1217 | observation_offset : optional, [n_dim_obs] array 1218 | observation offset at time t+1. If unspecified, 1219 | `self.observation_offset` will be used. 1220 | observation_covariance : optional, [n_dim_obs, n_dim_obs] array 1221 | observation covariance at time t+1. If unspecified, 1222 | `self.observation_covariance` will be used. 1223 | 1224 | Returns 1225 | ------- 1226 | next_filtered_state_mean : [n_dim_state] array 1227 | mean estimate for state at time t+1 given observations from times 1228 | [1...t+1] 1229 | next_filtered_state_covariance : [n_dim_state, n_dim_state] array 1230 | covariance of estimate for state at time t+1 given observations 1231 | from times [1...t+1] 1232 | """ 1233 | # initialize matrices 1234 | (transition_matrices, transition_offsets, transition_cov, 1235 | observation_matrices, observation_offsets, observation_cov, 1236 | initial_state_mean, initial_state_covariance) = ( 1237 | self._initialize_parameters() 1238 | ) 1239 | transition_offset = _arg_or_default( 1240 | transition_offset, transition_offsets, 1241 | 1, "transition_offset" 1242 | ) 1243 | observation_offset = _arg_or_default( 1244 | observation_offset, observation_offsets, 1245 | 1, "observation_offset" 1246 | ) 1247 | transition_matrix = _arg_or_default( 1248 | transition_matrix, transition_matrices, 1249 | 2, "transition_matrix" 1250 | ) 1251 | observation_matrix = _arg_or_default( 1252 | observation_matrix, observation_matrices, 1253 | 2, "observation_matrix" 1254 | ) 1255 | transition_covariance = _arg_or_default( 1256 | transition_covariance, transition_cov, 1257 | 2, "transition_covariance" 1258 | ) 1259 | observation_covariance = _arg_or_default( 1260 | observation_covariance, observation_cov, 1261 | 2, "observation_covariance" 1262 | ) 1263 | 1264 | # Make a masked observation if necessary 1265 | if observation is None: 1266 | n_dim_obs = observation_covariance.shape[0] 1267 | observation = np.ma.array(np.zeros(n_dim_obs)) 1268 | observation.mask = True 1269 | else: 1270 | observation = np.ma.asarray(observation) 1271 | 1272 | predicted_state_mean, predicted_state_covariance = ( 1273 | _filter_predict( 1274 | transition_matrix, transition_covariance, 1275 | transition_offset, filtered_state_mean, 1276 | filtered_state_covariance 1277 | ) 1278 | ) 1279 | (_, next_filtered_state_mean, 1280 | next_filtered_state_covariance) = ( 1281 | _filter_correct( 1282 | observation_matrix, observation_covariance, 1283 | observation_offset, predicted_state_mean, 1284 | predicted_state_covariance, observation 1285 | ) 1286 | ) 1287 | 1288 | return (next_filtered_state_mean, next_filtered_state_covariance) 1289 | 1290 | def smooth(self, X): 1291 | """Apply the Kalman Smoother 1292 | 1293 | Apply the Kalman Smoother to estimate the hidden state at time 1294 | :math:`t` for :math:`t = [0...n_{\\text{timesteps}}-1]` given all 1295 | observations. See :func:`_smooth` for more complex output 1296 | 1297 | Parameters 1298 | ---------- 1299 | X : [n_timesteps, n_dim_obs] array-like 1300 | observations corresponding to times [0...n_timesteps-1]. If `X` is 1301 | a masked array and any of `X[t]` is masked, then `X[t]` will be 1302 | treated as a missing observation. 1303 | 1304 | Returns 1305 | ------- 1306 | smoothed_state_means : [n_timesteps, n_dim_state] 1307 | mean of hidden state distributions for times [0...n_timesteps-1] 1308 | given all observations 1309 | smoothed_state_covariances : [n_timesteps, n_dim_state] 1310 | covariances of hidden state distributions for times 1311 | [0...n_timesteps-1] given all observations 1312 | """ 1313 | Z = self._parse_observations(X) 1314 | 1315 | (transition_matrices, transition_offsets, transition_covariance, 1316 | observation_matrices, observation_offsets, observation_covariance, 1317 | initial_state_mean, initial_state_covariance) = ( 1318 | self._initialize_parameters() 1319 | ) 1320 | 1321 | (predicted_state_means, predicted_state_covariances, 1322 | _, filtered_state_means, filtered_state_covariances) = ( 1323 | _filter( 1324 | transition_matrices, observation_matrices, 1325 | transition_covariance, observation_covariance, 1326 | transition_offsets, observation_offsets, 1327 | initial_state_mean, initial_state_covariance, Z 1328 | ) 1329 | ) 1330 | (smoothed_state_means, smoothed_state_covariances) = ( 1331 | _smooth( 1332 | transition_matrices, filtered_state_means, 1333 | filtered_state_covariances, predicted_state_means, 1334 | predicted_state_covariances 1335 | )[:2] 1336 | ) 1337 | return (smoothed_state_means, smoothed_state_covariances) 1338 | 1339 | def em(self, X, y=None, n_iter=10, em_vars=None): 1340 | """Apply the EM algorithm 1341 | 1342 | Apply the EM algorithm to estimate all parameters specified by 1343 | `em_vars`. Note that all variables estimated are assumed to be 1344 | constant for all time. See :func:`_em` for details. 1345 | 1346 | Parameters 1347 | ---------- 1348 | X : [n_timesteps, n_dim_obs] array-like 1349 | observations corresponding to times [0...n_timesteps-1]. If `X` is 1350 | a masked array and any of `X[t]`'s components is masked, then 1351 | `X[t]` will be treated as a missing observation. 1352 | n_iter : int, optional 1353 | number of EM iterations to perform 1354 | em_vars : iterable of strings or 'all' 1355 | variables to perform EM over. Any variable not appearing here is 1356 | left untouched. 1357 | """ 1358 | Z = self._parse_observations(X) 1359 | 1360 | # initialize parameters 1361 | (self.transition_matrices, self.transition_offsets, 1362 | self.transition_covariance, self.observation_matrices, 1363 | self.observation_offsets, self.observation_covariance, 1364 | self.initial_state_mean, self.initial_state_covariance) = ( 1365 | self._initialize_parameters() 1366 | ) 1367 | 1368 | # Create dictionary of variables not to perform EM on 1369 | if em_vars is None: 1370 | em_vars = self.em_vars 1371 | 1372 | if em_vars == 'all': 1373 | given = {} 1374 | else: 1375 | given = { 1376 | 'transition_matrices': self.transition_matrices, 1377 | 'observation_matrices': self.observation_matrices, 1378 | 'transition_offsets': self.transition_offsets, 1379 | 'observation_offsets': self.observation_offsets, 1380 | 'transition_covariance': self.transition_covariance, 1381 | 'observation_covariance': self.observation_covariance, 1382 | 'initial_state_mean': self.initial_state_mean, 1383 | 'initial_state_covariance': self.initial_state_covariance 1384 | } 1385 | em_vars = set(em_vars) 1386 | for k in list(given.keys()): 1387 | if k in em_vars: 1388 | given.pop(k) 1389 | 1390 | # If a parameter is time varying, print a warning 1391 | for (k, v) in get_params(self).items(): 1392 | if k in DIM and (not k in given) and len(v.shape) != DIM[k]: 1393 | warn_str = ( 1394 | '{0} has {1} dimensions now; after fitting, ' 1395 | + 'it will have dimension {2}' 1396 | ).format(k, len(v.shape), DIM[k]) 1397 | warnings.warn(warn_str) 1398 | 1399 | # Actual EM iterations 1400 | for i in range(n_iter): 1401 | (predicted_state_means, predicted_state_covariances, 1402 | kalman_gains, filtered_state_means, 1403 | filtered_state_covariances) = ( 1404 | _filter( 1405 | self.transition_matrices, self.observation_matrices, 1406 | self.transition_covariance, self.observation_covariance, 1407 | self.transition_offsets, self.observation_offsets, 1408 | self.initial_state_mean, self.initial_state_covariance, 1409 | Z 1410 | ) 1411 | ) 1412 | (smoothed_state_means, smoothed_state_covariances, 1413 | kalman_smoothing_gains) = ( 1414 | _smooth( 1415 | self.transition_matrices, filtered_state_means, 1416 | filtered_state_covariances, predicted_state_means, 1417 | predicted_state_covariances 1418 | ) 1419 | ) 1420 | sigma_pair_smooth = _smooth_pair( 1421 | smoothed_state_covariances, 1422 | kalman_smoothing_gains 1423 | ) 1424 | (self.transition_matrices, self.observation_matrices, 1425 | self.transition_offsets, self.observation_offsets, 1426 | self.transition_covariance, self.observation_covariance, 1427 | self.initial_state_mean, self.initial_state_covariance) = ( 1428 | _em(Z, self.transition_offsets, self.observation_offsets, 1429 | smoothed_state_means, smoothed_state_covariances, 1430 | sigma_pair_smooth, given=given 1431 | ) 1432 | ) 1433 | return self 1434 | 1435 | def loglikelihood(self, X): 1436 | """Calculate the log likelihood of all observations 1437 | 1438 | Parameters 1439 | ---------- 1440 | X : [n_timesteps, n_dim_obs] array 1441 | observations for time steps [0...n_timesteps-1] 1442 | 1443 | Returns 1444 | ------- 1445 | likelihood : float 1446 | likelihood of all observations 1447 | """ 1448 | Z = self._parse_observations(X) 1449 | 1450 | # initialize parameters 1451 | (transition_matrices, transition_offsets, 1452 | transition_covariance, observation_matrices, 1453 | observation_offsets, observation_covariance, 1454 | initial_state_mean, initial_state_covariance) = ( 1455 | self._initialize_parameters() 1456 | ) 1457 | 1458 | # apply the Kalman Filter 1459 | (predicted_state_means, predicted_state_covariances, 1460 | kalman_gains, filtered_state_means, 1461 | filtered_state_covariances) = ( 1462 | _filter( 1463 | transition_matrices, observation_matrices, 1464 | transition_covariance, observation_covariance, 1465 | transition_offsets, observation_offsets, 1466 | initial_state_mean, initial_state_covariance, 1467 | Z 1468 | ) 1469 | ) 1470 | 1471 | # get likelihoods for each time step 1472 | loglikelihoods = _loglikelihoods( 1473 | observation_matrices, observation_offsets, observation_covariance, 1474 | predicted_state_means, predicted_state_covariances, Z 1475 | ) 1476 | 1477 | return np.sum(loglikelihoods) 1478 | 1479 | def _initialize_parameters(self): 1480 | """Retrieve parameters if they exist, else replace with defaults""" 1481 | n_dim_state, n_dim_obs = self.n_dim_state, self.n_dim_obs 1482 | 1483 | arguments = get_params(self) 1484 | defaults = { 1485 | 'transition_matrices': np.eye(n_dim_state), 1486 | 'transition_offsets': np.zeros(n_dim_state), 1487 | 'transition_covariance': np.eye(n_dim_state), 1488 | 'observation_matrices': np.eye(n_dim_obs, n_dim_state), 1489 | 'observation_offsets': np.zeros(n_dim_obs), 1490 | 'observation_covariance': np.eye(n_dim_obs), 1491 | 'initial_state_mean': np.zeros(n_dim_state), 1492 | 'initial_state_covariance': np.eye(n_dim_state), 1493 | 'random_state': 0, 1494 | 'em_vars': [ 1495 | 'transition_covariance', 1496 | 'observation_covariance', 1497 | 'initial_state_mean', 1498 | 'initial_state_covariance' 1499 | ], 1500 | } 1501 | converters = { 1502 | 'transition_matrices': array2d, 1503 | 'transition_offsets': array1d, 1504 | 'transition_covariance': array2d, 1505 | 'observation_matrices': array2d, 1506 | 'observation_offsets': array1d, 1507 | 'observation_covariance': array2d, 1508 | 'initial_state_mean': array1d, 1509 | 'initial_state_covariance': array2d, 1510 | 'random_state': check_random_state, 1511 | 'n_dim_state': int, 1512 | 'n_dim_obs': int, 1513 | 'em_vars': lambda x: x, 1514 | } 1515 | 1516 | parameters = preprocess_arguments([arguments, defaults], converters) 1517 | 1518 | return ( 1519 | parameters['transition_matrices'], 1520 | parameters['transition_offsets'], 1521 | parameters['transition_covariance'], 1522 | parameters['observation_matrices'], 1523 | parameters['observation_offsets'], 1524 | parameters['observation_covariance'], 1525 | parameters['initial_state_mean'], 1526 | parameters['initial_state_covariance'] 1527 | ) 1528 | 1529 | def _parse_observations(self, obs): 1530 | """Safely convert observations to their expected format""" 1531 | obs = np.ma.atleast_2d(obs) 1532 | if obs.shape[0] == 1 and obs.shape[1] > 1: 1533 | obs = obs.T 1534 | return obs 1535 | -------------------------------------------------------------------------------- /dsgepy/kalman/utils.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Utility functions taken from scikit-learn 3 | 4 | Correction for masked arrays 5 | https://gist.github.com/JesseLivezey/5d80ef651e75b21a7079 6 | ''' 7 | 8 | import inspect 9 | import itertools 10 | 11 | import numpy as np 12 | from scipy import linalg 13 | 14 | 15 | def array1d(X, dtype=None, order=None): 16 | """Returns at least 1-d array with data from X""" 17 | return np.asarray(np.atleast_1d(X), dtype=dtype, order=order) 18 | 19 | 20 | def array2d(X, dtype=None, order=None): 21 | """Returns at least 2-d array with data from X""" 22 | return np.asarray(np.atleast_2d(X), dtype=dtype, order=order) 23 | 24 | 25 | def log_multivariate_normal_density(X, means, covars, min_covar=1.e-7): 26 | """Log probability for full covariance matrices. """ 27 | if hasattr(linalg, 'solve_triangular'): 28 | # only in scipy since 0.9 29 | solve_triangular = linalg.solve_triangular 30 | else: 31 | # slower, but works 32 | solve_triangular = linalg.solve 33 | n_samples, n_dim = X.shape 34 | nmix = len(means) 35 | log_prob = np.empty((n_samples, nmix)) 36 | for c, (mu, cv) in enumerate(zip(means, covars)): 37 | try: 38 | cv_chol = linalg.cholesky(np.asarray(cv), lower=True) 39 | except linalg.LinAlgError: 40 | # The model is most probabily stuck in a component with too 41 | # few observations, we need to reinitialize this components 42 | 43 | # THIS IS THE ORIGINAL COMMAND, BUT IT WAS NOT ALWAYS ENOUGH TO ASSURE THAT 44 | # 'cv' WAS POSITIVE DEFINITE 45 | # cv_chol = linalg.cholesky(np.asarray(cv + min_covar * np.eye(n_dim)), 46 | # lower=True) 47 | 48 | # I ADDED THIS TO FIND THE NEAREST PD MATRIX 49 | near_pd = nearest_pd(np.asarray(cv)) 50 | cv_chol = linalg.cholesky(near_pd, 51 | lower=True) 52 | 53 | cv_log_det = 2 * np.sum(np.log(np.diagonal(cv_chol))) 54 | cv_sol = solve_triangular(cv_chol, np.asarray((X - mu).T), lower=True).T 55 | log_prob[:, c] = - .5 * (np.sum(cv_sol ** 2, axis=1) + \ 56 | n_dim * np.log(2 * np.pi) + cv_log_det) 57 | 58 | return log_prob 59 | 60 | 61 | def nearest_pd(A): 62 | """Find the nearest positive-definite matrix to input 63 | 64 | A Python/Numpy port of John D'Errico's `nearestSPD` MATLAB code [1], which 65 | credits [2]. 66 | 67 | [1] https://www.mathworks.com/matlabcentral/fileexchange/42885-nearestspd 68 | 69 | [2] N.J. Higham, "Computing a nearest symmetric positive semidefinite 70 | matrix" (1988): https://doi.org/10.1016/0024-3795(88)90223-6 71 | """ 72 | 73 | B = (A + A.T) / 2 74 | _, s, V = np.linalg.svd(B) 75 | 76 | H = np.dot(V.T, np.dot(np.diag(s), V)) 77 | 78 | A2 = (B + H) / 2 79 | 80 | A3 = (A2 + A2.T) / 2 81 | 82 | if ispd(A3): 83 | return A3 84 | 85 | spacing = np.spacing(np.linalg.norm(A)) 86 | # The above is different from [1]. It appears that MATLAB's `chol` Cholesky 87 | # decomposition will accept matrixes with exactly 0-eigenvalue, whereas 88 | # Numpy's will not. So where [1] uses `eps(mineig)` (where `eps` is Matlab 89 | # for `np.spacing`), we use the above definition. CAVEAT: our `spacing` 90 | # will be much larger than [1]'s `eps(mineig)`, since `mineig` is usually on 91 | # the order of 1e-16, and `eps(1e-16)` is on the order of 1e-34, whereas 92 | # `spacing` will, for Gaussian random matrixes of small dimension, be on 93 | # othe order of 1e-16. In practice, both ways converge, as the unit test 94 | # below suggests. 95 | I = np.eye(A.shape[0]) 96 | k = 1 97 | while not ispd(A3): 98 | mineig = np.min(np.real(np.linalg.eigvals(A3))) 99 | A3 += I * (-mineig * k**2 + spacing) 100 | k += 1 101 | 102 | return A3 103 | 104 | 105 | def ispd(B): 106 | """Returns true when input is positive-definite, via Cholesky""" 107 | try: 108 | _ = np.linalg.cholesky(B) 109 | return True 110 | except np.linalg.LinAlgError: 111 | return False 112 | 113 | 114 | def check_random_state(seed): 115 | """Turn seed into a np.random.RandomState instance 116 | 117 | If seed is None, return the RandomState singleton used by np.random. 118 | If seed is an int, return a new RandomState instance seeded with seed. 119 | If seed is already a RandomState instance, return it. 120 | Otherwise raise ValueError. 121 | """ 122 | if seed is None or seed is np.random: 123 | return np.random.mtrand._rand 124 | if isinstance(seed, (int, np.integer)): 125 | return np.random.RandomState(seed) 126 | if isinstance(seed, np.random.RandomState): 127 | return seed 128 | raise ValueError('{0} cannot be used to seed a numpy.random.RandomState' 129 | + ' instance').format(seed) 130 | 131 | 132 | class Bunch(dict): 133 | """Container object for datasets: dictionary-like object that exposes its 134 | keys as attributes.""" 135 | 136 | def __init__(self, **kwargs): 137 | dict.__init__(self, kwargs) 138 | self.__dict__ = self 139 | 140 | 141 | def get_params(obj): 142 | '''Get names and values of all parameters in `obj`'s __init__''' 143 | try: 144 | # get names of every variable in the argument 145 | args = inspect.getfullargspec(obj.__init__)[0] 146 | args.pop(0) # remove "self" 147 | 148 | # get values for each of the above in the object 149 | argdict = dict([(arg, obj.__getattribute__(arg)) for arg in args]) 150 | return argdict 151 | except: 152 | raise ValueError("object has no __init__ method") 153 | 154 | 155 | def preprocess_arguments(argsets, converters): 156 | """convert and collect arguments in order of priority 157 | 158 | Parameters 159 | ---------- 160 | argsets : [{argname: argval}] 161 | a list of argument sets, each with lower levels of priority 162 | converters : {argname: function} 163 | conversion functions for each argument 164 | 165 | Returns 166 | ------- 167 | result : {argname: argval} 168 | processed arguments 169 | """ 170 | result = {} 171 | for argset in argsets: 172 | for (argname, argval) in argset.items(): 173 | # check that this argument is necessary 174 | if not argname in converters: 175 | raise ValueError("Unrecognized argument: {0}".format(argname)) 176 | 177 | # potentially use this argument 178 | if argname not in result and argval is not None: 179 | # convert to right type 180 | argval = converters[argname](argval) 181 | 182 | # save 183 | result[argname] = argval 184 | 185 | # check that all arguments are covered 186 | if not len(converters.keys()) == len(result.keys()): 187 | missing = set(converters.keys()) - set(result.keys()) 188 | s = "The following arguments are missing: {0}".format(list(missing)) 189 | raise ValueError(s) 190 | 191 | return result 192 | -------------------------------------------------------------------------------- /dsgepy/lineardsge.py: -------------------------------------------------------------------------------- 1 | """ 2 | Author: Gustavo Amarante 3 | Classes and functions for linearized DSGEs. 4 | """ 5 | import pandas as pd 6 | from tqdm import tqdm 7 | from scipy.linalg import qz 8 | from math import ceil, floor 9 | import matplotlib.pyplot as plt 10 | from dsgepy.kalman import KalmanFilter 11 | from sympy import simplify, Matrix 12 | from dsgepy.pycsminwel import csminwel 13 | from scipy.optimize import minimize, basinhopping 14 | from numpy.random import multivariate_normal, rand, seed 15 | from scipy.stats import beta, gamma, invgamma, norm, uniform 16 | from numpy.linalg import svd, inv, eig, matrix_power, LinAlgError 17 | from numpy import diagonal, vstack, array, eye, where, diag, sqrt, hstack, zeros, \ 18 | arange, exp, log, inf, nan, isnan, isinf, set_printoptions, matrix, linspace, ndarray 19 | 20 | 21 | class DSGE(object): 22 | """ 23 | This is the main class which holds a DSGE model with all its attributes and methods. 24 | """ 25 | 26 | optim_methods = ['csminwel', 'basinhopping', 'bfgs'] 27 | 28 | resid = None 29 | chains = None 30 | prior_info = None 31 | has_solution = False 32 | posterior_table = None 33 | acceptance_rate = None 34 | 35 | def __init__(self, endog, endogl, exog, expec=None, state_equations=None, obs_equations=None, estimate_params=None, 36 | calib_dict=None, prior_dict=None, obs_data=None, verbose=False, optim_method='csminwel', 37 | obs_names=None): 38 | """ 39 | Model declaration requires passing SymPy symbols as variables and parameters. Some arguments can be left empty 40 | if you are working with simulations of calibrated models. 41 | 42 | @param endog: SymPy matrix of symbols containing the endogenous variables. 43 | @param endogl: SymPy matrix of symbols containing the lagged endogenous variables. 44 | @param exog: SymPy matrix of symbols containing the exogenous shocks. 45 | @param expec: SymPy matrix of symbols containing the expectational errors. 46 | @param state_equations: SymPy matrix of symbolic expressions representing the model's equilibrium conditions, 47 | with zeros on the right-hand side of the equality. 48 | @param obs_equations: SymPy matrix of symbolic expressions representing the model's observation equations, with 49 | observable variables on the left-hand side of the equation. This is only required if the 50 | model is going to be estimated. You do not need to provide observation equations to run 51 | simulations on a calibrated model. 52 | @param estimate_params: SymPy matrix of symbols containing the parameters that are free to be estimated. 53 | @param calib_dict: dict. Keys are the symbols of parameters that are going to be calibrated, and values are 54 | their calibrated value. 55 | @param prior_dict: dict. Entries must have symbols of parameters that are going to be estimated. Values are 56 | dictionaries containing the following entries: 57 | - 'dist': prior distribution. 'normal', 'beta', 'gamma' or 'invgamma'. 58 | - 'mean': mean of the prior distribution. 59 | - 'std': standard deviation of the prior distribution. 60 | - 'label': str with name/representation of the estimated parameter. This argument accepts 61 | LaTeX representations. 62 | @param obs_data: pandas DataFrame with the observable variables. Columns must be in the same order as the 63 | 'obs_equations' declarations. 64 | @param verbose: 65 | """ 66 | # TODO document residuals require that n_obs == n_exog 67 | # TODO document all of the attributes 68 | # TODO full review of documentation 69 | # TODO add functionality to save the chart to a given path 70 | 71 | assert optim_method in self.optim_methods, f"optimization method '{optim_method}' not implemented" 72 | 73 | self.optim_method = optim_method 74 | self.verbose = verbose 75 | self.endog = endog 76 | self.endogl = endogl 77 | self.exog = exog 78 | self.expec = expec 79 | self.params = estimate_params 80 | self.state_equations = state_equations 81 | self.obs_equations = obs_equations 82 | self.obs_names = obs_names 83 | self.prior_dict = prior_dict 84 | self.data = obs_data 85 | self.n_state = len(endog) 86 | self.n_exog = len(exog) 87 | self.n_obs = len(endog) if obs_equations is None else len(obs_equations) 88 | self.n_param = None if estimate_params is None else len(estimate_params) 89 | 90 | if (obs_equations is None) and (obs_data is None): 91 | generate_obs = True 92 | else: 93 | generate_obs = False 94 | 95 | self._get_jacobians(generate_obs=generate_obs) 96 | 97 | if estimate_params is None: 98 | # If no parameters are going to be estimated, calibrate the whole model 99 | self.Gamma0, self.Gamma1, self.Psi, self.Pi, self.C_in, self.obs_matrix, self.obs_offset = \ 100 | self._eval_matrix(calib_dict, to_array=True) 101 | 102 | self.G1, self.C_out, self.impact, self.fmat, self.fwt, self.ywt, self.gev, self.eu, self.loose = \ 103 | gensys(self.Gamma0, self.Gamma1, self.C_in, self.Psi, self.Pi) 104 | 105 | self.has_solution = True 106 | else: 107 | # Otherwise, calibrate only the required parameters, if calibration is requested 108 | if calib_dict is not None: 109 | self.Gamma0, self.Gamma1, self.Psi, self.Pi, self.C_in, self.obs_matrix, self.obs_offset = \ 110 | self._eval_matrix(calib_dict, to_array=False) 111 | 112 | self.prior_info = self._get_prior_info() 113 | 114 | def simulate(self, n_obs=100, random_seed=None): 115 | """ 116 | Given a calibration or estimated model, simulates values of the endogenous variables based on random samples of 117 | the exogenous shocks. 118 | @param n_obs: number of observation in the time dimension. 119 | @param random_seed: random seed for the simulation. 120 | @return: pandas DataFrame. 'df_obs' contains the simualtions for the observable variables. 'df_state' contains 121 | the simulations for the state/endogenous variables. 122 | """ 123 | 124 | assert self.has_solution, "No solution was generated yet" 125 | 126 | if not (random_seed is None): 127 | seed(random_seed) 128 | 129 | kf = KalmanFilter(self.G1, self.obs_matrix, self.impact @ self.impact.T, None, 130 | self.C_out.reshape(self.n_state), self.obs_offset.reshape(self.n_obs)) 131 | simul_data = kf.sample(n_obs) 132 | 133 | state_names = [str(s) for s in list(self.endog)] 134 | if self.obs_names is None: 135 | obs_names = [f'obs {i+1}' for i in range(self.obs_matrix.shape[0])] 136 | else: 137 | obs_names = self.obs_names 138 | 139 | df_obs = pd.DataFrame(data=simul_data[1], columns=obs_names) 140 | df_states = pd.DataFrame(data=simul_data[0], columns=state_names) 141 | 142 | return df_obs, df_states 143 | 144 | def estimate(self, file_path, nsim=1000, ck=0.2, save_interval=100): 145 | """ 146 | Run the MCMC estimation. 147 | @param file_path: str. Save path where the MCMC chains are saved. The file format is HDF5 (.h5). This file 148 | format gets very heavy but has very fast read/write speed. If the file already exists, the 149 | estimation will resume from these previously simulated chains. 150 | @param nsim: Length of the MCMC chains to be generated. If the chains are already stable, this is the number of 151 | draws from the posterior distribution. 152 | @param ck: float. Scaling factor of the hessian matrix of the mode of the posterior distribution, which is used 153 | as the covariance matrix for the MCMC algorithm. Bayesian literature says this value needs to be 154 | calibrated in order to achieve your desired acceptance rate from the posterior draws. 155 | @param save_interval: int. Save the MCMC chains every `save_interval` iterations. This is useful in case 156 | something goes wrong and interrupts the program, so you can restart the MCMC sampling 157 | where it last saved. 158 | @return: the 'chains' attribute of this DSGE instance is generated. 159 | """ 160 | 161 | try: 162 | df_chains = pd.read_hdf(file_path, key='chains') 163 | sigmak = pd.read_hdf(file_path, key='sigmak') 164 | start = df_chains.index[-1] 165 | 166 | except FileNotFoundError: 167 | def obj_func(theta_irr): 168 | theta_irr = {k: v for k, v in zip(self.params, theta_irr)} 169 | theta_res = self._irr2res(theta_irr) 170 | return -1 * self._calc_posterior(theta_res) 171 | 172 | theta_res0 = {k: v for k, v in zip(self.params, self.prior_info['mean'].values)} 173 | theta_irr0 = self._res2irr(theta_res0) 174 | theta_irr0 = array(list(theta_irr0.values())) 175 | 176 | # Optimization to find the posterior mode 177 | if self.optim_method == 'csminwel': 178 | _, theta_mode_irr, _, h, _, _, retcodeh = csminwel(fcn=obj_func, x0=theta_irr0, crit=1e-14, nit=100, 179 | verbose=True, h0=1*eye(len(theta_irr0))) 180 | theta_mode_irr = {k: v for k, v in zip(self.params, theta_mode_irr)} 181 | theta_mode_res = self._irr2res(theta_mode_irr) 182 | sigmak = ck * h 183 | 184 | elif self.optim_method == 'bfgs': 185 | res = minimize(obj_func, theta_irr0, options={'disp': True}, method='BFGS') 186 | theta_mode_irr = {k: v for k, v in zip(self.params, res.x)} 187 | theta_mode_res = self._irr2res(theta_mode_irr) 188 | sigmak = ck * res.hess_inv 189 | 190 | elif self.optim_method == 'basinhopping': 191 | res = basinhopping(obj_func, theta_irr0) 192 | theta_mode_irr = {k: v for k, v in zip(self.params, res.x)} 193 | theta_mode_res = self._irr2res(theta_mode_irr) 194 | sigmak = ck * res.hess_inv 195 | 196 | else: 197 | msg = f"optimization method '{self.optim_method}' not implemented" 198 | raise NotImplementedError(msg) 199 | 200 | if self.verbose: 201 | print('===== Posterior Mode =====') 202 | print(theta_mode_res, '\n') 203 | print('===== MH jump covariance =====') 204 | print(sigmak, '\n') 205 | print('===== Eigenvalues of MH jump convariance =====') 206 | print(eig(sigmak)[0], '\n') 207 | 208 | df_chains = pd.DataFrame(columns=[str(p) for p in list(self.params)], index=range(nsim)) 209 | df_chains.loc[0] = list(theta_mode_res.values()) 210 | start = 0 211 | 212 | # Metropolis-Hastings 213 | muk = zeros(self.n_param) 214 | accepted = 0 215 | 216 | for ii in tqdm(range(start + 1, start+nsim), 'Metropolis-Hastings'): 217 | theta1 = {k: v for k, v in zip(self.params, df_chains.loc[ii - 1].values)} 218 | pos1 = self._calc_posterior(theta1) 219 | omega1 = self._res2irr(theta1) 220 | omega2 = array(list(omega1.values())) + multivariate_normal(muk, sigmak) 221 | omega2 = {k: v for k, v in zip(self.params, omega2)} 222 | theta2 = self._irr2res(omega2) 223 | pos2 = self._calc_posterior(theta2) 224 | 225 | ratio = exp(pos2 - pos1) 226 | 227 | if ratio > rand(1)[0]: 228 | accepted += 1 229 | df_chains.loc[ii] = list(theta2.values()) 230 | else: 231 | df_chains.loc[ii] = df_chains.loc[ii - 1] 232 | 233 | # Save the partial chains 234 | if ii % save_interval == 0: 235 | store = pd.HDFStore(file_path) 236 | store['chains'] = df_chains 237 | store['sigmak'] = pd.DataFrame(data=sigmak) 238 | store.close() 239 | 240 | store = pd.HDFStore(file_path) 241 | store['chains'] = df_chains 242 | store['sigmak'] = pd.DataFrame(data=sigmak) 243 | store.close() 244 | 245 | self.chains = df_chains.astype(float) 246 | self.acceptance_rate = accepted / nsim 247 | 248 | if self.verbose: 249 | print('Acceptance rate:', 100 * (accepted / nsim), 'percent') 250 | 251 | def eval_chains(self, burnin=0.3, conf=0.90, load_chain=None, show_charts=False): 252 | """ 253 | This function evaluates the chains generated in the estimation step and calibrates the model with the posterior 254 | mode. It also has functionalities to plot the generated chains and posterior distributions. 255 | @param burnin: int or float. Number of observations on the begging of the chain that are going to be dropped to 256 | compute posterior statistics. 257 | @param conf: Level of credibility for the credibility intervals inferred from the posteriors. 258 | @param load_chain: str. Save pathe of the HDF5 file with the chains. Only required if the chains were not loaded 259 | in the estimation step. 260 | @param show_charts: bool. If True, prior-posterior chart is shown. Red line are the theoretical prior densities, 261 | blues bars are the empirical posterior densities. 262 | @return: the 'posterior_table' attribute of this DSGE instance is generated. 263 | """ 264 | 265 | if not (load_chain is None): 266 | try: 267 | self.chains = pd.read_hdf(load_chain, key='chains').astype(float) 268 | except FileNotFoundError: 269 | raise FileNotFoundError('Chain file not found') 270 | 271 | assert not (self.chains is None), 'There are no loaded chains' 272 | chain_size = self.chains.shape[0] 273 | 274 | if type(burnin) is float and 0 <= burnin < 1: 275 | df_chains = self.chains.iloc[int(chain_size * burnin):] 276 | elif type(burnin) is int and burnin < chain_size: 277 | df_chains = self.chains.iloc[burnin + 1:] 278 | else: 279 | raise ValueError("'burnin' must be either an int smaller than the chain size or a float between 0 and 1") 280 | 281 | if show_charts: 282 | self._plot_chains(chains=df_chains, show_charts=show_charts) 283 | self._plot_prior_posterior(chains=df_chains, show_charts=show_charts) 284 | 285 | self.posterior_table = self._posterior_table(chains=df_chains, conf=conf) 286 | 287 | # Calibrate model with the posterior mode 288 | calib_dict = self.posterior_table['posterior mode'].to_dict() 289 | 290 | self.Gamma0, self.Gamma1, self.Psi, self.Pi, self.C_in, self.obs_matrix, self.obs_offset = \ 291 | self._eval_matrix(calib_dict, to_array=True) 292 | 293 | self.G1, self.C_out, self.impact, self.fmat, self.fwt, self.ywt, self.gev, self.eu, self.loose = \ 294 | gensys(self.Gamma0, self.Gamma1, self.C_in, self.Psi, self.Pi) 295 | 296 | self.has_solution = True 297 | 298 | if self.n_obs == self.n_exog: 299 | self.resid = self._get_residuals() 300 | else: 301 | self.resid = None 302 | 303 | def irf(self, periods=12, show_charts=False): 304 | """ 305 | Generates and plot impulse-response functions (IRFs). 306 | @param periods: int. Number of periods in the IRFs. 307 | @param show_charts: bool. If true, plot the IRFs for each shock. 308 | @return: a MultiIndex pandas.DataFrame with (shock, period) as the index and variables as the columns. 309 | """ 310 | # TODO IRF standard errors 311 | 312 | assert self.has_solution, 'Model does not have a solution yet. Cannot compute IRFs' 313 | 314 | # number os periods ahead plus the contemporaneous effect 315 | periods = periods + 1 316 | 317 | # Initialiaze with zeros 318 | irf_values = zeros((periods, self.n_state, self.n_exog)) 319 | obs_irf_values = zeros((periods, self.n_obs, self.n_exog)) 320 | 321 | # Compute the IRFs 322 | partial = self.impact 323 | for tt in range(periods): 324 | irf_values[tt, :, :] = partial 325 | obs_irf_values[tt, :, :] = self.obs_offset + self.obs_matrix @ partial 326 | partial = self.G1 @ partial 327 | 328 | # organize state variables in a MultiIndex DataFrame 329 | col_names = [str(var) for var in self.endog] 330 | idx_names = [str(var) for var in self.exog] 331 | mindex = pd.MultiIndex.from_product([idx_names, list(range(periods))], names=['exog', 'periods']) 332 | df_irf = pd.DataFrame(columns=col_names, index=mindex) 333 | 334 | for count, var in enumerate(idx_names): 335 | df_irf.loc[var] = irf_values[:, :, count] 336 | 337 | # organize observed variables in a MultiIndex DataFrame 338 | if self.obs_names is None: 339 | col_names_obs = ['obs ' + str(var + 1).zfill(2) for var in range(self.n_obs)] 340 | else: 341 | col_names_obs = self.obs_names 342 | 343 | df_irf_obs = pd.DataFrame(columns=col_names_obs, index=mindex) 344 | 345 | for count, var in enumerate(idx_names): 346 | df_irf_obs.loc[var] = obs_irf_values[:, :, count] 347 | 348 | # Charts 349 | if show_charts: 350 | if self.obs_equations is None: # TODO I need a cleaner way to do this 351 | shocks = df_irf.index.get_level_values(0).unique() 352 | 353 | n_subplots = df_irf.shape[1] 354 | n_rows = floor(sqrt(n_subplots)) 355 | n_cols = ceil(n_subplots / n_rows) 356 | subplot_shape = (n_rows, n_cols) 357 | 358 | for ss in shocks: 359 | plot_irfs = pd.concat([df_irf.loc[ss]], axis=1) 360 | ax = plot_irfs.plot(title=f'Impulse Response Functions: Shock {ss}', 361 | subplots=True, 362 | layout=subplot_shape, 363 | legend=None, 364 | grid=True, 365 | figsize=(12, 7), 366 | sharey=False) 367 | 368 | zero_irf = list(plot_irfs.columns[(plot_irfs.abs() < 1e-10).all()]) 369 | 370 | for axis, subptitle in zip(ax.ravel(), list(plot_irfs.columns)): 371 | axis.set_title(subptitle) 372 | if subptitle in zero_irf: 373 | axis.set_ylim((-1, 1)) 374 | 375 | plt.tight_layout() 376 | plt.show() 377 | else: 378 | shocks = df_irf.index.get_level_values(0).unique() 379 | 380 | n_subplots = df_irf.shape[1] + df_irf_obs.shape[1] 381 | n_rows = floor(sqrt(n_subplots)) 382 | n_cols = ceil(n_subplots / n_rows) 383 | subplot_shape = (n_rows, n_cols) 384 | 385 | for ss in shocks: 386 | plot_irfs = pd.concat([df_irf.loc[ss], df_irf_obs.loc[ss]], axis=1) 387 | ax = plot_irfs.plot(title=f'Impulse Response Functions: Shock {ss}', 388 | subplots=True, 389 | layout=subplot_shape, 390 | legend=None, 391 | grid=True, 392 | figsize=(12, 7), 393 | sharey=False) 394 | 395 | zero_irf = list(plot_irfs.columns[(plot_irfs.abs() < 1e-10).all()]) 396 | 397 | for axis, subptitle in zip(ax.ravel(), list(plot_irfs.columns)): 398 | axis.set_title(subptitle) 399 | if subptitle in zero_irf: 400 | axis.set_ylim((-1, 1)) 401 | 402 | plt.tight_layout() 403 | plt.show() 404 | 405 | # Concatenate state and observed variables 406 | df_irf = pd.concat([df_irf, df_irf_obs], axis=1) 407 | 408 | return df_irf 409 | 410 | def states(self, smoothed=True): 411 | 412 | kf = KalmanFilter(self.G1, self.obs_matrix, self.impact @ self.impact.T, None, 413 | self.C_out.reshape(self.n_state), self.obs_offset.reshape(self.n_obs)) 414 | 415 | if smoothed: 416 | states, states_cov = kf.smooth(self.data) 417 | else: 418 | states, states_cov = kf.filter(self.data) 419 | 420 | # Orgnize in a dataframe 421 | states = pd.DataFrame(data=states, 422 | index=self.data.index, 423 | columns=[str(var) for var in self.endog]) 424 | 425 | states_std = pd.DataFrame(index=self.data.index, 426 | columns=[str(var) for var in self.endog]) 427 | 428 | for ii in range(states_cov.shape[0]): 429 | states_std.iloc[ii] = diagonal(states_cov[ii]) ** 0.5 430 | 431 | return states, states_std 432 | 433 | def hist_decomp(self, smoothed=True, show_charts=False): 434 | # TODO documentation 435 | states, _ = self.states(smoothed=smoothed) 436 | x0 = states.iloc[0].values.reshape((-1, 1)) 437 | a = self.obs_matrix 438 | p0 = self.C_out 439 | p1 = self.G1 440 | b = self.impact 441 | eps = self.resid 442 | 443 | # organize state variables in a MultiIndex DataFrame 444 | if self.obs_names is None: 445 | idx_names = ['obs ' + str(var + 1).zfill(2) for var in range(self.n_obs)] 446 | else: 447 | idx_names = self.obs_names 448 | 449 | col_names = [str(var) for var in self.exog] 450 | mindex = pd.MultiIndex.from_product([idx_names, eps.index], names=['obs', 'date']) 451 | df_hd = pd.DataFrame(columns=col_names, index=mindex) 452 | 453 | # Compute historical decomposition 454 | for count, tt in tqdm(enumerate(eps.index), 'Computing historical decomposition', disable=not self.verbose): 455 | 456 | matrix_sum = zeros((self.n_state, self.n_exog)) 457 | 458 | for k in range(count + 1): 459 | matrix_sum += eps.iloc[k].values * (matrix_power(p1, count - k) @ b) 460 | 461 | df_hd[df_hd.index.get_level_values(1) == tt] = a @ matrix_sum 462 | 463 | # compute the contributions of steady states and initial conditions 464 | extra_contributions = [] 465 | for var in self.data.columns: 466 | extra = self.data[var] - df_hd.loc[var].sum(axis=1) 467 | extra = extra.rename('Steady State and Initial Conditions') 468 | extra.index = pd.MultiIndex.from_tuples([(var, i) for i in extra.index]) 469 | extra_contributions.append(extra) 470 | 471 | # Organize the data 472 | df_extras = pd.concat(extra_contributions, axis=0) 473 | df_hd = pd.concat([df_extras, df_hd], axis=1) 474 | 475 | if show_charts: 476 | self._plot_historical_decomposition(df_hd, show_charts=show_charts) 477 | 478 | return df_hd 479 | 480 | def _get_residuals(self): 481 | 482 | c = self.obs_offset 483 | a = self.obs_matrix 484 | p0 = self.C_out 485 | p1 = self.G1 486 | b = self.impact 487 | ss = inv(eye(self.n_state) - p1) @ p0 488 | 489 | df_resid = pd.DataFrame(columns=[str(var) for var in self.exog], 490 | index=self.data.index) 491 | 492 | # Compute residuals 493 | acum_schoks = zeros((self.n_state, 1)) 494 | for tt in self.data.index: 495 | y = self.data.loc[tt].values.reshape((self.n_obs, -1)) 496 | eps = inv(a @ b) @ ((y - c - a @ ss) - a @ acum_schoks) 497 | df_resid.loc[tt] = eps.flatten() 498 | acum_schoks = p1 @ (acum_schoks + b @ eps) 499 | 500 | df_resid = df_resid.astype(float) 501 | return df_resid 502 | 503 | def _get_jacobians(self, generate_obs): 504 | # State Equations 505 | self.Gamma0 = self.state_equations.jacobian(self.endog) 506 | self.Gamma1 = -self.state_equations.jacobian(self.endogl) 507 | self.Psi = -self.state_equations.jacobian(self.exog) 508 | self.Pi = -self.state_equations.jacobian(self.expec) 509 | self.C_in = simplify(self.state_equations 510 | - self.Gamma0 @ self.endog 511 | + self.Gamma1 @ self.endogl 512 | + self.Psi @ self.exog 513 | + self.Pi @ self.expec) 514 | 515 | # Obs Equation 516 | if generate_obs: 517 | self.obs_matrix = Matrix(eye(self.n_obs)) 518 | self.obs_offset = Matrix(zeros(self.n_obs)) 519 | else: 520 | self.obs_matrix = self.obs_equations.jacobian(self.endog) 521 | self.obs_offset = self.obs_equations - self.obs_matrix @ self.endog 522 | 523 | def _calc_posterior(self, theta): 524 | P = self._calc_prior(theta) 525 | L = self._log_likelihood(theta) 526 | f = P + L 527 | return f*1000 # x1000 is here to increase precison of the posterior mode-finding algorithm. 528 | 529 | def _calc_prior(self, theta): 530 | prior_dict = self.prior_dict 531 | df_prior = self.prior_info.copy() 532 | df_prior['pdf'] = nan 533 | 534 | for param in prior_dict.keys(): 535 | mu = df_prior.loc[str(param)]['mean'] 536 | sigma = df_prior.loc[str(param)]['std'] 537 | dist = df_prior.loc[str(param)]['distribution'].lower() 538 | theta_i = theta[param] 539 | 540 | # since we are goig to take logs, the density function only needs the terms that depend on 541 | # theta_i, this will help speed up the code a little and will not affect optimization output. 542 | if dist == 'beta': 543 | a = ((mu ** 2) * (1 - mu)) / (sigma ** 2) - mu 544 | b = a * mu / (1 - mu) 545 | pdf_i = (theta_i**(a - 1)) * ((1 - theta_i)**(b - 1)) 546 | 547 | elif dist == 'gamma': 548 | a = (mu/sigma)**2 549 | b = mu/a 550 | pdf_i = theta_i**(a - 1) * exp(-theta_i/b) 551 | 552 | elif dist == 'invgamma': 553 | a = (mu/sigma)**2 + 2 554 | b = mu * (a - 1) 555 | pdf_i = (theta_i**(- a - 1)) * exp(-b/theta_i) 556 | 557 | elif dist == 'uniform': 558 | a = mu - sqrt(3) * sigma 559 | b = 2 * mu - a 560 | pdf_i = 1/(b - a) 561 | 562 | else: # Normal 563 | pdf_i = exp(-((theta_i - mu)**2)/(2 * (sigma**2))) 564 | 565 | df_prior.loc[str(param), 'pdf'] = pdf_i 566 | 567 | df_prior['log pdf'] = log(df_prior['pdf'].astype(float)) 568 | 569 | P = df_prior['log pdf'].sum() 570 | 571 | return P 572 | 573 | def _log_likelihood(self, theta): 574 | Gamma0, Gamma1, Psi, Pi, C_in, obs_matrix, obs_offset = self._eval_matrix(theta, to_array=True) 575 | 576 | for mat in [Gamma0, Gamma1, Psi, Pi, C_in]: 577 | if isnan(mat).any() or isinf(mat).any(): 578 | return -inf 579 | 580 | G1, C_out, impact, fmat, fwt, ywt, gev, eu, loose = gensys(Gamma0, Gamma1, C_in, Psi, Pi) 581 | 582 | if eu[0] == 1 and eu[1] == 1: 583 | kf = KalmanFilter(G1, obs_matrix, impact @ impact.T, None, C_out.reshape(self.n_state), 584 | obs_offset.reshape(self.n_obs)) 585 | try: 586 | L = kf.loglikelihood(self.data) 587 | except (LinAlgError, ValueError): 588 | L = - inf 589 | else: 590 | L = - inf 591 | 592 | return L 593 | 594 | def _eval_matrix(self, theta, to_array): 595 | 596 | if to_array: 597 | # state matrices 598 | Gamma0 = array(self.Gamma0.subs(theta)).astype(float) 599 | Gamma1 = array(self.Gamma1.subs(theta)).astype(float) 600 | Psi = array(self.Psi.subs(theta)).astype(float) 601 | Pi = array(self.Pi.subs(theta)).astype(float) 602 | C_in = array(self.C_in.subs(theta)).astype(float) 603 | 604 | # observation matrices 605 | obs_matrix = array(self.obs_matrix.subs(theta)).astype(float) 606 | obs_offset = array(self.obs_offset.subs(theta)).astype(float) 607 | else: 608 | # state matrices 609 | Gamma0 = self.Gamma0.subs(theta) 610 | Gamma1 = self.Gamma1.subs(theta) 611 | Psi = self.Psi.subs(theta) 612 | Pi = self.Pi.subs(theta) 613 | C_in = self.C_in.subs(theta) 614 | 615 | # observation matrices 616 | obs_matrix = self.obs_matrix.subs(theta) 617 | obs_offset = self.obs_offset.subs(theta) 618 | 619 | return Gamma0, Gamma1, Psi, Pi, C_in, obs_matrix, obs_offset 620 | 621 | def _get_prior_info(self): 622 | prior_info = self.prior_dict 623 | 624 | param_names = [str(s) for s in list(self.params)] 625 | 626 | df_prior = pd.DataFrame(columns=['distribution', 'mean', 'std', 'param a', 'param b'], 627 | index=param_names) 628 | 629 | for param in prior_info.keys(): 630 | mu = prior_info[param]['mean'] 631 | sigma = prior_info[param]['std'] 632 | dist = prior_info[param]['dist'].lower() 633 | 634 | if dist == 'beta': 635 | a = ((mu ** 2) * (1 - mu)) / (sigma ** 2) - mu 636 | b = a * mu / (1 - mu) 637 | 638 | elif dist == 'gamma': 639 | a = (mu / sigma) ** 2 640 | b = mu / a 641 | 642 | elif dist == 'invgamma': 643 | a = (mu / sigma) ** 2 + 2 644 | b = mu * (a - 1) 645 | 646 | elif dist == 'uniform': 647 | a = mu - sqrt(3) * sigma 648 | b = 2 * mu - a 649 | 650 | else: # Normal 651 | a = mu 652 | b = sigma 653 | 654 | df_prior.loc[str(param)] = [dist, mu, sigma, a, b] 655 | 656 | return df_prior 657 | 658 | def _res2irr(self, theta_res): 659 | """ 660 | converts the prior distribution from restricted to irrestricted 661 | :param theta_res: 662 | :return: 663 | """ 664 | prior_info = self.prior_info 665 | theta_irr = theta_res.copy() 666 | 667 | for param in theta_res.keys(): 668 | a = prior_info.loc[str(param)]['param a'] 669 | b = prior_info.loc[str(param)]['param b'] 670 | dist = prior_info.loc[str(param)]['distribution'].lower() 671 | theta_i = theta_res[param] 672 | 673 | if dist == 'beta': 674 | theta_irr[param] = log(theta_i / (1 - theta_i)) 675 | 676 | elif dist == 'gamma' or dist == 'invgamma': 677 | theta_irr[param] = log(theta_i) 678 | 679 | elif dist == 'uniform': 680 | theta_irr[param] = log((theta_i - a) / (b - theta_i)) 681 | 682 | else: # Normal 683 | theta_irr[param] = theta_i 684 | 685 | return theta_irr 686 | 687 | def _irr2res(self, theta_irr): 688 | """ 689 | converts the prior distribution from irrestricted to restricted 690 | :param theta_irr: 691 | :return: 692 | """ 693 | prior_info = self.prior_info 694 | theta_res = theta_irr.copy() 695 | 696 | for param in theta_irr.keys(): 697 | a = prior_info.loc[str(param)]['param a'] 698 | b = prior_info.loc[str(param)]['param b'] 699 | dist = prior_info.loc[str(param)]['distribution'].lower() 700 | lambda_i = theta_irr[param] 701 | 702 | if dist == 'beta': 703 | theta_res[param] = exp(lambda_i) / (1 + exp(lambda_i)) 704 | 705 | elif dist == 'gamma': 706 | theta_res[param] = exp(lambda_i) 707 | 708 | elif dist == 'invgamma': 709 | theta_res[param] = exp(lambda_i) 710 | 711 | elif dist == 'uniform': 712 | theta_res[param] = (a + b * exp(lambda_i)) / (1 + exp(lambda_i)) 713 | 714 | else: # Normal 715 | theta_res[param] = lambda_i 716 | 717 | return theta_res 718 | 719 | def _plot_chains(self, chains, show_charts): 720 | n_cols = int(self.n_param ** 0.5) 721 | n_rows = n_cols + 1 if self.n_param > n_cols ** 2 else n_cols 722 | subplot_shape = (n_rows, n_cols) 723 | 724 | plt.figure(figsize=(7*1.61, 7)) 725 | 726 | for count, param in enumerate(list(self.params)): 727 | ax = plt.subplot2grid(subplot_shape, (count // n_cols, count % n_cols)) 728 | ax.plot(chains[str(param)], linewidth=0.5, color='darkblue') 729 | ax.set_title(self.prior_dict[param]['label']) 730 | 731 | plt.tight_layout() 732 | 733 | if show_charts: 734 | plt.show() 735 | 736 | def _plot_prior_posterior(self, chains, show_charts): 737 | n_bins = int(sqrt(chains.shape[0])) # TODO use the same logic as the IRF charts 738 | n_cols = int(self.n_param ** 0.5) 739 | n_rows = n_cols + 1 if self.n_param > n_cols ** 2 else n_cols 740 | subplot_shape = (n_rows, n_cols) 741 | 742 | plt.figure(figsize=(7 * 1.61, 7)) 743 | 744 | for count, param in enumerate(list(self.params)): 745 | mu = self.prior_info.loc[str(param)]['mean'] 746 | sigma = self.prior_info.loc[str(param)]['std'] 747 | a = self.prior_info.loc[str(param)]['param a'] 748 | b = self.prior_info.loc[str(param)]['param b'] 749 | dist = self.prior_info.loc[str(param)]['distribution'] 750 | 751 | ax = plt.subplot2grid(subplot_shape, (count // n_cols, count % n_cols)) 752 | ax.hist(chains[str(param)], bins=n_bins, density=True, 753 | color='royalblue', edgecolor='black') 754 | ax.set_title(self.prior_dict[param]['label']) 755 | x_min, x_max = ax.get_xlim() 756 | x = linspace(x_min, x_max, n_bins) 757 | 758 | if dist == 'beta': 759 | y = beta.pdf(x, a, b) 760 | ax.plot(x, y, color='red') 761 | 762 | elif dist == 'gamma': 763 | y = gamma.pdf(x, a, scale=b) 764 | ax.plot(x, y, color='red') 765 | 766 | elif dist == 'invgamma': 767 | y = (b ** a) * invgamma.pdf(x, a) * exp((1 - b) / x) 768 | ax.plot(x, y, color='red') 769 | 770 | elif dist == 'uniform': 771 | y = uniform.pdf(x, loc=a, scale=b - a) 772 | ax.plot(x, y, color='red') 773 | 774 | else: # Normal 775 | y = norm.pdf(x, loc=mu, scale=sigma) 776 | ax.plot(x, y, color='red') 777 | 778 | plt.tight_layout() 779 | 780 | if show_charts: 781 | plt.show() 782 | 783 | def _posterior_table(self, chains, conf): 784 | 785 | low_conf = (1 - conf) / 2 786 | high_conf = (1 + conf) / 2 787 | 788 | df = self.prior_info[['distribution', 'mean', 'std']] 789 | df = df.rename({'distribution': 'prior dist', 'mean': 'prior mean', 'std': 'prior std'}, axis=1) 790 | df['posterior mode'] = chains.mode().mean() 791 | df['posterior mean'] = chains.mean() 792 | df[f'posterior {100 * round(low_conf, 3)}%'] = chains.quantile(low_conf) 793 | df[f'posterior {100 * round(high_conf, 3)}%'] = chains.quantile(high_conf) 794 | 795 | return df 796 | 797 | def _plot_historical_decomposition(self, df_hd, show_charts=False): 798 | 799 | for var in df_hd.index.get_level_values(0).unique(): 800 | 801 | fig = plt.figure(figsize=(7 * 1.61, 7)) 802 | ax = fig.gca() 803 | 804 | ax = df_hd.loc[var].plot(kind='bar', stacked=True, width=1, ax=ax, legend='best', 805 | title=var) 806 | ax.plot(self.data[var], color='black') 807 | 808 | ax.grid(axis='y') 809 | y_min, y_max = ax.get_ylim() 810 | 811 | if y_min <= 0 <= y_max: 812 | ax.axhline(y=0, color='black', linewidth=0.5) 813 | 814 | plt.tight_layout() 815 | 816 | if show_charts: 817 | plt.show() 818 | 819 | 820 | def gensys(g0, g1, c, psi, pi, div=None, realsmall=0.000001): 821 | """ 822 | This code is a translation from matlab to python of Christopher Sim's 'gensys'. 823 | https://dge.repec.org/codes/sims/linre3a/ 824 | """ 825 | 826 | # Assertions 827 | all_inputs = [g0, g1, c, psi, pi] 828 | assert_cond = all([isinstance(inpt, ndarray) for inpt in all_inputs]) 829 | assert assert_cond, "Inputs 'g0', 'g1', 'c', 'psi' and 'pi' must all be numpy.ndarray" 830 | 831 | unique = False 832 | eu = [0, 0] 833 | nunstab = 0 834 | zxz = False 835 | 836 | n = g1.shape[0] 837 | 838 | if div is None: 839 | div = 1.01 840 | fixdiv = False 841 | else: 842 | fixdiv = True 843 | 844 | a, b, q, z = qz(g0, g1, 'complex') 845 | 846 | # the numpy.matrix class may be deprecated in a future version of numpy. Should be changed to the ndarray. 847 | # Scipy's version of 'qz' is different from MATLAB's, Q needs to be hermitian transposed to get same output 848 | q = array(matrix(q).H) 849 | 850 | for i in range(n): 851 | if not fixdiv: 852 | if abs(a[i, i]) > 0: 853 | 854 | divhat = abs(b[i, i]/a[i, i]) 855 | 856 | if (1 + realsmall < divhat) and divhat <= div: 857 | div = 0.5 * (1 + divhat) 858 | 859 | nunstab = nunstab + (abs(b[i, i]) > div * abs(a[i, i])) 860 | 861 | if (abs(a[i, i]) < realsmall) and abs(b[i, i] < realsmall): 862 | zxz = True 863 | 864 | if not zxz: 865 | a, b, q, z, _ = qzdiv(div, a, b, q, z) 866 | 867 | gev = vstack([diagonal(a), diagonal(b)]).T 868 | 869 | if zxz: 870 | # print('Coincident zeros. Indeterminancy and/or nonexistence') 871 | eu = [-2, -2] 872 | return None, None, None, None, None, None, None, eu, None 873 | 874 | q1 = q[:n - nunstab, :] 875 | q2 = q[n - nunstab:, :] 876 | 877 | # This was in the author's original code, but seems unused 878 | # z1 = z[:, :n - nunstab].T 879 | # z2 = z[:, n - nunstab:] 880 | # a2 = a[n - nunstab:, n - nunstab:] 881 | # b2 = a[n - nunstab:, n - nunstab:] 882 | 883 | etawt = q2 @ pi 884 | neta = pi.shape[1] 885 | 886 | # Case for no stable roots 887 | if nunstab == 0: 888 | bigev = 0 889 | # etawt = zeros((0, neta)) 890 | ueta = zeros((0, 0)) 891 | deta = zeros((0, 0)) 892 | veta = zeros((neta, 0)) 893 | else: 894 | ueta, deta, veta = svd(etawt) 895 | deta = diag(deta) 896 | veta = array(matrix(veta).H) 897 | md = min(deta.shape) 898 | bigev = where(diagonal(deta[:md, :md]) > realsmall)[0] 899 | ueta = ueta[:, bigev] 900 | veta = veta[:, bigev] 901 | deta = deta[bigev, bigev] 902 | if deta.ndim == 1: 903 | deta = diag(deta) 904 | 905 | try: 906 | if len(bigev) >= nunstab: 907 | eu[0] = 1 908 | except TypeError: 909 | if bigev >= nunstab: 910 | eu[0] = 1 911 | 912 | # Case for all stable roots 913 | if nunstab == n: 914 | etawt1 = zeros((0, neta)) 915 | ueta1 = zeros((0, 0)) 916 | deta1 = zeros((0, 0)) 917 | veta1 = zeros((neta, 0)) 918 | # bigev = 0 919 | else: 920 | etawt1 = q1 @ pi 921 | # ndeta1 = min(n - nunstab, neta) 922 | ueta1, deta1, veta1 = svd(etawt1) 923 | deta1 = diag(deta1) 924 | veta1 = array(matrix(veta1).H) 925 | md = min(deta1.shape) 926 | bigev = where(diagonal(deta1[:md, :md]) > realsmall)[0] 927 | ueta1 = ueta1[:, bigev] 928 | veta1 = veta1[:, bigev] 929 | deta1 = deta1[bigev, bigev] 930 | if deta1.ndim == 1: 931 | deta1 = diag(deta1) 932 | 933 | if 0 in veta1.shape: 934 | unique = True 935 | else: 936 | loose = veta1 - veta @ veta.T @ veta1 937 | ul, dl, vl = svd(loose) 938 | if dl.ndim == 1: 939 | dl = diag(dl) 940 | nloose = sum(abs(diagonal(dl))) > realsmall*n 941 | if not nloose: 942 | unique = True 943 | 944 | if unique: 945 | eu[1] = 1 946 | else: 947 | # print(f'Indeterminacy. Loose endogenous variables.') 948 | pass 949 | 950 | tmat = hstack((eye(n - nunstab), -(ueta @ (inv(deta) @ matrix(veta).H) @ veta1 @ deta1 @ matrix(ueta1).H).H)) 951 | G0 = vstack((tmat @ a, hstack((zeros((nunstab, n - nunstab)), eye(nunstab))))) 952 | G1 = vstack((tmat @ b, zeros((nunstab, n)))) 953 | 954 | G0I = inv(G0) 955 | G1 = G0I @ G1 956 | usix = arange(n - nunstab, n) 957 | C = G0I @ vstack((tmat @ q @ c, inv(a[usix, :][:, usix] - b[usix, :][:, usix]) @ q2 @ c)) 958 | impact = G0I @ vstack((tmat @ q @ psi, zeros((nunstab, psi.shape[1])))) 959 | fmat = inv(b[usix, :][:, usix]) @ a[usix, :][:, usix] 960 | fwt = -inv(b[usix, :][:, usix]) @ q2 @ psi 961 | ywt = G0I[:, usix] 962 | 963 | loose = G0I @ vstack((etawt1 @ (eye(neta) - veta @ matrix(veta).H), zeros((nunstab, neta)))) 964 | G1 = array((z @ G1 @ matrix(z).H).real) 965 | C = array((z @ C).real) 966 | impact = array((z @ impact).real) 967 | loose = array((z @ loose).real) 968 | ywt = array(z @ ywt) 969 | 970 | return G1, C, impact, fmat, fwt, ywt, gev, eu, loose 971 | 972 | 973 | def qzdiv(stake, A, B, Q, Z, v=None): 974 | """ 975 | This code is a translation from matlab to python of Christopher Sim's 'qzdiv'. 976 | https://dge.repec.org/codes/sims/linre3a/ 977 | """ 978 | 979 | n = A.shape[0] 980 | 981 | root = abs(vstack([diagonal(A), diagonal(B)]).T) 982 | 983 | root[:, 0] = root[:, 0] - (root[:, 0] < 1.e-13) * (root[:, 0] + root[:, 1]) 984 | root[:, 1] = root[:, 1] / root[:, 0] 985 | 986 | for i in reversed(range(n)): 987 | m = None 988 | for j in reversed(range(0, i+1)): 989 | if (root[j, 1] > stake) or (root[j, 1] < -0.1): 990 | m = j 991 | break 992 | 993 | if m is None: 994 | return A, B, Q, Z, v 995 | 996 | for k in range(m, i): 997 | A, B, Q, Z = qzswitch(k, A, B, Q, Z) 998 | temp = root[k, 1] 999 | root[k, 1] = root[k+1, 1] 1000 | root[k + 1, 1] = temp 1001 | 1002 | if not (v is None): 1003 | temp = v[:, k] 1004 | v[:, k] = v[:, k+1] 1005 | v[:, k + 1] = temp 1006 | 1007 | return A, B, Q, Z, v 1008 | 1009 | 1010 | def qzswitch(i, A, B, Q, Z): 1011 | """ 1012 | This code is a translation from matlab to python of Christopher Sim's 'qzswitch'. 1013 | https://dge.repec.org/codes/sims/linre3a/ 1014 | """ 1015 | 1016 | eps = 2.2204e-16 1017 | realsmall = sqrt(eps)*10 1018 | 1019 | a, b, c = A[i, i], A[i, i + 1], A[i + 1, i + 1] 1020 | d, e, f = B[i, i], B[i, i + 1], B[i + 1, i + 1] 1021 | 1022 | if (abs(c) < realsmall) and (abs(f) < realsmall): 1023 | if abs(a) < realsmall: 1024 | # l.r. coincident zeros with u.l. of A=0. Do Nothing 1025 | return A, B, Q, Z 1026 | else: 1027 | # l.r. coincident zeros. put zeros in u.l. of a. 1028 | wz = array([[b], [-a]]) 1029 | wz = wz / ((matrix(wz).H @ matrix(wz))[0, 0] ** 0.5) 1030 | wz = array([[wz[0][0], wz[1][0].conj()], [wz[1][0], -wz[0][0].conj()]]) 1031 | xy = eye(2) 1032 | elif (abs(a) < realsmall) and (abs(d) < realsmall): 1033 | if abs(c) < realsmall: 1034 | # u.l. coincident zeros with u.l. of A=0. Do Nothing 1035 | return A, B, Q, Z 1036 | else: 1037 | # u.l. coincident zeros. put zeros in u.l. of A 1038 | wz = eye(2) 1039 | xy = array([c, -b]) 1040 | xy = xy / (matrix(xy) @ matrix(xy).H)[0, 0] ** 0.5 1041 | xy = array([[xy[1].conj(), -xy[0].conj()], [xy[0], xy[1]]]) 1042 | else: 1043 | # Usual Case 1044 | wz = array([c*e - f*b, (c*d - f*a).conj()]) 1045 | xy = array([(b*d - e*a).conj(), (c*d - f*a).conj()]) 1046 | n = (matrix(wz) @ matrix(wz).H)[0, 0] ** 0.5 1047 | m = (matrix(xy) @ matrix(xy).H)[0, 0] ** 0.5 1048 | 1049 | if m < eps*100: 1050 | # all elements of A and B are proportional 1051 | return A, B, Q, Z 1052 | 1053 | wz = wz / n 1054 | xy = xy / m 1055 | wz = array([[wz[0], wz[1]], [-wz[1].conj(), wz[0].conj()]]) 1056 | xy = array([[xy[0], xy[1]], [-xy[1].conj(), xy[0].conj()]]) 1057 | 1058 | A[i:i + 2, :] = xy @ A[i:i + 2, :] 1059 | B[i:i + 2, :] = xy @ B[i:i + 2, :] 1060 | A[:, i:i + 2] = A[:, i:i + 2] @ wz 1061 | B[:, i:i + 2] = B[:, i:i + 2] @ wz 1062 | Z[:, i:i + 2] = Z[:, i:i + 2] @ wz 1063 | Q[i:i + 2, :] = xy @ Q[i:i + 2, :] 1064 | 1065 | return A, B, Q, Z 1066 | -------------------------------------------------------------------------------- /dsgepy/pycsminwel.py: -------------------------------------------------------------------------------- 1 | """ 2 | All the functions in this file were created by Christopher Sims in 1996. 3 | They are translations to python from the author's original MATLAB files. 4 | 5 | The originals can be found in: 6 | http://sims.princeton.edu/yftp/optimize/ 7 | 8 | I kept the author's original variable names so it is easier to compare 9 | this code with his. Also, this code is not very pythonic, it is very 10 | matlaby. I need a better understanding of the algorithm before making 11 | it more pythonic. 12 | """ 13 | 14 | import numpy as np 15 | from warnings import warn 16 | 17 | 18 | def csminwel(fcn, x0, h0=None, grad=None, crit=1e-14, nit=100, verbose=False): 19 | """ 20 | This is a locol minimization algorithm. Uses a quasi-Newton method with BFGS 21 | update of the estimated inverse hessian. It is robust against certain 22 | pathologies common on likelihood functions. It attempts to be robust against 23 | "cliffs", i.e. hyperplane discontinuities, though it is not really clear 24 | whether what it does in such cases succeeds reliably. 25 | 26 | The author of this algorithm is Christopher Sims. 27 | 28 | :param fcn: The function to be minimized. Must have only one input. 29 | :param x0: Initial guess for the optimization. 30 | :param h0: Initial guess for the inverse hessian matrix. Must be a positive 31 | definite matrix. 32 | :param grad: A function that calculates the gradient of 'fcn'. The function 33 | must output a single array with the same dimension as x. 34 | If None, the optimization calculates a numerical gradient. 35 | :param crit: Convergence criterion. Iteration will cease when it proves 36 | impossible to improve 'fcn' value by more than 'crit'. 37 | :param nit: Maximum number of iterations. 38 | :param verbose: If True, prints the steps of the algorithm. 39 | :return: fh: minimal value of the function 40 | xh: optimal input 41 | gh: gradient at the optimal value 42 | h: inverse hessian matrix at the optimal value 43 | itct: number of iterations 44 | fcount: number of times 'fcn' was evaluated 45 | retcodeh: return code. 46 | """ 47 | 48 | assert isinstance(x0, np.ndarray) and x0.ndim == 1, "'x0' must be a numpy.ndarray with only 1 dimension" 49 | assert callable(fcn), "'fcn' must be a callable function" 50 | 51 | nx = x0.shape[0] 52 | 53 | if grad is not None: 54 | assert callable(grad), "'grad' must be a callable function" 55 | test_grad = grad(x0) 56 | assert test_grad.shape[0] == nx, "output of 'grad' does not match the size of 'x'" 57 | assert test_grad.ndim == 1, "output of 'grad' must have ndim equal to 1" 58 | 59 | itct = 0 60 | fcount = 0 61 | numGrad = True if grad is None else False 62 | 63 | if h0 is None: # This is not a desired behaviour. Passing h0 is recommended. 64 | h0 = 0.5 * np.eye(nx) 65 | else: 66 | assert np.all(np.linalg.eigvals(h0) > 0), "'h0' must be positive definite" 67 | 68 | f0 = fcn(x0) 69 | 70 | if numGrad: 71 | g, badg = numgrad(fcn, x0) 72 | else: 73 | g = grad(x0) 74 | badg = False 75 | 76 | # To avoid declaration before assignment 77 | badg1 = None 78 | badg2 = None 79 | badg3 = None 80 | x2 = None 81 | x3 = None 82 | xh = None 83 | fh = None 84 | g1 = None 85 | g2 = None 86 | g3 = None 87 | gh = None 88 | retcodeh = None 89 | 90 | # Start the optimization 91 | x = x0 92 | f = f0 93 | h = h0 94 | done = False 95 | 96 | while not done: 97 | itct += 1 98 | 99 | if verbose: 100 | print(f'f at the beginning of iteration {itct} is', f) 101 | 102 | f1, x1, fc, retcode1 = csminit(fcn, x, f, g, badg, h) 103 | 104 | fcount += fc 105 | 106 | if retcode1 != 1: 107 | if retcode1 == 2 or retcode1 == 4: 108 | wall1 = True 109 | badg1 = True 110 | else: 111 | if numGrad: 112 | g1, badg1 = numgrad(fcn, x1) 113 | else: 114 | g1 = grad(x1) 115 | badg1 = False 116 | 117 | wall1 = badg1 118 | 119 | if wall1 and h.shape[0] > 1: 120 | # Bad gradient or back and forth on step length. Possibly at cliff edge. 121 | # Try perturbing search direction if problem not 1D 122 | hCliff = h + np.diag(np.diag(h) * np.random.rand(nx)) 123 | f2, x2, fc, retcode2 = csminit(fcn, x, f, g, badg, hCliff) 124 | fcount += fc 125 | 126 | if f2 < f: 127 | if retcode2 == 2 or retcode2 == 4: 128 | wall2 = True 129 | badg2 = True 130 | else: 131 | if numGrad: 132 | g2, badg2 = numgrad(fcn, x2) 133 | else: 134 | g2 = grad(x2) 135 | badg2 = False 136 | 137 | wall2 = badg2 138 | 139 | if wall2: 140 | # cliff again. Try traversing. 141 | if np.linalg.norm(x2 - x1) < 1e-13: 142 | f3 = f 143 | x3 = x 144 | badg3 = True 145 | retcode3 = 101 146 | else: 147 | gcliff = ((f2 - f1) / ((np.linalg.norm(x2 - x1)) ** 2)) * (x2 - x1) 148 | # "if (size(x0, 2) > 1), gcliff=gcliff', end" <-- this is matlab code, only 149 | # needed if x is a matrix. Not our case. 150 | f3, x3, fc, retcode3 = csminit(fcn, x, f, gcliff, False, np.eye(nx)) 151 | fcount += fc 152 | 153 | if retcode3 == 2 or retcode3 == 4: 154 | wall3 = True 155 | badg3 = True 156 | else: 157 | if numGrad: 158 | g3, badg3 = numgrad(fcn, x3) 159 | else: 160 | g3 = grad(x3) 161 | badg3 = False 162 | 163 | wall3 = badg3 164 | 165 | else: 166 | f3 = f 167 | x3 = x 168 | badg3 = True 169 | retcode3 = 101 170 | 171 | else: 172 | f3 = f 173 | x3 = x 174 | badg3 = True 175 | retcode3 = 101 176 | 177 | else: 178 | # normal iteration, no walls, or else 1D, or else we're finished here 179 | f2 = f 180 | f3 = f 181 | badg2 = True 182 | badg3 = True 183 | retcode2 = 101 184 | retcode3 = 101 185 | 186 | else: 187 | f2 = f 188 | f3 = f 189 | f1 = f 190 | retcode2 = retcode1 191 | retcode3 = retcode1 192 | 193 | if (f3 < f - crit) and (badg3 is False): 194 | ih = 3 195 | fh = f3 196 | xh = x3 197 | gh = g3 198 | badgh = badg3 199 | retcodeh = retcode3 200 | 201 | elif (f2 < f - crit) and (badg2 is False): 202 | ih = 2 203 | fh = f2 204 | xh = x2 205 | gh = g2 206 | badgh = badg2 207 | retcodeh = retcode2 208 | 209 | elif (f1 < f - crit) and (badg1 is False): 210 | ih = 1 211 | fh = f1 212 | xh = x1 213 | gh = g1 214 | badgh = badg1 215 | retcodeh = retcode1 216 | else: 217 | fh = min(f1, f2, f3) 218 | ih = np.argmin([f1, f2, f3]) 219 | 220 | if ih == 0: 221 | xh = x1 222 | elif ih == 1: 223 | xh = x2 224 | elif ih == 2: 225 | xh = x3 226 | 227 | retcodei = [retcode1, retcode2, retcode3] 228 | retcodeh = retcodei[ih] 229 | 230 | if 'gh' in locals(): 231 | nogh = gh.size == 0 232 | else: 233 | nogh = True 234 | 235 | if nogh: 236 | if numGrad: 237 | gh, badgh = numgrad(fcn, xh) 238 | else: 239 | gh = grad(xh) 240 | badgh = False 241 | 242 | badgh = True 243 | 244 | stuck = (np.abs(fh - f) < crit) 245 | 246 | if (not badg) and (not badgh) and (not stuck): 247 | h = bfgsi(h, gh - g, xh - x, verbose=verbose) 248 | 249 | if verbose: 250 | print(f'Improvement on iteration {itct} was {f - fh}') 251 | 252 | if itct > nit: 253 | if verbose: 254 | print('Maximum number of iterations reached') 255 | done = True 256 | 257 | elif stuck: 258 | if verbose: 259 | print("Convergence achieved. Improvement lower than 'crit'.") 260 | done = True 261 | 262 | if verbose: 263 | if retcodeh == 1: 264 | print('Zero gradient') 265 | elif retcodeh == 6: 266 | print('Smallest step still improving too slow, reversed gradient') 267 | elif retcodeh == 5: 268 | print('Largest step still improving too fast') 269 | elif retcodeh == 4 or retcodeh == 2: 270 | print('Back and forth on step length never finished') 271 | elif retcodeh == 3: 272 | print('Smallest step still improving too slow') 273 | elif retcodeh == 7: 274 | warn('Possible inaccuracy in the Hessian matrix') 275 | 276 | print('\n') 277 | 278 | f = fh 279 | x = xh 280 | g = gh 281 | badg = badgh 282 | 283 | return fh, xh, gh, h, itct, fcount, retcodeh 284 | 285 | 286 | def csminit(fcn, x0, f0, g0, badg, h0): 287 | 288 | # Do not ask me where these come from 289 | angle = 0.005 290 | theta = 0.3 291 | fchange = 1000 292 | minlamb = 1e-9 293 | mindfac = 0.01 294 | 295 | retcode = None 296 | fcount = 0 297 | lambda_ = 1 298 | xhat = x0 299 | fhat = f0 300 | g = g0 301 | gnorm = np.linalg.norm(g) 302 | 303 | if gnorm < 1e-12 and not badg: 304 | # gradient convergence 305 | retcode = 1 306 | else: 307 | # with badg true, we don't try to match rate of improvement to 308 | # directional derivative. We're satisfied just to get *some* 309 | # improvement in f. 310 | dx = - h0 @ g 311 | dxnorm = np.linalg.norm(dx) 312 | 313 | if dxnorm > 1e12: # near singular H problem 314 | dx = dx @ fchange / dxnorm 315 | 316 | dfhat = dx.T @ g0 317 | 318 | if not badg: 319 | # test for alignment of dx with gradient and fix if necessary 320 | a = - dfhat / (gnorm * dxnorm) 321 | if a < angle: 322 | dx = dx - (angle * dxnorm / gnorm + dfhat/(gnorm ** 2)) * g 323 | dx = dx * dxnorm / np.linalg.norm(dx) 324 | dfhat = dx.T @ g 325 | 326 | done = False 327 | factor = 3 328 | shrink = True 329 | lambdaMax = np.inf 330 | lambdaPeak = 0 331 | fPeak = f0 332 | 333 | while not done: 334 | if x0.shape[0] > 1: 335 | dxtest = x0 + dx.T * lambda_ 336 | else: 337 | dxtest = x0 + dx * lambda_ 338 | 339 | f = fcn(dxtest) 340 | 341 | if f < fhat: 342 | fhat = f 343 | xhat = dxtest 344 | 345 | fcount += 1 346 | 347 | shrinkSignal = ((not badg) and (f0 - f < max(- theta * dfhat * lambda_, 0))) or (badg and (f0 - f < 0)) 348 | growSignal = (not badg) and ((lambda_ > 0) and (f0 - f > - (1 - theta) * dfhat * lambda_)) 349 | 350 | if shrinkSignal and ((lambda_ > lambdaPeak) or (lambda_ < 0)): 351 | if lambda_ > 0 and ((not shrink) or (lambda_ / factor <= lambdaPeak)): 352 | shrink = True 353 | factor = factor ** 0.6 354 | 355 | while lambda_/factor <= lambdaPeak: 356 | factor = factor ** 0.6 357 | 358 | if np.abs(factor - 1) < mindfac: 359 | 360 | if np.abs(lambda_) < 4: 361 | retcode = 2 362 | else: 363 | retcode = 7 364 | 365 | done = True 366 | 367 | if (lambda_ < lambdaMax) and (lambda_ > lambdaPeak): 368 | lambdaMax = lambda_ 369 | 370 | lambda_ = lambda_ / factor 371 | 372 | if np.abs(lambda_) < minlamb: 373 | if (lambda_ > 0) and (f0 <= fhat): 374 | # try going against gradient, which may be inaccurate 375 | lambda_ = - lambda_ * factor ** 6 376 | else: 377 | if lambda_ < 0: 378 | retcode = 6 379 | else: 380 | retcode = 3 381 | 382 | done = True 383 | 384 | elif (growSignal and lambda_ > 0) or (shrinkSignal and ((lambda_ <= lambdaPeak) and (lambda_ > 0))): 385 | if shrink: 386 | shrink = False 387 | factor = factor ** 0.6 388 | 389 | if np.abs(factor - 1) < mindfac: 390 | if np.abs(lambda_) < 4: 391 | retcode = 4 392 | else: 393 | retcode = 7 394 | 395 | done = True 396 | 397 | if (f < fPeak) and (lambda_ > 0): 398 | fPeak = f 399 | lambdaPeak = lambda_ 400 | if lambdaMax <= lambdaPeak: 401 | lambdaMax = lambdaPeak * (factor ** 2) 402 | 403 | lambda_ = lambda_ * factor 404 | 405 | if np.abs(lambda_) > 1e20: 406 | retcode = 5 407 | done = True 408 | 409 | else: 410 | done = True 411 | if factor < 1.2: 412 | retcode = 7 413 | else: 414 | retcode = 0 415 | 416 | return fhat, xhat, fcount, retcode 417 | 418 | 419 | def numgrad(fcn, x): 420 | """ 421 | Computes the numerical gradient of a function at point x. 422 | :param fcn: python function that must depend on a single argument 423 | :param x: 1-D numpy.array 424 | :return: 1-D numpy.array 425 | """ 426 | 427 | delta = 1e-6 428 | bad_gradient = False 429 | n = x.shape[0] 430 | tvec = delta * np.eye(n) 431 | g = np.zeros(n) 432 | 433 | f0 = fcn(x) 434 | 435 | for i in range(n): 436 | tvecv = tvec[i, :] 437 | 438 | g0 = (fcn(x + tvecv) - f0) / delta 439 | 440 | if np.abs(g0) < 1e15: # good gradient 441 | g[i] = g0 442 | else: # bad gradient 443 | g[i] = 0 444 | bad_gradient = True 445 | 446 | return g, bad_gradient 447 | 448 | 449 | def bfgsi(h0, dg, dx, verbose=False): 450 | """ 451 | BFGS update for the inverse hessian, ; 452 | :param h0: previous value of the inverse hessian 453 | :param dg: dg is previous change in gradient 454 | :param dx: dx is previous change in x 455 | :param verbose: If True, prints updates of the algorithm 456 | :return: updated inverse hessian matrix 457 | """ 458 | 459 | dx = dx.reshape(-1, 1) 460 | dg = dg.reshape(-1, 1) 461 | 462 | hdg = h0 @ dg 463 | dgdx = (dg.T @ dx)[0, 0] 464 | 465 | if np.abs(dgdx) > 1e-12: 466 | h = h0 + (1 + (dg.T @ hdg) / dgdx) * (dx @ dx.T) / dgdx - (dx @ hdg.T + hdg @ dx.T) / dgdx 467 | 468 | else: 469 | if verbose: 470 | print('bfgs update failed') 471 | # disp(['|dg| = ' num2str(sqrt(dg'*dg)) ' | dx | = ' num2str(sqrt(dx' * dx))]); 472 | # disp(['dg''*dx = ' num2str(dgdx)]) 473 | # disp(['|H*dg| = ' num2str(Hdg'*Hdg)]) 474 | 475 | h = h0 476 | 477 | return h 478 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup, find_packages 2 | import codecs 3 | import os 4 | 5 | here = os.path.abspath(os.path.dirname(__file__)) 6 | 7 | with codecs.open(os.path.join(here, "README.md"), encoding="utf-8") as fh: 8 | long_description = "\n" + fh.read() 9 | 10 | VERSION = '1.1' 11 | DESCRIPTION = 'Solve and estimate linearized DSGE models' 12 | 13 | # Setting up 14 | setup( 15 | name="dsgepy", 16 | version=VERSION, 17 | author="Gustavo Amarante", 18 | author_email="gusamarante@gmail.com", 19 | description=DESCRIPTION, 20 | long_description_content_type="text/markdown", 21 | long_description=long_description, 22 | packages=find_packages(), 23 | install_requires=[ 24 | 'matplotlib', 25 | 'numpy', 26 | 'pandas', 27 | 'scipy', 28 | 'sympy', 29 | 'tqdm', 30 | 31 | ], 32 | keywords=['dsge', 'macroeconomics'], 33 | ) 34 | --------------------------------------------------------------------------------