├── pqn.png ├── .gitignore ├── README.md ├── how_to_compute_drawdown_on_an_investment.ipynb ├── how_to_use_the_sharpe_ratio_for_riskadjusted_returns.ipynb ├── evaluate_a_real_trading_strategy_with_python_and_pandas.ipynb ├── a_simple_stat_arb_strategy_you_can_trade_now.ipynb ├── how_to_compute_volatility_6_ways_most_people_don’t_know.ipynb ├── how_to_simulate_stock_prices_with_python.ipynb ├── value_at_risk_manage_your_portfolios_risk.ipynb └── how_to_tell_if_options_are_cheap_with_volatility_cones.ipynb /pqn.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pyquantnews/PyQuantNewsletter/HEAD/pqn.png -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Jupyter Notebook 2 | .ipynb_checkpoints 3 | 4 | # Mac stuff 5 | .DS_Store 6 | .virtual_documents 7 | 8 | # Python 9 | matplotlibrc -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Code for the PyQuant Newsletter 2 | _Join traders, quants, and complete beginners using Python for algorithmic trading, market data analysis, and quant finance._ 3 | 4 | ## Overview 5 | This repository contains the code for the **PyQuant Newsletter**. Each issue focuses on practical, implementation-ready Python techniques for quantitative finance, algorithmic trading, factor research, portfolio construction, and data engineering. You can [subscribe to the PyQuant Newsletter on Substack free](https://pyquantnews.substack.com/subscribe). 6 | 7 | ## Why subscribe to the free PyQuant Newsletter 8 | 9 | The PyQuant Newsletter gives you applied quant finance in a compact format designed for working professionals: 10 | 11 | - **Hands-on code**: Each issue includes runnable Python examples for immediate use. 12 | - **Actionable quant ideas**: Factor signals, trading edges, vol modeling, portfolio techniques, data pipelines. 13 | - **Practical focus**: Real workflows used by quants, traders, and data scientists. 14 | - **Time-efficient**: Designed so readers can extract value in ~5 minutes. 15 | - **Trusted by practitioners**: Read by tens of thousands of quants, traders, engineers, and analysts. 16 | 17 | ## Alpha Lab Membership 18 | 19 | An [Alpha Lab membership](https://pyquantnews.substack.com/subscribe) is for members who want access to backtested strategies and higher-touch interaction. 20 | 21 | #### Backtested Strategies 22 | Every month Alpha Lab members receive a full research deep dive, including: 23 | - Python code for signal construction 24 | - Backtests and performance analysis 25 | - Interpretation of edge, risk, and failure modes 26 | - A reproducible research notebook 27 | 28 | #### Upcoming strategies include: 29 | - **Reversion Strength** — identifies exhaustion points where price is likely to revert 30 | - **Trend Persistence** — rewards assets with sustained directional strength 31 | - **Stability Premium** — prioritizes assets with persistent, lower-volatility behavior 32 | 33 | #### Start New Chat Threads 34 | Alpha Lab members can initiate new discussion threads inside the private chat environment, enabling: 35 | - Direct Q&A 36 | - Strategy-specific help 37 | - Environment setup guidance 38 | - More detailed and open-ended discussions 39 | 40 | ## Paid Membership Benefits 41 | 42 | You can support also PyQuant News and get deeper support, updated code, and access to private discussions through a [paid membership](https://pyquantnews.substack.com/subscribe). 43 | 44 | #### Private Code Support 45 | Access to the **Substack Chat**, a private channel where members can get personalized help with: 46 | - Python code issues 47 | - Strategy design 48 | - Backtesting workflows 49 | - Execution logic 50 | - Troubleshooting library breaks and environment issues 51 | 52 | > Note: Only **Alpha Lab** members can start new chat threads. Paid members can reply and participate. 53 | 54 | #### Updated Code 55 | Python libraries evolve, dependencies break, and tutorials need maintenance. Paid members receive: 56 | - Updated notebooks 57 | - Rewritten code when APIs change 58 | - Fixes for breaking library updates 59 | - Continuously maintained examples that keep running over time 60 | 61 | #### Meaningful Discussions 62 | Every newsletter ends with a **members-only comments section**. 63 | Paid members get: 64 | - Direct interaction 65 | - Thoughtful replies 66 | - Deeper discussions around implementation and research decisions 67 | 68 | ## Who this is for 69 | The newsletter and membership benefits serve: 70 | - Quants and quant-curious developers 71 | - Systematic traders 72 | - Data scientists working with market data 73 | - Python programmers learning quant techniques 74 | - Researchers exploring factor models and trading strategies 75 | 76 | If you find this code useful, consider subscribing to receive future issues and member-only benefits. 77 | -------------------------------------------------------------------------------- /how_to_compute_drawdown_on_an_investment.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "e7ab5fbe", 6 | "metadata": {}, 7 | "source": [ 8 | "
PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk.
" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "8a8ad335", 14 | "metadata": {}, 15 | "source": [ 16 | "## Library installation\n", 17 | "Install the libraries needed to download market data, compute vectorized metrics, and render plots so the notebook runs end to end." 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": null, 23 | "id": "95ee3b4e", 24 | "metadata": {}, 25 | "outputs": [], 26 | "source": [ 27 | "!pip install yfinance pandas numpy matplotlib" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "id": "7ed921e1", 33 | "metadata": {}, 34 | "source": [ 35 | "yfinance gives us quick access to historical prices, pandas handles time series, numpy powers fast array ops, and matplotlib lets pandas render the underwater charts. Installing them up front avoids environment surprises when we get to plotting and rolling calculations." 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "id": "4907e200", 41 | "metadata": {}, 42 | "source": [ 43 | "## Imports and setup\n", 44 | "We use yfinance to download SPY price history for the study period, and numpy for vectorized operations used inside the drawdown calculations." 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": null, 50 | "id": "6a1541ab", 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "import yfinance as yf\n", 55 | "import numpy as np" 56 | ] 57 | }, 58 | { 59 | "cell_type": "markdown", 60 | "id": "1be9dbe5", 61 | "metadata": {}, 62 | "source": [ 63 | "Keeping imports minimal mirrors production pipelines where dependencies are kept lean and predictable. yfinance is fine for tutorials; in production you would typically swap in a data vendor but keep the same numpy-based logic." 64 | ] 65 | }, 66 | { 67 | "cell_type": "markdown", 68 | "id": "f29c65dc", 69 | "metadata": {}, 70 | "source": [ 71 | "## Fetch price data and returns\n", 72 | "Pull SPY daily prices for the sample window and transform them into simple daily returns, which are the input to our drawdown functions." 73 | ] 74 | }, 75 | { 76 | "cell_type": "code", 77 | "execution_count": null, 78 | "id": "8538fd86", 79 | "metadata": { 80 | "lines_to_next_cell": 1 81 | }, 82 | "outputs": [], 83 | "source": [ 84 | "data = yf.download(\"SPY\", start=\"2020-01-01\", end=\"2022-07-31\")\n", 85 | "returns = data.Close.pct_change()" 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "id": "ccf2a845", 91 | "metadata": {}, 92 | "source": [ 93 | "Using adjusted close aligns with what we care about: investable returns including dividends and splits. pct_change introduces a leading NaN, which is expected and handled later; the important part is we’re creating a clean, noncumulative return stream that we can roll up into an equity curve." 94 | ] 95 | }, 96 | { 97 | "cell_type": "markdown", 98 | "id": "d982a9de", 99 | "metadata": {}, 100 | "source": [ 101 | "## Define drawdown utilities and metrics\n", 102 | "Implement a vectorized drawdown series and a companion max drawdown function so we can reuse the same logic in backtests, dashboards, and rolling risk checks." 103 | ] 104 | }, 105 | { 106 | "cell_type": "code", 107 | "execution_count": null, 108 | "id": "ac127d46", 109 | "metadata": {}, 110 | "outputs": [], 111 | "source": [ 112 | "def drawdown(returns):\n", 113 | " \"\"\"Determines the drawdown\n", 114 | "\n", 115 | " Parameters\n", 116 | " ----------\n", 117 | " returns : pd.Series\n", 118 | " Daily returns of an asset, noncumulative\n", 119 | "\n", 120 | " Returns\n", 121 | " -------\n", 122 | " drawdown : pd.Series\n", 123 | " \"\"\"\n", 124 | " returns.fillna(0.0, inplace=True)\n", 125 | "\n", 126 | " cumulative = (returns + 1).cumprod()\n", 127 | "\n", 128 | " running_max = np.maximum.accumulate(cumulative)\n", 129 | "\n", 130 | " return (cumulative - running_max) / running_max" 131 | ] 132 | }, 133 | { 134 | "cell_type": "code", 135 | "execution_count": null, 136 | "id": "a065558e", 137 | "metadata": { 138 | "lines_to_next_cell": 1 139 | }, 140 | "outputs": [], 141 | "source": [ 142 | "def max_drawdown(returns):\n", 143 | " \"\"\"Determines the maximum drawdown\n", 144 | "\n", 145 | " Parameters\n", 146 | " ----------\n", 147 | " returns : pd.Series\n", 148 | " Daily returns of an asset, noncumulative\n", 149 | "\n", 150 | " Returns\n", 151 | " -------\n", 152 | " max_drawdown : float\n", 153 | " \"\"\"\n", 154 | " return np.min(drawdown(returns))" 155 | ] 156 | }, 157 | { 158 | "cell_type": "markdown", 159 | "id": "b2233c38", 160 | "metadata": {}, 161 | "source": [ 162 | "The drawdown series tracks how far underwater we are relative to the running peak, which is how pros experience risk day to day. Filling NaNs with zero avoids contaminating the cumulative product at the start; note this mutates the input Series in place, so pass a copy if you need the original untouched. The minimum of the underwater series is the max drawdown, the single pain number allocators and risk teams anchor on." 163 | ] 164 | }, 165 | { 166 | "cell_type": "markdown", 167 | "id": "c6427782", 168 | "metadata": {}, 169 | "source": [ 170 | "## Plot underwater and rolling risk\n", 171 | "Visualize the full-sample underwater curve, then a 30-day rolling max drawdown to spot regime shifts and trigger de-risking rules earlier." 172 | ] 173 | }, 174 | { 175 | "cell_type": "code", 176 | "execution_count": null, 177 | "id": "d1d734ee", 178 | "metadata": {}, 179 | "outputs": [], 180 | "source": [ 181 | "drawdown(returns).plot(kind=\"area\", color=\"salmon\", alpha=0.5)" 182 | ] 183 | }, 184 | { 185 | "cell_type": "code", 186 | "execution_count": null, 187 | "id": "35c27bfe", 188 | "metadata": {}, 189 | "outputs": [], 190 | "source": [ 191 | "returns.rolling(30).apply(max_drawdown).plot(\n", 192 | " kind=\"area\",\n", 193 | " color=\"salmon\",\n", 194 | " alpha=0.5,\n", 195 | ")" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "id": "b2890ed0", 201 | "metadata": {}, 202 | "source": [ 203 | "The underwater area chart shows the depth and duration of capital losses, which is more decision-relevant than average return when markets stress. Rolling max drawdown highlights localized pockets of pain that get averaged away in full-sample stats, making it a practical alert for position sizing and pause rules. In production, this series feeds a dashboard and simple thresholds to keep risk within cash and mandate limits." 204 | ] 205 | }, 206 | { 207 | "cell_type": "markdown", 208 | "id": "93b7ceeb", 209 | "metadata": {}, 210 | "source": [ 211 | "PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk." 212 | ] 213 | }, 214 | { 215 | "cell_type": "code", 216 | "execution_count": null, 217 | "id": "e73bdad9-aaca-44a3-8302-2ca9bc14fce8", 218 | "metadata": {}, 219 | "outputs": [], 220 | "source": [] 221 | } 222 | ], 223 | "metadata": { 224 | "jupytext": { 225 | "cell_metadata_filter": "-all", 226 | "main_language": "python", 227 | "notebook_metadata_filter": "-all" 228 | }, 229 | "kernelspec": { 230 | "display_name": "Python 3 (ipykernel)", 231 | "language": "python", 232 | "name": "python3" 233 | }, 234 | "language_info": { 235 | "codemirror_mode": { 236 | "name": "ipython", 237 | "version": 3 238 | }, 239 | "file_extension": ".py", 240 | "mimetype": "text/x-python", 241 | "name": "python", 242 | "nbconvert_exporter": "python", 243 | "pygments_lexer": "ipython3", 244 | "version": "3.12.9" 245 | } 246 | }, 247 | "nbformat": 4, 248 | "nbformat_minor": 5 249 | } 250 | -------------------------------------------------------------------------------- /how_to_use_the_sharpe_ratio_for_riskadjusted_returns.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "e771ac8a", 6 | "metadata": {}, 7 | "source": [ 8 | "
PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk.
" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "11d9e861", 14 | "metadata": {}, 15 | "source": [ 16 | "## Library installation" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "eb0a0f05", 22 | "metadata": {}, 23 | "source": [ 24 | "Install the libraries we need to fetch market data, compute statistics, and visualize results so this notebook runs cleanly anywhere." 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": null, 30 | "id": "1e52e486", 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": [ 34 | "!pip install yfinance pandas numpy matplotlib" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "id": "728f4093", 40 | "metadata": {}, 41 | "source": [ 42 | "## Imports and setup" 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "id": "4e6a8342", 48 | "metadata": {}, 49 | "source": [ 50 | "We use yfinance to pull historical prices, numpy for fast numerical operations such as annualization, and matplotlib.pyplot to visualize Sharpe behavior over time." 51 | ] 52 | }, 53 | { 54 | "cell_type": "code", 55 | "execution_count": null, 56 | "id": "fbb1d5d0", 57 | "metadata": {}, 58 | "outputs": [], 59 | "source": [ 60 | "import yfinance as yf\n", 61 | "import numpy as np\n", 62 | "import matplotlib.pyplot as plt" 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "id": "fb398b3b", 68 | "metadata": {}, 69 | "source": [ 70 | "Keeping imports minimal helps us focus on the Sharpe workflow without extra dependencies. This mirrors what pros do in quick analyses: load prices, compute stats, and plot diagnostics. It also makes results easier to reproduce across machines and environments." 71 | ] 72 | }, 73 | { 74 | "cell_type": "markdown", 75 | "id": "467fa578", 76 | "metadata": {}, 77 | "source": [ 78 | "## Download and prepare daily returns" 79 | ] 80 | }, 81 | { 82 | "cell_type": "markdown", 83 | "id": "07e77e22", 84 | "metadata": {}, 85 | "source": [ 86 | "Pull adjusted close prices for SPY and AAPL over a fixed horizon so we compare assets on the same calendar and sampling frequency." 87 | ] 88 | }, 89 | { 90 | "cell_type": "code", 91 | "execution_count": null, 92 | "id": "bde52d3c", 93 | "metadata": {}, 94 | "outputs": [], 95 | "source": [ 96 | "data = yf.download([\"SPY\", \"AAPL\"], start=\"2020-01-01\", end=\"2024-12-31\")" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "id": "8b457731", 102 | "metadata": {}, 103 | "source": [ 104 | "Using a single source avoids mismatched calendars and survivorship issues that skew risk metrics. Adjusted prices account for splits and dividends, which is important when you want returns that reflect what holders experienced. Consistent sourcing reduces noisy differences that would otherwise leak into Sharpe comparisons." 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "id": "4af8620d", 110 | "metadata": {}, 111 | "source": [ 112 | "Convert prices to noncumulative daily returns using close-to-close changes, which is the correct input for Sharpe." 113 | ] 114 | }, 115 | { 116 | "cell_type": "code", 117 | "execution_count": null, 118 | "id": "75dfbad8", 119 | "metadata": { 120 | "lines_to_next_cell": 1 121 | }, 122 | "outputs": [], 123 | "source": [ 124 | "closes = data.Close\n", 125 | "spy_returns = closes.SPY.pct_change().dropna()\n", 126 | "aapl_returns = closes.AAPL.pct_change().dropna()" 127 | ] 128 | }, 129 | { 130 | "cell_type": "markdown", 131 | "id": "326d47af", 132 | "metadata": {}, 133 | "source": [ 134 | "We compute simple daily returns because Sharpe expects a stream of noncumulative observations; the difference versus log returns is negligible at daily horizons. Dropping missing values prevents artificial spikes or divide-by-zero artifacts that can inflate volatility. This gives us a clean, aligned return series for each asset." 135 | ] 136 | }, 137 | { 138 | "cell_type": "markdown", 139 | "id": "0c5e7da2", 140 | "metadata": {}, 141 | "source": [ 142 | "## Define and compute Sharpe ratios" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "id": "57fb4be9", 148 | "metadata": {}, 149 | "source": [ 150 | "Define a helper that computes an annualized Sharpe ratio from daily returns, with an optional constant daily cash adjustment for excess returns." 151 | ] 152 | }, 153 | { 154 | "cell_type": "code", 155 | "execution_count": null, 156 | "id": "1308284a", 157 | "metadata": { 158 | "lines_to_next_cell": 1 159 | }, 160 | "outputs": [], 161 | "source": [ 162 | "def sharpe_ratio(returns, adjustment_factor=0.0):\n", 163 | " \"\"\"\n", 164 | " Determines the Sharpe ratio of a strategy.\n", 165 | "\n", 166 | " Parameters\n", 167 | " ----------\n", 168 | " returns : pd.Series or np.ndarray\n", 169 | " Daily returns of the strategy, noncumulative.\n", 170 | " adjustment_factor : int, float\n", 171 | " Constant daily benchmark return throughout the period.\n", 172 | "\n", 173 | " Returns\n", 174 | " -------\n", 175 | " sharpe_ratio : float\n", 176 | "\n", 177 | " Note\n", 178 | " -----\n", 179 | " See https://en.wikipedia.org/wiki/Sharpe_ratio for more details.\n", 180 | " \"\"\"\n", 181 | " returns_risk_adj = returns - adjustment_factor\n", 182 | " return (\n", 183 | " returns_risk_adj.mean() / returns_risk_adj.std()\n", 184 | " ) * np.sqrt(252)" 185 | ] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "id": "9e7ff189", 190 | "metadata": {}, 191 | "source": [ 192 | "This implementation uses the sample standard deviation (pandas’ default) on noncumulative returns and annualizes with sqrt(252), matching common practice on U.S. trading days. In production, we would subtract a time-varying daily cash rate rather than a constant when rates move. The function enforces the “excess return per unit of volatility” idea that lets us compare assets fairly." 193 | ] 194 | }, 195 | { 196 | "cell_type": "markdown", 197 | "id": "305a2595", 198 | "metadata": {}, 199 | "source": [ 200 | "Compute full-sample Sharpe ratios for SPY and AAPL to benchmark their risk-adjusted performance over the same window." 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "execution_count": null, 206 | "id": "b7637bdd", 207 | "metadata": {}, 208 | "outputs": [], 209 | "source": [ 210 | "sharpe_ratio(spy_returns)" 211 | ] 212 | }, 213 | { 214 | "cell_type": "code", 215 | "execution_count": null, 216 | "id": "44e5cdd5", 217 | "metadata": {}, 218 | "outputs": [], 219 | "source": [ 220 | "sharpe_ratio(aapl_returns)" 221 | ] 222 | }, 223 | { 224 | "cell_type": "markdown", 225 | "id": "03e1b134", 226 | "metadata": {}, 227 | "source": [ 228 | "Looking at full-sample Sharpe gives a quick first pass on which asset delivered more return per unit of risk, unlevered and on identical sampling. This avoids the “bigger CAGR wins” trap by normalizing for volatility and frequency. Treat these as starting points before you inspect stability through time." 229 | ] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "id": "d700d765", 234 | "metadata": {}, 235 | "source": [ 236 | "## Visualize rolling Sharpe and differences" 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "id": "f0e9f692", 242 | "metadata": {}, 243 | "source": [ 244 | "Track 30-day rolling Sharpe for AAPL, then plot its distribution and the rolling Sharpe difference versus SPY to see stability and relative edge." 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "execution_count": null, 250 | "id": "ec0e039a", 251 | "metadata": {}, 252 | "outputs": [], 253 | "source": [ 254 | "aapl_returns.rolling(30).apply(sharpe_ratio).plot()" 255 | ] 256 | }, 257 | { 258 | "cell_type": "code", 259 | "execution_count": null, 260 | "id": "e16c1491-3650-422a-a2bf-14240e8b04a4", 261 | "metadata": {}, 262 | "outputs": [], 263 | "source": [ 264 | "aapl_returns.rolling(30).apply(sharpe_ratio).hist(bins=50)" 265 | ] 266 | }, 267 | { 268 | "cell_type": "code", 269 | "execution_count": null, 270 | "id": "c5b942b2-293e-47a2-92b0-72c0e94cb704", 271 | "metadata": {}, 272 | "outputs": [], 273 | "source": [ 274 | "(\n", 275 | " aapl_returns.rolling(30).apply(sharpe_ratio)\n", 276 | " - spy_returns.rolling(30).apply(sharpe_ratio)\n", 277 | ").hist(bins=50)" 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "id": "0a014e70", 283 | "metadata": {}, 284 | "source": [ 285 | "Rolling windows reveal regime shifts that a single full-period number hides, which is how desks set guardrails and monitor drift. The histograms show how often AAPL’s efficiency clusters at certain levels and when it outperforms SPY on a risk-adjusted basis. Short windows can be noisy, so in real reviews we also test multiple horizons and subtract a realistic daily cash rate when conditions change." 286 | ] 287 | }, 288 | { 289 | "cell_type": "markdown", 290 | "id": "44b1756c", 291 | "metadata": {}, 292 | "source": [ 293 | "PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk." 294 | ] 295 | } 296 | ], 297 | "metadata": { 298 | "jupytext": { 299 | "cell_metadata_filter": "-all", 300 | "main_language": "python", 301 | "notebook_metadata_filter": "-all" 302 | }, 303 | "kernelspec": { 304 | "display_name": "Python 3 (ipykernel)", 305 | "language": "python", 306 | "name": "python3" 307 | }, 308 | "language_info": { 309 | "codemirror_mode": { 310 | "name": "ipython", 311 | "version": 3 312 | }, 313 | "file_extension": ".py", 314 | "mimetype": "text/x-python", 315 | "name": "python", 316 | "nbconvert_exporter": "python", 317 | "pygments_lexer": "ipython3", 318 | "version": "3.12.9" 319 | } 320 | }, 321 | "nbformat": 4, 322 | "nbformat_minor": 5 323 | } 324 | -------------------------------------------------------------------------------- /evaluate_a_real_trading_strategy_with_python_and_pandas.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "b68b729a", 6 | "metadata": {}, 7 | "source": [ 8 | "
" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "7a93132c", 14 | "metadata": {}, 15 | "source": [ 16 | "## Library installation" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "fbf603bb", 22 | "metadata": {}, 23 | "source": [ 24 | "Install the required Python packages so the notebook runs end-to-end in a fresh environment. This ensures everyone tests the same versions when measuring a small calendar edge." 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": null, 30 | "id": "ff7e4199", 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": [ 34 | "!pip install yfinance pandas numpy matplotlib" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "id": "641fa05a", 40 | "metadata": {}, 41 | "source": [ 42 | "These wheels install quickly on standard CPython via pip. For reproducibility, consider pinning versions in a requirements.txt when you move from exploration to backtesting. No non-standard system dependencies are needed for this notebook." 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "id": "7f868274", 48 | "metadata": {}, 49 | "source": [ 50 | "## Imports and setup" 51 | ] 52 | }, 53 | { 54 | "cell_type": "markdown", 55 | "id": "2367b3be", 56 | "metadata": {}, 57 | "source": [ 58 | "We import matplotlib.pyplot for plotting, pandas for time‑series manipulation, numpy for vectorized math and log returns, and yfinance to download historical TLT prices. This minimal stack keeps the focus on testing a calendar hypothesis rather than tooling." 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "id": "ba9386e3", 65 | "metadata": {}, 66 | "outputs": [], 67 | "source": [ 68 | "import matplotlib.pyplot as plt\n", 69 | "import pandas as pd\n", 70 | "import numpy as np\n", 71 | "import yfinance as yf" 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "id": "953d9fc4", 77 | "metadata": {}, 78 | "source": [ 79 | "Keeping the stack small reduces chances of hidden defaults that skew results. Matplotlib’s stateful interface will render plots at the end when we call plt.show(), which keeps notebooks tidy. yfinance retrieves split/dividend-adjusted series we will treat as our baseline." 80 | ] 81 | }, 82 | { 83 | "cell_type": "markdown", 84 | "id": "ab0b43e7", 85 | "metadata": {}, 86 | "source": [ 87 | "## Download data and build features" 88 | ] 89 | }, 90 | { 91 | "cell_type": "markdown", 92 | "id": "1be8bfb9", 93 | "metadata": {}, 94 | "source": [ 95 | "Fetch daily TLT prices from Yahoo Finance to cover two decades of regimes for a robust seasonality check. The long sample lets us see whether month-end strength persists across different rate cycles." 96 | ] 97 | }, 98 | { 99 | "cell_type": "code", 100 | "execution_count": null, 101 | "id": "ea1aec3d", 102 | "metadata": {}, 103 | "outputs": [], 104 | "source": [ 105 | "tlt = yf.download(\"TLT\", start=\"2002-01-01\", end=\"2022-06-30\")" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "id": "e6770da8", 111 | "metadata": {}, 112 | "source": [ 113 | "Yahoo’s API returns a DataFrame with Open/High/Low/Close/Adj Close and a DateTime index. Using ETF data avoids survivorship bias from delisted bonds, though vendor adjustments can differ from your broker. Treat this as a quick scan; you can swap in official vendor data once the signal earns more work." 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "id": "9827125c", 119 | "metadata": {}, 120 | "source": [ 121 | "Compute daily log returns and attach calendar labels (day of month and year), then aggregate mean return by calendar day. This frames the problem as an event study across months." 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": null, 127 | "id": "d3c82fb2", 128 | "metadata": {}, 129 | "outputs": [], 130 | "source": [ 131 | "tlt[\"log_return\"] = np.log(tlt[\"Adj Close\"] / tlt[\"Adj Close\"].shift(1))\n", 132 | "tlt[\"day_of_month\"] = tlt.index.day\n", 133 | "tlt[\"year\"] = tlt.index.year" 134 | ] 135 | }, 136 | { 137 | "cell_type": "code", 138 | "execution_count": null, 139 | "id": "8d556bbf", 140 | "metadata": {}, 141 | "outputs": [], 142 | "source": [ 143 | "grouped_by_day = tlt.groupby(\"day_of_month\")[\"log_return\"].mean()" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "id": "9188c11b", 149 | "metadata": {}, 150 | "source": [ 151 | "Log returns add across time and handle compounding cleanly, which makes later cumulation and comparisons less fragile. Grouping by the calendar day gives us a cross-month average without peeking into the future because each day uses only its own prior close. Short and long months contribute whatever days they have, so the average reflects actual trading calendars rather than an imposed template." 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "id": "d35e9436", 157 | "metadata": {}, 158 | "source": [ 159 | "## Define month-end strategy proxy" 160 | ] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "id": "bed72eac", 165 | "metadata": {}, 166 | "source": [ 167 | "Build a simple diagnostic spread: sum of returns in the last calendar week minus the first calendar week. It approximates buying into month-end strength and giving it back after the turn." 168 | ] 169 | }, 170 | { 171 | "cell_type": "code", 172 | "execution_count": null, 173 | "id": "d4116ca8", 174 | "metadata": {}, 175 | "outputs": [], 176 | "source": [ 177 | "tlt[\"first_week_returns\"] = 0.0\n", 178 | "tlt.loc[tlt.day_of_month <= 7, \"first_week_returns\"] = tlt[\n", 179 | " tlt.day_of_month <= 7\n", 180 | "][\"log_return\"]" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": null, 186 | "id": "fc0de820", 187 | "metadata": {}, 188 | "outputs": [], 189 | "source": [ 190 | "tlt[\"last_week_returns\"] = 0.0\n", 191 | "tlt.loc[tlt.day_of_month >= 23, \"last_week_returns\"] = tlt[\n", 192 | " tlt.day_of_month >= 23\n", 193 | "][\"log_return\"]" 194 | ] 195 | }, 196 | { 197 | "cell_type": "code", 198 | "execution_count": null, 199 | "id": "43ceb859", 200 | "metadata": {}, 201 | "outputs": [], 202 | "source": [ 203 | "tlt[\"last_week_less_first_week\"] = (\n", 204 | " tlt[\"last_week_returns\"] - tlt[\"first_week_returns\"]\n", 205 | ")" 206 | ] 207 | }, 208 | { 209 | "cell_type": "markdown", 210 | "id": "eac825ae", 211 | "metadata": {}, 212 | "source": [ 213 | "This is not an executable PnL yet; it’s a clean indicator of relative strength around the boundary. The definitions (<=7 and >=23) sidestep exact month lengths and exchange holidays while keeping the effect focused. If the spread is positive on average and reasonably stable, it justifies deeper work with precise trading days and costs." 214 | ] 215 | }, 216 | { 217 | "cell_type": "markdown", 218 | "id": "52da0a2c", 219 | "metadata": {}, 220 | "source": [ 221 | "## Visualize results and check stability" 222 | ] 223 | }, 224 | { 225 | "cell_type": "markdown", 226 | "id": "df1cf7cc", 227 | "metadata": {}, 228 | "source": [ 229 | "Chart average log returns by calendar day to see where any excess clusters. Visual inspection quickly tells us whether the edge lives near the end or start." 230 | ] 231 | }, 232 | { 233 | "cell_type": "code", 234 | "execution_count": null, 235 | "id": "559133b5", 236 | "metadata": {}, 237 | "outputs": [], 238 | "source": [ 239 | "grouped_by_day.plot.bar(\n", 240 | " title=\"Mean Log Returns by Calendar Day of Month\"\n", 241 | ")" 242 | ] 243 | }, 244 | { 245 | "cell_type": "markdown", 246 | "id": "6bb47041", 247 | "metadata": {}, 248 | "source": [ 249 | "Bars near the final trading days should stand out if flows lift prices into close, while early-month bars may dip if that strength mean-reverts. Plotting the average by day is the fastest way to validate the hypothesis before building a backtester. If nothing shows here, a more complex model probably will not rescue it." 250 | ] 251 | }, 252 | { 253 | "cell_type": "markdown", 254 | "id": "660e9ac8", 255 | "metadata": {}, 256 | "source": [ 257 | "Evaluate persistence with three views: yearly average of the spread, yearly cumulative contribution, and the full-sample cumulative path." 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": null, 263 | "id": "5b75fe4e", 264 | "metadata": {}, 265 | "outputs": [], 266 | "source": [ 267 | "(\n", 268 | " tlt.groupby(\"year\")\n", 269 | " .last_week_less_first_week.mean()\n", 270 | " .plot.bar(title=\"Mean Log Strategy Returns by Year\")\n", 271 | ")" 272 | ] 273 | }, 274 | { 275 | "cell_type": "code", 276 | "execution_count": null, 277 | "id": "4a00d6d7", 278 | "metadata": {}, 279 | "outputs": [], 280 | "source": [ 281 | "(\n", 282 | " tlt.groupby(\"year\")\n", 283 | " .last_week_less_first_week.sum()\n", 284 | " .cumsum()\n", 285 | " .plot(title=\"Cumulative Sum of Returns By Year\")\n", 286 | ")" 287 | ] 288 | }, 289 | { 290 | "cell_type": "code", 291 | "execution_count": null, 292 | "id": "68cf155f", 293 | "metadata": {}, 294 | "outputs": [], 295 | "source": [ 296 | "tlt[\"last_week_less_first_week\"].cumsum().plot(\n", 297 | " title=\"Cumulative Sum of Returns By Day\"\n", 298 | ")" 299 | ] 300 | }, 301 | { 302 | "cell_type": "code", 303 | "execution_count": null, 304 | "id": "c5c925de", 305 | "metadata": {}, 306 | "outputs": [], 307 | "source": [ 308 | "plt.show()" 309 | ] 310 | }, 311 | { 312 | "cell_type": "markdown", 313 | "id": "a04d6e08", 314 | "metadata": {}, 315 | "source": [ 316 | "Consistent positive bars by year indicate the behavior isn’t just a couple of lucky months. Yearly cumulative sums highlight which regimes drive or fade the edge, and the daily cumulative path reveals drawdowns you would have felt. We still ignore trading costs and ETF tracking error here; if the shape looks good, add slippage and calendars next." 317 | ] 318 | }, 319 | { 320 | "cell_type": "markdown", 321 | "id": "735e7b39", 322 | "metadata": {}, 323 | "source": [ 324 | "PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk." 325 | ] 326 | } 327 | ], 328 | "metadata": { 329 | "jupytext": { 330 | "cell_metadata_filter": "-all", 331 | "main_language": "python", 332 | "notebook_metadata_filter": "-all" 333 | } 334 | }, 335 | "nbformat": 4, 336 | "nbformat_minor": 5 337 | } 338 | -------------------------------------------------------------------------------- /a_simple_stat_arb_strategy_you_can_trade_now.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "44282607", 6 | "metadata": {}, 7 | "source": [ 8 | "
" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "8ca26c13", 14 | "metadata": {}, 15 | "source": [ 16 | "## Library installation\n", 17 | "Install the libraries we use so the notebook runs anywhere without manual setup." 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": null, 23 | "id": "909a1c11", 24 | "metadata": {}, 25 | "outputs": [], 26 | "source": [ 27 | "!pip install yfinance pandas numpy matplotlib seaborn statsmodels" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "id": "c88e4eb9", 33 | "metadata": {}, 34 | "source": [ 35 | "These are standard pip packages that cover data access (yfinance), manipulation (pandas, numpy), testing (statsmodels), and plotting (matplotlib, seaborn). If you use conda, you can install the same names via conda-forge to avoid binary build issues on some systems." 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "id": "5128b83a", 41 | "metadata": {}, 42 | "source": [ 43 | "## Imports and setup\n", 44 | "We use matplotlib and seaborn for plotting, numpy and pandas for array/DataFrame work, yfinance to fetch historical prices, and statsmodels’ coint for the Engle–Granger cointegration test." 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": null, 50 | "id": "530046a4", 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "import matplotlib.pyplot as plt\n", 55 | "import numpy as np\n", 56 | "import pandas as pd\n", 57 | "import seaborn as sns\n", 58 | "import yfinance as yf\n", 59 | "from statsmodels.tsa.stattools import coint" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "id": "b2af95ae", 65 | "metadata": {}, 66 | "source": [ 67 | "These imports set up a minimal stack that mirrors how professionals prototype stat-arb: get data, test the relationship, and visualize the spread. Keeping the toolset small helps us focus on the mechanics (stationarity and residuals) rather than juggling libraries." 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "id": "af07d50c", 73 | "metadata": {}, 74 | "source": [ 75 | "## Define universe and download data\n", 76 | "Define a related renewable-energy universe and pull daily closes, then align the panel by dropping dates with missing values." 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": null, 82 | "id": "f1687414", 83 | "metadata": {}, 84 | "outputs": [], 85 | "source": [ 86 | "tickers = [\n", 87 | " \"NEE\",\n", 88 | " \"FSLR\",\n", 89 | " \"ENPH\",\n", 90 | " \"PLUG\",\n", 91 | " \"BEP\",\n", 92 | " \"AQN\",\n", 93 | " \"PBW\",\n", 94 | " \"FAN\",\n", 95 | " \"ICLN\",\n", 96 | "]\n", 97 | "start_date = \"2014-01-01\"\n", 98 | "end_date = \"2015-01-01\"\n", 99 | "df = yf.download(\n", 100 | " tickers,\n", 101 | " start=start_date,\n", 102 | " end=end_date,\n", 103 | " auto_adjust=False,\n", 104 | ").Close\n", 105 | "df = df.dropna()" 106 | ] 107 | }, 108 | { 109 | "cell_type": "markdown", 110 | "id": "e68406a2", 111 | "metadata": {}, 112 | "source": [ 113 | "Starting with related names increases our odds of finding economic links that support a stable residual, rather than spurious correlation. Dropping missing dates ensures the Engle–Granger test compares like-for-like observations, which protects your inference. The short window is for illustration; in practice, use longer histories and time splits to reduce sampling noise." 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "id": "cad45526", 119 | "metadata": {}, 120 | "source": [ 121 | "## Test cointegration and select pairs\n", 122 | "Run Engle–Granger tests across all pairs to score stability (p-values) and keep candidates below a permissive threshold." 123 | ] 124 | }, 125 | { 126 | "cell_type": "code", 127 | "execution_count": null, 128 | "id": "4d7d2f86", 129 | "metadata": {}, 130 | "outputs": [], 131 | "source": [ 132 | "n = len(tickers)\n", 133 | "score_matrix = np.zeros((n, n))\n", 134 | "pvalue_matrix = np.ones((n, n))\n", 135 | "pairs = []\n", 136 | "for i in range(n):\n", 137 | " for j in range(i + 1, n):\n", 138 | " score, pval, _ = coint(df.iloc[:, i], df.iloc[:, j])\n", 139 | " score_matrix[i, j] = score\n", 140 | " pvalue_matrix[i, j] = pval\n", 141 | " if pval < 0.10:\n", 142 | " pairs.append((tickers[i], tickers[j]))" 143 | ] 144 | }, 145 | { 146 | "cell_type": "markdown", 147 | "id": "1e9e1748", 148 | "metadata": {}, 149 | "source": [ 150 | "Cointegration targets a stationary linear combination, which is what makes residual trading coherent when volatility shifts. A 10% screen is deliberately loose so we don’t overfit at this stage; we’ll rely on later standardization and validation to refine signals. In production research, account for multiple testing and re-estimate over rolling windows to keep parameters current." 151 | ] 152 | }, 153 | { 154 | "cell_type": "markdown", 155 | "id": "f7dfe0d3", 156 | "metadata": {}, 157 | "source": [ 158 | "Identify the strongest candidate (lowest p-value) and extract its two aligned price series for further analysis." 159 | ] 160 | }, 161 | { 162 | "cell_type": "code", 163 | "execution_count": null, 164 | "id": "c4b1274a", 165 | "metadata": {}, 166 | "outputs": [], 167 | "source": [ 168 | "mask = np.triu(np.ones_like(pvalue_matrix, dtype=bool), k=1)\n", 169 | "upper_vals = pvalue_matrix[mask]\n", 170 | "min_idx_flat = np.argmin(upper_vals)\n", 171 | "min_p = upper_vals[min_idx_flat]\n", 172 | "idx_pairs = np.column_stack(np.where(mask))\n", 173 | "i, j = idx_pairs[min_idx_flat]" 174 | ] 175 | }, 176 | { 177 | "cell_type": "code", 178 | "execution_count": null, 179 | "id": "8518ba7d", 180 | "metadata": {}, 181 | "outputs": [], 182 | "source": [ 183 | "S1, S2 = df.iloc[:, i], df.iloc[:, j]" 184 | ] 185 | }, 186 | { 187 | "cell_type": "markdown", 188 | "id": "5c3555e3", 189 | "metadata": {}, 190 | "source": [ 191 | "Picking the best-ranked pair gives us a concrete example to carry through the workflow. Remember that the test is symmetric, but the tradable spread is not—next we should estimate a hedge ratio so the residual reflects the true linear relation rather than raw price levels. Treat this selection as a starting point to be confirmed out-of-sample." 192 | ] 193 | }, 194 | { 195 | "cell_type": "markdown", 196 | "id": "e6e585a9", 197 | "metadata": {}, 198 | "source": [ 199 | "## Standardize residuals and visualize signals\n", 200 | "Reconfirm the cointegration result on the chosen pair, then build a spread and a rolling z-score to make signals comparable over time." 201 | ] 202 | }, 203 | { 204 | "cell_type": "code", 205 | "execution_count": null, 206 | "id": "3d303266", 207 | "metadata": {}, 208 | "outputs": [], 209 | "source": [ 210 | "score, pvalue, _ = coint(S1, S2)\n", 211 | "print(\n", 212 | " f\"tickers with lowest p-value: {tickers[i]} x {tickers[j]}, p={pvalue}\"\n", 213 | ")\n", 214 | "spread = S1 - S2\n", 215 | "zscore = (\n", 216 | " spread - spread.rolling(21, min_periods=21).mean()\n", 217 | ") / spread.rolling(21, min_periods=21).std()" 218 | ] 219 | }, 220 | { 221 | "cell_type": "markdown", 222 | "id": "dcba250b", 223 | "metadata": {}, 224 | "source": [ 225 | "Standardizing with a 21-day rolling window avoids lookahead and puts the residual in “number of sigmas,” which makes entry and exit rules clean. The spread here is a simple difference; in practice, regress S1 on S2 and use residual = S1 − beta*S2 to respect scaling. Z-scores are portable across pairs and regimes, which helps when you later compare signals or batch-trade a basket." 226 | ] 227 | }, 228 | { 229 | "cell_type": "markdown", 230 | "id": "03c5308e", 231 | "metadata": {}, 232 | "source": [ 233 | "Visualize the cointegration p-values, masking non-candidates and the redundant lower triangle to see which names truly co-move in a stable way." 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": null, 239 | "id": "22e23505", 240 | "metadata": {}, 241 | "outputs": [], 242 | "source": [ 243 | "mask = (\n", 244 | " np.tril(np.ones_like(pvalue_matrix, dtype=bool))\n", 245 | " | (pvalue_matrix >= 0.10)\n", 246 | ")\n", 247 | "plt.figure(figsize=(8, 6))\n", 248 | "sns.heatmap(\n", 249 | " pvalue_matrix,\n", 250 | " mask=mask,\n", 251 | " xticklabels=tickers,\n", 252 | " yticklabels=tickers,\n", 253 | " cmap=\"RdYlGn_r\",\n", 254 | " annot=True,\n", 255 | " fmt=\".2f\",\n", 256 | " cbar=True,\n", 257 | " vmin=0,\n", 258 | " vmax=1,\n", 259 | ")\n", 260 | "plt.title(\"Cointegration Test p-value Matrix\")\n", 261 | "plt.show()" 262 | ] 263 | }, 264 | { 265 | "cell_type": "markdown", 266 | "id": "b00284a2", 267 | "metadata": {}, 268 | "source": [ 269 | "A heatmap helps us spot clusters of stable relationships within the renewable theme, which is more actionable than eyeballing charts. This is how we define a tradable subset before doing any signal tuning. Keep in mind that what looks good in one year may drift, so periodic re-tests are part of a robust workflow." 270 | ] 271 | }, 272 | { 273 | "cell_type": "markdown", 274 | "id": "5a4970e1", 275 | "metadata": {}, 276 | "source": [ 277 | "Plot the raw spread with simple bands and the standardized z-score so we can sketch an entry/exit plan around mean reversion." 278 | ] 279 | }, 280 | { 281 | "cell_type": "code", 282 | "execution_count": null, 283 | "id": "a1a5dbb5", 284 | "metadata": {}, 285 | "outputs": [], 286 | "source": [ 287 | "fig, axes = plt.subplots(2, 1, figsize=(14, 8), sharex=True)\n", 288 | "axes[0].plot(spread.index, spread, label=\"Spread\")\n", 289 | "axes[0].axhline(spread.mean(), color=\"black\", linestyle=\"--\", lw=1)\n", 290 | "axes[0].axhline(\n", 291 | " spread.mean() + spread.std(),\n", 292 | " color=\"red\",\n", 293 | " linestyle=\"--\",\n", 294 | " lw=1,\n", 295 | ")\n", 296 | "axes[0].axhline(\n", 297 | " spread.mean() - spread.std(),\n", 298 | " color=\"green\",\n", 299 | " linestyle=\"--\",\n", 300 | " lw=1,\n", 301 | ")\n", 302 | "axes[0].set_ylabel(\"Spread\")\n", 303 | "axes[0].legend()\n", 304 | "axes[1].plot(zscore.index, zscore, label=\"Z-score\")\n", 305 | "axes[1].axhline(0, color=\"black\", linestyle=\"--\", lw=1)\n", 306 | "axes[1].axhline(1, color=\"red\", linestyle=\"--\", lw=1)\n", 307 | "axes[1].axhline(-1, color=\"green\", linestyle=\"--\", lw=1)\n", 308 | "axes[1].set_ylabel(\"Z-score\")\n", 309 | "axes[1].legend()\n", 310 | "plt.xlabel(\"Date\")\n", 311 | "plt.suptitle(\"Spread and Rolling Z-score between ABGB and FSLR\")\n", 312 | "plt.tight_layout()\n", 313 | "plt.show()" 314 | ] 315 | }, 316 | { 317 | "cell_type": "markdown", 318 | "id": "07f264a9", 319 | "metadata": {}, 320 | "source": [ 321 | "Z-score bands make the rules explicit: for example, consider entries beyond ±1 and exits near 0, then evaluate with costs and borrow constraints. Treat these pictures as diagnostics; the next step is to backtest with rolling hedge-ratio estimation so bands reflect the true residual. Static thresholds are a useful baseline, but the edge lives in disciplined estimation and risk controls." 322 | ] 323 | }, 324 | { 325 | "cell_type": "markdown", 326 | "id": "f48eb190", 327 | "metadata": {}, 328 | "source": [ 329 | "PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk." 330 | ] 331 | } 332 | ], 333 | "metadata": { 334 | "jupytext": { 335 | "cell_metadata_filter": "-all", 336 | "main_language": "python", 337 | "notebook_metadata_filter": "-all" 338 | } 339 | }, 340 | "nbformat": 4, 341 | "nbformat_minor": 5 342 | } 343 | -------------------------------------------------------------------------------- /how_to_compute_volatility_6_ways_most_people_don’t_know.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "6e13693f", 6 | "metadata": {}, 7 | "source": [ 8 | "
" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "0a83b5ca", 14 | "metadata": {}, 15 | "source": [ 16 | "## Library installation" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "c76831c7", 22 | "metadata": {}, 23 | "source": [ 24 | "Install the runtime dependencies so this notebook runs anywhere without manual setup. We include yfinance, pandas, numpy, and matplotlib for data, arrays, and plots." 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": null, 30 | "id": "7c60b3cd", 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": [ 34 | "!pip install yfinance pandas numpy matplotlib" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "id": "713da31a", 40 | "metadata": {}, 41 | "source": [ 42 | "We install pandas even if we do not import it directly because yfinance returns pandas DataFrames and Series. A single cell like this removes environment friction and makes the notebook portable. No special system packages are typically required for these libraries." 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "id": "aa3218b8", 48 | "metadata": {}, 49 | "source": [ 50 | "## Imports and setup" 51 | ] 52 | }, 53 | { 54 | "cell_type": "markdown", 55 | "id": "bd3c51b1", 56 | "metadata": {}, 57 | "source": [ 58 | "We import math for constants and square roots, numpy for fast elementwise log operations, yfinance to fetch OHLCV bars, and matplotlib.pyplot to plot the close and our volatility estimates." 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "id": "54637b7b", 65 | "metadata": {}, 66 | "outputs": [], 67 | "source": [ 68 | "import math" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "execution_count": null, 74 | "id": "b8369240", 75 | "metadata": {}, 76 | "outputs": [], 77 | "source": [ 78 | "import numpy as np\n", 79 | "import yfinance as yf\n", 80 | "import matplotlib.pyplot as plt" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "id": "129d715f", 86 | "metadata": {}, 87 | "source": [ 88 | "Keeping imports minimal helps focus on measurement rather than tooling. yfinance returns pandas objects, so we can use rolling windows without importing pandas explicitly. These are the only modules we need to build, compare, and visualize the estimators." 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "id": "f0a2b7ad", 94 | "metadata": {}, 95 | "source": [ 96 | "## Download OHLCV data for testing" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "id": "931aff52", 102 | "metadata": {}, 103 | "source": [ 104 | "Download daily OHLCV for AAPL over a multi-year span to feed each estimator with realistic bar information. This includes Open, High, Low, Close that capture intraday ranges and overnight gaps we care about." 105 | ] 106 | }, 107 | { 108 | "cell_type": "code", 109 | "execution_count": null, 110 | "id": "9c3198e4", 111 | "metadata": { 112 | "lines_to_next_cell": 1 113 | }, 114 | "outputs": [], 115 | "source": [ 116 | "data = yf.download(\"AAPL\", start=\"2017-01-01\", end=\"2022-06-30\")" 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "id": "2f8e82ab", 122 | "metadata": {}, 123 | "source": [ 124 | "Bar data avoids the close-only trap by retaining the range inside each session and the discontinuity between sessions. A longer sample improves stability of rolling statistics, but we will still see small-sample effects in short windows. You can swap the ticker or dates later to pressure-test how robust your sizing metric is across regimes." 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "id": "4dbcb5bc", 130 | "metadata": {}, 131 | "source": [ 132 | "## Define realized volatility estimators clearly" 133 | ] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "id": "7f2b85af", 138 | "metadata": {}, 139 | "source": [ 140 | "Implement six realized volatility estimators on OHLC bars, all annualized to trading_periods and computed with rolling windows that avoid lookahead by using only past data. Each function returns a Series aligned with the input index and can drop warm-up NaNs via clean=True." 141 | ] 142 | }, 143 | { 144 | "cell_type": "code", 145 | "execution_count": null, 146 | "id": "c69aa31c", 147 | "metadata": {}, 148 | "outputs": [], 149 | "source": [ 150 | "def standard_deviation(price_data, window=30, trading_periods=252, clean=True):\n", 151 | " log_return = (price_data[\"Close\"] / price_data[\"Close\"].shift(1)).apply(np.log)\n", 152 | "\n", 153 | " result = (\n", 154 | " log_return.rolling(window=window, center=False).std()\n", 155 | " * math.sqrt(trading_periods)\n", 156 | " )\n", 157 | "\n", 158 | " if clean:\n", 159 | " return result.dropna()\n", 160 | " else:\n", 161 | " return result" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "id": "1da8695f", 168 | "metadata": {}, 169 | "outputs": [], 170 | "source": [ 171 | "def parkinson(price_data, window=30, trading_periods=252, clean=True):\n", 172 | " rs = (1.0 / (4.0 * math.log(2.0))) * (\n", 173 | " (price_data[\"High\"] / price_data[\"Low\"]).apply(np.log)\n", 174 | " ) ** 2.0\n", 175 | "\n", 176 | " def f(v):\n", 177 | " return (trading_periods * v.mean()) ** 0.5\n", 178 | "\n", 179 | " result = rs.rolling(window=window, center=False).apply(func=f)\n", 180 | "\n", 181 | " if clean:\n", 182 | " return result.dropna()\n", 183 | " else:\n", 184 | " return result" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": null, 190 | "id": "84ba5f9f", 191 | "metadata": {}, 192 | "outputs": [], 193 | "source": [ 194 | "def garman_klass(price_data, window=30, trading_periods=252, clean=True):\n", 195 | " log_hl = (price_data[\"High\"] / price_data[\"Low\"]).apply(np.log)\n", 196 | " log_co = (price_data[\"Close\"] / price_data[\"Open\"]).apply(np.log)\n", 197 | "\n", 198 | " rs = 0.5 * log_hl ** 2 - (2 * math.log(2) - 1) * log_co ** 2\n", 199 | "\n", 200 | " def f(v):\n", 201 | " return (trading_periods * v.mean()) ** 0.5\n", 202 | "\n", 203 | " result = rs.rolling(window=window, center=False).apply(func=f)\n", 204 | "\n", 205 | " if clean:\n", 206 | " return result.dropna()\n", 207 | " else:\n", 208 | " return result" 209 | ] 210 | }, 211 | { 212 | "cell_type": "code", 213 | "execution_count": null, 214 | "id": "ad975476", 215 | "metadata": {}, 216 | "outputs": [], 217 | "source": [ 218 | "def hodges_tompkins(price_data, window=30, trading_periods=252, clean=True):\n", 219 | " log_return = (price_data[\"Close\"] / price_data[\"Close\"].shift(1)).apply(np.log)\n", 220 | "\n", 221 | " vol = (\n", 222 | " log_return.rolling(window=window, center=False).std()\n", 223 | " * math.sqrt(trading_periods)\n", 224 | " )\n", 225 | "\n", 226 | " h = window\n", 227 | " n = (log_return.count() - h) + 1\n", 228 | "\n", 229 | " adj_factor = 1.0 / (\n", 230 | " 1.0 - (h / n) + ((h ** 2 - 1) / (3 * (n ** 2)))\n", 231 | " )\n", 232 | "\n", 233 | " result = vol * adj_factor\n", 234 | "\n", 235 | " if clean:\n", 236 | " return result.dropna()\n", 237 | " else:\n", 238 | " return result" 239 | ] 240 | }, 241 | { 242 | "cell_type": "code", 243 | "execution_count": null, 244 | "id": "0f984959", 245 | "metadata": {}, 246 | "outputs": [], 247 | "source": [ 248 | "def rogers_satchell(price_data, window=30, trading_periods=252, clean=True):\n", 249 | " log_ho = (price_data[\"High\"] / price_data[\"Open\"]).apply(np.log)\n", 250 | " log_lo = (price_data[\"Low\"] / price_data[\"Open\"]).apply(np.log)\n", 251 | " log_co = (price_data[\"Close\"] / price_data[\"Open\"]).apply(np.log)\n", 252 | "\n", 253 | " rs = log_ho * (log_ho - log_co) + log_lo * (log_lo - log_co)\n", 254 | "\n", 255 | " def f(v):\n", 256 | " return (trading_periods * v.mean()) ** 0.5\n", 257 | "\n", 258 | " result = rs.rolling(window=window, center=False).apply(func=f)\n", 259 | "\n", 260 | " if clean:\n", 261 | " return result.dropna()\n", 262 | " else:\n", 263 | " return result" 264 | ] 265 | }, 266 | { 267 | "cell_type": "code", 268 | "execution_count": null, 269 | "id": "03add9a4", 270 | "metadata": { 271 | "lines_to_next_cell": 1 272 | }, 273 | "outputs": [], 274 | "source": [ 275 | "def yang_zhang(price_data, window=30, trading_periods=252, clean=True):\n", 276 | " log_ho = (price_data[\"High\"] / price_data[\"Open\"]).apply(np.log)\n", 277 | " log_lo = (price_data[\"Low\"] / price_data[\"Open\"]).apply(np.log)\n", 278 | " log_co = (price_data[\"Close\"] / price_data[\"Open\"]).apply(np.log)\n", 279 | "\n", 280 | " log_oc = (price_data[\"Open\"] / price_data[\"Close\"].shift(1)).apply(np.log)\n", 281 | " log_oc_sq = log_oc ** 2\n", 282 | "\n", 283 | " log_cc = (price_data[\"Close\"] / price_data[\"Close\"].shift(1)).apply(np.log)\n", 284 | " log_cc_sq = log_cc ** 2\n", 285 | "\n", 286 | " rs = log_ho * (log_ho - log_co) + log_lo * (log_lo - log_co)\n", 287 | "\n", 288 | " close_vol = (\n", 289 | " log_cc_sq.rolling(window=window, center=False).sum()\n", 290 | " * (1.0 / (window - 1.0))\n", 291 | " )\n", 292 | " open_vol = (\n", 293 | " log_oc_sq.rolling(window=window, center=False).sum()\n", 294 | " * (1.0 / (window - 1.0))\n", 295 | " )\n", 296 | " window_rs = (\n", 297 | " rs.rolling(window=window, center=False).sum()\n", 298 | " * (1.0 / (window - 1.0))\n", 299 | " )\n", 300 | "\n", 301 | " k = 0.34 / (1.34 + (window + 1) / (window - 1))\n", 302 | " result = (\n", 303 | " (open_vol + k * close_vol + (1 - k) * window_rs).apply(np.sqrt)\n", 304 | " * math.sqrt(trading_periods)\n", 305 | " )\n", 306 | "\n", 307 | " if clean:\n", 308 | " return result.dropna()\n", 309 | " else:\n", 310 | " return result" 311 | ] 312 | }, 313 | { 314 | "cell_type": "markdown", 315 | "id": "b816e969", 316 | "metadata": {}, 317 | "source": [ 318 | "standard_deviation is the close-to-close baseline; hodges_tompkins debiases its small-sample downward bias from overlapping windows. parkinson and garman_klass use high–low (and open–close) ranges for higher efficiency; rogers_satchell is drift-robust, and yang_zhang blends overnight gap and intraday range to handle open–close discontinuities. These options let us see how bar-based measures react to spikes and gaps that closes miss, then pick a stable sizing input." 319 | ] 320 | }, 321 | { 322 | "cell_type": "markdown", 323 | "id": "ff3b9faf", 324 | "metadata": {}, 325 | "source": [ 326 | "## Visualize and compare volatility estimates" 327 | ] 328 | }, 329 | { 330 | "cell_type": "markdown", 331 | "id": "12b5e0ec", 332 | "metadata": {}, 333 | "source": [ 334 | "Plot the close and all six estimators together to inspect when they agree and when they diverge around gaps, trend days, and stress. We care about relative timing and spikes more than exact levels on this axis." 335 | ] 336 | }, 337 | { 338 | "cell_type": "code", 339 | "execution_count": null, 340 | "id": "80a22b2b", 341 | "metadata": {}, 342 | "outputs": [], 343 | "source": [ 344 | "data[\"Close\"].plot()\n", 345 | "standard_deviation(data).plot()\n", 346 | "parkinson(data).plot()\n", 347 | "garman_klass(data).plot()\n", 348 | "hodges_tompkins(data).plot()\n", 349 | "rogers_satchell(data).plot()\n", 350 | "yang_zhang(data).plot()\n", 351 | "plt.show()" 352 | ] 353 | }, 354 | { 355 | "cell_type": "markdown", 356 | "id": "674b9d94", 357 | "metadata": {}, 358 | "source": [ 359 | "The curves will not share units with price, so focus on how each estimator jumps and mean-reverts when regimes shift. In practice we would validate a candidate metric by comparing it to realized next-day moves or high–low ranges and then size positions off the most stable forecast. Watch for cases where close-vol looks calm while range-based measures surge; that mismatch is the failure mode we want to eliminate." 360 | ] 361 | }, 362 | { 363 | "cell_type": "markdown", 364 | "id": "1f208f8b", 365 | "metadata": {}, 366 | "source": [ 367 | "PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk." 368 | ] 369 | } 370 | ], 371 | "metadata": { 372 | "jupytext": { 373 | "cell_metadata_filter": "-all", 374 | "main_language": "python", 375 | "notebook_metadata_filter": "-all" 376 | } 377 | }, 378 | "nbformat": 4, 379 | "nbformat_minor": 5 380 | } 381 | -------------------------------------------------------------------------------- /how_to_simulate_stock_prices_with_python.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "7f8b3f16", 6 | "metadata": {}, 7 | "source": [ 8 | "
PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk.
" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "616a6ec6", 14 | "metadata": {}, 15 | "source": [ 16 | "## Library installation" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "b5f1d0c5", 22 | "metadata": {}, 23 | "source": [ 24 | "Install numpy and matplotlib so the notebook can simulate and plot GBM paths. This ensures a clean, reproducible environment regardless of your local setup." 25 | ] 26 | }, 27 | { 28 | "cell_type": "code", 29 | "execution_count": null, 30 | "id": "51fab65b", 31 | "metadata": {}, 32 | "outputs": [], 33 | "source": [ 34 | "!pip install numpy matplotlib" 35 | ] 36 | }, 37 | { 38 | "cell_type": "markdown", 39 | "id": "6bbacdc5", 40 | "metadata": {}, 41 | "source": [ 42 | "If you are running in an environment that already ships these libraries (such as Colab or some IDEs), this command will simply confirm versions. Keeping dependencies minimal lets us focus on modeling choices like time-step units and volatility scaling rather than environment issues." 43 | ] 44 | }, 45 | { 46 | "cell_type": "markdown", 47 | "id": "33bfb5d0", 48 | "metadata": {}, 49 | "source": [ 50 | "## Imports and setup" 51 | ] 52 | }, 53 | { 54 | "cell_type": "markdown", 55 | "id": "2529b9a9", 56 | "metadata": {}, 57 | "source": [ 58 | "We use numpy for fast vectorized arrays and random draws, and matplotlib.pyplot for quick visual checks of simulated price paths." 59 | ] 60 | }, 61 | { 62 | "cell_type": "code", 63 | "execution_count": null, 64 | "id": "37edd0c7", 65 | "metadata": {}, 66 | "outputs": [], 67 | "source": [ 68 | "import numpy as np\n", 69 | "import matplotlib.pyplot as plt" 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "id": "54cf882a", 75 | "metadata": {}, 76 | "source": [ 77 | "Visual checks are a core part of simulation work because they surface unit mismatches early. If you want reproducible runs later, you can set a seed with numpy’s RNG before calling any functions; reproducibility is essential when diagnosing differences across parameter choices." 78 | ] 79 | }, 80 | { 81 | "cell_type": "markdown", 82 | "id": "3a6efa61", 83 | "metadata": {}, 84 | "source": [ 85 | "## Define simulation parameters and units" 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "id": "a0598b63", 91 | "metadata": {}, 92 | "source": [ 93 | "Set the initial price, annualized volatility, and annualized drift, then define the Monte Carlo grid (paths, daily time step, total horizon). Using delta = 1/252 converts annual inputs into daily increments, and time = 252*5 targets a five-year window." 94 | ] 95 | }, 96 | { 97 | "cell_type": "code", 98 | "execution_count": null, 99 | "id": "0eff7483", 100 | "metadata": { 101 | "lines_to_next_cell": 1 102 | }, 103 | "outputs": [], 104 | "source": [ 105 | "s0 = 131.00\n", 106 | "sigma = 0.25\n", 107 | "mu = 0.35\n", 108 | "paths = 1000\n", 109 | "delta = 1.0 / 252.0\n", 110 | "time = 252 * 5" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "id": "28cf4ef0", 116 | "metadata": {}, 117 | "source": [ 118 | "Locking units up front prevents the classic mistake of feeding annual parameters into daily steps without the proper scaling. Consistent definitions make results interpretable and comparable as you vary drift or volatility. paths controls Monte Carlo precision, so increasing it is how we check convergence of any statistic we care about." 119 | ] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "id": "1e2a69e4", 124 | "metadata": {}, 125 | "source": [ 126 | "## Build GBM and helper functions" 127 | ] 128 | }, 129 | { 130 | "cell_type": "markdown", 131 | "id": "17268d20", 132 | "metadata": {}, 133 | "source": [ 134 | "Generate Brownian shocks whose variance matches the chosen time step and whose scale reflects the (annualized) volatility. Each column is an independent price path, and rows step forward in time." 135 | ] 136 | }, 137 | { 138 | "cell_type": "code", 139 | "execution_count": null, 140 | "id": "a64a8fe8", 141 | "metadata": { 142 | "lines_to_next_cell": 1 143 | }, 144 | "outputs": [], 145 | "source": [ 146 | "def wiener_process(delta, sigma, time, paths):\n", 147 | " \"\"\"Returns a Wiener process.\n", 148 | "\n", 149 | " Parameters\n", 150 | " ----------\n", 151 | " delta : float\n", 152 | " The increment to downsample sigma.\n", 153 | " sigma : float\n", 154 | " Percentage volatility.\n", 155 | " time : int\n", 156 | " Number of samples to create.\n", 157 | " paths : int\n", 158 | " Number of price simulations to create.\n", 159 | "\n", 160 | " Returns\n", 161 | " -------\n", 162 | " wiener_process : np.ndarray\n", 163 | "\n", 164 | " Notes\n", 165 | " -----\n", 166 | " This method returns a Wiener process. The Wiener process is also called\n", 167 | " Brownian motion. For more information about the Wiener process check out\n", 168 | " the Wikipedia page:\n", 169 | " http://en.wikipedia.org/wiki/Wiener_process\n", 170 | " \"\"\"\n", 171 | " return sigma * np.random.normal(\n", 172 | " loc=0,\n", 173 | " scale=np.sqrt(delta),\n", 174 | " size=(time, paths),\n", 175 | " )" 176 | ] 177 | }, 178 | { 179 | "cell_type": "markdown", 180 | "id": "dffd23fa", 181 | "metadata": {}, 182 | "source": [ 183 | "Scaling by sqrt(delta) gives the correct time-step variance, and multiplying by sigma downscales the annual volatility to the per-step level. This independence-by-path assumption is the default baseline in pricing engines and is ideal for sandbox learning. Vectorized generation keeps the simulation fast enough to run thousands of paths interactively." 184 | ] 185 | }, 186 | { 187 | "cell_type": "markdown", 188 | "id": "0dd3567d", 189 | "metadata": {}, 190 | "source": [ 191 | "Convert Brownian shocks into multiplicative GBM returns using the closed-form step of a constant-drift, constant-vol process. The (mu - 0.5*sigma**2)*delta term applies the Itô correction so expected log returns line up with the specified drift." 192 | ] 193 | }, 194 | { 195 | "cell_type": "code", 196 | "execution_count": null, 197 | "id": "60616bb8", 198 | "metadata": { 199 | "lines_to_next_cell": 1 200 | }, 201 | "outputs": [], 202 | "source": [ 203 | "def gbm_returns(delta, sigma, time, mu, paths):\n", 204 | " \"\"\"Returns from a Geometric Brownian Motion (GBM).\n", 205 | "\n", 206 | " Parameters\n", 207 | " ----------\n", 208 | " delta : float\n", 209 | " The increment to downsample sigma.\n", 210 | " sigma : float\n", 211 | " Percentage volatility.\n", 212 | " time : int\n", 213 | " Number of samples to create.\n", 214 | " mu : float\n", 215 | " Percentage drift.\n", 216 | " paths : int\n", 217 | " Number of price simulations to create.\n", 218 | "\n", 219 | " Returns\n", 220 | " -------\n", 221 | " gbm_returns : np.ndarray\n", 222 | "\n", 223 | " Notes\n", 224 | " -----\n", 225 | " This method constructs random Geometric Brownian Motion (GBM).\n", 226 | " \"\"\"\n", 227 | " process = wiener_process(delta, sigma, time, paths)\n", 228 | " return np.exp(process + (mu - sigma**2 / 2) * delta)" 229 | ] 230 | }, 231 | { 232 | "cell_type": "markdown", 233 | "id": "259d0647", 234 | "metadata": {}, 235 | "source": [ 236 | "Many beginners omit the 0.5*sigma^2 term and end up with biased levels that drift too high. Keeping drift and volatility annualized while applying delta per step is the key to unit consistency. This block encodes those rules in one place so we can change parameters confidently." 237 | ] 238 | }, 239 | { 240 | "cell_type": "markdown", 241 | "id": "b10c5f55", 242 | "metadata": {}, 243 | "source": [ 244 | "Compound the simulated returns from the initial price to produce level paths while ensuring strictly positive prices. We prepend ones so the first row represents the starting point before any move." 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "execution_count": null, 250 | "id": "ea7a7279", 251 | "metadata": { 252 | "lines_to_next_cell": 1 253 | }, 254 | "outputs": [], 255 | "source": [ 256 | "def gbm_levels(s0, delta, sigma, time, mu, paths):\n", 257 | " \"\"\"Returns price paths starting at s0.\n", 258 | "\n", 259 | " Parameters\n", 260 | " ----------\n", 261 | " s0 : float\n", 262 | " The starting stock price.\n", 263 | " delta : float\n", 264 | " The increment to downsample sigma.\n", 265 | " sigma : float\n", 266 | " Percentage volatility.\n", 267 | " time : int\n", 268 | " Number of samples to create.\n", 269 | " mu : float\n", 270 | " Percentage drift.\n", 271 | " paths : int\n", 272 | " Number of price simulations to create.\n", 273 | "\n", 274 | " Returns\n", 275 | " -------\n", 276 | " gbm_levels : np.ndarray\n", 277 | " \"\"\"\n", 278 | " returns = gbm_returns(delta, sigma, time, mu, paths)\n", 279 | " stacked = np.vstack([np.ones(paths), returns])\n", 280 | " return s0 * stacked.cumprod(axis=0)" 281 | ] 282 | }, 283 | { 284 | "cell_type": "markdown", 285 | "id": "280c57fd", 286 | "metadata": {}, 287 | "source": [ 288 | "Working with levels is how option desks and risk teams visualize scenarios and payoffs. GBM’s multiplicative compounding matches how real prices behave and avoids negative levels that plague additive models. Including the initial price makes plots and summary stats easier to interpret." 289 | ] 290 | }, 291 | { 292 | "cell_type": "markdown", 293 | "id": "fa785bd6", 294 | "metadata": {}, 295 | "source": [ 296 | "## Run scenarios and visualize paths" 297 | ] 298 | }, 299 | { 300 | "cell_type": "markdown", 301 | "id": "88ee0f93", 302 | "metadata": {}, 303 | "source": [ 304 | "Simulate paths with a positive drift and compute how many finishes end above the starting price as a quick sanity check. In notebooks this last expression will display and gives a rough sense of the distribution’s tilt under positive drift." 305 | ] 306 | }, 307 | { 308 | "cell_type": "code", 309 | "execution_count": null, 310 | "id": "035fa3f7", 311 | "metadata": {}, 312 | "outputs": [], 313 | "source": [ 314 | "price_paths = gbm_levels(s0, delta, sigma, time, mu, paths)\n", 315 | "len(price_paths[-1, price_paths[-1, :] > s0])" 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "id": "b2caad37", 321 | "metadata": {}, 322 | "source": [ 323 | "This quick diagnostic connects drift to long-run tendency without overfitting historical quirks. As you scale paths higher, the fraction should stabilize, which is a simple convergence check. It’s a habit pros use before trusting any simulated payoff estimates." 324 | ] 325 | }, 326 | { 327 | "cell_type": "markdown", 328 | "id": "3735f237", 329 | "metadata": {}, 330 | "source": [ 331 | "Plot the simulated paths to visually verify unit choices and compounding behavior. Thin lines help us inspect many paths at once." 332 | ] 333 | }, 334 | { 335 | "cell_type": "code", 336 | "execution_count": null, 337 | "id": "c5c5ff0d", 338 | "metadata": {}, 339 | "outputs": [], 340 | "source": [ 341 | "plt.plot(price_paths, linewidth=0.25)\n", 342 | "plt.show()" 343 | ] 344 | }, 345 | { 346 | "cell_type": "markdown", 347 | "id": "5473ae6d", 348 | "metadata": {}, 349 | "source": [ 350 | "Early plots catch mistakes like mismatched time steps or mis-scaled volatility before you burn time on strategy code. Visual checks pair with seeded randomness to make discrepancies obvious when you tweak parameters. Treat these figures as fast feedback loops, not final analyses." 351 | ] 352 | }, 353 | { 354 | "cell_type": "markdown", 355 | "id": "c0500dc0", 356 | "metadata": {}, 357 | "source": [ 358 | "Rerun the engine with zero drift to isolate volatility’s effect and compare to the positive-drift case. Visualizing both back-to-back reinforces how drift sets direction while volatility controls dispersion." 359 | ] 360 | }, 361 | { 362 | "cell_type": "code", 363 | "execution_count": null, 364 | "id": "ce37b9f0", 365 | "metadata": {}, 366 | "outputs": [], 367 | "source": [ 368 | "price_paths = gbm_levels(s0, delta, sigma, time, 0.0, paths)\n", 369 | "plt.plot(price_paths, linewidth=0.25)\n", 370 | "plt.show()" 371 | ] 372 | }, 373 | { 374 | "cell_type": "markdown", 375 | "id": "73180f90", 376 | "metadata": {}, 377 | "source": [ 378 | "A zero-drift scenario is a common baseline in pricing and risk because it mirrors a risk-neutral setting for intuition-building. You should see no systematic trend in log space, yet level paths remain positively skewed due to compounding. Use this contrast to validate that your step size and scaling behave as expected." 379 | ] 380 | }, 381 | { 382 | "cell_type": "markdown", 383 | "id": "6355b4a7", 384 | "metadata": {}, 385 | "source": [ 386 | "PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk." 387 | ] 388 | }, 389 | { 390 | "cell_type": "code", 391 | "execution_count": null, 392 | "id": "f4831cb7-5a54-442f-a08d-2ede52162306", 393 | "metadata": {}, 394 | "outputs": [], 395 | "source": [] 396 | } 397 | ], 398 | "metadata": { 399 | "jupytext": { 400 | "cell_metadata_filter": "-all", 401 | "main_language": "python", 402 | "notebook_metadata_filter": "-all" 403 | }, 404 | "kernelspec": { 405 | "display_name": "Python 3 (ipykernel)", 406 | "language": "python", 407 | "name": "python3" 408 | }, 409 | "language_info": { 410 | "codemirror_mode": { 411 | "name": "ipython", 412 | "version": 3 413 | }, 414 | "file_extension": ".py", 415 | "mimetype": "text/x-python", 416 | "name": "python", 417 | "nbconvert_exporter": "python", 418 | "pygments_lexer": "ipython3", 419 | "version": "3.12.9" 420 | } 421 | }, 422 | "nbformat": 4, 423 | "nbformat_minor": 5 424 | } 425 | -------------------------------------------------------------------------------- /value_at_risk_manage_your_portfolios_risk.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "6771ed6e", 6 | "metadata": {}, 7 | "source": [ 8 | "
" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "0898293c", 14 | "metadata": {}, 15 | "source": [ 16 | "## Library installation\n", 17 | "Install the required Python packages so this notebook can fetch data, compute statistics, and plot results anywhere you run it. This ensures a clean, reproducible setup for the VaR workflow." 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": null, 23 | "id": "e679c8aa", 24 | "metadata": {}, 25 | "outputs": [], 26 | "source": [ 27 | "!pip install yfinance pandas numpy matplotlib scipy" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "id": "819fbfb4", 33 | "metadata": {}, 34 | "source": [ 35 | "yfinance pulls prices, pandas and numpy handle the tabular math, scipy provides the normal quantile, and matplotlib draws the chart. Running this cell once per environment avoids import errors later. If you are in a locked environment, confirm these are preinstalled and skip the install." 36 | ] 37 | }, 38 | { 39 | "cell_type": "markdown", 40 | "id": "58037a61", 41 | "metadata": {}, 42 | "source": [ 43 | "## Imports and setup\n", 44 | "We import pandas for DataFrame operations, numpy for vectorized linear algebra, scipy.stats.norm for the normal quantile, matplotlib.pyplot for plotting, and yfinance to download prices." 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": 1, 50 | "id": "f6ea1df0", 51 | "metadata": {}, 52 | "outputs": [], 53 | "source": [ 54 | "import pandas as pd\n", 55 | "import numpy as np\n", 56 | "from scipy.stats import norm\n", 57 | "import matplotlib.pyplot as plt\n", 58 | "import yfinance as yf" 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "id": "aebdea5c", 64 | "metadata": {}, 65 | "source": [ 66 | "This small stack is enough to replicate a desk-style daily VaR check without heavy infrastructure. Keeping versions stable helps avoid small numerical drift across runs. In production, it’s good practice to pin versions and log them alongside your daily risk output." 67 | ] 68 | }, 69 | { 70 | "cell_type": "markdown", 71 | "id": "1ff69b72", 72 | "metadata": {}, 73 | "source": [ 74 | "## Define portfolio and fetch prices\n", 75 | "Set the portfolio composition, capital base, and tail probability, then download daily closes with yfinance over a fixed window. This establishes what risk we measure and provides the price history needed for returns." 76 | ] 77 | }, 78 | { 79 | "cell_type": "code", 80 | "execution_count": 4, 81 | "id": "1b72b58f", 82 | "metadata": {}, 83 | "outputs": [], 84 | "source": [ 85 | "tickers = [\"AAPL\", \"META\", \"C\", \"DIS\"]\n", 86 | "weights = np.array([0.25, 0.3, 0.15, 0.3])\n", 87 | "portfolio_value = 1_000\n", 88 | "confidence = 0.05" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 5, 94 | "id": "7e06b999", 95 | "metadata": {}, 96 | "outputs": [ 97 | { 98 | "name": "stderr", 99 | "output_type": "stream", 100 | "text": [ 101 | "/var/folders/6m/0ykpdmvn3lb5qkq15hk3s11c0000gn/T/ipykernel_22291/2418303877.py:1: FutureWarning: YF.download() has changed argument auto_adjust default to True\n", 102 | " data = yf.download(\n", 103 | "[*********************100%***********************] 4 of 4 completed\n" 104 | ] 105 | } 106 | ], 107 | "source": [ 108 | "data = yf.download(\n", 109 | " tickers, start=\"2018-01-01\", end=\"2021-12-31\"\n", 110 | ")[\"Close\"]" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "id": "7334e700", 116 | "metadata": {}, 117 | "source": [ 118 | "Weights should sum to one, capital sets the dollar scale, and alpha=0.05 targets the left 5% tail for a 95% one-sided check. Always confirm symbols and data quality; FB is now META and adjusted closes are preferable when accounting for splits and dividends. In a daily workflow, pull a recent window and rerun this block each morning before sizing positions." 119 | ] 120 | }, 121 | { 122 | "cell_type": "markdown", 123 | "id": "0b283f80", 124 | "metadata": {}, 125 | "source": [ 126 | "## Compute returns and estimate moments\n", 127 | "Convert prices to daily percentage returns, estimate each asset’s mean and the covariance matrix, and combine them with weights to get portfolio drift and volatility. Express both in fraction and dollar terms to prepare for a dollar VaR." 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": 6, 133 | "id": "dc249469", 134 | "metadata": {}, 135 | "outputs": [], 136 | "source": [ 137 | "returns = data.pct_change()\n", 138 | "mean_returns = returns.mean()\n", 139 | "port_mean = mean_returns.dot(weights)\n", 140 | "investment_mean = (1 + port_mean) * portfolio_value\n", 141 | "cov_matrix = returns.cov()\n", 142 | "port_stdev = np.sqrt(weights.T.dot(cov_matrix).dot(weights))\n", 143 | "investment_stdev = portfolio_value * port_stdev" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "id": "57c1b295", 149 | "metadata": {}, 150 | "source": [ 151 | "The covariance captures co-movement, which is the key upgrade over eyeballing single names. A recent rolling window (about 30–60 trading days for equities) keeps this estimate current; long samples can stale your risk. Drift is small day to day, so many teams ignore it for speed, but we include it so you can see the full parametric form." 152 | ] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "id": "596f76b5", 157 | "metadata": {}, 158 | "source": [ 159 | "## Calculate VaR and scale horizon\n", 160 | "Map the left tail of the normal distribution to a dollar loss using norm.ppf with alpha, the mean, and the dollar volatility. Convert that threshold into a one-day VaR and extend it across multiple days under the square-root-of-time rule." 161 | ] 162 | }, 163 | { 164 | "cell_type": "code", 165 | "execution_count": 7, 166 | "id": "9c073a48", 167 | "metadata": {}, 168 | "outputs": [ 169 | { 170 | "name": "stdout", 171 | "output_type": "stream", 172 | "text": [ 173 | "Portfolio VaR: 28.257793003030656\n" 174 | ] 175 | } 176 | ], 177 | "source": [ 178 | "percent_point = norm.ppf(confidence, investment_mean, investment_stdev)\n", 179 | "value_at_risk = portfolio_value - percent_point\n", 180 | "print(f\"Portfolio VaR: {value_at_risk}\")" 181 | ] 182 | }, 183 | { 184 | "cell_type": "code", 185 | "execution_count": 8, 186 | "id": "52edba25", 187 | "metadata": {}, 188 | "outputs": [], 189 | "source": [ 190 | "value_at_risks = value_at_risk * np.sqrt(range(1, 31))" 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": 9, 196 | "id": "49310654", 197 | "metadata": {}, 198 | "outputs": [ 199 | { 200 | "data": { 201 | "text/plain": [ 202 | "[]" 203 | ] 204 | }, 205 | "execution_count": 9, 206 | "metadata": {}, 207 | "output_type": "execute_result" 208 | }, 209 | { 210 | "data": { 211 | "image/png": "", 212 | "text/plain": [ 213 | "
" 214 | ] 215 | }, 216 | "metadata": {}, 217 | "output_type": "display_data" 218 | } 219 | ], 220 | "source": [ 221 | "plt.xlabel(\"Day\")\n", 222 | "plt.ylabel(\"Max loss\")\n", 223 | "plt.title(\"Portfolio VaR\")\n", 224 | "plt.plot(value_at_risks, \"r\")" 225 | ] 226 | }, 227 | { 228 | "cell_type": "markdown", 229 | "id": "8b5cfc85", 230 | "metadata": {}, 231 | "source": [ 232 | "With small drift, the 95% parametric VaR is roughly 1.645 times dollar volatility, which is a good mental check against the printed value. Scaling assumes independent daily returns and stable volatility; crises, gaps, and fat tails can break this, so also inspect 99% and add simple shocks. The curve gives a concrete loss budget you can use to cap size and leverage per strategy." 233 | ] 234 | }, 235 | { 236 | "cell_type": "markdown", 237 | "id": "43e5fa68", 238 | "metadata": {}, 239 | "source": [ 240 | "PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk." 241 | ] 242 | } 243 | ], 244 | "metadata": { 245 | "jupytext": { 246 | "cell_metadata_filter": "-all", 247 | "main_language": "python", 248 | "notebook_metadata_filter": "-all" 249 | }, 250 | "kernelspec": { 251 | "display_name": "Python 3 (ipykernel)", 252 | "language": "python", 253 | "name": "python3" 254 | }, 255 | "language_info": { 256 | "codemirror_mode": { 257 | "name": "ipython", 258 | "version": 3 259 | }, 260 | "file_extension": ".py", 261 | "mimetype": "text/x-python", 262 | "name": "python", 263 | "nbconvert_exporter": "python", 264 | "pygments_lexer": "ipython3", 265 | "version": "3.12.9" 266 | } 267 | }, 268 | "nbformat": 4, 269 | "nbformat_minor": 5 270 | } 271 | -------------------------------------------------------------------------------- /how_to_tell_if_options_are_cheap_with_volatility_cones.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "3bccd3fa", 6 | "metadata": {}, 7 | "source": [ 8 | "
" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "da59df26", 14 | "metadata": {}, 15 | "source": [ 16 | "## Library installation\n", 17 | "Install the libraries needed to fetch market data, compute realized volatility, and plot the cone so the notebook runs end-to-end." 18 | ] 19 | }, 20 | { 21 | "cell_type": "code", 22 | "execution_count": null, 23 | "id": "a70da6d5", 24 | "metadata": {}, 25 | "outputs": [], 26 | "source": [ 27 | "!pip install yfinance pandas numpy matplotlib" 28 | ] 29 | }, 30 | { 31 | "cell_type": "markdown", 32 | "id": "420e9534", 33 | "metadata": {}, 34 | "source": [ 35 | "## Imports and setup\n", 36 | "We use math for annualization, yfinance to download daily prices, numpy for vectorized math on returns, and matplotlib.pyplot to visualize the volatility cone." 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 9, 42 | "id": "1a9d52eb", 43 | "metadata": {}, 44 | "outputs": [], 45 | "source": [ 46 | "import math" 47 | ] 48 | }, 49 | { 50 | "cell_type": "code", 51 | "execution_count": 10, 52 | "id": "0db724fd", 53 | "metadata": {}, 54 | "outputs": [], 55 | "source": [ 56 | "import matplotlib.pyplot as plt\n", 57 | "import numpy as np\n", 58 | "import pandas as pd\n", 59 | "import yfinance as yf" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "id": "bf338f7f", 65 | "metadata": {}, 66 | "source": [ 67 | "This small stack mirrors what desks use for quick pre-trade checks without heavy dependencies. Keeping tools simple encourages a consistent process, which matters more than model complexity when judging whether options are cheap or rich. Annualizing with math.sqrt(252) aligns our realized metric with how implied volatility is quoted." 68 | ] 69 | }, 70 | { 71 | "cell_type": "markdown", 72 | "id": "1e2d7555", 73 | "metadata": {}, 74 | "source": [ 75 | "## Define cone parameters and collect data\n", 76 | "Set the lookback windows we will test and the quantile bands to summarize, and prepare containers to hold each window’s historical stats and the latest realized value." 77 | ] 78 | }, 79 | { 80 | "cell_type": "code", 81 | "execution_count": 11, 82 | "id": "50bc34d9", 83 | "metadata": {}, 84 | "outputs": [], 85 | "source": [ 86 | "windows = [30, 60, 90, 120]\n", 87 | "quantiles = [0.25, 0.75]" 88 | ] 89 | }, 90 | { 91 | "cell_type": "code", 92 | "execution_count": 12, 93 | "id": "5ffe17d7", 94 | "metadata": {}, 95 | "outputs": [], 96 | "source": [ 97 | "min_ = []\n", 98 | "max_ = []\n", 99 | "median = []\n", 100 | "top_q = []\n", 101 | "bottom_q = []\n", 102 | "realized = []" 103 | ] 104 | }, 105 | { 106 | "cell_type": "markdown", 107 | "id": "d1c8e3a2", 108 | "metadata": {}, 109 | "source": [ 110 | "Each window maps to a common option tenor, so we can compare realized behavior on the same horizon as the trade. Quantile bands give a robust sense of typical ranges instead of chasing extremes. Preallocating these lists keeps the loop clean and avoids mixing statistics across windows." 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "id": "2c0fcc78", 116 | "metadata": {}, 117 | "source": [ 118 | "Download daily JPM close prices for 2020 using yfinance; this is the raw material for realized volatility." 119 | ] 120 | }, 121 | { 122 | "cell_type": "code", 123 | "execution_count": 13, 124 | "id": "1c5e5c03", 125 | "metadata": { 126 | "lines_to_next_cell": 1 127 | }, 128 | "outputs": [ 129 | { 130 | "name": "stderr", 131 | "output_type": "stream", 132 | "text": [ 133 | "/var/folders/6m/0ykpdmvn3lb5qkq15hk3s11c0000gn/T/ipykernel_99966/3699040392.py:1: FutureWarning: YF.download() has changed argument auto_adjust default to True\n", 134 | " data = yf.download(\"JPM\", start=\"2020-01-01\", end=\"2025-12-31\")\n", 135 | "[*********************100%***********************] 1 of 1 completed\n" 136 | ] 137 | } 138 | ], 139 | "source": [ 140 | "data = yf.download(\"JPM\", start=\"2020-01-01\", end=\"2025-12-31\")" 141 | ] 142 | }, 143 | { 144 | "cell_type": "markdown", 145 | "id": "946de500", 146 | "metadata": {}, 147 | "source": [ 148 | "We measure realized volatility from what the market actually did rather than guessing. Use trading-day data to stay consistent with the 252-day annualization. In practice, pull enough history to fill the longest window so the most recent estimates are not NaN." 149 | ] 150 | }, 151 | { 152 | "cell_type": "markdown", 153 | "id": "d58fab67", 154 | "metadata": {}, 155 | "source": [ 156 | "## Compute rolling realized volatility by window\n", 157 | "Define a helper that converts rolling log-return variability into annualized realized volatility for a chosen window." 158 | ] 159 | }, 160 | { 161 | "cell_type": "code", 162 | "execution_count": 14, 163 | "id": "c24c6ea6", 164 | "metadata": { 165 | "lines_to_next_cell": 1 166 | }, 167 | "outputs": [], 168 | "source": [ 169 | "def realized_vol(price_data, window=30):\n", 170 | " log_return = (price_data[\"Close\"] / price_data[\"Close\"].shift(1)).apply(np.log)\n", 171 | " return log_return.rolling(window=window, center=False).std() * math.sqrt(252)" 172 | ] 173 | }, 174 | { 175 | "cell_type": "markdown", 176 | "id": "0799387b", 177 | "metadata": {}, 178 | "source": [ 179 | "Log returns make volatility scale-stable across price levels, which is important when comparing periods. Annualizing with the square root of 252 puts realized on the same units as implied volatility. Recomputing this per window builds the cone’s tenor-specific histories." 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "id": "8e17351c", 185 | "metadata": {}, 186 | "source": [ 187 | "Evaluate the realized volatility series for each window and summarize its historical distribution into bands." 188 | ] 189 | }, 190 | { 191 | "cell_type": "code", 192 | "execution_count": 15, 193 | "id": "a19b4ccd", 194 | "metadata": {}, 195 | "outputs": [], 196 | "source": [ 197 | "def _scalar(x):\n", 198 | " # robustly convert Series/ndarray/scalar -> Python float\n", 199 | " return float(np.asarray(x).squeeze())\n", 200 | "\n", 201 | "rows = []\n", 202 | "for w in windows:\n", 203 | " est = realized_vol(price_data=data, window=w).dropna()\n", 204 | "\n", 205 | " rows.append({\n", 206 | " \"window\": w,\n", 207 | " \"Min\": _scalar(est.min()),\n", 208 | " \"Max\": _scalar(est.max()),\n", 209 | " \"Median\": _scalar(est.median()),\n", 210 | " f\"{quantiles[1]*100:.0f} Prctl\": _scalar(est.quantile(quantiles[1])),\n", 211 | " f\"{quantiles[0]*100:.0f} Prctl\": _scalar(est.quantile(quantiles[0])),\n", 212 | " \"Realized\": _scalar(est.iloc[-1]),\n", 213 | " })\n", 214 | "\n", 215 | "dfw = pd.DataFrame(rows).sort_values(\"window\")" 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "id": "ad3777a9", 221 | "metadata": {}, 222 | "source": [ 223 | "These summaries create a stable yardstick by horizon, which is the core idea of a volatility cone. We also record today’s realized per window to see where it sits within its own historical range. Be mindful of thin history or recent events, which influence short windows more than long ones." 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "id": "dbe26f86", 229 | "metadata": {}, 230 | "source": [ 231 | "## Visualize volatility cone results clearly\n", 232 | "Plot each window’s distribution bands and overlay the latest realized point so we can judge today against history at matching tenors." 233 | ] 234 | }, 235 | { 236 | "cell_type": "code", 237 | "execution_count": 16, 238 | "id": "6fbbb701", 239 | "metadata": {}, 240 | "outputs": [ 241 | { 242 | "data": { 243 | "image/png": "", 244 | "text/plain": [ 245 | "
" 246 | ] 247 | }, 248 | "metadata": {}, 249 | "output_type": "display_data" 250 | } 251 | ], 252 | "source": [ 253 | "dfw = pd.DataFrame(rows).sort_values(\"window\")\n", 254 | "for col in [\n", 255 | " \"Min\",\n", 256 | " \"Max\",\n", 257 | " \"Median\",\n", 258 | " f\"{quantiles[1]*100:.0f} Prctl\",\n", 259 | " f\"{quantiles[0]*100:.0f} Prctl\",\n", 260 | " \"Realized\",\n", 261 | "]:\n", 262 | " plt.plot(\n", 263 | " dfw[\"window\"].to_numpy(), dfw[col].to_numpy(), \"-o\", linewidth=1, label=col\n", 264 | " )\n", 265 | "plt.legend()\n", 266 | "plt.xlabel(\"Window\")\n", 267 | "plt.ylabel(\"Annualized σ\")\n", 268 | "plt.xticks(windows)\n", 269 | "plt.show()" 270 | ] 271 | }, 272 | { 273 | "cell_type": "markdown", 274 | "id": "aaa04251", 275 | "metadata": {}, 276 | "source": [ 277 | "Read the cone across the x-axis by tenor, not as a time series. If today’s realized is near the lower band while same-tenor implied is high, selling optionality can be justified; the reverse suggests caution or buying vol. Always layer in event context like earnings or macro releases because cones anchor expectations but do not predict jumps." 278 | ] 279 | }, 280 | { 281 | "cell_type": "markdown", 282 | "id": "c6f34435", 283 | "metadata": {}, 284 | "source": [ 285 | "PyQuant News is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to get started with Python for quant finance. For educational purposes. Not investment advice. Use at your own risk." 286 | ] 287 | }, 288 | { 289 | "cell_type": "code", 290 | "execution_count": null, 291 | "id": "2b23dc9a-12e5-4745-9831-930dcf89b46d", 292 | "metadata": {}, 293 | "outputs": [], 294 | "source": [] 295 | } 296 | ], 297 | "metadata": { 298 | "jupytext": { 299 | "cell_metadata_filter": "-all", 300 | "main_language": "python", 301 | "notebook_metadata_filter": "-all" 302 | }, 303 | "kernelspec": { 304 | "display_name": "Python 3 (ipykernel)", 305 | "language": "python", 306 | "name": "python3" 307 | }, 308 | "language_info": { 309 | "codemirror_mode": { 310 | "name": "ipython", 311 | "version": 3 312 | }, 313 | "file_extension": ".py", 314 | "mimetype": "text/x-python", 315 | "name": "python", 316 | "nbconvert_exporter": "python", 317 | "pygments_lexer": "ipython3", 318 | "version": "3.12.9" 319 | } 320 | }, 321 | "nbformat": 4, 322 | "nbformat_minor": 5 323 | } 324 | --------------------------------------------------------------------------------