├── Chapter 1 - Intro to Algorithm Design.ipynb ├── Chapter 2 - Algorithm Analysis.ipynb ├── Chapter 3 - Data Structures.ipynb ├── Chapter 4 - Sorting and Searching.ipynb ├── Chapter 5 - Graph Traversal.ipynb ├── Chapter 6 - Weighted Graph Algorithms.ipynb ├── Chapter 7 - Combinatorial Search and Heuristic Methods.ipynb ├── Chapter 8 - Dynamic Programming.ipynb ├── Chapter 9 - Intractable Problems and Approximation Algorithms.ipynb ├── Figures ├── Hallock_Fig_0-2.png ├── Hallock_Fig_1-1.jpg ├── Hallock_Fig_2-1.png ├── Hallock_Fig_6-1.jpg ├── Hallock_Fig_6-10.jpg ├── Hallock_Fig_6-11.jpg ├── Hallock_Fig_6-2.jpg ├── Hallock_Fig_6-3.jpg ├── Hallock_Fig_6-4.jpg ├── Hallock_Fig_6-5.jpg ├── Hallock_Fig_6-6.jpg ├── Hallock_Fig_6-7.jpg ├── Hallock_Fig_6-8.jpg ├── Hallock_Fig_6-9.jpg ├── Skiena_Fig_5-1.png ├── Skiena_Fig_5-2.png ├── Skiena_Fig_5-3.png ├── Skiena_Fig_5-4.png └── Skiena_Fig_5-5.png ├── GitHub Rendering Issue.ipynb ├── Other └── Dictionary_Words.txt └── README.md /Chapter 2 - Algorithm Analysis.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Chapter 2: Algorithm Analysis (Completed 15/52: 29%)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "### Notes:\n", 15 | "\n", 16 | "TAKING LOGS AND EXPONENTS DOES NOT PRESERVE DOMINANCE RELATIONSHIPS.\n", 17 | "\n", 18 | "For example, $n^3$ dominates $n^2$, but $\\log n^3 = 3\\log n$ does NOT dominate $\\log n^2 = 2 \\log n$. \n", 19 | "\n", 20 | "Another example, $\\log(n^3) = O(\\log(n^2))$ but $n^3 \\neq O(n^2)$, since $e^{c\\cdot \\log(n^2)} = e^{ \\log(n^{2c})} = n^{2c} \\neq c n^2$.\n", 21 | "\n", 22 | "The FUNDAMENTAL reason for this is that $\\log cf(n) \\neq c\\log f(n)$. Same for $\\exp(cf(n))$. So a given $c$ used to show a given dominance relation can't be reused after taking the logarithm or exponent, since the $c$ is no longer a multiplying factor in front of the new function. The transformed inequality WILL HOLD, but it is not with the function you want.\n", 23 | "\n", 24 | "A result of this idea (that logs and exps don't preserve dominance relationships) was stated on page 51: \"Logarithms cut any function down to size\", specifically, the log of ANY polynomial is $O(\\lg n)$.\n", 25 | "\n", 26 | "Again, logs and exps WILL preserve inequalities, but likely will not preserve dominance relations.\n", 27 | "\n", 28 | "This is not true of taking the square: $(c\\cdot f(n))^2 = c^2 \\cdot f^2(n)$. So now $c^2$ is your new \"$c$\".\n", 29 | "\n", 30 | "By the way, this is a property called [homogeneity](http://www.wikiwand.com/en/Homogeneous_function), where $f(cx) = c^kf(x)$ for some $k$." 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "---\n", 38 | "\n", 39 | "ALSO, it seems that if $f >> g$, then $g = O(f)$ and $f = \\Omega(g)$. HOWEVER, if $g = O(f)$ this does NOT necessarily mean that $f >> g$. For example, $f = \\Theta(g)$ implies $f = O(g)$, $f = \\Omega(g)$, $g = \\Theta(f)$. But neither dominates each other.\n", 40 | "\n", 41 | "In other words, $f >> g$ ($f$ dominates $g$) is a stronger statement than $g = O(f)$." 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": {}, 47 | "source": [ 48 | "## Program Analysis" 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "metadata": {}, 54 | "source": [ 55 | "### 2.1 [3]\n", 56 | "\n", 57 | "What value is returned by the following function? Express your answer as a\n", 58 | "function of $n$. Give the worst-case running time using the Big Oh notation.\n", 59 | "\n", 60 | "$\\hspace{2em} function \\text{ mystery}(n)$ \n", 61 | "$\\hspace{4em} r := 0$ \n", 62 | "$\\hspace{4em} for\\ i := 1\\ to\\ n-1\\ do$ \n", 63 | "$\\hspace{6em} for\\ j := i+1\\ to\\ n\\ do$ \n", 64 | "$\\hspace{8em} for\\ k := 1\\ to\\ j\\ do$ \n", 65 | "$\\hspace{10em} r := r+1$ \n", 66 | "$\\hspace{4em} \\text{return}(r)$" 67 | ] 68 | }, 69 | { 70 | "cell_type": "markdown", 71 | "metadata": {}, 72 | "source": [ 73 | "*Solution:*\n", 74 | "\n", 75 | "Examining the loops, it is computing (I am not sure if I was required to simplify the expression):\n", 76 | "$$\\sum_{i=1}^{n-1}\\sum_{j=i+1}^{n}\\sum_{k=1}^{j}1$$\n", 77 | "$$= \\sum_{i=1}^{n-1}\\sum_{j=i+1}^{n}j$$\n", 78 | "$$= \\sum_{i=1}^{n-1} \\left[ \\sum_{j=1}^{n}j - \\sum_{j=1}^{i}j \\right]$$\n", 79 | "$$\\stackrel{(*)}{=} \\sum_{i=1}^{n-1} \\left[ \\frac{n(n+1)}{2} - \\frac{i(i+1)}{2} \\right]$$\n", 80 | "$$= \\sum_{i=1}^{n-1} \\left[ \\frac{n(n+1)}{2} \\right] - \\sum_{i=1}^{n-1} \\left[\\frac{i(i+1)}{2} \\right]$$\n", 81 | "$$= \\frac{n(n+1)}{2} \\sum_{i=1}^{n-1} 1 - \\frac{1}{2} \\sum_{i=1}^{n-1} \\left[i^2+i \\right]$$\n", 82 | "$$= \\frac{n(n-1)(n+1)}{2} - \\frac{1}{2} \\sum_{i=1}^{n-1} i - \\frac{1}{2} \\sum_{i=1}^{n-1} i^2 $$\n", 83 | "$$\\stackrel{(*)}{=} \\frac{n(n-1)(n+1)}{2} - \\frac{1}{2} \\cdot \\frac{n(n-1)}{2} - \\frac{1}{2} \\cdot \\frac{n(n-1)(2(n-1)+1)}{6} $$\n", 84 | "$$= \\frac{n(n-1)(n+1)}{2} - \\frac{n(n-1)}{4} - \\frac{n(n-1)(2n-1)}{12} $$\n", 85 | "$$= \\frac{n(n-1)}{12} \\cdot \\left[6(n+1) - 3 - 2n + 1) \\right]$$\n", 86 | "$$= \\frac{n(n-1)}{12} \\cdot \\left[4n + 4 \\right]$$\n", 87 | "$$= \\frac{n(n-1)(n+1)}{3}$$\n", 88 | "\n", 89 | "\n", 90 | "($*$) where we use the expressions $\\sum_{i=1}^n i = \\frac{n(n+1)}{2}$ and $\\sum_{i=1}^n i^2 = \\frac{n(n+1)(2n+1)}{6}$ proved earlier in exercises 1.10 and 1.11.\n", 91 | "\n", 92 | "The outer loop will run a maximum of $n-1$ times, the next inner loop will run a maximum of $n$ times, and the most inner loop will run a maximum of $n$, for a worst cast running time of $O(n^3)$." 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "metadata": {}, 98 | "source": [ 99 | "### 2.2 [3]\n", 100 | "\n", 101 | "What value is returned by the following function? Express your answer as a\n", 102 | "function of $n$. Give the worst-case running time using Big Oh notation.\n", 103 | "\n", 104 | "$\\hspace{2em} function \\text{ pesky}(n)$ \n", 105 | "$\\hspace{4em} r := 0$ \n", 106 | "$\\hspace{4em} for\\ i := 1\\ to\\ n\\ do$ \n", 107 | "$\\hspace{6em} for\\ j := 1\\ to\\ i\\ do$ \n", 108 | "$\\hspace{8em} for\\ k := j\\ to\\ i+j\\ do$ \n", 109 | "$\\hspace{10em} r := r+1$ \n", 110 | "$\\hspace{4em} \\text{return}(r)$" 111 | ] 112 | }, 113 | { 114 | "cell_type": "markdown", 115 | "metadata": {}, 116 | "source": [ 117 | "*Solution*:\n", 118 | "\n", 119 | "Examining the loops, it is computing:\n", 120 | "\n", 121 | "$$\\sum_{i=1}^{n}\\sum_{j=1}^{i}\\sum_{k=j}^{i+j}1$$\n", 122 | "$$= \\sum_{i=1}^{n}\\sum_{j=1}^{i}[i+1]$$\n", 123 | "$$= \\sum_{i=1}^{n}\\left[i\\sum_{j=1}^{i}1\\right] + \\sum_{i=1}^{n}\\sum_{j=1}^{i}1$$\n", 124 | "$$= \\sum_{i=1}^{n}i^2 + \\sum_{i=1}^{n}i$$\n", 125 | "$$= \\frac{n(n+1)(2n+1)}{6} + \\frac{n(n+1)}{2}$$\n", 126 | "$$= \\frac{n(n+1)}{6}\\cdot \\left[ 2n + 1 + 3 \\right]$$\n", 127 | "$$= \\frac{n(n+1)(n+2)}{3}$$\n", 128 | "\n", 129 | "The $i$ loop will run n times, the $j$ loop will run at most $n$ times, and the $k$ loop will run at most $n$ times, for a total running times of $O(n^3)$." 130 | ] 131 | }, 132 | { 133 | "cell_type": "markdown", 134 | "metadata": {}, 135 | "source": [ 136 | "### 2.3 [5] Unfinished\n", 137 | "\n", 138 | "What value is returned by the following function? Express your answer as a\n", 139 | "function of $n$. Give the worst-case running time using Big Oh notation.\n", 140 | "\n", 141 | "$\\hspace{2em} function \\text{ pestiferous}(n)$ \n", 142 | "$\\hspace{4em} r := 0$ \n", 143 | "$\\hspace{4em} for\\ i := 1\\ to\\ n\\ do$ \n", 144 | "$\\hspace{6em} for\\ j := 1\\ to\\ i\\ do$ \n", 145 | "$\\hspace{8em} for\\ k := j\\ to\\ i+j\\ do$ \n", 146 | "$\\hspace{10em} for\\ l := 1\\ to\\ i+j-k\\ do$ \n", 147 | "$\\hspace{12em} r := r+1$ \n", 148 | "$\\hspace{4em} \\text{return}(r)$" 149 | ] 150 | }, 151 | { 152 | "cell_type": "markdown", 153 | "metadata": {}, 154 | "source": [ 155 | "---" 156 | ] 157 | }, 158 | { 159 | "cell_type": "markdown", 160 | "metadata": {}, 161 | "source": [ 162 | "## Big Oh" 163 | ] 164 | }, 165 | { 166 | "cell_type": "markdown", 167 | "metadata": { 168 | "collapsed": true 169 | }, 170 | "source": [ 171 | "### 2.7 [3]\n", 172 | "\n", 173 | "True or False?\n", 174 | "\n", 175 | "$\\hspace{1em}$(a): Is $2^{n+1} = O(2^n)$? \n", 176 | "$\\hspace{1em}$(b): Is $2^{2n} = O(2^n)$?" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "metadata": {}, 182 | "source": [ 183 | "*Solution:*\n", 184 | "\n", 185 | "Always go back to the definitions on these ones.\n", 186 | "\n", 187 | "**(a):**\n", 188 | "\n", 189 | "This looks like it would be False, except that this exact problem was covered in the text. It looks misleading because functions like $n^x$ look like $x^n$.\n", 190 | "\n", 191 | "$2^{n+1} = 2\\cdot2^n$, which is clearly $O(2^n)$. Specifically, $c \\cdot 2^n > 2 \\cdot 2^n$ for all $n \\geq 1 $ if we set $c=3$. So it's True.\n", 192 | "\n", 193 | "**(b):**\n", 194 | "\n", 195 | "False. Suppose there is some $c$ such that $2^{2n} \\leq c \\cdot 2^n$ holds for some $n$. Choose $n = \\lg c$. Then the LHS become $2^{2 \\lg c} = 2^{\\lg{c^2}} = c^2$, and the RHS becomes $c \\cdot 2^{\\lg c} = c^2$. Therefore at this particular $n$, $2^{2n} = c \\cdot 2^n$. Furthermore, we can compute the derivatives of each side to show that the LHS is growing faster.\n", 196 | "\n", 197 | "LHS: $= \\frac{\\text{d}}{\\text{d}n} \\left[2^{2n}\\right]\n", 198 | "= \\frac{\\text{d}}{\\text{d}n} \\left[ \\left( e^{\\ln2} \\right)^{2n} \\right]\n", 199 | "= \\frac{\\text{d}}{\\text{d}n} \\left[ e^{2n \\ln{2}} \\right] \n", 200 | "= 2 \\ln{2} \\cdot e^{2n \\ln{2}}\n", 201 | "= 2 \\ln{2} \\cdot 2^{2n}$ \n", 202 | "\n", 203 | "RHS: $= \\frac{\\text{d}}{\\text{d}n} \\left[c \\, 2^n \\right]\n", 204 | "= \\frac{\\text{d}}{\\text{d}n} \\left[c \\, e^{n \\ln{2}} \\right]\n", 205 | "= c \\ln2 \\cdot e^{n \\ln{2}}\n", 206 | "= c \\ln2 \\cdot 2^n$\n", 207 | "\n", 208 | "Plugging in $n = \\lg c$, we get $2 c^2 \\ln{2}$ on the LHS, and $c^2 \\ln2$ on the RHS. The LHS derivative is larger. And because of the exponent of $2n$ compared with $n$, it will continue to grow faster.\n", 209 | "\n", 210 | "Therefore for any possible $c$ such that $2^{2n} \\leq c \\cdot 2^n$ might hold, we have shown that at $n = \\lg c$ the two sides are equal and that the LHS is growing more rapidly, and will continue to do so. Therefore for $n > \\lg c$ we have $2^{2n} > c \\cdot 2^n$.\n", 211 | "\n", 212 | "So $2^{2n} \\neq O(2^n)$. It's False.\n", 213 | "\n", 214 | "\n", 215 | "\n", 216 | "\n", 217 | "\n", 218 | "\n", 219 | "\n" 220 | ] 221 | }, 222 | { 223 | "cell_type": "markdown", 224 | "metadata": {}, 225 | "source": [ 226 | "### 2.8 [3]\n", 227 | "\n", 228 | "For each of the following pairs of functions, either $f(n)$ is in $O(g(n))$, $\\,f(n)$ is in $\\Omega(g(n))$, or $f(n) = \\Theta(g(n))$. Determine which relationship is correct and briefly\n", 229 | "explain why.\n", 230 | "\n", 231 | "$\\hspace{1em}$ (a): $\\, f(n) = \\log n^2; \\, g(n) = \\log n + 5$ \n", 232 | "$\\hspace{1em}$ (b): $\\, f(n) = \\sqrt n; \\, g(n) = \\log n^2$ \n", 233 | "$\\hspace{1em}$ (c): $\\, f(n) = \\log^2 n; \\, g(n) = \\log n$ \n", 234 | "$\\hspace{1em}$ (d): $\\, f(n) = n; \\, g(n) = \\log^2 n$ \n", 235 | "$\\hspace{1em}$ (e): $\\, f(n) = n\\log n + n; \\, g(n) = \\log n$ \n", 236 | "$\\hspace{1em}$ (f): $\\, f(n) = 10; \\, g(n) = \\log 10$ \n", 237 | "$\\hspace{1em}$ (g): $\\, f(n) = 2^n; \\, g(n) = 10n^2$ \n", 238 | "$\\hspace{1em}$ (h): $\\, f(n) = 2^n; \\, g(n) = 3^n$ " 239 | ] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "metadata": {}, 244 | "source": [ 245 | "*Solution:*\n", 246 | "\n", 247 | "**(a):** $\\, f(n) = \\log n^2; \\, g(n) = \\log n + 5$ \n", 248 | "\n", 249 | "$\\,f(n)$ can be rewritten as $f(n) = \\log n^2 = 2\\log n$. Therefore $f(n)$ is always within a factor of $2$ of $g(n)$, and constant factors are ignored in Big Oh analysis. So $f(n) = \\Theta(g(n))$.\n", 250 | "\n", 251 | "**(b):** $\\, f(n) = \\sqrt n; \\, g(n) = \\log n^2$\n", 252 | "\n", 253 | "$\\,g(n) = \\log n^2 = 2 \\log n$, and $\\sqrt n >> \\log n$, so $f(n) = \\Omega(g(n))$. By the way, $\\sqrt n >> \\log n$ is clear when you look at the derivatives: $\\frac{1}{2\\sqrt n} > \\frac{2}{n}$ for $n>16$ since $\\sqrt n$ grows more slowly than $n$.\n", 254 | "\n", 255 | "But what about $f(n) = O(g(n))$? Does there exist some $c$ and $n_0$ such that $\\sqrt n \\leq c \\cdot 2\\log n$ for all $n>n_0$? It's easily possible to find the $n$ where the derivatives are equal, above which $\\sqrt n$ grows more rapidly, but this does not guarantee that it will ever overtake $c\\cdot 2\\log n$, especially if $c$ is very large.\n", 256 | "\n", 257 | "$f(n) = O(g(n))$ is a statement that $g(n)$ *dominates* $f(n)$, written as $g >> f$. The definition of dominance given on page 56 is that $g$ dominates $f$ if\n", 258 | "\n", 259 | "$$ \\lim_{n\\rightarrow \\infty} \\frac{f(n)}{g(n)} = 0$$\n", 260 | "\n", 261 | "We can show that $c\\cdot 2\\log n$ does *not* dominate $\\sqrt n$ by showing that $\\sqrt n$ *does* dominate $2\\log n$.\n", 262 | "\n", 263 | "$$ \\lim_{n\\rightarrow \\infty} \\frac{g(n)}{f(n)}\n", 264 | "= \\lim_{n\\rightarrow \\infty} \\frac{2 \\log n}{\\sqrt n}\n", 265 | "= \\lim_{n\\rightarrow \\infty} \\frac{2 \\log \\sqrt{n}^2}{e^{\\log{\\sqrt n}}}\n", 266 | "= \\lim_{n\\rightarrow \\infty} \\frac{4 \\log \\sqrt{n}}{e^{\\log{\\sqrt n}}}\n", 267 | "= 0$$\n", 268 | "\n", 269 | "Since the denominator $e^x$ grows much more rapidly than the numerator $x$. Now since this ratio approaches $0$, the reciprocal surely does not, and therefore $c\\cdot 2\\log n$ does not dominate $\\sqrt n$.\n", 270 | "\n", 271 | "Therefore $f(n) = \\Omega(g(n))$.\n", 272 | "\n", 273 | "**(c):** $\\, f(n) = \\log^2 n; \\, g(n) = \\log n$ \n", 274 | "\n", 275 | "$(\\log n)^2 > \\log n$ once $\\log n$ becomes greater than $1$, which occurs once $n$ exceeds $e$. Therefore $\\log^2 n = \\Omega(\\log n)$. $\\log^2 n \\neq O(\\log n)$ since for any $c$, one can find an $n$ such that $\\log^2 n = c \\cdot \\log n$. Specifically, $n = e^c$. Above this, $\\log^2 n > c \\cdot \\log n$.\n", 276 | "\n", 277 | "Another way of answering: $x^2$ dominates $x$, and letting $x = \\log n$ won't change that.\n", 278 | "\n", 279 | "In an case, we have $\\log^2 n = \\Omega(\\log n)$\n", 280 | "\n", 281 | "**(d):** $\\, f(n) = n; \\, g(n) = \\log^2 n$\n", 282 | "\n", 283 | "In part (b) we showed that $\\sqrt n = \\Omega(\\log n)$. This means that there exists $c$ and $n_0$ such that $\\sqrt n \\geq c \\cdot \\log n$ holds for all $n > n_0$. Because we know that both sides are positive, we can square both sides to produce $n \\geq c^2 \\cdot\\log^2 n$ for the same $c$, which will hold for all $n > n_0$ for the same $n_0$. Therefore we have $n = \\Omega(\\log^2 n)$.\n", 284 | "\n", 285 | "For the same reason (squaring preserves inequalities), the $n \\neq O(\\log^2 n)$ result from part (b) also applies here.\n", 286 | "\n", 287 | "So $n = \\Omega(\\log^2 n)$.\n", 288 | "\n", 289 | "**(e):** $\\, f(n) = n\\log n + n; \\, g(n) = \\log n$\n", 290 | "\n", 291 | "$n\\log n + n = \\Omega(\\log n)$ since $n\\log n + n> \\log n$ for all $n > 1$.\n", 292 | "\n", 293 | "Furthermore $n\\log n + n \\neq O(\\log n)$ since for any $c>0$ we can find an $n_0>0$ such that $n\\log n + n> c \\cdot\\log n$ for all $n>n_0$. Specifically, set $n=c$:\n", 294 | "\n", 295 | "$$\\left[n\\log n + n\\right]\\biggr|_{n=c} = c\\log c + c > c\\log c = \\left[ c \\cdot\\log n\\right] \\biggr|_{n=c}$$\n", 296 | "\n", 297 | "Therefore, $f(n) = \\Omega(g(n))$.\n", 298 | "\n", 299 | "**(f):** $\\, f(n) = 10; \\, g(n) = \\log 10$\n", 300 | "\n", 301 | "These are both constants. $f \\geq c \\cdot g$ if $c = (1,000,000)^{-1}$. On the other hand, $f \\leq c \\cdot g$ if $c = 1,000,000$. Both of these trivially hold for all $n>0$.\n", 302 | "\n", 303 | "Therefore $f = \\Theta(g)$.\n", 304 | "\n", 305 | "**(g):** $\\, f(n) = 2^n; \\, g(n) = 10n^2$\n", 306 | "\n", 307 | "We'll just assume that $2^n >> n^2$. It's obviously true and routinely taken for granted, but also proved in problem 2.11. So we know there exists $c$ and $n_0$ such that $2^n \\geq c \\cdot n^2$ for all $n > n_0$. Let $d = c/10$. Then $2^n \\geq d \\cdot 10n^2$ for all $n>n_0$, and we have shown that $2^n = \\Omega(10n^2)$. Similarly, $2^n \\neq O(10n^2)$ since multiplicative factors don't affect Big Oh analysis, and $2^n \\neq O(n^2)$.\n", 308 | "\n", 309 | "So $f(n) = \\Omega(g(n))$.\n", 310 | "\n", 311 | "**(h):** $\\, f(n) = 2^n; \\, g(n) = 3^n$\n", 312 | "\n", 313 | "$3^n >> 2^n:$\n", 314 | "\n", 315 | "$$ \\lim_{n\\rightarrow \\infty} \\frac{2^n}{3^n}\n", 316 | "= \\lim_{n\\rightarrow \\infty} \\left(\\frac{2}{3}\\right)^n\n", 317 | "= 0$$\n", 318 | "\n", 319 | "since $\\frac{2}{3} < 1$.\n", 320 | "\n", 321 | "Since $g$ dominates $f$, $f(n) = O(g(n))$." 322 | ] 323 | }, 324 | { 325 | "cell_type": "markdown", 326 | "metadata": {}, 327 | "source": [ 328 | "### 2.9 [3]\n", 329 | "\n", 330 | "For each of the following pairs of functions $f(n)$ and $g(n)$, determine whether $f(n) = O(g(n))$, $g(n) = O(f(n))$, or both.\n", 331 | "\n", 332 | "$\\hspace{1em}$ (a): $\\, f(n) = (n^2 - n)\\,/\\, 2, \\, g(n) = 6n$ \n", 333 | "$\\hspace{1em}$ (b): $\\, f(n) = n+2\\sqrt n, \\, g(n) = n^2$ \n", 334 | "$\\hspace{1em}$ (c): $\\, f(n) = n\\log n, \\, g(n) = n\\sqrt n \\, / \\, 2$ \n", 335 | "$\\hspace{1em}$ (d): $\\, f(n) = n + \\log n, \\, g(n) = \\sqrt n$ \n", 336 | "$\\hspace{1em}$ (e): $\\, f(n) = 2(\\log n)^2, \\, g(n) = \\log n + 1$ \n", 337 | "$\\hspace{1em}$ (f): $\\, f(n) = 4n\\log n + n, \\, g(n) = (n^2 - n)/2$" 338 | ] 339 | }, 340 | { 341 | "cell_type": "markdown", 342 | "metadata": {}, 343 | "source": [ 344 | "*Solution:*\n", 345 | "\n", 346 | "**(a):** $\\, f(n) = (n^2 - n)\\,/\\, 2, \\, g(n) = 6n$\n", 347 | "\n", 348 | "We can ignore constant factors and lower order terms, so $f(n) = O(n^2)$ and $g(n) = O(n)$. Since $n^2 >> n$ we have $g(n) = O(f(n))$.\n", 349 | "\n", 350 | "**(b):** $\\, f(n) = n+2\\sqrt n, \\, g(n) = n^2$\n", 351 | "\n", 352 | "Again, we ignore lower order terms, so $f(n) = O(n)$ and we have $f(n) = O(g(n))$.\n", 353 | "\n", 354 | "**(c):** $\\, f(n) = n\\log n, \\, g(n) = n\\sqrt n \\, / \\, 2$\n", 355 | "\n", 356 | "We can write $g(n) = n^\\frac{3}{2} \\, /\\, 2$, and then show that g>>f:\n", 357 | "\n", 358 | "$$\\lim_{n \\rightarrow \\infty} \\frac{f(n)}{g(n)}\n", 359 | "= \\lim_{n \\rightarrow \\infty} \\frac{n\\log n}{n^\\frac{3}{2} \\, /\\, 2}\n", 360 | "= \\lim_{n \\rightarrow \\infty} \\frac{2\\log n}{\\sqrt n}\n", 361 | "= 0$$\n", 362 | "\n", 363 | "since $n^r$ for any positive $r$ will grow much faster than $\\log n$. This was also showed in 2.8(b).\n", 364 | "\n", 365 | "Therefore $g >> f$ and so $f(n) = O(g(n))$.\n", 366 | "\n", 367 | "**(d):** $\\, f(n) = n + \\log n, \\, g(n) = \\sqrt n$\n", 368 | "\n", 369 | "Again, ignoring lower order terms we have $f(n) = O(n)$ (since $n >> \\log n$), and because $n >> \\sqrt n$ we have $g(n) = O(f(n))$.\n", 370 | "\n", 371 | "**(e):** $\\, f(n) = 2(\\log n)^2, \\, g(n) = \\log n + 1$\n", 372 | "\n", 373 | "Ignoring constant factors and terms, we have $f(n) \\sim (\\log n)^2$ and $g(n) \\sim \\log n$. Because $g\\,/\\,f \\sim 1\\,/\\,\\log n \\rightarrow 0$ as $n \\rightarrow \\infty$, we have $f >> g$ and therefore $g(n) = O(f(n))$.\n", 374 | "\n", 375 | "**(f):** $\\, f(n) = 4n\\log n + n, \\, g(n) = (n^2 - n)/2$\n", 376 | "\n", 377 | "Again, ignoring constant factors and lower order terms, we have $f \\sim n\\log n$ and $g \\sim n^2$. We then know that $f\\,/\\,g = \\log n \\,/\\, n \\rightarrow 0$ as $n \\rightarrow \\infty$ since $n >>\\log n$. Therefore $g>>f$ and $f(n) = O(g(n))$." 378 | ] 379 | }, 380 | { 381 | "cell_type": "markdown", 382 | "metadata": {}, 383 | "source": [ 384 | "### 2.10 [3]\n", 385 | "\n", 386 | "Prove that $n^3 -3n^2 -n +1 = \\Theta(n^3)$." 387 | ] 388 | }, 389 | { 390 | "cell_type": "markdown", 391 | "metadata": {}, 392 | "source": [ 393 | "*Solution:*\n", 394 | "\n", 395 | "First let's calculate the derivatives, which will will need.\n", 396 | "\n", 397 | "$\\frac{\\text{d}}{\\text{d}n} \\text{LHS}\n", 398 | "= \\frac{\\text{d}}{\\text{d}n} \\left[ n^3 - 3n^2 - n + 1\\right]\n", 399 | "= 3n^2 -6n -1$\n", 400 | "\n", 401 | "$\\frac{\\text{d}}{\\text{d}n} \\text{RHS}\n", 402 | "= \\frac{\\text{d}}{\\text{d}n} \\left[c \\cdot n^3\\right]\n", 403 | "= c \\cdot 3n^2$\n", 404 | "\n", 405 | "Note that for $n>0$, $\\frac{\\text{d}}{\\text{d}n} \\text{LHS} < \\frac{\\text{d}}{\\text{d}n} \\text{RHS}$. This means that (with $c=1$) the RHS will always be growing faster. Furthermore,\n", 406 | "$\\text{LHS}(n=1) = 1 - 3 - 1 + 1 = -2$\n", 407 | "while\n", 408 | "$\\text{RHS}(n=1) = 1$.\n", 409 | "So at $n=1$, $\\text{RHS} > \\text{LHS}$ and $\\text{RHS}$ will continue to grow more rapidly.\n", 410 | "\n", 411 | "This means that if we let $c=1$ and $n_0 = 1$, then $n^3 - 3n^2 - n + 1 < c \\cdot n^3$ for all $n > n_0$.\n", 412 | "\n", 413 | "We have therefore shown that $n^3 - 3n^2 - n + 1 = O(n^3)$.\n", 414 | "\n", 415 | "Now to demonstrate $\\Omega$, and therefore $\\Theta$, we need to provide numbers $c$ and $n_0$ such that $n^3 - 3n^2 - n + 1 > c \\cdot n^3$ for all $n>n_0$. Let $n_0=100$ and $c = \\frac{1}{3}$.\n", 416 | "\n", 417 | "At $n_0$ the RHS is\n", 418 | "\n", 419 | "$\\text{RHS}(n=100) = 1,000,000 - 30,000 - 100 + 1 = 970,000 - 100 + 1 = 969,900 + 1 = 969,901$\n", 420 | "\n", 421 | "and the LHS is\n", 422 | "\n", 423 | "$\\text{LHS}(n=100) = \\frac{1}{3}1,000,000 \\approx 333,333$\n", 424 | "\n", 425 | "So at $n=n_0$, $\\text{LHS} > \\text{RHS}$. Their derivatives at $n_0$ evaluate to\n", 426 | "\n", 427 | "$\\frac{\\text{d}}{\\text{d}n} \\text{LHS} \\biggr|_{n=100}\n", 428 | "= 30,000 - 600 - 1\n", 429 | "= 29,399$\n", 430 | "\n", 431 | "$\\frac{\\text{d}}{\\text{d}n} \\text{RHS} \\biggr|_{n=100}\n", 432 | "= 10,000$\n", 433 | "\n", 434 | "So at $n=n_0$, $\\text{LHS} > \\text{RHS}$ and the LHS is growing more rapidly. A look at the second derivatives\n", 435 | "$\\left(6n-6, 2n\\right)$ shows that the LHS will continue to grow more rapidly for all $n>n_0$\n", 436 | "\n", 437 | "Therefore with $c=\\frac{1}{3}$ $n_0 = 100$, $n^3 - 3n^2 - n + 1 > c \\cdot n^3$ for all $n>n_0$. So $n^3 - 3n^2 - n + 1 = \\Omega(n^3)$.\n", 438 | "\n", 439 | "Because $n^3 - 3n^2 - n + 1 = O(n^3)$ and $n^3 - 3n^2 - n + 1 = \\Omega(n^3)$, we have $n^3 - 3n^2 - n + 1 = \\Theta(n^3)$." 440 | ] 441 | }, 442 | { 443 | "cell_type": "markdown", 444 | "metadata": {}, 445 | "source": [ 446 | "### 2.11 [3]\n", 447 | "\n", 448 | "Prove that $n^2 = O(2^n)$." 449 | ] 450 | }, 451 | { 452 | "cell_type": "markdown", 453 | "metadata": {}, 454 | "source": [ 455 | "*Solution*:\n", 456 | "\n", 457 | "The answer is yes. $2^n$ dominates any polynomial.\n", 458 | "\n", 459 | "Let $c=1$. We shall find an $n_0$ such that for all $n>n_0$, $2^n \\geq n^2$.\n", 460 | "\n", 461 | "First let's look at some values.\n", 462 | "\n", 463 | "$\\hspace{3em} (n^2, 2^n)$ \n", 464 | "$n=1: \\,(1,\\, 2)$ \n", 465 | "$n=2: \\,(4,\\, 4)$ \n", 466 | "$n=3: \\,(9,\\, 8)$ \n", 467 | "$n=4: \\,(16,\\, 16)$ \n", 468 | "$n=5: \\,(25,\\, 32)$ \n", 469 | "$n=6: \\,(36,\\, 64)$ \n", 470 | "$n=7: \\,(49,\\, 128)$ \n", 471 | "$n=8: \\,(64,\\, 256)$ \n", 472 | "$n=9: \\,(81,\\, 512)$ \n", 473 | "$n=10: \\,(100,\\, 1024)$ \n", 474 | "\n", 475 | "Note that at $n=3$, $n^2 > 2^n$, but at $n=4$ they are equal. And for $n>4$ it seems that $2^n > n^2$. So let's set $n_0 = 4$.\n", 476 | "\n", 477 | "To prove that $2^n > n^2$ we are going to look at their derivatives at that point and beyond.\n", 478 | "\n", 479 | "$\\frac{\\text{d}}{\\text{d}n} n^2 = 2n$\n", 480 | "\n", 481 | "$\\frac{\\text{d}}{\\text{d}n} 2^n\n", 482 | "= \\frac{\\text{d}}{\\text{d}n} \\left[\\left( e^{\\ln2} \\right)^n \\right]\n", 483 | "= \\frac{\\text{d}}{\\text{d}n} \\left[ e^{n\\ln2} \\right]\n", 484 | "= \\ln2 \\left[ e^{n\\ln2} \\right]\n", 485 | "= \\ln2 \\cdot2^n$\n", 486 | "\n", 487 | "Setting $n = 4$ we have\n", 488 | "\n", 489 | "$\\frac{\\text{d}}{\\text{d}n} n^2\\biggr|_{n=4}\n", 490 | " = 2n\\biggr|_{n=4}= 2(4) = 8$ \n", 491 | "$\\frac{\\text{d}}{\\text{d}n} 2^n\\biggr|_{n=4}\n", 492 | " = \\ln2 \\cdot2^n\\biggr|_{n=4}\n", 493 | " = \\ln2 \\cdot2^4\n", 494 | " = 16\\ln2\n", 495 | " \\approx 11$\n", 496 | "\n", 497 | "So at $n=4$, $2^n$ is growing faster than $n^2$.\n", 498 | "\n", 499 | "Furthermore, the second derivative of $n^2$ is $2$, a constant. And its third derivative is 0. Growth in the slope is constant.\n", 500 | "\n", 501 | "On the other hand, the $n$th derivative of $2^n$ is $(\\ln2)^{n}2^n$. The constant factor in front only gets larger. This implies that this function only continues accelerating, faster and faster.\n", 502 | "\n", 503 | "Together, these results show that at $n=4$, $n^2 = 2^n$. But $2^n$ is growing more rapidly, and will continue to do so for all $n>4$. Therefore for all $n>4$, $2^n > 2^n$, and therefore $n^2 = O(2^n)$." 504 | ] 505 | }, 506 | { 507 | "cell_type": "markdown", 508 | "metadata": {}, 509 | "source": [ 510 | "### 2.12 [3]\n", 511 | "\n", 512 | "For each of the following pairs of functions $f(n)$ and $g(n)$, give an appropriate positive constant $c$ such that $f(n) \\leq c \\cdot g(n)$ for all $n>1$.\n", 513 | "\n", 514 | "$\\hspace{1em}$ (a): $\\, f(n) = n^2 + n + 1, \\, g(n) = 2n^3$ \n", 515 | "$\\hspace{1em}$ (b): $\\, f(n) = n\\sqrt n + n^2, \\, g(n) = n^2$ \n", 516 | "$\\hspace{1em}$ (c): $\\, f(n) = n^2 - n + 1, \\, g(n) = n^2\\,/\\,2$ " 517 | ] 518 | }, 519 | { 520 | "cell_type": "markdown", 521 | "metadata": {}, 522 | "source": [ 523 | "*Solution:*\n", 524 | "\n", 525 | "Our approach will be to choose $c$ such that $f \\leq c\\cdot g$ at $n=1$, but also that $\\,f' \\leq c\\cdot g'$, and $\\, f'' \\leq c\\cdot g''$ so that $c \\cdot g$ will continue to be greater than $f$ for all larger $n$.\n", 526 | "\n", 527 | "**(a):** $\\, f(n) = n^2 + n + 1, \\, g(n) = 2n^3$ \n", 528 | "\n", 529 | "Letting $c=2$ so that $g(n) = 4n^3$, the first and second derivatives of $f$ and $g$ are:\n", 530 | "\n", 531 | "$f'(n) = 2n + 1; \\, f''(n) = 2 $ \n", 532 | "$g'(n) = 12n^2; \\, g''(n) = 24n$\n", 533 | "\n", 534 | "Plugging in $n=1$ we have\n", 535 | "\n", 536 | "$f(1) = 3; \\, f'(1) = 3; \\, f''(1) = 2$ \n", 537 | "$g(1) = 4; \\, g'(1) = 12; \\, g''(1) = 24$\n", 538 | "\n", 539 | "We see that at $n=1$, $\\, f(n) < c \\cdot g(n)$ and that all orders of derivatives are also higher for the RHS. Therefore $\\, f(n) < c \\cdot g(n)$ holds for $n=1$ and will continue to hold for all $n>1$.\n", 540 | "\n", 541 | "\n", 542 | "**(b):** $\\, f(n) = n\\sqrt n + n^2, \\, g(n) = n^2$ \n", 543 | "\n", 544 | "Letting $c=2$ so that $g(n) = 2n^2$, the first and second derivatives of $f$ and $g$ are:\n", 545 | "\n", 546 | "$f'(n) = \\frac{3}{2}n^{\\frac{1}{2}} + 2n; \\, f''(n) = \\frac{3}{4\\sqrt n} + 2$ \n", 547 | "$g'(n) = 4n; \\, g''(n) = 4$\n", 548 | "\n", 549 | "Plugging in $n=1$ we have\n", 550 | "\n", 551 | "$f(1) = 2; \\, f'(1) = \\frac{7}{2}; \\, f''(1) = \\frac{11}{4}$ \n", 552 | "$g(1) = 2; \\, g'(1) = 4; \\, g''(1) = 4$\n", 553 | "\n", 554 | "We see that at $n=1$, $\\, f(n) = c \\cdot g(n)$, but the first and second derivatives on the RHS are greater. Further, the second derivative on the LHS will only get smaller, while the second derivative on the RHS is constant.\n", 555 | "\n", 556 | "Together these ensure that $\\, f(n) < c \\cdot g(n)$ for all $n>1$.\n", 557 | "\n", 558 | "\n", 559 | "**(c):** $\\, f(n) = n^2 - n + 1, \\, g(n) = n^2\\,/\\,2$ \n", 560 | "\n", 561 | "Letting $c=4$ so that $g(n) = 2n^2$, the first and second derivatives of $f$ and $g$ are:\n", 562 | "\n", 563 | "$f'(n) = 2n - 1; \\, f''(n) = 2$ \n", 564 | "$g'(n) = 4n; \\, g''(n) = 4$\n", 565 | "\n", 566 | "Plugging in $n=1$ we have\n", 567 | "\n", 568 | "$f(1) = 1; \\, f'(1) = 1; \\, f''(1) = 2$ \n", 569 | "$g(1) = 2; \\, g'(1) = 4; \\, g''(1) = 4$\n", 570 | "\n", 571 | "We see that $n=1$, $\\, f(n) < c \\cdot g(n)$. Further, the first and second derivatives of the RHS are greater than those of the left, and all higher order derivatives are zero. Thie ensures that $\\, f(n) < c \\cdot g(n)$ will continue to hold for all $n>1$.\n", 572 | "\n" 573 | ] 574 | }, 575 | { 576 | "cell_type": "markdown", 577 | "metadata": {}, 578 | "source": [ 579 | "---" 580 | ] 581 | }, 582 | { 583 | "cell_type": "markdown", 584 | "metadata": {}, 585 | "source": [ 586 | "## Summations" 587 | ] 588 | }, 589 | { 590 | "cell_type": "markdown", 591 | "metadata": { 592 | "collapsed": true 593 | }, 594 | "source": [ 595 | "### 2.32 [5]\n", 596 | "\n", 597 | "Prove that:\n", 598 | "$$1^2 - 2^2 + 3^2 - 4^2 + \\ldots + (-1)^{k-1}k^2 = (-1)^{k-1}\\frac{k(k+1)}{2}$$" 599 | ] 600 | }, 601 | { 602 | "cell_type": "markdown", 603 | "metadata": {}, 604 | "source": [ 605 | "*Solution:*\n", 606 | "\n", 607 | "We'll prove this with induction.\n", 608 | "\n", 609 | "*Base case:* $n = 0$. \n", 610 | "Both sides are 0.\n", 611 | "\n", 612 | "*Base case:* $n = 1$. \n", 613 | "LHS: $1^2 = 1$ \n", 614 | "RHS: $(-1)^{1-1}\\frac{1(1+1)}{2} = (-1)^{0}\\frac{2}{2} = 1 \\cdot 1 = 1$ \n", 615 | "They are equal.\n", 616 | "\n", 617 | "*General case:* $k = n$.\n", 618 | "\n", 619 | "Suppose the equation holds for all $k \\leq n - 1$. We would like to show that it holds for $k = n$.\n", 620 | "\n", 621 | "$$1^2 - 2^2 + 3^2 - 4^2 + \\ldots + (-1)^{n-1}n^2$$\n", 622 | "$$\\hspace{1em} = \\sum_{i=1}^{n} (-1)^{i-1}i^2$$\n", 623 | "$$\\hspace{1em} = (-1)^{n-1}n^2 + \\sum_{i=1}^{n-1} (-1)^{i-1}i^2$$\n", 624 | "$$\\hspace{1em} = (-1)^{n-1}n^2 + (-1)^{n-2}\\frac{n(n-1)}{2}$$\n", 625 | "$$\\hspace{1em} = \\frac{(-1)^{n-2}}{2} \\cdot \\left[(-1)^{1}2n^2 + n(n-1)\\right]$$\n", 626 | "$$\\hspace{1em} = \\frac{(-1)^{n-2}}{2} \\cdot \\left[-2n^2 + n^2-n\\right]$$\n", 627 | "$$\\hspace{1em} = \\frac{(-1)^{n-2}}{2} \\cdot \\left[-n^2 - n\\right]$$\n", 628 | "$$\\hspace{1em} = \\frac{(-1)^{n-2}}{2} \\cdot (-1)^1(n^2 + n)$$\n", 629 | "$$\\hspace{1em} = (-1)^{n-1}\\frac{n(n+1)}{2}$$\n", 630 | "\n", 631 | "which is the expression we wanted." 632 | ] 633 | }, 634 | { 635 | "cell_type": "markdown", 636 | "metadata": {}, 637 | "source": [ 638 | "### 2.33 [5]\n", 639 | "\n", 640 | "Find an expression for the sum of the $i$th row of the following triangle, and prove its correctness. Each entry is the sum of the three entries directly above it. All non existing entries are considered 0.\n", 641 | "\n", 642 | "$$ \\_\\_ \\hspace{2em} \\_\\_ \\hspace{2em} \\_\\_ \\hspace{2em} \\_\\_ \\hspace{2em} \\_1 \\hspace{2em} \\_\\_ \\hspace{2em} \\_\\_ \\hspace{2em} \\_\\_ \\hspace{2em} \\_\\_ $$\n", 643 | "$$ \\_\\_ \\hspace{2em} \\_\\_ \\hspace{2em} \\_\\_ \\hspace{2em} \\_1 \\hspace{2em} \\_1 \\hspace{2em} \\_1 \\hspace{2em} \\_\\_ \\hspace{2em} \\_\\_ \\hspace{2em} \\_\\_ $$\n", 644 | "$$ \\_\\_ \\hspace{2em} \\_\\_ \\hspace{2em} \\_1 \\hspace{2em} \\_2 \\hspace{2em} \\_3 \\hspace{2em} \\_2 \\hspace{2em} \\_1 \\hspace{2em} \\_\\_ \\hspace{2em} \\_\\_ $$\n", 645 | "$$ \\_\\_ \\hspace{2em} \\_1 \\hspace{2em} \\_3 \\hspace{2em} \\_6 \\hspace{2em} \\_7 \\hspace{2em} \\_6 \\hspace{2em} \\_3 \\hspace{2em} \\_1 \\hspace{2em} \\_\\_ $$\n", 646 | "$$ \\_1 \\hspace{2em} \\_4 \\hspace{2em} 10 \\hspace{2em} 16 \\hspace{2em} 19 \\hspace{2em} 16 \\hspace{2em} 10 \\hspace{2em} \\_4 \\hspace{2em} \\_1 $$" 647 | ] 648 | }, 649 | { 650 | "cell_type": "markdown", 651 | "metadata": {}, 652 | "source": [ 653 | "*Solution*:\n", 654 | "\n", 655 | "The sums of the rows in the triangle above are 1, 3, 9, 27, and 81. This suggests that sum of the values in the $i$th row is $3^{i-1}$. Let's prove it.\n", 656 | "\n", 657 | "First note that the value in position $A[i,j]$ where $i>0$ and $j \\in \\mathbb{Z}$ is \n", 658 | "\n", 659 | "$$A[i,j] = A[i-1,j-1] + A[i-1,j] + A[i-1,j+1] = \\sum_{k=j-1}^{j+1}A[i-1,k]$$\n", 660 | "\n", 661 | "that is, the sum of the surrounding values in the previous row.\n", 662 | "\n", 663 | "Now let's show that the sum of the elements in row $i$, $\\text{Sum}[i]$ is equal to 3 times the sum in the previous row.\n", 664 | "\n", 665 | "$$\\text{Sum}[i] = \\sum_{j=-\\infty}^{\\infty}A[i,j]$$\n", 666 | "$$= \\sum_{j=-\\infty}^{\\infty} \\bigr[ A[i-1,j-1] + A[i-1,j] + A[i-1,j+1] \\bigr]$$\n", 667 | "$$= \\sum_{j=-\\infty}^{\\infty} A[i-1,j-1] + \\sum_{j=-\\infty}^{\\infty} A[i-1,j] + \\sum_{j=-\\infty}^{\\infty} A[i-1,j+1] $$\n", 668 | "$$= \\text{Sum}[i-1] + \\text{Sum}[i-1] + \\text{Sum}[i-1]$$\n", 669 | "$$= 3\\cdot \\text{Sum}[i-1] $$\n", 670 | "\n", 671 | "Now we can use recursion to find a closed form solution.\n", 672 | "\n", 673 | "Base case: $n=1$. \n", 674 | "Row $1$ consists of all $0$s except for a single $1$ in position $j=0$. Therefore $\\text{Sum}[1] = 1$.\n", 675 | "\n", 676 | "General case. \n", 677 | "Suppose $\\text{Sum}[i] = 3^{i-1}$ for all $i 0$ is 1 to 1 (bijective) from $\\mathbb{R} \\rightarrow \\mathbb{R}^+$ and therefore has the inverse function $\\log_n{x}$, if we can show that $a^{\\text{LHS}} = a^{\\text{RHS}}$, then $\\text{LHS} = \\text{RHS}$.\n", 754 | "\n", 755 | "Raising $a$ to both sides: \n", 756 | "$a^\\text{LHS} = a^{\\log_{a}{(xy)}} = xy$ \n", 757 | "$a^\\text{RHS} = a^{\\log_{a}{x} + \\log_{a}{y}}\n", 758 | "= a^{\\log_{a}{x}} \\cdot a^{\\log_{a}{y}} = xy$ \n", 759 | "\n", 760 | "They are equal, so $\\text{LHS} = \\text{RHS}$.\n", 761 | "\n", 762 | "**(b):** \n", 763 | "$\\log_{a}{x^y}$ \n", 764 | "$\\hspace{1em} = \\log_{a}\\left[ x\\cdot x \\cdot \\ldots \\{ \\text{y times} \\}\n", 765 | "\\ldots \\cdot x \\right]$ \n", 766 | "$\\hspace{1em} = \\log_{a}x + \\log_{a}x + \\ldots \\{\\text{y times}\\} \\ldots + \\log_{a}x$ \n", 767 | "$\\hspace{1em} = y \\log_{a}x$\n", 768 | "\n", 769 | "Alternatively, \n", 770 | "$a^\\text{LHS} = a^{\\log_{a}{x^y}} = x^y$ \n", 771 | "$a^\\text{RHS}$\n", 772 | "$= a^{y \\log_{a}{x}}$\n", 773 | "$ = \\left(a^{\\log_{a}{x}}\\right)^y$\n", 774 | "$= x^y$\n", 775 | "\n", 776 | "So $a^\\text{LHS} = a^\\text{RHS}$, and therefore $\\text{LHS} = \\text{RHS}$ (since $a^x$ is injective).\n", 777 | "\n", 778 | "**(c):** \n", 779 | "$$\\log_{a}{x} = \\frac{\\log_{b}{x}}{\\log_{b}{a}}$$\n", 780 | "Multiply both sides by $\\log_{x}{a}$:\n", 781 | "$$\\log_{a}{x} \\cdot \\log_{b}{a} = \\log_{b}{x}$$\n", 782 | "We can rewrite the LHS as: \n", 783 | "$$\\text{LHS} = \\log_{a}{x} \\cdot \\log_{b}{a} = \\log_{b}{a^{\\log_{a}{x}}} = \\log_{b}{x} = \\text{RHS}$$ \n", 784 | "So they are equal.\n", 785 | "\n", 786 | "**(d):** \n", 787 | "Raising both sides to $\\frac{1}{\\log_bx}$, on the LHS we have\n", 788 | "\n", 789 | "$$ \\text{LHS}^{\\frac{1}{\\log_bx}}\n", 790 | "= \\left(x^{\\log_by}\\right)^{\\frac{1}{\\log_bx}}\n", 791 | "= x^{\\frac{\\log_by}{\\log_bx}} = x^{\\log_xy}\n", 792 | "= y$$\n", 793 | "\n", 794 | "and on the RHS we have\n", 795 | "\n", 796 | "$$ \\text{RHS}^{\\frac{1}{\\log_bx}}\n", 797 | " = \\left(y^{\\log_bx}\\right)^{\\frac{1}{\\log_bx}}\n", 798 | " = y^{\\frac{\\log_bx}{\\log_bx}}\n", 799 | " = y$$\n", 800 | "\n", 801 | "So $\\text{LHS}^\\frac{1}{\\log_bx} = \\text{RHS}^\\frac{1}{\\log_bx}$. Now because $f(x) = x^n$ and $f^{-1}(x) = x^{1/n}$ are 1 to 1 inverse operations, we know that $\\text{LHS} = \\text{RHS}$ also holds." 802 | ] 803 | }, 804 | { 805 | "cell_type": "markdown", 806 | "metadata": {}, 807 | "source": [ 808 | "### 2.40 [3]\n", 809 | "\n", 810 | "Show that $\\lceil \\lg{(n+1)}\\rceil = \\lfloor \\lg{n} \\rfloor + 1$" 811 | ] 812 | }, 813 | { 814 | "cell_type": "markdown", 815 | "metadata": {}, 816 | "source": [ 817 | "*Solution:*\n", 818 | "\n", 819 | "First let's plug in some numbers.\n", 820 | "\n", 821 | "$n=1$: $\\lceil \\lg{2}\\rceil = \\lceil 1 \\rceil = 1$ and $\\lfloor \\lg{1} \\rfloor + 1 = \\lfloor0 \\rfloor + 1 = 1$. \n", 822 | "$n=2$: $\\lceil \\lg{3}\\rceil = 2$ and $\\lfloor \\lg{2} \\rfloor + 1 = 1 + 1 = 2$. \n", 823 | "$n=3$: $\\lceil \\lg{4}\\rceil = 2$ and $\\lfloor \\lg{3} \\rfloor + 1 =1 + 1 = 1$.\n", 824 | "\n", 825 | "$n \\in \\mathbb{N}^+$.\n", 826 | "\n", 827 | "There are 3 cases for $n$: $n$ is a power of 2, $n+1$ is a power of 2, or neither $n$ nor $n+1$ is a power of 2. The only overlap between these cases is $n=1$, and we showed above that the equation is true for $n=1$.\n", 828 | "\n", 829 | "**Case 1:** Suppose that $n$ is a power of 2. Then there exists a positive integer $m$ such that $n = 2^m$. Furthermore, we can assume that $n+1$ is not a power of 2, otherwise $n=1$ which we already showed verified the equation.\n", 830 | "\n", 831 | "$\\text{RHS} = \\lfloor \\lg 2^m\\rfloor + 1 = \\lfloor m\\rfloor + 1 = m+1$\n", 832 | "\n", 833 | "Now because for $m>1$ the difference between $2^m$ and $n^{m+1}$ is greater than 1, we know that $n+1$ lies between $2^m$ and $2^{m+1}$, and therefore $\\lg{(n+1)}$ lies between $\\lg{2^m} = m$ and $\\lg{2^{m+1}} = m+1$. We can then write\n", 834 | "\n", 835 | "$\\text{LHS} = \\lceil \\lg{(n+1)}\\rceil = m+1$.\n", 836 | "\n", 837 | "Therefore for this case the equation holds.\n", 838 | "\n", 839 | "**Case 2:** Suppose that $n+1$ is a power of two. Again, we also assume that $n>1$; the equation has already been shown to hold for $n=1$.\n", 840 | "\n", 841 | "There exists a positive integer $m$ such that $n+1 = 2^m$. Since $n\\neq 1$ and the difference between adjacent powers of 2 is greater than 1 for powers greater than 1, we know that $n$ lies between $2^{m-1}$ and $2^m$, and therefore $\\lg{n}$ lies between $m-1$ and $m$. We can then write\n", 842 | "\n", 843 | "$\\text{LHS} = \\lceil \\lg{(n+1)}\\rceil = \\lceil \\lg{2^m}\\rceil = m$\n", 844 | "\n", 845 | "$\\text{RHS} = \\lfloor \\lg n\\rfloor + 1 = (m-1) + 1 = m$\n", 846 | "\n", 847 | "Since both sides equal $m$, the equation holds in the case.\n", 848 | "\n", 849 | "**Case 3:** Suppose that $n>1$ and that neither $n$ nor $n+1$ are a power of 2.\n", 850 | "\n", 851 | "Since for $n>1$ the distance between adjacent powers of 2 is greater than 1, we know that both $n$ and $n+1$ lie between $2^m$ and $2^{m+1}$ for some $m\\geq1$, and therefore that $\\lg n$ and $\\lg{(n+1)}$ lie between $m$ and $m+1$. We can then write\n", 852 | "\n", 853 | "$\\text{LHS} = \\lceil \\lg{(n+1)}\\rceil = m+1$\n", 854 | "\n", 855 | "$\\text{RHS} = \\lfloor \\lg n\\rfloor + 1 = (m) + 1 = m+1$\n", 856 | "\n", 857 | "Since both sides are equal, the equation holds for this case.\n", 858 | "\n", 859 | "Because the equation holds for all cases, it holds for all $n>0$." 860 | ] 861 | }, 862 | { 863 | "cell_type": "markdown", 864 | "metadata": {}, 865 | "source": [ 866 | "---" 867 | ] 868 | }, 869 | { 870 | "cell_type": "markdown", 871 | "metadata": { 872 | "collapsed": true 873 | }, 874 | "source": [ 875 | "## Interview Problems" 876 | ] 877 | }, 878 | { 879 | "cell_type": "markdown", 880 | "metadata": {}, 881 | "source": [ 882 | "### 2.43 [5]\n", 883 | "\n", 884 | "You are given a set $S$ of $n$ numbers. You must pick a subset $S'$ of $k$ numbers from $S$ such that the probability of each element of $S$ occurring in $S'$ is equal (i.e., each is selected with probability $k\\,/\\,n$). You may make only one pass over the numbers. What if $n$ is unknown?" 885 | ] 886 | }, 887 | { 888 | "cell_type": "markdown", 889 | "metadata": {}, 890 | "source": [ 891 | "*Solution:*\n", 892 | "\n", 893 | "First we address the case where $n$ is known.\n", 894 | "\n", 895 | "The problem imposes 2 constraints: 1) that the probability of each number in $S$ appearing in $S'$ be the same, and 2) that we select exactly $k$ numbers.\n", 896 | "\n", 897 | "My first thought is to iterate over the numbers in $S$, and for each number, choose to include it in $S'$ with probability $k\\,/\\,n$. One can imagine having a weighted coin where you can set the probability of heads arbitrarily, and flipping it with $p = k \\, / \\, n$ for each element of $S$. The problem with this procedure is that there is a non-zero probability of selecting 0 elements, all $n$ of the elements, or any number in between. Actually, the probability of $m$ numbers being included in $S'$ will form a binomial distribution. Yes, the mode and mean (expected value) will be $k$, but we are required to select exactly $k$, not just in the average case.\n", 898 | "\n", 899 | "My second though is to generate the set of all $n$ choose $k$ subsets of size $k$ containing numbers from $1$ through $n$ and pick one subset at random. But let's say that such an algorithm is not available to us.\n", 900 | "\n", 901 | "To enforce that we select exactly $k$ numbers, we can modify the probability of selection as we go. For example, if we have already included $k$ numbers, we set $\\text{P(selection)} = 0$ for all remaining elements of $S$. Actually, we are *conditioning* the selection probabilities on the outcomes of previous selections.\n", 902 | "\n", 903 | "Let's take an example. Let $S = \\{1, 2, 3\\}$ and let $k=1$. First we consider selecting number $1$. This element needs to have a selection probability of $1\\,/\\,3$, and since there are no previous decisions to condition on, we must set $P(1) = 1 \\, / \\, 3$. Now consider number 2. We must condition its selection probability on whether or not number 1 was included. Written out, $P(2) = P(2|1)\\cdot P(1) + P(2|\\bar{1})\\cdot P(\\bar{1})$, where $\\bar{1}$ indicates that number 1 was not selected. If number 1 was selected, then $P(2|1)= 0$, since we are only allowed to select 1 number. Therefore we have $P(2) = P(2|\\bar{1})\\cdot P(\\bar{1})$, meaning the only way for the number 2 to be selected is for number 1 to be NOT selected, and then for number 2 to be selected. This probability must come out to $1\\, / \\, 3$. Plugging in $P(\\bar{1}) = 1 - P(1) = 2 \\, / \\, 3$, we have $P(2) = P(2|\\bar{1})\\cdot 2 \\, / 3 = 1 \\,/ \\, 3$. Therefore $P(2|\\bar{1}) = 1 \\, / \\, 2$. Now consider number 3. If either numbers 1 or 2 have already been included, then $P(3) = 0$, otherwise $P(3) = 1$. Does this come out to an overall selection probability of $1 \\, / \\, 3$? The only way number 3 can be selected is for both numbers 1 and 2 to be NOT selected. The probability of this happening is $P(3) = P(\\bar{1}) \\cdot P(\\bar{2}|\\bar{1})$ $= (1 - P(1)) \\cdot (1 - P(2|\\bar{1}))$ $= (2 \\, / 3) \\cdot (1 \\, / \\, 2)$ $= 1 \\, / \\, 3$.\n", 904 | "\n", 905 | "In conditioning selection probabilities in this way, we have enforced the 2 desired constraints.\n", 906 | "\n", 907 | "Let's take one more example before we try to generalize to arbitrary $n$ and $k$. Let $S = \\{ 1, 2, 3\\}$ and $k = 2$. Every element must have a selection probability of $P = 2 \\, / \\, 3$. To enforce this on number 1, we set $P(1) = 2 \\, / \\, 3$. Now consider number 2. The general form of its selection probability is $P(2) = P(2|1) \\cdot P(1) + P(2|\\bar{1}) \\cdot P(\\bar{1})$. Plugging in the probabilities for number 1, we have $P(2) = P(2|1) \\cdot 2 \\, / \\, 3 + P(2|\\bar{1}) \\cdot 1 \\, / \\, 3$. If number 1 was NOT selected, since there are only 2 numbers remaining and $k = 2$, we must select number 2. Therefore $P(2|\\bar{1}) = 1$, leaving us with $P(2) = P(2|1) \\cdot 2 \\, / \\, 3 + 1 \\, / \\, 3$. Since this must equal $2 \\, / \\, 3$, we must have $P(2|1) = 1 \\, /\\,2$. Now consider number 3. If both numbers 1 and 2 were selected, then $P(3) = 0$. Otherwise, $P(3) = 1$. Does this produce an overall selection probability of $P(3) = 2 \\, / \\, 3$? The only way for 3 to be NOT selected is for both numbers 1 and 2 to be selected. This will occur with probability $P(1) \\cdot P(2|1) = 2 \\, / \\, 3 \\cdot 1 \\, / \\, 2 = 1 \\, / \\, 3$. Therefore $P(\\bar{3}) = 1 \\, / \\, 3$ and $P(3) = 1 - 1 \\, / \\, 3 = 2 \\, / \\, 3$.\n", 908 | "\n", 909 | "We successfully enforced both constraints.\n", 910 | "\n", 911 | "To generalize this to arbitrary $n$ and $k$, let's look at the selection of number 2 in the previous example. We will write $k'$ for the number of unassignmed \"slots\" remaining in $S'$, and $n'$ for the number of numbers in $S$ which have yet to be considered. If number 1 had already been selected, then we have $k' = 1$ and $n' = 2$; we are choosing whether number 2 or 3 should fill the remaining vacancy in $S'$. We calculated in this case that $P(2|1) = 1 \\, /\\,2$. In the event that number 1 was NOT selected, we have $k'=2$ and $n'= 2$. So both numbers 2 and 3 must be selected to fill the 2 vacancies in $S'$. Therefore $P(2|\\bar{1}) = 1$. In both these cases, $P(2) = k' \\, / \\, n'$. Indeed, this equation holds for numbers 1 and 3 as well. Regarding number 1, $k' = k$ and $n' = n$, and so we set $P(1) = k \\, / \\, n$. Regarding number 3, since this number was last, we had $n' = 1$ and $k'$ equal to either $0$ or $1$, resulting in selection probabilities of $0 \\, / \\, 1$ or $1 \\, / \\, 1$.\n", 912 | "\n", 913 | "This suggests that as we iterate over $S$, we keep track of the remaining number of vacancies in $S'$ as $k'$, and the remaining number of elements from $S$ to be considered as $n$, and for each number in $S$, include it in $S'$ with probability $k' / n'$. The two examples above suggests that this procedure works in general.\n", 914 | "\n", 915 | "To prove that this does in fact hold, suppose that we have already considered $\\bar{n}$ numbers and we are now considering including number $\\bar{n}+1$. We would have to show that\n", 916 | "\n", 917 | "$$ P(\\bar{n} + 1) = \\sum_{\\bar{k} = 0}^{\\bar{n}} P(\\bar{n} + 1|\\bar{k}) \\cdot F(\\bar{k}, \\bar{n}) = \\sum_{\\bar{k} = 0}^{\\bar{n}} \\frac{k'}{n'} \\cdot F(\\bar{k}, \\bar{n}) = \\sum_{\\bar{k} = 0}^{\\bar{n}} \\frac{k - \\bar{k}}{n - \\bar{n}} \\cdot F(\\bar{k}, \\bar{n}) \\stackrel{WTS}{=} \\frac{k}{n}$$\n", 918 | "\n", 919 | "where we have set $\\bar{n} = n - n'$ (the number of numbers preceding number $\\bar{n} + 1$), $\\bar{k} = k - k'$ (how many numbers were selected from those before number $\\bar{n} + 1$), and $F(\\bar{k}, \\bar{n})$ (the probability that exactly $\\bar{k}$ numbers were selected amongst the previous $\\bar{n}$).\n", 920 | "\n", 921 | "Since amongst the first $\\bar{n}$ numbers, each one has an equal probability of $k \\, / \\, n$ of being selected, the probability that exactly $\\bar{k}$ are included is the same as the probability of obtaining exactly $\\bar{k}$ heads while flipping a coin $\\bar{n}$ times that has a probability of heads of $p = k \\, / \\, n$. This is given by $(P_{heads})^{H} \\cdot (P_{tails})^{N-H} \\cdot \\binom{N}{H})$ where $H$ is the number of heads and $N$ is the total number of flips. Plugging this in for $F(\\bar{k}, \\bar{n})$ we have:\n", 922 | "\n", 923 | "$$\n", 924 | "\\sum_{\\bar{k} = 0}^{\\bar{n}} \\frac{k - \\bar{k}}{n - \\bar{n}} \\cdot F(\\bar{k}, \\bar{n})\n", 925 | "= \\sum_{\\bar{k} = 0}^{\\bar{n}} \\frac{k - \\bar{k}}{n - \\bar{n}} \\cdot \\left(\\frac{k}{n}\\right)^{\\bar{k}} \\cdot \\left(\\frac{n-k}{n}\\right)^{\\bar{n} - \\bar{k}} \\cdot \\binom{\\bar{n}}{\\bar{k}}\n", 926 | "\\stackrel{WTS}{=} \\frac{k}{n}\n", 927 | "$$\n", 928 | "\n", 929 | "[Upon reviewing this problem, I realize that this assumes that the outcomes of coin tosses are independent, which is contrary to our practice of conditioning later selections on earlier ones. But mysteriously, it still comes out right. I am not yet sure why.]\n", 930 | "\n", 931 | "Rather than show that this is correct by hand, I did two things. One, I plugged this into Wolfram Alpha and indeed it holds. Below is the screenshot, where $x = \\bar{k}$ and $m = \\bar{n}$. Visually, the $n-m$ should be under the $k-x$.\n", 932 | "\n", 933 | "![Hallock_Fig_2-1.png](Figures/Hallock_Fig_2-1.png)\n", 934 | "\n", 935 | "\n", 936 | "**Second, I found another way of thinking about the problem.** The problem asks us to iterate over the numbers and select whether each number is to be included in $S'$. A *probabilistically equivalent* way of thinking about the problem is to imagine that the numbers which are included in $S'$ *are already set*, and we sequentially \"*reveal*\" whether each one is indeed included. Since we have no prior information, number 1 is included with probability $k\\,/\\,n$. Once we \"*reveal*\" whether number 1 is included, we update our information about the other $n-1$ numbers, namely that they each have an equal chance of being found amongst the $k$ or $k-1$ remaining vacancies in $S'$. This formulation makes it apparent why the probability that each number is included is $k' \\, / \\, n'$. The math above shows that these two formulations are indeed the same.\n", 937 | "\n", 938 | "**If $n$ is unknown**, we cannot guarantee both constraints, that every number has an equal chance of being included, and that exactly $k$ are included. The best we can do is pick a surrogate $n$ based on the relative importance of giving every number an equal chance of inclusion (pick a high $n$, resulting in a low probability of inclusion) versus including exactly $k$ numbers (pick a low $n$, resulting in a high probability of inclusion, but once we select $k$ we have no more room in $S'$)." 939 | ] 940 | }, 941 | { 942 | "cell_type": "markdown", 943 | "metadata": { 944 | "collapsed": true 945 | }, 946 | "source": [ 947 | "### 2.46 [5]\n", 948 | "\n", 949 | "You have a 100-story building and a couple of marbles. You must identify the lowest floor for which a marble will break if you drop it from this floor. How fast can you find this floor if you are given an infinite supply of marbles? What if you have only two marbles?" 950 | ] 951 | }, 952 | { 953 | "cell_type": "markdown", 954 | "metadata": {}, 955 | "source": [ 956 | "*Solution:*\n", 957 | "\n", 958 | "If we have an infinite number of marbles, we can do a binary search. Drop a marble from floor 50, and if it breaks, try floor 25. If it doesn't, try floor 75. And repeat. This will take $\\lceil \\lg 100 \\rceil = 7$ marbles. However, if a marble doesn't break, we can reuse it. On average, half of the drops will break the marble, and half won't. Therefore, the number of marbles is $\\lceil 7 \\, / \\, 2\\rceil = 4$. When the lowest floor is identified that breaks the marbles, if this was discovered with a reused marble, then there is a chance that the marble only broke because it was weakened from a previous drop. Therefore a new marble should be dropped from this floor to make sure that new marbles will break.\n", 959 | "\n", 960 | "If we only have 2 marbles, then we can leverage the fact that we can reuse marbles. We drop the first marble from floor 1, then floor 2, then floor 3, and so on until we find the first floor for which it breaks. Then we drop the second marble from that floor to be sure that it can break new marbles. If the new marble doesn't break, we keep dropping it from higher floors until we find the one that breaks the newer marble. If floor $n$ is the lowest breakable floor, this procedure will take $n + 1$ drops including the double check, 51 drops on average.\n", 961 | "\n", 962 | "If we assume that marbles do not become more brittle when they survive drops, which may be a fair assumption, then we can drop the first marble from just the odd-numbered floors, that is, from floor 1, then 3, then 5... If floor $n$ is the floor that breaks the marble, then either floor $n$ or $n - 1$ is the lowest such floor. Drop marble 2 from floor $n-1$ to find out which. This procedure will take $\\lceil (n+1) \\, / \\, 2 \\rceil + 1$ drops, 27 drops on average. For example, if $n = 9$, then we drop from floors 1, 3, 5, 7, 9, and then floor 8. This is a total of 6, which is equal to $\\lceil (9+1) \\, / \\, 2 \\rceil + 1 = 5 + 1 = 6$. If $n = 10$, then we drop from floors 1, 3, 5, 7, 9, 11, and then floor 10. This is a total of 7, which is equal to $\\lceil (10+1) \\, / \\, 2 \\rceil + 1 = 6 + 1 = 7$." 963 | ] 964 | }, 965 | { 966 | "cell_type": "code", 967 | "execution_count": null, 968 | "metadata": { 969 | "collapsed": true 970 | }, 971 | "outputs": [], 972 | "source": [] 973 | } 974 | ], 975 | "metadata": { 976 | "kernelspec": { 977 | "display_name": "Python 3", 978 | "language": "python", 979 | "name": "python3" 980 | }, 981 | "language_info": { 982 | "codemirror_mode": { 983 | "name": "ipython", 984 | "version": 3 985 | }, 986 | "file_extension": ".py", 987 | "mimetype": "text/x-python", 988 | "name": "python", 989 | "nbconvert_exporter": "python", 990 | "pygments_lexer": "ipython3", 991 | "version": "3.5.1" 992 | } 993 | }, 994 | "nbformat": 4, 995 | "nbformat_minor": 0 996 | } 997 | -------------------------------------------------------------------------------- /Chapter 3 - Data Structures.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Chapter 3: Data Structures (Completed 7/29: 24%)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Stacks, Queues, and Lists" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "### 3.1 [3]\n", 22 | "\n", 23 | "A common problem for compilers and text editors is determining whether the\n", 24 | "parentheses in a string are balanced and properly nested. For example, the string ((())())() contains properly nested pairs of parentheses, which the strings )()( and ()) do not. Give an algorithm that returns true if a string contains properly nested and balanced parentheses, and false if otherwise. For full credit, identify the position of the first offending parenthesis if the string is not properly nested and balanced." 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "metadata": {}, 30 | "source": [ 31 | "*Solution:*\n", 32 | "\n", 33 | "The algorithm will look through a possibly long string $s$, and each left parenthesis \"(\" will be pushed onto a stack, and each right parenthesis \")\" will trigger a \"pop\" from the stack. If the stack is empty during an attempted \"pop\", then there is an extra \")\", and if there is an extra \"(\" on the stack after the end of the string $s$ has been reached, then there was an extra \"(\".\n", 34 | "\n", 35 | "$\\hspace{2em} function \\text{ parentheses_check}(s):$ \n", 36 | "$\\hspace{4em} i = -1$ \n", 37 | "$\\hspace{4em} stack = stack$ \n", 38 | "$\\hspace{4em} \\text{for } c \\text{ in } s:$ \n", 39 | "$\\hspace{6em} i = i + 1$ \n", 40 | "$\\hspace{6em} \\text{if } c \\neq \\text{'('} \\text{ and } c \\neq \\text{')'}:$ \n", 41 | "$\\hspace{8em} continue$ \n", 42 | "$\\hspace{6em} \\text{if } c = \\text{'('}:$ \n", 43 | "$\\hspace{8em} stack.push(c)$ \n", 44 | "$\\hspace{6em} else:$ \n", 45 | "$\\hspace{8em} try:$ \n", 46 | "$\\hspace{10em} stack.pop()$ \n", 47 | "$\\hspace{8em} except \\text{ StackEmpty}:$ \n", 48 | "$\\hspace{10em} return\\ i$ \n", 49 | "$\\hspace{4em} return \\text{ True}$ \n" 50 | ] 51 | }, 52 | { 53 | "cell_type": "markdown", 54 | "metadata": {}, 55 | "source": [ 56 | "### 3.2 [3]\n", 57 | "\n", 58 | "Write a program to reverse the direction of a given singly-linked list. In other words, after the reversal all pointers should now point backwards. Your algorithm should take linear time." 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "metadata": {}, 64 | "source": [ 65 | "*Solution:*\n", 66 | "\n", 67 | "$\\hspace{2em} function \\text{ reverse}(node1, node2):$ \n", 68 | "$\\hspace{4em} p = \\text{target of node2's \"next\" pointer}$ \n", 69 | "$\\hspace{4em} \\text{Change node2's \"next\" pointer to point to node1}$ \n", 70 | "$\\hspace{4em} \\text{If p == NULL}:$ \n", 71 | "$\\hspace{6em} \\text{Return pointer to node2}$ \n", 72 | "$\\hspace{4em} \\text{Else:}$ \n", 73 | "$\\hspace{6em} \\text{Return reverse}(node2, p)$\n", 74 | "\n", 75 | "\n" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "metadata": {}, 81 | "source": [ 82 | "---" 83 | ] 84 | }, 85 | { 86 | "cell_type": "markdown", 87 | "metadata": {}, 88 | "source": [ 89 | "## Trees and Other Dictionary Structures" 90 | ] 91 | }, 92 | { 93 | "cell_type": "markdown", 94 | "metadata": {}, 95 | "source": [ 96 | "### 3.4 [3]\n", 97 | "\n", 98 | "Design a dictionary data structure in which search, insertion, and deletion can all be processed in $O(1)$ time in the worst case. You may assume the set elements are integers drawn from a finite set $1, 2, \\ldots , n$, and initialization can take $O(n)$ time." 99 | ] 100 | }, 101 | { 102 | "cell_type": "markdown", 103 | "metadata": {}, 104 | "source": [ 105 | "*Solution:*\n", 106 | "\n", 107 | "A hash table of length $n$ that hashs the integer $i$ to value $i$ should do the trick. Searching is constant time, since the number $i$ is in position $i$.\n", 108 | "\n", 109 | "Upon insertion is when you would have to worry about collisions. The only way two elements could collide upon hashing in this case is if they are the same number. If such duplicates are allowed (i.e. the problem allows for the same number to be added more than once), then each position in the array should contain a stack, with the duplicates pushed onto the stack upon insertion.\n", 110 | "\n", 111 | "Upon deletion, the number $i$ should be popped off of the stack in position $i$.\n", 112 | "\n", 113 | "If the problem does not allow for the same number to be added twice, then a static array of size $n$ with the above hash function would work." 114 | ] 115 | }, 116 | { 117 | "cell_type": "markdown", 118 | "metadata": {}, 119 | "source": [ 120 | "### 3.5 [3]\n", 121 | "\n", 122 | "Find the overhead fraction (the ratio of data space over total space) for each of the following binary tree implementations on $n$ nodes:\n", 123 | "\n", 124 | "1. All nodes store data, two child pointers, and a parent pointer. The data field requires four bytes and each pointer requires four bytes.\n", 125 | "2. Only leaf nodes store data; internal nodes store two child pointers. The data field requires four bytes and each pointer requires two bytes." 126 | ] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "metadata": {}, 131 | "source": [ 132 | "*Solution:*\n", 133 | "\n", 134 | "1.\n", 135 | "Each node has 1 data item and 3 pointers for a total of 16 bytes. 4 of those bytes are for data, so the overhead fraction is $\\frac{4}{16} = \\frac{1}{4}$.\n", 136 | "\n", 137 | "2.\n", 138 | "Note here that both leaf and internal nodes need 4 bytes. So the total memory storage needed is 4n.\n", 139 | "\n", 140 | "The amount of data storage depends on the number of leaf nodes, and differently shaped trees will have different amounts of leaf nodes. If we know the number of leaf nodes $l$, then the overhead is $\\frac{l}{n}$.\n", 141 | "\n", 142 | "In the case that we don't know the number of leaf nodes, for simplicity, we'll assume that $n=2^m - 1$ for some integer $m$ and the tree is perfectly balanced. In this case, there will be $2^{m-1}$ leaf nodes where $m =\\lg{(n+1)}$. This can be simplified: $2^{\\lg{(n+1)}-1}=\\frac{n+1}{2}$. For example, if $n=15$, then $m=4$, and there are 1, 2, 4, and 8 nodes at each depth, and therefore $2^3 = \\frac{16}{2}= 8$ nodes in the final row.\n", 143 | "\n", 144 | "The overhead in this case is then given by\n", 145 | "\n", 146 | "$\\frac{4l}{4n}\n", 147 | "= \\frac{l}{n}\n", 148 | "= \\frac{n+1}{2n}\n", 149 | "\\approx \\frac{1}{2}$\n", 150 | "\n" 151 | ] 152 | }, 153 | { 154 | "cell_type": "markdown", 155 | "metadata": {}, 156 | "source": [ 157 | "---" 158 | ] 159 | }, 160 | { 161 | "cell_type": "markdown", 162 | "metadata": {}, 163 | "source": [ 164 | "## Applications of Tree Structures" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "metadata": {}, 170 | "source": [ 171 | "### 3.12 [5]\n", 172 | "\n", 173 | "Suppose you are given an input set $S$ of $n$ numbers, and a black box that if given any sequence of real numbers and an integer $k$ instantly and correctly answers whether there is a subset of input sequence whose sum is exactly $k$. Show how to use the black box $O(n)$ times to find a subset of $S$ that adds up to $k$." 174 | ] 175 | }, 176 | { 177 | "cell_type": "markdown", 178 | "metadata": {}, 179 | "source": [ 180 | "*Solution:*\n", 181 | "\n", 182 | "$\\hspace{2em} \\text{If black_box}(S) == \\text{False}$ \n", 183 | "$\\hspace{4em} \\text{Return False}$ \n", 184 | "$\\hspace{2em} \\text{For } i \\text{ from } 0 \\text{ to } n-1, do:$ \n", 185 | "$\\hspace{4em} s = S[i]$ \n", 186 | "$\\hspace{4em} \\text{Delete } S[i]$ \n", 187 | "$\\hspace{4em} \\text{If black_box}(S) == \\text{True}$ \n", 188 | "$\\hspace{6em} \\text{Continue}$ \n", 189 | "$\\hspace{4em} \\text{Else}$ \n", 190 | "$\\hspace{6em} S[i] = s$ \n", 191 | "$\\hspace{2em} \\text{Return }S$ \n", 192 | "\n", 193 | "After checking that a subset does exist that adds up to $k$, sequentially delete items from $S$ and test if a valid (meaning its elements add up to $k$) subset still exists. If it does, move on to the next iteration. Otherwise undo the deletion.\n", 194 | "\n", 195 | "After the last iteration, there can be only one remaining valid subset. Elements that do not belong to every possible remaining valid subset would have been deleted, leaving those other sets as still valid subsets that add up to $k$. Only elements that appear in all the remaining valid subsets would be retained. But if all the remaining valid subsets have the same elements, then they are all the same, i.e. there is only 1 such subset remaining in $S$. Since elements that do not appear in *any* valid subset would have been deleted as well, the returned set $S$ is itself a valid subset; its elements add up to $k$.\n", 196 | "\n", 197 | "The black box gets used once for every element of the list. So it was used $O(n)$ times, specifically, $n$ times." 198 | ] 199 | }, 200 | { 201 | "cell_type": "markdown", 202 | "metadata": {}, 203 | "source": [ 204 | "---" 205 | ] 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "metadata": {}, 210 | "source": [ 211 | "## Implementation Projects" 212 | ] 213 | }, 214 | { 215 | "cell_type": "markdown", 216 | "metadata": {}, 217 | "source": [ 218 | "---" 219 | ] 220 | }, 221 | { 222 | "cell_type": "markdown", 223 | "metadata": {}, 224 | "source": [ 225 | "## Interview Problems" 226 | ] 227 | }, 228 | { 229 | "cell_type": "markdown", 230 | "metadata": {}, 231 | "source": [ 232 | "### 3.18 [3]\n", 233 | "\n", 234 | "What method would you use to look up a word in a dictionary?" 235 | ] 236 | }, 237 | { 238 | "cell_type": "markdown", 239 | "metadata": {}, 240 | "source": [ 241 | "*Solution:*\n", 242 | "\n", 243 | "One method would be binary search. I would open to roughly the middle page of the dictionary, and if the word is not on that page, I would repeat the process on the lower or upper portion of the dictionary depending on whether the word is alphabetically earlier or later than the words on that page.\n", 244 | "\n", 245 | "However, if I knew roughly where each letter started in the dictionary, say if it had tabs, then I would isolate that portion of the dictionary corresponding to the first letter of my word, and then perform binary search on that section." 246 | ] 247 | }, 248 | { 249 | "cell_type": "markdown", 250 | "metadata": {}, 251 | "source": [ 252 | "### 3.19 [3]\n", 253 | "\n", 254 | "Imagine you have a closet full of shirts. What can you do to organize your shirts for easy retrieval?" 255 | ] 256 | }, 257 | { 258 | "cell_type": "markdown", 259 | "metadata": {}, 260 | "source": [ 261 | "*Solution*:\n", 262 | "\n", 263 | "It depends on how I pick my shirt in the morning. If I want to wear all my shirts and I don't care about the particular order, then each time I do laundry I'll put the clean shirts on the left, and each morning wear the shirt furthest to the right.\n", 264 | "\n", 265 | "Another general option is to sort the shirts. I usually identify them most by color, so I could put them in some order based on color." 266 | ] 267 | } 268 | ], 269 | "metadata": { 270 | "kernelspec": { 271 | "display_name": "Python 3", 272 | "language": "python", 273 | "name": "python3" 274 | }, 275 | "language_info": { 276 | "codemirror_mode": { 277 | "name": "ipython", 278 | "version": 3 279 | }, 280 | "file_extension": ".py", 281 | "mimetype": "text/x-python", 282 | "name": "python", 283 | "nbconvert_exporter": "python", 284 | "pygments_lexer": "ipython3", 285 | "version": "3.5.1" 286 | } 287 | }, 288 | "nbformat": 4, 289 | "nbformat_minor": 0 290 | } 291 | -------------------------------------------------------------------------------- /Chapter 5 - Graph Traversal.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Chapter 5: Graph Traversal (Completed 10/32: 31%)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "### Notes:\n", 15 | "\n", 16 | "Note that BFS and DFS differ regarding when a node is marked as \"discovered\", when the parent relation is encoded, and therefore when nodes/edges are added to the search tree. This is partly due to the fact that DFS does not use a stack, even though it is occasionally helpful to think of it this way, in contrast to the queue used in BFS; DFS is implemented recursively.\n", 17 | "\n", 18 | "In BFS, all of a node's adjacent undiscovered vertices are immediately marked as \"discovered\", marked as a child of the current node (added with Tree edges to the tree), and put on the queue. In BFS, nodes are added to the tree while the *parent* is being processed.\n", 19 | "\n", 20 | "In DFS however, when processing a node, at first all but one of its nodes are *completely ignored*; they aren't put on the stack because there is no stack; no parent relation is recorded and nothing is marked as \"discovered\". Instead, the single node chosen to be first is recorded as a child (adding it with a Tree edge to the tree), then DFS is called on this child node, and then it is marked as \"discovered\". In DFS, nodes are added to the tree, essentially, when they themselves are being processed, when they have been \"entered into\". Thinking of a stack gives the false impression that when a node is being processed, all of its edges are looked at up front, before diving into the next node.\n", 21 | "\n", 22 | "This incorrect thinking was confusing me for a bit. Because if a path from the first child can lead back to another child of the same parent (think of a triangle with 3 edges), the other child will still be undiscovered; it is then marked as a descendant of the other child and a Back edge is created back to the original parent. If you imagine a stack being used, you may think both nodes were at first made as children of the parent, leading to a Cross edge between them. This cannot be correct: on undirected graphs, DFS labels every edge as either a Tree edge or a Back edge; no Cross edges." 23 | ] 24 | }, 25 | { 26 | "cell_type": "markdown", 27 | "metadata": {}, 28 | "source": [ 29 | "---" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "## Simulating Graph Algorithms" 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "metadata": {}, 42 | "source": [ 43 | "### 5.1 [3]\n", 44 | "\n", 45 | "For the following graphs $G_1$ (left) an $G_2$ (right):\n", 46 | "\n", 47 | "(a): Report the order of the vertices encountered on a breadth-first search starting from vertex $A$. Break all ties by picking the vertices in alphabetical order (i.e., $A$ before $Z$). \n", 48 | "(b): Report the order of the vertices encountered on a depth-first search starting from vertex $A$. Break all ties by picking the vertices in alphabetical order (i.e., $A$ before $Z$).\n", 49 | "\n", 50 | "![Skiena Fig 5-1](Figures/Skiena_Fig_5-1.png)" 51 | ] 52 | }, 53 | { 54 | "cell_type": "markdown", 55 | "metadata": {}, 56 | "source": [ 57 | "*Solution:* \n", 58 | "\n", 59 | "**(a):**\n", 60 | "\n", 61 | "On left: $A\n", 62 | "\\rightarrow B\n", 63 | "\\rightarrow D\n", 64 | "\\rightarrow I\n", 65 | "\\rightarrow C\n", 66 | "\\rightarrow E\n", 67 | "\\rightarrow G\n", 68 | "\\rightarrow J\n", 69 | "\\rightarrow F\n", 70 | "\\rightarrow H$\n", 71 | "\n", 72 | "On right: $A\n", 73 | "\\rightarrow B\n", 74 | "\\rightarrow E\n", 75 | "\\rightarrow C\n", 76 | "\\rightarrow F\n", 77 | "\\rightarrow I\n", 78 | "\\rightarrow D\n", 79 | "\\rightarrow G\n", 80 | "\\rightarrow J\n", 81 | "\\rightarrow M\n", 82 | "\\rightarrow H\n", 83 | "\\rightarrow K\n", 84 | "\\rightarrow N\n", 85 | "\\rightarrow L\n", 86 | "\\rightarrow O\n", 87 | "\\rightarrow P$\n", 88 | "\n", 89 | "**(b):**\n", 90 | "\n", 91 | "On left: $A\n", 92 | "\\rightarrow B\n", 93 | "\\rightarrow C\n", 94 | "\\rightarrow E\n", 95 | "\\rightarrow D\n", 96 | "\\rightarrow G\n", 97 | "\\rightarrow H\n", 98 | "\\rightarrow F\n", 99 | "\\rightarrow J\n", 100 | "\\rightarrow I$\n", 101 | "\n", 102 | "On right: $A\n", 103 | "\\rightarrow B\n", 104 | "\\rightarrow C\n", 105 | "\\rightarrow D\n", 106 | "\\rightarrow H\n", 107 | "\\rightarrow G\n", 108 | "\\rightarrow F\n", 109 | "\\rightarrow E\n", 110 | "\\rightarrow I\n", 111 | "\\rightarrow J\n", 112 | "\\rightarrow K\n", 113 | "\\rightarrow L\n", 114 | "\\rightarrow P\n", 115 | "\\rightarrow O\n", 116 | "\\rightarrow N\n", 117 | "\\rightarrow M$" 118 | ] 119 | }, 120 | { 121 | "cell_type": "markdown", 122 | "metadata": {}, 123 | "source": [ 124 | "### 5.2 [3]\n", 125 | "\n", 126 | "Do a topological sort of the following graph $G$:\n", 127 | "\n", 128 | "![Skiena Fig 5-2](Figures/Skiena_Fig_5-2.png)\n", 129 | "\n", 130 | "Errata: \"this graph is not acyclic. Reverse the edge $(F,H)$.\"" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": {}, 136 | "source": [ 137 | "*Solution:*\n", 138 | "\n", 139 | "To find a topological sort, we perform a Depth-First Search and reverse the order the vertices were labeled \"processed\".\n", 140 | "\n", 141 | "A DFS starting at vertex $A$ will pass through the nodes in the following order:\n", 142 | "\n", 143 | "$A\n", 144 | "\\rightarrow B\n", 145 | "\\rightarrow C\n", 146 | "\\rightarrow F\n", 147 | "\\rightarrow D\n", 148 | "\\rightarrow E\n", 149 | "\\rightarrow G\n", 150 | "\\rightarrow I\n", 151 | "\\rightarrow J\n", 152 | "\\Rightarrow H$\n", 153 | "\n", 154 | "where \"$\\Rightarrow$\" indicates we had to initialize a new component, since with the correction there is no way of reaching vertex $H$.\n", 155 | "\n", 156 | "What order were they labeled \"processed\"? Vertices are labeled \"processed\" when the search exits from them (backtracks from them). In the traversal order above, they are labeled \"processed\" in the following order:\n", 157 | "\n", 158 | "$F\n", 159 | "\\rightarrow C\n", 160 | "\\rightarrow J\n", 161 | "\\rightarrow I\n", 162 | "\\rightarrow G\n", 163 | "\\rightarrow E\n", 164 | "\\rightarrow D\n", 165 | "\\rightarrow B\n", 166 | "\\rightarrow A\n", 167 | "\\rightarrow H$\n", 168 | "\n", 169 | "Reversing the direction of the arrows produces a topological sort:\n", 170 | "\n", 171 | "$H\n", 172 | "\\rightarrow A\n", 173 | "\\rightarrow B\n", 174 | "\\rightarrow D\n", 175 | "\\rightarrow E\n", 176 | "\\rightarrow G\n", 177 | "\\rightarrow I\n", 178 | "\\rightarrow J\n", 179 | "\\rightarrow C\n", 180 | "\\rightarrow F$\n", 181 | "\n", 182 | "One can check that for every edge $(x,y)$ in the graph, vertex $x$ appears before vertex $y$ in the given topological sort." 183 | ] 184 | }, 185 | { 186 | "cell_type": "markdown", 187 | "metadata": {}, 188 | "source": [ 189 | "---" 190 | ] 191 | }, 192 | { 193 | "cell_type": "markdown", 194 | "metadata": {}, 195 | "source": [ 196 | "## Traversal" 197 | ] 198 | }, 199 | { 200 | "cell_type": "markdown", 201 | "metadata": {}, 202 | "source": [ 203 | "### 5.3 [3]\n", 204 | "\n", 205 | "Prove by induction that there is a unique path between any pair of vertices in a tree." 206 | ] 207 | }, 208 | { 209 | "cell_type": "markdown", 210 | "metadata": {}, 211 | "source": [ 212 | "*Solution:*\n", 213 | "\n", 214 | "Let $K$ be the number of levels in the tree.\n", 215 | "\n", 216 | "**Base case: $K = 1$.**\n", 217 | "\n", 218 | "If there is only one level in the tree, there can only be one node in the tree, otherwise the tree would be disconnected. This follows from the fact that all edges connect parents to children, and so all edges connect one level to another. If there is only one level, there can't be any edges. This case is trivially true since there are no pairs of vertices in a tree with only one node.\n", 219 | "\n", 220 | "**Base case: $K = 2$.**\n", 221 | "\n", 222 | "In this case there is a single parent, and the rest of the nodes are children connected directly to the parent. For a given pair of nodes $(x,y)$, if one of them is the parent, then there is a single edge connecting them. An alternate route between them would have to enter/exit the parent from another child, and therefore would require there to be a edges directly between children, which is not allowed. Therefore the edge between child and parent is a unique path between them. If neither $x$ nor $y$ is the parent, then they are both children. Since edges directly between children are not allowed, any path between them must pass through the parent. Because the path from child to parent is unique, the path from one child, to the parent, to a different child is also unique.\n", 223 | "\n", 224 | "Therefore there is a unique path between any pair of vertices in a tree with only $2$ levels.\n", 225 | "\n", 226 | "**General case: $K = K$.**\n", 227 | "\n", 228 | "Suppose there is a unique path between any pair of vertices in a tree with less than $K$ levels.\n", 229 | "\n", 230 | "A tree with $K$ levels can be viewed as multiple subtrees with $K-1$ levels with their roots connected directly to the overall root.\n", 231 | "\n", 232 | "For any pair of vertices $(x,y)$, if they are both in the same subtree, then by our induction hypothesis there is a unique path between them.\n", 233 | "\n", 234 | "If $x$ is the overall root, let $p$ be the root of the subtree with $K-1$ levels containing $y$. $p$ must be connected directly to $x$, and the edge between them is a unique path from $x$ to $p$; otherwise there would have to be edges between subtrees, which is not possible since all edges are edges between parents and children. Put another way, the root of a subtree provides the only (that is, unique) access point to that subtree. Now there is also a unique path from $p$ to $y$ by our induction hypothesis, since they are both in a tree with $K-1$ levels. Therefore there is a unique path from $x$ to $p$ to $y$, which is from $x$ to $y$. \n", 235 | "\n", 236 | "If $x$ and $y$ are in different subtrees, any path between them must pass through the overall root $p$, otherwise there would have had to have been an edge directly connecting subtrees. As argued above, the path from $x$ to the overall root $p$ is unique, and so is the path between $y$ and $p$. Therefore connecting them into a path from $x$ to $p$ to $y$ provides a unique path from $x$ to $y$.\n", 237 | "\n", 238 | "Therefore there is a unique path between any pair of vertices in any tree with an arbitrary number of levels." 239 | ] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "metadata": {}, 244 | "source": [ 245 | "### 5.4 [3]\n", 246 | "\n", 247 | "Prove that in a breadth-first search on a undirected graph $G$, every edge is either a tree edge or a cross edge, where $x$ is neither an ancestor nor descendant of $y$, in cross edge $(x, y)$." 248 | ] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "metadata": {}, 253 | "source": [ 254 | "*Solution:*\n", 255 | "\n", 256 | "BFS and DFS classify all edges $(x,y)$ into 4 possible classes:\n", 257 | "\n", 258 | "- **Tree edge**: $y$ was discovered or \"enterered into\" directly from $x$; $x$ is $y$'s parent in the traversal tree\n", 259 | "- **Forward edge**: $y$ was already a descendant of $x$ when edge $(x,y)$ was processed\n", 260 | "- **Back edge**: $y$ was an ancestor of $x$ when edge $(x,y)$ was processed\n", 261 | "- **Cross edge**: $y$ was already discovered or \"entered into\" when edge $(x,y)$ was processed, but is not an ancestor or descendant of $x$\n", 262 | "\n", 263 | "On undirected graphs, DFS can only produce Tree and Back edges, as described on page 178. Here we wish to show that BFS can only produce Tree and Cross edges. The simplest example that shows this differing behavior is computing the traversal tree for a simple dense triangle with 3 nodes and 3 edges. DFS produces 2 Tree edges and 1 Back edge, while BFS produces 2 Tree edges and 1 Cross edge.\n", 264 | "\n", 265 | "**No Forward Edges:** Suppose $(x,z)$ is an potential Forward edge, meaning $z$ is already a descendant of $x$. We assume that this graph is \"simple\", meaning there are no multi-edges. Then there must be an intermediate node $y$ such that $y$ is a descendant of $x$ and $z$ is a descendant of $y$. But in BFS, the edges of node $x$ would have been completely processed before those of $y$, and $z$ would have been made a direct child of $x$ rather than a descendant of $y$. The edge $(x,z)$ would therefore be a Tree edge, and could not be a Forward edge.\n", 266 | "\n", 267 | "** No Back Edges:** Similarly, suppose edge $(z,x)$ is being processed from node $z$, and $z$ is already a descendant of $x$, suggesting this should be a Back edge. Since this is BFS, all of $x$'s edges would have been processed before those of any of its descendants. Therefore this edge would have been processed while processing node $x$, not node $z$, making it a Tree edge." 268 | ] 269 | }, 270 | { 271 | "cell_type": "markdown", 272 | "metadata": {}, 273 | "source": [ 274 | "### 5.5 [3]\n", 275 | "\n", 276 | "Give a linear algorithm to compute the chromatic number of graphs where each vertex has degree at most $2$. Must such graphs be bipartite?" 277 | ] 278 | }, 279 | { 280 | "cell_type": "markdown", 281 | "metadata": {}, 282 | "source": [ 283 | "*Solution:*\n", 284 | "\n", 285 | "**Chromatic number:** the smallest number of colors sufficient to vertex-color a graph, meaning no adjacent nodes are the same color.\n", 286 | "\n", 287 | "**Bipartite graph:** A graph where every node can be divided into two groups such that every edge in the graph connects a node in one group with a node in the other; or equivalently, no edge in the graph contains 2 nodes in the same group. Or equivalently, the chromatic number of the graph is $2$.\n", 288 | "\n", 289 | "In our graph $G$, every node is of degree $0$, $1$, or $2$. The nodes of degree $0$ can be colored any color without incident, since they have no adjacent nodes. Therefore assume that no node has degree $0$. In this case, the graph is either a line, with 2 nodes of degree $1$, or a circle, with no nodes of degree $1$. The rest of the nodes are of degree $2$. The graph could also contain multiple components, with some of either type. But in this case we could just color the components separately; the coloring of one component cannot affect the coloring of the others. Therefore we assume our graph is connected, and must be either a circle or a line.\n", 290 | "\n", 291 | "The restriction to nodes of degree at most $2$ makes this problem easy, because when entering a node from a parent, there is at most one way out. The complexity of graph problems often arises from the multiplicity of possible paths through the graph. But in this case, once a starting node and direction are chosen, there is a unique path.\n", 292 | "\n", 293 | "Before we develop an algorithm, we can work out what cases arise to what chromatic numbers. If $G$ is a line, we could traverse it from one end to the other, coloring the nodes alternating colors. Therefore a line has chromatic number $2$.\n", 294 | "\n", 295 | "If $G$ is a circle, we can color one vertex one color, say, blue, then move around the circle alternating colors between blue and another color, say, red. However, we might end up back at our starting vertex with 2 adjacent blues. This will happen if the circle has an odd number of vertices. (For example, look at a triangle with 3 edges.) In this case we will need a third color to color the last vertex.\n", 296 | "\n", 297 | "To summarize, lines have chromatic number 2, circles with an even number of vertices have chromatic number 2, and circles with an odd number of vertices have chromatic number 3. A line can be differentiated from a circle by the presence of 2 nodes with degree $1$.\n", 298 | "\n", 299 | "Therefore we have the following algorithm:\n", 300 | "\n", 301 | "$\\hspace{2em} \\text{For each component:}$ \n", 302 | "$\\hspace{4em} \\text{Pick a random starting node}$ \n", 303 | "$\\hspace{4em} \\text{If the node has degree 0:}$ \n", 304 | "$\\hspace{6em} \\text{The chromatic number of that component is 1}$ \n", 305 | "$\\hspace{6em} \\text{Continue to next component}$ \n", 306 | "$\\hspace{4em} \\text{Traverse the component, counting the number of vertices}$ \n", 307 | "$\\hspace{4em} \\text{If a node of degree 1 is reached:}$ \n", 308 | "$\\hspace{6em} \\text{The chromatic number of that component is 2}$ \n", 309 | "$\\hspace{4em} \\text{If the starting node is reached and there was an even number of nodes:}$ \n", 310 | "$\\hspace{6em} \\text{The chromatic number of that component is 2}$ \n", 311 | "$\\hspace{4em} \\text{If the starting node is reached and there was an odd number of nodes:}$ \n", 312 | "$\\hspace{6em} \\text{The chromatic number of that component is 3}$ \n", 313 | "$\\hspace{6em} \\text{Return 3}$ \n", 314 | "$\\hspace{2em} \\text{Return the maximum chromatic number over all components}$ \n", 315 | "\n", 316 | "In this algorithm, each node is touched at most once, less if a node of degree 1 is reached and the starting vertex wasn't the other end of the line. Each node is processed in constant time (check degree number, increment number of nodes, and change chromatic number if necessary). Therefore this is a linear-time algorithm.\n", 317 | "\n", 318 | "Another possible algorithm would be to actually color the vertices with alternating colors until they are all colored or until 2 adjacent nodes would be colored the same color, in that case short circuit and return 3." 319 | ] 320 | }, 321 | { 322 | "cell_type": "markdown", 323 | "metadata": {}, 324 | "source": [ 325 | "### 5.9 [3]\n", 326 | "\n", 327 | "Suppose an arithmetic expression is given as a tree. Each leaf is an integer and each internal node is one of the standard arithmetical operations $(+,−, *, /)$. For example, the expression $2 + 3 * 4 + (3 * 4)/5$ is represented by the tree in Figure 5.17(a). Give an $O(n)$ algorithm for evaluating such an expression, where there are $n$ nodes in the tree.\n", 328 | "\n", 329 | "![Skiena Fig 5-1](Figures/Skiena_Fig_5-5.png)" 330 | ] 331 | }, 332 | { 333 | "cell_type": "markdown", 334 | "metadata": {}, 335 | "source": [ 336 | "*Solution:*\n", 337 | "\n", 338 | "Note that if a node's value is an operator, it must have exactly $2$ children; if its value is a number, then it must have exactly $0$ children. The algorithm below is recursive and would be called on the root of the tree. The convention is that if node $p$ has 2 children, they are labeled $p1$ and $p2$.\n", 339 | "\n", 340 | "$\\hspace{2em} \\text{Solve(node p):}$ \n", 341 | "$\\hspace{4em} \\text{If Value is a number:}$ \n", 342 | "$\\hspace{6em} \\text{Return Value}$ \n", 343 | "$\\hspace{4em} \\text{If Value is an operator:}$ \n", 344 | "$\\hspace{6em} \\text{A = Solve(node p1)}$ \n", 345 | "$\\hspace{6em} \\text{B = Solve(node p2)}$ \n", 346 | "$\\hspace{6em} \\text{Return (A operator B)}$ \n", 347 | "\n", 348 | "In this algorithm, each node is processed exactly once. Other than recursive calls and passing back return values, the only computations are the binary operations computing \"$\\text{(A operator B)}$\" for the nodes that contain operators. Therefore this algorithm will be linear in the number of nodes." 349 | ] 350 | }, 351 | { 352 | "cell_type": "markdown", 353 | "metadata": {}, 354 | "source": [ 355 | "---" 356 | ] 357 | }, 358 | { 359 | "cell_type": "markdown", 360 | "metadata": {}, 361 | "source": [ 362 | "## Algorithm Design" 363 | ] 364 | }, 365 | { 366 | "cell_type": "markdown", 367 | "metadata": {}, 368 | "source": [ 369 | "### 5.14 [3]\n", 370 | "\n", 371 | "A *vertex cover* of a graph $G = (V,E)$ is a subset of vertices $V' \\in V$ such that every edge in $E$ contains at least one vertex from $V'$. Delete all the leaves from any depth-first search tree of $G$. Must the remaining vertices form a vertex cover of $G$? Give a proof or a counterexample." 372 | ] 373 | }, 374 | { 375 | "cell_type": "markdown", 376 | "metadata": {}, 377 | "source": [ 378 | "*Solution:*\n", 379 | "\n", 380 | "Upon deleting all leaf nodes from a Depth-First Search tree, only edges between these deleted nodes are at risk of not being covered. Therefore the question reduces to whether or not there can be edges between leaf nodes. Edges between leaf nodes are not Tree Edges, since they do not define parent/child relations nor express the order of discovery during traversal. Nor are they Forward or Back Edges, which must point to ansestors or descendants. Therefore they are Cross Edges. These edges can only be present in directed graphs. **Therefore if $G$ is undirected, deleting the leaf nodes will result in a vertex cover.**\n", 381 | "\n", 382 | "Suppose $G$ is directed. Can there be cross edges between leaf nodes? Yes. A counterexample can be found in the book (Figure 5.14, weird since this is problem 5.14):\n", 383 | "\n", 384 | "![Skiena Fig 5-2](Figures/Skiena_Fig_5-4.png)\n", 385 | "\n", 386 | "The \"Cross Edges\" case on the right contains an edge between leaf nodes. A graph that has that search tree can be produced from the tree itself by making all the Tree Edges be directed downward. This shows that if the graph $G$ is directed, then deleting the leaf nodes from *any* DFS tree does not always produce a vertex cover." 387 | ] 388 | }, 389 | { 390 | "cell_type": "markdown", 391 | "metadata": {}, 392 | "source": [ 393 | "---" 394 | ] 395 | }, 396 | { 397 | "cell_type": "markdown", 398 | "metadata": {}, 399 | "source": [ 400 | "## Directed Graphs" 401 | ] 402 | }, 403 | { 404 | "cell_type": "markdown", 405 | "metadata": {}, 406 | "source": [ 407 | "### 5.24 [3]\n", 408 | "\n", 409 | "Adding a single directed edge to a directed graph can reduce the number of weakly connected components, but by at most how many components? What about the number of strongly connected components?" 410 | ] 411 | }, 412 | { 413 | "cell_type": "markdown", 414 | "metadata": {}, 415 | "source": [ 416 | "*Solution:*\n", 417 | "\n", 418 | "**Weakly connected components:** disjoint subsets of vertices such that *if all directed edges were replaced with undirected edges*, for any pair of vertices $(x,y)$ in the component, you could travel between $x$ and $y$ in either direction without leaving the component.\n", 419 | "\n", 420 | "Adding the edge $(x,y)$ connects all nodes previously connected with node $x$ with those connected with node $y$. If they were already connected, there is no change in the number of weakly connected components. If not, then 2 components are now 1, a reduction of 1.\n", 421 | "\n", 422 | "**Strongly connected components:** disjoint subsets of vertices such that for any pair of vertices $(x,y)$ in the component, you can travel between $x$ and $y$ in either direction without leaving the component; directed cycles are the smallest non-trivial strongly connected components.\n", 423 | "\n", 424 | "When addressing *strongly* connected components, it is important to note that the number of components ranges from $V$ to $1$, where $V$ is the number of vertices in the graph. Initially, each vertex is its own component, and as directed cycles are identified, the nodes in the cycle are merged into the same component. Therefore it is never the case that a graph has NO strongly connected components, but rather we say there are $V$ such components. In support of this convention is the fact that as the number of components decreases, the graph is increasingly connected. Therefore the lowest such possible number of connected components should refer to a maximally strongly connected graph. In keeping with this pattern, the minimally strongly connected graph has $V$ components and the maximally connected graph has $1$ component.\n", 425 | "\n", 426 | "Suppose we have a circle of $V$ vertices with edges along the perimeter all pointing in the same direction. This is a single strongly connected graph, since any vertex can be reached from any other simply by traversing around the circle. But suppose we remove one edge, essentially creating a directed line of vertices. Now for any pair of vertices $(x,y)$, either $x$ can be reached from $y$ or vice versa, but never both, since you can only travel in one direction and you never loop back. Therefore in this case there are $V$ strongly connected components (or 0 in the intuitive but erronious convention described above). Inserting back in the edge that was removed reduces the number of components from $V$ back down to 1, a reduction of $V-1$. Since $V$ is arbitrary, we see that adding a single directed edge can reduce the number of strongly connected components by *any* number greater than or equal to 0." 427 | ] 428 | }, 429 | { 430 | "cell_type": "markdown", 431 | "metadata": {}, 432 | "source": [ 433 | "---" 434 | ] 435 | }, 436 | { 437 | "cell_type": "markdown", 438 | "metadata": {}, 439 | "source": [ 440 | "## Articulation Vertices" 441 | ] 442 | }, 443 | { 444 | "cell_type": "markdown", 445 | "metadata": {}, 446 | "source": [ 447 | "### 5.30 [3]\n", 448 | "\n", 449 | "Suppose $G$ is a connected undirected graph. An edge $e$ whose removal disconnects the graph is called a *bridge*. Must every bridge $e$ be an edge in a depth-first search tree of $G$? Give a proof or a counterexample." 450 | ] 451 | }, 452 | { 453 | "cell_type": "markdown", 454 | "metadata": {}, 455 | "source": [ 456 | "*Solution:*\n", 457 | "\n", 458 | "DFS labels edges as either Tree or Back, with only Tree edges being explicitly included in the search tree. Therefore an equivalent question is whether or not a bridge can be a Back edge.\n", 459 | "\n", 460 | "If $(y,x)$ is a Back edge, this means that $y$ was already a descendant of $x$, which in turn means there exists an alternate route from $x$ to $y$. Therefore deleting edge $(y,x)$ does not disconnect $x$ from $y$. So the edge cannot be a bridge.\n", 461 | "\n", 462 | "We have just shown that BACK implies NOT BRIDGE. This is logically equivalent to BRIDGE implies NOT BACK. And since every edge in a DFS search tree is either a Tree edge or Back edge, this means that BRIDGE implies TREE. In other words, yes, every bridge $e$ must be an edge in a depth-first search tree of $G$." 463 | ] 464 | }, 465 | { 466 | "cell_type": "markdown", 467 | "metadata": {}, 468 | "source": [ 469 | "---" 470 | ] 471 | }, 472 | { 473 | "cell_type": "markdown", 474 | "metadata": {}, 475 | "source": [ 476 | "## Interview Problems" 477 | ] 478 | }, 479 | { 480 | "cell_type": "markdown", 481 | "metadata": { 482 | "collapsed": true 483 | }, 484 | "source": [ 485 | "### 5.31 [3]\n", 486 | "\n", 487 | "Which data structures are used in depth-first and breath-first search?" 488 | ] 489 | }, 490 | { 491 | "cell_type": "markdown", 492 | "metadata": {}, 493 | "source": [ 494 | "*Solution:*\n", 495 | "\n", 496 | "**Adjacency lists or matrices** to represent the graph.\n", 497 | "\n", 498 | "**Arrays** to record which nodes have been discovered, which have been processed, and the entry/exit times in the case of depth-first search, and to record the parent relationships for the search tree.\n", 499 | "\n", 500 | "A **queue** in the case of breadth-first search to organize the traversal order.\n", 501 | "\n", 502 | "DFS can be implemented with a **stack**, but instead should be implemented recursively." 503 | ] 504 | } 505 | ], 506 | "metadata": { 507 | "kernelspec": { 508 | "display_name": "Python 3", 509 | "language": "python", 510 | "name": "python3" 511 | }, 512 | "language_info": { 513 | "codemirror_mode": { 514 | "name": "ipython", 515 | "version": 3 516 | }, 517 | "file_extension": ".py", 518 | "mimetype": "text/x-python", 519 | "name": "python", 520 | "nbconvert_exporter": "python", 521 | "pygments_lexer": "ipython3", 522 | "version": "3.5.1" 523 | } 524 | }, 525 | "nbformat": 4, 526 | "nbformat_minor": 0 527 | } 528 | -------------------------------------------------------------------------------- /Chapter 6 - Weighted Graph Algorithms.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Chapter 6: Weighted Graph Algorithms (Completed 10/25: 40%)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Simulating Graph Algorithms" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "### 6.1 [3]\n", 22 | "\n", 23 | "For the graphs in Problem 5-1:\n", 24 | "\n", 25 | "(a): Draw the spanning forest after every iteration of the main loop in Kruskal’s algorithm. \n", 26 | "(b): Draw the spanning forest after every iteration of the main loop in Prim’s algorithm. \n", 27 | "(c): Find the shortest path spanning tree rooted in $A$. \n", 28 | "(d): Compute the maximum flow from $A$ to $H$. \n", 29 | "\n", 30 | "![Skiena Fig 5-1](Figures/Skiena_Fig_5-1.png)" 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "*Solution:*\n", 38 | "\n", 39 | "**Graph 1**:\n", 40 | "\n", 41 | "First we put the edges in sorted order:\n", 42 | "\n", 43 | "$\\text{(B,E): 1}$ \n", 44 | "$\\text{(I,J): 1}$ \n", 45 | "$\\text{(C,E): 2}$ \n", 46 | "$\\text{(C,F): 2}$ \n", 47 | "$\\text{(G,I): 2}$ \n", 48 | "$\\text{(G,J): 2}$ \n", 49 | "$\\text{(B,C): 3}$ \n", 50 | "$\\text{(B,D): 3}$ \n", 51 | "$\\text{(G,H): 3}$ \n", 52 | "$\\text{(A,D): 4}$ \n", 53 | "$\\text{(D,E): 4}$ \n", 54 | "$\\text{(H,J): 4}$ \n", 55 | "$\\text{(A,B): 6}$ \n", 56 | "$\\text{(D,G): 6}$ \n", 57 | "$\\text{(E,G): 6}$ \n", 58 | "$\\text{(E,H): 7}$ \n", 59 | "$\\text{(E,F): 8}$ \n", 60 | "$\\text{(A,I): 9}$ \n", 61 | "$\\text{(F,H): 11}$ \n", 62 | "\n", 63 | "**(a):** Kruskal's\n", 64 | "\n", 65 | "In Kruskal's algorithm, we sequentially add the next least-weight edge, unless there is already a path between the two vertices in the current tree/forest.\n", 66 | "\n", 67 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-1.jpg)\n", 68 | "\n", 69 | "**(b):** Prim's\n", 70 | "\n", 71 | "In Prim's algorithm, we add the next least-weight edge that connects a non-tree node with the tree. Unlike in Kruskal's algorithm, there is only a single tree at all times.\n", 72 | "\n", 73 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-2.jpg)\n", 74 | "\n", 75 | "**(c):** Shortest Path from $A$\n", 76 | "\n", 77 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-3.jpg)\n", 78 | "\n", 79 | "**(d):** Maximum Flow from $A$ to $H$\n", 80 | "\n", 81 | "In the graphs below I successively found the shortest path from $A$ to $H$, then computed the Residual Flow Graph by subtracting the lowest edge-weight along that path from all the edges in the path. The maximum flow was 13. Note that this is equal to the weight of the minimum $A-H$ cut. Specifically the minimum $A-H$ cut consists of deleting edges $(A,B), (A,D), (G,I),$ and $(G,J)$, which have a total weight of 13 and which are exactly the edges with weight zero (missing) in the final Residual Flow Graph.\n", 82 | "\n", 83 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-4.jpg)\n", 84 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-5.jpg)" 85 | ] 86 | }, 87 | { 88 | "cell_type": "markdown", 89 | "metadata": {}, 90 | "source": [ 91 | "**Graph 2**:\n", 92 | "\n", 93 | "Again, first we put the edges in sorted order:\n", 94 | "\n", 95 | "$\\text{(A,B): 1}$ \n", 96 | "$\\text{(C,G): 1}$ \n", 97 | "$\\text{(M,N): 1}$ \n", 98 | "$\\text{(A,E): 2}$ \n", 99 | "$\\text{(I,M): 3}$ \n", 100 | "$\\text{(B,F): 4}$ \n", 101 | "$\\text{(D,H): 4}$ \n", 102 | "$\\text{(J,K): 4}$ \n", 103 | "$\\text{(E,I): 5}$ \n", 104 | "$\\text{(H,L): 5}$ \n", 105 | "$\\text{(F,J): 6}$ \n", 106 | "$\\text{(O,P): 8}$ \n", 107 | "$\\text{(J,N): 9}$ \n", 108 | "$\\text{(L,P): 9}$ \n", 109 | "$\\text{(N,O): 9}$ \n", 110 | "$\\text{(E,F): 10}$ \n", 111 | "$\\text{(G,K): 11}$ \n", 112 | "$\\text{(I,J): 11}$ \n", 113 | "$\\text{(K,O): 12}$ \n", 114 | "$\\text{(F,G): 13}$ \n", 115 | "$\\text{(B,C): 18}$ \n", 116 | "$\\text{(C,D): 20}$ \n", 117 | "$\\text{(G,H): 22}$ \n", 118 | "$\\text{(K,L): 23}$ \n", 119 | "\n", 120 | "**(a):** Kruskal's\n", 121 | "\n", 122 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-6.jpg)\n", 123 | "\n", 124 | "**(b):** Prim's\n", 125 | "\n", 126 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-7.jpg)\n", 127 | "\n", 128 | "**(c):** Shortest Path from $A$\n", 129 | "\n", 130 | "Here the edges represent the *cumulative* cost of reaching the vertex.\n", 131 | "\n", 132 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-8.jpg)\n", 133 | "\n", 134 | "**(d):** Maximum Flow from $A$ to $H$\n", 135 | "\n", 136 | "The maximum flow in this case was only 3. Note once again that this is equal to the weight of the minimum $A-H$ cut. Specifically the minimum $A-H$ cut consists of deleting edges $(A,B)$ and $(A,E)$, which have a toal weight of 3 and which are exactly the edges with weight zero (missing) in the final Residual Flow Graph.\n", 137 | "\n", 138 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-9.jpg)" 139 | ] 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": {}, 144 | "source": [ 145 | "---" 146 | ] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": {}, 151 | "source": [ 152 | "## Minimum Spanning Trees" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "metadata": {}, 158 | "source": [ 159 | "### 6.2 [3]\n", 160 | "\n", 161 | "Is the path between two vertices in a minimum spanning tree necessarily a shortest path between the two vertices in the full graph? Give a proof or a counterexample." 162 | ] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "metadata": {}, 167 | "source": [ 168 | "*Solution:*\n", 169 | "\n", 170 | "Definitely not. Otherwise, what was the point of Dijkstra's algorithm, if we could just use Prim's or Kruskal's to find the minimum spanning tree?\n", 171 | "\n", 172 | "As a counterexample, think of a triangle with 3 edges, each with weight 1. Any minimum spanning tree will only have 2 of these edges. Pick the 2 vertices with degree 1 in the tree. In the tree, the shortest path between them passes through the 3rd vertex and has weight 2, while in the full graph there is a direct edge of weight 1 connecting them." 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "metadata": {}, 178 | "source": [ 179 | "### 6.3 [3]\n", 180 | "\n", 181 | "Assume that all edges in the graph have distinct edge weights (i.e. , no pair of edges have the same weight). Is the path between a pair of vertices in a minimum spanning tree necessarily a shortest path between the two vertices in the full graph? Give a proof or a counterexample." 182 | ] 183 | }, 184 | { 185 | "cell_type": "markdown", 186 | "metadata": {}, 187 | "source": [ 188 | "*Solution:*\n", 189 | "\n", 190 | "No.\n", 191 | "\n", 192 | "As a counterexample, think of a triangle with 3 edges with weights 2, 3, and 4. The minimum spanning tree will only have the edges of weight 2 and 3. The path between the two leaves must travel through the root, and therefore has weight $2 + 3 = 5$. However, in the full graph, there is a direct edge between them with weight $4$. Therefore the MST-path was not the shortest path on the full graph." 193 | ] 194 | }, 195 | { 196 | "cell_type": "markdown", 197 | "metadata": {}, 198 | "source": [ 199 | "### 6.4 [3]\n", 200 | "\n", 201 | "Can Prim’s and Kruskal’s algorithms yield different minimum spanning trees? Explain why or why not." 202 | ] 203 | }, 204 | { 205 | "cell_type": "markdown", 206 | "metadata": {}, 207 | "source": [ 208 | "*Solution:*\n", 209 | "\n", 210 | "They both yield correct results, so this could only occur if there is more than one correct minimmum spanning tree.\n", 211 | "\n", 212 | "I can see 3 potential ways that 2 minimum spanning trees can differ, without changing the total weight of the tree. I'll briefly discuss each one and why I think each can or cannot occur between MST's produces by Prim's and Kruskal's algorithms. They are ordered by decreasing trivialness.\n", 213 | " \n", 214 | "**1) Different root, but otherwise same nodes and edges.** This difference can occur because Prim's algorithm allows you to select the starting point, which becomes the root of the MST.\n", 215 | "\n", 216 | "**2) Equal-weight edges are swapped.** Here it depends on the implementation. For example, a triangle with 3 weight-1 edges will engender an MST with one of the edges missing. Prim's algorithm seems to be deterministic once a starting point is selected, but Kruskal's algorithm, as given in the textbook, uses quicksort to sort the edges based on weight. The random nature of quicksort can give rise to different orderings among the 3 edges since they are of equal weight. Therefore the 2 edges selected to be in the MST may differ on repeated runs of Kruskal's algorithm, only one of which can match Prim's. If a stable sort like merge sort was used instead, so that equal-weight edges were placed in order based on their array index, this may lead to a deterministic result that may always match Prim's.\n", 217 | "\n", 218 | "**3) Different edges with different weights, but the sum is still the same.** For example, one algorithm might include edges of weights 2 and 4, while the other might include edges of weights 3 and 3. Both add to 6, but clearly they are different edges with different weights. I do not believe this situation can occur because 1) both algorithms are greedy, and so would choose to include the edge of weight 2 as soon as possible. 2) I have not been able to construct an example where the selection of an edge meaningfully forces the algorithms to make different decisions later. And 3) choosing the edge with weight 3 doesn't pay off until after the other edge of weight 3 is included; therefore at this moment, the two algorithms may have included the same nodes but have different total weights. If we were to delete the non-tree nodes at this point, only one of the algorithms could be correct. You may object that the incorrect algorithm was making its decision based on the presence of the rest of the nodes, and that deleting them \"tricked\" the algorithm. But this isn't true because both algorithms are greedy, and are not capable of that sort of planning ahead." 219 | ] 220 | }, 221 | { 222 | "cell_type": "markdown", 223 | "metadata": {}, 224 | "source": [ 225 | "### 6.5 [3]\n", 226 | "\n", 227 | "Does either Prim’s or Kruskal’s algorithm work if there are negative edge weights? Explain why or why not." 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "metadata": {}, 233 | "source": [ 234 | "*Solution:*\n", 235 | "\n", 236 | "They both work. Both algorithms check if the 2 vertices in a candidate edge are already connected. This ensures that both algorithms will add only $n-1$ edges to span a graph with $n$ vertices, even if adding additional edges would lower the weight. At each stage, both algorithms seek to add the algorithm with the lowest weight; it doesn't matter if the number happens to be negative. Note that when dealing with negative weights, non-edges should have weight infinity, not zero.\n", 237 | "\n", 238 | "Problems where you care about cumulative weights across paths are likely to have more difficulty with negative weights. Intuitively, this might have to do with the fact that the number of vertices in different paths can vary; for example, a 3-edge path might be of lower weight than a 2-edge path between the same pair of vertices. In the MST problem, spanning trees must have $n-1$ edges, so this is not an issue here.\n", 239 | "\n", 240 | "This problem will likely be related to question 6.7, since in this question you could simply add a large constant value to every edge weight to ensure they are all positive, find the minimum spanning tree, and then subtract off the value." 241 | ] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": {}, 246 | "source": [ 247 | "### 6.7 [5]\n", 248 | "\n", 249 | "**(a):** Let $T$ be a minimum spanning tree of a weighted graph $G$. Construct a new graph $G'$ by adding a weight of $k$ to every edge of $G$. Do the edges of $T$ form a minimum spanning tree of $G'$? Prove the statement or give a counterexample.\n", 250 | "\n", 251 | "**(b):** Let $P = \\{s, . . . , t\\}$ describe a shortest weighted path between vertices $s$ and $t$\n", 252 | "of a weighted graph $G$. Construct a new graph $G'$ by adding a weight of $k$ to every edge of $G$. Does $P$ describe a shortest path from $s$ to $t$ in $G'$? Prove the statement or give a counterexample." 253 | ] 254 | }, 255 | { 256 | "cell_type": "markdown", 257 | "metadata": {}, 258 | "source": [ 259 | "*Solution:*\n", 260 | "\n", 261 | "**(a):**\n", 262 | "\n", 263 | "Yes. We'll prove this by contradiction.\n", 264 | "\n", 265 | "Let $G'$ be a new graph constructed by adding a weight of $k$ to every edge of $G$. Let $T_1$ be a minimum spanning tree of $G$ and $T_2$ be a MST of $G'$, and suppose that $T_1$ and $T_2$ do not contain all the same edges. Let $w(T)$ and $w'(T)$ be the weight of tree $T$ on graphs $G$ and $G'$, respectively.\n", 266 | "\n", 267 | "We know that $w(T_1) < w(T_2)$, since $T_1$ is a MST of $G$ while $T_2$ is not. Similarly, we know that $w'(T_2) < w'(T_1)$.\n", 268 | "\n", 269 | "To show that these two expressions contradict one another, we need to relate $w$ to $w'$.\n", 270 | "\n", 271 | "Constructing $G'$ involved adding $k$ to every edge in $G$. Therefore the weight of a tree will incease by $k$ for each edge it contains. Since MST's must contain $n-1$ edges, we have $w'(T_1) = w(T_1) + k(n-1)$ and $w'(T_2) = w(T_2) + k(n-1)$. Importantly, both increase by the same amount.\n", 272 | "\n", 273 | "We can then rewrite our second inequality $w'(T_2) < w'(T_1)$ as $w(T_2) + k(n-1) < w(T_1) + k(n-1)$. Removing the common term on both sides yields $w(T_2) < w(T_1)$. But this directly contradicts our first inequality $w(T_1) < w(T_2)$.\n", 274 | "\n", 275 | "In other words, if the weight of $T_2$ were less than that of $T_1$ on $G'$, then this would also be true on $G$, since moving between graphs changes their weights by the same amount, because they contain the same number of edges. But this contradicts the assumption that $T_1$ was a MST of $G$.\n", 276 | "\n", 277 | "Therefore the weight of $T_2$ cannot be less than that of $T_1$ on $G'$, and so $T_1$ must be a minimum spanning tree of $G'$ as well as of $G$.\n", 278 | "\n", 279 | "**(b):**\n", 280 | "\n", 281 | "No. This is because different paths between $s$ and $t$ can contain different amounts of edges, and therefore their total weights will change by different amounts when converting between $G$ and $G'$.\n", 282 | "\n", 283 | "As a simple counterexample, imagine a triangle with edges of weights 1, 1, and 3. The shortest path between the two vertices contained in the weight-3 edge is to go around through the other vertex. This path will have weight 2, less than the direct path of weight 3. However, if all edge weights were increased by 2, and therefore became 3, 3, and 5, in this case the fastest route between the same two vertices is the direct edge of weight 5, rather than going around for a total weight of 6. The shortest weighted path between these two vertices is different on graphs $G$ and $G'$.\n" 284 | ] 285 | }, 286 | { 287 | "cell_type": "markdown", 288 | "metadata": {}, 289 | "source": [ 290 | "---" 291 | ] 292 | }, 293 | { 294 | "cell_type": "markdown", 295 | "metadata": {}, 296 | "source": [ 297 | "## Union-Find" 298 | ] 299 | }, 300 | { 301 | "cell_type": "markdown", 302 | "metadata": {}, 303 | "source": [ 304 | "---" 305 | ] 306 | }, 307 | { 308 | "cell_type": "markdown", 309 | "metadata": {}, 310 | "source": [ 311 | "## Shortest Paths" 312 | ] 313 | }, 314 | { 315 | "cell_type": "markdown", 316 | "metadata": {}, 317 | "source": [ 318 | "### 6.14 [3]\n", 319 | "\n", 320 | "The *single-destination shortest path* problem for a directed graph seeks the shortest path *from* every vertex to a specified vertex $v$. Give an efficient algorithm to solve the single-destination shortest paths problem." 321 | ] 322 | }, 323 | { 324 | "cell_type": "markdown", 325 | "metadata": {}, 326 | "source": [ 327 | "*Solution:*\n", 328 | "\n", 329 | "Dijkstra's algorithm gives the shortest path **from** a vertex $v$ **to** every other vertex. If we flipped the direction of every edge and then used Dijkstra's algorith, the resulting shortest paths **from** $v$ would correspond to shortest paths **to** $v$ in the orignal graph.\n", 330 | "\n", 331 | "Since Dijkstra's algorithm is efficient ($O(m + n \\log n)$ with a priority queue implementation), we just need an efficient way to flip the direction of every edge. It was implied that this could be done in linear time on unweighted graphs on page 181, in the context of determining if a directed graph was strongly connected. We need to iterate over every edge $(x,y)$, adding the reverse edge $(y,x)$ each time to the new graph. Using adjacency lists, iterating over all edges would be $O(m)$." 332 | ] 333 | }, 334 | { 335 | "cell_type": "markdown", 336 | "metadata": {}, 337 | "source": [ 338 | "### 6.15 [3]\n", 339 | "\n", 340 | "Let $G = (V,E)$ be an undirected weighted graph, and let $T$ be the shortest-path spanning tree rooted at a vertex $v$. Suppose now that all the edge weights in $G$ are increased by a constant number $k$. Is $T$ still the shortest-path spanning tree from $v$?" 341 | ] 342 | }, 343 | { 344 | "cell_type": "markdown", 345 | "metadata": {}, 346 | "source": [ 347 | "*Solution:*\n", 348 | "\n", 349 | "No. The reason is that the amount by which the weight of a path increases is proportional to the number of edges it contains. Therefore with weight of different paths can increase by different amounts.\n", 350 | "\n", 351 | "As a counterexample, imagine a triangle with two edges of weight 1 and one edge of weight 3. Let $u$ and $v$ be the two vertices incident on this weightier edge. The shortest-path spanning tree rooted at $v$ will be composed of a single branch going around the perimeter, containing the two edges of weight 1. The path to vertex $u$ will have weight 2, since 2 weight-1 edges were traversed.\n", 352 | "\n", 353 | "Now add a weight of 2 to every edge, making 2 edges with weight 3 and 1 edge with weight 5. The path from $v$ to $u$ on the original shortest-path tree now has weight 6, since 2 weight-3 edges were traversed. However, the direct edge between $u$ and $v$ now has weight 5, and therefore now is the shorter path. On this new graph, the shortest-path spanning tree rooted at $v$ will have 2 branches, each of length 1, of weights 3 and 5.\n", 354 | "\n", 355 | "Therefore the adding a constant weight to every edge can alter the shortest-path spanning tree." 356 | ] 357 | }, 358 | { 359 | "cell_type": "markdown", 360 | "metadata": {}, 361 | "source": [ 362 | "### 6.16 [3]\n", 363 | "\n", 364 | "Answer all of the following:\n", 365 | "\n", 366 | "**(a):** Give an example of a weighted connected graph $G = (V,E)$ and a vertex $v$, such that the minimum spanning tree of $G$ is the same as the shortest-path spanning tree rooted at $v$.\n", 367 | "\n", 368 | "**(b):** Give an example of a weighted connected directed graph $G = (V,E)$ and a vertex $v$, such that the minimum-cost spanning tree of $G$ is very different from the shortest-path spanning tree rooted at $v$.\n", 369 | "\n", 370 | "**(c):** Can the two trees be completely disjointed?" 371 | ] 372 | }, 373 | { 374 | "cell_type": "markdown", 375 | "metadata": {}, 376 | "source": [ 377 | "*Solution:*\n", 378 | "\n", 379 | "**(a):**\n", 380 | "\n", 381 | "A triangle with edge weights 1, 1, and 10. For any vertex chosen as the root, and Minimum Spanning Tree (MST) and Shortest-Path Tree (SPT) will be the same. Specifically, they both will contain the two weight-1 edges and will not contain the weight-10 edge.\n", 382 | "\n", 383 | "\n", 384 | "**(b):**\n", 385 | "\n", 386 | "In the example given below, the MST and SPT rooted at $A$ are very different, only having one edge in common.\n", 387 | "\n", 388 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-10.jpg)\n", 389 | "\n", 390 | "**(c):**\n", 391 | "\n", 392 | "No. Prim's algorithm for finding the MST and Dijkstra's algorithm for finding the SPT differ in how they rate the desirability of potential new tree nodes and edges. Specifically, Dijkstra's includes the cost of the path from the root to the connecting tree node, while Prim's only cares about the cost of the new edge.\n", 393 | "\n", 394 | "On the first iteration of these algorithms, these conditions are identical, because Dijkstra's cannot include the cost of the adjoining tree-path because there isn't one: the tree has only one node. Therefore they both will select the same edge, specifically, the one with the lowest weight.\n", 395 | "\n", 396 | "This can be seen in the example from part (b): the single edge that the MST and SPT have in common is the lowest-weight edge adjacent to $A$, specifically edge $(A,B)$ of weight 2.\n", 397 | "\n", 398 | "What if there are multiple edges leaving the root that tie for the smallest weight? The SPT will contain all of them, and the MST must contain at least one of them. Therefore they still will have at least one edge in common.\n", 399 | "\n", 400 | "Using Kruskal's algorithm instead of Prim's to find the MST doesn't change the result: the lowest-weight edge leaving the root will still be selected to be included in the tree (or one of them, if there are multiple). And this edge will also be in the SPT." 401 | ] 402 | }, 403 | { 404 | "cell_type": "markdown", 405 | "metadata": {}, 406 | "source": [ 407 | "### 6.17 [3]\n", 408 | "\n", 409 | "Either prove the following or give a counterexample:\n", 410 | "\n", 411 | "**(a):** Is the path between a pair of vertices in a minimum spanning tree of an undirected graph necessarily the shortest (minimum weight) path?\n", 412 | "\n", 413 | "**(b):** Suppose that the minimum spanning tree of the graph is unique. Is the path between a pair of vertices in a minimum spanning tree of an undirected graph necessarily the shortest (minimum weight) path?" 414 | ] 415 | }, 416 | { 417 | "cell_type": "markdown", 418 | "metadata": {}, 419 | "source": [ 420 | "*Solution:*\n", 421 | "\n", 422 | "**(a):**\n", 423 | "\n", 424 | "This is identical to problem 6.2, and so I quote my solution to that problem:\n", 425 | "\n", 426 | "\"Definitely not. Otherwise, what was the point of Dijkstra's algorithm, if we could just use Prim's or Kruskal's to find the minimum spanning tree?\"\n", 427 | "\n", 428 | "\"As a counterexample, think of a triangle with 3 edges, each with weight 1. Any minimum spanning tree will only have 2 of these edges. Pick the 2 vertices with degree 1 in the tree. In the tree, the shortest path between them passes through the 3rd vertex and has weight 2, while in the full graph there is a direct edge of weight 1 connecting them.\"\n", 429 | "\n", 430 | "**(b):**\n", 431 | "\n", 432 | "No. Suppose we have a triangle with edge weights 2, 3, and 4. The unique MST will consist of the edges of weights 2 and 3, leaving out the edge of weight 4. On the MST, traveling between the two vertices included in the weight-4 edge would require passing through the root, and the path would have weight $2 + 3 = 5$. But on the original graph the shortest path is the edge between them, which has weight $4$. Therefore even in the case where the MST is unique, finding the MST still does not mean you have found the shortest path between any two vertices." 433 | ] 434 | }, 435 | { 436 | "cell_type": "markdown", 437 | "metadata": {}, 438 | "source": [ 439 | "---" 440 | ] 441 | }, 442 | { 443 | "cell_type": "markdown", 444 | "metadata": {}, 445 | "source": [ 446 | "## Network Flow and Matching" 447 | ] 448 | }, 449 | { 450 | "cell_type": "markdown", 451 | "metadata": {}, 452 | "source": [ 453 | "### 6.24 [3] Unfinished\n", 454 | "\n", 455 | "A matching in a graph is a set of disjoint edges—i.e., edges that do not share any vertices in common. Give a linear-time algorithm to find a maximum matching in a tree." 456 | ] 457 | }, 458 | { 459 | "cell_type": "markdown", 460 | "metadata": {}, 461 | "source": [ 462 | "*Solution:*\n", 463 | "\n", 464 | "Some thoughts.\n", 465 | "\n", 466 | "First note that all trees are bipartite: just make each layer a different color.\n", 467 | "\n", 468 | "In the book, this problem is only described in terms of unweighted graphs as an application of the network flow problem, which does use weighted graphs. The problem is solved by connecting a source to all L vertices and a sink to all R vertices. All edges have weight 1, both original and added. Then find the maximum flow from source to sink.\n", 469 | "\n", 470 | "One idea I had was to use breadth-first search to compute the tree. Then count the number of clusters in each row. One edge can be included from each cluster since they all connect to the same parent, so the number of edges you can include from each level is equal to the number of clusters. But you can only include edges from every other level. So just add up the number of clusters on the even numbered levels and the odd numbered levels, and the greater one is the one you choose to include in the matching.\n", 471 | "\n", 472 | "This idea doesn't work though in general, maybe only if the tree is fully dense. If a level includes a dead-end edge, then you can include both the dead end edge and edges in the next lower level.\n", 473 | "\n", 474 | "Below is an example of a graph, with the BFS tree computed starting from vertex 1. The vertices are numbered in the order that they were added to the tree. The brackets indicate the number of clusters in each level, and the circled edges are the ones that perhaps form a maximum matching. Note that the number of clusters are $1-1-1-1-2-2-2$, and summing up alternating levels equals $6$ and $4$, but the circled matching contains $8$ edges, showing that the maximum matching can exceed these values.\n", 475 | "\n", 476 | "![Skiena Fig 5-1](Figures/Hallock_Fig_6-11.jpg)\n", 477 | "\n", 478 | "Perhaps an idea is to include as many leaf edges as you can at each level, starting at the bottom. In the example above, 6 out of the 8 edges are leaves.\n", 479 | "\n", 480 | "How about a recursive algorithm that when in finds a leaf, it deletes the leaf and it's parent, and any other children of the parent.\n", 481 | "\n", 482 | "Iterate over the adjacency list, looking for vertices with degree 1?\n", 483 | "\n", 484 | "Is there a way to do this conceptually on the original graph? Perhaps targetting dead-ends first? The original graph is still a tree, but isn't organized into a top down shape with a root." 485 | ] 486 | }, 487 | { 488 | "cell_type": "code", 489 | "execution_count": null, 490 | "metadata": { 491 | "collapsed": true 492 | }, 493 | "outputs": [], 494 | "source": [] 495 | } 496 | ], 497 | "metadata": { 498 | "kernelspec": { 499 | "display_name": "Python 3", 500 | "language": "python", 501 | "name": "python3" 502 | }, 503 | "language_info": { 504 | "codemirror_mode": { 505 | "name": "ipython", 506 | "version": 3 507 | }, 508 | "file_extension": ".py", 509 | "mimetype": "text/x-python", 510 | "name": "python", 511 | "nbconvert_exporter": "python", 512 | "pygments_lexer": "ipython3", 513 | "version": "3.5.1" 514 | } 515 | }, 516 | "nbformat": 4, 517 | "nbformat_minor": 0 518 | } 519 | -------------------------------------------------------------------------------- /Chapter 7 - Combinatorial Search and Heuristic Methods.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Chapter 7: Combinatorial Search and Heuristic Methods (Completed 5/19: 26%)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Backtracking" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "Below is a general purpose Backtracking routine, adapted into Python from the book's C implementation, that will be used in some of the exercises." 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 1, 27 | "metadata": { 28 | "collapsed": true 29 | }, 30 | "outputs": [], 31 | "source": [ 32 | "def backtrack(A, k, **data):\n", 33 | " if is_a_solution(A, k, **data):\n", 34 | " process_solution(A,k, **data)\n", 35 | " else:\n", 36 | " k += 1\n", 37 | " candidates = construct_candidates(A,k, **data)\n", 38 | " for i, _ in enumerate(candidates):\n", 39 | " A.append(candidates[i])\n", 40 | " make_move(A, k, **data)\n", 41 | " backtrack(A, k, **data)\n", 42 | " unmake_move(A, k, **data)\n", 43 | " A.pop()" 44 | ] 45 | }, 46 | { 47 | "cell_type": "markdown", 48 | "metadata": {}, 49 | "source": [ 50 | "### 7.1 [3]\n", 51 | "\n", 52 | "A *derangement* is a permutation $p$ of $\\{1, . . . , n\\}$ such that no item is in its proper position, i.e. $p_i \\neq i$ for all $1 \\leq i \\leq n$. Write an efficient backtracking program with pruning that constructs all the derangements of $n$ items." 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "metadata": {}, 58 | "source": [ 59 | "*Solution:*\n", 60 | "\n", 61 | "For simplicity, the program computes the derangements of numbers $\\{0, . . . , n-1\\}$.\n", 62 | "\n", 63 | "A closed form expression for the number of derangements, notated as $!n$, from [Wikipedia](http://www.wikiwand.com/en/Derangement), is given by:\n", 64 | "\n", 65 | "$$ !n = \\left[ \\frac{n!}{e} \\right] = \\left \\lfloor \\frac{n!}{e} + \\frac{1}{2} \\right \\rfloor$$\n", 66 | "\n", 67 | "where $[x]$ is the nearest integer function. Therefore this number grows proportionally with $n!$, the number of permutations.\n", 68 | "\n", 69 | "This problem is very similar to constructing permutations, a partial solution to which is provided in the text. However, we must ensure that we don't insert number $i$ at position $i$. Therefore we include the check `(i != k)` when screening potential candidates in the `construct_candidates` function. This effectively prunes partial solutions that are doomed to fail, rather than putting such a check in the `is_a_solution`, which would require a linear scan through all $!n$ solutions checking if `A[i] == i`. This indeed would be exhaustive." 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": 2, 75 | "metadata": { 76 | "collapsed": false 77 | }, 78 | "outputs": [ 79 | { 80 | "name": "stdout", 81 | "output_type": "stream", 82 | "text": [ 83 | "[1, 0, 3, 2]\n", 84 | "Solution number: 1\n", 85 | "[1, 2, 3, 0]\n", 86 | "Solution number: 2\n", 87 | "[1, 3, 0, 2]\n", 88 | "Solution number: 3\n", 89 | "[2, 0, 3, 1]\n", 90 | "Solution number: 4\n", 91 | "[2, 3, 0, 1]\n", 92 | "Solution number: 5\n", 93 | "[2, 3, 1, 0]\n", 94 | "Solution number: 6\n", 95 | "[3, 0, 1, 2]\n", 96 | "Solution number: 7\n", 97 | "[3, 2, 0, 1]\n", 98 | "Solution number: 8\n", 99 | "[3, 2, 1, 0]\n", 100 | "Solution number: 9\n" 101 | ] 102 | } 103 | ], 104 | "source": [ 105 | "## Problem-specific functions \n", 106 | "def construct_candidates(A, k, n, V, m):\n", 107 | " candidates = []\n", 108 | " for i in range(n):\n", 109 | " if (V[i] == False) and (i != k):\n", 110 | " candidates.append(i)\n", 111 | " return candidates\n", 112 | "\n", 113 | "def make_move(A, k, n, V, m):\n", 114 | " V[A[k]] = True\n", 115 | " \n", 116 | "def unmake_move(A, k, n, V, m):\n", 117 | " V[A[k]] = False\n", 118 | "\n", 119 | "def is_a_solution(A, k, n, V, m):\n", 120 | " return k == n - 1\n", 121 | "\n", 122 | "def process_solution(A, k, n, V, m):\n", 123 | " print(A)\n", 124 | " m[0] += 1\n", 125 | " print(\"Solution number:\", m[0])\n", 126 | " \n", 127 | "def generate_derangements(n):\n", 128 | " A = []\n", 129 | " V = []\n", 130 | " m = [0] #number of solutions\n", 131 | " \n", 132 | " for i in range(n):\n", 133 | " V.append(False)\n", 134 | " \n", 135 | " backtrack(A, -1, n = n, V = V, m = m)\n", 136 | "\n", 137 | "generate_derangements(4)\n", 138 | "\n", 139 | "#No number is in the position equal to its value" 140 | ] 141 | }, 142 | { 143 | "cell_type": "markdown", 144 | "metadata": {}, 145 | "source": [ 146 | "### 7.2 [4]\n", 147 | "\n", 148 | "*Multisets* are allowed to have repeated elements. A multiset of $n$ items may thus have fewer than $n!$ distinct permutations. For example, $\\{1, 1, 2, 2\\}$ has only six different permutations: $\\{1, 1, 2, 2\\}$, $\\{1, 2, 1, 2\\}$, $\\{1, 2, 2, 1\\}$, $\\{2, 1, 1, 2\\}$, $\\{2, 1, 2, 1\\}$, and $\\{2, 2, 1, 1\\}$. Design and implement an efficient algorithm for constructing all\n", 149 | "permutations of a multiset." 150 | ] 151 | }, 152 | { 153 | "cell_type": "markdown", 154 | "metadata": {}, 155 | "source": [ 156 | "*Solution:*\n", 157 | "\n", 158 | "I originally tried to solve this problem by enumerating permutations of the index and pruning them appropriately. (Skip this paragaph if you want.) For example, $S = \\{1, 1, 2, 2\\}$ contains 4 numbers, so permutations of the numbers 1 through 4 will correspond to permutations of $S$. The only problem is that some of these index permutations will correspond to identical set permutations, due to the repeated elements. For example, both index permutations $\\{1, 3, 2, 4\\}$ and $\\{1, 4, 2, 3 \\}$ produce the set permutation $\\{1, 2, 1, 2\\}$. However, we may note that only one of these satisfies the condition that indices corresponding to identical set elements are in sorted order. Specifically, both indices $3$ and $4$ correspond to the number $2$ in the set $S$. So if we require that such numbers be in sorted order, then only one of these index permutations is valid, $\\{1, 3, 2, 4\\}$. However, I was unable to encode this without using a \"stack\" data structure. Specifically, for set-values already included, a second auxilliary data structure would be needed to record what index was used most recently that corresponds to this value. (The first data structure is a bit vector for included indices.) For example, when adding index $3$ to the partial solution $\\{1, 4, 2\\}$, we would need check that $S[3] = 2$ was already included using index $4$, and since $3 < 4$ this would break our requirement that indices that correspond to equal set values be in sorted order. The problem is how to implement the `unmake_move` function when there are more than 2 identical elements. For example, if there are 5 3's in the set $S$, then our second auxiliary data structure would essentially have to record the order in which these 5 different indices were added. This could be done with a stack. However, I was able to find a more elementary and direct solution to the problem.\n", 159 | "\n", 160 | "To solve this problem, we can compute a histogram-like structure that tells us how many more times we can include a given value. For example, say you have 3 unique numbers, like $S = [1, 1, 2, 2, 2, 7, 7]$. We can turn this into two arrays $H = [2, 3, 2]$ and $I = [1, 2, 7]$, where the first tells you how many of each number you have, and the second is what the number is. Together, these form a sort of histogram, with $I$ the x values and $H$ the y values.\n", 161 | "\n", 162 | "At each iteration in the backtracking algorithm, the potential candidates for the next position in our solution vector $A$ are any index from $0$ to $m-1$ such that $H[i] \\neq 0$, where $m$ is the number of unique values in $S$. When a number $i$ is added to $A$, we decrement $H[i]$ by one to indicate that we have one less of that number available for future positions. To undo this, we increment $H[i]$ by one.\n", 163 | "\n", 164 | "Interestingly, although this solution is conceptually different from how the permutation problem is treated, it may in fact be a generalization. The permutation problem uses a bit vector to indicate what indices from $1$ to $n$ have already been included. This is analogous to our $H$ array, only in the permutation problem we know that each number appears exactly once, and so True/False values can be used instead of 1's and 0's. Also, there is no need for an $I$ array in the permutation case because the value and the index are equal." 165 | ] 166 | }, 167 | { 168 | "cell_type": "code", 169 | "execution_count": 3, 170 | "metadata": { 171 | "collapsed": false 172 | }, 173 | "outputs": [ 174 | { 175 | "name": "stdout", 176 | "output_type": "stream", 177 | "text": [ 178 | "The set: [1, 1, 2, 2] \n", 179 | "\n", 180 | "Solution number: 1\n", 181 | "Permutation; [1, 1, 2, 2]\n", 182 | "\n", 183 | "Solution number: 2\n", 184 | "Permutation; [1, 2, 1, 2]\n", 185 | "\n", 186 | "Solution number: 3\n", 187 | "Permutation; [1, 2, 2, 1]\n", 188 | "\n", 189 | "Solution number: 4\n", 190 | "Permutation; [2, 1, 1, 2]\n", 191 | "\n", 192 | "Solution number: 5\n", 193 | "Permutation; [2, 1, 2, 1]\n", 194 | "\n", 195 | "Solution number: 6\n", 196 | "Permutation; [2, 2, 1, 1]\n", 197 | "\n" 198 | ] 199 | } 200 | ], 201 | "source": [ 202 | "## Problem-specific functions \n", 203 | "def construct_candidates(A, k, n, H, I, m):\n", 204 | " candidates = []\n", 205 | " for i, _ in enumerate(I):\n", 206 | " if H[i] != 0:\n", 207 | " candidates.append(i) \n", 208 | " return candidates\n", 209 | "\n", 210 | "def make_move(A, k, n, H, I, m):\n", 211 | " H[A[k]] -= 1\n", 212 | " \n", 213 | "def unmake_move(A, k, n, H, I, m):\n", 214 | " H[A[k]] += 1\n", 215 | "\n", 216 | "def is_a_solution(A, k, n, H, I, m):\n", 217 | " return k == n - 1\n", 218 | "\n", 219 | "def process_solution(A, k, n, H, I, m):\n", 220 | " m[0] += 1\n", 221 | " Solution = []\n", 222 | " for i, _ in enumerate(A):\n", 223 | " Solution.append(I[A[i]])\n", 224 | " \n", 225 | " print(\"Solution number:\", m[0])\n", 226 | " print('Permutation;', Solution)\n", 227 | " print()\n", 228 | " \n", 229 | "def construct_histogram(S):\n", 230 | " H = []\n", 231 | " I = []\n", 232 | " \n", 233 | " for i, _ in enumerate(S):\n", 234 | " if i == 0 or S[i] > S[i - 1]: #New number\n", 235 | " I.append(S[i])\n", 236 | " H.append(1)\n", 237 | " else:\n", 238 | " H[-1] += 1\n", 239 | " return H, I\n", 240 | "\n", 241 | "def generate_multiset_permutations(S):\n", 242 | " S.sort()\n", 243 | " A = []\n", 244 | " H, I = construct_histogram(S)\n", 245 | " m = [0] #number of solutions\n", 246 | " n = len(S)\n", 247 | "\n", 248 | " backtrack(A, -1, n = n, H = H, I = I, m = m)\n", 249 | "\n", 250 | "S = [1, 1, 2, 2]\n", 251 | "print('The set:', S, '\\n')\n", 252 | "generate_multiset_permutations(S)\n", 253 | "\n", 254 | "# Possible Multisets from the given example:\n", 255 | "#{1,1,2,2}\n", 256 | "#{1,2,1,2}\n", 257 | "#{1,2,2,1}\n", 258 | "#{2,1,1,2}\n", 259 | "#{2,1,2,1}\n", 260 | "#{2,2,1,1}" 261 | ] 262 | }, 263 | { 264 | "cell_type": "markdown", 265 | "metadata": {}, 266 | "source": [ 267 | "---" 268 | ] 269 | }, 270 | { 271 | "cell_type": "markdown", 272 | "metadata": {}, 273 | "source": [ 274 | "## Combinatorial Optimization" 275 | ] 276 | }, 277 | { 278 | "cell_type": "markdown", 279 | "metadata": {}, 280 | "source": [ 281 | "---" 282 | ] 283 | }, 284 | { 285 | "cell_type": "markdown", 286 | "metadata": {}, 287 | "source": [ 288 | "## Interview Problems" 289 | ] 290 | }, 291 | { 292 | "cell_type": "markdown", 293 | "metadata": {}, 294 | "source": [ 295 | "For this section I will not use the general purpose backtracking algorithm at the beginning of this notebook, but rather will write the algorithms from scratch as if I was in an \"interview\"." 296 | ] 297 | }, 298 | { 299 | "cell_type": "markdown", 300 | "metadata": {}, 301 | "source": [ 302 | "### 7.14 [4]\n", 303 | "\n", 304 | "Write a function to find all permutations of the letters in a particular string." 305 | ] 306 | }, 307 | { 308 | "cell_type": "markdown", 309 | "metadata": {}, 310 | "source": [ 311 | "*Solution:*\n", 312 | "\n", 313 | "If we may use Python's dictionary structure, we can convert the string into a dictionary mapping each letter to the number of times it appears in the string. Think of this as a histogram. We can then use a backtracking algorithm to sucessively build up solution strings by taking letters off the histogram." 314 | ] 315 | }, 316 | { 317 | "cell_type": "code", 318 | "execution_count": 4, 319 | "metadata": { 320 | "collapsed": false 321 | }, 322 | "outputs": [ 323 | { 324 | "name": "stdout", 325 | "output_type": "stream", 326 | "text": [ 327 | "ello\n", 328 | "elol\n", 329 | "eoll\n", 330 | "lelo\n", 331 | "leol\n", 332 | "lleo\n", 333 | "lloe\n", 334 | "loel\n", 335 | "lole\n", 336 | "oell\n", 337 | "olel\n", 338 | "olle\n" 339 | ] 340 | } 341 | ], 342 | "source": [ 343 | "def string_to_histogram(s):\n", 344 | " D = {}\n", 345 | " for c in s:\n", 346 | " if c in D:\n", 347 | " D[c] += 1\n", 348 | " else:\n", 349 | " D[c] = 1\n", 350 | " return D\n", 351 | "\n", 352 | "def string_backtrack(A, k, D, n, m):\n", 353 | " if k == n - 1:\n", 354 | " m[0] += 1\n", 355 | " #print('Solution', m[0]) #Uncomment to count solutions\n", 356 | " print(''.join(A))\n", 357 | " else:\n", 358 | " k += 1\n", 359 | " candidates = construct_candidates(D)\n", 360 | " for c in candidates:\n", 361 | " A.append(c)\n", 362 | " D[c] -= 1\n", 363 | " string_backtrack(A, k, D, n, m)\n", 364 | " D[c] += 1\n", 365 | " A.pop()\n", 366 | "\n", 367 | "def construct_candidates(D):\n", 368 | " candidates = []\n", 369 | " for s in D:\n", 370 | " if D[s] != 0:\n", 371 | " candidates.append(s)\n", 372 | " return candidates\n", 373 | " \n", 374 | "def string_permutations(s):\n", 375 | " A = []\n", 376 | " k = -1\n", 377 | " D = string_to_histogram(s)\n", 378 | " n = len(s)\n", 379 | " m = [0]\n", 380 | " \n", 381 | " string_backtrack(A, k, D, n, m)\n", 382 | "\n", 383 | "string_permutations('ello')" 384 | ] 385 | }, 386 | { 387 | "cell_type": "markdown", 388 | "metadata": {}, 389 | "source": [ 390 | "### 7.15 [4]\n", 391 | "\n", 392 | "Implement an efficient algorithm for listing all $k$-element subsets of $n$ items." 393 | ] 394 | }, 395 | { 396 | "cell_type": "markdown", 397 | "metadata": {}, 398 | "source": [ 399 | "*Solution:*\n", 400 | "\n", 401 | "I assume that all $n$ items are distinguishable. We can then number the items from $0$ to $n-1$ and enumerate all subsets of size $k$ of these numbers. We can do this using a backtracking algorithm. An array $A$ will hold our partial solutions, and we'll use a bit vector to record which items have already been included. Candidate values to add next are those values which haven't yet been included, which is a constant time lookup with the bit vector.\n", 402 | "\n", 403 | "We also require that the element being added be greater than the previous element. For example, of all the permutations that correspond to the size-3 subset $\\{1, 2, 4\\}$, the rest of them will at some point have broken the rule that the additional element was lower than the previous one. So this condition eliminates any equivalent permutations of the same subset showing up, and therefore significantly prunes our search.\n", 404 | "\n", 405 | "NOTE: A elementary but still recursive function for doing this was done as a subproblem for problem 4.10. It was quite challenging, so it is remarkable that using backtracking the problem becomes somewhat easy." 406 | ] 407 | }, 408 | { 409 | "cell_type": "code", 410 | "execution_count": 5, 411 | "metadata": { 412 | "collapsed": false 413 | }, 414 | "outputs": [ 415 | { 416 | "name": "stdout", 417 | "output_type": "stream", 418 | "text": [ 419 | "[0, 1, 2]\n", 420 | "[0, 1, 3]\n", 421 | "[0, 1, 4]\n", 422 | "[0, 2, 3]\n", 423 | "[0, 2, 4]\n", 424 | "[0, 3, 4]\n", 425 | "[1, 2, 3]\n", 426 | "[1, 2, 4]\n", 427 | "[1, 3, 4]\n", 428 | "[2, 3, 4]\n" 429 | ] 430 | } 431 | ], 432 | "source": [ 433 | "# Note that j plays the role of k\n", 434 | "# compared to most of the other\n", 435 | "# backtracking implementations here\n", 436 | "\n", 437 | "def k_subset_backtrack(A, j, n, k, V):\n", 438 | " if j == k - 1:\n", 439 | " print(A)\n", 440 | " else:\n", 441 | " j += 1\n", 442 | " candidates = construct_candidates(A, j, n, k, V)\n", 443 | " for c in candidates:\n", 444 | " A.append(c)\n", 445 | " V[c] = True\n", 446 | " k_subset_backtrack(A, j, n, k, V)\n", 447 | " V[c] = False\n", 448 | " A.pop()\n", 449 | "\n", 450 | "def construct_candidates(A, j, n, k, V):\n", 451 | " candidates = []\n", 452 | " for i, v in enumerate(V):\n", 453 | " if v == False and ((j == 0) or (A[j-1] < i)):\n", 454 | " candidates.append(i)\n", 455 | " return candidates\n", 456 | " \n", 457 | "def k_subsets(n, k):\n", 458 | " A = []\n", 459 | " V = []\n", 460 | " j = -1\n", 461 | " \n", 462 | " for i in range(n): #Initialize bit-vector\n", 463 | " V.append(False)\n", 464 | " \n", 465 | " k_subset_backtrack(A, j, n, k, V)\n", 466 | " \n", 467 | "k_subsets(5, 3) " 468 | ] 469 | }, 470 | { 471 | "cell_type": "markdown", 472 | "metadata": {}, 473 | "source": [ 474 | "---" 475 | ] 476 | }, 477 | { 478 | "cell_type": "markdown", 479 | "metadata": { 480 | "collapsed": true 481 | }, 482 | "source": [ 483 | "### 7.16 [5]\n", 484 | "\n", 485 | "An anagram is a rearrangement of the letters in a given string into a sequence of dictionary words, like *Steven Skiena* into *Vainest Knees*. Propose an algorithm to construct all the anagrams of a given string." 486 | ] 487 | }, 488 | { 489 | "cell_type": "markdown", 490 | "metadata": {}, 491 | "source": [ 492 | "*Solution:*\n", 493 | "\n", 494 | "This problem is similar to enumerating all the perumutations of a set of $n$ numbers, where $n$ is the length of the string, but for 3 complications. Strings are multisets, meaning they can have repeated elements. Permuting equal letters doesn't change a word, so unless we accound for this, the same anagram will appear multiple times. The second complication is that anagrams are free to include differing amounts of words separated by spaces, as long as they contain the same set of letters.\n", 495 | "\n", 496 | "The third complication is the requirement that the resulting strings be dictionary words.\n", 497 | "\n", 498 | "To construct the anagrams, we will convert the original string into a histogam-like structure that will track how many more times we may use each letter. \"Space\" will also be an option unless the previously added character was also a \"space\". We will use a backtracking algorithm to recursively extend partial string vectors $A$ one character at a time, using the histogram to track what characters remain available for insertion.\n", 499 | "\n", 500 | "To enforce the requirement that the words be found in the dictionary, each time we add a \"space\" we will check if the word is contained in a list of dictionary words, found in `'Other/Dictionary_words.txt'`. The `is_word()` function makes use of the `bisect` module to implement a binary search. When we have found a potential solution, we use the `is_word()` function again to check if the last word in the solution is valid.\n", 501 | "\n", 502 | "The program implemented below works, but takes about a minute to run on the string 'steven skien', and is too slow to run on 'steven skiena'.\n", 503 | "\n", 504 | "One important area where the program can be improved is the `is_word()` function. Currently a binary search is performed on a list of valid words that originally was stored at `/usr/share/dict/words` on my computer. However, this list **does not contain plurals** even though it contains 235886 words. I found a list of dictionary words online that contained 354986 words, but many of them did not seem to be valid words, like \"wl\", and many other two-letter combinations. For example, the word \"pillow\" produces 33 or 2498 different anagrams depending on which of these two dictionaries is used. I opted for the smaller one.\n", 505 | "\n", 506 | "Instead of manually performing a binary search on a list of dictionary words, I could have used a specialized spell checker library that would probably provide better and faster results, but I am not sure what modules GitHub has available on their system." 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "execution_count": 6, 512 | "metadata": { 513 | "collapsed": false 514 | }, 515 | "outputs": [ 516 | { 517 | "name": "stdout", 518 | "output_type": "stream", 519 | "text": [ 520 | "235886\n" 521 | ] 522 | }, 523 | { 524 | "data": { 525 | "text/plain": [ 526 | "True" 527 | ] 528 | }, 529 | "execution_count": 6, 530 | "metadata": {}, 531 | "output_type": "execute_result" 532 | } 533 | ], 534 | "source": [ 535 | "import bisect\n", 536 | "\n", 537 | "# Index function code from\n", 538 | "# bisect module page\n", 539 | "def index(a, x):\n", 540 | " 'Locate the leftmost value exactly equal to x'\n", 541 | " i = bisect.bisect_left(a, x)\n", 542 | " if i != len(a) and a[i] == x:\n", 543 | " return True\n", 544 | " else:\n", 545 | " return False\n", 546 | "\n", 547 | "def is_word(x):\n", 548 | " return index(dictionary_words, x)\n", 549 | "\n", 550 | "with open('Other/Dictionary_words.txt', 'r') as file:\n", 551 | " dictionary_words = [line.rstrip() for line in file]\n", 552 | "dictionary_words.sort()\n", 553 | "\n", 554 | "print(len(dictionary_words))\n", 555 | "is_word('all')" 556 | ] 557 | }, 558 | { 559 | "cell_type": "code", 560 | "execution_count": 7, 561 | "metadata": { 562 | "collapsed": false 563 | }, 564 | "outputs": [ 565 | { 566 | "name": "stdout", 567 | "output_type": "stream", 568 | "text": [ 569 | "pillows\n", 570 | "{'l': 2, 'i': 1, 'p': 1, 'o': 1, 's': 1, 'w': 1}\n", 571 | "Anagram 1: lip slow\n", 572 | "Anagram 2: lip sowl\n", 573 | "Anagram 3: lisp low\n", 574 | "Anagram 4: lisp owl\n", 575 | "Anagram 5: lis plow\n", 576 | "Anagram 6: low lisp\n", 577 | "Anagram 7: low slip\n", 578 | "Anagram 8: ill wops\n", 579 | "Anagram 9: plow lis\n", 580 | "Anagram 10: plow sil\n", 581 | "Anagram 11: pill sow\n", 582 | "Anagram 12: poll wis\n", 583 | "Anagram 13: pow sill\n", 584 | "Anagram 14: po swill\n", 585 | "Anagram 15: owl lisp\n", 586 | "Anagram 16: owl slip\n", 587 | "Anagram 17: ow spill\n", 588 | "Anagram 18: slip low\n", 589 | "Anagram 19: slip owl\n", 590 | "Anagram 20: slow lip\n", 591 | "Anagram 21: sill pow\n", 592 | "Anagram 22: sill wop\n", 593 | "Anagram 23: sil plow\n", 594 | "Anagram 24: spill ow\n", 595 | "Anagram 25: spill wo\n", 596 | "Anagram 26: sop will\n", 597 | "Anagram 27: sowl lip\n", 598 | "Anagram 28: sow pill\n", 599 | "Anagram 29: swill po\n", 600 | "Anagram 30: will sop\n", 601 | "Anagram 31: wis poll\n", 602 | "Anagram 32: wops ill\n", 603 | "Anagram 33: wop sill\n", 604 | "Anagram 34: wo spill\n", 605 | "CPU times: user 101 ms, sys: 2.35 ms, total: 103 ms\n", 606 | "Wall time: 104 ms\n" 607 | ] 608 | } 609 | ], 610 | "source": [ 611 | "# A = partial solution string\n", 612 | "# D = dictionary \"histogram\" of remaining letters\n", 613 | "# m = number of letters currently in solution string\n", 614 | "# n = number of letters in original string\n", 615 | "# sol = solution number\n", 616 | "\n", 617 | "def anagram_backtracking(A, D, m, n, sol):\n", 618 | " if m == n and last_word_in_dict(A, D, m, n, sol):\n", 619 | " sol[0] += 1\n", 620 | " print('Anagram %s:' % sol[0], ''.join(A))\n", 621 | " else:\n", 622 | " candidates = construct_candidates_anagram(A, D, m, n, sol)\n", 623 | " for c in candidates:\n", 624 | " A.append(c)\n", 625 | " if c != ' ':\n", 626 | " D[c] -= 1\n", 627 | " m += 1\n", 628 | " anagram_backtracking(A, D, m, n, sol)\n", 629 | " if c != ' ':\n", 630 | " D[c] += 1\n", 631 | " m -= 1\n", 632 | " A.pop()\n", 633 | "\n", 634 | "def construct_candidates_anagram(A, D, m, n, sol):\n", 635 | " candidates = []\n", 636 | " for c, h in D.items():\n", 637 | " if h != 0:\n", 638 | " candidates.append(c) \n", 639 | " if m != 0 and A[-1] != ' ': # Consider 'space'\n", 640 | " if last_word_in_dict(A, D, m, n, sol):\n", 641 | " candidates.append(' ')\n", 642 | " return candidates\n", 643 | "\n", 644 | "def last_word_in_dict(A, D, m, n, sol):\n", 645 | " s = 0 # Position of previous 'space'\n", 646 | " for i in range(len(A)):\n", 647 | " if A[-(i+1)] == ' ':\n", 648 | " s = i\n", 649 | " break\n", 650 | " word = ''.join(A[-s:])\n", 651 | " if len(word) == 1:\n", 652 | " return (word == 'a' or word == 'i')\n", 653 | " else:\n", 654 | " #return word in dictionary_words\n", 655 | " return is_word(word)\n", 656 | " \n", 657 | "def construct_histogram(s):\n", 658 | " D = {}\n", 659 | " for c in s:\n", 660 | " if c == ' ': # Ignore spaces\n", 661 | " continue\n", 662 | " if c in D:\n", 663 | " D[c] += 1\n", 664 | " else:\n", 665 | " D[c] = 1\n", 666 | " return D\n", 667 | "\n", 668 | "def find_anagrams(s):\n", 669 | " s = s.lower().replace(' ', '')\n", 670 | " print(s)\n", 671 | " A = []\n", 672 | " D = construct_histogram(s)\n", 673 | " m = 0\n", 674 | " n = len(s)\n", 675 | " sol = [0]\n", 676 | " print(D)\n", 677 | " anagram_backtracking(A, D, m, n, sol)\n", 678 | "\n", 679 | "%time find_anagrams('pillows')" 680 | ] 681 | }, 682 | { 683 | "cell_type": "markdown", 684 | "metadata": {}, 685 | "source": [ 686 | "## 7.17 [5] Unfinished\n", 687 | "\n", 688 | "Telephone keypads have letters on each numerical key. Write a program that generates all possible words resulting from translating a given digit sequence (e.g., 145345) into letters." 689 | ] 690 | }, 691 | { 692 | "cell_type": "markdown", 693 | "metadata": {}, 694 | "source": [ 695 | "*Solution:*" 696 | ] 697 | }, 698 | { 699 | "cell_type": "code", 700 | "execution_count": 8, 701 | "metadata": { 702 | "collapsed": false 703 | }, 704 | "outputs": [], 705 | "source": [ 706 | "keypad = {1 : ['a', 'b', 'c'],\n", 707 | " 2 : ['d', 'e', 'f']}\n", 708 | "\n", 709 | "def keypad_backtracking(A, n, D, m):\n", 710 | " if n == m:\n", 711 | " print(''.join(A))\n", 712 | " else:\n", 713 | " candidates = keypad_construct_candidates(A, n, D, m)\n", 714 | " for c in candidates:\n", 715 | " A.append(c)\n", 716 | " keypad_backtracking(A, n, D, m)\n", 717 | " A.pop()\n", 718 | " \n", 719 | "#def keypad_construct_candidates(A, n, D, m):\n", 720 | " \n", 721 | " " 722 | ] 723 | }, 724 | { 725 | "cell_type": "code", 726 | "execution_count": null, 727 | "metadata": { 728 | "collapsed": false 729 | }, 730 | "outputs": [], 731 | "source": [] 732 | } 733 | ], 734 | "metadata": { 735 | "kernelspec": { 736 | "display_name": "Python 3", 737 | "language": "python", 738 | "name": "python3" 739 | }, 740 | "language_info": { 741 | "codemirror_mode": { 742 | "name": "ipython", 743 | "version": 3 744 | }, 745 | "file_extension": ".py", 746 | "mimetype": "text/x-python", 747 | "name": "python", 748 | "nbconvert_exporter": "python", 749 | "pygments_lexer": "ipython3", 750 | "version": "3.5.1" 751 | } 752 | }, 753 | "nbformat": 4, 754 | "nbformat_minor": 0 755 | } 756 | -------------------------------------------------------------------------------- /Chapter 8 - Dynamic Programming.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Chapter 8: Dynamic Programming (Completed 6/26: 23%)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Edit Distance" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "### 8.1 [3]\n", 22 | "\n", 23 | "Typists often make transposition errors exchanging neighboring characters, such as typing “setve” when you mean “steve.” This requires two substitutions to fix under the conventional definition of edit distance. Incorporate a swap operation into our edit distance function, so that such neighboring transposition errors can be fixed at the cost of one operation." 24 | ] 25 | }, 26 | { 27 | "cell_type": "markdown", 28 | "metadata": {}, 29 | "source": [ 30 | "*Solution:*\n", 31 | "\n", 32 | "The following is nearly verbatim from the text (including errata), except for the addition of option 4, which is the new swap operation.\n", 33 | "\n", 34 | "Let $D[i,j]$ be the minimum number of differences between the segment of $P$ ending at position $i$ and the segment of $T$ ending at position $j$. $D[i,j]$ is the *minimum* of the **four** possible ways to extend smaller strings:\n", 35 | "\n", 36 | "1. If ($P_i = T_j$), then $D[i-1, j-1]$, else $D[i-1, j-1] + 1$. This means we either match or substitute the $i$th and $j$th characters, depending upon whether the tail characters are the same.\n", 37 | "2. $D[i − 1, j] + 1$. This means that there is an extra character in the text to account for, so we do not advance the pattern pointer and pay the cost of an insertion.\n", 38 | "3. $D[i, j − 1] + 1$. This means that there is an extra character in the pattern to remove, so we do not advance the text pointer and pay the cost on a deletion.\n", 39 | "4. If ($P_{i} = T_{j-1}$ and $P_{i-1} = T_{j}$), then $D[i-1, j-1]$. This means that characters in P at positions $i-1$ and $i$ are the same as those in T at positions $j-1$ and $j$, only swapped. We must have already payed a cost of 1 for the mismatch between $P_{i-1}$ and $T_{j-1}$ so we advance both patterns and do not pay a second time, for a total cost of 1 for a swap operation." 40 | ] 41 | }, 42 | { 43 | "cell_type": "markdown", 44 | "metadata": {}, 45 | "source": [ 46 | "### 8.2 [4]\n", 47 | "\n", 48 | "Suppose you are given three strings of characters: $X$, $Y$, and $Z$, where $|X| = n$, $|Y| = m$, and $|Z| = n + m$. $Z$ is said to be a *shuffle* of $X$ and $Y$ iff $Z$ can be formed by interleaving the characters from $X$ and $Y$ in a way that maintains the left-to-right ordering of the characters from each string.\n", 49 | "\n", 50 | "**(a):** Show that \"*cchocohilaptes*\" is a shuffle of \"*chocolate*\" and \"*chips*\", but \"*chocochilatspe*\"\n", 51 | "is not. \n", 52 | "**(b):** Give an efficient dynamic-programming algorithm that determines whether $Z$ is a shuffle of $X$ and $Y$. Hint: the values of the dynamic programming matrix you construct should be Boolean, not numeric." 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "metadata": {}, 58 | "source": [ 59 | "*Solution:*\n", 60 | "\n", 61 | "**(a):**\n", 62 | "\n", 63 | "\"$\\text{cchocohilaptes}$\" can be written as \"$\\text{(c)choco(h)(i)la(p)te(s)}$\" where letters surrounded by parentheses indicate they are from the word \"chips\", and letters without parentheses are from \"chocolate\". The letters in each group are in the same order as in their parent word. Therefore \"cchocohilaptes\" is indeed a shuffle of \"chocolate\" and \"chips\".\n", 64 | "\n", 65 | "However, within \"$\\text{chocochilatspe}$\", there is an \"s\" followed by a \"p\". Neither of these letters appear in \"chocolate\", and so they must be from \"chips\". But in \"chips\" their order is \"p\" then \"s\", while in \"chocochilatspe\" it is \"s\" then \"p\". Proper shuffles must maintain the letter orders of the parent strings. Therefore \"chocochilatspe\" cannot be a shuffle of \"chocolate\" and \"chips\".\n", 66 | "\n", 67 | "**(b):**\n", 68 | "\n", 69 | "We wish to answer the question \"can string $Z$ be expressed as a shuffle of strings $X$ and $Y$?\". This is a boolean question. We would like to express this question in terms of smaller substrings, enabling us to construct a dynamic programming matrix of answers to identical questions on these smaller substrings.\n", 70 | "\n", 71 | "Two natural possibilities are to parametrize the matrix with an index $k$ along string $Z$, or with two indices $i$ and $j$, the first along $X$ and the second along $Y$. Note that we must have $i + j = k$, meaning that if the first $i$ characters of $X$ and the first $j$ characters of $Y$ can be shuffled to match the beginning of $Z$, this beginning substring of $Z$ must have $i + j = k$ characters. This suggests using a 2D matrix for indices $i$ and $j$, instead of a 1D array for index $k$.\n", 72 | "\n", 73 | "Let $B[i,j]$ indicate whether the first $i$ characters of $X$ and the first $j$ characters of $Y$ can be shuffled to match the first $i + j = k$ characters of $Z$.\n", 74 | "\n", 75 | "Two subtle, but important things to note: our definition of $B$ assumes that the beginning of the strings are padded with a space, otherwise $i$ would mean the first $i+1$ characters of $X$. For example, $X[1]$ would refer to the *second* character of $X$ rather than the *first*. Second, this padding with a space is necessary for another reason. If a word as $n$ letters, we need to be able to express including 0 of these letters up to including all of them. Therefore we need $n+1$ different values for our index.\n", 76 | "\n", 77 | "$B$ can then be written recursively as:\n", 78 | "\n", 79 | "$$ B[i,j] = \\left(B[i-1, j] \\text{ AND } Z[i+j]=X[i]\\right)\\text{ OR } \\left( B[i, j-1] \\text{ AND } Z[i+j]=Y[j] \\right)$$\n", 80 | "\n", 81 | "meaning the current character of $Z$, $Z[i+j]$, must match either $X[i]$ or $Y[j]$, and for each, the previous substring of $Z$ up to character $i + j -1$ must be a shuffle with that parent string's index decremented.\n", 82 | "\n", 83 | "Basis cases: \n", 84 | "$B[0, j] = B[0, j-1] \\text{ AND } Z[j]=Y[j]$ \n", 85 | "$B[i, 0] = B[i-1, 0] \\text{ AND } Z[i]=X[i]$ \n", 86 | "$B[0, 0] = \\text{True}$\n", 87 | "\n", 88 | "Note that matrix will be of dimension $(|X|+1)\\times(|Y|+1) = (n+1) \\times (m+1$). In the program below, however, $n$ and $m$ are defined as the lengths of strings $X$ and $Y$ *after* the strings have been padded with a space.\n", 89 | "\n", 90 | "The algorithm is $O(nm)$ since there are two nested loops of size $O(n)$ and $O(m)$, with constant time operations within the inner loop." 91 | ] 92 | }, 93 | { 94 | "cell_type": "code", 95 | "execution_count": 1, 96 | "metadata": { 97 | "collapsed": false 98 | }, 99 | "outputs": [ 100 | { 101 | "name": "stdout", 102 | "output_type": "stream", 103 | "text": [ 104 | "True\n", 105 | "False\n" 106 | ] 107 | } 108 | ], 109 | "source": [ 110 | "def is_shuffle(X, Y, Z):\n", 111 | " X = ' ' + X\n", 112 | " Y = ' ' + Y\n", 113 | " Z = ' ' + Z\n", 114 | " \n", 115 | " n = len(X)\n", 116 | " m = len(Y)\n", 117 | " \n", 118 | " if (n + m - 1) != len(Z):\n", 119 | " return False\n", 120 | " \n", 121 | " Matrix = [[False for x in range(m)] for y in range(n)]\n", 122 | " \n", 123 | " for i in range(n):\n", 124 | " for j in range(m):\n", 125 | " if (i == 0) and (j == 0):\n", 126 | " Matrix[i][j] = True\n", 127 | " if (i == 0) and (j != 0):\n", 128 | " if Matrix[0][j-1] and (Y[j] == Z[j]):\n", 129 | " Matrix[i][j] = True\n", 130 | " if (i != 0) and (j == 0):\n", 131 | " if Matrix[i-1][0] and (X[i] == Z[i]):\n", 132 | " Matrix[i][j] = True\n", 133 | " if (i != 0) and (j != 0):\n", 134 | " if (Matrix[i][j-1] and (Y[j] == Z[i+j])) or (Matrix[i-1][j] and X[i] == Z[i+j]):\n", 135 | " Matrix[i][j] = True\n", 136 | "\n", 137 | " return Matrix[n-1][m-1]\n", 138 | "\n", 139 | "X = 'chocolate'\n", 140 | "Y = 'chips'\n", 141 | "Z_True = 'cchocohilaptes' # True\n", 142 | "Z_False = 'chocochilatspe' # False\n", 143 | " \n", 144 | "print(is_shuffle(X, Y, Z_True))\n", 145 | "print(is_shuffle(X, Y, Z_False))" 146 | ] 147 | }, 148 | { 149 | "cell_type": "markdown", 150 | "metadata": {}, 151 | "source": [ 152 | "***" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "metadata": {}, 158 | "source": [ 159 | "## Greedy Algorithms" 160 | ] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "metadata": {}, 165 | "source": [ 166 | "### 8.5 [4]\n", 167 | "\n", 168 | "Let $P_1, P_2, . . . , P_n$ be $n$ programs to be stored on a disk with capacity $D$ megabytes. Program $P_i$ requires $s_i$ megabytes of storage. We cannot store them all because $D < \\sum_{i=1}^{n} s_i$.\n", 169 | "\n", 170 | "**(a):** Does a greedy algorithm that selects programs in order of nondecreasing $s_i$ maximize the number of programs held on the disk? Prove or give a counterexample. \n", 171 | "**(b):** Does a greedy algorithm that selects programs in order of nonincreasing order $s_i$ use as much of the capacity of the disk as possible? Prove or give a counterexample." 172 | ] 173 | }, 174 | { 175 | "cell_type": "markdown", 176 | "metadata": {}, 177 | "source": [ 178 | "*Solution:*\n", 179 | "\n", 180 | "**(a):**\n", 181 | "\n", 182 | "The statement is true: a greedy algorithm would maximize the number of programs held on the disk. We'll prove this by contradiction.\n", 183 | "\n", 184 | "First, let: \n", 185 | "$P = $set of $n$ programs \n", 186 | "$P^{(S)} = $ our \"solution\" set, a subset of $P$ \n", 187 | "$P^{(U)} = $ the set of programs **not** in the solution; the complement of $P^{(S)}$; not empty since \"we cannot store them all\" \n", 188 | "$P_{largest}^{(S)} = $ the largest program in the solution \n", 189 | "$P_{smallest}^{(U)} = $ the smallest program **not** in the solution \n", 190 | "\n", 191 | "A solution $P^{(S)}$ is *greedy* if $P_i < P_{largest}^{(S)}$ implies $P_i \\in P^{(S)}$, that is, if all the programs smaller than $P_{largest}^{(S)}$ are also in the solution $P^{(S)}$. This is equal to the condition: $P^{(S)}$ is *greedy* if $P_{smallest}^{(U)} \\geq P_{largest}^{(S)}$, that is, if the smallest program **not** in the solution is larger than or equal to the largest program **in** the solution.\n", 192 | "\n", 193 | "To prove that the greedy solution contains the largest possible number of programs, we'll assume otherwise and produce a contradiction. Suppose that $P^{(S)}$ contains the largest possible number of programs while having a total size less than $D$, and that it is not greedy. Further, assume that a greedy solution would contain a smaller number of programs, and therefore would not be optimal.\n", 194 | "\n", 195 | "\n", 196 | "Since the solution set $P^{(S)}$ is not the greedy solution, there must be programs **not** in the solution that are smaller than $P_{largest}^{(S)}$. Specifically, we must have $P_{smallest}^{(U)} < P_{largest}^{(S)}$. We now propose the following algorithm for converting the given non-greedy solution into the greedy solution: we keep placing the smallest not-included program $P_{smallest}^{(U)}$ into the solution set if there is room, and if there isn't, we swap it with the largest program currently in the solution $P_{largest}^{(S)}$. Repeat this until the greedy solution is reached. In pseudocode:\n", 197 | "\n", 198 | "$\\hspace{2em} \\text{while } P_{smallest}^{(U)} < P_{largest}^{(S)}:$ \n", 199 | "$\\hspace{4em} \\text{if } \\text{size}(P^{(S)}) + P_{smallest}^{(U)} \\leq D:$ \n", 200 | "$\\hspace{6em} \\text{add } P_{smallest}^{(U)} \\text{ to } P^{(S)}$ \n", 201 | "$\\hspace{4em} \\text{else}:$ \n", 202 | "$\\hspace{6em} \\text{swap } P_{smallest}^{(U)} \\text{ with } P_{largest}^{(S)}$ \n", 203 | "$\\hspace{2em} \\text{If there is room, extend } P^{(S)} \\text{ with the greedy algorithm}$\n", 204 | "\n", 205 | "We wish to show 2 things:\n", 206 | "\n", 207 | "**(1)** *that this process can always be done*: as long as the solution isn't greedy, and therefore $P_{smallest}^{(U)} < P_{largest}^{(S)}$, swapping them will only decrease the size of the solution set, and therefore will always be allowed; and\n", 208 | "\n", 209 | "**(2)** *that the resulting greedy solution contains the same number of programs or greater than the original solution*: the swap operations maintain the same number of programs in the solution, but adding $P_l^{(U)}$ into the solution adds an additional program; thereore the size of the solution can only increase.\n", 210 | "\n", 211 | "Therefore the greedy solution will be at least as good (i. e., contain at least as many programs) as the original non-greedy solution. And if it contains the same number of programs, then it freed up more space on the disk by swapping larger programs for smaller ones.\n", 212 | "\n", 213 | "**(b):**\n", 214 | "\n", 215 | "No. Suppose we have two programs of size 2 and 3, and that our disk size is $D = 3$. The greedy algorithm will produce a solution set $\\{2|{-1}\\}$, containing one program and leaving 1 unit of free space. However, the solution $\\{3 | 0\\}$ would also contain only one program, but would leave no free space on the disk." 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "metadata": {}, 221 | "source": [ 222 | "### 8.6 [5]\n", 223 | "\n", 224 | "Coins in the United States are minted with denominations of 1, 5, 10, 25, and 50 cents. Now consider a country whose coins are minted with denominations of $\\{d_1, . . . , d_k\\}$ units. We seek an algorithm to make change of $n$ units using the minimum number of coins for this country.\n", 225 | "\n", 226 | "**(a):** The greedy algorithm repeatedly selects the biggest coin no bigger than the amount to be changed and repeats until it is zero. Show that the greedy algorithm does not always use the minimum number of coins in a country whose denominations are $\\{1, 6, 10\\}$. \n", 227 | "**(b):** Give an efficient algorithm that correctly determines the minimum number of coins needed to make change of $n$ units using denominations $\\{d_1, . . . , d_k\\}$. Analyze its running time." 228 | ] 229 | }, 230 | { 231 | "cell_type": "markdown", 232 | "metadata": {}, 233 | "source": [ 234 | "*Solution:*\n", 235 | "\n", 236 | "**(a):**\n", 237 | "\n", 238 | "Suppose we want change of 13 cents. The greedy algorithm will produce coins $\\{10, 1, 1, 1\\}$, while a smaller set of coins is $\\{6, 6, 1\\}$. Therefore the greedy algorithm did not produce a minimal set of coins.\n", 239 | "\n", 240 | "**(b):**\n", 241 | "\n", 242 | "Let $C[n] = $ the smallest number of coins needed to make change of $n$ cents.\n", 243 | "\n", 244 | "This can be written recursively as:\n", 245 | "\n", 246 | "$$C[n] = \\min_{d \\in D,\\, d \\leq n} C[n-d] + 1$$\n", 247 | "\n", 248 | "where $D = \\{d_1, . . . , d_k\\}$ are the denominations of the available coins. For example, change of 57 cents using US coins is \n", 249 | "$$C[57] = \\min_{d \\in D,\\, D \\leq n} C[d] + 1 = \\min \\left\\{ C[56],\\, C[52],\\, C[47],\\, C[32],\\, C[50] \\right\\} + 1$$\n", 250 | "\n", 251 | "Intuitively, 57 cents can be made with one additional coin from any set of coins with value $57 - d$, for any $d \\in D$.\n", 252 | "\n", 253 | "We can implement this recursive relation using dynamic progamming, where to find the number of coins needed to make change of $n$ cents, we make a 1D array of length $n+1$ and calculate the value of $C[i]$ for all $i$ up to $n$.\n", 254 | "\n", 255 | "It is assumed that a 1-cent coin is included so that coins can produce any amount of cents. Due to this, every value between $1$ and $n$ will be calculated at some point. If this wasn't true, then it may have been more space efficient to implement this with explicit recursion with caching using a dictionary, that way only needed values are computed.\n", 256 | "\n", 257 | "The algorithm fills up an array of size $n+1$, and for each cell computes the minimum of $\\leq k$ values, where $k$ is the number of different types of coins, each of which is a constant time look up. Therefore the algorithm is $O(nk)$." 258 | ] 259 | }, 260 | { 261 | "cell_type": "code", 262 | "execution_count": 2, 263 | "metadata": { 264 | "collapsed": false 265 | }, 266 | "outputs": [ 267 | { 268 | "name": "stdout", 269 | "output_type": "stream", 270 | "text": [ 271 | "CPU times: user 25 ms, sys: 795 µs, total: 25.8 ms\n", 272 | "Wall time: 25.3 ms\n" 273 | ] 274 | }, 275 | { 276 | "data": { 277 | "text/plain": [ 278 | "203" 279 | ] 280 | }, 281 | "execution_count": 2, 282 | "metadata": {}, 283 | "output_type": "execute_result" 284 | } 285 | ], 286 | "source": [ 287 | "def subtract_coins(n, C, D):\n", 288 | " A = []\n", 289 | " for coin in D:\n", 290 | " if n - coin >= 0:\n", 291 | " A.append(C[n - coin])\n", 292 | " return A\n", 293 | "\n", 294 | "def changer(n, D):\n", 295 | " C = [0]\n", 296 | " for i in range(1, n + 1):\n", 297 | " C.append(min(subtract_coins(i, C, D)) + 1)\n", 298 | " return C[n]\n", 299 | "\n", 300 | "D = [1, 5, 10, 25, 50]\n", 301 | "%time changer(10031, D)" 302 | ] 303 | }, 304 | { 305 | "cell_type": "markdown", 306 | "metadata": {}, 307 | "source": [ 308 | "**It is possible** to improve the efficiency on repeated function calls by using a **closure**. This allows the cached values to persist beyond a single call. However, the set of coin denominations $D$ must be constant. Therefore the `changer_with_closure(D)` function implemented below takes a set of coin denominations and produces a `changer(n)` function that will \"change\" different coin values with that fixed currency, with cached values persisting.\n", 309 | "\n", 310 | "Repeated calls are shown to be much faster." 311 | ] 312 | }, 313 | { 314 | "cell_type": "code", 315 | "execution_count": 3, 316 | "metadata": { 317 | "collapsed": false 318 | }, 319 | "outputs": [ 320 | { 321 | "name": "stdout", 322 | "output_type": "stream", 323 | "text": [ 324 | "CPU times: user 28.2 ms, sys: 1.47 ms, total: 29.7 ms\n", 325 | "Wall time: 30.3 ms\n" 326 | ] 327 | }, 328 | { 329 | "data": { 330 | "text/plain": [ 331 | "203" 332 | ] 333 | }, 334 | "execution_count": 3, 335 | "metadata": {}, 336 | "output_type": "execute_result" 337 | } 338 | ], 339 | "source": [ 340 | "# Implementation using a closure\n", 341 | "def changer_with_closure(D):\n", 342 | " C = [0]\n", 343 | " \n", 344 | " def dynamic_changer(n): \n", 345 | " if len(C) - 1 < n:\n", 346 | " for i in range(len(C), n + 1):\n", 347 | " C.append(min(subtract_coins(i)) + 1)\n", 348 | " return C[n]\n", 349 | " \n", 350 | " def subtract_coins(n):\n", 351 | " A = []\n", 352 | " for coin in D:\n", 353 | " if n - coin >= 0:\n", 354 | " A.append(C[n - coin])\n", 355 | " return A\n", 356 | " \n", 357 | " return dynamic_changer\n", 358 | " \n", 359 | "D = [1, 5, 10, 25, 50]\n", 360 | "changer = changer_with_closure(D)\n", 361 | "%time changer(10031)" 362 | ] 363 | }, 364 | { 365 | "cell_type": "code", 366 | "execution_count": 4, 367 | "metadata": { 368 | "collapsed": false 369 | }, 370 | "outputs": [ 371 | { 372 | "name": "stdout", 373 | "output_type": "stream", 374 | "text": [ 375 | "CPU times: user 3 µs, sys: 0 ns, total: 3 µs\n", 376 | "Wall time: 5.01 µs\n" 377 | ] 378 | }, 379 | { 380 | "data": { 381 | "text/plain": [ 382 | "203" 383 | ] 384 | }, 385 | "execution_count": 4, 386 | "metadata": {}, 387 | "output_type": "execute_result" 388 | } 389 | ], 390 | "source": [ 391 | "# Repeat SAME computation on 100.31 dollars \n", 392 | "%time changer(10031)" 393 | ] 394 | }, 395 | { 396 | "cell_type": "code", 397 | "execution_count": 5, 398 | "metadata": { 399 | "collapsed": false 400 | }, 401 | "outputs": [ 402 | { 403 | "name": "stdout", 404 | "output_type": "stream", 405 | "text": [ 406 | "Change for 0 cents with 0 coins\n", 407 | "Change for 1 cents with 1 coins\n", 408 | "Change for 2 cents with 2 coins\n", 409 | "Change for 3 cents with 3 coins\n", 410 | "Change for 4 cents with 4 coins\n", 411 | "Change for 5 cents with 1 coins\n", 412 | "Change for 6 cents with 2 coins\n", 413 | "Change for 7 cents with 3 coins\n", 414 | "Change for 8 cents with 4 coins\n", 415 | "Change for 9 cents with 5 coins\n", 416 | "Change for 10 cents with 1 coins\n", 417 | "Change for 11 cents with 2 coins\n", 418 | "Change for 12 cents with 3 coins\n", 419 | "Change for 13 cents with 4 coins\n", 420 | "Change for 14 cents with 5 coins\n", 421 | "Change for 15 cents with 2 coins\n", 422 | "Change for 16 cents with 3 coins\n", 423 | "Change for 17 cents with 4 coins\n", 424 | "Change for 18 cents with 5 coins\n", 425 | "Change for 19 cents with 6 coins\n", 426 | "Change for 20 cents with 2 coins\n", 427 | "Change for 21 cents with 3 coins\n", 428 | "Change for 22 cents with 4 coins\n", 429 | "Change for 23 cents with 5 coins\n", 430 | "Change for 24 cents with 6 coins\n", 431 | "Change for 25 cents with 1 coins\n" 432 | ] 433 | } 434 | ], 435 | "source": [ 436 | "for i in range(0, 26):\n", 437 | " print('Change for %d cents with %d coins' % (i, changer(i)))" 438 | ] 439 | }, 440 | { 441 | "cell_type": "markdown", 442 | "metadata": {}, 443 | "source": [ 444 | "***" 445 | ] 446 | }, 447 | { 448 | "cell_type": "markdown", 449 | "metadata": {}, 450 | "source": [ 451 | "## Number Problems" 452 | ] 453 | }, 454 | { 455 | "cell_type": "markdown", 456 | "metadata": {}, 457 | "source": [ 458 | "***" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "metadata": {}, 464 | "source": [ 465 | "## Graph Problems" 466 | ] 467 | }, 468 | { 469 | "cell_type": "markdown", 470 | "metadata": {}, 471 | "source": [ 472 | "***" 473 | ] 474 | }, 475 | { 476 | "cell_type": "markdown", 477 | "metadata": {}, 478 | "source": [ 479 | "## Design Problems" 480 | ] 481 | }, 482 | { 483 | "cell_type": "markdown", 484 | "metadata": {}, 485 | "source": [ 486 | "### 8.18 [4] Unfinished\n", 487 | "\n", 488 | "Consider the problem of storing $n$ books on shelves in a library. The order of the books is fixed by the cataloging system and so cannot be rearranged. Therefore, we can speak of a book $b_i$, where $1 \\leq i \\leq n$, that has a thickness $t_i$ and height $h_i$. The length of each bookshelf at this library is $L$.\n", 489 | "\n", 490 | "Suppose all the books have the same height $h$ (i.e. , $h = h_i = h_j$ for all $i$, $j$) and the shelves are all separated by a distance of greater than $h$, so any book fits on any shelf. The greedy algorithm would fill the first shelf with as many books as we can until we get the smallest $i$ such that $b_i$ does not fit, and then repeat with subsequent shelves. Show that the greedy algorithm always finds the optimal shelf placement, and analyze its time complexity." 491 | ] 492 | }, 493 | { 494 | "cell_type": "markdown", 495 | "metadata": {}, 496 | "source": [ 497 | "*Solution:*\n", 498 | "\n", 499 | "First note that the fact that you cannot reorder the books is crucial for the correctness of the greedy algorithm. For example, suppose 7 books have thicknesses $\\{20, 20, 20, 50, 20, 20, 50\\}$ and shelves have length $100$. If we had to keep them in order, the greedy algorithm would place them on shelves as $\\text{Shelf 1} = \\{20, 20, 20 | -40\\}$, $\\text{Shelf 2} = \\{50, 20, 20 | -10\\}$, and $\\text{Shelf 3} = \\{50 | -50\\}$, using 3 shelves with 100 units of free space. If they could be reordered, the optimal packing would be $\\text{Shelf 1} = \\{50, 50 |0\\}$ and $\\text{Shelf 2} = \\{20, 20, 20, 20, 20 | 0\\}$, using 2 shelves and leaving no free space.\n", 500 | "\n", 501 | "Second, note that Skiena does not define what he means by the \"optimal shelf placement\". This could mean \"using the smallest number of shelves\" or \"leaving the smallest amount of space free\". However, these two definitions are equivalent; an arrangement of the same books on the same number of shelves will take up the same amount of space, and therefore will leave the same amount of space free. For two arrangements to leave different amounts of space free, they must use different amounts of shelves.\n", 502 | "\n", 503 | "We will prove the optimality of the greedy algorithm by contradition. Suppose that a given arrangement $P$ uses $n$ shelves, but the greedy algorithm would use more than $n$ shelves.\n", 504 | "\n", 505 | "For simplicity, assume that there is no empty shelf space between books, meaning that on every shelf the books are pushed all the way to the left, and any free space occurs to their right.\n", 506 | "\n", 507 | "Without loss of generality, we will speak of the books being placed in their positions in catalog order.\n", 508 | "\n", 509 | "Because the current arrangement $P$ is not greedy, there must have occured a time when a book was placed on the next shelf, even though there was still room for it on the current shelf. Therefore in the current arrangement, there must be a book $i$ at the beginning of a shelf that can be moved to the end of the previous shelf. Let us make that move. Doing so will reduce the amount of free space on the earlier shelf and will increase the amount of free space on the later shelf. If the moved book was the only book on its shelf, then we have just freed up a shelf. Any further moves will ignore that shelf. If we repeat this move at many times as possible, we will have produced the greedy arrangement. And because each move could only decrease the amount of shelves moved, and never can increase, we can be sure that the greedy solution contains the same number of shelves or less than the original, \"optimal\" arrangement. But this contradicts our assumption that the greedy solution was not optimal.\n", 510 | "\n", 511 | "Therefore, the greedy arrangement is optimal." 512 | ] 513 | }, 514 | { 515 | "cell_type": "markdown", 516 | "metadata": {}, 517 | "source": [ 518 | "***" 519 | ] 520 | }, 521 | { 522 | "cell_type": "markdown", 523 | "metadata": {}, 524 | "source": [ 525 | "## Interview Problems" 526 | ] 527 | }, 528 | { 529 | "cell_type": "markdown", 530 | "metadata": {}, 531 | "source": [ 532 | "### 8.24 [5]\n", 533 | "\n", 534 | "Given a set of coin denominations, find the minimum number of coins to make a certain amount of change." 535 | ] 536 | }, 537 | { 538 | "cell_type": "markdown", 539 | "metadata": {}, 540 | "source": [ 541 | "*Solution:*\n", 542 | "\n", 543 | "Let $D = \\{d_1, ..., d_k\\}$ be the denominations of $k$ different types of coins in a currency, and let $C[n]$ be the minimal number of coins needed to equal $n$ cents. If we knew the value of $C[i]$ for all $i < n$, we could look up the number of coins needed to produce values of $n-d$ for all $d \\in D$. From each of these sets of coins, we would only need to add a single coin of denomination $d$ to equal $n$ cents. Therefore $C[n]$ is equal to the number of coins in the smallest of these sets, plus one. Written out, we have the recursion relation:\n", 544 | "\n", 545 | "$$C[n] = \\min_{d \\in D, \\, d \\leq n} C[n-d] + 1$$\n", 546 | "\n", 547 | "where the additional condition $d \\leq n$ is needed to ensure that $n-d \\geq 0$; otherwise we would be trying to produce a negative amount of cents, which... doesn't make sense.\n", 548 | "\n", 549 | "This recursive relation can be used to implement a \"dynamic\" program that calculates the value of $C[i]$ for all $i$ up to $n$, caching the results. Each computation will require a comparison of up to $k$ different values. Therefore the algorithm will be $O(nk)$.\n", 550 | "\n", 551 | "As done above for problem 8.6, this could be implemented using a closure so that the cached computations will persist across function calls. But in the spirit of an \"interview\" problem, I implemented this quickly and focused on correctness, not additional functionality." 552 | ] 553 | }, 554 | { 555 | "cell_type": "code", 556 | "execution_count": 6, 557 | "metadata": { 558 | "collapsed": false 559 | }, 560 | "outputs": [ 561 | { 562 | "name": "stdout", 563 | "output_type": "stream", 564 | "text": [ 565 | "0 cents with 0 coins\n", 566 | "1 cents with 1 coins\n", 567 | "2 cents with 2 coins\n", 568 | "3 cents with 3 coins\n", 569 | "4 cents with 4 coins\n", 570 | "5 cents with 1 coins\n", 571 | "6 cents with 2 coins\n", 572 | "7 cents with 3 coins\n", 573 | "8 cents with 4 coins\n", 574 | "9 cents with 5 coins\n", 575 | "10 cents with 1 coins\n", 576 | "11 cents with 2 coins\n", 577 | "12 cents with 3 coins\n", 578 | "13 cents with 4 coins\n", 579 | "14 cents with 5 coins\n", 580 | "15 cents with 2 coins\n", 581 | "16 cents with 3 coins\n", 582 | "17 cents with 4 coins\n", 583 | "18 cents with 5 coins\n", 584 | "19 cents with 6 coins\n", 585 | "20 cents with 2 coins\n", 586 | "21 cents with 3 coins\n", 587 | "22 cents with 4 coins\n", 588 | "23 cents with 5 coins\n", 589 | "24 cents with 6 coins\n", 590 | "25 cents with 1 coins\n" 591 | ] 592 | } 593 | ], 594 | "source": [ 595 | "def changer(n, D):\n", 596 | " C = []\n", 597 | " C.append(0)\n", 598 | " for i in range(1, n + 1):\n", 599 | " C.append(min(smaller_coins(i, C, D)) + 1)\n", 600 | " return C[n]\n", 601 | "\n", 602 | "def smaller_coins(i, C, D):\n", 603 | " A = []\n", 604 | " for d in D:\n", 605 | " value = i - d\n", 606 | " if value >= 0:\n", 607 | " A.append(C[value])\n", 608 | " return A\n", 609 | "\n", 610 | "D = [1, 5, 10, 25]\n", 611 | "for i in range(26):\n", 612 | " print('%d cents with %d coins' % (i, changer(i, D)))" 613 | ] 614 | }, 615 | { 616 | "cell_type": "markdown", 617 | "metadata": {}, 618 | "source": [ 619 | "### 8.25 [5]\n", 620 | "\n", 621 | "You are given an array of $n$ numbers, each of which may be positive, negative, or zero. Give an efficient algorithm to identify the index positions $i$ and $j$ that maximize the sum of the ith through jth numbers." 622 | ] 623 | }, 624 | { 625 | "cell_type": "markdown", 626 | "metadata": {}, 627 | "source": [ 628 | "*Solution:*\n", 629 | "\n", 630 | "Let $A = \\{a_1, ..., a_n\\}$ be the array of numbers.\n", 631 | "\n", 632 | "A brute force method to find the largest contiguous sum would try all possible combinations of $i$ and $j$, calculating the sum each time. This would be $O(n^3)$.\n", 633 | "\n", 634 | "If we pre-computed the partial sums $S[i] = \\sum_{x=0}^{i} a_x$ and define $S[-1] = 0$, then the sum of the numbers between $i$ and $j$ would be $S[j] - S[i-1]$. This is a constant time operation, reducing our complexity to $O(n^2)$.\n", 635 | "\n", 636 | "We probably can't hope for an $O(n \\log n)$ algorithm, since we aren't allowed to sort the array.\n", 637 | "\n", 638 | "This problem has an inherent left-rightness, and so maybe a dynamic programming solution might work. Since there are 2 indices, we will try to find a solution that constructs either a 1D or 2D matrix.\n", 639 | "\n", 640 | "Suppose we isolate a left portion of the set $A$ between positions $0$ and some $j < n$. Furthermore, suppose we knew the maximum possible sum of a contiguous set of numbers *ending on position $j$*. We can denote this as $C[j]$. And now we want to know what is the largest possible sum of contiguous numbers *ending on position $j+1$*, $C[j+1]$. Well, $A[j+1]$ must be included in the sum. Is it possible to produce a sum larger than $A[j+1]$ by including previous numbers? If we do, the maximum possible contribution of this set of \"previous numbers\" is exactly $C[j]$. Therefore these numbers should be included only if $C[j] > 0$.\n", 641 | "\n", 642 | "Writing out this recursion relation, we have:\n", 643 | "$$ C[j] = A[j] + \\max\\left(0, C[j-1]\\right)$$\n", 644 | "\n", 645 | "This is a constant time operation. Therefore computing $C[j]$ for all $j$ from $0$ to $n-1$ will be $O(n)$. Afterwards, we can do a single linear scan to find the maximum value in $C$, which will be the largest possible contiguous sum. Then we can do one more scan backwards from the location of the maximum to find the matching index $i$ that produces that sum.\n", 646 | "\n", 647 | "To increase efficiency, the first of those two final linear scans can be avoided by tracking the largest value seen so far during the computation of the array $C$. This costs a small, constant amount of space, and saves us a search over all $n$ items in the array. \n", 648 | "\n", 649 | "The second final linear scan to locate the matching index $i$ can also be avoided if, during the computation of $C$, we record in a second array $I$ the locations of the corresponding $i$ indices. This will double the amount of linear space used, but save us a linear amount of time, depending on how many values are in the maximal contiguous set, which could be just $1$ or could be all $n$.\n", 650 | "\n", 651 | "These two improvements reduce the amount of linear scans over the array from 3 to 1.\n", 652 | "\n", 653 | "This algorithm will be $O(n)$." 654 | ] 655 | }, 656 | { 657 | "cell_type": "code", 658 | "execution_count": 1, 659 | "metadata": { 660 | "collapsed": false 661 | }, 662 | "outputs": [ 663 | { 664 | "name": "stdout", 665 | "output_type": "stream", 666 | "text": [ 667 | "Largest sum is 11 between i = 0 and j = 5\n" 668 | ] 669 | } 670 | ], 671 | "source": [ 672 | "def largest_contiguous_sequence(A):\n", 673 | " n = len(A)\n", 674 | " I = [0]\n", 675 | " C = [A[0]]\n", 676 | " max_j = 0\n", 677 | " \n", 678 | " for j in range(1, n):\n", 679 | " i = j\n", 680 | " value = A[j]\n", 681 | " if C[-1] > 0:\n", 682 | " value += C[-1]\n", 683 | " i = I[-1]\n", 684 | " I.append(i)\n", 685 | " C.append(value)\n", 686 | " if value > C[max_j]:\n", 687 | " max_j = j\n", 688 | " \n", 689 | " print('Largest sum is %d between i = %d and j = %d' % (C[max_j], I[max_j], max_j))\n", 690 | "\n", 691 | "largest_contiguous_sequence([10, -1, -1, -1, -1, 5, -1])" 692 | ] 693 | }, 694 | { 695 | "cell_type": "code", 696 | "execution_count": null, 697 | "metadata": { 698 | "collapsed": true 699 | }, 700 | "outputs": [], 701 | "source": [] 702 | } 703 | ], 704 | "metadata": { 705 | "kernelspec": { 706 | "display_name": "Python 3", 707 | "language": "python", 708 | "name": "python3" 709 | }, 710 | "language_info": { 711 | "codemirror_mode": { 712 | "name": "ipython", 713 | "version": 3 714 | }, 715 | "file_extension": ".py", 716 | "mimetype": "text/x-python", 717 | "name": "python", 718 | "nbconvert_exporter": "python", 719 | "pygments_lexer": "ipython3", 720 | "version": "3.5.1" 721 | } 722 | }, 723 | "nbformat": 4, 724 | "nbformat_minor": 0 725 | } 726 | -------------------------------------------------------------------------------- /Chapter 9 - Intractable Problems and Approximation Algorithms.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Chapter 9: Intractable Problems and Approximation Algorithms (Completed 7/30: 23%)" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "## Transformations and Satisfiability" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "### 9.1 [2]\n", 22 | "\n", 23 | "Give the 3-SAT formula that results from applying the reduction of SAT to 3-SAT for the formula:\n", 24 | "\n", 25 | "$$ (x + y + \\bar{z} + w + u + \\bar{v}) \\cdot (\\bar{x} + \\bar{y} + z + \\bar{w} + u + v) \\cdot (x + \\bar{y} + \\bar{z} + w + u + \\bar{v}) \\cdot (x + \\bar{y})$$" 26 | ] 27 | }, 28 | { 29 | "cell_type": "markdown", 30 | "metadata": {}, 31 | "source": [ 32 | "*Solution:*\n", 33 | "\n", 34 | "We have 4 clauses:\n", 35 | "\n", 36 | "$C_1 = (x + y + \\bar{z} + w + u + \\bar{v})$ \n", 37 | "$C_2 = (\\bar{x} + \\bar{y} + z + \\bar{w} + u + v)$ \n", 38 | "$C_3 = (x + \\bar{y} + \\bar{z} + w + u + \\bar{v})$ \n", 39 | "$C_4 = (x + \\bar{y})$\n", 40 | "\n", 41 | "We wish to transform each clause into a set of 3-clauses such that sets of truth assignments that satisty the original clause also satisfy the engendered set of 3-clauses, and vice versa. We will follow the transformation procedure given in the text.\n", 42 | "\n", 43 | "$C_1 = (x + y + \\bar{z} + w + u + \\bar{v})$ transforms to\n", 44 | "\n", 45 | "$C_1^{1} = (x + y + a_1)$ \n", 46 | "$C_1^{2} = (\\bar{a_1} + \\bar{z} + a_2)$ \n", 47 | "$C_1^{3} = (\\bar{a_2} + w + a_3)$ \n", 48 | "$C_1^{4} = (\\bar{a_3} + u + \\bar{v})$\n", 49 | "\n", 50 | "$C_2 = (\\bar{x} + \\bar{y} + z + \\bar{w} + u + v)$ transforms to\n", 51 | "\n", 52 | "$C_2^{1} = (\\bar{x} + \\bar{y} + a_4)$ \n", 53 | "$C_2^{2} = (\\bar{a_4} + z + a_5)$ \n", 54 | "$C_2^{3} = (\\bar{a_5} + \\bar{w} + a_6)$ \n", 55 | "$C_2^{4} = (\\bar{a_6} + u + v)$\n", 56 | "\n", 57 | "$C_3 = (x + \\bar{y} + \\bar{z} + w + u + \\bar{v})$ transforms to\n", 58 | "\n", 59 | "$C_3^{1} = (x + \\bar{y} + a_7)$ \n", 60 | "$C_3^{2} = (\\bar{a_7} + \\bar{z} + a_8)$ \n", 61 | "$C_3^{3} = (\\bar{a_8} + w + a_9)$ \n", 62 | "$C_3^{4} = (\\bar{a_9} + u + \\bar{v})$\n", 63 | "\n", 64 | "and $C_1 = (x + y + \\bar{z} + w + u + \\bar{v})$ transforms to\n", 65 | "\n", 66 | "$C_4^{1} = (x + \\bar{y} + a_{10})$ \n", 67 | "$C_4^{2} = (x + \\bar{y} + \\bar{a_{10})}$\n", 68 | "\n", 69 | "\n", 70 | "$C_1$, which contains 6 variables, is transformed into $6-2 = 4$ clauses (every variable except the first and the last gets its own clause) with the addition of $6-3 = 3$ new variables. Since the number of clauses exceeds the number of new variables by 1, the new variables by themselves are not able to ensure that all the clauses are True; at least one of the original literals must also be True, meaning that the original clause must be True. The same reasoning also shows that the satisfiability of clauses 2 and 3 are equivalent to their engendered set of clauses. Similarly, regarding the two clauses engendered by clause 4, the new variable $a_{10}$ cannot make both clauses True. Rather, both can be True only if one of the original two literals is True, which is equivalent to the original clause being True. Therefore the satisfiability of clause 4 is equivalent to the satisfiability of its two engendered clauses.\n", 71 | "\n", 72 | "Putting everything together, the equivalent 3-SAT formula is:\n", 73 | "\n", 74 | "$$\n", 75 | "(x + y + a_1)\n", 76 | "\\cdot (\\bar{a_1} + \\bar{z} + a_2)\n", 77 | "\\cdot (\\bar{a_2} + w + a_3)\n", 78 | "\\cdot (\\bar{a_3} + u + \\bar{v})$$\n", 79 | "$$\\cdot (\\bar{x} + \\bar{y} + a_4)\n", 80 | "\\cdot (\\bar{a_4} + z + a_5)\n", 81 | "\\cdot (\\bar{a_5} + \\bar{w} + a_6)\n", 82 | "\\cdot (\\bar{a_6} + u + v)$$\n", 83 | "$$\\cdot (x + \\bar{y} + a_7)\n", 84 | "\\cdot (\\bar{a_7} + \\bar{z} + a_8)\n", 85 | "\\cdot (\\bar{a_8} + w + a_9)\n", 86 | "\\cdot (\\bar{a_9} + u + \\bar{v})$$\n", 87 | "$$\\cdot (x + \\bar{y} + a_{10})\n", 88 | "\\cdot (x + \\bar{y} + \\bar{a_{10})}\n", 89 | "$$\n", 90 | "\n", 91 | "Note that for every new literal, the equivalent NOT literal appears in the very next clause. It is this sort of chaining that ensures that the new variables don't alter the satisfiability of the original clauses and literals.\n", 92 | "\n" 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "metadata": {}, 98 | "source": [ 99 | "### 9.3 [4]\n", 100 | "\n", 101 | "Suppose we are given a subroutine which can solve the traveling salesman decision problem of page 318 in, say, linear time. Give an efficient algorithm to find the actual TSP tour by making a polynomial number of calls to this subroutine." 102 | ] 103 | }, 104 | { 105 | "cell_type": "markdown", 106 | "metadata": {}, 107 | "source": [ 108 | "*Solution:*\n", 109 | "\n", 110 | "The Traveling Salesman Problem as given on page 319 is:\n", 111 | "\n", 112 | "*Problem:* The Traveling Salesman Decision Problem \n", 113 | "*Input:* A weighted graph $G$ and integer $k$. \n", 114 | "*Output:* Does there exist a TSP tour with cost $\\leq k$?\n", 115 | "\n", 116 | "Suppose that we have a subroutine that can solve the decision problem version of TSP in $O(m)$ running time. We could use this to find the smallest tour. First we call the subroutine for increasing $k$ until we find the smallest such $k$ such that there exists a tour. This will be $O(mW)$, where $W$ is the total weight of the graph. Then we iterate over the edges in the graph, and for each edge, we delete the edge and recall the subroutine; if there still exists a tour of size $k$, we keep the edge deleted, otherwise we restore it. Only edges that are a part of every tour that is possible with the remaining edges, and therefore are \"necessary\", will remain. But if all remaining edges are part of every remaining possible tour, then there is only one such tour, composed of exactly the remaining edges. Therefore this algorithm finds a tour of weight $k$, where $k$ is optimal. The second part of the algorithm runs in $O(m|E|)$ time, for a total running time of $O(m(W + |E|))$." 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "metadata": {}, 122 | "source": [ 123 | "---" 124 | ] 125 | }, 126 | { 127 | "cell_type": "markdown", 128 | "metadata": {}, 129 | "source": [ 130 | "## Basic Reductions" 131 | ] 132 | }, 133 | { 134 | "cell_type": "markdown", 135 | "metadata": {}, 136 | "source": [ 137 | "### 9.7 [4]\n", 138 | "\n", 139 | "An instance of the *set cover* problem consists of a set $X$ of $n$ elements, a family $F$ of subsets of $X$, and an integer $k$. The question is, does there exist $k$ subsets from $F$ whose union is $X$?\n", 140 | "\n", 141 | "For example, if $X = \\{1, 2, 3, 4\\}$ and $F = \\{\\{1, 2\\}, \\{2, 3\\}, \\{4\\}, \\{2, 4\\}\\}$, there does not exist a solution for $k = 2$, but there does for $k = 3$ (for example, $\\{1, 2\\}, \\{2, 3\\}, \\{4\\}$).\n", 142 | "\n", 143 | "Prove that set cover is NP-complete with a reduction from vertex cover." 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": {}, 149 | "source": [ 150 | "*Solution:*\n", 151 | "\n", 152 | "We wish to map instances of the Vertex Cover problem (VCP) to equivalent instances of the Set Cover problem (SCP).\n", 153 | "\n", 154 | "In VCP, we seek to select a set of vertices such that every edge is covered. In SCP we seek to select a set of sets such that every element in the universal set is covered. This suggests we map each edge $E$ in our graph $G$ to a unique number from $1$ to $|E| = n$, and map each vertex $V$ in $G$ to a set that contains the numbers corresponding to its adjacent edges. If a collection of sets is found whose union is every number from $1$ to $n$, this can be translated back to a set of vertices that are adjacent to every edge.\n", 155 | "\n", 156 | "This translation from instances of VCP to SCP can be made in polynomial time, requiring every edge and vertex in the graph to be visited once.\n", 157 | "\n", 158 | "With this correspondence in place, if a polynomial-time algorithm was to be found to solve the Set Cover problem, it could also be used to solve the Vertex Cover problem. But the Vertex Cover problem is known to be NP-complete, meaning that no deterministic polynomial-time algorithm is known to solve it; nor is one likely to exist. Therefore none can exist for the Set Cover problem either, otherwise the NP-completeness of VCP would be violated.\n", 159 | "\n", 160 | "Therefore the *Set Cover* problem must be at least as hard as the *Vertex Cover* problem, and is therefore NP-complete." 161 | ] 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "metadata": {}, 166 | "source": [ 167 | "### 9.8 [4]\n", 168 | "\n", 169 | "The *baseball card collector problem* is as follows. Given packets $P_1, . . . , P_m$, each of which contains a subset of this year’s baseball cards, is it possible to collect all the year’s cards by buying $\\leq k$ packets?\n", 170 | "\n", 171 | "For example, if the players are $\\{Aaron,Mays,Ruth, Skiena\\}$ and the packets are\n", 172 | "\n", 173 | "$$ \\{\\{Aaron,Mays\\}, \\{Mays,Ruth\\}, \\{Skiena\\}, \\{Mays, Skiena\\}\\}, $$\n", 174 | "\n", 175 | "there does not exist a solution for $k = 2$, but there does for $k = 3$, such as\n", 176 | "\n", 177 | "$$ \\{Aaron,Mays\\}, \\{Mays,Ruth\\}, \\{Skiena\\} $$\n", 178 | "\n", 179 | "Prove that the baseball card collector problem is NP-hard using a reduction from vertex cover." 180 | ] 181 | }, 182 | { 183 | "cell_type": "markdown", 184 | "metadata": {}, 185 | "source": [ 186 | "*Solution:*\n", 187 | "\n", 188 | "We wish to map instances of the Vertex Cover problem (VCP) to instances of the Baseball Card Collector problem (BCCP).\n", 189 | "\n", 190 | "In VCP we seek to select a subset of vertices, while in BCCP we seek to select a subset of packets. This suggests we map vertices in the VCP graph to packets, and the edges adjacent to each vertex to the contents of its corresponding packet. Specifically, label every edge with a unique number from $1$ to $|E| = n$. Packets will contain the numbers of the edges adjacent to the corresponding vertex. The union of the packets is the total set of this year's baseball cards, in this case the total set of edges.\n", 191 | "\n", 192 | "An output of BCCP will select a subset of packets (vertices) such that every baseball card (edge in $E$) is contained in their union. This is identical to the definition of every edge being \"covered\" by a subset of vertices. Therefore outputs of BCCP are correct outputs for VCP.\n", 193 | "\n", 194 | "With this correspondence between the two problems in place, we can use algorithms for solving BCCP for solving VCP. Since VCP is known to be NP-complete, BCCP must be NP-complete as well. Specifically, if a deterministic polynomial time algorithm was found to solve BCCP, it could be used to solve VCP, thereby violating the purported \"hardness\" of VCP.\n", 195 | "\n", 196 | "Another way of thinking about this is that this correspondence between VCP and BCCP shows that BCCP must be \"at least as hard as VCP\". This is because the space of possible inputs to VCP is included in the space of inputs to BCCP; but the space of inputs to BCCP may be larger, meaning there may be inputs to BCCP that do not correspond to VCP inputs. Since BCCP is \"at least as hard as VCP\", and VCP is known to be NP-complete, BCCP must be NP-complete as well. The definition of NP-complete includes being NP-hard. Therefore BCCP is NP-hard and NP-complete." 197 | ] 198 | }, 199 | { 200 | "cell_type": "markdown", 201 | "metadata": {}, 202 | "source": [ 203 | "---" 204 | ] 205 | }, 206 | { 207 | "cell_type": "markdown", 208 | "metadata": {}, 209 | "source": [ 210 | "## Creative Reductions" 211 | ] 212 | }, 213 | { 214 | "cell_type": "markdown", 215 | "metadata": {}, 216 | "source": [ 217 | "### 9-13 [5]\n", 218 | "\n", 219 | "Prove that the following problem is NP-complete:\n", 220 | "\n", 221 | "*Problem:* Hitting Set \n", 222 | "*Input:* A collection $C$ of subsets of a set $S$, positive integer $k$. \n", 223 | "*Output:* Does $S$ contain a subset $S'$ such that $|S'| \\leq k$ and each subset in $C$ contains at least one element from $S'$?" 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "metadata": {}, 229 | "source": [ 230 | "*Solution:*\n", 231 | "\n", 232 | "This is a selection problem, and therefore we will try to use a reduction from the Vertex Cover problem (VCP), the archetype of \"Selection\".\n", 233 | "\n", 234 | "VCP is NP-complete. Therefore if we can map instances of VCP to instances of the Hitting Set problem (HSP), then HSP must be at least as hard as VCP, otherwise an algorithm for HCP could be used to violate the hardness of VCP.\n", 235 | "\n", 236 | "Note that a graph is already in the form of an input to HCP. Specifically, the set of vertices $V$ is the set $S$, and the collection of edges $E$ is the collection $C$ of subsets of $S$, where each edge is a subset of size 2 containing the two vertices it connects.\n", 237 | "\n", 238 | "The output from HCP is whether there is a subset (of vertices) of size less than or equal to $k$ such that each subset (edge) in $C$ ($E$) contains at least one element from this subset. The condition that every edge contain an element from $S'$ ($V'$) is the definition of what it means for every edge to be \"covered\". Therefore a solution to HSP is a solution to the equivalent VCP problem.\n", 239 | "\n", 240 | "Interestingly, note that the size of the input space to HSP is larger than that of VCP, since every input of VCP has an equivalent HSP input, but most HSP inputs can't be mapped to VCP inputs. Specifically, subsets of $S$ that have more than 2 elements can't be interpreted as edges of a graph. So even if we were to restrict the types of inputs of HSP to allow only subsets of size 2, the problem would still be NP-complete. This reduction elucidates what makes HSP difficult, mainly that it is a problem of selection." 241 | ] 242 | }, 243 | { 244 | "cell_type": "markdown", 245 | "metadata": {}, 246 | "source": [ 247 | "---" 248 | ] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "metadata": {}, 253 | "source": [ 254 | "## Algorithms for Special Cases" 255 | ] 256 | }, 257 | { 258 | "cell_type": "markdown", 259 | "metadata": {}, 260 | "source": [ 261 | "---" 262 | ] 263 | }, 264 | { 265 | "cell_type": "markdown", 266 | "metadata": {}, 267 | "source": [ 268 | "## P=NP?" 269 | ] 270 | }, 271 | { 272 | "cell_type": "markdown", 273 | "metadata": {}, 274 | "source": [ 275 | "### 9.23 [4]\n", 276 | "\n", 277 | "Show that the following problems are in NP:\n", 278 | "- Does graph $G$ have a simple path (i.e. , with no vertex repeated) of length $k$?\n", 279 | "- Is integer $n$ composite (i.e. , not prime)?\n", 280 | "- Does graph $G$ have a vertex cover of size $k$?" 281 | ] 282 | }, 283 | { 284 | "cell_type": "markdown", 285 | "metadata": {}, 286 | "source": [ 287 | "*Solution:*\n", 288 | "\n", 289 | "A problem is in NP if a potential solution can be verified in polynomial time.\n", 290 | "\n", 291 | "**- Does graph $G$ have a simple path (i.e. , with no vertex repeated) of length $k$?**\n", 292 | "\n", 293 | "If given a potential path $P$, we could traverse the path, keeping track of which vertices we have visited and maintain a counter of how many we have visited. If we never revisit a vertex and if the counter equal $k$ at the end of the path, then the solution is valid. This traversal will take linear time, which is polynomia time. Therefore this problem is in NP.\n", 294 | "\n", 295 | "**- Is integer $n$ composite (i.e. , not prime)?**\n", 296 | "\n", 297 | "If given a number $n$ that is supposedly not prime, can we verify this in polynomial time? Yes. We can try dividing $n$ by all numbers from $2$ to $\\sqrt{n}$. If any of them evenly divides $n$, i.e. produces integers, then $n$ is not prime. This is $O(\\sqrt{n})$. Therefore this problem is in NP.\n", 298 | "\n", 299 | "**- Does graph $G$ have a vertex cover of size $k$?**\n", 300 | "\n", 301 | "If given a graph $G$ and a subset of vertices $V'$, can we verify that $V'$ is a cover of size $k$ in polynomial time? Yes. We can iterate over the vertices in $V'$ and for each vertex, delete its adjacent edges from $E$ and increment a counter. After the iteration is complete, we check that the counter equals $k$ and that there are no edges remaining in $E$. This involved a loop of size $\\leq |V|$ over the cover vertices, and for each vertex a loop of size $\\leq |E|$ over the adjacent edges. Therefore this verification algorithm is $O(|V| \\times |E|)$ and so the problem is in NP." 302 | ] 303 | }, 304 | { 305 | "cell_type": "markdown", 306 | "metadata": {}, 307 | "source": [ 308 | "---" 309 | ] 310 | }, 311 | { 312 | "cell_type": "markdown", 313 | "metadata": {}, 314 | "source": [ 315 | "## Approximation Algorithms" 316 | ] 317 | }, 318 | { 319 | "cell_type": "markdown", 320 | "metadata": { 321 | "collapsed": true 322 | }, 323 | "source": [ 324 | "### 9.25 [4]\n", 325 | "\n", 326 | "In the *maximum-satisfiability* problem, we seek a truth assignment that satisfies as many clauses as possible. Give an heuristic that always satisfies at least half as many clauses as the optimal solution." 327 | ] 328 | }, 329 | { 330 | "cell_type": "markdown", 331 | "metadata": {}, 332 | "source": [ 333 | "*Solution*:\n", 334 | "\n", 335 | "First some terminology. Say we have a clause $C = (v_1, \\bar{v_2})$. This clause contains the *literals* $v_1$ and $\\bar{v_2}$, which are determined by the state of the *variables* $v_1$ and $v_2$.\n", 336 | "\n", 337 | "An idea for a simple heuristic is to iterate over either the claues or the variables, making greedy truth assignments while keeping track of the consequences of your decisions.\n", 338 | "\n", 339 | "Here is such a simple heuristic.\n", 340 | "\n", 341 | "$\\hspace{2em}$For every variable $v$: \n", 342 | "$\\hspace{4em}$Assign $v$ the truth value that makes the greater number of undeleted clauses True \n", 343 | "$\\hspace{4em}$Delete the clauses that contained literals of variable $v$\n", 344 | "\n", 345 | "In the heuristic, for each variable $v$, a subset of the remaining clauses is considered, at least half of them are made to be True, and then they all are deleted. Since every clause contains variables, every clause is deleted in this process. Therefore the total collection of clauses is broken down into this disjoint chain of subsets of clauses, and in each subset, at least half of the clauses are made to be True. Therefore more than half of all the clauses will be True. Since the optimal solution can make at most all of the clauses True, this heuristic satisfies at least half as many clauses as the optimal solution.\n", 346 | "\n", 347 | "As a very simple example, look at the set of clauses:\n", 348 | "\n", 349 | "$$ (v_1 + v_2)\\cdot(v_1 + \\bar{v_2})\\cdot(\\bar{v_1} + v_2)\\cdot(\\bar{v_1} + \\bar{v_2})$$\n", 350 | "\n", 351 | "The algorithm will start with $v_1$ and see that 2 clauses contain $v_1$ and 2 clauses contain $\\bar{v_1}$. Assigning $v_1$ as either True of False will make 2 clauses True and 2 False. So let's say that the algorithm assigns $v_1 = $ True and deletes all the clauses that contained a literal of the variable $v_1$. In this case, all the clauses did, so there are no more clauses and $v_2$ can be set to anything. Half the clauses were directly satisfied by the algorithm, and the random assignment of $v_2$ will satisfy another clause.\n", 352 | "\n", 353 | "As another example:\n", 354 | "\n", 355 | "$$ (v_1 + \\bar{v_2})\\cdot(v_2)\\cdot(v_2)\\cdot(\\bar{v_2})$$\n", 356 | "\n", 357 | "$v_1$ only appears in one clause, and so the algorithm will set $v_1 = $ True and delete that clause. In the remaining 3 clauses, $v_2$ appears twice and $\\bar{v_2}$ appears once. Therefore the algorithm will set $v_2 =$ True and delete these clauses. In total, 3 out of 4 clauses were satisfied.\n", 358 | "\n", 359 | "A possible way to improve the heuristic is to only delete the clauses that are now being made True by the variable assignment, not all the clauses that contain the variable." 360 | ] 361 | }, 362 | { 363 | "cell_type": "markdown", 364 | "metadata": {}, 365 | "source": [] 366 | } 367 | ], 368 | "metadata": { 369 | "kernelspec": { 370 | "display_name": "Python 3", 371 | "language": "python", 372 | "name": "python3" 373 | }, 374 | "language_info": { 375 | "codemirror_mode": { 376 | "name": "ipython", 377 | "version": 3 378 | }, 379 | "file_extension": ".py", 380 | "mimetype": "text/x-python", 381 | "name": "python", 382 | "nbconvert_exporter": "python", 383 | "pygments_lexer": "ipython3", 384 | "version": "3.5.1" 385 | } 386 | }, 387 | "nbformat": 4, 388 | "nbformat_minor": 0 389 | } 390 | -------------------------------------------------------------------------------- /Figures/Hallock_Fig_0-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_0-2.png -------------------------------------------------------------------------------- /Figures/Hallock_Fig_1-1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_1-1.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_2-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_2-1.png -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-1.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-1.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-10.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-10.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-11.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-11.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-2.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-3.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-4.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-4.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-5.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-5.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-6.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-6.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-7.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-7.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-8.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-8.jpg -------------------------------------------------------------------------------- /Figures/Hallock_Fig_6-9.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Hallock_Fig_6-9.jpg -------------------------------------------------------------------------------- /Figures/Skiena_Fig_5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Skiena_Fig_5-1.png -------------------------------------------------------------------------------- /Figures/Skiena_Fig_5-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Skiena_Fig_5-2.png -------------------------------------------------------------------------------- /Figures/Skiena_Fig_5-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Skiena_Fig_5-3.png -------------------------------------------------------------------------------- /Figures/Skiena_Fig_5-4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Skiena_Fig_5-4.png -------------------------------------------------------------------------------- /Figures/Skiena_Fig_5-5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jhallock7/Algorithm_Exercises/9abb2a60c5a42620940d28967528f5e0adbe762a/Figures/Skiena_Fig_5-5.png -------------------------------------------------------------------------------- /GitHub Rendering Issue.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "$n$ *italicized* *again* **bold** $n$" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "$i < n$. A large amount of text can follow." 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": 1, 20 | "metadata": { 21 | "collapsed": true 22 | }, 23 | "outputs": [], 24 | "source": [ 25 | "#Now some code that includes a commented dollar sign\n", 26 | "\n", 27 | "# $\n", 28 | "\n", 29 | "# The text below should render exactly the same as above\n", 30 | "# ...and does on my computer." 31 | ] 32 | }, 33 | { 34 | "cell_type": "markdown", 35 | "metadata": {}, 36 | "source": [ 37 | "$n$ *italicized* *again* **bold** $n$" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": {}, 43 | "source": [ 44 | "$i < n$. A large amount of text can follow." 45 | ] 46 | }, 47 | { 48 | "cell_type": "code", 49 | "execution_count": 2, 50 | "metadata": { 51 | "collapsed": true 52 | }, 53 | "outputs": [], 54 | "source": [ 55 | "#Now another dollar sign\n", 56 | "\n", 57 | "# $\n", 58 | "\n", 59 | "# And it works again" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": {}, 65 | "source": [ 66 | "$n$ *italicized* *again* **bold** $n$" 67 | ] 68 | }, 69 | { 70 | "cell_type": "markdown", 71 | "metadata": {}, 72 | "source": [ 73 | "$i < n$. A large amount of text can follow." 74 | ] 75 | } 76 | ], 77 | "metadata": { 78 | "kernelspec": { 79 | "display_name": "Python 3", 80 | "language": "python", 81 | "name": "python3" 82 | }, 83 | "language_info": { 84 | "codemirror_mode": { 85 | "name": "ipython", 86 | "version": 3 87 | }, 88 | "file_extension": ".py", 89 | "mimetype": "text/x-python", 90 | "name": "python", 91 | "nbconvert_exporter": "python", 92 | "pygments_lexer": "ipython3", 93 | "version": "3.5.1" 94 | } 95 | }, 96 | "nbformat": 4, 97 | "nbformat_minor": 0 98 | } 99 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Algorithm_Exercises 2 | 3 | In this repo are my solutions to some of the exercises from Steven Skiena’s Algorithm Design Manual. 4 | 5 | ## Progress 6 | 7 | **Chapter 1: Introduction to Algorithm Design** - 32 solved out of 34 = 94% 8 | **Chapter 2: Algorithmic Analysis** - 15 solved out of 52 = 29% 9 | **Chapter 3: Data Structures** - 7 solved out of 29 = 24% 10 | **Chapter 4: Sorting and Searching** - 12 solved out of 46 = 26% 11 | **Chapter 5: Graph Traversal** - 10 solved out of 32 = 31% 12 | **Chapter 6: Weighted Graph Algorithms** - 10 solved out of 25 = 40% 13 | **Chapter 7: Combinatorial Search and Heuristic Methods** - 5 solved out of 19 = 26% 14 | **Chapter 8: Dynamic Programming** - 6 solved out of 26 = 23% 15 | **Chapter 9: Intractable Problems and Approximation Algorithms** - 7 solved out of 30 = 23% 16 | 17 | **Total:** 104 solved out of 293 = 35% 18 | 19 | ![Progress](Figures/Hallock_Fig_0-2.png) --------------------------------------------------------------------------------