├── README.md ├── p119.pdf └── p208-209.pdf /README.md: -------------------------------------------------------------------------------- 1 | # algforopt-errata 2 | 3 | Errata for the *First Edition* of the Algorithms for Optimization book 4 | 5 | ## First printing 6 | 7 | * p. 10: Eq 1.14 and 1.16 should use x* in x*+hy (thanks to Chris Peel) 8 | * p. 38: Figure 3.6: F_2 should be F_3 and F_3 should be F_4 (thanks to Zdeněk Hurák) 9 | * p. 57: Change "strong Wolfe condition" to "strong curvature condition" (thanks to Chris Peel) 10 | * p. 64: Added comments to Alg 4.4, and changed `r` to `eta` to match the text (thanks to Chelsea Sidrane) 11 | * p. 78: Change eq 5.26 to use s of k+1 rather than s of k. RMSProp uses the most recent value. (thanks to Michael Gobble) 12 | * Fig 5.7: change "hypermomentum" in legend to "hypergradient" and change caption to begin: "Hypergradient versions of gradient descent and Nesterov momentum compared on..." 13 | * p. 110: Change "elseif yr > ys" to use \geq 14 | * p. 138: Covariance matrix adaptation had inconsistencies with the recommended implementation in the original work. Primarily, this changes eq 8.18 such that first m_elite entries sum to 1, that all entries sum to approx 0, and that weights need not be non-negative. Eq 8.19 and 8.22 only sum to m_elite. (thanks to Arec Jamgochian) 15 | * p. 160: The ref to eq. 9.8 in the last paragraph should actually be to eq. 9.7. (thanks to Anthony Corso and Joan Creus-Costa) 16 | * p. 171: There should be minus signs instead of plus signs before each of the x_i terms under the square root (thanks to Zhengyu Chen) 17 | * p. 174: Eq 10.20 should have a + instead of a - as the gradients must point in opposite directions and mu is non-negative (thanks to Erez Krimsky) 18 | * p. 176: Stationarity condition should use a + instead of a - 19 | * p. 177: Stationarity condition (eq 10.31) should use + instead of - (thanks to Erez Krimsky) 20 | * p. 179, Example 10.5 optimum should actually be at -pi/2 (approx. -1.57), rather than -1.5. Future releases will use x^2 < 1 so optimum will be at -1. (thanks Steven Pauly and yingjieMiao) 21 | * p. 180: Ex 10.6 incorrectly optimizes the dual function. It should recognize that the dual function is unbounded below when lambda is less than 1/2. The dual problem is maximized at 1/2, yielding two optimal solutions. 22 | * p. 195: In ex. 11.4, the minimization should be over x and s in the second problem and over x+, x-, and s in the third problem (thanks to Holly Dinkel) 23 | * p. 200: The leaving index should be in B. 24 | * p. 203: Ex 11.7 should read "causes x_4 to become zero" (thanks to Wouter Van Gijseghem) 25 | * p. 208: The dual form should have a >= constraint. This must also be changed in alg 11.6. (thanks to Wouter Van Gijseghem) 26 | * p. 209: Example 11.9 should use a >= constraint for A'mu. 27 | * p. 248: Alg 13.11 should use 13 instead of 6 in its implementation. (thanks to Sudarshan Chawathe) 28 | * p. 277: Ex 15.1 has 1^-1, which should be 2^-1 (thanks to Ryan Samuels) 29 | * p. 284: In the matrix in Ex 15.3, the subscript of row 5, column 6 should be 21, not 12 (thanks to Veronika Korneyeva) 30 | * p. 287: Add period to end of line 5. (thanks to Javier Yu) 31 | * p. 324: Ex 18.1 analytic solution for variance was incorrect. (plot appears to be correct) (thanks to Veronika Korneyeva) 32 | * p. 336: Fix q vector formatting in Ex. 18.6 33 | * p. 347: Ex 19.3: Each of -0.091 - floor(-0.091) and -0.455 - floor(-0.455) should be reversed (thanks to Veronika Korneyeva) 34 | * p. 426: Alg B1: The euler constant (\euler in Julia) does not exist in the DejaVu Sans Mono font, so is missing. (thanks Sudarshan Chawathe) 35 | * p. 449: Eq D.2 should read: 4 * (-27) = -108 36 | * p. 454: Changed two instances of sigma to mu on the left-hand-side of Excercise 8.4. 37 | * p. 459: Ex. 11.3: solution should be "We can add a slack variable $x_3$ and split $x_1$ and $x_2$ into $x_1^+$, $x_1^-$, $x_2^+$, and $x_2^-$. We minimize $6x_1^+ - 6x_1^- + 5x_2^+ - 5x_2^-$ subject to the constraints $-3x_1 + 2x_2 + x_3 = -5$ and $x_1^+, x_1^-, x_2^+, x_2^-, x_3 \geq 0$." (thanks to Jimin Park) 38 | * p. 463: Ex. 13.2: The probability converges to one. (thanks to Wouter Van Gijseghem and Jacob West) 39 | 40 | ### Minor 41 | * p. 46: Change Eq 3.14 to use y_min on both sides for consistency 42 | * Example 4.2: "satisfied" is twice misspelled as "satisifed." 43 | * Fig 4.9: Needs one more iteration to get to x^10. 44 | * Sec 4.5: "termination" is misspelled as "terminiation" 45 | * Mu vector prior to Eq 8.18 should be bold. 46 | * p. 201: Changed "positive" to "non-negative" is 2nd-to-last paragraph 47 | * p. 347: In Example 19.3, in the last array of equations, the right pair had too much space before the equals sign. 48 | * The latest version of Julia includes Iterators in Base, so `product` can be accessed via `Iterators.product`. 49 | 50 | ## Second printing 51 | 52 | * p. 2: Drop "al-" in "al-Kit\={a}b al-jabr wal-muq\={a}bala" (thanks to Tarek Zougari) 53 | * p. 38: The right-hand side of Eq 3.3 should use n and n-1 rather than n+1 and n (thanks to Tarek Zougari) 54 | * p. 46: Change termination conditions to be f(x(n)) - y(n) rather than y(n) - f(x(n)) (thanks to Esen Yel) 55 | * p. 46: Eq. 3.14 should be $\left[ x^{(i)} - \frac{1}{\ell}(y_{min}-y^{(i)}), x^{(i)} + \frac{1}{\ell}(y_{min}-y^{(i)}) \right]$ (thanks to Ross Alexander) 56 | * p. 47: Change for clarity: updated algorithm to use y_min and replaced j with i. (thanks to Ross Alexander) 57 | * p. 58: add missing alpha and betas in first inequality (thanks to Ethan Strijbosch) 58 | * p. 60: Dropped all indexing in eqs 4.7-9, as it was inconsistent and not strictly needed. (thanks to Remy Zawislak) 59 | * p. 64: Keep up with Convex.jl API using solve!(p, SCS.Optimizer) in Alg 4.4 (thanks to Ross Alexander) 60 | * p. 72: Change for clarity: adjust indexing of d and g, but not beta, in conjugate gradient equations by -1. (thanks to Raman Vilkhu) 61 | * p. 72: Change for clarity: use cdot instead of dot() function in Alg 5.2 (thanks to Cooper Shea) 62 | * p. 76: Ref 4, subscript should be superscript (thanks to Robert Moss) 63 | * p. 82: Eq 5.34:37: Update superscripts to properly index into alpha and match Eq 4.1 (thanks to Alexandros Tzikas) 64 | * p. 83: Alg 5.10 had been implemented from the paper, which was based on a different form of Nesterov momentum (thanks to Dylan Asmar) 65 | * p. 84: Ex 5.2: Change "step size" to "step factor" (thanks to John Wu) 66 | * p. 85: Ex 5.7: Remove "normalized" to have exercise match code implementation (thanks to Alexandros Tzikas) 67 | * p. 88: Deleted "horizontal", as 2nd derivative is zero for any line (thanks to Alexandros Tzikas) 68 | * p. 89: Eq 6.5, drop the k superscript in the argmument (thanks to Alexandros Tzikas) 69 | * p. 90: Update sidenote 2 in chapter 6 to refer to chapter 4 rather than chapter 5 (thanks to Alexandros Tzikas) 70 | * p. 92: Update text around DFP update to say "at the same iteration" rather than "at iteration k" to avoid potential confusion (thanks to Alexandros Tzikas) 71 | * p. 97: Gamma and delta in the optimizaiton problem should be bold (thanks to Alexandros Tzikas) 72 | * p. 99: Add subscripts to LHS of cyclic coordinate descent equations (thanks to Alexandros Tzikas) 73 | * p. 101: Add a clarifying "n-dimensional" before "quadratic function" (thanks to Chelsea Sidrane) 74 | * p. 102: Alg 7.4, the second instance of line_search should have x' instead of x (thanks to Kaijun Feng) 75 | * p. 109: Alg 7.7, yr ≤ yh changed to yr < yh (thanks to Ellis Brown) 76 | * p. 116: Multivariate direct was using max half-width rather than the distance from cell center to vertex. Changed text to "The lower bound for each hyper-rectangle can be computed based on the longest center-to-vertex distance and center value.". Updated Algs 7.10 and 7.11 to use vertex_dist() and OrderedDict. (thanks to Ross Alexander) 77 | * p. 119: Alg 7.11 method for computing potentially optimal intervals was incorrect. [Updated algorithm is here.](https://github.com/sisl/algforopt-errata/blob/master/p119.pdf) 78 | * p. 122: Ex 7.2: The interval centers were incorrect. (thanks to Ross Alexander) 79 | * p. 125: Eq. 8.1: The g should be replaced with d (thanks to Lucy Brown) 80 | * p. 127: Ex 8.1: Change caption to: "Positive spanning sets for $\mathbb{R}^2$." (thanks to Ross Alexander) 81 | Swap columns 2 and 3 and note that "the lower triangular generation strategy can only generate the first two columns of spanning sets." (thanks to Kaijun Feng) 82 | * p. 127: Make x's in final sentence bold 83 | * p. 128: Eq 8.5: Remove unnecessary min 84 | * p. 128: Alg 8.2: Change for loop over j to set the lower triangular components rather than upper. 85 | Use D on right hand side of `D = L[:,randperm(n)]`. 86 | * p. 132: Change `r` is drawn uniformly at random from [1,-1] to use [1,-1] (thanks to Nancy Ammar) 87 | * p. 132: Fig 8.4 Change upper y tick label from c to 1+c (thanks to Nancy Ammar) 88 | * p. 140: eq. 8.23, $\delta^(i)$ should be boldface on the right hand side (thanks to Robert Moss) 89 | * p. 141: eq. 8.30 used c_c instead of c_Sigma. (thanks to Pranav Maheshwari) 90 | * p. 141: eq. 8.30 should have a "rank-m_elite" subscript for the 3rd term, and the rank-mu update should be a rank-m_elite update (thanks to Ye Li) 91 | * p. 141 equation (8.28): p_\Sigma -> p_\sigma (thanks to Shogo Kishimoto) 92 | * p. 147: Eq. 9.1: Remove boldface x (thanks to Ross Alexander) 93 | * p. 151: Change "choose" to "select" for truncation selection for clarity (thanks to Chelsea Sidrane) 94 | * p. 152: Alg 9.6: Refer to `select` rather than `selection` and `y` rather than `f`. Added some basic comments. (thanks to Chelsea 95 | Sidrane) 96 | * p. 171: ex 10.2, simplify example to not require any constraints by making h(x) a linear function; then x_n can be determined from x_1 through x_n-1. 97 | Thank you TSeyhan for finding an addition in this change, which is now fixed. 98 | * p. 174: last terms in equations 10.16 and 10.17 should have minus sign (thanks to Vladislav Ankudinov) 99 | * p. 176: Description of complementary slackness changed to not suggest that zero's are mutually exclusive (thanks to Zdeněk Hurák) 100 | * p. 177: Sentence before eq. 10.36 should have "minimization" not "maximization" (thanks to Kaijun Feng) 101 | * p. 183: Swap order of lambda and rho updates in Alg 10.2 (thanks to Stuart Rogers) 102 | * p. 183: Equation 10.43 and Alg 10.2 should use the opposite sign for its second term (λ⋅h(x)) (thanks to Vineet Theodore) 103 | * p. 183: delete f(x) from Alg. 10.2 where p is defined (thanks to Kaijun Feng) 104 | * p. 183: 0 above eq. 10.43 should be in bold (thanks to Ross Alexander) 105 | * p. 184: Add clarifying "in the feasible region" to condition 2 (thanks to Alexandros Tzikas) 106 | * p. 196: "If a linear program contains feasible points, it also contains at least one vertex" -> "If a linear program has a bounded solution, then it also contains at least one vertex." 107 | * p. 204: Alg 11.3: Fixed whitespace affecting the indentation of some 'end's (thanks to Alexandros Tzikas) 108 | * p. 207: Alg 11.5: Need c'' = vcat(c, zeros(m)) (thanks to Andre Tkacenko) 109 | * p. 208-209: mu should be lambda and polarity of constraint in dual form should be A^T lambda <= c, [see corrected pages](https://github.com/sisl/algforopt-errata/blob/master/p208-209.pdf) with additional explanation (thanks to Masha Itkina) 110 | * p. 213: When defining the utopia point, change criterion space to objective space (thanks to Alexandros Tzikas) 111 | * p. 238: Fig 13.6: Half of the dots dropped to truly have a uniform projection plan 112 | * p. 242: Change Morris-Mitchell Criterion list from {1,2,3,10,20,50,100} to {1,2,5,10,20,50,100}. (thanks to Stephan Milius) 113 | * p. 245: Alg 13.8 change `S = [X[rand(1:m)]]` to `S = [sample(X)]` (thanks to Trương Minh Nhật) 114 | * p. 246: Alg 13.9 change `S = X[randperm(m)]` to `S = sample(X, m, replace=false)` (thanks to Trương Minh Nhật) 115 | * p. 254: Theta should be upright bold in sentence before eq. 14.7 (thanks to Ross Alexander) 116 | * p. 256: Figure 14.2: Change nonsingular to singular (thanks to Yuki Matsuoka) 117 | * p. 263: Change sidenote 8 to add " if $\lambda = 0$" and change "sufficiently large" to "positive". (thanks to Chris Peel) 118 | * p. 294: Square sigma hat in eq. 16.5. 119 | * p. 295: Eq 16.12 for expected improvement needs a sigma squared leading the 2nd term (thanks to Philippe Weingertner) 120 | * p. 296: Alg 16.2 for expected improvement needs a sigma squared leading the 2nd term (thanks to Philippe Weingertner) 121 | Figure 16.6 updated to reflect this change to expected improvement. 122 | * p. 309: Eq 17.4, append "and (x,z') in F for all z'" 123 | * p. 312: First sentence of Section 17.3, change "uses" to "use" (thanks to Alexandros Tzikas) 124 | * p. 313: Example 17.3, change 5 exp to 6 exp. (thanks to Mindaugas Kepalas) 125 | * p. 316: Eq 17.10: Change P((x,z) in F) to P(x in F), as z is marginalized out (thanks to Alexandros Tzikas) 126 | * p. 325: Remove repeated equation 18.18 and insert step between 18.13 and 18.14. (thanks to Christoph Buchner) 127 | * p. 329: Update to Polynomials.jl interface change; Poly->Polynomial, polyint->integrate, polyder->derivative 128 | * p. 334: Update eq 18.40 to include an additional expectation over p(z). This change was propagated to equations 18.{44-46}. (thanks to Kouhei Harada) 129 | * p. 335: Eq 18.50: copy over the approximation introduced in 18.40 rather than presenting an exact equality. (thanks to Kouhei Harada) 130 | * p. 335: Eq 18.51: change \hat \mu(z^i) to simply z^i. (thanks to Kouhei Harada) 131 | * p. 349: Fig 19.4 for consistency with the algorithm, shifted blue region up so we are splitting on a max-distance variable. (thanks to Mykel Kochenderfer) 132 | * p. 376: Example 20.6, one of the B's had the wrong font. (thanks to Christoph Buchner) 133 | * p. 376: Example 20.6, change the second w_1^B to w_2^B (thanks to Shogo Kishimoto) 134 | * p. 383: Algorithm 20.16: Updated to call `prune!` recursively (thanks to Shogo Kishimoto) 135 | * p. 385: Exercise 20.9: Change p_F, p_I to w_F, w_I (thanks to Shogo Kishimoto) 136 | * p. 402: Figure 21.11: The expression "f(v, s, r, p,..." was missing the c^{(d)} argument. (thanks to Christoph Buchner) 137 | * p. 402: Figure 21.11: right hand side of top block, y^{(d)} should replace c^{(d)} (thanks to Loren Newton) 138 | * p. 409: Exercise 21.4: The problem should use 10 degrees rather than 10 radians. (thanks to Sudarshan Chawathe) 139 | * p. 418: Replace `subtype` with `subtypes` (thanks to Ellis Brown) 140 | * p. 439: Change Chebyschev to Chebyshev (thanks to Shogo Kishimoto) 141 | * p. 441: Eq C.29: For clarity, reversed order of terms in each addition pair (thanks to Anil Yildiz) 142 | * p. 447: solution 2.1 has an x that should be bold 143 | * p. 447: solution 2.1 : "where ei is the ith basis vector with ei = 1" to "where $\vect e_i$ is the $i$th basis vector with its $i$th component equal to one" (thanks to Shogo Kishimoto) 144 | * p. 448: exercise 2.1: second x on RHS should be bolded (thanks to Yash Taneja) 145 | * p. 449: exercise 5.4: change the Hessian term to return a column vector (thanks to Tarek Zougari) 146 | * p. 450: Ex 5.7: Use unnormalized gradient to have exercise match code implementation (thanks to Alexandros Tzikas) 147 | * p. 452: Ex 7.1 solution, add equation for Hessian evaluation and show that it is 4 terms (thanks to Alex Fuster) 148 | * p. 453: Change "multivariate normal distributions" to "multivariate mixture distributions" (thanks to Javier Yu) 149 | * p. 456: Ex 10.4: No need to consider the FONC for constrained problems - can go directly to f'(x_p) = 0 (thanks to Alexandros Tzikas) 150 | * p. 459: Solution 11.3 additionally splits the constraint by positive and negative variable (thanks to Sydney Hsu) 151 | * p. 478: Change call to `phenotype` to `decode` (thanks to Shogo Kishimoto) 152 | * p. 480: Ex 20.10 solution: Change p_F, p_I to w_F, w_I (thanks to Shogo Kishimoto) 153 | * p. 495: Change "BGFS" to "BFGS" (thanks to Martijn Ruppert) 154 | -------------------------------------------------------------------------------- /p119.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/algorithmsbooks/optimization/65c6b73dd55969979ebe65d71d2a90adc87f1108/p119.pdf -------------------------------------------------------------------------------- /p208-209.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/algorithmsbooks/optimization/65c6b73dd55969979ebe65d71d2a90adc87f1108/p208-209.pdf --------------------------------------------------------------------------------