├── .gitattributes ├── 1.Introduction; numerics; error analysis ├── introduction.pdf └── numerics_and_error.pdf ├── 10.Systems of equations; optimization in one variable ├── nonlinear_systems_ii.pdf └── optimization_i.pdf ├── 11.Optimization Multiple variables, constraints ├── optimization_ii.pdf └── optimization_iii.pdf ├── 12.Conjugate gradients I Gradient descent, setup └── cg_i.pdf ├── 13.Conjugate gradients II Formulation, preconditioning, and variants └── cg_ii.pdf ├── 14.Interpolation └── interpolation.pdf ├── 15.Numerical integration and differentiation └── integration_and_differentiation.pdf ├── 16.Initial value problems and basics of ODE └── ode_i.pdf ├── 17.Time-stepping strategies └── ode_ii.pdf ├── 18.PDE I Examplestheory, derivative operators └── pde_i.pdf ├── 19.PDE II Basic solution techniques ├── conclusion.pdf └── pde_ii.pdf ├── 2.Linear systems and LU └── linear_systems_and_lu.pdf ├── 3.More LU; conditioning and sensitivity ├── more_lu.pdf └── sensitivity_and_conditioning.pdf ├── 4.Designing linear systems (incl. least-squares); special structure (Cholesky, sparsity) └── designing_and_analyzing_systems.pdf ├── 5.Column spaces and QR ├── column_spaces_and_qr.pdf └── designing_and_analyzing_systems.pdf ├── 6.Eigenproblems How they arise, properties └── eigenproblems_i.pdf ├── 7.Eigenproblems II Algorithms └── eigenproblems_ii.pdf ├── 8.Eigenproblems III QR iteration, conditioning; singular value decomposition (SVD) ├── eigenproblems_iii.pdf └── svd.pdf ├── 9.Nonlinear equations and convergence analysis ├── linear_algebra_review.pdf └── nonlinear_systems.pdf ├── README.md ├── 作业 ├── hw0 │ ├── hw0.pdf │ ├── hw0_solution.md │ └── hw0_solution.pdf ├── hw1 │ ├── hw1.pdf │ ├── hw1_solution.md │ └── hw1_solution.pdf ├── hw2 │ ├── hw2.pdf │ ├── hw2_solution.md │ └── hw2_solution.pdf ├── hw3 │ ├── hw3.pdf │ ├── hw3_solution.md │ └── hw3_solution.pdf ├── hw4 │ ├── hw4.pdf │ ├── hw4_solution.md │ └── hw4_solution.pdf ├── hw5 │ ├── hw5.pdf │ ├── hw5_solution.md │ └── hw5_solution.pdf ├── hw6 │ ├── code │ │ ├── hw6.zip │ │ ├── plotGraph.m │ │ ├── problem1.m │ │ └── unitCircleFEM.mat │ ├── hw6.pdf │ ├── hw6_solution.md │ └── hw6_solution.pdf ├── hw7 │ ├── code │ │ └── Problem 4.py │ ├── hw7.pdf │ ├── hw7_solution.md │ └── hw7_solution.pdf ├── hw8 │ ├── code │ │ ├── backwardEuler.m │ │ ├── firstOrderMatrix.m │ │ ├── forceMatrix.m │ │ ├── forwardEuler.m │ │ ├── fw.fig │ │ ├── hw8.m │ │ ├── hw8.zip │ │ ├── leapfrog.m │ │ ├── plotGraph.m │ │ └── trapezoidalODE.m │ ├── hw8.pdf │ ├── hw8_solution.md │ └── hw8_solution.pdf └── 参考资料 │ ├── hw1_solutions.pdf │ ├── hw2_solutions.pdf │ ├── hw6_solutions.pdf │ └── review_1.pdf └── 讲义以及参考书 ├── cs205a_notes.pdf ├── numerical_book.pdf └── 科学计算导论(第2版).MICHAEL.T.HEATH..pdf /.gitattributes: -------------------------------------------------------------------------------- 1 | # Auto detect text files and perform LF normalization 2 | * text=auto 3 | -------------------------------------------------------------------------------- /1.Introduction; numerics; error analysis/introduction.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/1.Introduction; numerics; error analysis/introduction.pdf -------------------------------------------------------------------------------- /1.Introduction; numerics; error analysis/numerics_and_error.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/1.Introduction; numerics; error analysis/numerics_and_error.pdf -------------------------------------------------------------------------------- /10.Systems of equations; optimization in one variable/nonlinear_systems_ii.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/10.Systems of equations; optimization in one variable/nonlinear_systems_ii.pdf -------------------------------------------------------------------------------- /10.Systems of equations; optimization in one variable/optimization_i.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/10.Systems of equations; optimization in one variable/optimization_i.pdf -------------------------------------------------------------------------------- /11.Optimization Multiple variables, constraints/optimization_ii.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/11.Optimization Multiple variables, constraints/optimization_ii.pdf -------------------------------------------------------------------------------- /11.Optimization Multiple variables, constraints/optimization_iii.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/11.Optimization Multiple variables, constraints/optimization_iii.pdf -------------------------------------------------------------------------------- /12.Conjugate gradients I Gradient descent, setup/cg_i.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/12.Conjugate gradients I Gradient descent, setup/cg_i.pdf -------------------------------------------------------------------------------- /13.Conjugate gradients II Formulation, preconditioning, and variants/cg_ii.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/13.Conjugate gradients II Formulation, preconditioning, and variants/cg_ii.pdf -------------------------------------------------------------------------------- /14.Interpolation/interpolation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/14.Interpolation/interpolation.pdf -------------------------------------------------------------------------------- /15.Numerical integration and differentiation/integration_and_differentiation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/15.Numerical integration and differentiation/integration_and_differentiation.pdf -------------------------------------------------------------------------------- /16.Initial value problems and basics of ODE/ode_i.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/16.Initial value problems and basics of ODE/ode_i.pdf -------------------------------------------------------------------------------- /17.Time-stepping strategies/ode_ii.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/17.Time-stepping strategies/ode_ii.pdf -------------------------------------------------------------------------------- /18.PDE I Examplestheory, derivative operators/pde_i.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/18.PDE I Examplestheory, derivative operators/pde_i.pdf -------------------------------------------------------------------------------- /19.PDE II Basic solution techniques/conclusion.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/19.PDE II Basic solution techniques/conclusion.pdf -------------------------------------------------------------------------------- /19.PDE II Basic solution techniques/pde_ii.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/19.PDE II Basic solution techniques/pde_ii.pdf -------------------------------------------------------------------------------- /2.Linear systems and LU/linear_systems_and_lu.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/2.Linear systems and LU/linear_systems_and_lu.pdf -------------------------------------------------------------------------------- /3.More LU; conditioning and sensitivity/more_lu.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/3.More LU; conditioning and sensitivity/more_lu.pdf -------------------------------------------------------------------------------- /3.More LU; conditioning and sensitivity/sensitivity_and_conditioning.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/3.More LU; conditioning and sensitivity/sensitivity_and_conditioning.pdf -------------------------------------------------------------------------------- /4.Designing linear systems (incl. least-squares); special structure (Cholesky, sparsity)/designing_and_analyzing_systems.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/4.Designing linear systems (incl. least-squares); special structure (Cholesky, sparsity)/designing_and_analyzing_systems.pdf -------------------------------------------------------------------------------- /5.Column spaces and QR/column_spaces_and_qr.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/5.Column spaces and QR/column_spaces_and_qr.pdf -------------------------------------------------------------------------------- /5.Column spaces and QR/designing_and_analyzing_systems.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/5.Column spaces and QR/designing_and_analyzing_systems.pdf -------------------------------------------------------------------------------- /6.Eigenproblems How they arise, properties/eigenproblems_i.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/6.Eigenproblems How they arise, properties/eigenproblems_i.pdf -------------------------------------------------------------------------------- /7.Eigenproblems II Algorithms/eigenproblems_ii.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/7.Eigenproblems II Algorithms/eigenproblems_ii.pdf -------------------------------------------------------------------------------- /8.Eigenproblems III QR iteration, conditioning; singular value decomposition (SVD)/eigenproblems_iii.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/8.Eigenproblems III QR iteration, conditioning; singular value decomposition (SVD)/eigenproblems_iii.pdf -------------------------------------------------------------------------------- /8.Eigenproblems III QR iteration, conditioning; singular value decomposition (SVD)/svd.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/8.Eigenproblems III QR iteration, conditioning; singular value decomposition (SVD)/svd.pdf -------------------------------------------------------------------------------- /9.Nonlinear equations and convergence analysis/linear_algebra_review.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/9.Nonlinear equations and convergence analysis/linear_algebra_review.pdf -------------------------------------------------------------------------------- /9.Nonlinear equations and convergence analysis/nonlinear_systems.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/9.Nonlinear equations and convergence analysis/nonlinear_systems.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CS205A Mathematical Methods for Robotics, Vision, and Graphics 2 | 3 | 斯坦福数值分析公开课学习资料,内容丰富,介绍了数值分析常见的内容,涵盖了机器学习中绝大多数优化方法,部分作业难度较大。 4 | 5 | 6 | 7 | 课程主页: 8 | 9 | https://graphics.stanford.edu/courses/cs205a-13-fall/schedule.html 10 | 11 | 个人笔记: 12 | 13 | https://doraemonzzz.com/tags/CS205A/ 14 | 15 | -------------------------------------------------------------------------------- /作业/hw0/hw0.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/作业/hw0/hw0.pdf -------------------------------------------------------------------------------- /作业/hw0/hw0_solution.md: -------------------------------------------------------------------------------- 1 | #### Problem 1 2 | 3 | $\forall f,g \in C^1 (\mathbb R), \forall \alpha, \beta \in \mathbb R​$,我们有$\alpha f+ \beta g​$连续可导,所以$\alpha f + \beta g \in C^1 (\mathbb R)​$,因此$C^1 (\mathbb R)​$是线性空间。 4 | 5 | 考虑$C^1 (\mathbb R)$的子集多项式全体,显然全体多项式的维度为$\infty$,因此$C^1 (\mathbb R)$的维度为$\infty$。 6 | 7 | 8 | 9 | #### Problem 2 10 | 11 | $$ 12 | A^T A = \left[ 13 | \begin{matrix} 14 | \vec c_1^T \vec c_1& \ldots & \vec c_1^T \vec c_n \\ 15 | \ldots & \ldots & \ldots \\ 16 | \vec c_n^T \vec c_1& \ldots & \vec c_n^T \vec c_n \\ 17 | \end{matrix} 18 | \right] \in \mathbb R^{ n\times n} , 19 | AA^T = \left[ 20 | \begin{matrix} 21 | \vec r_1^T \vec r_1& \ldots & \vec r_1^T \vec r_m \\ 22 | \ldots & \ldots & \ldots \\ 23 | \vec r_m^T \vec r_1& \ldots & \vec r_m^T \vec r_m \\ 24 | \end{matrix} 25 | \right] \in \mathbb R^{ m\times m} 26 | $$ 27 | 28 | 29 | 30 | #### Problem 3 31 | 32 | 注意到原问题等价于最小化 33 | $$ 34 | \begin{aligned} 35 | f^2(\vec x) 36 | &=||A\vec x -\vec b ||^2 \\ 37 | &=(A\vec x -\vec b)^T (A\vec x -\vec b)\\ 38 | &=\vec x ^T A^TA\vec x -\vec b^T A\vec x - \vec x ^T A^T \vec b + \vec b^T\vec b \\ 39 | &=\vec x ^T A^TA\vec x -2\vec x ^T A^T \vec b + \vec b^T\vec b 40 | \end{aligned} 41 | $$ 42 | 对上式关于$\vec x $求梯度可得 43 | $$ 44 | \begin{aligned} 45 | \nabla_{\vec x} f^2 (\vec x) 46 | &= \nabla_{\vec x}( \vec x ^T A^TA\vec x -2\vec x ^T A^T \vec b + \vec b^T\vec b) \\ 47 | &= 2 A^TA\vec x - 2A^T \vec b 48 | \end{aligned} 49 | $$ 50 | 令上式为$0​$可得 51 | $$ 52 | A^TA\vec x=A^T \vec b \\ 53 | \vec x = (A^TA)^{-1}A^T \vec b 54 | $$ 55 | 56 | 57 | 58 | #### Problem 4 59 | 60 | 注意到原问题等价于最小化 61 | $$ 62 | || A\vec x||^2 = \vec x^T A^TA\vec x 63 | $$ 64 | 约束条件等价于 65 | $$ 66 | ||B\vec x||^2 = \vec x^T B^TB\vec x =1 67 | $$ 68 | 根据该条件构造拉格朗日乘子: 69 | $$ 70 | L(\vec x , \lambda) = \vec x^T A^TA\vec x -\lambda(\vec x^T B^TB\vec x -1) 71 | $$ 72 | 求梯度可得 73 | $$ 74 | \nabla_{\vec x } L(\vec x ,\lambda) =2 A^TA\vec x -2\lambda B^TB\vec x \\ 75 | \nabla_{\lambda } L(\vec x ,\lambda) = -\vec x^T B^TB\vec x +1 76 | $$ 77 | 令上式为$0​$可得 78 | $$ 79 | \begin{eqnarray*} 80 | A^T A\vec x &&=\lambda B^T B\vec x \tag 1 \\ 81 | \vec x^T B^TB\vec x&&=1 \tag 2 82 | \end{eqnarray*} 83 | $$ 84 | 将$(1),(2)​$带入目标函数可得 85 | $$ 86 | \vec x^T A^TA\vec x = \lambda \vec x^T B^T B\vec x =\lambda 87 | $$ 88 | 所以接下来只要求出$\lambda $即可,对等式$(1)​$稍作变形可得 89 | $$ 90 | (A^T A-\lambda B^T B) \vec x = 0 91 | $$ 92 | 由约束条件可知$\vec x\neq 0$,所以上述线性方程有非零解,因此 93 | $$ 94 | |A^T A -\lambda B^T B| = 0 95 | $$ 96 | 解该$n$次方程即可求出$\lambda_1,...,\lambda _n$,记最小的正根为$\lambda _i$,最大的正根为$\lambda _j $,所以 97 | $$ 98 | || A\vec x||^2 =\lambda \in [\lambda_i, \lambda_j] 99 | $$ 100 | 101 | 102 | 103 | #### Problem 5 104 | 105 | 注意约束条件等价于 106 | $$ 107 | \vec x^T \vec x =1 108 | $$ 109 | 根据该条件构造拉格朗日乘子: 110 | $$ 111 | \begin{aligned} 112 | L(\vec x ,\lambda) &= \vec a . \vec x -\lambda ( \vec x^T \vec x-1) \\ 113 | &=\vec a ^T \vec x-\lambda ( \vec x^T \vec x-1) 114 | \end{aligned} 115 | $$ 116 | 求梯度可得 117 | $$ 118 | \nabla_{\vec x } L(\vec x ,\lambda) = \vec a -\lambda \vec x \\ 119 | \nabla_{\lambda } L(\vec x ,\lambda) = \vec x^T \vec x-1 120 | $$ 121 | 令上式为$0$可得 122 | $$ 123 | \vec x = \frac 1 \lambda \vec a \\ 124 | \vec x ^T \vec x = \frac 1 {\lambda^2} \vec a ^T \vec a = 1 \\ 125 | \lambda = \pm ||\vec a || 126 | $$ 127 | 将$\vec x = \frac 1 \lambda \vec a ​$带入可得 128 | $$ 129 | f(\vec x) = \frac 1 \lambda \vec a ^T \vec a =\frac 1 \lambda ||\vec a||^2 =\pm ||\vec a|| 130 | $$ 131 | 所以 132 | $$ 133 | \max f(\vec x) = ||\vec a || 134 | $$ 135 | 136 | -------------------------------------------------------------------------------- /作业/hw0/hw0_solution.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/作业/hw0/hw0_solution.pdf -------------------------------------------------------------------------------- /作业/hw1/hw1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/作业/hw1/hw1.pdf -------------------------------------------------------------------------------- /作业/hw1/hw1_solution.md: -------------------------------------------------------------------------------- 1 | #### Problem 1 2 | 3 | 最小化$f(x)$等价于求$f'(x)=0$的根,所以 4 | 5 | (a)向前误差: 6 | $$ 7 | |x^*- x_{test}| 8 | $$ 9 | (b)向后误差: 10 | $$ 11 | |f'(x^*)- f'(x_{test})| 12 | $$ 13 | (c)条件数: 14 | $$ 15 | \begin{aligned} 16 | \frac{|x^*- x_{test}|}{|f'(x^*)- f'(x_{test})|} 17 | &\approx \frac{|x^*- x_{test}|}{|f''(x_{test})(x^*- x_{test})|} \\ 18 | &=\frac 1 {|f''(x_{test})|} 19 | \end{aligned} 20 | $$ 21 | 22 | 23 | 24 | #### Problem 2 25 | 26 | (a)$\epsilon $相当于相对误差,相比于$(x \Diamond y) +\epsilon $的形式,$(1+\epsilon )(x \Diamond y) $可以更方便的比较不同测量值的误差大小。 27 | 28 | (b)证明:存在性是显然的,所以我们只需证明 29 | $$ 30 | 0\le |\epsilon| < \epsilon_{\max} 31 | $$ 32 | 左边的不等号显然,只需考虑右边的不等号,利用反证法,假设 33 | $$ 34 | |\epsilon| \ge \epsilon_{\max} 35 | $$ 36 | 如果$\epsilon \ge \epsilon_{\max}​$,那么 37 | $$ 38 | \prod_{i=1}^k (1+\epsilon_i) = (1+\epsilon)^k \ge (1+ \epsilon_{\max})^k >1 39 | $$ 40 | 所以必存在$\epsilon_j ​$,使得 41 | $$ 42 | 1+\epsilon_j \ge 1+\epsilon_{\max} \\ 43 | \epsilon_j \ge \epsilon_{\max} 44 | $$ 45 | 这就产生了矛盾,因此$\epsilon \ge \epsilon_{\max}$不可能发生。 46 | 47 | 如果$\epsilon \le -\epsilon_{\max}​$,那么 48 | $$ 49 | \prod_{i=1}^k (1+\epsilon_i) = (1+\epsilon)^k \le (1- \epsilon_{\max})^k <1 50 | $$ 51 | 所以必存在$\epsilon_j $,使得 52 | $$ 53 | 1+\epsilon_j \le 1-\epsilon_{\max} \\ 54 | \epsilon_j \le -\epsilon_{\max} 55 | $$ 56 | 这就产生了矛盾,因此$\epsilon \le -\epsilon_{\max}$不可能发生。 57 | 58 | 所以 59 | $$ 60 | |\epsilon| < \epsilon_{\max} 61 | $$ 62 | (c)这里要注意一点,我们的运算也会产生误差,所以 63 | 64 | (i) 65 | $$ 66 | \begin{aligned} 67 | \overline {\bar x. \bar y} 68 | &=(1+\epsilon_1)(1+\epsilon_2) xy.(1+\epsilon_3)\\ 69 | &=(1+\epsilon)^3 xy \\ 70 | &= (1+3\epsilon +O(\epsilon^2))xy 71 | \end{aligned} 72 | $$ 73 | 其中$0\le |\epsilon| <\epsilon_{\max}$,第二个等号是因为(b)。由上式可得,我们的误差上界为 74 | $$ 75 | 3\epsilon_{\max} + O(\epsilon_{\max}^2) 76 | $$ 77 | (ii) 78 | $$ 79 | \begin{aligned} 80 | \frac {\bar x} {\bar y} 81 | &=\frac{1+\epsilon_1}{1+\epsilon_2} \frac x y .(1+\epsilon_3)\\ 82 | &=\frac{(1+\epsilon_1)(1+\epsilon_2)(1+\epsilon_3)}{(1+\epsilon_2)(1+\epsilon_2)} \frac x y \\ 83 | &=\frac{(1+\epsilon)^3}{(1+\epsilon_2)^2} \frac x y\\ 84 | &\le {(1+\epsilon)^3} \frac x y\\ 85 | &=(1+3\epsilon +O(\epsilon^2))\frac x y 86 | \end{aligned} 87 | $$ 88 | 其中$0\le |\epsilon| <\epsilon_{\max}$,第三个等号是因为(b)。有上式可得,我们的误差上界为 89 | $$ 90 | 3\epsilon_{\max} + O(\epsilon_{\max}^2) 91 | $$ 92 | (d)注意到 93 | $$ 94 | \begin{aligned} 95 | \overline {\bar x- \bar y} 96 | &=\big((1+\epsilon_1) x-(1+\epsilon_2)y \big) .(1+\epsilon_3)\\ 97 | &=(1+\epsilon_3)(x+\epsilon_1 x -y-\epsilon_2 y) \\ 98 | &=x-y+(\epsilon_1+\epsilon_3 +\epsilon_1 \epsilon_3) x- 99 | (\epsilon_2 +\epsilon_3+\epsilon_2 \epsilon_3)y 100 | \end{aligned} 101 | $$ 102 | 计算相对误差可得: 103 | $$ 104 | \begin{aligned} 105 | \frac{|\overline {\bar x- \bar y}-(x-y)|}{|x-y|} 106 | &= \frac{|(\epsilon_1+\epsilon_3 +\epsilon_1 \epsilon_3) x- 107 | (\epsilon_2 +\epsilon_3+\epsilon_2 \epsilon_3)y|}{|x-y|} \\ 108 | &=|\epsilon_1+\epsilon_3 +\epsilon_1 \epsilon_3- 109 | \frac{(\epsilon_1-\epsilon_2 +\epsilon_1 \epsilon_3 -\epsilon_2 \epsilon_3)y}{x-y}| 110 | \end{aligned} 111 | $$ 112 | 如果$x,y$非常接近,那么不难看出上式趋于无穷大,因此减法的相对误差无法估计。 113 | 114 | (e)考虑带误差的递推式: 115 | $$ 116 | \begin{aligned} 117 | \bar s_{k} 118 | &=(\bar s_{k-1}+(1+\epsilon_0) x).(1+\epsilon_{k-1})\\ 119 | &=\bar s_{k-1}(1+\epsilon_{k-1}) +x(1+\epsilon_0) (1+\epsilon_{k-1}) \\ 120 | &=\big(\bar s_{k-2}(1+\epsilon_{k-2}) +x(1+\epsilon_0) (1+\epsilon_{k-2}) \big) 121 | (1+\epsilon_{k-1}) 122 | +x(1+\epsilon_0) (1+\epsilon_{k-1}) \\ 123 | &=\bar s_{k-2}(1+\epsilon_{k-2})(1+\epsilon_{k-1})+x(1+\epsilon_0)\Big( (1+\epsilon_{k-2})(1+\epsilon_{k-1})+(1+\epsilon_{k-1}) \Big) \\ 124 | &=\ldots \\ 125 | &=\bar s_{1} \prod_{i=1}^{k-1}(1+\epsilon_i) + x(1+\epsilon_0) 126 | \Big(\sum_{j=1}^{k-1} \prod_{i=j}^{k-1} (1+\epsilon_j) \Big)\\ 127 | &=x (1+\epsilon)^{k-1}+x(1+\epsilon_0) 128 | \Big(\sum_{j=1}^{k-1} (1+\epsilon_j')^{k-j} \Big)\\ 129 | &=x \big( 1+(k-1) \epsilon \big)+x(1+\epsilon_0) \Big(\sum_{j=1}^{k-1} (1+(k-j)\epsilon'_j) \Big)+O(\epsilon_{\max}^2)\\ 130 | &=x +(k-1)\epsilon x +x(1+\epsilon_0) 131 | \Big( k-1+\sum_{j=1}^{k-1} (k-j)\epsilon'_j \Big)+O(\epsilon_{\max}^2)\\ 132 | &=x +(k-1)\epsilon x +(k-1)x+ 133 | \Big( (k-1)\epsilon_0+\sum_{j=1}^{k-1} (k-j)\epsilon'_j \Big)x+O(\epsilon_{\max}^2)\\ 134 | &=kx +\Big( (k-1)\epsilon +(k-1)\epsilon_0+\sum_{j=1}^{k-1} (k-j)\epsilon'_j \Big)x+O(\epsilon_{\max}^2) 135 | \end{aligned} 136 | $$ 137 | 138 | 对大括号的式子进行估计: 139 | $$ 140 | \begin{aligned} 141 | |(k-1)\epsilon +(k-1)\epsilon_0+\sum_{j=1}^{k-1} (k-j)\epsilon'_j| 142 | &\le |\epsilon_{\max}| |2k-2+\frac {(k-1)k}{2}|\\ 143 | &=|\epsilon_{\max}| |(k-1)\frac{k+4}{2}| 144 | \end{aligned} 145 | $$ 146 | 计算相对误差可得 147 | $$ 148 | \begin{aligned} 149 | |\frac{\bar {s_k} -s_k}{s_k}| 150 | &=\Big|\frac{\Big( (k-1)\epsilon +(k-1)\epsilon_0+\sum_{j=1}^{k-1} (k-j)\epsilon'_j \Big)x+O(\epsilon_{\max}^2)}{kx}\Big|\\ 151 | &\le |\epsilon_{\max}| |\frac{(k+4)(k-1)}{2k}| +O(\epsilon_{\max}^2)\\ 152 | &=\frac k 2 |\epsilon_{\max}| +O(\epsilon_{\max}^2) 153 | \end{aligned} 154 | $$ 155 | (f)考虑带误差的递推式: 156 | $$ 157 | \begin{aligned} 158 | \bar q_{k} 159 | &=(\bar q_{k-1}+\bar q_{k-1}).(1+\epsilon_{k-1})\\ 160 | &=2\bar q_{k-1}(1+\epsilon_{k-1}) \\ 161 | &=2^2\bar q_{k-2}(1+\epsilon_{k-1})(1+\epsilon_{k-2})\\ 162 | &=2^{k}\bar q_0 \prod_{i=0}^{k-1} (1+\epsilon_{i})\\ 163 | &=2^{k}x (1+\epsilon)^k \\ 164 | &=q_k(1+k\epsilon +O(\epsilon^2)) 165 | \end{aligned} 166 | $$ 167 | 因为 168 | $$ 169 | |k\epsilon| =|\epsilon|. \log_2 n \le |\epsilon_{\max}|. \log_2 n 170 | $$ 171 | 所以相对误差的上界约等于 172 | $$ 173 | |\epsilon_{\max}|. \log_2 n 174 | $$ 175 | 176 | Kahan求和的方法比较复杂,可以参考计算机程序设计艺术(第2卷)中文版第235页和598页,这里只给出结果: 177 | $$ 178 | \hat S_n =\sum_{i=1}^n (1+\mu_i)x_i\\ 179 | |\mu_i| \le 2u +O(nu^2) 180 | $$ 181 | 182 | 这个结果说明Kahan求和产生的误差和计算次数无关。 183 | 184 | 185 | 186 | #### Problem 3 187 | 188 | (a)假设$A,B \in \mathbb R^{n\times n}​$是上三角矩阵,即 189 | $$ 190 | 当i>j时,a_{ij}=b_{ij}=0 191 | $$ 192 | 记$C=AB$,考虑$b_{ij}(i>j)​$ 193 | $$ 194 | \begin{aligned} 195 | b_{ij} 196 | &=\sum_{k=1}^n a_{ik}b_{kj}\\ 197 | &=\sum_{k=1}^{j} a_{ik}b_{kj}+\sum_{k=j+1}^n a_{ik}b_{kj} 198 | \end{aligned} 199 | $$ 200 | 对前一项来说,因为$i>j \ge k$,所以$a_{ik}=0$;对于后一项来说,$k>j$,所以$b_{kj}=0$,因此对于$i>j$,我们有 201 | $$ 202 | c_{ij}=0 203 | $$ 204 | 这说明$C=AB​$是上三角矩阵。 205 | 206 | (b)直接计算特征多项式即可: 207 | $$ 208 | \left| 209 | \begin{matrix} 210 | \lambda-u_{11} & \ldots &\ldots & \ldots\\ 211 | 0& \lambda-u_{22} &\ldots&\ldots \\ 212 | 0 &0 & \ldots &\ldots \\ 213 | 0&0&0& \lambda-u_{nn} 214 | \end{matrix} 215 | \right|=\prod_{i=1}^n (\lambda- u_{ii}) 216 | $$ 217 | 所以上三角阵的特征值为其对角元。 218 | 219 | 下面证明$\{ \vec v_1,...,\vec v_k\} ​$线性无关,假设 220 | $$ 221 | \sum_{i=1}^{k} \alpha_i \vec v_i=0 222 | $$ 223 | 两边左乘$U^m$可得 224 | $$ 225 | \begin{eqnarray*} 226 | \sum_{i=1}^{k} \alpha_i U^m \vec v_i &=0\\ 227 | \sum_{i=1}^{k} \alpha_iu_{ii} U^{m-1} \vec v_i&=0\\ 228 | \ldots \\ 229 | \sum_{i=1}^{k} \alpha_i u_{ii}^m\vec v_i&=0 230 | \end{eqnarray*} 231 | $$ 232 | 对$m=0,...,k-1​$,将这些等式写成矩阵形式可得: 233 | $$ 234 | (\alpha_1 \vec v_1,..., \alpha_k \vec v_k) 235 | \left( 236 | \begin{matrix} 237 | 1 & u_{11} & \ldots & u_{11}^{k-1} \\ 238 | \vdots & \vdots & \ldots & \vdots \\ 239 | 1 & u_{kk} & \ldots & u_{kk}^{k-1} 240 | \end{matrix} 241 | \right) =(0,\ldots,0) 242 | $$ 243 | 记 244 | $$ 245 | A= \left( 246 | \begin{matrix} 247 | 1 & u_{11} & \ldots & u_{11}^{k-1} \\ 248 | \vdots & \vdots & \ldots & \vdots \\ 249 | 1 & u_{kk} & \ldots & u_{kk}^{k-1} 250 | \end{matrix} 251 | \right) 252 | $$ 253 | $A$的行列式为范德蒙行列式,因为$u_{ii}$互不相同,所以$|A|\neq 0$,从而$A$可逆,因此 254 | $$ 255 | (\alpha_1 \vec v_1,..., \alpha_k \vec v_k) =(0,\ldots,0) 256 | $$ 257 | 所以 258 | $$ 259 | \alpha_i \vec v_i =0 260 | $$ 261 | 因为$\vec v_i \neq 0$,所以$\alpha_i= 0$,从而$\{ \vec v_1,...,\vec v_k\} $线性无关。 262 | 263 | (c)假设$A\in \mathbb R^{n\times n}$是下三角矩阵,即 264 | $$ 265 | 当ii 309 | $$ 310 | 所以$B$是下三角矩阵。 -------------------------------------------------------------------------------- /作业/hw1/hw1_solution.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/作业/hw1/hw1_solution.pdf -------------------------------------------------------------------------------- /作业/hw2/hw2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/作业/hw2/hw2.pdf -------------------------------------------------------------------------------- /作业/hw2/hw2_solution.md: -------------------------------------------------------------------------------- 1 | #### Problem 1 2 | 3 | (a)$\forall A,B \in \mathbb R^{n\times n}$,以及满足$||\vec x||=1 $的$\vec x $,我们有 4 | $$ 5 | \begin{aligned} 6 | ||(A+B)\vec x || 7 | &=||A \vec x + B\vec x||\\ 8 | &\le ||A \vec x|| + ||B\vec x|| \\ 9 | &\le ||A|| + ||B|| 10 | \end{aligned} 11 | $$ 12 | 其中最后一个不等号是由$||A||$的定义。 13 | 14 | 所以 15 | $$ 16 | ||A+B||= \max \{||(A+B)\vec x||: ||\vec x||=1 \} 17 | \le ||A||+||B|| 18 | $$ 19 | (b)利用等价定义: 20 | $$ 21 | ||A|| =\max_{\vec x \in \mathbb R^n \setminus \{0\}} 22 | \frac{||A\vec x||}{||\vec x||} 23 | $$ 24 | 那么显然有 25 | $$ 26 | \begin{aligned} 27 | \frac{||A\vec x||}{||\vec x||}& \le ||A|| \\ 28 | ||A\vec x||&\le ||A||.||\vec x|| 29 | \end{aligned} 30 | $$ 31 | 任取$\vec x \in \mathbb R^n$,我们有 32 | $$ 33 | \begin{aligned} 34 | ||AB \vec x || 35 | &=||A(B\vec x) ||\\ 36 | &\le ||A|| .||B\vec x|| \\ 37 | &\le ||A|| .||B||. ||\vec x|| 38 | \end{aligned} 39 | $$ 40 | 所以 41 | $$ 42 | \frac{||AB \vec x ||}{||\vec x||} \le ||A|| .||B|| 43 | $$ 44 | 对左边取最大值可得: 45 | $$ 46 | ||AB|| \le ||A|| .||B|| 47 | $$ 48 | (c)原不等式等价于 49 | $$ 50 | ||A^k|| \ge |\lambda|^k 51 | $$ 52 | 对$A$的特征值$\lambda_i$,取对应的特征向量$\vec x_i \in \mathbb R^n​$,我们有 53 | $$ 54 | \begin{aligned} 55 | ||A^k \vec {x_i} || 56 | &=||A^{k-1}\lambda_i \vec {x_i} ||\\ 57 | &=|\lambda_i| ||A^{k-1} \vec {x_i}||\\ 58 | &=\ldots \\ 59 | &= |\lambda_i|^k .||\vec {x_i}|| 60 | \end{aligned} 61 | $$ 62 | 所以 63 | $$ 64 | \frac{|| A^k\vec {x_i} ||}{||\vec x ||}=|\lambda_i|^k 65 | $$ 66 | 因此对任意特征值$\lambda_i ​$,我们有 67 | $$ 68 | ||A^k|| =\max_{\vec x \in \mathbb R^n \setminus \{0\}} 69 | \frac{||A^k\vec x||}{||\vec x||} \ge |\lambda_i|^k 70 | $$ 71 | (d)我们假设$||\vec x||_1 =1$,即 72 | $$ 73 | \sum_{i=1}^n |x_i|=1 74 | $$ 75 | 计算$||A\vec x||_1​$可得: 76 | $$ 77 | \begin{aligned} 78 | ||A\vec x||_1 &=\sum_{i=1}^n \sum_{j=1}^n |a_{ij}||x_j|\\ 79 | &=\sum_{j=1}^n\sum_{i=1}^n |a_{ij}||x_j|\\ 80 | &\le \sum_{j=1}^n (\max_{1\le j\le n} \sum_{i=1}^n |a_{ij}|)|x_j| \\ 81 | &=(\max_{1\le j\le n} \sum_{i=1}^n |a_{ij}|)\sum_{j=1}^n |x_j| \\ 82 | &=\max_{1\le j\le n} \sum_{i=1}^n |a_{ij}| 83 | \end{aligned} 84 | $$ 85 | 所以我们有 86 | $$ 87 | ||A||_1 =\max_{1\le j\le n} \sum_{i=1}^n |a_{ij}| 88 | $$ 89 | 补充题: 90 | 91 | 由(c)可得,对任意特征值$\lambda_i$,我们有 92 | $$ 93 | \lim_{k\to \infty} ||A^k||^{\frac 1 k} \ge |\lambda_i| 94 | $$ 95 | 所以 96 | $$ 97 | \lim_{k\to \infty} ||A^k||^{\frac 1 k}\ge \max \{|\lambda_i|\}=\rho(A) 98 | $$ 99 | 接下来证明另一个方向的不等式,证明参考[维基百科](https://en.wikipedia.org/wiki/Spectral_radius#cite_note-1) 100 | $$ 101 | \lim_{k\to \infty} A^k = 0\Leftrightarrow \rho(A) <1 102 | $$ 103 | 注意到对于任意矩阵$A​$,假设其特征值为 104 | $$ 105 | \lambda_1,...,\lambda_n 106 | $$ 107 | 那么$\frac A k ​$的特征为 108 | $$ 109 | \frac {\lambda_1} k,...,\frac{\lambda_n} k 110 | $$ 111 | 所以$\forall \epsilon > 0$,构造如下矩阵 112 | $$ 113 | A_+ =\frac 1 {\rho(A)+ \epsilon} A 114 | $$ 115 | 由之前叙述可得 116 | $$ 117 | \rho(A_+) < 1 118 | $$ 119 | 所以 120 | $$ 121 | \lim_{k\to \infty} A_+^k = 0 122 | $$ 123 | 所以由极限的定义可得,存在$N_+$,当$k\ge N_+$时,我们有 124 | $$ 125 | ||A_+^k ||=\frac{||A^k ||}{(\rho(A)+ \epsilon)^k} <1 126 | $$ 127 | 从而 128 | $$ 129 | ||A^k ||^{\frac 1 k }\le \rho(A)+ \epsilon 130 | $$ 131 | 结合之前的结果可得 132 | $$ 133 | \rho(A) \le ||A^k ||^{\frac 1 k }\le \rho(A)+ \epsilon 134 | $$ 135 | 令$\epsilon \to 0$,我们有 136 | $$ 137 | \rho(A) =||A^k ||^{\frac 1 k } 138 | $$ 139 | 140 | 141 | 142 | #### Problem 2 143 | 144 | (a)对于固定的$k$,记 145 | $$ 146 | E_{c,l}= I+c\vec e_l \vec e_k^T 147 | $$ 148 | 我们计算$E_{c_1,s} E_{c_2,s}​$ 149 | $$ 150 | \begin{aligned} 151 | E_{c_1,s} E_{c_2,t} 152 | &= (I+c_1 \vec e_s \vec e_k^T) (I+c_2\vec e_t \vec e_k^T) \\ 153 | &=I+(c_1\vec e_s +c_2 \vec e_t) \vec e_k^T + c_1 c_2 \vec e_s \vec e_k^T\vec e_t \vec e_k^T \\ 154 | &=I+(c_1\vec e_s +c_2 \vec e_t) \vec e_k^T + c_1 c_2 \vec e_s (\vec e_k^T\vec e_t) \vec e_k^T\\ 155 | &=I+(c_1\vec e_s +c_2 \vec e_t) \vec e_k^T 156 | \end{aligned} 157 | $$ 158 | 注意到forward substitution等价于左乘矩阵$E_{c,l}, l>k$,结合上述事实,我们有 159 | $$ 160 | \begin{aligned} 161 | M_k 162 | &= \prod_{i=k+1}^n E_{c_i, i} \\ 163 | &= I+ (\sum_{i=k+1}^{n}c_i\vec e_i ) \vec e_k^T 164 | \end{aligned} 165 | $$ 166 | 记 167 | $$ 168 | \vec m_k = -(\sum_{i=k+1}^{n}c_i\vec e_i ) 169 | $$ 170 | 第一部分结论得证。接着验证 171 | $$ 172 | L_k M_k =I 173 | $$ 174 | 事实上,我们有 175 | $$ 176 | \begin{aligned} 177 | L_k M_k 178 | &= (I+\vec m_k\vec e_k^T )(I-\vec m_k\vec e_k^T ) \\ 179 | &=I+\vec m_k\vec e_k^T -\vec m_k\vec e_k^T+\vec m_k(\vec e_k^T\vec m_k)\vec e_k^T \\ 180 | &=I+\vec m_k(\vec e_k^T\vec m_k)\vec e_k^T 181 | \end{aligned} 182 | $$ 183 | 注意到$\vec e_k$的第$k$个元素为$1$,$\vec m_k$第$k$个元素为$0$,所以 184 | $$ 185 | \vec m_k(\vec e_k^T\vec m_k)\vec e_k^T =\vec 0 186 | $$ 187 | 因此 188 | $$ 189 | L_k M_k =I 190 | $$ 191 | (b) 192 | $$ 193 | \begin{aligned} 194 | L_k P^{(ij)} 195 | &= (I+\vec m_k\vec e_k^T ) P^{(ij)}\\ 196 | &=P^{(ij)} +\vec m_k \big(\vec e_k^TP^{(ij)} \big) 197 | \end{aligned} 198 | $$ 199 | 因为$\vec e_k^TP^{(ij)}​$的作用是交换$\vec e_k^T​$的第$i,j ​$列,而$\vec e_k^T ​$的第$i,j​$列均为$0​$,$k\neq i,j​$,所以 200 | $$ 201 | \vec e_k^TP^{(ij)} = \vec e_k^T 202 | $$ 203 | 注意到我们显然有 204 | $$ 205 | P^{(ij)}P^{(ij)}=I 206 | $$ 207 | 所以 208 | $$ 209 | \begin{aligned} 210 | L_k P^{(ij)} 211 | &=P^{(ij)} +\vec m_k \big(\vec e_k^TP^{(ij)} \big) \\ 212 | &=P^{(ij)} +\vec m_k\vec e_k^T \\ 213 | &=P^{(ij)} +P^{(ij)}P^{(ij)}\vec{m_k}\vec e_k^T\\ 214 | &=P^{(ij)}(I+P^{(ij)}\vec{m_k}\vec e_k^T) 215 | \end{aligned} 216 | $$ 217 | (c)令 218 | $$ 219 | \begin{aligned} 220 | G(k) &= P_{k+1}L_{k+1}...P_{n-1}L_{n-1} \\ 221 | G'(k)&= P_{k+1}...P_{n-1}L_{k+1}^p ...L_{n-1}^p 222 | \end{aligned} 223 | $$ 224 | 我们的目标是证明 225 | $$ 226 | G(0)=G'(0) 227 | $$ 228 | 所以关于$k$做数学归纳法即可。当$k=n-2$时, 229 | $$ 230 | \begin{aligned} 231 | 232 | G(n-2)&= P_{n-1} L_{n-1}=P_{n-1} (I+\vec m_{n-1}\vec e_{n-1}^T)\\ 233 | G'(n-2)&= P_{n-1} L^p_{n-1}=P_{n-1} (I+\vec m_{n-1}\vec e_{n-1}^T) 234 | \end{aligned} 235 | $$ 236 | 所以 237 | $$ 238 | G(n-2) =G'(n-2) 239 | $$ 240 | 因此$k=n-2$时结论成立。假设$k=s$时结论,现在证明$k=s-1$时结论也成立,此时有 241 | $$ 242 | \begin{aligned} 243 | G(s) &= P_{s+1}L_{s+1}...P_{n-1}L_{n-1} \\ 244 | &= G'(s)\\ 245 | &= P_{s+1}...P_{n-1}L_{s+1}^p ...L_{n-1}^p 246 | \end{aligned} 247 | $$ 248 | 由定义,我们有 249 | $$ 250 | \begin{aligned} 251 | G(s-1) 252 | &=P_s L_s G(s)\\ 253 | &=P_s L_s P_{s+1}...P_{n-1}L_{s+1}^p ...L_{n-1}^p 254 | \end{aligned} 255 | $$ 256 | 利用(b)计算$P_s L_s P_{s+1}...P_{n-1}$可得 257 | $$ 258 | \begin{aligned} 259 | P_s L_s P_{s+1}...P_{n-1} 260 | &=P_s (I+ \vec m_s \vec e_s)P_{s+1}...P_{n-1} \\ 261 | &=P_sP_{s+1} (I+P_{s+1}\vec m_s \vec e_s)P_{s+2}...P_{n-1} 262 | \end{aligned} 263 | $$ 264 | 注意到$P_{s+1}\vec m_s$的特点依然为$i\le s$的元素为$0$,所以仍然可以使用(b)的性质,即 265 | $$ 266 | (I+P_{s+1}\vec m_s \vec e_s)P_{s+2} = P_{s+2}(I+P_{s+2}P_{s+1}\vec m_s \vec e_s) 267 | $$ 268 | 所以 269 | $$ 270 | \begin{aligned} 271 | P_s L_s P_{s+1}...P_{n-1} 272 | &=P_sP_{s+1} (I+P_{s+1}\vec m_s \vec e_s)P_{s+2}...P_{n-1} \\ 273 | &= P_sP_{s+1} P_{s+2}(I+P_{s+2}P_{s+1}\vec m_s \vec e_s) 274 | P_{s+3}...P_{n-1} \\ 275 | &\ldots \\ 276 | &=P_s...P_{n-1} (I+P_{n-1} \ldots P_{s+1}\vec m_s \vec e_s)\\ 277 | &=P_s...P_{n-1} L_{s}^p 278 | \end{aligned} 279 | $$ 280 | 从而 281 | $$ 282 | \begin{aligned} 283 | G(s-1) 284 | &=P_s L_s P_{s+1}...P_{n-1}L_{s+1}^p ...L_{n-1}^p\\ 285 | &= P_s...P_{n-1} L_{s}^p L_{s+1}^p ...L_{n-1}^p\\ 286 | &=G'(s-1) 287 | \end{aligned} 288 | $$ 289 | 因此$k=s-1 ​$时结论也成立。 290 | 291 | (d)首先考虑 292 | $$ 293 | S_k \triangleq \vec m_k \vec e_k^T 294 | $$ 295 | 由$\vec m_k ,\vec e_k​$的性质可得,只有当$i \ge k+1​$且$j=k​$时,$(S_k)_{ij}\neq 0​$,所以当$i 0$,所以当$t\to \infty$时,上式趋于$0$,即 224 | $$ 225 | e^{-\Lambda t} \to \text{diag}\{0,...,0\} 226 | $$ 227 | 因此 228 | $$ 229 | e^{-At}= Q e^{-\Lambda t}Q^T \to 0 230 | $$ 231 | 232 | 233 | 234 | #### Problem 3 235 | 236 | (a)直接验证即可 237 | $$ 238 | \begin{aligned} 239 | A^T (I_m- QQ^T) 240 | &=R^TQ^T(I_m- QQ^T)\\ 241 | &=R^TQ^T-R^TQ^TQQ^T\\ 242 | &=R^TQ^T-R^TQ^T\\ 243 | &=0 244 | \end{aligned} 245 | $$ 246 | 所以结论成立。 247 | 248 | (b)因为 249 | $$ 250 | \vec a = \frac {\vec a }{\Arrowvert \vec a \Arrowvert}.\Arrowvert \vec a \Arrowvert 251 | $$ 252 | 利用第五讲的定义可得 253 | $$ 254 | Q_1 = \frac {\vec a }{\Arrowvert \vec a \Arrowvert}, R_1 = \Arrowvert \vec a \Arrowvert 255 | $$ 256 | (c)注意上式等价于 257 | $$ 258 | AQ^T =R 259 | $$ 260 | 所以只要利用正交矩阵对$A$做列变换,使得最终结果为上三角阵即可,所以存在上述分解。 261 | 262 | (d)注意上式等价于 263 | $$ 264 | Q^T A=L 265 | $$ 266 | 所以只要利用正交矩阵对$A​$做行变换,使得最终结果为下三角阵即可,所以存在上述分解。 267 | 268 | -------------------------------------------------------------------------------- /作业/hw3/hw3_solution.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/作业/hw3/hw3_solution.pdf -------------------------------------------------------------------------------- /作业/hw4/hw4.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/作业/hw4/hw4.pdf -------------------------------------------------------------------------------- /作业/hw4/hw4_solution.md: -------------------------------------------------------------------------------- 1 | #### Problem 1 2 | 3 | (a)假设 4 | $$ 5 | \begin{aligned} 6 | \Delta J&= J_k -J_{k-1}\\ 7 | \Delta \vec x&= \vec x_k -\vec x_{k-1}\\ 8 | \vec d &= f(\vec x_k) -f(\vec x_{k-1}) 9 | -J_{k-1}\Delta \vec x 10 | \end{aligned} 11 | $$ 12 | 那么原问题可以化为如下形式: 13 | $$ 14 | \begin{aligned} 15 | \min_{\Delta J}&\ \Arrowvert \Delta J\Arrowvert^2_{\text{Fro}}\\ 16 | \text{such that}&\ \Delta J.\Delta \vec x =\vec d 17 | \end{aligned} 18 | $$ 19 | 构造拉格朗日乘子: 20 | $$ 21 | \Lambda =\Arrowvert \Delta J\Arrowvert^2_{\text{Fro}} + 22 | \vec \lambda^T(\Delta J.\Delta \vec x -\vec d) 23 | $$ 24 | 关于$(\Delta J)_{ij}$求偏导并令其为$0$得到 25 | $$ 26 | 0 =\frac{\partial \Lambda}{(\Delta J)_{ij}} 27 | 28 | =2(\Delta J)_{ij} +\lambda_i (\Delta \vec x)_j 29 | $$ 30 | 所以 31 | $$ 32 | \Delta J =-\frac 1 2 \vec \lambda (\Delta \vec x)^T 33 | $$ 34 | 带回$\Delta J.\Delta \vec x =\vec d$可得 35 | $$ 36 | \vec \lambda(\Delta \vec x)^T (\Delta \vec x)=-2\vec d \Rightarrow 37 | \vec \lambda =-\frac{2\vec d}{\Arrowvert \Delta \vec x\Arrowvert^2} 38 | $$ 39 | 因此 40 | $$ 41 | \Delta J=-\frac 1 2 \vec \lambda (\Delta \vec x)^T=\frac{\vec d(\Delta \vec x)^T}{\Arrowvert \Delta \vec x\Arrowvert^2} 42 | $$ 43 | 回顾各项的定义,我们得到 44 | $$ 45 | \begin{aligned} 46 | J_k &=J_{k-1}+\Delta J\\ 47 | &=J_{k-1} +\frac{\vec d(\Delta \vec x)^T}{\Arrowvert \Delta \vec x\Arrowvert^2}\\ 48 | &=J_{k-1}+\frac{(f(\vec x_k) -f(\vec x_{k-1}) 49 | -J_{k-1}\Delta \vec x)} 50 | {\Arrowvert \vec x_k -\vec x_{k-1}\Arrowvert^2} 51 | (x_k -\vec x_{k-1})^T 52 | \end{aligned} 53 | $$ 54 | (b)带入验证即可: 55 | $$ 56 | \begin{aligned} 57 | \Big(A+\vec u \vec v^T\Big)\Big( A^{-1} 58 | -\frac{A^{-1}\vec u\vec v^T A^{-1}}{1+\vec v^T A^{-1}\vec u}\Big) 59 | &= I+\vec u \vec v^TA^{-1} - \frac{\vec u\vec v^T A^{-1}}{1+\vec v^T A^{-1}\vec u} 60 | -\frac{\vec u \vec v^TA^{-1}\vec u\vec v^T A^{-1}}{1+\vec v^T A^{-1}\vec u}\\ 61 | &= I+\vec u \vec v^TA^{-1} - \frac{\vec u\vec v^T A^{-1}}{1+\vec v^T A^{-1}\vec u} 62 | -\frac{\vec u (\vec v^TA^{-1}\vec u)\vec v^T A^{-1}}{1+\vec v^T A^{-1}\vec u}\\ 63 | &=I+\vec u \vec v^TA^{-1}-(\vec u\vec v^T A^{-1}) 64 | \frac{1+\vec v^TA^{-1}\vec u}{1+\vec v^TA^{-1}\vec u}\\ 65 | &=I+\vec u \vec v^TA^{-1} -\vec u\vec v^T A^{-1}\\ 66 | &=I 67 | \end{aligned} 68 | $$ 69 | (c)原始的迭代形式为: 70 | $$ 71 | J_k=J_{k-1}+\vec u_k \vec v_k^T 72 | $$ 73 | 由(b)可得 74 | $$ 75 | \begin{aligned} 76 | J_k^{-1} 77 | &=J_{k-1}^{-1} - 78 | \frac{J_{k-1}^{-1}\vec u_k \vec v_k^TJ_{k-1}^{-1}} 79 | {1+\vec v_k^T J_{k-1}^{-1}\vec u_k} 80 | \end{aligned} 81 | $$ 82 | 其中 83 | $$ 84 | \begin{aligned} 85 | \vec u_k &= \frac{(f(\vec x_k) -f(\vec x_{k-1}) 86 | -J_{k-1}\Delta \vec x)} 87 | {\Arrowvert \vec x_k -\vec x_{k-1}\Arrowvert^2} \\ 88 | \vec v_k &=x_k -\vec x_{k-1} 89 | \end{aligned} 90 | $$ 91 | 92 | 93 | 94 | #### Problem 2 95 | 96 | (a)使用奇异值分解: 97 | $$ 98 | A=U\Sigma V^T 99 | $$ 100 | 因为$m=n$,所以$\Sigma$为对角阵,即 101 | $$ 102 | \Sigma^T= \Sigma 103 | $$ 104 | 因此 105 | $$ 106 | \begin{aligned} 107 | A^TA 108 | &=V\Sigma ^T U^T U\Sigma V^T\\ 109 | &=V\Sigma^2V^T\\ 110 | \sqrt {A^TA} 111 | &= V\Sigma V^T 112 | \end{aligned} 113 | $$ 114 | 计算trace可得 115 | $$ 116 | \begin{aligned} 117 | \text{trace}(\sqrt {A^TA}) 118 | &=\text{trace}(V\Sigma V^T)\\ 119 | &=\text{trace}( V^TV\Sigma)&\text{trace}(AB)=\text{trace}(BA)\\ 120 | &=\text{trace}(\Sigma)\\ 121 | &=\sum_{i=1}^n \sigma_i(A)\\ 122 | &=\Arrowvert A\Arrowvert_* 123 | \end{aligned} 124 | $$ 125 | (b)证明一般情形,如果$A\in \mathbb R^{n\times m},B\in \mathbb R^{m\times n}​$,那么 126 | $$ 127 | \text{trace}(AB)=\text{trace}(BA) 128 | $$ 129 | 注意到 130 | $$ 131 | (AB)_{ii}=\sum_{s=1}^m A_{is} B_{si},(BA)_{ss}=\sum_{i=1}^n B_{si} A_{is} 132 | $$ 133 | 注意到 134 | $$ 135 | AB \in \mathbb R^{n\times n},BA\in \mathbb R^{m\times m} 136 | $$ 137 | 所以 138 | $$ 139 | \begin{aligned} 140 | \text{trace}(AB) 141 | &=\sum_{i=1}^n (AB)_{ii}\\ 142 | &=\sum_{i=1}^n \sum_{s=1}^m A_{is} B_{si}\\ 143 | &= \sum_{s=1}^m \sum_{i=1}^nB_{si}A_{is}\\ 144 | &=\sum_{s=1}^m (BA)_{ss}\\ 145 | &=\text{trace}(BA) 146 | \end{aligned} 147 | $$ 148 | (c)由SVD分解可得 149 | $$ 150 | AC=U\Sigma V^TC 151 | $$ 152 | 记 153 | $$ 154 | (V')^T=V^TC 155 | $$ 156 | 那么 157 | $$ 158 | AC=U\Sigma (V')^T 159 | $$ 160 | 并且 161 | $$ 162 | (V')(V')^T=C^TVV^TC=C^TC=I 163 | $$ 164 | 利用定义可得 165 | $$ 166 | \begin{aligned} 167 | \text{trace}(AC) 168 | &=\text{trace}(U\Sigma (V')^T)\\ 169 | &=\sum_{i=1}^n \sigma_i(A)u_i^Tv_i'\\ 170 | &\le \sum_{i=1}^n \sigma_i(A) 171 | \Arrowvert u_i\Arrowvert.\Arrowvert v_i'\Arrowvert\\ 172 | &=\sum_{i=1}^n \sigma_i(A) \\ 173 | &=\Arrowvert A\Arrowvert_* 174 | \end{aligned} 175 | $$ 176 | 当且仅当 177 | $$ 178 | v_i'=u_i 179 | $$ 180 | 时等号成立,即 181 | $$ 182 | V'=C^TV=U,C^T=UV^{T},C=VU^T 183 | $$ 184 | (d)对于满足条件$C^TC=I$的$C$,我们有 185 | $$ 186 | \begin{aligned} 187 | \text{trace}\Big((A+B)C\Big) 188 | &=\text{trace}(AC)+\text{trace}(BC)\\ 189 | &\le \Arrowvert A\Arrowvert_*+\Arrowvert B\Arrowvert_* 190 | \end{aligned} 191 | $$ 192 | 所以 193 | $$ 194 | \begin{aligned} 195 | \Arrowvert A+B\Arrowvert_* 196 | &=\max_{C^TC=I}\text{trace}\Big((A+B)C\Big)\\ 197 | &\le \Arrowvert A\Arrowvert_*+\Arrowvert B\Arrowvert_* 198 | \end{aligned} 199 | $$ 200 | (e)令 201 | $$ 202 | A'=(\sigma_1(A),...,\sigma_n(A))^T 203 | $$ 204 | 那么我们需要最小化 205 | $$ 206 | \Arrowvert A-A_0\Arrowvert^2_{\text{Fro}}+ \Arrowvert A'\Arrowvert_1 207 | $$ 208 | 由$L_1$正则化的特性,我们的结果会使得$A'$某些项为$0$,不妨设非零项的下标为 209 | $$ 210 | k_1,...,k_m 211 | $$ 212 | 由SVD可得 213 | $$ 214 | A=\sum_{\sigma_i(A)\neq 0}\sigma_i(A) u_i v_i^T 215 | =\sum_{j=1}^m\sigma_{k_j}(A) u_{k_j} v_{k_j}^T 216 | $$ 217 | 所以得到$A_0$的低秩近似。 218 | 219 | 220 | 221 | #### Problem 3 222 | 223 | (a)回顾割线法的定义: 224 | $$ 225 | x_{k+1}=x_k -\frac{f(x_k)(x_k -x_{k-1})}{f(x_k) -f(x_{k-1})} 226 | $$ 227 | 注意到 228 | $$ 229 | f(x')=0 230 | $$ 231 | 如果$x_k =x'$,那么 232 | $$ 233 | f(x_k)=0 234 | $$ 235 | 即 236 | $$ 237 | \begin{aligned} 238 | x_{k+1} 239 | &=x_k -\frac{f(x_k)(x_k -x_{k-1})}{f(x_k) -f(x_{k-1})}\\ 240 | &=x' 241 | \end{aligned} 242 | $$ 243 | 如果$x_{k-1}=x'$,那么 244 | $$ 245 | f(x_{k-1})=0 246 | $$ 247 | 即 248 | $$ 249 | \begin{aligned} 250 | x_{k+1} 251 | &=x_k -\frac{f(x_k)(x_k -x')}{f(x_k)}\\ 252 | &=x_k -(x_k -x')\\ 253 | &=x' 254 | \end{aligned} 255 | $$ 256 | (b)假设$A$的奇异值为 257 | $$ 258 | \sigma_1(A)\ge \ldots \ge \sigma_k(A) 259 | $$ 260 | 回顾SVD的推导,我们知道 261 | $$ 262 | R_A(\vec x) =\frac{\Arrowvert A\vec x \Arrowvert}{\Arrowvert \vec x \Arrowvert}\in [\sigma_k(A),\sigma_1(A)] 263 | $$ 264 | 现在假设增加一行$\vec \alpha^T​$,那么 265 | $$ 266 | \tilde A= \left[ 267 | \begin{matrix} 268 | A \\ 269 | \vec \alpha^T 270 | \end{matrix} 271 | \right],\tilde A\vec x= \left[ 272 | \begin{matrix} 273 | A \vec x \\ 274 | \vec \alpha^T \vec x 275 | \end{matrix} 276 | \right] 277 | $$ 278 | 因此 279 | $$ 280 | \begin{aligned} 281 | R_{\tilde A}(\vec x) 282 | &=\frac{\Arrowvert \tilde A\vec x \Arrowvert}{\Arrowvert \vec x \Arrowvert}\\ 283 | &=\frac{\Big\Arrowvert \left[ 284 | \begin{matrix} 285 | A \vec x \\ 286 | \vec \alpha^T \vec x 287 | \end{matrix} 288 | \right] \Big\Arrowvert}{\Arrowvert \vec x \Arrowvert}\\ 289 | &\ge \frac{ 290 | \Arrowvert A \vec x \Arrowvert}{\Arrowvert \vec x \Arrowvert}\\ 291 | &=R_A(\vec x) 292 | \end{aligned} 293 | $$ 294 | 因此$\tilde A$的最小奇异值和最大奇异值均不小于$A$的最小奇异值和最大奇异值。 295 | 296 | 297 | 298 | #### Problem 4 299 | 300 | (a)将$\frac 1 a$视为 301 | $$ 302 | f(x)=\frac 1 x - a 303 | $$ 304 | 的零点,然后利用牛顿迭代法迭代即可,注意 305 | $$ 306 | f'(x)=-\frac 1 {x^2} 307 | $$ 308 | 所以 309 | $$ 310 | x_{k+1}=x_k-\frac{f(x_k)}{f'(x_k)}=x_k+ x_k^2(\frac 1 {x_k}-a)=2x_k -ax_k^2 311 | $$ 312 | (b) 313 | $$ 314 | \begin{aligned} 315 | \epsilon_{k+1} 316 | &=ax_{k+1}-1\\ 317 | &=2ax_k-a^2x_k^2-1\\ 318 | &=-(ax_k-1)^2\\ 319 | &=-\epsilon_{k}^2 320 | \end{aligned} 321 | $$ 322 | (c)由(b)可得 323 | $$ 324 | |\epsilon_{k+1}|=|\epsilon_k^2|, 325 | |\epsilon_{k}|=|\epsilon_0|^{2^k} 326 | $$ 327 | 要使得计算结果达到$d$位$2$进制小数,我们有 328 | $$ 329 | \begin{aligned} 330 | |\epsilon_0|^{2^k}&=2^{-d}\\ 331 | 2^k\ln |\epsilon_0|&=-d \ln 2\\ 332 | 2^k&=-\frac{d\ln 2}{\ln |\epsilon_0|}\\ 333 | k&=\log_2 (-\frac{d\ln 2}{\ln |\epsilon_0|}) 334 | \end{aligned} 335 | $$ 336 | -------------------------------------------------------------------------------- /作业/hw4/hw4_solution.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/作业/hw4/hw4_solution.pdf -------------------------------------------------------------------------------- /作业/hw5/hw5.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Doraemonzzz/CS205A-Mathematical-Methods-for-Robotics--Vision--and-Graphics/47f3aba77a5233cde2292ac944a9e0f5ee20a4bc/作业/hw5/hw5.pdf -------------------------------------------------------------------------------- /作业/hw5/hw5_solution.md: -------------------------------------------------------------------------------- 1 | #### Problem 1 2 | 3 | (a) 4 | $$ 5 | \begin{aligned} 6 | \nabla_{\vec x} f(\vec x) 7 | &= A\vec x -\vec b\\ 8 | \nabla_{\vec x}^2 f(\vec x) 9 | &=\nabla(A\vec x -\vec b)\\ 10 | &= A^T\\ 11 | &=A 12 | \end{aligned} 13 | $$ 14 | 令第一个式子为$\vec 0$可得 15 | $$ 16 | \vec x' =A^{-1} \vec b 17 | $$ 18 | 因为$A$正定,所以$\vec x' ​$是极小值点。 19 | 20 | (b)首先考虑如何将两个向量变成$A-$共轭,假设两个向量为$\vec v_1, \vec v_2$,取$\vec w_1=\vec v_1$,设 21 | $$ 22 | \vec w_2 =\vec v_2 -\alpha \vec w_1 23 | $$ 24 | 要使得$\vec w_1, \vec w_2$为$A-$共轭,那么 25 | $$ 26 | \begin{aligned} 27 | \langle \vec w_2, \vec w_1 \rangle 28 | &=\langle \vec v_2 -\alpha \vec w_1, \vec w_1 \rangle\\ 29 | &= \langle \vec v_2 , \vec w_1 \rangle-\alpha\langle \vec w_1, \vec w_1 \rangle\\ 30 | &=0 \\ 31 | \alpha&= \frac{\langle \vec v_2 , \vec w_1 \rangle}{\langle \vec w_1, \vec w_1 \rangle} 32 | \end{aligned} 33 | $$ 34 | 不难看出 35 | $$ 36 | \text{span}\{\vec w_1 ,\vec w_2\} =\text{span}\{\vec v_1 ,\vec v_2\} 37 | $$ 38 | 接着,假设我们已经有了$k$个$A-$共轭向量$\vec w_1 ,\ldots ,\vec w_k​$,并且 39 | $$ 40 | \text{span}\{\vec w_1,\ldots ,\vec w_k\} =\text{span}\{\vec v_1,\ldots ,\vec v_k\} 41 | $$ 42 | 现在讨论如何获得第$k+1$个$A-$共轭向量$\vec w_{k+1}$,假设 43 | $$ 44 | \vec w_{k+1}= \vec v_{k+1} -\sum_{j=1}^k \alpha_j \vec w_j 45 | $$ 46 | 我们需要的条件为 47 | $$ 48 | \langle \vec w_{k+1}, \vec w_i \rangle =0, 1\le i \le k 49 | $$ 50 | 那么$\forall 1\le i \le k$ 51 | $$ 52 | \begin{aligned} 53 | \langle \vec w_{k+1}, \vec w_i \rangle 54 | &=\langle \vec v_{k+1} -\sum_{j=1}^k \alpha_j \vec w_j, \vec w_i \rangle\\ 55 | &= \langle \vec v_{k+1} , \vec w_i \rangle-\sum_{j=1}^k\alpha_j\langle \vec w_j, 56 | \vec w_i \rangle\\ 57 | &=\langle \vec v_{k+1} , \vec w_i \rangle -\alpha_i \langle \vec w_i, 58 | \vec w_i \rangle \\ 59 | &=0 \\ 60 | \alpha_i&= \frac{\langle \vec v_{k+1} , \vec w_i \rangle}{\langle \vec w_i, 61 | \vec w_i \rangle} 62 | \end{aligned} 63 | $$ 64 | 显然有 65 | $$ 66 | \text{span}\{\vec w_1,\ldots ,\vec w_{k+1}\} =\text{span}\{\vec v_1,\ldots ,\vec v_{k+1}\} 67 | $$ 68 | 69 | 70 | 上述讨论可以得到如下算法: 71 | 72 | 1. 令 73 | $$ 74 | {\vec w_1} ={\vec v_1} 75 | $$ 76 | 77 | 2. 对$k=2...n​$ 78 | 79 | 1. 对$i=1,\ldots ,k-1$,计算 80 | $$ 81 | \alpha_i= \frac{\langle \vec v_{k+1} , \vec w_i \rangle}{\langle \vec w_i, 82 | \vec w_i \rangle} 83 | $$ 84 | 85 | 2. 86 | 87 | $$ 88 | \vec w_k = \vec v_{k} -\sum_{i=1}^{k-1} \alpha_i \vec w_i 89 | $$ 90 | 91 | 92 | 93 | #### Problem 2 94 | 95 | (a)拉朗格朗日乘子为 96 | $$ 97 | \Lambda(\vec{x}, \vec{\lambda}, \vec{\mu}) = \vec c^T\vec x-\vec{\lambda}^T (A\vec x-\vec b)-\vec{\mu}^T \vec x 98 | $$ 99 | 所以KKT条件为 100 | 101 | 1. stationarity 102 | $$ 103 | \nabla_{\vec x}\Lambda(\vec{x}, \vec{\lambda}, \vec{\mu}) =\vec c - A^T\vec \lambda-\vec \mu =\vec 0 104 | $$ 105 | 106 | 2. primal feasibility 107 | $$ 108 | \begin{aligned} 109 | A\vec x&=\vec b\\ 110 | \vec x& \ge \vec 0 111 | \end{aligned} 112 | $$ 113 | 114 | 3. complementary slackness 115 | $$ 116 | \mu_j x_j = 0 117 | $$ 118 | 119 | 4. dual feasibility 120 | $$ 121 | \mu_j \ge 0 122 | $$ 123 | 124 | (b)令 125 | $$ 126 | x_{n+1} =d- \vec v^T \vec x=d-\sum_{i=1}^n v_i x_i\ge 0 127 | $$ 128 | 记 129 | $$ 130 | \tilde A= 131 | \left[ 132 | \begin{array}{c|c} 133 | A& \vec 0 \\ 134 | \hline 135 | \vec v^T& 1 136 | \end{array} 137 | \right], \tilde {\vec b}=\left[ 138 | \begin{matrix} 139 | \vec b \\ 140 | d 141 | \end{matrix} 142 | \right], 143 | 144 | \tilde {\vec c} =\left[ 145 | \begin{matrix} 146 | \vec c \\ 147 | 0 148 | \end{matrix} 149 | \right], 150 | 151 | \tilde {\vec x} =\left[ 152 | \begin{matrix} 153 | \vec x \\ 154 | x_{n+1} 155 | \end{matrix} 156 | \right] 157 | $$ 158 | 那么 159 | $$ 160 | \begin{aligned} 161 | \tilde {\vec c}^T \tilde {\vec x}&=\vec c^T\vec x \\ 162 | \tilde A \tilde {\vec x}& = \left[ 163 | \begin{matrix} 164 | A\vec x \\ 165 | \vec v^T \vec x +x_{n+1} 166 | \end{matrix} 167 | \right] =\left[ 168 | \begin{matrix} 169 | \vec b \\ 170 | d 171 | \end{matrix} 172 | \right] =\tilde{\vec b}\\ 173 | \tilde{\vec x}& =\left[ 174 | \begin{matrix} 175 | \vec x \\ 176 | x_{n+1} 177 | \end{matrix} 178 | \right]\ge 0 179 | \end{aligned} 180 | $$ 181 | 所以新的线性规划问题为 182 | $$ 183 | \begin{aligned} 184 | \text { minimize }& \tilde {\vec c}^T \tilde {\vec x} \\ 185 | \text { such that } & \tilde A \tilde {\vec x} =\tilde{\vec b} \\ 186 | &\tilde{\vec x} \ge \vec 0 187 | \end{aligned} 188 | $$ 189 | (c)对偶问题可以化为 190 | $$ 191 | \begin{aligned} 192 | \text { minimize }& -\vec b^T \vec y \\ 193 | \text { such that } & 194 | -A^T\vec y +\vec c \ge \vec 0 195 | \end{aligned} 196 | $$ 197 | 记 198 | $$ 199 | \Lambda'(\vec{y}, \vec{\lambda}) = -\vec b^T\vec y-\vec{\lambda}^T (-A^T\vec y +\vec c) 200 | $$ 201 | 所以stationarity条件为 202 | $$ 203 | \begin{aligned} 204 | \nabla_{\vec y}\Lambda'(\vec{y}, \vec{\lambda})& = -\vec b+A\vec{\lambda}=\vec 0\\ 205 | \vec b&= A\vec \lambda 206 | \end{aligned} 207 | $$ 208 | 带回原式可得 209 | $$ 210 | \begin{aligned} 211 | \Lambda'(\vec{y}, \vec{\lambda}) 212 | &= - \vec \lambda^T A^T\vec y + \vec \lambda^T A^T\vec y 213 | -\vec \lambda^T \vec c \\ 214 | &=-\vec \lambda^T \vec c\\ 215 | &=-\vec c^T \vec \lambda 216 | \end{aligned} 217 | $$ 218 | complementary slackness条件为 219 | $$ 220 | \vec{\lambda}^T (-A^T\vec y +\vec c) = 0 221 | $$ 222 | 所以在最优解处 223 | $$ 224 | \begin{aligned} 225 | -\vec b^T \vec y&=\Lambda'(\vec{y}, \vec{\lambda})= -\vec c^T \vec \lambda\\ 226 | \vec b^T \vec y & =\vec c^T \vec \lambda 227 | \end{aligned} 228 | $$ 229 | 因为驻点唯一,所以如下方程的解唯一: 230 | $$ 231 | \begin{eqnarray*} 232 | A\vec \lambda&&=\vec b \tag 1\\ 233 | \vec \lambda&&\ge \vec 0 \tag 2 234 | \end{eqnarray*} 235 | $$ 236 | 设上述方程的解为 237 | $$ 238 | \vec \lambda ' 239 | $$ 240 | 所以最优解满足 241 | $$ 242 | \vec b^T \vec y= \vec c^T \vec \lambda' \tag 3 243 | $$ 244 | 因为原始问题的驻点唯一,所以原始问题的解唯一,设为$\vec x'$,那么最优解为 245 | $$ 246 | {\vec c}^T {\vec x}' \tag 4 247 | $$ 248 | 其中$\vec x '​$满足 249 | $$ 250 | \begin{eqnarray*} 251 | 252 | & A {\vec x} ={\vec b} \tag 5\\ 253 | &{\vec x} \ge \vec 0 \tag 6 254 | \end{eqnarray*} 255 | $$ 256 | 此即为对偶问题的解。注意(3)(4)和(5)(6)的形式一致,所以(3)(4)相等,即原问题和对偶问题的最优值相同。 257 | 258 | 259 | 260 | #### Problem 3 261 | 262 | (a)(i)取 263 | $$ 264 | g(\vec x) =\| \vec x\|^2 =\vec x^T \vec x 265 | $$ 266 | (ii)因为 267 | $$ 268 | \nabla_{\vec x} g(\vec x) =2\vec x 269 | $$ 270 | 所以随机梯度下降法计算的梯度为 271 | $$ 272 | \frac 2 k \sum_{i=1}^k (\vec x_i -\vec x) 273 | =2\left( \frac 1k \sum_{i=1}^k \vec x_i -\vec x\right) 274 | $$ 275 | (b)(i)不一定,如果$f$无下界,那么无法收敛到局部最小值。 276 | 277 | (ii)利用单调有界数列必收敛即可。 278 | 279 | 280 | 281 | #### Problem 4 282 | 283 | $\vec x​$是Pareto optimal的含义为 284 | $$ 285 | \forall \vec y, 或者\exists i, s.t\ f_i(\vec x ) >f_i(\vec y), 286 | 或者\forall i, f_i(\vec x )\ge f_i(\vec y) 287 | $$ 288 | (i)如果第一个条件不成立,即$\forall y,i$, 289 | $$ 290 | f_i(\vec x ) \le f_i(\vec y) 291 | $$ 292 | 293 | 294 | 如果$f_i​$全部相同,那么结论显然;如果$f_i​$不全相同,那么不妨设$f_i \neq f_j​$,那么必然存在函数值不相同的点$\vec x​$,设 295 | $$ 296 | f_i(\vec x )>f_j(\vec x) 297 | $$ 298 | 299 | (b)首先由$f_i$的凸性以及$\vec \gamma \ge \vec 0$,我们可得$g$也是凸函数,所以存在最小值。接着将$\gamma_i$视为变量,所以得到如下优化问题 300 | $$ 301 | \begin{aligned} 302 | \text {minimize }& \sum_{i=1}^k\gamma_i f_i(\vec x) \\ 303 | \text {such that } 304 | & \gamma_i \ge 0, i=1,\ldots, n \\ 305 | & \sum_{i=1}^n \gamma_i =1 306 | \end{aligned} 307 | $$ 308 | 构造拉格朗日乘子: 309 | $$ 310 | \begin{aligned} 311 | L(\vec x, \gamma) 312 | =\sum_{i=1}^k\gamma_i f_i(\vec x)-\sum_{i=1}^k \alpha_i \gamma_i - 313 | \beta \left(\sum_{i=1}^k \gamma_i -1 \right) 314 | \end{aligned} 315 | $$ 316 | 关于$\vec x, \gamma$求梯度并为$0$可得 317 | $$ 318 | \begin{aligned} 319 | \nabla_{\vec x} L(\vec x, \gamma) 320 | &=\sum_{i=1}^k\gamma_i \nabla_{\vec x} f_i(\vec x) =\vec 0\\ 321 | \frac{\partial L(\vec x, \gamma)}{\partial \gamma_i} 322 | &=f_i(\vec x) - \alpha_i -\beta=0,i=1,\ldots, n 323 | 324 | \end{aligned} 325 | $$ 326 | 所以 327 | $$ 328 | f_i(\vec x') =\alpha_i +\beta,i=1,\ldots, n 329 | $$ 330 | 对偶互补条件为 331 | $$ 332 | \alpha_i \gamma_i =0,i=1,\ldots, n 333 | $$ 334 | 因为 335 | $$ 336 | \gamma_i >0 337 | $$ 338 | 所以 339 | $$ 340 | \alpha_i =0 341 | $$ 342 | 因此 343 | $$ 344 | f_i(\vec x') =\beta,i=1,\ldots, n 345 | $$ 346 | 以及 347 | $$ 348 | g(\vec x') =\sum_{i=1}^k \gamma_i f_i(\vec x') =\sum_{i=1}^k \gamma_i \beta =\beta 349 | $$ 350 | 注意我们有 351 | $$ 352 | g(\vec x) =\sum_{i=1}^k \gamma_i f_i(\vec x) \ge \beta 353 | $$ 354 | 如果不存在$i ,\vec x $,使得 355 | $$ 356 | f_i(\vec x) >\beta 357 | $$ 358 | 那么必然有 359 | $$ 360 | g(\vec x) =\sum_{i=1}^k \gamma_i f_i(\vec x) \le \sum_{i=1}^k \gamma_i\beta=\beta 361 | $$ 362 | 所以$\forall i,\vec x $, 363 | $$ 364 | f_i(\vec x) =\beta 365 | $$ 366 | 此时$\vec x' ​$是Pareto optimal。 367 | 368 | 如果存在$i ,\vec x $,使得 369 | $$ 370 | f_i(\vec x) >\beta = f_i(\vec x') 371 | $$ 372 | 那么$\vec x' $同样是Pareto optimal。 373 | 374 | (c)感觉原题有误,$\vec x'$应该是Pareto dominate。 375 | 376 | 关于$\vec x ​$求梯度可得 377 | $$ 378 | \begin{aligned} 379 | \nabla h(\vec x) 380 | &=2\sum_{i=1}^k (f_i(\vec x) - z_i) \nabla f_i(\vec x)\\ 381 | \nabla^2 h(\vec x) 382 | &=2\sum_{i=1}^k (f_i(\vec x) - z_i)\nabla^2 f_i(\vec x) 383 | \end{aligned} 384 | $$ 385 | 因为$\forall i$ 386 | $$ 387 | f_i(\vec x) -z_i \ge 0 388 | $$ 389 | 以及$\nabla^2 f_i(\vec x) \ge $(半正定),所以$\nabla^2 h(\vec x)$半正定,因此最小值点唯一。 390 | 391 | 注意$\forall \vec x \neq \vec x'$ 392 | $$ 393 | \sum_{i=1}^k (f_i(\vec x') - z_i)^2 <\sum_{i=1}^k (f_i(\vec x) - z_i)^2 394 | $$ 395 | 所以必然存在$i​$,使得 396 | $$ 397 | \begin{aligned} 398 | (f_i(\vec x') - z_i)^2& < (f_i(\vec x) - z_i)^2 \\ 399 | f_i(\vec x') - z_i & < f_i(\vec x) - z_i \\ 400 | f_i(\vec x')& < f_i(\vec x) 401 | \end{aligned} 402 | $$ 403 | 因此$\vec x'​$是Pareto dominate。 404 | 405 | 406 | 407 | #### Problem 5 408 | 409 | (a)当满足如下条件时$f​$取最小值 410 | $$ 411 | x_1= 1, x_2 =x_1^2 =1 412 | $$ 413 | (b) 414 | $$ 415 | \begin{aligned} 416 | \frac {\partial f}{\partial x_1} 417 | &= (x_1^2 -x_2)\times 2x_1 +(x_1-1)\\ 418 | &=2x_1^3-2x_1x_2 +x_1 -1\\ 419 | \frac {\partial f}{\partial x_2} 420 | &= (x_1^2 -x_2)\times (-1) \\ 421 | &=x_2 -x_1^2\\ 422 | 423 | \frac {\partial^2 f}{\partial x_1^2} 424 | &= \frac {\partial }{\partial x_1}(2x_1^3-2x_1x_2 +x_1 -1)\\ 425 | &=6x_1^2-2x_2+1\\ 426 | \frac {\partial^2 f}{\partial x_2^2} 427 | &= \frac {\partial }{\partial x_2}(x_2 -x_1^2)\\ 428 | &=1\\ 429 | 430 | \frac {\partial^2 f}{\partial x_2\partial x_1} 431 | &= \frac {\partial }{\partial x_2}(2x_1^3-2x_1x_2 +x_1 -1)\\ 432 | &=-2x_1\\ 433 | \end{aligned} 434 | $$ 435 | 所以 436 | $$ 437 | \begin{aligned} 438 | \nabla f\Big|_{\vec x_0=(2,2)} &= \left[ 439 | \begin{matrix} 440 | 9\\ 441 | -2 442 | \end{matrix} 443 | \right] \\ 444 | H= \nabla^2 f\Big|_{\vec x_0=(2,2)}&= \left[ 445 | \begin{matrix} 446 | 21 & -4\\ 447 | -4 & 1 448 | \end{matrix} 449 | \right]\\ 450 | H^{-1} \nabla f 451 | &=\left[ 452 | \begin{matrix} 453 | \frac 1 5\\ 454 | - \frac 6 5 455 | \end{matrix} 456 | \right] 457 | \end{aligned} 458 | $$ 459 | 所以 460 | $$ 461 | \begin{aligned} 462 | \vec x_1 463 | &= \vec x_0 - H^{-1} \nabla f\\ 464 | &=\left[ 465 | \begin{matrix} 466 | 2\\ 467 | 2 468 | \end{matrix} 469 | \right]-\left[ 470 | \begin{matrix} 471 | \frac 1 5\\ 472 | - \frac 6 5 473 | \end{matrix} 474 | \right]\\ 475 | &=\left[ 476 | \begin{matrix} 477 | \frac 9 5\\ 478 | \frac {16} 5 479 | \end{matrix} 480 | \right] 481 | \end{aligned} 482 | $$ 483 | (c)显然 484 | $$ 485 | f(\vec x_1) 0 $,所以 411 | $$ 412 | \left|\lambda_{0}-a_{r r}\right|\le R_{r} 413 | $$ 414 | 415 | 416 | 回到原题,我们有 417 | $$ 418 | \begin{aligned} 419 | R_{i}&=\sum_{j \neq 1}^{n}\left|g_{i j}\right|\\ 420 | &=\left|g_{i 1}\right|+\cdots+\left|g_{i, i-1}\right|+\left|g_{i, i+1}\right|+\cdots+\left|g_{i n}\right|\\ 421 | &=\frac {1}{|a_{ii}|} \left(\left|a_{i 1}\right|+\cdots+\left|a_{i, i-1}\right|+\left|a_{i, i+1}\right|+\cdots+\left|a_{i n}\right|\right) \\ 422 | &<1 423 | \end{aligned} 424 | $$ 425 | 而$g_{ii}=0$,所以$G​$的特征值满足 426 | $$ 427 | |\lambda|