├── .gitignore
├── README.md
├── midterm
├── midterm2023.lyx
├── midterm2023.pdf
├── midterm2023soln.lyx
└── midterm2023soln.pdf
├── notes
├── 18.335_Lecture1.pdf
├── 18.335_Lecture10.pdf
├── 18.335_Lecture11.pdf
├── 18.335_Lecture12.pdf
├── 18.335_Lecture13.pdf
├── 18.335_Lecture14.pdf
├── 18.335_Lecture15.pdf
├── 18.335_Lecture16.pdf
├── 18.335_Lecture17.pdf
├── 18.335_Lecture18.pdf
├── 18.335_Lecture19.pdf
├── 18.335_Lecture2.pdf
├── 18.335_Lecture20.pdf
├── 18.335_Lecture21.pdf
├── 18.335_Lecture22.pdf
├── 18.335_Lecture3.pdf
├── 18.335_Lecture4.pdf
├── 18.335_Lecture5.pdf
├── 18.335_Lecture6.pdf
├── 18.335_Lecture7.pdf
├── 18.335_Lecture8.pdf
├── 18.335_Lecture9.pdf
├── Floating-Point-Intro.ipynb
├── Gram-Schmidt.ipynb
├── Is-Gaussian-Elimination-Unstable.ipynb
├── Three-Ways-To-Solve-Least-Squares.ipynb
├── notes_spring_2023
│ ├── BFGS_SJnotes.pdf
│ ├── Implicit_Q.ipynb
│ ├── Lectures
│ │ ├── Lecture_11.pdf
│ │ ├── Lecture_12.pdf
│ │ ├── Lecture_13.pdf
│ │ ├── Lecture_14.pdf
│ │ ├── Lecture_15.pdf
│ │ ├── Lecture_16.pdf
│ │ ├── Lecture_17.pdf
│ │ ├── Lecture_18.pdf
│ │ ├── Lecture_19.pdf
│ │ ├── Lecture_20.pdf
│ │ ├── Lecture_21.pdf
│ │ ├── Lecture_22.pdf
│ │ ├── Lecture_25.pdf
│ │ ├── Lecture_4.pdf
│ │ ├── Lecture_5-6.pdf
│ │ ├── Lecture_8.pdf
│ │ ├── Lecture_9-10.pdf
│ │ └── Lectures_1-3.pdf
│ ├── Reflections-Rotations-And-Orthogonal-Transformations.ipynb
│ └── matrix-mult-experiments.pdf
├── painless-conjugate-gradient.pdf
├── restarting-arnoldi.pdf
└── solver-options.pdf
├── project
└── final_project_spring2025.pdf
└── psets
├── notes_spring_2023
├── pset1.ipynb
├── pset1.lyx
├── pset1.pdf
├── pset2.ipynb
├── pset2.lyx
├── pset2.pdf
├── pset3.lyx
└── pset3.pdf
├── pset1.ipynb
├── pset1.pdf
├── pset2.pdf
└── pset3.pdf
/.gitignore:
--------------------------------------------------------------------------------
1 | psets-old
2 | old.md
3 | .DS_Store
4 | *.lyx~
5 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # 18.335J/6.7310J: Introduction to Numerical Methods
2 |
3 | This is the repository of course materials for the 18.335J/6.7310J course at MIT, taught by Dr. [Shi Chen](https://math.mit.edu/directory/profile.html?pid=2701), in Spring 2025.
4 |
5 | Syllabus
6 | --------
7 |
8 | **Lectures**: Monday/Wednesday 9:30am–11am in room 2-190
9 |
10 | **Office Hours:** Monday 11am–12pm and Friday 3pm-4pm in room 2-239.
11 |
12 | **Contact:** schen636@mit.edu
13 |
14 | **Topics**: Advanced introduction to numerical linear algebra and related numerical methods. Topics include direct and iterative methods for linear systems, eigenvalue decompositions and QR/SVD factorizations, stability and accuracy of numerical algorithms, the IEEE floating-point standard. Other topics may include nonlinear optimization, numerical integration, and FFTs. Problem sets will involve use of [Julia](http://julialang.org/), a Matlab-like environment (little or no prior experience required; you will learn as you go).
15 |
16 | Launch a Julia environment in the cloud: [](https://mybinder.org/v2/gh/mitmath/binder-env/main?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Fmitmath%252F18335%26urlpath%3Dtree%252F18335%252F%26branch%3Dmaster)
17 |
18 | **Prerequisites**: Understanding of linear algebra ([18.06](http://web.mit.edu/18.06/www/), [18.700](http://ocw.mit.edu/OcwWeb/Mathematics/18-700Fall-2005/CourseHome/), or equivalents). 18.335 is a graduate-level subject, however, so much more mathematical maturity, ability to deal with abstractions and proofs, and general exposure to mathematics is assumed than for 18.06!
19 |
20 | **Textbook**: The primary textbook for the course is [_Numerical Linear Algebra_ by Trefethen and Bau](https://mit.primo.exlibrisgroup.com/discovery/fulldisplay?docid=alma990008205740106761&context=L&vid=01MIT_INST:MIT&lang=en&search_scope=all&adaptor=Local%20Search%20Engine&isFrbr=true&tab=all&query=any,contains,numerical%20linear%20algebra&sortby=date_d&facet=frbrgroupid,include,9020987051618639766&offset=0). For a classical (and rigorous) treatment, see Higham's [Accuracy and Stability of Numerical Algorithms](https://epubs.siam.org/doi/book/10.1137/1.9780898718027), and Golub and Uhlig's [Matrix Computations](https://github.com/CompPhysics/ComputationalPhysicsMSU/blob/master/doc/Lectures/Golub%2C%20Van%20Loan%20-%20Matrix%20Computations.pdf). For a contemporary perspective, see Solomon's [Numerical Algorithms](https://people.csail.mit.edu/jsolomon/share/book/numerical_book.pdf).
21 |
22 | **Other Reading**: Previous terms can be found in [branches of the 18335 git repository](https://github.com/mitmath/18335/branches). The [course notes from 18.335 in much earlier terms](https://ocw.mit.edu/courses/mathematics/18-335j-introduction-to-numerical-methods-fall-2010/) can be found on OpenCourseWare. For a review of iterative methods, the online books [Templates for the Solution of Linear Systems](http://www.netlib.org/linalg/html_templates/Templates.html) (Barrett et al.) and [Templates for the Solution of Algebraic Eigenvalue Problems](http://www.cs.utk.edu/~dongarra/etemplates/book.html) (Bai et al.) are useful surveys.
23 |
24 | **Grading**: 40% problem sets (three psets due / released every other Friday), 30% **take-home mid-term exam** (second week of April), 30% **final project** (See details below).
25 |
26 | * Psets will be **submitted electronically via [Gradescope](https://www.gradescope.com/)** (sign up for Gradescope with the entry code on Canvas). Submit a good-quality PDF *scan* of any handwritten solutions and *also* a PDF *printout* of a Julia notebook of your computational solutions.
27 |
28 | * [Piazza discussion board](https://www.piazza.com/mit/spring2025/18335/home)
29 |
30 | * previous midterms: [fall 2008](https://github.com/mitmath/18335/blob/fall08/midterm.pdf) and [solutions](https://github.com/mitmath/18335/blob/fall08/midterm-sol.pdf), [fall 2009](https://github.com/mitmath/18335/blob/fall09/midterm-f09.pdf) (no solutions), [fall 2010](https://github.com/mitmath/18335/blob/fall10/midterm-f10.pdf) and [solutions](https://github.com/mitmath/18335/blob/fall10/midterm-sol-f10.pdf), [fall 2011](https://github.com/mitmath/18335/blob/fall11/midterm-f11.pdf) and [solutions](https://github.com/mitmath/18335/blob/fall11/midtermsol-f11.pdf), [fall 2012](https://github.com/mitmath/18335/blob/fall12/midterm-f12.pdf) and [solutions](https://github.com/mitmath/18335/blob/fall12/midtermsol-f12.pdf), [fall 2013](https://github.com/mitmath/18335/blob/fall13/midterm-f13.pdf) and [solutions](https://github.com/mitmath/18335/blob/fall13/midtermsol-f13.pdf), [spring 2015](https://github.com/mitmath/18335/blob/spring15/exams/midterm-s15.pdf) and [solutions](https://github.com/mitmath/18335/blob/spring15/exams/midtermsol-s15.pdf), [spring 2019](https://github.com/mitmath/18335/blob/spring19/psets/midterm.pdf) and [solutions](https://github.com/mitmath/18335/blob/spring19/psets/midtermsol.pdf), [spring 2020](https://github.com/mitmath/18335/blob/spring20/psets/midterm.pdf) and [solutions](https://github.com/mitmath/18335/blob/spring20/psets/midtermsol.pdf).
31 |
32 | **Collaboration policy**: Talk to anyone you want to and read anything you want to, with three exceptions: First, you **may not refer to homework solutions from the previous terms** in which I taught 18.335. Second, make a solid effort to solve a problem on your own before discussing it with classmates or googling. Third, no matter whom you talk to or what you read, write up the solution on your own, without having their answer in front of you.
33 |
34 | * You can use [psetpartners.mit.edu](https://psetpartners.mit.edu/) to help you find classmates to chat with.
35 |
36 | **Final Projects**: The final project will explore a numerical topic of your choice that is related to the course material but has not been covered in depth during lectures or problem sets. The project consists of two components:
37 | * **Proposal (due by 11:59 PM, Sunday, April 13)**: A one-page summary outlining your chosen topic and its relevance.
38 | * Final submission: You may choose one of the following formats:
39 | * **Technical report (5–15 pages)** reviewing an interesting numerical algorithm not covered in the course, due by **11:59 PM, Thursday, May 15**.
40 | * **Technical presentation (35-minute in-class lecture)**. Limited spots are available, and you may collaborate with one other classmate. You need to submit your code
41 |
42 | See [this page](https://github.com/mitmath/18335/blob/master/project/final_project_spring2025.pdf) for more details.
43 |
44 | Assignments
45 | ------------
46 |
47 | * [Pset 1](https://github.com/mitmath/18335/blob/master/psets/pset1.pdf) is due on March 2 at 11:59pm.
48 |
49 | * [Pset 2](https://github.com/mitmath/18335/blob/master/psets/pset2.pdf) is due on April 6 at 11:59pm.
50 |
51 | * [Pset 3](https://github.com/mitmath/18335/blob/master/psets/pset3.pdf) is due on May 4 at 11:59pm.
52 |
53 |
54 | Lecture Summaries and Handouts
55 | ------------------------------
56 |
57 | ### Lecture 1 (February 3)
58 | * **Scope of Numerical Methods and the key concerns:** A broad introduction to the vast field of numerical methods. Unlike pure mathematics, numerical analysis must account for performance (efficiency) and accuracy (precision and stability).
59 | * **The Importance of Numerical Linear Algebra (NLA):** Why is NLA fundamental? Through examples, we demonstrate how NLA naturally arises in solving a wide range of problems in continuous mathematics.
60 |
61 | **Further Reading:** L.N. Trefethen, [The Definition of Numerical Analysis](https://people.maths.ox.ac.uk/trefethen/essays.html), [Lecture notes 1](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture1.pdf).
62 |
63 | ### Lecture 2 (February 5)
64 | * Floating point arithmetic, exact rounding, and the "fundamental axiom"
65 | * Stability of summation algorithms
66 | * Forward and backward stability
67 |
68 | **Further Reading:** L.N. Trefethen, Lectures 13 and 14. Also, see the [notebook](https://github.com/mitmath/18335/blob/master/notes/Floating-Point-Intro.ipynb) about floating point, [Lecture notes 2](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture2.pdf).
69 |
70 | ### Lecture 3 (February 10)
71 | * Recap on forward and backward stability, and condition numbers
72 | * Accuracy <= backward stable algorithms + well-conditioned problems
73 | * Vector and matrix norms
74 |
75 | **Further Reading:** L. N. Trefethen, Lectures 12 and 15. Also, see the notes about [stability of sum](https://github.com/mitmath/18335/blob/spring21/notes/summation-stability.pdf) and [equivalence of norms](https://github.com/mitmath/18335/blob/spring21/notes/norm-equivalence.pdf), [Lecture notes 3](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture3.pdf).
76 |
77 | ### Lecture 4 (February 12)
78 | * Solving Ax = b
79 | * Condition number of A
80 | * Orthogonal/Unitary matrices
81 | * The singular value decomposition (SVD)
82 |
83 | **Further Reading:** L. N. Trefethen, Lectures 4 and 5. [Lecture notes 4](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture4.pdf).
84 |
85 | ### Lecture 5 (February 18)
86 |
87 | * Recap on SVD and its applications
88 | * Least-squares solutions to overdetermined linear systems
89 | * Normal equations and pseudoinverse
90 | * QR factorization
91 |
92 | **Further Reading:** L. N. Trefethen, Lectures 11, 18 and 19. Also, see the notes about [least-squares](https://github.com/mitmath/18335/blob/master/notes/Three-Ways-To-Solve-Least-Squares.ipynb), [Lecture notes 5](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture5.pdf).
93 |
94 | ### Lecture 6 (February 19)
95 |
96 | * Gram-Schmidt orthogonalization and its stability
97 | * Householder transform, Householder QR algorithm and its stability
98 |
99 | **Further Reading:** L. N. Trefethen, Lectures 7, 8 and 10. Also, see the notes about [Gram-Schmidt](https://github.com/mitmath/18335/blob/master/notes/Gram-Schmidt.ipynb), [Lecture notes 6](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture6.pdf).
100 |
101 | ### Lecture 7 (February 24)
102 |
103 | * Solving Ax=b via QR factorization: flops and stability
104 | * Solving Ax=b via LU factorization: Gaussian elimination, flops and stability
105 |
106 | **Further Reading:** L. N. Trefethen, Lectures 20, 21, and 22. [Lecture notes 7](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture7.pdf)
107 |
108 | ### Lecture 8 (February 26)
109 |
110 | * Pivoting in LU factorization
111 | * LU factorization for symmetric positive definite matrix: Cholesky factorization
112 |
113 | **Further Reading:** L. N. Trefethen, Lectures 21, and 22. [Lecture notes 8](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture8.pdf). Can you bring GE's instability to life in this [example](https://github.com/mitmath/18335/blob/master/notes/Is-Gaussian-Elimination-Unstable.ipynb)?
114 |
115 | ### Lecture 9 (March 3)
116 |
117 | * Iterative methods for linear systems
118 | * Convergence of vector sequence
119 | * Matrix splittings and examples
120 | * Convergence of stationary iterative methods
121 |
122 | **Further Reading:** Y. Saad, Iterative Methods for Sparse Linear Systems, [Chapter 4](https://epubs.siam.org/doi/10.1137/1.9780898718003.ch4). [Lecture notes 9](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture9.pdf).
123 |
124 | ### Lecture 10 (March 5)
125 |
126 | * Jacobi iteration
127 | * Gauss-Seidel iteration
128 | * Successive over-relaxation (SOR) and optimal relaxation parameter
129 |
130 | **Further Reading:** Y. Saad, Iterative Methods for Sparse Linear Systems, [Chapter 4](https://epubs.siam.org/doi/10.1137/1.9780898718003.ch4). [Lecture notes 10](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture10.pdf).
131 |
132 | ### Lecutre 11 (March 10)
133 |
134 | * Review basics of eigenvalue problems
135 | * Schur factorization
136 | * Stability/perturbation theory of eigenvalue problems
137 |
138 | **Further Reading:** L. N. Trefethen, Lectures 24, and 25, [Lecture notes 11](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture11.pdf)
139 |
140 | ### Lecutre 12 (March 12)
141 |
142 | * Power methods
143 | * Rayleigh quotient methods
144 | * Simultaneous power iteration
145 |
146 | **Further Reading:** L. N. Trefethen, Lectures 27, and 28, [Lecture notes 12](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture12.pdf).
147 |
148 | ### Lecutre 13 (March 17)
149 |
150 | * QR algorithm
151 | * Upper Hessenberg factorization
152 | * QR algorithm with shift
153 |
154 | **Further Reading:** L. N. Trefethen, Lectures 26, and 29, [Lecture notes 13](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture13.pdf).
155 |
156 | ### Lecutre 14 (March 19)
157 |
158 | * Other eigenvalue algorithms
159 | * Computing SVD
160 |
161 | **Further Reading:** L. N. Trefethen, Lectures 30 and 31, [Lecture notes 14](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture14.pdf)
162 |
163 | ### Lecutre 15 (March 31)
164 |
165 | * Iterative methods for sparse matrices, and Krylov subspaces
166 | * Galerkin condition and Rayleigh-Ritz projection for eigenvalue problems
167 | * Arnoldi's iteration for finding orthonormal basis in Krylov subspaces
168 |
169 | **Further Reading:** L. N. Trefethen, Lectures 30 and 31, [Lecture notes 15](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture15.pdf)
170 |
171 | ### Lecture 16 (April 2)
172 |
173 | * Arnoldi's method for Hermitian matrices, and Lanczos algorithm for sparse eigenvalue problems
174 | * Convergence of Lanczos algorithm and Arnoldi's method
175 |
176 | **Further Reading:** L. N. Trefethen, Lectures 33 and 34, [Lecture notes 16](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture16.pdf). You can read about the [implicitly restarted Arnoldi iteration](https://github.com/mitmath/18335/blob/master/notes/restarting-arnoldi.pdf) and the [bells and whistles](https://epubs.siam.org/doi/10.1137/S0895479895281484). that made it [a standard iterative eigensolver](https://epubs.siam.org/doi/book/10.1137/1.9780898719628) for non-Hermitian sparse matrices.
177 |
178 | ### Lecture 17 (April 7)
179 |
180 | * Krylov subspace methods for linear system
181 | * Projection methods
182 | * Arnoldi's method for linear systems
183 |
184 | **Further Reading:** Y. Saad, Iterative Methods for Sparse Linear Systems, [Chapter 5.1](https://epubs.siam.org/doi/10.1137/1.9780898718003.ch5), and [6.4](https://epubs.siam.org/doi/10.1137/1.9780898718003.ch6), [Lecture notes 17](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture17.pdf)
185 |
186 | ### Lecture 18 (April 9)
187 |
188 | * GMRES as a projection problem and a least-square problem
189 | * Convergence of GMRES
190 |
191 | **Further Reading:** L. N. Trefethen, Lectures 35, [Lecture notes 18](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture18.pdf).
192 |
193 | ### Lecture 19 (April 14)
194 |
195 | * Steepest Gradient Descent
196 | * Conjugate Gradient (CG)
197 | * Convergence of CG
198 |
199 | **Further Reading:** L. N. Trefethen, Lectures 38, [Introduction to CG](https://github.com/mitmath/18335/blob/master/notes/painless-conjugate-gradient.pdf), [Lecture notes 19](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture19.pdf).
200 |
201 | ### Lecture 20 (April 16)
202 |
203 | * Preconditioning
204 | * Biorthogonalization methods
205 |
206 | **Further Reading:** L. N. Trefethen, Lectures 39, [Lecture notes 20](https://github.com/mitmath/18335/blob/master/notes/18.335_Lecture20.pdf), [Notes on different options for solving Ax=b](https://github.com/mitmath/18335/blob/master/notes/solver-options.pdf).
207 |
208 | ### Lecture 21 (April 23)
209 |
210 | * Many big data matrices are approximately low-rank
211 | * Optimal low-rank approximation and the SVD
212 | * Randomized range-finders
213 |
214 | **Further Reading:** See the now classic review paper by [Halko, Martinsson, and TroppLinks](https://epubs.siam.org/doi/10.1137/090771806) for an introduction to randomized range-finders and approximate matrix decompositions. This interesting paper by [Udell and TownsendLinks](https://epubs.siam.org/doi/10.1137/18M1183480). explores the origins of low-rank structure in big-data matrices.
215 |
216 | ### Lecture 22 (April 28)
217 |
218 | * Accuracy of randomized range-finders
219 | * Oversampling and failure probability
220 |
221 | **Further Reading:** See for example this [repo](https://github.com/JuliaLinearAlgebra/LowRankApprox.jl/tree/master) for a Julia implementation of randomized low-rank approximation.
222 |
223 | ### Lecture 23 (April 30)
224 |
225 | Guest lecture by Dr. [Ziang Chen](https://sites.mit.edu/ziangchen/) on Graph Neural Networks
226 |
227 | ### Lecture 24 (May 5)
228 |
229 | Student Presentation I:
230 |
231 | * **Anghel David-Andrei, Paulius Aleknavicius:** Fast matrix multiplication
232 | * **Mary Foxen, John Readlinger:** Fast discrete cosine transforms
233 |
234 | ### Lecture 25 (May 7)
235 |
236 | Student Presentation II:
237 |
238 | * **Akhil Sadam:** Adjoints for SDEs
239 | * **Zimi Zhang, Cynthia Cao:** Fast Multipole Methods for N-Body Problems
240 |
241 | ### Lecture 26 (May 12)
242 |
243 | Student Presentation III:
244 |
245 | * **Julia Zhao, Gandhar Mahadeshwar:** Monte-Carlo Integration with Adaptive Importance Sampling
246 | * **Max Misterka, Tristan Kay:** Nonlinear Complementarity
--------------------------------------------------------------------------------
/midterm/midterm2023.lyx:
--------------------------------------------------------------------------------
1 | #LyX 2.3 created this file. For more info see http://www.lyx.org/
2 | \lyxformat 544
3 | \begin_document
4 | \begin_header
5 | \save_transient_properties true
6 | \origin unavailable
7 | \textclass article
8 | \begin_preamble
9 |
10 | \renewcommand{\vec}[1]{\mathbf{#1}}
11 |
12 | \renewcommand{\labelenumi}{(\alph{enumi})}
13 | \renewcommand{\labelenumii}{(\roman{enumii})}
14 |
15 | \newcommand{\tr}{\operatorname{tr}}
16 | \end_preamble
17 | \use_default_options false
18 | \maintain_unincluded_children false
19 | \language english
20 | \language_package default
21 | \inputencoding auto
22 | \fontencoding global
23 | \font_roman "times" "default"
24 | \font_sans "default" "default"
25 | \font_typewriter "default" "default"
26 | \font_math "auto" "auto"
27 | \font_default_family default
28 | \use_non_tex_fonts false
29 | \font_sc false
30 | \font_osf false
31 | \font_sf_scale 100 100
32 | \font_tt_scale 100 100
33 | \use_microtype false
34 | \use_dash_ligatures true
35 | \graphics default
36 | \default_output_format default
37 | \output_sync 0
38 | \bibtex_command default
39 | \index_command default
40 | \paperfontsize default
41 | \spacing single
42 | \use_hyperref false
43 | \papersize default
44 | \use_geometry true
45 | \use_package amsmath 2
46 | \use_package amssymb 2
47 | \use_package cancel 1
48 | \use_package esint 0
49 | \use_package mathdots 1
50 | \use_package mathtools 1
51 | \use_package mhchem 1
52 | \use_package stackrel 1
53 | \use_package stmaryrd 1
54 | \use_package undertilde 1
55 | \cite_engine basic
56 | \cite_engine_type default
57 | \biblio_style plain
58 | \use_bibtopic false
59 | \use_indices false
60 | \paperorientation portrait
61 | \suppress_date false
62 | \justification true
63 | \use_refstyle 0
64 | \use_minted 0
65 | \index Index
66 | \shortcut idx
67 | \color #008000
68 | \end_index
69 | \topmargin 1in
70 | \secnumdepth 3
71 | \tocdepth 3
72 | \paragraph_separation indent
73 | \paragraph_indentation default
74 | \is_math_indent 0
75 | \math_numbering_side default
76 | \quotes_style english
77 | \dynamic_quotes 0
78 | \papercolumns 1
79 | \papersides 2
80 | \paperpagestyle default
81 | \tracking_changes false
82 | \output_changes false
83 | \html_math_output 0
84 | \html_css_as_file 0
85 | \html_be_strict false
86 | \end_header
87 |
88 | \begin_body
89 |
90 | \begin_layout Section*
91 | 18.335 Take-Home Midterm Exam: Spring 2023
92 | \end_layout
93 |
94 | \begin_layout Standard
95 | Posted Friday 12:30pm April 14, due
96 | \series bold
97 | 11:59pm Monday April 17.
98 | \end_layout
99 |
100 | \begin_layout Subsection*
101 | Problem 0: Honor code
102 | \end_layout
103 |
104 | \begin_layout Standard
105 | Copy and sign the following in your solutions:
106 | \end_layout
107 |
108 | \begin_layout Standard
109 |
110 | \emph on
111 | I have not used any resources to complete this exam other than my own 18.335
112 | notes, the textbook, running my own Julia code, and posted 18.335 course
113 | materials.
114 | \end_layout
115 |
116 | \begin_layout Standard
117 | \begin_inset VSpace 30pt
118 | \end_inset
119 |
120 |
121 | \end_layout
122 |
123 | \begin_layout Standard
124 | \begin_inset Tabular
125 |
126 |
127 |
128 |
129 |
130 |
131 | \begin_inset Text
132 |
133 | \begin_layout Plain Layout
134 | \begin_inset space \hspace{}
135 | \length 30col%
136 | \end_inset
137 |
138 |
139 | \end_layout
140 |
141 | \end_inset
142 | |
143 |
144 | \begin_inset Text
145 |
146 | \begin_layout Plain Layout
147 | \begin_inset CommandInset line
148 | LatexCommand rule
149 | offset "0ex"
150 | width "60col%"
151 | height "1pt"
152 |
153 | \end_inset
154 |
155 |
156 | \end_layout
157 |
158 | \end_inset
159 | |
160 |
161 |
162 |
163 | \begin_inset Text
164 |
165 | \begin_layout Plain Layout
166 |
167 | \end_layout
168 |
169 | \end_inset
170 | |
171 |
172 | \begin_inset Text
173 |
174 | \begin_layout Plain Layout
175 | your signature
176 | \end_layout
177 |
178 | \end_inset
179 | |
180 |
181 |
182 |
183 | \end_inset
184 |
185 |
186 | \end_layout
187 |
188 | \begin_layout Subsection*
189 | Problem 1: (32 points)
190 | \end_layout
191 |
192 | \begin_layout Standard
193 | Given two real vectors
194 | \begin_inset Formula $u=(u_{1},u_{2},\ldots,u_{n})^{T}$
195 | \end_inset
196 |
197 | and
198 | \begin_inset Formula $v=(v_{1},v_{2},\ldots,v_{n})^{T}$
199 | \end_inset
200 |
201 | , computing the dot product
202 | \begin_inset Formula $f(u,v)=u_{1}v_{1}+u_{2}v_{2}+\cdots+u_{n}v_{n}=u^{T}v$
203 | \end_inset
204 |
205 | in floating point arithmetic with left to right summation is backward stable.
206 | The computed dot product
207 | \begin_inset Formula $\hat{f}(u,v)$
208 | \end_inset
209 |
210 | satisfies the
211 | \emph on
212 | component-wise
213 | \emph default
214 | backward error criteria
215 | \begin_inset Formula
216 | \[
217 | \hat{f}(u,v)=(u+\delta u)^{T}v,\qquad\text{where}\ensuremath{\qquad|\delta u|\leq n\epsilon_{{\rm mach}}|u|+\mathcal{O}(\epsilon_{{\rm mach}}^{2})}.
218 | \]
219 |
220 | \end_inset
221 |
222 | The notation
223 | \begin_inset Formula $|w|$
224 | \end_inset
225 |
226 | indicates the vector
227 | \begin_inset Formula $|w|=(|w_{1}|,|w_{2}|,\ldots,|w_{n}|)^{T}$
228 | \end_inset
229 |
230 | , i.e., the vector obtained by taking the absolute value of each entry of
231 |
232 | \begin_inset Formula $w$
233 | \end_inset
234 |
235 | .
236 | \end_layout
237 |
238 | \begin_layout Enumerate
239 | Using the dot product algorithm
240 | \begin_inset Formula $\hat{f}(u,v)$
241 | \end_inset
242 |
243 | , derive an algorithm
244 | \begin_inset Formula $\hat{g}(A,b)$
245 | \end_inset
246 |
247 | for computing the matrix-vector product
248 | \begin_inset Formula $g(A,b)=Ab$
249 | \end_inset
250 |
251 | in floating point arithmetic, and show that it satisfies the component-wise
252 | backward stability criteria
253 | \begin_inset Formula
254 | \[
255 | \hat{g}(A,b)=(A+\delta A)b,\qquad\text{where\ensuremath{\qquad}|\ensuremath{\delta A|\leq n\epsilon_{{\rm mach}}|A|+\mathcal{O}(\epsilon_{{\rm mach}}^{2}),}}
256 | \]
257 |
258 | \end_inset
259 |
260 | where the notation
261 | \begin_inset Formula $|B|$
262 | \end_inset
263 |
264 | indicates the matrix obtained by taking the absolute value of each entry
265 | of
266 | \begin_inset Formula $B$
267 | \end_inset
268 |
269 | .
270 | \end_layout
271 |
272 | \begin_layout Enumerate
273 | Suppose the algorithm
274 | \begin_inset Formula $\hat{g}(A,b)$
275 | \end_inset
276 |
277 | is used to compute matrix-matrix products
278 | \begin_inset Formula $C=AB$
279 | \end_inset
280 |
281 | by computing one column of the matrix
282 | \begin_inset Formula $C$
283 | \end_inset
284 |
285 | at a time.
286 | Is the resulting floating-point algorithm
287 | \begin_inset Formula $\hat{h}(A,B)$
288 | \end_inset
289 |
290 | component-wise backward stable in the sense that there is a matrix
291 | \begin_inset Formula $\delta A$
292 | \end_inset
293 |
294 | such that
295 | \begin_inset Formula
296 | \[
297 | \hat{h}(A,B)=(A+\delta A)B,\qquad\text{where\ensuremath{\qquad|\delta A|\leq n\epsilon_{{\rm mach}}|A|}+\ensuremath{\mathcal{O}(\epsilon_{{\rm mach}}^{2})}}?
298 | \]
299 |
300 | \end_inset
301 |
302 | Explain why or why not.
303 | \end_layout
304 |
305 | \begin_layout Subsection*
306 | Problem 2: (32 points)
307 | \end_layout
308 |
309 | \begin_layout Standard
310 | Given an
311 | \begin_inset Formula $n$
312 | \end_inset
313 |
314 | -dimensional subspace
315 | \begin_inset Formula $\mathcal{V}$
316 | \end_inset
317 |
318 | , the standard Rayleigh–Ritz projection approximates a few (
319 | \begin_inset Formula $n\ll m$
320 | \end_inset
321 |
322 | ) eigenvalues of an
323 | \begin_inset Formula $m\times m$
324 | \end_inset
325 |
326 | matrix
327 | \begin_inset Formula $A$
328 | \end_inset
329 |
330 | by finding a scalar
331 | \begin_inset Formula $\lambda$
332 | \end_inset
333 |
334 | and
335 | \begin_inset Formula $x\in\mathcal{V}$
336 | \end_inset
337 |
338 | such that
339 | \begin_inset Formula $Ax-\lambda x\perp\mathcal{V}$
340 | \end_inset
341 |
342 | , i.e., the residual is perpendicular to the subspace.
343 | A
344 | \emph on
345 | two-sided
346 | \emph default
347 | Rayleigh–Ritz projection uses a second subspace
348 | \begin_inset Formula $\mathcal{W}$
349 | \end_inset
350 |
351 | (not orthogonal to
352 | \begin_inset Formula $\mathcal{V}$
353 | \end_inset
354 |
355 | ) and searches for a scalar
356 | \begin_inset Formula $\lambda$
357 | \end_inset
358 |
359 | and
360 | \begin_inset Formula $x\in\mathcal{V}$
361 | \end_inset
362 |
363 | such that
364 | \begin_inset Formula
365 | \begin{equation}
366 | Ax-\lambda x\perp\mathcal{\mathcal{W}},\qquad\text{and\ensuremath{\qquad x\in\mathcal{V}},}
367 | \end{equation}
368 |
369 | \end_inset
370 |
371 | i.e., the residual is perpendicular to the
372 | \emph on
373 | second
374 | \emph default
375 | subspace.
376 | In this problem,
377 | \begin_inset Formula $A$
378 | \end_inset
379 |
380 | is diagonalizable.
381 | \end_layout
382 |
383 | \begin_layout Enumerate
384 | Let
385 | \begin_inset Formula $V$
386 | \end_inset
387 |
388 | and
389 | \begin_inset Formula $W$
390 | \end_inset
391 |
392 | be a pair of bases for
393 | \begin_inset Formula $\mathcal{V}$
394 | \end_inset
395 |
396 | and
397 | \begin_inset Formula $\mathcal{W}$
398 | \end_inset
399 |
400 | , and let
401 | \begin_inset Formula $\lambda$
402 | \end_inset
403 |
404 | (finite) and
405 | \begin_inset Formula $w$
406 | \end_inset
407 |
408 | solve the eigenvalue problem
409 | \begin_inset Formula $Bw=\lambda Mw$
410 | \end_inset
411 |
412 | , where
413 | \begin_inset Formula $B=W^{T}AV$
414 | \end_inset
415 |
416 | and
417 | \begin_inset Formula $M=W^{T}V$
418 | \end_inset
419 |
420 | .
421 | Show that
422 | \begin_inset Formula $\lambda$
423 | \end_inset
424 |
425 | and
426 | \begin_inset Formula $x=Vw$
427 | \end_inset
428 |
429 | satisfy the criteria in (1).
430 | \end_layout
431 |
432 | \begin_layout Enumerate
433 | Suppose that
434 | \begin_inset Formula $\mathcal{V}={\rm span}\{x_{1},\ldots,x_{n}\}$
435 | \end_inset
436 |
437 | and
438 | \begin_inset Formula $\mathcal{W={\rm span}}\{y_{1},\ldots,y_{n}\}$
439 | \end_inset
440 |
441 | , where
442 | \begin_inset Formula $Ax_{i}=\lambda_{i}x_{i}$
443 | \end_inset
444 |
445 | and
446 | \begin_inset Formula $A^{T}y_{i}=\lambda_{i}y_{i}$
447 | \end_inset
448 |
449 | for
450 | \begin_inset Formula $i=1,\ldots,n$
451 | \end_inset
452 |
453 | , are a pair of
454 | \begin_inset Formula $n$
455 | \end_inset
456 |
457 | -dimensional
458 | \emph on
459 | right and left invariant subspaces
460 | \emph default
461 | of
462 | \begin_inset Formula $A$
463 | \end_inset
464 |
465 | .
466 | If the bases
467 | \begin_inset Formula $V$
468 | \end_inset
469 |
470 | and
471 | \begin_inset Formula $W$
472 | \end_inset
473 |
474 | are chosen to be
475 | \emph on
476 | bi-orthonormal
477 | \emph default
478 | , meaning that
479 | \begin_inset Formula $W^{T}V=I$
480 | \end_inset
481 |
482 | , show that
483 | \begin_inset Formula $\lambda$
484 | \end_inset
485 |
486 | and
487 | \begin_inset Formula $x$
488 | \end_inset
489 |
490 | from part (a) are an eigenpair of the full
491 | \begin_inset Formula $m\times m$
492 | \end_inset
493 |
494 | matrix
495 | \begin_inset Formula $A$
496 | \end_inset
497 |
498 | , i.e., that
499 | \begin_inset Formula $Ax=\lambda x$
500 | \end_inset
501 |
502 | .
503 | \end_layout
504 |
505 | \begin_layout Standard
506 |
507 | \series bold
508 | Hint 1:
509 | \series default
510 | In part (b), consider the similarity transform
511 | \begin_inset Formula $[W\quad W_{2}]^{T}A[V\quad V_{2}],$
512 | \end_inset
513 |
514 | where
515 | \begin_inset Formula $V_{2}$
516 | \end_inset
517 |
518 | and
519 | \begin_inset Formula $W_{2}$
520 | \end_inset
521 |
522 | are biorthonormal bases for the subspaces
523 | \begin_inset Formula $\mathcal{V}_{2}=\{x_{n+1},\ldots,x_{m}\}$
524 | \end_inset
525 |
526 | and
527 | \begin_inset Formula $\mathcal{W}_{2}=\{y_{n+1},\ldots,y_{m}\}$
528 | \end_inset
529 |
530 | , respectively.
531 |
532 | \series bold
533 | Hint 2:
534 | \series default
535 | The right and left eigenvectors of a diagonalizable matrix can be made
536 | biorthonormal (why?), so
537 | \begin_inset Formula $\mathcal{V}$
538 | \end_inset
539 |
540 | and
541 | \begin_inset Formula $\mathcal{W}_{2}$
542 | \end_inset
543 |
544 | are orthogonal subspaces.
545 | \end_layout
546 |
547 | \begin_layout Subsection*
548 | Problem 2: (36 points)
549 | \end_layout
550 |
551 | \begin_layout Standard
552 | The method of Generalized Minimal RESiduals (GMRES) uses
553 | \begin_inset Formula $n$
554 | \end_inset
555 |
556 | iterations of the Arnoldi method to construct a sequence of approximate
557 | solutions
558 | \begin_inset Formula $x_{1},x_{2},\ldots,x_{n}$
559 | \end_inset
560 |
561 | to the
562 | \begin_inset Formula $m\times m$
563 | \end_inset
564 |
565 | linear system
566 | \begin_inset Formula $Ax=b$
567 | \end_inset
568 |
569 | .
570 | At the
571 | \begin_inset Formula $n^{{\rm th}}$
572 | \end_inset
573 |
574 | iteration, the approximate solution
575 | \begin_inset Formula $x_{n}=Q_{n}y_{n}$
576 | \end_inset
577 |
578 | is constructed by solving the least-squares problem,
579 | \begin_inset Formula
580 | \[
581 | y_{n}={\rm argmin}_{y}\|\tilde{H}_{n}y-\|b\|e_{1}\|,
582 | \]
583 |
584 | \end_inset
585 |
586 | where
587 | \begin_inset Formula $\tilde{H}_{n}$
588 | \end_inset
589 |
590 | is an
591 | \begin_inset Formula $(n+1)\times n$
592 | \end_inset
593 |
594 | upper Hessenberg matrix and
595 | \begin_inset Formula $Q_{n}$
596 | \end_inset
597 |
598 | is the usual orthonormal basis for the Krylov subspace
599 | \begin_inset Formula $\mathcal{K}_{n}(A,b)={\rm span}\{b,Ab,A^{2}b,\ldots,A^{n-1}b\}.$
600 | \end_inset
601 |
602 |
603 | \end_layout
604 |
605 | \begin_layout Enumerate
606 | Describe an algorithm based on Givens rotations that exploits the upper
607 | Hessenberg structure of
608 | \begin_inset Formula $\tilde{H}_{n}$
609 | \end_inset
610 |
611 | to solve the
612 | \begin_inset Formula $(n+1)\times n$
613 | \end_inset
614 |
615 | least-squares problem in
616 | \begin_inset Formula $\mathcal{O}(n^{2})$
617 | \end_inset
618 |
619 | flops.
620 | \end_layout
621 |
622 | \begin_layout Enumerate
623 | If the QR factorization
624 | \begin_inset Formula $\tilde{H}_{n-1}=\Omega_{n-1}R_{n-1}$
625 | \end_inset
626 |
627 | is known from the previous iteration, explain how to update the QR factorizatio
628 | n to
629 | \begin_inset Formula $\tilde{H}_{n}=\Omega_{n}R_{n}$
630 | \end_inset
631 |
632 | cheaply using a single Givens rotation.
633 | \end_layout
634 |
635 | \begin_layout Enumerate
636 | Using your result from part (b), explain how the solution to the least-squares
637 | problem can also be updated cheaply from the solution at the previous iteration.
638 | \end_layout
639 |
640 | \begin_layout Enumerate
641 | What is the approximate flop count for updating the least-squares solution
642 | at the
643 | \begin_inset Formula $n^{{\rm th}}$
644 | \end_inset
645 |
646 | step of GMRES? You may use big-
647 | \begin_inset Formula $O$
648 | \end_inset
649 |
650 | notation to express the asymptotic scaling in
651 | \begin_inset Formula $n$
652 | \end_inset
653 |
654 | .
655 | \end_layout
656 |
657 | \end_body
658 | \end_document
659 |
--------------------------------------------------------------------------------
/midterm/midterm2023.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/midterm/midterm2023.pdf
--------------------------------------------------------------------------------
/midterm/midterm2023soln.lyx:
--------------------------------------------------------------------------------
1 | #LyX 2.3 created this file. For more info see http://www.lyx.org/
2 | \lyxformat 544
3 | \begin_document
4 | \begin_header
5 | \save_transient_properties true
6 | \origin unavailable
7 | \textclass article
8 | \begin_preamble
9 |
10 | \renewcommand{\vec}[1]{\mathbf{#1}}
11 |
12 | \renewcommand{\labelenumi}{(\alph{enumi})}
13 | \renewcommand{\labelenumii}{(\roman{enumii})}
14 |
15 | \newcommand{\tr}{\operatorname{tr}}
16 | \end_preamble
17 | \use_default_options false
18 | \maintain_unincluded_children false
19 | \language english
20 | \language_package default
21 | \inputencoding auto
22 | \fontencoding global
23 | \font_roman "times" "default"
24 | \font_sans "default" "default"
25 | \font_typewriter "default" "default"
26 | \font_math "auto" "auto"
27 | \font_default_family default
28 | \use_non_tex_fonts false
29 | \font_sc false
30 | \font_osf false
31 | \font_sf_scale 100 100
32 | \font_tt_scale 100 100
33 | \use_microtype false
34 | \use_dash_ligatures true
35 | \graphics default
36 | \default_output_format default
37 | \output_sync 0
38 | \bibtex_command default
39 | \index_command default
40 | \paperfontsize default
41 | \spacing single
42 | \use_hyperref false
43 | \papersize default
44 | \use_geometry true
45 | \use_package amsmath 2
46 | \use_package amssymb 2
47 | \use_package cancel 1
48 | \use_package esint 0
49 | \use_package mathdots 1
50 | \use_package mathtools 1
51 | \use_package mhchem 1
52 | \use_package stackrel 1
53 | \use_package stmaryrd 1
54 | \use_package undertilde 1
55 | \cite_engine basic
56 | \cite_engine_type default
57 | \biblio_style plain
58 | \use_bibtopic false
59 | \use_indices false
60 | \paperorientation portrait
61 | \suppress_date false
62 | \justification true
63 | \use_refstyle 0
64 | \use_minted 0
65 | \index Index
66 | \shortcut idx
67 | \color #008000
68 | \end_index
69 | \topmargin 1in
70 | \secnumdepth 3
71 | \tocdepth 3
72 | \paragraph_separation indent
73 | \paragraph_indentation default
74 | \is_math_indent 0
75 | \math_numbering_side default
76 | \quotes_style english
77 | \dynamic_quotes 0
78 | \papercolumns 1
79 | \papersides 2
80 | \paperpagestyle default
81 | \tracking_changes false
82 | \output_changes false
83 | \html_math_output 0
84 | \html_css_as_file 0
85 | \html_be_strict false
86 | \end_header
87 |
88 | \begin_body
89 |
90 | \begin_layout Section*
91 | 18.335 Take-Home Midterm Exam: Spring 2023
92 | \end_layout
93 |
94 | \begin_layout Standard
95 | Posted Friday 12:30pm April 14, due
96 | \series bold
97 | 11:59pm Monday April 17.
98 | \end_layout
99 |
100 | \begin_layout Subsection*
101 | Problem 0: Honor code
102 | \end_layout
103 |
104 | \begin_layout Standard
105 | Copy and sign the following in your solutions:
106 | \end_layout
107 |
108 | \begin_layout Standard
109 |
110 | \emph on
111 | I have not used any resources to complete this exam other than my own 18.335
112 | notes, the textbook, running my own Julia code, and posted 18.335 course
113 | materials.
114 | \end_layout
115 |
116 | \begin_layout Standard
117 | \begin_inset VSpace 30pt
118 | \end_inset
119 |
120 |
121 | \end_layout
122 |
123 | \begin_layout Standard
124 | \begin_inset Tabular
125 |
126 |
127 |
128 |
129 |
130 |
131 | \begin_inset Text
132 |
133 | \begin_layout Plain Layout
134 | \begin_inset space \hspace{}
135 | \length 30col%
136 | \end_inset
137 |
138 |
139 | \end_layout
140 |
141 | \end_inset
142 | |
143 |
144 | \begin_inset Text
145 |
146 | \begin_layout Plain Layout
147 | \begin_inset CommandInset line
148 | LatexCommand rule
149 | offset "0ex"
150 | width "60col%"
151 | height "1pt"
152 |
153 | \end_inset
154 |
155 |
156 | \end_layout
157 |
158 | \end_inset
159 | |
160 |
161 |
162 |
163 | \begin_inset Text
164 |
165 | \begin_layout Plain Layout
166 |
167 | \end_layout
168 |
169 | \end_inset
170 | |
171 |
172 | \begin_inset Text
173 |
174 | \begin_layout Plain Layout
175 | your signature
176 | \end_layout
177 |
178 | \end_inset
179 | |
180 |
181 |
182 |
183 | \end_inset
184 |
185 |
186 | \end_layout
187 |
188 | \begin_layout Subsection*
189 | Problem 1: (32 points)
190 | \end_layout
191 |
192 | \begin_layout Standard
193 | Given two real vectors
194 | \begin_inset Formula $u=(u_{1},u_{2},\ldots,u_{n})^{T}$
195 | \end_inset
196 |
197 | and
198 | \begin_inset Formula $v=(v_{1},v_{2},\ldots,v_{n})^{T}$
199 | \end_inset
200 |
201 | , computing the dot product
202 | \begin_inset Formula $f(u,v)=u_{1}v_{1}+u_{2}v_{2}+\cdots+u_{n}v_{n}=u^{T}v$
203 | \end_inset
204 |
205 | in floating point arithmetic with left to right summation is backward stable.
206 | The computed dot product
207 | \begin_inset Formula $\hat{f}(u,v)$
208 | \end_inset
209 |
210 | satisfies the
211 | \emph on
212 | component-wise
213 | \emph default
214 | backward error criteria
215 | \begin_inset Formula
216 | \[
217 | \hat{f}(u,v)=(u+\delta u)^{T}v,\qquad\text{where}\ensuremath{\qquad|\delta u|\leq n\epsilon_{{\rm mach}}|u|+\mathcal{O}(\epsilon_{{\rm mach}}^{2})}.
218 | \]
219 |
220 | \end_inset
221 |
222 | The notation
223 | \begin_inset Formula $|w|$
224 | \end_inset
225 |
226 | indicates the vector
227 | \begin_inset Formula $|w|=(|w_{1}|,|w_{2}|,\ldots,|w_{n}|)^{T}$
228 | \end_inset
229 |
230 | , i.e., the vector obtained by taking the absolute value of each entry of
231 |
232 | \begin_inset Formula $w$
233 | \end_inset
234 |
235 | .
236 | \end_layout
237 |
238 | \begin_layout Enumerate
239 | Using the dot product algorithm
240 | \begin_inset Formula $\hat{f}(u,v)$
241 | \end_inset
242 |
243 | , derive an algorithm
244 | \begin_inset Formula $\hat{g}(A,b)$
245 | \end_inset
246 |
247 | for computing the matrix-vector product
248 | \begin_inset Formula $g(A,b)=Ab$
249 | \end_inset
250 |
251 | in floating point arithmetic, and show that it satisfies the component-wise
252 | backward stability criteria
253 | \begin_inset Formula
254 | \[
255 | \hat{g}(A,b)=(A+\delta A)b,\qquad\text{where\ensuremath{\qquad}|\ensuremath{\delta A|\leq n\epsilon_{{\rm mach}}|A|+\mathcal{O}(\epsilon_{{\rm mach}}^{2}),}}
256 | \]
257 |
258 | \end_inset
259 |
260 | where the notation
261 | \begin_inset Formula $|B|$
262 | \end_inset
263 |
264 | indicates the matrix obtained by taking the absolute value of each entry
265 | of
266 | \begin_inset Formula $B$
267 | \end_inset
268 |
269 | .
270 | \begin_inset VSpace medskip
271 | \end_inset
272 |
273 |
274 | \series bold
275 | \color blue
276 | Solution:
277 | \series default
278 | The
279 | \begin_inset Formula $i^{{\rm th}}$
280 | \end_inset
281 |
282 | entry of the matrix-vector product
283 | \begin_inset Formula $Ab$
284 | \end_inset
285 |
286 | is the dot product of the
287 | \begin_inset Formula $i^{{\rm th}}$
288 | \end_inset
289 |
290 | row of
291 | \begin_inset Formula $A$
292 | \end_inset
293 |
294 | with the vector
295 | \begin_inset Formula $b$
296 | \end_inset
297 |
298 | .
299 | Using the floating-point algorithm
300 | \begin_inset Formula $\hat{f}(u,v)$
301 | \end_inset
302 |
303 | for each of these dot products results in a computed vector
304 | \begin_inset Formula $\hat{g}(A,b)$
305 | \end_inset
306 |
307 | whose
308 | \begin_inset Formula $i^{{\rm th}}$
309 | \end_inset
310 |
311 | entry is
312 | \begin_inset Formula $\hat{f}(A_{i,:},b)=(A_{i,:}+\delta A_{i})b$
313 | \end_inset
314 |
315 | .
316 | Denoting the matrix whose
317 | \begin_inset Formula $i^{{\rm th}}$
318 | \end_inset
319 |
320 | row is
321 | \begin_inset Formula $\delta A_{i}$
322 | \end_inset
323 |
324 | by
325 | \begin_inset Formula $\delta A$
326 | \end_inset
327 |
328 | , we have that
329 | \begin_inset Formula $\hat{g}(A,b)=(A+\delta A)b$
330 | \end_inset
331 |
332 | as desired.
333 | The componentwise bounds on
334 | \begin_inset Formula $|\delta A|$
335 | \end_inset
336 |
337 | follow immediately from the component-wise backward error bounds for
338 | \begin_inset Formula $\hat{f}(A_{:,i},b)$
339 | \end_inset
340 |
341 | , i.e., the component-wise bounds on the rows
342 | \begin_inset Formula $\delta A_{i}$
343 | \end_inset
344 |
345 | , for
346 | \begin_inset Formula $1\leq i\leq n$
347 | \end_inset
348 |
349 | .
350 | \end_layout
351 |
352 | \begin_layout Enumerate
353 | Suppose the algorithm
354 | \begin_inset Formula $\hat{g}(A,b)$
355 | \end_inset
356 |
357 | is used to compute matrix-matrix products
358 | \begin_inset Formula $C=AB$
359 | \end_inset
360 |
361 | by computing one column of the matrix
362 | \begin_inset Formula $C$
363 | \end_inset
364 |
365 | at a time.
366 | Is the resulting floating-point algorithm
367 | \begin_inset Formula $\hat{h}(A,B)$
368 | \end_inset
369 |
370 | component-wise backward stable in the sense that there is a matrix
371 | \begin_inset Formula $\delta A$
372 | \end_inset
373 |
374 | such that
375 | \begin_inset Formula
376 | \[
377 | \hat{h}(A,B)=(A+\delta A)B,\qquad\text{where\ensuremath{\qquad|\delta A|\leq n\epsilon_{{\rm mach}}|A|}+\ensuremath{\mathcal{O}(\epsilon_{{\rm mach}}^{2})}}?
378 | \]
379 |
380 | \end_inset
381 |
382 | Explain why or why not.
383 | \begin_inset VSpace medskip
384 | \end_inset
385 |
386 |
387 | \series bold
388 | \color blue
389 | Solution:
390 | \series default
391 | We can apply the matrix-vector product algorithm
392 | \begin_inset Formula $\hat{g}(A,b)$
393 | \end_inset
394 |
395 | from part (a) to compute one column of
396 | \begin_inset Formula $C=AB$
397 | \end_inset
398 |
399 | at a time.
400 | The columns of the computed matrix
401 | \begin_inset Formula $\hat{C}=\hat{h}(A,b)$
402 | \end_inset
403 |
404 | then satisfy
405 | \begin_inset Formula $\hat{C}_{:,j}=\hat{g}(A,B_{:,j})=(A+\delta A_{j})B_{:,j}$
406 | \end_inset
407 |
408 | .
409 | The problem here is that the
410 | \begin_inset Formula $j^{{\rm th}}$
411 | \end_inset
412 |
413 | computed column of
414 | \begin_inset Formula $\hat{C}$
415 | \end_inset
416 |
417 | is the result of multiplying a column of
418 | \begin_inset Formula $B$
419 | \end_inset
420 |
421 | by a
422 | \emph on
423 | different
424 | \emph default
425 | perturbed matrix
426 | \begin_inset Formula $A+\delta A_{j}$
427 | \end_inset
428 |
429 | , so it is impossible to express
430 | \begin_inset Formula $\hat{C}$
431 | \end_inset
432 |
433 | as a product of
434 | \begin_inset Formula $B$
435 | \end_inset
436 |
437 | with a single perturbed matrix:
438 | \begin_inset Formula $\hat{C}\neq(A+\delta A)B$
439 | \end_inset
440 |
441 | for some
442 | \begin_inset Formula $\delta A$
443 | \end_inset
444 |
445 | .
446 | Matrix-matrix multiplication is
447 | \emph on
448 | not
449 | \emph default
450 | backward stable.
451 | See Higham's book (chapter 3.5) for more discussion and complimentary forward
452 | error bounds.
453 | \end_layout
454 |
455 | \begin_layout Subsection*
456 | Problem 2: (32 points)
457 | \end_layout
458 |
459 | \begin_layout Standard
460 | Given an
461 | \begin_inset Formula $n$
462 | \end_inset
463 |
464 | -dimensional subspace
465 | \begin_inset Formula $\mathcal{V}$
466 | \end_inset
467 |
468 | , the standard Rayleigh–Ritz projection approximates a few (
469 | \begin_inset Formula $n\ll m$
470 | \end_inset
471 |
472 | ) eigenvalues of an
473 | \begin_inset Formula $m\times m$
474 | \end_inset
475 |
476 | matrix
477 | \begin_inset Formula $A$
478 | \end_inset
479 |
480 | by finding a scalar
481 | \begin_inset Formula $\lambda$
482 | \end_inset
483 |
484 | and
485 | \begin_inset Formula $x\in\mathcal{V}$
486 | \end_inset
487 |
488 | such that
489 | \begin_inset Formula $Ax-\lambda x\perp\mathcal{V}$
490 | \end_inset
491 |
492 | , i.e., the residual is perpendicular to the subspace.
493 | A
494 | \emph on
495 | two-sided
496 | \emph default
497 | Rayleigh–Ritz projection uses a second subspace
498 | \begin_inset Formula $\mathcal{W}$
499 | \end_inset
500 |
501 | (not orthogonal to
502 | \begin_inset Formula $\mathcal{V}$
503 | \end_inset
504 |
505 | ) and searches for a scalar
506 | \begin_inset Formula $\lambda$
507 | \end_inset
508 |
509 | and
510 | \begin_inset Formula $x\in\mathcal{V}$
511 | \end_inset
512 |
513 | such that
514 | \begin_inset Formula
515 | \begin{equation}
516 | Ax-\lambda x\perp\mathcal{\mathcal{W}},\qquad\text{and\ensuremath{\qquad x\in\mathcal{V}},}
517 | \end{equation}
518 |
519 | \end_inset
520 |
521 | i.e., the residual is perpendicular to the
522 | \emph on
523 | second
524 | \emph default
525 | subspace.
526 | In this problem,
527 | \begin_inset Formula $A$
528 | \end_inset
529 |
530 | is diagonalizable.
531 | \end_layout
532 |
533 | \begin_layout Enumerate
534 | Let
535 | \begin_inset Formula $V$
536 | \end_inset
537 |
538 | and
539 | \begin_inset Formula $W$
540 | \end_inset
541 |
542 | be a pair of bases for
543 | \begin_inset Formula $\mathcal{V}$
544 | \end_inset
545 |
546 | and
547 | \begin_inset Formula $\mathcal{W}$
548 | \end_inset
549 |
550 | , and let
551 | \begin_inset Formula $\lambda$
552 | \end_inset
553 |
554 | (finite) and
555 | \begin_inset Formula $w$
556 | \end_inset
557 |
558 | solve the eigenvalue problem
559 | \begin_inset Formula $Bw=\lambda Mw$
560 | \end_inset
561 |
562 | , where
563 | \begin_inset Formula $B=W^{T}AV$
564 | \end_inset
565 |
566 | and
567 | \begin_inset Formula $M=W^{T}V$
568 | \end_inset
569 |
570 | .
571 | Show that
572 | \begin_inset Formula $\lambda$
573 | \end_inset
574 |
575 | and
576 | \begin_inset Formula $x=Vw$
577 | \end_inset
578 |
579 | satisfy the criteria in (1).
580 | \begin_inset VSpace medskip
581 | \end_inset
582 |
583 |
584 | \series bold
585 | \color blue
586 | Solution:
587 | \series default
588 | Since the columns of
589 | \begin_inset Formula $V$
590 | \end_inset
591 |
592 | form a basis for
593 | \begin_inset Formula $\mathcal{V}$
594 | \end_inset
595 |
596 | , the vector
597 | \begin_inset Formula $x=Vw\in\mathcal{V}$
598 | \end_inset
599 |
600 | as it is a linear combination of the columns of
601 | \begin_inset Formula $V$
602 | \end_inset
603 |
604 | .
605 | On the other hand, we have that
606 | \begin_inset Formula
607 | \[
608 | Bw-\lambda Mw=W^{T}AVw-\lambda W^{T}Vw=W^{T}(Ax-\lambda x)=0,
609 | \]
610 |
611 | \end_inset
612 |
613 | which means that the residual
614 | \begin_inset Formula $Ax-\lambda x$
615 | \end_inset
616 |
617 | is orthogonal to the columns of
618 | \begin_inset Formula $W$
619 | \end_inset
620 |
621 | .
622 | Since the columns of
623 | \begin_inset Formula $W$
624 | \end_inset
625 |
626 | form a basis for
627 | \begin_inset Formula $\mathcal{W}$
628 | \end_inset
629 |
630 | , the residual is orthogonal to the whole subspace
631 | \begin_inset Formula $\mathcal{W}$
632 | \end_inset
633 |
634 | , i.e.,
635 | \begin_inset Formula $Ax-\lambda x\perp\mathcal{W}$
636 | \end_inset
637 |
638 | .
639 | \end_layout
640 |
641 | \begin_layout Enumerate
642 | Suppose that
643 | \begin_inset Formula $\mathcal{V}={\rm span}\{x_{1},\ldots,x_{n}\}$
644 | \end_inset
645 |
646 | and
647 | \begin_inset Formula $\mathcal{W={\rm span}}\{y_{1},\ldots,y_{n}\}$
648 | \end_inset
649 |
650 | , where
651 | \begin_inset Formula $Ax_{i}=\lambda_{i}x_{i}$
652 | \end_inset
653 |
654 | and
655 | \begin_inset Formula $A^{T}y_{i}=\lambda_{i}y_{i}$
656 | \end_inset
657 |
658 | for
659 | \begin_inset Formula $i=1,\ldots,n$
660 | \end_inset
661 |
662 | , are a pair of
663 | \begin_inset Formula $n$
664 | \end_inset
665 |
666 | -dimensional
667 | \emph on
668 | right and left invariant subspaces
669 | \emph default
670 | of
671 | \begin_inset Formula $A$
672 | \end_inset
673 |
674 | .
675 | If the bases
676 | \begin_inset Formula $V$
677 | \end_inset
678 |
679 | and
680 | \begin_inset Formula $W$
681 | \end_inset
682 |
683 | are chosen to be
684 | \emph on
685 | bi-orthonormal
686 | \emph default
687 | , meaning that
688 | \begin_inset Formula $W^{T}V=I$
689 | \end_inset
690 |
691 | , show that
692 | \begin_inset Formula $\lambda$
693 | \end_inset
694 |
695 | and
696 | \begin_inset Formula $x$
697 | \end_inset
698 |
699 | from part (a) are an eigenpair of the full
700 | \begin_inset Formula $m\times m$
701 | \end_inset
702 |
703 | matrix
704 | \begin_inset Formula $A$
705 | \end_inset
706 |
707 | , i.e., that
708 | \begin_inset Formula $Ax=\lambda x$
709 | \end_inset
710 |
711 | .
712 | \begin_inset VSpace medskip
713 | \end_inset
714 |
715 |
716 | \series bold
717 | \color blue
718 | Solution:
719 | \series default
720 | If the bases
721 | \begin_inset Formula $V$
722 | \end_inset
723 |
724 | and
725 | \begin_inset Formula $W$
726 | \end_inset
727 |
728 | are biorthonormal, the generalized eigenvalue problem from part (a) becomes
729 | the standard eigenvalue problem
730 | \begin_inset Formula $Bw=\lambda w$
731 | \end_inset
732 |
733 | .
734 | Following the first hint, we consider the similarity transform
735 | \begin_inset Formula
736 | \[
737 | D=\left(\begin{array}{cc}
738 | W & W_{2}\end{array}\right)^{T}A\left(\begin{array}{cc}
739 | V & V_{2}\end{array}\right)=\left(\begin{array}{cc}
740 | W^{T}AV & W^{T}AV_{2}\\
741 | W_{2}^{T}AV & W_{2}^{T}AV_{2}
742 | \end{array}\right).
743 | \]
744 |
745 | \end_inset
746 |
747 | First, we can verify that this is indeed a similarity transform because
748 |
749 | \begin_inset Formula $[W\quad W_{2}]^{T}[V\quad V_{2}]=I$
750 | \end_inset
751 |
752 | by the biorthogonality conditions and, therefore,
753 | \begin_inset Formula $[W\quad W_{2}]^{T}=[V\quad V_{2}]^{-1}$
754 | \end_inset
755 |
756 | .
757 | Similar matrices have the same eigenvalues, so
758 | \begin_inset Formula $D$
759 | \end_inset
760 |
761 | and
762 | \begin_inset Formula $A$
763 | \end_inset
764 |
765 | have the same eigenvalues.
766 | Second, notice that the upper left block is the matrix
767 | \begin_inset Formula $W^{T}AV=B$
768 | \end_inset
769 |
770 | .
771 | What about the remaining blocks? By the second hint,
772 | \begin_inset Formula $\mathcal{V}$
773 | \end_inset
774 |
775 | and
776 | \begin_inset Formula $\mathcal{W}$
777 | \end_inset
778 |
779 | are orthogonal to
780 | \begin_inset Formula $\mathcal{W}_{2}$
781 | \end_inset
782 |
783 | and
784 | \begin_inset Formula $\mathcal{V}_{2}$
785 | \end_inset
786 |
787 | , respectively.
788 | Now,
789 | \begin_inset Formula $\mathcal{V}$
790 | \end_inset
791 |
792 | and
793 | \begin_inset Formula $\mathcal{W}$
794 | \end_inset
795 |
796 | are right and left invariant subspaces of
797 | \begin_inset Formula $A$
798 | \end_inset
799 |
800 | so the columns of
801 | \begin_inset Formula $AV$
802 | \end_inset
803 |
804 | are vectors in
805 | \begin_inset Formula $\mathcal{V}$
806 | \end_inset
807 |
808 | and the rows of
809 | \begin_inset Formula $W^{T}A$
810 | \end_inset
811 |
812 | are vectors in
813 | \begin_inset Formula $\mathcal{W}$
814 | \end_inset
815 |
816 | .
817 | Therefore, the off-diagonal blocks vanish because the columns of
818 | \begin_inset Formula $AV$
819 | \end_inset
820 |
821 | are orthogonal to the rows of
822 | \begin_inset Formula $W_{2}^{T}$
823 | \end_inset
824 |
825 | and the rows of
826 | \begin_inset Formula $W^{T}A$
827 | \end_inset
828 |
829 | are orthogonal to the columns of
830 | \begin_inset Formula $V_{2}$
831 | \end_inset
832 |
833 | .
834 | The eigenvalues of a block diagonal matrix are the eigenvalues of the diagonal
835 | blocks, so the eigenavalues of the upper left block
836 | \begin_inset Formula $B$
837 | \end_inset
838 |
839 | are eigenvalues of the full matrix
840 | \begin_inset Formula $D$
841 | \end_inset
842 |
843 | , which are eigenvalues of
844 | \begin_inset Formula $A$
845 | \end_inset
846 |
847 | by similarity.
848 | Thefore, if
849 | \begin_inset Formula $\lambda$
850 | \end_inset
851 |
852 | is an eigenvalue of
853 | \begin_inset Formula $B$
854 | \end_inset
855 |
856 | , it is also an eigenvalue of
857 | \begin_inset Formula $A$
858 | \end_inset
859 |
860 | .
861 |
862 | \begin_inset VSpace medskip
863 | \end_inset
864 |
865 | How are the eigenvectors of
866 | \begin_inset Formula $B$
867 | \end_inset
868 |
869 | related to eigenvectors of
870 | \begin_inset Formula $A$
871 | \end_inset
872 |
873 | ? First, by similarity, the right eigenvectors of
874 | \begin_inset Formula $A$
875 | \end_inset
876 |
877 | are related to those of
878 | \begin_inset Formula $D$
879 | \end_inset
880 |
881 | by
882 | \begin_inset Formula
883 | \[
884 | x_{i}=\left(\begin{array}{cc}
885 | V & V_{2}\end{array}\right)\chi_{i},\qquad\text{where\ensuremath{\qquad D\chi_{i}=\lambda_{i}\chi_{i}.}}
886 | \]
887 |
888 | \end_inset
889 |
890 | Consider the vector
891 | \begin_inset Formula $\chi=[w\quad0]^{T}$
892 | \end_inset
893 |
894 | of length
895 | \begin_inset Formula $m$
896 | \end_inset
897 |
898 | , and, using that
899 | \begin_inset Formula $Bw=\lambda w$
900 | \end_inset
901 |
902 | , calculate directly that
903 | \begin_inset Formula
904 | \[
905 | D\chi=\left(\begin{array}{cc}
906 | W^{T}AV\\
907 | & W_{2}^{T}AV_{2}
908 | \end{array}\right)\left(\begin{array}{c}
909 | w\\
910 | 0
911 | \end{array}\right)=\left(\begin{array}{c}
912 | Bw\\
913 | 0
914 | \end{array}\right)=\lambda\left(\begin{array}{c}
915 | w\\
916 | 0
917 | \end{array}\right).
918 | \]
919 |
920 | \end_inset
921 |
922 | So,
923 | \begin_inset Formula $\chi=[w\quad0]^{T}$
924 | \end_inset
925 |
926 | is an eigenvector of
927 | \begin_inset Formula $D$
928 | \end_inset
929 |
930 | with eigenvalue
931 | \begin_inset Formula $\lambda$
932 | \end_inset
933 |
934 | , and therefore, using the connection between eigenvectors of similar matrices
935 | from above, we have that
936 | \begin_inset Formula
937 | \[
938 | \left(\begin{array}{cc}
939 | V & V_{2}\end{array}\right)\left(\begin{array}{c}
940 | w\\
941 | 0
942 | \end{array}\right)=Vw=x
943 | \]
944 |
945 | \end_inset
946 |
947 | is an eigenvector of
948 | \begin_inset Formula $A$
949 | \end_inset
950 |
951 | with eigenvalue
952 | \begin_inset Formula $\lambda$
953 | \end_inset
954 |
955 | .
956 | \begin_inset VSpace defskip
957 | \end_inset
958 |
959 | There is an alternative elegant way to prove the statement using orthogonality
960 | relations for the residual.
961 | From part (a) we know that
962 | \begin_inset Formula $Ax-\lambda x\perp\mathcal{W}$
963 | \end_inset
964 |
965 | when
966 | \begin_inset Formula $x=Vw$
967 | \end_inset
968 |
969 | and
970 | \begin_inset Formula $(\lambda,w)$
971 | \end_inset
972 |
973 | solves
974 | \begin_inset Formula $Bw=\lambda Mw$
975 | \end_inset
976 |
977 | .
978 | If
979 | \begin_inset Formula $\mathcal{V}$
980 | \end_inset
981 |
982 | is also invariant under
983 | \begin_inset Formula $A$
984 | \end_inset
985 |
986 | , then we also have that
987 | \begin_inset Formula $Ax-\lambda x\in\mathcal{V}$
988 | \end_inset
989 |
990 | .
991 | This implies
992 | \begin_inset Formula $Ax-\lambda x\perp\mathcal{W}_{2}$
993 | \end_inset
994 |
995 | because
996 | \begin_inset Formula $\mathcal{V}$
997 | \end_inset
998 |
999 | and
1000 | \begin_inset Formula $\mathcal{W}_{2}$
1001 | \end_inset
1002 |
1003 | are orthogonal subspaces.
1004 | Since
1005 | \begin_inset Formula $A$
1006 | \end_inset
1007 |
1008 | is diagonalizable,
1009 | \begin_inset Formula $\mathcal{W}\bigcup\mathcal{W}_{2}=\mathbb{R}^{m}$
1010 | \end_inset
1011 |
1012 | so the only vector orthogonal to both is the zero vector, which means that
1013 |
1014 | \begin_inset Formula $Ax-\lambda x=0$
1015 | \end_inset
1016 |
1017 | .
1018 | \end_layout
1019 |
1020 | \begin_layout Standard
1021 |
1022 | \series bold
1023 | Hint 1:
1024 | \series default
1025 | In part (b), consider the similarity transform
1026 | \begin_inset Formula $[W\quad W_{2}]^{T}A[V\quad V_{2}],$
1027 | \end_inset
1028 |
1029 | where
1030 | \begin_inset Formula $V_{2}$
1031 | \end_inset
1032 |
1033 | and
1034 | \begin_inset Formula $W_{2}$
1035 | \end_inset
1036 |
1037 | are biorthonormal bases for the subspaces
1038 | \begin_inset Formula $\mathcal{V}_{2}=\{x_{n+1},\ldots,x_{m}\}$
1039 | \end_inset
1040 |
1041 | and
1042 | \begin_inset Formula $\mathcal{W}_{2}=\{y_{n+1},\ldots,y_{m}\}$
1043 | \end_inset
1044 |
1045 | , respectively.
1046 |
1047 | \series bold
1048 | Hint 2:
1049 | \series default
1050 | The right and left eigenvectors of a diagonalizable matrix can be made
1051 | biorthonormal (why?), so
1052 | \begin_inset Formula $\mathcal{V}$
1053 | \end_inset
1054 |
1055 | and
1056 | \begin_inset Formula $\mathcal{W}_{2}$
1057 | \end_inset
1058 |
1059 | are orthogonal subspaces.
1060 | \end_layout
1061 |
1062 | \begin_layout Subsection*
1063 | Problem 3: (36 points)
1064 | \end_layout
1065 |
1066 | \begin_layout Standard
1067 | The method of Generalized Minimal RESiduals (GMRES) uses
1068 | \begin_inset Formula $n$
1069 | \end_inset
1070 |
1071 | iterations of the Arnoldi method to construct a sequence of approximate
1072 | solutions
1073 | \begin_inset Formula $x_{1},x_{2},\ldots,x_{n}$
1074 | \end_inset
1075 |
1076 | to the
1077 | \begin_inset Formula $m\times m$
1078 | \end_inset
1079 |
1080 | linear system
1081 | \begin_inset Formula $Ax=b$
1082 | \end_inset
1083 |
1084 | .
1085 | At the
1086 | \begin_inset Formula $n^{{\rm th}}$
1087 | \end_inset
1088 |
1089 | iteration, the approximate solution
1090 | \begin_inset Formula $x_{n}=Q_{n}y_{n}$
1091 | \end_inset
1092 |
1093 | is constructed by solving the least-squares problem,
1094 | \begin_inset Formula
1095 | \[
1096 | y_{n}={\rm argmin}_{y}\|\tilde{H}_{n}y-\|b\|e_{1}\|,
1097 | \]
1098 |
1099 | \end_inset
1100 |
1101 | where
1102 | \begin_inset Formula $\tilde{H}_{n}$
1103 | \end_inset
1104 |
1105 | is an
1106 | \begin_inset Formula $(n+1)\times n$
1107 | \end_inset
1108 |
1109 | upper Hessenberg matrix and
1110 | \begin_inset Formula $Q_{n}$
1111 | \end_inset
1112 |
1113 | is the usual orthonormal basis for the Krylov subspace
1114 | \begin_inset Formula $\mathcal{K}_{n}(A,b)={\rm span}\{b,Ab,A^{2}b,\ldots,A^{n-1}b\}.$
1115 | \end_inset
1116 |
1117 |
1118 | \end_layout
1119 |
1120 | \begin_layout Enumerate
1121 | Describe an algorithm based on Givens rotations that exploits the upper
1122 | Hessenberg structure of
1123 | \begin_inset Formula $\tilde{H}_{n}$
1124 | \end_inset
1125 |
1126 | to solve the
1127 | \begin_inset Formula $(n+1)\times n$
1128 | \end_inset
1129 |
1130 | least-squares problem in
1131 | \begin_inset Formula $\mathcal{O}(n^{2})$
1132 | \end_inset
1133 |
1134 | flops.
1135 | \begin_inset VSpace medskip
1136 | \end_inset
1137 |
1138 |
1139 | \series bold
1140 | \color blue
1141 | Solution:
1142 | \series default
1143 | The
1144 | \begin_inset Formula $(n+1)\times n$
1145 | \end_inset
1146 |
1147 | upper Hessenberg matrix
1148 | \begin_inset Formula $\tilde{H}_{n}$
1149 | \end_inset
1150 |
1151 | has
1152 | \begin_inset Formula $n$
1153 | \end_inset
1154 |
1155 | (potentially) nonzero entries on the subdiagonal.
1156 | We can compute its QR factorization efficiently by applying Givens rotations
1157 | to eliminate these subdiagonal entries and triangularize
1158 | \begin_inset Formula $\tilde{H}_{n}$
1159 | \end_inset
1160 |
1161 | .
1162 | We begin by applying a Givens rotation,
1163 | \begin_inset Formula $G_{1}$
1164 | \end_inset
1165 |
1166 | , that mixes the first two rows in order to eliminate the
1167 | \begin_inset Formula $(2,1)$
1168 | \end_inset
1169 |
1170 | entry:
1171 | \begin_inset Formula
1172 | \[
1173 | \tilde{H}_{n}=\left(\begin{array}{ccccc}
1174 | \times & \times & \times & \cdots & \times\\
1175 | \times & \times & \times & \cdots & \times\\
1176 | & \times & \times & \cdots & \times\\
1177 | & & \ddots & \ddots & \vdots\\
1178 | & & & \times & \times\\
1179 | & & & & \times
1180 | \end{array}\right)\qquad\implies\qquad G_{1}\tilde{H}_{n}=\left(\begin{array}{ccccc}
1181 | \boxtimes & \boxtimes & \boxtimes & \cdots & \boxtimes\\
1182 | 0 & \boxtimes & \boxtimes & \cdots & \boxtimes\\
1183 | & \times & \times & \cdots & \times\\
1184 | & & \ddots & \ddots & \vdots\\
1185 | & & & \times & \times\\
1186 | & & & & \times
1187 | \end{array}\right).
1188 | \]
1189 |
1190 | \end_inset
1191 |
1192 | Note that only the first two rows are affected by the first Givens rotation
1193 | and no new nonzeros appear below the first subdiagonal.
1194 | Next, we apply a Givens rotation,
1195 | \begin_inset Formula $G_{2}$
1196 | \end_inset
1197 |
1198 | , that mixes the second two rows in order to eliminate the
1199 | \begin_inset Formula $(3,2)$
1200 | \end_inset
1201 |
1202 | entry:
1203 | \begin_inset Formula
1204 | \[
1205 | G_{1}\tilde{H}_{n}=\left(\begin{array}{ccccc}
1206 | \times & \times & \times & \cdots & \times\\
1207 | 0 & \times & \times & \cdots & \times\\
1208 | & \times & \times & \cdots & \times\\
1209 | & & \ddots & \ddots & \vdots\\
1210 | & & & \times & \times\\
1211 | & & & & \times
1212 | \end{array}\right)\qquad\implies\qquad G_{2}G_{1}\tilde{H}_{n}=\left(\begin{array}{ccccc}
1213 | \times & \times & \times & \cdots & \times\\
1214 | 0 & \boxtimes & \boxtimes & \cdots & \boxtimes\\
1215 | & 0 & \boxtimes & \cdots & \boxtimes\\
1216 | & & \ddots & \ddots & \vdots\\
1217 | & & & \times & \times\\
1218 | & & & & \times
1219 | \end{array}\right).
1220 | \]
1221 |
1222 | \end_inset
1223 |
1224 | Note that only the second and third row are affected by the second Givens
1225 | rotation and there is no fill-in (the introduction of
1226 | \begin_inset Quotes eld
1227 | \end_inset
1228 |
1229 | unwanted
1230 | \begin_inset Quotes erd
1231 | \end_inset
1232 |
1233 | nonzeros) below the diagonal.
1234 | We continue applying Givens rotations, eliminating the
1235 | \begin_inset Formula $(k+1,k)$
1236 | \end_inset
1237 |
1238 | entry with
1239 | \begin_inset Formula $G_{k}$
1240 | \end_inset
1241 |
1242 | , which mixes rows
1243 | \begin_inset Formula $k$
1244 | \end_inset
1245 |
1246 | and
1247 | \begin_inset Formula $k+1$
1248 | \end_inset
1249 |
1250 | at the
1251 | \begin_inset Formula $k$
1252 | \end_inset
1253 |
1254 | th step.
1255 | After
1256 | \begin_inset Formula $n-1$
1257 | \end_inset
1258 |
1259 | Givens rotations, we apply a final Givens rotation to eliminate the single
1260 | nonzer entry in the last row of the rectangular Hessenberg matrix
1261 | \begin_inset Formula $\tilde{H}_{n}$
1262 | \end_inset
1263 |
1264 | :
1265 | \begin_inset Formula
1266 | \[
1267 | G_{n-1}G_{1}\tilde{H}_{n}=\left(\begin{array}{ccccc}
1268 | \times & \times & \times & \cdots & \times\\
1269 | 0 & \times & \times & \cdots & \times\\
1270 | & 0 & \times & \cdots & \times\\
1271 | & & \ddots & \ddots & \vdots\\
1272 | & & & 0 & \times\\
1273 | & & & & \times
1274 | \end{array}\right)\qquad\implies\qquad G_{2}G_{1}\tilde{H}_{n}=\left(\begin{array}{ccccc}
1275 | \times & \times & \times & \cdots & \times\\
1276 | 0 & \times & \times & \cdots & \times\\
1277 | & 0 & \times & \cdots & \times\\
1278 | & & \ddots & \ddots & \vdots\\
1279 | & & & 0 & \boxtimes\\
1280 | & & & & 0
1281 | \end{array}\right).
1282 | \]
1283 |
1284 | \end_inset
1285 |
1286 | Now,
1287 | \begin_inset Formula $G_{n}\ldots G_{1}\tilde{H}_{n}=R_{n}$
1288 | \end_inset
1289 |
1290 | is an
1291 | \begin_inset Formula $(n+1)\times n$
1292 | \end_inset
1293 |
1294 | upper triangular matrix,
1295 | \begin_inset Formula $\Omega_{n}=G_{1}^{T}\ldots G_{n}^{T}$
1296 | \end_inset
1297 |
1298 | is an
1299 | \begin_inset Formula $(n+1)\times(n+1)$
1300 | \end_inset
1301 |
1302 | orthogonal matrix (usually
1303 | \emph on
1304 | not
1305 | \emph default
1306 | stored explicitly), and
1307 | \begin_inset Formula $\tilde{H}_{n}=\Omega_{n}R_{n}$
1308 | \end_inset
1309 |
1310 | .
1311 | We can use the QR factorization to solve the least squares problem in the
1312 | usual way by appying the Givens rotations to the right-hand side,
1313 | \begin_inset Formula $d=\|b\|\Omega_{n}^{T}e_{1}=\|b\|G_{n}\ldots G_{1}e_{1}$
1314 | \end_inset
1315 |
1316 | , and solving the
1317 | \begin_inset Formula $n\times n$
1318 | \end_inset
1319 |
1320 | triangular system
1321 | \begin_inset Formula $(R_{n})_{1:n,n}y_{n}=d_{1:n}$
1322 | \end_inset
1323 |
1324 | with backsubstitution.
1325 | The
1326 | \begin_inset Formula $k$
1327 | \end_inset
1328 |
1329 | th step of the QR factorization of
1330 | \begin_inset Formula $\tilde{H}_{n}$
1331 | \end_inset
1332 |
1333 | requires
1334 | \begin_inset Formula $\mathcal{O}(n-k)$
1335 | \end_inset
1336 |
1337 | flops because rows of length
1338 | \begin_inset Formula $n-k+1$
1339 | \end_inset
1340 |
1341 | are combined by the Givens rotation
1342 | \begin_inset Formula $G_{k}$
1343 | \end_inset
1344 |
1345 | .
1346 | After
1347 | \begin_inset Formula $n$
1348 | \end_inset
1349 |
1350 | steps, the total flop count is
1351 | \begin_inset Formula $\mathcal{O}(n^{2})$
1352 | \end_inset
1353 |
1354 | .
1355 | Applying the
1356 | \begin_inset Formula $n$
1357 | \end_inset
1358 |
1359 | Givens rotations to
1360 | \begin_inset Formula $e_{1}$
1361 | \end_inset
1362 |
1363 | costs
1364 | \begin_inset Formula $\mathcal{O}(n)$
1365 | \end_inset
1366 |
1367 | flops and backsubstitution for the triangular system costs
1368 | \begin_inset Formula $\mathcal{O}(n^{2})$
1369 | \end_inset
1370 |
1371 | flops.
1372 | Therefore, the total cost of computing the least-squares solution is
1373 | \begin_inset Formula $\mathcal{O}(n^{2})$
1374 | \end_inset
1375 |
1376 | .
1377 | \end_layout
1378 |
1379 | \begin_layout Enumerate
1380 | If the QR factorization
1381 | \begin_inset Formula $\tilde{H}_{n-1}=\Omega_{n-1}R_{n-1}$
1382 | \end_inset
1383 |
1384 | is known from the previous iteration, explain how to update the QR factorizatio
1385 | n to
1386 | \begin_inset Formula $\tilde{H}_{n}=\Omega_{n}R_{n}$
1387 | \end_inset
1388 |
1389 | cheaply using a single Givens rotation.
1390 | \begin_inset VSpace medskip
1391 | \end_inset
1392 |
1393 |
1394 | \series bold
1395 | \color blue
1396 | Solution:
1397 | \series default
1398 | If the QR factorization is known at the previous iteration, we can write
1399 |
1400 | \begin_inset Formula $\tilde{H}_{n}$
1401 | \end_inset
1402 |
1403 | in the block form
1404 | \begin_inset Formula
1405 | \[
1406 | \tilde{H}_{n}=\left(\begin{array}{cc}
1407 | \Omega_{n-1}R_{n-1} & h_{1:n,n}\\
1408 | & h_{n,n+1}
1409 | \end{array}\right)=\left(\begin{array}{cc}
1410 | \Omega_{n-1}\\
1411 | & 1
1412 | \end{array}\right)\left(\begin{array}{cc}
1413 | R_{n-1} & \Omega_{n-1}^{T}h_{1:n,n}\\
1414 | & h_{n,n+1}
1415 | \end{array}\right).
1416 | \]
1417 |
1418 | \end_inset
1419 |
1420 | Using the full QR decomposition (as in part (a)), note that
1421 | \begin_inset Formula $R_{n-1}$
1422 | \end_inset
1423 |
1424 | is a
1425 | \begin_inset Formula $n\times(n-1)$
1426 | \end_inset
1427 |
1428 | rectangular upper triangular matrix and
1429 | \begin_inset Formula $\Omega_{n-1}$
1430 | \end_inset
1431 |
1432 | is a
1433 | \begin_inset Formula $n\times n$
1434 | \end_inset
1435 |
1436 | orthogonal matrix.
1437 | Therefore, the first factor is an
1438 | \begin_inset Formula $(n+1)\times(n+1)$
1439 | \end_inset
1440 |
1441 | orthogonal matrix and the first
1442 | \begin_inset Formula $n-1$
1443 | \end_inset
1444 |
1445 | columns of the second factor are already upper triangular.
1446 | It remains to apply a single additional Givens rotation to the second factor,
1447 | mixing the last two rows to eliminate the single subdiagonal entry
1448 | \begin_inset Formula $h_{n,n+1}$
1449 | \end_inset
1450 |
1451 | .
1452 | We start with the structure
1453 | \begin_inset Formula
1454 | \[
1455 | \left(\begin{array}{cc}
1456 | R_{n-1} & \Omega_{n-1}^{T}h_{1:n,n}\\
1457 | & h_{n,n+1}
1458 | \end{array}\right)=\left(\begin{array}{ccccc}
1459 | \times & \times & \times & \cdots & \times\\
1460 | 0 & \times & \times & \cdots & \times\\
1461 | & 0 & \times & \cdots & \times\\
1462 | & & \ddots & \ddots & \vdots\\
1463 | & & & 0 & \times\\
1464 | & & & & \times
1465 | \end{array}\right),
1466 | \]
1467 |
1468 | \end_inset
1469 |
1470 | and end up with the structure
1471 | \begin_inset Formula
1472 | \[
1473 | G\left(\begin{array}{cc}
1474 | R_{n-1} & \Omega_{n-1}^{T}h_{1:n,n}\\
1475 | & h_{n,n+1}
1476 | \end{array}\right)=\left(\begin{array}{ccccc}
1477 | \times & \times & \times & \cdots & \times\\
1478 | 0 & \times & \times & \cdots & \times\\
1479 | & 0 & \times & \cdots & \times\\
1480 | & & \ddots & \ddots & \vdots\\
1481 | & & & 0 & \boxtimes\\
1482 | & & & & 0
1483 | \end{array}\right)
1484 | \]
1485 |
1486 | \end_inset
1487 |
1488 | Since Givens rotations are orthogonal matrices, we have that
1489 | \begin_inset Formula $G^{T}G=I$
1490 | \end_inset
1491 |
1492 | , and we can reformulate
1493 | \begin_inset Formula
1494 | \[
1495 | \tilde{H}_{n}=\left(\begin{array}{cc}
1496 | \Omega_{n-1}R_{n-1} & h_{1:n,n}\\
1497 | & h_{n,n+1}
1498 | \end{array}\right)=\left(\begin{array}{cc}
1499 | \Omega_{n-1}\\
1500 | & 1
1501 | \end{array}\right)G^{T}G\left(\begin{array}{cc}
1502 | R_{n-1} & \Omega_{n-1}^{T}h_{1:n,n}\\
1503 | & h_{n,n+1}
1504 | \end{array}\right).
1505 | \]
1506 |
1507 | \end_inset
1508 |
1509 | The product of the first two matrices on the right is the orthogonal matrix
1510 |
1511 | \begin_inset Formula $\Omega_{n}$
1512 | \end_inset
1513 |
1514 | and the product of the second two matrices on the right is the triangular
1515 | matrix
1516 | \begin_inset Formula $R_{n}$
1517 | \end_inset
1518 |
1519 | .
1520 | Note that computing the updated QR factorization means applying the previous
1521 | Givens rotations to the new column
1522 | \begin_inset Formula $h_{1:n,n}$
1523 | \end_inset
1524 |
1525 | , i.e., computing
1526 | \begin_inset Formula $\Omega_{n-1}^{T}h_{1:n,n}$
1527 | \end_inset
1528 |
1529 | , and then computing and applying the new Givens rotation
1530 | \begin_inset Formula $G$
1531 | \end_inset
1532 |
1533 | .
1534 | The total cost of the update is
1535 | \begin_inset Formula $\mathcal{O}(n)$
1536 | \end_inset
1537 |
1538 | flops.
1539 | \end_layout
1540 |
1541 | \begin_layout Enumerate
1542 | Using your result from part (b), explain how the solution to the least-squares
1543 | problem can also be updated cheaply from the solution at the previous iteration.
1544 | \begin_inset VSpace medskip
1545 | \end_inset
1546 |
1547 |
1548 | \series bold
1549 | \color blue
1550 | Solution:
1551 | \series default
1552 | After computing
1553 | \begin_inset Formula $\tilde{H}_{n}=\Omega_{n}R_{n}$
1554 | \end_inset
1555 |
1556 | using the fast update in part (b), we simply solve the triangular system
1557 |
1558 | \begin_inset Formula $(R_{n})_{1:n,n}y_{n}=d_{1:n}^{(n)}$
1559 | \end_inset
1560 |
1561 | , where
1562 | \begin_inset Formula
1563 | \[
1564 | d^{(n)}=\|b\|\Omega_{n}^{T}e_{1}=\|b\|G\left(\begin{array}{cc}
1565 | \Omega_{n-1}^{T}\\
1566 | & 1
1567 | \end{array}\right)e_{1}=G\left(\begin{array}{c}
1568 | d^{(n-1)}\\
1569 | 0
1570 | \end{array}\right).
1571 | \]
1572 |
1573 | \end_inset
1574 |
1575 | In other words, we apply the new Givens rotation
1576 | \begin_inset Formula $G$
1577 | \end_inset
1578 |
1579 | (from the QR update) to update the right-hand side from
1580 | \begin_inset Formula $d^{(n-1)}$
1581 | \end_inset
1582 |
1583 | to
1584 | \begin_inset Formula $d^{(n)}$
1585 | \end_inset
1586 |
1587 | and then solve the new triangular system by backsubstitution as usual.
1588 | \end_layout
1589 |
1590 | \begin_layout Enumerate
1591 | What is the approximate flop count for updating the least-squares solution
1592 | at the
1593 | \begin_inset Formula $n^{{\rm th}}$
1594 | \end_inset
1595 |
1596 | step of GMRES? You may use big-
1597 | \begin_inset Formula $O$
1598 | \end_inset
1599 |
1600 | notation to express the asymptotic scaling in
1601 | \begin_inset Formula $n$
1602 | \end_inset
1603 |
1604 | .
1605 | \begin_inset VSpace medskip
1606 | \end_inset
1607 |
1608 |
1609 | \series bold
1610 | \color blue
1611 | Solution:
1612 | \series default
1613 | In part (a), both the Hessenberg QR factorization and the solution of the
1614 | triangular system were
1615 | \begin_inset Formula $\mathcal{O}(n^{2})$
1616 | \end_inset
1617 |
1618 | flops.
1619 | Using the fast QR update from part (b), we can reduce the cost of the QR
1620 | factorization, but the solution of the triangular system remains at
1621 | \begin_inset Formula $\mathcal{O}(n^{2})$
1622 | \end_inset
1623 |
1624 | flops.
1625 | Therefore, updating the least-squares solution at the
1626 | \begin_inset Formula $n$
1627 | \end_inset
1628 |
1629 | th step of of GMRES remains
1630 | \begin_inset Formula $\mathcal{O}(n^{2})$
1631 | \end_inset
1632 |
1633 | .
1634 | Note that both the
1635 | \begin_inset Formula $m\times m$
1636 | \end_inset
1637 |
1638 | matrix-vector multiplication and the
1639 | \begin_inset Formula $\mathcal{O}(mn^{2})$
1640 | \end_inset
1641 |
1642 | orthogonalization cost of the Arnoldi process are typically much more expensive
1643 | than the
1644 | \begin_inset Formula $\mathcal{O}(n^{2})$
1645 | \end_inset
1646 |
1647 | cost of the least-squares update in GMRES, since
1648 | \begin_inset Formula $n\ll m$
1649 | \end_inset
1650 |
1651 | in almost all practical situations.
1652 | \end_layout
1653 |
1654 | \end_body
1655 | \end_document
1656 |
--------------------------------------------------------------------------------
/midterm/midterm2023soln.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/midterm/midterm2023soln.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture1.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture10.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture10.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture11.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture11.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture12.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture12.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture13.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture13.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture14.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture14.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture15.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture15.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture16.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture16.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture17.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture17.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture18.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture18.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture19.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture19.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture2.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture20.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture20.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture21.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture21.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture22.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture22.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture3.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture4.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture4.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture5.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture5.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture6.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture6.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture7.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture7.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture8.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture8.pdf
--------------------------------------------------------------------------------
/notes/18.335_Lecture9.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/18.335_Lecture9.pdf
--------------------------------------------------------------------------------
/notes/Is-Gaussian-Elimination-Unstable.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "b4738d86",
6 | "metadata": {},
7 | "source": [
8 | "Gaussian elimination for $A = LU$ may not be backward stable when $||U|| >> ||A||$. Here is a special matrix for which this occurs."
9 | ]
10 | },
11 | {
12 | "cell_type": "code",
13 | "execution_count": 2,
14 | "id": "6103f719",
15 | "metadata": {},
16 | "outputs": [],
17 | "source": [
18 | "using LinearAlgebra"
19 | ]
20 | },
21 | {
22 | "cell_type": "code",
23 | "execution_count": 3,
24 | "id": "93919aaa",
25 | "metadata": {},
26 | "outputs": [
27 | {
28 | "data": {
29 | "text/plain": [
30 | "8×8 Matrix{Float64}:\n",
31 | " 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0\n",
32 | " -1.0 1.0 0.0 0.0 0.0 0.0 0.0 1.0\n",
33 | " -1.0 -1.0 1.0 0.0 0.0 0.0 0.0 1.0\n",
34 | " -1.0 -1.0 -1.0 1.0 0.0 0.0 0.0 1.0\n",
35 | " -1.0 -1.0 -1.0 -1.0 1.0 0.0 0.0 1.0\n",
36 | " -1.0 -1.0 -1.0 -1.0 -1.0 1.0 0.0 1.0\n",
37 | " -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 1.0\n",
38 | " -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0"
39 | ]
40 | },
41 | "metadata": {},
42 | "output_type": "display_data"
43 | }
44 | ],
45 | "source": [
46 | "m=8\n",
47 | "A = tril(2*I - ones(m,m))\n",
48 | "A[:,end] = ones(m)\n",
49 | "display(A)\n"
50 | ]
51 | },
52 | {
53 | "cell_type": "markdown",
54 | "id": "aabda139",
55 | "metadata": {},
56 | "source": [
57 | "Let's compute $A = LU$ and compare $||U||$ and $||A||$."
58 | ]
59 | },
60 | {
61 | "cell_type": "code",
62 | "execution_count": 4,
63 | "id": "c055f548",
64 | "metadata": {},
65 | "outputs": [
66 | {
67 | "data": {
68 | "text/plain": [
69 | "8×8 Matrix{Float64}:\n",
70 | " 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0\n",
71 | " -1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0\n",
72 | " -1.0 -1.0 1.0 0.0 0.0 0.0 0.0 0.0\n",
73 | " -1.0 -1.0 -1.0 1.0 0.0 0.0 0.0 0.0\n",
74 | " -1.0 -1.0 -1.0 -1.0 1.0 0.0 0.0 0.0\n",
75 | " -1.0 -1.0 -1.0 -1.0 -1.0 1.0 0.0 0.0\n",
76 | " -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0 0.0\n",
77 | " -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 -1.0 1.0"
78 | ]
79 | },
80 | "metadata": {},
81 | "output_type": "display_data"
82 | },
83 | {
84 | "data": {
85 | "text/plain": [
86 | "8×8 Matrix{Float64}:\n",
87 | " 1.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0\n",
88 | " 0.0 1.0 0.0 0.0 0.0 0.0 0.0 2.0\n",
89 | " 0.0 0.0 1.0 0.0 0.0 0.0 0.0 4.0\n",
90 | " 0.0 0.0 0.0 1.0 0.0 0.0 0.0 8.0\n",
91 | " 0.0 0.0 0.0 0.0 1.0 0.0 0.0 16.0\n",
92 | " 0.0 0.0 0.0 0.0 0.0 1.0 0.0 32.0\n",
93 | " 0.0 0.0 0.0 0.0 0.0 0.0 1.0 64.0\n",
94 | " 0.0 0.0 0.0 0.0 0.0 0.0 0.0 128.0"
95 | ]
96 | },
97 | "metadata": {},
98 | "output_type": "display_data"
99 | }
100 | ],
101 | "source": [
102 | "F = lu(A)\n",
103 | "display(F.L)\n",
104 | "display(F.U)"
105 | ]
106 | },
107 | {
108 | "cell_type": "code",
109 | "execution_count": 5,
110 | "id": "642a2b3b",
111 | "metadata": {},
112 | "outputs": [
113 | {
114 | "data": {
115 | "text/plain": [
116 | "6.557438524302"
117 | ]
118 | },
119 | "metadata": {},
120 | "output_type": "display_data"
121 | },
122 | {
123 | "data": {
124 | "text/plain": [
125 | "147.82421993705904"
126 | ]
127 | },
128 | "metadata": {},
129 | "output_type": "display_data"
130 | }
131 | ],
132 | "source": [
133 | "display(norm(A))\n",
134 | "display(norm(F.U))"
135 | ]
136 | },
137 | {
138 | "cell_type": "markdown",
139 | "id": "914617dc",
140 | "metadata": {},
141 | "source": [
142 | "Because $||U|| / ||A||$ grows exponentially with $m$, the backward error $||A - \\tilde L \\tilde U||$ may be exponentially large! Solving $Ax = b$ with Gaussian elimination sounds like a bad idea.\n",
143 | "\n",
144 | "But what actually happens on a computer?"
145 | ]
146 | },
147 | {
148 | "cell_type": "code",
149 | "execution_count": 6,
150 | "id": "c8473ff4",
151 | "metadata": {},
152 | "outputs": [
153 | {
154 | "data": {
155 | "text/plain": [
156 | "0.0"
157 | ]
158 | },
159 | "metadata": {},
160 | "output_type": "display_data"
161 | }
162 | ],
163 | "source": [
164 | "display(norm(A - F.L*F.U))"
165 | ]
166 | },
167 | {
168 | "cell_type": "markdown",
169 | "id": "d55ebb2b",
170 | "metadata": {},
171 | "source": [
172 | "It looks like the backward error for the LU factorization is actually perfect. Why?\n",
173 | "\n",
174 | "Can you modify the problem to bring the expected exponentially bad errors to life? (See Trefethen, problem 22.2.)\n",
175 | "\n",
176 | "Note that we DO lose accuracy rapidly (as dimension $m$ increases) when using the LU factors to solve Ax = b. Why?"
177 | ]
178 | },
179 | {
180 | "cell_type": "code",
181 | "execution_count": 7,
182 | "id": "d0fae337",
183 | "metadata": {},
184 | "outputs": [
185 | {
186 | "data": {
187 | "text/plain": [
188 | "2.1181705310112556e-15"
189 | ]
190 | },
191 | "metadata": {},
192 | "output_type": "display_data"
193 | }
194 | ],
195 | "source": [
196 | "b = rand(m)\n",
197 | "y = F.L \\ b\n",
198 | "x = F.U \\ y\n",
199 | "norm(A*x - b)"
200 | ]
201 | },
202 | {
203 | "cell_type": "markdown",
204 | "id": "c7d74597",
205 | "metadata": {},
206 | "source": [
207 | "In practice, it is extremely rare to encounter matrices for which $||U|| >> ||A||$. We can get a sense of this by examining the ratio $\\rho = ||U||/||A||$ for various random matrices."
208 | ]
209 | },
210 | {
211 | "cell_type": "code",
212 | "execution_count": 8,
213 | "id": "95e9b192",
214 | "metadata": {},
215 | "outputs": [
216 | {
217 | "data": {
218 | "text/plain": [
219 | "2.23189077910945"
220 | ]
221 | },
222 | "metadata": {},
223 | "output_type": "display_data"
224 | }
225 | ],
226 | "source": [
227 | "n_iter = 10000\n",
228 | "ρ = zeros(n_iter)\n",
229 | "for i = 1:n_iter\n",
230 | " A = randn(m,m)\n",
231 | " F = lu(A)\n",
232 | " ρ[i] = norm(F.U) / norm(A)\n",
233 | "end\n",
234 | "maximum(ρ)"
235 | ]
236 | }
237 | ],
238 | "metadata": {
239 | "kernelspec": {
240 | "display_name": "Julia 1.6.3",
241 | "language": "julia",
242 | "name": "julia-1.6"
243 | },
244 | "language_info": {
245 | "file_extension": ".jl",
246 | "mimetype": "application/julia",
247 | "name": "julia",
248 | "version": "1.6.3"
249 | }
250 | },
251 | "nbformat": 4,
252 | "nbformat_minor": 5
253 | }
254 |
--------------------------------------------------------------------------------
/notes/Three-Ways-To-Solve-Least-Squares.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "attachments": {},
5 | "cell_type": "markdown",
6 | "id": "789653bc",
7 | "metadata": {},
8 | "source": [
9 | "# Overdetermined least-squares problems\n",
10 | "\n",
11 | "This notebook explores three ways to solve an overdetermined least-squares problem from approximation theory (taken from Lecture 19 in Trefethen and Bau). Suppose we want to fit a degree $14$ polynomial $p(x) = c_0 + c_1 x + \\cdots + c_{14} x^{14}$ to $100$ data points $b_k = B\\exp(\\sin(4x_k))$ at equispaced points $x_1,x_2,\\ldots,x_{100}$ in $[0,1]$, where $B$ is a normalization constant so that the least squares solution has $c_{14} = 1$. Each data point gives us an equation for the $15$ unkown polynomial coefficients: the result is the $100\\times 15$ overdetermined _Vandermonde_ system $Ac=b$,\n",
12 | "\n",
13 | "$$\n",
14 | "\\begin{pmatrix}\n",
15 | "1 && x_1 && \\cdots && x_1^{14} \\\\\n",
16 | "1 && x_2 && \\cdots && x_2^{14} \\\\\n",
17 | "\\ddots && \\ddots &&\\vdots && \\ddots \\\\\n",
18 | "1 && x_{100} && \\cdots && x_{100}^{14}\n",
19 | "\\end{pmatrix}\n",
20 | "\\begin{pmatrix}\n",
21 | "c_0 \\\\ c_1 \\\\ \\ddots \\\\ c_{14}\n",
22 | "\\end{pmatrix}\n",
23 | "=\n",
24 | "\\begin{pmatrix}\n",
25 | "b_1 \\\\ b_2 \\\\ \\ddots \\\\ b_{100}\n",
26 | "\\end{pmatrix}\n",
27 | "$$\n",
28 | "\n",
29 | "The system has more equations than unknowns and there is no unique solution. The Vandermonde matrix has full column rank but is highly ill-conditioned."
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": null,
35 | "id": "ef3744a4",
36 | "metadata": {},
37 | "outputs": [],
38 | "source": [
39 | "using LinearAlgebra"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "execution_count": null,
45 | "id": "6a55415e",
46 | "metadata": {},
47 | "outputs": [],
48 | "source": [
49 | "# set up least-squares problem A * c = b so that a[15] = 1\n",
50 | "m = 100\n",
51 | "n = 15\n",
52 | "xpts = collect(0:m-1)/(m-1)\n",
53 | "A = zeros(m,n)\n",
54 | "for k in 1:n\n",
55 | " A[:,k] = xpts .^ (k-1)\n",
56 | "end\n",
57 | "b = exp.(sin.(4*xpts)) / 2006.787453080206\n",
58 | "\n",
59 | "cond(A) # display condition number of Vandermonde matrix"
60 | ]
61 | },
62 | {
63 | "attachments": {},
64 | "cell_type": "markdown",
65 | "id": "6c119a5c",
66 | "metadata": {},
67 | "source": [
68 | "Before we test out the three methods to solve $\\min\\|Ac-b\\|$, let's estimate the condition number of the least-squares solution $c_*$ associated with perturbations in the right-hand side, $\\kappa_{b\\rightarrow c}$, and the coefficient matrix, $\\kappa_{A\\rightarrow c}$. We'll use \"backslash\" to get the approximate least-squares solution."
69 | ]
70 | },
71 | {
72 | "cell_type": "code",
73 | "execution_count": null,
74 | "id": "5f9a95db",
75 | "metadata": {},
76 | "outputs": [],
77 | "source": [
78 | "c = A \\ b\n",
79 | "y = A * c\n",
80 | "κ = cond(A)\n",
81 | "cosθ = norm(y)/norm(b)\n",
82 | "η = norm(A)*norm(c) / norm(y)\n",
83 | "κA = κ / cosθ\n",
84 | "κb = κ / (η * cosθ)\n",
85 | "display(κA) # display condition number for perturbations to A\n",
86 | "display(κb) # display condition number for perturbations to b"
87 | ]
88 | },
89 | {
90 | "attachments": {},
91 | "cell_type": "markdown",
92 | "id": "7836dd42",
93 | "metadata": {},
94 | "source": [
95 | "Now, let's compare the three methods to compute the least-squares solution. The least-squares solution has $a_{14} = 1$ by construction. \n",
96 | "\n",
97 | "\n",
98 | "## 1. The Normal Equations\n",
99 | " \n",
100 | "This means solving $A^TAc = A^Tb$."
101 | ]
102 | },
103 | {
104 | "cell_type": "code",
105 | "execution_count": null,
106 | "id": "994f8b72",
107 | "metadata": {},
108 | "outputs": [],
109 | "source": [
110 | "# normal equations\n",
111 | "cN = (A'*A) \\ (A' * b)\n",
112 | "abs(cN[end]-1)"
113 | ]
114 | },
115 | {
116 | "attachments": {},
117 | "cell_type": "markdown",
118 | "id": "0e46d800",
119 | "metadata": {},
120 | "source": [
121 | "Not even a single correct digit! Computing the least-squares solution via the normal equations is unstable when the system is ill-conditioned. It is stable for restricted classes of problems, e.g., problems with bounded condition number.\n",
122 | "\n",
123 | "## 2. The Singular Value Decomposition\n",
124 | "\n",
125 | "Now, let's try diagonal reduction with the thin SVD, $A = U\\Sigma V^T$. We need to solve $\\Sigma d = U^T b$ and then $c = Vd$."
126 | ]
127 | },
128 | {
129 | "cell_type": "code",
130 | "execution_count": null,
131 | "id": "aa1bd786",
132 | "metadata": {},
133 | "outputs": [],
134 | "source": [
135 | "# diagonal reduction (SVD)\n",
136 | "F = svd(A)\n",
137 | "d = (F.U' * b) ./ F.S\n",
138 | "c = F.V * d\n",
139 | "abs(1-c[end])"
140 | ]
141 | },
142 | {
143 | "attachments": {},
144 | "cell_type": "markdown",
145 | "id": "22f90419",
146 | "metadata": {},
147 | "source": [
148 | "Seven correct digits is much better! This is about as many correct digits as we can hope for when $\\kappa_{A\\rightarrow c}\\approx 10^{10}$. The SVD approach is backward stable.\n",
149 | "\n",
150 | "## The QR Decomposition (three ways)\n",
151 | "\n",
152 | "Finally, let's compute the least-squares solution using the thin QR factorization, $A=QR$. We need to solve $Rc = Q^Tb$.\n",
153 | "\n",
154 | "First, we will compute $A = QR$ with modified Gram-Schmidt."
155 | ]
156 | },
157 | {
158 | "cell_type": "code",
159 | "execution_count": null,
160 | "id": "b9188ce2",
161 | "metadata": {},
162 | "outputs": [],
163 | "source": [
164 | "function mgs(A)\n",
165 | " V = copy(A)\n",
166 | " n = size(A,2) # number of columns\n",
167 | " R = UpperTriangular(zeros(n,n))\n",
168 | " for i in 1:n\n",
169 | " R[i,i] = norm(V[:,i])\n",
170 | " V[:,i] = V[:, i] / R[i,i]\n",
171 | " for j in i+1:n\n",
172 | " R[i,j] = V[:,i]'*V[:,j]\n",
173 | " V[:,j] = V[:,j] - R[i,j]*V[:,i]\n",
174 | " end\n",
175 | " end\n",
176 | "\n",
177 | " return V, R\n",
178 | "end\n",
179 | "\n",
180 | "# unpack Q and R from modified Gram-Schmidt\n",
181 | "F = mgs(A)\n",
182 | "Q = F[1] # 100 x 15 ONB from Gram-Schmidt\n",
183 | "R = F[2] # 15 x 15 upper triangular matrix from Gram-Schmidt\n",
184 | "\n",
185 | "# solve R * c = Q' * b\n",
186 | "c = R \\ (Q' * b)\n",
187 | "abs(1-c[end])"
188 | ]
189 | },
190 | {
191 | "attachments": {},
192 | "cell_type": "markdown",
193 | "id": "d312621a",
194 | "metadata": {},
195 | "source": [
196 | "Modified Gram-Schmidt is also performing pretty poorly here! A part of this is the loss of orthogonality in the columns of $Q$ due to ill-conditioning in $A$, exacerbated by ill-conditioning in $R$ (which has the same condition number as $A$). However, we can stabilize modified Gram-Schmidt by computing the product $Q^Tb$ in a clever way. We orthogonalize the augmented matrix $[A\\,\\, b]$ with MGS and extract the last column, which is (in exact arithmetic) $Q^T b$."
197 | ]
198 | },
199 | {
200 | "cell_type": "code",
201 | "execution_count": null,
202 | "id": "94a93cb2",
203 | "metadata": {},
204 | "outputs": [],
205 | "source": [
206 | "F = mgs([A b])\n",
207 | "\n",
208 | "# unpack Q and R from modified Gram-Schmidt\n",
209 | "R = F[2] # 16 x 16 upper triangular matrix from Gram-Schmidt\n",
210 | "Qb = R[1:n,n+1] # extract Q'*b from the last column of ONB for augmented matrix\n",
211 | "R = R[1:n,1:n] # extract R from principle n x n submatrix of triangular factor\n",
212 | "\n",
213 | "# solve R * c = Q' * b\n",
214 | "c = R \\ Qb\n",
215 | "abs(1-c[end])"
216 | ]
217 | },
218 | {
219 | "attachments": {},
220 | "cell_type": "markdown",
221 | "id": "ac27a9bd",
222 | "metadata": {},
223 | "source": [
224 | "Seven digits again! We are doing almost as well as the SVD! Remarkably, this augmented MGS method is a backward stable method for computing least-squares solutions, even though MGS is not a backward stable algorithm for factoring $A=QR$.\n",
225 | "\n",
226 | "Finally, let's try out the gold standard algorithm for \"daily-use\" - Householder QR."
227 | ]
228 | },
229 | {
230 | "cell_type": "code",
231 | "execution_count": null,
232 | "id": "d2aad83c",
233 | "metadata": {},
234 | "outputs": [],
235 | "source": [
236 | "F = qr(A)\n",
237 | "Qb = F.Q' * b\n",
238 | "c = F.R \\ Qb[1:n]\n",
239 | "abs(1-c[end])"
240 | ]
241 | },
242 | {
243 | "attachments": {},
244 | "cell_type": "markdown",
245 | "id": "5281d2c0",
246 | "metadata": {},
247 | "source": [
248 | "Householder QR gives us about 6 digits of accuracy, which is (again) about as many correct digits as we can hope for from a least-squares problem with condition number $\\kappa_{A\\rightarrow c} \\approx 10^{10}$. "
249 | ]
250 | },
251 | {
252 | "attachments": {},
253 | "cell_type": "markdown",
254 | "id": "24d33a6c",
255 | "metadata": {},
256 | "source": [
257 | "## Beyond numerical linear algebra\n",
258 | "\n",
259 | "The ill-conditioning in this polynomial regression problem can be avoided altogether by working with \n",
260 | "\n",
261 | "* better polynomial bases than the monomials $1, x, \\ldots, x^{14}$, and\n",
262 | "* better sample points that equispaced points in the unit interval $[-1, 1]$.\n",
263 | "\n",
264 | "The first $15$ Chebyshev polynomials are a much better choice of basis. We can build the matrix column by column using the three term recurrence satisfied by the Chebyshev polynomials, with $T_0(x) = 1$, $T_1(x) = x$, and\n",
265 | "\n",
266 | "$$ T_{k+1}(x) = 2xT_k(x) - T_{k-1}(x).$$\n",
267 | "\n",
268 | "The roots of the Chebyshev polynomials are an excellent set of sample points to build approximations from:\n",
269 | "\n",
270 | "$$x_k = \\cos(\\pi (k + 1/2)/100), \\qquad k = 1,2,\\ldots,100.$$\n",
271 | "\n",
272 | "What is the condition number of the Vandermonde matrix associated with Chebyshev polynomials and points?\n",
273 | "\n",
274 | "\n",
275 | "\n"
276 | ]
277 | },
278 | {
279 | "cell_type": "code",
280 | "execution_count": null,
281 | "id": "19c651e9",
282 | "metadata": {},
283 | "outputs": [],
284 | "source": [
285 | "# set up a least-squares problem A * c = b so that a[15] = 1, in Chebyshev basis\n",
286 | "m = 100\n",
287 | "n = 15\n",
288 | "xpts = cos.(pi*(0.5.+collect(0:m-1))/m)\n",
289 | "A = zeros(m,n)\n",
290 | "A[:,1] = ones(m,1)\n",
291 | "A[:,2] = xpts\n",
292 | "for k in 3:n\n",
293 | " A[:,k] = 2*xpts.*A[:,k-1] - A[:,k-2] # xpts .^ (k-1) #\n",
294 | "end\n",
295 | "b = exp.(sin.(4*xpts)) / 2006.787453080206\n",
296 | "\n",
297 | "# condition number of Vandermonde matrix\n",
298 | "cond(A)"
299 | ]
300 | },
301 | {
302 | "attachments": {},
303 | "cell_type": "markdown",
304 | "id": "4aded3f5",
305 | "metadata": {},
306 | "source": [
307 | "This is really the domain of _approximation theory_: how can we represent and construct highly accurate approximations to continuous objects like functions on a computer? What bases should we employ? Where should we sample/measure our function? The now classic reference is Approximation Theory and Practice by Nick Trefethen (who happens to be the first author of our NLA textbook).\n",
308 | "\n",
309 | "If you want to check accuracy in the solution, note that we have changed the least-squares problem entirely. The least-squares solution to _this problem_ may not be normalized to $1$ at the endpoint anymore. However, we can check the condition numbers again and be confident that a backward-stable algorithm achieves about 15 digits of accuracy."
310 | ]
311 | },
312 | {
313 | "cell_type": "code",
314 | "execution_count": null,
315 | "id": "7055fbfe",
316 | "metadata": {},
317 | "outputs": [],
318 | "source": [
319 | "c = A \\ b\n",
320 | "y = A * c\n",
321 | "κ = cond(A)\n",
322 | "cosθ = norm(y)/norm(b)\n",
323 | "η = norm(A)*norm(c) / norm(y)\n",
324 | "κA = κ / cosθ\n",
325 | "κb = κ / (η * cosθ)\n",
326 | "display(κA)\n",
327 | "display(κb) "
328 | ]
329 | }
330 | ],
331 | "metadata": {
332 | "kernelspec": {
333 | "display_name": "Julia 1.6.3",
334 | "language": "julia",
335 | "name": "julia-1.6"
336 | },
337 | "language_info": {
338 | "file_extension": ".jl",
339 | "mimetype": "application/julia",
340 | "name": "julia",
341 | "version": "1.6.3"
342 | }
343 | },
344 | "nbformat": 4,
345 | "nbformat_minor": 5
346 | }
347 |
--------------------------------------------------------------------------------
/notes/notes_spring_2023/BFGS_SJnotes.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/BFGS_SJnotes.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_11.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_11.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_12.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_12.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_13.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_13.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_14.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_14.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_15.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_15.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_16.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_16.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_17.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_17.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_18.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_18.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_19.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_19.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_20.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_20.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_21.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_21.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_22.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_22.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_25.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_25.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_4.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_4.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_5-6.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_5-6.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_8.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_8.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lecture_9-10.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lecture_9-10.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Lectures/Lectures_1-3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/Lectures/Lectures_1-3.pdf
--------------------------------------------------------------------------------
/notes/notes_spring_2023/Reflections-Rotations-And-Orthogonal-Transformations.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "attachments": {},
5 | "cell_type": "markdown",
6 | "id": "5c33e8d6",
7 | "metadata": {},
8 | "source": [
9 | "# Householder Reflections and Givens Rotations\n",
10 | "\n",
11 | "
\n",
12 | "\n",
13 | "Using rotations and reflections to introduce zeros in a matrix is a powerful paradigm in numerical linear algebra. \n",
14 | "\n",
15 | "
\n",
16 | "\n",
17 | "Reflections and rotations are orthogonal transformations. They preserve the Euclidean lengths of real vectors. \n",
18 | "\n",
19 | "
\n",
20 | "\n",
21 | "They don't shrink and they don't stretch too much - these transformations have the perfect condition number, $\\kappa=1$ (at least, in the 2-norm and other unitarily invariant norms like the Frobenius norm).\n",
22 | "\n",
23 | "
\n",
24 | "\n",
25 | "This notebook walks through the basic construction and application of two important variants. \n",
26 | "\n",
27 | "* **Householder reflections:** roughly speaking, these are used for dense problems, introducing zeros column by column.\n",
28 | "\n",
29 | "* **Givens rotations:** These are often used for data-sparse problems, for instance, introducing zeros one entry at a time in a sparse matrix."
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": null,
35 | "id": "a275e098",
36 | "metadata": {},
37 | "outputs": [],
38 | "source": [
39 | "using LinearAlgebra\n",
40 | "using SparseArrays"
41 | ]
42 | },
43 | {
44 | "attachments": {},
45 | "cell_type": "markdown",
46 | "id": "c2782277",
47 | "metadata": {},
48 | "source": [
49 | "## Householder Reflections\n",
50 | "\n",
51 | "The Householder reflector $F_v$, constructed from a real $k$-dimensional vector $v$, is a rank-one modification of the identity: \n",
52 | "\n",
53 | "$$F_v = I - 2\\alpha xx^*, \\quad\\text{where}\\quad x = {\\rm sign}(v_1)\\|v\\|e_1+v \\quad\\text{and}\\quad \\alpha = \\|x\\|^2.$$\n",
54 | "\n",
55 | "It looks like the orthogonal projection onto subspace orthogonal to $x$, but it is actually a _reflection across_ the subspace orthogonal to $x$. The orthogonal projection has rank $k-1$, but the reflection has rank $k$ and $F_v^{-1}=F_v^T=F_v$.\n",
56 | "\n",
57 | "The vector $x$ is chosen so that the reflection takes the vector $v$ to the first coordinate axis:\n",
58 | "\n",
59 | "$$ F_v\\begin{pmatrix}v_1 \\\\ v_2 \\\\ \\vdots \\\\ v_k \\end{pmatrix} = \\begin{pmatrix}\\|v\\| \\\\ 0 \\\\ \\vdots \\\\ 0\\end{pmatrix}.$$"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": null,
65 | "id": "a3fdac43",
66 | "metadata": {},
67 | "outputs": [],
68 | "source": [
69 | "# compute the Householder reflector\n",
70 | "function hhr(v)\n",
71 | "\n",
72 | " x = copy(v) # copy v to x\n",
73 | " x[1] = x[1] + sign(x[1])*norm(x) # modify first entry of x\n",
74 | " return x\n",
75 | "end\n",
76 | "\n",
77 | "\n",
78 | "v = randn(6)\n",
79 | "v = v / norm(v)\n",
80 | "x = hhr(v)\n",
81 | "Fv = I-2*x*x' / (x'*x)\n",
82 | "\n",
83 | "display(Fv) # here's the full reflection matrix\n",
84 | "display(norm(I - Fv'*Fv)) # orthogonal transformations have transpose = inverse\n",
85 | "display(Fv*v) # x is chosen so that Fv*v = ||v|| * e1 (on the computer, remember that rounding errors occur)"
86 | ]
87 | },
88 | {
89 | "attachments": {},
90 | "cell_type": "markdown",
91 | "id": "5b9eef81",
92 | "metadata": {},
93 | "source": [
94 | "In practice, we never form the matrix $F_v$ explicitly. We _apply_ it to vectors as a linear transformation by computing\n",
95 | "\n",
96 | "$$ F_vw = w - 2\\alpha x (x^*w). $$\n",
97 | "\n",
98 | "The arithmetic cost is a dot product, a vector addition, and a multiplication. Much better than building the whole matrix and multiplying by a vector.\n",
99 | "\n",
100 | "We also only need to store the reflection vector $x$ (you could also store the scalar to avoid calculating $\\alpha = \\|x\\|^2$ again if you want)."
101 | ]
102 | },
103 | {
104 | "cell_type": "code",
105 | "execution_count": null,
106 | "id": "6de72dfb",
107 | "metadata": {},
108 | "outputs": [],
109 | "source": [
110 | "function apply_hhr(x, w)\n",
111 | " \n",
112 | " return w - 2 * x * (x' * w) / (x' * x)\n",
113 | "end\n",
114 | "\n",
115 | "w = randn(6)\n",
116 | "Fvw = apply_hhr(x, w) # we can use the computed reflector x to apply Fv to any vector without forming the full reflection matrix\n",
117 | "display(Fvw) # vectors other than v get reflected across the same subspace, but don't usually end up along a coordinate axis (mostly nonzeros)\n",
118 | "display(norm(Fvw)-norm(w)) # but, reflection means norm is preserved for _any_ vector"
119 | ]
120 | },
121 | {
122 | "attachments": {},
123 | "cell_type": "markdown",
124 | "id": "33046caa",
125 | "metadata": {},
126 | "source": [
127 | "Householder reflections really come into their strength when we use them to introduce zeros and factor matrices. Here's a toy implementation for a familiar example: $A=QR$."
128 | ]
129 | },
130 | {
131 | "cell_type": "code",
132 | "execution_count": null,
133 | "id": "d6b49962",
134 | "metadata": {},
135 | "outputs": [],
136 | "source": [
137 | "function hqr(A, k)\n",
138 | " # Householder triangularization on the first k columns of A\n",
139 | " \n",
140 | " R = copy(A)\n",
141 | " n = size(A,2)\n",
142 | " X = zeros(size(A))\n",
143 | " for j in 1:k\n",
144 | " X[j:end,j] = hhr(R[j:end,j]') # get Householder reflector\n",
145 | " R[j:end,j:end] = apply_hhr(X[j:end,j],R[j:end,j:end]) # introduce zeros in n-j x n-j lower right submatrix\n",
146 | " end\n",
147 | " return X, R # return reflectors (for orthogonal Q) and upper triangular R\n",
148 | "end\n",
149 | "\n",
150 | "A = randn(8,5)\n",
151 | "F = hqr(A,5)\n",
152 | "display(abs.(F[2]).>1e-14) # R is now (numerically) zero below the diagonal\n",
153 | "display(abs.(F[1]).>1e-14) # F[1] contains Householder vectors of decreasing size"
154 | ]
155 | },
156 | {
157 | "attachments": {},
158 | "cell_type": "markdown",
159 | "id": "22d1b2fe",
160 | "metadata": {},
161 | "source": [
162 | "In practice, the Householder reflectors are usually stored in a compact blocked format that enjoys better storage properties and enables faster matrix-matrix multiplication implementations. \n",
163 | "\n",
164 | "The point is, we store the reflectors, and not the full reflection matrix!\n",
165 | "\n",
166 | "Next week, Householder reflectors will play a starring role in the \"crown jewel\" of numerical analysis: the QR algorithm for computing eigenvalues (not to be mistaken with the QR factorization)."
167 | ]
168 | },
169 | {
170 | "attachments": {},
171 | "cell_type": "markdown",
172 | "id": "6dca2d2e",
173 | "metadata": {},
174 | "source": [
175 | "## Givens rotations\n",
176 | "\n",
177 | "Householder reflections naturally operate on columns of $A$. But what if most column entries in $A$ are nonzero? We can introduce zeros one entry at a time with Givens rotations.\n",
178 | "\n",
179 | "
\n",
180 | "\n",
181 | "You can see the idea clearest in two dimensions first, where the vector $x = (x_1,\\,\\,x_2)^T$ is rotated counterclockwise by an angle $\\theta$ into the vector $y = (y_1,\\,\\,y_2)^T$ by\n",
182 | "\n",
183 | "$$\n",
184 | "\\begin{pmatrix} \n",
185 | "y_1 \\\\ y_2\n",
186 | "\\end{pmatrix}\n",
187 | "=\n",
188 | "\\begin{pmatrix}\n",
189 | "\\cos(\\theta) & -\\sin(\\theta) \\\\\n",
190 | "\\sin(\\theta) & \\cos(\\theta)\n",
191 | "\\end{pmatrix}\n",
192 | "\\begin{pmatrix} \n",
193 | "x_1 \\\\ x_2\n",
194 | "\\end{pmatrix}.\n",
195 | "$$\n",
196 | "\n",
197 | "Given $x$, how should we chose $\\theta$ so that $y_2 = 0$? We need to rotate $x$ counterclockwise so that it lies along the $e_1=(1,\\,\\,0)^T$ coordinate axis! If we choose $\\cos(\\theta) = x_1/\\|x\\|$ and $\\sin(\\theta) = - x_2/\\|x\\|$, then\n",
198 | "\n",
199 | "$$\n",
200 | "\\begin{pmatrix} \n",
201 | "\\|x\\| \\\\ 0\n",
202 | "\\end{pmatrix}\n",
203 | "=\n",
204 | "\\frac{1}{\\|x\\|}\\begin{pmatrix}\n",
205 | "x_1 & x_2 \\\\\n",
206 | "-x_2 & x_1\n",
207 | "\\end{pmatrix}\n",
208 | "\\begin{pmatrix} \n",
209 | "x_1 \\\\ x_2\n",
210 | "\\end{pmatrix}.\n",
211 | "$$\n",
212 | "\n",
213 | "The matrix we constructed is a rotation - an orthogonal transformation - that zeros out the second entry of the special vector $x$. A Givens rotation is the $2\\times 2$ rotation analogue of a Housholder reflection!\n",
214 | "\n",
215 | "
\n",
216 | "\n",
217 | "In numerical linear algebra, we are usually concerned with more than just two dimensions. But we can use Givens rotations to introduce one zero at a time by, e.g., mixing two rows at a time. Conceptually, this means embedding the Givens rotation matrix into a larger identity matrix. For example if we want to use the first entry of a $5$-dimensional vector to zero out its last entry, we could write\n",
218 | "\n",
219 | "$$\n",
220 | "\\begin{pmatrix} \n",
221 | "\\sqrt{x_1^2+x_5^2} \\\\ x_2 \\\\ x_3 \\\\ x_4 \\\\ 0\n",
222 | "\\end{pmatrix}\n",
223 | "=\n",
224 | "\\begin{pmatrix}\n",
225 | "c & 0 & 0 & 0 & s \\\\\n",
226 | "0 & 1 & 0 & 0 & 0 \\\\\n",
227 | "0 & 0 & 1 & 0 & 0 \\\\\n",
228 | "0 & 0 & 0 & 1 & 0 \\\\\n",
229 | "-s & 0 & 0 & 0 & c\n",
230 | "\\end{pmatrix}\n",
231 | "\\begin{pmatrix} \n",
232 | "x_1 \\\\ x_2 \\\\ x_3 \\\\ x_4 \\\\ x_5\n",
233 | "\\end{pmatrix}, \n",
234 | "\\qquad\\text{where}\\qquad\n",
235 | "c = \\frac{x_1}{\\sqrt{x_1^2+x_5^2}}\n",
236 | "\\qquad\\text{and}\\qquad\n",
237 | "s = \\frac{x_5}{\\sqrt{x_1^2+x_5^2}}.\n",
238 | "$$\n",
239 | "\n",
240 | "Just as with Householder reflections, we never form the full rotation matrix on the computer. We store $c$ and $s$ and apply the Givens rotation as a linear transformation that combines two rows (when applied from the left)."
241 | ]
242 | },
243 | {
244 | "cell_type": "code",
245 | "execution_count": null,
246 | "id": "eb47426d",
247 | "metadata": {},
248 | "outputs": [],
249 | "source": [
250 | "function toy_givens(x, j, k)\n",
251 | " \n",
252 | " r = sqrt(x[j]^2 + x[k]^2)\n",
253 | " c = x[j] / r\n",
254 | " s = x[k] / r\n",
255 | "\n",
256 | " return c, s\n",
257 | "end\n",
258 | "\n",
259 | "function apply_givens(v, c, s, j, k)\n",
260 | "\n",
261 | " w = copy(v)\n",
262 | "\n",
263 | " w[j,:] = c*v[j,:] + s*v[k,:]\n",
264 | " w[k,:] = -s*v[j,:] + c*v[k,:]\n",
265 | "\n",
266 | " return w\n",
267 | "end\n",
268 | "\n",
269 | "N = 10\n",
270 | "A = diagm(-1 => -ones(N-1), 0 => 2*ones(N), 1 => -ones(N-1))\n",
271 | "g = toy_givens(A[:,1], 1, 2) # compute Givens rotation to zero out first subdiagonal entry\n",
272 | "B = apply_givens(A, g[1], g[2], 1, 2) # apply Givens rotation to mix first two rows of A\n",
273 | "display(sparse(A)) # display matrix before Givens\n",
274 | "display(sparse(B)) # display matrix after Givens\n",
275 | "display( norm(A[:,2]) - norm(B[:,2]) ) # column norm is preserved"
276 | ]
277 | },
278 | {
279 | "attachments": {},
280 | "cell_type": "markdown",
281 | "id": "effffdd1",
282 | "metadata": {},
283 | "source": [
284 | "There are a number of subtle points to get right when doing Givens on the computer. Luckily for us, Julia has a great ready-to-use function for computing and working with Givens rotations. Let's test it out on the same example we used for our \"toy\" Givens rotation code."
285 | ]
286 | },
287 | {
288 | "cell_type": "code",
289 | "execution_count": null,
290 | "id": "b6844880",
291 | "metadata": {},
292 | "outputs": [],
293 | "source": [
294 | "N = 10\n",
295 | "A = diagm(-1 => -ones(N-1), 0 => 2*ones(N), 1 => -ones(N-1))\n",
296 | "G = givens(A, 1, 2, 1)\n",
297 | "B = G[1]*A\n",
298 | "display(sparse(A))\n",
299 | "display(sparse(B))\n",
300 | "display( norm(A[:,2]) - norm(B[:,2]) )"
301 | ]
302 | },
303 | {
304 | "attachments": {},
305 | "cell_type": "markdown",
306 | "id": "8359af8a",
307 | "metadata": {},
308 | "source": [
309 | "It looks like our toy code did okay on this simple example. Now, let's string a few more Givens rotations together to triangularize the tridiagonal matrix $A$ above. We'll also allow it to accumulate the same Givens rotations applied to another matrix B, which is useful if we want to express the orthogonal factor $Q$ as a matrix, or if we want to compute $Q^Tb$ for least-squares problems."
310 | ]
311 | },
312 | {
313 | "cell_type": "code",
314 | "execution_count": null,
315 | "id": "1ce4a736",
316 | "metadata": {},
317 | "outputs": [],
318 | "source": [
319 | "function triQR(A,B)\n",
320 | " # compute the QR decomposition of a tridiagonal matrix using Givens rotations\n",
321 | " \n",
322 | " R = copy(A)\n",
323 | " QTB = copy(B)\n",
324 | " n = size(R,2)\n",
325 | " for j in 1:n-1\n",
326 | " G = givens(R, j, j + 1, j)\n",
327 | " R = G[1]*R\n",
328 | " QTB = G[1]*QTB\n",
329 | " end\n",
330 | "\n",
331 | " return R, QTB\n",
332 | "end\n",
333 | "\n",
334 | "F = triQR(A,diagm(0 => ones(N)))\n",
335 | "display(abs.(F[1]).>1e-14) # F[1] is the triangular factor - it is banded with upper bandwidth = 3\n",
336 | "display(norm(F[2]'*F[2]-I)) # F[2] is the transpose of the orthogonal factor"
337 | ]
338 | },
339 | {
340 | "attachments": {},
341 | "cell_type": "markdown",
342 | "id": "8a864530",
343 | "metadata": {},
344 | "source": [
345 | "Banded matrices are very special. What happens for other types of sparse matrices? Let's add a row of ones to the first row of the difference matrix and see how it effects the $QR$ factorization.\n",
346 | "\n",
347 | " $$\n",
348 | " A = \\begin{pmatrix}\n",
349 | " 1 & 1 & 1 & 1 & \\cdots & 1 \\\\\n",
350 | " -1 & 2 & -1 & 0 & \\cdots & 0 \\\\\n",
351 | " 0 & -1 & -2 & 1 & \\cdots & 0 \\\\\n",
352 | " \\vdots & & \\ddots & \\ddots & \\ddots & \\vdots \\\\\n",
353 | " 0 & \\cdots & & 0 & -1 & 2\n",
354 | " \\end{pmatrix}\n",
355 | " $$"
356 | ]
357 | },
358 | {
359 | "cell_type": "code",
360 | "execution_count": null,
361 | "id": "ad523721",
362 | "metadata": {},
363 | "outputs": [],
364 | "source": [
365 | "A[1,:] = ones(1,N)\n",
366 | "F = triQR(A,diagm(0 => ones(N)))\n",
367 | "display(abs.(F[1]).>1e-14) # F[1] is the triangular factor - it is banded with upper bandwidth = 3\n",
368 | "display(norm(F[2]'*F[2]-I)) # F[2] is the transpose of the orthogonal factor"
369 | ]
370 | },
371 | {
372 | "attachments": {},
373 | "cell_type": "markdown",
374 | "id": "56ff1f07",
375 | "metadata": {},
376 | "source": [
377 | "The upper triangular factor is now completely dense... what happened?\n",
378 | "\n",
379 | "
\n",
380 | "\n",
381 | "We can explore by running Householder QR on $A$ one column at a time, stopping to visualize how the upper triangular factor fills in."
382 | ]
383 | },
384 | {
385 | "cell_type": "code",
386 | "execution_count": null,
387 | "id": "d379f452",
388 | "metadata": {},
389 | "outputs": [],
390 | "source": [
391 | "F = hqr(A, 0)\n",
392 | "display(abs.(F[2]).>1e-14) # R is now (numerically) zero below the diagonal"
393 | ]
394 | },
395 | {
396 | "attachments": {},
397 | "cell_type": "markdown",
398 | "id": "0f93d48e",
399 | "metadata": {},
400 | "source": [
401 | "Just as we saw with Gaussian elimination and $A=LU$, the factors of $A$ can suffer from _fill-in_: many more nonzero entries than in the original matrix."
402 | ]
403 | }
404 | ],
405 | "metadata": {
406 | "kernelspec": {
407 | "display_name": "Julia 1.6.3",
408 | "language": "julia",
409 | "name": "julia-1.6"
410 | },
411 | "language_info": {
412 | "file_extension": ".jl",
413 | "mimetype": "application/julia",
414 | "name": "julia",
415 | "version": "1.6.3"
416 | }
417 | },
418 | "nbformat": 4,
419 | "nbformat_minor": 5
420 | }
421 |
--------------------------------------------------------------------------------
/notes/notes_spring_2023/matrix-mult-experiments.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/notes_spring_2023/matrix-mult-experiments.pdf
--------------------------------------------------------------------------------
/notes/painless-conjugate-gradient.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/painless-conjugate-gradient.pdf
--------------------------------------------------------------------------------
/notes/restarting-arnoldi.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/restarting-arnoldi.pdf
--------------------------------------------------------------------------------
/notes/solver-options.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/notes/solver-options.pdf
--------------------------------------------------------------------------------
/project/final_project_spring2025.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/project/final_project_spring2025.pdf
--------------------------------------------------------------------------------
/psets/notes_spring_2023/pset1.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# 18.335 Problem Set 1\n",
8 | "\n",
9 | "This notebook accompanies the first problem set posted on the [18.335 web page](https://github.com/mitmath/18335), and is here to get you started with your own Julia computations.\n",
10 | "\n",
11 | "Download this notebook (a `pset1.ipynb` file) by **right-clicking the download link** at the upper-right to *Save As* a file, and then drag this file into your Jupyter dashboard to upload it (e.g. on [](https://mybinder.org/v2/gh/mitmath/binder-env/main?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Fmitmath%252F18335%26urlpath%3Dtree%252F18335%252F%26branch%3Dmaster) or in a local installation).\n",
12 | "\n",
13 | "Modify it as needed, then choose **Print Preview** from the \"File\" menu and *print to a PDF* file to submit electronically."
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": [
20 | "# Problem 2: Floating point"
21 | ]
22 | },
23 | {
24 | "cell_type": "markdown",
25 | "metadata": {},
26 | "source": [
27 | "Give your solution to Trefethen problem 13.2(c) below. Note that you can use the *Insert* menu (or ctrl-m b) to insert new code cells as needed."
28 | ]
29 | },
30 | {
31 | "cell_type": "code",
32 | "execution_count": null,
33 | "metadata": {},
34 | "outputs": [],
35 | "source": []
36 | },
37 | {
38 | "cell_type": "markdown",
39 | "metadata": {},
40 | "source": [
41 | "# Problem 3: Funny functions"
42 | ]
43 | },
44 | {
45 | "cell_type": "markdown",
46 | "metadata": {},
47 | "source": [
48 | "## part (a)\n",
49 | "\n",
50 | "Compute $(|x|^4 + |y|^4)^{1/4}$:"
51 | ]
52 | },
53 | {
54 | "cell_type": "code",
55 | "execution_count": null,
56 | "metadata": {},
57 | "outputs": [],
58 | "source": [
59 | "L4(x,y) = ..."
60 | ]
61 | },
62 | {
63 | "cell_type": "markdown",
64 | "metadata": {},
65 | "source": [
66 | "Some tests:"
67 | ]
68 | },
69 | {
70 | "cell_type": "code",
71 | "execution_count": null,
72 | "metadata": {},
73 | "outputs": [],
74 | "source": [
75 | "L4(1e-100,0.0)"
76 | ]
77 | },
78 | {
79 | "cell_type": "code",
80 | "execution_count": null,
81 | "metadata": {},
82 | "outputs": [],
83 | "source": [
84 | "L4(1e+100,0.0)"
85 | ]
86 | },
87 | {
88 | "cell_type": "markdown",
89 | "metadata": {},
90 | "source": [
91 | "## part (b)\n",
92 | "\n",
93 | "Compute $\\cot(x) - \\cot(x + y)$:"
94 | ]
95 | },
96 | {
97 | "cell_type": "code",
98 | "execution_count": null,
99 | "metadata": {},
100 | "outputs": [],
101 | "source": [
102 | "cotdiff(x,y) = ..."
103 | ]
104 | },
105 | {
106 | "cell_type": "markdown",
107 | "metadata": {},
108 | "source": [
109 | "Some tests:"
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": null,
115 | "metadata": {},
116 | "outputs": [],
117 | "source": [
118 | "cotdiff(1.0, 1e-20)"
119 | ]
120 | },
121 | {
122 | "cell_type": "code",
123 | "execution_count": null,
124 | "metadata": {},
125 | "outputs": [],
126 | "source": [
127 | "cotdiff(big(1.0), big(1e-20)) # compute in BigFloat precision"
128 | ]
129 | },
130 | {
131 | "cell_type": "markdown",
132 | "metadata": {},
133 | "source": [
134 | "# Problem 4: Addition, another way"
135 | ]
136 | },
137 | {
138 | "cell_type": "code",
139 | "execution_count": 7,
140 | "metadata": {},
141 | "outputs": [],
142 | "source": [
143 | "# Sum x[first:last]. This function works, but is a little slower than we would like.\n",
144 | "function div2sum(x, first=1, last=length(x))\n",
145 | " n = last - first + 1;\n",
146 | " if n < 2\n",
147 | " s = zero(eltype(x))\n",
148 | " for i = first:last\n",
149 | " s += x[i]\n",
150 | " end\n",
151 | " return s\n",
152 | " else\n",
153 | " mid = div(first + last, 2) # find middle as (first+last)/2, rounding down\n",
154 | " return div2sum(x, first, mid) + div2sum(x, mid+1, last)\n",
155 | " end\n",
156 | "end\n",
157 | "\n",
158 | "# check its accuracy for a set logarithmically spaced n's. Since div2sum is slow,\n",
159 | "# we won't go to very large n or use too many points\n",
160 | "N = round.(Int, 10 .^ range(1,7,length=50)) # 50 points from 10¹ to 10⁷\n",
161 | "err = Float64[]\n",
162 | "for n in N\n",
163 | " x = rand(Float32, n)\n",
164 | " xdouble = Float64.(x)\n",
165 | " push!(err, abs(div2sum(x) - sum(xdouble)) / abs(sum(xdouble)))\n",
166 | "end\n",
167 | "\n",
168 | "using PyPlot\n",
169 | "loglog(N, err, \"bo-\")\n",
170 | "title(\"simple div2sum\")\n",
171 | "xlabel(\"number of summands\")\n",
172 | "ylabel(\"relative error\")\n",
173 | "grid()"
174 | ]
175 | },
176 | {
177 | "cell_type": "markdown",
178 | "metadata": {},
179 | "source": [
180 | "Time it vs. the built-in `sum` (which is also written in Julia):"
181 | ]
182 | },
183 | {
184 | "cell_type": "code",
185 | "execution_count": 6,
186 | "metadata": {},
187 | "outputs": [
188 | {
189 | "name": "stdout",
190 | "output_type": "stream",
191 | "text": [
192 | " 0.047769 seconds (1 allocation: 16 bytes)\n"
193 | ]
194 | },
195 | {
196 | "name": "stdout",
197 | "output_type": "stream",
198 | "text": [
199 | " 0.046905 seconds (1 allocation: 16 bytes)\n",
200 | " 0.044345 seconds (1 allocation: 16 bytes)\n",
201 | " 0.002188 seconds (1 allocation: 16 bytes)\n",
202 | " 0.002432 seconds (1 allocation: 16 bytes)\n",
203 | " 0.002175 seconds (1 allocation: 16 bytes)\n"
204 | ]
205 | },
206 | {
207 | "data": {
208 | "text/plain": [
209 | "4.997705f6"
210 | ]
211 | },
212 | "metadata": {},
213 | "output_type": "display_data"
214 | }
215 | ],
216 | "source": [
217 | "x = rand(Float32, 10^7)\n",
218 | "@time div2sum(x)\n",
219 | "@time div2sum(x)\n",
220 | "@time div2sum(x)\n",
221 | "@time sum(x)\n",
222 | "@time sum(x)\n",
223 | "@time sum(x)"
224 | ]
225 | },
226 | {
227 | "cell_type": "markdown",
228 | "metadata": {},
229 | "source": [
230 | "You should notice that it's pretty darn slow compared to `sum`, although in an absolute sense it is pretty good. Make it faster:"
231 | ]
232 | },
233 | {
234 | "cell_type": "code",
235 | "execution_count": null,
236 | "metadata": {},
237 | "outputs": [],
238 | "source": [
239 | "function fast_div2sum(x, first=1, last=length(x))\n",
240 | " # ???\n",
241 | "end"
242 | ]
243 | }
244 | ],
245 | "metadata": {
246 | "@webio": {
247 | "lastCommId": null,
248 | "lastKernelId": null
249 | },
250 | "kernelspec": {
251 | "display_name": "Julia 1.6.3",
252 | "language": "julia",
253 | "name": "julia-1.6"
254 | },
255 | "language_info": {
256 | "file_extension": ".jl",
257 | "mimetype": "application/julia",
258 | "name": "julia",
259 | "version": "1.6.3"
260 | }
261 | },
262 | "nbformat": 4,
263 | "nbformat_minor": 1
264 | }
265 |
--------------------------------------------------------------------------------
/psets/notes_spring_2023/pset1.lyx:
--------------------------------------------------------------------------------
1 | #LyX 2.3 created this file. For more info see http://www.lyx.org/
2 | \lyxformat 544
3 | \begin_document
4 | \begin_header
5 | \save_transient_properties true
6 | \origin unavailable
7 | \textclass article
8 | \begin_preamble
9 |
10 | \renewcommand{\vec}[1]{\mathbf{#1}}
11 |
12 | \renewcommand{\labelenumi}{(\alph{enumi})}
13 | \renewcommand{\labelenumii}{(\roman{enumii})}
14 | \end_preamble
15 | \use_default_options false
16 | \maintain_unincluded_children false
17 | \language english
18 | \language_package default
19 | \inputencoding auto
20 | \fontencoding global
21 | \font_roman "default" "default"
22 | \font_sans "default" "default"
23 | \font_typewriter "default" "default"
24 | \font_math "auto" "auto"
25 | \font_default_family default
26 | \use_non_tex_fonts false
27 | \font_sc false
28 | \font_osf false
29 | \font_sf_scale 100 100
30 | \font_tt_scale 100 100
31 | \use_microtype false
32 | \use_dash_ligatures true
33 | \graphics default
34 | \default_output_format default
35 | \output_sync 0
36 | \bibtex_command default
37 | \index_command default
38 | \paperfontsize default
39 | \spacing single
40 | \use_hyperref false
41 | \papersize default
42 | \use_geometry true
43 | \use_package amsmath 1
44 | \use_package amssymb 1
45 | \use_package cancel 1
46 | \use_package esint 0
47 | \use_package mathdots 0
48 | \use_package mathtools 1
49 | \use_package mhchem 1
50 | \use_package stackrel 1
51 | \use_package stmaryrd 1
52 | \use_package undertilde 1
53 | \cite_engine basic
54 | \cite_engine_type default
55 | \biblio_style plain
56 | \use_bibtopic false
57 | \use_indices false
58 | \paperorientation portrait
59 | \suppress_date false
60 | \justification true
61 | \use_refstyle 0
62 | \use_minted 0
63 | \index Index
64 | \shortcut idx
65 | \color #008000
66 | \end_index
67 | \topmargin 1in
68 | \secnumdepth 3
69 | \tocdepth 3
70 | \paragraph_separation indent
71 | \paragraph_indentation default
72 | \is_math_indent 0
73 | \math_numbering_side default
74 | \quotes_style english
75 | \dynamic_quotes 0
76 | \papercolumns 2
77 | \papersides 2
78 | \paperpagestyle default
79 | \tracking_changes false
80 | \output_changes false
81 | \html_math_output 0
82 | \html_css_as_file 0
83 | \html_be_strict false
84 | \end_header
85 |
86 | \begin_body
87 |
88 | \begin_layout Section*
89 | 18.335 Problem Set 1
90 | \end_layout
91 |
92 | \begin_layout Standard
93 | Due February 24, 2023 at 11:59pm.
94 | You should submit your problem set
95 | \series bold
96 | electronically
97 | \series default
98 | on the 18.335 Gradescope page.
99 | Submit
100 | \series bold
101 | both
102 | \series default
103 | a
104 | \emph on
105 | scan
106 | \emph default
107 | of any handwritten solutions (I recommend an app like TinyScanner or similar
108 | to create a good-quality black-and-white
109 | \begin_inset Quotes eld
110 | \end_inset
111 |
112 | thresholded
113 | \begin_inset Quotes erd
114 | \end_inset
115 |
116 | scan) and
117 | \series bold
118 | also
119 | \series default
120 | a
121 | \emph on
122 | PDF printout
123 | \emph default
124 | of the Julia notebook of your computer solutions.
125 | A
126 | \series bold
127 | template Julia notebook is posted
128 | \series default
129 | in the 18.335 web site to help you get started.
130 | \end_layout
131 |
132 | \begin_layout Subsection*
133 | Problem 0: Pset Honor Code
134 | \end_layout
135 |
136 | \begin_layout Standard
137 | Include the following statement in your solutions:
138 | \end_layout
139 |
140 | \begin_layout Quotation
141 |
142 | \emph on
143 | I will not look at 18.335 pset solutions from previous semesters.
144 | I may discuss problems with my classmates or others, but I will write up
145 | my solutions on my own.
146 |
151 | \end_layout
152 |
153 | \begin_layout Subsection*
154 | Problem 1: Jupyter notebook
155 | \end_layout
156 |
157 | \begin_layout Standard
158 | On the course home page,
159 | \begin_inset Quotes eld
160 | \end_inset
161 |
162 | launch a Julia environment in the cloud
163 | \begin_inset Quotes erd
164 | \end_inset
165 |
166 | and open the
167 | \begin_inset Quotes eld
168 | \end_inset
169 |
170 | Floating-Point-Intro.ipynb
171 | \begin_inset Quotes erd
172 | \end_inset
173 |
174 | notebook in the notes folder.
175 | Read through it, get familiar with Julia and the notebook environment,
176 | and play! (You don't need to submit a notebook print out or turn in work
177 | for this question.)
178 | \end_layout
179 |
180 | \begin_layout Subsection*
181 | Problem 2: Floating point
182 | \end_layout
183 |
184 | \begin_layout Standard
185 | (From Trefethen and Bau, Exercise 13.2.) The floating point numbers
186 | \begin_inset Formula $\mathbb{F}$
187 | \end_inset
188 |
189 | can be written compactly in the form (e.g., see (13.2) in Trefethen and Bau)
190 |
191 | \begin_inset Formula $x=\pm(m/\beta^{t})\beta^{e}$
192 | \end_inset
193 |
194 | , with integer base
195 | \begin_inset Formula $\beta\geq2$
196 | \end_inset
197 |
198 | , significand
199 | \begin_inset Formula $\beta^{-t}\leq m/\beta^{t}\leq1$
200 | \end_inset
201 |
202 | , and integer exponent
203 | \begin_inset Formula $e$
204 | \end_inset
205 |
206 | .
207 | This floating point system includes many integers, but not all of them.
208 | \end_layout
209 |
210 | \begin_layout Enumerate
211 | Give an exact formula for the smallest positive integer
212 | \begin_inset Formula $n$
213 | \end_inset
214 |
215 | that does not belong to
216 | \begin_inset Formula $\mathbb{F}$
217 | \end_inset
218 |
219 | .
220 | \end_layout
221 |
222 | \begin_layout Enumerate
223 | In particular, what are the values of
224 | \begin_inset Formula $n$
225 | \end_inset
226 |
227 | for IEEE single and double precision arithmetic?
228 | \end_layout
229 |
230 | \begin_layout Enumerate
231 | Figure out a way to verify this result for your own computer.
232 | Specifically, design and run a program that produces evidence that
233 | \begin_inset Formula $n-3$
234 | \end_inset
235 |
236 | ,
237 | \begin_inset Formula $n-2$
238 | \end_inset
239 |
240 | , and
241 | \begin_inset Formula $n-1$
242 | \end_inset
243 |
244 | belong to
245 | \begin_inset Formula $\mathbb{F}$
246 | \end_inset
247 |
248 | but
249 | \begin_inset Formula $n$
250 | \end_inset
251 |
252 | does not.
253 | What about
254 | \begin_inset Formula $n+1$
255 | \end_inset
256 |
257 | ,
258 | \begin_inset Formula $n+2$
259 | \end_inset
260 |
261 | , and
262 | \begin_inset Formula $n+3$
263 | \end_inset
264 |
265 | ?
266 | \end_layout
267 |
268 | \begin_layout Standard
269 | (In part (c), you can use Julia, which employs IEEE double precision by
270 | default.
271 | However, unlike Matlab, Julia distinguishes between integer and floating-point
272 | scalars.
273 | For example,
274 | \family typewriter
275 | 2^50
276 | \family default
277 | in Julia will produce a 64-bit integer result; to get a 64-bit/double floating-
278 | point result, do e.g.
279 |
280 | \family typewriter
281 | 2.0^50
282 | \family default
283 | instead.)
284 | \end_layout
285 |
286 | \begin_layout Subsection*
287 | Problem 3: Funny functions
288 | \end_layout
289 |
290 | \begin_layout Enumerate
291 | Write a function
292 | \family typewriter
293 | L4(x,y)
294 | \family default
295 | in Julia to compute the
296 | \begin_inset Formula $L_{4}$
297 | \end_inset
298 |
299 | norm
300 | \begin_inset Formula $(|x|^{4}+|y|^{4})^{1/4}$
301 | \end_inset
302 |
303 | of two scalars
304 | \begin_inset Formula $x$
305 | \end_inset
306 |
307 | and
308 | \begin_inset Formula $y$
309 | \end_inset
310 |
311 | .
312 | Does your code give an accurate answer for
313 | \family typewriter
314 | L4(1e-100,0.0)
315 | \family default
316 | ? What about
317 | \family typewriter
318 | L4(1e+100,0.0)
319 | \family default
320 | ? Without using arbitrary-precision (
321 | \family typewriter
322 | BigFloat
323 | \family default
324 | ) calculations,
325 | \series bold
326 | fix your code
327 | \series default
328 | so that it gives an answer whose relative error
329 | \begin_inset Formula $\frac{|\text{computed}-\text{correct}|}{|\text{correct}|}$
330 | \end_inset
331 |
332 | is within a small multiple of
333 | \family typewriter
334 | eps()
335 | \family default
336 | =
337 | \begin_inset Formula $\epsilon_{\text{machine}}$
338 | \end_inset
339 |
340 | (a few
341 | \begin_inset Quotes eld
342 | \end_inset
343 |
344 | ulps
345 | \begin_inset Quotes erd
346 | \end_inset
347 |
348 | , or
349 | \begin_inset Quotes eld
350 | \end_inset
351 |
352 | units in the last place
353 | \begin_inset Quotes erd
354 | \end_inset
355 |
356 | ) of the exactly rounded answer for all double-precision
357 | \begin_inset Formula $x$
358 | \end_inset
359 |
360 | and
361 | \begin_inset Formula $y$
362 | \end_inset
363 |
364 | .
365 | (You can test your code by comparing to
366 | \family typewriter
367 | L4(big(x),big(y))
368 | \family default
369 | , i.e.
370 | arbitrary-precision calculation.)
371 | \end_layout
372 |
373 | \begin_layout Enumerate
374 | Write a function
375 | \family typewriter
376 | cotdiff(x,y)
377 | \family default
378 | that computes
379 | \begin_inset Formula $\cot(x)-\cot(x+y)$
380 | \end_inset
381 |
382 | .
383 | Does your code give an accurate answer for
384 | \family typewriter
385 | cotdiff(1.0, 1e-20)
386 | \family default
387 | ? Without using arbitrary-precision (
388 | \family typewriter
389 | BigFloat
390 | \family default
391 | ) calculations,
392 | \series bold
393 | fix your code
394 | \series default
395 | so that it gives an accurate
396 | \family typewriter
397 | Float64
398 | \family default
399 | answer (within a few ulps) even when
400 | \begin_inset Formula $|y|\ll|x|$
401 | \end_inset
402 |
403 | (without hurting the accuracy when
404 | \begin_inset Formula $y$
405 | \end_inset
406 |
407 | and
408 | \begin_inset Formula $x$
409 | \end_inset
410 |
411 | are comparable!).
412 | (Hint: one option would be to switch over to Taylor expansion when
413 | \begin_inset Formula $|y|/|x|$
414 | \end_inset
415 |
416 | is sufficiently small, but a simpler solution is possible by applying some
417 | trigonometric identities.)
418 | \end_layout
419 |
420 | \begin_layout Subsection*
421 | Problem 4: Addition, another way
422 | \end_layout
423 |
424 | \begin_layout Standard
425 | Here you will analyze
426 | \begin_inset Formula $f(x)=\sum_{i=1}^{n}x_{i}$
427 | \end_inset
428 |
429 | , but you will compute
430 | \begin_inset Formula $\tilde{f}(x)$
431 | \end_inset
432 |
433 | in a different way from the naive sum considered in class.
434 | In particular, compute
435 | \begin_inset Formula $\tilde{f}(x)$
436 | \end_inset
437 |
438 | by a recursive divide-and-conquer approach, recursively dividing the set
439 | of values to be summed in two halves and then summing the halves:
440 | \begin_inset Formula
441 | \[
442 | \tilde{f}(x)=\begin{cases}
443 | 0 & \mbox{if }n=0\\
444 | x_{1} & \mbox{if }n=1\\
445 | \tilde{f}(x_{1:\left\lfloor n/2\right\rfloor })\oplus\tilde{f}(x_{\left\lfloor n/2\right\rfloor +1:n}) & \mbox{if }n>1
446 | \end{cases},
447 | \]
448 |
449 | \end_inset
450 |
451 | where
452 | \begin_inset Formula $\left\lfloor y\right\rfloor $
453 | \end_inset
454 |
455 | denotes the greatest integer
456 | \begin_inset Formula $\leq y$
457 | \end_inset
458 |
459 | (i.e.
460 |
461 | \begin_inset Formula $y$
462 | \end_inset
463 |
464 | rounded down).
465 | In exact arithmetic, this computes
466 | \begin_inset Formula $f(x)$
467 | \end_inset
468 |
469 | exactly, but in floating-point arithmetic this will have very different
470 | error characteristics than the simple loop-based summation in class.
471 | \end_layout
472 |
473 | \begin_layout Enumerate
474 | For simplicity, assume
475 | \begin_inset Formula $n$
476 | \end_inset
477 |
478 | is a power of 2 (so that the set of numbers to add divides evenly in two
479 | at each stage of the recursion).
480 | Prove that
481 | \begin_inset Formula $|\tilde{f}(x)-f(x)|\leq\epsilon_{\mbox{machine}}\log_{2}(n)\sum_{i=1}^{n}|x_{i}|+O(\epsilon_{\mbox{machine}}^{2})$
482 | \end_inset
483 |
484 | .
485 | That is, show that the worst-case error bound grows
486 | \emph on
487 | logarithmically
488 | \emph default
489 | rather than
490 | \emph on
491 | linearly
492 | \emph default
493 | with
494 | \begin_inset Formula $n$
495 | \end_inset
496 |
497 | !
498 | \end_layout
499 |
500 | \begin_layout Enumerate
501 | Pete R.
502 | Stunt, a Microsoft employee, complains,
503 | \begin_inset Quotes eld
504 | \end_inset
505 |
506 | While doing this kind of recursion may have nice error characteristics in
507 | theory, it is ridiculous in the real world because it will be insanely
508 | slow—I'm proud of my efficient software and can't afford to have a function-cal
509 | l overhead for every number I want to add!
510 | \begin_inset Quotes erd
511 | \end_inset
512 |
513 | Explain to Pete how to implement a slight variation of this algorithm with
514 | the same logarithmic error bounds (possibly with a worse constant factor)
515 | but roughly the same performance as a simple loop.
516 | \end_layout
517 |
518 | \begin_layout Enumerate
519 | In the pset 1 Julia notebook, there is a function
520 | \begin_inset Quotes eld
521 | \end_inset
522 |
523 | div2sum
524 | \begin_inset Quotes erd
525 | \end_inset
526 |
527 | that computes
528 | \begin_inset Formula $\tilde{f}(x)=$
529 | \end_inset
530 |
531 |
532 | \family typewriter
533 | div2sum(x)
534 | \family default
535 | in single precision by the above algorithm.
536 | Modify it to not be horrendously slow via your suggestion in (b), and then
537 | plot its errors for random inputs as a function of
538 | \begin_inset Formula $n$
539 | \end_inset
540 |
541 | with the help of the example code in the Julia notebook (but with a larger
542 | range of lengths
543 | \begin_inset Formula $n$
544 | \end_inset
545 |
546 | ).
547 | Are your results consistent with your error bounds above?
548 | \end_layout
549 |
550 | \end_body
551 | \end_document
552 |
--------------------------------------------------------------------------------
/psets/notes_spring_2023/pset1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/psets/notes_spring_2023/pset1.pdf
--------------------------------------------------------------------------------
/psets/notes_spring_2023/pset2.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "attachments": {},
5 | "cell_type": "markdown",
6 | "metadata": {},
7 | "source": [
8 | "# 18.335 Problem Set 2\n",
9 | "\n",
10 | "This notebook accompanies the second problem set posted on the [18.335 web page](https://github.com/mitmath/18335), and is here to get you started with your own Julia computations.\n",
11 | "\n",
12 | "Download this notebook (a `pset2.ipynb` file) by **right-clicking the download link** at the upper-right to *Save As* a file, and then drag this file into your Jupyter dashboard to upload it (e.g. on [](https://mybinder.org/v2/gh/mitmath/binder-env/main?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Fmitmath%252F18335%26urlpath%3Dtree%252F18335%252F%26branch%3Dmaster) or in a local installation).\n",
13 | "\n",
14 | "Modify it as needed, then choose **Print Preview** from the \"File\" menu and *print to a PDF* file to submit electronically."
15 | ]
16 | },
17 | {
18 | "attachments": {},
19 | "cell_type": "markdown",
20 | "metadata": {},
21 | "source": [
22 | "# Problem 2: Banded factorization of a finite-difference matrix"
23 | ]
24 | },
25 | {
26 | "attachments": {},
27 | "cell_type": "markdown",
28 | "metadata": {},
29 | "source": [
30 | "Develop your algorithm for a banded factorization of $A=I+\\sigma D$ in the following sections. The flops required to compute the factorization should scale linearly with the dimension of $A$ (e.g., if $A$ is $n\\times n$, then ${\\rm \\#flops}=\\mathcal{O}(n))$. Here are the problem parameters and a banded representation of our finite-difference discretization to get you started."
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": null,
36 | "metadata": {},
37 | "outputs": [],
38 | "source": [
39 | "using LinearAlgebra\n",
40 | "using BandedMatrices\n",
41 | "\n",
42 | "## PDE and discretization parameters\n",
43 | "α = 1 # velocity\n",
44 | "n = 199 # discretization size\n",
45 | "Δx = 1 / (n+1) # grid spacing\n",
46 | "Δt = 0.005 # time step\n",
47 | "σ = α*Δt/Δx # shift\n",
48 | "\n",
49 | "## scaled 2nd order central difference matrix plus identity\n",
50 | "D = BandedMatrix(-2 => ones(n-2)/12, -1 => -2*ones(n-1)/3, 0=> zeros(n), 1 => 2*ones(n-1)/3, 2 => -ones(n-2)/12)\n",
51 | "A = BandedMatrix(Eye(n), (2,2)) + σ * D"
52 | ]
53 | },
54 | {
55 | "attachments": {},
56 | "cell_type": "markdown",
57 | "metadata": {},
58 | "source": [
59 | "Your algorithm goes here! You can overwrite the factors $L$, $D$, and $U$ on the copy of $A$ by writing the entries in $L$ and $U$ to the strictly lower and upper triangular entries and $D$ on the diagonal (no need to store the unit diagonal entries of $L$ and $U$ explicitly)."
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": null,
65 | "metadata": {},
66 | "outputs": [],
67 | "source": [
68 | "function ldu( A )\n",
69 | " # YOUR SOLUTION HERE: compute banded factorization A = LDU via elimination\n",
70 | " \n",
71 | " # write factors L, D, and U onto a copy of A \n",
72 | " F = copy(A)\n",
73 | "\n",
74 | " # banded elimination\n",
75 | "\n",
76 | " return F\n",
77 | "end"
78 | ]
79 | },
80 | {
81 | "attachments": {},
82 | "cell_type": "markdown",
83 | "metadata": {},
84 | "source": [
85 | "Let's check the backward error $\\|\\Delta A\\| = \\|A-LDU\\|$ in the computed factorization."
86 | ]
87 | },
88 | {
89 | "cell_type": "code",
90 | "execution_count": null,
91 | "metadata": {},
92 | "outputs": [],
93 | "source": [
94 | "F = ldu(A)\n",
95 | "D = BandedMatrix(0 => diag(F))\n",
96 | "U = UpperTriangular(F) - D + BandedMatrix(Eye(n), (0,2))\n",
97 | "L = LowerTriangular(F) - D + BandedMatrix(Eye(n), (2,0))\n",
98 | "norm(A - L*D*U)"
99 | ]
100 | },
101 | {
102 | "attachments": {},
103 | "cell_type": "markdown",
104 | "metadata": {},
105 | "source": [
106 | "Finally, let's use our factorization to solve the advection equation with the square-wave initial condition $$u(0,x) = \\begin{cases} 0,\\qquad |x-1/2| > 0.1 \\\\ 1, \\qquad |x-1/2| \\leq 0.1 \\end{cases}$$\n",
107 | "\n",
108 | "Provide a function to advance the solution from $u_k$ to $u_{k+1}$, using only the factors $L$, $D$, and $U$, in the segment below.\n",
109 | "\n"
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": null,
115 | "metadata": {},
116 | "outputs": [],
117 | "source": [
118 | "function advec_step(L, D, U, uk)\n",
119 | " # YOUR SOLUTION HERE: advance advection solution ub one time step with LDU factorization of finite-difference discretization\n",
120 | "\n",
121 | " return ukp1\n",
122 | "end"
123 | ]
124 | },
125 | {
126 | "attachments": {},
127 | "cell_type": "markdown",
128 | "metadata": {},
129 | "source": [
130 | "Now, we'll take a look at the initial condition and the numerical solution."
131 | ]
132 | },
133 | {
134 | "cell_type": "code",
135 | "execution_count": null,
136 | "metadata": {},
137 | "outputs": [],
138 | "source": [
139 | "using Plots\n",
140 | "\n",
141 | "# initial condition\n",
142 | "b = zeros(n)\n",
143 | "b[80:120] = ones(41)\n",
144 | "plot(range(0.01, 0.99, length=n), b)"
145 | ]
146 | },
147 | {
148 | "attachments": {},
149 | "cell_type": "markdown",
150 | "metadata": {},
151 | "source": [
152 | "In the exact (weak) solution, the square-wave moves to the right with velocity $v=1$, i.e., $u(x,t)=u(0,x-vt)$ (at least, until it hits the boundary). What do you observe in the numerical solution?\n",
153 | "\n",
154 | "Try out the second gaussian initial condition too!"
155 | ]
156 | },
157 | {
158 | "cell_type": "code",
159 | "execution_count": null,
160 | "metadata": {},
161 | "outputs": [],
162 | "source": [
163 | "# initial condition 1 (square-wave)\n",
164 | "b = zeros(n)\n",
165 | "b[80:120] = ones(41)\n",
166 | "\n",
167 | "# initial condition 2 (gaussian)\n",
168 | "#b = zeros(n)\n",
169 | "#x = range(0.01, 0.99, length=n)\n",
170 | "#b = exp.(-200*(x.-0.25).^2)\n",
171 | "\n",
172 | "# time stepping gif\n",
173 | "anim = Animation()\n",
174 | "m = 100 # number of steps in time \n",
175 | "for k ∈ 1:m # animate solution\n",
176 | " plot(range(0.01, 0.99, length=n), b, linecolor = :blue, legend = false)\n",
177 | " ylims!(0.0,1.5)\n",
178 | " b = advec_step(L,D,U,b)\n",
179 | " frame(anim)\n",
180 | "end\n",
181 | "gif(anim)\n",
182 | "\n",
183 | " "
184 | ]
185 | },
186 | {
187 | "attachments": {},
188 | "cell_type": "markdown",
189 | "metadata": {},
190 | "source": [
191 | "For a deeper understanding of the movie, one needs to go beyond linear algebra and understand the approximation properties of finite-difference schemes for partial differential equations like the advection equation."
192 | ]
193 | },
194 | {
195 | "attachments": {},
196 | "cell_type": "markdown",
197 | "metadata": {},
198 | "source": [
199 | "# Problem 3: Regularized least-squares solutions"
200 | ]
201 | },
202 | {
203 | "attachments": {},
204 | "cell_type": "markdown",
205 | "metadata": {},
206 | "source": [
207 | "Implement a structure-exploiting Givens-based QR solver for the least-squares problem $$x_* = {\\rm argmin}_x\\left\\|\\begin{pmatrix}A \\\\ \\\\ \\sqrt{\\lambda}I \\end{pmatrix}x - \\begin{pmatrix}b \\\\ \\\\ 0 \\end{pmatrix}\\right\\|_2^2.$$"
208 | ]
209 | },
210 | {
211 | "cell_type": "code",
212 | "execution_count": null,
213 | "metadata": {},
214 | "outputs": [],
215 | "source": [
216 | "function qrsolve(A, b, λ)\n",
217 | " ## YOUR SOLUTION HERE\n",
218 | "\n",
219 | " return xmin\n",
220 | "end\n"
221 | ]
222 | },
223 | {
224 | "attachments": {},
225 | "cell_type": "markdown",
226 | "metadata": {},
227 | "source": [
228 | "## NOT FOR CREDIT\n",
229 | "\n",
230 | "The rest of this notebook explores solutions to the regularized least-squares problem. Experiment if you are curious! There are no tasks \"for credit\" beyond this point."
231 | ]
232 | },
233 | {
234 | "attachments": {},
235 | "cell_type": "markdown",
236 | "metadata": {},
237 | "source": [
238 | "Consider the following $100\\times 50$ ill-conditioned least-squares problem $Ax=b$, where the last $20$ singular values of $A$ decay rapidly to about $10^{-6}$."
239 | ]
240 | },
241 | {
242 | "cell_type": "code",
243 | "execution_count": null,
244 | "metadata": {},
245 | "outputs": [],
246 | "source": [
247 | "# singular values with decay profile\n",
248 | "x = 1:50\n",
249 | "v = 1e-6 .+ (1 ./ (1 .+ exp.(2*(x .- 30))))\n",
250 | "plot(x,log10.(v), legend = false)"
251 | ]
252 | },
253 | {
254 | "cell_type": "code",
255 | "execution_count": null,
256 | "metadata": {},
257 | "outputs": [],
258 | "source": [
259 | "# matrix constructed from SVD\n",
260 | "U = qr(rand(100,100)).Q\n",
261 | "V = qr(randn(50,50)).Q\n",
262 | "Σ = [diagm(v); zeros(50,50)]\n",
263 | "A = U * Σ * V'\n",
264 | "cond(A)"
265 | ]
266 | },
267 | {
268 | "attachments": {},
269 | "cell_type": "markdown",
270 | "metadata": {},
271 | "source": [
272 | "Given a random right-hand side $b$, plot the terms $\\|Ax_*-b\\|$ and $\\|x_*\\|$ as $\\lambda\\rightarrow 0$."
273 | ]
274 | },
275 | {
276 | "cell_type": "code",
277 | "execution_count": null,
278 | "metadata": {},
279 | "outputs": [],
280 | "source": [
281 | "# random right-hand side\n",
282 | "b = randn(100)\n",
283 | "\n",
284 | "# \"exact\" least-squares solution (warning: ill-conditioned)\n",
285 | "x0 = A \\ b\n",
286 | "res = norm(A*x0 - b)\n",
287 | "\n",
288 | "# range of \\lambda\n",
289 | "l = 20\n",
290 | "p = LinRange(-1,15,l)\n",
291 | "λ = 10 .^ (-p)\n",
292 | "\n",
293 | "# iterate over \\lambda\n",
294 | "errx = zeros(l)\n",
295 | "resAx = zeros(l)\n",
296 | "normx = zeros(l)\n",
297 | "for j ∈ 1:l\n",
298 | " x = V * ( (Σ'*Σ+λ[j]*I) \\ (Σ' * U' *b) )\n",
299 | " resAx[j] = norm(A*x-b)\n",
300 | " normx[j] = norm(x)\n",
301 | "end\n",
302 | "\n",
303 | "p1 = plot(log10.(λ), resAx, legend = false, title = \"log10(||Ax-b||)\", xlabel = \"log10(lambda)\")\n",
304 | "plot!(log10.(λ), res * ones(length(λ)), legend = false)\n",
305 | "p2 = plot(log10.(λ), log10.(normx), legend = false, title = \"log10||x||\", xlabel = \"log10(lambda)\")\n",
306 | "plot!(log10.(λ), log10(norm(x0)) * ones(length(λ)), legend = false)\n",
307 | "display(p1)\n",
308 | "display(p2)\n"
309 | ]
310 | },
311 | {
312 | "attachments": {},
313 | "cell_type": "markdown",
314 | "metadata": {},
315 | "source": [
316 | "Now, try adjusting the singular value profile of $A$ (for example, lower the plateau at $10^{-6}$ to $10^{-9}$) and see what changes!"
317 | ]
318 | }
319 | ],
320 | "metadata": {
321 | "@webio": {
322 | "lastCommId": null,
323 | "lastKernelId": null
324 | },
325 | "kernelspec": {
326 | "display_name": "Julia 1.6.3",
327 | "language": "julia",
328 | "name": "julia-1.6"
329 | },
330 | "language_info": {
331 | "file_extension": ".jl",
332 | "mimetype": "application/julia",
333 | "name": "julia",
334 | "version": "1.6.3"
335 | }
336 | },
337 | "nbformat": 4,
338 | "nbformat_minor": 1
339 | }
340 |
--------------------------------------------------------------------------------
/psets/notes_spring_2023/pset2.lyx:
--------------------------------------------------------------------------------
1 | #LyX 2.3 created this file. For more info see http://www.lyx.org/
2 | \lyxformat 544
3 | \begin_document
4 | \begin_header
5 | \save_transient_properties true
6 | \origin unavailable
7 | \textclass article
8 | \begin_preamble
9 |
10 | \renewcommand{\vec}[1]{\mathbf{#1}}
11 |
12 | \renewcommand{\labelenumi}{(\alph{enumi})}
13 | \renewcommand{\labelenumii}{(\roman{enumii})}
14 | \end_preamble
15 | \use_default_options false
16 | \maintain_unincluded_children false
17 | \language english
18 | \language_package default
19 | \inputencoding auto
20 | \fontencoding global
21 | \font_roman "default" "default"
22 | \font_sans "default" "default"
23 | \font_typewriter "default" "default"
24 | \font_math "auto" "auto"
25 | \font_default_family default
26 | \use_non_tex_fonts false
27 | \font_sc false
28 | \font_osf false
29 | \font_sf_scale 100 100
30 | \font_tt_scale 100 100
31 | \use_microtype false
32 | \use_dash_ligatures true
33 | \graphics default
34 | \default_output_format default
35 | \output_sync 0
36 | \bibtex_command default
37 | \index_command default
38 | \paperfontsize default
39 | \spacing single
40 | \use_hyperref false
41 | \papersize default
42 | \use_geometry true
43 | \use_package amsmath 1
44 | \use_package amssymb 1
45 | \use_package cancel 1
46 | \use_package esint 0
47 | \use_package mathdots 0
48 | \use_package mathtools 1
49 | \use_package mhchem 1
50 | \use_package stackrel 1
51 | \use_package stmaryrd 1
52 | \use_package undertilde 1
53 | \cite_engine basic
54 | \cite_engine_type default
55 | \biblio_style plain
56 | \use_bibtopic false
57 | \use_indices false
58 | \paperorientation portrait
59 | \suppress_date false
60 | \justification true
61 | \use_refstyle 0
62 | \use_minted 0
63 | \index Index
64 | \shortcut idx
65 | \color #008000
66 | \end_index
67 | \topmargin 1in
68 | \secnumdepth 3
69 | \tocdepth 3
70 | \paragraph_separation indent
71 | \paragraph_indentation default
72 | \is_math_indent 0
73 | \math_numbering_side default
74 | \quotes_style english
75 | \dynamic_quotes 0
76 | \papercolumns 2
77 | \papersides 2
78 | \paperpagestyle default
79 | \tracking_changes false
80 | \output_changes false
81 | \html_math_output 0
82 | \html_css_as_file 0
83 | \html_be_strict false
84 | \end_header
85 |
86 | \begin_body
87 |
88 | \begin_layout Section*
89 | 18.335 Problem Set 2
90 | \end_layout
91 |
92 | \begin_layout Standard
93 | Due March 17, 2023 at 11:59pm.
94 | You should submit your problem set
95 | \series bold
96 | electronically
97 | \series default
98 | on the 18.335 Gradescope page.
99 | Submit
100 | \series bold
101 | both
102 | \series default
103 | a
104 | \emph on
105 | scan
106 | \emph default
107 | of any handwritten solutions (I recommend an app like TinyScanner or similar
108 | to create a good-quality black-and-white
109 | \begin_inset Quotes eld
110 | \end_inset
111 |
112 | thresholded
113 | \begin_inset Quotes erd
114 | \end_inset
115 |
116 | scan) and
117 | \series bold
118 | also
119 | \series default
120 | a
121 | \emph on
122 | PDF printout
123 | \emph default
124 | of the Julia notebook of your computer solutions.
125 | A
126 | \series bold
127 | template Julia notebook is posted
128 | \series default
129 | in the 18.335 web site to help you get started.
130 | \end_layout
131 |
132 | \begin_layout Subsection*
133 | Problem 0: Pset Honor Code
134 | \end_layout
135 |
136 | \begin_layout Standard
137 | Include the following statement in your solutions:
138 | \end_layout
139 |
140 | \begin_layout Quotation
141 |
142 | \emph on
143 | I will not look at 18.335 pset solutions from previous semesters.
144 | I may discuss problems with my classmates or others, but I will write up
145 | my solutions on my own.
146 |
151 | \end_layout
152 |
153 | \begin_layout Subsection*
154 | Problem 1: Stability and conditioning for linear systems
155 | \end_layout
156 |
157 | \begin_layout Enumerate
158 | (From Trefethen and Bau, Exercise 18.1.) Consider the example
159 | \begin_inset Formula
160 | \[
161 | A=\left(\begin{array}{cc}
162 | 1 & 1\\
163 | 1 & 1.0001\\
164 | 1 & 1.0001
165 | \end{array}\right),\text{ }b=\left(\begin{array}{c}
166 | 2\\
167 | 0.0001\\
168 | 4.0001
169 | \end{array}\right).
170 | \]
171 |
172 | \end_inset
173 |
174 |
175 | \end_layout
176 |
177 | \begin_deeper
178 | \begin_layout Enumerate
179 | What are the matrices
180 | \begin_inset Formula $A^{+}$
181 | \end_inset
182 |
183 | and
184 | \begin_inset Formula $P$
185 | \end_inset
186 |
187 | in this example? Give exact answers.
188 | \end_layout
189 |
190 | \begin_layout Enumerate
191 | Find the exact solutions
192 | \begin_inset Formula $x$
193 | \end_inset
194 |
195 | and
196 | \begin_inset Formula $y=Ax$
197 | \end_inset
198 |
199 | to the least squares problem
200 | \begin_inset Formula ${\rm x=argmin}_{v}\|Av-b\|_{2}$
201 | \end_inset
202 |
203 | .
204 | \end_layout
205 |
206 | \begin_layout Enumerate
207 | What are
208 | \begin_inset Formula $\kappa(A)$
209 | \end_inset
210 |
211 | ,
212 | \begin_inset Formula $\theta$
213 | \end_inset
214 |
215 | , and
216 | \begin_inset Formula $\eta$
217 | \end_inset
218 |
219 | ? Numerical answers computed with, e.g., Julia, are acceptable.
220 | \end_layout
221 |
222 | \begin_layout Enumerate
223 | What numerical values do the four condition numbers of Theorem 18.1 take
224 | for this problem?
225 | \end_layout
226 |
227 | \begin_layout Enumerate
228 | Give examples of perturbations
229 | \begin_inset Formula $\delta b$
230 | \end_inset
231 |
232 | and
233 | \begin_inset Formula $\delta$
234 | \end_inset
235 |
236 | A that approximately attain these four condition numbers.
237 | \end_layout
238 |
239 | \end_deeper
240 | \begin_layout Enumerate
241 | (From Trefethen and Bau, Exercise 21.6.) Suppose
242 | \begin_inset Formula $A\in\mathbb{\mathbb{C}}^{m\times m}$
243 | \end_inset
244 |
245 | is
246 | \emph on
247 | strictly column diagonally dominant
248 | \emph default
249 | , which means that for each column index
250 | \begin_inset Formula $k$
251 | \end_inset
252 |
253 | ,
254 | \begin_inset Formula
255 | \[
256 | |a_{kk}|>\sum_{j\neq k}|a_{jk}|.
257 | \]
258 |
259 | \end_inset
260 |
261 |
262 | \end_layout
263 |
264 | \begin_deeper
265 | \begin_layout Standard
266 | Show that if Gaussian elimination with partial pivoting is applied to
267 | \begin_inset Formula $A$
268 | \end_inset
269 |
270 | , no row interchanges take place.
271 | \end_layout
272 |
273 | \end_deeper
274 | \begin_layout Enumerate
275 | (From Trefethen and Bau, Exercise 23.2.) Using the proof of Theorem 16.2 as
276 | a guide, derive Theorem 23.3 from Theorems 23.2 and 17.1.
277 | In other words, show that solving symmetric positive definite (SPD) linear
278 | systems with a Cholesky factorization followed by forward and backward
279 | substitution is backward stable.
280 | \end_layout
281 |
282 | \begin_layout Subsection*
283 | Problem 2: Banded factorization of a finite-difference matrix
284 | \end_layout
285 |
286 | \begin_layout Standard
287 | Consider the advection equation
288 | \begin_inset Formula $\frac{\partial u}{\partial t}+\frac{\partial u}{\partial x}=0$
289 | \end_inset
290 |
291 | with
292 | \begin_inset Formula $(t,x)\in[0,T]\times[0,1]$
293 | \end_inset
294 |
295 | for some
296 | \begin_inset Formula $T>0$
297 | \end_inset
298 |
299 | .
300 | Let
301 | \begin_inset Formula $u(x,t)$
302 | \end_inset
303 |
304 | have initial condition
305 | \begin_inset Formula $u(x,0)=b(x)$
306 | \end_inset
307 |
308 | and boundary conditions
309 | \begin_inset Formula $u(0,t)=u(1,t)=0$
310 | \end_inset
311 |
312 | .
313 | For numerical approximation with finite-differences at equispaced times
314 |
315 | \begin_inset Formula $00$
472 | \end_inset
473 |
474 | , given by
475 | \begin_inset Formula
476 | \[
477 | x_{*}={\rm argmin}_{x}\|Ax-b\|_{2}^{2}+\lambda\|x\|_{2}^{2}.
478 | \]
479 |
480 | \end_inset
481 |
482 |
483 | \end_layout
484 |
485 | \begin_layout Enumerate
486 | Show that the regularized least-squares problem is equivalent to a standard
487 | least-squares problem,
488 | \begin_inset Formula
489 | \[
490 | x_{*}={\rm argmin}_{x}\left\Vert \left(\begin{array}{c}
491 | A\\
492 | \sqrt{\lambda}I
493 | \end{array}\right)x-\left(\begin{array}{c}
494 | b\\
495 | 0
496 | \end{array}\right)\right\Vert _{2}^{2}.
497 | \]
498 |
499 | \end_inset
500 |
501 |
502 | \end_layout
503 |
504 | \begin_layout Enumerate
505 | If the SVD of
506 | \begin_inset Formula $A$
507 | \end_inset
508 |
509 | is
510 | \begin_inset Formula $A=U\Sigma V^{*}$
511 | \end_inset
512 |
513 | , show that the unique solution is
514 | \begin_inset Formula
515 | \[
516 | x_{*}=V(\Sigma^{*}\Sigma+\lambda I)^{-1}\Sigma^{*}U^{*}b.
517 | \]
518 |
519 | \end_inset
520 |
521 |
522 | \end_layout
523 |
524 | \begin_layout Enumerate
525 | Under what conditions on
526 | \begin_inset Formula $A$
527 | \end_inset
528 |
529 | does the regularized solution converge to the usual least-squares solution
530 |
531 | \begin_inset Formula ${\rm argmin}_{x}||Ax-b\|_{2}^{2}$
532 | \end_inset
533 |
534 | in the limit
535 | \begin_inset Formula $\lambda\rightarrow0$
536 | \end_inset
537 |
538 | ?
539 | \end_layout
540 |
541 | \begin_layout Enumerate
542 | Describe a structure-exploiting Givens-based QR solver for the equivalent
543 | standard least-squares problem in part (a).
544 | How does the flop count compare to the standard QR solver for
545 | \begin_inset Formula ${\rm argmin}||Ax-b\|_{2}^{2}$
546 | \end_inset
547 |
548 | ?
549 | \end_layout
550 |
551 | \begin_layout Enumerate
552 | (Not for credit.) Play with the example in the accompanying Julia notebook!
553 | \end_layout
554 |
555 | \end_body
556 | \end_document
557 |
--------------------------------------------------------------------------------
/psets/notes_spring_2023/pset2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/psets/notes_spring_2023/pset2.pdf
--------------------------------------------------------------------------------
/psets/notes_spring_2023/pset3.lyx:
--------------------------------------------------------------------------------
1 | #LyX 2.3 created this file. For more info see http://www.lyx.org/
2 | \lyxformat 544
3 | \begin_document
4 | \begin_header
5 | \save_transient_properties true
6 | \origin unavailable
7 | \textclass article
8 | \begin_preamble
9 |
10 | \renewcommand{\vec}[1]{\mathbf{#1}}
11 |
12 | \renewcommand{\labelenumi}{(\alph{enumi})}
13 | \renewcommand{\labelenumii}{(\roman{enumii})}
14 |
15 | \newcommand{\tr}{\operatorname{tr}}
16 | \end_preamble
17 | \use_default_options false
18 | \maintain_unincluded_children false
19 | \language english
20 | \language_package default
21 | \inputencoding auto
22 | \fontencoding global
23 | \font_roman "times" "default"
24 | \font_sans "default" "default"
25 | \font_typewriter "default" "default"
26 | \font_math "auto" "auto"
27 | \font_default_family default
28 | \use_non_tex_fonts false
29 | \font_sc false
30 | \font_osf false
31 | \font_sf_scale 100 100
32 | \font_tt_scale 100 100
33 | \use_microtype false
34 | \use_dash_ligatures true
35 | \graphics default
36 | \default_output_format default
37 | \output_sync 0
38 | \bibtex_command default
39 | \index_command default
40 | \paperfontsize default
41 | \spacing single
42 | \use_hyperref false
43 | \papersize default
44 | \use_geometry true
45 | \use_package amsmath 2
46 | \use_package amssymb 2
47 | \use_package cancel 1
48 | \use_package esint 0
49 | \use_package mathdots 1
50 | \use_package mathtools 1
51 | \use_package mhchem 1
52 | \use_package stackrel 1
53 | \use_package stmaryrd 1
54 | \use_package undertilde 1
55 | \cite_engine basic
56 | \cite_engine_type default
57 | \biblio_style plain
58 | \use_bibtopic false
59 | \use_indices false
60 | \paperorientation portrait
61 | \suppress_date false
62 | \justification true
63 | \use_refstyle 0
64 | \use_minted 0
65 | \index Index
66 | \shortcut idx
67 | \color #008000
68 | \end_index
69 | \topmargin 1in
70 | \secnumdepth 3
71 | \tocdepth 3
72 | \paragraph_separation indent
73 | \paragraph_indentation default
74 | \is_math_indent 0
75 | \math_numbering_side default
76 | \quotes_style english
77 | \dynamic_quotes 0
78 | \papercolumns 2
79 | \papersides 2
80 | \paperpagestyle default
81 | \tracking_changes false
82 | \output_changes false
83 | \html_math_output 0
84 | \html_css_as_file 0
85 | \html_be_strict false
86 | \end_header
87 |
88 | \begin_body
89 |
90 | \begin_layout Section*
91 | 18.335 Problem Set 3
92 | \end_layout
93 |
94 | \begin_layout Standard
95 | Due Friday, April 14, 2023 at 11:59pm.
96 | \end_layout
97 |
98 | \begin_layout Subsection*
99 | Problem 1: Generalized eigenproblems
100 | \end_layout
101 |
102 | \begin_layout Standard
103 | Consider the
104 | \begin_inset Quotes eld
105 | \end_inset
106 |
107 | generalized
108 | \begin_inset Quotes erd
109 | \end_inset
110 |
111 | eigenvalue problem
112 | \begin_inset Formula $Ax=\lambda Bx$
113 | \end_inset
114 |
115 | where
116 | \begin_inset Formula $A$
117 | \end_inset
118 |
119 | and
120 | \begin_inset Formula $B$
121 | \end_inset
122 |
123 | are Hermitian and
124 | \begin_inset Formula $B$
125 | \end_inset
126 |
127 | is positive-definite.
128 | You could turn this into an ordinary eigenvalue problem
129 | \begin_inset Formula $B^{-1}Ax=\lambda x$
130 | \end_inset
131 |
132 | , but then you disguise the Hermitian nature of the underlying problem.
133 | \end_layout
134 |
135 | \begin_layout Enumerate
136 | Show that the eigenvalue solutions
137 | \begin_inset Formula $\lambda$
138 | \end_inset
139 |
140 | of
141 | \begin_inset Formula $Ax=\lambda Bx$
142 | \end_inset
143 |
144 | must be real (with
145 | \begin_inset Formula $x\ne0$
146 | \end_inset
147 |
148 | as usual) and that eigenvectors
149 | \begin_inset Formula $x_{1}$
150 | \end_inset
151 |
152 | and
153 | \begin_inset Formula $x_{2}$
154 | \end_inset
155 |
156 | for distinct eigenvalues
157 | \begin_inset Formula $\lambda_{1}\ne\lambda_{2}$
158 | \end_inset
159 |
160 | must satisfy a modified orthogonality relationship
161 | \begin_inset Formula $x_{1}^{*}Bx_{2}=0$
162 | \end_inset
163 |
164 | .
165 | \end_layout
166 |
167 | \begin_layout Enumerate
168 | A Hermitian matrix
169 | \begin_inset Formula $A$
170 | \end_inset
171 |
172 | can be diagonalized as
173 | \begin_inset Formula $A=Q\Lambda Q^{*}$
174 | \end_inset
175 |
176 | ; write down an analogous diagonalization formula arising from the eigenvectors
177 |
178 | \begin_inset Formula $X$
179 | \end_inset
180 |
181 | (suitably normalized) and eigenvalues of the generalized problem, in terms
182 | of
183 | \begin_inset Formula $\{A,B,X,\Lambda\}$
184 | \end_inset
185 |
186 | .
187 | (Your formula should contain no matrix inverses, only adjoints like
188 | \begin_inset Formula $X^{*}$
189 | \end_inset
190 |
191 | .)
192 | \end_layout
193 |
194 | \begin_layout Enumerate
195 | Using the Cholesky factorization of
196 | \begin_inset Formula $B$
197 | \end_inset
198 |
199 | , show that you can derive an ordinary Hermitian eigenvalue problem
200 | \begin_inset Formula $Hy=\lambda y$
201 | \end_inset
202 |
203 | where
204 | \begin_inset Formula $H=H^{*}$
205 | \end_inset
206 |
207 | is Hermitian, the eigenvalues
208 | \begin_inset Formula $\lambda$
209 | \end_inset
210 |
211 | are the same as those of
212 | \begin_inset Formula $Ax=\lambda Bx$
213 | \end_inset
214 |
215 | , and there is a simple relationship between the corresponding eigenvectors
216 |
217 | \begin_inset Formula $x$
218 | \end_inset
219 |
220 | and
221 | \begin_inset Formula $y$
222 | \end_inset
223 |
224 | .
225 | \end_layout
226 |
227 | \begin_layout Subsection*
228 | Problem 2: Shifted-inverse iteration
229 | \end_layout
230 |
231 | \begin_layout Standard
232 | Trefethen, problem 27.5.
233 | \end_layout
234 |
235 | \begin_layout Subsection*
236 | Problem 3: Arnoldi
237 | \end_layout
238 |
239 | \begin_layout Standard
240 | Trefethen, problem 33.2.
241 | \end_layout
242 |
243 | \end_body
244 | \end_document
245 |
--------------------------------------------------------------------------------
/psets/notes_spring_2023/pset3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/psets/notes_spring_2023/pset3.pdf
--------------------------------------------------------------------------------
/psets/pset1.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# 18.335 Problem Set 0.1\n",
8 | "\n",
9 | "This notebook accompanies the first problem set posted on the [18.335 web page](https://github.com/mitmath/18335), and is here to get you started with your own Julia computations.\n",
10 | "\n",
11 | "Download this notebook (a `pset1.ipynb` file) by **right-clicking the download link** at the upper-right to *Save As* a file, and then drag this file into your Jupyter dashboard to upload it (e.g. on [](https://mybinder.org/v2/gh/mitmath/binder-env/main?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Fmitmath%252F18335%26urlpath%3Dtree%252F18335%252F%26branch%3Dmaster) or in a local installation).\n",
12 | "\n",
13 | "Modify it as needed, then choose **Print Preview** from the \"File\" menu and *print to a PDF* file to submit electronically."
14 | ]
15 | },
16 | {
17 | "cell_type": "markdown",
18 | "metadata": {},
19 | "source": [
20 | "# Problem 1: Floating point"
21 | ]
22 | },
23 | {
24 | "cell_type": "markdown",
25 | "metadata": {},
26 | "source": [
27 | "## part (a)\n",
28 | "\n",
29 | "Give your code to check the largest floating point integer below."
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": null,
35 | "metadata": {},
36 | "outputs": [],
37 | "source": []
38 | },
39 | {
40 | "cell_type": "markdown",
41 | "metadata": {},
42 | "source": [
43 | "## part (b)\n",
44 | "\n",
45 | "Give your code for computing the average of two floating-point numbers in the problem below."
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": null,
51 | "metadata": {
52 | "scrolled": true
53 | },
54 | "outputs": [],
55 | "source": [
56 | "# Install DecFP package\n",
57 | "using Pkg\n",
58 | "Pkg.add(\"DecFP\")"
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": null,
64 | "metadata": {},
65 | "outputs": [],
66 | "source": [
67 | "using DecFP"
68 | ]
69 | },
70 | {
71 | "cell_type": "code",
72 | "execution_count": null,
73 | "metadata": {},
74 | "outputs": [],
75 | "source": [
76 | "a = Dec32()\n",
77 | "b = Dec32()"
78 | ]
79 | },
80 | {
81 | "cell_type": "code",
82 | "execution_count": null,
83 | "metadata": {},
84 | "outputs": [],
85 | "source": [
86 | "(a+b)/2 >= a"
87 | ]
88 | },
89 | {
90 | "cell_type": "markdown",
91 | "metadata": {},
92 | "source": [
93 | "# Problem 2: Rounding error analysis"
94 | ]
95 | },
96 | {
97 | "cell_type": "markdown",
98 | "metadata": {},
99 | "source": [
100 | "## Part (a)\n",
101 | "\n",
102 | "You may compare the worst case rounding error you get in theory to the error you get in practice"
103 | ]
104 | },
105 | {
106 | "cell_type": "code",
107 | "execution_count": null,
108 | "metadata": {},
109 | "outputs": [],
110 | "source": []
111 | },
112 | {
113 | "cell_type": "markdown",
114 | "metadata": {},
115 | "source": [
116 | "## Part (b)"
117 | ]
118 | },
119 | {
120 | "cell_type": "code",
121 | "execution_count": null,
122 | "metadata": {},
123 | "outputs": [],
124 | "source": [
125 | "# Sum x[first:last].\n",
126 | "function div2sum(x, first=1, last=length(x))\n",
127 | " n = last - first + 1;\n",
128 | " if n < 2\n",
129 | " s = zero(eltype(x))\n",
130 | " for i = first:last\n",
131 | " s += x[i]\n",
132 | " end\n",
133 | " return s\n",
134 | " else\n",
135 | " mid = div(first + last, 2) # find middle as (first+last)/2, rounding down\n",
136 | " return div2sum(x, first, mid) + div2sum(x, mid+1, last)\n",
137 | " end\n",
138 | "end"
139 | ]
140 | },
141 | {
142 | "cell_type": "markdown",
143 | "metadata": {},
144 | "source": [
145 | "# Problem 3: Matrix norm"
146 | ]
147 | },
148 | {
149 | "cell_type": "markdown",
150 | "metadata": {},
151 | "source": [
152 | "## part (a)\n",
153 | "\n",
154 | "In the problem, you proved that $\\|A\\|_\\infty = \\underset{1\\leq i\\leq m}{\\max} \\|a_i^\\ast\\|_1$.\n",
155 | "\n",
156 | "You can check your result in Julia — it is a good habit to try out matrix properties on random matrices, both to check for mistakes and to get used to matrix calculations."
157 | ]
158 | },
159 | {
160 | "cell_type": "code",
161 | "execution_count": null,
162 | "metadata": {},
163 | "outputs": [],
164 | "source": [
165 | "using LinearAlgebra"
166 | ]
167 | },
168 | {
169 | "cell_type": "code",
170 | "execution_count": null,
171 | "metadata": {},
172 | "outputs": [],
173 | "source": [
174 | "A = "
175 | ]
176 | },
177 | {
178 | "cell_type": "code",
179 | "execution_count": null,
180 | "metadata": {},
181 | "outputs": [],
182 | "source": [
183 | "opnorm(A, Inf)"
184 | ]
185 | },
186 | {
187 | "cell_type": "markdown",
188 | "metadata": {},
189 | "source": [
190 | "# Problem 4: SVD"
191 | ]
192 | },
193 | {
194 | "cell_type": "markdown",
195 | "metadata": {},
196 | "source": [
197 | "## part (a)"
198 | ]
199 | },
200 | {
201 | "cell_type": "code",
202 | "execution_count": null,
203 | "metadata": {},
204 | "outputs": [],
205 | "source": [
206 | "C = "
207 | ]
208 | },
209 | {
210 | "cell_type": "code",
211 | "execution_count": null,
212 | "metadata": {},
213 | "outputs": [],
214 | "source": [
215 | "S = svd(C)\n",
216 | "S.S"
217 | ]
218 | },
219 | {
220 | "cell_type": "markdown",
221 | "metadata": {},
222 | "source": [
223 | "## part (b)"
224 | ]
225 | },
226 | {
227 | "cell_type": "code",
228 | "execution_count": null,
229 | "metadata": {},
230 | "outputs": [],
231 | "source": [
232 | "function srank(A)\n",
233 | " A_F = \n",
234 | " A_2 = \n",
235 | " return \n",
236 | "end"
237 | ]
238 | },
239 | {
240 | "cell_type": "code",
241 | "execution_count": null,
242 | "metadata": {},
243 | "outputs": [],
244 | "source": [
245 | "srank(A) <= rank(A)"
246 | ]
247 | }
248 | ],
249 | "metadata": {
250 | "@webio": {
251 | "lastCommId": null,
252 | "lastKernelId": null
253 | },
254 | "kernelspec": {
255 | "display_name": "Julia 1.11.3",
256 | "language": "julia",
257 | "name": "julia-1.11"
258 | },
259 | "language_info": {
260 | "file_extension": ".jl",
261 | "mimetype": "application/julia",
262 | "name": "julia",
263 | "version": "1.11.3"
264 | }
265 | },
266 | "nbformat": 4,
267 | "nbformat_minor": 1
268 | }
269 |
--------------------------------------------------------------------------------
/psets/pset1.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/psets/pset1.pdf
--------------------------------------------------------------------------------
/psets/pset2.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/psets/pset2.pdf
--------------------------------------------------------------------------------
/psets/pset3.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/mitmath/18335/7ea6e53750958a6d886bdfa4b00a3d32d6f7c23b/psets/pset3.pdf
--------------------------------------------------------------------------------