├── README.md ├── homework ├── homework1.pdf └── homework2.pdf ├── note ├── intro_nn.pdf ├── lec1.pdf ├── lec2.pdf ├── lec3.pdf ├── lec4.pdf ├── lec5.pdf ├── lec6.pdf ├── lec7.1.pdf ├── lec7.2.pdf ├── lec7.pdf ├── lec9.1.pdf ├── lec9.2.pdf ├── lec9.3.pdf ├── overview.pdf └── recent_progresses.pdf ├── paper_list.md ├── pre_schedule.txt └── template ├── scribe.sty └── template.tex /README.md: -------------------------------------------------------------------------------- 1 | 2 | ## Mathematical Theory of Neural Network Models 3 | 4 | ### Announcements 5 | - **7/26**: Lecture 7 and 9 are out. 6 | - **7/25**: The report of paper review is due on 8/2, 12 pm. 7 | - 7/19: The [schedule](pre_schedule.txt) of presentations is out. 8 | - 7/18: A draft of Lecture [4](note/lec4.pdf) is out. 9 | - 7/17: Drafts of Lecture [3](note/lec3.pdf), [5](note/lec5.pdf) and [6](note/lec6.pdf) are out. 10 | - 7/12: A draft of [Lecture 2](note/lec2.pdf) is out. 11 | - 7/12: Some references for random feature models, Barron spaces and regularization theory of two-layer nets are added. 12 | - 7/9: A draft of [Lecture 1](note/lec1.pdf) is out. 13 | - 7/9: [Homework 2](homework/homework2.pdf) is out. It is due on Tuesday, 7/16, 12pm. 14 | - 7/6: [Homework 1](homework/homework1.pdf) is out. It is due on Friday, 7/12, 12pm. 15 | 16 | 17 | ### Administrative information 18 | 19 | - **Instructor:** 20 | - [Weinan E](https://web.math.princeton.edu/~weinan/) 21 | - [Lei Wu](https://scholar.google.com/citations?user=CMweeYcAAAAJ&hl=en), leiwu@princeton.edu 22 | - Chao Ma, chaom@princeton.edu 23 | 24 | - **Time:** Tue: 2:00-5:00 pm; Thu: 2:00-5:00 pm; Fri: 3:00-5:00 pm. 25 | 26 | - **Location:** Room 515, [Teaching Building 2](https://maps.baidu.com/poi/%E5%8C%97%E4%BA%AC%E5%A4%A7%E5%AD%A6(%E7%87%95%E5%9B%AD%E6%A0%A1%E5%8C%BA)%E7%AC%AC%E4%BA%8C%E6%95%99%E5%AD%A6%E6%A5%BC(%E6%9D%8E%E5%85%86%E5%9F%BA%E6%A5%BC)/@12948834.869857343,4837581.844142513,19.6z?uid=82548a63754afc91735e80e4&primaryUid=10472254985355704340&ugc_type=3&ugc_ver=1&device_ratio=1&compat=1&querytype=detailConInfo&da_src=shareurl) 27 | 28 | 29 | 30 | 31 | ### Course Content 32 | **Description:** 33 | 34 | This course introduces the basic models for supervised learning, including kernel method, two-layer neural network and residual network. We then provide a unified approach to analyze these models. 35 | 36 | 37 | **Topic:** 38 | 39 | - Supverised learning, generalization/approximation/estimation error, a priroi/posteriori estimates 40 | - Kernel method, two-layer nerual network, residual network 41 | - Reproducing kernel Hilbert space, Barron space, compositional function space 42 | - Rademacher complexity, margin, gradient descent, implicit regularization 43 | 44 | **Prerequisite:** 45 | 46 | - A solid background in linear algebra, real analysis and probability/measure theory 47 | - Basic knowledge in (convex) optimization and statistics 48 | 49 | 50 | ### Grading 51 | **Coursework:** 52 | - **Homework** (45%) 53 | - **Paper review** (45%): You are asked to choose a paper from this [ paper list](paper_list.md) and write a review. The review should not only summarize the paper, but also identify the novelty and limitation of the result. A good paper review at least attempts to answer the following four questions: 54 | - What is the main result of the paper? 55 | - Why is the result important and significant compared with other papers? 56 | - What is the limitation of the result? 57 | - What is the potential research direction inspired by the paper? 58 | 59 | You are required to give a presentation (15%) and submit a report of 3 pages (30%). 60 | 61 | - **Scribe notes** (10%): You are asked to scribe a note in LaTeX. The scribe notes can be done in pairs. Please use this [template](template/): 62 | 63 | **Collaboration policy:** We encourage you to form study groups and discuss courseworks. However, you must write up all the coureworks from scrach independently without refering to any notes from others. 64 | 65 | 66 | 67 | ### Texts and References 68 | - [Peter Bartlett's course: Statistical Learning Theory](https://www.stat.berkeley.edu/~bartlett/courses/2014spring-cs281bstat241b/) 69 | - [MIT's course: Statistical Learning Theory](http://www.mit.edu/~9.520/fall18/) 70 | - [Mohri's book: Foundations of Machine Learning](https://cs.nyu.edu/~mohri/mlbook/) 71 | - [Shai Shalev-Shwartz's book: Understanding Machine Learning: From Theory to Algorithms](https://www.cs.huji.ac.il/~shais/UnderstandingMachineLearning/copy.html) 72 | 73 | --- 74 | ### Schedule (subject to change) 75 | 76 | #### Week 1 77 | - Tue 7/2: Introduction to supervised learning methods 78 | - [Lecture 1](note/lec1.pdf) 79 | - [Random Features for Large-Scale Kernel Machines](https://papers.nips.cc/paper/3182-random-features-for-large-scale-kernel-machines) 80 | - Thu 7/4: Overview of mathematical theory for neural network models 81 | - [Lecture 2](note/lec2.pdf) 82 | - [Slide of Prof. E](note/overview.pdf) 83 | - [A priori estimates](https://en.wikipedia.org/wiki/A_priori_estimate) 84 | - Fri 7/5: Rademacher complexity, covering number, metric entropy and uniform bound 85 | - [Lecture 3](note/lec3.pdf) 86 | - [Concentration inequalities](https://www.stat.berkeley.edu/~mjwain/stat210b/Chap2_TailBounds_Jan22_2015.pdf) 87 | 88 | #### Week 2 89 | - Reproducing kernel Hilbert space and random feature model 90 | - [Lecture 4](note/lec4.pdf) 91 | - [What is an RKHS?](http://www.stats.ox.ac.uk/~sejdinov/teaching/atml14/Theory_2014.pdf) 92 | - [Uniform Approximation of Functions with Random Bases](https://people.eecs.berkeley.edu/~brecht/papers/08.Rah.Rec.Allerton.pdf) 93 | - Error estimates for random feature model with explict and implicit regularizations 94 | - [Lecture 5](note/lec5.pdf) 95 | - The analysis of implicit regularization for the random feature model can be found in this [paper](https://arxiv.org/abs/1904.04326) 96 | - [Learning with SGD and Random Features](https://arxiv.org/abs/1807.06343) 97 | - [Optimal Rates for the Regularized Least-Squares Algorithm](https://link.springer.com/article/10.1007/s10208-006-0196-8) 98 | - Barron space and regularization theory of two-layer neural networks 99 | - [Lecture 6](note/lec6.pdf) 100 | - Properties of Barron space can found in Section 2 of this [paper](https://arxiv.org/abs/1906.08039) 101 | - The a priori estimates of regularized two-layer neural networks can be found in this [paper](https://arxiv.org/abs/1810.06397) 102 | - [The must-read classic paper of Andrew Barron](http://www.stat.yale.edu/~arb4/publications_files/UniversalApproximationBoundsForSuperpositionsOfASigmoidalFunction.pdf) (This is the first paper that provides an approximation rate without the course of dimensionality.) 103 | 104 | #### Week 3 105 | - Implicit regularization for two-layer neural networks 106 | - Lecture [7](note/lec7.pdf), [7.1](note/lec7.1.pdf), [7.2](note/lec7.2.pdf) 107 | - The main materials can be found in this [paper](https://arxiv.org/pdf/1904.04326v1.pdf) 108 | - A priori estimates for regularized deep residual networks 109 | - [A Priori Estimates of the Population Risk for Residual Networks](https://arxiv.org/abs/1903.02154) 110 | - F-principle and it application in deep learning (Guest speakers: Zhiqin Xu, Yaoyu Zhang, Tao Luo) 111 | - An introduction to F-principle [Lecture 9.1](note/lec9.1.pdf) 112 | - Application of F-principle in learning two-layer neural networks [Lecture 9.2](note/lec9.2.pdf) 113 | - General theory of F-principle [Lecture 9.3](note/lec9.3.pdf) 114 | 115 | #### Week 4 116 | - Compositonal function space for deep residual networks 117 | - The mathematical theory of compositonal function spaces can found in Section 3 of this [paper](https://arxiv.org/abs/1906.08039) 118 | - Overview of recent progresses in theoretical deep learning 119 | - [Slide](note/recent_progresses.pdf) 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | -------------------------------------------------------------------------------- /homework/homework1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/homework/homework1.pdf -------------------------------------------------------------------------------- /homework/homework2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/homework/homework2.pdf -------------------------------------------------------------------------------- /note/intro_nn.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/intro_nn.pdf -------------------------------------------------------------------------------- /note/lec1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec1.pdf -------------------------------------------------------------------------------- /note/lec2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec2.pdf -------------------------------------------------------------------------------- /note/lec3.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec3.pdf -------------------------------------------------------------------------------- /note/lec4.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec4.pdf -------------------------------------------------------------------------------- /note/lec5.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec5.pdf -------------------------------------------------------------------------------- /note/lec6.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec6.pdf -------------------------------------------------------------------------------- /note/lec7.1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec7.1.pdf -------------------------------------------------------------------------------- /note/lec7.2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec7.2.pdf -------------------------------------------------------------------------------- /note/lec7.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec7.pdf -------------------------------------------------------------------------------- /note/lec9.1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec9.1.pdf -------------------------------------------------------------------------------- /note/lec9.2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec9.2.pdf -------------------------------------------------------------------------------- /note/lec9.3.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/lec9.3.pdf -------------------------------------------------------------------------------- /note/overview.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/overview.pdf -------------------------------------------------------------------------------- /note/recent_progresses.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leiwu0/course.math_theory_nn/6a635d692305b383f719afbd285e74b0ce72db76/note/recent_progresses.pdf -------------------------------------------------------------------------------- /paper_list.md: -------------------------------------------------------------------------------- 1 | ### Approximation and estimation theory 2 | - [x] [Optimal Approximation with Sparsely Connected Deep Neural Networks](https://arxiv.org/abs/1705.01714) 3 | - [x] [On the Expressive Power of Deep Polynomial Neural Networks](https://arxiv.org/pdf/1905.12207.pdf) 4 | - [x] [Deep Network Approximation Characterized by Number of Neurons](https://arxiv.org/pdf/1906.05497v1.pdf) 5 | - [ ] [The phase diagram of approximation rates for deep neural networks](https://arxiv.org/pdf/1906.09477v1.pdf) 6 | - [ ] [Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality](https://openreview.net/forum?id=H1ebTsActm) 7 | - [x] [Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks](https://arxiv.org/abs/1903.10047) 8 | - [ ] [Optimal Approximation of Piecewise Smooth Functions Using Deep ReLU Neural Networks](https://arxiv.org/pdf/1709.05289v4.pdf) 9 | - [ ] [Approximation Spaces of Deep Neural Networks](https://arxiv.org/pdf/1905.01208v2.pdf) 10 | - [x] [On the Power and Limitations of Random Features for Understanding Neural Networks](https://arxiv.org/abs/1904.00687) 11 | - [x] [Error bounds for deep ReLU networks using the Kolmogorov--Arnold superposition theorem](https://arxiv.org/abs/1906.11945) 12 | - [x] [Benign Overfitting in Linear Regression](https://arxiv.org/abs/1906.11300) 13 | 14 | ### Implicit regularization 15 | - [x] [Risk and Parameter Convergence of Logistic Regression](https://arxiv.org/abs/1803.07300) 16 | - [x] [Implicit Regularization in Deep Matrix Factorization](https://arxiv.org/pdf/1905.13655.pdf) 17 | - [x] [Gradient Descent Maximizes the Margin of Homogeneous Neural Networks](https://arxiv.org/pdf/1906.05890.pdf) 18 | - [x] [Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets](https://arxiv.org/abs/1906.06247v1) 19 | - [x] [Gradient Dynamics of Shallow Univariate ReLU Networks](https://arxiv.org/abs/1906.07842) 20 | 21 | ### PDE 22 | - [ ] [Analysis of the generalization error: Empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations](https://arxiv.org/abs/1809.03062) 23 | - [x] [A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations](https://arxiv.org/abs/1901.10854) 24 | -------------------------------------------------------------------------------- /pre_schedule.txt: -------------------------------------------------------------------------------- 1 | Schedule of presentations 2 | 3 | 4 | ============================= 5 | Thursday (7/25) 6 | ============================= 7 | 14:00 - 14:20 On the Power and Limitations of Random Features for Understanding Neural Networks 8 | 14:20 - 14:40 On the Expressive Power of Deep Polynomial Neural Networks 9 | 14:40 - 15:00 Benign Overfitting in Linear Regression 10 | 11 | 15:10 - 15:30 Risk and Parameter Convergence of Logistic Regression 12 | 15:30 - 15:50 Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks 13 | 14 | 16:00 - 16:20 Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets 15 | 16:20 - 16:40 Gradient Dynamics of Shallow Univariate ReLU Networks 16 | 17 | 18 | ============================= 19 | Friday (7/26) 20 | ============================= 21 | 15:10 - 15:30 Gradient Descent Maximizes the Margin of Homogeneous Neural Networks 22 | 15:30 - 15:50 Deep Network Approximation Characterized by Number of Neurons 23 | 15:50 - 16:10 Optimal Approximation with Sparsely Connected Deep Neural Networks 24 | 25 | 16:20 - 16:40 Error bounds for deep ReLU networks using the Kolmogorov--Arnold superposition theorem 26 | 16:40 - 17:00 A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations 27 | 17:00 - 17:20 Implicit Regularization in Deep Matrix Factorization 28 | -------------------------------------------------------------------------------- /template/scribe.sty: -------------------------------------------------------------------------------- 1 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2 | % Scribe notes style file 3 | % 4 | % This file should be called scribe.sty 5 | % 6 | % Your main LaTeX file should look like this: 7 | % 8 | % \documentclass[12pt]{article} 9 | % \usepackage{scribe} 10 | % 11 | % \Scribe{YOUR NAME} 12 | % \Lecturer{Anupam Gupta OR Ryan O'Donnell} 13 | % \LectureNumber{N} 14 | % \LectureDate{DATE} 15 | % \LectureTitle{A TITLE FOR THE LECTURE} 16 | % 17 | % \begin{document} 18 | % \MakeScribeTop 19 | % 20 | % \section{SECTION NAME} 21 | % 22 | % NOTES GO HERE 23 | % 24 | % \section{ANOTHER SECTION NAME} 25 | % 26 | % MORE NOTES GO HERE 27 | % 28 | % etc. 29 | % 30 | % \bibliographystyle{abbrv} % if you need a bibliography 31 | % \bibliography{mybib} % assuming yours is named mybib.bib 32 | % 33 | % \end{document} 34 | % 35 | % 36 | % A .bib file is a text file containing a sequence like... 37 | % 38 | % @article{ADR82, 39 | % author = "Alain Aspect and Jean Dalibard and G{\'e}rard Roger", 40 | % title = "Experimental Test of {B}ell's Inequalities Using Time-Varying Analyzers", 41 | % journal = "Phys.\ Rev.\ Lett.", 42 | % volume = 49, 43 | % number = 25, 44 | % pages = "1804--1807", 45 | % year = 1982 46 | % } 47 | % 48 | % @inproceedings{Fei91, 49 | % author = "Uriel Feige", 50 | % title = "On the success probability of the two provers in one round proof systems", 51 | % booktitle = "Proc.\ 6th Symp.\ on Structure in Complexity Theory (CCC)", 52 | % pages = "116--123", 53 | % year = 1991 54 | % } 55 | % 56 | % 57 | % 58 | % 59 | % 60 | % 61 | % For your LaTeX files, there are some macros you may want to use below... 62 | 63 | 64 | \oddsidemargin 0in \evensidemargin 0in \marginparwidth 40pt 65 | \marginparsep 10pt \topmargin 0pt \headsep 0in \headheight 0in 66 | \textheight 8.5in \textwidth 6.5in \brokenpenalty=10000 67 | 68 | \usepackage{amssymb} 69 | \usepackage{amsfonts} 70 | \usepackage{amsmath} 71 | \usepackage{amsthm} 72 | \usepackage{latexsym} 73 | \usepackage{epsfig} 74 | \usepackage{bm} 75 | \usepackage{xspace} 76 | \usepackage{times} 77 | \usepackage[utf8x]{inputenc} 78 | \usepackage[T1]{fontenc} 79 | \usepackage{listings} 80 | \usepackage{color} 81 | 82 | \definecolor{codegreen}{rgb}{0.3,0.6,0.4} 83 | \definecolor{codegray}{rgb}{0.5,0.5,0.5} 84 | \definecolor{codepurple}{rgb}{0.58,0,0.82} 85 | \definecolor{backcolour}{rgb}{0.95,0.95,0.92} 86 | 87 | \lstdefinestyle{mystyle}{ 88 | backgroundcolor=\color{backcolour}, 89 | commentstyle=\color{codegreen}, 90 | keywordstyle=\color{magenta}, 91 | numberstyle=\tiny\color{codegray}, 92 | stringstyle=\color{codepurple}, 93 | basicstyle=\footnotesize, 94 | breakatwhitespace=false, 95 | breaklines=true, 96 | captionpos=b, 97 | keepspaces=true, 98 | numbers=left, 99 | numbersep=5pt, 100 | showspaces=false, 101 | showstringspaces=false, 102 | showtabs=false, 103 | tabsize=2 104 | } 105 | 106 | %% 107 | %% Julia definition (c) 2014 Jubobs 108 | %% 109 | \lstdefinelanguage{Julia}% 110 | {morekeywords={abstract,break,case,catch,const,continue,do,else,elseif,% 111 | end,export,false,for,function,immutable,import,importall,if,in,% 112 | macro,module,otherwise,quote,return,switch,true,try,type,typealias,% 113 | using,while},% 114 | sensitive=true,% 115 | alsoother={$},% 116 | morecomment=[l]\#,% 117 | morecomment=[n]{\#=}{=\#},% 118 | morestring=[s]{"}{"},% 119 | morestring=[m]{'}{'},% 120 | }[keywords,comments,strings]% 121 | 122 | \lstset{% 123 | language = Julia, 124 | basicstyle = \ttfamily, 125 | keywordstyle = \bfseries\color{blue}, 126 | stringstyle = \color{magenta}, 127 | commentstyle = \color{ForestGreen}, 128 | showstringspaces = false, 129 | } 130 | 131 | 132 | \newtheorem{theorem}{Theorem}[section] 133 | \newtheorem{lemma}[theorem]{Lemma} 134 | \newtheorem{claim}[theorem]{Claim} 135 | \newtheorem{proposition}[theorem]{Proposition} 136 | \newtheorem{corollary}[theorem]{Corollary} 137 | \newtheorem{fact}[theorem]{Fact} 138 | \newtheorem{example}[theorem]{Example} 139 | \newtheorem{notation}[theorem]{Notation} 140 | \newtheorem{observation}[theorem]{Observation} 141 | \newtheorem{conjecture}[theorem]{Conjecture} 142 | 143 | \theoremstyle{definition} 144 | \newtheorem{definition}[theorem]{Definition} 145 | 146 | \theoremstyle{remark} 147 | \newtheorem{remark}[theorem]{Remark} 148 | 149 | % Setting the theorem style back to plain in case theorems are defined in the main file 150 | \theoremstyle{plain} 151 | 152 | 153 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 154 | % Useful macros 155 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 156 | 157 | % for temporarily chunks of text 158 | \newcommand{\ignore}[1]{} 159 | 160 | % Probability/expectation operators. The ones ending in x should be used if you want 161 | % subscripts that go directly *below* the operator (in math mode); no x means the subscripts 162 | % go below and to the right. NB: \P is remapped below for the complexity class P. 163 | \renewcommand{\Pr}{{\bf Pr}} 164 | \newcommand{\Prx}{\mathop{\bf Pr\/}} 165 | \newcommand{\E}{{\bf E}} 166 | \newcommand{\Ex}{\mathop{\bf E\/}} 167 | \newcommand{\Var}{{\bf Var}} 168 | \newcommand{\Varx}{\mathop{\bf Var\/}} 169 | \newcommand{\Cov}{{\bf Cov}} 170 | \newcommand{\Covx}{\mathop{\bf Cov\/}} 171 | 172 | % shortcuts for symbol names that are too long to type 173 | \newcommand{\eps}{\epsilon} 174 | \newcommand{\lam}{\lambda} 175 | \renewcommand{\l}{\ell} 176 | \newcommand{\la}{\langle} 177 | \newcommand{\ra}{\rangle} 178 | \newcommand{\wh}{\widehat} 179 | \newcommand{\wt}{\widetilde} 180 | 181 | % "blackboard-fonted" letters for the reals, naturals etc. 182 | \newcommand{\R}{\mathbb R} 183 | \newcommand{\N}{\mathbb N} 184 | \newcommand{\Z}{\mathbb Z} 185 | \newcommand{\F}{\mathbb F} 186 | \newcommand{\Q}{\mathbb Q} 187 | \newcommand{\C}{\mathbb C} 188 | 189 | % operators that should be typeset in Roman font 190 | \newcommand{\poly}{\mathrm{poly}} 191 | \newcommand{\polylog}{\mathrm{polylog}} 192 | \newcommand{\sgn}{\mathrm{sgn}} 193 | \newcommand{\avg}{\mathop{\mathrm{avg}}} 194 | \newcommand{\val}{{\mathrm{val}}} 195 | 196 | % complexity classes 197 | \renewcommand{\P}{\mathrm{P}} 198 | \newcommand{\NP}{\mathrm{NP}} 199 | \newcommand{\BPP}{\mathrm{BPP}} 200 | \newcommand{\DTIME}{\mathrm{DTIME}} 201 | \newcommand{\ZPTIME}{\mathrm{ZPTIME}} 202 | \newcommand{\BPTIME}{\mathrm{BPTIME}} 203 | \newcommand{\NTIME}{\mathrm{NTIME}} 204 | 205 | % values associated to optimization algorithm instances 206 | \newcommand{\Opt}{{\mathsf{Opt}}} 207 | \newcommand{\Alg}{{\mathsf{Alg}}} 208 | \newcommand{\Lp}{{\mathsf{Lp}}} 209 | \newcommand{\Sdp}{{\mathsf{Sdp}}} 210 | \newcommand{\Exp}{{\mathsf{Exp}}} 211 | 212 | % if you think the sum and product signs are too big in your math mode; x convention 213 | % as in the probability operators 214 | \newcommand{\littlesum}{{\textstyle \sum}} 215 | \newcommand{\littlesumx}{\mathop{{\textstyle \sum}}} 216 | \newcommand{\littleprod}{{\textstyle \prod}} 217 | \newcommand{\littleprodx}{\mathop{{\textstyle \prod}}} 218 | 219 | % horizontal line across the page 220 | \newcommand{\horz}{ 221 | \vspace{-.4in} 222 | \begin{center} 223 | \begin{tabular}{p{\textwidth}}\\ 224 | \hline 225 | \end{tabular} 226 | \end{center} 227 | } 228 | 229 | % calligraphic letters 230 | \newcommand{\calA}{{\cal A}} 231 | \newcommand{\calB}{{\cal B}} 232 | \newcommand{\calC}{{\cal C}} 233 | \newcommand{\calD}{{\cal D}} 234 | \newcommand{\calE}{{\cal E}} 235 | \newcommand{\calF}{{\cal F}} 236 | \newcommand{\calG}{{\cal G}} 237 | \newcommand{\calH}{{\cal H}} 238 | \newcommand{\calI}{{\cal I}} 239 | \newcommand{\calJ}{{\cal J}} 240 | \newcommand{\calK}{{\cal K}} 241 | \newcommand{\calL}{{\cal L}} 242 | \newcommand{\calM}{{\cal M}} 243 | \newcommand{\calN}{{\cal N}} 244 | \newcommand{\calO}{{\cal O}} 245 | \newcommand{\calP}{{\cal P}} 246 | \newcommand{\calQ}{{\cal Q}} 247 | \newcommand{\calR}{{\cal R}} 248 | \newcommand{\calS}{{\cal S}} 249 | \newcommand{\calT}{{\cal T}} 250 | \newcommand{\calU}{{\cal U}} 251 | \newcommand{\calV}{{\cal V}} 252 | \newcommand{\calW}{{\cal W}} 253 | \newcommand{\calX}{{\cal X}} 254 | \newcommand{\calY}{{\cal Y}} 255 | \newcommand{\calZ}{{\cal Z}} 256 | 257 | % bold letters (useful for random variables) 258 | \renewcommand{\a}{{\boldsymbol a}} 259 | \renewcommand{\b}{{\boldsymbol b}} 260 | \renewcommand{\c}{{\boldsymbol c}} 261 | \renewcommand{\d}{{\boldsymbol d}} 262 | \newcommand{\e}{{\boldsymbol e}} 263 | \newcommand{\f}{{\boldsymbol f}} 264 | \newcommand{\g}{{\boldsymbol g}} 265 | \newcommand{\h}{{\boldsymbol h}} 266 | \renewcommand{\i}{{\boldsymbol i}} 267 | \renewcommand{\j}{{\boldsymbol j}} 268 | \renewcommand{\k}{{\boldsymbol k}} 269 | \newcommand{\m}{{\boldsymbol m}} 270 | \newcommand{\n}{{\boldsymbol n}} 271 | \renewcommand{\o}{{\boldsymbol o}} 272 | \newcommand{\p}{{\boldsymbol p}} 273 | \newcommand{\q}{{\boldsymbol q}} 274 | \renewcommand{\r}{{\boldsymbol r}} 275 | \newcommand{\s}{{\boldsymbol s}} 276 | \renewcommand{\t}{{\boldsymbol t}} 277 | \renewcommand{\u}{{\boldsymbol u}} 278 | \renewcommand{\v}{{\boldsymbol v}} 279 | \newcommand{\w}{{\boldsymbol w}} 280 | \newcommand{\x}{{\boldsymbol x}} 281 | \newcommand{\y}{{\boldsymbol y}} 282 | \newcommand{\z}{{\boldsymbol z}} 283 | \newcommand{\A}{{\boldsymbol A}} 284 | \newcommand{\B}{{\boldsymbol B}} 285 | \newcommand{\D}{{\boldsymbol D}} 286 | \newcommand{\G}{{\boldsymbol G}} 287 | \renewcommand{\H}{{\boldsymbol H}} 288 | \newcommand{\I}{{\boldsymbol I}} 289 | \newcommand{\J}{{\boldsymbol J}} 290 | \newcommand{\K}{{\boldsymbol K}} 291 | \renewcommand{\L}{{\boldsymbol L}} 292 | \newcommand{\M}{{\boldsymbol M}} 293 | \renewcommand{\O}{{\boldsymbol O}} 294 | \renewcommand{\S}{{\boldsymbol S}} 295 | \newcommand{\T}{{\boldsymbol T}} 296 | \newcommand{\U}{{\boldsymbol U}} 297 | \newcommand{\V}{{\boldsymbol V}} 298 | \newcommand{\W}{{\boldsymbol W}} 299 | \newcommand{\X}{{\boldsymbol X}} 300 | \newcommand{\Y}{{\boldsymbol Y}} 301 | 302 | 303 | 304 | % useful for Fourier analysis 305 | \newcommand{\bits}{\{-1,1\}} 306 | \newcommand{\bitsn}{\{-1,1\}^n} 307 | \newcommand{\bn}{\bitsn} 308 | \newcommand{\isafunc}{{: \bitsn \rightarrow \bits}} 309 | \newcommand{\fisafunc}{{f : \bitsn \rightarrow \bits}} 310 | 311 | % if you want 312 | \newcommand{\half}{{\textstyle \frac12}} 313 | 314 | \newcommand{\myfig}[4]{\begin{figure}[h] \begin{center} \includegraphics[width=#1\textwidth]{#2} \caption{#3} \label{#4} \end{center} \end{figure}} 315 | 316 | 317 | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 318 | % Feel free to ignore the rest of this file 319 | 320 | 321 | 322 | \def\ScribeStr{??} 323 | \def\LecStr{??} 324 | \def\LecNum{??} 325 | \def\LecTitle{??} 326 | \def\LecDate{??} 327 | \newcommand{\Scribe}[1]{\def\ScribeStr{Scribe: #1}} 328 | \newcommand{\Scribes}[1]{\def\ScribeStr{Scribes: #1}} 329 | \newcommand{\Lecturer}[1]{\def\LecStr{Lecturer: #1}} 330 | \newcommand{\Lecturers}[1]{\def\LecStr{Lecturers: #1}} 331 | \newcommand{\LectureNumber}[1]{\def\LecNum{#1}} 332 | \newcommand{\LectureDate}[1]{\def\LecDate{#1}} 333 | \newcommand{\LectureTitle}[1]{\def\LecTitle{#1}} 334 | 335 | \newdimen\headerwidth 336 | 337 | \newcommand{\MakeScribeTop}{ 338 | \noindent 339 | \begin{center} 340 | \framebox{ 341 | \vbox{ 342 | \headerwidth=\textwidth 343 | \advance\headerwidth by -0.22in 344 | \hbox to \headerwidth {\hfill Mathematical Theory of Neural Network Models} 345 | \vspace{4mm} 346 | \hbox to \headerwidth {{\Large \hfill Lecture \LecNum: {\LecTitle} \hfill}} 347 | \vspace{2mm} 348 | \hbox to \headerwidth {\hfill \LecDate \hfill} 349 | \vspace{2mm} 350 | \hbox to \headerwidth {{\it \LecStr \hfill \ScribeStr}} 351 | } 352 | } 353 | \end{center} 354 | \vspace*{4mm}} 355 | -------------------------------------------------------------------------------- /template/template.tex: -------------------------------------------------------------------------------- 1 | \documentclass[12pt]{article} 2 | \usepackage[english]{babel} 3 | \usepackage[utf8x]{inputenc} 4 | \usepackage[T1]{fontenc} 5 | \usepackage{scribe} 6 | \usepackage{listings} 7 | 8 | \Scribe{Shuhai Zhao, Yilei Han} 9 | \Lecturer{Lei Wu} 10 | \LectureNumber{1} 11 | \LectureDate{July 2} 12 | \LectureTitle{An Introduction to Supervised Learning} 13 | 14 | \lstset{style=mystyle} 15 | 16 | \begin{document} 17 | \MakeScribeTop 18 | 19 | \section{Supervised Learning} 20 | Some basic terminologies: 21 | \begin{itemize} 22 | \item \textit{features}: The set of attibutes, often represented as a vector, associated to an example. 23 | \item \textit{Hypothesis space}: A set $\mathcal{F}$ of functions mapping features to the set of labels $\mathcal{Y}$. 24 | \item \textit{Loss function}: A function $l$ that measures the difference, or loss, between a predicted label and a true label: $l:\mathcal{Y}×\mathcal{Y}\to \mathbb{R}_{+}$, for example, $l(y,y')=(y-y')^{2}$. 25 | \end{itemize} 26 | 27 | 28 | \bibliographystyle{abbrv} % if you need a bibliography 29 | \bibliography{mybib} % assuming yours is named mybib.bib 30 | 31 | \end{document} --------------------------------------------------------------------------------