└── README.md


/README.md:
--------------------------------------------------------------------------------
  1 | # Some resources for ML research
  2 | 
  3 | Personal and biased selection of ML resources.
  4 | 
  5 | **Disclaimer:** I'm a noivce in ML research, and I read only a few of the list.
  6 | 
  7 | 
  8 | ## Table of Contents
  9 | - [Beginner's Guide](#beginners-guide)
 10 | - [Machine Learning](#machine-learning)
 11 | - [Deep Learning](#deep-learning)
 12 | - [Generative Model](#generative-model)
 13 | - [Reinforcement Learning](#reinforcement-learning)
 14 | - [Graphical Model](#graphical-model)
 15 | - [Optimization](#optimization)
 16 | - [Learning Theory](#learning-theory)
 17 | - [Statistics](#statistics)
 18 | - [Topics in Machine Learning](#topics-in-machine-learning)
 19 | - [Math Backgrounds](#math-backgrounds)
 20 | - [Blogs](#blogs)
 21 | 
 22 | 
 23 | ## Beginner's Guide
 24 | 
 25 | **Must Read**
 26 | - Machine Learning: A Probabilistic Perspective (Murphy)
 27 | - Deep Learning (Goodfellow et al.)
 28 | - Reinforcement Learning: An Introduction (Sutton & Barto)
 29 | 
 30 | **Recommended**
 31 | - Convex Optimization (Boyd & Vandenberghe)
 32 | - Graphical Models, Exponential Families, and Variational Inference (Wainwright & Jordan)
 33 | - Learning from Data (Abu-Mostafa) *-> for whom interested in learning theory*
 34 | 
 35 | **Recent Topics**
 36 | - Read research blogs (e.g., [OpenAI](https://blog.openai.com/), [BAIR](http://bair.berkeley.edu/blog/), [CMU](https://blog.ml.cmu.edu/))
 37 | - Read lectures from Berkeley, Stanford, CMU or UofT (e.g., [unsupervised learning](https://sites.google.com/view/berkeley-cs294-158-sp19))
 38 | - There are lots of good sources, but I stop updating them up-to-date
 39 | 
 40 | 
 41 | ## Machine Learning
 42 | There are many ML books, but most of them are encyclopedic. <br/>
 43 | I recommend to take a course using Murphy or Bishop book.
 44 | 
 45 | ### Textbook
 46 | - Machine Learning: A Probabilistic Perspective (Murphy) :sparkles:
 47 | - Pattern Recognition and Machine Learning (Bishop) :sparkles:
 48 | - The Elements of Statistical Learning (Hastie et al.)
 49 | - Pattern Classification (Duda et al.)
 50 | - Bayesian Reasoning and Machine Learning (Barber)
 51 | 
 52 | ### Lecture
 53 | - [Stanford CS229 Machine Learning](http://cs229.stanford.edu) :sparkles:
 54 | - [CMU 10701 Mahine Learning](http://www.cs.cmu.edu/~tom/10701_sp11/)
 55 | - [CMU 10702 Statistical Machine Learning](http://www.stat.cmu.edu/~larry/=sml/)
 56 | - [Oxford Machine Learning](https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/)
 57 | 
 58 | 
 59 | ## Deep Learning
 60 | Goodfellow et al. is the new classic. <br/>
 61 | For vision and NLP, Stanford lectures would be helpful.
 62 | 
 63 | ### Textbook
 64 | - Deep Learning (Goodfellow et al.) :sparkles:
 65 | 
 66 | ### Lecture (Practice)
 67 | - [Deep Learning book](http://www.deeplearningbook.org/lecture_slides.html) :sparkles:
 68 | - [Stanford CS231n Convolutional Neural Networks for Visual Recognition](http://cs231n.stanford.edu) :sparkles:
 69 | - [Stanfrod CS224d Deep Learning for Natural Language Processing](http://cs224d.stanford.edu)
 70 | - [Stanfrod CS224s Spoken Language Processing](http://web.stanford.edu/class/cs224s/)
 71 | - [Oxford Deep Natural Language Processing](https://github.com/oxford-cs-deepnlp-2017/lectures)
 72 | - [CMU 11747 Neural Networks for NLP](http://phontron.com/class/nn4nlp2017/index.html)
 73 | 
 74 | ### Lecture (Theory)
 75 | - [Stanford Stat385 Theories of Deep Learning](https://stats385.github.io/)
 76 | 
 77 | 
 78 | ## Generative Model
 79 | I seperated generative model as an independent topic, <br/>
 80 | since I think it is big and important area.
 81 | 
 82 | ### Lecture
 83 | - [Toronto CSC2541 Differentiable Inference and Generative Models](https://www.cs.toronto.edu/~duvenaud/courses/csc2541/index.html)
 84 | - [Toronto CSC2547 Learning Discrete Latent Structure](https://duvenaud.github.io/learn-discrete/)
 85 | - [Toronto CSC2541 Scalable and Flexible Models of Uncertainty](https://csc2541-f17.github.io/)
 86 | 
 87 | 
 88 | ## Reinforcement Learning
 89 | For classic (non-deep) RL, Sutton & Barto is the classic. <br/>
 90 | For deep RL, lectures from Berkeley/CMU looks good.
 91 | 
 92 | ### Textbook
 93 | - Reinforcement Learning: An Introduction (Sutton & Barto) :sparkles:
 94 | 
 95 | ### Lecture
 96 | - [UCL Reinforcement Learning](http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html) :sparkles:
 97 | - [Berkeley CS294 Deep Reinforcement Leanring](http://rll.berkeley.edu/deeprlcourse/) :sparkles:
 98 | - [CMU 10703 Deep Reinforcement Learing and Control](https://katefvision.github.io/)
 99 | 
100 | 
101 | ## Graphical Model
102 | Koller & Friedman is comprehensive, but too encyclopedic. <br/>
103 | I recommend to take an introductory course using Koller & Friedman book. <br/>
104 | 
105 | Wainwright & Jordan only focuses on variational inference, <br/>
106 | but it gives really good intuition for probabilistic models.
107 | 
108 | ### Textbook
109 | - Probabilistic Graphical Models: Principles and Techniques (Koller & Friedman)
110 | - Graphical Models, Exponential Families, and Variational Inference (Wainwright & Jordan) :sparkles:
111 | 
112 | ### Lecture
113 | - [Stanford CS228 Probabilistic Graphical Models](http://cs.stanford.edu/~ermon/cs228/index.html)
114 | - [CMU 10708 Probabilistic Graphical Models](http://www.cs.cmu.edu/~epxing/Class/10708/) :sparkles:
115 | 
116 | 
117 | ## Optimization
118 | Boyd & Vandenberghe is the classic, but I think it's too boring. <br/>
119 | Reading chapter 2-5 would be enough.
120 | 
121 | Bertsekas more concentrates on convex analysis. <br/>
122 | Nocedal & Wright more concentrates on optimization.
123 | 
124 | ### Textbook
125 | - Convex Optimization (Boyd & Vandenberghe) :sparkles:
126 | - Convex Optimization Theory (Bertsekas)
127 | - Numerical Optimization (Nocedal & Wright)
128 | 
129 | ### Lecture
130 | - [Stanford EE364a Convex Optimization I](http://stanford.edu/class/ee364a/) :sparkles:
131 | - [Stanford EE364b Convex Optimization II](http://stanford.edu/class/ee364a/)
132 | - [MIT 6.253 Convex Analysis and Optimization](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-253-convex-analysis-and-optimization-spring-2012/index.htm)
133 | 
134 | 
135 | ## Learning Theory
136 | In my understanding, there are two major topics in learning theory:
137 | 
138 | - **Learning Theory:** VC-dimension, PAC-learning
139 | - **Online Learning:** regret bound, multi-armed bandit
140 | 
141 | For learning theory, Kearns & Vazirani is the classic; but it's too old-fashined. <br/>
142 | Abu-Mostafa is a good introductory book, and I think it's enough for most people.
143 | 
144 | For online learning, Cesa-Bianchi & Lugosi is the classic. <br/>
145 | For multi-armed bandit, Bubeck & Cesa-Bianchi provides a good survey.
146 | 
147 | ### Textbook (Learning Theory)
148 | - Learning from Data (Abu-Mostafa) :sparkles:
149 | - Foundations of Machine Learning (Mohri et al.)
150 | - An Introduction to Computational Learning Theory (Kearns & Vazirani) 
151 | 
152 | ### Textbook (Online Learning)
153 | - Prediction, Learning, and Games (Cesa-Bianchi & Lugosi)
154 | - Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems (Bubeck & Cesa-Bianchi)
155 | 
156 | ### Lecture
157 | - [Caltech Learning from Data](https://work.caltech.edu/telecourse.html) :sparkles:
158 | - [CMU 15859 Machine Learning Theory](http://www.cs.cmu.edu/~avrim/ML14/)
159 | - [Berkeley CS281b/Stat241b Statistical Learning Theory](https://www.stat.berkeley.edu/~bartlett/courses/2014spring-cs281bstat241b/)
160 | - [MIT 9.520 Statistical Learning Theory and Applications](http://www.mit.edu/~9.520/fall15/)
161 | 
162 | 
163 | ## Statistics
164 | Statistics is a broad area; hence, I listed only a few of them. <br/>
165 | For advanced topics, lectures from Berkeley/Stanford/CMU/MIT looks really cool. <br/>
166 | 
167 | ### Textbook (Statistical Inference)
168 | - All of Statistics (Wasserman)
169 | - Computer Age Statistical Inference (Efron & Hastie) :sparkles:
170 | - Time Series Analysis and Its Applications: With R Examples (Shumway & Stoffer)
171 | 
172 | ### Textbook (Nonparametrics)
173 | - All of Nonparametric Statistics (Wasserman)
174 | - Introduction to Nonparametric Estimation (Tsybakov)
175 | - Gaussian Process and Machine Learning (Rasmussen & Williams) :sparkles:
176 | - Bayesian Nonparametrics (Ghosh & Ramamoorthi) :sparkles:
177 | 
178 | ### Textbook (Advanced Topics)
179 | - High-Dimensional Statistics: A Non-Asymptotic Viewpoint (Wainwright) :sparkles:
180 | - Statistics for High-Dimensional Data (Bühlmann & van de Geer)
181 | - Asymptotic Statistics (van der Vaart)
182 | - Empirical Processes in M-Estimation (van der Vaart)
183 | 
184 | ### Lecture
185 | - [Berkeley Stat210a Theoretical Statistics I](https://www.stat.berkeley.edu/~wfithian/courses/stat210a/)
186 | - [Berkeley Stat210b Theoretical Statistics II](https://people.eecs.berkeley.edu/~jordan/courses/210B-spring17/)
187 | - [Stanford Stat300a Theory of Statistics](https://web.stanford.edu/~lmackey/stats300a/)
188 | - [Stanford CS369m Algorithms for Massive Data Set Analysis](http://cs.stanford.edu/people/mmahoney/cs369m/)
189 | - [CMU 36755 Advanced Statistical Theory I](http://www.stat.cmu.edu/~arinaldo/36755/F16/)
190 | - [MIT 18.S997 High-Dimensional Statistics](https://ocw.mit.edu/courses/mathematics/18-s997-high-dimensional-statistics-spring-2015/)
191 | 
192 | 
193 | ## Topics in Machine Learning
194 | Miscellaneous topics related to machine learning. <br/>
195 | There are much more subfields, but I'll not list them all.
196 | 
197 | ### Information Theory
198 | - Elements of Information Theory (Cover & Thomas)
199 | - Information Theory, Inference, and Learning Algorithms (MacKay)
200 | 
201 | ### Network Science
202 | - Networks, Crowds, and Markets (Easley & Kleinberg)
203 | - Social and Economic Networks (Jackson)
204 | 
205 | ### Markov Chain
206 | - Markov Chains (Norris)
207 | - Markov Chains and Mixing Times (Levin et al.)
208 | 
209 | ### Game Theory
210 | - Algorithmic Game Theory (Nisan et al.)
211 | - Multiagent Systems (Shoham & Leyton-Brown)
212 | 
213 | ### Combinatorics
214 | - The Probabilistic Method (Alon & Spencer)
215 | - A First Course in Combinatorial Optimization (Lee)
216 | 
217 | ### Algorithm
218 | - Introduction to Algorithms (Cormen et al.)
219 | - Randomized Algorithms (Motwani & Raghavan)
220 | - Approximation Algorithms (Vazirani)
221 | 
222 | ### Geometric View
223 | - Topological Data Analysis (Wasserman)
224 | - Methods of Information Geometry (Amari & Nagaoka)
225 | - Algebraic Geometry and Statistical Learning Theory (Watanabe)
226 | 
227 | ### Some Lectures
228 | - [MIT 18.409 Algorithmic Aspects of Machine Learning](http://people.csail.mit.edu/moitra/409.html)
229 | - [MIT 18.409 An Algorithmist's Toolkit](http://stellar.mit.edu/S/course/18/fa09/18.409/)
230 | 
231 | 
232 | ## Math Backgrounds
233 | I selected essential topics for machine learning. <br/>
234 | Personally, I think more analysis / matrix / geometry never hurts.
235 | 
236 | ### Probability
237 | - Probability: Theory and Examples (Durrett)
238 | - Theoretical Statistics (Keener)
239 | - Stochastic Processes (Bass)
240 | - Probability and Statistics Cookbook (Vallentin)
241 | 
242 | ### Linear Algebra
243 | - Linear Algebra (Hoffman & Kunze)
244 | - Matrix Analysis (Horn & Johnson)
245 | - Matrix Computations (Golub & Van Loan)
246 | - The Matrix Cookbook (Petersen & Pedersen)
247 | 
248 | ### Large Deviations
249 | - Concentration Inequalities and Martingale Inequalities (Chung & Lu)
250 | - An Introduction to Matrix Concentration Inequalities (Tropp)
251 | 
252 | 
253 | ## Blogs
254 | 
255 | - [Google AI Blog](https://ai.googleblog.com/)
256 | - [DeepMind Blog](https://deepmind.com/blog/?category=research)
257 | - [OpenAI Blog](https://blog.openai.com/)
258 | - [FAIR Blog](https://research.fb.com/blog/)
259 | - [Distill.pub](https://distill.pub/)
260 | - [BAIR Blog](http://bair.berkeley.edu/blog/)
261 | - [CMU Blog](https://blog.ml.cmu.edu/)
262 | - [Off the convex path](http://www.offconvex.org/)
263 | - [inFERENCe](http://www.inference.vc/)
264 | - [Sebastian Ruder](http://ruder.io/#open)
265 | - [Lunit Tech Blog (Korean)](https://blog.lunit.io/category/paper-review/)
266 | 


--------------------------------------------------------------------------------