├── README.md └── slides ├── GM_1.pdf ├── GM_2.pdf ├── GM_3_4.pdf ├── GM_4_Sampling.pdf ├── classification+1(1).pdf ├── classification+1.pdf ├── classification+2.pdf ├── clustering.pdf ├── data+preprocessing.pdf ├── dimensionality+reduction(1).pdf ├── dimensionality+reduction.pdf ├── ensemble.pdf ├── generative+models.pdf ├── introduction+to+deep+learning.pdf ├── lec1+lec2.pdf ├── recurrent_network.pdf ├── regression.pdf └── topic+models.pdf /README.md: -------------------------------------------------------------------------------- 1 | ## Introduction to Data Science 2 | 3 | 4 | ### Announcements 5 | 6 | 7 | ### Administrative information 8 | 9 | - **Lecturer:** [Zhanxing Zhu](https://sites.google.com/view/zhanxingzhu/) 10 | - **Teaching Assistants:** 11 | - Hantao Guo, guohantao@pku.edu.cn 12 | - Yuanjin Zhu, syxz1995@gmail.com 13 | - Junzhao Zhang, 1801213960@pku.edu.cn 14 | 15 | - **Time:** Mon: 1:00-2:50 pm; Thu(odd week): 3:10-5:00 pm. 16 | 17 | - **Location:** Room 107, [Teaching Building 2](https://maps.baidu.com/poi/%E5%8C%97%E4%BA%AC%E5%A4%A7%E5%AD%A6(%E7%87%95%E5%9B%AD%E6%A0%A1%E5%8C%BA)%E7%AC%AC%E4%BA%8C%E6%95%99%E5%AD%A6%E6%A5%BC(%E6%9D%8E%E5%85%86%E5%9F%BA%E6%A5%BC)/@12948834.869857343,4837581.844142513,19.6z?uid=82548a63754afc91735e80e4&primaryUid=10472254985355704340&ugc_type=3&ugc_ver=1&device_ratio=1&compat=1&querytype=detailConInfo&da_src=shareurl) 18 | 19 | 20 | ### Course Content 21 | 22 | **Description:** 23 | 24 | This course is a preliminary course for learning from data, artificial intelligence and other related application areas. 25 | 26 | **Prerequisite:** 27 | 28 | - Calculus, linear algebra, basic statistics, numerical optimization, signal processing 29 | - Programming and algorithms, e.g. Python 30 | 31 | 32 | ### Grading 33 | 34 | **Mid-term exam** (40%): writing exam (Dec 9) 35 | 36 | **Final projects** (60%): 37 | 38 | - Kaggle-in-Class competition, select 1 out of 3 39 | - including submission to the Kaggle platform and report writing 40 | - Deadline: Jan 19, 2020 (strict) 41 | - At most 2 students as a team 42 | - the links of competitions are as follows 43 | - [Traffic Prediction](https://www.kaggle.com/t/966947f0c9454eec8fab55e15d2bbacd) 44 | - [Flower Classification](https://www.kaggle.com/t/25f30048035046078ae2359a85241a39) 45 | - [PES Challenge](https://www.kaggle.com/t/85b078a8fa4f45648c01b488b8925876) 46 | 47 | 48 | ### References 49 | 50 | - Elements of statistical learning 51 | - Pattern recognition and machine learning 52 | - Vapnik. The nature of statistical learning theory 53 | - 《数据科学导引》 54 | 55 | 56 | --- 57 | ### Schedule (subject to change) 58 | 59 | #### Week 1 60 | - Mon 9/9: Introduction to Data Science 61 | - [Lecture 1](slides/lec1+lec2.pdf) 62 | - Thu 9/12: Review of Preliminary Knowledge 63 | - [Lecture 2](slides/lec1+lec2.pdf) 64 | #### Week 2 65 | - Mon 9/16: Data Preprocess 66 | - [Lecture 3](slides/data+preprocessing.pdf) 67 | #### Week 3 68 | - Mon 9/23: Regression 69 | - [Lecture 4](slides/regression.pdf) 70 | #### Week 5 71 | - Mon 10/7: Classification1 72 | - [Lecture 5](slides/classification+1.pdf) 73 | - Thu 10/10: Classification2 74 | - [Lecture 6](slides/classification+1(1).pdf) 75 | #### Week 6 76 | - Mon 10/14: Classification3 77 | - [Lecture 7](slides/classification+2.pdf) 78 | - Thu 10/17: Ensemble 79 | - [Lecture 8](slides/ensemble.pdf) 80 | #### Week 7 81 | - Mon 10/21: Clustering 82 | - [Lecture 9](slides/clustering.pdf) 83 | - Thu 10/24: Dimension Reduction1 84 | - [Lecture 10](slides/dimensionality+reduction.pdf) 85 | #### Week 8 86 | - Mon 10/28: Dimension Reduction2 87 | - [Lecture 11](slides/dimensionality+reduction(1).pdf) 88 | #### Week 9 89 | - Mon 11/4: Graphical Models1 90 | - [Lecture 12](slides/GM_1.pdf) 91 | - Thu 11/7: Graphical Models2 92 | - [Lecture 13](slides/GM_2.pdf) 93 | #### Week 10 94 | - Mon 11/11: Graphical Models3 95 | - [Lecture14](slides/GM_3_4.pdf) 96 | #### Week 11 97 | - Mon 11/18: Graphical Models4 98 | - Thu 11/21: Graphical Models5 99 | - [Lecture15](slides/GM_4_Sampling.pdf) 100 | #### Week 12 101 | - Mon 11/25: Topic Model 102 | - [Lecture16](slides/topic+models.pdf) 103 | #### Week 13 104 | - Mon 12/2: Introduction to Deep Learning 105 | - [Lecture17](slides/introduction+to+deep+learning.pdf) 106 | - Thu 12/7 107 | #### Week 14 108 | - Mon 12/9: Mid-term Exam 109 | #### Week 15 110 | - Mon 12/16 111 | - Thu 12/19 112 | #### Week 16 113 | - Mon 12/23 114 | -------------------------------------------------------------------------------- /slides/GM_1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/GM_1.pdf -------------------------------------------------------------------------------- /slides/GM_2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/GM_2.pdf -------------------------------------------------------------------------------- /slides/GM_3_4.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/GM_3_4.pdf -------------------------------------------------------------------------------- /slides/GM_4_Sampling.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/GM_4_Sampling.pdf -------------------------------------------------------------------------------- /slides/classification+1(1).pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/classification+1(1).pdf -------------------------------------------------------------------------------- /slides/classification+1.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/classification+1.pdf -------------------------------------------------------------------------------- /slides/classification+2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/classification+2.pdf -------------------------------------------------------------------------------- /slides/clustering.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/clustering.pdf -------------------------------------------------------------------------------- /slides/data+preprocessing.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/data+preprocessing.pdf -------------------------------------------------------------------------------- /slides/dimensionality+reduction(1).pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/dimensionality+reduction(1).pdf -------------------------------------------------------------------------------- /slides/dimensionality+reduction.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/dimensionality+reduction.pdf -------------------------------------------------------------------------------- /slides/ensemble.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/ensemble.pdf -------------------------------------------------------------------------------- /slides/generative+models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/generative+models.pdf -------------------------------------------------------------------------------- /slides/introduction+to+deep+learning.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/introduction+to+deep+learning.pdf -------------------------------------------------------------------------------- /slides/lec1+lec2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/lec1+lec2.pdf -------------------------------------------------------------------------------- /slides/recurrent_network.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/recurrent_network.pdf -------------------------------------------------------------------------------- /slides/regression.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/regression.pdf -------------------------------------------------------------------------------- /slides/topic+models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/PKUAI26/IntroductionToDataScience2019Fall/845890e839d911d2d4611414d35b01240850a99e/slides/topic+models.pdf --------------------------------------------------------------------------------