├── Ass1 ├── autotrace.sh ├── spamData.mat ├── start_work.sh ├── report.bib ├── README.md ├── report_en.tex ├── report_zh.tex ├── dataloader.py └── naive_bayes_classifier.py ├── Ass2 ├── autotrace.sh ├── report.bib ├── setup_svm.sh ├── start_work.sh ├── README.md ├── report_en.tex ├── report_zh.tex └── dataloader.py ├── Ass1_partIII ├── autotrace.sh ├── report.bib ├── start_work.sh ├── README.md ├── report_en.tex ├── report_zh.tex ├── dataloader.py └── discriminative_classfier.py ├── Ass0_warming_up ├── part3 │ ├── fig │ │ └── 3d_lda.png │ ├── report_en.pdf │ ├── report_zh.pdf │ ├── report.bib │ ├── report_zh.tex │ └── report_en.tex └── README.md └── README.md /Ass1/autotrace.sh: -------------------------------------------------------------------------------- 1 | while true; do git add *.tex *.bib *.py; git commit -m "autotrace"; sleep 60; done 2 | -------------------------------------------------------------------------------- /Ass1/spamData.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/taineleau-zz/Fudan_PRML2017_Assignment/HEAD/Ass1/spamData.mat -------------------------------------------------------------------------------- /Ass2/autotrace.sh: -------------------------------------------------------------------------------- 1 | while true; do git add *.tex *.bib *.py; git commit -m "autotrace"; sleep 60; done 2 | -------------------------------------------------------------------------------- /Ass1_partIII/autotrace.sh: -------------------------------------------------------------------------------- 1 | while true; do git add *.tex *.bib *.py; git commit -m "autotrace"; sleep 60; done 2 | -------------------------------------------------------------------------------- /Ass0_warming_up/part3/fig/3d_lda.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/taineleau-zz/Fudan_PRML2017_Assignment/HEAD/Ass0_warming_up/part3/fig/3d_lda.png -------------------------------------------------------------------------------- /Ass0_warming_up/part3/report_en.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/taineleau-zz/Fudan_PRML2017_Assignment/HEAD/Ass0_warming_up/part3/report_en.pdf -------------------------------------------------------------------------------- /Ass0_warming_up/part3/report_zh.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/taineleau-zz/Fudan_PRML2017_Assignment/HEAD/Ass0_warming_up/part3/report_zh.pdf -------------------------------------------------------------------------------- /Ass1/start_work.sh: -------------------------------------------------------------------------------- 1 | mkdir ../../prml_ass1 2 | cp *.py *.sh *.tex *.bib ../../prml_ass1 3 | cd ../../prml_ass1 4 | git init 5 | git add . 6 | git commit -m "initial init for ass 1" 7 | -------------------------------------------------------------------------------- /Ass1/report.bib: -------------------------------------------------------------------------------- 1 | @article{bishop2006pattern, 2 | title={Pattern recognition}, 3 | author={Bishop, Christopher M}, 4 | journal={Machine Learning}, 5 | volume={128}, 6 | year={2006} 7 | } -------------------------------------------------------------------------------- /Ass2/report.bib: -------------------------------------------------------------------------------- 1 | @article{bishop2006pattern, 2 | title={Pattern recognition}, 3 | author={Bishop, Christopher M}, 4 | journal={Machine Learning}, 5 | volume={128}, 6 | year={2006} 7 | } -------------------------------------------------------------------------------- /Ass2/setup_svm.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | git clone https://github.com/cjlin1/libsvm.git 3 | cd libsvm/python 4 | make 5 | cd ../../ 6 | cp __init__.py libsvm 7 | cp __init__.py libsvm/python 8 | cd .. -------------------------------------------------------------------------------- /Ass1_partIII/report.bib: -------------------------------------------------------------------------------- 1 | @article{bishop2006pattern, 2 | title={Pattern recognition}, 3 | author={Bishop, Christopher M}, 4 | journal={Machine Learning}, 5 | volume={128}, 6 | year={2006} 7 | } -------------------------------------------------------------------------------- /Ass0_warming_up/part3/report.bib: -------------------------------------------------------------------------------- 1 | @article{bishop2006pattern, 2 | title={Pattern recognition}, 3 | author={Bishop, Christopher M}, 4 | journal={Machine Learning}, 5 | volume={128}, 6 | year={2006} 7 | } -------------------------------------------------------------------------------- /Ass1_partIII/start_work.sh: -------------------------------------------------------------------------------- 1 | mkdir ../../prml_ass1_partIII 2 | cp *.py *.sh *.tex *.bib ../../prml_ass1_partIII 3 | cd ../../prml_ass1_partIII 4 | git init 5 | git add . 6 | git commit -m "initial init for ass 1 part III" 7 | -------------------------------------------------------------------------------- /Ass2/start_work.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | if [ $# -eq 1 ] 3 | then 4 | mkdir ../../prml_ass2_$1 5 | cp *.py *.sh *.tex *.bib ../../prml_ass2_$1 6 | cd ../../prml_ass2_$1 7 | git init 8 | git add . 9 | git commit -m "initial init for ass 2" 10 | else 11 | echo "Usage: $0 stuid" 12 | fi 13 | -------------------------------------------------------------------------------- /Ass1/README.md: -------------------------------------------------------------------------------- 1 | # Guide 2 | 3 | First, make sure you have installed [`git`](https://git-scm.com/book/en/v1/Getting-Started-Installing-Git). 4 | 5 | Second, run `start_work.sh` by running: 6 | 7 | `./start_work.sh` 8 | 9 | Third, change to your new working directory by running: 10 | 11 | `cd ../../prml_ass1` 12 | 13 | 14 | Have fun! -------------------------------------------------------------------------------- /Ass2/README.md: -------------------------------------------------------------------------------- 1 | # Guide 2 | 3 | First, make sure you have installed [`git`](https://git-scm.com/book/en/v1/Getting-Started-Installing-Git). 4 | 5 | Second, run `start_work.sh` by running: 6 | 7 | `./start_work.sh 1330000000`, replace `1330000000` by your student ID. 8 | 9 | Third, change to your new working directory by running: 10 | 11 | `cd ../../prml_ass2_133000000`, replace `1330000000` by your student ID. 12 | 13 | 14 | Have fun! -------------------------------------------------------------------------------- /Ass1_partIII/README.md: -------------------------------------------------------------------------------- 1 | # Guide 2 | 3 | First, make sure you have installed [`git`](https://git-scm.com/book/en/v1/Getting-Started-Installing-Git). 4 | 5 | Second, run `start_work.sh` by running: 6 | 7 | `./start_work.sh` 8 | 9 | Third, change to your new working directory by running: 10 | 11 | `cd ../../prml_ass1_partIII` 12 | 13 | Four, copy `spamData.mat` from last assignment's directory to your working directory. 14 | 15 | Have fun! -------------------------------------------------------------------------------- /Ass1/report_en.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt, oneside]{article} 2 | \usepackage[a4paper,bindingoffset=0.2in,% 3 | left=1in,right=1in,top=1in,bottom=1in,% 4 | footskip=.25in]{geometry} 5 | \usepackage{graphicx} 6 | \usepackage{enumerate} 7 | \usepackage{url} 8 | \usepackage{hyperref} 9 | \usepackage{pbox} 10 | \usepackage{CJKutf8} 11 | 12 | 13 | \begin{document} 14 | \title{Report} 15 | \author{taineleau, 13300000001 \\ Department of CS} 16 | \maketitle 17 | \section{Chapter 1} 18 | \subsection{Chapter 1.1} 19 | 20 | 21 | reference \cite{bishop2006pattern}, url: \url{http://fudan.edu.cn} 22 | 23 | \bibliographystyle{plain} 24 | \bibliography{report} 25 | 26 | \end{document} -------------------------------------------------------------------------------- /Ass2/report_en.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt, oneside]{article} 2 | \usepackage[a4paper,bindingoffset=0.2in,% 3 | left=1in,right=1in,top=1in,bottom=1in,% 4 | footskip=.25in]{geometry} 5 | \usepackage{graphicx} 6 | \usepackage{enumerate} 7 | \usepackage{url} 8 | \usepackage{hyperref} 9 | \usepackage{pbox} 10 | \usepackage{CJKutf8} 11 | 12 | 13 | \begin{document} 14 | \title{Report} 15 | \author{taineleau, 13300000001 \\ Department of CS} 16 | \maketitle 17 | \section{Chapter 1} 18 | \subsection{Chapter 1.1} 19 | 20 | 21 | reference \cite{bishop2006pattern}, url: \url{http://fudan.edu.cn} 22 | 23 | \bibliographystyle{plain} 24 | \bibliography{report} 25 | 26 | \end{document} -------------------------------------------------------------------------------- /Ass1_partIII/report_en.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt, oneside]{article} 2 | \usepackage[a4paper,bindingoffset=0.2in,% 3 | left=1in,right=1in,top=1in,bottom=1in,% 4 | footskip=.25in]{geometry} 5 | \usepackage{graphicx} 6 | \usepackage{enumerate} 7 | \usepackage{url} 8 | \usepackage{hyperref} 9 | \usepackage{pbox} 10 | \usepackage{CJKutf8} 11 | 12 | 13 | \begin{document} 14 | \title{Report} 15 | \author{taineleau, 13300000001 \\ Department of CS} 16 | \maketitle 17 | \section{Chapter 1} 18 | \subsection{Chapter 1.1} 19 | 20 | 21 | reference \cite{bishop2006pattern}, url: \url{http://fudan.edu.cn} 22 | 23 | \bibliographystyle{plain} 24 | \bibliography{report} 25 | 26 | \end{document} -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Assignments of Fudan's PRML course in 2017 Spring 2 | Hi, class, this is the assignment page of Fudan University's Pattern Recogniton and Machine Learning Course (COMP130137). 3 | This lecture is given by Professor Xipeng Qiu. Danlu Chen (I) serve as a teaching and lab assistant of this course in 2017 Spring. 4 | 5 | Feel free to comment on any issue concerning the assignments. 6 | Please let us know if you have an idea to improve this course project. 7 | 8 | 9 | ## Assignment 10 | 11 | - [Assignment 0](https://github.com/taineleau/Fudan_PRML2017_Assignment/tree/master/Ass0_warming_up): warming up 12 | - Release: Jan. 4, 2017 13 | - Due: March 5, 2017, Week 1 Sunday 23: 55 14 | - Estimated time to finish: 2 hrs 15 | 16 | ## Contact 17 | 18 | ### Lecturer 19 | 20 | - Xipeng Qiu (@xpqiu) 21 | 22 | ### TAs 23 | 24 | - Danlu Chen (@taineleau) 25 | 26 | - Qipeng Guo (@QipengGuo) 27 | 28 | - Linghao Zhang (@dnc1994) 29 | -------------------------------------------------------------------------------- /Ass1/report_zh.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt, oneside]{ctexart} 2 | \usepackage[a4paper,bindingoffset=0.2in,% 3 | left=1in,right=1in,top=1in,bottom=1in,% 4 | footskip=.25in]{geometry} 5 | \usepackage{graphicx} 6 | \usepackage{enumerate} 7 | \usepackage{url} 8 | \usepackage{hyperref} 9 | \usepackage{pbox} 10 | \usepackage{CJKutf8} 11 | 12 | 13 | \begin{document} 14 | \title{报告} 15 | \author{王小明, 13300000001 \\ 计算机科学与技术学院} 16 | \maketitle 17 | \section{第一章} 18 | \subsection{第一节} 19 | 20 | 21 | 表格插入: 22 | 23 | \begin{table}[!htbp] 24 | % \resizebox{1.4\linewidth}{!} 25 | \centering 26 | \scalebox{1}{ 27 | \begin{tabular}{ l || c | c | c | c | c | c | c | c | c | c} 28 | \hline 29 | dim & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 20 \\ \hline 30 | Err. \% & 100 & 100 &100 &100 &100 &100 &100 &100 &100 &100 \\ 31 | \hline 32 | \end{tabular} 33 | } 34 | \caption{错误} 35 | \label{tb:lda_knn} 36 | \end{table} 37 | 38 | 引用 \cite{bishop2006pattern}, 链接: \url{http://fudan.edu.cn} 39 | 40 | \bibliographystyle{plain} 41 | \bibliography{report} 42 | 43 | 44 | \end{document} -------------------------------------------------------------------------------- /Ass2/report_zh.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt, oneside]{ctexart} 2 | \usepackage[a4paper,bindingoffset=0.2in,% 3 | left=1in,right=1in,top=1in,bottom=1in,% 4 | footskip=.25in]{geometry} 5 | \usepackage{graphicx} 6 | \usepackage{enumerate} 7 | \usepackage{url} 8 | \usepackage{hyperref} 9 | \usepackage{pbox} 10 | \usepackage{CJKutf8} 11 | 12 | 13 | \begin{document} 14 | \title{报告} 15 | \author{王小明, 13300000001 \\ 计算机科学与技术学院} 16 | \maketitle 17 | \section{第一章} 18 | \subsection{第一节} 19 | 20 | 21 | 表格插入: 22 | 23 | \begin{table}[!htbp] 24 | % \resizebox{1.4\linewidth}{!} 25 | \centering 26 | \scalebox{1}{ 27 | \begin{tabular}{ l || c | c | c | c | c | c | c | c | c | c} 28 | \hline 29 | dim & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 20 \\ \hline 30 | Err. \% & 100 & 100 &100 &100 &100 &100 &100 &100 &100 &100 \\ 31 | \hline 32 | \end{tabular} 33 | } 34 | \caption{错误} 35 | \label{tb:lda_knn} 36 | \end{table} 37 | 38 | 引用 \cite{bishop2006pattern}, 链接: \url{http://fudan.edu.cn} 39 | 40 | \bibliographystyle{plain} 41 | \bibliography{report} 42 | 43 | 44 | \end{document} -------------------------------------------------------------------------------- /Ass1_partIII/report_zh.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt, oneside]{ctexart} 2 | \usepackage[a4paper,bindingoffset=0.2in,% 3 | left=1in,right=1in,top=1in,bottom=1in,% 4 | footskip=.25in]{geometry} 5 | \usepackage{graphicx} 6 | \usepackage{enumerate} 7 | \usepackage{url} 8 | \usepackage{hyperref} 9 | \usepackage{pbox} 10 | \usepackage{CJKutf8} 11 | 12 | 13 | \begin{document} 14 | \title{报告} 15 | \author{王小明, 13300000001 \\ 计算机科学与技术学院} 16 | \maketitle 17 | \section{第一章} 18 | \subsection{第一节} 19 | 20 | 21 | 表格插入: 22 | 23 | \begin{table}[!htbp] 24 | % \resizebox{1.4\linewidth}{!} 25 | \centering 26 | \scalebox{1}{ 27 | \begin{tabular}{ l || c | c | c | c | c | c | c | c | c | c} 28 | \hline 29 | dim & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 20 \\ \hline 30 | Err. \% & 100 & 100 &100 &100 &100 &100 &100 &100 &100 &100 \\ 31 | \hline 32 | \end{tabular} 33 | } 34 | \caption{错误} 35 | \label{tb:lda_knn} 36 | \end{table} 37 | 38 | 引用 \cite{bishop2006pattern}, 链接: \url{http://fudan.edu.cn} 39 | 40 | \bibliographystyle{plain} 41 | \bibliography{report} 42 | 43 | 44 | \end{document} -------------------------------------------------------------------------------- /Ass0_warming_up/part3/report_zh.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt, oneside]{ctexart} 2 | \usepackage[a4paper,bindingoffset=0.2in,% 3 | left=1in,right=1in,top=1in,bottom=1in,% 4 | footskip=.25in]{geometry} 5 | \usepackage{graphicx} 6 | \usepackage{enumerate} 7 | \usepackage{url} 8 | \usepackage{hyperref} 9 | \usepackage{pbox} 10 | \usepackage{CJKutf8} 11 | 12 | 13 | \begin{document} 14 | \title{报告} 15 | \author{王小明, 13300000001 \\ 计算机科学与技术学院} 16 | \maketitle 17 | \section{第一章} 18 | \subsection{第一节} 19 | 20 | 图片\footnote{脚注}插入: 21 | 22 | \begin{figure*}[!htbp] 23 | \centering 24 | \includegraphics[width=0.9\linewidth]{fig/3d_lda.png} 25 | \caption{3D LDA 可视化} 26 | \label{fig:3d_lda} 27 | \end{figure*} 28 | 29 | 30 | 表格插入: 31 | 32 | \begin{table}[!htbp] 33 | % \resizebox{1.4\linewidth}{!} 34 | \centering 35 | \scalebox{1}{ 36 | \begin{tabular}{ l || c | c | c | c | c | c | c | c | c | c} 37 | \hline 38 | dim & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 20 \\ \hline 39 | Err. \% & 100 & 100 &100 &100 &100 &100 &100 &100 &100 &100 \\ 40 | \hline 41 | \end{tabular} 42 | } 43 | \caption{错误} 44 | \label{tb:lda_knn} 45 | \end{table} 46 | 47 | 引用 \cite{bishop2006pattern}, 链接: \url{http://fudan.edu.cn} 48 | 49 | \bibliographystyle{plain} 50 | \bibliography{report} 51 | 52 | 53 | \end{document} -------------------------------------------------------------------------------- /Ass0_warming_up/part3/report_en.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt, oneside]{article} 2 | \usepackage[a4paper,bindingoffset=0.2in,% 3 | left=1in,right=1in,top=1in,bottom=1in,% 4 | footskip=.25in]{geometry} 5 | \usepackage{graphicx} 6 | \usepackage{enumerate} 7 | \usepackage{url} 8 | \usepackage{hyperref} 9 | \usepackage{pbox} 10 | \usepackage{CJKutf8} 11 | 12 | 13 | \begin{document} 14 | \title{Report} 15 | \author{taineleau, 13300000001 \\ Department of CS} 16 | \maketitle 17 | \section{Chapter 1} 18 | \subsection{Chapter 1.1} 19 | 20 | Figure: 21 | 22 | \begin{figure*}[!htbp] 23 | \centering 24 | \includegraphics[width=0.9\linewidth]{fig/3d_lda.png} 25 | \caption{3D LDA Visualization} 26 | \label{fig:3d_lda} 27 | \end{figure*} 28 | 29 | 30 | Table \footnote{footnote}: 31 | 32 | \begin{table}[!htbp] 33 | % \resizebox{1.4\linewidth}{!} 34 | \centering 35 | \scalebox{1}{ 36 | \begin{tabular}{ l || c | c | c | c | c | c | c | c | c | c} 37 | \hline 38 | dim & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 20 \\ \hline 39 | Err. \% & 100 & 100 &100 &100 &100 &100 &100 &100 &100 &100 \\ 40 | \hline 41 | \end{tabular} 42 | } 43 | \caption{Error rate} 44 | \label{tb:lda_knn} 45 | \end{table} 46 | 47 | reference \cite{bishop2006pattern}, url: \url{http://fudan.edu.cn} 48 | 49 | \bibliographystyle{plain} 50 | \bibliography{report} 51 | 52 | \end{document} -------------------------------------------------------------------------------- /Ass1/dataloader.py: -------------------------------------------------------------------------------- 1 | import scipy.io as sio 2 | import numpy as np 3 | 4 | def load_data(t="", preprocess=False, verbose=False, bias=False): 5 | data = sio.loadmat("./spamData.mat") 6 | feature_size = len(data["Xtrain"][1]) 7 | 8 | def binarized(d): 9 | d = [[1 if e > 0 else 0 for e in dd] for dd in d] 10 | return np.array(d, dtype=int) 11 | 12 | def logtransform(d): 13 | # TODO: Please implement me. 14 | return d 15 | 16 | def znormalization(d): 17 | # TODO: Please implement me. 18 | return d 19 | 20 | def flat(d): 21 | return np.array(d, dtype=int).flatten() 22 | 23 | Xtrain = data['Xtrain'] 24 | ytrain = flat(data['ytrain']) 25 | Xtest = data['Xtest'] 26 | ytest = flat(data['ytest']) 27 | 28 | if verbose: 29 | print "Loaded " + str(len(ytrain)) + " train samples, " + \ 30 | "and " + str(len(ytest)) + " test samples, " + \ 31 | "feature of X is of length " + str(feature_size) + "." 32 | 33 | # TODO: explore different preprocessing methods if needed. 34 | if t == "binarized": 35 | print "Binarized preprocessing!" 36 | return binarized(Xtrain), ytrain, binarized(Xtest), ytest, feature_size 37 | elif t == "log": 38 | print "log-transformation preprocessing!" 39 | return logtransform(Xtrain), ytrain, logtransform(Xtest), ytest, feature_size 40 | elif t == "z": 41 | print "z-normalization preprocessing!" 42 | return znormalization(Xtrain), ytrain, znormalization(Xtest), ytest, feature_size 43 | else: 44 | return Xtrain, ytrain, Xtest, ytest, feature_size 45 | 46 | 47 | if __name__ == '__main__': 48 | load_data(verbose=True) 49 | -------------------------------------------------------------------------------- /Ass1_partIII/dataloader.py: -------------------------------------------------------------------------------- 1 | import scipy.io as sio 2 | import numpy as np 3 | 4 | def load_data(t="", preprocess=False, verbose=False, bias=False): 5 | data = sio.loadmat("./spamData.mat") 6 | feature_size = len(data["Xtrain"][1]) 7 | 8 | def binarized(d): 9 | d = [[1 if e > 0 else 0 for e in dd] for dd in d] 10 | return np.array(d, dtype=int) 11 | 12 | def logtransform(d): 13 | # TODO: Please implement me. 14 | return d 15 | 16 | def znormalization(d): 17 | # TODO: Please implement me. 18 | return d 19 | 20 | def flat(d): 21 | return np.array(d, dtype=int).flatten() 22 | 23 | Xtrain = data['Xtrain'] 24 | ytrain = flat(data['ytrain']) 25 | Xtest = data['Xtest'] 26 | ytest = flat(data['ytest']) 27 | 28 | if verbose: 29 | print "Loaded " + str(len(ytrain)) + " train samples, " + \ 30 | "and " + str(len(ytest)) + " test samples, " + \ 31 | "feature of X is of length " + str(feature_size) + "." 32 | 33 | # TODO: explore different preprocessing methods if needed. 34 | if t == "binarized": 35 | print "Binarized preprocessing!" 36 | return binarized(Xtrain), ytrain, binarized(Xtest), ytest, feature_size 37 | elif t == "log": 38 | print "log-transformation preprocessing!" 39 | return logtransform(Xtrain), ytrain, logtransform(Xtest), ytest, feature_size 40 | elif t == "z": 41 | print "z-normalization preprocessing!" 42 | return znormalization(Xtrain), ytrain, znormalization(Xtest), ytest, feature_size 43 | else: 44 | return Xtrain, ytrain, Xtest, ytest, feature_size 45 | 46 | 47 | if __name__ == '__main__': 48 | load_data(verbose=True) 49 | -------------------------------------------------------------------------------- /Ass1/naive_bayes_classifier.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import unicode_literals 3 | from dataloader import load_data 4 | import numpy as np 5 | import cPickle as pkl 6 | 7 | 8 | class GenerativeClassifier(object): 9 | """ 10 | meta class for generative classifiers. 11 | """ 12 | 13 | def __init__(self, preprocessing="", shuffle_train=False, shuffle_test=False): 14 | self.x_train, self.y_train, self.x_test, self.y_test, self.feature_size = load_data(t=preprocessing) 15 | self.train_size = len(self.y_train) 16 | self.test_size = len(self.x_test) 17 | 18 | # TODO: reset the prior distribution 19 | self.pi_1 = 0 20 | 21 | if shuffle_train: 22 | idx = np.arange(self.train_size) 23 | np.random.shuffle(idx) 24 | self.x_train = [self.x_train[idx[i]] for i in range(self.train_size)] 25 | self.y_train = [self.y_train[idx[i]] for i in range(self.train_size)] 26 | 27 | if shuffle_test: 28 | idx = np.arange(self.test_size) 29 | np.random.shuffle(idx) 30 | self.x_test = [self.x_test[idx[i]] for i in range(self.test_size)] 31 | self.y_test = [self.y_test[idx[i]] for i in range(self.test_size)] 32 | 33 | 34 | """ 35 | this method using Beta Distribution as the conjuncture prior. 36 | """ 37 | 38 | 39 | class BetaNaiveBayesClassifier(GenerativeClassifier): 40 | 41 | def __init__(self, alpha=500, preprocessing=""): 42 | super(BetaNaiveBayesClassifier, self).__init__(preprocessing=preprocessing) 43 | self.alpha = alpha 44 | self.beta = np.full((2, self.feature_size), self.alpha, dtype=float) 45 | self.count = [0, 0] 46 | 47 | def train(self): 48 | # TODO 49 | 50 | def test(self): 51 | # TODO 52 | 53 | 54 | 55 | 56 | class GaussianNaiveBayesClassifier(GenerativeClassifier): 57 | 58 | def __init__(self, preprocessing=""): 59 | super(GaussianNaiveBayesClassifier, self).__init__(preprocessing=preprocessing) 60 | self.ML = None 61 | 62 | def train(self): 63 | #TODO 64 | 65 | 66 | def test(self): 67 | # TODO -------------------------------------------------------------------------------- /Ass1_partIII/discriminative_classfier.py: -------------------------------------------------------------------------------- 1 | from dataloader import * 2 | import numpy as np 3 | from numpy.linalg import pinv, norm 4 | 5 | 6 | class DiscriminativeClassifier(object): 7 | def __init__(self,preprocessing="", bias=False): 8 | self.x_train, self.y_train, self.x_test, self.y_test, self.feature_size = \ 9 | load_data(t=preprocessing, bias=bias) 10 | self.train_size = len(self.y_train) 11 | self.test_size = len(self.x_test) 12 | 13 | def sigmoid(x): 14 | return 1.0 / (1.0 + np.exp((-1)*x)) 15 | 16 | 17 | # calc gradient (and/or Hessian matrix) 18 | # TODO: you should implement gradient descent or other optimizer to this 19 | # so please calculate the gradient (and/or) hessian in this function 20 | # to help TA undertand your code 21 | # notice that you are only allow to use numpy here 22 | def calc_grad(W, X, Y): 23 | # W: weight matrix 24 | # X: input 25 | # Y: ground true 26 | # TODO: write your code here :D 27 | grad, Hess = None, None 28 | return grad, Hess 29 | 30 | 31 | class LogisticRegression(DiscriminativeClassifier): 32 | 33 | def __init__(self, l2norm=1, preprocessing="", eta=1e-4, max_epoch=30, l2_on=False): 34 | """ 35 | :param l2norm: l2 norm penalty 36 | :param preprocessing: preprocessing method 37 | :param eta: learning rate (step size) 38 | :param max_epoch: how many epochs are you going to train the regression (optional) 39 | :param l2_on: use l2 or not 40 | """ 41 | super(LogisticRegression, self).__init__(preprocessing=preprocessing, bias=True) 42 | self.weight = np.ones((self.feature_size + 1, 1)) 43 | self.L2norm = l2norm 44 | self.eta = eta 45 | self.max_epoch = max_epoch 46 | self.l2_on = l2_on 47 | # mask is to filter out the weight and discard the bias 48 | self.mask = np.ones_like(self.weight) 49 | self.mask[self.feature_size][0] = 0 50 | 51 | def train(self, eps=1e-4): 52 | epoch = 0 53 | while True: 54 | 55 | # TODO: calc update here using calc_grad() or anything you want 56 | update = 0 57 | # update weight 58 | self.weight = self.weight 59 | 60 | if norm(update) < self.eta * 0.1: # TODO: you should think about some early stopping scheme here 61 | break 62 | 63 | epoch += 1 64 | 65 | print "epoch", epoch 66 | if epoch > self.max_epoch: 67 | break 68 | 69 | def test(self): 70 | err = 0 71 | return err 72 | 73 | 74 | class KNNClassifier(DiscriminativeClassifier): 75 | 76 | def __init__(self, preprocessing="", K=4): 77 | """ 78 | 79 | :param preprocessing: preprocessing method 80 | :param K: how much neighbours you want 81 | """ 82 | super(KNNClassifier, self).__init__(preprocessing=preprocessing) 83 | self.binary = (preprocessing == "binary") 84 | self.K = K 85 | 86 | # TODO: implement distance between sample here 87 | def __calc_distance(self, a, b): 88 | if self.binary: 89 | return 90 | else: 91 | return 92 | 93 | def train(self): 94 | # TODO 95 | return 96 | 97 | def test(self): 98 | # TODO 99 | err = 0 100 | return err 101 | -------------------------------------------------------------------------------- /Ass2/dataloader.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from __future__ import unicode_literals 3 | """ 4 | This downloading script is modified from: 5 | https://github.com/sorki/python-mnist 6 | 7 | """ 8 | 9 | import os 10 | import struct 11 | from array import array 12 | import numpy as np 13 | 14 | class MNIST(object): 15 | def __init__(self, path='.'): 16 | self.path = path 17 | 18 | self.test_img_fname = 't10k-images-idx3-ubyte' 19 | self.test_lbl_fname = 't10k-labels-idx1-ubyte' 20 | 21 | self.train_img_fname = 'train-images-idx3-ubyte' 22 | self.train_lbl_fname = 'train-labels-idx1-ubyte' 23 | 24 | self.test_images = [] 25 | self.test_labels = [] 26 | 27 | self.train_images = [] 28 | self.train_labels = [] 29 | 30 | def load_testing(self): 31 | ims, labels = self.load(os.path.join(self.path, self.test_img_fname), 32 | os.path.join(self.path, self.test_lbl_fname)) 33 | 34 | self.test_images = np.array(ims) 35 | self.test_labels = np.array(labels) 36 | 37 | return self.test_images, self.test_labels 38 | 39 | def load_training(self): 40 | ims, labels = self.load(os.path.join(self.path, self.train_img_fname), 41 | os.path.join(self.path, self.train_lbl_fname)) 42 | 43 | self.train_images = np.array(ims) 44 | self.train_labels = np.array(labels) 45 | 46 | return self.train_images, self.train_labels 47 | 48 | @classmethod 49 | def load(cls, path_img, path_lbl): 50 | with open(path_lbl, 'rb') as file: 51 | magic, size = struct.unpack(">II", file.read(8)) 52 | if magic != 2049: 53 | raise ValueError('Magic number mismatch, expected 2049,' 54 | 'got {}'.format(magic)) 55 | 56 | labels = array("B", file.read()) 57 | 58 | with open(path_img, 'rb') as file: 59 | magic, size, rows, cols = struct.unpack(">IIII", file.read(16)) 60 | if magic != 2051: 61 | raise ValueError('Magic number mismatch, expected 2051,' 62 | 'got {}'.format(magic)) 63 | 64 | image_data = array("B", file.read()) 65 | 66 | images = [] 67 | for i in range(size): 68 | images.append([0] * rows * cols) 69 | 70 | for i in range(size): 71 | images[i][:] = image_data[i * rows * cols:(i + 1) * rows * cols] 72 | 73 | return images, labels 74 | 75 | @classmethod 76 | def display(cls, img, width=28, threshold=200): 77 | render = '' 78 | for i in range(len(img)): 79 | if i % width == 0: 80 | render += '\n' 81 | if img[i] > threshold: 82 | render += '@' 83 | else: 84 | render += '.' 85 | return render 86 | 87 | 88 | def load_data(): 89 | d = np.load('data/data.npz') 90 | return d['x_train'], d['y_train'], d['x_test'], d['y_test'] 91 | 92 | 93 | if __name__ == '__main__': 94 | if not os.path.exists('data/data.npz'): 95 | data = MNIST('./data') 96 | data.load_training() 97 | data.load_testing() 98 | print("save data in data/data.npz!") 99 | np.savez('data/data.npz', x_train=data.train_images, y_train=data.train_labels, 100 | x_test=data.test_images, y_test=data.test_labels) -------------------------------------------------------------------------------- /Ass0_warming_up/README.md: -------------------------------------------------------------------------------- 1 | ## Assignment 0: Warming up (Winter Holiday) [6 %] 2 | > Release: During Winter Holiday 3 | > 4 | > Due: March 5, 2017, Week 1 Sunday 23: 55 5 | > 6 | > Grading: Pass / Fail (You will obtain full marks if you submit it) 7 | > 8 | > Estimated time to finish: 2 hrs 9 | > 10 | > 11 | > Don't panic! This assignment is released so early mainly becuase we need some warmming up practice & recap. 12 | > If it's your first time to set up the environment for `LaTeX` and Python (numerical programming) it could be really tricky. 13 | > Don't wait until the last minute! 14 | 15 | Keyword: 16 | 17 | - Linear Algebra 18 | 19 | - Numpy 20 | 21 | - Matplotlib 22 | 23 | - LaTeX 24 | 25 | - Eigen Decomposition & SVD 26 | 27 | 28 | 29 | 30 | 31 | ### Part 1: Python & Numpy [2 %] 32 | 33 | This part is for those who are not familiar with `Numpy` and numerical programming. 34 | 35 | Please finish (parts of) the practice: https://github.com/Kyubyong/numpy_exercises 36 | 37 | Specifically, you need to finish the following parts: 38 | 39 | Array creation routines 40 | Array manipulation routines 41 | Numpy-specific help functions 42 | Linear algebra (numpy.linalg) 43 | Logic functions 44 | Random sampling (numpy.random) 45 | Sorting, searching, and counting 46 | 47 | --- 48 | 49 | > For reference: 50 | 51 | > - Numpy Tutorial: http://cs231n.github.io/python-numpy-tutorial/ 52 | 53 | > - Matrix calculus: https://en.wikipedia.org/wiki/Matrix_calculus 54 | 55 | ### Part 2: Plot & Visualization [2 %] 56 | 57 | - matplotlib: http://matplotlib.org/examples/index.html 58 | 59 | - tutorial: https://github.com/rougier/matplotlib-tutorial 60 | 61 | Please try several plots as below to finish this part. You could use arbitrary data as long as your plot looks like the sample one. 62 | 63 | - Scatter Plot, as page 47 in https://web.stanford.edu/~schmit/cme193/lec/lec5.pdf 64 | - Histogram, as page 49 65 | - Box Plot, as page 51 66 | - Wire Plot, as page 55 67 | 68 | Several APIs you might find it helpful: 69 | 70 | - subplots 71 | - tight_layout 72 | - legend 73 | - savefig 74 | - imshow 75 | 76 | 77 | ### Part 3: Eigen Decomposition and SVD (Introduction) [2 %] 78 | 79 | > This part aims at reviewing some basic ideas in linear algebra and matrix analysis. 80 | 81 | > Beside, learning how to use `LaTeX` as a tool to write a scitific report will be helpful in the following assignments. 82 | 83 | Please read part I and II in http://www.cc.gatech.edu/~dellaert/pub/svd-note.pdf, and answer the following question **briefly**: 84 | 85 | What's the relationship between eigendecomposition and singular value decomposition? 86 | 87 | You'd better answer it from the following points of view: 88 | - math 89 | - code implementation 90 | 91 | 92 | --- 93 | > For reference: 94 | 95 | > - http://cims.nyu.edu/~donev/Teaching/SciComp-Spring2012/Lecture5.handout.pdf 96 | 97 | --- 98 | 99 | The LaTeX template `report.tex` (you can write your report in either English or Chinese) has been provided in the folder `part3`. 100 | You must hand in a report written by `LaTeX` (you may use your own template if you are confident enough on TeX). 101 | You may compile your `*.tex` locally or online (`https://www.overleaf.com/`) using the command `xelatex`. 102 | 103 | - For MacOS, it's sufficient to install the [MacTeX](http://www.tug.org/mactex/). 104 | 105 | - For Windows/Linux, please use [TeXLive](https://www.tug.org/texlive/). 106 | 107 | You may find it helpful to set up an environment on [Sublime](https://www.zhihu.com/question/23918126) to compile `*.tex`. For windows users, you may find this IDE [WinEdt](http://www.winedt.com/) very convenient. 108 | 109 | LaTeX [cheatsheet](https://wch.github.io/latexsheet/) would be helpful and please search on the Internet for a good tutorial for `LaTeX` if you have never met `LaTeX` before. 110 | For example, [CTeX.org](http://bbs.ctex.org/forum.php?mod=forumdisplay&fid=9) is a good community for LaTeX beginners. 111 | 112 | --- 113 | 114 | ### Hand-in format 115 | 116 | - Please use your Student ID as the folder name and zip it as `Student_ID.zip`. The file structure is as followed: 117 | 118 | 119 | /13300000001 120 | /README.md 121 | / part1 122 | | *.ipynb 123 | / part2 124 | | fig1.png 125 | | fig2.png 126 | | fig3.png 127 | | fig4.png 128 | / part3 129 | | report.pdf 130 | | report.tex 131 | 132 | - In `README.md`, you should report your workload. 133 | Please give an estimate of how much time you spent on this assignment (please specify how many hours you spent on each part). 134 | Note that you will not be given a higher or lower grade if you spend a lot of time or very little time. 135 | We just want an honest estimate and this would help us better evolve the following assignments. 136 | Other feedback are welcomed. Please note that regardless of positive or negative feedback, your grades won’t be affected of course. 137 | --------------------------------------------------------------------------------