├── README.md ├── problem_1 ├── 10000.png ├── 100000.png ├── 1000000.png ├── README.md └── main.py ├── problem_2 ├── README.md ├── main.py └── result.png ├── problem_3 ├── 1.png ├── 2.png ├── README.md ├── data.png └── main.py ├── problem_4 ├── 1.png ├── 2.png ├── README.md ├── data.png └── main.py ├── problem_5 ├── README.md ├── data.png └── main.py ├── problem_6 ├── README.md ├── eta.png ├── h_num.png ├── iteration_batch.png ├── iteration_single.png └── main.py ├── problem_7 ├── README.md ├── data ├── data.png ├── diff.png ├── main.py └── result.png ├── problem_8 ├── README.md ├── acc-k.png ├── acc-sigma-k20.png ├── acc-sigma-k5.png ├── acc-sigma-k50.png ├── data.png ├── data.txt ├── main.py └── result.png └── problem_9 ├── README.md ├── data.zip └── main.py /README.md: -------------------------------------------------------------------------------- 1 | # 模式识别 Pattern Recognition 2 | ## 0. 简介 3 | * 模式识别的一些实例问题Python实现 4 | * 环境: Ubuntu16.04 + Python3.6 5 | 6 | ## 1. 实例问题 7 | ### Problem 1 8 | 通过下列步骤说明这样一个事实:大量独立的随机变量的平均将近似为一高斯分布。 9 | * 写一个程序,从均一分布$U(x_l,x_u)$中产生n个随机整数。 10 | * 写一个程序,从范围$-100 \le x_l < x_u \le 100$中随机取$x_l$和$x_u$,以及在范围$0 35 | 36 | 37 | 38 | > 本数据同时被问题3/4/5使用 39 | 40 | ### Problem 4 41 | 42 | 实现Ho-Kashyap算法,并使用它分别对$\omega_1$和$\omega_3$与$\omega_2$和$\omega_4$进行分类.给出分类误差并分析. 43 | 44 | ### Problem 5 45 | 46 | 请写一个程序,实现 MSE 多类扩展方法。每一类用前 8 个样本来构造分类器,用后两个样本作测试。请给出你的正确率. 47 | 48 | ### Problem 6 49 | 50 | 本题使用的数据如下: 51 | 52 | 第一类 10 个样本(三维空间): 53 | 54 | [1.58, 2.32, -5.8], [0.67, 1.58, -4.78], [1.04, 1.01, -3.63], [-1.49, 2.18, -3.39], [-0.41, 1.21, -4.73], 55 | [1.39, 3.16, 2.87], [1.20, 1.40, -1.89], [-0.92, 1.44, -3,22], [0.45, 1.33, -4.38], [-0.76, 0.84, -1.96] 56 | 57 | 第二类 10 个样本(三维空间): 58 | 59 | [0.21, 0.03, -2.21], [0.37, 0.28, -1.8], [0.18, 1.22, 0.16], [-0.24, 0.93, -1.01], [-1.18, 0.39, -0.39], 60 | [0.74, 0.96, -1.16], [-0.38, 1.94, -0.48], [0.02, 0.72, -0.17], [ 0.44, 1.31, -0.14], [0.46, 1.49, 0.68] 61 | 62 | 第三类 10 个样本(三维空间): 63 | 64 | [-1.54, 1.17, 0.64], [5.41, 3.45, -1.33], [1.55, 0.99, 2.69], [1.86, 3.19, 1.51], [1.68, 1.79, -0.87], 65 | [3.51, -0.22, -1.39], [1.40, -0.44, -0.92], [0.44, 0.83, 1.97], [ 0.25, 0.68, -0.99], [0.66, -0.45, 0.08] 66 | 67 | * 请编写两个通用的三层前向神经网络反向传播算法程序,一个采用批量方式更新权重,另一个采用单样本方式更新权重。其中,隐含层结点的激励函数采用双曲正切函数,输出层的激励函数采用 sigmoid 函数。目标函数采用平方误差准则函数。 68 | 69 | * 请利用上面的数据验证你写的程序,分析如下几点: 70 | * 隐含层不同结点数目对训练精度的影响; 71 | * 观察不同的梯度更新步长对训练的影响,并给出一些描述或解释; 72 | * 在网络结构固定的情况下,绘制出目标函数随着迭代步数增加的变化曲线。 73 | 74 | ### Problem 7 75 | 76 | 现有1000个二维空间的数据点, 其$\sigma=[1,0;0,1]$, $\mu_1=[1,-1],\mu_2=[5.5,-4.5], \mu_3=[1,4], \mu_4=[6,4.5], \mu_5=[9,0]$. 77 | 请完成如下工作: 78 | 79 | * 编写一个程序, 实现经典的K-means聚类算法; 80 | * 令聚类个数为5, 采用不同的初始值观察最后的聚类中心, 给出你所估计的聚类中心, 指出每个中心有多少个样本; 指出你所得到的聚类中心与对应的真实分布的均值之间的误差(对5个聚类, 给出均方误差即可). 81 | 82 | ### Problem 8 83 | 84 | 关于谱聚类。有如下 200 个数据点,它们是通过两个半月形分布生成的。如图所示: 85 | 86 |
87 | 88 |
89 | 90 | * 请编写一个谱聚类算法,实现"Normalized Spectral Clustering—Algorithm 3 (Ng 算法)". 91 | * 设点对亲和性(即边权值)采用如下计算公式: 92 | 93 | $$ 94 | w_{ij} = e^{-\frac{||x_i-x_j||^2_2}{2\sigma^2}} 95 | $$ 96 | 97 | 同时,数据图采用 k-近邻方法来生成(即对每个数据点$x_i$,首先在所有样本中找出不包含$x_i$的 k 个最邻近的样本点,然后$x_i$与每个邻近样本点均有一条边相连,从而完成图构造)。 98 | 99 | 注意,为了保证亲和度矩阵 W 是对称矩阵,可以令$W=\frac{(W^{T} +W)}{2}$. 假设已知前 100 个点为一个聚类, 后 100 个点为一个聚类,请分析分别取不同的$\sigma$值和 k 值对聚类结果的影响。 100 | (本题可以给出关于聚类精度随着$\sigma$值和 k 值的变化曲线。在实验中,可以固定一个,变化另一个). 101 | 102 | ### Problem 9 103 | 104 | 从MNIST数据集中选择两类,对其进行SVM分类,可调用现有的SVM工具. 105 | -------------------------------------------------------------------------------- /problem_1/10000.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_1/10000.png -------------------------------------------------------------------------------- /problem_1/100000.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_1/100000.png -------------------------------------------------------------------------------- /problem_1/1000000.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_1/1000000.png -------------------------------------------------------------------------------- /problem_1/README.md: -------------------------------------------------------------------------------- 1 | # Problem 1 2 | ## 1. 问题描述 3 | 通过下列步骤说明这样一个事实:大量独立的随机变量的平均将近似为一高斯分布。 4 | * 写一个程序,从均一分布$U(x_l,x_u)$中产生n个随机整数。 5 | * 写一个程序,从范围$-100 \le x_l < x_u \le 100$中随机取$x_l$和$x_u$,以及在范围$0= 1000: 39 | n = np.random.randint(1,1001) 40 | N -= n 41 | # 随机选取此轮采样上下界 42 | x_l, x_u = np.random.randint(-100, 101, 2) 43 | if x_l > x_u: x_l, x_u = x_u, x_l 44 | # 采样 45 | s = np.array(np.random.uniform(x_l, x_u+1, n), np.int) 46 | S.extend(s) 47 | # 当剩余样本数小于1000时,完成最后一轮采样 48 | if N: 49 | x_l, x_u = np.random.randint(-100, 101, 2) 50 | if x_l > x_u: x_l, x_u = x_u, x_l 51 | s = np.array(np.random.uniform(x_l, x_u+1, N), np.int) 52 | S.extend(s) 53 | # 计算均值和标准差 54 | mean = np.mean(S) 55 | std = np.std(S) 56 | # 根据均值标准差计算正态分布曲线 57 | x = np.arange(-100, 101, 1) 58 | y = normfunc(x, mean, std) 59 | # 绘图 60 | plt.hist(S, bins=200, range=[-100,100], normed=True, color='orange') 61 | plt.plot(x, y) 62 | plt.ylabel("Frequency") 63 | plt.show() 64 | 65 | if __name__ == "__main__": 66 | gen_cal_plot(10000) 67 | gen_cal_plot(100000) 68 | gen_cal_plot(1000000) 69 | ``` 70 | 71 | ## 4. 结果与讨论 72 | 73 | 分别采样$10^4$、$10^5$和$10^6$个样本后,其直方图如下,图中曲线为根据均值和标准差绘制的正态分布曲线。 74 | 75 |
76 | 77 | 78 | 79 |
80 | 81 | 对比三个图可以看出,当采样点数为$10^4$时,样本的分布不呈现什么规律。随着样本个数的增多,样本分布逐渐规整:在均值处样本最多,在均值两侧基本对称,这在样本个数为$10^6$时较为明显。 -------------------------------------------------------------------------------- /problem_1/main.py: -------------------------------------------------------------------------------- 1 | """ 2 | @leofansq 3 | https://github.com/leofansq 4 | """ 5 | import numpy as np 6 | import matplotlib.pyplot as plt 7 | 8 | def normfunc(x,mu,sigma): 9 | """ 10 | Return a list of value depending on the parameters according to Normal distribution. 11 | Parameters: 12 | x: a list of x value 13 | mu: mean value 14 | sigma: std 15 | """ 16 | y = np.exp(-((x - mu)**2)/(2*sigma**2)) / (sigma * np.sqrt(2*np.pi)) 17 | return y 18 | 19 | def gen_cal_plot(N): 20 | """ 21 | i) Generate N samples and plot them. 22 | ii) Calculate the mean & std and plot. 23 | Parameters: 24 | N : the number of the samples 25 | """ 26 | S = [] 27 | # 当剩余样本数大于1000时随机确定此轮采样样本个数n 28 | while N >= 1000: 29 | n = np.random.randint(1,1001) 30 | N -= n 31 | # 随机选取此轮采样上下界 32 | x_l, x_u = np.random.randint(-100, 101, 2) 33 | if x_l > x_u: x_l, x_u = x_u, x_l 34 | # 采样 35 | s = np.array(np.random.uniform(x_l, x_u+1, n), np.int) 36 | S.extend(s) 37 | # 当剩余样本数小于1000时,完成最后一轮采样 38 | if N: 39 | x_l, x_u = np.random.randint(-100, 101, 2) 40 | if x_l > x_u: x_l, x_u = x_u, x_l 41 | s = np.array(np.random.uniform(x_l, x_u+1, N), np.int) 42 | S.extend(s) 43 | # 计算均值和方差 44 | mean = np.mean(S) 45 | std = np.std(S) 46 | # 根据均值方差计算正态分布曲线 47 | x = np.arange(-100, 101, 1) 48 | y = normfunc(x, mean, std) 49 | # 绘图 50 | plt.hist(S, bins=200, range=[-100,100], normed=True, color='orange') 51 | plt.plot(x, y) 52 | plt.ylabel("Frequency") 53 | plt.show() 54 | 55 | if __name__ == "__main__": 56 | gen_cal_plot(10000) 57 | gen_cal_plot(100000) 58 | gen_cal_plot(1000000) -------------------------------------------------------------------------------- /problem_2/README.md: -------------------------------------------------------------------------------- 1 | # Problem 2 2 | ## 1. 问题描述 3 | 根据以下步骤测试经验误差: 4 | * 写一个程序产生d维空间的样本点,服从均值为$\mu$和协方差矩阵$\Sigma$的正态分布。 5 | * 考虑正态分布 6 | $$ 7 | p(x|\omega_1) \sim N (\dbinom{1}{0}, I)\\ 8 | p(x|\omega_2) \sim N (\dbinom{-1}{0}, I) 9 | $$ 10 | 11 | 且$P(\omega_1)=P(\omega_2)=0.5$,说明贝叶斯判决边界。 12 | * 产生100个点(50个$\omega_1$类的点,50个$\omega_2$类的点),并计算经验误差。 13 | 14 | ## 2. 实现思路 15 | 首先,根据(b)中所给信息计算确定判决边界。 16 | 17 | 由于$p(x|\omega_1)$和$p(x|\omega_2)$均服从正态分布,且有$P(\omega_1)=P(\omega_2)=0.5$。 18 | 19 | 因此,贝叶斯判决边界为 20 | $$ 21 | \dbinom{2}{0} \times x = 0 22 | $$ 23 | 24 | ## 3. Python代码 25 | ```Python 26 | import numpy as np 27 | import matplotlib.pyplot as plt 28 | 29 | def main(): 30 | # 设置参数 31 | mean_1 = [1, 0] 32 | mean_2 = [-1, 0] 33 | cov = np.eye(2) 34 | 35 | # 生成样本点 36 | s_1 = np.random.multivariate_normal(mean_1, cov, 50) 37 | s_2 = np.random.multivariate_normal(mean_2, cov, 50) 38 | 39 | # 绘制样本点 40 | x_1 = s_1[:,0] 41 | y_1 = s_1[:,1] 42 | x_2 = s_2[:,0] 43 | y_2 = s_2[:,1] 44 | plt.subplot(211).set_title("Real Distribution") 45 | plt.scatter(x_1,y_1) 46 | plt.scatter(x_2,y_2) 47 | 48 | # 根据贝叶斯判决边界绘制判决结果 49 | S = [] 50 | S.extend(s_1) 51 | S.extend(s_2) 52 | X_1 = [] 53 | Y_1 = [] 54 | X_2 = [] 55 | Y_2 = [] 56 | for i in S: 57 | if i[0]>0: 58 | X_1.append(i[0]) 59 | Y_1.append(i[1]) 60 | else: 61 | X_2.append(i[0]) 62 | Y_2.append(i[1]) 63 | plt.subplot(212).set_title("Decision Distribution") 64 | plt.scatter(X_1,Y_1) 65 | plt.scatter(X_2,Y_2) 66 | plt.tight_layout() 67 | plt.show() 68 | 69 | # 计算错误率 70 | err_1 = [i for i in s_1 if i[0]<0] 71 | err_2 = [i for i in s_2 if i[0]>0] 72 | err_cnt = len(err_1) + len(err_2) 73 | rate = err_cnt/100 74 | print (rate) 75 | 76 | if __name__ == "__main__": 77 | main() 78 | ``` 79 | 80 | ## 4. 结果与讨论 81 | 82 | 两类各采样50个样本点的分布和根据所求的贝叶斯判决边界判决的结果如下图所示。 83 | 84 |
85 | 86 |
87 | 88 | 经计算,在上述仿真过程中,经验误差为0.15. 89 | 90 | -------------------------------------------------------------------------------- /problem_2/main.py: -------------------------------------------------------------------------------- 1 | """ 2 | @leofansq 3 | https://github.com/leofansq 4 | """ 5 | import numpy as np 6 | import matplotlib.pyplot as plt 7 | 8 | def main(): 9 | # 设置参数 10 | mean_1 = [1, 0] 11 | mean_2 = [-1, 0] 12 | cov = np.eye(2) 13 | 14 | # 生成样本点 15 | s_1 = np.random.multivariate_normal(mean_1, cov, 50) 16 | s_2 = np.random.multivariate_normal(mean_2, cov, 50) 17 | 18 | # 绘制样本点 19 | x_1 = s_1[:,0] 20 | y_1 = s_1[:,1] 21 | x_2 = s_2[:,0] 22 | y_2 = s_2[:,1] 23 | plt.subplot(211).set_title("Real Distribution") 24 | plt.scatter(x_1,y_1) 25 | plt.scatter(x_2,y_2) 26 | 27 | # 根据贝叶斯判决边界绘制判决结果 28 | S = [] 29 | S.extend(s_1) 30 | S.extend(s_2) 31 | X_1 = [] 32 | Y_1 = [] 33 | X_2 = [] 34 | Y_2 = [] 35 | for i in S: 36 | if i[0]>0: 37 | X_1.append(i[0]) 38 | Y_1.append(i[1]) 39 | else: 40 | X_2.append(i[0]) 41 | Y_2.append(i[1]) 42 | plt.subplot(212).set_title("Decision Distribution") 43 | plt.scatter(X_1,Y_1) 44 | plt.scatter(X_2,Y_2) 45 | plt.tight_layout() 46 | plt.show() 47 | 48 | # 计算错误率 49 | err_1 = [i for i in s_1 if i[0]<0] 50 | err_2 = [i for i in s_2 if i[0]>0] 51 | err_cnt = len(err_1) + len(err_2) 52 | rate = err_cnt/100 53 | print (rate) 54 | 55 | 56 | if __name__ == "__main__": 57 | main() 58 | 59 | 60 | 61 | 62 | -------------------------------------------------------------------------------- /problem_2/result.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_2/result.png -------------------------------------------------------------------------------- /problem_3/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_3/1.png -------------------------------------------------------------------------------- /problem_3/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_3/2.png -------------------------------------------------------------------------------- /problem_3/README.md: -------------------------------------------------------------------------------- 1 | # Problem 3 2 | ## 1. 问题描述 3 | 4 | 实现批量感知器算法.初始权向量$a = 0$, 5 | * 使用程序利用$\omega_1$和$\omega_2$的数据进行训练,记录收敛步数. 6 | * 使用程序利用$\omega_2$和$\omega_3$的数据进行训练,记录收敛步数. 7 | 8 |
9 | 10 |
11 | 12 | ## 2. 实现思路 13 | 14 | * 规范化增广样本 15 | * 初始化参数 16 | * 基于批处理感知算法更新权向量. 17 | 18 | ## 3. Python代码 19 | ### 3.1 规范化增广样本 20 | ```Python 21 | import numpy as np 22 | import copy 23 | def samples_trans(w1, w2): 24 | """ 25 | 规范化增广样本 26 | Parameters: 27 | w1: 类1样本 28 | w2: 类2样本 29 | Return: 30 | w: 规范化增广样本 31 | """ 32 | # 复制样本,防止后续操作改变原始样本 33 | w_1 = copy.deepcopy(w1) 34 | w_2 = copy.deepcopy(w2) 35 | 36 | # 增广 37 | for i in w_1: i.append(1) 38 | for i in w_2: i.append(1) 39 | 40 | # 规范化 41 | w_1 = np.array(w_1) 42 | w_2 = np.array(w_2) 43 | w_2 = -w_2 44 | w = np.concatenate([w_1, w_2]) 45 | 46 | return w 47 | ``` 48 | 49 | ### 3.2 批处理感知算法 50 | ```Python 51 | def batch_perception(w1, w2): 52 | """ 53 | Batch Perception 54 | Parameters: 55 | w1: 类1样本 56 | w2: 类2样本 57 | Return: 58 | a 59 | k: 迭代次数 60 | """ 61 | # 规范化增广样本 62 | w = samples_trans(w1, w2) 63 | 64 | # 初始化参数 65 | a = np.zeros_like(w[1]) 66 | yita = 1 67 | theta = np.zeros_like(w[1])+1e-6 68 | k = 0 69 | 70 | # 迭代 71 | while True: 72 | y = np.zeros_like(w[1]) 73 | for i in w: 74 | if np.matmul(a.T, i) <= 0:y += i 75 | yita_y = yita * y 76 | 77 | if all(np.abs(yita_y)<=theta):break 78 | 79 | a += yita_y 80 | k += 1 81 | 82 | return a, k 83 | ``` 84 | 85 | ### 3.3 求解(a)(b) 86 | ```Python 87 | # (a) 88 | w_1 = [[0.1, 1.1], [6.8, 7.1], [-3.5, -4.1], [2.0, 2.7], [4.1, 2.8], 89 | [3.1, 5.0], [-0.8, -1.3], [0.9, 1.2], [5.0, 6.4], [3.9, 4.0]] 90 | w_2 = [[7.1, 4.2], [-1.4, -4.3], [4.5, 0.0], [6.3, 1.6], [4.2, 1.9], 91 | [1.4, -3.2], [2.4, -4.0], [2.5, -6.1], [8.4, 3.7], [4.1, -2.2]] 92 | a, k = batch_perception(w_1, w_2) 93 | print ("a:{}\nk:{}".format(a, k)) 94 | 95 | # (b) 96 | w_3 = [[-3.0, -2.9], [0.5, 8.7], [2.9, 2.1], [-0.1, 5.2], [-4.0, 2.2], 97 | [-1.3, 3.7], [-3.4, 6.2], [-4.1, 3.4], [-5.1, 1.6], [1.9, 5.1]] 98 | a, k = batch_perception(w_3, w_2) 99 | print ("a:{}\nk:{}".format(a, k)) 100 | ``` 101 | 102 | ## 4. 结果与讨论 103 | 104 | 对于$\omega_1$和$\omega_2$两类数据,迭代23次,得到权向量$(-30.4, 34.1, 34)^T$ 105 | 106 |
107 | 108 |
109 | 110 | 对于$\omega_2$和$\omega_3$两类数据,迭代16次,得到权向量$(-41.4, 48.6, 19)^T$. 111 | 112 |
113 | 114 |
115 | 116 | 117 | -------------------------------------------------------------------------------- /problem_3/data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_3/data.png -------------------------------------------------------------------------------- /problem_3/main.py: -------------------------------------------------------------------------------- 1 | """ 2 | @leofansq 3 | https://github.com/leofansq 4 | """ 5 | import numpy as np 6 | import copy 7 | 8 | def samples_trans(w1, w2): 9 | """ 10 | 规范化增广样本 11 | Parameters: 12 | w1: 类1样本 13 | w2: 类2样本 14 | Return: 15 | w: 规范化增广样本 16 | """ 17 | # 复制样本,防止后续操作改变原始样本 18 | w_1 = copy.deepcopy(w1) 19 | w_2 = copy.deepcopy(w2) 20 | 21 | # 增广 22 | for i in w_1: i.append(1) 23 | for i in w_2: i.append(1) 24 | 25 | # 规范化 26 | w_1 = np.array(w_1) 27 | w_2 = np.array(w_2) 28 | w_2 = -w_2 29 | w = np.concatenate([w_1, w_2]) 30 | 31 | return w 32 | 33 | def batch_perception(w1, w2): 34 | """ 35 | Batch Perception 36 | Parameters: 37 | w1: 类1样本 38 | w2: 类2样本 39 | Return: 40 | a 41 | k: 迭代次数 42 | """ 43 | # 规范化增广样本 44 | w = samples_trans(w1, w2) 45 | 46 | # 初始化参数 47 | a = np.zeros_like(w[1]) 48 | yita = 1 49 | theta = np.zeros_like(w[1])+1e-6 50 | k = 0 51 | 52 | # 迭代 53 | while True: 54 | y = np.zeros_like(w[1]) 55 | for i in w: 56 | if np.matmul(a.T, i) <= 0:y += i 57 | yita_y = yita * y 58 | 59 | if all(np.abs(yita_y)<=theta):break 60 | 61 | a += yita_y 62 | k += 1 63 | 64 | return a, k 65 | 66 | def show_result(w1, w2, a): 67 | import matplotlib.pyplot as plt 68 | 69 | w_1 = np.array(w1) 70 | x = w_1[:, 0] 71 | y = w_1[:, 1] 72 | plt.scatter(x, y, marker = '.',color = 'red') 73 | 74 | w_2 = np.array(w2) 75 | x = w_2[:, 0] 76 | y = w_2[:, 1] 77 | plt.scatter(x, y, marker = '.',color = 'blue') 78 | 79 | x = np.arange(-10, 10, 0.1) 80 | y = -a[0]/a[1]*x - a[2]/a[1] 81 | plt.plot(x, y) 82 | 83 | plt.xlabel('x_1') 84 | plt.ylabel('x_2') 85 | plt.title('Classfication Result') 86 | plt.show() 87 | 88 | 89 | 90 | 91 | if __name__ == "__main__": 92 | # (a) 93 | w_1 = [[0.1, 1.1], [6.8, 7.1], [-3.5, -4.1], [2.0, 2.7], [4.1, 2.8], [3.1, 5.0], [-0.8, -1.3], [0.9, 1.2], [5.0, 6.4], [3.9, 4.0]] 94 | w_2 = [[7.1, 4.2], [-1.4, -4.3], [4.5, 0.0], [6.3, 1.6], [4.2, 1.9], [1.4, -3.2], [2.4, -4.0], [2.5, -6.1], [8.4, 3.7], [4.1, -2.2]] 95 | a, k = batch_perception(w_1, w_2) 96 | print ("a:{}\nk:{}".format(a, k)) 97 | show_result(w_1, w_2, a) 98 | # (b) 99 | w_3 = [[-3.0, -2.9], [0.5, 8.7], [2.9, 2.1], [-0.1, 5.2], [-4.0, 2.2], [-1.3, 3.7], [-3.4, 6.2], [-4.1, 3.4], [-5.1, 1.6], [1.9, 5.1]] 100 | a, k = batch_perception(w_3, w_2) 101 | print ("a:{}\nk:{}".format(a, k)) 102 | show_result(w_3, w_2, a) 103 | -------------------------------------------------------------------------------- /problem_4/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_4/1.png -------------------------------------------------------------------------------- /problem_4/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_4/2.png -------------------------------------------------------------------------------- /problem_4/README.md: -------------------------------------------------------------------------------- 1 | # Problem 4 2 | ## 1. 问题描述 3 | 4 | 实现Ho-Kashyap算法,并使用它分别对$\omega_1$和$\omega_3$与$\omega_2$和$\omega_4$进行分类.给出分类误差并分析. 5 | 6 |
7 | 8 |
9 | 10 | ## 2. 实现思路 11 | 12 | * 规范化增广样本 13 | * 初始化参数 14 | * 基于Ho-Kashyap算法迭代更新a和b. 15 | 16 | ## 3. Python代码 17 | ### 3.1 规范化增广样本 18 | ```Python 19 | import numpy as np 20 | import copy 21 | def samples_trans(w1, w2): 22 | """ 23 | 规范化增广样本 24 | Parameters: 25 | w1: 类1样本 26 | w2: 类2样本 27 | Return: 28 | w: 规范化增广样本 29 | """ 30 | # 复制样本,防止后续操作改变原始样本 31 | w_1 = copy.deepcopy(w1) 32 | w_2 = copy.deepcopy(w2) 33 | 34 | # 增广 35 | for i in w_1: i.append(1) 36 | for i in w_2: i.append(1) 37 | 38 | # 规范化 39 | w_1 = np.array(w_1) 40 | w_2 = np.array(w_2) 41 | w_2 = -w_2 42 | w = np.concatenate([w_1, w_2]) 43 | 44 | return w 45 | ``` 46 | 47 | ### 3.2 Ho-Kashyap算法 48 | ```Python 49 | def HK_algorithm(w1, w2): 50 | """ 51 | Ho-Kashyap Algorithm 52 | Parameters: 53 | w1: 类1样本 54 | w2: 类2样本 55 | Return: 56 | a, b, k 57 | """ 58 | # 规范化增广样本 59 | w = samples_trans(w1, w2) 60 | 61 | # 初始化 62 | a = np.zeros_like(w[1]) 63 | b = np.zeros(w.shape[0]) + 0.5 64 | yita = 0.5 65 | th_b = np.zeros(w.shape[0]) + 1e-3 66 | th_k = 10000 67 | k = 1 68 | 69 | # 迭代 70 | while k <= th_k: 71 | e = np.matmul(w, a) - b 72 | e_p = 0.5 * (e + np.abs(e)) 73 | b += 2 * (yita) * e_p 74 | a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w.T, w)), w.T), b) 75 | k += 1 76 | 77 | # 判断是否线性不可分 78 | if any(e) < 0 and any(e) > 0: break 79 | 80 | if all(np.abs(e) <= th_b): return a, e, k 81 | 82 | print ("No solution found !", k) 83 | return None, None, k 84 | ``` 85 | 86 | ### 3.3 求解(a)(b) 87 | ```Python 88 | w_1 = [[0.1, 1.1], [6.8, 7.1], [-3.5, -4.1], [2.0, 2.7], [4.1, 2.8], 89 | [3.1, 5.0], [-0.8, -1.3], [0.9, 1.2], [5.0, 6.4], [3.9, 4.0]] 90 | w_2 = [[7.1, 4.2], [-1.4, -4.3], [4.5, 0.0], [6.3, 1.6], [4.2, 1.9], 91 | [1.4, -3.2], [2.4, -4.0], [2.5, -6.1], [8.4, 3.7], [4.1, -2.2]] 92 | w_3 = [[-3.0, -2.9], [0.5, 8.7], [2.9, 2.1], [-0.1, 5.2], [-4.0, 2.2], 93 | [-1.3, 3.7], [-3.4, 6.2], [-4.1, 3.4], [-5.1, 1.6], [1.9, 5.1]] 94 | w_4 = [[-2.0, -8.4], [-8.9, 0.2], [-4.2, -7.7], [-8.5, -3.2], [-6.7, -4.0], 95 | [-0.5, -9.2], [-5.3, -6.7], [-8.7, -6.4], [-7.1, -9.7], [-8.0, -6.3]] 96 | # (a) 97 | a, b, k = HK_algorithm(w_1, w_3) 98 | print (a, b, k) 99 | show_result(w_1, w_3, a) 100 | 101 | # (b) 102 | a, b, k = HK_algorithm(w_2, w_4) 103 | print (a, b, k) 104 | show_result(w_2, w_4, a) 105 | ``` 106 | 107 | ## 4. 结果与讨论 108 | 109 | 利用程序对$\omega_1$和$\omega_3$分类.算法判断此两类线性不可分,为验证其真实情况,做上述两类样本的散点图如下图所示. 110 | 111 |
112 | 113 |
114 | 115 | 利用程序对$\omega_1$和$\omega_2$分类,迭代得到权向量$a = (0.28705495, 0.25649182, 2.0037991)^T$.此时的误差是 116 | 117 | ```Python 118 | [3.67685249e-05 -9.97335486e-04 2.40112701e-05 3.04323559e-05 119 | 2.69355530e-05 1.21333562e-05 1.25869927e-05 8.90828697e-06 120 | 3.83537956e-05 1.91839467e-05 4.01030966e-06 -3.09893675e-04 121 | 6.96252457e-06 6.95894683e-06 4.96249797e-06 -5.49458729e-04 122 | 7.24029286e-06 1.32441658e-05 1.62424168e-05 1.17098451e-05] 123 | ``` 124 | 125 | 分类情况如下图所示. 126 | 127 |
128 | 129 |
130 | 131 | 132 | -------------------------------------------------------------------------------- /problem_4/data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_4/data.png -------------------------------------------------------------------------------- /problem_4/main.py: -------------------------------------------------------------------------------- 1 | """ 2 | @leofansq 3 | https://github.com/leofansq 4 | """ 5 | import numpy as np 6 | import copy 7 | 8 | def samples_trans(w1, w2): 9 | """ 10 | 规范化增广样本 11 | Parameters: 12 | w1: 类1样本 13 | w2: 类2样本 14 | Return: 15 | w: 规范化增广样本 16 | """ 17 | # 复制样本,防止后续操作改变原始样本 18 | w_1 = copy.deepcopy(w1) 19 | w_2 = copy.deepcopy(w2) 20 | 21 | # 增广 22 | for i in w_1: i.append(1) 23 | for i in w_2: i.append(1) 24 | 25 | # 规范化 26 | w_1 = np.array(w_1) 27 | w_2 = np.array(w_2) 28 | w_2 = -w_2 29 | w = np.concatenate([w_1, w_2]) 30 | 31 | return w 32 | 33 | def HK_algorithm(w1, w2): 34 | """ 35 | Ho-Kashyap Algorithm 36 | Parameters: 37 | w1: 类1样本 38 | w2: 类2样本 39 | Return: 40 | a, b, k 41 | """ 42 | # 规范化增广样本 43 | w = samples_trans(w1, w2) 44 | 45 | # 初始化 46 | a = np.zeros_like(w[1]) 47 | b = np.zeros(w.shape[0]) + 0.5 48 | yita = 0.5 49 | th_b = np.zeros(w.shape[0]) + 1e-3 50 | th_k = 10000 51 | k = 1 52 | 53 | # 迭代 54 | while k <= th_k: 55 | e = np.matmul(w, a) - b 56 | e_p = 0.5 * (e + np.abs(e)) 57 | b += 2 * (yita) * e_p 58 | a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w.T, w)), w.T), b) 59 | k += 1 60 | 61 | if any(e) < 0 and any(e) > 0: break 62 | if all(np.abs(e) <= th_b): return a, e 63 | 64 | print ("No solution found !") 65 | return None, None 66 | 67 | def show_result(w1, w2, a): 68 | """ 69 | 可视化结果 70 | Parameters: 71 | w1: 类1样本点 72 | w2: 类2样本点 73 | a: 权向量 74 | """ 75 | import matplotlib.pyplot as plt 76 | # 可视化类1样本点 77 | w_1 = np.array(w1) 78 | x = w_1[:, 0] 79 | y = w_1[:, 1] 80 | plt.scatter(x, y, marker = '.',color = 'red') 81 | # 可视化类2样本点 82 | w_2 = np.array(w2) 83 | x = w_2[:, 0] 84 | y = w_2[:, 1] 85 | plt.scatter(x, y, marker = '.',color = 'blue') 86 | # 可视化判别面 87 | if a != None: 88 | x = np.arange(-10, 10, 0.1) 89 | y = -a[0]/a[1]*x - a[2]/a[1] 90 | plt.plot(x, y) 91 | 92 | plt.xlabel('x_1') 93 | plt.ylabel('x_2') 94 | plt.title('Classfication Result') 95 | plt.show() 96 | 97 | 98 | if __name__ == "__main__": 99 | w_1 = [[0.1, 1.1], [6.8, 7.1], [-3.5, -4.1], [2.0, 2.7], [4.1, 2.8], [3.1, 5.0], [-0.8, -1.3], [0.9, 1.2], [5.0, 6.4], [3.9, 4.0]] 100 | w_2 = [[7.1, 4.2], [-1.4, -4.3], [4.5, 0.0], [6.3, 1.6], [4.2, 1.9], [1.4, -3.2], [2.4, -4.0], [2.5, -6.1], [8.4, 3.7], [4.1, -2.2]] 101 | w_3 = [[-3.0, -2.9], [0.5, 8.7], [2.9, 2.1], [-0.1, 5.2], [-4.0, 2.2], [-1.3, 3.7], [-3.4, 6.2], [-4.1, 3.4], [-5.1, 1.6], [1.9, 5.1]] 102 | w_4 = [[-2.0, -8.4], [-8.9, 0.2], [-4.2, -7.7], [-8.5, -3.2], [-6.7, -4.0], [-0.5, -9.2], [-5.3, -6.7], [-8.7, -6.4], [-7.1, -9.7], [-8.0, -6.3]] 103 | # (a) 104 | a, b = HK_algorithm(w_1, w_3) 105 | print (a, b) 106 | show_result(w_1, w_3, a) 107 | # (b) 108 | a, b = HK_algorithm(w_2, w_4) 109 | print (a, b) 110 | show_result(w_2, w_4, a) 111 | 112 | -------------------------------------------------------------------------------- /problem_5/README.md: -------------------------------------------------------------------------------- 1 | # Problem 5 2 | ## 1. 问题描述 3 | 4 | 请写一个程序,实现 MSE 多类扩展方法。每一类用前 8 个样本来构造分类器,用后两个样本作测试。请给出你的正确率. 5 | 6 |
7 | 8 |
9 | 10 | ## 2. 实现思路 11 | 12 | * 根据样本类别建立Label矩阵Y 13 | * 基于MSE多类扩展方法计算权向量$a = W^+ Y^T$ 14 | 15 | ## 3. Python代码 16 | ### 3.1 MSE 多类扩展方法 17 | ```Python 18 | def MSE_multi(wi): 19 | """ 20 | MSE 多类扩展训练 21 | Parameters: 22 | wi : 由多类样本构成的列表[[类1样本],[类2样本],...] 23 | Return: 24 | a: 权重矩阵 25 | """ 26 | # 增广 & 构建label矩阵y 27 | w_i = copy.deepcopy(wi) 28 | w = [] 29 | y = np.zeros((len(w_i), len(w_i)*len(w_i[0]))) 30 | for idx, i in enumerate(w_i): 31 | for j in i: 32 | j.append(1) 33 | w.append(j) 34 | y[idx, idx*len(w_i[0]):(idx+1)*len(w_i[0])] = 1 35 | w = np.array(w).T 36 | 37 | # 计算权向量矩阵a 38 | a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w, w.T)), w), y.T) 39 | 40 | return a 41 | ``` 42 | 43 | ### 3.2 测试函数,计算正确率 44 | ```Python 45 | def test(w_test, a): 46 | """ 47 | 测试并计算错误率. 48 | Parameters: 49 | w_test: 测试样本集 50 | a : 权向量矩阵 51 | Return: 52 | f_ratio: 错误率 53 | """ 54 | w_t = copy.deepcopy(w_test) 55 | f_cnt = 0 56 | for idx_i, i in enumerate(w_t): 57 | for idx_j, j in enumerate(i): 58 | j.append(1) 59 | j = np.array(j) 60 | if np.argmax(np.matmul(a.T, j)) != idx_i: f_cnt += 1 61 | 62 | f_ratio = f_cnt / ((idx_i+1)*(idx_j+1)) 63 | 64 | return f_ratio 65 | ``` 66 | 67 | ### 3.3 求解 68 | ```Python 69 | w_1 = [[0.1, 1.1], [6.8, 7.1], [-3.5, -4.1], [2.0, 2.7], [4.1, 2.8], 70 | [3.1, 5.0], [-0.8, -1.3], [0.9, 1.2], [5.0, 6.4], [3.9, 4.0]] 71 | w_2 = [[7.1, 4.2], [-1.4, -4.3], [4.5, 0.0], [6.3, 1.6], [4.2, 1.9], 72 | [1.4, -3.2], [2.4, -4.0], [2.5, -6.1], [8.4, 3.7], [4.1, -2.2]] 73 | w_3 = [[-3.0, -2.9], [0.5, 8.7], [2.9, 2.1], [-0.1, 5.2], [-4.0, 2.2], 74 | [-1.3, 3.7], [-3.4, 6.2], [-4.1, 3.4], [-5.1, 1.6], [1.9, 5.1]] 75 | w_4 = [[-2.0, -8.4], [-8.9, 0.2], [-4.2, -7.7], [-8.5, -3.2], [-6.7, -4.0], 76 | [-0.5, -9.2], [-5.3, -6.7], [-8.7, -6.4], [-7.1, -9.7], [-8.0, -6.3]] 77 | 78 | wi = [w_1[:8], w_2[:8], w_3[:8], w_4[:8]] 79 | a = MSE_multi(wi) 80 | print (a) 81 | 82 | w_test = [w_1[8:], w_2[8:], w_3[8:], w_4[8:]] 83 | f_ratio = test(w_test, a) 84 | print (f_ratio) 85 | ``` 86 | 87 | ## 4. 结果与讨论 88 | 89 | 基于MSE多类扩展方法,利用每一类的前8个样本,计算得到权向量a为 90 | 91 | $$\left[ 92 | \begin{matrix} 93 | 0.02049668 & 0.06810971 & -0.04087307 & -0.04773332 \\ 94 | 0.01626151 & -0.03603827 & 0.05969134 & -0.03991458 \\ 95 | 0.26747287 & 0.27372075 & 0.25027714 & 0.20852924 96 | \end{matrix} 97 | \right]$$ 98 | 99 | 经测试, 上述权向量可以将测试样本全部正确分类,错误率为0. 100 | 101 | 102 | -------------------------------------------------------------------------------- /problem_5/data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_5/data.png -------------------------------------------------------------------------------- /problem_5/main.py: -------------------------------------------------------------------------------- 1 | """ 2 | @leofansq 3 | https://github.com/leofansq 4 | """ 5 | import numpy as np 6 | import copy 7 | 8 | def MSE_multi(wi): 9 | """ 10 | MSE 多类扩展训练 11 | Parameters: 12 | wi : 由多类样本构成的列表[[类1样本],[类2样本],...] 13 | Return: 14 | a: 权重矩阵 15 | """ 16 | # 增广 & 构建label矩阵y 17 | w_i = copy.deepcopy(wi) 18 | w = [] 19 | y = np.zeros((len(w_i), len(w_i)*len(w_i[0]))) 20 | for idx, i in enumerate(w_i): 21 | for j in i: 22 | j.append(1) 23 | w.append(j) 24 | y[idx, idx*len(w_i[0]):(idx+1)*len(w_i[0])] = 1 25 | w = np.array(w).T 26 | 27 | # 计算权向量矩阵a 28 | a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w, w.T)), w), y.T) 29 | 30 | return a 31 | 32 | def test(w_test, a): 33 | """ 34 | 测试并计算错误率. 35 | Parameters: 36 | w_test: 测试样本集 37 | a : 权向量矩阵 38 | Return: 39 | f_ratio: 错误率 40 | """ 41 | w_t = copy.deepcopy(w_test) 42 | f_cnt = 0 43 | for idx_i, i in enumerate(w_t): 44 | for idx_j, j in enumerate(i): 45 | j.append(1) 46 | j = np.array(j) 47 | if np.argmax(np.matmul(a.T, j)) != idx_i: f_cnt += 1 48 | 49 | f_ratio = f_cnt / ((idx_i+1)*(idx_j+1)) 50 | 51 | return f_ratio 52 | 53 | 54 | 55 | 56 | if __name__ == "__main__": 57 | w_1 = [[0.1, 1.1], [6.8, 7.1], [-3.5, -4.1], [2.0, 2.7], [4.1, 2.8], [3.1, 5.0], [-0.8, -1.3], [0.9, 1.2], [5.0, 6.4], [3.9, 4.0]] 58 | w_2 = [[7.1, 4.2], [-1.4, -4.3], [4.5, 0.0], [6.3, 1.6], [4.2, 1.9], [1.4, -3.2], [2.4, -4.0], [2.5, -6.1], [8.4, 3.7], [4.1, -2.2]] 59 | w_3 = [[-3.0, -2.9], [0.5, 8.7], [2.9, 2.1], [-0.1, 5.2], [-4.0, 2.2], [-1.3, 3.7], [-3.4, 6.2], [-4.1, 3.4], [-5.1, 1.6], [1.9, 5.1]] 60 | w_4 = [[-2.0, -8.4], [-8.9, 0.2], [-4.2, -7.7], [-8.5, -3.2], [-6.7, -4.0], [-0.5, -9.2], [-5.3, -6.7], [-8.7, -6.4], [-7.1, -9.7], [-8.0, -6.3]] 61 | # Train 62 | wi = [w_1[:8], w_2[:8], w_3[:8], w_4[:8]] 63 | a = MSE_multi(wi) 64 | print (a) 65 | # Test 66 | w_test = [w_1[8:], w_2[8:], w_3[8:], w_4[8:]] 67 | f_ratio = test(w_test, a) 68 | print (f_ratio) 69 | 70 | 71 | -------------------------------------------------------------------------------- /problem_6/README.md: -------------------------------------------------------------------------------- 1 | # Problem 6 2 | ## 1. 问题描述 3 | 4 | 本题使用的数据如下: 5 | 6 | 第一类 10 个样本(三维空间): 7 | 8 | [1.58, 2.32, -5.8], [0.67, 1.58, -4.78], [1.04, 1.01, -3.63], [-1.49, 2.18, -3.39], [-0.41, 1.21, -4.73], 9 | [1.39, 3.16, 2.87], [1.20, 1.40, -1.89], [-0.92, 1.44, -3,22], [0.45, 1.33, -4.38], [-0.76, 0.84, -1.96] 10 | 11 | 第二类 10 个样本(三维空间): 12 | 13 | [0.21, 0.03, -2.21], [0.37, 0.28, -1.8], [0.18, 1.22, 0.16], [-0.24, 0.93, -1.01], [-1.18, 0.39, -0.39], 14 | [0.74, 0.96, -1.16], [-0.38, 1.94, -0.48], [0.02, 0.72, -0.17], [ 0.44, 1.31, -0.14], [0.46, 1.49, 0.68] 15 | 16 | 第三类 10 个样本(三维空间): 17 | 18 | [-1.54, 1.17, 0.64], [5.41, 3.45, -1.33], [1.55, 0.99, 2.69], [1.86, 3.19, 1.51], [1.68, 1.79, -0.87], 19 | [3.51, -0.22, -1.39], [1.40, -0.44, -0.92], [0.44, 0.83, 1.97], [ 0.25, 0.68, -0.99], [0.66, -0.45, 0.08] 20 | 21 | * 请编写两个通用的三层前向神经网络反向传播算法程序,一个采用批量方式更新权重,另一个采用单样本方式更新权重。其中,隐含层结点的激励函数采用双曲正切函数,输出层的激励函数采用 sigmoid 函数。目标函数采用平方误差准则函数。 22 | 23 | * 请利用上面的数据验证你写的程序,分析如下几点: 24 | * 隐含层不同结点数目对训练精度的影响; 25 | * 观察不同的梯度更新步长对训练的影响,并给出一些描述或解释; 26 | * 在网络结构固定的情况下,绘制出目标函数随着迭代步数增加的变化曲线。 27 | 28 | ## 2. 实现思路 29 | 30 | 为实现题目中所要求的功能, 需要实现以下子功能: 31 | 32 | * 根据题目所提供的输入数据,生成相应的Label, 形成最终的训练数据; 33 | 34 | * 初始化三层网络, 其中隐含层结点个数可改变; 35 | 36 | * 网络训练: 包括前向传播和反向传播. 包含两种权重矩阵更新方式, 单样本更新方式是指每次得到基于单个样本的权重更新矩阵后, 立即更新权重矩阵; 批量方式更新是指将所有单次得到更新矩阵累加, 在批量数据全部运算完毕后一起更新权重矩阵. 37 | 38 | 因此, 代码实现时主要分为两部分: 训练数据生成函数和三层网络类. 后者包含网络初始化和训练所需的各个函数. 39 | 40 | ## 3. Python代码 41 | ### 3.1 训练数据生成函数 42 | ```Python 43 | import numpy as np 44 | import copy 45 | def gen_train_data(data_input): 46 | """ 47 | 根据输入数据, 生成相应的label, 形成训练数据 48 | Parameter: 49 | data_input: 输入数据列表 [[类1数据], [类2数据], [类3数据], ...] 50 | Return: 51 | train_data: 训练用数据列表 [[数据1], [数据2], ...] 52 | train_label: 训练用Label列表 [[数据1对应Label], [数据2对应Label], ...] 53 | """ 54 | train_data = [] 55 | train_label = [] 56 | for idx, i in enumerate(data_input): 57 | for j in i: 58 | # 数据列表 59 | data = np.array(j) 60 | train_data.append(data) 61 | # Label列表: 对应类别为1, 其余为0 62 | label = np.zeros_like(data) 63 | label[idx] = 1 64 | train_label.append(label) 65 | 66 | return train_data, train_label 67 | ``` 68 | 69 | ### 3.2 三层网络类 70 | ```Python 71 | class net: 72 | """ 73 | 三层网络类 74 | """ 75 | def __init__(self, train_data, train_label, h_num): 76 | """ 77 | 网络初始化 78 | Parameters: 79 | train_data: 训练用数据列表 80 | train_label: 训练用Label列表 81 | h_num: 隐含层结点数 82 | """ 83 | # 初始化数据 84 | self.train_data = train_data 85 | self.train_label = train_label 86 | self.h_num = h_num 87 | # 随机初始化权重矩阵 88 | self.w_ih = np.random.rand(train_data[0].shape[0], h_num) 89 | self.w_hj = np.random.rand(h_num, train_label[0].shape[0]) 90 | 91 | def tanh(self, data): 92 | """ 93 | tanh函数 94 | """ 95 | return (np.exp(data) - np.exp(-data)) / (np.exp(data) + np.exp(-data)) 96 | 97 | def sigmoid(self, data): 98 | """ 99 | Sigmoid函数 100 | """ 101 | return 1 / (1 + np.exp(-data)) 102 | 103 | def forward(self, data): 104 | """ 105 | 前向传播 106 | Parameter: 107 | data: 单个样本输入数据 108 | Return: 109 | z_j: 单个输入数据对应的网络输出 110 | y_h: 对应的隐含层输出, 用于后续反向传播时权重更新矩阵的计算 111 | """ 112 | # 计算隐含层输出 113 | net_h = np.matmul(data.T, self.w_ih) 114 | y_h = self. tanh(net_h) 115 | # 计算输出层输出 116 | net_j = np.matmul(y_h.T, self.w_hj) 117 | z_j = self.sigmoid(net_j) 118 | 119 | return z_j, y_h 120 | 121 | def backward(self, z, label, eta, y_h, x_i): 122 | """ 123 | 反向传播 124 | Parameters: 125 | z: 前向传播计算的网络输出 126 | label: 对应的Label 127 | eta: 学习率 128 | y_h: 对应的隐含层输出 129 | x_i: 对应的输入数据 130 | Return: 131 | delta_w_hj: 隐含层-输出层权重更新矩阵 132 | delta_w_ih: 输入层-隐含层权重更新矩阵 133 | error: 样本输出误差, 用于后续可视化 134 | """ 135 | # 矩阵维度整理 136 | z = np.reshape(z, (z.shape[0], 1)) 137 | label = np.reshape(label, (label.shape[0], 1)) 138 | y_h = np.reshape(y_h, (y_h.shape[0], 1)) 139 | x_i = np.reshape(x_i, (x_i.shape[0], 1)) 140 | # 计算输出误差 141 | error = np.matmul((label-z).T, (label-z))[0][0] 142 | # 计算隐含层-输出层权重更新矩阵 143 | error_j = (label - z) * z * (1-z) 144 | delta_w_hj = eta * np.matmul(y_h, error_j.T) 145 | # 计算输入层-隐含层权重更新矩阵 146 | error_h = np.matmul(((label - z) * z * (1-z)).T, self.w_hj.T).T * (1-y_h**2) 147 | delta_w_ih = eta * np.matmul(x_i, error_h.T) 148 | 149 | return delta_w_hj, delta_w_ih, error 150 | 151 | def train(self, bk_mode, eta, epoch_num): 152 | """ 153 | 网络训练 154 | Parameters: 155 | bk_mode: 反向传播方式('single' or 'batch') 156 | eta: 学习率 157 | epoch_num: 全部训练数据迭代次数 158 | """ 159 | # 单样本更新 160 | if bk_mode == 'single': 161 | E = [] 162 | for _ in range(epoch_num): 163 | e = [] 164 | for idx, x_i in enumerate(self.train_data): 165 | # 前向传播 166 | z, y_h = self.forward(x_i) 167 | # 反向传播 168 | delta_w_hj, delta_w_ih, error = self.backward(z, self.train_label[idx], eta, y_h, x_i) 169 | # 权重矩阵更新 170 | self.w_hj += delta_w_hj 171 | self.w_ih += delta_w_ih 172 | 173 | e.append(error) 174 | E.append(np.mean(e)) 175 | 176 | # 批次更新 177 | if bk_mode == 'batch': 178 | E = [] 179 | for _ in range(epoch_num): 180 | e = [] 181 | Delta_w_hj = 0 182 | Delta_w_ih = 0 183 | for idx, x_i in enumerate(self.train_data): 184 | # 前向传播 185 | z, y_h = self.forward(x_i) 186 | # 反向传播 187 | delta_w_hj, delta_w_ih, error = self.backward(z, self.train_label[idx], eta, y_h, x_i) 188 | # 更新权重矩阵累加 189 | Delta_w_hj += delta_w_hj 190 | Delta_w_ih += delta_w_ih 191 | 192 | e.append(error) 193 | # 权重矩阵批次更新 194 | self.w_hj += Delta_w_hj 195 | self.w_ih += Delta_w_ih 196 | E.append(np.mean(e)) 197 | 198 | # 可视化迭代优化过程 199 | import matplotlib.pyplot as plt 200 | plt.plot(E) 201 | plt.show() 202 | ``` 203 | 204 | ### 3.3 实验主程序 205 | ```Python 206 | # 输入数据 207 | data_1 = [[1.58, 2.32, -5.8], [0.67, 1.58, -4.78], [1.04, 1.01, -3.63], 208 | [-1.49, 2.18, -3.39], [-0.41, 1.21, -4.73], [1.39, 3.16, 2.87], 209 | [1.20, 1.40, -1.89], [-0.92, 1.44, -3.22], [0.45, 1.33, -4.38], 210 | [-0.76, 0.84, -1.96]] 211 | data_2 = [[0.21, 0.03, -2.21], [0.37, 0.28, -1.8], [0.18, 1.22, 0.16], 212 | [-0.24, 0.93, -1.01], [-1.18, 0.39, -0.39], [0.74, 0.96, -1.16], 213 | [-0.38, 1.94, -0.48], [0.02, 0.72, -0.17], [0.44, 1.31, -0.14], 214 | [0.46, 1.49, 0.68]] 215 | data_3 = [[-1.54, 1.17, 0.64], [5.41, 3.45, -1.33], [1.55, 0.99, 2.69], 216 | [1.86, 3.19, 1.51], [1.68, 1.79, -0.87], [3.51, -0.22, -1.39], 217 | [1.40, -0.44, -0.92], [0.44, 0.83, 1.97], [0.25, 0.68, -0.99], 218 | [0.66, -0.45, 0.08]] 219 | # 生成训练数据 220 | train_data, train_label = gen_train_data([data_1, data_2, data_3]) 221 | # 初始化网络 222 | n = net(train_data, train_label, h_num=5) 223 | # 网络训练 224 | n.train(bk_mode='batch', eta=0.1, epoch_num=100) 225 | ``` 226 | 227 | ## 4. 结果与讨论 228 | 229 | ### 4.1 隐含层不同结点数目对训练精度的影响 230 | 231 | 实验时,保持网络其他参数不变, 梯度更新步长$\eta=0.1$, 采用单样本更新方式, 改变隐含层结点个数:分别为3,9,15. 不同隐含层结点个数下的网络输出误差随迭代次数增加的变化曲线如下图所示. 232 | 233 |
234 | 235 |
236 | 237 | 可以看到, 当迭代次数足够多且保持不变的情况下, 隐含层结点越多, 最终的训练误差越小, 训练精度越高. 而当迭代次数较少时, 隐含层结点数量的增多导致网络训练参数的增多, 较难训练, 因此结点数越多精度越低. 238 | 239 | ### 4.2 观察不同的梯度更新步长对训练的影响,并给出一些描述或解释; 240 | 241 | 实验时, 保持网络结构不变, 隐含层包含10个结点, 改变梯度更新步长$\eta$依次为0.1, 0.4, 0.8, 不同更新步长下的网络输出误差随迭代次数增加的变化曲线如下图所示. 242 | 243 |
244 | 245 |
246 | 247 | 可以看到, 梯度更新步长的增加可以加快误差的下降, 在迭代次数相同的情况下, 梯度更新步长一定程度上的增大, 可以提升训练精度. 此现象是容易理解的, 由于更新步长的增大, 权重矩阵每次更新向最优矩阵迈出的步伐越大, 会以更快的速度接近最优矩阵. 248 | 249 | 但同时也可以发现, 当梯度更新步长过大时, 网络训练变得不稳定, 会出现震荡的现象. 此现象也是容易理解的, 过大的更新步长可能导致权重矩阵更新时越过最优权重, 出现矫枉过正的情况, 进而上下往复调整. 250 | 251 | ### 4.3 在网络结构固定的情况下,绘制出目标函数随着迭代步数增加的变化曲线 252 | 253 | 其实, 目标函数随着迭代步数增加的变化曲线在上两问中就已展示. 本题中以隐含层结点为15个, 单样本更新方式, 更新步长$\eta = 0.6$, 迭代300次的结果为例, 如下图所示. 254 | 255 |
256 | 257 |
258 | 259 | 此时, 训练过程较为平稳, 曲线较为光滑, 最终的训练误差在0.05左右. 260 | 261 | 接着,对比相同参数下,采用批量更新方式的情况, 如下图所示. 262 | 263 |
264 | 265 |
266 | 267 | 可以看到, 相同参数下, 批量更新方式下的曲线是震荡下降的. 若希望得到平稳的训练过程, 则需要适量减小更新步长. 268 | -------------------------------------------------------------------------------- /problem_6/eta.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_6/eta.png -------------------------------------------------------------------------------- /problem_6/h_num.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_6/h_num.png -------------------------------------------------------------------------------- /problem_6/iteration_batch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_6/iteration_batch.png -------------------------------------------------------------------------------- /problem_6/iteration_single.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_6/iteration_single.png -------------------------------------------------------------------------------- /problem_6/main.py: -------------------------------------------------------------------------------- 1 | """ 2 | @leofansq 3 | https://github.com/leofansq 4 | """ 5 | import numpy as np 6 | import matplotlib.pyplot as plt 7 | 8 | def gen_train_data(data_input): 9 | """ 10 | 根据输入数据, 生成相应的label, 形成训练数据 11 | Parameter: 12 | data_input: 输入数据列表 [[类1数据], [类2数据], [类3数据], ...] 13 | Return: 14 | train_data: 训练用数据列表 [[数据1], [数据2], ...] 15 | train_label: 训练用Label列表 [[数据1对应Label], [数据2对应Label], ...] 16 | """ 17 | train_data = [] 18 | train_label = [] 19 | for idx, i in enumerate(data_input): 20 | for j in i: 21 | # 数据列表 22 | data = np.array(j) 23 | train_data.append(data) 24 | # Label列表: 对应类别为1, 其余为0 25 | label = np.zeros_like(data) 26 | label[idx] = 1 27 | train_label.append(label) 28 | 29 | return train_data, train_label 30 | 31 | class net: 32 | """ 33 | 三层网络类 34 | """ 35 | def __init__(self, train_data, train_label, h_num): 36 | """ 37 | 网络初始化 38 | Parameters: 39 | train_data: 训练用数据列表 40 | train_label: 训练用Label列表 41 | h_num: 隐含层结点数 42 | """ 43 | # 初始化数据 44 | self.train_data = train_data 45 | self.train_label = train_label 46 | self.h_num = h_num 47 | # 随机初始化权重矩阵 48 | self.w_ih = np.random.rand(train_data[0].shape[0], h_num) 49 | self.w_hj = np.random.rand(h_num, train_label[0].shape[0]) 50 | 51 | def tanh(self, data): 52 | """ 53 | tanh函数 54 | """ 55 | return (np.exp(data) - np.exp(-data)) / (np.exp(data) + np.exp(-data)) 56 | 57 | def sigmoid(self, data): 58 | """ 59 | Sigmoid函数 60 | """ 61 | return 1 / (1 + np.exp(-data)) 62 | 63 | def forward(self, data): 64 | """ 65 | 前向传播 66 | Parameter: 67 | data: 单个样本输入数据 68 | Return: 69 | z_j: 单个输入数据对应的网络输出 70 | y_h: 对应的隐含层输出, 用于后续反向传播时权重更新矩阵的计算 71 | """ 72 | # 计算隐含层输出 73 | net_h = np.matmul(data.T, self.w_ih) 74 | y_h = self. tanh(net_h) 75 | # 计算输出层输出 76 | net_j = np.matmul(y_h.T, self.w_hj) 77 | z_j = self.sigmoid(net_j) 78 | 79 | return z_j, y_h 80 | 81 | def backward(self, z, label, eta, y_h, x_i): 82 | """ 83 | 反向传播 84 | Parameters: 85 | z: 前向传播计算的网络输出 86 | label: 对应的Label 87 | eta: 学习率 88 | y_h: 对应的隐含层输出 89 | x_i: 对应的输入数据 90 | Return: 91 | delta_w_hj: 隐含层-输出层权重更新矩阵 92 | delta_w_ih: 输入层-隐含层权重更新矩阵 93 | error: 样本输出误差, 用于后续可视化 94 | """ 95 | # 矩阵维度整理 96 | z = np.reshape(z, (z.shape[0], 1)) 97 | label = np.reshape(label, (label.shape[0], 1)) 98 | y_h = np.reshape(y_h, (y_h.shape[0], 1)) 99 | x_i = np.reshape(x_i, (x_i.shape[0], 1)) 100 | # 计算输出误差 101 | error = np.matmul((label-z).T, (label-z))[0][0] 102 | # 计算隐含层-输出层权重更新矩阵 103 | error_j = (label - z) * z * (1-z) 104 | delta_w_hj = eta * np.matmul(y_h, error_j.T) 105 | # 计算输入层-隐含层权重更新矩阵 106 | error_h = np.matmul(((label - z) * z * (1-z)).T, self.w_hj.T).T * (1-y_h**2) 107 | delta_w_ih = eta * np.matmul(x_i, error_h.T) 108 | 109 | return delta_w_hj, delta_w_ih, error 110 | 111 | def train(self, bk_mode, eta, epoch_num): 112 | """ 113 | 网络训练 114 | Parameters: 115 | bk_mode: 反向传播方式('single' or 'batch') 116 | eta: 学习率 117 | epoch_num: 全部训练数据迭代次数 118 | """ 119 | # 单样本更新 120 | if bk_mode == 'single': 121 | E = [] 122 | for _ in range(epoch_num): 123 | e = [] 124 | for idx, x_i in enumerate(self.train_data): 125 | # 前向传播 126 | z, y_h = self.forward(x_i) 127 | # 反向传播 128 | delta_w_hj, delta_w_ih, error = self.backward(z, self.train_label[idx], eta, y_h, x_i) 129 | # 权重矩阵更新 130 | self.w_hj += delta_w_hj 131 | self.w_ih += delta_w_ih 132 | 133 | e.append(error) 134 | E.append(np.mean(e)) 135 | 136 | # 批次更新 137 | if bk_mode == 'batch': 138 | E = [] 139 | for _ in range(epoch_num): 140 | e = [] 141 | Delta_w_hj = 0 142 | Delta_w_ih = 0 143 | for idx, x_i in enumerate(self.train_data): 144 | # 前向传播 145 | z, y_h = self.forward(x_i) 146 | # 反向传播 147 | delta_w_hj, delta_w_ih, error = self.backward(z, self.train_label[idx], eta, y_h, x_i) 148 | # 更新权重矩阵累加 149 | Delta_w_hj += delta_w_hj 150 | Delta_w_ih += delta_w_ih 151 | 152 | e.append(error) 153 | # 权重矩阵批次更新 154 | self.w_hj += Delta_w_hj 155 | self.w_ih += Delta_w_ih 156 | E.append(np.mean(e)) 157 | 158 | # 可视化迭代优化过程 159 | # plt.plot(E) 160 | plt.plot(E, label="{}".format(eta)) 161 | 162 | 163 | if __name__ == "__main__": 164 | # 输入数据 165 | data_1 = [[1.58, 2.32, -5.8], [0.67, 1.58, -4.78], [1.04, 1.01, -3.63], 166 | [-1.49, 2.18, -3.39], [-0.41, 1.21, -4.73], [1.39, 3.16, 2.87], 167 | [1.20, 1.40, -1.89], [-0.92, 1.44, -3.22], [0.45, 1.33, -4.38], 168 | [-0.76, 0.84, -1.96]] 169 | data_2 = [[0.21, 0.03, -2.21], [0.37, 0.28, -1.8], [0.18, 1.22, 0.16], 170 | [-0.24, 0.93, -1.01], [-1.18, 0.39, -0.39], [0.74, 0.96, -1.16], 171 | [-0.38, 1.94, -0.48], [0.02, 0.72, -0.17], [0.44, 1.31, -0.14], 172 | [0.46, 1.49, 0.68]] 173 | data_3 = [[-1.54, 1.17, 0.64], [5.41, 3.45, -1.33], [1.55, 0.99, 2.69], 174 | [1.86, 3.19, 1.51], [1.68, 1.79, -0.87], [3.51, -0.22, -1.39], 175 | [1.40, -0.44, -0.92], [0.44, 0.83, 1.97], [0.25, 0.68, -0.99], 176 | [0.66, -0.45, 0.08]] 177 | # 生成训练数据 178 | train_data, train_label = gen_train_data([data_1, data_2, data_3]) 179 | # 初始化网络 180 | n = net(train_data, train_label, h_num=10) 181 | # 网络训练 182 | n.train(bk_mode='single', eta=0.1, epoch_num=100) 183 | 184 | # 初始化对比网络1 185 | n = net(train_data, train_label, h_num=10) 186 | # 对比网络1训练 187 | n.train(bk_mode='single', eta=0.4, epoch_num=100) 188 | 189 | # 初始化对比网络2 190 | n = net(train_data, train_label, h_num=10) 191 | # 对比网络2训练 192 | n.train(bk_mode='single', eta=0.8, epoch_num=100) 193 | 194 | plt.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=0, 195 | ncol=10, mode="expand", borderaxespad=0.) 196 | 197 | plt.show() 198 | -------------------------------------------------------------------------------- /problem_7/README.md: -------------------------------------------------------------------------------- 1 | # Problem 7 2 | ## 1. 问题描述 3 | 4 | 现有1000个二维空间的数据点, 其$\sigma=[1,0;0,1]$, $\mu_1=[1,-1],\mu_2=[5.5,-4.5], \mu_3=[1,4], \mu_4=[6,4.5], \mu_5=[9,0]$. 5 | 请完成如下工作: 6 | 7 | * 编写一个程序, 实现经典的K-means聚类算法; 8 | * 令聚类个数为5, 采用不同的初始值观察最后的聚类中心, 给出你所估计的聚类中心, 指出每个中心有多少个样本; 指出你所得到的聚类中心与对应的真实分布的均值之间的误差(对5个聚类, 给出均方误差即可). 9 | 10 | ## 2. 实现思路 11 | 12 | 为实现上述功能, 需实现以下部分和相应功能: 13 | 14 | * 数据生成函数: 由于实验平台为Python, 因此需根据原题中提供的Matlab代码在Python中实现相应功能. 包括: 根据已给的$\sigma$和$\mu$随机生成待聚类数据, 数据可视化和数据的储存. 15 | * K-means聚类函数: 基于经典K-means算法, 根据输入待聚类数据, 对输入的初始化类中心迭代更新, 最终实现聚类. 16 | 17 | ## 3. Python代码 18 | ### 3.1 数据生成函数 19 | ```Python 20 | def generate_data(save_path): 21 | """ 22 | 根据设定的Sigma和mu随机生成待聚类数据, 保存用于后续实验 23 | """ 24 | # 设置Sigma 25 | sigma = np.array([[1.0, 0.0], [0.0, 1.0]]) 26 | # 设置mu 27 | mu_1 = np.array([1.0, -1.0]) 28 | mu_2 = np.array([5.5, -4.5]) 29 | mu_3 = np.array([1.0, 4.0]) 30 | mu_4 = np.array([6.0, 4.5]) 31 | mu_5 = np.array([9.0, 0.0]) 32 | # 随机生成数据 33 | x_1 = np.random.multivariate_normal(mu_1, sigma, 200) 34 | x_2 = np.random.multivariate_normal(mu_2, sigma, 200) 35 | x_3 = np.random.multivariate_normal(mu_3, sigma, 200) 36 | x_4 = np.random.multivariate_normal(mu_4, sigma, 200) 37 | x_5 = np.random.multivariate_normal(mu_5, sigma, 200) 38 | 39 | x = np.concatenate([x_1, x_2], axis=0) 40 | x = np.concatenate([x, x_3], axis=0) 41 | x = np.concatenate([x, x_4], axis=0) 42 | x = np.concatenate([x, x_5], axis=0) 43 | # 数据可视化 44 | plt.scatter(x_1[:,0], x_1[:,1], marker = '.',color = 'red') 45 | plt.scatter(x_2[:,0], x_2[:,1], marker = '.',color = 'blue') 46 | plt.scatter(x_3[:,0], x_3[:,1], marker = '.',color = 'black') 47 | plt.scatter(x_4[:,0], x_4[:,1], marker = '.',color = 'green') 48 | plt.scatter(x_5[:,0], x_5[:,1], marker = '.',color = 'purple') 49 | 50 | plt.title('Data') 51 | plt.savefig('data.png') 52 | # 数据保存 53 | with open(save_path, 'wb') as f: 54 | pickle.dump(x, f, -1) 55 | ``` 56 | 57 | ### 3.2 K-means聚类函数 58 | ```Python 59 | def k_means(data, mu): 60 | """ 61 | K-means聚类 62 | Parameters: 63 | data: 待聚类数据(np.array) 64 | mu: 初始化聚类中心(np.array) 65 | Return: 66 | c: 聚类结果[[第一类数据], [第二类数据], ... , [第c类数据]] 67 | mu: 类中心结果[第一类类中心, 第二类类中心, ... , 第c类类中心] 68 | cnt: 迭代次数 69 | """ 70 | # 待聚类数据矩阵调整(复制矩阵使其从n*d变为n*c*d, 便于后续矩阵运算) 71 | data = np.tile(np.expand_dims(data, axis=1), (1,mu.shape[0],1)) 72 | # 初始化变量 73 | mu_temp = np.zeros_like(mu) # 保存前一次mu结果 74 | cnt = 0 75 | 76 | # 迭代更新类中心 77 | while np.sum(mu - mu_temp): 78 | mu_temp = mu 79 | cnt += 1 80 | label = np.zeros((data.shape[0]), dtype=np.uint8) 81 | # mu矩阵调整(复制矩阵使其从c*d变为n*c*d, 便于后续矩阵运算) 82 | mu = np.tile(np.expand_dims(mu, axis=0), (data.shape[0],1,1)) 83 | # 生成距离矩阵(n*c) 84 | dist = np.sum((data-mu)**2, axis=-1) 85 | # 初始化聚类结果 & 根据距离确定样本类别 86 | c = [] 87 | for _ in range(data.shape[1]): 88 | c.append([]) 89 | 90 | for idx, sample in enumerate(data): 91 | c[np.argmin(dist[idx])].append(sample[0]) 92 | label[idx] = np.argmin(dist[idx]) 93 | c = np.array(c) 94 | # 更新类中心 95 | mu = [] 96 | for i in c: mu.append(np.mean(i, axis=0)) 97 | mu = np.array(mu) 98 | 99 | return c, label, mu, cnt 100 | ``` 101 | 102 | ### 3.3 实验主程序 103 | ```Python 104 | # # 初次生成数据 105 | # generate_data('./data') 106 | 107 | # 加载数据 108 | with open('./data', 'rb') as f: 109 | data = pickle.load(f) 110 | 111 | # 类中心初始化 112 | mu_1 = np.array([0.5, -4.3]) 113 | mu_2 = np.array([3.8, -6.5]) 114 | mu_3 = np.array([-3.1, 6.4]) 115 | mu_4 = np.array([0.7, 5.5]) 116 | mu_5 = np.array([1.5, 7.8]) 117 | mu_rand = np.array([mu_1, mu_2, mu_3, mu_4, mu_5]) 118 | 119 | # K-means 聚类 120 | c, _, mu, cnt = k_means(data, mu_rand) 121 | 122 | # 聚类结果分析 & 可视化 123 | mu_gt = np.array([[1.0, -1.0], [5.5, -4.5], [1.0, 4.0], [6.0, 4.5], [9.0, 0.0]]) 124 | 125 | print ("共迭代了{}次".format(cnt)) 126 | E = 0 127 | color = ['red', 'blue', 'black', 'green', 'purple'] 128 | for idx, i in enumerate(c): 129 | i = np.array(i) 130 | e = np.matmul((mu[idx]-mu_gt[idx]).T, (mu[idx]-mu_gt[idx])) 131 | E += e 132 | print ("第{}类: 初始化类中心{}, 结果为{}, 样本数为{}, 聚类中心均方误差为{}".format(idx, mu_rand[idx], mu[idx], i.shape[0], e)) 133 | plt.scatter(i[:,0], i[:,1], marker = '.',color = color[idx]) 134 | plt.title('Data') 135 | plt.show() 136 | print ("聚类整体均方误差和为{}".format(E)) 137 | ``` 138 | 139 | ## 4. 结果与讨论 140 | 141 | 实验之前, 首先生成实验所需数据, 并存储. 其数据可视化结果如下图所示. 142 | 143 |
144 | 145 |
146 | 147 | 利用实现的K-means对上述数据进行聚类. 多次尝试不同的初始值, 如: 148 | * $\mu=[[3.3, -2.1],[7.6, -3.2], [0.5, 7.2], [5.4, 6.3], [13.2, -1.3]]$ 149 | * $\mu=[[-2.5, -4.1],[-1.6, 1.2], [3.4, 5.1], [1.7, 9.1], [-2.6, -3.1]$ 150 | * $\mu=[[0.5, -4.3],[3.8, -6.5], [-3.1, 6.4], [0.7, 5.5], [1.5, 7.8]$等 151 | 152 | 经观察聚类结果发现, 在本问题中, 不同初始值只会影响聚类所需的迭代次数, 如对于第一种初始值需迭代5次, 对于第三种初始值需迭代17次. 最终的聚类中心均一致.聚类所得的中心分别为: 153 | 154 | $$ 155 | \mu_1 = [1.092081, -1.03069906]\\ 156 | \mu_2 = [5.48195172, -4.46507794]\\ 157 | \mu_3 = [0.93495792, 3.91569933]\\ 158 | \mu_4 = [5.95600064, 4.45909185]\\ 159 | \mu_5 = [9.02265324, 0.02529175] 160 | $$ 161 | 162 | 聚类结果的可视化结果如下图所示. 5个类别分别包含样本数为: 197, 201, 201, 204, 197. 163 | 164 |
165 | 166 |
167 | 168 | 通过与真实分类情况对比, 可以发现错分点如下图所示(黄框中). 169 | 170 |
171 | 172 |
173 | 174 | 可以看到, 被错分点由于原理真实分类的类中心, 因此从离类中心距离来看, 确实距离被错归的类中心更近, 错分情况可以理解. 5个类别的均方误差分别约为(保留5位小数): 0.00942, 0.00155, 0.01134, 0.00361 和 0.00115. 总的聚类均方误差和约为0.02707. 175 | -------------------------------------------------------------------------------- /problem_7/data: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_7/data -------------------------------------------------------------------------------- /problem_7/data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_7/data.png -------------------------------------------------------------------------------- /problem_7/diff.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_7/diff.png -------------------------------------------------------------------------------- /problem_7/main.py: -------------------------------------------------------------------------------- 1 | """ 2 | K均值聚类: generate_data(save_path)实验数据生成; k_means(data, mu)K均值聚类; 3 | 4 | @leofansq 5 | https://github.com/leofansq 6 | 7 | """ 8 | import numpy as np 9 | import matplotlib.pyplot as plt 10 | import pickle 11 | 12 | def generate_data(save_path): 13 | """ 14 | 根据设定的Sigma和mu随机生成待聚类数据, 保存用于后续实验 15 | """ 16 | # 设置Sigma 17 | sigma = np.array([[1.0, 0.0], [0.0, 1.0]]) 18 | # 设置mu 19 | mu_1 = np.array([1.0, -1.0]) 20 | mu_2 = np.array([5.5, -4.5]) 21 | mu_3 = np.array([1.0, 4.0]) 22 | mu_4 = np.array([6.0, 4.5]) 23 | mu_5 = np.array([9.0, 0.0]) 24 | # 随机生成数据 25 | x_1 = np.random.multivariate_normal(mu_1, sigma, 200) 26 | x_2 = np.random.multivariate_normal(mu_2, sigma, 200) 27 | x_3 = np.random.multivariate_normal(mu_3, sigma, 200) 28 | x_4 = np.random.multivariate_normal(mu_4, sigma, 200) 29 | x_5 = np.random.multivariate_normal(mu_5, sigma, 200) 30 | 31 | x = np.concatenate([x_1, x_2], axis=0) 32 | x = np.concatenate([x, x_3], axis=0) 33 | x = np.concatenate([x, x_4], axis=0) 34 | x = np.concatenate([x, x_5], axis=0) 35 | # 数据可视化 36 | plt.scatter(x_1[:,0], x_1[:,1], marker = '.',color = 'red') 37 | plt.scatter(x_2[:,0], x_2[:,1], marker = '.',color = 'blue') 38 | plt.scatter(x_3[:,0], x_3[:,1], marker = '.',color = 'black') 39 | plt.scatter(x_4[:,0], x_4[:,1], marker = '.',color = 'green') 40 | plt.scatter(x_5[:,0], x_5[:,1], marker = '.',color = 'purple') 41 | 42 | plt.title('Data') 43 | plt.savefig('data.png') 44 | # 数据保存 45 | with open(save_path, 'wb') as f: 46 | pickle.dump(x, f, -1) 47 | 48 | def k_means(data, mu): 49 | """ 50 | K-means聚类 51 | Parameters: 52 | data: 待聚类数据(np.array) 53 | mu: 初始化聚类中心(np.array) 54 | Return: 55 | c: 聚类结果[[第一类数据], [第二类数据], ... , [第c类数据]] 56 | mu: 类中心结果[第一类类中心, 第二类类中心, ... , 第c类类中心] 57 | cnt: 迭代次数 58 | """ 59 | # 待聚类数据矩阵调整(复制矩阵使其从n*d变为n*c*d, 便于后续矩阵运算) 60 | data = np.tile(np.expand_dims(data, axis=1), (1,mu.shape[0],1)) 61 | # 初始化变量 62 | mu_temp = np.zeros_like(mu) # 保存前一次mu结果 63 | cnt = 0 64 | 65 | # 迭代更新类中心 66 | while np.sum(mu - mu_temp): 67 | mu_temp = mu 68 | cnt += 1 69 | label = np.zeros((data.shape[0]), dtype=np.uint8) 70 | # mu矩阵调整(复制矩阵使其从c*d变为n*c*d, 便于后续矩阵运算) 71 | mu = np.tile(np.expand_dims(mu, axis=0), (data.shape[0],1,1)) 72 | # 生成距离矩阵(n*c) 73 | dist = np.sum((data-mu)**2, axis=-1) 74 | # 初始化聚类结果 & 根据距离确定样本类别 75 | c = [] 76 | for _ in range(data.shape[1]): 77 | c.append([]) 78 | 79 | for idx, sample in enumerate(data): 80 | c[np.argmin(dist[idx])].append(sample[0]) 81 | label[idx] = np.argmin(dist[idx]) 82 | c = np.array(c) 83 | # 更新类中心 84 | mu = [] 85 | for i in c: mu.append(np.mean(i, axis=0)) 86 | mu = np.array(mu) 87 | 88 | return c, label, mu, cnt 89 | 90 | if __name__ == "__main__": 91 | # # 初次生成数据 92 | # generate_data('./data') 93 | 94 | # 加载数据 95 | with open('./data', 'rb') as f: 96 | data = pickle.load(f) 97 | # 类中心初始化 98 | 99 | # mu_1 = np.array([3.3, -2.1]) 100 | # mu_2 = np.array([7.6, -3.2]) 101 | # mu_3 = np.array([0.5, 7.2]) 102 | # mu_4 = np.array([5.4, 6.3]) 103 | # mu_5 = np.array([13.2, -1.3]) 104 | 105 | mu_1 = np.array([0.5, -4.3]) 106 | mu_2 = np.array([3.8, -6.5]) 107 | mu_3 = np.array([-3.1, 6.4]) 108 | mu_4 = np.array([0.7, 5.5]) 109 | mu_5 = np.array([1.5, 7.8]) 110 | mu_rand = np.array([mu_1, mu_2, mu_3, mu_4, mu_5]) 111 | # K-means 聚类 112 | c, _, mu, cnt = k_means(data, mu_rand) 113 | # 聚类结果分析 & 可视化 114 | mu_gt = np.array([[1.0, -1.0], [5.5, -4.5], [1.0, 4.0], [6.0, 4.5], [9.0, 0.0]]) 115 | 116 | print ("共迭代了{}次".format(cnt)) 117 | 118 | # E = 0 119 | color = ['red', 'blue', 'black', 'green', 'purple'] 120 | for idx, i in enumerate(c): 121 | i = np.array(i) 122 | # e = np.matmul((mu[idx]-mu_gt[idx]).T, (mu[idx]-mu_gt[idx])) 123 | # E += e 124 | # print ("第{}类: 初始化类中心{}, 结果为{}, 样本数为{}, 聚类中心均方误差为{}".format(idx, mu_rand[idx], mu[idx], i.shape[0], e)) 125 | print ("第{}类: 初始化类中心{}, 结果为{}, 样本数为{}".format(idx, mu_rand[idx], mu[idx], i.shape[0])) 126 | plt.scatter(i[:,0], i[:,1], marker = '.',color = color[idx]) 127 | plt.title('Result') 128 | plt.show() 129 | 130 | # print ("聚类整体均方误差和为{}".format(E)) 131 | 132 | 133 | 134 | -------------------------------------------------------------------------------- /problem_7/result.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_7/result.png -------------------------------------------------------------------------------- /problem_8/README.md: -------------------------------------------------------------------------------- 1 | # Problem 8 2 | ## 1. 问题描述 3 | 4 | 关于谱聚类。有如下 200 个数据点,它们是通过两个半月形分布生成的。如图所示: 5 | 6 |
7 | 8 |
9 | 10 | * 请编写一个谱聚类算法,实现"Normalized Spectral Clustering—Algorithm 3 (Ng 算法)". 11 | * 设点对亲和性(即边权值)采用如下计算公式: 12 | 13 | $$ 14 | w_{ij} = e^{-\frac{||x_i-x_j||^2_2}{2\sigma^2}} 15 | $$ 16 | 17 | 同时,数据图采用 k-近邻方法来生成(即对每个数据点$x_i$,首先在所有样本中找出不包含$x_i$的 k 个最邻近的样本点,然后$x_i$与每个邻近样本点均有一条边相连,从而完成图构造)。 18 | 19 | 注意,为了保证亲和度矩阵 W 是对称矩阵,可以令$W=\frac{(W^{T} +W)}{2}$. 假设已知前 100 个点为一个聚类, 后 100 个点为一个聚类,请分析分别取不同的$\sigma$值和 k 值对聚类结果的影响。 20 | (本题可以给出关于聚类精度随着$\sigma$值和 k 值的变化曲线。在实验中,可以固定一个,变化另一个). 21 | 22 | ## 2. 实现思路 23 | 24 | 为实现上述功能, 需实现以下部分和相应功能: 25 | 26 | * 数据加载函数: 从指定txt文件中读取实验数据. 27 | * 图构造函数: 基于输入的数据和参数构造图, 图构造时首先计算每个样本点到其他样本点的距离, 利用K近邻方法生成图, 进而基于亲和性公式生成亲和性矩阵. 28 | * 谱聚类(Ng算法)函数: 基于输入的实验数据亲和性矩阵, 依次计算D, L 和 $L_sym$, 进而求取$L_sym$的最小的前c个特征值对应的特征向量, 归一化后作为样本新特征, 利用K-means实现最终聚类. 29 | * K-means聚类算法(上题中已实现) 30 | 31 | ## 3. Python代码 32 | ### 3.1 数据加载函数 33 | ```Python 34 | def load_data(file_name): 35 | """ 36 | 加载数据 37 | """ 38 | data = [] 39 | with open(file_name, 'r') as f: 40 | content = f.readlines() 41 | for i in content: 42 | i = i[:-1].split(" ") 43 | data.append([float(i[0]), float(i[1])]) 44 | data = np.array(data) 45 | 46 | return data 47 | ``` 48 | 49 | ### 3.2 图构造函数 50 | ```Python 51 | def generate_graph(data, k, theta): 52 | """ 53 | 构造图 54 | Parameter: 55 | data: 待聚类数据 56 | k: k近邻数 57 | theta: 亲和性参数 58 | Return: 59 | w: 亲和性矩阵 60 | """ 61 | # 构造data行列矩阵(n*n*d)以便后续矩阵运算: data_c每列相同 = data_r每行相同 = 样本数据 62 | data_c = np.tile(np.expand_dims(data.copy(), axis=1), (1,data.shape[0],1)) 63 | data_r = np.tile(np.expand_dims(data.copy(), axis=0), (data.shape[0],1,1)) 64 | 65 | # 生成Dist矩阵 66 | dist = np.sum((data_c - data_r)**2, axis=-1) 67 | 68 | # 生成W矩阵 69 | # 初始化w 70 | w = np.zeros_like(dist) 71 | for idx_sample, i in enumerate(dist): 72 | idx = np.arange(0, i.shape[0]) 73 | # 构造 距离-索引 序列, 将距离和索引一一对应 74 | i_idx = zip(i, idx) 75 | # 按照距离递增排序 76 | i_sorted = sorted(i_idx, key=lambda i_idx: i_idx[0]) 77 | # 生成w矩阵: 循环时排除自身距离为0的干扰 78 | for j in range(1,k+1): 79 | w[idx_sample, i_sorted[j][1]] = np.exp(-i_sorted[j][0]/(2*(theta**2))) 80 | # w调整:为保证w为对称矩阵 81 | w = (w.T + w)/2 82 | 83 | return w 84 | ``` 85 | 86 | ### 3.3 谱聚类(Ng算法)函数 87 | ```Python 88 | def ng_algo(W, c): 89 | """ 90 | Ng谱聚类算法 91 | Parameters: 92 | W: 亲和性矩阵 93 | c: 聚类类别数 94 | Return: 95 | label: 聚类结果label列表 [样本1类别, 样本2类别, ... , 样本n类别] 96 | """ 97 | # 计算D & D^(-1/2)矩阵: 为避免生成D后计算会出现分母为0的情况, 直接计算D^(-1/2) 98 | W_rowsum = np.sum(W, axis=1) 99 | D = np.diag(W_rowsum) 100 | # W_rowsum = 1/(np.sqrt(W_rowsum)) 101 | W_rowsum = W_rowsum**(-0.5) 102 | D_invsqrt = np.diag(W_rowsum) 103 | # 计算L矩阵 104 | L = D - W 105 | # 计算L_sym矩阵 106 | L_sym = np.matmul(np.matmul(D_invsqrt, L), D_invsqrt) 107 | # L_sym特征值 & 特征向量 108 | e_value, e_vector = np.linalg.eig(L_sym) 109 | e_vector = e_vector.T 110 | e = zip(e_value, e_vector) 111 | e_sorted = sorted(e, key=lambda e: e[0]) 112 | # 生成新特征 113 | new_feature = [] 114 | for i in range(c): 115 | new_feature.append(e_sorted[i][1]) 116 | new_feature = np.array(new_feature).T 117 | # 归一化新特征 118 | norm_feature = [] 119 | for i in new_feature: 120 | i = i/(np.sqrt(np.sum(i**2))+1e-10) 121 | norm_feature.append([i[0], i[1]]) 122 | norm_feature = np.array(norm_feature) 123 | # 对新特征做K-means 124 | mu = np.array([norm_feature[50], norm_feature[150]]) 125 | _, label, _, _ = k_means(norm_feature, mu) 126 | 127 | return label 128 | ``` 129 | 130 | ### 3.4 实验主函数 131 | ```Python 132 | # 加载数据 & 可视化 133 | data = load_data("./data_2.txt") 134 | # 构造图 135 | k = 5 136 | theta = 2 137 | w = generate_graph(data, k, theta) 138 | # 谱聚类 139 | c = 2 140 | label = ng_algo(w, c) 141 | # 可视化 142 | color = ['red', 'blue'] 143 | for idx, i in enumerate(data): 144 | i = np.array(i) 145 | plt.scatter(i[0], i[1], marker = '.',color = color[label[idx]]) 146 | plt.title('Result') 147 | plt.show() 148 | ``` 149 | 150 | ## 4. 结果与讨论 151 | 152 | 首先,测试实现的谱聚类算法的有效性. 当$k=5, \sigma=2$时, 分类完全正确, 其可视化结果如下图所示. 153 | 154 |
155 | 156 |
157 | 158 | 然后, 探究不同的k值和$\sigma$值对聚类结果的影响. 实验时, 保持其他参数不变(固定K-means初始化值), 固定k和$\sigma$中的一个, 改变另一个, 观察聚类结果(精度)的变化. 159 | 160 | 当固定$\sigma=1$时, 如下图所示. 当$k \in (5,15)$时, 能达到$100\%$的聚类精度. 之后随着k值的增加, 聚类精度下降, 具体体现为在k=20左右时突降, 之后缓慢下降, 最终维持在0.72左右. 161 | 162 |
163 | 164 |
165 | 166 | 在探究聚类精度与亲和性参数$\sigma$之间关系时, 由于其关系在k值不同时表现不尽相同, 故分别在k = 5, 20和50的情况下进行实验. 实验时, $\sigma \in (0,2)$, 步长0.1. 结果如下图所示. 167 | 168 |
169 | 170 | 171 | 172 |
173 | 174 | 当固定$k=5$时,随$\sigma$变化, 聚类精度未发生变化, 恒为$100\%$. 且后续实验发现, 在$k \in (5,15)$, 均为上述现象; 175 | 当$k>15$时, 随$\sigma$增大, 聚类精度下降, 且随k值的增大, 下降速度越快. 176 | 177 | 有上述实验结果可以看出, 图的构造对谱聚类的最终结果影响很大, 只有选择合适的k近邻数和亲和性系数才能得到较好的谱聚类结果. -------------------------------------------------------------------------------- /problem_8/acc-k.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_8/acc-k.png -------------------------------------------------------------------------------- /problem_8/acc-sigma-k20.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_8/acc-sigma-k20.png -------------------------------------------------------------------------------- /problem_8/acc-sigma-k5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_8/acc-sigma-k5.png -------------------------------------------------------------------------------- /problem_8/acc-sigma-k50.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_8/acc-sigma-k50.png -------------------------------------------------------------------------------- /problem_8/data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_8/data.png -------------------------------------------------------------------------------- /problem_8/data.txt: -------------------------------------------------------------------------------- 1 | -1.3046 -0.1606 2 | -1.4341 -0.3372 3 | -1.3475 -0.0421 4 | -1.3426 0.0746 5 | -1.2433 -0.0451 6 | -1.3477 0.0140 7 | -1.2695 0.0159 8 | -1.2560 -0.0217 9 | -1.4525 -0.1600 10 | -1.4813 0.2405 11 | -1.2533 0.1410 12 | -1.1160 0.1336 13 | -1.2506 0.0134 14 | -1.4179 0.1585 15 | -1.1427 0.1662 16 | -1.2654 0.2749 17 | -1.3093 0.4972 18 | -0.8894 0.5117 19 | -1.2147 0.4365 20 | -1.0589 0.5416 21 | -1.0974 0.3831 22 | -1.1807 0.5650 23 | -1.0185 0.5226 24 | -1.1234 0.5162 25 | -0.9260 0.5602 26 | -1.1131 0.6486 27 | -1.0704 0.4117 28 | -1.3090 0.6600 29 | -0.9298 0.6564 30 | -0.7473 0.7469 31 | -0.8765 0.7687 32 | -1.0327 0.6897 33 | -0.8800 0.6879 34 | -0.7897 0.8371 35 | -0.9968 0.5561 36 | -0.7329 0.8403 37 | -0.9754 0.4914 38 | -0.5870 0.8492 39 | -0.7836 0.9945 40 | -0.5976 0.7718 41 | -0.6082 1.1435 42 | -0.5593 0.8088 43 | -0.4512 0.6593 44 | -0.4470 0.9503 45 | -0.5153 0.8144 46 | -0.4882 0.8137 47 | -0.3907 0.9800 48 | -0.3310 0.9054 49 | -0.4595 0.8447 50 | -0.2568 0.8148 51 | -0.1615 0.6493 52 | -0.1068 0.9994 53 | -0.3376 0.9333 54 | -0.0439 0.8333 55 | -0.0312 0.8441 56 | -0.1402 0.8472 57 | -0.0961 1.0671 58 | 0.0436 0.9985 59 | -0.2503 0.7038 60 | -0.1422 0.9170 61 | -0.1054 0.9390 62 | -0.0536 0.8220 63 | 0.3137 0.8682 64 | 0.0720 0.6508 65 | 0.0782 0.8938 66 | -0.0973 0.6904 67 | 0.1411 0.7367 68 | 0.1296 0.7934 69 | 0.2079 0.5609 70 | 0.2143 0.8637 71 | 0.0133 0.8896 72 | 0.2598 0.6927 73 | 0.1868 0.9497 74 | 0.1563 0.3887 75 | 0.2798 0.5154 76 | 0.2081 0.6446 77 | 0.3619 0.4604 78 | 0.4530 0.2400 79 | 0.3655 0.4805 80 | 0.4878 0.3051 81 | 0.3224 0.4864 82 | 0.4199 0.3399 83 | 0.4536 0.2371 84 | 0.4703 0.3144 85 | 0.3250 0.3584 86 | 0.4466 0.2982 87 | 0.6724 0.3708 88 | 0.5432 0.2119 89 | 0.6769 0.1165 90 | 0.5262 0.2860 91 | 0.8319 0.2159 92 | 0.6931 -0.0227 93 | 0.3347 -0.0805 94 | 0.6928 0.0936 95 | 0.4681 0.0717 96 | 0.6455 -0.0896 97 | 0.6603 -0.0498 98 | 0.6617 -0.1023 99 | 0.5885 -0.0414 100 | 0.5709 -0.1069 101 | -0.5597 0.1367 102 | -0.6311 0.2434 103 | -0.4894 0.0448 104 | -0.6578 0.1441 105 | -0.4873 0.2604 106 | -0.5392 -0.0516 107 | -0.6547 -0.0502 108 | -0.6288 -0.1486 109 | -0.7275 -0.1015 110 | -0.6298 0.0501 111 | -0.3436 -0.2539 112 | -0.6768 -0.1921 113 | -0.6052 -0.2849 114 | -0.4088 -0.1798 115 | -0.5129 -0.2064 116 | -0.3433 -0.3513 117 | -0.5111 -0.3768 118 | -0.4578 -0.3235 119 | -0.3441 -0.5589 120 | -0.4199 -0.3592 121 | -0.3887 -0.4941 122 | -0.6470 -0.5287 123 | -0.4376 -0.4311 124 | -0.2540 -0.5804 125 | -0.5833 -0.5855 126 | -0.2180 -0.5837 127 | -0.3327 -0.4320 128 | -0.4364 -0.6702 129 | -0.4299 -0.6884 130 | -0.4017 -0.7903 131 | -0.2743 -0.5594 132 | -0.2300 -0.5676 133 | -0.0773 -0.6538 134 | -0.0194 -0.7473 135 | -0.1527 -0.7179 136 | -0.2595 -0.5159 137 | -0.0402 -0.8188 138 | 0.1271 -0.7987 139 | 0.0809 -0.5309 140 | 0.1404 -0.8573 141 | 0.1402 -0.6199 142 | -0.0565 -0.6858 143 | 0.0977 -0.8960 144 | 0.1566 -0.7853 145 | 0.2867 -0.9838 146 | 0.1906 -1.0191 147 | 0.0526 -0.7277 148 | -0.0401 -0.8271 149 | 0.3977 -0.7768 150 | 0.5240 -0.9277 151 | 0.3055 -0.9099 152 | 0.5350 -0.8659 153 | 0.5637 -0.8476 154 | 0.4200 -0.9719 155 | 0.5199 -0.7543 156 | 0.6421 -0.7453 157 | 0.6160 -0.7585 158 | 0.4763 -0.8627 159 | 0.5213 -0.8194 160 | 0.6220 -0.7689 161 | 0.6079 -0.7382 162 | 0.8823 -0.8498 163 | 0.6742 -0.6028 164 | 1.0800 -0.5724 165 | 0.7631 -0.7106 166 | 0.8309 -0.7879 167 | 0.9476 -0.7258 168 | 0.6503 -0.6596 169 | 0.9792 -0.6558 170 | 0.9327 -0.5266 171 | 0.9087 -0.4398 172 | 1.2249 -0.5257 173 | 1.0020 -0.6705 174 | 0.8749 -0.6966 175 | 0.9947 -0.4664 176 | 1.2899 -0.7225 177 | 1.2173 -0.3204 178 | 1.1954 -0.5276 179 | 1.2815 -0.4800 180 | 1.2609 -0.4442 181 | 1.2198 -0.5365 182 | 1.2043 -0.3410 183 | 1.2503 -0.3602 184 | 1.2035 -0.3879 185 | 1.3811 -0.0959 186 | 1.2714 -0.1807 187 | 1.4145 -0.0326 188 | 1.1280 -0.3075 189 | 1.3946 -0.0686 190 | 1.1405 -0.2190 191 | 1.2109 -0.1448 192 | 1.4998 0.0787 193 | 1.1156 -0.0481 194 | 1.5878 0.1745 195 | 1.2817 0.4193 196 | 1.3417 -0.0501 197 | 1.2290 -0.1015 198 | 1.4039 0.2343 199 | 1.2912 0.1612 200 | 1.2759 0.1342 201 | -------------------------------------------------------------------------------- /problem_8/main.py: -------------------------------------------------------------------------------- 1 | """ 2 | Ng谱聚类: load_data(file_name) 数据加载; 3 | generate_graph(data, k, theta) 图构造; 4 | k_means(data, mu) K-means; 5 | ng_algo(W, c) 谱聚类 6 | 7 | @leofansq 8 | https://github.com/leofansq 9 | """ 10 | import numpy as np 11 | import matplotlib.pyplot as plt 12 | 13 | def load_data(file_name): 14 | """ 15 | 加载数据 16 | """ 17 | data = [] 18 | with open(file_name, 'r') as f: 19 | content = f.readlines() 20 | for i in content: 21 | i = i[:-1].split(" ") 22 | data.append([float(i[0]), float(i[1])]) 23 | data = np.array(data) 24 | 25 | return data 26 | 27 | def generate_graph(data, k, theta): 28 | """ 29 | 构造图 30 | Parameter: 31 | data: 待聚类数据 32 | k: k近邻数 33 | theta: 亲和性参数 34 | Return: 35 | w: 亲和性矩阵 36 | """ 37 | # 构造data行列矩阵(n*n*d)以便后续矩阵运算: data_c每列相同 = data_r每行相同 = 样本数据 38 | data_c = np.tile(np.expand_dims(data.copy(), axis=1), (1,data.shape[0],1)) 39 | data_r = np.tile(np.expand_dims(data.copy(), axis=0), (data.shape[0],1,1)) 40 | 41 | # 生成Dist矩阵 42 | dist = np.sum((data_c - data_r)**2, axis=-1) 43 | 44 | # 生成W矩阵 45 | # 初始化w 46 | w = np.zeros_like(dist) 47 | for idx_sample, i in enumerate(dist): 48 | idx = np.arange(0, i.shape[0]) 49 | # 构造 距离-索引 序列, 将距离和索引一一对应 50 | i_idx = zip(i, idx) 51 | # 按照距离递增排序 52 | i_sorted = sorted(i_idx, key=lambda i_idx: i_idx[0]) 53 | # 生成w矩阵: 循环时排除自身距离为0的干扰 54 | for j in range(1,k+1): 55 | w[idx_sample, i_sorted[j][1]] = np.exp(-i_sorted[j][0]/(2*(theta**2))) 56 | # w调整:为保证w为对称矩阵 57 | w = (w.T + w)/2 58 | 59 | return w 60 | 61 | def k_means(data, mu): 62 | """ 63 | K-means聚类 64 | Parameters: 65 | data: 待聚类数据(np.array) 66 | mu: 初始化聚类中心(np.array) 67 | Return: 68 | c: 聚类结果[[第一类数据], [第二类数据], ... , [第c类数据]] 69 | label: 聚类结果label列表 [样本1类别, 样本2类别, ... , 样本n类别] 70 | mu: 类中心结果[第一类类中心, 第二类类中心, ... , 第c类类中心] 71 | cnt: 迭代次数 72 | """ 73 | # 待聚类数据矩阵调整(复制矩阵使其从n*d变为n*c*d, 便于后续矩阵运算) 74 | data = np.tile(np.expand_dims(data, axis=1), (1,mu.shape[0],1)) 75 | # 初始化变量 76 | mu_temp = np.zeros_like(mu) # 保存前一次mu结果 77 | cnt = 0 78 | 79 | # 迭代更新类中心 80 | while np.abs(np.sum((mu - mu_temp)**2))>1e-10 : 81 | mu_temp = mu 82 | cnt += 1 83 | label = np.zeros((data.shape[0]), dtype=np.uint8) 84 | # mu矩阵调整(复制矩阵使其从c*d变为n*c*d, 便于后续矩阵运算) 85 | mu = np.tile(np.expand_dims(mu, axis=0), (data.shape[0],1,1)) 86 | # 生成距离矩阵(n*c) 87 | dist = np.sum((data-mu)**2, axis=-1) 88 | # 初始化聚类结果 & 根据距离确定样本类别 89 | c = [] 90 | for _ in range(data.shape[1]): 91 | c.append([]) 92 | 93 | for idx, sample in enumerate(data): 94 | c[np.argmin(dist[idx])].append(sample[0]) 95 | label[idx] = np.argmin(dist[idx]) 96 | c = np.array(c) 97 | # 更新类中心 98 | mu = [] 99 | for i in c: mu.append(np.mean(i, axis=0)) 100 | mu = np.array(mu) 101 | 102 | return c, label, mu, cnt 103 | 104 | def ng_algo(W, c): 105 | """ 106 | Ng谱聚类算法 107 | Parameters: 108 | W: 亲和性矩阵 109 | c: 聚类类别数 110 | Return: 111 | label: 聚类结果label列表 [样本1类别, 样本2类别, ... , 样本n类别] 112 | """ 113 | # 计算D & D^(-1/2)矩阵: 为避免生成D后计算会出现分母为0的情况, 直接计算D^(-1/2) 114 | W_rowsum = np.sum(W, axis=1) 115 | D = np.diag(W_rowsum) 116 | # W_rowsum = 1/(np.sqrt(W_rowsum)) 117 | W_rowsum = W_rowsum**(-0.5) 118 | D_invsqrt = np.diag(W_rowsum) 119 | # 计算L矩阵 120 | L = D - W 121 | # 计算L_sym矩阵 122 | L_sym = np.matmul(np.matmul(D_invsqrt, L), D_invsqrt) 123 | # L_sym特征值 & 特征向量 124 | e_value, e_vector = np.linalg.eig(L_sym) 125 | e_vector = e_vector.T 126 | e = zip(e_value, e_vector) 127 | e_sorted = sorted(e, key=lambda e: e[0]) 128 | # 生成新特征 129 | new_feature = [] 130 | for i in range(c): 131 | new_feature.append(e_sorted[i][1]) 132 | new_feature = np.array(new_feature).T 133 | # 归一化新特征 134 | norm_feature = [] 135 | for i in new_feature: 136 | i = i/(np.sqrt(np.sum(i**2))+1e-10) 137 | norm_feature.append([i[0], i[1]]) 138 | norm_feature = np.array(norm_feature) 139 | # 对新特征做K-means 140 | mu = np.array([norm_feature[50], norm_feature[150]]) 141 | _, label, _, _ = k_means(norm_feature, mu) 142 | 143 | return label 144 | 145 | 146 | if __name__ == "__main__": 147 | # 加载数据 & 可视化 148 | data = load_data("./data.txt") 149 | # plt.scatter(data[:,0], data[:,1], marker=".", color="black") 150 | # plt.title("Data") 151 | # plt.savefig('data.png') 152 | 153 | # 构造图 154 | k = 5 155 | theta = 2 156 | w = generate_graph(data, k, theta) 157 | 158 | # 谱聚类 159 | c = 2 160 | label = ng_algo(w, c) 161 | 162 | # 可视化 163 | color = ['red', 'blue'] 164 | for idx, i in enumerate(data): 165 | i = np.array(i) 166 | plt.scatter(i[0], i[1], marker = '.',color = color[label[idx]]) 167 | plt.title('Result') 168 | plt.show() 169 | 170 | # # ACC-K/Sigma 曲线 171 | # gt = np.zeros((200)) 172 | # for i in range(100,200): gt[i]=1 173 | 174 | # Acc = [] 175 | # # for k in range(1,199): 176 | # for theta in np.arange(0.1,2,0.1): 177 | # print (theta) 178 | # k = 5 179 | # # theta = 1 180 | # w = generate_graph(data, k, theta) 181 | # c = 2 182 | # label = ng_algo(w, c) 183 | # acc = (200 - np.sum(np.abs(label - gt)))/200 184 | # Acc.append(acc) 185 | # Acc = np.array(Acc) 186 | # # plt.plot(np.arange(1,199), Acc) 187 | # plt.plot(np.arange(0.1,2,0.1), Acc) 188 | # plt.title("Acc-Sigma (k={})".format(k)) 189 | # plt.xlabel("sigma") 190 | # plt.ylabel("acc") 191 | # plt.show() 192 | 193 | 194 | 195 | 196 | 197 | 198 | -------------------------------------------------------------------------------- /problem_8/result.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_8/result.png -------------------------------------------------------------------------------- /problem_9/README.md: -------------------------------------------------------------------------------- 1 | # Problem 9 2 | ## 1. 问题描述 3 | 4 | 从MNIST数据集中选择两类,对其进行SVM分类,可调用现有的SVM工具. 5 | 6 | ## 2. 实现思路 7 | 8 | 为实现上述功能, 需实现以下子功能: 9 | 10 | * 加载图像数据与Label数据: Mnist数据集原始图像数据以idx3-ubyte格式存储, Label数据以idx1-ubyte格式存储. 因此需根据其特定格式加载数据. 11 | 12 | * 根据需求生成实际使用数据: 由于需从数据集中选取两类进行分类, 因此需生成指定类别的图像数据和其对应的Label数据, 以数组的形式返回. 13 | 14 | * SVM训练与模型保存: 利用训练数据对SVM进行训练, 直接调用Sk-learn的SVM工具. 将训练好的模型保存, 以便后续使用. 15 | 16 | * SVM测试: 加载已有的SVM模型, 利用测试数据进行测试, 统计并评价分类的准确率. 17 | 18 | ## 3. Python代码 19 | 20 | 代码实现时, 使用以下库: numpy, struct.unpack, sklearn.svm, pickle 21 | 22 | ### 3.1 图像数据与Label数据的加载 23 | ```Python 24 | def load_imgs(path): 25 | """ 26 | 加载图像数据 27 | Parameter: 28 | path: 图像数据文件路径 29 | Return: 30 | imgs: 每行为一个图像数据(n, img.size) 31 | """ 32 | with open(path, 'rb') as f: 33 | _, num, rows, cols = unpack('>4I', f.read(16)) 34 | imgs = np.fromfile(f, dtype=np.uint8).reshape(num, rows*cols) 35 | return imgs 36 | 37 | def load_labels(path): 38 | """ 39 | 加载Label数据 40 | Parameter: 41 | path: Label数据文件路径 42 | Return: 43 | labels: 每行为一个label标签(n,) 44 | """ 45 | with open(path, 'rb') as f: 46 | _, num = unpack('>2I', f.read(8)) 47 | labels = np.fromfile(f, dtype=np.uint8) 48 | return labels 49 | ``` 50 | 51 | ### 3.2 生成与整合数据 52 | ```Python 53 | def generate_data(img_path, label_path, c_list): 54 | """ 55 | 生成整合数据 56 | Parameter: 57 | img_path: 图像数据文件路径 58 | label_path: Label数据文件路径 59 | c_list: 需生成的类别名列表 60 | Return: 61 | [img_c, label_c]: img_c为图像数据数组,每行为一个图像的数据(n,img.size);label_c为标签数据数组,每行为一个标签的数据(n,) 62 | """ 63 | # 加载图像数据和Label 64 | imgs = load_imgs(img_path) 65 | labels = load_labels(label_path) 66 | 67 | # 选取特定类别并生成所需数据 68 | img_c = [] 69 | label_c = [] 70 | for c in c_list: 71 | idx = np.where(labels==c) 72 | img_c.extend(imgs[idx[0]]) 73 | label_c.extend(labels[idx[0]]) 74 | # 图像数据归一化 75 | img_c = np.array(img_c)/255.0 76 | label_c = np.array(label_c) 77 | 78 | return [img_c, label_c] 79 | ``` 80 | 81 | ### 3.3 SVM训练与模型保存 82 | ```Python 83 | def train_svm(train_data, c, model_path): 84 | """ 85 | SVM训练 & 模型保存 86 | Parameter: 87 | train_data: [img_c, label_c], generate_data函数返回的数据格式 88 | c: SVM参数c 89 | model_path: 训练生成的模型保存路径 90 | """ 91 | # SVM训练 92 | print ("Start Training...") 93 | classifier = svm.SVC(C=c, decision_function_shape='ovr') 94 | classifier.fit(train_data[0], train_data[1]) 95 | 96 | # 模型保存 97 | save = pickle.dumps(classifier) 98 | with open(model_path, 'wb+') as f: f.write(save) 99 | print ("Training Done. Model is saved in {}.".format(model_path)) 100 | ``` 101 | 102 | ### 3.4 SVM测试 103 | ```Python 104 | def test_svm(test_data, model_path): 105 | """ 106 | SVM测试 107 | Parameter: 108 | test_data: [img_c, label_c], generate_data函数返回的数据格式 109 | model_path: 测试的模型的文件路径 110 | """ 111 | # 加载待测试模型 112 | print ("Start Testing...") 113 | with open(model_path, 'rb') as f: s = f.read() 114 | classifier = pickle.loads(s) 115 | 116 | # 模型测试 117 | score = classifier.score(test_data[0], test_data[1]) 118 | print ("Testing Accuracy:", score) 119 | ``` 120 | 121 | ### 3.5 实验主程序 122 | ```Python 123 | # Option 124 | # 文件路径 125 | TRAIN_IMG_PATH = "./data/train-images.idx3-ubyte" 126 | TRAIN_LABEL_PATH = "./data/train-labels.idx1-ubyte" 127 | TEST_IMG_PATH = "./data/t10k-images.idx3-ubyte" 128 | TEST_LABEL_PATH = "./data/t10k-labels.idx1-ubyte" 129 | # 训练&测试的指定类别 130 | CLASS = [1, 2] 131 | # 模型保存路径 132 | MODEL_PATH = "svm.model" 133 | 134 | # 数据加载 135 | train_data = generate_data(TRAIN_IMG_PATH, TRAIN_LABEL_PATH, CLASS) 136 | test_data = generate_data(TEST_IMG_PATH, TEST_LABEL_PATH, CLASS) 137 | # 训练 138 | train_svm(train_data, 1, MODEL_PATH) 139 | # 测试 140 | test_svm(test_data, MODEL_PATH) 141 | ``` 142 | 143 | ## 4. 结果与讨论 144 | 145 | 通过运行python main.py利用上述代码对SVM进行训练与模型测试.(为方便作业的上传与下载, 提交时对数据集数据进行了压缩, 测试前需解压) 146 | 147 | 实验发现, SVM能较好地对数据进行分类. 在参数C相同的情况下, 对于本身相似度较小的两个类别, 如1和2, 测试正确率可达到99.5\%; 148 | 对于本身相似度较高的两个类别, 如0和6, 1和7, 测试正确率可达到99.0\%. 149 | 150 | 在待分类类别固定不变的情况下, 调整参数C可以使测试正确率发生变化. 对于1和2两个类别, 当C=1时, 测试正确率为99.4\%; 当C=100时, 正确率为99.5\%; 151 | 当C=200时, 正确率为99.7\%; 但当C=1000时, 正确率降回99.5\%. 152 | 上述现象可以从理论上进行分析. 增大C意味着减小对错分情况的松弛, 可以达到更好的分类情况, 但同时也可能带来过拟合的隐患, 一旦引起过拟合, 模型虽然能在训练数据上达到更优的分类效果, 但在测试数据上正确率则会降低. -------------------------------------------------------------------------------- /problem_9/data.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leofansq/AI01001H-PatternRecognition/70efb93b2028f0c61a3e1d4549f07b867f576034/problem_9/data.zip -------------------------------------------------------------------------------- /problem_9/main.py: -------------------------------------------------------------------------------- 1 | """ 2 | SVM训练 & 测试, MNIST数据集 3 | 4 | @leofansq 5 | https://github.com/leofansq 6 | """ 7 | import numpy as np 8 | from struct import unpack 9 | from sklearn import svm 10 | 11 | import pickle 12 | 13 | def load_imgs(path): 14 | """ 15 | 加载图像数据 16 | Parameter: 17 | path: 图像数据文件路径 18 | Return: 19 | imgs: 每行为一个图像数据(n, img.size) 20 | """ 21 | with open(path, 'rb') as f: 22 | _, num, rows, cols = unpack('>4I', f.read(16)) 23 | imgs = np.fromfile(f, dtype=np.uint8).reshape(num, rows*cols) 24 | return imgs 25 | 26 | def load_labels(path): 27 | """ 28 | 加载Label数据 29 | Parameter: 30 | path: Label数据文件路径 31 | Return: 32 | labels: 每行为一个label标签(n,) 33 | """ 34 | with open(path, 'rb') as f: 35 | _, num = unpack('>2I', f.read(8)) 36 | labels = np.fromfile(f, dtype=np.uint8) 37 | return labels 38 | 39 | def generate_data(img_path, label_path, c_list): 40 | """ 41 | 生成整合数据 42 | Parameter: 43 | img_path: 图像数据文件路径 44 | label_path: Label数据文件路径 45 | c_list: 需生成的类别名列表 46 | Return: 47 | [img_c, label_c]: img_c为图像数据数组,每行为一个图像的数据(n,img.size);label_c为标签数据数组,每行为一个标签的数据(n,) 48 | """ 49 | # 加载图像数据和Label 50 | imgs = load_imgs(img_path) 51 | labels = load_labels(label_path) 52 | 53 | # 选取特定类别并生成所需数据 54 | img_c = [] 55 | label_c = [] 56 | for c in c_list: 57 | idx = np.where(labels==c) 58 | img_c.extend(imgs[idx[0]]) 59 | label_c.extend(labels[idx[0]]) 60 | # 图像数据归一化 61 | img_c = np.array(img_c)/255.0 62 | label_c = np.array(label_c) 63 | 64 | return [img_c, label_c] 65 | 66 | def train_svm(train_data, c, model_path): 67 | """ 68 | SVM训练 & 模型保存 69 | Parameter: 70 | train_data: [img_c, label_c], generate_data函数返回的数据格式 71 | c: SVM参数c 72 | model_path: 训练生成的模型保存路径 73 | """ 74 | # SVM训练 75 | print ("Start Training...") 76 | classifier = svm.SVC(C=c, decision_function_shape='ovr') 77 | classifier.fit(train_data[0], train_data[1]) 78 | 79 | # 模型保存 80 | save = pickle.dumps(classifier) 81 | with open(model_path, 'wb+') as f: f.write(save) 82 | print ("Training Done. Model is saved in {}.".format(model_path)) 83 | 84 | def test_svm(test_data, model_path): 85 | """ 86 | SVM测试 87 | Parameter: 88 | test_data: [img_c, label_c], generate_data函数返回的数据格式 89 | model_path: 测试的模型的文件路径 90 | """ 91 | # 加载待测试模型 92 | print ("Start Testing...") 93 | with open(model_path, 'rb') as f: s = f.read() 94 | classifier = pickle.loads(s) 95 | 96 | # 模型测试 97 | score = classifier.score(test_data[0], test_data[1]) 98 | print ("Testing Accuracy:", score) 99 | 100 | if __name__ == "__main__": 101 | # Option 102 | # 文件路径 103 | TRAIN_IMG_PATH = "./data/train-images.idx3-ubyte" 104 | TRAIN_LABEL_PATH = "./data/train-labels.idx1-ubyte" 105 | TEST_IMG_PATH = "./data/t10k-images.idx3-ubyte" 106 | TEST_LABEL_PATH = "./data/t10k-labels.idx1-ubyte" 107 | # 训练&测试的指定类别 108 | CLASS = [1, 2] 109 | # 模型保存路径 110 | MODEL_PATH = "svm.model" 111 | 112 | # 数据加载 113 | train_data = generate_data(TRAIN_IMG_PATH, TRAIN_LABEL_PATH, CLASS) 114 | test_data = generate_data(TEST_IMG_PATH, TEST_LABEL_PATH, CLASS) 115 | # 训练 116 | train_svm(train_data, 200, MODEL_PATH) 117 | # 测试 118 | test_svm(test_data, MODEL_PATH) 119 | 120 | --------------------------------------------------------------------------------