├── docs ├── .nojekyll ├── images │ ├── 23-1.png │ ├── 24-4.png │ ├── 24-7.png │ ├── 25-1.png │ ├── 25-4.png │ ├── 25-6.png │ ├── 28-6.png │ ├── 23-4-1.png │ ├── 23-4-2.png │ ├── 23-5-1.png │ ├── 23-5-2.png │ ├── 26-4-CNN.png │ ├── 5-1-prune.png │ ├── 26-1-S2S-LSTM.png │ ├── 5-2-prune-T4.png │ ├── 20-5-EM-E step.png │ ├── 20-5-EM-M step.png │ ├── 26-5-Attention.png │ ├── 18-3-Text-Dataset.png │ ├── 26-5-Transformer.png │ ├── 3-1-KD-Tree-Demo.png │ ├── 11-1-Maximal-Clique.png │ ├── 15-2-Search-Click-Data.png │ └── 17-1-word-text-matrix.png ├── resources │ ├── qrcode.jpeg │ ├── part3_images.pptx │ └── machine-learning-method-book.png ├── chapter02 │ ├── output_8_0.png │ └── output_17_1.png ├── chapter03 │ └── output_6_0.png ├── chapter06 │ └── output_25_0.png ├── chapter07 │ └── output_22_0.png ├── chapter09 │ └── output_29_0.png ├── chapter19 │ ├── output_109_1.png │ └── output_93_2.png ├── chapter27 │ ├── output_70_0.png │ ├── output_70_1.png │ └── output_70_2.png ├── _sidebar.md ├── index.html ├── chapter05 │ └── output_7_0.svg ├── chapter21 │ └── ch21.md └── chapter01 │ └── ch01.md ├── images ├── qrcode.jpeg └── machine-learning-method-book.png ├── codes ├── ch05 │ ├── Source.gv.pdf │ ├── Source.gv │ ├── decision_tree_demo.py │ ├── my_least_squares_regression_tree.py │ └── my_decision_tree.py ├── ch15 │ ├── outer_product_expansion.py │ └── my_svd.py ├── ch03 │ ├── kd_tree_demo.py │ ├── k_neighbors_classifier.py │ └── my_kd_tree.py ├── ch08 │ ├── adaboost_demo.py │ └── my_adaboost.py ├── ch21 │ └── page_rank.py ├── ch17 │ ├── lsa_svd.py │ └── divergence_nmf_lsa.py ├── ch16 │ └── pca_svd.py ├── ch28 │ └── zero_sum_game.py ├── ch09 │ ├── gmm_demo.py │ ├── three_coin_EM.py │ └── my_gmm.py ├── ch19 │ ├── monte_carlo_method.py │ ├── metropolis_hastings.py │ └── gibbs_sampling.py ├── ch07 │ └── svm_demo.py ├── ch11 │ └── crf_matrix.py ├── ch18 │ └── em_plsa.py ├── ch10 │ ├── hidden_markov_backward.py │ ├── hidden_markov_forward_backward.py │ └── hidden_markov_viterbi.py ├── ch27 │ ├── auto_encoder.py │ └── bi-lstm-text-classification.py ├── ch14 │ └── divisive_clustering.py ├── ch23 │ └── feedforward_nn_backpropagation.py ├── ch02 │ └── perceptron.py ├── summary │ └── merge_docs.py ├── ch26 │ ├── lstm_seq2seq.py │ └── cnn_seq2seq.py ├── ch06 │ ├── my_logistic_regression.py │ └── maxent_dfp.py ├── ch20 │ └── gibbs_sampling_lda.py └── ch24 │ └── cnn-text-classification.py ├── notebook ├── part03 │ └── images │ │ ├── 23-1.png │ │ ├── 24-4.png │ │ ├── 24-7.png │ │ ├── 25-1.png │ │ ├── 25-4.png │ │ ├── 25-6.png │ │ ├── 28-6.png │ │ ├── 23-4-1.png │ │ ├── 23-4-2.png │ │ ├── 23-5-1.png │ │ ├── 23-5-2.png │ │ ├── 26-4-CNN.png │ │ ├── 26-1-S2S-LSTM.png │ │ ├── 26-5-Attention.png │ │ ├── part03_images.pptx │ │ └── 26-5-Transformer.png ├── part04 │ ├── images │ │ ├── 35-1.png │ │ └── part04_images.pptx │ └── notes │ │ ├── ch34.ipynb │ │ ├── ch38.ipynb │ │ ├── ch40.ipynb │ │ ├── ch36.ipynb │ │ ├── ch37.ipynb │ │ └── ch39.ipynb ├── part01 │ └── images │ │ ├── 5-1-prune.png │ │ ├── 5-2-prune-T4.png │ │ ├── 3-1-KD-Tree-Demo.png │ │ └── 11-1-Maximal-Clique.png └── part02 │ └── images │ ├── 20-5-EM-E step.png │ ├── 20-5-EM-M step.png │ ├── 18-3-Text-Dataset.png │ ├── 15-2-Search-Click-Data.png │ └── 17-1-word-text-matrix.png ├── requirements.txt └── .gitignore /docs/.nojekyll: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /images/qrcode.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/images/qrcode.jpeg -------------------------------------------------------------------------------- /docs/images/23-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/23-1.png -------------------------------------------------------------------------------- /docs/images/24-4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/24-4.png -------------------------------------------------------------------------------- /docs/images/24-7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/24-7.png -------------------------------------------------------------------------------- /docs/images/25-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/25-1.png -------------------------------------------------------------------------------- /docs/images/25-4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/25-4.png -------------------------------------------------------------------------------- /docs/images/25-6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/25-6.png -------------------------------------------------------------------------------- /docs/images/28-6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/28-6.png -------------------------------------------------------------------------------- /codes/ch05/Source.gv.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/codes/ch05/Source.gv.pdf -------------------------------------------------------------------------------- /docs/images/23-4-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/23-4-1.png -------------------------------------------------------------------------------- /docs/images/23-4-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/23-4-2.png -------------------------------------------------------------------------------- /docs/images/23-5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/23-5-1.png -------------------------------------------------------------------------------- /docs/images/23-5-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/23-5-2.png -------------------------------------------------------------------------------- /docs/images/26-4-CNN.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/26-4-CNN.png -------------------------------------------------------------------------------- /docs/images/5-1-prune.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/5-1-prune.png -------------------------------------------------------------------------------- /docs/resources/qrcode.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/resources/qrcode.jpeg -------------------------------------------------------------------------------- /docs/chapter02/output_8_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter02/output_8_0.png -------------------------------------------------------------------------------- /docs/chapter03/output_6_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter03/output_6_0.png -------------------------------------------------------------------------------- /docs/images/26-1-S2S-LSTM.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/26-1-S2S-LSTM.png -------------------------------------------------------------------------------- /docs/images/5-2-prune-T4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/5-2-prune-T4.png -------------------------------------------------------------------------------- /docs/chapter02/output_17_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter02/output_17_1.png -------------------------------------------------------------------------------- /docs/chapter06/output_25_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter06/output_25_0.png -------------------------------------------------------------------------------- /docs/chapter07/output_22_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter07/output_22_0.png -------------------------------------------------------------------------------- /docs/chapter09/output_29_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter09/output_29_0.png -------------------------------------------------------------------------------- /docs/chapter19/output_109_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter19/output_109_1.png -------------------------------------------------------------------------------- /docs/chapter19/output_93_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter19/output_93_2.png -------------------------------------------------------------------------------- /docs/chapter27/output_70_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter27/output_70_0.png -------------------------------------------------------------------------------- /docs/chapter27/output_70_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter27/output_70_1.png -------------------------------------------------------------------------------- /docs/chapter27/output_70_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/chapter27/output_70_2.png -------------------------------------------------------------------------------- /docs/images/20-5-EM-E step.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/20-5-EM-E step.png -------------------------------------------------------------------------------- /docs/images/20-5-EM-M step.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/20-5-EM-M step.png -------------------------------------------------------------------------------- /docs/images/26-5-Attention.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/26-5-Attention.png -------------------------------------------------------------------------------- /notebook/part03/images/23-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/23-1.png -------------------------------------------------------------------------------- /notebook/part03/images/24-4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/24-4.png -------------------------------------------------------------------------------- /notebook/part03/images/24-7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/24-7.png -------------------------------------------------------------------------------- /notebook/part03/images/25-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/25-1.png -------------------------------------------------------------------------------- /notebook/part03/images/25-4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/25-4.png -------------------------------------------------------------------------------- /notebook/part03/images/25-6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/25-6.png -------------------------------------------------------------------------------- /notebook/part03/images/28-6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/28-6.png -------------------------------------------------------------------------------- /notebook/part04/images/35-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part04/images/35-1.png -------------------------------------------------------------------------------- /docs/images/18-3-Text-Dataset.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/18-3-Text-Dataset.png -------------------------------------------------------------------------------- /docs/images/26-5-Transformer.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/26-5-Transformer.png -------------------------------------------------------------------------------- /docs/images/3-1-KD-Tree-Demo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/3-1-KD-Tree-Demo.png -------------------------------------------------------------------------------- /docs/resources/part3_images.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/resources/part3_images.pptx -------------------------------------------------------------------------------- /notebook/part03/images/23-4-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/23-4-1.png -------------------------------------------------------------------------------- /notebook/part03/images/23-4-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/23-4-2.png -------------------------------------------------------------------------------- /notebook/part03/images/23-5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/23-5-1.png -------------------------------------------------------------------------------- /notebook/part03/images/23-5-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/23-5-2.png -------------------------------------------------------------------------------- /docs/images/11-1-Maximal-Clique.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/11-1-Maximal-Clique.png -------------------------------------------------------------------------------- /notebook/part01/images/5-1-prune.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part01/images/5-1-prune.png -------------------------------------------------------------------------------- /notebook/part03/images/26-4-CNN.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/26-4-CNN.png -------------------------------------------------------------------------------- /docs/images/15-2-Search-Click-Data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/15-2-Search-Click-Data.png -------------------------------------------------------------------------------- /docs/images/17-1-word-text-matrix.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/images/17-1-word-text-matrix.png -------------------------------------------------------------------------------- /images/machine-learning-method-book.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/images/machine-learning-method-book.png -------------------------------------------------------------------------------- /notebook/part01/images/5-2-prune-T4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part01/images/5-2-prune-T4.png -------------------------------------------------------------------------------- /notebook/part02/images/20-5-EM-E step.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part02/images/20-5-EM-E step.png -------------------------------------------------------------------------------- /notebook/part02/images/20-5-EM-M step.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part02/images/20-5-EM-M step.png -------------------------------------------------------------------------------- /notebook/part03/images/26-1-S2S-LSTM.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/26-1-S2S-LSTM.png -------------------------------------------------------------------------------- /notebook/part03/images/26-5-Attention.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/26-5-Attention.png -------------------------------------------------------------------------------- /notebook/part03/images/part03_images.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/part03_images.pptx -------------------------------------------------------------------------------- /notebook/part04/images/part04_images.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part04/images/part04_images.pptx -------------------------------------------------------------------------------- /notebook/part01/images/3-1-KD-Tree-Demo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part01/images/3-1-KD-Tree-Demo.png -------------------------------------------------------------------------------- /notebook/part02/images/18-3-Text-Dataset.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part02/images/18-3-Text-Dataset.png -------------------------------------------------------------------------------- /notebook/part03/images/26-5-Transformer.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part03/images/26-5-Transformer.png -------------------------------------------------------------------------------- /notebook/part01/images/11-1-Maximal-Clique.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part01/images/11-1-Maximal-Clique.png -------------------------------------------------------------------------------- /docs/resources/machine-learning-method-book.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/docs/resources/machine-learning-method-book.png -------------------------------------------------------------------------------- /notebook/part02/images/15-2-Search-Click-Data.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part02/images/15-2-Search-Click-Data.png -------------------------------------------------------------------------------- /notebook/part02/images/17-1-word-text-matrix.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/datawhalechina/statistical-learning-method-solutions-manual/HEAD/notebook/part02/images/17-1-word-text-matrix.png -------------------------------------------------------------------------------- /codes/ch15/outer_product_expansion.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: outer_product_expansion.py 6 | @time: 2022/7/15 16:09 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题15.2 外积展开式 9 | """ 10 | 11 | import numpy as np 12 | 13 | A = np.array([[2, 4], 14 | [1, 3], 15 | [0, 0], 16 | [0, 0]]) 17 | 18 | # 调用numpy的svd方法 19 | U, S, V = np.linalg.svd(A) 20 | np.set_printoptions() 21 | 22 | print("U=", U) 23 | print("S=", S) 24 | print("V=", V.T) 25 | 26 | calc = S[0] * np.outer(U[:, 0], V[:, 0]) + S[1] * np.outer(U[:, 1], V[:, 1]) 27 | print("A=", calc) 28 | 29 | 30 | -------------------------------------------------------------------------------- /codes/ch05/Source.gv: -------------------------------------------------------------------------------- 1 | digraph Tree { 2 | node [shape=box, style="filled, rounded", color="black", fontname=SimSun] ; 3 | edge [fontname=helvetica] ; 4 | 0 [label=<有自己的房子 ≤ 3.0
gini = 0.48
samples = 15
value = [6, 9]
class = 是>, fillcolor="#bddef6"] ; 5 | 1 [label=<有工作 ≤ 3.0
gini = 0.444
samples = 9
value = [6, 3]
class = 否>, fillcolor="#f2c09c"] ; 6 | 0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ; 7 | 2 [label=samples = 6
value = [6, 0]
class = 否>, fillcolor="#e58139"] ; 8 | 1 -> 2 ; 9 | 3 [label=samples = 3
value = [0, 3]
class = 是>, fillcolor="#399de5"] ; 10 | 1 -> 3 ; 11 | 4 [label=samples = 6
value = [0, 6]
class = 是>, fillcolor="#399de5"] ; 12 | 0 -> 4 [labeldistance=2.5, labelangle=-45, headlabel="False"] ; 13 | } 14 | -------------------------------------------------------------------------------- /codes/ch03/kd_tree_demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: kd_tree_demo.py 6 | @time: 2021/8/3 17:08 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题3.2 kd树的构建与求最近邻点 9 | """ 10 | 11 | import numpy as np 12 | from sklearn.neighbors import KDTree 13 | 14 | # 构造例题3.2的数据集 15 | train_data = np.array([[2, 3], 16 | [5, 4], 17 | [9, 6], 18 | [4, 7], 19 | [8, 1], 20 | [7, 2]]) 21 | # (1)使用sklearn的KDTree类,构建平衡kd树 22 | # 设置leaf_size为2,表示平衡树 23 | tree = KDTree(train_data, leaf_size=2) 24 | 25 | # (2)使用tree.query方法,设置k=1,查找(3, 4.5)的最近邻点 26 | # dist表示与最近邻点的距离,ind表示最近邻点在train_data的位置 27 | dist, ind = tree.query(np.array([[3, 4.5]]), k=1) 28 | node_index = ind[0] 29 | 30 | # 得到最近邻点 31 | x1 = train_data[node_index][0][0] 32 | x2 = train_data[node_index][0][1] 33 | print("x点的最近邻点是({0}, {1})".format(x1, x2)) 34 | -------------------------------------------------------------------------------- /codes/ch08/adaboost_demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: adaboost_demo.py 6 | @time: 2021/8/13 17:16 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题8.1 使用sklearn的AdaBoostClassifier分类器实现 9 | """ 10 | 11 | from sklearn.ensemble import AdaBoostClassifier 12 | import numpy as np 13 | 14 | # 加载训练数据 15 | X = np.array([[0, 1, 3], 16 | [0, 3, 1], 17 | [1, 2, 2], 18 | [1, 1, 3], 19 | [1, 2, 3], 20 | [0, 1, 2], 21 | [1, 1, 2], 22 | [1, 1, 1], 23 | [1, 3, 1], 24 | [0, 2, 1] 25 | ]) 26 | y = np.array([-1, -1, -1, -1, -1, -1, 1, 1, -1, -1]) 27 | 28 | # 使用sklearn的AdaBoostClassifier分类器 29 | clf = AdaBoostClassifier() 30 | # 进行分类器训练 31 | clf.fit(X, y) 32 | # 对数据进行预测 33 | y_predict = clf.predict(X) 34 | # 得到分类器的预测准确率 35 | score = clf.score(X, y) 36 | print("原始输出:", y) 37 | print("预测输出:", y_predict) 38 | print("预测准确率:{:.2%}".format(score)) 39 | -------------------------------------------------------------------------------- /codes/ch21/page_rank.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: page_rank.py 6 | @time: 2022/7/18 9:14 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题21.2 基本定义的PageRank的迭代算法 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | def page_rank_basic(M, R0, max_iter=1000): 15 | """ 16 | 迭代求解基本定义的PageRank 17 | :param M: 转移矩阵 18 | :param R0: 初始分布向量 19 | :param max_iter: 最大迭代次数 20 | :return: Rt: 极限向量 21 | """ 22 | Rt = R0 23 | for _ in range(max_iter): 24 | Rt = np.dot(M, Rt) 25 | return Rt 26 | 27 | 28 | if __name__ == '__main__': 29 | # 使用例21.1的转移矩阵M 30 | M = np.array([[0, 1 / 2, 1, 0], 31 | [1 / 3, 0, 0, 1 / 2], 32 | [1 / 3, 0, 0, 1 / 2], 33 | [1 / 3, 1 / 2, 0, 0]]) 34 | 35 | # 使用5个不同的初始分布向量R0 36 | for _ in range(5): 37 | R0 = np.random.rand(4) 38 | R0 = R0 / np.linalg.norm(R0, ord=1) 39 | Rt = page_rank_basic(M, R0) 40 | print("R0 =", R0) 41 | print("Rt =", Rt) 42 | print() -------------------------------------------------------------------------------- /codes/ch17/lsa_svd.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: lsa_svd.py 6 | @time: 2022/7/11 15:21 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题17.1 利用矩阵奇异值分解进行潜在语义分析 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | def lsa_svd(X, k): 15 | """ 16 | 潜在语义分析的矩阵奇异值分解 17 | :param X: 单词-文本矩阵 18 | :param k: 话题数 19 | :return: 话题向量空间、文本集合在话题向量空间的表示 20 | """ 21 | # 单词-文本矩阵X的奇异值分解 22 | U, S, V = np.linalg.svd(X) 23 | # 矩阵的截断奇异值分解,取前k个 24 | U = U[:, :k] 25 | S = np.diag(S[:k]) 26 | V = V[:k, :] 27 | 28 | return U, np.dot(S, V) 29 | 30 | 31 | if __name__ == '__main__': 32 | X = np.array([[2, 0, 0, 0], 33 | [0, 2, 0, 0], 34 | [0, 0, 1, 0], 35 | [0, 0, 2, 3], 36 | [0, 0, 0, 1], 37 | [1, 2, 2, 1]]) 38 | 39 | # 设置精度为2 40 | np.set_printoptions(precision=2, suppress=True) 41 | # 假设话题的个数是3个 42 | U, SV = lsa_svd(X, k=3) 43 | print("话题空间U:") 44 | print(U) 45 | print("文本在话题空间的表示SV:") 46 | print(SV) 47 | -------------------------------------------------------------------------------- /docs/_sidebar.md: -------------------------------------------------------------------------------- 1 | * [目录](README.md) 2 | * 第1篇 监督学习 3 | * [第1章 统计学习方法概论](chapter01/ch01.md) 4 | * [第2章 感知机](chapter02/ch02.md) 5 | * [第3章 k近邻法](chapter03/ch03.md) 6 | * [第4章 朴素贝叶斯法](chapter04/ch04.md) 7 | * [第5章 决策树](chapter05/ch05.md) 8 | * [第6章 Logistic回归与最大熵模型](chapter06/ch06.md) 9 | * [第7章 支持向量机](chapter07/ch07.md) 10 | * [第8章 提升方法](chapter08/ch08.md) 11 | * [第9章 EM算法及其推广](chapter09/ch09.md) 12 | * [第10章 隐马尔可夫模型](chapter10/ch10.md) 13 | * [第11章 条件随机场](chapter11/ch11.md) 14 | * 第2篇 无监督学习 15 | * [第14章 聚类方法](chapter14/ch14.md) 16 | * [第15章 奇异值分解](chapter15/ch15.md) 17 | * [第16章 主成分分析](chapter16/ch16.md) 18 | * [第17章 潜在语义分析](chapter17/ch17.md) 19 | * [第18章 概率潜在语义分析](chapter18/ch18.md) 20 | * [第19章 马尔可夫链蒙特卡罗法](chapter19/ch19.md) 21 | * [第20章 潜在狄利克雷分配](chapter20/ch20.md) 22 | * [第21章 PageRank算法](chapter21/ch21.md) 23 | * 第3篇 深度学习 24 | * [第23章 前馈神经网络](chapter23/ch23.md) 25 | * [第24章 卷积神经网络](chapter24/ch24.md) 26 | * [第25章 循环神经网络](chapter25/ch25.md) 27 | * [第26章 序列到序列模型](chapter26/ch26.md) 28 | * [第27章 预训练语言模型](chapter27/ch27.md) 29 | * [第28章 生成对抗网络](chapter28/ch28.md) -------------------------------------------------------------------------------- /codes/ch16/pca_svd.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: pca_svd.py 6 | @time: 2022/7/14 10:48 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题16.1 样本矩阵的奇异值分解的主成分分析算法 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | def pca_svd(X, k): 15 | """ 16 | 样本矩阵的奇异值分解的主成分分析算法 17 | :param X: 样本矩阵X 18 | :param k: 主成分个数k 19 | :return: 特征向量V,样本主成分矩阵Y 20 | """ 21 | n_samples = X.shape[1] 22 | 23 | # 构造新的n×m矩阵 24 | T = X.T / np.sqrt(n_samples - 1) 25 | 26 | # 对矩阵T进行截断奇异值分解 27 | U, S, V = np.linalg.svd(T) 28 | V = V[:, :k] 29 | 30 | # 求k×n的样本主成分矩阵 31 | return V, np.dot(V.T, X) 32 | 33 | 34 | if __name__ == '__main__': 35 | X = np.array([[2, 3, 3, 4, 5, 7], 36 | [2, 4, 5, 5, 6, 8]]) 37 | X = X.astype("float64") 38 | 39 | # 规范化变量 40 | avg = np.average(X, axis=1) 41 | var = np.var(X, axis=1) 42 | for i in range(X.shape[0]): 43 | X[i] = (X[i, :] - avg[i]) / np.sqrt(var[i]) 44 | 45 | # 设置精度为3 46 | np.set_printoptions(precision=3, suppress=True) 47 | V, vnk = pca_svd(X, 2) 48 | 49 | print("正交矩阵V:") 50 | print(V) 51 | print("样本主成分矩阵Y:") 52 | print(vnk) 53 | -------------------------------------------------------------------------------- /codes/ch28/zero_sum_game.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: zero_sum_game.py 6 | @time: 2023/3/17 18:48 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题28.2 零和博弈的代码验证 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | def minmax_function(A): 15 | """ 16 | 从收益矩阵中计算minmax的算法 17 | :param A: 收益矩阵 18 | :return: 计算得到的minmax结果 19 | """ 20 | index_max = [] 21 | for i in range(len(A)): 22 | # 计算每一行的最大值 23 | index_max.append(A[i, :].max()) 24 | 25 | # 计算每一行的最大值中的最小值 26 | minmax = min(index_max) 27 | return minmax 28 | 29 | 30 | def maxmin_function(A): 31 | """ 32 | 从收益矩阵中计算maxmin的算法 33 | :param A: 收益矩阵 34 | :return: 计算得到的maxmin结果 35 | """ 36 | column_min = [] 37 | for i in range(len(A)): 38 | # 计算每一列的最小值 39 | column_min.append(A[:, i].min()) 40 | 41 | # 计算每一列的最小值中的最大值 42 | maxmin = max(column_min) 43 | return maxmin 44 | 45 | 46 | if __name__ == '__main__': 47 | # 创建收益矩阵 48 | A = np.array([[-1, 2], [4, 1]]) 49 | # 计算maxmin 50 | maxmin = maxmin_function(A) 51 | # 计算minmax 52 | minmax = minmax_function(A) 53 | # 输出结果 54 | print("maxmin =", maxmin) 55 | print("minmax =", minmax) 56 | -------------------------------------------------------------------------------- /codes/ch09/gmm_demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: gmm_demo.py 6 | @time: 2021/8/14 16:04 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题9.3 使用GaussianMixture求解两个分量高斯混合模型的6个参数 9 | """ 10 | 11 | from sklearn.mixture import GaussianMixture 12 | import numpy as np 13 | import matplotlib.pyplot as plt 14 | 15 | # 初始化观测数据 16 | data = np.array([-67, -48, 6, 8, 14, 16, 23, 24, 28, 29, 41, 49, 56, 60, 75]).reshape(-1, 1) 17 | 18 | # 设置n_components=2,表示两个分量高斯混合模型 19 | gmm_model = GaussianMixture(n_components=2) 20 | # 对模型进行参数估计 21 | gmm_model.fit(data) 22 | # 对数据进行聚类 23 | labels = gmm_model.predict(data) 24 | 25 | # 得到分类结果 26 | print("分类结果:labels = {}\n".format(labels)) 27 | print("两个分量高斯混合模型的6个参数如下:") 28 | # 得到参数u1,u2 29 | print("means =", gmm_model.means_.reshape(1, -1)) 30 | # 得到参数sigma1, sigma1 31 | print("covariances =", gmm_model.covariances_.reshape(1, -1)) 32 | # 得到参数a1, a2 33 | print("weights = ", gmm_model.weights_.reshape(1, -1)) 34 | 35 | # 绘制观测数据的聚类情况 36 | for i in range(0, len(labels)): 37 | if labels[i] == 0: 38 | plt.scatter(i, data.take(i), s=15, c='red') 39 | elif labels[i] == 1: 40 | plt.scatter(i, data.take(i), s=15, c='blue') 41 | plt.title('Gaussian Mixture Model') 42 | plt.xlabel('x') 43 | plt.ylabel('y') 44 | plt.show() 45 | -------------------------------------------------------------------------------- /codes/ch19/monte_carlo_method.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: monte_carlo_method_demo.py 6 | @time: 2022/7/18 10:52 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题19.1 蒙特卡洛法积分计算 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | class MonteCarloIntegration: 15 | def __init__(self, func_f, func_p): 16 | # 所求期望的函数 17 | self.func_f = func_f 18 | # 抽样分布的概率密度函数 19 | self.func_p = func_p 20 | 21 | def solve(self, num_samples): 22 | """ 23 | 蒙特卡罗积分法 24 | :param num_samples: 抽样样本数量 25 | :return: 样本的函数均值 26 | """ 27 | samples = self.func_p(num_samples) 28 | vfunc_f = lambda x: self.func_f(x) 29 | vfunc_f = np.vectorize(vfunc_f) 30 | y = vfunc_f(samples) 31 | return np.sum(y) / num_samples 32 | 33 | 34 | if __name__ == '__main__': 35 | def func_f(x): 36 | """定义函数f""" 37 | return x ** 2 * np.sqrt(2 * np.pi) 38 | 39 | 40 | def func_p(n): 41 | """定义在分布上随机抽样的函数g""" 42 | return np.random.standard_normal(int(n)) 43 | 44 | 45 | # 设置样本数量 46 | num_samples = 1e6 47 | 48 | # 使用蒙特卡罗积分法进行求解 49 | monte_carlo_integration = MonteCarloIntegration(func_f, func_p) 50 | result = monte_carlo_integration.solve(num_samples) 51 | print("抽样样本数量:", num_samples) 52 | print("近似解:", result) 53 | -------------------------------------------------------------------------------- /codes/ch07/svm_demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: svm.py 6 | @time: 2021/8/12 22:51 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题7.2 根据题目中的数据训练模型,并在图中画出分离超平面、间隔边界及支持向量 9 | """ 10 | from sklearn.svm import SVC 11 | import matplotlib.pyplot as plt 12 | import numpy as np 13 | 14 | # 加载数据 15 | X = [[1, 2], [2, 3], [3, 3], [2, 1], [3, 2]] 16 | y = [1, 1, 1, -1, -1] 17 | 18 | # 训练SVM模型 19 | clf = SVC(kernel='linear', C=10000) 20 | clf.fit(X, y) 21 | 22 | print("w =", clf.coef_) 23 | print("b =", clf.intercept_) 24 | print("support vectors =", clf.support_vectors_) 25 | 26 | # 绘制数据点 27 | color_seq = ['red' if v == 1 else 'blue' for v in y] 28 | plt.scatter([i[0] for i in X], [i[1] for i in X], c=color_seq) 29 | # 得到x轴的所有点 30 | xaxis = np.linspace(0, 3.5) 31 | w = clf.coef_[0] 32 | # 计算斜率 33 | a = -w[0] / w[1] 34 | # 得到分离超平面 35 | y_sep = a * xaxis - (clf.intercept_[0]) / w[1] 36 | # 下边界超平面 37 | b = clf.support_vectors_[0] 38 | yy_down = a * xaxis + (b[1] - a * b[0]) 39 | # 上边界超平面 40 | b = clf.support_vectors_[-1] 41 | yy_up = a * xaxis + (b[1] - a * b[0]) 42 | # 绘制超平面 43 | plt.plot(xaxis, y_sep, 'k-') 44 | plt.plot(xaxis, yy_down, 'k--') 45 | plt.plot(xaxis, yy_up, 'k--') 46 | # 绘制支持向量 47 | plt.xlabel('$x^{(1)}$') 48 | plt.ylabel('$x^{(2)}$') 49 | plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], 50 | s=150, facecolors='none', edgecolors='k') 51 | plt.show() 52 | -------------------------------------------------------------------------------- /codes/ch15/my_svd.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: svd.py 6 | @time: 2022/7/15 10:06 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题15.1 自编程实现奇异值分解 9 | """ 10 | import numpy as np 11 | from scipy.linalg import null_space 12 | 13 | 14 | def my_svd(A): 15 | m = A.shape[0] 16 | 17 | # (1) 计算对称矩阵 A^T A 的特征值与特征向量, 18 | W = np.dot(A.T, A) 19 | # 返回的特征值lambda_value是升序的,特征向量V是单位化的特征向量 20 | lambda_value, V = np.linalg.eigh(W) 21 | # 并按特征值从大到小排列 22 | lambda_value = lambda_value[::-1] 23 | lambda_value = lambda_value[lambda_value > 0] 24 | # (2) 计算n阶正价矩阵V 25 | V = V[:, -1::-1] 26 | 27 | # (3) 求 m * n 对角矩阵 28 | sigma = np.sqrt(lambda_value) 29 | S = np.diag(sigma) @ np.eye(*A.shape) 30 | 31 | # (4.1) 求A的前r个正奇异值 32 | r = np.linalg.matrix_rank(A) 33 | U1 = np.hstack([(np.dot(A, V[:, i]) / sigma[i])[:, np.newaxis] for i in range(r)]) 34 | # (4.2) 求A^T的零空间的一组标准正交基 35 | U = U1 36 | if r < m: 37 | U2 = null_space(A.T) 38 | U2 = U2[:, r:] 39 | U = np.hstack([U, U2]) 40 | 41 | return U, S, V 42 | 43 | 44 | if __name__ == '__main__': 45 | A = np.array([[1, 2, 0], 46 | [2, 0, 2]]) 47 | 48 | np.set_printoptions(precision=2, suppress=True) 49 | 50 | U, S, V = my_svd(A) 51 | print("U=", U) 52 | print("S=", S) 53 | print("V=", V) 54 | calc = np.dot(np.dot(U, S), V.T) 55 | print("A=", calc) 56 | -------------------------------------------------------------------------------- /docs/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 机器学习方法习题解答 6 | 7 | 8 | 10 | 11 | 12 | 13 |
14 | 15 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | -------------------------------------------------------------------------------- /codes/ch05/decision_tree_demo.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: decision_tree_demo.py 6 | @time: 2021/8/5 15:11 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题5.1 调用sklearn的DecisionTreeClassifier类使用C4.5算法生成决策树 9 | """ 10 | 11 | from sklearn.tree import DecisionTreeClassifier 12 | from sklearn import preprocessing 13 | import numpy as np 14 | import pandas as pd 15 | 16 | from sklearn import tree 17 | import graphviz 18 | 19 | features_names = ["年龄", "有工作", "有自己的房子", "信贷情况"] 20 | X_train = pd.DataFrame([ 21 | ["青年", "否", "否", "一般"], 22 | ["青年", "否", "否", "好"], 23 | ["青年", "是", "否", "好"], 24 | ["青年", "是", "是", "一般"], 25 | ["青年", "否", "否", "一般"], 26 | ["中年", "否", "否", "一般"], 27 | ["中年", "否", "否", "好"], 28 | ["中年", "是", "是", "好"], 29 | ["中年", "否", "是", "非常好"], 30 | ["中年", "否", "是", "非常好"], 31 | ["老年", "否", "是", "非常好"], 32 | ["老年", "否", "是", "好"], 33 | ["老年", "是", "否", "好"], 34 | ["老年", "是", "否", "非常好"], 35 | ["老年", "否", "否", "一般"] 36 | ]) 37 | y_train = pd.DataFrame(["否", "否", "是", "是", "否", 38 | "否", "否", "是", "是", "是", 39 | "是", "是", "是", "是", "否"]) 40 | class_names = [str(k) for k in np.unique(y_train)] 41 | # 数据预处理 42 | le_x = preprocessing.LabelEncoder() 43 | le_x.fit(np.unique(X_train)) 44 | X_train = X_train.apply(le_x.transform) 45 | # 调用sklearn的DecisionTreeClassifier建立决策树模型 46 | model_tree = DecisionTreeClassifier() 47 | # 训练模型 48 | model_tree.fit(X_train, y_train) 49 | 50 | # 导出决策树的可视化文件,文件格式是dot 51 | dot_data = tree.export_graphviz(model_tree, out_file=None, 52 | feature_names=features_names, 53 | class_names=class_names, 54 | filled=True, rounded=True, 55 | special_characters=True) 56 | # 使用graphviz包,对决策树进行展示 57 | graph = graphviz.Source(dot_data) 58 | # 可使用view方法展示决策树 59 | # 中文乱码:需要对源码_export.py文件(文件路径:sklearn/tree/_export.py)修改,在文件第451行中将helvetica改成SimSun 60 | graph.view() 61 | # 打印决策树 62 | tree_text = tree.export_text(model_tree, feature_names=features_names) 63 | print(tree_text) 64 | -------------------------------------------------------------------------------- /codes/ch11/crf_matrix.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: crf_matrix.py 6 | @time: 2021/8/19 2:15 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题11.4 使用条件随机场矩阵形式,计算所有路径状态序列的概率及概率最大的状态序列 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | class CRFMatrix: 15 | def __init__(self, M, start, stop): 16 | # 随机矩阵 17 | self.M = M 18 | # 19 | self.start = start 20 | self.stop = stop 21 | self.path_prob = None 22 | 23 | def _create_path(self): 24 | """按照图11.6的状态路径图,生成路径""" 25 | # 初始化start结点 26 | path = [self.start] 27 | for i in range(1, len(self.M)): 28 | paths = [] 29 | for _, r in enumerate(path): 30 | temp = np.transpose(r) 31 | # 添加状态结点1 32 | paths.append(np.append(temp, 1)) 33 | # 添加状态结点2 34 | paths.append(np.append(temp, 2)) 35 | path = paths.copy() 36 | 37 | # 添加stop结点 38 | path = [np.append(r, self.stop) for _, r in enumerate(path)] 39 | return path 40 | 41 | def fit(self): 42 | path = self._create_path() 43 | pr = [] 44 | for _, row in enumerate(path): 45 | p = 1 46 | for i in range(len(row) - 1): 47 | a = row[i] 48 | b = row[i + 1] 49 | # 根据公式11.24,计算条件概率 50 | p *= M[i][a - 1][b - 1] 51 | pr.append((row.tolist(), p)) 52 | # 按照概率从大到小排列 53 | pr = sorted(pr, key=lambda x: x[1], reverse=True) 54 | self.path_prob = pr 55 | 56 | def print(self): 57 | # 打印结果 58 | print("以start=%s为起点stop=%s为终点的所有路径的状态序列y的概率为:" % (self.start, self.stop)) 59 | for path, p in self.path_prob: 60 | print(" 路径为:" + "->".join([str(x) for x in path]), end=" ") 61 | print("概率为:" + str(p)) 62 | print("概率最大[" + str(self.path_prob[0][1]) + "]的状态序列为:", 63 | "->".join([str(x) for x in self.path_prob[0][0]])) 64 | 65 | 66 | if __name__ == '__main__': 67 | # 创建随机矩阵 68 | M1 = [[0, 0], [0.5, 0.5]] 69 | M2 = [[0.3, 0.7], [0.7, 0.3]] 70 | M3 = [[0.5, 0.5], [0.6, 0.4]] 71 | M4 = [[0, 1], [0, 1]] 72 | M = [M1, M2, M3, M4] 73 | # 构建条件随机场的矩阵模型 74 | crf = CRFMatrix(M=M, start=2, stop=2) 75 | # 得到所有路径的状态序列的概率 76 | crf.fit() 77 | # 打印结果 78 | crf.print() 79 | -------------------------------------------------------------------------------- /codes/ch03/k_neighbors_classifier.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: k_neighbors_classifier.py 6 | @time: 2021/8/3 14:58 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题3.1 k近邻算法关于k值的模型比较 9 | """ 10 | 11 | import numpy as np 12 | from sklearn.neighbors import KNeighborsClassifier 13 | import matplotlib.pyplot as plt 14 | from matplotlib.colors import ListedColormap 15 | 16 | data = np.array([[5, 12, 1], 17 | [6, 21, 0], 18 | [14, 5, 0], 19 | [16, 10, 0], 20 | [13, 19, 0], 21 | [13, 32, 1], 22 | [17, 27, 1], 23 | [18, 24, 1], 24 | [20, 20, 0], 25 | [23, 14, 1], 26 | [23, 25, 1], 27 | [23, 31, 1], 28 | [26, 8, 0], 29 | [30, 17, 1], 30 | [30, 26, 1], 31 | [34, 8, 0], 32 | [34, 19, 1], 33 | [37, 28, 1]]) 34 | # 得到特征向量 35 | X_train = data[:, 0:2] 36 | # 得到类别向量 37 | y_train = data[:, 2] 38 | 39 | # 分别构造k=1和k=2的k近邻模型 40 | models = (KNeighborsClassifier(n_neighbors=1, n_jobs=-1), 41 | KNeighborsClassifier(n_neighbors=2, n_jobs=-1)) 42 | # 模型训练 43 | models = (clf.fit(X_train, y_train) for clf in models) 44 | 45 | # 设置图形标题 46 | titles = ('K Neighbors with k=1', 47 | 'K Neighbors with k=2') 48 | 49 | # 设置图形的大小和图间距 50 | fig = plt.figure(figsize=(15, 5)) 51 | plt.subplots_adjust(wspace=0.4, hspace=0.4) 52 | 53 | # 分别获取第1个和第2个特征向量 54 | X0, X1 = X_train[:, 0], X_train[:, 1] 55 | 56 | # 得到坐标轴的最小值和最大值 57 | x_min, x_max = X0.min() - 1, X0.max() + 1 58 | y_min, y_max = X1.min() - 1, X1.max() + 1 59 | 60 | # 构造网格点坐标矩阵(https://blog.csdn.net/lllxxq141592654/article/details/81532855) 61 | # 设置0.2的目的是生成更多的网格点,数值越小,划分空间之间的分隔线越清晰 62 | xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.2), 63 | np.arange(y_min, y_max, 0.2)) 64 | 65 | for clf, title, ax in zip(models, titles, fig.subplots(1, 2).flatten()): 66 | # 对所有网格点进行预测 67 | Z = clf.predict(np.c_[xx.ravel(), yy.ravel()]) 68 | Z = Z.reshape(xx.shape) 69 | # 设置颜色列表 70 | colors = ('red', 'green', 'lightgreen', 'gray', 'cyan') 71 | # 根据类别数生成颜色 72 | cmap = ListedColormap(colors[:len(np.unique(Z))]) 73 | # 绘制分隔线,contourf函数用于绘制等高线,alpha表示颜色的透明度,一般设置成0.5 74 | ax.contourf(xx, yy, Z, cmap=cmap, alpha=0.5) 75 | 76 | # 绘制样本点 77 | ax.scatter(X0, X1, c=y_train, s=50, edgecolors='k', cmap=cmap, alpha=0.5) 78 | 79 | # 计算预测准确率 80 | acc = clf.score(X_train, y_train) 81 | # 设置标题 82 | ax.set_title(title + ' (Accuracy: %d%%)' % (acc * 100)) 83 | 84 | plt.show() 85 | -------------------------------------------------------------------------------- /codes/ch09/three_coin_EM.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: three_coin_EM.py 6 | @time: 2021/8/14 0:56 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题9.1 三硬币模型的EM算法 9 | """ 10 | 11 | import math 12 | 13 | 14 | class ThreeCoinEM: 15 | def __init__(self, prob, tol=1e-6, max_iter=1000): 16 | """ 17 | 初始化模型参数 18 | :param prob: 模型参数的初值 19 | :param tol: 收敛阈值 20 | :param max_iter: 最大迭代次数 21 | """ 22 | self.prob_A, self.prob_B, self.prob_C = prob 23 | self.tol = tol 24 | self.max_iter = max_iter 25 | 26 | def calc_mu(self, j): 27 | """ 28 | (E步)计算mu 29 | :param j: 观测数据y的第j个 30 | :return: 在模型参数下观测数据yj来自掷硬币B的概率 31 | """ 32 | # 掷硬币A观测结果为正面 33 | pro_1 = self.prob_A * math.pow(self.prob_B, data[j]) * math.pow((1 - self.prob_B), 1 - data[j]) 34 | # 掷硬币A观测结果为反面 35 | pro_2 = (1 - self.prob_A) * math.pow(self.prob_C, data[j]) * math.pow((1 - self.prob_C), 1 - data[j]) 36 | return pro_1 / (pro_1 + pro_2) 37 | 38 | def fit(self, data): 39 | count = len(data) 40 | print("模型参数的初值:") 41 | print("prob_A={}, prob_B={}, prob_C={}".format(self.prob_A, self.prob_B, self.prob_C)) 42 | print("EM算法训练过程:") 43 | for i in range(self.max_iter): 44 | # (E步)得到在模型参数下观测数据yj来自掷硬币B的概率 45 | _mu = [self.calc_mu(j) for j in range(count)] 46 | # (M步)计算模型参数的新估计值 47 | prob_A = 1 / count * sum(_mu) 48 | prob_B = sum([_mu[k] * data[k] for k in range(count)]) \ 49 | / sum([_mu[k] for k in range(count)]) 50 | prob_C = sum([(1 - _mu[k]) * data[k] for k in range(count)]) \ 51 | / sum([(1 - _mu[k]) for k in range(count)]) 52 | print('第{}次:prob_A={:.4f}, prob_B={:.4f}, prob_C={:.4f}'.format(i + 1, prob_A, prob_B, prob_C)) 53 | # 计算误差值 54 | error = abs(self.prob_A - prob_A) + abs(self.prob_B - prob_B) + abs(self.prob_C - prob_C) 55 | self.prob_A = prob_A 56 | self.prob_B = prob_B 57 | self.prob_C = prob_C 58 | # 判断是否收敛 59 | if error < self.tol: 60 | print("模型参数的极大似然估计:") 61 | print("prob_A={:.4f}, prob_B={:.4f}, prob_C={:.4f}".format(self.prob_A, self.prob_B, 62 | self.prob_C)) 63 | break 64 | 65 | 66 | if __name__ == '__main__': 67 | # 加载数据 68 | data = [1, 1, 0, 1, 0, 0, 1, 0, 1, 1] 69 | # 模型参数的初值 70 | init_prob = [0.46, 0.55, 0.67] 71 | 72 | # 三硬币模型的EM模型 73 | em = ThreeCoinEM(prob=init_prob, tol=1e-5, max_iter=100) 74 | # 模型训练 75 | em.fit(data) 76 | -------------------------------------------------------------------------------- /codes/ch18/em_plsa.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: em_plsa.py 6 | @time: 2022/7/13 18:48 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题18.3 基于生成模型的EM算法的概率潜在语义分析 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | class EMPlsa: 15 | def __init__(self, max_iter=100, random_state=2022): 16 | """ 17 | 基于生成模型的EM算法的概率潜在语义模型 18 | :param max_iter: 最大迭代次数 19 | :param random_state: 随机种子 20 | """ 21 | self.max_iter = max_iter 22 | self.random_state = random_state 23 | 24 | def fit(self, X, K): 25 | """ 26 | :param X: 单词-文本矩阵 27 | :param K: 话题个数 28 | :return: P(w_i|z_k) 和 P(z_k|d_j) 29 | """ 30 | # M, N分别为单词个数和文本个数 31 | M, N = X.shape 32 | 33 | # 计算n(d_j) 34 | n_d = [np.sum(X[:, j]) for j in range(N)] 35 | 36 | # (1)设置参数P(w_i|z_k)和P(z_k|d_j)的初始值 37 | np.random.seed(self.random_state) 38 | p_wz = np.random.random((M, K)) 39 | p_zd = np.random.random((K, N)) 40 | 41 | # (2)迭代执行E步和M步,直至收敛为止 42 | for _ in range(self.max_iter): 43 | # E步 44 | P = np.zeros((M, N, K)) 45 | for i in range(M): 46 | for j in range(N): 47 | for k in range(K): 48 | P[i][j][k] = p_wz[i][k] * p_zd[k][j] 49 | P[i][j] /= np.sum(P[i][j]) 50 | 51 | # M步 52 | for k in range(K): 53 | for i in range(M): 54 | p_wz[i][k] = np.sum([X[i][j] * P[i][j][k] for j in range(N)]) 55 | p_wz[:, k] /= np.sum(p_wz[:, k]) 56 | 57 | for k in range(K): 58 | for j in range(N): 59 | p_zd[k][j] = np.sum([X[i][j] * P[i][j][k] for i in range(M)]) / n_d[j] 60 | 61 | return p_wz, p_zd 62 | 63 | 64 | if __name__ == "__main__": 65 | # 输入文本-单词矩阵,共有9个文本,11个单词 66 | X = np.array([[0, 0, 1, 1, 0, 0, 0, 0, 0], 67 | [0, 0, 0, 0, 0, 1, 0, 0, 1], 68 | [0, 1, 0, 0, 0, 0, 0, 1, 0], 69 | [0, 0, 0, 0, 0, 0, 1, 0, 1], 70 | [1, 0, 0, 0, 0, 1, 0, 0, 0], 71 | [1, 1, 1, 1, 1, 1, 1, 1, 1], 72 | [1, 0, 1, 0, 0, 0, 0, 0, 0], 73 | [0, 0, 0, 0, 0, 0, 1, 0, 1], 74 | [0, 0, 0, 0, 0, 2, 0, 0, 1], 75 | [1, 0, 1, 0, 0, 0, 0, 1, 0], 76 | [0, 0, 0, 1, 1, 0, 0, 0, 0]]) 77 | 78 | # 设置精度为3 79 | np.set_printoptions(precision=3, suppress=True) 80 | 81 | # 假设话题的个数是3个 82 | k = 3 83 | 84 | em_plsa = EMPlsa(max_iter=100) 85 | 86 | p_wz, p_zd = em_plsa.fit(X, 3) 87 | 88 | print("参数P(w_i|z_k):") 89 | print(p_wz) 90 | print("参数P(z_k|d_j):") 91 | print(p_zd) 92 | -------------------------------------------------------------------------------- /codes/ch17/divergence_nmf_lsa.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: divergence_nmf_lsa.py 6 | @time: 2022/7/11 16:50 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题17.2 损失函数是散度损失时的非负矩阵分解算法 9 | """ 10 | import numpy as np 11 | 12 | 13 | class DivergenceNmfLsa: 14 | def __init__(self, max_iter=1000, tol=1e-6, random_state=0): 15 | """ 16 | 损失函数是散度损失时的非负矩阵分解 17 | :param max_iter: 最大迭代次数 18 | :param tol: 容差 19 | :param random_state: 随机种子 20 | """ 21 | self.max_iter = max_iter 22 | self.tol = tol 23 | self.random_state = random_state 24 | np.random.seed(self.random_state) 25 | 26 | def _init_param(self, X, k): 27 | self.__m, self.__n = X.shape 28 | self.__W = np.random.random((self.__m, k)) 29 | self.__H = np.random.random((k, self.__n)) 30 | 31 | def _div_loss(self, X, W, H): 32 | Y = np.dot(W, H) 33 | loss = 0 34 | for i in range(self.__m): 35 | for j in range(self.__n): 36 | loss += (X[i][j] * np.log(X[i][j] / Y[i][j]) if X[i][j] * Y[i][j] > 0 else 0) - X[i][j] + Y[i][j] 37 | 38 | return loss 39 | 40 | def fit(self, X, k): 41 | """ 42 | :param X: 单词-文本矩阵 43 | :param k: 话题个数 44 | :return: 45 | """ 46 | # (1)初始化 47 | self._init_param(X, k) 48 | # (2.c)计算散度损失 49 | loss = self._div_loss(X, self.__W, self.__H) 50 | 51 | for _ in range(self.max_iter): 52 | # (2.a)更新W的元素 53 | WH = np.dot(self.__W, self.__H) 54 | for i in range(self.__m): 55 | for l in range(k): 56 | s1 = sum(self.__H[l][j] * X[i][j] / WH[i][j] for j in range(self.__n)) 57 | s2 = sum(self.__H[l][j] for j in range(self.__n)) 58 | self.__W[i][l] *= s1 / s2 59 | 60 | # (2.b)更新H的元素 61 | WH = np.dot(self.__W, self.__H) 62 | for l in range(k): 63 | for j in range(self.__n): 64 | s1 = sum(self.__W[i][l] * X[i][j] / WH[i][j] for i in range(self.__m)) 65 | s2 = sum(self.__W[i][l] for i in range(self.__m)) 66 | self.__H[l][j] *= s1 / s2 67 | 68 | new_loss = self._div_loss(X, self.__W, self.__H) 69 | if abs(new_loss - loss) < self.tol: 70 | break 71 | 72 | loss = new_loss 73 | 74 | return self.__W, self.__H 75 | 76 | 77 | if __name__ == '__main__': 78 | X = np.array([[2, 0, 0, 0], 79 | [0, 2, 0, 0], 80 | [0, 0, 1, 0], 81 | [0, 0, 2, 3], 82 | [0, 0, 0, 1], 83 | [1, 2, 2, 1]]) 84 | 85 | # 设置精度为2 86 | np.set_printoptions(precision=2, suppress=True) 87 | # 假设话题的个数是3个 88 | k = 3 89 | div_nmf = DivergenceNmfLsa(max_iter=1000, random_state=2022) 90 | W, H = div_nmf.fit(X, k) 91 | print("话题空间W:") 92 | print(W) 93 | print("文本在话题空间的表示H:") 94 | print(H) 95 | -------------------------------------------------------------------------------- /codes/ch10/hidden_markov_backward.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: hidden_markov_backward.py 6 | @time: 2021/8/16 21:17 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题10.1 隐马尔可夫模型的后向算法 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | class HiddenMarkovBackward: 15 | def __init__(self, verbose=False): 16 | self.betas = None 17 | self.backward_P = None 18 | self.verbose = verbose 19 | 20 | def backward(self, Q, V, A, B, O, PI): 21 | """ 22 | 后向算法 23 | :param Q: 所有可能的状态集合 24 | :param V: 所有可能的观测集合 25 | :param A: 状态转移概率矩阵 26 | :param B: 观测概率矩阵 27 | :param O: 观测序列 28 | :param PI: 初始状态概率向量 29 | """ 30 | # 状态序列的大小 31 | N = len(Q) 32 | # 观测序列的大小 33 | M = len(O) 34 | # (1)初始化后向概率beta值,书中第201页公式(10.19) 35 | betas = np.ones((N, M)) 36 | if self.verbose: 37 | self.print_betas_T(N, M) 38 | 39 | # (2)对观测序列逆向遍历,M-2即为T-1 40 | if self.verbose: 41 | print("\n从时刻T-1到1观测序列的后向概率:") 42 | for t in range(M - 2, -1, -1): 43 | # 得到序列对应的索引 44 | index_of_o = V.index(O[t + 1]) 45 | # 遍历状态序列 46 | for i in range(N): 47 | # 书中第201页公式(10.20) 48 | betas[i][t] = np.dot(np.multiply(A[i], [b[index_of_o] for b in B]), 49 | [beta[t + 1] for beta in betas]) 50 | real_t = t + 1 51 | real_i = i + 1 52 | if self.verbose: 53 | self.print_betas_t(A, B, N, betas, i, index_of_o, real_i, real_t, t) 54 | 55 | # 取出第一个值索引,用于得到o1 56 | index_of_o = V.index(O[0]) 57 | self.betas = betas 58 | # 书中第201页公式(10.21) 59 | P = np.dot(np.multiply(PI, [b[index_of_o] for b in B]), 60 | [beta[0] for beta in betas]) 61 | self.backward_P = P 62 | self.print_P(B, N, P, PI, betas, index_of_o) 63 | 64 | @staticmethod 65 | def print_P(B, N, P, PI, betas, index_of_o): 66 | print("\n观测序列概率:") 67 | print("P(O|lambda) = ", end="") 68 | for i in range(N): 69 | print("%.1f * %.1f * %.5f + " 70 | % (PI[0][i], B[i][index_of_o], betas[i][0]), end="") 71 | print("0 = %f" % P) 72 | 73 | @staticmethod 74 | def print_betas_t(A, B, N, betas, i, index_of_o, real_i, real_t, t): 75 | print("beta%d(%d) = sum[a%dj * bj(o%d) * beta%d(j)] = (" 76 | % (real_t, real_i, real_i, real_t + 1, real_t + 1), end='') 77 | for j in range(N): 78 | print("%.2f * %.2f * %.2f + " 79 | % (A[i][j], B[j][index_of_o], betas[j][t + 1]), end='') 80 | print("0) = %.3f" % betas[i][t]) 81 | 82 | @staticmethod 83 | def print_betas_T(N, M): 84 | print("初始化后向概率:") 85 | for i in range(N): 86 | print('beta%d(%d) = 1' % (M, i + 1)) 87 | 88 | 89 | if __name__ == '__main__': 90 | Q = [1, 2, 3] 91 | V = ['红', '白'] 92 | A = [[0.5, 0.2, 0.3], [0.3, 0.5, 0.2], [0.2, 0.3, 0.5]] 93 | B = [[0.5, 0.5], [0.4, 0.6], [0.7, 0.3]] 94 | O = ['红', '白', '红', '白'] 95 | PI = [[0.2, 0.4, 0.4]] 96 | 97 | hmm_backward = HiddenMarkovBackward() 98 | hmm_backward.backward(Q, V, A, B, O, PI) 99 | -------------------------------------------------------------------------------- /notebook/part04/notes/ch34.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "57b2e607924b7d11", 6 | "metadata": {}, 7 | "source": [ 8 | "# 第34章 强化学习简介" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "d0c3aeb2", 14 | "metadata": {}, 15 | "source": [ 16 | "## 习题34.1" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "cdd16750", 22 | "metadata": {}, 23 | "source": [ 24 | "  写出奖励、回报、价值(状态价值)的定义,比较三者之间的关系。" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "id": "a24401c0", 30 | "metadata": {}, 31 | "source": [ 32 | "**解答:** " 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "id": "a26f336d", 38 | "metadata": {}, 39 | "source": [ 40 | "**解答思路:**" 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "id": "3abf0c86", 46 | "metadata": {}, 47 | "source": [ 48 | "**解答步骤:** " 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "id": "aaca44cd", 54 | "metadata": {}, 55 | "source": [ 56 | "## 习题34.2" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "id": "c394480f", 62 | "metadata": {}, 63 | "source": [ 64 | "  证明状态-动作的转移概率分布也具有马尔可夫性(34.3)。" 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "id": "9bab9ea7", 70 | "metadata": {}, 71 | "source": [ 72 | "**解答:** " 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "id": "14093b3c", 78 | "metadata": {}, 79 | "source": [ 80 | "**解答思路:**" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "id": "d6d30d87", 86 | "metadata": {}, 87 | "source": [ 88 | "**解答步骤:** " 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "id": "e39fcb90", 94 | "metadata": {}, 95 | "source": [ 96 | "## 习题34.3" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "id": "35a2a2ee", 102 | "metadata": {}, 103 | "source": [ 104 | "  证明两种价值函数之间的关系(34.8)。" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "id": "eb3cbb2c", 110 | "metadata": {}, 111 | "source": [ 112 | "**解答:** " 113 | ] 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "id": "f266fc48", 118 | "metadata": {}, 119 | "source": [ 120 | "**解答思路:**" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "id": "42450a35", 126 | "metadata": {}, 127 | "source": [ 128 | "**解答步骤:** " 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": null, 134 | "id": "f8562f85", 135 | "metadata": {}, 136 | "outputs": [], 137 | "source": [] 138 | } 139 | ], 140 | "metadata": { 141 | "kernelspec": { 142 | "display_name": "Python 3 (ipykernel)", 143 | "language": "python", 144 | "name": "python3" 145 | }, 146 | "language_info": { 147 | "codemirror_mode": { 148 | "name": "ipython", 149 | "version": 3 150 | }, 151 | "file_extension": ".py", 152 | "mimetype": "text/x-python", 153 | "name": "python", 154 | "nbconvert_exporter": "python", 155 | "pygments_lexer": "ipython3", 156 | "version": "3.10.5" 157 | } 158 | }, 159 | "nbformat": 4, 160 | "nbformat_minor": 5 161 | } 162 | -------------------------------------------------------------------------------- /notebook/part04/notes/ch38.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "cc7f09af7eaa7e62", 6 | "metadata": {}, 7 | "source": [ 8 | "# 第38章 深度Q网络" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "fea08308", 14 | "metadata": {}, 15 | "source": [ 16 | "## 习题38.1" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "141abbbc", 22 | "metadata": {}, 23 | "source": [ 24 | "  写出函数近似的SARSA算法。" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "id": "a24401c0", 30 | "metadata": {}, 31 | "source": [ 32 | "**解答:** " 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "id": "a26f336d", 38 | "metadata": {}, 39 | "source": [ 40 | "**解答思路:**" 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "id": "3abf0c86", 46 | "metadata": {}, 47 | "source": [ 48 | "**解答步骤:** " 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "id": "52bcbab9", 54 | "metadata": {}, 55 | "source": [ 56 | "## 习题38.2" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "id": "2a2bc70e", 62 | "metadata": {}, 63 | "source": [ 64 | "  假设使用函数近似的价值函数出现了过拟合或欠拟合现象,应该如何调整训练方法?" 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "id": "f7c31f8a", 70 | "metadata": {}, 71 | "source": [ 72 | "**解答:** " 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "id": "34d157ad", 78 | "metadata": {}, 79 | "source": [ 80 | "**解答思路:**" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "id": "6a2440f4", 86 | "metadata": {}, 87 | "source": [ 88 | "**解答步骤:** " 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "id": "1e194422", 94 | "metadata": {}, 95 | "source": [ 96 | "## 习题38.3" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "id": "9e00f0b9", 102 | "metadata": {}, 103 | "source": [ 104 | "  比较DQN与传统Q学习算法的异同点,并分析DQN在处理大规模问题上的优势。" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "id": "9aff71d4", 110 | "metadata": {}, 111 | "source": [ 112 | "**解答:** " 113 | ] 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "id": "c16037fa", 118 | "metadata": {}, 119 | "source": [ 120 | "**解答思路:**" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "id": "5dc1f464", 126 | "metadata": {}, 127 | "source": [ 128 | "**解答步骤:** " 129 | ] 130 | }, 131 | { 132 | "cell_type": "code", 133 | "execution_count": null, 134 | "id": "7a104b7eac663d0c", 135 | "metadata": {}, 136 | "outputs": [], 137 | "source": [] 138 | } 139 | ], 140 | "metadata": { 141 | "kernelspec": { 142 | "display_name": "Python 3 (ipykernel)", 143 | "language": "python", 144 | "name": "python3" 145 | }, 146 | "language_info": { 147 | "codemirror_mode": { 148 | "name": "ipython", 149 | "version": 3 150 | }, 151 | "file_extension": ".py", 152 | "mimetype": "text/x-python", 153 | "name": "python", 154 | "nbconvert_exporter": "python", 155 | "pygments_lexer": "ipython3", 156 | "version": "3.10.5" 157 | } 158 | }, 159 | "nbformat": 4, 160 | "nbformat_minor": 5 161 | } 162 | -------------------------------------------------------------------------------- /codes/ch27/auto_encoder.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: auto_encoder.py 6 | @time: 2023/3/14 14:13 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题27.3 2层卷积神经网络编码器和2层卷积神经网络解码器组成的自动编码器 9 | """ 10 | import torch 11 | import torch.nn as nn 12 | import torchvision.transforms as transforms 13 | import tqdm 14 | from matplotlib import pyplot as plt 15 | from torch import optim 16 | from torch.utils.data import DataLoader 17 | from torchvision.datasets import mnist 18 | from torchviz import make_dot 19 | 20 | 21 | class AutoEncoder(nn.Module): 22 | def __init__(self): 23 | super(AutoEncoder, self).__init__() 24 | # 2层卷积神经网络编码器 25 | self.encoder = nn.Sequential( 26 | nn.Conv2d(1, 16, 3, 2, 1), 27 | nn.ReLU(), 28 | nn.Conv2d(16, 32, 3, 2, 1), 29 | nn.ReLU() 30 | ) 31 | # 2层卷积神经网络解码器 32 | self.decoder = nn.Sequential( 33 | nn.ConvTranspose2d(32, 16, 3, 2, 1, output_padding=1), 34 | nn.ReLU(), 35 | 36 | nn.ConvTranspose2d(16, 1, 3, 2, 1, output_padding=1), 37 | nn.Sigmoid() 38 | ) 39 | 40 | def forward(self, x): 41 | x = self.encoder(x) 42 | x = self.decoder(x) 43 | return x 44 | 45 | 46 | def save_model_structure(model, device): 47 | x = torch.randn(1, 1, 28, 28).requires_grad_(True).to(device) 48 | y = model(x) 49 | vise = make_dot(y, params=dict(list(model.named_parameters()) + [('x', x)])) 50 | vise.format = "png" 51 | vise.directory = "./data" 52 | vise.view() 53 | 54 | 55 | if __name__ == '__main__': 56 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 57 | 58 | # 使用MNIST数据集 59 | train_set = mnist.MNIST('./data', transform=transforms.ToTensor(), train=True, download=True) 60 | test_set = mnist.MNIST('./data', transform=transforms.ToTensor(), train=False, download=True) 61 | train_dataloader = DataLoader(train_set, batch_size=32, shuffle=True) 62 | test_dataloader = DataLoader(test_set, batch_size=8, shuffle=False) 63 | 64 | model = AutoEncoder().to(device) 65 | 66 | # 设置损失函数 67 | criterion = nn.MSELoss() 68 | # 设置优化器 69 | optimizer = optim.Adam(model.parameters(), lr=1e-2) 70 | 71 | # 模型训练 72 | EPOCHES = 10 73 | for epoch in range(EPOCHES): 74 | for img, _ in tqdm.tqdm(train_dataloader): 75 | optimizer.zero_grad() 76 | 77 | img = img.to(device) 78 | out = model(img) 79 | loss = criterion(out, img) 80 | loss.backward() 81 | 82 | optimizer.step() 83 | 84 | # 将生成图片和原始图片进行对比 85 | for i, data in enumerate(test_dataloader): 86 | img, _ = data 87 | img = img.to(device) 88 | model = model.to(device) 89 | img_new = model(img).detach().cpu().numpy() 90 | img = img.cpu().numpy() 91 | plt.figure(figsize=(8, 2)) 92 | for j in range(8): 93 | plt.subplot(2, 8, j + 1) 94 | plt.axis('off') 95 | plt.imshow(img_new[j].squeeze()) 96 | plt.subplot(2, 8, 8 + j + 1) 97 | plt.axis('off') 98 | plt.imshow(img[j].squeeze()) 99 | if i >= 2: 100 | break 101 | 102 | save_model_structure(model, device) 103 | -------------------------------------------------------------------------------- /codes/ch14/divisive_clustering.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: divisive_clustering.py 6 | @time: 2022/7/17 0:42 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题14.1 分裂聚类算法 9 | """ 10 | import numpy as np 11 | 12 | 13 | class DivisiveClustering: 14 | def __init__(self, num_class): 15 | # 聚类类别个数 16 | self.num_class = num_class 17 | # 聚类数据集 18 | self.cluster_data = [] 19 | if num_class > 1: 20 | self.cluster_data = [[] for _ in range(num_class)] 21 | 22 | def fit(self, data): 23 | """ 24 | :param data: 数据集 25 | """ 26 | num_sample = data.shape[0] 27 | 28 | if self.num_class == 1: 29 | # 如果只设定了一类,将所有数据放入到该类中 30 | for d in data: 31 | self.cluster_data.append(d) 32 | else: 33 | # (1) 构造1个类,该类包含全部样本 34 | # 初始化类中心 35 | class_center = [] 36 | 37 | # (2) 计算n个样本两两之间的欧氏距离 38 | distance = np.zeros((num_sample, num_sample)) 39 | for i in range(num_sample): 40 | for j in range(i + 1, num_sample): 41 | distance[j, i] = distance[i, j] = np.linalg.norm(data[i, :] - data[j, :], ord=2) 42 | 43 | # (3) 分裂距离最大的两个样本,并设置为各自的类中心 44 | index = np.where(np.max(distance) == distance) 45 | class_1 = index[1][0] 46 | class_2 = index[1][1] 47 | # 记录已经分裂完成的样本 48 | finished_data = [class_1, class_2] 49 | class_center.append(data[class_1, :]) 50 | class_center.append(data[class_2, :]) 51 | 52 | num_class_temp = 2 53 | # (5) 判断类的个数是否满足设定的样本类别数 54 | while num_class_temp != self.num_class: 55 | # (4.1) 计算未分裂的样本与目前各个类中心的距离 56 | data2class_distance = np.zeros((num_sample, 1)) 57 | for i in range(num_sample): 58 | # 计算样本到各类中心的距离总和 59 | data2class_sum = 0 60 | for j, center_data in enumerate(class_center): 61 | if i not in finished_data: 62 | data2class_sum += np.linalg.norm(data[i, :] - center_data) 63 | data2class_distance[i] = data2class_sum 64 | 65 | # (4.2) 分裂类间距离最大的样本作为新的类中心,构造一个新类 66 | class_new_index = np.argmax(data2class_distance) 67 | num_class_temp += 1 68 | finished_data.append(class_new_index) 69 | # 添加到类中心集合中 70 | class_center.append(data[class_new_index, :]) 71 | 72 | # 根据当前的类中心,按照最近邻的原则对整个数据集进行分类 73 | for i in range(num_sample): 74 | data2class_distance = [] 75 | for j, center_data in enumerate(class_center): 76 | # 计算每个样本到类中心的距离 77 | data2class_distance.append(np.linalg.norm(data[i, :] - center_data)) 78 | 79 | # 将样本划分到最近的中心的类中 80 | label = np.argmin(data2class_distance) 81 | self.cluster_data[label].append(data[i, :]) 82 | 83 | 84 | if __name__ == '__main__': 85 | # 使用书中例14.2的样本数据集 86 | dataset = np.array([[0, 2], 87 | [0, 0], 88 | [1, 0], 89 | [5, 0], 90 | [5, 2]]) 91 | 92 | num_class = 2 93 | divi_cluster = DivisiveClustering(num_class=num_class) 94 | divi_cluster.fit(dataset) 95 | print("分类数:", num_class) 96 | for i in range(num_class): 97 | print(f"class_{i}:", divi_cluster.cluster_data[i]) 98 | -------------------------------------------------------------------------------- /codes/ch10/hidden_markov_forward_backward.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: hidden_markov_forward_backward.py 6 | @time: 2021/8/16 22:08 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题10.2 隐马尔可夫模型的前向后向算法 9 | """ 10 | 11 | import numpy as np 12 | from hidden_markov_backward import HiddenMarkovBackward 13 | 14 | 15 | class HiddenMarkovForwardBackward(HiddenMarkovBackward): 16 | def __init__(self, verbose=False): 17 | super(HiddenMarkovBackward, self).__init__() 18 | self.alphas = None 19 | self.forward_P = None 20 | self.verbose = verbose 21 | 22 | def forward(self, Q, V, A, B, O, PI): 23 | """ 24 | 前向算法 25 | :param Q: 所有可能的状态集合 26 | :param V: 所有可能的观测集合 27 | :param A: 状态转移概率矩阵 28 | :param B: 观测概率矩阵 29 | :param O: 观测序列 30 | :param PI: 初始状态概率向量 31 | """ 32 | # 状态序列的大小 33 | N = len(Q) 34 | # 观测序列的大小 35 | M = len(O) 36 | # 初始化前向概率alpha值 37 | alphas = np.zeros((N, M)) 38 | # 时刻数=观测序列数 39 | T = M 40 | # (2)对观测序列遍历,遍历每一个时刻,计算前向概率alpha值 41 | 42 | for t in range(T): 43 | if self.verbose: 44 | if t == 0: 45 | print("前向概率初值:") 46 | elif t == 1: 47 | print("\n从时刻1到T-1观测序列的前向概率:") 48 | # 得到序列对应的索引 49 | index_of_o = V.index(O[t]) 50 | # 遍历状态序列 51 | for i in range(N): 52 | if t == 0: 53 | # (1)初始化alpha初值,书中第198页公式(10.15) 54 | alphas[i][t] = PI[t][i] * B[i][index_of_o] 55 | if self.verbose: 56 | self.print_alpha_t1(alphas, i, t) 57 | else: 58 | # (2)递推,书中第198页公式(10.16) 59 | alphas[i][t] = np.dot([alpha[t - 1] for alpha in alphas], 60 | [a[i] for a in A]) * B[i][index_of_o] 61 | if self.verbose: 62 | self.print_alpha_t(alphas, i, t) 63 | # (3)终止,书中第198页公式(10.17) 64 | self.forward_P = np.sum([alpha[M - 1] for alpha in alphas]) 65 | self.alphas = alphas 66 | 67 | @staticmethod 68 | def print_alpha_t(alphas, i, t): 69 | print("alpha%d(%d) = [sum alpha%d(i) * ai%d] * b%d(o%d) = %f" 70 | % (t + 1, i + 1, t - 1, i, i, t, alphas[i][t])) 71 | 72 | @staticmethod 73 | def print_alpha_t1(alphas, i, t): 74 | print('alpha1(%d) = pi%d * b%d * b(o1) = %f' 75 | % (i + 1, i, i, alphas[i][t])) 76 | 77 | def calc_t_qi_prob(self, t, qi): 78 | result = (self.alphas[qi - 1][t - 1] * self.betas[qi - 1][t - 1]) / self.backward_P[0] 79 | if self.verbose: 80 | print("计算P(i%d=q%d|O,lambda):" % (t, qi)) 81 | print("P(i%d=q%d|O,lambda) = alpha%d(%d) * beta%d(%d) / P(O|lambda) = %f" 82 | % (t, qi, t, qi, t, qi, result)) 83 | 84 | return result 85 | 86 | 87 | if __name__ == '__main__': 88 | Q = [1, 2, 3] 89 | V = ['红', '白'] 90 | A = [[0.5, 0.1, 0.4], [0.3, 0.5, 0.2], [0.2, 0.2, 0.6]] 91 | B = [[0.5, 0.5], [0.4, 0.6], [0.7, 0.3]] 92 | O = ['红', '白', '红', '红', '白', '红', '白', '白'] 93 | PI = [[0.2, 0.3, 0.5]] 94 | 95 | hmm_forward_backward = HiddenMarkovForwardBackward(verbose=True) 96 | hmm_forward_backward.forward(Q, V, A, B, O, PI) 97 | print() 98 | hmm_forward_backward.backward(Q, V, A, B, O, PI) 99 | print() 100 | hmm_forward_backward.calc_t_qi_prob(t=4, qi=3) 101 | -------------------------------------------------------------------------------- /codes/ch05/my_least_squares_regression_tree.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: my_least_squares_regression_tree.py 6 | @time: 2021/8/6 21:37 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题5.2 最小二乘回归树生成算法 9 | """ 10 | import json 11 | 12 | import numpy as np 13 | 14 | 15 | # 节点类 16 | class Node: 17 | def __init__(self, value, feature, left=None, right=None): 18 | self.value = value.tolist() 19 | self.feature = feature.tolist() 20 | self.left = left 21 | self.right = right 22 | 23 | def __repr__(self): 24 | return json.dumps(self, indent=3, default=lambda obj: obj.__dict__, ensure_ascii=False) 25 | 26 | 27 | class MyLeastSquareRegTree: 28 | def __init__(self, train_X, y, epsilon): 29 | # 训练集特征值 30 | self.x = train_X 31 | # 类别 32 | self.y = y 33 | # 特征总数 34 | self.feature_count = train_X.shape[1] 35 | # 损失阈值 36 | self.epsilon = epsilon 37 | # 回归树 38 | self.tree = None 39 | 40 | def _fit(self, x, y, feature_count): 41 | # (1)选择最优切分点变量j与切分点s,得到选定的对(j,s),并解得c1,c2 42 | (j, s, minval, c1, c2) = self._divide(x, y, feature_count) 43 | # 初始化树 44 | tree = Node(feature=j, value=x[s, j], left=None, right=None) 45 | # 用选定的对(j,s)划分区域,并确定响应的输出值 46 | if minval < self.epsilon or len(y[np.where(x[:, j] <= x[s, j])]) <= 1: 47 | tree.left = c1 48 | else: 49 | # 对左子区域调用步骤(1)、(2) 50 | tree.left = self._fit(x[np.where(x[:, j] <= x[s, j])], 51 | y[np.where(x[:, j] <= x[s, j])], 52 | self.feature_count) 53 | if minval < self.epsilon or len(y[np.where(x[:, j] > x[s, j])]) <= 1: 54 | tree.right = c2 55 | else: 56 | # 对右子区域调用步骤(1)、(2) 57 | tree.right = self._fit(x[np.where(x[:, j] > x[s, j])], 58 | y[np.where(x[:, j] > x[s, j])], 59 | self.feature_count) 60 | return tree 61 | 62 | def fit(self): 63 | self.tree = self._fit(self.x, self.y, self.feature_count) 64 | return self 65 | 66 | @staticmethod 67 | def _divide(x, y, feature_count): 68 | # 初始化损失误差 69 | cost = np.zeros((feature_count, len(x))) 70 | # 公式5.21 71 | for i in range(feature_count): 72 | for k in range(len(x)): 73 | # k行i列的特征值 74 | value = x[k, i] 75 | y1 = y[np.where(x[:, i] <= value)] 76 | c1 = np.mean(y1) 77 | y2 = y[np.where(x[:, i] > value)] 78 | if len(y2) == 0: 79 | c2 = 0 80 | else: 81 | c2 = np.mean(y2) 82 | y1[:] = y1[:] - c1 83 | y2[:] = y2[:] - c2 84 | cost[i, k] = np.sum(y1 * y1) + np.sum(y2 * y2) 85 | # 选取最优损失误差点 86 | cost_index = np.where(cost == np.min(cost)) 87 | # 所选取的特征 88 | j = cost_index[0][0] 89 | # 选取特征的切分点 90 | s = cost_index[1][0] 91 | # 求两个区域的均值c1,c2 92 | c1 = np.mean(y[np.where(x[:, j] <= x[s, j])]) 93 | c2 = np.mean(y[np.where(x[:, j] > x[s, j])]) 94 | return j, s, cost[cost_index], c1, c2 95 | 96 | def __repr__(self): 97 | return str(self.tree) 98 | 99 | 100 | if __name__ == '__main__': 101 | train_X = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]]).T 102 | y = np.array([4.50, 4.75, 4.91, 5.34, 5.80, 7.05, 7.90, 8.23, 8.70, 9.00]) 103 | 104 | model_tree = MyLeastSquareRegTree(train_X, y, epsilon=0.2) 105 | model_tree.fit() 106 | print(model_tree) 107 | -------------------------------------------------------------------------------- /codes/ch23/feedforward_nn_backpropagation.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: feedforward_nn_backpropagation.py 6 | @time: 2023/3/17 18:32 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题23.3 自编程实现前馈神经网络的反向传播算法,使用MNIST数据构建手写数字识别网络 9 | """ 10 | 11 | import numpy as np 12 | from sklearn.datasets import fetch_openml 13 | from sklearn.model_selection import train_test_split 14 | from sklearn.preprocessing import LabelBinarizer 15 | 16 | from tqdm import tqdm 17 | 18 | np.random.seed(2023) 19 | 20 | 21 | class NeuralNetwork: 22 | def __init__(self, layers, alpha=0.1): 23 | # 网络层的神经元个数,其中第一层和第二层是隐层 24 | self.layers = layers 25 | # 学习率 26 | self.alpha = alpha 27 | # 权重 28 | self.weights = [] 29 | # 偏置 30 | self.biases = [] 31 | # 初始化权重和偏置 32 | for i in range(1, len(layers)): 33 | self.weights.append(np.random.randn(layers[i - 1], layers[i])) 34 | self.biases.append(np.random.randn(layers[i])) 35 | 36 | def sigmoid(self, x): 37 | return 1 / (1 + np.exp(-x)) 38 | 39 | def sigmoid_derivative(self, x): 40 | return x * (1 - x) 41 | 42 | def feedforward(self, inputs): 43 | """ 44 | (1)正向传播 45 | """ 46 | self.activations = [inputs] 47 | self.weighted_inputs = [] 48 | for i in range(len(self.weights)): 49 | weighted_input = np.dot(self.activations[-1], self.weights[i]) + self.biases[i] 50 | self.weighted_inputs.append(weighted_input) 51 | # 得到各层的输出h 52 | activation = self.sigmoid(weighted_input) 53 | self.activations.append(activation) 54 | 55 | return self.activations[-1] 56 | 57 | def backpropagate(self, expected): 58 | """ 59 | (2)反向传播 60 | """ 61 | # 计算各层的误差 62 | errors = [expected - self.activations[-1]] 63 | # 计算各层的梯度 64 | deltas = [errors[-1] * self.sigmoid_derivative(self.activations[-1])] 65 | 66 | for i in range(len(self.weights) - 1, 0, -1): 67 | error = deltas[-1].dot(self.weights[i].T) 68 | errors.append(error) 69 | delta = errors[-1] * self.sigmoid_derivative(self.activations[i]) 70 | deltas.append(delta) 71 | deltas.reverse() 72 | 73 | for i in range(len(self.weights)): 74 | # 更新参数 75 | self.weights[i] += self.alpha * np.array([self.activations[i]]).T.dot(np.array([deltas[i]])) 76 | self.biases[i] += self.alpha * np.sum(deltas[i], axis=0) 77 | 78 | def train(self, inputs, expected_outputs, epochs): 79 | for i in tqdm(range(epochs)): 80 | for j in range(len(inputs)): 81 | self.feedforward(inputs[j]) 82 | self.backpropagate(expected_outputs[j]) 83 | 84 | 85 | if __name__ == '__main__': 86 | # 加载MNIST手写数字数据集 87 | mnist = fetch_openml('mnist_784', parser='auto') 88 | X = mnist.data.astype('float32') / 255.0 89 | y = mnist.target.astype('int') 90 | 91 | # 划分训练集和测试集 92 | lb = LabelBinarizer() 93 | y = lb.fit_transform(y) 94 | X = np.array(X) 95 | 96 | X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) 97 | 98 | # 训练神经网络,其中第一层和第二层各有100个神经元和50个神经元 99 | nn = NeuralNetwork([784, 100, 50, 10], alpha=0.1) 100 | nn.train(X_train, y_train, epochs=10) 101 | 102 | # 使用测试集对模型进行评估 103 | correct = 0 104 | 105 | for i in range(len(X_test)): 106 | output = nn.feedforward(X_test[i]) 107 | prediction = np.argmax(output) 108 | actual = np.argmax(y_test[i]) 109 | if prediction == actual: 110 | correct += 1 111 | 112 | accuracy = correct / len(X_test) * 100 113 | print("Accuracy: {:.2f} %".format(accuracy)) 114 | -------------------------------------------------------------------------------- /codes/ch02/perceptron.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: perceptron.py 6 | @time: 2021/8/2 21:51 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题2.2 构建从训练数据求解感知机模型的例子 9 | """ 10 | 11 | import numpy as np 12 | from matplotlib import pyplot as plt 13 | 14 | 15 | class Perceptron: 16 | def __init__(self, X, Y, lr=0.001, plot=True): 17 | """ 18 | 初始化感知机 19 | :param X: 特征向量 20 | :param Y: 类别 21 | :param lr: 学习率 22 | :param plot: 是否绘制图形 23 | """ 24 | self.X = X 25 | self.Y = Y 26 | self.lr = lr 27 | self.plot = plot 28 | if plot: 29 | self.__model_plot = self._ModelPlot(self.X, self.Y) 30 | self.__model_plot.open_in() 31 | 32 | def fit(self): 33 | # (1)初始化weight, b 34 | weight = np.zeros(self.X.shape[1]) 35 | b = 0 36 | # 训练次数 37 | train_counts = 0 38 | # 分类错误标识 39 | mistake_flag = True 40 | while mistake_flag: 41 | # 开始前,将mistake_flag设置为False,用于判断本次循环是否有分类错误 42 | mistake_flag = False 43 | # (2)从训练集中选取x,y 44 | for index in range(self.X.shape[0]): 45 | if self.plot: 46 | self.__model_plot.plot(weight, b, train_counts) 47 | # 损失函数 48 | loss = self.Y[index] * (weight @ self.X[index] + b) 49 | # (3)如果损失函数小于0,则该点是误分类点 50 | if loss <= 0: 51 | # 更新weight, b 52 | weight += self.lr * self.Y[index] * self.X[index] 53 | b += self.lr * self.Y[index] 54 | # 训练次数加1 55 | train_counts += 1 56 | print("Epoch {}, weight = {}, b = {}, formula: {}".format( 57 | train_counts, weight, b, self.__model_plot.formula(weight, b))) 58 | # 本次循环有误分类点(即分类错误),置为True 59 | mistake_flag = True 60 | break 61 | if self.plot: 62 | self.__model_plot.close() 63 | # (4)直至训练集中没有误分类点 64 | return weight, b 65 | 66 | class _ModelPlot: 67 | def __init__(self, X, Y): 68 | self.X = X 69 | self.Y = Y 70 | 71 | @staticmethod 72 | def open_in(): 73 | # 打开交互模式,用于展示动态交互图 74 | plt.ion() 75 | 76 | @staticmethod 77 | def close(): 78 | # 关闭交互模式,并显示最终的图形 79 | plt.ioff() 80 | plt.show() 81 | 82 | def plot(self, weight, b, epoch): 83 | plt.cla() 84 | # x轴表示x1 85 | plt.xlim(0, np.max(self.X.T[0]) + 1) 86 | # y轴表示x2 87 | plt.ylim(0, np.max(self.X.T[1]) + 1) 88 | # 画出散点图,并添加图示 89 | scatter = plt.scatter(self.X.T[0], self.X.T[1], c=self.Y) 90 | plt.legend(*scatter.legend_elements()) 91 | if True in list(weight == 0): 92 | plt.plot(0, 0) 93 | else: 94 | x1 = -b / weight[0] 95 | x2 = -b / weight[1] 96 | # 画出分离超平面 97 | plt.plot([x1, 0], [0, x2]) 98 | # 绘制公式 99 | text = self.formula(weight, b) 100 | plt.text(0.3, x2 - 0.1, text) 101 | plt.title('Epoch %d' % epoch) 102 | plt.pause(0.01) 103 | 104 | @staticmethod 105 | def formula(weight, b): 106 | text = 'x1 ' if weight[0] == 1 else '%d*x1 ' % weight[0] 107 | text += '+ x2 ' if weight[1] == 1 else ( 108 | '+ %d*x2 ' % weight[1] if weight[1] > 0 else '- %d*x2 ' % -weight[1]) 109 | text += '= 0' if b == 0 else ('+ %d = 0' % b if b > 0 else '- %d = 0' % -b) 110 | return text 111 | 112 | 113 | if __name__ == '__main__': 114 | X = np.array([[3, 3], [4, 3], [1, 1]]) 115 | Y = np.array([1, 1, -1]) 116 | model = Perceptron(X, Y, lr=1) 117 | weight, b = model.fit() -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | allennlp==2.10.1 2 | anyio==3.6.2 3 | argon2-cffi==21.3.0 4 | argon2-cffi-bindings==21.2.0 5 | arrow==1.2.3 6 | asttokens==2.2.1 7 | attrs==22.2.0 8 | autopep8==2.0.1 9 | backcall==0.2.0 10 | base58==2.1.1 11 | beautifulsoup4==4.11.2 12 | bleach==6.0.0 13 | blis==0.7.9 14 | boto3==1.26.114 15 | botocore==1.29.114 16 | cached-path==1.1.6 17 | cachetools==5.3.0 18 | catalogue==2.0.8 19 | certifi==2022.12.7 20 | cffi==1.15.1 21 | charset-normalizer==3.0.1 22 | click==8.1.3 23 | colorama==0.4.6 24 | comm==0.1.2 25 | commonmark==0.9.1 26 | contourpy==1.0.7 27 | cycler==0.11.0 28 | cymem==2.0.7 29 | debugpy==1.6.6 30 | decorator==5.1.1 31 | defusedxml==0.7.1 32 | dill==0.3.6 33 | docker-pycreds==0.4.0 34 | exceptiongroup==1.1.1 35 | executing==1.2.0 36 | fairscale==0.4.6 37 | fastjsonschema==2.16.3 38 | filelock==3.7.1 39 | fonttools==4.38.0 40 | fqdn==1.5.1 41 | gitdb==4.0.10 42 | GitPython==3.1.31 43 | google-api-core==2.11.0 44 | google-auth==2.17.3 45 | google-cloud-core==2.3.2 46 | google-cloud-storage==2.8.0 47 | google-crc32c==1.5.0 48 | google-resumable-media==2.4.1 49 | googleapis-common-protos==1.59.0 50 | graphviz==0.20.1 51 | h5py==3.8.0 52 | huggingface-hub==0.10.1 53 | idna==3.4 54 | iniconfig==2.0.0 55 | ipykernel==6.21.2 56 | ipython==8.11.0 57 | ipython-genutils==0.2.0 58 | ipywidgets==8.0.4 59 | isoduration==20.11.0 60 | jedi==0.18.2 61 | Jinja2==3.1.2 62 | jmespath==1.0.1 63 | joblib==1.2.0 64 | jsonpointer==2.3 65 | jsonschema==4.17.3 66 | jupyter-contrib-core==0.4.2 67 | jupyter-events==0.6.3 68 | jupyter-nbextensions-configurator==0.6.1 69 | jupyter_client==8.0.3 70 | jupyter_core==5.2.0 71 | jupyter_server==2.3.0 72 | jupyter_server_terminals==0.4.4 73 | jupyterlab-pygments==0.2.2 74 | jupyterlab-widgets==3.0.5 75 | kiwisolver==1.4.4 76 | langcodes==3.3.0 77 | lmdb==1.4.0 78 | MarkupSafe==2.1.2 79 | matplotlib==3.7.0 80 | matplotlib-inline==0.1.6 81 | mistune==2.0.5 82 | more-itertools==9.1.0 83 | murmurhash==1.0.9 84 | nbclassic==0.5.2 85 | nbclient==0.7.2 86 | nbconvert==7.2.9 87 | nbformat==5.7.3 88 | nest-asyncio==1.5.6 89 | nltk==3.8.1 90 | notebook==6.5.3 91 | notebook_shim==0.2.2 92 | numpy==1.24.2 93 | packaging==23.0 94 | pandas==1.5.3 95 | pandocfilters==1.5.0 96 | parso==0.8.3 97 | pathtools==0.1.2 98 | pathy==0.10.1 99 | pickleshare==0.7.5 100 | Pillow==9.4.0 101 | platformdirs==3.0.0 102 | pluggy==1.0.0 103 | portalocker==2.7.0 104 | preshed==3.0.8 105 | prometheus-client==0.16.0 106 | promise==2.3 107 | prompt-toolkit==3.0.38 108 | protobuf==3.20.3 109 | psutil==5.9.4 110 | pure-eval==0.2.2 111 | pyasn1==0.4.8 112 | pyasn1-modules==0.2.8 113 | pycodestyle==2.10.0 114 | pycparser==2.21 115 | pydantic==1.8.2 116 | Pygments==2.14.0 117 | pyparsing==3.0.9 118 | pyrsistent==0.19.3 119 | pytest==7.2.2 120 | python-dateutil==2.8.2 121 | python-json-logger==2.0.7 122 | pytz==2022.7.1 123 | pywin32==305 124 | pywinpty==2.0.10 125 | PyYAML==6.0 126 | pyzmq==25.0.0 127 | regex==2023.3.23 128 | requests==2.28.2 129 | rfc3339-validator==0.1.4 130 | rfc3986-validator==0.1.1 131 | rich==12.6.0 132 | rsa==4.9 133 | s3transfer==0.6.0 134 | sacremoses==0.0.53 135 | scikit-learn==1.2.1 136 | scipy==1.10.1 137 | Send2Trash==1.8.0 138 | sentencepiece==0.1.97 139 | sentry-sdk==1.19.1 140 | setproctitle==1.3.2 141 | shortuuid==1.0.11 142 | six==1.16.0 143 | smart-open==6.3.0 144 | smmap==5.0.0 145 | sniffio==1.3.0 146 | soupsieve==2.4 147 | spacy==3.3.2 148 | spacy-legacy==3.0.12 149 | spacy-loggers==1.0.4 150 | srsly==2.4.6 151 | stack-data==0.6.2 152 | tensorboardX==2.6 153 | termcolor==1.1.0 154 | terminado==0.17.1 155 | thinc==8.0.17 156 | threadpoolctl==3.1.0 157 | tinycss2==1.2.1 158 | tokenizers==0.12.1 159 | tomli==2.0.1 160 | torch==1.12.1+cu116 161 | torchaudio==0.12.1+cu116 162 | torchdata==0.4.1 163 | torchtext==0.13.1 164 | torchvision==0.13.1+cu116 165 | torchviz==0.0.2 166 | tornado==6.2 167 | tqdm==4.65.0 168 | traitlets==5.9.0 169 | transformers==4.20.1 170 | typer==0.4.2 171 | typing_extensions==4.5.0 172 | uri-template==1.2.0 173 | urllib3==1.26.14 174 | wandb==0.12.21 175 | wasabi==0.10.1 176 | wcwidth==0.2.6 177 | webcolors==1.12 178 | webencodings==0.5.1 179 | websocket-client==1.5.1 180 | wget==3.2 181 | widgetsnbextension==4.0.5 182 | -------------------------------------------------------------------------------- /codes/ch19/metropolis_hastings.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: metropolis_hastings_demo.py 6 | @time: 2022/7/18 16:05 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题19.7 使用Metropolis-Hastings算法求后验概率分布的均值和方差 9 | """ 10 | import matplotlib.pyplot as plt 11 | import numpy as np 12 | from scipy.stats import beta, binom 13 | 14 | 15 | class MetropolisHastings: 16 | def __init__(self, proposal_dist, accepted_dist, m=1e4, n=1e5): 17 | """ 18 | Metropolis Hastings 算法 19 | 20 | :param proposal_dist: 建议分布 21 | :param accepted_dist: 接受分布 22 | :param m: 收敛步数 23 | :param n: 迭代步数 24 | """ 25 | self.proposal_dist = proposal_dist 26 | self.accepted_dist = accepted_dist 27 | self.m = m 28 | self.n = n 29 | 30 | @staticmethod 31 | def __calc_acceptance_ratio(q, p, x, x_prime): 32 | """ 33 | 计算接受概率 34 | 35 | :param q: 建议分布 36 | :param p: 接受分布 37 | :param x: 上一状态 38 | :param x_prime: 候选状态 39 | """ 40 | prob_1 = p.prob(x_prime) * q.joint_prob(x_prime, x) 41 | prob_2 = p.prob(x) * q.joint_prob(x, x_prime) 42 | alpha = np.min((1., prob_1 / prob_2)) 43 | return alpha 44 | 45 | def solve(self): 46 | """ 47 | Metropolis Hastings 算法求解 48 | """ 49 | all_samples = np.zeros(self.n) 50 | # (1) 任意选择一个初始值 51 | x_0 = np.random.random() 52 | # (2) 循环执行 53 | for i in range(int(self.n)): 54 | x = x_0 if i == 0 else all_samples[i - 1] 55 | # (2.a) 从建议分布中抽样选取 56 | x_prime = self.proposal_dist.sample() 57 | # (2.b) 计算接受概率 58 | alpha = self.__calc_acceptance_ratio(self.proposal_dist, self.accepted_dist, x, x_prime) 59 | # (2.c) 从区间 (0,1) 中按均匀分布随机抽取一个数 u 60 | u = np.random.uniform(0, 1) 61 | # 根据 u <= alpha,选择 x 或 x_prime 进行赋值 62 | if u <= alpha: 63 | all_samples[i] = x_prime 64 | else: 65 | all_samples[i] = x 66 | 67 | # (3) 随机样本集合 68 | samples = all_samples[self.m:] 69 | # 函数样本均值 70 | dist_mean = samples.mean() 71 | # 函数样本方差 72 | dist_var = samples.var() 73 | return samples[self.m:], dist_mean, dist_var 74 | 75 | @staticmethod 76 | def visualize(samples, bins=50): 77 | """ 78 | 可视化展示 79 | :param samples: 抽取的随机样本集合 80 | :param bins: 频率直方图的分组个数 81 | """ 82 | fig, ax = plt.subplots() 83 | ax.set_title('Metropolis Hastings') 84 | ax.hist(samples, bins, alpha=0.7, label='Samples Distribution') 85 | ax.set_xlim(0, 1) 86 | ax.legend() 87 | plt.show() 88 | 89 | 90 | class ProposalDistribution: 91 | """ 92 | 建议分布 93 | """ 94 | 95 | @staticmethod 96 | def sample(): 97 | """ 98 | 从建议分布中抽取一个样本 99 | """ 100 | # B(1,1) 101 | return beta.rvs(1, 1, size=1) 102 | 103 | @staticmethod 104 | def prob(x): 105 | """ 106 | P(X = x) 的概率 107 | """ 108 | return beta.pdf(x, 1, 1) 109 | 110 | def joint_prob(self, x_1, x_2): 111 | """ 112 | P(X = x_1, Y = x_2) 的联合概率 113 | """ 114 | return self.prob(x_1) * self.prob(x_2) 115 | 116 | 117 | class AcceptedDistribution: 118 | """ 119 | 接受分布 120 | """ 121 | 122 | @staticmethod 123 | def prob(x): 124 | """ 125 | P(X = x) 的概率 126 | """ 127 | # Bin(4, 10) 128 | return binom.pmf(4, 10, x) 129 | 130 | 131 | if __name__ == '__main__': 132 | # 收敛步数 133 | m = 1000 134 | # 迭代步数 135 | n = 10000 136 | 137 | # 建议分布 138 | proposal_dist = ProposalDistribution() 139 | # 接受分布 140 | accepted_dist = AcceptedDistribution() 141 | 142 | metropolis_hastings = MetropolisHastings(proposal_dist, accepted_dist, m, n) 143 | 144 | # 使用 Metropolis-Hastings 算法进行求解 145 | samples, dist_mean, dist_var = metropolis_hastings.solve() 146 | print("均值:", dist_mean) 147 | print("方差:", dist_var) 148 | 149 | # 对结果进行可视化 150 | metropolis_hastings.visualize(samples, bins=20) -------------------------------------------------------------------------------- /codes/summary/merge_docs.py: -------------------------------------------------------------------------------- 1 | import os 2 | import shutil 3 | from pathlib import Path 4 | 5 | PROJECT_ROOT = Path(__file__).parent.parent.parent.absolute() 6 | 7 | 8 | def gather_md(summary_dir): 9 | docs_dir = PROJECT_ROOT / "docs" 10 | # 汇总所有md文件内容 11 | summary_md = summary_dir / "summary.md" 12 | exclude_files = ["_sidebar.md", "README.md"] 13 | 14 | # 定义三份输出文件路径(对应三个章节范围) 15 | part1_file = summary_dir / "summary_part1.md" # 第01-11章 16 | part2_file = summary_dir / "summary_part2.md" # 第14-21章 17 | part3_file = summary_dir / "summary_part3.md" # 第23-28章 18 | 19 | # 打开三份文件并初始化(添加目录和分隔符) 20 | with open(part1_file, "w", encoding="utf-8") as part1_out, \ 21 | open(part2_file, "w", encoding="utf-8") as part2_out, \ 22 | open(part3_file, "w", encoding="utf-8") as part3_out: 23 | 24 | # 为每份文件添加独立目录和分隔符 25 | part1_out.write("[toc]\n\n---\n\n") 26 | part2_out.write("[toc]\n\n---\n\n") 27 | part3_out.write("[toc]\n\n---\n\n") 28 | 29 | # 遍历所有md文件 30 | for root, _, files in os.walk(docs_dir): 31 | for file in files: 32 | if file.endswith(".md") and file not in exclude_files: 33 | md_path = Path(root) / file 34 | 35 | try: 36 | rel_path = md_path.relative_to(docs_dir) 37 | chapter_dir = rel_path.parts[0] 38 | if not chapter_dir.startswith("chapter"): 39 | continue 40 | chapter_num = int(chapter_dir[len("chapter"):]) 41 | except (ValueError, IndexError): 42 | continue 43 | 44 | # 根据章节号判断归属文件 45 | target_out = None 46 | if 1 <= chapter_num <= 11: 47 | target_out = part1_out # 第01-11章 → part1 48 | elif 14 <= chapter_num <= 21: 49 | target_out = part2_out # 第14-21章 → part2 50 | elif 23 <= chapter_num <= 28: 51 | target_out = part3_out # 第23-28章 → part3 52 | else: 53 | continue # 不在目标范围内的章节跳过 54 | 55 | # 读取内容并写入对应文件 56 | with open(md_path, "r", encoding="utf-8") as infile: 57 | content = infile.read() 58 | content = content.replace("../images", "./images") 59 | target_out.write(content) 60 | target_out.write("\n\n") 61 | 62 | 63 | def gather_output_images(summary_dir): 64 | # 收集所有图片文件并复制到汇总目录 65 | image_files = [] 66 | docs_dir = PROJECT_ROOT / "docs" 67 | for root, _, files in os.walk(docs_dir): 68 | for file in files: 69 | if (file.startswith("output_") and 70 | (file.endswith(".png") or file.endswith(".svg"))): 71 | src_path = Path(root) / file 72 | # 处理可能的文件名冲突 73 | dest_filename = file 74 | counter = 1 75 | while (summary_dir / dest_filename).exists(): 76 | name, ext = os.path.splitext(file) 77 | dest_filename = f"{name}_{counter}{ext}" 78 | counter += 1 79 | 80 | dest_path = summary_dir / dest_filename 81 | shutil.copy2(src_path, dest_path) 82 | image_files.append((src_path, dest_path)) 83 | return image_files 84 | 85 | 86 | def copy_images(summary_dir): 87 | docs_images = PROJECT_ROOT / "docs" / "images" 88 | if docs_images.exists() and docs_images.is_dir(): 89 | target_images = summary_dir / "images" 90 | if target_images.exists(): 91 | shutil.rmtree(target_images) 92 | shutil.copytree(docs_images, target_images) 93 | 94 | 95 | def main(): 96 | # 创建汇总目录 97 | summary_dir = Path("summary_docs") 98 | summary_dir.mkdir(exist_ok=True) 99 | 100 | # 拷贝docs/images目录到summary_docs 101 | # copy_images(summary_dir) 102 | 103 | # gather_output_images(summary_dir) 104 | 105 | gather_md(summary_dir) 106 | 107 | print(f"汇总完成!结果保存在 {summary_dir} 目录中") 108 | 109 | 110 | if __name__ == "__main__": 111 | main() 112 | -------------------------------------------------------------------------------- /codes/ch19/gibbs_sampling.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: gibbs_sampling.py 6 | @time: 2022/7/18 17:22 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题19.8 使用吉布斯抽样算法估计参数的均值和方差 9 | """ 10 | import matplotlib.pyplot as plt 11 | import numpy as np 12 | 13 | 14 | class GibbsSampling: 15 | def __init__(self, target_dist, j, m=1e4, n=1e5): 16 | """ 17 | Gibbs Sampling 算法 18 | 19 | :param target_dist: 目标分布 20 | :param j: 变量维度 21 | :param m: 收敛步数 22 | :param n: 迭代步数 23 | """ 24 | self.target_dist = target_dist 25 | self.j = j 26 | self.m = int(m) 27 | self.n = int(n) 28 | 29 | def solve(self): 30 | """ 31 | Gibbs Sampling 算法求解 32 | """ 33 | # (1) 初始化 34 | all_samples = np.zeros((self.n, self.j)) 35 | # 任意选择一个初始值 36 | x_0 = np.random.random(self.j) 37 | # (2) 循环执行 38 | for i in range(self.n): 39 | x = x_0 if i == 0 else all_samples[i - 1] 40 | # 满条件分布抽取 41 | for k in range(self.j): 42 | x[k] = self.target_dist.sample(x, k) 43 | all_samples[i] = x 44 | # (3) 得到样本集合 45 | samples = all_samples[self.m:] 46 | # (4) 计算函数样本均值 47 | dist_mean = samples.mean(0) 48 | dist_var = samples.var(0) 49 | return samples[self.m:], dist_mean, dist_var 50 | 51 | @staticmethod 52 | def visualize(samples, bins=50): 53 | """ 54 | 可视化展示 55 | :param samples: 抽取的随机样本集合 56 | :param bins: 频率直方图的分组个数 57 | """ 58 | fig, ax = plt.subplots() 59 | ax.set_title('Gibbs Sampling') 60 | ax.hist(samples[:, 0], bins, alpha=0.7, label='$\\theta$') 61 | ax.hist(samples[:, 1], bins, alpha=0.7, label='$\\eta$') 62 | ax.set_xlim(0, 1) 63 | ax.legend() 64 | plt.show() 65 | 66 | 67 | class TargetDistribution: 68 | """ 69 | 目标概率分布 70 | """ 71 | 72 | def __init__(self): 73 | # 联合概率值过小,可对建议分布进行放缩 74 | self.c = self.__select_prob_scaler() 75 | 76 | def sample(self, x, k=0): 77 | """ 78 | 使用接受-拒绝方法从满条件分布中抽取新的分量 x_k 79 | """ 80 | theta, eta = x 81 | if k == 0: 82 | while True: 83 | new_theta = np.random.uniform(0, 1 - eta) 84 | alpha = np.random.uniform() 85 | if (alpha * self.c) < self.__prob([new_theta, eta]): 86 | return new_theta 87 | elif k == 1: 88 | while True: 89 | new_eta = np.random.uniform(0, 1 - theta) 90 | alpha = np.random.uniform() 91 | if (alpha * self.c) < self.__prob([theta, new_eta]): 92 | return new_eta 93 | 94 | def __select_prob_scaler(self): 95 | """ 96 | 选择合适的建议分布放缩尺度 97 | """ 98 | prob_list = [] 99 | step = 1e-3 100 | for theta in np.arange(step, 1, step): 101 | for eta in np.arange(step, 1 - theta + step, step): 102 | prob = self.__prob((theta, eta)) 103 | prob_list.append(prob) 104 | searched_max_prob = max(prob_list) 105 | upper_bound_prob = searched_max_prob * 10 106 | return upper_bound_prob 107 | 108 | @staticmethod 109 | def __prob(x): 110 | """ 111 | P(X = x) 的概率 112 | """ 113 | theta = x[0] 114 | eta = x[1] 115 | p1 = (theta / 4 + 1 / 8) ** 14 116 | p2 = theta / 4 117 | p3 = eta / 4 118 | p4 = (eta / 4 + 3 / 8) 119 | p5 = 1 / 2 * (1 - theta - eta) ** 5 120 | p = (p1 * p2 * p3 * p4 * p5) 121 | return p 122 | 123 | 124 | if __name__ == '__main__': 125 | # 收敛步数 126 | m = 1e3 127 | # 迭代步数 128 | n = 1e4 129 | 130 | # 目标分布 131 | target_dist = TargetDistribution() 132 | 133 | # 使用 Gibbs Sampling 算法进行求解 134 | gibbs_sampling = GibbsSampling(target_dist, 2, m, n) 135 | 136 | samples, dist_mean, dist_var = gibbs_sampling.solve() 137 | 138 | print(f'theta均值:{dist_mean[0]}, theta方差:{dist_var[0]}') 139 | print(f'eta均值:{dist_mean[1]}, eta方差:{dist_var[1]}') 140 | 141 | # 对结果进行可视化 142 | GibbsSampling.visualize(samples, bins=20) 143 | -------------------------------------------------------------------------------- /codes/ch26/lstm_seq2seq.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: lstm_seq2seq.py 6 | @time: 2023/3/17 18:37 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题26.1 4层LSTM组成的序列到序列的基本模型 9 | """ 10 | 11 | import torch 12 | from torch import nn 13 | import numpy as np 14 | 15 | 16 | class S2SEncoder(nn.Module): 17 | r"""由LSTM组成的序列到序列编码器。 18 | 19 | Args: 20 | inp_size: 嵌入层的输入维度 21 | embed_size: 嵌入层的输出维度 22 | num_hids: LSTM隐层向量维度 23 | num_layers: LSTM层数,本题目设置为4 24 | """ 25 | 26 | def __init__(self, inp_size, embed_size, num_hids, 27 | num_layers, dropout=0, **kwargs): 28 | super(S2SEncoder, self).__init__(**kwargs) 29 | 30 | self.embed = nn.Embedding(inp_size, embed_size) 31 | self.rnn = nn.LSTM(embed_size, num_hids, num_layers, 32 | dropout=dropout) 33 | 34 | def forward(self, inputs): 35 | # inputs.shape(): (seq_length, embed_size) 36 | inputs = self.embed(inputs) 37 | 38 | # output.shape(): (seq_length, num_hids) 39 | # states.shape(): (num_layers, num_hids) 40 | output, state = self.rnn(inputs) 41 | 42 | return output, state 43 | 44 | 45 | class S2SDecoder(nn.Module): 46 | r"""由LSTM组成的序列到序列解码器。 47 | 48 | Args: 49 | inp_size: 嵌入层的输入维度。 50 | embed_size: 嵌入层的输出维度。 51 | num_hids: LSTM 隐层向量维度。 52 | num_layers: LSTM 层数,本题目设置为4。 53 | """ 54 | 55 | def __init__(self, inp_size, embed_size, num_hids, 56 | num_layers, dropout=0, **kwargs): 57 | super(S2SDecoder, self).__init__(**kwargs) 58 | self.num_layers = num_layers 59 | self.embed = nn.Embedding(inp_size, embed_size) 60 | # 解码器 LSTM 的输入,由目标序列的嵌入向量和编码器的隐层向量拼接而成。 61 | self.rnn = nn.LSTM(embed_size + num_hids, num_hids, num_layers, 62 | dropout=dropout) 63 | 64 | self.linear = nn.Linear(num_hids, inp_size) 65 | 66 | def init_state(self, enc_outputs, *args): 67 | return enc_outputs[1][-1] 68 | 69 | def forward(self, inputs, state): 70 | # inputs.shape(): (seq_length, embed_size) 71 | inputs = self.embed(inputs) 72 | 73 | # 广播 context,使其具有与 inputs 相同的长度 74 | # context.shape(): (seq_length, num_layers, embed_size) 75 | context = state[-1].repeat(inputs.shape[0], 1, 1) 76 | inputs = torch.cat((inputs, context), 2) 77 | # output.shape(): (seq_length, num_hids) 78 | output, _ = self.rnn(inputs) 79 | 80 | output = self.linear(output) 81 | 82 | return output 83 | 84 | 85 | class EncoderDecoder(nn.Module): 86 | r"""基于 LSTM 的序列到序列模型。 87 | 88 | Args: 89 | encoder: 编码器。 90 | decoder: 解码器。 91 | """ 92 | 93 | def __init__(self, encoder, decoder, **kwargs): 94 | super(EncoderDecoder, self).__init__(**kwargs) 95 | self.encoder = encoder 96 | self.decoder = decoder 97 | 98 | def forward(self, enc_inp, dec_inp): 99 | enc_out = self.encoder(enc_inp) 100 | dec_state = self.decoder.init_state(enc_out) 101 | 102 | return self.decoder(dec_inp, dec_state) 103 | 104 | 105 | if __name__ == '__main__': 106 | # 搭建一个4层LSTM构成的序列到序列模型,进行前向计算 107 | inp_size, embed_size, num_hids, num_layers = 10, 8, 16, 4 108 | encoder = S2SEncoder(inp_size, embed_size, num_hids, num_layers) 109 | decoder = S2SDecoder(inp_size, embed_size, num_hids, num_layers) 110 | model = EncoderDecoder(encoder, decoder) 111 | 112 | enc_inp_seq = "I love you !" 113 | dec_inp_seq = "我 爱 你 !" 114 | enc_inp, dec_inp = [], [] 115 | 116 | # 自己构造的的词典 117 | word2vec = {"I": [1, 0, 0, 0], 118 | "love": [0, 1, 0, 0], 119 | "you": [0, 0, 1, 0], 120 | "!": [0, 0, 0, 1], 121 | "我": [1, 0, 0, 0], 122 | "爱": [0, 1, 0, 0], 123 | "你": [0, 0, 1, 0], 124 | "!": [0, 0, 0, 1]} 125 | 126 | for word in enc_inp_seq.split(): 127 | enc_inp.append(word2vec[word]) 128 | 129 | enc_inp = torch.tensor(enc_inp) 130 | 131 | for word in dec_inp_seq.split(): 132 | dec_inp.append(word2vec[word]) 133 | 134 | dec_inp = torch.tensor(dec_inp) 135 | output = model(enc_inp, dec_inp) 136 | -------------------------------------------------------------------------------- /notebook/part04/notes/ch40.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "3c95a48bc526d8f3", 6 | "metadata": {}, 7 | "source": [ 8 | "# 第40章 近端策略优化PPO" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "6438df9d", 14 | "metadata": {}, 15 | "source": [ 16 | "## 习题40.1" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "5b03583b", 22 | "metadata": {}, 23 | "source": [ 24 | "  策略梯度算法REINFORCE、带基线的REINFORCE、演员-评论员、TRPO、PPO是一步步发展而来的,总结每一个算法对之前算法的主要改进点。" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "id": "a24401c0", 30 | "metadata": {}, 31 | "source": [ 32 | "**解答:** " 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "id": "a26f336d", 38 | "metadata": {}, 39 | "source": [ 40 | "**解答思路:**" 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "id": "3abf0c86", 46 | "metadata": {}, 47 | "source": [ 48 | "**解答步骤:** " 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "id": "bcbdb752", 54 | "metadata": {}, 55 | "source": [ 56 | "## 习题40.2" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "id": "d850e4bc", 62 | "metadata": {}, 63 | "source": [ 64 | "  费舍尔信息矩阵的一般定义是\n", 65 | "$$\n", 66 | "F(\\theta) = \\mathcal{E}_{p_{\\theta}(x)} \\left[ \\nabla_{\\theta} \\log p_{\\theta}(x) (\\nabla_{\\theta} \\log p_{\\theta} (x))^T \\right]\n", 67 | "$$\n", 68 | "证明与引理40.2中的定义等价。" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "id": "79a2ab8e", 74 | "metadata": {}, 75 | "source": [ 76 | "**解答:** " 77 | ] 78 | }, 79 | { 80 | "cell_type": "markdown", 81 | "id": "98215627", 82 | "metadata": {}, 83 | "source": [ 84 | "**解答思路:**" 85 | ] 86 | }, 87 | { 88 | "cell_type": "markdown", 89 | "id": "ed67f6f0", 90 | "metadata": {}, 91 | "source": [ 92 | "**解答步骤:** " 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "id": "fa80866c", 98 | "metadata": {}, 99 | "source": [ 100 | "## 习题40.3" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "id": "e1360bc2", 106 | "metadata": {}, 107 | "source": [ 108 | "  写出PPO-Clip算法实现中的计算公式(40.16)\\~(40.17)的函数表,验证它与截断目标函数的等价性。" 109 | ] 110 | }, 111 | { 112 | "cell_type": "markdown", 113 | "id": "ee9b66ea", 114 | "metadata": {}, 115 | "source": [ 116 | "**解答:** " 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "id": "d891b876", 122 | "metadata": {}, 123 | "source": [ 124 | "**解答思路:**" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "id": "f37a4fa0", 130 | "metadata": {}, 131 | "source": [ 132 | "**解答步骤:** " 133 | ] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "id": "73518130", 138 | "metadata": {}, 139 | "source": [ 140 | "## 习题40.4" 141 | ] 142 | }, 143 | { 144 | "cell_type": "markdown", 145 | "id": "67dedac3", 146 | "metadata": {}, 147 | "source": [ 148 | "  列出PPO和深度Q网络的不同点。" 149 | ] 150 | }, 151 | { 152 | "cell_type": "markdown", 153 | "id": "fe285d39", 154 | "metadata": {}, 155 | "source": [ 156 | "**解答:** " 157 | ] 158 | }, 159 | { 160 | "cell_type": "markdown", 161 | "id": "cc51edff", 162 | "metadata": {}, 163 | "source": [ 164 | "**解答思路:**" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "id": "11cb028d", 170 | "metadata": {}, 171 | "source": [ 172 | "**解答步骤:** " 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": null, 178 | "id": "340ace35040ef97", 179 | "metadata": {}, 180 | "outputs": [], 181 | "source": [] 182 | } 183 | ], 184 | "metadata": { 185 | "kernelspec": { 186 | "display_name": "Python 3 (ipykernel)", 187 | "language": "python", 188 | "name": "python3" 189 | }, 190 | "language_info": { 191 | "codemirror_mode": { 192 | "name": "ipython", 193 | "version": 3 194 | }, 195 | "file_extension": ".py", 196 | "mimetype": "text/x-python", 197 | "name": "python", 198 | "nbconvert_exporter": "python", 199 | "pygments_lexer": "ipython3", 200 | "version": "3.10.5" 201 | } 202 | }, 203 | "nbformat": 4, 204 | "nbformat_minor": 5 205 | } 206 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # ignore config 2 | ### JetBrains template 3 | # Covers JetBrains IDEs: IntelliJ, RubyMine, PhpStorm, AppCode, PyCharm, CLion, Android Studio and WebStorm 4 | # Reference: https://intellij-support.jetbrains.com/hc/en-us/articles/206544839 5 | 6 | # User-specific stuff 7 | .idea/**/workspace.xml 8 | .idea/**/tasks.xml 9 | .idea/**/usage.statistics.xml 10 | .idea/**/dictionaries 11 | .idea/**/shelf 12 | 13 | # Generated files 14 | .idea/**/contentModel.xml 15 | 16 | # Sensitive or high-churn files 17 | .idea/**/dataSources/ 18 | .idea/**/dataSources.ids 19 | .idea/**/dataSources.local.xml 20 | .idea/**/sqlDataSources.xml 21 | .idea/**/dynamic.xml 22 | .idea/**/uiDesigner.xml 23 | .idea/**/dbnavigator.xml 24 | 25 | # Gradle 26 | .idea/**/gradle.xml 27 | .idea/**/libraries 28 | 29 | # Gradle and Maven with auto-import 30 | # When using Gradle or Maven with auto-import, you should exclude module files, 31 | # since they will be recreated, and may cause churn. Uncomment if using 32 | # auto-import. 33 | # .idea/modules.xml 34 | # .idea/*.iml 35 | # .idea/modules 36 | # *.iml 37 | # *.ipr 38 | 39 | # CMake 40 | cmake-build-*/ 41 | 42 | # Mongo Explorer plugin 43 | .idea/**/mongoSettings.xml 44 | 45 | # File-based project format 46 | *.iws 47 | 48 | # IntelliJ 49 | out/ 50 | 51 | # mpeltonen/sbt-idea plugin 52 | .idea_modules/ 53 | 54 | # JIRA plugin 55 | atlassian-ide-plugin.xml 56 | 57 | # Cursive Clojure plugin 58 | .idea/replstate.xml 59 | 60 | # Crashlytics plugin (for Android Studio and IntelliJ) 61 | com_crashlytics_export_strings.xml 62 | crashlytics.properties 63 | crashlytics-build.properties 64 | fabric.properties 65 | 66 | # Editor-based Rest Client 67 | .idea/httpRequests 68 | 69 | # Android studio 3.1+ serialized cache file 70 | .idea/caches/build_file_checksums.ser 71 | 72 | ### JupyterNotebooks template 73 | # gitignore template for Jupyter Notebooks 74 | # website: http://jupyter.org/ 75 | 76 | .ipynb_checkpoints 77 | */.ipynb_checkpoints/* 78 | 79 | # Remove previous ipynb_checkpoints 80 | # git rm -r .ipynb_checkpoints/ 81 | # 82 | 83 | ### Python template 84 | # Byte-compiled / optimized / DLL files 85 | __pycache__/ 86 | *.py[cod] 87 | *$py.class 88 | 89 | # C extensions 90 | *.so 91 | 92 | # Distribution / packaging 93 | .Python 94 | build/ 95 | develop-eggs/ 96 | dist/ 97 | downloads/ 98 | eggs/ 99 | .eggs/ 100 | lib/ 101 | lib64/ 102 | parts/ 103 | sdist/ 104 | var/ 105 | wheels/ 106 | pip-wheel-metadata/ 107 | share/python-wheels/ 108 | *.egg-info/ 109 | .installed.cfg 110 | *.egg 111 | MANIFEST 112 | 113 | # PyInstaller 114 | # Usually these files are written by a python script from a template 115 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 116 | *.manifest 117 | *.spec 118 | 119 | # Installer logs 120 | pip-log.txt 121 | pip-delete-this-directory.txt 122 | 123 | # Unit test / coverage reports 124 | htmlcov/ 125 | .tox/ 126 | .nox/ 127 | .coverage 128 | .coverage.* 129 | .cache 130 | nosetests.xml 131 | coverage.xml 132 | *.cover 133 | .hypothesis/ 134 | .pytest_cache/ 135 | 136 | # Translations 137 | *.mo 138 | *.pot 139 | 140 | # Django stuff: 141 | *.log 142 | local_settings.py 143 | db.sqlite3 144 | 145 | # Flask stuff: 146 | instance/ 147 | .webassets-cache 148 | 149 | # Scrapy stuff: 150 | .scrapy 151 | 152 | # Sphinx documentation 153 | docs/_build/ 154 | 155 | # PyBuilder 156 | target/ 157 | 158 | # Jupyter Notebook 159 | .ipynb_checkpoints 160 | 161 | # IPython 162 | profile_default/ 163 | ipython_config.py 164 | 165 | # pyenv 166 | .python-version 167 | 168 | # pipenv 169 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 170 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 171 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 172 | # install all needed dependencies. 173 | #Pipfile.lock 174 | 175 | # celery beat schedule file 176 | celerybeat-schedule 177 | 178 | # SageMath parsed files 179 | *.sage.py 180 | 181 | # Environments 182 | .env 183 | .venv 184 | env/ 185 | venv/ 186 | ENV/ 187 | env.bak/ 188 | venv.bak/ 189 | 190 | # Spyder project settings 191 | .spyderproject 192 | .spyproject 193 | 194 | # Rope project settings 195 | .ropeproject 196 | 197 | # mkdocs documentation 198 | /site 199 | 200 | # mypy 201 | .mypy_cache/ 202 | .dmypy.json 203 | dmypy.json 204 | 205 | # Pyre type checker 206 | .pyre/ 207 | .idea/ 208 | 209 | /codes/ch27/data/ 210 | /codes/ch24/data/ 211 | /notebook/part03/notes/data/ 212 | -------------------------------------------------------------------------------- /codes/ch09/my_gmm.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: my_gmm.py 6 | @time: 2021/8/14 2:46 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题9.3 自编程实现求两个分量的高斯混合模型的5个参数 9 | """ 10 | 11 | import numpy as np 12 | import itertools 13 | 14 | 15 | class MyGMM: 16 | def __init__(self, alphas_init, means_init, covariances_init, tol=1e-6, n_components=2, max_iter=50): 17 | # (1)设置参数的初始值 18 | # 分模型权重 19 | self.alpha_ = np.array(alphas_init, dtype="float16").reshape(n_components, 1) 20 | # 分模型均值 21 | self.mean_ = np.array(means_init, dtype="float16").reshape(n_components, 1) 22 | # 分模型标准差(方差的平方) 23 | self.covariances_ = np.array(covariances_init, dtype="float16").reshape(n_components, 1) 24 | # 迭代停止的阈值 25 | self.tol = tol 26 | # 高斯混合模型分量个数 27 | self.K = n_components 28 | # 最大迭代次数 29 | self.max_iter = max_iter 30 | # 观测数据 31 | self._y = None 32 | # 实际迭代次数 33 | self.n_iter_ = 0 34 | 35 | def gaussian(self, mean, convariances): 36 | """计算高斯分布概率密度""" 37 | return 1 / np.sqrt(2 * np.pi * convariances) * np.exp( 38 | -(self._y - mean) ** 2 / (2 * convariances)) 39 | 40 | def update_r(self, mean, convariances, alpha): 41 | """更新r_jk, 分模型k对观测数据yi的响应度""" 42 | r_jk = alpha * self.gaussian(mean, convariances) 43 | return r_jk / r_jk.sum(axis=0) 44 | 45 | def update_params(self, r): 46 | """更新mean, alpha, covariances每个分模型k的均值、权重、方差""" 47 | u = self.mean_[-1] 48 | _mean = ((r * self._y).sum(axis=1) / r.sum(axis=1)).reshape(self.K, 1) 49 | _covariances = ((r * (self._y - u) ** 2).sum(axis=1) / r.sum(axis=1)).reshape(self.K, 1) 50 | _alpha = (r.sum(axis=1) / self._y.size).reshape(self.K, 1) 51 | return _mean, _covariances, _alpha 52 | 53 | def judge_stop(self, mean, covariances, alpha): 54 | """中止条件判断""" 55 | a = np.linalg.norm(self.mean_ - mean) 56 | b = np.linalg.norm(self.covariances_ - covariances) 57 | c = np.linalg.norm(self.alpha_ - alpha) 58 | return True if np.sqrt(a ** 2 + b ** 2 + c ** 2) < self.tol else False 59 | 60 | def fit(self, y): 61 | self._y = np.copy(np.array(y)) 62 | """迭代训练获得预估参数""" 63 | # (2)E步:计算分模型k对观测数据yi的响应度 64 | # 更新r_jk, 分模型k对观测数据yi的响应度 65 | r = self.update_r(self.mean_, self.covariances_, self.alpha_) 66 | # 更新mean, alpha, covariances每个分模型k的均值、权重、方差的平方 67 | _mean, _covariances, _alpha = self.update_params(r) 68 | for i in range(self.max_iter): 69 | if not self.judge_stop(_mean, _covariances, _alpha): 70 | # (4)未达到阈值条件,重复迭代 71 | r = self.update_r(_mean, _covariances, _alpha) 72 | # (3)M步:计算新一轮迭代的模型参数 73 | _mean, _covariances, _alpha = self.update_params(r) 74 | else: 75 | # 达到阈值条件,停止迭代 76 | self.n_iter_ = i 77 | break 78 | 79 | self.mean_ = _mean 80 | self.covariances_ = _covariances 81 | self.alpha_ = _alpha 82 | 83 | def score(self): 84 | """计算该局部最优解的score,即似然函数值""" 85 | return (self.alpha_ * self.gaussian(self.mean_, self.covariances_)).sum() 86 | 87 | 88 | if __name__ == "__main__": 89 | # 观测数据 90 | y = np.array([-67, -48, 6, 8, 14, 16, 23, 24, 28, 29, 41, 49, 56, 60, 75]).reshape(1, 15) 91 | # 预估均值和方差,以其邻域划分寻优范围 92 | y_mean = y.mean() // 1 93 | y_std = (y.std() ** 2) // 1 94 | 95 | # 网格搜索,对不同的初值进行参数估计 96 | alpha = [[i, 1 - i] for i in np.linspace(0.1, 0.9, 9)] 97 | mean = [[y_mean + i, y_mean + j] for i in range(-10, 10, 5) for j in range(-10, 10, 5)] 98 | covariances = [[y_std + i, y_std + j] for i in range(-1000, 1000, 500) for j in range(-1000, 1000, 500)] 99 | results = [] 100 | for i in itertools.product(alpha, mean, covariances): 101 | init_alpha = i[0] 102 | init_mean = i[1] 103 | init_covariances = i[2] 104 | clf = MyGMM(alphas_init=init_alpha, means_init=init_mean, covariances_init=init_covariances, 105 | n_components=2, tol=1e-6) 106 | clf.fit(y) 107 | # 得到不同初值收敛的局部最优解 108 | results.append([clf.alpha_, clf.mean_, clf.covariances_, clf.score()]) 109 | # 根据score,从所有局部最优解找到相对最优解 110 | best_value = max(results, key=lambda x: x[3]) 111 | 112 | print("alpha : {}".format(best_value[0].T)) 113 | print("mean : {}".format(best_value[1].T)) 114 | print("std : {}".format(best_value[2].T)) 115 | -------------------------------------------------------------------------------- /codes/ch10/hidden_markov_viterbi.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: hidden_markov_viterbi.py 6 | @time: 2021/8/16 23:13 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题10.3 隐马尔可夫模型的维特比算法 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | class HiddenMarkovViterbi: 15 | def __init__(self, verbose=False): 16 | self.verbose = verbose 17 | 18 | def viterbi(self, Q, V, A, B, O, PI): 19 | """ 20 | 维特比算法 21 | :param Q: 所有可能的状态集合 22 | :param V: 所有可能的观测集合 23 | :param A: 状态转移概率矩阵 24 | :param B: 观测概率矩阵 25 | :param O: 观测序列 26 | :param PI: 初始状态概率向量 27 | """ 28 | # 状态序列的大小 29 | N = len(Q) 30 | # 观测序列的大小 31 | M = len(O) 32 | # 初始化deltas 33 | deltas = np.zeros((N, M)) 34 | # 初始化psis 35 | psis = np.zeros((N, M)) 36 | 37 | # 初始化最优路径矩阵,该矩阵维度与观测序列维度相同 38 | I = np.zeros((1, M)) 39 | # (2)递推,遍历观测序列 40 | for t in range(M): 41 | if self.verbose: 42 | if t == 0: 43 | print("初始化Psi1和delta1:") 44 | elif t == 1: 45 | print("\n从时刻2到T的所有单个路径中概率最大值delta和概率最大的路径的第t-1个结点Psi:") 46 | 47 | # (2)递推从t=2开始 48 | real_t = t + 1 49 | # 得到序列对应的索引 50 | index_of_o = V.index(O[t]) 51 | for i in range(N): 52 | real_i = i + 1 53 | if t == 0: 54 | # (1)初始化 55 | deltas[i][t] = PI[0][i] * B[i][index_of_o] 56 | psis[i][t] = 0 57 | 58 | self.print_delta_t1(B, PI, deltas, i, index_of_o, real_i, t) 59 | self.print_psi_t1(real_i) 60 | else: 61 | # (2)递推,对t=2,3,...,T 62 | deltas[i][t] = np.max(np.multiply([delta[t - 1] for delta in deltas], 63 | [a[i] for a in A])) * B[i][index_of_o] 64 | self.print_delta_t(A, B, deltas, i, index_of_o, real_i, real_t, t) 65 | 66 | psis[i][t] = np.argmax(np.multiply([delta[t - 1] for delta in deltas], 67 | [a[i] for a in A])) 68 | self.print_psi_t(i, psis, real_i, real_t, t) 69 | 70 | last_deltas = [delta[M - 1] for delta in deltas] 71 | # (3)终止,得到所有路径的终结点最大的概率值 72 | P = np.max(last_deltas) 73 | # (3)得到最优路径的终结点 74 | I[0][M - 1] = np.argmax(last_deltas) 75 | if self.verbose: 76 | print("\n所有路径的终结点最大的概率值:") 77 | print("P = %f" % P) 78 | if self.verbose: 79 | print("\n最优路径的终结点:") 80 | print("i%d = argmax[deltaT(i)] = %d" % (M, I[0][M - 1] + 1)) 81 | print("\n最优路径的其他结点:") 82 | 83 | # (4)递归由后向前得到其他结点 84 | for t in range(M - 2, -1, -1): 85 | I[0][t] = psis[int(I[0][t + 1])][t + 1] 86 | if self.verbose: 87 | print("i%d = Psi%d(i%d) = %d" % (t + 1, t + 2, t + 2, I[0][t] + 1)) 88 | 89 | # 输出最优路径 90 | print("\n最优路径是:", "->".join([str(int(i + 1)) for i in I[0]])) 91 | 92 | def print_psi_t(self, i, psis, real_i, real_t, t): 93 | if self.verbose: 94 | print("Psi%d(%d) = argmax[delta%d(j) * aj%d] = %d" 95 | % (real_t, real_i, real_t - 1, real_i, psis[i][t])) 96 | 97 | def print_delta_t(self, A, B, deltas, i, index_of_o, real_i, real_t, t): 98 | if self.verbose: 99 | print("delta%d(%d) = max[delta%d(j) * aj%d] * b%d(o%d) = %.2f * %.2f = %.5f" 100 | % (real_t, real_i, real_t - 1, real_i, real_i, real_t, 101 | np.max(np.multiply([delta[t - 1] for delta in deltas], 102 | [a[i] for a in A])), 103 | B[i][index_of_o], deltas[i][t])) 104 | 105 | def print_psi_t1(self, real_i): 106 | if self.verbose: 107 | print("Psi1(%d) = 0" % real_i) 108 | 109 | def print_delta_t1(self, B, PI, deltas, i, index_of_o, real_i, t): 110 | if self.verbose: 111 | print("delta1(%d) = pi%d * b%d(o1) = %.2f * %.2f = %.2f" 112 | % (real_i, real_i, real_i, PI[0][i], B[i][index_of_o], deltas[i][t])) 113 | 114 | 115 | if __name__ == '__main__': 116 | Q = [1, 2, 3] 117 | V = ['红', '白'] 118 | A = [[0.5, 0.2, 0.3], [0.3, 0.5, 0.2], [0.2, 0.3, 0.5]] 119 | B = [[0.5, 0.5], [0.4, 0.6], [0.7, 0.3]] 120 | O = ['红', '白', '红', '白'] 121 | PI = [[0.2, 0.4, 0.4]] 122 | 123 | HMM = HiddenMarkovViterbi(verbose=True) 124 | HMM.viterbi(Q, V, A, B, O, PI) 125 | -------------------------------------------------------------------------------- /codes/ch06/my_logistic_regression.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: my_logistic_regression.py 6 | @time: 2021/8/9 20:23 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题6.2 实现Logistic回归模型学习的梯度下降法 9 | """ 10 | 11 | import matplotlib.pyplot as plt 12 | import numpy as np 13 | from pylab import mpl 14 | from scipy.optimize import fminbound 15 | 16 | # 图像显示中文 17 | mpl.rcParams['font.sans-serif'] = ['Microsoft YaHei'] 18 | 19 | 20 | class MyLogisticRegression: 21 | def __init__(self, max_iter=10000, distance=3, epsilon=1e-6): 22 | """ 23 | 逻辑斯谛回归 24 | :param max_iter: 最大迭代次数 25 | :param distance: 一维搜索的长度范围 26 | :param epsilon: 迭代停止阈值 27 | """ 28 | self.max_iter = max_iter 29 | self.epsilon = epsilon 30 | # 权重 31 | self.w = None 32 | self.distance = distance 33 | self._X = None 34 | self._y = None 35 | 36 | @staticmethod 37 | def preprocessing(X): 38 | """将原始X末尾加上一列,该列数值全部为1""" 39 | row = X.shape[0] 40 | y = np.ones(row).reshape(row, 1) 41 | return np.hstack((X, y)) 42 | 43 | @staticmethod 44 | def sigmoid(x): 45 | return 1 / (1 + np.exp(-x)) 46 | 47 | def grad(self, w): 48 | z = np.dot(self._X, w.T) 49 | grad = self._X * (self._y - self.sigmoid(z)) 50 | grad = grad.sum(axis=0) 51 | return grad 52 | 53 | def likelihood_func(self, w): 54 | z = np.dot(self._X, w.T) 55 | f = self._y * z - np.log(1 + np.exp(z)) 56 | return np.sum(f) 57 | 58 | def fit(self, data_x, data_y): 59 | self._X = self.preprocessing(data_x) 60 | self._y = data_y.T 61 | # (1)取初始化w 62 | w = np.array([[0] * self._X.shape[1]], dtype=np.float) 63 | k = 0 64 | # (2)计算f(w) 65 | fw = self.likelihood_func(w) 66 | for _ in range(self.max_iter): 67 | # 计算梯度g(w) 68 | grad = self.grad(w) 69 | # (3)当梯度g(w)的模长小于精度时,停止迭代 70 | if (np.linalg.norm(grad, axis=0, keepdims=True) < self.epsilon).all(): 71 | self.w = w 72 | break 73 | 74 | # 梯度方向的一维函数 75 | def f(x): 76 | z = w - np.dot(x, grad) 77 | return -self.likelihood_func(z) 78 | 79 | # (3)进行一维搜索,找到使得函数最大的lambda 80 | _lambda = fminbound(f, -self.distance, self.distance) 81 | 82 | # (4)设置w(k+1) 83 | w1 = w - np.dot(_lambda, grad) 84 | fw1 = self.likelihood_func(w1) 85 | 86 | # (4)当f(w(k+1))-f(w(k))的二范数小于精度,或w(k+1)-w(k)的二范数小于精度 87 | if np.linalg.norm(fw1 - fw) < self.epsilon or \ 88 | (np.linalg.norm((w1 - w), axis=0, keepdims=True) < self.epsilon).all(): 89 | self.w = w1 90 | break 91 | 92 | # (5) 设置k=k+1 93 | k += 1 94 | w, fw = w1, fw1 95 | 96 | self.grad_ = grad 97 | self.n_iter_ = k 98 | self.coef_ = self.w[0][:-1] 99 | self.intercept_ = self.w[0][-1] 100 | 101 | def predict(self, x): 102 | p = self.sigmoid(np.dot(self.preprocessing(x), self.w.T)) 103 | p[np.where(p > 0.5)] = 1 104 | p[np.where(p < 0.5)] = 0 105 | return p 106 | 107 | def score(self, X, y): 108 | y_c = self.predict(X) 109 | # 计算准确率 110 | error_rate = np.sum(np.abs(y_c - y.T)) / y_c.shape[0] 111 | return 1 - error_rate 112 | 113 | def draw(self, X, y): 114 | # 分隔正负实例点 115 | y = y[0] 116 | X_po = X[np.where(y == 1)] 117 | X_ne = X[np.where(y == 0)] 118 | # 绘制数据集散点图 119 | ax = plt.axes(projection='3d') 120 | x_1 = X_po[0, :] 121 | y_1 = X_po[1, :] 122 | z_1 = X_po[2, :] 123 | x_2 = X_ne[0, :] 124 | y_2 = X_ne[1, :] 125 | z_2 = X_ne[2, :] 126 | ax.scatter(x_1, y_1, z_1, c="r", label="正实例") 127 | ax.scatter(x_2, y_2, z_2, c="b", label="负实例") 128 | ax.legend(loc='best') 129 | # 绘制透明度为0.5的分隔超平面 130 | x = np.linspace(-3, 3, 3) 131 | y = np.linspace(-3, 3, 3) 132 | x_3, y_3 = np.meshgrid(x, y) 133 | a, b, c, d = self.w[0] 134 | z_3 = -(a * x_3 + b * y_3 + d) / c 135 | # 调节透明度 136 | ax.plot_surface(x_3, y_3, z_3, alpha=0.5) 137 | plt.show() 138 | 139 | 140 | if __name__ == '__main__': 141 | # 训练数据集 142 | X_train = np.array([[3, 3, 3], [4, 3, 2], [2, 1, 2], [1, 1, 1], [-1, 0, 1], [2, -2, 1]]) 143 | y_train = np.array([[1, 1, 1, 0, 0, 0]]) 144 | # 构建实例,进行训练 145 | clf = MyLogisticRegression(epsilon=1e-6) 146 | clf.fit(X_train, y_train) 147 | clf.draw(X_train, y_train) 148 | print("迭代次数:{}次".format(clf.n_iter_)) 149 | print("梯度:{}".format(clf.grad_)) 150 | print("权重:{}".format(clf.w[0])) 151 | print("模型准确率:{:.2%}".format(clf.score(X_train, y_train))) 152 | -------------------------------------------------------------------------------- /codes/ch20/gibbs_sampling_lda.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: gibbs_sampling_lda.py 6 | @time: 2022/7/12 20:30 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题20.2 LDA吉布斯抽样算法 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | class GibbsSamplingLDA: 15 | def __init__(self, iter_max=1000): 16 | self.iter_max = iter_max 17 | self.weights_ = [] 18 | 19 | def fit(self, words, K): 20 | """ 21 | :param words: 单词-文本矩阵 22 | :param K: 话题个数 23 | :return: 文本话题序列z 24 | """ 25 | # M, Nm分别为文本个数和单词个数 26 | words = words.T 27 | M, Nm = words.shape 28 | 29 | # 初始化超参数alpha, beta,其中alpha为文本的话题分布相关参数,beta为话题的单词分布相关参数 30 | alpha = np.array([1 / K] * K) 31 | beta = np.array([1 / Nm] * Nm) 32 | 33 | # 初始化参数theta, varphi,其中theta为文本关于话题的多项分布参数,varphi为话题关于单词的多项分布参数 34 | theta = np.zeros([M, K]) 35 | varphi = np.zeros([K, Nm]) 36 | 37 | # 输出文本的话题序列z 38 | z = np.zeros(words.shape, dtype='int') 39 | 40 | # (1)设所有计数矩阵的元素n_mk、n_kv,计数向量的元素n_m、n_k初值为 0 41 | n_mk = np.zeros([M, K]) 42 | n_kv = np.zeros([K, Nm]) 43 | n_m = np.zeros(M) 44 | n_k = np.zeros(K) 45 | 46 | # (2)对所有M个文本中的所有单词进行循环 47 | for m in range(M): 48 | for v in range(Nm): 49 | # 如果单词v存在于文本m 50 | if words[m, v] != 0: 51 | # (2.a)抽样话题 52 | z[m, v] = np.random.choice(list(range(K))) 53 | # 增加文本-话题计数 54 | n_mk[m, z[m, v]] += 1 55 | # 增加文本-话题和计数 56 | n_m[m] += 1 57 | # 增加话题-单词计数 58 | n_kv[z[m, v], v] += 1 59 | # 增加话题-单词和计数 60 | n_k[z[m, v]] += 1 61 | 62 | # (3)对所有M个文本中的所有单词进行循环,直到进入燃烧期 63 | zi = 0 64 | for i in range(self.iter_max): 65 | for m in range(M): 66 | for v in range(Nm): 67 | # (3.a)如果单词v存在于文本m,那么当前单词是第v个单词,话题指派z_mv是第k个话题 68 | if words[m, v] != 0: 69 | # 减少计数 70 | n_mk[m, z[m, v]] -= 1 71 | n_m[m] -= 1 72 | n_kv[z[m, v], v] -= 1 73 | n_k[z[m, v]] -= 1 74 | 75 | # (3.b)按照满条件分布进行抽样 76 | max_zi_value, max_zi_index = -float('inf'), z[m, v] 77 | for k in range(K): 78 | zi = ((n_kv[k, v] + beta[v]) / (n_kv[k, :].sum() + beta.sum())) * \ 79 | ((n_mk[m, k] + alpha[k]) / (n_mk[m, :].sum() + alpha.sum())) 80 | 81 | # 得到新的第 k‘个话题,分配给 z_mv 82 | if max_zi_value < zi: 83 | max_zi_value, max_zi_index = zi, k 84 | z[m, v] = max_zi_index 85 | 86 | # (3.c) (3.d)增加计数并得到两个更新的计数矩阵的n_kv和n_mk 87 | n_mk[m, z[m, v]] += 1 88 | n_m[m] += 1 89 | n_kv[z[m, v], v] += 1 90 | n_k[z[m, v]] += 1 91 | 92 | # (4)利用得到的样本计数,计算模型参数 93 | for m in range(M): 94 | for k in range(K): 95 | theta[m, k] = (n_mk[m, k] + alpha[k]) / (n_mk[m, :].sum() + alpha.sum()) 96 | 97 | for k in range(K): 98 | for v in range(Nm): 99 | varphi[k, v] = (n_kv[k, v] + beta[v]) / (n_kv[k, :].sum() + beta.sum()) 100 | 101 | self.weights_ = [varphi, theta] 102 | return z.T, n_kv, n_mk 103 | 104 | 105 | if __name__ == '__main__': 106 | gibbs_sampling_lda = GibbsSamplingLDA(iter_max=1000) 107 | 108 | # 输入文本-单词矩阵,共有9个文本,11个单词 109 | words = np.array([[0, 0, 1, 1, 0, 0, 0, 0, 0], 110 | [0, 0, 0, 0, 0, 1, 0, 0, 1], 111 | [0, 1, 0, 0, 0, 0, 0, 1, 0], 112 | [0, 0, 0, 0, 0, 0, 1, 0, 1], 113 | [1, 0, 0, 0, 0, 1, 0, 0, 0], 114 | [1, 1, 1, 1, 1, 1, 1, 1, 1], 115 | [1, 0, 1, 0, 0, 0, 0, 0, 0], 116 | [0, 0, 0, 0, 0, 0, 1, 0, 1], 117 | [0, 0, 0, 0, 0, 2, 0, 0, 1], 118 | [1, 0, 1, 0, 0, 0, 0, 1, 0], 119 | [0, 0, 0, 1, 1, 0, 0, 0, 0]]) 120 | # 假设话题数量为3 121 | K = 3 122 | 123 | # 设置精度为3 124 | np.set_printoptions(precision=3, suppress=True) 125 | 126 | z, n_kv, n_mk = gibbs_sampling_lda.fit(words, K) 127 | varphi = gibbs_sampling_lda.weights_[0] 128 | theta = gibbs_sampling_lda.weights_[1] 129 | 130 | print("文本的话题序列z:") 131 | print(z) 132 | print("样本的计数矩阵N_KV:") 133 | print(n_kv) 134 | print("样本的计数矩阵N_MK:") 135 | print(n_mk) 136 | print("模型参数varphi:") 137 | print(varphi) 138 | print("模型参数theta:") 139 | print(theta) 140 | -------------------------------------------------------------------------------- /notebook/part04/notes/ch36.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "584ea112ddc4d14e", 6 | "metadata": {}, 7 | "source": [ 8 | "# 第36章 多臂老虎机" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "65b08216", 14 | "metadata": {}, 15 | "source": [ 16 | "## 习题36.1" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "7e93fc9c", 22 | "metadata": {}, 23 | "source": [ 24 | "  探索优先算法中首先对每个手臂$a$进行$N$次选择,得到估计的期望奖励$\\hat{Q}(a)$,然后一直选择估计的期望奖励最大的手臂,总轮数是$T$。证明以下概率不等式成立。\n", 25 | "$$\n", 26 | "P\\left[ \\hat{Q}(a) \\leqslant Q(a) + \\sqrt{\\frac{2 \\log T}{N}} \\right] \\geqslant 1 - \\frac{1}{T^4}\n", 27 | "$$\n", 28 | "其中,$Q(a)$是手臂$a$真实的期望奖励。提示:使用Hoeffding不等式。" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "id": "a24401c0", 34 | "metadata": {}, 35 | "source": [ 36 | "**解答:** " 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "id": "a26f336d", 42 | "metadata": {}, 43 | "source": [ 44 | "**解答思路:**" 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "id": "3abf0c86", 50 | "metadata": {}, 51 | "source": [ 52 | "**解答步骤:** " 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "id": "739a6d06", 58 | "metadata": {}, 59 | "source": [ 60 | "## 习题36.2" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "id": "1297c611", 66 | "metadata": {}, 67 | "source": [ 68 | "  假设多臂老虎机有三个手臂,奖励取值都是0或1,分布都是伯努利分布,期望分别是$Q(a_1) = 0.4,Q(a_2)=0.5,Q(a_3)=0.6$。与老虎机交互的总轮数是$T=10000$。通过模拟计算探索优先算法($N=100$)、$\\varepsilon$贪心算法($\\varepsilon = 0.2$)、UCB算法、汤普森采样算法在这个多臂老虎机的总体奖励。各自进行10次模拟后取平均。" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "id": "62e1bb10", 74 | "metadata": {}, 75 | "source": [ 76 | "  UCB算法使用更紧的上界\n", 77 | "$$\n", 78 | "\\frac{n(a)}{N(a)} + \\sqrt{\\frac{\\frac{n(a)}{N(a)} \\log t}{N(a)}} + \\frac{\\log t}{N(a)}\n", 79 | "$$\n", 80 | "其中,$t$是轮数,$N(a)$是手臂$a$被选择的次数,是$n(a)$中选择手臂结果为1的次数。" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "id": "10466ef6", 86 | "metadata": {}, 87 | "source": [ 88 | "**解答:** " 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "id": "05a83efb", 94 | "metadata": {}, 95 | "source": [ 96 | "**解答思路:**" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "id": "5e92fe05", 102 | "metadata": {}, 103 | "source": [ 104 | "**解答步骤:** " 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "id": "b2ac97c0", 110 | "metadata": {}, 111 | "source": [ 112 | "## 习题36.3" 113 | ] 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "id": "22cae8ee", 118 | "metadata": {}, 119 | "source": [ 120 | "  $\\varepsilon$贪心算法中的探索概率$\\varepsilon$可以不是一个定量,而是一个随着轮数$t$递减的变量$\\varepsilon_t$。设计一个$\\varepsilon_t$的函数,并观察这个$\\varepsilon$贪心算法在例题36.2中的效果。" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "id": "59980dc3", 126 | "metadata": {}, 127 | "source": [ 128 | "**解答:** " 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "id": "0a76104a", 134 | "metadata": {}, 135 | "source": [ 136 | "**解答思路:**" 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "id": "19b9a258", 142 | "metadata": {}, 143 | "source": [ 144 | "**解答步骤:** " 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "id": "659e7f89", 150 | "metadata": {}, 151 | "source": [ 152 | "## 习题36.4" 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "id": "78e7497d", 158 | "metadata": {}, 159 | "source": [ 160 | "  解析为什么汤普森采样算法中$s$值越小越趋向探索,而$s$值越大越趋向利用。" 161 | ] 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "id": "903f07ed", 166 | "metadata": {}, 167 | "source": [ 168 | "**解答:** " 169 | ] 170 | }, 171 | { 172 | "cell_type": "markdown", 173 | "id": "35e702cf", 174 | "metadata": {}, 175 | "source": [ 176 | "**解答思路:**" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "id": "1e0b032b", 182 | "metadata": {}, 183 | "source": [ 184 | "**解答步骤:** " 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": null, 190 | "id": "6fb03f6d", 191 | "metadata": {}, 192 | "outputs": [], 193 | "source": [] 194 | } 195 | ], 196 | "metadata": { 197 | "kernelspec": { 198 | "display_name": "Python 3 (ipykernel)", 199 | "language": "python", 200 | "name": "python3" 201 | }, 202 | "language_info": { 203 | "codemirror_mode": { 204 | "name": "ipython", 205 | "version": 3 206 | }, 207 | "file_extension": ".py", 208 | "mimetype": "text/x-python", 209 | "name": "python", 210 | "nbconvert_exporter": "python", 211 | "pygments_lexer": "ipython3", 212 | "version": "3.10.5" 213 | } 214 | }, 215 | "nbformat": 4, 216 | "nbformat_minor": 5 217 | } 218 | -------------------------------------------------------------------------------- /docs/chapter05/output_7_0.svg: -------------------------------------------------------------------------------- 1 | 2 | 4 | 6 | 7 | 9 | 10 | Tree 11 | 12 | 13 | 14 | 0 15 | 16 | 有自己的房子 ≤ 3.0 17 | gini = 0.48 18 | samples = 15 19 | value = [6, 9] 20 | class = 是 21 | 22 | 23 | 24 | 1 25 | 26 | 有工作 ≤ 3.0 27 | gini = 0.444 28 | samples = 9 29 | value = [6, 3] 30 | class = 否 31 | 32 | 33 | 34 | 0->1 35 | 36 | 37 | True 38 | 39 | 40 | 41 | 4 42 | 43 | gini = 0.0 44 | samples = 6 45 | value = [0, 6] 46 | class = 是 47 | 48 | 49 | 50 | 0->4 51 | 52 | 53 | False 54 | 55 | 56 | 57 | 2 58 | 59 | gini = 0.0 60 | samples = 6 61 | value = [6, 0] 62 | class = 否 63 | 64 | 65 | 66 | 1->2 67 | 68 | 69 | 70 | 71 | 72 | 3 73 | 74 | gini = 0.0 75 | samples = 3 76 | value = [0, 3] 77 | class = 是 78 | 79 | 80 | 81 | 1->3 82 | 83 | 84 | 85 | 86 | 87 | -------------------------------------------------------------------------------- /codes/ch08/my_adaboost.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: my_adaboost.py 6 | @time: 2021/8/13 21:17 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题8.1 自编程实现AdaBoost算法 9 | """ 10 | 11 | import numpy as np 12 | 13 | 14 | class MyAdaBoost: 15 | def __init__(self, tol=0.05, max_iter=10): 16 | # 特征 17 | self.X = None 18 | # 标签 19 | self.y = None 20 | # 分类误差小于精度时,分类器训练中止 21 | self.tol = tol 22 | # 最大迭代次数 23 | self.max_iter = max_iter 24 | # 权值分布 25 | self.w = None 26 | # 弱分类器集合 27 | self.G = [] 28 | 29 | def build_stump(self): 30 | """ 31 | 以带权重的分类误差最小为目标,选择最佳分类阈值,得到最佳的决策树桩 32 | best_stump['dim'] 合适特征的所在维度 33 | best_stump['thresh'] 合适特征的阈值 34 | best_stump['ineq'] 树桩分类的标识lt,rt 35 | """ 36 | m, n = np.shape(self.X) 37 | # 分类误差 38 | min_error = np.inf 39 | # 小于分类阈值的样本所属的标签类别 40 | sign = None 41 | # 最优决策树桩 42 | best_stump = {} 43 | for i in range(n): 44 | # 求每一种特征的最小值和最大值 45 | range_min = self.X[:, i].min() 46 | range_max = self.X[:, i].max() 47 | step_size = (range_max - range_min) / n 48 | for j in range(-1, int(n) + 1): 49 | # 根据n的值,构造切分点 50 | thresh_val = range_min + j * step_size 51 | # 计算左子树和右子树的误差 52 | for inequal in ['lt', 'rt']: 53 | # (a)得到基本分类器 54 | predict_values = self.base_estimator(self.X, i, thresh_val, inequal) 55 | # (b)计算在训练集上的分类误差率 56 | err_arr = np.array(np.ones(m)) 57 | err_arr[predict_values.T == self.y.T] = 0 58 | weighted_error = np.dot(self.w, err_arr) 59 | if weighted_error < min_error: 60 | min_error = weighted_error 61 | sign = predict_values 62 | best_stump['dim'] = i 63 | best_stump['thresh'] = thresh_val 64 | best_stump['ineq'] = inequal 65 | return best_stump, sign, min_error 66 | 67 | def updata_w(self, alpha, predict): 68 | """ 69 | 更新样本权重w 70 | :param alpha: alpha 71 | :param predict: yi 72 | :return: 73 | """ 74 | # (d)根据迭代公式,更新权值分布 75 | P = self.w * np.exp(-alpha * self.y * predict) 76 | self.w = P / P.sum() 77 | 78 | @staticmethod 79 | def base_estimator(X, dimen, thresh_val, thresh_ineq): 80 | """ 81 | 计算单个弱分类器(决策树桩)预测输出 82 | :param X: 特征 83 | :param dimen: 特征的位置(即第几个特征) 84 | :param thresh_val: 切分点 85 | :param thresh_ineq: 标记结点的位置,可取左子树(lt),右子树(rt) 86 | :return: 返回预测结果矩阵 87 | """ 88 | # 预测结果矩阵 89 | ret_array = np.ones(np.shape(X)[0]) 90 | # 左叶子 ,整个矩阵的样本进行比较赋值 91 | if thresh_ineq == 'lt': 92 | ret_array[X[:, dimen] >= thresh_val] = -1.0 93 | else: 94 | ret_array[X[:, dimen] < thresh_val] = -1.0 95 | return ret_array 96 | 97 | def fit(self, X, y): 98 | """ 99 | 对分类器进行训练 100 | """ 101 | self.X = X 102 | self.y = y 103 | # (1)初始化训练数据的权值分布 104 | self.w = np.full((X.shape[0]), 1 / X.shape[0]) 105 | G = 0 106 | # (2)对m=1,2,...,M进行遍历 107 | for i in range(self.max_iter): 108 | # (b)得到Gm(x)的分类误差error,获取当前迭代最佳分类阈值sign 109 | best_stump, sign, error = self.build_stump() 110 | # (c)计算弱分类器Gm(x)的系数 111 | alpha = 1 / 2 * np.log((1 - error) / error) 112 | # 弱分类器Gm(x)权重 113 | best_stump['alpha'] = alpha 114 | # 保存弱分类器Gm(x),得到分类器集合G 115 | self.G.append(best_stump) 116 | # 计算当前总分类器(之前所有弱分类器加权和)误差率 117 | G += alpha * sign 118 | y_predict = np.sign(G) 119 | # 使用MAE计算误差 120 | error_rate = np.sum(np.abs(y_predict - self.y)) / self.y.shape[0] 121 | if error_rate < self.tol: 122 | # 满足中止条件,则跳出循环 123 | print("迭代次数:{}次".format(i + 1)) 124 | break 125 | else: 126 | # (d)更新训练数据集的权值分布 127 | self.updata_w(alpha, alpha * sign) 128 | 129 | def predict(self, X): 130 | """对新数据进行预测""" 131 | m = np.shape(X)[0] 132 | G = np.zeros(m) 133 | for i in range(len(self.G)): 134 | stump = self.G[i] 135 | # 遍历每一个弱分类器,进行加权 136 | _G = self.base_estimator(X, stump['dim'], stump['thresh'], stump['ineq']) 137 | alpha = stump['alpha'] 138 | # (3)构建基本分类器的线性组合 139 | G += alpha * _G 140 | # 计算最终分类器的预测结果 141 | y_predict = np.sign(G) 142 | return y_predict.astype(int) 143 | 144 | def score(self, X, y): 145 | """计算分类器的预测准确率""" 146 | y_predict = self.predict(X) 147 | # 使用MAE计算误差 148 | error_rate = np.sum(np.abs(y_predict - y)) / y.shape[0] 149 | return 1 - error_rate 150 | 151 | def print_G(self): 152 | i = 1 153 | s = "G(x) = sign[f(x)] = sign[" 154 | for stump in self.G: 155 | if i != 1: 156 | s += " + " 157 | s += "{}·G{}(x)".format(round(stump['alpha'], 4), i) 158 | i += 1 159 | s += "]" 160 | return s 161 | 162 | 163 | if __name__ == '__main__': 164 | # 加载训练数据 165 | X = np.array([[0, 1, 3], 166 | [0, 3, 1], 167 | [1, 2, 2], 168 | [1, 1, 3], 169 | [1, 2, 3], 170 | [0, 1, 2], 171 | [1, 1, 2], 172 | [1, 1, 1], 173 | [1, 3, 1], 174 | [0, 2, 1] 175 | ]) 176 | y = np.array([-1, -1, -1, -1, -1, -1, 1, 1, -1, -1]) 177 | 178 | clf = MyAdaBoost() 179 | clf.fit(X, y) 180 | y_predict = clf.predict(X) 181 | score = clf.score(X, y) 182 | print("原始输出:", y) 183 | print("预测输出:", y_predict) 184 | print("预测正确率:{:.2%}".format(score)) 185 | print("最终分类器G(x)为:", clf.print_G()) 186 | -------------------------------------------------------------------------------- /codes/ch05/my_decision_tree.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: my_decision_tree.py 6 | @time: 2021/8/5 17:11 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题5.1 自编程实现C4.5生成算法 9 | """ 10 | import json 11 | from collections import Counter 12 | 13 | import numpy as np 14 | 15 | 16 | # 节点类 17 | class Node: 18 | def __init__(self, node_type, class_name, feature_name=None, 19 | info_gain_ratio_value=0.0): 20 | # 结点类型(internal或leaf) 21 | self.node_type = node_type 22 | # 特征名 23 | self.feature_name = feature_name 24 | # 类别名 25 | self.class_name = class_name 26 | # 子结点树 27 | self.child_nodes = [] 28 | # Gini指数值 29 | self.info_gain_ratio_value = info_gain_ratio_value 30 | 31 | def __repr__(self): 32 | return json.dumps(self, indent=3, default=lambda obj: obj.__dict__, ensure_ascii=False) 33 | 34 | def add_sub_tree(self, key, sub_tree): 35 | self.child_nodes.append({"condition": key, "sub_tree": sub_tree}) 36 | 37 | 38 | class MyDecisionTree: 39 | def __init__(self, epsilon): 40 | self.epsilon = epsilon 41 | self.tree = None 42 | 43 | def fit(self, train_set, y, feature_names): 44 | features_indices = list(range(len(feature_names))) 45 | self.tree = self._fit(train_set, y, features_indices, feature_names) 46 | return self 47 | 48 | # C4.5算法 49 | def _fit(self, train_data, y, features_indices, feature_labels): 50 | LEAF = 'leaf' 51 | INTERNAL = 'internal' 52 | class_num = len(np.unique(y)) 53 | 54 | # (1)如果训练数据集所有实例都属于同一类Ck 55 | label_set = set(y) 56 | if len(label_set) == 1: 57 | # 将Ck作为该结点的类 58 | return Node(node_type=LEAF, class_name=label_set.pop()) 59 | 60 | # (2)如果特征集为空 61 | # 计算每一个类出现的个数 62 | class_len = Counter(y).most_common() 63 | (max_class, max_len) = class_len[0] 64 | 65 | if len(features_indices) == 0: 66 | # 将实例数最大的类Ck作为该结点的类 67 | return Node(LEAF, class_name=max_class) 68 | 69 | # (3)按式(5.10)计算信息增益,并选择信息增益最大的特征 70 | max_feature = 0 71 | max_gda = 0 72 | D = y.copy() 73 | # 计算特征集A中各特征 74 | for feature in features_indices: 75 | # 选择训练集中的第feature列(即第feature个特征) 76 | A = np.array(train_data[:, feature].flat) 77 | # 计算信息增益 78 | gda = self._calc_ent_grap(A, D) 79 | if self._calc_ent(A) != 0: 80 | # 计算信息增益比 81 | gda /= self._calc_ent(A) 82 | # 选择信息增益最大的特征Ag 83 | if gda > max_gda: 84 | max_gda, max_feature = gda, feature 85 | 86 | # (4)如果Ag信息增益小于阈值 87 | if max_gda < self.epsilon: 88 | # 将训练集中实例数最大的类Ck作为该结点的类 89 | return Node(LEAF, class_name=max_class) 90 | 91 | max_feature_label = feature_labels[max_feature] 92 | 93 | # (6)移除已选特征Ag 94 | sub_feature_indecs = np.setdiff1d(features_indices, max_feature) 95 | sub_feature_labels = np.setdiff1d(feature_labels, max_feature_label) 96 | 97 | # (5)构建非空子集 98 | # 构建结点 99 | feature_name = feature_labels[max_feature] 100 | tree = Node(INTERNAL, class_name=None, feature_name=feature_name, 101 | info_gain_ratio_value=max_gda) 102 | 103 | max_feature_col = np.array(train_data[:, max_feature].flat) 104 | # 将类按照对应的实例数递减顺序排列 105 | feature_value_list = [x[0] for x in Counter(max_feature_col).most_common()] 106 | # 遍历Ag的每一个可能值ai 107 | for feature_value in feature_value_list: 108 | index = [] 109 | for i in range(len(y)): 110 | if train_data[i][max_feature] == feature_value: 111 | index.append(i) 112 | 113 | # 递归调用步(1)~步(5),得到子树 114 | sub_train_set = train_data[index] 115 | sub_train_label = y[index] 116 | sub_tree = self._fit(sub_train_set, sub_train_label, sub_feature_indecs, sub_feature_labels) 117 | # 在结点中,添加其子结点构成的树 118 | tree.add_sub_tree(feature_value, sub_tree) 119 | 120 | return tree 121 | 122 | # 计算数据集x的经验熵H(x) 123 | @staticmethod 124 | def _calc_ent(x): 125 | x_value_list = set([x[i] for i in range(x.shape[0])]) 126 | ent = 0.0 127 | for x_value in x_value_list: 128 | p = float(x[x == x_value].shape[0]) / x.shape[0] 129 | logp = np.log2(p) 130 | ent -= p * logp 131 | 132 | return ent 133 | 134 | # 计算条件熵H(y/x) 135 | def _calc_condition_ent(self, x, y): 136 | x_value_list = set([x[i] for i in range(x.shape[0])]) 137 | ent = 0.0 138 | for x_value in x_value_list: 139 | sub_y = y[x == x_value] 140 | temp_ent = self._calc_ent(sub_y) 141 | ent += (float(sub_y.shape[0]) / y.shape[0]) * temp_ent 142 | 143 | return ent 144 | 145 | # 计算信息增益 146 | def _calc_ent_grap(self, x, y): 147 | base_ent = self._calc_ent(y) 148 | condition_ent = self._calc_condition_ent(x, y) 149 | ent_grap = base_ent - condition_ent 150 | 151 | return ent_grap 152 | 153 | def __repr__(self): 154 | return str(self.tree) 155 | 156 | 157 | if __name__ == '__main__': 158 | # 表5.1的训练数据集 159 | feature_names = np.array(["年龄", "有工作", "有自己的房子", "信贷情况"]) 160 | X_train = np.array([ 161 | ["青年", "否", "否", "一般"], 162 | ["青年", "否", "否", "好"], 163 | ["青年", "是", "否", "好"], 164 | ["青年", "是", "是", "一般"], 165 | ["青年", "否", "否", "一般"], 166 | ["中年", "否", "否", "一般"], 167 | ["中年", "否", "否", "好"], 168 | ["中年", "是", "是", "好"], 169 | ["中年", "否", "是", "非常好"], 170 | ["中年", "否", "是", "非常好"], 171 | ["老年", "否", "是", "非常好"], 172 | ["老年", "否", "是", "好"], 173 | ["老年", "是", "否", "好"], 174 | ["老年", "是", "否", "非常好"], 175 | ["老年", "否", "否", "一般"] 176 | ]) 177 | y = np.array(["否", "否", "是", "是", "否", 178 | "否", "否", "是", "是", "是", 179 | "是", "是", "是", "是", "否"]) 180 | 181 | dt_tree = MyDecisionTree(epsilon=0.1) 182 | dt_tree.fit(X_train, y, feature_names) 183 | print(dt_tree) 184 | -------------------------------------------------------------------------------- /notebook/part04/notes/ch37.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "fcfcce7a1bd48a2b", 6 | "metadata": {}, 7 | "source": [ 8 | "# 第37章 基于价值的方法" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "7a2b895f", 14 | "metadata": {}, 15 | "source": [ 16 | "## 习题37.1" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "01e4ebdf", 22 | "metadata": {}, 23 | "source": [ 24 | "  算法37.1的蒙特卡罗预测算法可以估计状态价值函数。还有一个对应的算法用于估计动作价值函数,写出该算法。" 25 | ] 26 | }, 27 | { 28 | "cell_type": "markdown", 29 | "id": "a24401c0", 30 | "metadata": {}, 31 | "source": [ 32 | "**解答:** " 33 | ] 34 | }, 35 | { 36 | "cell_type": "markdown", 37 | "id": "a26f336d", 38 | "metadata": {}, 39 | "source": [ 40 | "**解答思路:**" 41 | ] 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "id": "3abf0c86", 46 | "metadata": {}, 47 | "source": [ 48 | "**解答步骤:** " 49 | ] 50 | }, 51 | { 52 | "cell_type": "markdown", 53 | "id": "7b5ac88a", 54 | "metadata": {}, 55 | "source": [ 56 | "## 习题37.2" 57 | ] 58 | }, 59 | { 60 | "cell_type": "markdown", 61 | "id": "1603e34e", 62 | "metadata": {}, 63 | "source": [ 64 | "  例37.1的问题中,假设折扣因子$\\gamma = 0.9$。用蒙特卡罗预测估计策略$\\pi_1$在每一个状态的价值,这时分别使用第一次访问估计和每一次访问估计。" 65 | ] 66 | }, 67 | { 68 | "cell_type": "markdown", 69 | "id": "e06b6cda", 70 | "metadata": {}, 71 | "source": [ 72 | "**解答:** " 73 | ] 74 | }, 75 | { 76 | "cell_type": "markdown", 77 | "id": "112725a8", 78 | "metadata": {}, 79 | "source": [ 80 | "**解答思路:**" 81 | ] 82 | }, 83 | { 84 | "cell_type": "markdown", 85 | "id": "8d0ab2f3", 86 | "metadata": {}, 87 | "source": [ 88 | "**解答步骤:** " 89 | ] 90 | }, 91 | { 92 | "cell_type": "markdown", 93 | "id": "7bb4b8f9", 94 | "metadata": {}, 95 | "source": [ 96 | "## 习题37.3" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "id": "3067f3a1", 102 | "metadata": {}, 103 | "source": [ 104 | "  例37.1的问题中,假设策略$\\pi_2$是在每一个格点都向左移动。用蒙特卡罗预测估计策略$\\pi_2$在每一个状态的价值。比较策略$\\pi_1$和策略$\\pi_2$的价值。" 105 | ] 106 | }, 107 | { 108 | "cell_type": "markdown", 109 | "id": "8ad83031", 110 | "metadata": {}, 111 | "source": [ 112 | "**解答:** " 113 | ] 114 | }, 115 | { 116 | "cell_type": "markdown", 117 | "id": "411bdfdf", 118 | "metadata": {}, 119 | "source": [ 120 | "**解答思路:**" 121 | ] 122 | }, 123 | { 124 | "cell_type": "markdown", 125 | "id": "3a84efde", 126 | "metadata": {}, 127 | "source": [ 128 | "**解答步骤:** " 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "id": "bc9c4bd5", 134 | "metadata": {}, 135 | "source": [ 136 | "## 习题37.4" 137 | ] 138 | }, 139 | { 140 | "cell_type": "markdown", 141 | "id": "efe411cb", 142 | "metadata": {}, 143 | "source": [ 144 | "  算法37.2的TD(0)算法可以估计状态价值函数。还有一个对应的算法用于估计动作价值函数,写出该算法。" 145 | ] 146 | }, 147 | { 148 | "cell_type": "markdown", 149 | "id": "1d66d71e", 150 | "metadata": {}, 151 | "source": [ 152 | "**解答:** " 153 | ] 154 | }, 155 | { 156 | "cell_type": "markdown", 157 | "id": "fc28a743", 158 | "metadata": {}, 159 | "source": [ 160 | "**解答思路:**" 161 | ] 162 | }, 163 | { 164 | "cell_type": "markdown", 165 | "id": "9024abaa", 166 | "metadata": {}, 167 | "source": [ 168 | "**解答步骤:** " 169 | ] 170 | }, 171 | { 172 | "cell_type": "markdown", 173 | "id": "27ba6917", 174 | "metadata": {}, 175 | "source": [ 176 | "## 习题37.5" 177 | ] 178 | }, 179 | { 180 | "cell_type": "markdown", 181 | "id": "eff422f6", 182 | "metadata": {}, 183 | "source": [ 184 | "  在以下表中标出答案为“是”的部分。" 185 | ] 186 | }, 187 | { 188 | "cell_type": "markdown", 189 | "id": "49e4c1cc", 190 | "metadata": {}, 191 | "source": [ 192 | "| | 动态规划 | 蒙特卡罗预测 | 时序差分预测 |\n", 193 | "| :---: | :---: | :---: | :---: |\n", 194 | "| 可用于模型无关的情况 | | | |\n", 195 | "| 可用于无限期MDP | | | |\n", 196 | "| 不依赖于马尔可夫假设 | | | |\n", 197 | "| 在极限收敛于真实值 | | | |\n", 198 | "| 可得到价值的无偏估计 | | | |" 199 | ] 200 | }, 201 | { 202 | "cell_type": "markdown", 203 | "id": "29cad240", 204 | "metadata": {}, 205 | "source": [ 206 | "**解答:** " 207 | ] 208 | }, 209 | { 210 | "cell_type": "markdown", 211 | "id": "c74544d0", 212 | "metadata": {}, 213 | "source": [ 214 | "**解答思路:**" 215 | ] 216 | }, 217 | { 218 | "cell_type": "markdown", 219 | "id": "20503f16", 220 | "metadata": {}, 221 | "source": [ 222 | "**解答步骤:** " 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "id": "bd280fcb", 228 | "metadata": {}, 229 | "source": [ 230 | "## 习题37.6" 231 | ] 232 | }, 233 | { 234 | "cell_type": "markdown", 235 | "id": "c8e03ea3", 236 | "metadata": {}, 237 | "source": [ 238 | "  比较蒙特卡罗预测算法、蒙特卡罗控制算法、TD(0)算法、SARSA算法、Q学习的学习收敛条件。注意这些条件都是充分条件而不是必要条件。" 239 | ] 240 | }, 241 | { 242 | "cell_type": "markdown", 243 | "id": "fc224740", 244 | "metadata": {}, 245 | "source": [ 246 | "**解答:** " 247 | ] 248 | }, 249 | { 250 | "cell_type": "markdown", 251 | "id": "52875025", 252 | "metadata": {}, 253 | "source": [ 254 | "**解答思路:**" 255 | ] 256 | }, 257 | { 258 | "cell_type": "markdown", 259 | "id": "d40edba2", 260 | "metadata": {}, 261 | "source": [ 262 | "**解答步骤:** " 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": null, 268 | "id": "d0a84e4abb4434", 269 | "metadata": {}, 270 | "outputs": [], 271 | "source": [] 272 | } 273 | ], 274 | "metadata": { 275 | "kernelspec": { 276 | "display_name": "Python 3 (ipykernel)", 277 | "language": "python", 278 | "name": "python3" 279 | }, 280 | "language_info": { 281 | "codemirror_mode": { 282 | "name": "ipython", 283 | "version": 3 284 | }, 285 | "file_extension": ".py", 286 | "mimetype": "text/x-python", 287 | "name": "python", 288 | "nbconvert_exporter": "python", 289 | "pygments_lexer": "ipython3", 290 | "version": "3.10.5" 291 | } 292 | }, 293 | "nbformat": 4, 294 | "nbformat_minor": 5 295 | } 296 | -------------------------------------------------------------------------------- /codes/ch27/bi-lstm-text-classification.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: bi-lstm-text-classification.py 6 | @time: 2023/3/15 14:30 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题27.1 基于双向LSTM的ELMo预训练语言模型,假设下游任务是文本分类 9 | """ 10 | import os 11 | import time 12 | 13 | import torch 14 | import torch.nn as nn 15 | import wget 16 | from allennlp.modules.elmo import Elmo 17 | from allennlp.modules.elmo import batch_to_ids 18 | from torch.utils.data import DataLoader 19 | from torch.utils.data.dataset import random_split 20 | from torchtext.data.functional import to_map_style_dataset 21 | from torchtext.datasets import AG_NEWS 22 | 23 | 24 | def get_elmo_model(): 25 | elmo_options_file = './data/elmo_2x1024_128_2048cnn_1xhighway_options.json' 26 | elmo_weight_file = './data/elmo_2x1024_128_2048cnn_1xhighway_weights.hdf5' 27 | url = "https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x1024_128_2048cnn_1xhighway/elmo_2x1024_128_2048cnn_1xhighway_options.json" 28 | if (not os.path.exists(elmo_options_file)): 29 | wget.download(url, elmo_options_file) 30 | url = "https://s3-us-west-2.amazonaws.com/allennlp/models/elmo/2x1024_128_2048cnn_1xhighway/elmo_2x1024_128_2048cnn_1xhighway_weights.hdf5" 31 | if (not os.path.exists(elmo_weight_file)): 32 | wget.download(url, elmo_weight_file) 33 | 34 | elmo = Elmo(elmo_options_file, elmo_weight_file, 1) 35 | return elmo 36 | 37 | 38 | # 加载ELMo模型 39 | elmo = get_elmo_model() 40 | 41 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 42 | 43 | label_pipeline = lambda x: int(x) - 1 44 | 45 | 46 | def collate_batch(batch): 47 | label_list, text_list = [], [] 48 | for (_label, _text) in batch: 49 | label_list.append(label_pipeline(_label)) 50 | text_list.append(_text.split()) 51 | label_list = torch.tensor(label_list, dtype=torch.int64) 52 | return label_list.to(device), text_list 53 | 54 | 55 | # 加载AG_NEWS数据集 56 | train_iter, test_iter = AG_NEWS(root='./data') 57 | train_dataset = to_map_style_dataset(train_iter) 58 | test_dataset = to_map_style_dataset(test_iter) 59 | num_train = int(len(train_dataset) * 0.95) 60 | split_train_, split_valid_ = \ 61 | random_split(train_dataset, [num_train, len(train_dataset) - num_train]) 62 | 63 | BATCH_SIZE = 128 64 | train_dataloader = DataLoader(split_train_, batch_size=BATCH_SIZE, 65 | shuffle=True, collate_fn=collate_batch) 66 | valid_dataloader = DataLoader(split_valid_, batch_size=BATCH_SIZE, 67 | shuffle=False, collate_fn=collate_batch) 68 | test_dataloader = DataLoader(test_dataset, batch_size=BATCH_SIZE, 69 | shuffle=False, collate_fn=collate_batch) 70 | 71 | 72 | class TextClassifier(nn.Module): 73 | def __init__(self, embedding_dim, hidden_dim, num_classes): 74 | super().__init__() 75 | # 使用预训练的ELMO 76 | self.elmo = elmo 77 | 78 | # 使用双向LSTM 79 | self.lstm = nn.LSTM(embedding_dim, hidden_dim, bidirectional=True, batch_first=True) 80 | 81 | # 使用线性函数进行文本分类任务 82 | self.fc = nn.Linear(hidden_dim * 2, num_classes) 83 | 84 | self.dropout = nn.Dropout(0.5) 85 | self.init_weights() 86 | 87 | def init_weights(self): 88 | initrange = 0.1 89 | self.fc.weight.data.uniform_(-initrange, initrange) 90 | self.fc.bias.data.uniform_(-initrange, initrange) 91 | 92 | def forward(self, sentence_lists): 93 | character_ids = batch_to_ids(sentence_lists) 94 | character_ids = character_ids.to(device) 95 | 96 | embeddings = self.elmo(character_ids) 97 | embedded = embeddings['elmo_representations'][0] 98 | 99 | x, _ = self.lstm(embedded) 100 | x = x.mean(1) 101 | x = self.dropout(x) 102 | x = self.fc(x) 103 | return x 104 | 105 | 106 | EMBED_DIM = 256 107 | HIDDEN_DIM = 64 108 | NUM_CLASSES = 4 109 | LEARNING_RATE = 1e-2 110 | NUM_EPOCHS = 1 111 | 112 | model = TextClassifier(EMBED_DIM, HIDDEN_DIM, NUM_CLASSES).to(device) 113 | 114 | 115 | def train(dataloader): 116 | model.train() 117 | 118 | for idx, (label, text) in enumerate(dataloader): 119 | optimizer.zero_grad() 120 | predicted_label = model(text) 121 | loss = criterion(predicted_label, label) 122 | loss.backward() 123 | torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0) 124 | optimizer.step() 125 | 126 | 127 | def evaluate(dataloader): 128 | model.eval() 129 | total_acc, total_count = 0, 0 130 | 131 | with torch.no_grad(): 132 | for idx, (label, text) in enumerate(dataloader): 133 | predicted_label = model(text) 134 | total_acc += (predicted_label.argmax(1) == label).sum().item() 135 | total_count += label.size(0) 136 | 137 | return total_acc / total_count 138 | 139 | 140 | # 使用交叉熵损失函数 141 | criterion = nn.CrossEntropyLoss().to(device) 142 | # 设置优化器 143 | optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE) 144 | 145 | for epoch in range(1, NUM_EPOCHS + 1): 146 | epoch_start_time = time.time() 147 | train(train_dataloader) 148 | accu_val = evaluate(valid_dataloader) 149 | print('-' * 59) 150 | print('| end of epoch {:3d} | time: {:5.2f}s | ' 151 | 'valid accuracy {:8.1f}% '.format(epoch, 152 | time.time() - epoch_start_time, 153 | accu_val * 100)) 154 | print('-' * 59) 155 | 156 | ag_news_label = {1: "World", 2: "Sports", 3: "Business", 4: "Sci/Tec"} 157 | 158 | 159 | def predict(text): 160 | with torch.no_grad(): 161 | output = model([text]) 162 | return output.argmax(1).item() + 1 163 | 164 | 165 | ex_text_str = """ 166 | Our younger Fox Cubs (Y2-Y4) also had a great second experience 167 | of swimming competition in February when they travelled over to 168 | NIS at the end of February to compete in the SSL Development 169 | Series R2 event. For students aged 9 and under these SSL 170 | Development Series events are a great introduction to 171 | competitive swimming, focussed on fun and participation whilst 172 | also building basic skills and confidence as students build up 173 | to joining the full SSL team in Year 5 and beyond. 174 | """ 175 | 176 | print("This is a %s news" % ag_news_label[predict(ex_text_str)]) 177 | -------------------------------------------------------------------------------- /notebook/part04/notes/ch39.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "adcf9054dbd97916", 6 | "metadata": {}, 7 | "source": [ 8 | "# 第39章 基于策略的方法" 9 | ] 10 | }, 11 | { 12 | "cell_type": "markdown", 13 | "id": "6438df9d", 14 | "metadata": {}, 15 | "source": [ 16 | "## 习题39.1" 17 | ] 18 | }, 19 | { 20 | "cell_type": "markdown", 21 | "id": "7ebdd547", 22 | "metadata": {}, 23 | "source": [ 24 | "  推导式(39.18)。证明当$\\text{Var}(\\nabla_\\theta \\log \\pi (\\tau | \\theta) (G(\\tau) - B)) = 0 $,基线$B$满足以下条件。\n", 25 | "$$\n", 26 | "B = \\frac{\\mathbb{E}_{\\tau \\sim \\pi(\\tau | \\theta)} (\\nabla_\\theta \\log \\pi (\\tau | \\theta))^2 G(\\tau) }{ \\mathbb{E}_{\\tau \\sim \\pi(\\tau | \\theta)} (\\nabla_\\theta \\log \\pi (\\tau | \\theta))^2 }\n", 27 | "$$\n", 28 | "提示:利用$\\text{Var}(x) = \\mathbb{E}[x^2] - \\mathbb{E}[x]^2$。" 29 | ] 30 | }, 31 | { 32 | "cell_type": "markdown", 33 | "id": "a24401c0", 34 | "metadata": {}, 35 | "source": [ 36 | "**解答:** " 37 | ] 38 | }, 39 | { 40 | "cell_type": "markdown", 41 | "id": "a26f336d", 42 | "metadata": {}, 43 | "source": [ 44 | "**解答思路:**" 45 | ] 46 | }, 47 | { 48 | "cell_type": "markdown", 49 | "id": "3abf0c86", 50 | "metadata": {}, 51 | "source": [ 52 | "**解答步骤:** " 53 | ] 54 | }, 55 | { 56 | "cell_type": "markdown", 57 | "id": "a3e029f5", 58 | "metadata": {}, 59 | "source": [ 60 | "## 习题39.2" 61 | ] 62 | }, 63 | { 64 | "cell_type": "markdown", 65 | "id": "11c951f9", 66 | "metadata": {}, 67 | "source": [ 68 | "  解释为什么本章介绍的REINFORCE和演员-评论员是在策略学习算法,而不是离策略方法。考虑什么样的情况下需要离策略学习的REINFORCE和演员-评论员算法。" 69 | ] 70 | }, 71 | { 72 | "cell_type": "markdown", 73 | "id": "66e44a47", 74 | "metadata": {}, 75 | "source": [ 76 | "**解答:** " 77 | ] 78 | }, 79 | { 80 | "cell_type": "markdown", 81 | "id": "64290b3a", 82 | "metadata": {}, 83 | "source": [ 84 | "**解答思路:**" 85 | ] 86 | }, 87 | { 88 | "cell_type": "markdown", 89 | "id": "86bf1cc9", 90 | "metadata": {}, 91 | "source": [ 92 | "**解答步骤:** " 93 | ] 94 | }, 95 | { 96 | "cell_type": "markdown", 97 | "id": "1f956c2a", 98 | "metadata": {}, 99 | "source": [ 100 | "## 习题39.3" 101 | ] 102 | }, 103 | { 104 | "cell_type": "markdown", 105 | "id": "49cc435f", 106 | "metadata": {}, 107 | "source": [ 108 | "  本章介绍的演员-评论员算法使用蒙特卡罗法进行数据采样,学习是小批量模式。试写出使用时间差分法进行数据采样,学习是在线模式的算法。" 109 | ] 110 | }, 111 | { 112 | "cell_type": "markdown", 113 | "id": "8cec3e82", 114 | "metadata": {}, 115 | "source": [ 116 | "**解答:** " 117 | ] 118 | }, 119 | { 120 | "cell_type": "markdown", 121 | "id": "5c11f559", 122 | "metadata": {}, 123 | "source": [ 124 | "**解答思路:**" 125 | ] 126 | }, 127 | { 128 | "cell_type": "markdown", 129 | "id": "aab7ccde", 130 | "metadata": {}, 131 | "source": [ 132 | "**解答步骤:** " 133 | ] 134 | }, 135 | { 136 | "cell_type": "markdown", 137 | "id": "f8cfa0c9", 138 | "metadata": {}, 139 | "source": [ 140 | "## 习题39.4" 141 | ] 142 | }, 143 | { 144 | "cell_type": "markdown", 145 | "id": "ed50e8e7", 146 | "metadata": {}, 147 | "source": [ 148 | "  为什么演员-评论员算法直接定义目标函数的梯度函数,而不定义目标函数?请给出解释。" 149 | ] 150 | }, 151 | { 152 | "cell_type": "markdown", 153 | "id": "a42a9c8c", 154 | "metadata": {}, 155 | "source": [ 156 | "**解答:** " 157 | ] 158 | }, 159 | { 160 | "cell_type": "markdown", 161 | "id": "ffe9b30b", 162 | "metadata": {}, 163 | "source": [ 164 | "**解答思路:**" 165 | ] 166 | }, 167 | { 168 | "cell_type": "markdown", 169 | "id": "8c99b546", 170 | "metadata": {}, 171 | "source": [ 172 | "**解答步骤:** " 173 | ] 174 | }, 175 | { 176 | "cell_type": "markdown", 177 | "id": "e1dadc61", 178 | "metadata": {}, 179 | "source": [ 180 | "## 习题39.5" 181 | ] 182 | }, 183 | { 184 | "cell_type": "markdown", 185 | "id": "85614959", 186 | "metadata": {}, 187 | "source": [ 188 | "  证明策略梯度定理在使用优势函数时仍然成立。\n", 189 | "$$\n", 190 | "\\nabla_\\theta J(\\theta) = \\mathbb{E}_{\\rho_{\\theta}(s)} \\left [ \\mathbb{E}_{\\pi_{\\theta}(a|s)} \\left [\\nabla_{\\theta} \\log \\pi_{\\theta}(a | s) A_{\\pi_{\\theta}} (s, a) \\right ] \\right ]\n", 191 | "$$" 192 | ] 193 | }, 194 | { 195 | "cell_type": "markdown", 196 | "id": "034423f9", 197 | "metadata": {}, 198 | "source": [ 199 | "**解答:** " 200 | ] 201 | }, 202 | { 203 | "cell_type": "markdown", 204 | "id": "4f40d483", 205 | "metadata": {}, 206 | "source": [ 207 | "**解答思路:**" 208 | ] 209 | }, 210 | { 211 | "cell_type": "markdown", 212 | "id": "c1b188cd", 213 | "metadata": {}, 214 | "source": [ 215 | "**解答步骤:** " 216 | ] 217 | }, 218 | { 219 | "cell_type": "markdown", 220 | "id": "ea7c08c0", 221 | "metadata": {}, 222 | "source": [ 223 | "## 习题39.6" 224 | ] 225 | }, 226 | { 227 | "cell_type": "markdown", 228 | "id": "f1d1d157", 229 | "metadata": {}, 230 | "source": [ 231 | "  考虑为什么在阿尔法狗的学习中,策略网络的学习要通过新旧策略网络对弈,而价值网络的学习通过已学好的策略网络的自己对弈。" 232 | ] 233 | }, 234 | { 235 | "cell_type": "markdown", 236 | "id": "5b9cd403", 237 | "metadata": {}, 238 | "source": [ 239 | "**解答:** " 240 | ] 241 | }, 242 | { 243 | "cell_type": "markdown", 244 | "id": "6b11a442", 245 | "metadata": {}, 246 | "source": [ 247 | "**解答思路:**" 248 | ] 249 | }, 250 | { 251 | "cell_type": "markdown", 252 | "id": "0b2e1493", 253 | "metadata": {}, 254 | "source": [ 255 | "**解答步骤:** " 256 | ] 257 | }, 258 | { 259 | "cell_type": "code", 260 | "execution_count": null, 261 | "id": "d2ad0dca8d50c14", 262 | "metadata": {}, 263 | "outputs": [], 264 | "source": [] 265 | } 266 | ], 267 | "metadata": { 268 | "kernelspec": { 269 | "display_name": "Python 3 (ipykernel)", 270 | "language": "python", 271 | "name": "python3" 272 | }, 273 | "language_info": { 274 | "codemirror_mode": { 275 | "name": "ipython", 276 | "version": 3 277 | }, 278 | "file_extension": ".py", 279 | "mimetype": "text/x-python", 280 | "name": "python", 281 | "nbconvert_exporter": "python", 282 | "pygments_lexer": "ipython3", 283 | "version": "3.10.5" 284 | } 285 | }, 286 | "nbformat": 4, 287 | "nbformat_minor": 5 288 | } 289 | -------------------------------------------------------------------------------- /codes/ch06/maxent_dfp.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: maxent_dfp.py 6 | @time: 2021/8/10 2:28 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题6.3 最大熵模型学习的DFP算法 9 | """ 10 | import copy 11 | from collections import defaultdict 12 | 13 | import numpy as np 14 | from scipy.optimize import fminbound 15 | 16 | 17 | class MaxEntDFP: 18 | def __init__(self, epsilon, max_iter=1000, distance=0.01): 19 | """ 20 | 最大熵的DFP算法 21 | :param epsilon: 迭代停止阈值 22 | :param max_iter: 最大迭代次数 23 | :param distance: 一维搜索的长度范围 24 | """ 25 | self.distance = distance 26 | self.epsilon = epsilon 27 | self.max_iter = max_iter 28 | self.w = None 29 | self._dataset_X = None 30 | self._dataset_y = None 31 | # 标签集合,相当去去重后的y 32 | self._y = set() 33 | # key为(x,y), value为对应的索引号ID 34 | self._xyID = {} 35 | # key为对应的索引号ID, value为(x,y) 36 | self._IDxy = {} 37 | # 经验分布p(x,y) 38 | self._pxy_dic = defaultdict(int) 39 | # 样本数 40 | self._N = 0 41 | # 特征键值(x,y)的个数 42 | self._n = 0 43 | # 实际迭代次数 44 | self.n_iter_ = 0 45 | 46 | # 初始化参数 47 | def init_params(self, X, y): 48 | self._dataset_X = copy.deepcopy(X) 49 | self._dataset_y = copy.deepcopy(y) 50 | self._N = X.shape[0] 51 | 52 | for i in range(self._N): 53 | xi, yi = X[i], y[i] 54 | self._y.add(yi) 55 | for _x in xi: 56 | self._pxy_dic[(_x, yi)] += 1 57 | 58 | self._n = len(self._pxy_dic) 59 | # 初始化权重w 60 | self.w = np.zeros(self._n) 61 | 62 | for i, xy in enumerate(self._pxy_dic): 63 | self._pxy_dic[xy] /= self._N 64 | self._xyID[xy] = i 65 | self._IDxy[i] = xy 66 | 67 | def calc_zw(self, X, w): 68 | """书中第100页公式6.23,计算Zw(x)""" 69 | zw = 0.0 70 | for y in self._y: 71 | zw += self.calc_ewf(X, y, w) 72 | return zw 73 | 74 | def calc_ewf(self, X, y, w): 75 | """书中第100页公式6.22,计算分子""" 76 | sum_wf = self.calc_wf(X, y, w) 77 | return np.exp(sum_wf) 78 | 79 | def calc_wf(self, X, y, w): 80 | sum_wf = 0.0 81 | for x in X: 82 | if (x, y) in self._pxy_dic: 83 | sum_wf += w[self._xyID[(x, y)]] 84 | return sum_wf 85 | 86 | def calc_pw_yx(self, X, y, w): 87 | """计算Pw(y|x)""" 88 | return self.calc_ewf(X, y, w) / self.calc_zw(X, w) 89 | 90 | def calc_f(self, w): 91 | """计算f(w)""" 92 | fw = 0.0 93 | for i in range(self._n): 94 | x, y = self._IDxy[i] 95 | for dataset_X in self._dataset_X: 96 | if x not in dataset_X: 97 | continue 98 | fw += np.log(self.calc_zw(x, w)) - self._pxy_dic[(x, y)] * self.calc_wf(dataset_X, y, w) 99 | 100 | return fw 101 | 102 | # DFP算法 103 | def fit(self, X, y): 104 | self.init_params(X, y) 105 | 106 | def calc_dfw(i, w): 107 | """计算书中第107页的拟牛顿法f(w)的偏导""" 108 | 109 | def calc_ewp(i, w): 110 | """计算偏导左边的公式""" 111 | ep = 0.0 112 | x, y = self._IDxy[i] 113 | for dataset_X in self._dataset_X: 114 | if x not in dataset_X: 115 | continue 116 | ep += self.calc_pw_yx(dataset_X, y, w) / self._N 117 | return ep 118 | 119 | def calc_ep(i): 120 | """计算关于经验分布P(x,y)的期望值""" 121 | (x, y) = self._IDxy[i] 122 | return self._pxy_dic[(x, y)] 123 | 124 | return calc_ewp(i, w) - calc_ep(i) 125 | 126 | # 算出g(w),是n*1维矩阵 127 | def calc_gw(w): 128 | return np.array([[calc_dfw(i, w) for i in range(self._n)]]).T 129 | 130 | # (1)初始正定对称矩阵,单位矩阵 131 | Gk = np.array(np.eye(len(self.w), dtype=float)) 132 | 133 | # (2)计算g(w0) 134 | w = self.w 135 | gk = calc_gw(w) 136 | # 判断gk的范数是否小于阈值 137 | if np.linalg.norm(gk, ord=2) < self.epsilon: 138 | self.w = w 139 | return 140 | 141 | k = 0 142 | for _ in range(self.max_iter): 143 | # (3)计算pk 144 | pk = -Gk.dot(gk) 145 | 146 | # 梯度方向的一维函数 147 | def _f(x): 148 | z = w + np.dot(x, pk).T[0] 149 | return self.calc_f(z) 150 | 151 | # (4)进行一维搜索,找到使得函数最小的lambda 152 | _lambda = fminbound(_f, -self.distance, self.distance) 153 | 154 | delta_k = _lambda * pk 155 | # (5)更新权重 156 | w += delta_k.T[0] 157 | 158 | # (6)计算gk+1 159 | gk1 = calc_gw(w) 160 | # 判断gk1的范数是否小于阈值 161 | if np.linalg.norm(gk1, ord=2) < self.epsilon: 162 | self.w = w 163 | break 164 | # 根据DFP算法的迭代公式(附录B.24公式)计算Gk 165 | yk = gk1 - gk 166 | Pk = delta_k.dot(delta_k.T) / (delta_k.T.dot(yk)) 167 | Qk = Gk.dot(yk).dot(yk.T).dot(Gk) / (yk.T.dot(Gk).dot(yk)) * (-1) 168 | Gk = Gk + Pk + Qk 169 | gk = gk1 170 | 171 | # (7)置k=k+1 172 | k += 1 173 | 174 | self.w = w 175 | self.n_iter_ = k 176 | 177 | def predict(self, x): 178 | result = {} 179 | for y in self._y: 180 | prob = self.calc_pw_yx(x, y, self.w) 181 | result[y] = prob 182 | 183 | return result 184 | 185 | 186 | if __name__ == '__main__': 187 | # 训练数据集 188 | dataset = np.array([['no', 'sunny', 'hot', 'high', 'FALSE'], 189 | ['no', 'sunny', 'hot', 'high', 'TRUE'], 190 | ['yes', 'overcast', 'hot', 'high', 'FALSE'], 191 | ['yes', 'rainy', 'mild', 'high', 'FALSE'], 192 | ['yes', 'rainy', 'cool', 'normal', 'FALSE'], 193 | ['no', 'rainy', 'cool', 'normal', 'TRUE'], 194 | ['yes', 'overcast', 'cool', 'normal', 'TRUE'], 195 | ['no', 'sunny', 'mild', 'high', 'FALSE'], 196 | ['yes', 'sunny', 'cool', 'normal', 'FALSE'], 197 | ['yes', 'rainy', 'mild', 'normal', 'FALSE'], 198 | ['yes', 'sunny', 'mild', 'normal', 'TRUE'], 199 | ['yes', 'overcast', 'mild', 'high', 'TRUE'], 200 | ['yes', 'overcast', 'hot', 'normal', 'FALSE'], 201 | ['no', 'rainy', 'mild', 'high', 'TRUE']]) 202 | 203 | X_train = dataset[:, 1:] 204 | y_train = dataset[:, 0] 205 | 206 | mae = MaxEntDFP(epsilon=1e-4, max_iter=1000, distance=0.01) 207 | mae.fit(X_train, y_train) 208 | print("模型训练迭代次数:{}次".format(mae.n_iter_)) 209 | print("模型权重:{}".format(mae.w)) 210 | 211 | result = mae.predict(['overcast', 'mild', 'high', 'FALSE']) 212 | print("预测结果:", result) 213 | -------------------------------------------------------------------------------- /codes/ch24/cnn-text-classification.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: cnn-text-classification.py 6 | @time: 2023/3/16 16:22 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题24.7 基于CNN的自然语言句子分类模型 9 | """ 10 | import time 11 | import torch 12 | from torch import nn, optim 13 | from torch.utils.data import random_split, DataLoader 14 | from torchtext.data import get_tokenizer, to_map_style_dataset 15 | from torchtext.datasets import AG_NEWS 16 | from torchtext.vocab import build_vocab_from_iterator 17 | 18 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 19 | # 加载AG_NEWS数据集 20 | train_iter, test_iter = AG_NEWS(root='./data') 21 | 22 | # 定义tokenizer 23 | tokenizer = get_tokenizer('basic_english') 24 | 25 | 26 | # 定义数据处理函数 27 | def yield_tokens(data_iter): 28 | for _, text in data_iter: 29 | yield tokenizer(text) 30 | 31 | 32 | # 构建词汇表 33 | vocab = build_vocab_from_iterator(yield_tokens(train_iter), specials=[""]) 34 | vocab.set_default_index(vocab[""]) 35 | 36 | # 将数据集映射到MapStyleDataset格式 37 | train_dataset = list(to_map_style_dataset(train_iter)) 38 | test_dataset = list(to_map_style_dataset(test_iter)) 39 | # 划分验证集 40 | num_train = int(len(train_dataset) * 0.9) 41 | train_dataset, val_dataset = random_split(train_dataset, [num_train, len(train_dataset) - num_train]) 42 | 43 | # 设置文本和标签的处理函数 44 | text_pipeline = lambda x: vocab(tokenizer(x)) 45 | label_pipeline = lambda x: int(x) - 1 46 | 47 | 48 | def collate_batch(batch): 49 | """ 50 | 对数据集进行数据处理 51 | """ 52 | label_list, text_list, offsets = [], [], [0] 53 | for (_label, _text) in batch: 54 | label_list.append(label_pipeline(_label)) 55 | processed_text = torch.tensor(text_pipeline(_text), dtype=torch.int64) 56 | text_list.append(processed_text) 57 | offsets.append(processed_text.size(0)) 58 | label_list = torch.tensor(label_list, dtype=torch.int64) 59 | offsets = torch.tensor(offsets[:-1]).cumsum(dim=0) 60 | text_list = torch.cat(text_list) 61 | return label_list.to(device), text_list.to(device), offsets.to(device) 62 | 63 | # 构建数据集的数据加载器 64 | BATCH_SIZE = 64 65 | train_dataloader = DataLoader(train_dataset, batch_size=BATCH_SIZE, 66 | shuffle=True, collate_fn=collate_batch) 67 | valid_dataloader = DataLoader(val_dataset, batch_size=BATCH_SIZE, 68 | shuffle=True, collate_fn=collate_batch) 69 | test_dataloader = DataLoader(test_dataset, batch_size=BATCH_SIZE, 70 | shuffle=True, collate_fn=collate_batch) 71 | 72 | 73 | class CNN_Text(nn.Module): 74 | """ 75 | 基于CNN的文本分类模型 76 | """ 77 | def __init__(self, vocab_size, embed_dim, class_num=4, dropout=0.5, kernel_size: list = None): 78 | super(CNN_Text, self).__init__() 79 | if kernel_size is None: 80 | kernel_size = [3, 4, 5] 81 | self.embedding = nn.EmbeddingBag(vocab_size, embed_dim, sparse=False) 82 | self.convs = nn.ModuleList( 83 | [nn.Conv1d(in_channels=1, out_channels=256, kernel_size=k) for k in kernel_size]) 84 | self.fc = nn.Sequential( 85 | nn.Dropout(p=dropout), 86 | nn.Linear(256 * len(kernel_size), 256), 87 | nn.ReLU(), 88 | nn.Dropout(p=dropout), 89 | nn.Linear(256, class_num) 90 | ) 91 | 92 | def forward(self, text, offsets): 93 | embedded = self.embedding(text, offsets) 94 | embedded = embedded.unsqueeze(0) 95 | embedded = embedded.permute(1, 0, 2) 96 | conv_outputs = [] 97 | for conv in self.convs: 98 | conv_outputs.append(nn.functional.relu(conv(embedded))) 99 | pooled_outputs = [] 100 | for conv_output in conv_outputs: 101 | pooled = nn.functional.max_pool1d(conv_output, conv_output.shape[-1]).squeeze(-1) 102 | pooled_outputs.append(pooled) 103 | cat = torch.cat(pooled_outputs, dim=-1) 104 | return self.fc(cat) 105 | 106 | 107 | # 设置超参数 108 | vocab_size = len(vocab) 109 | embed_dim = 64 110 | class_num = len(set([label for label, _ in train_iter])) 111 | lr = 1e-3 112 | dropout = 0.5 113 | epochs = 10 114 | 115 | # 创建模型、优化器和损失函数 116 | model = CNN_Text(vocab_size, embed_dim, class_num, dropout).to(device) 117 | optimizer = optim.Adam(model.parameters(), lr=lr) 118 | criterion = nn.CrossEntropyLoss() 119 | 120 | 121 | def train(dataloader): 122 | """ 123 | 模型训练 124 | """ 125 | model.train() 126 | 127 | for label, text, offsets in dataloader: 128 | optimizer.zero_grad() 129 | predicted_label = model(text, offsets) 130 | loss = criterion(predicted_label, label) 131 | loss.backward() 132 | torch.nn.utils.clip_grad_norm_(model.parameters(), 0.1) 133 | optimizer.step() 134 | 135 | 136 | def evaluate(dataloader): 137 | """ 138 | 模型验证 139 | """ 140 | model.eval() 141 | total_acc, total_count = 0, 0 142 | 143 | with torch.no_grad(): 144 | for label, text, offsets in dataloader: 145 | predicted_label = model(text, offsets) 146 | criterion(predicted_label, label) 147 | total_acc += (predicted_label.argmax(1) == label).sum().item() 148 | total_count += label.size(0) 149 | return total_acc / total_count 150 | 151 | 152 | max_accu = 0 153 | for epoch in range(1, epochs + 1): 154 | epoch_start_time = time.time() 155 | train(train_dataloader) 156 | accu_val = evaluate(valid_dataloader) 157 | print('-' * 59) 158 | print('| end of epoch {:3d} | time: {:5.2f}s | ' 159 | 'valid accuracy {:8.1f}% '.format(epoch, 160 | time.time() - epoch_start_time, 161 | accu_val * 100)) 162 | print('-' * 59) 163 | if max_accu < accu_val: 164 | best_model = model 165 | max_accu = accu_val 166 | 167 | # 在测试集上测试模型 168 | test_acc = 0.0 169 | with torch.no_grad(): 170 | for label, text, offsets in test_dataloader: 171 | output = best_model(text, offsets) 172 | pred = output.argmax(dim=1) 173 | test_acc += (pred == label).sum().item() 174 | test_acc /= len(test_dataset) 175 | 176 | print(f"Test Acc: {test_acc * 100 :.1f}%") 177 | 178 | # 新闻的分类标签 179 | ag_news_label = {1: "World", 180 | 2: "Sports", 181 | 3: "Business", 182 | 4: "Sci/Tec"} 183 | 184 | 185 | def predict(text, text_pipeline): 186 | with torch.no_grad(): 187 | text = torch.tensor(text_pipeline(text)) 188 | output = best_model(text, torch.tensor([0])) 189 | return output.argmax(1).item() + 1 190 | 191 | # 预测一个文本的类别 192 | ex_text_str = """ 193 | Our younger Fox Cubs (Y2-Y4) also had a great second experience of swimming competition in February when they travelled 194 | over to NIS at the end of February to compete in the SSL Development Series R2 event. For students aged 9 and under 195 | these SSL Development Series events are a great introduction to competitive swimming, focussed on fun and participation 196 | whilst also building basic skills and confidence as students build up to joining the full SSL team in Year 5 and beyond. 197 | """ 198 | model = best_model.to("cpu") 199 | print("This is a %s news" % ag_news_label[predict(ex_text_str, text_pipeline)]) 200 | -------------------------------------------------------------------------------- /codes/ch03/my_kd_tree.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: my_kd_tree.py 6 | @time: 2021/8/3 20:10 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题3.3 用kd树的k邻近搜索算法 9 | """ 10 | import json 11 | 12 | 13 | class Node: 14 | """节点类""" 15 | 16 | def __init__(self, value, index, left_child, right_child): 17 | self.value = value.tolist() 18 | self.index = index 19 | self.left_child = left_child 20 | self.right_child = right_child 21 | 22 | def __repr__(self): 23 | return json.dumps(self, indent=3, default=lambda obj: obj.__dict__, ensure_ascii=False) 24 | 25 | 26 | class KDTree: 27 | """kd tree类""" 28 | 29 | def __init__(self, data): 30 | # 数据集 31 | self.data = np.asarray(data) 32 | # kd树 33 | self.kd_tree = None 34 | # 创建平衡kd树 35 | self._create_kd_tree(data) 36 | 37 | def _split_sub_tree(self, data, depth=0): 38 | # 算法3.2第3步:直到子区域没有实例存在时停止 39 | if len(data) == 0: 40 | return None 41 | # 算法3.2第2步:选择切分坐标轴, 从0开始(书中是从1开始) 42 | l = depth % data.shape[1] 43 | # 对数据进行排序 44 | data = data[data[:, l].argsort()] 45 | # 算法3.2第1步:将所有实例坐标的中位数作为切分点 46 | median_index = data.shape[0] // 2 47 | # 获取结点在数据集中的位置 48 | node_index = [i for i, v in enumerate(self.data) if list(v) == list(data[median_index])] 49 | return Node( 50 | # 本结点 51 | value=data[median_index], 52 | # 本结点在数据集中的位置 53 | index=node_index[0], 54 | # 左子结点 55 | left_child=self._split_sub_tree(data[:median_index], depth + 1), 56 | # 右子结点 57 | right_child=self._split_sub_tree(data[median_index + 1:], depth + 1) 58 | ) 59 | 60 | def _create_kd_tree(self, X): 61 | self.kd_tree = self._split_sub_tree(X) 62 | 63 | def query(self, data, k=1): 64 | data = np.asarray(data) 65 | hits = self._search(data, self.kd_tree, k=k, k_neighbors_sets=list()) 66 | dd = np.array([hit[0] for hit in hits]) 67 | ii = np.array([hit[1] for hit in hits]) 68 | return dd, ii 69 | 70 | def __repr__(self): 71 | return str(self.kd_tree) 72 | 73 | @staticmethod 74 | def _cal_node_distance(node1, node2): 75 | """计算两个结点之间的距离""" 76 | return np.sqrt(np.sum(np.square(node1 - node2))) 77 | 78 | def _search(self, point, tree=None, k=1, k_neighbors_sets=None, depth=0): 79 | n = point.shape[1] 80 | if k_neighbors_sets is None: 81 | k_neighbors_sets = [] 82 | if tree is None: 83 | return k_neighbors_sets 84 | 85 | # (1)找到包含目标点x的叶节点 86 | if tree.left_child is None and tree.right_child is None: 87 | # 更新当前k近邻集 88 | return self._update_k_neighbor_sets(k_neighbors_sets, k, tree, point) 89 | 90 | # 递归地向下访问kd树 91 | if point[0][depth % n] < tree.value[depth % n]: 92 | direct = 'left' 93 | next_branch = tree.left_child 94 | else: 95 | direct = 'right' 96 | next_branch = tree.right_child 97 | 98 | if next_branch is not None: 99 | # (3)(b)检查另一子节点对应的区域是否相交 100 | k_neighbors_sets = self._search(point, tree=next_branch, k=k, depth=depth + 1, 101 | k_neighbors_sets=k_neighbors_sets) 102 | # 计算目标点与切分点形成的分割超平面的距离 103 | temp_dist = abs(tree.value[depth % n] - point[0][depth % n]) 104 | 105 | # 判断超球体是否与超平面相交 106 | if not (k_neighbors_sets[0][0] < temp_dist and len(k_neighbors_sets) == k): # 换到另一侧 107 | # 如果相交,递归地进行近邻搜索 108 | # (3)(a)判断当前结点,并更新当前k近邻点集 109 | k_neighbors_sets = self._update_k_neighbor_sets(k_neighbors_sets, k, tree, point) # tree 返回父节点 110 | if direct == 'left': 111 | return self._search(point, tree=tree.right_child, k=k, depth=depth + 1, 112 | k_neighbors_sets=k_neighbors_sets) 113 | else: 114 | return self._search(point, tree=tree.left_child, k=k, depth=depth + 1, 115 | k_neighbors_sets=k_neighbors_sets) 116 | else: 117 | # 如果选定的子树为空,则直接判断是否需要回溯另一侧子树 118 | # 如果 k 近邻集合未满,则需要回溯另一侧子树 119 | if len(k_neighbors_sets) < k: 120 | # 如果相交,递归地进行近邻搜索 121 | # (3)(a)判断当前结点,并更新当前k近邻点集 122 | k_neighbors_sets = self._update_k_neighbor_sets(k_neighbors_sets, k, tree, point) # tree 返回父节点 123 | if direct == 'left': 124 | return self._search(point, tree=tree.right_child, k=k, depth=depth + 1, 125 | k_neighbors_sets=k_neighbors_sets) 126 | else: 127 | return self._search(point, tree=tree.left_child, k=k, depth=depth + 1, 128 | k_neighbors_sets=k_neighbors_sets) 129 | 130 | return k_neighbors_sets 131 | 132 | def _update_k_neighbor_sets(self, best, k, tree, point): 133 | # 计算目标点与当前结点的距离 134 | node_distance = self._cal_node_distance(point, tree.value) 135 | if len(best) == 0: 136 | best.append((node_distance, tree.index, tree.value)) 137 | elif len(best) < k: 138 | # 如果“当前k近邻点集”元素数量小于k 139 | self._insert_k_neighbor_sets(best, tree, node_distance) 140 | else: 141 | # 叶节点距离小于“当前 𝑘 近邻点集”中最远点距离 142 | if best[0][0] > node_distance: 143 | best = best[1:] 144 | self._insert_k_neighbor_sets(best, tree, node_distance) 145 | return best 146 | 147 | @staticmethod 148 | def _insert_k_neighbor_sets(best, tree, node_distance): 149 | """将距离最远的结点排在前面""" 150 | n = len(best) 151 | for i, item in enumerate(best): 152 | if item[0] < node_distance: 153 | # 将距离最远的结点插入到前面 154 | best.insert(i, (node_distance, tree.index, tree.value)) 155 | break 156 | if len(best) == n: 157 | best.append((node_distance, tree.index, tree.value)) 158 | 159 | 160 | def print_k_neighbor_sets(k, ii, dd): 161 | if k == 1: 162 | text = "x点的最近邻点是" 163 | else: 164 | text = "x点的%d个近邻点是" % k 165 | 166 | for i, index in enumerate(ii): 167 | res = X_train[index] 168 | if i == 0: 169 | text += str(tuple(res)) 170 | else: 171 | text += ", " + str(tuple(res)) 172 | 173 | if k == 1: 174 | text += ",距离是" 175 | else: 176 | text += ",距离分别是" 177 | for i, dist in enumerate(dd): 178 | if i == 0: 179 | text += "%.4f" % dist 180 | else: 181 | text += ", %.4f" % dist 182 | 183 | print(text) 184 | 185 | 186 | if __name__ == '__main__': 187 | import numpy as np 188 | 189 | X_train = np.array([[2, 3], 190 | [5, 4], 191 | [9, 6], 192 | [4, 7], 193 | [8, 1], 194 | [7, 2]]) 195 | kd_tree = KDTree(X_train) 196 | k = 1 197 | dists, indices = kd_tree.query(np.array([[3, 4.5]]), k=k) 198 | print_k_neighbor_sets(k, indices, dists) 199 | print(kd_tree) -------------------------------------------------------------------------------- /codes/ch26/cnn_seq2seq.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # encoding: utf-8 3 | """ 4 | @author: HuRuiFeng 5 | @file: cnn_seq2seq.py 6 | @time: 2023/3/17 18:44 7 | @project: statistical-learning-method-solutions-manual 8 | @desc: 习题26.4 基于CNN的序列到序列模型 9 | """ 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torch.nn.functional as F 14 | 15 | 16 | class CNNEncoder(nn.Module): 17 | r"""序列到序列 CNN 编码器。 18 | 19 | Args: 20 | inp_dim: 嵌入层的输入维度。 21 | emb_dim: 嵌入层的输出维度。 22 | hid_dim: CNN 隐层向量维度。 23 | num_layers: CNN 层数。 24 | kerner_size: 卷积核大小。 25 | """ 26 | 27 | def __init__(self, inp_dim, emb_dim, hid_dim, 28 | num_layers, kernel_size): 29 | super().__init__() 30 | 31 | self.embed = nn.Embedding(inp_dim, emb_dim) 32 | 33 | self.emb2hid = nn.Linear(emb_dim, int(hid_dim / 2)) 34 | self.hid2emb = nn.Linear(int(hid_dim / 2), emb_dim) 35 | 36 | self.convs = nn.ModuleList([nn.Conv1d(in_channels=emb_dim, 37 | out_channels=hid_dim, 38 | kernel_size=kernel_size, 39 | padding=(kernel_size - 1) // 2) 40 | for _ in range(num_layers)]) 41 | 42 | def forward(self, inputs): 43 | # inputs.shape(): (src_len, inp_dim) 44 | # conv_inp.shape(): (src_len, emb_dim) 45 | conv_inp = self.embed(inputs).permute(0, 2, 1) 46 | 47 | for _, conv in enumerate(self.convs): 48 | # 进行卷积运算 49 | # conv_out.shape(): (src_len, hid_dim) 50 | conv_out = conv(conv_inp) 51 | 52 | # 经过激活函数 53 | conved = F.glu(conv_out, dim=1) 54 | 55 | # 残差连接运算 56 | conved = self.hid2emb(conved.permute(0, 2, 1)).permute(0, 2, 1) 57 | conved = conved + conv_inp 58 | conv_inp = conved 59 | 60 | # 卷积输出与词嵌入 element-wise 点加进行注意力运算 61 | # combined.shape(): (src_len, emb_dim) 62 | combined = conved + conv_inp 63 | 64 | return conved, combined 65 | 66 | 67 | class CNNDecoder(nn.Module): 68 | r"""序列到序列 CNN 解码器。 69 | 70 | Args: 71 | out_dim: 嵌入层的输入维度。 72 | emb_dim: 嵌入层的输出维度。 73 | hid_dim: CNN 隐层向量维度。 74 | num_layers: CNN 层数。 75 | kernel_size: 卷积核大小。 76 | """ 77 | 78 | def __init__(self, out_dim, emb_dim, hid_dim, 79 | num_layers, kernel_size, trg_pad_idx): 80 | super().__init__() 81 | 82 | self.kernel_size = kernel_size 83 | self.trg_pad_idx = trg_pad_idx 84 | 85 | self.embed = nn.Embedding(out_dim, emb_dim) 86 | 87 | self.emb2hid = nn.Linear(emb_dim, int(hid_dim / 2)) 88 | self.hid2emb = nn.Linear(int(hid_dim / 2), emb_dim) 89 | 90 | self.attn_hid2emb = nn.Linear(int(hid_dim / 2), emb_dim) 91 | self.attn_emb2hid = nn.Linear(emb_dim, int(hid_dim / 2)) 92 | 93 | self.fc_out = nn.Linear(emb_dim, out_dim) 94 | 95 | self.convs = nn.ModuleList([nn.Conv1d(in_channels=emb_dim, 96 | out_channels=hid_dim, 97 | kernel_size=kernel_size) 98 | for _ in range(num_layers)]) 99 | 100 | def calculate_attention(self, embed, conved, encoder_conved, encoder_combined): 101 | # embed.shape(): (trg_len, emb_dim) 102 | # conved.shape(): (hid_dim, trg_len) 103 | # encoder_conved.shape(), encoder_combined.shape(): (src_len, emb_dim) 104 | # 进行注意力层第一次线性运算调整维度 105 | conved_emb = self.attn_hid2emb(conved.permute(0, 2, 1)).permute(0, 2, 1) 106 | 107 | # conved_emb.shape(): (trg_len, emb_dim]) 108 | combined = conved_emb + embed 109 | # print(combined.size(), encoder_conved.size()) 110 | energy = torch.matmul(combined.permute(0, 2, 1), encoder_conved) 111 | 112 | # attention.shape(): (trg_len, emb_dim]) 113 | attention = F.softmax(energy, dim=2) 114 | attended_encoding = torch.matmul(attention, encoder_combined.permute(0, 2, 1)) 115 | 116 | # attended_encoding.shape(): (trg_len, emd_dim) 117 | # 进行注意力层第二次线性运算调整维度 118 | attended_encoding = self.attn_emb2hid(attended_encoding) 119 | 120 | # attended_encoding.shape(): (trg_len, hid_dim) 121 | # 残差计算 122 | attended_combined = conved + attended_encoding.permute(0, 2, 1) 123 | 124 | return attention, attended_combined 125 | 126 | def forward(self, targets, encoder_conved, encoder_combined): 127 | # targets.shape(): (trg_len, out_dim) 128 | # encoder_conved.shape(): (src_len, emb_dim) 129 | # encoder_combined.shape(): (src_len, emb_dim) 130 | conv_inp = self.embed(targets).permute(0, 2, 1) 131 | 132 | src_len = conv_inp.shape[0] 133 | hid_dim = conv_inp.shape[1] 134 | 135 | for _, conv in enumerate(self.convs): 136 | # need to pad so decoder can't "cheat" 137 | padding = torch.zeros(src_len, hid_dim, 138 | self.kernel_size - 1).fill_(self.trg_pad_idx) 139 | 140 | padded_conv_input = torch.cat((padding, conv_inp), dim=-1) 141 | 142 | # padded_conv_input = [batch size, hid dim, trg len + kernel size - 1] 143 | 144 | # 经过卷积运算 145 | conved = conv(padded_conv_input) 146 | 147 | # 经过激活函数 148 | conved = F.glu(conved, dim=1) 149 | 150 | # 注意力分数计算 151 | attention, conved = self.calculate_attention(conv_inp, conved, 152 | encoder_conved, 153 | encoder_combined) 154 | 155 | # 残差连接计算 156 | conved = self.hid2emb(conved.permute(0, 2, 1)).permute(0, 2, 1) 157 | 158 | conved = conved + conv_inp 159 | conv_inp = conved 160 | 161 | output = self.fc_out(conved.permute(0, 2, 1)) 162 | return output, attention 163 | 164 | 165 | class EncoderDecoder(nn.Module): 166 | r"""序列到序列 CNN 模型。 167 | """ 168 | 169 | def __init__(self, encoder, decoder): 170 | super().__init__() 171 | self.encoder = encoder 172 | self.decoder = decoder 173 | 174 | def forward(self, enc_inp, dec_inp): 175 | # 编码器,将源句子编码为向量输入解码器进行解码。 176 | encoder_conved, encoder_combined = self.encoder(enc_inp) 177 | 178 | # 解码器,根据编码器隐藏状态和解码器输入预测下一个单词的概率 179 | # 注意力层,源句子和目标句子之间进行注意力运算从而对齐 180 | output, attention = self.decoder(dec_inp, encoder_conved, encoder_combined) 181 | 182 | return output, attention 183 | 184 | 185 | if __name__ == '__main__': 186 | # 构建一个基于CNN的序列到序列模型 187 | inp_dim, out_dim, emb_dim, hid_dim, num_layers, kernel_size = 8, 10, 12, 16, 1, 3 188 | encoder = CNNEncoder(inp_dim, emb_dim, hid_dim, num_layers, kernel_size) 189 | decoder = CNNDecoder(out_dim, emb_dim, hid_dim, num_layers, kernel_size, trg_pad_idx=0) 190 | model = EncoderDecoder(encoder, decoder) 191 | 192 | enc_inp_seq = "I love you" 193 | dec_inp_seq = "我 爱 你" 194 | enc_inp, dec_inp = [], [] 195 | 196 | # 自己构造的的词典 197 | word2vec = {"I": [1, 0, 0, 0], 198 | "love": [0, 1, 0, 0], 199 | "you": [0, 0, 1, 0], 200 | "!": [0, 0, 0, 1], 201 | "我": [1, 0, 0, 0], 202 | "爱": [0, 1, 0, 0], 203 | "你": [0, 0, 1, 0], 204 | "!": [0, 0, 0, 1]} 205 | 206 | for word in enc_inp_seq.split(): 207 | enc_inp.append(word2vec[word]) 208 | pass 209 | 210 | enc_inp = torch.tensor(enc_inp) 211 | 212 | for word in dec_inp_seq.split(): 213 | dec_inp.append(word2vec[word]) 214 | pass 215 | dec_inp = torch.tensor(dec_inp) 216 | 217 | output = model(enc_inp, dec_inp) 218 | -------------------------------------------------------------------------------- /docs/chapter21/ch21.md: -------------------------------------------------------------------------------- 1 | # 第21章 PageRank算法 2 | 3 | ## 习题21.1 4 |   假设方阵A是随机矩阵,即其每个元素非负,每列元素之和为1,证明$A^k$仍然是随机矩阵,其中$k$是自然数。 5 | 6 | **解答:** 7 | 8 | **解答思路:** 9 | 1. 给出随机矩阵定义; 10 | 2. 证明随机矩阵的乘积仍然是随机矩阵; 11 | 3. 证明$A^k$仍然是随机矩阵。 12 | 13 | **解答步骤:** 14 | 15 | **第1步:给出随机矩阵定义** 16 | 17 |   根据书中第21.1.2节的随机矩阵定义: 18 | 19 | > 转移矩阵是一个$n$阶矩阵$M$ 20 | > $$ M = [m_{ij}]_{n \times n} \tag{21.1} $$ 21 | > 满足以下性质: 22 | > $$ \begin{align} 23 | m_{ij} \geqslant 0 \tag{21.2}\\ 24 | \displaystyle \sum_{i=1}^m m_{ij} = 1 \tag{21.3} 25 | \end{align} $$ 26 | > 即每个元素非负,每列元素之和为1,即矩阵$M$为随机矩阵。 27 | 28 |   根据题意:随机矩阵$A$满足以下性质: 29 |   (1)是方阵; 30 |   (2)每个元素非负; 31 |   (3)每列元素之和为1。 32 | 33 | **第2步:证明随机矩阵的乘积仍然是随机矩阵** 34 |   假设随机矩阵$A \in R^{n \times n}$与随机矩阵$B \in R^{n \times n}$相乘为矩阵$C$,即 35 | $$ 36 | C_{ij} = \sum_{k=1}^n A_{ik} B_{kj} 37 | $$ 38 |   $\because$ A、B均是随机矩阵 39 |   $\therefore A_{ik} \geqslant 0,B_{ik} \geqslant 0 $ 40 |   $\therefore$ 显然,$C_{ij}$非负 41 | $$ 42 | \begin{aligned} 43 | \sum_{i=1}^n C_{ij} 44 | &= \sum_{i=1}^n \sum_{k=1}^n A_{ik} B_{kj} \\ 45 | &= \sum_{k=1}^n B_{kj} \sum_{i=1}^n A_{ik} 46 | \end{aligned} 47 | $$ 48 |   $\because A$ 是随机矩阵 49 |   $\therefore \displaystyle \sum_{i=1}^n A_{ik} = 1$ 50 |   $\therefore \displaystyle \sum_{i=1}^n C_{ij} = \sum_{k=1}^n B_{kj} $ 51 |   $\because B$ 是随机矩阵 52 |   $\therefore \displaystyle \sum_{k=1}^n B_{kj} = 1$ 53 |   $\therefore \displaystyle \sum_{i=1}^n C_{ij} = 1$ 54 | 55 |   矩阵$C$满足: 56 |   (1)是方阵; 57 |   (2)每个元素非负; 58 |   (3)每列元素之和为1。 59 |   $\therefore$ 矩阵$C$为随机矩阵,即随机矩阵的乘积仍为随机矩阵 60 | 61 | **第3步:证明$A^k$仍然是随机矩阵** 62 |   根据第2步的推导,随机矩阵的乘积仍然为随机矩阵,可得$A^k$仍然是随机矩阵 63 | 64 | ## 习题21.2 65 |   例21.1中,以不同的初始分布向量$R_0$进行迭代,仍然得到同样的极限向量$R$,即PageRank。请验证。 66 | 67 | **解答:** 68 | 69 | **解答思路:** 70 | 71 | 1. 给出PageRank的基本定义 72 | 2. 自编程实现基本定义的PageRank的迭代求解算法 73 | 3. 使用例21.1中的转移矩阵,设置不同的初始分布向量$R_0$,验证可得到相同的极限向量$R$ 74 | 75 | **解答步骤:** 76 | 77 | **第1步:PageRank的基本定义** 78 | 79 |   根据书中第21.1.3节的定义21.3的PageRank的基本定义: 80 | 81 | > **定义21.3(PageRank的基本定义)** 给定一个包含$n$个结点$v_1, v_2, \cdots, v_n$的强连通且非周期性的有向图,在有向图上定义随机游走模型,即一阶马尔可夫链。随机游走的特点是从一个结点到有向边连出的所有结点的转移概率相等,转移矩阵为$M$,这个马尔科夫链具有平稳分布$R$ 82 | > $$ MR = R \tag{21.6} $$ 83 | > 平稳分布$R$称为这个有向图的PageRank。$R$的各个分量称为各个结点的PageRank值。 84 | > $$ R = \left[ \begin{array}{c} 85 | P R(v_1) \\ 86 | P R(v_2) \\ 87 | \vdots \\ 88 | P R(v_n) 89 | \end{array} \right] $$ 90 | > 其中$P R(v_i), i=1,2,\cdots, n$,表示结点$v_i$的PageRank值。 91 | 92 | **第2步:实现基本定义的PageRank的迭代求解算法** 93 | 94 | 95 | ```python 96 | import numpy as np 97 | 98 | 99 | def page_rank_basic(M, R0, max_iter=1000): 100 | """ 101 | 迭代求解基本定义的PageRank 102 | :param M: 转移矩阵 103 | :param R0: 初始分布向量 104 | :param max_iter: 最大迭代次数 105 | :return: Rt: 极限向量 106 | """ 107 | Rt = R0 108 | for _ in range(max_iter): 109 | Rt = np.dot(M, Rt) 110 | return Rt 111 | ``` 112 | 113 | **第3步:设置不同的初始分布向量$R_0$,验证可得到相同的极限向量$R$** 114 | 115 | 116 | ```python 117 | # 使用例21.1的转移矩阵M 118 | M = np.array([[0, 1 / 2, 1, 0], 119 | [1 / 3, 0, 0, 1 / 2], 120 | [1 / 3, 0, 0, 1 / 2], 121 | [1 / 3, 1 / 2, 0, 0]]) 122 | 123 | # 使用5个不同的初始分布向量R0 124 | for _ in range(5): 125 | R0 = np.random.rand(4) 126 | R0 = R0 / np.linalg.norm(R0, ord=1) 127 | Rt = page_rank_basic(M, R0) 128 | print("R0 =", R0) 129 | print("Rt =", Rt) 130 | print() 131 | ``` 132 | 133 | R0 = [0.24051216 0.26555451 0.22997054 0.26396279] 134 | Rt = [0.33333333 0.22222222 0.22222222 0.22222222] 135 | 136 | R0 = [0.0208738 0.60050438 0.26292553 0.11569629] 137 | Rt = [0.33333333 0.22222222 0.22222222 0.22222222] 138 | 139 | R0 = [0.31824487 0.19805355 0.27130894 0.21239265] 140 | Rt = [0.33333333 0.22222222 0.22222222 0.22222222] 141 | 142 | R0 = [0.16258713 0.37625269 0.18512522 0.27603496] 143 | Rt = [0.33333333 0.22222222 0.22222222 0.22222222] 144 | 145 | R0 = [0.27067789 0.16907504 0.31245762 0.24778945] 146 | Rt = [0.33333333 0.22222222 0.22222222 0.22222222] 147 | 148 | 149 | ​ 150 | 151 |   我们可以发现,使用不同的初始分布向量$R_0$进行迭代求解,仍然得到同样的极限向量$R$。 152 | 153 | ## 习题21.3 154 |   证明PageRank一般定义中的马尔科夫链具有平稳分布,即式(21.11)成立。 155 | 156 | **解答:** 157 | 158 | **解答思路:** 159 | 1. 给出PageRank的一般定义 160 | 2. 给出马尔科夫链平稳分布定理 161 | 3. 证明PageRank一般定义中的马尔科夫链符合平稳分布定理的条件 162 | 163 | **解答步骤:** 164 | 165 | **第1步:PageRank的一般定义** 166 | 167 |   根据书中第21.1.4节的定义21.4的PageRank的一般定义: 168 | 169 | > **定义21.4(PageRank的一般定义)** 给定一个含有$n$个结点的任意有向图,在有向图上定义一个一般的随机游走模型,即一阶马尔科夫链。一般的随机游走模型的转移矩阵由两部分的线性组合组成,一部分是有向图的基本转移矩阵$M$,表示从一个结点到其连出的所有结点的转移概率相等,另一部分是完全随机的转移矩阵,表示从任意一个结点到任意一个结点的转移概率都是$1/n$,线性组合系数为阻尼因子$d(0 \leqslant d \leqslant 1)$。这个一般随机游走的马尔可夫链存在平稳分布,记作$R$。定义平稳分布向量$R$为这个有向图的一般PageRank。$R$由公式 170 | > $$ R = d M R + \frac{1-d}{n} \boldsymbol{1} \tag{21.10} $$ 171 | > 决定,其中$\boldsymbol{1}$是所有分量为1的$n$维向量。 172 | 173 |   根据书中第21.1.4节的PageRank一般定义的公式: 174 | 175 | > $$ P R(v_i) = d \left( \sum_{v_j \in M(v_i)} \frac{P R(v_j)}{L(v_j)} \right ) + \frac{1-d}{n}, \quad i = 1, 2, \cdots, n \tag{21.11} $$ 176 | > 这里$M(v_i)$是指向结点$v_i$的结点集合,$L(v_j)$是结点$v_j$连出的边的个数。 177 | 178 |   根据书中第21.1.4节的一般PageRank的定义的解释: 179 | 180 | >   一般PageRank的定义意味着互联网游览器,按照以下方法在网上随机游走:在任意一个网页上,浏览者或者以概率$d$决定按照超链接随机跳转,这时以等概率从链接出去的超链接跳转到下一个网页;或者以概率$(1 - d)$决定完全随机跳转,这时以等概率$1 / n$跳转到任意一个网页。第二个机制保证从没有连接出去的超链接的网页也可以跳转出。这样可以保证平稳分布,即一般PageRank的存在,因而一般PageRank适用于任何结构的网络。 181 | 182 | **第2步:写出马尔科夫链平稳分布定理** 183 | 184 |   根据书中第21.1.3节的定理21.1: 185 | 186 | > **定理21.1** 不可约且非周期的有限状态马尔科夫链,有唯一平稳分布存在,并且当时间趋于无穷时状态分布收敛于唯一的平稳分布。 187 | 188 |   根据书中第21.2.2节的公式(21.22): 189 | 190 | > 一般PageRank的转移矩阵可以写作 191 | > $$ R = \left( d M + \frac{1-d}{n} \boldsymbol{E} \right ) R = A R \tag{21.22} $$ 192 | > 其中$d$是阻尼因子,$\boldsymbol{E}$ 是所有元素为1的$n$阶方阵。 193 | 194 |   结合定理21.1,需证明PageRank一般定义中的马尔科夫链的转移矩阵$A$满足以下条件: 195 | 1. $A$非负; 196 | 2. $A$不可约; 197 | 3. $A$非周期; 198 | 4. $A$有限。 199 | 200 | **第3步:证明PageRank一般定义中的马尔科夫链符合平稳分布定理的条件** 201 | 202 | 1. $A$非负 203 | 基本转移矩阵$M$每个元素都非负,所以显然$A$中每个元素也非负。 204 | 2. $A$不可约 205 | 如果有一个非零概率从任何状态过渡到任何其它状态,即图是强连通的,则被称为不可约。因为定义了完全随机的转移矩阵,所以$A$是不可约的。 206 | 3. $A$非周期 207 | 因为定义了完全随机的转移矩阵,所以每个点都有指向自己的边,即从每个点出发再返回,都有长度为1的路径,所以$A$是非周期的。 208 | 4. $A$有限 209 | 结合一般PageRank的定义,可知网页是有限的,则$A$是有限的。 210 | 211 | ## 习题21.4 212 |   证明随机矩阵的最大特征值为1。 213 | 214 | **解答:** 215 | 216 | **解答思路:** 217 | 1. 证明1是随机矩阵的特征值 218 | 2. 使用反证法,证明1是最大的特征值 219 | 220 | **解答步骤:** 221 | 222 | **第1步:证明1是随机矩阵的特征值** 223 | 224 |   假设随机矩阵$A \in R^{n \times n}$,其转置为$A^T$,则$A^T$的行和为1。显然全1向量$\boldsymbol{1}$是$A^T$的一个特征向量,对应特征值为1,即: 225 | $$ 226 | A^T \boldsymbol{1} = 1 \cdot \boldsymbol{1} 227 | $$ 228 |   $\because$ $A$与$A^T$互为转置向量,它们有相同的特征值 229 |   $\therefore$ 1也是$A$的特征值 230 | 231 | **第2步:使用反证法,证明1是最大的特征值** 232 | 233 |    假设存在特征值$\lambda$大于1,有: 234 | $$ 235 | A^T \boldsymbol{v} = \lambda \boldsymbol{v} 236 | $$ 237 |   设$v_k$是$\boldsymbol{v}$中的最大元素。因为$A^T$的每个元素非负,且行和为1,则$\lambda \boldsymbol{v}$中的每个元素都是$\boldsymbol{v}$中元素的凸组合。 238 | 239 | > [凸组合的概念](https://baike.baidu.com/item/%E5%87%B8%E7%BB%84%E5%90%88/18999826?fr=aladdin) 240 | > 设向量$\{x_i\}, i=1,2,\cdots, n$,如有实数$\lambda_i \geqslant 0$,且$\displaystyle \sum_{i=1}^n \lambda_i = 1$,则称$\displaystyle \sum_{i=1}^n \lambda_i x_i$为向量$\{x_i\}$的一个凸组合(凸线性组合)。 241 | 242 |   所以$\lambda \boldsymbol{v}$中的元素都小于等于$v_k$,即: 243 | $$ 244 | \sum_{j=1}^n {A^T}_{ij} v_{j} = \lambda v_{i} \leqslant v_k 245 | $$ 246 | 247 |   若$\lambda > 1$,则会有$\lambda v_{k} > v_k$,和上式矛盾,所以特征值$\lambda$大于1的假设不成立。 248 | 249 |   所以$A^T$的最大特征值为1,也就是$A$的最大特征值为1。 250 | 251 | ## 参考文献 252 | 253 | 【1】凸组合的概念:https://baike.baidu.com/item/%E5%87%B8%E7%BB%84%E5%90%88/18999826?fr=aladdin -------------------------------------------------------------------------------- /docs/chapter01/ch01.md: -------------------------------------------------------------------------------- 1 | # 第1章统计学习方法概论 2 | 3 | ## 习题1.1 4 |   说明伯努利模型的极大似然估计以及贝叶斯估计中的统计学习方法三要素。伯努利模型是定义在取值为0与1的随机变量上的概率分布。假设观测到伯努利模型$n$次独立的数据生成结果,其中$k$次的结果为1,这时可以用极大似然估计或贝叶斯估计来估计结果为1的概率。 5 | 6 | **解答:** 7 | 8 | **解答思路:** 9 | 1. 写出伯努利模型; 10 | 2. 写出伯努利模型的极大似然估计以及贝叶斯估计中的统计学习方法三要素; 11 | 3. 根据伯努利模型的极大似然估计,估计结果为1的概率; 12 | 4. 根据伯努利模型的贝叶斯估计,估计结果为1的概率。 13 | 14 | **解答步骤:** 15 | 16 | **第1步:伯努利模型** 17 |   根据题意:伯努利模型是定义在取值为0与1的随机变量上的概率分布。 18 |   对于随机变量$X$,则有: 19 | $$ 20 | P(X=1)=p \\ P(X=0)=1-p 21 | $$ 22 | 23 | 其中,$p$为随机变量$X$取值为1的概率,$1-p$则为取0的概率。 24 |   由于随机变量$X$只有0和1两个值,$X$的概率分布,即伯努利模型可写为: 25 | $$ 26 | P_p(X=x)=p^x (1-p)^{(1-x)}, \quad 0 \leqslant p \leqslant 1 27 | $$ 28 | 29 |   则伯努利模型的假设空间为: 30 | $$ 31 | \mathcal{F}=\{P|P_p(X)=p^x(1-p)^{(1-x)}, p\in [0,1] \} 32 | $$ 33 | 34 | **第2步:伯努利模型的极大似然估计以及贝叶斯估计中的统计学习方法三要素** 35 | (1)极大似然估计 36 |   模型:伯努利模型 37 |   策略:经验风险最小化。极大似然估计,等价于当模型是条件概率分布、损失函数是对数损失函数时的经验风险最小化。 38 |   算法:极大化似然:$\displaystyle \mathop{\arg\max} \limits_{p} L(p|X)= \mathop{\arg\max} \limits_{p} P(X|p)$ 39 | 40 | (2)贝叶斯估计 41 |   模型:伯努利模型 42 |   策略:结构风险最小化。贝叶斯估计中的最大后验概率估计,等价于当模型是条件概率分布、损失函数是对数损失函数、模型复杂度由模型的先验概率表示时的结构风险最小化。 43 |   算法:最大化后验概率:$\displaystyle \mathop{\arg\max} \limits_{p} \pi (p|X)= \displaystyle \mathop{\arg\max} \limits_{p} \frac{P(X|p)\pi(p)}{\int P(X|p)\pi(p)dp}$ 44 | 45 | 46 | **第3步:伯努利模型的极大似然估计** 47 | 48 |   极大似然估计的一般步骤: 49 |   参考Wiki:https://en.wikipedia.org/wiki/Maximum_likelihood_estimation 50 | > 1. 写出随机变量的概率分布函数; 51 | > 2. 写出似然函数; 52 | > 3. 对似然函数取对数,得到对数似然函数,并进行化简; 53 | > 4. 对参数进行求导,并令导数等于0; 54 | > 5. 求解似然函数方程,得到参数的值。 55 | 56 |   对于伯努利模型$n$次独立的数据生成结果,其中$k$次的结果为1,可得似然函数为: 57 | $$ 58 | \begin{aligned} L(p|X) &= P(X|p) \\ 59 | &= \prod_{i=1}^{n} P(x^{(i)}|p) \\ 60 | &=p^k (1-p)^{n-k} 61 | \end{aligned} 62 | $$ 63 | 64 |   对似然函数取对数,得到对数似然函数为: 65 | $$ 66 | \begin{aligned} \log L(p|X) &= \log p^k (1-p)^{n-k} \\ 67 | &= \log(p^k) + \log\left( (1-p)^{n-k} \right) \\ 68 | &= k\log p + (n-k)\log (1-p) 69 | \end{aligned} 70 | $$ 71 | 72 |   求解参数$p$: 73 | $$ 74 | \begin{aligned} 75 | \hat{p} &= \mathop{\arg\max} \limits_{p} L(p|X) \\ 76 | &= \mathop{\arg\max} \limits_{p} \left[ k\log p + (n-k)\log (1-p) \right] 77 | \end{aligned} 78 | $$ 79 | 80 |   对参数$p$求导,并求解导数为0时的$p$值: 81 | $$ 82 | \begin{aligned} 83 | \frac{\partial \log L(p)}{\partial p} &= \frac{k}{p} - \frac{n-k}{1-p} \\ 84 | &= \frac{k(1-p) - p(n-k)}{p(1-p)} \\ 85 | &= \frac{k-np}{p(1-p)} 86 | \end{aligned} 87 | $$ 88 | 89 |   令$\displaystyle \frac{\partial \log L(p)}{\partial p} = 0$,从上式可得,$k-np=0$,即$\displaystyle p=\frac{k}{n}$ 90 |   所以$\displaystyle P(X=1)=\frac{k}{n}$ 91 | 92 | **第4步:伯努利模型的贝叶斯估计** 93 | 94 | 解法一:求最大后验估计 95 | 96 |   贝叶斯估计(最大后验估计)的一般步骤: 97 |   参考Wiki:https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation 98 | > 1. 确定参数$\theta$的先验概率$p(\theta)$ 99 | > 2. 根据样本集$D=\{ x_1, x_2, \ldots, x_n \}$,计算似然函数$P(D|\theta)$:$\displaystyle P(D|\theta)=\prod_{i=1}^n P(x_i|\theta)$ 100 | > 3. 利用贝叶斯公式,写出后验概率最大化公式: 101 | > $$ \mathop{\arg\max} \limits_{\theta} P(\theta|D)=\mathop{\arg\max} \limits_{\theta} \frac{P(D|\theta)P(\theta)}{\displaystyle \int \limits_\Theta P(D|\theta) P(\theta) d \theta} = \mathop{\arg\max} \limits_{\theta} P(D|\theta)P(\theta)$$ 102 | > 4. 利用求导,得到后验概率最大时的参数取值 103 | 104 |   对于伯努利模型的参数$p$,根据贝叶斯估计,该参数也是随机变量。 105 |   假设$p$的先验分布$\pi(p)$为均匀分布,则最大后验概率估计等价于极大似然估计。 106 |   一般在贝叶斯估计中,如果后验分布与先验分布属于同一分布簇(共轭分布),则称此先验分布为似然函数的共轭先验。 107 | 108 |   参考[极大似然估计和贝叶斯估计](https://zhuanlan.zhihu.com/p/61593112) 109 | > 选取共轭先验有如下好处,例如: 110 | >(1)符合直观,先验分布和后验分布应该是相同形式的; 111 | >(2)可以给出后验分布的解析形式; 112 | >(3)可以形成一个先验链,即现在的后验分布可以作为下一次计算的先验分布,如果形式相同,就可以形成一个链条。 113 | 114 |   伯努利分布的先验分布为Beta分布,则此处假设先验分布$\pi(p)$为Beta分布。 115 | 116 | > **补充知识:Beta分布** 117 | > 来源维基百科:https://zh.wikipedia.org/wiki/%CE%92%E5%88%86%E5%B8%83 118 |   Beta 分布(Beta distribution),是指一组定义在${\displaystyle (0,1)}$区间的连续概率分布,亦称Β分布。有两个参数$\alpha, \beta>0$。 119 | > 概率密度函数:$\displaystyle f(x; \alpha, \beta)= \frac{1}{{\rm B}(\alpha, \beta)}x^{(\alpha-1)}(1-x)^{\beta-1}$ 120 | 其中${\rm B}(\alpha, \beta)$是Beta函数,亦称Β函数。$\displaystyle {\rm B}(\alpha, \beta) =\int _{0}^{1} x^{\alpha-1}(1-x)^{\beta-1}dx$ 121 | > 随机变量$X$服从参数为$\alpha, \beta$的Beta分布记作:$X \sim {\rm Be}(\alpha, \beta)$ 122 | > 期望:$\displaystyle {\rm E}(X) = \frac{\alpha}{\alpha+\beta}$ 123 | > 与均匀分布关系:当$\alpha=1, \beta=1$时,Beta分布就是一个均匀分布 124 | 125 |   $p$的先验分布为: 126 | $$ 127 | \displaystyle \pi (p) = \frac{1}{B(\alpha, \beta)} p^{(\alpha-1)} (1-p)^{\beta-1} 128 | $$ 129 | 130 |   似然函数与第3步相同: 131 | $$ 132 | \begin{aligned} L(p|X) &= P(X|p) \\ 133 | &= \prod_{i=1}^{n} P(x^{(i)}|p) \\ 134 | &=p^k (1-p)^{n-k} 135 | \end{aligned} 136 | $$ 137 | 138 |   最大化后验概率,求解参数$p$: 139 | $$ 140 | \begin{aligned} 141 | \hat{p} &= \mathop{\arg\max} \limits_{p} \frac{P(X|p)\pi(p)}{\displaystyle \int P(X|p)\pi(p)dp} \\ 142 | &= \mathop{\arg\max} \limits_{p} P(X|p)\pi(p) \\ 143 | &= \mathop{\arg\max} \limits_{p} p^k (1-p)^{n-k} \frac{1}{B(\alpha, \beta)} p^{(\alpha-1)} (1-p)^{\beta-1} \\ 144 | &= \mathop{\arg\max} \limits_{p} \frac{1}{B(\alpha, \beta)} p^{k+\alpha-1} (1-p)^{n-k+\beta-1} 145 | \end{aligned} 146 | $$ 147 | 148 |   令$\displaystyle g(p) = \frac{1}{B(\alpha, \beta)} p^{k+\alpha-1} (1-p)^{n-k+\beta-1}$,对函数$g(p)$先取对数,再对$p$求导,得 149 | $$ \frac{\partial \log g(p)}{\partial p} = \frac{1}{B(\alpha, \beta)} \left( \frac{k+\alpha-1}{p} - \frac{n-k+\beta-1}{1-p} \right)$$ 150 | 151 |   令上式等于0,得$\displaystyle \hat{p} = \frac{k+\alpha-1}{n+\alpha+\beta-2}$,其中$\alpha, \beta$为beta分布的参数。 152 | 153 |   所以最大后验概率估计得到$\displaystyle P(X=1)=\frac{k+\alpha-1}{n+\alpha+\beta-2}$ 154 | 155 | 解法二:求后验概率分布的期望 156 | 157 |   后验概率分布的期望求解 158 |   参考Wiki(英文):https://en.wikipedia.org/wiki/Bayes_estimator 159 | >   贝叶斯估计中的最大后验概率估计,得到的是模型参数$\theta$这个随机变量的后验分布的众数,通常被认为是点估计。而贝叶斯方法的特点是使用分布来总结数据和得出推论,因此贝叶斯方法倾向于得到后验均值或中值,以及可信区间。 160 | >   贝叶斯估计,利用后验分布的期望(均值)作为参数的估计值的方法,前两步与最大后验概率估计相同,第3、4步如下: 161 | > 3. 利用贝叶斯公式,求$\theta$的后验概率:$\displaystyle P(\theta|D)=\frac{P(D|\theta)P(\theta)}{\displaystyle \int \limits_\Theta P(D|\theta) P(\theta) d \theta}$ 162 | > 4. 计算后验概率分布参数$\theta$的期望,并求出贝叶斯估计值:$\displaystyle \hat{\theta}=\int \limits_{\Theta} \theta \cdot P(\theta|D) d \theta$ 163 | 164 |   已知似然函数和参数$p$的先验分布,参数$p$的后验分布为: 165 | $$ 166 | \begin{aligned} 167 | P(p|X) &= \frac{P(X|p)\pi(p)}{\displaystyle \int P(X|p)\pi(p)dp} \\ 168 | &=\frac{\displaystyle \frac{1}{B(\alpha, \beta)} p^{k+\alpha-1} (1-p)^{n-k+\beta-1}}{\displaystyle \int \frac{1}{B(\alpha, \beta)} p^{k+\alpha-1} (1-p)^{n-k+\beta-1} dp} \\ 169 | &=\frac{ p^{k+\alpha-1} (1-p)^{n-k+\beta-1}}{\displaystyle \int p^{k+\alpha-1} (1-p)^{n-k+\beta-1} dp} \\ 170 | &=\frac{1}{B(k+\alpha, n-k+\beta)} p^{k+\alpha-1} (1-p)^{n-k+\beta-1} \\ 171 | &\sim \text{Be}(k+\alpha, n-k+\beta) \\ 172 | \end{aligned} 173 | $$ 174 | 175 |   后验概率分布的期望: 176 | $$ 177 | \begin{aligned} 178 | E_p(p|X)&=E_p({\rm Be}(k+\alpha, n-k+\beta)) \\ 179 | &=\frac{k+\alpha}{(k+\alpha)+(n-k+\beta)} \\ 180 | &=\frac{k+\alpha}{n+\alpha+\beta} 181 | \end{aligned} 182 | $$ 183 | 184 |   则以参数的后验概率分布的期望作为贝叶斯估计的参数值: 185 | $$ 186 | \displaystyle \hat{p}=\frac{k+\alpha}{n+\alpha+\beta} 187 | $$ 188 | 189 |   所以贝叶斯估计得到$\displaystyle P(X=1)=\frac{k+\alpha}{n+\alpha+\beta}$ 190 | 191 | ## 习题1.2 192 |   通过经验风险最小化推导极大似然估计。证明模型是条件概率分布,当损失函数是对数损失函数时,经验风险最小化等价于极大似然估计。 193 | 194 | **解答:** 195 | 196 | **解答思路:** 197 | 1. 根据经验风险最小化定义,写出目标函数; 198 | 2. 根据对数损失函数,对目标函数进行整理; 199 | 3. 根据似然函数定义和极大似然估计的一般步骤(计算时需要取对数),可得到结论。 200 | 201 | **解答步骤:** 202 |   假设模型的条件概率分布是$P_{\theta}(Y|X)$,样本集$D=\{(x_1,y_1),(x_2,y_2),\ldots,(x_N,y_N)\}$,根据书中第17页公式(1.12),对数损失函数为: 203 | $$ 204 | L(Y,P(Y|X)) = -\log P(Y|X) 205 | $$ 206 |   根据书中第18页公式(1.15),按照经验风险最小化求最优模型就是求解最优化问题: 207 | $$ 208 | \min \limits_{f \in \mathcal{F}} \frac{1}{N} \sum_{i=1}^N L(y_i, f(x_i)) 209 | $$ 210 |   结合上述两个式子,可得经验风险最小化函数: 211 | $$ 212 | \begin{aligned} 213 | \mathop{\arg\min} \limits_{f \in \mathcal{F}} \frac{1}{N} \sum_{i=1}^N L(y_i, f(x_i)) &= \mathop{\arg\min} \limits_{f \in \mathcal{F}} \frac{1}{N} \sum_D [-\log P(Y|X)] \\ 214 | &= \mathop{\arg\max} \limits_{f \in \mathcal{F}} \frac{1}{N} \sum_D \log P(Y|X) \\ 215 | &= \mathop{\arg\max} \limits_{f \in \mathcal{F}} \frac{1}{N} \log \prod_D P(Y|X) \\ 216 | &= \frac{1}{N} \mathop{\arg\max} \limits_{f \in \mathcal{F}} \log \prod_D P(Y|X) 217 | \end{aligned} 218 | $$ 219 |   根据似然函数定义:$\displaystyle L(\theta)=\prod_D P_{\theta}(Y|X)$,以及极大似然估计的一般步骤,可得: 220 | $$ 221 | \mathop{\arg\min} \limits_{f \in \mathcal{F}} \frac{1}{N} \sum_{i=1}^N L(y_i, f(x_i)) = \frac{1}{N} \mathop{\arg\max} \limits_{f \in \mathcal{F}} \log L(\theta) 222 | $$ 223 |   即经验风险最小化等价于极大似然估计,得证。 224 | 225 | ## 参考文献 226 | 227 | 【1】极大似然估计的一般步骤(来源于Wiki百科):https://en.wikipedia.org/wiki/Maximum_likelihood_estimation 228 | 【2】贝叶斯估计(最大后验估计)的一般步骤(来源于Wiki百科):https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation 229 | 【3】极大似然估计和贝叶斯估计(来源于知乎):https://zhuanlan.zhihu.com/p/61593112 230 | 【4】Beta分布(来源于Wiki百科):https://zh.wikipedia.org/wiki/%CE%92%E5%88%86%E5%B8%83 231 | 【5】后验概率分布的期望求解(来源于Wiki百科):https://en.wikipedia.org/wiki/Bayes_estimator 232 | --------------------------------------------------------------------------------