├── LICENSE
├── README.md
├── hw1
├── Homework1.png
├── hw1.ipynb
├── hw1_15_train.dat
├── hw1_18_test.dat
└── hw1_18_train.dat
├── hw2
├── Homework2.png
├── hw2.ipynb
├── hw2_test.dat
└── hw2_train.dat
├── hw3
├── Homework3.png
├── hw3.ipynb
├── hw3_test.dat
└── hw3_train.dat
├── hw4
├── Homework4.png
├── hw4.ipynb
├── hw4_test.dat
└── hw4_train.dat
├── hw5
├── Homework5.png
├── features_test.dat
├── features_train.dat
└── hw5.ipynb
├── hw6
├── Homework6.png
├── hw2_adaboost_test.dat
├── hw2_adaboost_train.dat
├── hw2_lssvm_all.dat
└── hw6.ipynb
├── hw7
├── Homework7.png
├── hw3_test.dat
├── hw3_train.dat
└── hw7.ipynb
├── hw8
├── Homework8.png
├── hw4_kmeans_train.dat
├── hw4_knn_test.dat
├── hw4_knn_train.dat
├── hw4_nnet_test.dat
├── hw4_nnet_train.dat
└── hw8.ipynb
└── lecture
├── MLF1-1.md
├── MLF1-2.md
├── MLF1.md
├── MLF1
├── pic1.png
├── pic10.png
├── pic11.png
├── pic12.png
├── pic13.png
├── pic14.png
├── pic15.png
├── pic1_1.png
├── pic2.png
├── pic3.png
├── pic4.png
├── pic5.png
├── pic6.png
├── pic7.png
├── pic8.png
└── pic9.png
├── MLF2-1.md
├── MLF2-1
├── Q10.png
├── Q10_1.png
├── Q11_1.png
├── Q14.png
├── Q6.png
├── Q8.png
└── output_2_1.png
├── MLF2-2.md
├── MLF2-2
└── Q6.png
├── MLF2.md
├── MLF2
├── pic1.png
├── pic10.png
├── pic11.png
├── pic12.png
├── pic13.png
├── pic14.png
├── pic15.png
├── pic16.png
├── pic17.png
├── pic2.png
├── pic3.png
├── pic4.png
├── pic5.png
├── pic6.png
├── pic7.png
├── pic8.png
└── pic9.png
├── MLF3-1.md
├── MLF3-1
├── Q1.png
└── Q3.png
├── MLF3-2.md
├── MLF3.md
├── MLF3
├── pic1.png
├── pic10.png
├── pic11.png
├── pic12.png
├── pic13.png
├── pic14.png
├── pic15.png
├── pic16.png
├── pic18.png
├── pic2.png
├── pic3.png
├── pic4.png
├── pic5.png
├── pic6.png
├── pic7.png
├── pic8.png
└── pic9.png
├── MLF4-1.md
├── MLF4-1
├── Q4_1.png
└── Q5.png
├── MLF4-2.md
├── MLF4-2
└── hw4.png
├── MLF4.md
├── MLF4
├── pic1.png
├── pic10.png
├── pic11.png
├── pic12.png
├── pic13.png
├── pic2.png
├── pic3.png
├── pic4.png
├── pic5.png
├── pic6.png
├── pic7.png
├── pic8.png
└── pic9.png
├── MLT1-1.md
├── MLT1-1
├── output_1_0.png
├── pic1.png
├── pic2.png
└── pic3.png
├── MLT1-2.md
├── MLT1.md
├── MLT1
├── pic1.png
├── pic10.png
├── pic11.png
├── pic12.png
├── pic13.png
├── pic14.png
├── pic15.png
├── pic16.png
├── pic17.png
├── pic18.png
├── pic19.png
├── pic2.png
├── pic20.png
├── pic21.png
├── pic3.png
├── pic4.png
├── pic5.png
├── pic6.png
├── pic7.png
├── pic8.png
└── pic9.png
├── MLT2-1.md
├── MLT2-1
└── pic1.png
├── MLT2-2.md
├── MLT2.md
├── MLT2
├── pic1.png
├── pic10.png
├── pic11.png
├── pic12.png
├── pic13.png
├── pic14.png
├── pic15.png
├── pic16.png
├── pic17.png
├── pic18.png
├── pic19.png
├── pic2.png
├── pic20.png
├── pic21.png
├── pic22.png
├── pic23.png
├── pic24.png
├── pic25.png
├── pic26.png
├── pic27.png
├── pic3.png
├── pic4.png
├── pic5.png
├── pic6.png
├── pic7.png
├── pic8.png
└── pic9.png
├── MLT3-1.md
├── MLT3-1
├── pic1.png
├── pic2.png
└── pic3.png
├── MLT3-2.md
├── MLT3.md
├── MLT3
├── pic1.png
├── pic10.png
├── pic11.png
├── pic12.png
├── pic13.png
├── pic14.png
├── pic15.png
├── pic16.png
├── pic17.png
├── pic18.png
├── pic19.png
├── pic2.png
├── pic20.png
├── pic21.png
├── pic22.png
├── pic23.png
├── pic24.png
├── pic25.png
├── pic26.png
├── pic27.png
├── pic28.png
├── pic3.png
├── pic4.png
├── pic5.png
├── pic6.png
├── pic7.png
├── pic8.png
└── pic9.png
├── MLT4-1.md
├── MLT4-1
├── pic1.png
└── pic2.png
├── MLT4-2.md
├── MLT4.md
└── MLT4
├── pic1.png
├── pic10.png
├── pic11.png
├── pic12.png
├── pic13.png
├── pic14.png
├── pic15.png
├── pic16.png
├── pic17.png
├── pic18.png
├── pic19.png
├── pic2.png
├── pic20.png
├── pic21.png
├── pic22.png
├── pic23.png
├── pic24.png
├── pic25.png
├── pic26.png
├── pic27.png
├── pic28.png
├── pic29.png
├── pic3.png
├── pic30.png
├── pic31.png
├── pic32.png
├── pic33.png
├── pic34.png
├── pic35.png
├── pic36.png
├── pic37.png
├── pic38.png
├── pic39.png
├── pic4.png
├── pic40.png
├── pic5.png
├── pic6.png
├── pic7.png
├── pic8.png
└── pic9.png
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2019 Ace
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## 机器学习基石作业
2 |
3 | - Lec1~Lec4:作业1
4 | - Lec5~Lec8:作业2
5 | - Lec9~Lec12:作业3
6 | - Lec13~Lec16:作业4
7 |
8 | ## 机器学习技法作业
9 |
10 | - Lec1~Lec4:作业1
11 | - Lec5~Lec8:作业2
12 | - Lec9~Lec12:作业3
13 | - Lec13~Lec16:作业4
14 |
15 | ## 说明
16 |
17 | ① 此处主要放置关于需要编程题目的代码。为便于查看,为jupyter notebook格式
18 |
19 | ② 各作业的具体题目,以(.png放置)以及对应的数据集
20 |
21 | ③ 具体的解答过程和答案见[博客](https://acecoooool.github.io/)
22 |
23 | 友情提示:
24 |
25 | 1. 由于发现博客的网页有时反应很慢,所以将所有的`markdown`文件放置到此仓库下面一份。(个人采用[typora](https://www.typora.io/)编辑的,您可以clone到本地再查看)
26 | 2. 由于这个仓库里的内容是好几年前写的,有些代码可能会"比较久远",以及很多内容我也不太记得了。我会最近抽时间再看下这部分的内容。
27 |
28 |
--------------------------------------------------------------------------------
/hw1/Homework1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/hw1/Homework1.png
--------------------------------------------------------------------------------
/hw2/Homework2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/hw2/Homework2.png
--------------------------------------------------------------------------------
/hw2/hw2_train.dat:
--------------------------------------------------------------------------------
1 | 8.105 -3.500 4.769 4.541 -9.829 5.252 3.838 -3.408 -4.824 -1
2 | -6.273 -2.097 9.404 1.143 3.487 -5.206 0.061 5.024 -6.687 1
3 | 1.624 -1.173 4.260 -3.607 -6.632 4.431 -8.355 7.206 -8.977 1
4 | -10.000 7.758 -2.670 -8.880 -1.099 -9.183 -4.086 8.962 5.841 1
5 | 8.464 1.762 2.729 2.724 8.155 6.096 -2.844 9.800 3.302 -1
6 | -0.135 6.193 7.705 7.195 7.313 -3.395 8.012 -6.773 -4.433 1
7 | 0.934 -8.379 -2.083 -6.337 4.346 -3.928 9.759 -8.499 -4.128 1
8 | 8.923 -0.018 -6.837 6.628 -2.823 -9.524 -6.767 -4.811 -6.296 1
9 | -9.028 7.010 -9.063 -1.111 -9.328 5.282 4.960 -9.569 6.784 -1
10 | -9.706 1.392 6.562 -6.543 -1.980 -6.261 -6.067 1.254 -1.071 1
11 | -6.891 -4.157 1.057 -5.954 4.732 1.729 9.328 -0.308 2.160 1
12 | -0.845 -5.858 -0.486 -4.282 -2.401 7.534 -0.543 1.531 -1.212 -1
13 | -9.596 -3.929 9.556 1.461 0.117 4.288 -6.810 -0.555 -6.020 1
14 | 9.124 7.287 -7.506 -1.363 -6.995 0.093 -3.828 2.462 -8.376 1
15 | 7.514 7.608 -0.175 7.071 -0.931 9.942 1.359 2.259 -0.613 -1
16 | -1.805 -2.265 -9.636 0.689 6.373 -6.631 -9.218 -7.456 5.831 -1
17 | -3.048 8.819 -8.509 6.777 5.889 0.560 6.719 -2.752 -7.181 -1
18 | -5.873 -9.376 -3.226 -5.509 1.313 -6.853 -2.140 2.095 -4.309 -1
19 | 4.250 -5.350 -6.683 5.741 -8.574 9.207 -3.699 8.145 -3.545 -1
20 | 8.587 -0.571 -7.906 -4.638 3.920 3.407 -1.491 -8.220 -4.498 1
21 | -8.107 0.089 -7.650 -4.790 -4.171 -6.223 -5.583 2.130 -8.078 1
22 | -8.616 9.386 -9.095 -6.522 -5.252 4.825 6.886 3.256 6.605 -1
23 | -10.000 -3.258 -1.998 -7.559 1.952 3.832 -3.782 6.369 -4.038 1
24 | -4.212 -1.462 -2.603 -3.308 2.016 2.144 -8.483 -1.099 -4.600 1
25 | 8.112 3.770 -5.551 -3.885 6.211 6.401 9.946 -7.571 2.770 -1
26 | -8.868 0.669 5.703 -1.472 7.361 -2.282 -9.328 8.879 6.620 1
27 | 6.635 5.312 5.358 -8.916 -8.574 1.569 7.485 -8.628 3.998 1
28 | 7.432 -8.466 -9.884 3.135 0.062 7.477 -9.147 0.734 6.355 -1
29 | -3.031 2.371 -4.132 -7.674 3.454 -2.706 3.895 0.939 -1.334 1
30 | -10.000 -1.108 7.883 -7.978 -7.973 -2.055 9.498 -7.120 8.679 1
31 | 10.000 2.703 -6.408 -4.365 5.029 7.046 2.929 -1.076 -2.015 -1
32 | 3.891 1.182 -0.468 1.774 3.203 1.559 9.719 2.702 4.439 -1
33 | -4.895 7.533 3.229 -1.304 -6.832 -1.742 -4.258 6.097 7.182 1
34 | -6.454 -0.875 4.457 3.077 -9.100 -2.340 -5.364 -9.381 -10.000 -1
35 | 4.393 8.004 -5.783 -2.378 -3.299 -2.615 5.880 2.443 -6.518 1
36 | 0.337 2.622 -4.467 -5.206 -4.301 -3.567 2.454 0.335 -2.949 1
37 | -1.583 7.670 6.972 2.634 -4.708 -6.327 -9.980 -8.828 6.116 1
38 | -8.917 1.634 -6.017 -3.384 6.428 -0.318 3.049 -1.118 -10.000 1
39 | -4.864 1.848 0.375 -7.892 -5.517 5.667 -4.218 -5.498 6.839 -1
40 | 5.545 3.762 -5.996 9.528 -9.622 -9.568 -0.789 3.427 -0.686 -1
41 | 1.361 -5.169 -3.709 -8.264 -3.060 0.774 7.403 2.721 5.276 -1
42 | 7.686 4.347 -0.279 -8.310 3.875 0.099 -7.878 -6.914 -6.474 1
43 | 6.890 -7.670 -8.421 -6.819 -5.934 -1.481 3.954 -8.532 -8.760 1
44 | -1.530 8.711 -0.993 8.191 -9.599 -7.117 -1.710 -7.477 -4.031 1
45 | -4.384 3.295 1.583 -2.805 6.476 5.649 5.713 0.430 7.117 -1
46 | -2.528 -9.359 2.564 6.479 8.832 2.966 9.362 -2.878 5.489 1
47 | 2.867 3.421 9.149 -5.550 -9.384 5.625 -9.901 6.329 -3.945 1
48 | -6.103 3.564 8.529 6.461 0.044 7.361 -0.573 -0.595 -5.517 -1
49 | -10.000 1.217 -5.353 9.365 5.667 -4.737 4.989 5.765 -8.408 -1
50 | -5.352 -3.079 4.530 -6.823 -6.618 -5.426 -9.462 2.809 3.979 1
51 | 9.667 2.303 8.283 -5.686 1.668 3.949 -0.423 -3.343 -0.286 1
52 | -2.993 9.110 2.642 -8.462 -7.713 6.024 -3.888 -7.175 -1.167 1
53 | 5.873 5.954 0.947 4.155 -9.732 -7.385 -1.896 -0.155 -0.728 1
54 | -3.765 4.062 0.545 8.877 5.600 2.833 4.901 -8.289 5.658 -1
55 | -1.065 -3.518 5.746 9.882 -9.363 6.014 -7.503 -1.259 -4.141 -1
56 | -9.823 3.309 -2.012 0.723 2.186 -6.412 -6.445 -2.913 -4.701 1
57 | -7.490 0.047 -5.807 8.256 -0.070 -5.170 4.271 2.427 3.572 -1
58 | -9.071 3.115 -9.485 -1.083 -6.162 2.701 2.505 -2.607 9.788 1
59 | -7.382 1.835 -8.231 -3.189 0.091 1.698 1.642 -5.638 -5.875 1
60 | 2.551 2.422 4.373 3.066 -8.661 8.210 -4.233 3.844 -4.397 -1
61 | -2.114 9.172 3.369 -0.345 -4.017 -6.540 -8.647 7.625 -2.178 1
62 | 5.056 -9.265 6.228 -0.571 3.801 7.567 -2.361 9.569 1.411 -1
63 | -3.013 -0.825 8.785 -9.643 8.830 -5.231 -6.183 -9.817 -7.606 1
64 | -2.241 4.515 4.151 -6.012 -6.056 -2.047 -8.445 1.584 -2.479 1
65 | 5.637 7.266 -6.890 4.422 7.623 -8.061 9.191 -8.560 -7.878 -1
66 | -9.766 -5.208 -8.244 4.386 -1.221 -4.299 -7.662 0.334 7.284 -1
67 | 6.440 4.960 -0.344 9.550 -0.618 -2.722 -8.511 -1.426 -1.281 -1
68 | 8.634 7.211 -6.378 -9.609 1.597 2.401 -3.909 3.935 -7.265 1
69 | 7.875 -7.259 -9.684 -2.469 -7.710 -0.301 4.809 -6.221 8.272 -1
70 | -5.843 7.417 -7.380 -2.221 7.808 4.217 -9.820 -6.101 -1.848 1
71 | 4.305 0.635 -9.011 4.622 8.166 -6.721 -5.679 2.975 -2.941 -1
72 | 6.433 -4.014 0.649 9.053 3.765 -1.543 3.269 3.946 2.356 -1
73 | 1.617 -9.885 -6.974 2.606 4.737 -8.808 5.885 9.057 4.168 -1
74 | 0.624 -0.892 8.487 -8.727 -1.840 2.252 -0.271 -8.570 -3.802 1
75 | 4.106 -2.164 -1.017 7.132 -9.558 -6.280 8.325 6.327 -7.223 1
76 | 5.663 -2.714 -3.790 4.150 -1.441 4.370 -3.598 8.288 5.800 -1
77 | -5.474 6.195 -7.293 3.509 3.328 -6.851 7.229 1.652 9.476 -1
78 | -8.465 -7.029 -7.304 -2.255 7.120 1.255 -7.885 -6.478 -0.456 1
79 | 1.437 6.306 -1.798 4.145 -0.185 -8.470 7.294 -2.956 3.182 1
80 | 0.927 3.018 -2.395 3.623 -9.236 -5.275 -5.121 -7.121 -1.753 1
81 | 6.346 -1.202 2.456 -5.452 -7.057 -7.729 -3.923 -9.763 -0.685 1
82 | -8.780 -6.548 -9.133 -1.175 7.075 -8.370 3.550 -8.046 -5.491 1
83 | -7.684 7.061 1.463 4.771 -8.391 4.406 7.042 -2.314 4.643 -1
84 | 0.571 -5.249 -2.373 1.438 3.575 -5.297 3.069 -2.875 -3.343 1
85 | -4.453 7.404 -9.191 7.010 2.175 -7.582 1.417 -0.783 0.104 -1
86 | -8.114 -1.131 -4.669 -0.486 -9.693 8.906 4.216 3.376 -3.969 -1
87 | -2.346 9.384 -2.555 -1.536 6.394 9.620 0.882 -2.189 -1.162 -1
88 | 8.614 3.468 1.580 -6.056 -7.018 1.887 -7.150 7.198 -4.737 -1
89 | 3.875 -0.368 -0.563 -8.680 8.095 -4.169 -9.060 -1.023 3.642 1
90 | 6.901 -3.390 2.563 -1.520 0.554 5.544 -9.633 3.405 2.742 -1
91 | 1.901 9.995 -7.577 -8.662 -8.685 -9.482 -2.830 -7.745 -0.505 1
92 | -2.580 -6.876 4.063 9.982 1.604 -5.383 5.527 1.971 8.022 -1
93 | 1.874 1.349 -3.578 4.296 2.687 -2.263 4.814 9.857 -0.008 -1
94 | 1.218 6.413 1.371 -4.719 6.396 -7.025 -0.102 1.922 4.946 1
95 | 4.655 1.148 -6.657 -8.923 -4.556 6.031 -1.186 -9.741 5.888 1
96 | -0.921 9.551 -8.037 -9.549 -5.168 8.359 -6.574 4.731 0.281 1
97 | -7.088 -4.467 -9.106 -3.745 -3.390 -3.662 -7.714 5.423 -3.404 1
98 | -9.721 -5.860 9.048 -7.758 -5.410 -6.119 -9.399 -1.984 8.611 1
99 | 1.099 -9.784 7.673 1.993 -3.529 -5.718 8.331 -1.243 9.706 -1
100 | 5.588 -8.062 3.135 4.636 -5.819 7.725 8.517 -5.218 -4.259 -1
101 |
--------------------------------------------------------------------------------
/hw3/Homework3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/hw3/Homework3.png
--------------------------------------------------------------------------------
/hw4/Homework4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/hw4/Homework4.png
--------------------------------------------------------------------------------
/hw4/hw4_train.dat:
--------------------------------------------------------------------------------
1 | 0.568304 0.568283 1
2 | 0.310968 0.310956 -1
3 | 0.103376 0.103373 -1
4 | 0.0531882 0.053218 -1
5 | 0.97006 0.970064 1
6 | 0.0941873 0.0941707 -1
7 | 0.655902 0.655892 1
8 | 0.370821 0.370839 -1
9 | 0.558482 0.558476 1
10 | 0.849389 0.849383 1
11 | 0.796038 0.796051 1
12 | 0.723246 0.723252 1
13 | 0.571236 0.571254 1
14 | 0.385144 0.38512 -1
15 | 0.877176 0.877168 1
16 | 0.74655 0.746552 1
17 | 0.0676164 0.0676087 -1
18 | 0.0412524 0.0412649 -1
19 | 0.851637 0.851661 1
20 | 0.586989 0.58698 1
21 | 0.661014 0.660994 1
22 | 0.587988 0.587968 1
23 | 0.257615 0.257628 -1
24 | 0.680505 0.680485 1
25 | 0.895242 0.895257 1
26 | 0.381124 0.381139 -1
27 | 0.314332 0.31433 -1
28 | 0.157744 0.157747 -1
29 | 0.670923 0.670925 1
30 | 0.531716 0.531736 1
31 | 0.810956 0.810938 1
32 | 0.514937 0.51493 1
33 | 0.188567 0.188587 -1
34 | 0.778528 0.778527 1
35 | 0.904966 0.904955 1
36 | 0.563699 0.563708 1
37 | 0.599768 0.59978 1
38 | 0.619909 0.619928 1
39 | 0.650556 0.650556 1
40 | 0.131949 0.131967 -1
41 | 0.251546 0.251546 -1
42 | 0.690874 0.690863 1
43 | 0.381249 0.381284 -1
44 | 0.559231 0.559232 1
45 | 0.197361 0.197367 -1
46 | 0.784776 0.784781 1
47 | 0.620494 0.620499 1
48 | 0.229646 0.229647 -1
49 | 0.0891466 0.0891438 -1
50 | 0.981857 0.981861 1
51 | 0.64711 0.647102 1
52 | 0.725596 0.725592 1
53 | 0.614771 0.614764 1
54 | 0.976315 0.976321 1
55 | 0.250716 0.250708 -1
56 | 0.281071 0.281096 -1
57 | 0.550196 0.550187 1
58 | 0.955756 0.955751 1
59 | 0.251821 0.251838 -1
60 | 0.538196 0.538183 1
61 | 0.58285 0.582836 1
62 | 0.48367 0.48368 -1
63 | 0.481451 0.481471 -1
64 | 0.291576 0.291561 -1
65 | 0.181592 0.181596 -1
66 | 0.232746 0.232759 -1
67 | 0.488322 0.488349 -1
68 | 0.664499 0.664487 1
69 | 0.0420094 0.0420475 -1
70 | 0.950521 0.950524 1
71 | 0.445707 0.445706 -1
72 | 0.430385 0.430396 -1
73 | 0.747574 0.747583 1
74 | 0.245047 0.245078 -1
75 | 0.742838 0.742833 1
76 | 0.284625 0.284627 -1
77 | 0.0613909 0.061374 -1
78 | 0.612767 0.612754 1
79 | 0.378545 0.378555 -1
80 | 0.818764 0.818763 1
81 | 0.0507026 0.0507136 -1
82 | 0.882725 0.882731 1
83 | 0.0810847 0.0810796 -1
84 | 0.836278 0.836279 1
85 | 0.696709 0.696695 1
86 | 0.603346 0.603334 1
87 | 0.513718 0.513712 1
88 | 0.247789 0.247802 -1
89 | 0.704221 0.704213 1
90 | 0.546723 0.546724 1
91 | 0.881583 0.881592 1
92 | 0.13456 0.134545 -1
93 | 0.86883 0.868815 1
94 | 0.980909 0.980887 1
95 | 0.369986 0.369986 -1
96 | 0.194455 0.194457 -1
97 | 0.483858 0.483875 -1
98 | 0.43807 0.43808 -1
99 | 0.159602 0.159592 -1
100 | 0.923499 0.923504 1
101 | 0.419902 0.419906 -1
102 | 0.659252 0.659271 1
103 | 0.419546 0.419546 -1
104 | 0.935494 0.935512 1
105 | 0.712397 0.71239 1
106 | 0.952567 0.952549 1
107 | 0.915359 0.915379 1
108 | 0.182693 0.182675 -1
109 | 0.668527 0.668522 1
110 | 0.0965221 0.0965266 -1
111 | 0.984174 0.984197 1
112 | 0.7437 0.743702 1
113 | 0.213357 0.213341 -1
114 | 0.617402 0.617386 1
115 | 0.335604 0.335604 -1
116 | 0.632581 0.632597 1
117 | 0.515744 0.515757 1
118 | 0.786921 0.786912 1
119 | 0.502608 0.502599 1
120 | 0.164538 0.164537 -1
121 | 0.507454 0.507469 1
122 | 0.822809 0.822806 1
123 | 0.42883 0.428821 -1
124 | 0.157678 0.157693 -1
125 | 0.674884 0.674896 1
126 | 0.276618 0.276622 -1
127 | 0.374795 0.374795 -1
128 | 0.396781 0.396815 -1
129 | 0.132116 0.132101 -1
130 | 0.966203 0.966249 1
131 | 0.961164 0.961159 1
132 | 0.0140044 0.014014 -1
133 | 0.509361 0.509379 1
134 | 0.195082 0.195097 -1
135 | 0.853012 0.853012 1
136 | 0.852883 0.852896 1
137 | 0.574279 0.574282 1
138 | 0.316965 0.316939 -1
139 | 0.386753 0.386761 -1
140 | 0.764792 0.764815 1
141 | 0.680442 0.680428 1
142 | 0.125299 0.125304 -1
143 | 0.619824 0.619818 1
144 | 0.687672 0.687662 1
145 | 0.760271 0.760289 1
146 | 0.227148 0.22713 -1
147 | 0.224288 0.224295 -1
148 | 0.0150326 0.0150352 -1
149 | 0.585322 0.585314 1
150 | 0.732755 0.732777 1
151 | 0.864553 0.864569 1
152 | 0.0788415 0.0788569 -1
153 | 0.4326 0.432602 -1
154 | 0.804816 0.804801 1
155 | 0.50957 0.509589 1
156 | 0.405003 0.404988 -1
157 | 0.465702 0.465691 -1
158 | 0.368576 0.368574 -1
159 | 0.56202 0.562033 1
160 | 0.552361 0.552356 1
161 | 0.18263 0.182606 -1
162 | 0.672912 0.672906 1
163 | 0.642397 0.642413 1
164 | 0.816308 0.816316 1
165 | 0.264986 0.264978 -1
166 | 0.799168 0.799179 1
167 | 0.311442 0.311432 -1
168 | 0.715291 0.715278 1
169 | 0.913262 0.913265 1
170 | 0.703566 0.70358 1
171 | 0.0868818 0.0868856 -1
172 | 0.507828 0.507835 1
173 | 0.77619 0.776196 1
174 | 0.503254 0.503257 1
175 | 0.0585257 0.0585251 -1
176 | 0.668003 0.667995 1
177 | 0.409675 0.409686 -1
178 | 0.00104673 0.00105247 -1
179 | 0.6743 0.674268 1
180 | 0.461383 0.461378 -1
181 | 0.957667 0.957677 1
182 | 0.386593 0.386566 -1
183 | 0.260177 0.260171 -1
184 | 0.208071 0.208076 -1
185 | 0.634661 0.634646 1
186 | 0.354351 0.354351 -1
187 | 0.135384 0.135381 -1
188 | 0.216718 0.216748 -1
189 | 0.606084 0.606096 1
190 | 0.443809 0.443801 -1
191 | 0.480428 0.480418 -1
192 | 0.886987 0.886995 1
193 | 0.0126171 0.012603 -1
194 | 0.578502 0.578495 1
195 | 0.0664441 0.0664438 -1
196 | 0.292442 0.292432 -1
197 | 0.487013 0.487008 -1
198 | 0.176237 0.176234 -1
199 | 0.496052 0.496044 -1
200 | 0.62186 0.621853 1
201 |
--------------------------------------------------------------------------------
/hw5/Homework5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/hw5/Homework5.png
--------------------------------------------------------------------------------
/hw6/Homework6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/hw6/Homework6.png
--------------------------------------------------------------------------------
/hw6/hw2_adaboost_train.dat:
--------------------------------------------------------------------------------
1 | 0.757222 0.633831 -1
2 | 0.847382 0.281581 -1
3 | 0.24931 0.618635 +1
4 | 0.538526 0.144259 -1
5 | 0.474435 0.414558 -1
6 | 0.374151 0.0120482 1
7 | 0.847185 0.217572 1
8 | 0.983368 0.250496 1
9 | 0.645141 0.485816 1
10 | 0.172211 0.254331 -1
11 | 0.116866 0.378804 -1
12 | 0.55097 0.760426 -1
13 | 0.312109 0.442938 -1
14 | 0.304777 0.0529649 1
15 | 0.572727 0.370527 1
16 | 0.171491 0.50076 -1
17 | 0.644567 0.834055 -1
18 | 0.0529041 0.338461 -1
19 | 0.0323543 0.830701 -1
20 | 0.272193 0.587396 -1
21 | 0.123521 0.0516625 1
22 | 0.905544 0.247013 1
23 | 0.854276 0.559648 1
24 | 0.375914 0.505747 -1
25 | 0.160755 0.238718 -1
26 | 0.45893 0.227062 1
27 | 0.395407 0.791184 -1
28 | 0.742325 0.586444 1
29 | 0.43615 0.136922 1
30 | 0.954217 0.680325 1
31 | 0.916386 0.381431 1
32 | 0.953844 0.439266 1
33 | 0.328701 0.721918 -1
34 | 0.275732 0.43115 -1
35 | 0.892366 0.0136661 1
36 | 0.249529 0.0709084 1
37 | 0.124333 0.611515 -1
38 | 0.54449 0.423701 1
39 | 0.86019 0.93029 -1
40 | 0.432404 0.0901487 1
41 | 0.204973 0.406648 -1
42 | 0.0748025 0.568699 -1
43 | 0.936407 0.106094 1
44 | 0.572728 0.90924 -1
45 | 0.358618 0.651613 -1
46 | 0.631685 0.910141 -1
47 | 0.802581 0.599025 1
48 | 0.366818 0.0135169 1
49 | 0.708026 0.300654 1
50 | 0.243625 0.106277 1
51 | 0.960778 0.59799 1
52 | 0.726241 0.057674 1
53 | 0.158561 0.690295 -1
54 | 0.420638 0.503567 -1
55 | 0.651344 0.290269 1
56 | 0.933469 0.490516 1
57 | 0.502864 0.721677 -1
58 | 0.595151 0.82293 -1
59 | 0.696778 0.300018 1
60 | 0.927038 0.295737 1
61 | 0.145192 0.377728 -1
62 | 0.385435 0.68299 -1
63 | 0.296852 0.868018 -1
64 | 0.659204 0.77369 -1
65 | 0.896153 0.832046 1
66 | 0.466137 0.877674 -1
67 | 0.815532 0.164151 1
68 | 0.310117 0.857713 -1
69 | 0.522385 0.961609 -1
70 | 0.369345 0.781697 -1
71 | 0.901988 0.831265 1
72 | 0.692314 0.0640428 1
73 | 0.836977 0.614453 1
74 | 0.104584 0.357892 -1
75 | 0.265266 0.65833 -1
76 | 0.729254 0.885763 -1
77 | 0.205254 0.404956 -1
78 | 0.032359 0.778401 -1
79 | 0.464724 0.159682 1
80 | 0.940021 0.493738 1
81 | 0.248985 0.646083 -1
82 | 0.541258 0.728218 -1
83 | 0.391575 0.291076 1
84 | 0.0254967 0.300503 -1
85 | 0.475398 0.920203 -1
86 | 0.835664 0.584283 1
87 | 0.296033 0.0885163 1
88 | 0.0435908 0.646312 -1
89 | 0.284148 0.182427 1
90 | 0.627696 0.788116 -1
91 | 0.312939 0.871275 -1
92 | 0.676521 0.316903 1
93 | 0.0123539 0.178643 -1
94 | 0.682164 0.777194 -1
95 | 0.421563 0.302683 1
96 | 0.03183 0.289761 -1
97 | 0.435715 0.190071 1
98 | 0.730492 0.0655594 1
99 | 0.92527 0.524315 1
100 | 0.984815 0.383621 1
101 |
--------------------------------------------------------------------------------
/hw7/Homework7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/hw7/Homework7.png
--------------------------------------------------------------------------------
/hw7/hw3_train.dat:
--------------------------------------------------------------------------------
1 | 0.757222 0.633831 -1
2 | 0.847382 0.281581 -1
3 | 0.24931 0.618635 +1
4 | 0.538526 0.144259 -1
5 | 0.474435 0.414558 -1
6 | 0.374151 0.0120482 1
7 | 0.847185 0.217572 1
8 | 0.983368 0.250496 1
9 | 0.645141 0.485816 1
10 | 0.172211 0.254331 -1
11 | 0.116866 0.378804 -1
12 | 0.55097 0.760426 -1
13 | 0.312109 0.442938 -1
14 | 0.304777 0.0529649 1
15 | 0.572727 0.370527 1
16 | 0.171491 0.50076 -1
17 | 0.644567 0.834055 -1
18 | 0.0529041 0.338461 -1
19 | 0.0323543 0.830701 -1
20 | 0.272193 0.587396 -1
21 | 0.123521 0.0516625 1
22 | 0.905544 0.247013 1
23 | 0.854276 0.559648 1
24 | 0.375914 0.505747 -1
25 | 0.160755 0.238718 -1
26 | 0.45893 0.227062 1
27 | 0.395407 0.791184 -1
28 | 0.742325 0.586444 1
29 | 0.43615 0.136922 1
30 | 0.954217 0.680325 1
31 | 0.916386 0.381431 1
32 | 0.953844 0.439266 1
33 | 0.328701 0.721918 -1
34 | 0.275732 0.43115 -1
35 | 0.892366 0.0136661 1
36 | 0.249529 0.0709084 1
37 | 0.124333 0.611515 -1
38 | 0.54449 0.423701 1
39 | 0.86019 0.93029 -1
40 | 0.432404 0.0901487 1
41 | 0.204973 0.406648 -1
42 | 0.0748025 0.568699 -1
43 | 0.936407 0.106094 1
44 | 0.572728 0.90924 -1
45 | 0.358618 0.651613 -1
46 | 0.631685 0.910141 -1
47 | 0.802581 0.599025 1
48 | 0.366818 0.0135169 1
49 | 0.708026 0.300654 1
50 | 0.243625 0.106277 1
51 | 0.960778 0.59799 1
52 | 0.726241 0.057674 1
53 | 0.158561 0.690295 -1
54 | 0.420638 0.503567 -1
55 | 0.651344 0.290269 1
56 | 0.933469 0.490516 1
57 | 0.502864 0.721677 -1
58 | 0.595151 0.82293 -1
59 | 0.696778 0.300018 1
60 | 0.927038 0.295737 1
61 | 0.145192 0.377728 -1
62 | 0.385435 0.68299 -1
63 | 0.296852 0.868018 -1
64 | 0.659204 0.77369 -1
65 | 0.896153 0.832046 1
66 | 0.466137 0.877674 -1
67 | 0.815532 0.164151 1
68 | 0.310117 0.857713 -1
69 | 0.522385 0.961609 -1
70 | 0.369345 0.781697 -1
71 | 0.901988 0.831265 1
72 | 0.692314 0.0640428 1
73 | 0.836977 0.614453 1
74 | 0.104584 0.357892 -1
75 | 0.265266 0.65833 -1
76 | 0.729254 0.885763 -1
77 | 0.205254 0.404956 -1
78 | 0.032359 0.778401 -1
79 | 0.464724 0.159682 1
80 | 0.940021 0.493738 1
81 | 0.248985 0.646083 -1
82 | 0.541258 0.728218 -1
83 | 0.391575 0.291076 1
84 | 0.0254967 0.300503 -1
85 | 0.475398 0.920203 -1
86 | 0.835664 0.584283 1
87 | 0.296033 0.0885163 1
88 | 0.0435908 0.646312 -1
89 | 0.284148 0.182427 1
90 | 0.627696 0.788116 -1
91 | 0.312939 0.871275 -1
92 | 0.676521 0.316903 1
93 | 0.0123539 0.178643 -1
94 | 0.682164 0.777194 -1
95 | 0.421563 0.302683 1
96 | 0.03183 0.289761 -1
97 | 0.435715 0.190071 1
98 | 0.730492 0.0655594 1
99 | 0.92527 0.524315 1
100 | 0.984815 0.383621 1
101 |
--------------------------------------------------------------------------------
/hw8/Homework8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/hw8/Homework8.png
--------------------------------------------------------------------------------
/hw8/hw4_kmeans_train.dat:
--------------------------------------------------------------------------------
1 | 0.8105 -0.35 0.4769 0.4541 -0.9829 0.5252 0.3838 -0.3408 -0.4824
2 | -0.6273 -0.2097 0.9404 0.1143 0.3487 -0.5206 0.0061 0.5024 -0.6687
3 | 0.1624 -0.1173 0.426 -0.3607 -0.6632 0.4431 -0.8355 0.7206 -0.8977
4 | -1 0.7758 -0.267 -0.888 -0.1099 -0.9183 -0.4086 0.8962 0.5841
5 | 0.8464 0.1762 0.2729 0.2724 0.8155 0.6096 -0.2844 0.98 0.3302
6 | -0.0135 0.6193 0.7705 0.7195 0.7313 -0.3395 0.8012 -0.6773 -0.4433
7 | 0.0934 -0.8379 -0.2083 -0.6337 0.4346 -0.3928 0.9759 -0.8499 -0.4128
8 | 0.8923 -0.0018 -0.6837 0.6628 -0.2823 -0.9524 -0.6767 -0.4811 -0.6296
9 | -0.9028 0.701 -0.9063 -0.1111 -0.9328 0.5282 0.496 -0.9569 0.6784
10 | -0.9706 0.1392 0.6562 -0.6543 -0.198 -0.6261 -0.6067 0.1254 -0.1071
11 | -0.6891 -0.4157 0.1057 -0.5954 0.4732 0.1729 0.9328 -0.0308 0.216
12 | -0.0845 -0.5858 -0.0486 -0.4282 -0.2401 0.7534 -0.0543 0.1531 -0.1212
13 | -0.9596 -0.3929 0.9556 0.1461 0.0117 0.4288 -0.681 -0.0555 -0.602
14 | 0.9124 0.7287 -0.7506 -0.1363 -0.6995 0.0093 -0.3828 0.2462 -0.8376
15 | 0.7514 0.7608 -0.0175 0.7071 -0.0931 0.9942 0.1359 0.2259 -0.0613
16 | -0.1805 -0.2265 -0.9636 0.0689 0.6373 -0.6631 -0.9218 -0.7456 0.5831
17 | -0.3048 0.8819 -0.8509 0.6777 0.5889 0.056 0.6719 -0.2752 -0.7181
18 | -0.5873 -0.9376 -0.3226 -0.5509 0.1313 -0.6853 -0.214 0.2095 -0.4309
19 | 0.425 -0.535 -0.6683 0.5741 -0.8574 0.9207 -0.3699 0.8145 -0.3545
20 | 0.8587 -0.0571 -0.7906 -0.4638 0.392 0.3407 -0.1491 -0.822 -0.4498
21 | -0.8107 0.0089 -0.765 -0.479 -0.4171 -0.6223 -0.5583 0.213 -0.8078
22 | -0.8616 0.9386 -0.9095 -0.6522 -0.5252 0.4825 0.6886 0.3256 0.6605
23 | -1 -0.3258 -0.1998 -0.7559 0.1952 0.3832 -0.3782 0.6369 -0.4038
24 | -0.4212 -0.1462 -0.2603 -0.3308 0.2016 0.2144 -0.8483 -0.1099 -0.46
25 | 0.8112 0.377 -0.5551 -0.3885 0.6211 0.6401 0.9946 -0.7571 0.277
26 | -0.8868 0.0669 0.5703 -0.1472 0.7361 -0.2282 -0.9328 0.8879 0.662
27 | 0.6635 0.5312 0.5358 -0.8916 -0.8574 0.1569 0.7485 -0.8628 0.3998
28 | 0.7432 -0.8466 -0.9884 0.3135 0.0062 0.7477 -0.9147 0.0734 0.6355
29 | -0.3031 0.2371 -0.4132 -0.7674 0.3454 -0.2706 0.3895 0.0939 -0.1334
30 | -1 -0.1108 0.7883 -0.7978 -0.7973 -0.2055 0.9498 -0.712 0.8679
31 | 1 0.2703 -0.6408 -0.4365 0.5029 0.7046 0.2929 -0.1076 -0.2015
32 | 0.3891 0.1182 -0.0468 0.1774 0.3203 0.1559 0.9719 0.2702 0.4439
33 | -0.4895 0.7533 0.3229 -0.1304 -0.6832 -0.1742 -0.4258 0.6097 0.7182
34 | -0.6454 -0.0875 0.4457 0.3077 -0.91 -0.234 -0.5364 -0.9381 -1
35 | 0.4393 0.8004 -0.5783 -0.2378 -0.3299 -0.2615 0.588 0.2443 -0.6518
36 | 0.0337 0.2622 -0.4467 -0.5206 -0.4301 -0.3567 0.2454 0.0335 -0.2949
37 | -0.1583 0.767 0.6972 0.2634 -0.4708 -0.6327 -0.998 -0.8828 0.6116
38 | -0.8917 0.1634 -0.6017 -0.3384 0.6428 -0.0318 0.3049 -0.1118 -1
39 | -0.4864 0.1848 0.0375 -0.7892 -0.5517 0.5667 -0.4218 -0.5498 0.6839
40 | 0.5545 0.3762 -0.5996 0.9528 -0.9622 -0.9568 -0.0789 0.3427 -0.0686
41 | 0.1361 -0.5169 -0.3709 -0.8264 -0.306 0.0774 0.7403 0.2721 0.5276
42 | 0.7686 0.4347 -0.0279 -0.831 0.3875 0.0099 -0.7878 -0.6914 -0.6474
43 | 0.689 -0.767 -0.8421 -0.6819 -0.5934 -0.1481 0.3954 -0.8532 -0.876
44 | -0.153 0.8711 -0.0993 0.8191 -0.9599 -0.7117 -0.171 -0.7477 -0.4031
45 | -0.4384 0.3295 0.1583 -0.2805 0.6476 0.5649 0.5713 0.043 0.7117
46 | -0.2528 -0.9359 0.2564 0.6479 0.8832 0.2966 0.9362 -0.2878 0.5489
47 | 0.2867 0.3421 0.9149 -0.555 -0.9384 0.5625 -0.9901 0.6329 -0.3945
48 | -0.6103 0.3564 0.8529 0.6461 0.0044 0.7361 -0.0573 -0.0595 -0.5517
49 | -1 0.1217 -0.5353 0.9365 0.5667 -0.4737 0.4989 0.5765 -0.8408
50 | -0.5352 -0.3079 0.453 -0.6823 -0.6618 -0.5426 -0.9462 0.2809 0.3979
51 | 0.9667 0.2303 0.8283 -0.5686 0.1668 0.3949 -0.0423 -0.3343 -0.0286
52 | -0.2993 0.911 0.2642 -0.8462 -0.7713 0.6024 -0.3888 -0.7175 -0.1167
53 | 0.5873 0.5954 0.0947 0.4155 -0.9732 -0.7385 -0.1896 -0.0155 -0.0728
54 | -0.3765 0.4062 0.0545 0.8877 0.56 0.2833 0.4901 -0.8289 0.5658
55 | -0.1065 -0.3518 0.5746 0.9882 -0.9363 0.6014 -0.7503 -0.1259 -0.4141
56 | -0.9823 0.3309 -0.2012 0.0723 0.2186 -0.6412 -0.6445 -0.2913 -0.4701
57 | -0.749 0.0047 -0.5807 0.8256 -0.007 -0.517 0.4271 0.2427 0.3572
58 | -0.9071 0.3115 -0.9485 -0.1083 -0.6162 0.2701 0.2505 -0.2607 0.9788
59 | -0.7382 0.1835 -0.8231 -0.3189 0.0091 0.1698 0.1642 -0.5638 -0.5875
60 | 0.2551 0.2422 0.4373 0.3066 -0.8661 0.821 -0.4233 0.3844 -0.4397
61 | -0.2114 0.9172 0.3369 -0.0345 -0.4017 -0.654 -0.8647 0.7625 -0.2178
62 | 0.5056 -0.9265 0.6228 -0.0571 0.3801 0.7567 -0.2361 0.9569 0.1411
63 | -0.3013 -0.0825 0.8785 -0.9643 0.883 -0.5231 -0.6183 -0.9817 -0.7606
64 | -0.2241 0.4515 0.4151 -0.6012 -0.6056 -0.2047 -0.8445 0.1584 -0.2479
65 | 0.5637 0.7266 -0.689 0.4422 0.7623 -0.8061 0.9191 -0.856 -0.7878
66 | -0.9766 -0.5208 -0.8244 0.4386 -0.1221 -0.4299 -0.7662 0.0334 0.7284
67 | 0.644 0.496 -0.0344 0.955 -0.0618 -0.2722 -0.8511 -0.1426 -0.1281
68 | 0.8634 0.7211 -0.6378 -0.9609 0.1597 0.2401 -0.3909 0.3935 -0.7265
69 | 0.7875 -0.7259 -0.9684 -0.2469 -0.771 -0.0301 0.4809 -0.6221 0.8272
70 | -0.5843 0.7417 -0.738 -0.2221 0.7808 0.4217 -0.982 -0.6101 -0.1848
71 | 0.4305 0.0635 -0.9011 0.4622 0.8166 -0.6721 -0.5679 0.2975 -0.2941
72 | 0.6433 -0.4014 0.0649 0.9053 0.3765 -0.1543 0.3269 0.3946 0.2356
73 | 0.1617 -0.9885 -0.6974 0.2606 0.4737 -0.8808 0.5885 0.9057 0.4168
74 | 0.0624 -0.0892 0.8487 -0.8727 -0.184 0.2252 -0.0271 -0.857 -0.3802
75 | 0.4106 -0.2164 -0.1017 0.7132 -0.9558 -0.628 0.8325 0.6327 -0.7223
76 | 0.5663 -0.2714 -0.379 0.415 -0.1441 0.437 -0.3598 0.8288 0.58
77 | -0.5474 0.6195 -0.7293 0.3509 0.3328 -0.6851 0.7229 0.1652 0.9476
78 | -0.8465 -0.7029 -0.7304 -0.2255 0.712 0.1255 -0.7885 -0.6478 -0.0456
79 | 0.1437 0.6306 -0.1798 0.4145 -0.0185 -0.847 0.7294 -0.2956 0.3182
80 | 0.0927 0.3018 -0.2395 0.3623 -0.9236 -0.5275 -0.5121 -0.7121 -0.1753
81 | 0.6346 -0.1202 0.2456 -0.5452 -0.7057 -0.7729 -0.3923 -0.9763 -0.0685
82 | -0.878 -0.6548 -0.9133 -0.1175 0.7075 -0.837 0.355 -0.8046 -0.5491
83 | -0.7684 0.7061 0.1463 0.4771 -0.8391 0.4406 0.7042 -0.2314 0.4643
84 | 0.0571 -0.5249 -0.2373 0.1438 0.3575 -0.5297 0.3069 -0.2875 -0.3343
85 | -0.4453 0.7404 -0.9191 0.701 0.2175 -0.7582 0.1417 -0.0783 0.0104
86 | -0.8114 -0.1131 -0.4669 -0.0486 -0.9693 0.8906 0.4216 0.3376 -0.3969
87 | -0.2346 0.9384 -0.2555 -0.1536 0.6394 0.962 0.0882 -0.2189 -0.1162
88 | 0.8614 0.3468 0.158 -0.6056 -0.7018 0.1887 -0.715 0.7198 -0.4737
89 | 0.3875 -0.0368 -0.0563 -0.868 0.8095 -0.4169 -0.906 -0.1023 0.3642
90 | 0.6901 -0.339 0.2563 -0.152 0.0554 0.5544 -0.9633 0.3405 0.2742
91 | 0.1901 0.9995 -0.7577 -0.8662 -0.8685 -0.9482 -0.283 -0.7745 -0.0505
92 | -0.258 -0.6876 0.4063 0.9982 0.1604 -0.5383 0.5527 0.1971 0.8022
93 | 0.1874 0.1349 -0.3578 0.4296 0.2687 -0.2263 0.4814 0.9857 -0.0008
94 | 0.1218 0.6413 0.1371 -0.4719 0.6396 -0.7025 -0.0102 0.1922 0.4946
95 | 0.4655 0.1148 -0.6657 -0.8923 -0.4556 0.6031 -0.1186 -0.9741 0.5888
96 | -0.0921 0.9551 -0.8037 -0.9549 -0.5168 0.8359 -0.6574 0.4731 0.0281
97 | -0.7088 -0.4467 -0.9106 -0.3745 -0.339 -0.3662 -0.7714 0.5423 -0.3404
98 | -0.9721 -0.586 0.9048 -0.7758 -0.541 -0.6119 -0.9399 -0.1984 0.8611
99 | 0.1099 -0.9784 0.7673 0.1993 -0.3529 -0.5718 0.8331 -0.1243 0.9706
100 | 0.5588 -0.8062 0.3135 0.4636 -0.5819 0.7725 0.8517 -0.5218 -0.4259
101 |
--------------------------------------------------------------------------------
/hw8/hw4_knn_train.dat:
--------------------------------------------------------------------------------
1 | 0.8105 -0.35 0.4769 0.4541 -0.9829 0.5252 0.3838 -0.3408 -0.4824 -1
2 | -0.6273 -0.2097 0.9404 0.1143 0.3487 -0.5206 0.0061 0.5024 -0.6687 1
3 | 0.1624 -0.1173 0.426 -0.3607 -0.6632 0.4431 -0.8355 0.7206 -0.8977 1
4 | -1 0.7758 -0.267 -0.888 -0.1099 -0.9183 -0.4086 0.8962 0.5841 1
5 | 0.8464 0.1762 0.2729 0.2724 0.8155 0.6096 -0.2844 0.98 0.3302 -1
6 | -0.0135 0.6193 0.7705 0.7195 0.7313 -0.3395 0.8012 -0.6773 -0.4433 1
7 | 0.0934 -0.8379 -0.2083 -0.6337 0.4346 -0.3928 0.9759 -0.8499 -0.4128 1
8 | 0.8923 -0.0018 -0.6837 0.6628 -0.2823 -0.9524 -0.6767 -0.4811 -0.6296 1
9 | -0.9028 0.701 -0.9063 -0.1111 -0.9328 0.5282 0.496 -0.9569 0.6784 -1
10 | -0.9706 0.1392 0.6562 -0.6543 -0.198 -0.6261 -0.6067 0.1254 -0.1071 1
11 | -0.6891 -0.4157 0.1057 -0.5954 0.4732 0.1729 0.9328 -0.0308 0.216 1
12 | -0.0845 -0.5858 -0.0486 -0.4282 -0.2401 0.7534 -0.0543 0.1531 -0.1212 -1
13 | -0.9596 -0.3929 0.9556 0.1461 0.0117 0.4288 -0.681 -0.0555 -0.602 1
14 | 0.9124 0.7287 -0.7506 -0.1363 -0.6995 0.0093 -0.3828 0.2462 -0.8376 1
15 | 0.7514 0.7608 -0.0175 0.7071 -0.0931 0.9942 0.1359 0.2259 -0.0613 -1
16 | -0.1805 -0.2265 -0.9636 0.0689 0.6373 -0.6631 -0.9218 -0.7456 0.5831 -1
17 | -0.3048 0.8819 -0.8509 0.6777 0.5889 0.056 0.6719 -0.2752 -0.7181 -1
18 | -0.5873 -0.9376 -0.3226 -0.5509 0.1313 -0.6853 -0.214 0.2095 -0.4309 -1
19 | 0.425 -0.535 -0.6683 0.5741 -0.8574 0.9207 -0.3699 0.8145 -0.3545 -1
20 | 0.8587 -0.0571 -0.7906 -0.4638 0.392 0.3407 -0.1491 -0.822 -0.4498 1
21 | -0.8107 0.0089 -0.765 -0.479 -0.4171 -0.6223 -0.5583 0.213 -0.8078 1
22 | -0.8616 0.9386 -0.9095 -0.6522 -0.5252 0.4825 0.6886 0.3256 0.6605 -1
23 | -1 -0.3258 -0.1998 -0.7559 0.1952 0.3832 -0.3782 0.6369 -0.4038 1
24 | -0.4212 -0.1462 -0.2603 -0.3308 0.2016 0.2144 -0.8483 -0.1099 -0.46 1
25 | 0.8112 0.377 -0.5551 -0.3885 0.6211 0.6401 0.9946 -0.7571 0.277 -1
26 | -0.8868 0.0669 0.5703 -0.1472 0.7361 -0.2282 -0.9328 0.8879 0.662 1
27 | 0.6635 0.5312 0.5358 -0.8916 -0.8574 0.1569 0.7485 -0.8628 0.3998 1
28 | 0.7432 -0.8466 -0.9884 0.3135 0.0062 0.7477 -0.9147 0.0734 0.6355 -1
29 | -0.3031 0.2371 -0.4132 -0.7674 0.3454 -0.2706 0.3895 0.0939 -0.1334 1
30 | -1 -0.1108 0.7883 -0.7978 -0.7973 -0.2055 0.9498 -0.712 0.8679 1
31 | 1 0.2703 -0.6408 -0.4365 0.5029 0.7046 0.2929 -0.1076 -0.2015 -1
32 | 0.3891 0.1182 -0.0468 0.1774 0.3203 0.1559 0.9719 0.2702 0.4439 -1
33 | -0.4895 0.7533 0.3229 -0.1304 -0.6832 -0.1742 -0.4258 0.6097 0.7182 1
34 | -0.6454 -0.0875 0.4457 0.3077 -0.91 -0.234 -0.5364 -0.9381 -1 -1
35 | 0.4393 0.8004 -0.5783 -0.2378 -0.3299 -0.2615 0.588 0.2443 -0.6518 1
36 | 0.0337 0.2622 -0.4467 -0.5206 -0.4301 -0.3567 0.2454 0.0335 -0.2949 1
37 | -0.1583 0.767 0.6972 0.2634 -0.4708 -0.6327 -0.998 -0.8828 0.6116 1
38 | -0.8917 0.1634 -0.6017 -0.3384 0.6428 -0.0318 0.3049 -0.1118 -1 1
39 | -0.4864 0.1848 0.0375 -0.7892 -0.5517 0.5667 -0.4218 -0.5498 0.6839 -1
40 | 0.5545 0.3762 -0.5996 0.9528 -0.9622 -0.9568 -0.0789 0.3427 -0.0686 -1
41 | 0.1361 -0.5169 -0.3709 -0.8264 -0.306 0.0774 0.7403 0.2721 0.5276 -1
42 | 0.7686 0.4347 -0.0279 -0.831 0.3875 0.0099 -0.7878 -0.6914 -0.6474 1
43 | 0.689 -0.767 -0.8421 -0.6819 -0.5934 -0.1481 0.3954 -0.8532 -0.876 1
44 | -0.153 0.8711 -0.0993 0.8191 -0.9599 -0.7117 -0.171 -0.7477 -0.4031 1
45 | -0.4384 0.3295 0.1583 -0.2805 0.6476 0.5649 0.5713 0.043 0.7117 -1
46 | -0.2528 -0.9359 0.2564 0.6479 0.8832 0.2966 0.9362 -0.2878 0.5489 1
47 | 0.2867 0.3421 0.9149 -0.555 -0.9384 0.5625 -0.9901 0.6329 -0.3945 1
48 | -0.6103 0.3564 0.8529 0.6461 0.0044 0.7361 -0.0573 -0.0595 -0.5517 -1
49 | -1 0.1217 -0.5353 0.9365 0.5667 -0.4737 0.4989 0.5765 -0.8408 -1
50 | -0.5352 -0.3079 0.453 -0.6823 -0.6618 -0.5426 -0.9462 0.2809 0.3979 1
51 | 0.9667 0.2303 0.8283 -0.5686 0.1668 0.3949 -0.0423 -0.3343 -0.0286 1
52 | -0.2993 0.911 0.2642 -0.8462 -0.7713 0.6024 -0.3888 -0.7175 -0.1167 1
53 | 0.5873 0.5954 0.0947 0.4155 -0.9732 -0.7385 -0.1896 -0.0155 -0.0728 1
54 | -0.3765 0.4062 0.0545 0.8877 0.56 0.2833 0.4901 -0.8289 0.5658 -1
55 | -0.1065 -0.3518 0.5746 0.9882 -0.9363 0.6014 -0.7503 -0.1259 -0.4141 -1
56 | -0.9823 0.3309 -0.2012 0.0723 0.2186 -0.6412 -0.6445 -0.2913 -0.4701 1
57 | -0.749 0.0047 -0.5807 0.8256 -0.007 -0.517 0.4271 0.2427 0.3572 -1
58 | -0.9071 0.3115 -0.9485 -0.1083 -0.6162 0.2701 0.2505 -0.2607 0.9788 1
59 | -0.7382 0.1835 -0.8231 -0.3189 0.0091 0.1698 0.1642 -0.5638 -0.5875 1
60 | 0.2551 0.2422 0.4373 0.3066 -0.8661 0.821 -0.4233 0.3844 -0.4397 -1
61 | -0.2114 0.9172 0.3369 -0.0345 -0.4017 -0.654 -0.8647 0.7625 -0.2178 1
62 | 0.5056 -0.9265 0.6228 -0.0571 0.3801 0.7567 -0.2361 0.9569 0.1411 -1
63 | -0.3013 -0.0825 0.8785 -0.9643 0.883 -0.5231 -0.6183 -0.9817 -0.7606 1
64 | -0.2241 0.4515 0.4151 -0.6012 -0.6056 -0.2047 -0.8445 0.1584 -0.2479 1
65 | 0.5637 0.7266 -0.689 0.4422 0.7623 -0.8061 0.9191 -0.856 -0.7878 -1
66 | -0.9766 -0.5208 -0.8244 0.4386 -0.1221 -0.4299 -0.7662 0.0334 0.7284 -1
67 | 0.644 0.496 -0.0344 0.955 -0.0618 -0.2722 -0.8511 -0.1426 -0.1281 -1
68 | 0.8634 0.7211 -0.6378 -0.9609 0.1597 0.2401 -0.3909 0.3935 -0.7265 1
69 | 0.7875 -0.7259 -0.9684 -0.2469 -0.771 -0.0301 0.4809 -0.6221 0.8272 -1
70 | -0.5843 0.7417 -0.738 -0.2221 0.7808 0.4217 -0.982 -0.6101 -0.1848 1
71 | 0.4305 0.0635 -0.9011 0.4622 0.8166 -0.6721 -0.5679 0.2975 -0.2941 -1
72 | 0.6433 -0.4014 0.0649 0.9053 0.3765 -0.1543 0.3269 0.3946 0.2356 -1
73 | 0.1617 -0.9885 -0.6974 0.2606 0.4737 -0.8808 0.5885 0.9057 0.4168 -1
74 | 0.0624 -0.0892 0.8487 -0.8727 -0.184 0.2252 -0.0271 -0.857 -0.3802 1
75 | 0.4106 -0.2164 -0.1017 0.7132 -0.9558 -0.628 0.8325 0.6327 -0.7223 1
76 | 0.5663 -0.2714 -0.379 0.415 -0.1441 0.437 -0.3598 0.8288 0.58 -1
77 | -0.5474 0.6195 -0.7293 0.3509 0.3328 -0.6851 0.7229 0.1652 0.9476 -1
78 | -0.8465 -0.7029 -0.7304 -0.2255 0.712 0.1255 -0.7885 -0.6478 -0.0456 1
79 | 0.1437 0.6306 -0.1798 0.4145 -0.0185 -0.847 0.7294 -0.2956 0.3182 1
80 | 0.0927 0.3018 -0.2395 0.3623 -0.9236 -0.5275 -0.5121 -0.7121 -0.1753 1
81 | 0.6346 -0.1202 0.2456 -0.5452 -0.7057 -0.7729 -0.3923 -0.9763 -0.0685 1
82 | -0.878 -0.6548 -0.9133 -0.1175 0.7075 -0.837 0.355 -0.8046 -0.5491 1
83 | -0.7684 0.7061 0.1463 0.4771 -0.8391 0.4406 0.7042 -0.2314 0.4643 -1
84 | 0.0571 -0.5249 -0.2373 0.1438 0.3575 -0.5297 0.3069 -0.2875 -0.3343 1
85 | -0.4453 0.7404 -0.9191 0.701 0.2175 -0.7582 0.1417 -0.0783 0.0104 -1
86 | -0.8114 -0.1131 -0.4669 -0.0486 -0.9693 0.8906 0.4216 0.3376 -0.3969 -1
87 | -0.2346 0.9384 -0.2555 -0.1536 0.6394 0.962 0.0882 -0.2189 -0.1162 -1
88 | 0.8614 0.3468 0.158 -0.6056 -0.7018 0.1887 -0.715 0.7198 -0.4737 -1
89 | 0.3875 -0.0368 -0.0563 -0.868 0.8095 -0.4169 -0.906 -0.1023 0.3642 1
90 | 0.6901 -0.339 0.2563 -0.152 0.0554 0.5544 -0.9633 0.3405 0.2742 -1
91 | 0.1901 0.9995 -0.7577 -0.8662 -0.8685 -0.9482 -0.283 -0.7745 -0.0505 1
92 | -0.258 -0.6876 0.4063 0.9982 0.1604 -0.5383 0.5527 0.1971 0.8022 -1
93 | 0.1874 0.1349 -0.3578 0.4296 0.2687 -0.2263 0.4814 0.9857 -0.0008 -1
94 | 0.1218 0.6413 0.1371 -0.4719 0.6396 -0.7025 -0.0102 0.1922 0.4946 1
95 | 0.4655 0.1148 -0.6657 -0.8923 -0.4556 0.6031 -0.1186 -0.9741 0.5888 1
96 | -0.0921 0.9551 -0.8037 -0.9549 -0.5168 0.8359 -0.6574 0.4731 0.0281 1
97 | -0.7088 -0.4467 -0.9106 -0.3745 -0.339 -0.3662 -0.7714 0.5423 -0.3404 1
98 | -0.9721 -0.586 0.9048 -0.7758 -0.541 -0.6119 -0.9399 -0.1984 0.8611 1
99 | 0.1099 -0.9784 0.7673 0.1993 -0.3529 -0.5718 0.8331 -0.1243 0.9706 -1
100 | 0.5588 -0.8062 0.3135 0.4636 -0.5819 0.7725 0.8517 -0.5218 -0.4259 -1
101 |
--------------------------------------------------------------------------------
/hw8/hw4_nnet_test.dat:
--------------------------------------------------------------------------------
1 | -0.106006 -0.081467 -1
2 | 0.17793 -0.345951 -1
3 | 0.102162 0.718258 1
4 | 0.694078 0.623397 -1
5 | 0.0235411 0.727432 1
6 | -0.319728 -0.834114 -1
7 | -0.186744 0.538878 1
8 | -0.636967 0.152685 1
9 | -0.474463 0.854344 1
10 | -0.0356277 -0.271588 -1
11 | -0.148603 0.161762 -1
12 | -0.180652 -0.128739 -1
13 | -0.602411 0.925507 1
14 | 0.698081 0.794742 -1
15 | 0.881509 -0.201248 1
16 | -0.923849 0.386625 1
17 | -0.765713 -0.0112813 1
18 | 0.135592 0.0317051 -1
19 | -0.155151 -0.33142 -1
20 | 0.485175 0.299031 -1
21 | -0.6029 0.333234 1
22 | -0.572858 0.828352 1
23 | -0.6354 -0.474566 -1
24 | 0.909317 -0.784889 1
25 | 0.252105 -0.893937 1
26 | -0.517634 0.960444 1
27 | -0.385872 -0.31787 -1
28 | 0.823167 -0.127797 1
29 | 0.822486 -0.876843 1
30 | -0.503662 0.980274 1
31 | 0.533874 0.821234 -1
32 | -0.89497 -0.240115 1
33 | 0.342871 0.474977 -1
34 | 0.709289 0.562207 -1
35 | -1.00043 0.0604576 1
36 | 0.524284 0.735195 -1
37 | -0.56033 0.755838 1
38 | 0.697522 -0.67199 1
39 | 0.490423 0.785087 -1
40 | -0.326774 0.343372 1
41 | -0.00293421 -0.415182 -1
42 | -0.631239 0.352634 1
43 | 0.913881 0.593053 -1
44 | 0.218283 0.0396835 -1
45 | -0.616185 -0.886579 -1
46 | -0.528529 0.0286902 1
47 | -0.406523 1.04515 1
48 | -0.229795 0.0714251 -1
49 | -0.502121 0.833738 1
50 | -0.50808 0.79327 1
51 | -0.790678 0.187803 1
52 | -0.382511 0.824742 1
53 | 0.822328 0.401487 -1
54 | 0.985964 -0.329169 1
55 | -0.014047 -0.152387 -1
56 | -0.0541651 0.914285 1
57 | -1.07247 -0.720286 -1
58 | -0.242985 -1.04265 1
59 | -0.324486 -0.28318 -1
60 | 0.247749 -0.255656 -1
61 | -0.172211 -0.8494 1
62 | -0.417263 -0.393271 -1
63 | -0.347838 -0.573809 -1
64 | -0.851834 -0.722664 -1
65 | -0.725244 -0.373707 -1
66 | 0.345327 -0.0222718 -1
67 | 0.742421 0.740857 -1
68 | -0.137123 -0.347256 -1
69 | 0.105915 0.633788 1
70 | 0.332407 -0.565528 1
71 | -0.417541 0.948563 1
72 | -0.404889 -0.613469 -1
73 | -0.797158 0.90701 1
74 | 0.875921 0.360211 -1
75 | 0.54354 -0.181091 1
76 | 0.0753797 -0.511083 -1
77 | 0.564049 0.77191 -1
78 | 0.816991 0.525587 -1
79 | -0.376611 0.105994 1
80 | 0.436029 0.15046 -1
81 | 0.396919 -0.548933 1
82 | -0.274177 0.602238 1
83 | -0.989181 0.157649 1
84 | -0.516441 -0.824574 -1
85 | 0.980643 0.546138 -1
86 | 0.777557 -0.893465 1
87 | -0.259364 -0.64448 -1
88 | -0.237718 -0.906771 -1
89 | -0.603951 0.0881739 1
90 | -0.280116 -0.015129 -1
91 | -0.203737 0.797986 1
92 | -0.163726 0.436005 1
93 | 0.744445 0.410894 -1
94 | -0.332268 -0.458751 -1
95 | -0.0276649 -0.360801 -1
96 | 0.706967 0.754277 -1
97 | -0.823016 -0.302874 1
98 | -0.985381 -0.384415 1
99 | -0.490997 0.702803 1
100 | -0.522472 0.300006 1
101 | -0.56968 0.104462 1
102 | -0.323705 0.721715 1
103 | 0.919689 -0.325742 1
104 | 0.818177 0.349008 -1
105 | -0.712647 -0.491177 -1
106 | 0.536846 1.0477 -1
107 | 0.0488566 0.125227 -1
108 | 0.400446 -0.13359 -1
109 | 0.729577 -0.365942 1
110 | -0.846403 0.870347 1
111 | 0.830743 0.584719 -1
112 | 0.446426 0.283358 -1
113 | 0.635038 0.852612 -1
114 | 0.134707 0.840304 1
115 | -0.580763 -0.0146776 1
116 | -0.426553 0.491576 1
117 | -0.0309166 1.0816 1
118 | 0.487241 -0.75003 1
119 | -0.506428 -0.910393 -1
120 | 0.24941 0.383094 -1
121 | -0.443343 -0.763903 -1
122 | -0.104542 -0.883496 1
123 | -0.283878 -0.571733 -1
124 | 1.01211 0.371876 1
125 | 0.0618232 -0.607985 1
126 | 0.0871395 0.360893 -1
127 | 0.87155 0.413564 -1
128 | -0.422244 0.521393 1
129 | -0.529957 0.699439 1
130 | 0.590191 0.446972 -1
131 | 0.840474 -0.850343 1
132 | 0.091857 -0.231851 -1
133 | -0.0822101 -0.402158 -1
134 | 0.973161 0.641579 -1
135 | 0.435721 0.276361 -1
136 | 0.524528 -0.543545 1
137 | 0.554733 0.38298 -1
138 | 0.955988 -0.801573 1
139 | -0.770892 0.43342 1
140 | 0.889864 -0.531147 1
141 | -0.261993 0.066587 1
142 | 0.641572 -0.18225 1
143 | -0.415312 0.778326 1
144 | 0.467457 0.818292 -1
145 | 0.477891 -0.248978 1
146 | -0.826238 0.943979 1
147 | -0.945929 -0.428109 1
148 | 0.506624 -0.809274 1
149 | -0.536129 0.02724 1
150 | 0.41018 0.430307 -1
151 | -0.261622 -0.580314 -1
152 | 0.319113 0.215579 -1
153 | 0.147568 -0.814993 1
154 | 0.629898 0.699697 -1
155 | -0.954079 0.926492 1
156 | -0.244927 0.018247 -1
157 | 0.581089 0.316104 -1
158 | 0.0410187 -0.457155 -1
159 | 0.583963 0.682676 -1
160 | -0.487305 0.864632 1
161 | 0.825899 -0.0497919 1
162 | 0.521741 -0.889465 1
163 | 0.838825 -0.826463 1
164 | -0.629605 -0.148894 -1
165 | 0.852444 -1.05747 1
166 | 0.383238 0.678823 -1
167 | -0.0569873 0.60584 1
168 | 0.304129 -1.06668 1
169 | -0.861419 -0.226248 1
170 | -0.863202 0.202902 1
171 | 0.559463 0.153533 -1
172 | -0.417588 -0.326951 -1
173 | 0.122127 -0.248965 -1
174 | 0.74666 0.555462 -1
175 | 0.206152 0.57419 1
176 | -0.890985 0.498582 1
177 | -0.68593 -0.46911 -1
178 | -0.197807 -0.13616 -1
179 | 0.616974 0.129276 -1
180 | -0.793416 0.361406 1
181 | -1.01017 0.0784968 1
182 | -0.86176 -0.578659 -1
183 | -0.197205 0.275245 -1
184 | 0.575618 0.937045 1
185 | -0.613542 -0.941164 -1
186 | -0.551814 -0.268043 -1
187 | 0.160037 -0.341978 -1
188 | -0.397164 0.656517 1
189 | -0.57887 -0.873403 -1
190 | 0.831792 -0.0853375 1
191 | -0.298931 -0.428922 -1
192 | -0.149263 0.65394 1
193 | -0.744958 -0.719335 -1
194 | 0.155624 0.920676 1
195 | 0.528938 0.916654 -1
196 | 0.0828245 -0.627955 1
197 | -0.939876 -0.662727 -1
198 | 0.714181 -0.259774 1
199 | -0.0111258 -0.842756 1
200 | 0.542759 0.118396 -1
201 | 0.734106 -0.890722 1
202 | 0.378962 -0.116083 -1
203 | -0.166564 -0.410541 -1
204 | -0.782016 0.37673 1
205 | 0.271773 0.810967 1
206 | -0.867801 -0.67611 -1
207 | -0.181209 0.680912 1
208 | -0.0444729 0.000441472 -1
209 | 0.429315 0.828909 -1
210 | -0.837975 -0.0769672 1
211 | 0.699779 -0.208176 1
212 | 0.774033 0.51157 -1
213 | -0.688138 0.793107 1
214 | -0.425318 -0.850405 -1
215 | 0.443572 -0.242149 1
216 | -0.00218607 0.697437 1
217 | 0.325465 -0.185786 -1
218 | 0.271268 -0.852024 1
219 | 0.208343 -0.829311 1
220 | -0.338032 -0.894042 -1
221 | -0.0242602 -0.550912 -1
222 | 0.255416 -0.288361 -1
223 | -0.716619 0.00041986 1
224 | 0.131717 -0.459831 -1
225 | 0.344558 -0.128777 -1
226 | 0.0824384 -0.972626 1
227 | 0.533274 0.295129 -1
228 | -0.338819 0.919956 1
229 | 0.551242 -0.846343 1
230 | -0.410561 0.512252 1
231 | 0.462925 -0.735639 1
232 | 0.575689 -0.589523 1
233 | -0.632094 -0.980422 -1
234 | -0.167642 -0.529201 -1
235 | 0.720368 -1.04008 1
236 | 0.750029 -0.538094 1
237 | 0.25173 -0.960771 1
238 | -0.724598 0.0744037 1
239 | -0.720427 -0.557209 -1
240 | -0.953844 0.477419 1
241 | 0.71149 -0.990116 1
242 | 0.290836 -0.443204 1
243 | 0.319746 -0.40071 1
244 | 0.233538 0.636726 1
245 | -0.195969 -0.990105 1
246 | -0.437767 0.0116531 1
247 | -0.35483 0.81982 1
248 | 0.347045 -0.545084 1
249 | 0.836376 0.343831 -1
250 | -0.713851 -0.640575 -1
251 |
--------------------------------------------------------------------------------
/hw8/hw4_nnet_train.dat:
--------------------------------------------------------------------------------
1 | -0.77947 0.838221 1
2 | 0.155635 0.895377 1
3 | -0.0599077 -0.71778 1
4 | 0.207596 0.758933 1
5 | -0.195983 -0.375487 -1
6 | 0.588489 -0.842554 1
7 | 0.00719859 -0.548316 -1
8 | 0.738839 -0.603394 1
9 | 0.704648 -0.0204201 1
10 | 0.969927 0.641371 -1
11 | 0.435431 0.744773 -1
12 | -0.844258 0.742354 1
13 | 0.591425 -0.546021 1
14 | -0.0690931 0.03766 -1
15 | -0.951549 -0.733055 -1
16 | -0.129881 0.756761 1
17 | -0.495346 -0.566279 -1
18 | -0.903994 0.509221 1
19 | 0.292351 0.16089 -1
20 | 0.647986 -0.779338 1
21 | 0.375956 0.0782031 -1
22 | 0.24589 0.00451467 -1
23 | -0.457192 0.423905 1
24 | -0.441279 0.705719 1
25 | 0.507447 0.758726 -1
26 |
--------------------------------------------------------------------------------
/lecture/MLF1-1.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石作业1:part1
3 | date: 2017-02-06 16:35:57
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 | 机器学习基石课后作业1-1:对应题目1~题目14
8 |
9 |
10 | ## 机器学习基石作业1
11 |
12 | ### 问题1
13 | Q1. 下列哪些问题适合采用机器学习的方法?
14 | ① 将数分为素数和非素数
15 | ② 信用卡交易中潜在的欺诈交易
16 | ③ 物体掉落到地面的时间
17 | ④ 流量大的十字路口信号灯最优循环方案
18 | ⑤ 确定某种特定药物的建议服用年龄
19 |
20 | A1: 判断一个问题是否适合采用机器学习方法的三个准则:
21 | a. 存在某种可以被学习到的潜在规律
22 | b. 潜在规律难以被简单的定义,因此直接编写潜在规律的程序是困难的
23 | c. 存在某些与潜在规律相关的数据(即与问题本身相关的数据)
24 | 根据上述三条准则,不难选择②④⑤
25 |
26 | ### 问题2
27 |
28 | Q2~Q5:分辨不同问题具体属于哪一类型的机器学习问题,关键在于区分不同机器学习问题的特点。
29 | 分类1:根据输出空间$\mathbb{Y}$的不同可以分为-- ①分类问题:$\mathbb{Y}={1,2,...,K}$ ②回归问题:$\mathbb{Y}=\mathbb{R}$ ③结构学习:$\mathbb{Y}=各种结构的集合$
30 | 分类2:根据标签$y_n$的不同可以分为--①监督学习:每个训练数据都有$y_n$ ②无监督学习:每个训练数据都无$y_n$ ③半监督学习:部分训练数据有$y_n$ ④强化学习:无显示的标签,可以想成控制某个特定环境下的某个个体,通过与环境的互动,来不断改进个体的行为。更详细的定义见[强化学习](https://www.zhihu.com/question/41775291)
31 | 分类3:根据学习潜在规律函数$f$方式的不同:①批量学习:将全部有的数据喂进去获得一个模型 ②在线学习:不断慢慢喂数据来更新模型(在有新的数据进来时,无需训练之前的数据就可以调整模型)③主动学习:策略性的选择数据,具有甄别数据好坏的能力
32 |
33 | Q2:通过不同的决策方法并根据结果作为反馈来不断提高下棋水平?
34 |
35 | A2:强化学习(reinforcement learning)
36 |
37 | Q3:将没有任何标签的书籍进行分类?
38 |
39 | A3:无监督学习(unsupervised learning)
40 |
41 | Q4:通过1000张有脸的照片和10000张无脸的照片来区分其他照片中是否有脸?
42 |
43 | A4:监督学习(supervised learning)
44 |
45 | Q5:有选择性的安排实验,快速确定治疗癌症药物的功效?
46 |
47 | A5:主动学习(active learning)--原因不是很清楚,可能在选择性安排实验
48 |
49 | ### 问题3
50 |
51 | Q6~Q8:主要涉及训练样本外的数据的误差的分析。
52 | 令$\mathbb{X}={x_{1},x_2,...,x_N,x_{N+1},...,x_{N+L}}$和$\mathbb{Y}=\{-1,+1\}$。此外,我们假设训练集$D=\{(x_n,y_n)\}_{n=1}^N$,其中$y_n\in\mathbb{Y}$,测试集的输入为$\{x_{N+l}\}_{l=1}^L$,定义训练集外误差(OTS)为(其中$f$为潜在目标函数,$g$为似然函数):
53 | $$
54 | E_{OTS}(g,f)=\frac{1}{L}\sum_{l=1}^L[g(x_{N+l})\neq f(x_{N+l})]
55 | $$
56 | Q6:假设对于全部的$x$,$f(x)=+1$,和$g(x)=+1\ (x=x_k中k为奇数),g(x)=-1\ (其他情况)$,求$E_{OTS}(g,f)=?$?
57 |
58 | A6:其实题目本质上可以转换为求$N+1$到$N+L$有多少个偶数。则可以简单的举特例可以得到结果,也可以进行具体的推导,得到的结果为:$(\lfloor\frac{N+L}{2}\rfloor-\lfloor\frac{N}{2}\rfloor)$。所以最终结果为$\frac{1}{L}\times(\lfloor\frac{N+L}{2}\rfloor-\lfloor\frac{N}{2}\rfloor)$
59 |
60 | Q7:如果存在目标函数$f$对于每一个在训练集中的数据$\{(x_n,y_n)\}_{n=1}^N\in D$均有$f(x_n)=y_n$,则为无噪干扰情况。对于一切可能的$f: \mathbb{X}\to\mathbb{Y}$,存在多少种可能的$f$,使得在训练集$D$上满足无噪干扰情况。(注:如果函数$f_1=f_2\ for \forall x\in \mathbb{X}$,则认为两个函数相同)?
61 |
62 | A7:显然,$f$的不同取决于$(x_{N+1},y_{N+L}),...,(x_{N+L},y_{N+L})$,则共有$2^L$种可能的输出$y_{N+1},...,y_{N+L}$结果。
63 |
64 | Q8:一个确定的算法$\mathbb{A}$定义为以训练集$D$为输入,输出一个似然函数$g$。如果对于全部满足Q7条件的$f$产生的可能性是相等的,则对于两个算法$\mathbb{A}_1$和$\mathbb{A}_2$,哪条等式成立?
65 |
66 | A8:由于不管哪种算法$\mathbb{A}$产生的似然函数$g$对训练集之外的数据$(x_{N+1},...,x_{N+L})$产生的结果$(y_{N+1},...,y_{N+L})$只可能是$2^L(\pm1的全部可能)$中的一种,由于$E_{OTS}(g,f)$仅和训练集$D$之外的数据有关,因此完全可以将$g$视为全部可能的$f$(总共$2^L$种)中的一种而不影响最终结果。因此,问题就转换为全部$f$中的任意两个$f_1$和$f_2$关于$E_{OTS}$的问题了。则显然有$E_f\{E_{OTS}(f_1,f)\}=E_f\{E_{OTS}(f_2,f)\}$成立。
67 | 可以通过一个简单的例子来说明:$L=2$的情况,$f_1$预测的结果为$\{-1,-1\}$,$f_2$预测的结果为$\{+1,+1\}$,则$f$全部的结果有四种,代入计算下数学期望可以发现上式成立。(可以这样思考,从$f_1$出发,改变1个--对应err=1,改变2个--对应err=2和从$f_2$出发,改变1个,改变2个)
68 |
69 | ### 问题4
70 |
71 | Q9~Q12:主要考察对容器模型的理解,考虑一个含有无限弹珠的容器(仅有橙色和绿色两种类型弹珠),假设容器中橙色弹珠的比例为$\mu$,从容器中抽取10颗弹珠做为样本,该样本中橙子弹珠的比例为$\nu$
72 |
73 | Q9:如果$\mu=0.5$,则使得$\nu=\mu$的概率为多少?
74 |
75 | A9:概率论里面的知识,$C_{10}^5 0.5^50.5^5=0.246\approx 0.24$
76 |
77 | Q10:如果$\mu=0.9$,则使得$\nu=\mu$的概率为多少?
78 |
79 | A10:同理,$C_{10}^9 0.9^90.1^1=0.387\approx 0.39$
80 |
81 | Q11:如果$\mu=0.9$,则使得$\nu\leq 0.1$的概率为多少?
82 |
83 | A11:$C_{10}^1 0.9^{1}0.1^9+C_{10}^0 0.9^{0} 0.1^{10}=9.1\times 10^{-9}$
84 |
85 | Q12:如果$\mu=0.9$,则通过Hoeffding不等式获得$\nu\leq 0.1$的上界是多少?
86 |
87 | A12:Hoeffding不等式:
88 | $$
89 | P[|\nu-\mu|>\epsilon]\leq2exp(-2\epsilon^2N)
90 | $$
91 | 上式中根据题意可知$\epsilon=0.8$,代入可知$P[|\nu-\mu|>\epsilon]\leq5.52\times10^{-6}$,可以与Q11中的结果对比可见,该概率明显大于上述值。从而也可以说明Hoeffding不等式获得的界限要远大于真实情况,从侧面反映了后面几章中对于样本估计和真实情况近似可接受度的说明(即通过Hoeffding不等式获得的$\mu$和$\nu$的接近度要远小于真实情况)。
92 |
93 | ### 问题5
94 |
95 | Q13~Q14:一个袋子里含有无穷多的骰子(6面),这些骰子共有四种不同类型,每种类型出现的概率相等:
96 | a. 所有偶数面为橙色,奇数面为绿色
97 | b. 所有偶数面为绿色,奇数面为橙色
98 | c. 数字1-3面为橙色,数字4-6面为绿色
99 | d. 数字1-3面为绿色,数字4-6面为橙色
100 |
101 | Q13:如果从袋子中取出5颗骰子,则获得的5颗骰子数字1面全为橙色的概率?
102 |
103 | A13:数字1为橙色的有(b)(c),因此取出一个的概率为0.5,所以:$P=0.5^5=\frac{8}{256}$
104 |
105 | Q14:如果从袋子中取出5颗骰子,存在某个面5颗骰子均为橙色的概率?
106 |
107 | A14:全部的可能为组合{(a)(c), (a)(d), (b)(c), (b)(d)},则对应的结果为:$P=0.5^5\times 4-0.25^5\times 4=\frac{31}{256}$(其中的0.25是指单从一种类型取出时情况---因为在组合中存在一些重复情况)
108 |
109 |
--------------------------------------------------------------------------------
/lecture/MLF1-2.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石作业1:part2
3 | date: 2017-02-06 20:50:09
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习基石课后作业1-2:对应题目15~题目20
9 |
10 |
11 | ## 机器学习基石作业1
12 |
13 | ## 问题6
14 |
15 | Q15~Q20:主要考察感知机算法和Pocket算法。以下给出两种算法的“口语化”说明:
16 |
17 | ### 算法说明
18 |
19 | PLA算法
20 |
21 | >初始化$w=0$
22 | >For t=0,1,...
23 | > ① 寻找使得$sign(w_t^Tx_{n(t)})\neq y_{n(t)}$的数据$(x_{n(t)}, y_{n(t)})$
24 | > ② 按照$w_{t+1}\leftarrow w_t+y_{n(t)}x_{n(t)} $ 修正$w$
25 | >循环直到没有错误的情况为止
26 |
27 | 其中①中有些注意事项:可以采用不同的策略,如 (a).每次从前往后$(1,...,N)$寻找“错误数据” (b). 每次从前一次“错误数据”开始往后寻找“错误数据” (c). 每次随机打乱数据,按打乱后的顺序从前往后寻找“错误数据”。 相比较而言,方案(b)(c)要更加快速。
28 |
29 | 但PLA算法只能针对线性可分的数据,因此引入Pocket算法。
30 |
31 | Pocket算法
32 |
33 | >初始化$w=0$
34 | >For t=0,1,...
35 | > ①寻找使得$sign(w_t^Tx_{n(t)})\neq y_{n(t)}$的数据$(x_{n(t)}, y_{n(t)})$
36 | > ②按照$w_{t+1}\leftarrow w_t+y_{n(t)}x_{n(t)} $ ,(试图)修正错误
37 | > ③如果$w_{t+1}$的错误率小于$w_t$,则令$w=w_{t+1}$
38 | >直到预先设定的循环次数
39 | >返回$w$
40 |
41 | 其中①中寻找“错误数据”大多数情况下采用(c):编程实现则直接找出所有错误,再随便从中随机选一个。
42 |
43 | ### 算法实现
44 |
45 | ```python
46 | # 感知机算法
47 | # 下述实现需注意:加入了一个prevpos变量,为了保证每次都先从当前数据的后面数据中寻找错误项
48 | #(这样的方式相比每次均从第一个数据开始寻找要更快速)
49 | def perceptron(X, Y, theta, eta=1):
50 | num = 0; prevpos = 0
51 | while(True):
52 | yhat = np.sign(X.dot(theta))
53 | yhat[np.where(yhat == 0)] = -1
54 | index = np.where(yhat != Y)[0]
55 | if not index.any():
56 | break
57 | if not index[index >= prevpos].any():
58 | prevpos = 0
59 | pos = index[index >= prevpos][0]
60 | prevpos = pos
61 | theta += eta*Y[pos, 0]*X[pos:pos+1, :].T
62 | num += 1
63 | return theta, num
64 | ```
65 | ```python
66 | # 在定义Pocket算法前,先引入错误率函数
67 | def mistake(yhat, y):
68 | row, col = y.shape
69 | return np.sum(yhat != y)/row
70 | ```
71 | ```python
72 | # Pocket算法
73 | def pocket(X, Y, theta, iternum, eta = 1):
74 | yhat = np.sign(X.dot(theta))
75 | yhat[np.where(yhat == 0)] = -1
76 | errold = mistake(yhat, Y)
77 | thetabest = np.zeros(theta.shape)
78 | for t in range(iternum):
79 | index = np.where(yhat != Y)[0]
80 | if not index.any():
81 | break
82 | pos = index[np.random.permutation(len(index))[0]]
83 | theta += eta * Y[pos, 0] * X[pos:pos + 1, :].T
84 | yhat = np.sign(X.dot(theta))
85 | yhat[np.where(yhat == 0)] = -1
86 | errnow = mistake(yhat, Y)
87 | if errnow < errold:
88 | thetabest = theta.copy() # 这一步切勿弄错,如果直接thetabest=theta则会使两者指向同一块空间
89 | errold = errnow
90 | return thetabest, theta
91 | ```
92 | ### 具体问题
93 |
94 | 数据导入模块
95 |
96 | ```python
97 | # 导入数据函数
98 | def loadData(filename):
99 | data = pd.read_csv(filename, sep='\s+', header=None)
100 | data = data.as_matrix()
101 | col, row = data.shape
102 | X = np.c_[np.ones((col, 1)), data[:, 0: row-1]]
103 | Y = data[:, row-1:row]
104 | return X, Y
105 | ```
106 | **Q15-17数据导入**
107 | ```python
108 | # Q15-Q17导入数据项
109 | X, Y = loadData('hw1_15_train.dat')
110 | col, row = X.shape
111 | theta = np.zeros((row, 1))
112 | print('X的前五项:\n',X[0:5, :])
113 | print('Y的前五项: \n',Y[0:5,:].T)
114 | ```
115 |
116 | X的前五项:
117 | [[ 1. 0.97681 0.10723 0.64385 0.29556 ]
118 | [ 1. 0.67194 0.2418 0.83075 0.42741 ]
119 | [ 1. 0.20619 0.23321 0.81004 0.98691 ]
120 | [ 1. 0.51583 0.055814 0.92274 0.75797 ]
121 | [ 1. 0.70893 0.10836 0.33951 0.77058 ]]
122 | Y的前五项:
123 | [[ 1. 1. 1. 1. 1.]]
124 | Q15:基础感知机算法更新次数
125 |
126 | ```python
127 | # Q15的结果
128 | theta, num = perceptron(X, Y, theta)
129 | print('总共更新theta的次数:',num)
130 | ```
131 |
132 | 总共更新theta的次数: 39
133 |
134 | Q16:随机排列后感知机算法的平均更新次数
135 |
136 | ```python
137 | # Q16的结果
138 | total = 0
139 | for i in range(2000):
140 | theta = np.zeros((row, 1))
141 | randpos = np.random.permutation(col)
142 | Xrnd = X[randpos, :]
143 | Yrnd = Y[randpos, 0:1]
144 | _, num = perceptron(Xrnd, Yrnd, theta)
145 | total += num
146 | print('2000次平均每次更新theta的次数:',total/2000)
147 | ```
148 |
149 | 2000次平均每次更新theta的次数: 39.806
150 |
151 | Q17:不同$\eta$情况下的感知机平均更新次数
152 |
153 | ```python
154 | # Q17的结果
155 | total = 0
156 | for i in range(2000):
157 | theta = np.zeros((row, 1))
158 | randpos = np.random.permutation(col)
159 | Xrnd = X[randpos, :]
160 | Yrnd = Y[randpos, 0:1]
161 | _, num = perceptron(Xrnd, Yrnd, theta, 0.5)
162 | total += num
163 | print('2000次平均每次更新theta的次数:',total/2000)
164 | ```
165 |
166 | 2000次平均每次更新theta的次数: 39.758
167 |
168 | 这里需要说明一点:Q17和Q16的结果基本一致的原因在于参数同时缩放对$sign(w^Tx)$来说是一样的,但当初始$w\neq 0$时,两问的结果还是有一定差别的
169 |
170 | **Q18-Q20数据导入**
171 |
172 | ```python
173 | # Q18-20导入数据
174 | X, Y = loadData('hw1_18_train.dat')
175 | Xtest, Ytest = loadData('hw1_18_test.dat')
176 | col, row = X.shape
177 | theta = np.zeros((row, 1))
178 | ```
179 | Q18:50次更新情况下的测试集错误率
180 |
181 | ```python
182 | # Q18
183 | total = 0
184 | for i in range(2000):
185 | theta = np.zeros((row, 1))
186 | randpos = np.random.permutation(col)
187 | Xrnd = X[randpos, :]
188 | Yrnd = Y[randpos, 0:1]
189 | theta, thetabad = pocket(Xrnd, Yrnd, theta, 50)
190 | yhat = np.sign(Xtest.dot(theta))
191 | yhat[np.where(yhat == 0)] = -1
192 | err = mistake(yhat, Ytest)
193 | total += err
194 | print('迭代次数为50时,theta_pocket情况下的测试集错误率:',total/2000)
195 | ```
196 |
197 | 迭代次数为50时,theta_pocket情况下的测试集错误率: 0.132035
198 | Q19:50次更新情况下,最后一次theta作为参数时测试集的错误率
199 |
200 | ```python
201 | # Q19
202 | total = 0
203 | for i in range(2000):
204 | theta = np.zeros((row, 1))
205 | randpos = np.random.permutation(col)
206 | Xrnd = X[randpos, :]
207 | Yrnd = Y[randpos, 0:1]
208 | theta, thetabad = pocket(Xrnd, Yrnd, theta, 50)
209 | yhat = np.sign(Xtest.dot(thetabad))
210 | yhat[np.where(yhat == 0)] = -1
211 | err = mistake(yhat, Ytest)
212 | total += err
213 | print('迭代次数为50时,theta_50情况下的测试集错误率:',total/2000)
214 | ```
215 |
216 | 迭代次数为50时,theta_50情况下的测试集错误率: 0.354342
217 |
218 | Q20:100次更新情况下的测试集错误率
219 |
220 | ```python
221 | # Q20
222 | total = 0
223 | for i in range(2000):
224 | theta = np.zeros((row, 1))
225 | randpos = np.random.permutation(col)
226 | Xrnd = X[randpos, :]
227 | Yrnd = Y[randpos, 0:1]
228 | theta, thetabad = pocket(Xrnd, Yrnd, theta, 100)
229 | yhat = np.sign(Xtest.dot(theta))
230 | yhat[np.where(yhat == 0)] = -1
231 | err = mistake(yhat, Ytest)
232 | total += err
233 | print('迭代次数为100时,theta_pocket情况下的测试集错误率:',total/2000)
234 | ```
235 |
236 | 迭代次数为100时,theta_pocket情况下的测试集错误率: 0.11616
237 |
238 |
--------------------------------------------------------------------------------
/lecture/MLF1.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石Lec1-Lec4
3 | date: 2017-02-11 15:05:19
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习基石Lec1-Lec4主要知识点:对应作业1
9 |
10 |
11 | ## 关于机器学习
12 |
13 | ### 定义及特点
14 |
15 | 基本定义:
16 |
17 | 
18 |
19 | 判别问题是否适合采用机器学习的准则:
20 |
21 | 
22 |
23 | ### 基本框架
24 |
25 | 
26 |
27 | ## 0/1判别问题
28 |
29 | ### 感知机算法
30 |
31 | 
32 |
33 | - 优点:实现简便,运行速度快
34 | - 缺点:只能针对线性可分的数据集
35 |
36 | ### 感知机可行性分析
37 |
38 | ① $w_t$随着迭代次数$t$的增加而逐渐向$w_f$“靠拢”
39 |
40 | 
41 |
42 | 后续推导用到的公式$w_f^Tw_{t+1}\ge w_f^Tw_t+min_ny_nw_f^Tx_n$(1)
43 |
44 | ② $w_t$的“靠拢”是逐渐的
45 |
46 | 
47 |
48 | 后续推导用到的公式$||w_{t+1}||^2\leq ||w_t||^2+max_n||x_n||^2$(2)
49 |
50 | ③ 证明迭代次数有限:
51 |
52 | 由(1)可知:$w_f^Tw_{T}\ge w_f^Tw_{T-1}+min_ny_nw_f^Tx_n\ge...\ge w_f^Tw_{0}+T\cdot min_ny_nw_f^Tx_n$
53 |
54 | 由(2)可知:$||w_{T}||^2\leq ||w_{T-1}||^2+max_n||x_n||^2\le...\leq ||w_{0}||^2+T\cdot max_n||x_n||^2\to ||w_T||\le\sqrt{T}max_n||x_n||$
55 |
56 | 从而可以得到下式:
57 | $$
58 | \frac{w_f^Tw_T}{||w_f||\cdot||w_T||}\ge\frac{w_f^Tw_{0}+T\cdot min_ny_nw_f^Tx_n}{||w_f||\cdot||w_T||}\ge\frac{w_f^Tw_{0}+T\cdot min_ny_nw_f^Tx_n}{||w_f||\cdot\sqrt{T}max_n||x_n||}
59 | $$
60 | 当$w_0=0$时:
61 | $$
62 | \frac{w_f^Tw_T}{||w_f||\cdot||w_T||}\ge\frac{T\cdot min_ny_nw_f^Tx_n}{||w_f||\cdot\sqrt{T}max_n||x_n||}\to T\le \frac{R^2}{\rho^2}
63 | $$
64 | 其中$R^2=max_n||x_n||^2,\ \rho=min_ny_n\frac{w_f^T}{||w_f||}x_n$
65 |
66 | 因此得证。(其中利用到上式的左边$\leq1$)
67 |
68 | ### Pocket算法
69 |
70 | 为了解决线性不可分问题而引入(更准确而言是针对带有杂讯的“原本”线性可分数据):
71 |
72 | 
73 |
74 | 缺点:效率相比PLA要慢很多(主要在于每次迭代后都要计算错误率)
75 |
76 | ## 不同种类的学习
77 |
78 | ### 根据输出空间$\mathcal{Y}$不同进行划分
79 |
80 | 
81 |
82 | ### 根据数据标签$y_n$不同进行划分
83 |
84 | 
85 |
86 | ### 根据学习原型$f\Rightarrow(\mathbb{x_n},y_n)$不同进行划分
87 |
88 | 
89 |
90 | ### 根据输入空间$\mathcal{X}$不同进行划分
91 |
92 | 
93 |
94 | ## 学习可行性分析
95 |
96 | ### 引入小球容器模型
97 |
98 | 利用样本概率分布近似整体概率分布:
99 |
100 | 
101 |
102 | 关于Hoeffding不等式的详细内容可见[Hoeffding's Inequality](https://www.wikiwand.com/en/Hoeffding's_inequality)
103 |
104 | ### 从容器模型到学习模型
105 |
106 | ① 容器模型$\to$学习模型
107 |
108 | 
109 |
110 | ② Hoeffding不等式在学习模型中的形式
111 |
112 | 
113 |
114 | 学习到的模型是否可行主要判据:①$E_{in}(h)$是否足够小 ②$E_{in}(h)\approx E_{out}(h)$是否成立
115 |
116 | ### 单个假设函数$\to$多个假设函数
117 |
118 | 通过单个假设函数情况可以发现,在$N$较大时,$E_{in}(h)\approx E_{out}(h)$的概率就比较大。但是由于可供挑选的假设函数太少(只有一个),因此几乎很难保证$E_{in}(h)$足够小。
119 |
120 | 坏的数据集定义:数据集$D$使得$E_{in}(h)$和$E_{out}(h)$相差甚远,由Hoeffding不等式可知,对于单个假设函数而言,样本为坏的数据集的概率较小。
121 |
122 | 那么对于一系列假设函数而言,拥有的数据集(样本)是坏的数据集的概率由下式表示:
123 |
124 | 
125 |
126 | (上式你可以理解为$P[a\cup b\cup c]=P[a]+P[b]+P[c]$,$a,b,c$互不重叠)
127 |
128 | 从而不难知,当我们拥有很多假设函数时,则样本为坏的数据集的概率很大!关于如何“克服”这方面问题见Lec5-Lec8
--------------------------------------------------------------------------------
/lecture/MLF1/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic1.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic10.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic11.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic12.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic13.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic14.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic14.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic15.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic15.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic1_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic1_1.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic2.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic3.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic4.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic5.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic6.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic7.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic8.png
--------------------------------------------------------------------------------
/lecture/MLF1/pic9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF1/pic9.png
--------------------------------------------------------------------------------
/lecture/MLF2-1.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石作业2:part1
3 | date: 2017-02-07 20:22:43
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习基石课后作业2-1:对应题目1~题目15
9 |
10 |
11 | ## 机器学习基石作业2
12 |
13 | ### 问题1
14 |
15 | Q1~Q2:考虑目标中存在噪声的情况。
16 | 假设存在一个似然函数$h$能够近似一个确定的目标函数$f$,且输出错误(即$h(x)\neq f(x)$)的概率为$\mu$(注:$h$和$f$的输出均为$\{-1,+1\}$,如果我们用这个$h$去近似一个含噪声的$f$(用$y$表示),如下所示:
17 | $$
18 | P(x,y)=P(x)P(y|x) \\ P(y|x)=\begin{cases}\lambda, \ \ \ \ \ \ \ \ \ \ y=f(x)\\1-\lambda, \ \ \ \ otherwise\end{cases}
19 | $$
20 | Q1:用$h$近似带噪声的目标$y$的错误率时多少?
21 |
22 | A1:$h$和$y$通过$f$联系在一起,因此错误的情况分为两种:①$h=f, \ f\neq y$,这种的概率$P1=(1-\mu)(1-\lambda)$ ② $h\neq f,\ f=y$,这种的概率$P2=\mu \lambda$,所以最终的错误率$P=(1-\mu)(1-\lambda)+\mu\lambda$
23 |
24 | Q2:当$\lambda$取何值时,似然函数$h$与$\mu$无关?
25 |
26 | A2:对Q1中的结果进行化简:$P=(2\lambda-1)\mu-\lambda+1$,因此当$\lambda=1/2$时,上式与$\mu$无关。从中也可以看出,如果噪声强大到使得$f$的结果被搞得乱七八糟的,那么似然函数取什么将“没那么重要”。因此,噪声的多少和强度在分析问题时是重要的。
27 |
28 | ### 问题2
29 |
30 | Q3~Q5:根据一些数值情况,来加深对泛化误差的理解
31 | 主要涉及到的是VC Bound这条式子(具体的推导可见课件和教材):
32 | $$
33 | \mathbb{P}_D[|E_{in}(g)-E_{out}(g)|\gt\epsilon]\le 4(2N)^{d_{vc}}exp(-\frac{1}{8}\epsilon^2N)
34 | $$
35 | 上式中$m_H(2N)=(2N)^{d_{vc}}$,且常将$\delta=4(2N)^{d_{vc}}exp(-\frac{1}{8}\epsilon^2N)$
36 | 为了简单起见,直接令$m_H(N)=N^{d_{vc}}$,且假设$N\ge 2,d_{vc}\ge2$
37 |
38 | Q3:对于一个$d_{vc}=10$的似然函数集$H$,如果我们希望其泛化误差$\epsilon\le0.05$的概率大于$95\%$,则下述哪个数据量与该条件最接近?
39 | a. 500,000 b. 480,000 c. 420,000 d. 440,000 d. 460,000
40 |
41 | A3:在计算时做一步处理,将$(2N)^{d_{vc}}exp(-\frac{1}{8}\epsilon^2N)=exp(d_{vc}\cdot log(2N)-\frac{1}{8}\epsilon^2N)$,再将不同数据代入即可,最后可以发现$N=460,000$是最接近的。
42 |
43 | Q4&Q5:除了VC Bound之外,还有许多其他的Bounds准则,主要有以下5种:
44 | a. Original VC bound:$\epsilon\le\sqrt{\frac{8}{N}ln\frac{4m_H(2N)}{\delta}}$
45 | b. Rademacher Penalty Bound:$\epsilon\le\sqrt{\frac{2}{N}ln(2Nm_H(N))}+\sqrt{\frac{2}{N}ln\frac{1}{\delta}}+\frac{1}{N}$
46 | c. Parrondo and Van den Broek:$\epsilon\le\sqrt{\frac{1}{N}(2\epsilon+ln\frac{6m_H(2N)}{\delta})}$
47 | d. Devroye:$\epsilon\le\sqrt{\frac{1}{2N}(4\epsilon(1+\epsilon)+ln\frac{4m_H(N^2)}{\delta})}$
48 | e. Variant VC bound:$\epsilon\le\sqrt{\frac{16}{N}ln\frac{2m_H(N)}{\sqrt\delta}}$
49 | 假设$d_{vc}=50,\ \delta=0.05$,则对于不同$N$(10,000和5)的情况下,各Bounds中哪个值最小?
50 |
51 | A4&A5:可以通过程序来实现
52 |
53 | ```python
54 | n = np.arange(3, 10000)
55 | f1 = np.sqrt(8/n*(np.log(80)+50*np.log(2*n)))
56 | print('Original VC bound (10000) and (5): ', f1[-1], '\t', f1[2])
57 | f2 = np.sqrt(2/n*(np.log(2*n)+50*np.log(n)))+np.sqrt(2/n*math.log(20))+1/n
58 | print('Rademacher Penalty Bound (10000) and (5): ', f2[-1], '\t', f2[2])
59 | f3 = 1/n+np.sqrt(1/np.power(n, 2)+1/n*(np.log(120)+50*np.log(2*n)))
60 | print('Parrondo and Van den Broek (10000) and (5): ', f3[-1], '\t', f3[2])
61 | f4 = 1/(n-2)+np.sqrt(1/np.power(n-2, 2)+1/(2*n-4)*(np.log(80)+100*np.log(n)))
62 | print('Devroye (10000) and (5): ', f4[-1], '\t', f4[2])
63 | f5 = np.sqrt(16/n*(np.log(2/math.sqrt(0.05))+50*np.log(n)))
64 | print('Variant VC bound (10000) and (5): ', f5[-1], '\t', f5[2])
65 | plt.plot(n, f1, label='Original VC bound')
66 | plt.plot(n, f2, label='Rademacher Penalty Bound')
67 | plt.plot(n, f3, label='Parrondo and Van den Broek')
68 | plt.plot(n, f4, label='Devroye')
69 | plt.plot(n, f5, label='Variant VC bound')
70 | plt.legend()
71 | plt.show()
72 | ```
73 |
74 | Original VC bound (10000) and (5): 0.632203362312 13.828161485
75 | Rademacher Penalty Bound (10000) and (5): 0.33132369478 7.04877656418
76 | Parrondo and Van den Broek (10000) and (5): 0.223708366246 5.10136198199
77 | Devroye (10000) and (5): 0.215237656725 5.59312554318
78 | Variant VC bound (10000) and (5): 0.860464345894 16.264111061
79 |
80 | 
81 |
82 | 因此对于$N=10,000$的情况Devroye最小,对于$N=5$的情况parrondo最小
83 |
84 | ### 问题3
85 |
86 | Q6~Q11:主要考察growth function和VC维的问题
87 |
88 | Q6&Q7:求“positive-and-negative intervals on R”这种类型似然函数集:①在区间$[l, r]$之间为$+1$,其他地方为$-1$或者②在区间$[l, r]$之间为$-1$,其他为$+1$的growth function($m_H(N)$)和VC维?
89 |
90 | 
91 |
92 | A6&A7:上述问题可以看成N个数据(如上图9个数据,1代表第一个数据),则之间有N-1个空隙,扩展左边空隙变为N个空隙,则任选两个空隙$C_N^2$,则有$2C_N^2$种分法。但这样还少了全+1和全-1的情况,所以,总共应该有$N(N-1)+2=N^2-N+2$种,而VC维则直接从$N-1,2,3...$不断代入,直到其$\lt 2^N$停止。不难求得$d_{vc}=3$
93 |
94 | Q8:二维空间上,“甜甜圈”类型的似然函数集(如下图所示),在甜甜圈内部为+1($a^2\le x_1^2+x_2^2\leq b^2$),外部为-1,假设$0
10 |
11 | ## 机器学习基石作业2
12 |
13 | ## 问题5
14 |
15 | Q16~Q20:主要考察“一刀切”式的“决策树桩”算法。以下给出单维和多维情况下的算法的“口语化”说明。其中单维对应的式子:
16 | $$
17 | h_{s,\theta}(x)=s\cdot sign(x-\theta)
18 | $$
19 | 多维情况对应的式子:
20 | $$
21 | h_{s,i,\theta}=s\cdot sign(x_i-\theta)
22 | $$
23 |
24 | ### 算法说明
25 |
26 | 
27 |
28 | 单维树桩算法
29 |
30 | >假定初始数据为$\{(x_1,y_1),(x_2,y_2),...,(x_N,y_N)\}$
31 | >① 预先设定N个阈值$\theta$(先对数据的$x$进行排序,将$\theta$设定为其间隙值,且取一个最小数左边的值)
32 | >② 计算每一个阈值$\theta$和$s=+1,-1$对应的$E_{in}$,找出其中对应最小$E_{in}$的$\theta,\ s$
33 | >返回$\theta,\ s,\ minE_{in}$
34 |
35 | 其中①中可以采用其他的策略来实现,但具体方式是相近的。
36 |
37 | 多维树桩算法
38 |
39 | >假定初始数据为$\{(x^{(1)},y^{(1)},(x^{(2)},y^{(2)},...,(x^{(N)},y^{(N)}\}$,其中$x^{(i)}\in\mathbb{R}^d$
40 | >①For i=1,2,...,d:
41 | > 寻找维度i情况下的$\theta,\ s,\ minE_{in}$(通过单维树桩的方式求得)
42 | >②寻找上述$d$个不同$minE_{in}$中最小的那个,以及对应的$\theta,\ s$(如果存在两个$minE_{in}$相同则任意取一个)
43 | >返回$\theta,\ s,\ minE_{in}$
44 |
45 | ### 算法实现
46 |
47 | ```python
48 | # 单维度决策树桩算法
49 | def decision_stump(X, Y):
50 | theta = np.sort(X)
51 | num = len(theta)
52 | Xtemp = np.tile(X, (num, 1))
53 | ttemp = np.tile(np.reshape(theta, (num, 1)), (1, num))
54 | ypred = np.sign(Xtemp - ttemp)
55 | ypred[ypred == 0] = -1
56 | err = np.sum(ypred != Y, axis=1)
57 | if np.min(err) <= num-np.max(err):
58 | return 1, theta[np.argmin(err)], np.min(err)/num
59 | else:
60 | return -1, theta[np.argmax(err)], (num-np.max(err))/num
61 | ```
62 | ```python
63 | # 多维度决策树桩算法
64 | def decision_stump_multi(X, Y):
65 | row, col = X.shape
66 | err = np.zeros((col,)); s = np.zeros((col,)); theta = np.zeros((col,))
67 | for i in range(col):
68 | s[i], theta[i], err[i] = decision_stump(X[:, i], Y[:, 0])
69 | pos = np.argmin(err)
70 | return pos, s[pos], theta[pos], err[pos]
71 | ```
72 | ### 具体问题
73 |
74 | 涉及到自己生成数据问题,生成的数据满足下述两个条件:
75 | (a). $x$产生于$[-1,+1]$上的均匀分布
76 | (b). $y=f(x)+noise$,其中$f(x)=sign(x)$,noise则为有$20\%$的概率翻转$f(x)$的结果
77 |
78 | 生成数据函数
79 |
80 | ```python
81 | # 生成数据函数
82 | def generateData():
83 | x = np.random.uniform(-1, 1, 20)
84 | y = np.sign(x)
85 | y[y == 0] = -1
86 | prop = np.random.uniform(0, 1, 20)
87 | y[prop >= 0.8] *= -1
88 | return x, y
89 | ```
90 | Q16:对于任意一个决策树桩函数$h_{s,\theta}\ \ \theta\in[-1,+1]$,其对应的$E_{out}(h_{s,\theta})$为以下哪一种函数?
91 | a. $0.3+0.5s(1-|\theta|)$ b. $0.3+0.5s(|\theta|-1)$ c. $0.5+0.3s(|\theta|-1)$ d.$0.5+0.3s(1-|\theta|)$
92 |
93 | A16:为简便起见,假设$s=1,\theta\gt0$,此时$h$预测情况:$[\theta,1]\to +1$,$[-1,\theta]\to-1$,$f$真实情况:$(p=0.8)[-1,0]\to-1$,$(p=0.2)[-1,0]\to+1$,$(p=0.8)[0,1]\to+1$,$(p=0.2)[0,1]\to-1$。从而可见错误出现在区间$[0,\theta]$错误概率为$0.8$,其他区域错误概率为$0.2$。因此$E_{out}=(0.2(2-\theta)+0.8\theta)/2=0.2+0.3\theta$,其他三种情况类似分析,最终可得答案为c
94 |
95 | Q17&Q18:根据规则随机生成20组数据,运行5,000次,求平均$E_{in}$和平均$E_{out}$(其中$E_{out}$由Q16中的答案来求解)?
96 |
97 | ```python
98 | # Q17和Q18
99 | totalin = 0; totalout = 0
100 | for i in range(5000):
101 | X, Y = generateData()
102 | theta = np.sort(X)
103 | s, theta, errin = decision_stump(X, Y)
104 | errout = 0.5+0.3*s*(math.fabs(theta)-1)
105 | totalin += errin
106 | totalout += errout
107 | print('训练集平均误差: ', totalin/5000)
108 | print('测试集平均误差: ', totalout/5000)
109 | ```
110 |
111 | 训练集平均误差: 0.17111
112 | 测试集平均误差: 0.2613195070168455
113 | **Q19&Q20**导入数据集函数
114 |
115 | ```python
116 | # 导入数据函数
117 | def loadData(filename):
118 | data = pd.read_csv(filename, sep='\s+', header=None)
119 | data = data.as_matrix()
120 | col, row = data.shape
121 | X = np.c_[np.ones((col, 1)), data[:, 0: row-1]]
122 | Y = data[:, row-1:row]
123 | return X, Y
124 | ```
125 | Q19&Q20:求多维决策树桩在训练集和测试集上的误差$E_{in}$和$E_{out}$?
126 |
127 | ```python
128 | # Q19和Q20
129 | X, Y = loadData('hw2_train.dat')
130 | Xtest, Ytest = loadData('hw2_test.dat')
131 | pos, s, theta, err = decision_stump_multi(X, Y)
132 | print('训练集误差: ', err)
133 | ypred = s*np.sign(Xtest[:, pos]-theta)
134 | ypred[ypred == 0] = -1
135 | row, col = Ytest.shape
136 | errout = np.sum(ypred != Ytest.reshape(row,))/len(ypred)
137 | print('测试集误差: ', errout)
138 | ```
139 |
140 | 训练集误差: 0.25
141 | 测试集误差: 0.355
142 |
--------------------------------------------------------------------------------
/lecture/MLF2-2/Q6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF2-2/Q6.png
--------------------------------------------------------------------------------
/lecture/MLF2.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石Lec5-Lec8
3 | date: 2017-02-11 19:27:15
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习基石Lec5-Lec8主要知识点:对应作业2
9 |
10 |
11 | ## 训练vs测试
12 |
13 | ### 假设函数集数量的影响
14 |
15 | 机器学习能否可行的两个问题和假设函数集中假设函数的数量$M$的关系:
16 |
17 | 
18 |
19 | ### 有效假设函数
20 |
21 | 假设函数集中往往存在很多“冗余”(类似)假设函数,可以将相近的假设函数归为一类,当做一个假设函数,从而有效减少假设函数集中假设函数数量。
22 |
23 | - $\mathcal{H}(x_1,x_2,...,x_N)$:针对数据集$x_1,x_2,...,x_N$,$\mathcal{H}$中存在的有效假设函数集。用$|\mathcal{H}|$表示该有效假设函数集中假设函数数量
24 | - 增长函数$m_{\mathcal{H}}(N)$定义:$m_{\mathcal{H}}(N)=max_{x_1,x_2,...,x_N\in \mathcal{X}}|\mathcal{H}(x_1,x_2,...,x_N)|$
25 | - shatter的定义(此处只针对二元分类问题):如果$m_{\mathcal{H}}(N)=2^N$,则称这N个输入能够被shatter
26 | - 断点的定义:如果k个输入不能被假设函数集shatter,则称k为断点
27 |
28 | 从而可以将假设函数集对应的Hoeffding不等式表示为:
29 |
30 | 
31 |
32 | 如果我们能够获得$m_{\mathcal{H}}(N)=poly(N)$则当$N$足够大时能够有效
33 |
34 | ## 泛化理论
35 |
36 | ### 断点带来的约束
37 |
38 | 边界函数$B(N,k)$的定义:当断点为$k$时,对应的$m_{\mathcal{H}}(N)$的最大可能数目
39 |
40 | 由于存在断点,使得$m_{\mathcal{H}}(N)$随着$N$增大时,并不会急速增大:
41 |
42 | 
43 |
44 | ### 边界函数推导
45 |
46 | 对于边界函数$B(N,k)$,其任意$k$个数据均不能被shatter,因此显然$k$也是任意$N-1$个数据的断点,所以其$m(N-1)$最多有$B(N-1,k)$种假设函数(此处假设函数理解为$\{\times,\circ,...,\circ\}^2$中的一种),而这些假设函数又可以分为两大块:① 构成一块满足$B(N-1,k-1)$情况的假设函数集合 ②剩余部分的假设函数集合,即$B(N, k-1)-B(N-1,k-1)$。再增加一个数据从$N-1\to N$,则两大块情况不同:①中加入数据为$\times, \circ$皆可,因为本身断点为$k-1$,增加后最多断点变为$k$,②中则最多只能增加固定的一种可能(***具体证明还没想出来***) 。从而可以知道:
47 | $$
48 | B(N,k)=2\cdot B(N-1,k-1)+B(N-1,k)-B(N-1,k-1)\\
49 | =B(N-1,k-1)+B(N-1,k)=\sum_{i=0}^{k-1}\begin{pmatrix} N\\i \end{pmatrix}
50 | $$
51 | 证明:利用数学归纳法
52 |
53 | >① 当$N=1$时,$B(N,k)=\begin{pmatrix}1\\0\end{pmatrix}=1$
54 | >② 假设$B(N,k)=\sum_{i=0}^{k-1}\begin{pmatrix} N\\i \end{pmatrix}$
55 | >③ 根据
56 | >
57 | >$$
58 | >B(N+1,k)=B(N,k)+B(N,k-1)=\sum_{i=0}^{k-1}\begin{pmatrix} N\\i \end{pmatrix}+\sum_{i=0}^{k-2}\begin{pmatrix} N\\i \end{pmatrix}=\sum_{i=0}^{k-1}\begin{pmatrix} N\\i \end{pmatrix}+\sum_{i=1}^{k-2}\begin{pmatrix} N\\i-1 \end{pmatrix}\\
59 | >=1+\sum_{i=1}^{k-1}(\begin{pmatrix} N\\i-1 \end{pmatrix}+\begin{pmatrix} N\\i \end{pmatrix})=1+\sum_{i=1}^{k-1}\begin{pmatrix} N+1\\i \end{pmatrix}=\sum_{i=0}^{k-1}\begin{pmatrix} N+1\\i \end{pmatrix}
60 | >$$
61 | >从而的证
62 |
63 | ### VC bound
64 |
65 | 首先直接给出最终结论:
66 |
67 | 
68 |
69 |
70 |
71 | 主要分以下三个步骤进行证明(具体参照LFD这本书,不感兴趣可以忽略不看)
72 |
73 | ------
74 |
75 | **步骤①** 用$E_{in}^\prime$替代$E_{out}$
76 |
77 | 因为之前的$m(N)$均是建立在$N$的基础上进行分析的,因此无法直接运用于$E_{out}$,所以考虑用一个由虚拟的数据集$D^\prime$对应的$E_{in}^\prime$来取代$E_{out}$,替换的结果为:
78 |
79 | 
80 |
81 | 直观性解释:
82 | 
83 |
84 | 红色区域为$|E_{in}-E_{out}|\ large$的区域,绿色区域为$|E_{in}^\prime-E_{in}|\ large$的区域,而$E_{in}^\prime$取到绿色的区域$\approx1/2$,从而可以知道$P[|E_{in}-E_{out}|\ large]\approx2P[|E_{in}^\prime-E_{in}|\ large]$
85 |
86 | 证明:大前提---假设$P[sup_{h\in\mathcal{H}}|E_{in}(h)-E_{out}(h)|\gt\epsilon]\gt0$
87 | ① 其中(A.1)运用了$A\ and\ B\subset A\to P(A)\ge P(A\ and\ B)$
88 | 
89 |
90 | ② 主要来考虑上式中的后一项,固定数据集$D$,令$h^*$表示任意一个使得$|E_{in}(h^*)-E_{out}(h^*)|\gt \epsilon$成立的假设函数,因为$E_{in}$是关于$D$的,而$E_{in}^\prime$是关于$D^\prime$的,所以$h^*$与$D$有关而与$D^\prime$无关。
91 | 
92 |
93 | A.2:当$|E_{in}(h^\star)-E_{in}^\prime(h^\star)|\gt \epsilon/2$成立时,显然有$sup_{h\in\mathcal{H}}|E_{in}(h^\star)-E_{in}^\prime(h^\star)|\gt \epsilon/2$
94 | A.3:由$|E_{in}(h^\star)-E_{out}(h^\star)|\gt \epsilon$和$|E_{in}^\prime(h^\star)-E_{out}(h^\star)|\le \epsilon/2$两个不等式可以导出$|E_{in}(h^\star)-E_{in}^\prime(h^\star)|\gt \epsilon/2$(即如果A.3成立,则A.2必然成立),背后的公式即:$|A|\gt a,\ |B|\le a/2\to |A-B|\ge|B|-|A|\gt a/2$
95 | A.4:由于$h^*$是某个固定$D$情况下满足条件的假设函数,因此条件$sup_{h\in\mathcal{H}}|E_{in}(h)-E_{out}(h)|\gt\epsilon$满足条件的全部$h$对于$|E_{in}^\prime(h^\star)-E_{out}(h^\star)|\le \epsilon/2$均成立。所以等价于$\sum P[h_i]P[|E_{in}^\prime(h_i)-E_{out}(h_i)|\le \epsilon/2]$。这些$\sum P[h_i]=1$。因此可以直接使用Hoedffding不等式,其中采用了$P[A
10 |
11 | ## 机器学习基石作业3
12 |
13 | ### 问题1
14 |
15 | Q1~Q2:关于线性回归问题中$E_{in}$和$E_{out}$的理解
16 | 关于含有噪声的目标$y=w^T_fx+\epsilon$,其中的噪声$\epsilon$均值为0,方差为$\sigma^2$,且相互独立。根据PPT上的讲解可知,闭式解$w_{lin}$的$E_{in}$为:
17 | $$
18 | E_{in}(w_{lin})=\frac{1}{N}||\mathbb{y}-\hat{\mathbb{y}}||^2=\frac{1}{N}||(I-XX^{\dagger})\mathbb{y}||^2=\frac{1}{N}||(I-H)\mathbb{y}||^2
19 | $$
20 | 
21 |
22 | 从上图可知,$(I-H)\mathbb{y}=(I-H)noise$(这是基于只有$f(x)$含有噪声,$x$不含噪声的前提),从而问题转换为$E_{in}(w_{lin})=\frac{1}{N}||(I-H)noise||^2$。为了简化起见,令$A=(I-H),B=noise\to ||AB||^2=scaler$,从而可以获得下面的式子(其中用到$BB^T=scaler$):
23 | $$
24 | ||AB||^2=trace((AB)^TAB)=trace(BB^TA^TA)\\
25 | =BB^Ttrace(A^TA)=||noise||^2trace(A^TA)
26 | $$
27 | 根据$H$的性质(具体证明等见Q2)可得:
28 | $$
29 | trace((I-H)^T(I-H))=trace(I-H)=N-(d+1)
30 | $$
31 | 所以,综上所述可得:
32 | $$
33 | E_{in}(w_{lin})=(1-\frac{d+1}{N})||noise||^2
34 | $$
35 | Q1:当$\sigma=0.1,\ d=8$时,使得$E_{in}$的数学期望$\ge0.008$的样本数是多少?(从下述选项中选择满足条件情况下最小的)
36 | a. 500 b. 25 c. 100 d. 1000 e. 10
37 |
38 | A1:$E_{in}$的数学期望为:
39 | $$
40 | \mathbb{E}_D[E_{in}(w_{lin})]=(1-\frac{d+1}{N})\sigma^2
41 | $$
42 | 从而相当于$(1-9/N)*0.01\ge 0.0008\to N=45$,从而选择N=100
43 |
44 | Q2:针对hat matrix$H=X(X^TX)^{-1}X^T$性质的探究,以下哪些性质是$H$所具有的?
45 | a. $H$是一个对称矩阵 b. $H$是一个幂等矩阵 c. $H$是一个半正定矩阵 d. $H$总是可逆
46 | e. $H$存在大于1的特征值 f. $H$有的d+1个特征值为1
47 |
48 | A2:首先给出结论:$H$是①对称 ②幂等性 ③半正定 ④有d+1个特征值为1
49 |
50 | 证明:
51 | ①对称性:$H=(X(X^TX)^{-1}X^T)^T=X(X^TX)^{-1}X^T=H$(其中有用到$((AB)^{-1})^T=((AB)^T)^{-1}$,这条式子有可逆作为先决条件)
52 | ②幂等性:$H^2=X(X^TX)^{-1}X^TX(X^TX)^{-1}X^T=X(X^TX)^{-1}X^T=H$
53 | ③半正定:假设存在特征值和特征向量$Hw=\lambda w$,则$\lambda w=Hw=H^2w=\lambda Hw=\lambda^2w$,从而$\lambda^2w=\lambda w\to \lambda=0\ or \ 1$,所以全部特征值均$\ge 0$
54 | ④存在d+1个特征值为1:$trace(H)=trace(X(X^TX)^{-1}X^T)=trace((X^TX)^{-1}X^TX)=trace(I_{d+1\times d+1})=d+1$,又根据$trace(H)=\sum\lambda_i$(该项需要半正定和对称性作为条件,具体证明可见PRML),从而可知$\lambda=1$对应有$d+1$个。
55 |
56 | 这些结论可以用来证明$trace(I-H)=N-(d+1)$具体可自行推导
57 |
58 | ### 问题2
59 |
60 | Q3~Q5:主要考察损失判据和随机梯度下降
61 |
62 | 存在以下集中损失判据函数:
63 | a. $err(w)=(-yw^Tx)$
64 | b. $err(w)=(max(0,1-yw^Tx))^2$
65 | c. $err(w)=max(0,-yw^Tx)$
66 | d. $err(w)=\theta(-yw^Tx)$
67 |
68 | 
69 |
70 | Q3:上述损失判据中,哪个是0/1判据$sign(w^Tx)\ne y$的上界($y\in \{-1,+1\}$)?
71 |
72 | A3:通过上述图像易知,errb即$err(w)=(max(0,1-yw^Tx))^2$为0/1判据的上界
73 |
74 | Q4:上述损失判据中,哪个并不是处处可微分的?
75 |
76 | A4:从图中也容易看出,errc即$err(w)=max(0,-yw^Tx)$在0处不可微分
77 |
78 | Q5:对上述损失判据计算SGD(忽略不可微分的情况),哪个损失判据恰好是PLA中采用的(即其求梯度函数恰好为PLA中更新参数时用到的)?
79 |
80 | A5:由PPT2可知,PLA的参数更新方程为$w\gets w+yx\ \ if\ sign(w^Tx)\neq y$,因此可以等价为$yw^Tx\gt 0,\ \nabla E=0,\ \ \ \ yw^Tx\lt 0,\ \nabla E=-yx$,从而满足此情况的为errc即$err(w)=max(0,-yw^Tx)$
81 |
82 | ### 问题3
83 |
84 | Q6~Q10:主要考查二元情况下的导数和二阶泰勒展开
85 |
86 | 一阶导数:$\nabla f(x,y)=[\frac{\partial f}{\partial x},\frac{\partial f}{\partial y}]^T$
87 | 二阶导数:$\nabla^2 f(x,y)=[\frac{\partial^2 f}{\partial x^2},\frac{\partial^2 f}{\partial y\partial x}; \frac{\partial^2 f}{\partial y\partial x},\frac{\partial^2 f}{\partial y^2}]$
88 | 二阶泰勒展开:
89 | $$
90 | f(x+\Delta x, y+\Delta y)=f(x,y)+\Delta x\frac{\partial f(x,y)}{\partial x}+\Delta y\frac{\partial f(x,y)}{\partial y}\\
91 | +\frac{1}{2!}\big[(\Delta x)^2\frac{\partial^2 f(x,y)}{\partial x^2}+2\Delta x\Delta y\frac{\partial^2 f(x,y)}{\partial x\partial y}+(\Delta y)^2\frac{\partial^2 f(x,y)}{\partial y^2}\big]
92 | $$
93 | 主要针对表达式:
94 | $$
95 | E(u,v)=e^u+e^{2v}+e^{uv}+u^2-2uv+2v^2-3u-2v
96 | $$
97 | Q6:$\nabla E(u,v)$在$(u,v)=(0,0)$处的值时多少?
98 |
99 | A6:根据一阶导数可得:
100 | $$
101 | \frac{\partial E(u,v)}{\partial u}=e^u+ve^{uv}+2u-2v-3\\
102 | \frac{\partial E(u,v)}{\partial v}=2e^{2v}+ue^{uv}-2u+4v-2
103 | $$
104 | 从而将$(u,v)=(0,0)$代入可得$\nabla E(0,0)=(-2,0)$
105 |
106 | Q7:根据梯度下降算法(如下式所示),对参数进行迭代更新,求$\eta=0.01,(u_0,v_0)=(0,0)$经过五次迭代后的结果$(u_5,v_5)$和$E(u_5,v_5)$?
107 | $$
108 | (u_{t+1},v_{t+1})=(u_t,v_t)-\eta \nabla E(u_t,v_t)
109 | $$
110 | A7:直接根据简单的循环程序可解的$u= 0.094,v= 0.00179,E=2.8250$
111 |
112 | Q8:如果我们采用二阶泰勒展开$\hat{E}_2(\Delta u,\Delta v)$来近似$E(u+\Delta u, v+\Delta v)$,求下述表达式中的参数在$(u,v)=(0,0)$处的值?
113 | $$
114 | \hat{E}_2(\Delta u,\Delta v)=b_{uu}(\Delta u)^2+b_{vv}(\Delta v)^2+b_{uv}(\Delta u)(\Delta v)+b_u\Delta u+b_v\Delta v+b
115 | $$
116 | A8:根据二阶导数的情况:
117 | $$
118 | \frac{\partial^2 E}{\partial u^2}=e^u+v^2e^{uv}+2\\
119 | \frac{\partial^2 E}{\partial v^2}=4e^{2v}+u^2e^{uv}+4\\
120 | \frac{\partial^2 E}{\partial u\partial v}=e^{uv}+vue^{uv}-2
121 | $$
122 | 将这些一阶和二阶导数代入二阶泰勒展开,最后可得参数为$(1.5,4,-1,-2,0,3)$
123 |
124 | Q9:将Hessian矩阵表示为$\nabla^2E(u,v)$,并假设该Hessian矩阵为正定的。以下哪个时最佳的$(\Delta u,\Delta v)$使得$\hat E_2$取到最小值?(这个方向称之为Newton Direction)
125 |
126 | A9:找使得$\hat E_2$最小的$(\Delta u,\Delta v)$,可以直接通过求导获得。分别对$\Delta u$和$\Delta v$求导,结果如下:
127 | $$
128 | \frac{\partial E_2}{\partial \Delta u}=\frac{\partial E}{\partial u}+\Delta u\frac{\partial^2 E}{\partial u^2}+\Delta v\frac{\partial^2 E}{\partial u\partial v}=0\\
129 | \frac{\partial E_2}{\partial \Delta v}=\frac{\partial E}{\partial v}+\Delta v\frac{\partial^2 E}{\partial v^2}+\Delta u\frac{\partial^2 E}{\partial u\partial v}=0
130 | $$
131 | 联立上述两式,并将$(\Delta u,\Delta v)$以向量形式提取出来可以化简为:
132 | $$
133 | (\Delta u,\Delta v)^T=-(\nabla^2E)^{-1}\nabla E
134 | $$
135 | Q10:对$(u_0,v_0)$利用Newton Direction(无$\eta$)进行参数更新,五轮更新后$(u_5,v_5)$和$E$结果为?
136 |
137 | A10:通过简单的程序可得$u= 0.6118,v= 0.0705,E=2.3608$
138 | 从该结果可见,采用Newton Direction进行参数更新速度更快,但是代价是求Hessian矩阵
139 |
140 | ### 问题4
141 |
142 | Q11~Q12:关于特征转换的问题
143 |
144 | Q11:考虑二维空间上的6个点:$x_1=(1,1),x_2=(1,-1),x_3=(-1,-1),x_4=(-1,1),x_5=(0,0),x_6=(1,0)$,采用二次函数,线性函数的并集的hypotheses,能最多shatter里面几个点。
145 |
146 | A11:二次函数和线性函数组合的“最强形态”为$(1,x_1,x_2,x_1^2,x_2^2,x_1,x_2)$,则将此6个点转到6维空间上,可获得矩阵为:该矩阵为满秩矩阵,因此6个点能被shatter。
147 |
148 | [[ 1 1 1 1 1 1]
149 | [ 1 1 -1 1 -1 1]
150 | [ 1 -1 -1 1 1 1]
151 | [ 1 -1 1 1 -1 1]
152 | [ 1 0 0 0 0 0]
153 | [ 1 1 0 1 0 0]]
154 |
155 | Q12:假设转换之前预先“偷看”了所有$N$个数据,并定义一种特殊的特征转换,将$x\in \mathbb{R}^d\to z\in\mathbb{R}^N$
156 | $$
157 | (\Phi(x))_n=z_n=[x=x_n]
158 | $$
159 | 利用线性分类器对转换后的特征进行处理,求$d_{vc}(H_{\Phi})$
160 |
161 | A12:这题的关键在于理解这种“奇葩”的特征转换,举个例子,如第1个数据$x_1$,根据上述规则则变为$[1,0,...,0]^T$,(矩阵大小$N\times1$)就是将第几个数对应的行置为1,其他行均为0。显然,不管多少数,其转换后的向量均是线性无关的,因此均可以被shatter,所以$d_{vc}(H_\Phi)=\infty$
162 |
163 |
--------------------------------------------------------------------------------
/lecture/MLF3-1/Q1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3-1/Q1.png
--------------------------------------------------------------------------------
/lecture/MLF3-1/Q3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3-1/Q3.png
--------------------------------------------------------------------------------
/lecture/MLF3-2.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石作业3:part2
3 | date: 2017-02-09 21:56:45
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习基石课后作业3-2:对应题目13~题目20
9 |
10 |
11 | ## 机器学习基石作业3
12 |
13 | ## 问题5
14 |
15 | Q13~Q15:主要关于线性回归问题和特征转换。
16 |
17 | 数据产生:数据集大小$N=1000$,且$\mathcal{X}=[-1,1]\times[-1,1]$,每个数据的$\mathbb{x}$均等概率的从$\mathcal{X}$中提取。而对应的$y$则根据$f(x_1,x_2)=sign(x_1^2+x_2^2-0.6)$来确定,且对数据集中的$10\%$的数据的$y$进行反转。
18 |
19 | 先对线性回归算法进行简单的说明
20 |
21 | ### 算法说明
22 |
23 | 函数集:$y=w^T\mathbb{x}$
24 |
25 | 损失函数:$E_{in}(w)=\frac{1}{N}\sum_{n=1}^N(w^T\mathbb{x}_n-y_n)^2$
26 |
27 | 梯度:$\nabla E_{in}(w)=\frac{2}{N}(X^TXw-X^T\mathbb{y})$
28 |
29 | “目的”:寻找$w$使得损失函数最小
30 |
31 | Linear Regression
32 |
33 | >①获得数据$(\mathbb{x}_1,y_1),...,(\mathbb{x}_N,y_N)$
34 | >②采用闭式解公式求出最佳$w$:$w_{lin}=(X^TX)^{-1}X^T\mathbb{y}$
35 | >③返回$w_{lin}$
36 | >
37 | >如果还有预测过程,直接$\hat{y}=w_{lin}^Tx$
38 |
39 | ### 算法实现
40 |
41 | ```python
42 | theta = lin.pinv(X.T.dot(X)).dot(X.T).dot(Y)
43 | ```
44 |
45 | ### 具体问题
46 |
47 | 数据产生函数:
48 |
49 | ```python
50 | # 数据生成函数
51 | def generateData(num):
52 | axeX = np.random.uniform(-1, 1, num)
53 | axeY = np.random.uniform(-1, 1, num)
54 | Xtemp = np.c_[axeX, axeY]
55 | X = np.c_[np.ones((num, 1)), Xtemp]
56 | Ytemp = np.sign(np.power(axeX, 2)+np.power(axeY, 2)-0.6)
57 | Ytemp[Ytemp == 0] = -1
58 | pos = np.random.permutation(num)
59 | Ytemp[pos[0: round(0.1*num)]] *= -1
60 | Y = Ytemp.reshape((num, 1))
61 | return X, Y
62 | ```
63 | Q13:不进行特征转换,只采用特征$(1, x_1,x_2)$,利用Linear Regression获得最佳的$w_{lin}$。将其直接运用到分类问题上面(利用$sign(w^Tx)$),在利用$0/1$判据来衡量训练样本误差$E_{in}$。进行1000次实验,取误差的平均。
64 |
65 | A13:通过下面的代码来实现:
66 |
67 | ```python
68 | totalerr = 0
69 | for i in range(1000):
70 | X, Y = generateData(1000)
71 | theta = lin.pinv(X.T.dot(X)).dot(X.T).dot(Y)
72 | ypred = np.sign(X.dot(theta))
73 | err = np.sum(ypred!=Y)/1000
74 | totalerr += err
75 | print('Ein: ', totalerr/1000)
76 | ```
77 |
78 | Ein: 0.503646
79 |
80 | 通过上面结果可知,直接利用Linear Regression(利用square error)再运用到分类问题上结果很差!
81 |
82 | Q14~Q15:将数据的特征进行转换,转换为$(1,x_1,x_2,x_1x_2,x_1^2,x_2^2)$这6项,再利用Linear Regression获得最佳的$w_{lin}$,求该$w_{lin}$以及将其运用到测试集上的测试误差$E_{out}$(衡量方式与Q13相同)
83 |
84 | A14~A15:特征转换函数如下
85 |
86 | ```python
87 | # 特征转换函数
88 | def transform(X):
89 | row, col = X.shape
90 | Xback = np.zeros((row, 6))
91 | Xback[:, 0:col] = X
92 | Xback[:, col] = X[:, 1]*X[:, 2]
93 | Xback[:, col+1] = X[:, 1]**2
94 | Xback[:, col+2] = X[:, 2]**2
95 | return Xback
96 | ```
97 | 问题的具体代码如下:
98 |
99 | ```python
100 | # Q14
101 | totalerr = 0
102 | for i in range(1000):
103 | X, Y = generateData(1000)
104 | Xtran = transform(X)
105 | theta = lin.pinv(Xtran.T.dot(Xtran)).dot(Xtran.T).dot(Y)
106 | Xtest, Ytest = generateData(1000)
107 | Xback = transform(Xtest)
108 | ypred = np.sign(Xback.dot(theta))
109 | err = np.sum(ypred!=Ytest)/1000
110 | totalerr += err
111 | print('theta: ', theta.T)
112 | print('Eout: ', totalerr/1000)
113 | ```
114 |
115 | theta: [[-1.01626639 0.07325707 0.02834912 -0.0155599 1.63387468 1.52477431]]
116 | Eout: 0.12608
117 | 需要指出的是,Q14中给出的选项中最接近的为:
118 | $$
119 | g(x_1,x_2)=sign(-1-0.05x_1+0.08x_2+0.13x_1x_2+1.5x_1^2+1.5x_2^2)
120 | $$
121 |
122 | ## 问题6
123 |
124 | Q16~Q17:关于多类别logistics regression问题。针对K类别分类问题,我们定义输出空间$\mathcal{Y}=\{1,2,...,K\}$,MLR的函数集可以视为由一系列(K个)权值向量$(w_1,...,w_K)$构成,其中每个权值向量均为$d+1$维。每种假设函数可以表示为:
125 | $$
126 | h_y(x)=\frac{exp(w^T_y\mathbb{x})}{\sum_{i=1}^Kexp(w_i^T\mathbb{x})}
127 | $$
128 | 且可以用来近似潜在的目标分布函数$P(y|\mathbb{x})$。MLR的“目标”就是从假设函数集中寻找使得似然函数最大的额假设函数。
129 |
130 | Q16:类似Lec10中最小化$-log(likelihood)$一样,推导$E_{in}(w_1,...,w_K)$
131 |
132 | A16:采用同样的处理方式
133 | $$
134 | max\ \frac{1}{N}\prod_{i=1}^Nh_y(\mathbb{x})\to min\ -\frac{1}{N}\sum_{i=1}^Nlog(h_y(\mathbb{x}))
135 | $$
136 | 将MLR的假设函数代入上式并化简可得:
137 | $$
138 | \frac{1}{N}\sum_{n=1}^N\big(ln(\sum_{i=1}^Kexp(w_i^T\mathbb{x}_n))-w^T_{y_n}\mathbb{x}_n\big)
139 | $$
140 | Q17:针对上述的$E_{in}$,它的一阶导数$\nabla E_{in}$可以表示为$\big(\frac{\partial E_{in}}{\partial w_1},\frac{\partial E_{in}}{\partial w_2,},...,\frac{\partial E_{in}}{\partial w_K}\big)$,求$\frac{\partial E_{in}}{\partial w_i}$。
141 |
142 | A17:直接对A16的答案的式子进行求导,就可以得到下式:
143 | $$
144 | \frac{1}{N}\sum_{n=1}^N\big((h_i(\mathbb{x}_n)-[y_n=i]\mathbb{x}_n\big)
145 | $$
146 |
147 | ## 问题7
148 |
149 | Q18~Q20:关于logistic regression实现的问题
150 |
151 | ### 算法说明
152 |
153 | 函数集:$s=\sum_{i=0}^dw_ix_i$,$h(\mathbb{x})=\theta(s)=\frac{1}{1+e^{-s}}$
154 |
155 | 损失函数:$E_{in}(w)=\frac{1}{N}\sum_{i=1}^Nln(1+exp(-y_nw^T\mathbb{x}_n))$
156 |
157 | 梯度:$\nabla E_{in}=\frac{1}{N}\sum_{i=1}^N\theta\big(-y_nw^T\mathbb{x}_n\big)(-y_n\mathbb{x}_n)$
158 |
159 | “目的”:寻找一个最佳假设函数使得损失函数最小
160 |
161 | (注:$h(\mathbb{x})$来近似$P(y|\mathbb{x})$上述的损失函数通过cross-entropy可推导出来)
162 |
163 | Logistic Regression:
164 |
165 | >初始化$w$
166 | >For t=0,1,...
167 | > ① 计算$\nabla E_{in}(w)$
168 | > ② 更新参数:$w\gets w-\eta\nabla E_{in}(w)$
169 | >返回$w$
170 |
171 | (上述$\eta$可以视为一个超参数,可以通过cross-validation来确定)
172 |
173 | ### 算法实现
174 |
175 | ```python
176 | # sigmoid函数
177 | def sigmoid(z):
178 | zback = 1/(1+np.exp(-1*z))
179 | return zback
180 | ```
181 | ```python
182 | # Logistic Regression
183 | def logisticReg(X, Y, eta, numiter, flag=0):
184 | row, col = X.shape
185 | theta = np.zeros((col, 1))
186 | num = 0
187 | for i in range(numiter):
188 | if flag == 0:
189 | derr = (-1*X*Y).T.dot(sigmoid(-1*X.dot(theta)*Y))/row
190 | else:
191 | if num >= row:
192 | num = 0
193 | derr = -Y[num, 0]*X[num: num+1, :].T*sigmoid(-1*X[num, :].dot(theta)[0]*Y[num, 0])
194 | num += 1
195 | theta -= eta*derr
196 | return theta
197 | ```
198 | ### 具体问题
199 |
200 | 数据导入模块
201 |
202 | ```python
203 | # 导入数据函数
204 | def loadData(filename):
205 | data = pd.read_csv(filename, sep='\s+', header=None)
206 | data = data.as_matrix()
207 | col, row = data.shape
208 | X = np.c_[np.ones((col, 1)), data[:, 0: row-1]]
209 | Y = data[:, row-1:row]
210 | return X, Y
211 | ```
212 | 错误率计算模块
213 |
214 | ```python
215 | # 误差计算函数
216 | def mistake(X, Y, theta):
217 | yhat = X.dot(theta)
218 | yhat[yhat > 0] = 1
219 | yhat[yhat <= 0] = -1
220 | err = np.sum(yhat != Y)/len(Y)
221 | return err
222 | ```
223 | Q18:针对$\eta=0.001,\ T=2000$的情况,采用梯度下降法获得$w$后,在测试集上的错误率是多少?(利用0/1判据)
224 |
225 | A18:
226 |
227 | ```python
228 | # Q18
229 | eta = 0.001; T = 2000; flag = 0
230 | theta = logisticReg(X, Y, eta, T, flag)
231 | errin = mistake(X, Y, theta)
232 | errout = mistake(Xtest, Ytest, theta)
233 | print('Ein = ', errin,'Eout = ', errout)
234 | ```
235 |
236 | Ein = 0.466 Eout = 0.475
237 |
238 | Q19:针对$\eta=0.01,\ T=2000$的情况,采用梯度下降法获得$w$后,在测试集上的错误率是多少?(利用0/1判据)
239 |
240 | A19:
241 |
242 | ```python
243 | # Q19
244 | eta = 0.01; T = 2000; flag = 0
245 | theta = logisticReg(X, Y, eta, T, flag)
246 | errin = mistake(X, Y, theta)
247 | errout = mistake(Xtest, Ytest, theta)
248 | print('Ein = ', errin,'Eout = ', errout)
249 | ```
250 |
251 | Ein = 0.197 Eout = 0.22
252 |
253 | Q20:针对$\eta=0.001,\ T=2000$的情况,采用随机梯度下降法(此处采用按顺序每次选择元素,更通常的做法是随机选择元素)获得$w$后,在测试集上的错误率是多少?(利用0/1判据)
254 |
255 | A20:
256 |
257 | ```python
258 | # Q20
259 | eta = 0.001; T = 2000; flag = 1
260 | theta = logisticReg(X, Y, eta, T, flag)
261 | errin = mistake(X, Y, theta)
262 | errout = mistake(Xtest, Ytest, theta)
263 | print('Ein = ', errin,'Eout = ', errout)
264 | ```
265 |
266 | Ein = 0.464 Eout = 0.473
267 |
268 |
--------------------------------------------------------------------------------
/lecture/MLF3.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石Lec9-Lec12
3 | date: 2017-02-12 15:43:58
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习基石Lec9-Lec12主要知识点:对应作业3
9 |
10 |
11 | ## 线性回归
12 |
13 | ### 线性回归基础
14 |
15 | ① 误差定义:
16 | 
17 | ② 优化目标和其对应梯度
18 | 
19 | 往往直接通过$\nabla E_{in}=0$来获得最佳的参数$w_{lin}=(X^TX)^{-1}X^TY$
20 |
21 | ### 线性回归算法
22 |
23 | 
24 |
25 | ### 泛化能力分析
26 |
27 | 针对$E_{in}$和$E_{out}$的数学期望如下(具体推导见作业):
28 | 
29 | 可见,当$N\to\infty$时,两者都逼近$\sigma^2$
30 |
31 | ### 基于线性回归的二分类问题
32 |
33 | 易知$err_{0/1}\le err_{sqr}$,($err_{0/1}=[sign(w^Tx)\ne y]$,$err_{sqr}=(w^Tx-y)^2$)从而有下面的关系:
34 | 
35 | 可见,线性回归是二元分类问题的上界。但线性回归求参数$w$非常快(因为直接闭式解),从而往往将其作为分类器的初始值
36 |
37 | ## Logistic回归
38 |
39 | ### Logistic函数
40 |
41 | 令$s=w^Tx$:
42 | 
43 |
44 | ### Logistic回归基础
45 |
46 | 目标函数和梯度函数(目标函数通过cross-entropy获得)
47 | 
48 | 由于梯度闭式解不能直接求出,又由于$\nabla E_{in}$是关于$w$的凸函数,所以采用梯度下降法来更新参数
49 |
50 | ### Logistic回归算法
51 |
52 | 
53 | 其中参数$\eta$作为一个超参数,需自己设定(可以通过实验等手段来寻找最佳$\eta$)
54 |
55 | ### 梯度下降法介绍
56 |
57 | 梯度下降法的核心思想在于“一阶泰勒展开的运用”:具体的推导过程见下图
58 | 
59 |
60 | ## 基于线性模型的分类
61 |
62 | ### 三种线性模型比较
63 |
64 | ① 各自具体函数形式及对应的损失判据(其中$s=w^Tx$):
65 | 
66 | ② 对应的损失函数图:
67 | 
68 | 上述情况可见:当惩罚项$\Omega$在可接受范围内时,$E_{in}^{CE}\ small\to E_{out}^{CE}\ small\to E_{out}^{0/1}\ small$,从而说明利用Logistic回归进行分类是可行的。
69 |
70 | ### 随机梯度下降法
71 |
72 | 真实情况下的梯度为:$\frac{1}{N}\sum_{n=1}^N\nabla err(w,x_n,y_n)=\mathcal{E}_{n}\nabla err(w,x_n,y_n)$。为了增加运算速度,采用随机选取一个数据$m$,用$\nabla err(w,x_m,y_m)$来代替上述的梯度。
73 |
74 | - 优点:简单,速度更快。且在数据量很大和在线学习(online learning)中非常有用
75 | - 缺点:相比原本的梯度下降法,稳定性要差一些(这也是为什么batch SGD更常见)
76 |
77 | ### 多类别分类问题
78 |
79 | **方式1:一对多**
80 | 算法:
81 | 
82 | 特点:
83 | 
84 |
85 | **方式2:一对一**
86 | 算法:
87 | 
88 | 特点:
89 | 
90 |
91 | ## 非线性转换
92 |
93 | ### 处理非线性可分问题的步骤
94 |
95 | 
96 |
97 | ### 非线性转换的代价
98 |
99 | - 增加了$d_{VC}$,即增加了模型复杂度
100 | - 增加了计算量和存储空间,因为“自由参数”增加
--------------------------------------------------------------------------------
/lecture/MLF3/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic1.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic10.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic11.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic12.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic13.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic14.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic14.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic15.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic15.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic16.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic16.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic18.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic18.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic2.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic3.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic4.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic5.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic6.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic7.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic8.png
--------------------------------------------------------------------------------
/lecture/MLF3/pic9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF3/pic9.png
--------------------------------------------------------------------------------
/lecture/MLF4-1.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石作业4:part1
3 | date: 2017-02-10 16:47:09
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习基石课后作业4-1:对应题目1~题目12
9 |
10 |
11 | ## 机器学习基石作业4
12 |
13 | ### 问题1
14 |
15 | 关于过拟合和Deterministic noise的问题。假设目标量为$y=f(\mathbb{x}+\epsilon)\sim Gaussian(f(x),\sigma^2)$
16 |
17 | noise主要分为两大类:① Stochastic noise --- 来自于外界的“干扰”,属于不可消除性noise ② Deterministic noise --- 来自于潜在目标函数$f$与假设函数集中最优函数$h^{\star}\in \mathcal{H}$的差异(也可以简单的说为假设函数集能力不够,达不到目标函数)。
18 |
19 | 
20 |
21 | 关于过拟合问题主要可从上述三幅图来思考。图(2)(3)$Q_f$代表目标函数的最高次幂为多少(假设目标函数为多项式),颜色区域的值代表$E_{out}(g_{10})-E_{out}(g_2)$(理解为复杂假设函数集-简单假设函数集训练后的误差)
22 |
23 | Q1:Deterministic取决于假设函数集$\mathcal{H}$,假设$\mathcal{H}^,\subset \mathcal{H}$,且$f$固定,则一般而言,采用$\mathcal{H}^,$取代$\mathcal{H}$,deterministic noise将如何变化
24 |
25 | A1:由于Deterministic noise可以理解为假设函数集能够“逼近”目标函数的能力(能力越强,值越小),因此对于越大的假设函数集,其能力越强,所以一般而言,deterministic noise在此种情况下会增加
26 |
27 | ### 问题2
28 |
29 | Q2:多项式模型可以视为线性模型在$\mathcal{Z}$空间上的运用,即采用非线性转换$\Phi:\mathcal{X}\to\mathcal{Z}$,此问题中将标量$x$转换为向量$z$,采用的转换规则为$z=(1,L_1(x),L_2(x),...,L_Q(x))$(关于$L_i(x)$的详细表达式见PPT)。
30 | 将假设函数集表示为$H_Q=\big\{h|h(x)=w^Tz=\sum_{q=0}^{Q}w_1L_q(x)\big\}$,其中$L_0(x)=1$。定义$\mathcal{H}(Q,c,Q_0)={h|h(x)=q^Tz\in H_Q;w_q=c\ for\ q\ge Q_0}$,则下述表达式中哪条是正确的?
31 | a. $\mathcal{H}(10,0,3)\cup\mathcal{H}(10,1,4)=\mathcal{H}_3$
32 | b. $\mathcal{H}(10,1,3)\cup\mathcal{H}(10,1,4)=\mathcal{H}_1$
33 | c. $\mathcal{H}(10,0,3)\cup\mathcal{H}(10,0,4)=\mathcal{H}_2$
34 | d. $\mathcal{H}(10,0,3)\cup\mathcal{H}(10,0,4)=\mathcal{H}_4$
35 |
36 | A2:该问题中$c=0$才会有比较实际的意义,因此主要验证下含有$c=0$的项,不难知道(c)对
37 |
38 | ### 问题3
39 |
40 | Q3:含权值惩罚项的正则项,具体表达式如下所示(其中$\lambda\gt 0$)
41 | $$
42 | E_{aug}(w)=E_{in}(w)+\frac{\lambda}{N}w^Tw
43 | $$
44 | 如果我们希望通过梯度下降法来更新参数,具体的参数更新表达式是什么?
45 |
46 | A3:$\nabla E_{aug}$:$\nabla E_{in}+\frac{2\lambda}{N}w$,从而不难获得参数更新规则如下为
47 | $$
48 | w(t+1)\gets \big(1-\frac{2\lambda\eta}{N}\big)w(t)-\eta\nabla E_{in}(w(t))
49 | $$
50 | Q4:$w_{lin}$为普通线性回归问题的最优解,$w_{reg}$是加入了正则项之后的最优解,则两者之间有什么关系?
51 |
52 | A4:根据课程可知,正则项表达式本质上是通过下述表达式“推导”而来的(两者等价)
53 | $$
54 | min\ E_{in}(w)=\frac{1}{N}(Zw-y)^T(Zw-y)\ \ \ s.t.\ \ w^Tw\le C
55 | $$
56 | 从上式显然可见,若$w_{lin}$满足$w^Tw\le C$这个条件,则$w_{lin}=w_{reg}$,否则,$\|w_{lin}\|\ge \|w_{reg}\|$,且随着$C$的减小(对应$\lambda$增大)相应的似然函数集满足$w的值域(小C)\subset w的值域(大C) $。所以选择$\|w_{reg}(\lambda)\|$是关于$\lambda$的非增函数。
57 |
58 | ### 问题4
59 |
60 | 留一交叉验证背后的思想:可以在一定程度上反映$E_{out}$:$E_{loocv}(\mathcal{H},A)$的数学期望能够近似$E_{out}(g^-)$的数学期望,推导表达式如下(其中①$\mathcal{E}$表示数学期望②$D_n=D_{train},D_n+(x_n,y_n)=D$③$e_n=err(g_n^-(x_n),y_n)$):
61 | $$
62 | \mathcal{E}_{D}E_{loocv}(\mathcal{H},A)=\mathcal{E}_D\frac{1}{N}\sum_{n=1}^Ne_n=\frac{1}{N}\sum_{n=1}^N\mathcal{E}_De_n=\frac{1}{N}\sum_{n=1}^N\mathcal{E}_{D_n}\mathcal{E}_{(x_n,y_n)}err(g_n^-(x_n),y_n)\\
63 | =\frac{1}{N}\sum_{n=1}^N\mathcal{E}_{D_n}E_{out}(g^-_n)=\frac{1}{N}\sum_{n=1}^N\bar{E}_{out}(g^-_n)=\bar{E}_{out}(g^-_n)=\bar{E}_{out}(N-1)
64 | $$
65 | 上面式子中$\mathcal{E}_{(x_n,y_n)}err(g_n^-(x_n),y_n)=E_{out}(g_n^-)$是根据$E_{out}$的定义来的。
66 | 上式说明了可以用leave-one-out来近似$E_{out}(g)$(因为整体而言少了一个数据,但当N很大时,几乎可以忽略差别),但代价是运算成本开销增大。
67 |
68 | Q5:有三个数据点$(-1,0),(1,0),(\rho,1),\rho\ge0$,则对于两个模型:①$h_0(x)=b_0$ ②$h_1(x)=a_1x+b_1$,在$\rho$取何值时,两者的$E_{loocv}(in)$相等?
69 |
70 | 
71 |
72 | A5:直接通过$E=\frac{1}{3}(e_1+e_2+e_3)$来进行计算,容易发现$e_1$两者等价,且对于情况①$e_2=e_3=1/4$,所以只需情况②中$e_2+e_3=1/2$便可,从而问题可以转换为(线性方程求解代入等过程略):
73 | $$
74 | \big(\frac{2}{\rho+1}\big)^2+\big(\frac{-2}{\rho-1}\big)^2=\frac{1}{2}\to\rho=\sqrt{9+4\sqrt{6}}
75 | $$
76 |
77 | ### 问题5
78 |
79 | 关于学习的三大准则:
80 | a). Occam's Razor:能够拟合数据的模型中,越简单的模型越具有说服力
81 | b). Sampling Bias:如果训练样本和测试样本(以及后续的预测问题)的数据不是来自于同一分布,则训练的结果往往是不可信的
82 | c). Data Snooping(偷窥数据):要尽可能的避免或减少对数据的“偷窥”,在很多情况下,会有意无意的间接加入了“人的预处理”而对资料产生污染
83 |
84 | Q6:一封包含接下来5周每周1棒球比赛胜负的结果的邮件,这封邮件告诉你如果你想看第6周比赛的胜负,需支付1000元,下述哪条分析是对的?
85 | a). 如果要保证至少有一封邮件对5周棒球比赛预测结果是对的,至少要发出64封邮件(在5场比赛出结果之前)
86 | b). 在第一场比赛结果出来后,邮件发送者应该再发几封邮件给剩下的人
87 |
88 | A6:5场都没出结果时:$2^5=32$。1场出了结果:$2^4=16$,之后依次除以2即可
89 |
90 | Q7:如果发送一封邮件需10元,如果5周过后有一位收件人寄来1000元,则该发件人的这次骗局能赚多少钱?
91 |
92 | A7:总共花费$cost=10(2^5+2^4+2^3+...,2^1+1)=630$,所以$get=1000-630=370$
93 |
94 | ### 问题6
95 |
96 | Q8~Q10:银行一开始有一个非常简陋的“评判标准”$a(x)$来确定是否对某人$x$发放信用卡,对$N=10,000$个人采用$a(x)$标准决定是否发放信用卡,因此,现在手头的数据有$(x_1,y_1),...,(x_N,y_N)$。在你看到这些数据之前,厉害的你已经通过数学推导等手段建立了一个信用卡发放与否的评价函数$g$,然后你将其用到这些有的数据上,发现结果几乎完美。
97 |
98 | Q8:你的似然函数集$M$是多少?
99 |
100 | A8:由于时建立了一个评价函数,因此,似然函数集$M=1$
101 |
102 | Q9:在此$M$下,$N=10,000$情况下,通过Hoeffding bound计算误差大于$1\%$的概率为多少?
103 |
104 | A9:由于$M=1$,因此就等价于使用Hoeffding不等式:$P[|E_{in}-E_{out}|\gt\epsilon]\le2exp(-2\epsilon^2N)$,将具体数值代入可解得$P\leq0.271$
105 |
106 | Q10:银行采用你的评价函数$g$,结果发现在后续人中效果很差,超过半数办理信用卡的人存在欺诈行为。则如何“搭配”$a(x)$和你的$g(x)$作为评价体系会有较好的效果?
107 |
108 | A10:$g(x)$效果很差的根本原因在于使用了“污染”的数据,因为$Data$全部都是来自于$a(x)$获得的标签$y$,并不是真实可信的。但是$g(x)$又能够很好的预测来自$a(x)$的数据,因此一种好的预测方式是,如果能通过$a(x)$的测试,则再进一步进行$g(x)$的测试(因为$g(x)$对$a(x)$的结果预测效果好)。因此$a(x)\ AND\ g(x)$,即两者都说“可信”的情况下,真实情况一般而言是好的。
109 |
110 | ### 问题7
111 |
112 | Q11~Q12:虚拟样本和正则化:考虑线性回归问题,如果我们加入K个虚拟样本$(\tilde{x}_1,\tilde{y}_1),...,(\tilde{x}_K,\tilde{y}_K)$,然后我们解下述方程:
113 | $$
114 | min_w \frac{1}{N+K}\big(\sum_{n=1}^N(y_n-w^Tx_n)^2+\sum_{k=1}^K(\tilde{y}_k-w^T\tilde{x}_n)^2\big)
115 | $$
116 | 这些虚拟样本可以视为一种“另类”的正则化。
117 |
118 | Q11:使得上述问题得到最优解的$w$是多少?
119 |
120 | A11:对上式求导,再另其导数=0求出$w$,求解的结果为$w=(X^TX+\tilde{X}^T\tilde{X})^{-1}(X^TY+\tilde{X}^T\tilde{Y})$
121 |
122 | Q12:当$\tilde{X},\tilde{Y}$取何值时,上述表达式获得的$w$与下述0正则化完全一样?
123 | $$
124 | w_{reg}=argmin_w\frac{\lambda}{N}||w||^2+\frac{1}{N}||Xw-y||^2
125 | $$
126 | A12:简单的将A11的结果与上式对比就可知:$\tilde{X}=\sqrt{\lambda}I,\tilde{Y}=0$
--------------------------------------------------------------------------------
/lecture/MLF4-1/Q4_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4-1/Q4_1.png
--------------------------------------------------------------------------------
/lecture/MLF4-1/Q5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4-1/Q5.png
--------------------------------------------------------------------------------
/lecture/MLF4-2.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石作业4:part2
3 | date: 2017-02-10 21:22:26
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习基石课后作业4-2:对应题目13~题目20
9 |
10 |
11 | ## 机器学习基石作业4
12 |
13 | ## 问题8
14 |
15 | Q13~Q20:主要考虑了加入正则项后的线性回归问题,超参数的选择问题和交叉验证的方法。
16 |
17 | ### 算法说明
18 |
19 | 加入正则项的线性回归问题(基本原理与线性回归一致,也存在闭式解):
20 | $$
21 | w_{reg}=argmin_w\frac{\lambda}{N}||w||^2+\frac{1}{N}||Xw-Y||^2=(X^TX+\lambda I)^{-1}X^TY
22 | $$
23 | 
24 |
25 | 超参数的选择:
26 |
27 | >一系列超参数的集合$L=\lambda_1,...\lambda_K$
28 | >For i=1,2,...K:
29 | > ①计算每一种超参数下的$w_{reg}$
30 | > ②利用此$w_{reg}$求解$E_{in}$和$E_{out}$(有时还加上$E_{val}$)
31 | > ③根据一定评价标准(常用$E_{val}$来进行选择)
32 | >返回最佳的$\lambda$
33 |
34 | 交叉验证方法:
35 |
36 | >将数据划分为K份
37 | >For t=1,2,...,K:
38 | > ①将第t份作为验证集,其他的作为训练集
39 | > ②通过训练集训练求出$w_{reg}$
40 | > ③求解该$w_{reg}$情况下训练集和验证集的误差
41 | >将所有训练集和验证集的误差取均值
42 | >返回误差
43 |
44 | 交叉验证往往被用来作为超参数选择的策略。也是评价假设函数集好坏的有效方法。
45 |
46 | ### 具体问题
47 |
48 | ```python
49 | # 数据导入模块
50 | def loadData(filename):
51 | data = pd.read_csv(filename, sep='\s+', header=None)
52 | data = data.as_matrix()
53 | col, row = data.shape
54 | X = np.c_[np.ones((col, 1)), data[:, 0: row-1]]
55 | Y = data[:, row-1:row]
56 | return X, Y
57 | ```
58 | ```python
59 | # 误差计算函数
60 | def mistake(X, Y, theta):
61 | yhat = np.sign(X.dot(theta))
62 | yhat[yhat == 0] = -1
63 | err = np.sum(yhat != Y)/len(Y)
64 | return err
65 | ```
66 | Q13:取$\lambda=10$的情况下时,对应的$E_{in}$和$E_{out}$分别是多少?(采用0/1判据)
67 |
68 | A13:结果如下所示
69 |
70 | ```python
71 | # Q13 --- lambda=10的情况
72 | X, Y = loadData('hw4_train.dat')
73 | Xtest, Ytest = loadData('hw4_test.dat')
74 | lamb = 10
75 | row, col = X.shape
76 | wreg = lin.pinv(X.T.dot(X)+lamb*np.eye(col)).dot(X.T).dot(Y)
77 | ein = mistake(X, Y, wreg)
78 | eout = mistake(Xtest, Ytest, wreg)
79 | print('Ein: ',ein,'Eout: ',eout)
80 | ```
81 |
82 | Ein: 0.05 Eout: 0.045
83 |
84 | Q14&Q15:分别采用$log_{10}\lambda=\{2,1,0,-1,...,-8,-9,-10\}$,对应最小$E_{in}$和最小$E_{out}$情况下的$\lambda$是多少?(如果存在多个相同答案时,选择相同情况下最大的$\lambda$)
85 |
86 | A14&A15:结果如下
87 |
88 | ```python
89 | # Q14和Q15 --- 不同lambda情况下选择最佳的lambda
90 | arr = np.arange(-10, 3, 1); num =len(arr)
91 | lamb = 10.0**arr
92 | ein = np.zeros((num,)); eout = np.zeros((num,)); evali = np.zeros((num,))
93 | for i in range(num):
94 | wreg = lin.pinv(X.T.dot(X) + lamb[i] * np.eye(col)).dot(X.T).dot(Y)
95 | ein[i] = mistake(X, Y, wreg)
96 | eout[i] = mistake(Xtest, Ytest, wreg)
97 | out = np.c_[np.c_[np.array(lamb),ein],eout]
98 | print('\tlambda\t\t Ein\t\t Eout')
99 | print(out)
100 | ```
101 |
102 | lambda Ein Eout
103 | [[ 1.00000000e-10 1.50000000e-02 2.00000000e-02]
104 | [ 1.00000000e-09 1.50000000e-02 2.00000000e-02]
105 | [ 1.00000000e-08 1.50000000e-02 2.00000000e-02]
106 | [ 1.00000000e-07 3.00000000e-02 1.50000000e-02]
107 | [ 1.00000000e-06 3.50000000e-02 1.60000000e-02]
108 | [ 1.00000000e-05 3.00000000e-02 1.60000000e-02]
109 | [ 1.00000000e-04 3.00000000e-02 1.60000000e-02]
110 | [ 1.00000000e-03 3.00000000e-02 1.60000000e-02]
111 | [ 1.00000000e-02 3.00000000e-02 1.60000000e-02]
112 | [ 1.00000000e-01 3.50000000e-02 1.60000000e-02]
113 | [ 1.00000000e+00 3.50000000e-02 2.00000000e-02]
114 | [ 1.00000000e+01 5.00000000e-02 4.50000000e-02]
115 | [ 1.00000000e+02 2.40000000e-01 2.61000000e-01]]
116 |
117 | Q16&Q17:将训练集划分为$D_{train}:120$和$D_{val}:80$。通过$D_{train}$获得$g_{\lambda}^-$,再通过$D_{val}$对其进行验证。求不同超参数$\lambda$情况下使得$E_{train}(g_\lambda^-)$最小的$\lambda$和使得$E_{val}(g_\lambda^-)$最小的$\lambda$
118 |
119 | A16&A17:结果如下
120 |
121 | ```python
122 | # Q16和Q17
123 | Xtrain = X[0:120, :]; Ytrain = Y[0:120, :]
124 | Xval = X[120:, :]; Yval = Y[120:, :]
125 | ein = np.zeros((num,)); eout = np.zeros((num,)); evali = np.zeros((num,))
126 | for i in range(num):
127 | wreg = lin.pinv(Xtrain.T.dot(Xtrain) + lamb[i] * np.eye(col)).dot(Xtrain.T).dot(Ytrain)
128 | ein[i] = mistake(Xtrain, Ytrain, wreg)
129 | eout[i] = mistake(Xtest, Ytest, wreg)
130 | evali[i] = mistake(Xval, Yval, wreg)
131 | out = np.c_[np.c_[np.c_[np.array(lamb),ein],evali],eout]
132 | print('\tlambda\t\t Ein\t\t Eval\t\t Eout')
133 | print(out)
134 | ```
135 |
136 | lambda Ein Eval Eout
137 | [[ 1.00000000e-10 8.33333333e-03 1.25000000e-01 4.00000000e-02]
138 | [ 1.00000000e-09 0.00000000e+00 1.00000000e-01 3.80000000e-02]
139 | [ 1.00000000e-08 0.00000000e+00 5.00000000e-02 2.50000000e-02]
140 | [ 1.00000000e-07 3.33333333e-02 3.75000000e-02 2.10000000e-02]
141 | [ 1.00000000e-06 3.33333333e-02 3.75000000e-02 2.10000000e-02]
142 | [ 1.00000000e-05 3.33333333e-02 3.75000000e-02 2.10000000e-02]
143 | [ 1.00000000e-04 3.33333333e-02 3.75000000e-02 2.10000000e-02]
144 | [ 1.00000000e-03 3.33333333e-02 3.75000000e-02 2.10000000e-02]
145 | [ 1.00000000e-02 3.33333333e-02 3.75000000e-02 2.10000000e-02]
146 | [ 1.00000000e-01 3.33333333e-02 3.75000000e-02 2.20000000e-02]
147 | [ 1.00000000e+00 3.33333333e-02 3.75000000e-02 2.80000000e-02]
148 | [ 1.00000000e+01 7.50000000e-02 1.25000000e-01 8.00000000e-02]
149 | [ 1.00000000e+02 3.41666667e-01 4.12500000e-01 4.14000000e-01]]
150 | Q18:将Q17中(对应$E_{val}$最小)的$\lambda$作用于整个数据集获得$g_\lambda$,在计算此时的$E_{in}(g_\lambda)$和$E_{out}(g_\lambda)$
151 |
152 | A18:结果如下
153 |
154 | ```python
155 | # Q18
156 | lambmin = lamb[np.where(evali == np.min(evali))[0][-1]]
157 | wreg = lin.pinv(X.T.dot(X) + lambmin * np.eye(col)).dot(X.T).dot(Y)
158 | errin = mistake(X, Y, wreg)
159 | errout = mistake(Xtest, Ytest, wreg)
160 | print('Ein: ',errin,'Eout: ',errout)
161 | ```
162 |
163 | Ein: 0.035 Eout: 0.02
164 |
165 | Q19:将数据划分为5份,进行交叉验证,对于不同的$\lambda$情况,求使得$E_{cv}$最小的的$\lambda$
166 |
167 | A19:结果如下
168 |
169 | ```python
170 | # Q19
171 | ein = np.zeros((num,)); eout = np.zeros((num,))
172 | for j in range(num):
173 | for i in range(5):
174 | Xval = X[40*i:40*(i+1), :]
175 | Yval = Y[40*i:40*(i+1), :]
176 | Xtrain = np.r_[X[0:40*i, :], X[40*(i+1):, :]]
177 | Ytrain = np.r_[Y[0:40*i, :], Y[40*(i+1):, :]]
178 | wreg = lin.pinv(Xtrain.T.dot(Xtrain) + lamb[j] * np.eye(col)).dot(Xtrain.T).dot(Ytrain)
179 | ein[j] += mistake(Xval, Yval, wreg)
180 | ein[j] /= 5
181 | out = np.c_[np.array(lamb), ein]
182 | print('\tlambda\t\t Ecv')
183 | print(out)
184 | ```
185 |
186 | lambda Ecv
187 | [[ 1.00000000e-10 5.00000000e-02]
188 | [ 1.00000000e-09 5.00000000e-02]
189 | [ 1.00000000e-08 3.00000000e-02]
190 | [ 1.00000000e-07 3.50000000e-02]
191 | [ 1.00000000e-06 3.50000000e-02]
192 | [ 1.00000000e-05 3.50000000e-02]
193 | [ 1.00000000e-04 3.50000000e-02]
194 | [ 1.00000000e-03 3.50000000e-02]
195 | [ 1.00000000e-02 3.50000000e-02]
196 | [ 1.00000000e-01 3.50000000e-02]
197 | [ 1.00000000e+00 3.50000000e-02]
198 | [ 1.00000000e+01 6.00000000e-02]
199 | [ 1.00000000e+02 2.90000000e-01]]
200 | Q20:求上述$\lambda$(Q19)对应的$E_{in}(g_\lambda)$和$E_{out}(g_\lambda)$
201 |
202 | A20:结果如下
203 |
204 | ```python
205 | # Q20
206 | lambmin = lamb[np.where(ein == np.min(ein))[0][-1]]
207 | wreg = lin.pinv(X.T.dot(X) + lambmin * np.eye(col)).dot(X.T).dot(Y)
208 | errin = mistake(X, Y, wreg)
209 | errout = mistake(Xtest, Ytest, wreg)
210 | print('Ein: ',errin,'Eout: ',errout)
211 | ```
212 |
213 | Ein: 0.015 Eout: 0.02
--------------------------------------------------------------------------------
/lecture/MLF4-2/hw4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4-2/hw4.png
--------------------------------------------------------------------------------
/lecture/MLF4.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习基石Lec13-Lec16
3 | date: 2017-02-12 17:46:39
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习基石Lec13-Lec16主要知识点:对应作业4
9 |
10 |
11 | ## 过拟合的危害
12 |
13 | ### 过拟合和欠拟合
14 |
15 | - 过拟合定义:$E_{in}$足够小,但是$E_{in}$和$E_{out}$相差甚远(常常发生在$d_{VC}\gt d_{VC}^\star$)
16 | - 欠拟合定义:$E_{in}$和$E_{out}$均过大(常常发生在$d_{VC}\lt d_{VC}^\star$)
17 |
18 | ### 导致过拟合的几大主要原因
19 |
20 | 
21 |
22 | ### Stochastic noise和deterministic noise
23 |
24 | - Stochastic noise:属于“不可控”因素,来自于外界对数据的污染
25 | - Deterministic noise:可以视为假设函数集中最佳函数$h^\star$与潜在目标函数$f$的差异。与假设函数集有关,因此属于“可控”因素
26 |
27 | 以下是针对$y=f(x)+\epsilon\sim Gaussin(\sum_{q=0}^{Q_f}\alpha_qx^q,\sigma^2)$为目标函数情况下的实验(其中坐标轴上每个点的数值表示$E_{out}(g_{10})-E_{out}(g_2)$的结果,$Q_f$是指"潜在的目标函数"的复杂度)
28 |
29 | 
30 |
31 | 从上述图像可知:并不是假设函数集越复杂越好,需要视实际拥有的数据量$N$等情况而定。
32 |
33 | ### 解决过拟合的几个方法
34 |
35 | - 数据修正:对错误的标签进行修正
36 | - 数据筛选:剔除一些错误或“不良”数据
37 | - 添加“伪数据”:通过旋转平移等手段来增加数据量
38 | - 正则化
39 | - .......
40 |
41 | ## 正则化
42 |
43 | ### 基本思想
44 |
45 | 通过对参数$w$加入一些约束,从而起到减少假设函数集中假设函数的数量,从而有效减少了假设函数的复杂度。
46 |
47 | 常见正则化的两种等价形式(详细推导见课件):
48 | 
49 |
50 | ($\lambda$越大$\longleftrightarrow$更偏向于小的$w$$\longleftrightarrow$更小的C)
51 |
52 | ### 正则化背后的VC理论
53 |
54 | 解释①:直接的减少了模型的复杂度$\Omega(\mathcal{H}(C))\lt \Omega(\mathcal{H})$
55 | 
56 | 解释②:间接地减少了模型的复杂度:
57 | 
58 | 将$\frac{\lambda}{N}w^Tw$视为“关于$w$的复杂度度量”$\frac{\lambda}{N}\Omega(w)$,从而上述两边结合等价于:
59 | $$
60 | E_{out}(w)\leq E_{aug}(w)-\frac{\lambda}{N}\Omega(w)+\Omega(\mathcal{H})
61 | $$
62 | 从而相比$E_{in}$,$E_{aug}$更加近似$E_{out}$
63 |
64 | ### 常见的正则项$\Omega(w)$
65 |
66 | 
67 |
68 | 关于$\lambda$的选择:$\lambda$可以调整正则项能力的强弱,$\lambda$越大,正则项作用越大。当噪声(无论deterministic noise或者stochastic noise)越大时,相对而言,$\lambda$可以调整的大一点
69 |
70 | ## 验证方式
71 |
72 | ### 利用验证集选择模型
73 |
74 | **基本框架**($K$代表验证集含有的数据量):
75 | 
76 | 不等式①可以根据复杂度$\Omega(N,\mathcal{H},\delta)$与$N$有关,$N$越大,值越小可得。
77 | 不等式②则将$E_{val}$看成$E_{in}$,则就是VC bound
78 |
79 | **存在的矛盾**
80 | 
81 | 当$K$过大时:$g^-$远差于$g$,因此不能够代表选择的$\mathcal{H}$是“最佳的”
82 | 当$K$过小时:$E_{val}(g^-)$与$E_{out}(g^-)$相差较大,则通过$E_{val}$的大小选出来的最佳$\mathcal{H}$,对测试集而言并非“最佳的”
83 |
84 | 实践中,常取$K=N/5$
85 |
86 | ### 常见的交叉验证的方式
87 |
88 | **留一交叉验证**
89 |
90 | 
91 |
92 | 优点:理论上$E_{loocv}$的数学期望可以代表$E_{out}(g^-)$的数学期望,证明见作业
93 | 缺点:计算开销大;不稳定
94 |
95 | **留一份交叉验证**
96 |
97 | 
98 |
99 | 将数据划分为$V$份(实践中往往$V=5\ or\ 10$)
100 |
101 | ## 三个学习准则
102 |
103 | ### Occam's Razor for Learning
104 |
105 | 
106 |
107 | ### Sampling Bias
108 |
109 | 
110 |
111 | ### Data Snooping(别偷看数据)
112 |
113 | 
114 |
115 |
--------------------------------------------------------------------------------
/lecture/MLF4/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic1.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic10.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic11.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic12.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic13.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic2.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic3.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic4.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic5.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic6.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic7.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic8.png
--------------------------------------------------------------------------------
/lecture/MLF4/pic9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLF4/pic9.png
--------------------------------------------------------------------------------
/lecture/MLT1-1/output_1_0.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1-1/output_1_0.png
--------------------------------------------------------------------------------
/lecture/MLT1-1/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1-1/pic1.png
--------------------------------------------------------------------------------
/lecture/MLT1-1/pic2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1-1/pic2.png
--------------------------------------------------------------------------------
/lecture/MLT1-1/pic3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1-1/pic3.png
--------------------------------------------------------------------------------
/lecture/MLT1-2.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法作业1:part2
3 | date: 2017-02-14 15:15:09
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法课后作业1-2:对应题目15~题目20
9 |
10 |
11 | ## 机器学习技法作业1
12 |
13 | Q15~Q20主要考虑在一个实际数据集上进行实验。该数据集为邮票数字,输入为数字对应图像的密度和对称度,输出为对应的数字。我们主要考虑多分类问题中一对多的情况。
14 | 注意事项:在计算$E_{in},E_{out},E_{val}$时,采用$0/1$判据,此外别对数据进行处理(如缩放等)
15 |
16 | ```python
17 | # Q15~Q20
18 | # 加载数据函数
19 | def loadData(filename):
20 | data = pd.read_csv(filename, sep='\s+', header=None)
21 | data = data.as_matrix()
22 | col, row = data.shape
23 | X = np.c_[np.ones((col, 1)), data[:, 1: row]]
24 | Y = data[:, 0]
25 | return X, Y
26 | ```
27 | ```python
28 | # 误差计算函数
29 | def mistake(yhat, y):
30 | err = np.sum(yhat != y)/len(y)
31 | return err
32 | ```
33 | ```python
34 | # 导入数据
35 | X, Y = loadData('features_train.dat')
36 | Xtest, Ytest = loadData('features_test.dat')
37 | row, col = X.shape
38 | ```
39 | ### 问题15
40 |
41 | Q15:采用线性kernel: $K(x_n,x_m)=x_n^Tx_m$,取$C=0.01$,解决$(0..VS..非0)$的二元分类问题,则其解得的$||w||$的值最接近多少?
42 |
43 | A15:以下结果可见,应该选择$0.6$
44 |
45 | ```python
46 | # Q15
47 | Ytemp = Y.copy()
48 | pos1 = Ytemp == 0; pos2 = Ytemp != 0
49 | Ytemp[pos1] = 1; Ytemp[pos2] = -1
50 | clf = SVC(C=0.01, kernel='linear', shrinking=False)
51 | clf.fit(X, Ytemp)
52 | print('w: ', clf.coef_, '\n |w|: ', np.linalg.norm(clf.coef_))
53 | ```
54 |
55 | w: [[ -1.82145965e-15 5.70727340e-01 2.59535779e-02]]
56 | |w|: 0.571317149084
57 | ### 问题16
58 |
59 | Q16:考虑采用多项式kernel: $K(x_n,x_m)=(1+x_n^Tx_m)^Q$,考虑参数$C=0.01,Q=2$的情况下,各种一对多情况的$E_{in}$
60 |
61 | A16:从而可见应该选择$8..vs..not.8$
62 |
63 | ```python
64 | # Q16~Q17
65 | Ein = np.zeros((10,))
66 | Salpha = np.zeros((10,))
67 | clf = SVC(C=0.01, kernel='poly', degree=2, gamma=1, coef0=1, shrinking=False)
68 | for i in range(10):
69 | Ytemp = Y.copy()
70 | pos1 = Ytemp == i; pos2 = Ytemp != i
71 | Ytemp[pos1] = 1; Ytemp[pos2] = -1
72 | clf.fit(X, Ytemp)
73 | Yhat = clf.predict(X)
74 | Ein[i] = mistake(Ytemp, Yhat)
75 | Salpha[i] = np.sum(np.abs(clf.dual_coef_))
76 | out = np.c_[Ein,Salpha]
77 | print('\tEin\t\t Sum_alpha')
78 | print(out)
79 | ```
80 |
81 | Ein Sum_alpha
82 | [[ 1.02455082e-01 2.14119479e+01]
83 | [ 1.44013167e-02 3.74000000e+00]
84 | [ 1.00260595e-01 1.46200000e+01]
85 | [ 9.02482513e-02 1.31600000e+01]
86 | [ 8.94253189e-02 1.30400000e+01]
87 | [ 7.62584008e-02 1.11200000e+01]
88 | [ 9.10711837e-02 1.32800000e+01]
89 | [ 8.84652311e-02 1.29000000e+01]
90 | [ 7.43382252e-02 1.08400000e+01]
91 | [ 8.83280757e-02 1.28800000e+01]]
92 |
93 | ### 问题17
94 |
95 | Q17:与Q16相同的情况下,求各自对应的$\sum\alpha_n$
96 |
97 | A17:从上述程序可见,应该选择$20.0$
98 |
99 | ### 问题18
100 |
101 | Q18:考虑高斯kernel: $K(x_n,x_m)=exp(-\gamma||x_n-x_m||^2)$。当$\gamma=100$,考虑$0..vs..not.0$问题,分别取$C=[0.001,0.01,0.1,1,10]$这五种情况时,以下对应的几种属性中哪个是随着$C$增大严格递减的?
102 | (a). $\mathcal{Z}$空间中,自由支持向量到超平面的距离
103 | (b). 支持向量的数量
104 | (c). $E_{out}$
105 | (d). $\sum_{n=1}^N\sum_{m=1}^NK(x_n,x_m)$
106 |
107 | A18:显然(d)项是一个定值,不变。(a)项将$C$视为惩罚因子,$C$越大,容忍能力越差,从而会导致margin会更小,因此(a)项正确。其他两项看下面结果
108 |
109 | ```python
110 | # Q18
111 | c = np.array([0.001, 0.01, 0.1, 1, 10])
112 | nsup = np.zeros((len(c),))
113 | eout = np.zeros((len(c),))
114 | Ytemp = Y.copy()
115 | pos1 = Ytemp == 0; pos2 = Ytemp != 0
116 | Ytemp[pos1] = 1; Ytemp[pos2] = -1
117 | Ytesttemp = Ytest.copy()
118 | pos1 = Ytesttemp == 0; pos2 = Ytesttemp != 0
119 | Ytesttemp[pos1] = 1; Ytesttemp[pos2] = -1
120 | for i in range(len(c)):
121 | clf = SVC(C=c[i], kernel='rbf', gamma=100, shrinking=False)
122 | clf.fit(X, Ytemp)
123 | nsup[i] = np.sum(clf.n_support_)
124 | yhat = clf.predict(Xtest)
125 | eout[i] = mistake(Ytesttemp, yhat)
126 | out = np.c_[np.c_[c,nsup],eout]
127 | print('\tC\t\t n_suport\t eout')
128 | print(out)
129 | ```
130 |
131 | C n_suport eout
132 | [[ 1.00000000e-03 2.39800000e+03 1.78873941e-01]
133 | [ 1.00000000e-02 2.52000000e+03 1.78873941e-01]
134 | [ 1.00000000e-01 2.28500000e+03 1.05132038e-01]
135 | [ 1.00000000e+00 1.77400000e+03 1.03637270e-01]
136 | [ 1.00000000e+01 1.67300000e+03 1.04633782e-01]]
137 |
138 | ### 问题19
139 |
140 | Q19:内容同Q18,当$C=0.1$时,$\gamma=[1,10,100,1000,10000]$中哪个对应最下的$E_{out}$
141 |
142 | A19:从下述结果可见,选择$\gamma=10$
143 |
144 | ```python
145 | # Q19
146 | gamma1 = np.array([1, 10, 100, 1000, 10000])
147 | eout = np.zeros((len(gamma1),))
148 | Ytemp = Y.copy()
149 | pos1 = Ytemp == 0; pos2 = Ytemp != 0
150 | Ytemp[pos1] = 1; Ytemp[pos2] = -1
151 | Ytesttemp = Ytest.copy()
152 | pos1 = Ytesttemp == 0; pos2 = Ytesttemp != 0
153 | Ytesttemp[pos1] = 1; Ytesttemp[pos2] = -1
154 | for i in range(len(gamma1)):
155 | clf = SVC(C=0.1, kernel='rbf', gamma=gamma1[i], shrinking=False)
156 | clf.fit(X, Ytemp)
157 | yhat = clf.predict(Xtest)
158 | eout[i] = mistake(yhat, Ytesttemp)
159 | out = np.c_[gamma1, eout]
160 | print('\t gamma \t\t eout')
161 | print(out)
162 | ```
163 |
164 | gamma eout
165 | [[ 1.00000000e+00 1.07125062e-01]
166 | [ 1.00000000e+01 9.91529646e-02]
167 | [ 1.00000000e+02 1.05132038e-01]
168 | [ 1.00000000e+03 1.78873941e-01]
169 | [ 1.00000000e+04 1.78873941e-01]]
170 |
171 | ### 问题20
172 |
173 | Q20:内容同Q18,将数据集中随机选取1000个作为验证集,则对于$C=0.1$的情况下,不同的$\gamma=[1,10,100,1000,10000]$中,哪个对应的$E_{val}$最小(进行100次实验取平均)
174 |
175 | A20:从下述结果可见,$\gamma=10$时对应最小
176 |
177 | ```python
178 | # Q20
179 | evali = np.zeros((len(gamma1),))
180 | Ytemp = Y.copy()
181 | pos1 = Ytemp == 0; pos2 = Ytemp != 0
182 | Ytemp[pos1] = 1; Ytemp[pos2] = -1
183 | for i in range(len(gamma1)):
184 | for j in range(100):
185 | pos = np.random.permutation(row)
186 | Xval = X[pos[0:1000], :]; Yval = Ytemp[pos[0:1000]]
187 | Xtrain = X[pos[1000:], :]; Ytrain = Ytemp[pos[1000:]]
188 | clf = SVC(C=0.1, kernel='rbf', gamma=gamma1[i], shrinking=False)
189 | clf.fit(Xtrain, Ytrain)
190 | yhat = clf.predict(Xval)
191 | evali[i] += mistake(yhat, Yval)
192 | out = np.c_[gamma1, evali/100]
193 | print('\t gamma\t\t eval')
194 | print(out)
195 | ```
196 |
197 | gamma eval
198 | [[ 1.00000000e+00 1.05900000e-01]
199 | [ 1.00000000e+01 9.94500000e-02]
200 | [ 1.00000000e+02 1.00830000e-01]
201 | [ 1.00000000e+03 1.64670000e-01]
202 | [ 1.00000000e+04 1.62690000e-01]]
--------------------------------------------------------------------------------
/lecture/MLT1.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法Lec1-Lec4
3 | date: 2017-02-14 16:42:00
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法Lec1-Lec4主要知识点:对应作业1
9 |
10 |
11 | ## 线性支撑向量机(LSVM)
12 |
13 | ### SVM背后的本质
14 |
15 | 支撑向量机本质上就是寻找最大边界的分离超平面,以下给出支撑向量机的推导流程:
16 | 
17 | 上面的最后一幅图结果就是SVM的基本框架
18 |
19 | ### 基于QP solver求解支撑向量机
20 |
21 | SVM对应的表达式属于二次规划的问题,因此可以直接采用现成的求解方式求解:
22 | 
23 | 具体的算法流程如下所示:
24 | 
25 |
26 | ### SVM背后的依据
27 |
28 | 解释①:寻找最大边界超平面可以视为一种特殊情况的加入正则化过程
29 | 
30 | 解释②:使得$d_{vc}$依赖于数据,从而有效降低了$d_{vc}$,因此可以避免overfitting
31 | 
32 |
33 | ### 采用SVM的优势
34 |
35 | 由于SVM自带降低$d_{vc}$的“属性”,因此自带削弱“过拟合”的“光环”,将其和特征转换组合起来,既能获得非常复杂的分离超平面,又能避免过拟合。
36 | 
37 |
38 | ## 基于对偶形式的SVM
39 |
40 | SVM结合特征转换后存在的一个问题,当转换后的特征数目过大时,通过QP solver解决起来就存在困难(变量数目大于条件数目),因此引入SVM的对偶形式来解决这个问题
41 |
42 | ### 拉格朗日函数的引入和求解
43 |
44 | 将原问题转换为对偶拉格朗日函数形式:
45 | 
46 | 拉格朗日函数的简单解释:
47 | 
48 | 强对偶问题的“介入”:
49 | 
50 | KKT条件:
51 | 
52 | 注:常规的求解方式为
53 | ①对Lagrange dual中的min项对应“下标参数”求导,代入$\mathcal{L}$进行一定的“约简”
54 | ②结合原始条件+对偶问题中隐含的条件$\alpha^\star_ig_i(w^\star)=0$
55 | ③将全部结果代入$\mathcal{L}$,将结果化为只包含max项对应“下标参数”的情况
56 |
57 | 上述过程将常规SVM$\to$对偶形式的SVM:(其具体的对偶形式及求解如下)
58 | 
59 | 进一步求解最终的分离超平面:
60 | 
61 |
62 | ### 对偶SVM背后隐含的信息和存在的问题
63 |
64 | 对偶SVM与PLA的参数$w$的联系
65 | 
66 | 对偶SVM存在的问题:
67 | 
68 |
69 | ## 基于核函数的SVM
70 |
71 | ### 基于核函数的SVM算法
72 |
73 | 核函数本质上是将两个变量特征转换的乘积用两个变了乘积的形式来体现,从而有效减少计算复杂度。一个简单的对应关系实例如二次多项式核函数:
74 | 
75 | 核函数的算法:
76 | 
77 |
78 | ### 三种常见核函数的对比
79 |
80 | ① 线性核函数
81 | 
82 | ② 多项式核
83 | 
84 | ③ 高斯核
85 | 
86 |
87 | 一个函数能否作为核函数的判别标准比较简单,满足Mercer's condition:
88 | ① 对称性
89 | ② 半正定性
90 |
91 | 具体通过已有核函数构造新的核函数的方法见作业1。
92 |
93 | ## 柔性边界SVM
94 |
95 | ### 加入惩罚项的柔性边界SVM
96 |
97 | 加入惩罚因子后的基本形式和其对应的拉格朗日函数:
98 | 
99 | 利用强对偶情况的求解方式进行求解(与最大边界情况完全类似),可以获得其对偶形式如下:
100 | 
101 | 注:
102 | ①上述对偶形式与hard-margin形式的SVM除了$\alpha_n$的边界发生了变化,其他全部一样!
103 | ②上述情况中的free SV是指$0\lt\alpha_n\lt C$对应的数据项。
104 | ③ $\alpha_n=C$的情况代表“越界”的数据点(即被惩罚的数据)
--------------------------------------------------------------------------------
/lecture/MLT1/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic1.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic10.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic11.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic12.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic13.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic14.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic14.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic15.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic15.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic16.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic16.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic17.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic17.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic18.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic18.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic19.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic19.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic2.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic20.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic20.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic21.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic21.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic3.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic4.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic5.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic6.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic7.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic8.png
--------------------------------------------------------------------------------
/lecture/MLT1/pic9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT1/pic9.png
--------------------------------------------------------------------------------
/lecture/MLT2-1.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法作业2:part1
3 | date: 2017-02-17 10:07:52
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法课后作业2-1:对应题目1~题目11
9 |
10 |
11 | ## 机器学习技法作业2
12 |
13 | ### 问题1
14 |
15 | Q1:“概率型”SVM主要优化的表达式如下:
16 | $$
17 | min_{A,B}F(A,B)=\frac{1}{N}\sum_{n=1}^Nln(1+exp(-y_n(A\cdot(w_{svm}^T\phi(x_n)+b_{svm})+B)))
18 | $$
19 | 采用梯度下降法来最小化$F(A,B)$时,我们需要计算一阶导数。令$z_n=w_{svm}^T\phi(x_n)+b_{svm}$,以及$p_n=\theta(-y_n(Az_n+B))$,其中$\theta(s)=exp(s)/(1+exp(s))$为常规的logistic函数。则对应的导数$\nabla F(A,B)$为多少?
20 |
21 | A1:先进行简单的变量替换$F(A,B)=\frac{1}{N}\sum ln(k)$,在进行一阶偏导数:
22 | $$
23 | \frac{\partial F}{\partial A}=\frac{1}{N}\sum \frac{1}{k}\cdot\frac{\partial k}{\partial A}=\frac{1}{N}\sum_{n=1}^N-y_np_nz_n\\
24 | \frac{\partial F}{\partial B}=\frac{1}{N}\sum \frac{1}{k}\cdot\frac{\partial k}{\partial B}=\frac{1}{N}\sum_{n=1}^N-y_np_n
25 | $$
26 | 从而可知最终结果为:$\nabla F(A,B)=\frac{1}{N}\sum_{n=1}^N[-y_np_nz_n,-y_np_n]^T$
27 |
28 | ### 问题2
29 |
30 | Q2:当采用牛顿法来最小化$F(A,B)$时,我们需要计算$-(H(F))^{-1}\nabla F$,其中$H(F)$为$F$的Hessian矩阵,求其具体形式
31 |
32 | A2:Hessian矩阵具体可以通过$H(F)=[\frac{\partial^2 F}{\partial A^2},\frac{\partial^2 F}{\partial A\partial B};\frac{\partial^2 F}{\partial A\partial B},\frac{\partial^2 F}{\partial B^2}]$来获得,就是普通的求导,注意细节便可,最终结果如下:
33 | $$
34 | H(F)=\frac{1}{N}\sum_{n=1}^N\begin{bmatrix}z_n^2p_n(1-p_n)&z_np_n(1-p_n)\\z_np_n(1-p_n)&p_n(1-p_n)\end{bmatrix}
35 | $$
36 |
37 | ### 问题3
38 |
39 | Q3:对于$d$维的$N$个输入数据,其在kernel ridge regression中的求逆的矩阵的维数是多少?
40 |
41 | A3:在kernel ridge regression中,其参数$\beta$的求解:$\beta=(\lambda I+K)^{-1}Y$,从而可知其对应的求逆矩阵的维数与$K$的大小相等,而$K\to N\times N$,所以选择$N\times N$
42 |
43 | ### 问题4
44 |
45 | Q4:常规(课堂上所学习)的SVR模型通常可以转化为下述优化问题:
46 | $$
47 | (P_1)\quad min_{b,w,\xi^{\lor},\xi^{\land}}\frac{1}{2}w^Tw+C\sum_{n=1}^N(\xi_n^{\lor}+\xi_n^{\land})\\
48 | s.t.\quad -\epsilon-\xi_n^{\lor}\le y_n-w^T\phi(x_n)-b\le \epsilon+\xi_n^{\land}\\
49 | \xi_n^{\lor}\ge 0,\xi_n^{\land}\ge0
50 | $$
51 | 大多数情况下,采用线性惩罚的SVR。但另一种$l_2$惩罚的SVR也同样非常常见,含二次惩罚的SVR可以转换为下述优化问题:
52 | $$
53 | (P_2)\quad min_{b,w,\xi^{\lor},\xi^{\land}}\frac{1}{2}w^Tw+C\sum_{n=1}^N((\xi_n^{\lor})^2+(\xi_n^{\land})^2)\\
54 | s.t.\quad -\epsilon-\xi_n^{\lor}\le y_n-w^T\phi(x_n)-b\le \epsilon+\xi_n^{\land}
55 | $$
56 | 则与之对应的“无约束”形式的$(P_2)$的形式为?
57 |
58 | A4:联系$(P_1)$情况,其对应的“无约束”形式为:$min_{b,w}\frac{1}{2}w^Tw+C\sum_{n=1}^N max(0,|w^Tz_n+b-y_n|-\epsilon)$,即给“越界”情况加上线性惩罚。联系到$(P_2)$,自然而然的可以理解为,给“越界”行为加上二次方惩罚,从而不难知其“无约束”形式为:$min_{b,w}\frac{1}{2}w^Tw+C\sum_{n=1}^N max(0,|w^Tz_n+b-y_n|-\epsilon)^2$
59 |
60 | ### 问题5
61 |
62 | Q5:根据**表示定理(representer theorem)**可知,任意以$L_2$正则项作为惩罚项的线性模型的最优解均可表示为:$w_{\star}=\sum_{n=1}^N\beta_nz_n$。从而将其代入Q4获得的$(P_2)$中将其化为对应的对偶形式。其中用$K(x_n,x_m)=(\phi(x_n))^T(\phi(x_m))$来表示,且令$s_n=\sum_{m=1}^N \beta_mK(x_n,x_m)+b$,不难发现$F(b,\beta)$可以对$\beta$求导。求其导数$\frac{\partial F}{\partial\beta_m}$
63 |
64 | A5:首先根据A4的结果给出$(P_2)$的对偶形式,如下所示:
65 | $$
66 | min_{b,\beta}\frac{1}{2}\sum_{n=1}^N\sum_{m=1}^N\beta_n\beta_mK(x_n,x_m)+C\sum_{n=1}^N max(0,|\sum_{k=1}^N\beta_nK(x_k,x_n)+b-y_n|-\epsilon)^2
67 | $$
68 | 直接对上式对$\beta_m$求导可得:
69 | $$
70 | \sum_{n=1}^N\beta_nK(x_n,x_m)-2C\sum_{n=1}^N [|y_n-s_n|\ge\epsilon](|y_n-s_n|-\epsilon)sign(y_n-s_n)K(x_n,x_m)
71 | $$
72 | 其中需要注意: 当$|y_n-s_n|\gt\epsilon$时,后一项才不为$0$,此外,后一项求导时,注意正负号。
73 |
74 | ### 问题6
75 |
76 | Q6:考虑$T+1$个假设函数$g_0,g_1,...,g_T$,且令$g_0(x)=0,\forall x$。假设有一个潜在的测试集$\{(\hat{x}_m,\hat{y}_m)\}_{m=1}^M$,这个测试集中$\hat{x}_m$你已知,但$\hat{y}_m$你不知道。但是,假设你知道这个测试集对于每一个假设函数的平方误差为$E_{test}(g_t)=\frac{1}{M}\sum_{m=1}^M(g_t(\hat{x}_m)-\hat{y}_m)^2=e_t$,并假设$s_t=\frac{1}{M}\sum_{m=1}^M(g_t(\hat{x}_m))^2=s_t$。则$\sum_{m=1}^Mg_t(\hat{x}_m)\hat{y}_m$可以如何表示?
77 |
78 | A6:通过对$E_{test}$展开不难发现:
79 | $$
80 | e_t=\frac{1}{M}\sum_{m=1}^M(g_t(\hat{x}_m)^2-2g_t(\hat{x}_m)\hat{y}_m+\hat{y}_m^2)=s_t-2\frac{1}{M}\sum_{m=1}^Mg_t(\hat{x}_m)\hat{y}_m+e_0\\
81 | \sum_{m=1}^Mg_t(\hat{x}_m)\hat{y}_m=(s_t+e_0-e_t)M/2
82 | $$
83 | "虽然给出了答案,但隐藏在这个背后的核心思想还不是很了解?---是想求出cross-entropy?"
84 |
85 | ### 问题7
86 |
87 | Q7:考虑目标函数$f(x)=x^2:[0,1]\to R$,且输入为$[0,1]$上的均匀分布。假设训练集仅为两个样本,且不存在噪声,采用基于均方损失的线性回归函数$h(x)=w_1x+w_0$进行拟合。求所有假设函数集的数学期望?
88 |
89 | A7:由于训练样本是“随机产生”的,且来自均匀分布。假设两个训练样本为$(x_1,x_1^2),(x_2,x_2^2)$,则其对应的损失函数为$L=(w_1x_1+w_0-x_1^2)^2+(w_1x_2+w_0-x_2^2)^2$,对其求导求最优解可得:
90 | $$
91 | \frac{\partial L}{\partial w_1}=2(w_1x_1+w_0-x_1^2)x_1+2(w_1x_2+w_0-x_2^2)x_2=0\\
92 | \frac{\partial L}{\partial w_2}=2(w_1x_1+w_0-x_1^2)+2(w_1x_2+w_0-x_2^2)=0
93 | $$
94 | 联立两式可以求解的:$w_1=x_1+x_2,w_0=-x_1x_2$,从而最优函数$g=(x_1+x_2)x-x_1x_2$。根据$x_1,x_2$的选取的所有可能,从而可以获得:
95 | $$
96 | \bar{g}=lim_{T\to\infty}\frac{1}{T}\sum g_t=E(x_1+x_2)x-E(x_1x_2)=x-\frac{1}{4}
97 | $$
98 |
99 | ### 问题8
100 |
101 | Q8:假设AdaBoost中采用的是线性回归函数(用于分类)。则我们可以将问题转化为下述带权值的优化问题:
102 | $$
103 | min_{w}E_{in}^u(w)=\frac{1}{N}\sum_{n=1}^Nu_n(y_n-w^Tx_n)^2
104 | $$
105 | 上述的优化问题其实可以视为采用“变形”数据集$\{(\bar{x}_n,\bar{y}_n)\}_{n=1}^N$的普通线性回归常规$E_{in}$的形式,则其“变形”数据$(\bar{x}_n,\bar{y}_n)$与原数据集的关系为?
106 |
107 | A8:显然直接将$u_n$放进平方项里面便可获得答案:$(\sqrt{u_n}x_n,\sqrt{u_n}y_n)$
108 |
109 | ### 问题9
110 |
111 | Q9:考虑将AdaBoost算法运用到一个$99\%$数据均为+1的分类问题的样本集上。正是因为有这么多+1的样本,获得第一个最佳假设函数为$g_1(x)=+1$。令$u_{+}^{(2)},u_{-}^{(2)}$分别为第二次迭代时正负样本前的参数,则对应的$u_{+}^{(2)}/u_{-}^{(2)}$的结果为多少?
112 |
113 | A9:根据Adaboost的“参数更新规则”,可知:
114 | $$
115 | \frac{u_{+}^{(2)}}{u_{-}^{(2)}}=\frac{\epsilon}{1-\epsilon}=\frac{1}{99}
116 | $$
117 |
118 | ### 问题10
119 |
120 | Q10:在非均匀投票构成的“集成学习”中,存在一个权值向量与下述每次获得的最佳假设函数集进行相乘:
121 | $$
122 | \phi(x)=(g_1(x),g_2(x),...,g_T(x))
123 | $$
124 | 在学习kernel模型时,kernel可以视为简单的内积运算:$\phi(x)^T\phi(x^\prime)$。在这个问题中,将这两个主题在决策树桩问题中融合起来。
125 |
126 | 假设输入变量$x$每一维只取$[L,R]$上的整数,定义下述决策树桩:
127 | $$
128 | g_{s,i,\theta}(x)=s\cdot sign(x_i-\theta)\\
129 | where\quad i\in\{1,2,...,d\},d为x的维数\\
130 | s\in\{-1,+1\},\theta\in\mathbb{R},sign(0)=+1
131 | $$
132 | 如果两个决策树桩在任意$x\in\mathcal{X}$上均有$g(x)=\hat{g}(x)$,则认为这两个决策树桩相同。下述哪些表述是正确的?
133 | a. 决策树桩的数量与$\mathcal{X}$的大小有关
134 | b. $g_{s,i,\theta}$和$g_{s,i,celling(\theta)}$相等。其中$celling(\theta)$是指$\ge\theta$的最小整数
135 | c. $\mathcal{X}$的大小为无穷
136 | d. $d=2,L=1,R=6$时,有24种不同的决策树桩
137 |
138 | A10:首先需指出可以将$[L,R]$之间按整数换分存在$R-L$段,且不同维度情况是互不干扰的,所以有$2d(R-L)$种情况,又因为还有全正,全负的两种情况,而这两种情况对全部维度是等价的(根据上述定义的等价性可知),从而总共情况有$total=2d(R-L)+2$种情况。可以直到$a,c,d$均错误。
139 | $g_{s,i,\theta}$和$g_{s,i,celling(\theta)}$对应的决策树桩对于全部$x$均是等价的,因此是相等的。
140 |
141 | ### 问题11
142 |
143 | Q11:Q10的延伸,假设$\mathcal{G}=\{\mathcal{X}上全部可能的决策树桩\}$,并通过下标将其全部罗列出来如下:
144 | $$
145 | \phi_{ds}(x)=(g_1(x),g_2(x),...,g_t(x),...,g_{|\mathcal{G}|}(x))
146 | $$
147 | 则:$K_{ds}(x,x^\prime)=\phi_{ds}(x)^T\phi_{ds}(x^\prime)$的一种等价表示方式为什么?其中$||v||_1$表示一阶范数。
148 |
149 | A11:首先可以给出$K$的具体表达式如下所示:
150 | $$
151 | K_{ds}=sign(x_{i1}-\theta_1)sign(x_{i1}^\prime-\theta_1)+sign(x_{i2}-\theta_2)sign(x_{i2}^\prime-\theta_2)+...+sign(x_{i\mathcal{G}}-\theta_{\mathcal{G}})sign(x_{i\mathcal{G}}^\prime-\theta_\mathcal{G})
152 | $$
153 | 以上囊括了所有的$s,i,\theta$的可能,因此总共包含$2d(R-L)+2$项,现在进一步查看有多少项为$+1$,有多少项为$-1$。查看下述这种情况:
154 | 
155 | 对于上述含有$a$个整数的情况,总共有$2(a+1)$种假设函数使得其分类结果乘积为$-1$。因此对于给定的$x,x^\prime$($x,x^\prime$的每一维度上均为整数),总共包含的分类结果乘积为$-1$的数目为:$2||x-x^\prime||_1$,所以分类结果乘积为$+1$的数目为:$2d(R-L)+2-2||x-x^\prime||_1$。所以最终答案为+1项数目减去-1项数目:$2d(R-L)+2-4||x-x^\prime||_1$
156 |
157 |
--------------------------------------------------------------------------------
/lecture/MLT2-1/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2-1/pic1.png
--------------------------------------------------------------------------------
/lecture/MLT2-2.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法作业2:part2
3 | date: 2017-02-17 19:28:21
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法课后作业2-1:对应题目12~题目20
9 |
10 |
11 | ## 机器学习技法作业2
12 |
13 | ## 问题12-18
14 |
15 | 通过PPT208介绍的adaboost算法,实现一个AdaBoost-Stump算法。并在给定的训练集和测试集上进行运行。
16 | 采用$T=300$次迭代(即采用300个决策树桩函数),最终的$E_{in}$和$E_{out}$采用$0/1$误差
17 |
18 | 决策树桩的实现可根据下述的步骤进行:
19 | ① 对于任意特征$i$,可以先对该维度数据$x_{n,i}$进行排序,排序后满足:$x_{|n|,i}\le x_{|n+1|,i}$
20 | ② 考虑截距从$-\infty$以及所有的中点$(x_{|n|,i}+x_{|n+1|,i})/2$,并结合$s=\{+1,-1\}$通过最小化该维度$i$对应的最小$E_{in}^u$来获得最佳的$(s,\theta)$
21 | ③ 针对所有的维度,选取最佳的$(s,i,\theta)$
22 |
23 | ```python
24 | # 导入库
25 | import numpy as np
26 | import pandas as pd
27 | import scipy.linalg as lin
28 | ```
29 | ```python
30 | # 加载数据函数
31 | def loadData(filename):
32 | data = pd.read_csv(filename, sep='\s+', header=None)
33 | data = data.as_matrix()
34 | col, row = data.shape
35 | X = data[:, 0: row-1]
36 | Y = data[:, row-1:row]
37 | return X, Y
38 | ```
39 | ```python
40 | # 决策树桩
41 | def decision_stump(X, Y, thres, U):
42 | row, col = X.shape
43 | r, c = thres.shape; besterr = 1
44 | btheta = 0; bs = 0; index = 0
45 | for i in range(col):
46 | Yhat1 = np.sign(np.tile(X[:, i:i+1], (1, r)).T-thres[:, i:i+1]).T
47 | err1 = (Yhat1!=Y).T.dot(U)
48 | err2 = (-1*Yhat1!=Y).T.dot(U)
49 | s = 1 if np.min(err1) < np.min(err2) else -1
50 | if s == 1 and np.min(err1) < besterr:
51 | besterr = np.min(err1); bs = 1
52 | index = i; btheta = thres[np.argmin(err1), i]
53 | if s == -1 and np.min(err2) < besterr:
54 | besterr = np.min(err2); bs = -1
55 | index = i; btheta = thres[np.argmin(err2), i]
56 | return besterr, btheta, bs, index
57 | ```
58 | ```python
59 | # AdaBoost---Stump 算法
60 | # 需要说明: 与PPT上有点不同,始终保证sum(U)=1
61 | def ada_boost(X, Y, T):
62 | row, col = X.shape
63 | U = np.ones((row, 1))/row
64 | Xsort = np.sort(X, 0)
65 | thres = (np.r_[Xsort[0:1, :] - 0.1, Xsort[0:row - 1, :]] + Xsort) / 2
66 | theta = np.zeros((T,)); s = np.zeros((T,));
67 | index = np.zeros((T,)).astype(int); alpha = np.zeros((T,))
68 | err = np.zeros((T,))
69 | for i in range(T):
70 | err[i], theta[i], s[i], index[i] = decision_stump(X, Y, thres, U)
71 | yhat = s[i]*np.sign(X[:, index[i]:index[i]+1]-theta[i])
72 | delta = np.sqrt((1-err[i])/err[i])
73 | U[yhat==Y] /= delta
74 | U[yhat!=Y] *= delta
75 | # Q14运行时,解除注释
76 | # if i == T-1:
77 | # print('sum(U): ', np.sum(U))
78 | alpha[i] = np.log(delta)
79 | U /= np.sum(U)
80 | # Q15运行时,解除注释
81 | # print('最小的eta: ', np.min(err))
82 | return theta, index, s, alpha
83 | ```
84 | ```python
85 | # 预测函数
86 | def predict(X, theta, index, s, alpha):
87 | row, col = X.shape
88 | num = len(theta)
89 | ytemp = np.tile(s.reshape((1, num)), (row, 1))*np.sign(X[:, index]-theta.reshape((1, num)))
90 | yhat = np.sign(ytemp.dot(alpha.reshape(num, 1)))
91 | return yhat
92 | ```
93 | ```python
94 | # 导入数据
95 | X, Y = loadData('hw2_adaboost_train.dat')
96 | Xtest, Ytest = loadData('hw2_adaboost_test.dat')
97 | row, col = X.shape
98 | r, c = Xtest.shape
99 | ```
100 | ### 问题12
101 |
102 | Q12:$E_{in}(g_1)$的结果为多少?
103 |
104 | A12:令T=1
105 |
106 | ```python
107 | # Q12
108 | theta, index, s, alpha = ada_boost(X, Y, 1)
109 | Ypred = predict(X, theta, index, s, alpha)
110 | print('Ein(g1):', np.sum(Ypred!=Y)/row)
111 | ```
112 |
113 | Ein(g1): 0.24
114 | ### 问题13
115 |
116 | Q13:$E_{in}(G)$的结果为多少?
117 |
118 | A13:T=300
119 |
120 | ```python
121 | # Q13
122 | theta, index, s, alpha = ada_boost(X, Y, 300)
123 | Ypred = predict(X, theta, index, s, alpha)
124 | print('Ein(G):', np.sum(Ypred!=Y)/r)
125 | ```
126 |
127 | Ein(G): 0.0
128 | ### 问题14
129 |
130 | Q14:令$U_t=\sum_{n=1}^Nu_n^{(t)}$,$U_2$的值是多少($U_1=1$)?
131 |
132 | A14:
133 |
134 | ```python
135 | # Q14 --- 打开上述注释项,在运行一次
136 | theta, index, s, alpha = ada_boost(X, Y, 1)
137 | ```
138 |
139 | sum(U): 0.854166260163
140 | ### 问题15
141 |
142 | Q15:$U_T$的值是多少?
143 |
144 | A15:该问题采用“和为1”化后无法给出理想答案,但是取消“和为1”的条件,则会导致程序跑崩。希望知道如何改善的给点提示
145 |
146 | ### 问题16
147 |
148 | Q16:对于$t=1,2,...,300$的所有$\epsilon_t$中,最小的值为多少?
149 |
150 | A16:
151 |
152 | ```python
153 | # Q16
154 | theta, index, s, alpha = ada_boost(X, Y, 300)
155 | ```
156 |
157 | 最小的epsilon: 0.178728070175
158 |
159 | ### 问题17
160 |
161 | Q17:对测试集计算$E_{out}$,对应的$E_{out}(g_1)$为多少?
162 |
163 | A17:
164 |
165 | ```python
166 | # Q17
167 | theta, index, s, alpha = ada_boost(X, Y, 1)
168 | Ypred = predict(Xtest, theta, index, s, alpha)
169 | print('Eout(g1):', np.sum(Ypred!=Ytest)/r)
170 | ```
171 |
172 | Eout(g1): 0.29
173 |
174 | ### 问题18
175 |
176 | Q18:对测试集计算$E_{out}$,对应的$E_{out}(G)$为多少?
177 |
178 | A18:
179 |
180 | ```python
181 | # Q18
182 | theta, index, s, alpha = ada_boost(X, Y, 300)
183 | Ypred = predict(Xtest, theta, index, s, alpha)
184 | print('Eout(G):', np.sum(Ypred!=Ytest)/r)
185 | ```
186 |
187 | Eout(G): 0.132
188 |
189 | ## 问题19-20
190 |
191 | 根据Lec206实现一个kernel ridge regression算法,并将其运用到分类问题中。根据给定的数据集,取其前400个样本作为训练集,剩余的样本作为测试集。利用$0/1$误差计算$E_{in}$和$E_{out}$。考虑采用高斯核$exp(-\gamma ||x-x^\prime||^2)$,并尝试下述所有的参数$\gamma\in\{32,2,0.125\}$和$\lambda\in\{0.001,1,1000\}$
192 |
193 | ```python
194 | # ----------- Q19-20 --------------
195 | # 获得对偶矩阵K
196 | def matK(X, X1, gamma):
197 | row, col =X.shape
198 | r, c = X1.shape
199 | K = np.zeros((row, r))
200 | for i in range(r):
201 | K[:, i] = np.sum((X-X1[i:i+1, :])**2, 1)
202 | K = np.exp(-gamma*K)
203 | return K
204 | ```
205 | ```python
206 | # 加载数据
207 | X, Y = loadData('hw2_lssvm_all.dat')
208 | Xtrain = X[0:400, :]; Ytrain = Y[0:400, :]
209 | Xtest = X[400:, :]; Ytest = Y[400:, :]
210 | row, col = Xtest.shape
211 | ```
212 | ### 问题19&问题20
213 |
214 | Q19&Q20:以上给出的所有参数组合中,最小的$E_{in}(g)$和最小的$E_{out}(g)$分别是多少?
215 |
216 | A19&A20:
217 |
218 | ```python
219 | # 测试
220 | gamma = [32, 2, 0.125]
221 | lamb = [0.001, 1, 1000]
222 | Ein = np.zeros((len(gamma), len(lamb)))
223 | Eout = np.zeros((len(gamma), len(lamb)))
224 | for i in range(len(gamma)):
225 | K = matK(Xtrain, Xtrain, gamma[i])
226 | K2 = matK(Xtrain, Xtest, gamma[i])
227 | for j in range(len(lamb)):
228 | beta = lin.pinv(lamb[j]*np.eye(400)+K).dot(Ytrain)
229 | yhat = np.sign(K.dot(beta))
230 | Ein[i, j] = np.sum(yhat != Ytrain)/400
231 | yhat2 = np.sign(K2.T.dot(beta))
232 | Eout[i, j] = np.sum(yhat2 != Ytest)/row
233 | print('最小的Ein: ', np.min(Ein))
234 | print('最小的Eout: ', np.min(Eout))
235 | ```
236 |
237 | 最小的Ein: 0.0
238 | 最小的Eout: 0.39
239 |
--------------------------------------------------------------------------------
/lecture/MLT2.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法Lec5-Lec8
3 | date: 2017-02-17 20:27:17
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法Lec5-Lec8主要知识点:对应作业2
9 |
10 |
11 | ## 基于核函数的Logistic回归
12 |
13 | ### 柔性SVM与含正则项模型
14 |
15 | 根据柔性SVM的定义,可以获得其等价的无约束情况:
16 | 
17 | 从无约束SVM的表达式可见,其非常“相似于”加入了$L_2$正则项的模型:
18 | 
19 | 上述情况可见柔性SVM本质就是含有一个特殊$err$,以及利用另一个参数$C$替代$\lambda$以及选用更小的$w$的含正则项的模型。
20 |
21 | **结合上篇文章中指出的,不管是柔性SVM还是hard-margin SVM均可以视为含正则项的模型**:
22 | 
23 | 上述表格中包含一定的对应关系:
24 | ① 大的边界$\iff$更少可供选择的超平面$\iff$$L_2$正则化里将$w$限制的更小
25 | ② 柔性边界$\iff$特殊的损失函数$\hat{err}$
26 | ③ 大的“越界惩罚”$C$$\iff$更小的$\lambda\iff$“更弱”的正则化(其实就是更大的假设函数集)
27 |
28 | 将SVM看做含正则项的模型,这样更有助于理解和扩展
29 |
30 | ### 柔性SVM vs Logistic回归
31 |
32 | 柔性SVM的错误度量标准可以视为一种特殊的错误度量标准,将其与Logistic回归的错误度量标准进行比较:
33 | 
34 | 可以看出,当$ys\to-\infty or+\infty$时,$err_{SVM}\approx err_{SCE}$,所以$SVM\approx L2$正则化的Logistic回归
35 |
36 | ### SVM用于柔性二元分类
37 |
38 | 将SVM与Logistic回归结合起来,可以用于柔性二元分类(即以概率的形式给出结果),其具体的算法和优化目标如下所示:
39 | 
40 | 其中步骤①就是常规的SVM解法,步骤②可以采用梯度下降法来获得最优的$(A,B)$
41 |
42 | ### 表示理论(什么情况适用核函数)
43 |
44 | 首先给出表示理论的说明:
45 | 
46 | 给出解释:
47 | 
48 | 从而可见,对于任意$L_2$正则化的线性模型均可以采用核函数的方式!!!
49 |
50 | ### 基于核函数的logistic回归
51 |
52 | 将$w$最优解$w^\star=\sum_{n=1}^N\beta_nz_n$代入Logistic回归的目标函数,可以将其转换为核函数形式,且将求最优$w$的问题转变为求最优$\beta$的问题:
53 | 
54 | 从上述可见,目标函数变得非常类似与SVM。但需要注意的是,该目标函数解得的$\beta$并不类似于SVM求得的$\alpha$具有很强的稀疏性。对于核函数的Logistic回归所求得的$\beta$大部分都是非零的。
55 | 可以如此理解,Logistic回归中是将所以数据均考虑进来,并没有支撑向量这个概念。不管转变与否,每个数据在Logtistic中至始至终都是被使用着。
56 |
57 | ## SVR(Support Vector Regression)
58 |
59 | ### 基于核函数的Ridge回归问题(LLSVM)
60 |
61 | 根据上面的内容可知,根据表示规则,Ridge回归问题可以转变为等价的含核函数问题:将$w^\star=\sum_{n=1}^N\beta_nz_n$代入原目标函数即可:
62 | 
63 | 从而问题转变为求最优的$\beta$使得新的目标函数最小,对上式进行求导可得:
64 | $$
65 | \nabla E=\frac{2}{N}(\lambda K^TI\beta+K^TK\beta-K^TY)=\frac{2}{N}K^T((\lambda I+K)\beta-Y)\\
66 | \nabla E=0\to\beta=(\lambda I+K)^{-1}Y
67 | $$
68 |
69 | 从上式也不难发现,大部分$\beta$为非0,因此也为非稀疏的。
70 |
71 | ### Tube Regression
72 |
73 | 类似于“柔性”SVM,tube回归在原有线性回归模型中增加一定的“容忍度”,在距离小于一定范围内,认为没有误差,而大于该范围,则产生误差(从而使得$\beta$为稀疏的)。定义损失标准函数如下:
74 | $$
75 | err(y,s)=max(0,|s-y|-\epsilon)
76 | $$
77 | 上述中容忍度为$\epsilon$。
78 |
79 | 从而带有$L_2$正则项的tube回归问题的目标函数为,以及为了更加与SVM形式对应,将$\lambda$替换为$C$:
80 | 
81 |
82 | ### Tube Regression$\to$SVR
83 |
84 | 类比于柔性SVM其含约束情况,可以将上述tube回归问题转化为含约束的情况:
85 | 
86 | 此时,后续的对偶化求解方式与SVM完全一样,可以自行推导,可以获得其等价的对偶形式,其中$\alpha^{\land}_n$是指$y_n-w^Tz_n-b\le\epsilon+\xi_n^{\land}$条件转化的拉格朗日乘子。$\alpha^{\lor}_n$是指$y_n-w^Tz_n-b\ge\epsilon+\xi_n^{\lor}$条件转化的拉格朗日乘子:
87 | 
88 | 此外,上述的$\beta$是稀疏的,且稀疏项对应关系如下所示:
89 | 
90 |
91 | ### 线性函数/核函数大集结
92 |
93 | 目前所有用到的线性模型以及其对应采用的核函数模型如下所示:
94 | 
95 | 需要说明几点:
96 | ① 第四行在LIBSVM中非常受欢迎
97 | ② 第三行很少使用,因为其对应的$\beta$为非稀疏的
98 | ③ 第二行在LIBLINEAR中非常受欢迎
99 | ④ 第一行很少很少使用,因为一般表现比较差
100 |
101 | ## 模型融合和Bagging方法
102 |
103 | ### 模型融合
104 |
105 | 模型融合的几种常见类型:
106 | 
107 | 模型融合的意义:
108 | 
109 | 从上述两种融合情况来看:融合或许能起到① feature transform的作用 ② regularization的作用
110 |
111 | ### 基于均匀混合回归模型解释融合背后的思想
112 |
113 | 下述情况中需说明:① avg是指针对多个假设函数的平均 ② $E..or..\mathcal{E}$是指在不同数据集上的数学期望
114 | 
115 | 从上述结果不难看出:一系列假设函数$g_t$的误差平均要大于一系列假设函数平均的误差。
116 |
117 | 下面根据上述情况的一种特例来进一步说明混合模型的优势:
118 | 
119 | 从上述特例可以很显著的看到均匀混合模型能够减少variance,从而提高模型的稳定性
120 |
121 | ### 混合模型$\iff$线性模型+特征转换
122 |
123 | 常见混合模型的目标函数:
124 | 
125 | 从上式不难发现:**线性混合模型=线性模型+特征转换(将假设函数视为特征转换)+约束条件**
126 |
127 | 比如线性融合的线性回归模型等价于线性回归+特征转换
128 | 
129 | 所以对于任意的混合模型可以写成其对应的特征转换模型:
130 | 
131 | 其中$\Phi(x)=(g(x_1),g(x_2),...,g_T(x))$
132 |
133 | ### Bagging(Bootstrap Aggregation)
134 |
135 | Bagging的定义:
136 | 
137 | 从理想的模型$\Longrightarrow$Bagging模型
138 | 
139 |
140 | ## Adaptive Boosting(AdaBoost)算法
141 |
142 | ### 调整权值来体现多样性
143 |
144 | 通过加入权值,可以起到类似于Bagging的作用:
145 | 
146 | 通过加入权值,可以间接增加假设函数集的多样性((获得不同的最优假设函数):
147 | 
148 |
149 | ### AdaBoost算法
150 |
151 | 
152 | 几点核心思想:1. “放大”上次分类错误的样本,“隐藏”上次分类正确的样本 2. 根据每种最佳函数的错误率来对每种权值情况下的最佳函数分配不同的权重
153 |
154 | 上述算法的一些可供改变之处:为了避免出现$u^{(t)}\to0$的情况出现,可以在②的最后加上$u/sum(u)$将其始终保持和为1这个情况。
155 |
156 | ### AdaBoost背后的理论依据
157 |
158 | 可以根据VC bound理论给出下列结果:
159 | 
160 | 上述结果可以说明AdaBoost的可行性。
--------------------------------------------------------------------------------
/lecture/MLT2/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic1.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic10.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic11.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic12.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic13.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic14.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic14.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic15.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic15.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic16.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic16.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic17.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic17.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic18.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic18.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic19.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic19.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic2.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic20.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic20.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic21.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic21.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic22.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic22.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic23.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic23.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic24.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic24.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic25.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic25.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic26.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic26.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic27.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic27.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic3.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic4.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic5.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic6.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic7.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic8.png
--------------------------------------------------------------------------------
/lecture/MLT2/pic9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT2/pic9.png
--------------------------------------------------------------------------------
/lecture/MLT3-1.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法作业3:part1
3 | date: 2017-02-21 16:46:50
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法课后作业3-1:对应题目1~题目12
9 |
10 |
11 | ## 机器学习技法作业3
12 |
13 | ### 问题1
14 |
15 | Q1~Q2:关于决策树的问题
16 | “纯度函数”在决策树建立分支时(进行划分)时扮演重要作用。对于二分类问题,令$\mu_{+}$代表样本中正样本的比例。则$\mu_{-}$代表负样本的比例
17 |
18 | Q1:根据Gini系数的定义,不难得到二分类问题的Gini系数为$1-\mu_{+}^2-\mu_{-}^2$。则当$\mu_{+}\in[0,1]$取何值时,Gini系数最大?
19 |
20 | A1:相当于一个“优化问题”,二分类问题中,Gini系数可以表示为$\mu_{+}$的函数:
21 | $$
22 | 1-\mu_{+}^2-\mu_{-}^2=1-\mu_{+}^2-(1-\mu_{+})^2=-2(\mu_{+}-0.5)^2+0.5
23 | $$
24 | 从而当$\mu_{+}=0.5$时取到最大值$0.5$
25 |
26 | Q2:存在下述四种“纯度函数”,为了能够建立比较,将他们除以各自的最大值归一化到$[0,1]$。经过归一化处理后,以下哪一种“纯度函数”与归一化了的Gini系数等价?
27 | (a). classification error: $min(\mu_{+},\mu_{-})$
28 | (b). the squared regression error: $\mu_{+}(1-(\mu_{+}-\mu_{-}))^2+\mu_{-}(-1-(\mu_{+}-\mu_{-}))^2$
29 | (c). the closeness: $1-|\mu_{+}-\mu_{-}|$
30 | (d). the entropy: $-\mu_{+}ln(\mu_{+})-\mu_{-}ln(\mu_{-}),with..0log0=0$
31 |
32 | A2:首先可以知道Gini系数的归一化后的表达式为(为了方便,令$x=\mu_{+}$):$4x-4x^2$
33 | (a). 归一化后:$2min(x,1-x)$
34 | (b). 原式=$x(2-2x)^2+(1-x)(2x)^2=4x(1-x)(1-x+x)=4x-4x^2$
35 | (c). 原式=$1-|x-(1-x)|=1-|2x-1|$
36 | (d). 原式=$-xlnx-(1-x)ln(1-x)$
37 | 显然不难知,(b)与归一化后的Gini系数相等
38 |
39 | ### 问题2
40 |
41 | Q3~Q5:关于随机森林
42 |
43 | Q3:如果从$N$个样本中bootstrapping出$N^\prime=pN$个数据出来,且假设$N$非常大,则估计有多少个样本没有被取到?
44 |
45 | A3:一个样本没被取到的概率为$1-1/N$,从而对其中任意一个样本,$N^\prime$次未被取到的概率为:
46 | $$
47 | lim_{N\to\infty}(1-\frac{1}{N})^{pN}=(lim_{N\to\infty}(1-\frac{1}{N})^{N})^p=e^{-p}
48 | $$
49 | 从而总的未被取到的样本为:$e^{-p}N$
50 |
51 | Q4:假设随机森林的函数$G$由三个二元分类树$\{g_k\}_{k=1}^3$构成,且三个分类树的测试误差分别为$E_{out}(g_1)=0.1$,$E_{out}(g_2)=0.2$,$E_{out}(g_3)=0.3$。则$E_{out}(G)$的可能范围是多少?
52 |
53 | A4:两个极限情况,可以根据下述两幅图说明(黄色---$g_1$错误的区域,橙色---$g_2$错误的区域,绿色---$g_3$错误的区域),且图二中绿色恰好被其他两色挡住。从而可以知道$0\le E_{out}(G)\le 0.3$
54 | 
55 |
56 | Q5:考虑一个含有$K$个二元分类树$\{g_k\}_{k=1}^K$的随机森林$G$,其中$K$为奇数,且每个分类树在测试集上的误差分别为$E_{out}(g_k)=e_k$,求$E_{out}(G)$的上界?
57 |
58 | A5:假设整个测试集(全部数据集)离散化,一个点被随机森林最终错误的情况必须满足至少$(K+1)/2$个分类树都将其分错。从而一个错误点对应$\ge(K+1)/2$次错误。而总的错误次数为$\sum_{k=1}^Ke_k$,所以最极端情况下,存在$\frac{2}{K+1}\sum_{k=1}^Ke_k$个错误点。因此$E_{out}(G)\le\frac{2}{K+1}\sum_{k=1}^Ke_k$。当然也可以通过简单的实例来获得答案。
59 |
60 | ### 问题3
61 |
62 | Q6~Q8:关于Gradient Boosting,需要注意的是,Q7和Q8涉及的是回归问题,而不是分类问题。
63 |
64 | Q6:令$\epsilon_t$为AdaBoost算法中$g_t$关于加入权值情况下的$0/1$误差。且令$U_t=\sum_{n=1}^Nu_n^{(t)}$,则以下用$\epsilon_t$表示的哪条表达式等于$U_{T+1}$?
65 |
66 | A6:根据Lec11第7页可知,且不妨假设$[y_n\ne g_t(x_n)]$有$n1$个,$[y_n= u_t(x_n)]$有$n2$个:
67 | $$
68 | U_{T+1}=\sum_{n=1}^Nu_n^{(T+1)}=\sum_{n=1}^Nu_n^{(T)}\sqrt{\frac{1-\epsilon_T}{\epsilon_T}}^{-y_ng_T(x_n)}=\sum_{n_1=1}^{n1}u_{n_1}^{(T)}\sqrt{\frac{1-\epsilon_T}{\epsilon_T}}^{1}+\sum_{n_2=1}^{n2}u_{n_2}^{(T)}\sqrt{\frac{1-\epsilon_T}{\epsilon_T}}^{-1}\\
69 | =\sum_{n=1}^Nu_n^{(T)}\epsilon_T\sqrt{\frac{1-\epsilon_T}{\epsilon_T}}^{-1}+\sum_{n=1}^Nu_n^{(T)}(1-\epsilon_T)\sqrt{\frac{1-\epsilon_T}{\epsilon_T}}^{-1}=2\sqrt{(1-\epsilon_T)\epsilon_T}\sum_{n=1}^Nu_n^{(T)}\\
70 | ...=\prod_{t=1}^T2\sqrt{(1-\epsilon_t)\epsilon_t}
71 | $$
72 |
73 | 上面的表达式中用到了一条(根据$\epsilon$定义而来的):$\sum_{n_1=1}^{n1}u_{n_1}^{(T)}=\epsilon_TU_T$
74 |
75 | Q7:对于gradient boosted决策树,如果树仅仅只有一个常数节点作为返回,且为$g_1(x)=2$。则经过第一次迭代之后,全部的$s_n$从$0$更新到一个新的数$\alpha_1g_1(x_n)$,求更新后的$s_n$?
76 |
77 | A7:直接可知$g_1(x_n)=2$,之后便是求解$\alpha_1$,根据Lec11p17可知,最佳的$\alpha_1\to\eta$:
78 | $$
79 | min_\eta\frac{1}{N}\sum_{n=1}^N((y_n-s_n^{(0)})-\eta g_1(x_n))^2\to \nabla E=0\to \eta=\frac{1}{2N}\sum_{n=1}^Ny_n
80 | $$
81 | 所以最终结果为$s_n^{(1)}=\sum_{n=1}^Ny_n$
82 |
83 | Q8:对于gradient boosted决策树而言,利用下降最快的方向$\eta$作为$\alpha_t$经过$t$轮迭代后为$s_n^{(t)}$,求$\sum_{n=1}^Ns_n^{(t)}g_t(x_n)$为多少?
84 |
85 | A8:根据Lec11p19可知,对于第$t$次迭代,最优的下降方向$\eta$为:
86 | $$
87 | \alpha_t=\eta=\frac{\sum_{n=1}^Ng_t(x_n)(y_n-s_n^{(t-1)})}{\sum_{n=1}^Ng_t^2(x_n)}\\\to \alpha_t\sum_{n=1}^Ng_t^2(x_n)+\sum_{n=1}^Ng_t(x_n)s_n^{(t-1)}=\sum_{n=1}^Ng_t(x_n)y_n
88 | $$
89 | 根据$s_n$的更新规则可以获得下式:
90 | $$
91 | \sum_{n=1}^Ns_n^{(t)}g_t(x_n)=\sum_{n=1}^N(s_n^{(t-1)}+\alpha_tg_t(x_n))g_t(x_n)=\sum_{n=1}^Ng_t(x_n)y_n
92 | $$
93 |
94 | ### 问题4
95 |
96 | Q9~Q12:关于神经网络的问题
97 |
98 | Q9:考虑一种特俗的神经网络,以$sign(s)$作为神经元(或称为转移函数),即多层感知机模型,且我们认为$+1$代表True,$-1$代表False。假设下面式子中$x_i$为$+1$或$-1$,则下述哪个参数的感知机模型能够实现$OR(x_1,x_2,...x_d)$
99 | $$
100 | g_A(x)=sign(\sum_{i=0}^dw_ix_i)
101 | $$
102 | A9:$OR$即有对就对,全错才错的原则,可以直接给出答案,下述的参数情况满足要求:
103 | $$
104 | (w_0,w_1,w_2,...,w_d)=(d-1,-1,-1,...,-1)
105 | $$
106 | 注:其中假设$sign(0)=+1$
107 |
108 | Q10:根据Q9种同样的神经网络模型,则实现$XOR(x_1,x_2,x_3,x_4,x_5)$最小需要几个隐藏层神经元(采用的神经元模型为$(5-D-1)$)?
109 |
110 | A10:首先理解$XOR(x_1,x_2,x_3)=XOR(XOR(x_1,x_2),x_3)$
111 | 方法1:基于$XOR$的性质不难知存在四种情况①+1遇-1$\to$不变②+1遇+1$\to$变-1③-1遇-1$\to$不变④-1遇+1$\to$变+1。从而可知,当$x_1,...,x_5$中存在奇数个$+1$时输出结果为+1,其他情况输出结果为-1。从而中间层包含5个神经元,分别含义为①存在5个+1 ②至少4个+1 ③至少3个+1 ④至少2个+1 ⑤至少一个+1。所构建的网络如下参数如下所示:
112 | $$
113 | w_{i1}^{(1)}=[-5,1,1,1,1,1]\quad w_{i2}^{(1)}=[-4,1,1,1,1,1]\quad w_{i3}^{(1)}=[-3,1,1,1,1,1]\\
114 | w_{i2}^{(1)}=[-2,1,1,1,1,1]\quad w_{i3}^{(1)}=[-1,1,1,1,1,1] \\
115 | w_{i1}^{(2)}=[-6,6,-6,6,-6,6]
116 | $$
117 | 其中$w_{i1}^{(2)}$根据满秩所以必然有解。
118 |
119 | 方法2:在第一层转换中$s=w^Tx,not..include..w_0$,($w_0$留着当“阈值”)则相当于将其转换到与$x$相同维数的空间上的某条直线上去(过原点),则问题转换为寻找一条最佳过原点的直线,使得所需的“分割情况”最少。先以二维为例,总过结果情况有四种,则最佳直线为其对角线方向(四个点构成正方形),如下图所示,则划分等价直线情况需要两个阈值,所以第二层神经元数目为2。
120 | 
121 | 以及三维情况如下所示,则划分等价直线情况需要三个阈值,所以第二层神经元数目为3:
122 | 
123 | 从而不难归纳出,对于$N$维情况,则划分其等价直线情况需要$N$个阈值,对应第二层神经元数目为$N$
124 |
125 | Q11:对于包含至少一层隐藏层的神经网络,采用$tanh(s)$作为神经元(包含输入层),且初始化$w_{ij}^{(l)}=0$,则下述关于梯度成分的描述正确的是?
126 | (a). 只有参数$w_{j1}^{L},j\gt0$对应的梯度可能为非0,其他梯度必然为0
127 | (b). 只有参数$w_{01}^{L}$对应的梯度可能为非0,其他梯度必然为0
128 | (c). 全部的梯度均为0
129 | (d). 只有参数$w_{0j}^{l},j\gt0$对应的梯度可能为非0,其他梯度必然为0
130 |
131 | A11:根据神经网络中梯度部分的求解方式Lec12p14可知:
132 | $$
133 | \frac{\partial E}{\partial w_{ij}^{(l)}}=\delta_j^{(l)}(x_i^{(l-1)})\\
134 | \delta_1^{(L)}=-2(y_n-s_1^{(L)})\\
135 | \delta_j^{(l)}=\sum_k(\delta_k^{(l+1)})(w_{jk}^{l+1})(tanh^\prime(s_j^{(l)}))
136 | $$
137 | 根据前向传播的规则可知,$x_i^{(l)}=0,l\ge1$,从而不难发现$\delta_j^{(l)}=0,l\lt L$,$\delta_1^{(L)}\ne0, if..y_n\ne0$,$x_i^{(L-1)}=0,i\ne 0..and..x_0^{(L-1)}=1$。根据这些条件可以知道,唯一可能非零项为$w_{01}^{(L)}$对应的梯度部分
138 |
139 | Q12:对于包含一层隐藏层的神经网络,采用$tanh(s)$作为神经元(包含输入层),且初始化$w_{ij}^{(l)}=1$,则下述关于反向传播算法的描述正确的是?
140 | (a). $w_{ij}^{(l)}$对应的梯度部分为0
141 | (b). $w_{ij}^{(1)}=w_{(i+1)j}^{1},for..1\le i\le d^{(0)}-1..and..all..j$
142 | (c). $w_{ij}^{(1)}=w_{i(j+1)}^{1},for..1\le j\le d^{(1)}-1..and..all..i$
143 | (d). 全部$w_{j1}^{(2)},j\gt0$均不同
144 |
145 | A12:显然通过Q11中的解答可以直接看出,$\delta_{j}^{(l)}$对于不同的$j$相等,从而可知(c)项正确。(b)项错误是因为输入的$x$每一维度未必相等。
--------------------------------------------------------------------------------
/lecture/MLT3-1/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3-1/pic1.png
--------------------------------------------------------------------------------
/lecture/MLT3-1/pic2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3-1/pic2.png
--------------------------------------------------------------------------------
/lecture/MLT3-1/pic3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3-1/pic3.png
--------------------------------------------------------------------------------
/lecture/MLT3-2.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法作业3:part2
3 | date: 2017-02-23 11:35:47
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法课后作业3-1:对应题目13~题目20
9 |
10 |
11 | ## 机器学习技法作业3
12 |
13 | ### 问题5
14 |
15 | Q13~Q15:关于决策树的问题。
16 |
17 | 基于课堂所介绍的C&RT算法:基于Gini系数,无剪枝。来实现并对所给的数据集进行实验。
18 |
19 | ```python
20 | # 为了实现决策树,需建立树结点的类
21 | # 定义树结点
22 | class Node:
23 | def __init__(self, theta, index, value=None):
24 | self.theta = theta # 划分的阈值
25 | self.index = index # 选用的维度
26 | self.value = value # 根节点的值
27 | self.leftNode = None
28 | self.rightNode = None
29 | ```
30 | ```python
31 | # 定义Gini系数---作为每个子集“好坏”的衡量标准
32 | def gini(Y):
33 | l = Y.shape[0]
34 | if l == 0:
35 | return 1
36 | return 1-(np.sum(Y==1)/l)**2-(np.sum(Y==-1)/l)**2
37 | ```
38 | ```python
39 | # 为了便于实现,找出每一维度下的最佳划分阈值和对应的branch值 --- 但这样实现代价是运行速度
40 | # 单维情况下的最佳树桩---大于等于为1类
41 | def one_stump(X, Y, thres):
42 | l = thres.shape[0]
43 | mini = Y.shape[0]
44 | for i in range(l):
45 | Y1 = Y[X=thres[i]]
47 | judge = Y1.shape[0]*gini(Y1)+Y2.shape[0]*gini(Y2)
48 | if mini>judge:
49 | mini = judge; b = thres[i]
50 | return mini, b
51 | ```
52 | ```python
53 | # 定义划分终止的条件
54 | # 终止条件
55 | def stop_cond(X, Y):
56 | if np.sum(Y!=Y[0])==0 or X.shape[0]==1 or np.sum(X!=X[0, :])==0:
57 | return True
58 | return False
59 | ```
60 | ```python
61 | # 定义完全生长的决策树
62 | def dTree(X, Y):
63 | if stop_cond(X, Y):
64 | node = Node(None, None, Y[0])
65 | return node
66 | b, index = decision_stump(X, Y)
67 | pos1 = X[:, index] < b; pos2 = X[:, index] >= b
68 | leftX = X[pos1, :]; leftY = Y[pos1, 0:1]
69 | rightX = X[pos2, :]; rightY = Y[pos2, 0:1]
70 | node = Node(b, index)
71 | node.leftNode = dTree(leftX, leftY)
72 | node.rightNode = dTree(rightX, rightY)
73 | return node
74 | ```
75 | ```python
76 | # 预测函数---基于决策树对单个样本进行的预测
77 | def predict_one(node, X):
78 | if node.value is not None:
79 | return node.value[0]
80 | thre = node.theta; index = node.index
81 | if X[index] < thre:
82 | return predict_one(node.leftNode, X)
83 | else:
84 | return predict_one(node.rightNode, X)
85 | ```
86 | ```python
87 | # 基于决策树的预测结果及其错误率衡量函数
88 | def err_fun(X, Y, node):
89 | row, col = X.shape
90 | Yhat = np.zeros(Y.shape)
91 | for i in range(row):
92 | Yhat[i] = predict_one(node, X[i, :])
93 | return Yhat, np.sum(Yhat!=Y)/row
94 | ```
95 | Q13:获得的决策树$G$含有多少内部结点(不包括叶子结点,即所有包含阈值分割的结点)?
96 |
97 | A13:需要定义一个搜索树有几个结点的函数
98 |
99 | ```python
100 | # Q13
101 | # 定义一个搜索树有多少结点的函数---叶子结点不计入
102 | def internal_node(node):
103 | if node == None:
104 | return 0
105 | if node.leftNode == None and node.rightNode == None:
106 | return 0
107 | l = 0; r = 0
108 | if node.leftNode != None:
109 | l = internal_node(node.leftNode)
110 | if node.rightNode != None:
111 | r = internal_node(node.rightNode)
112 | return 1 + l + r
113 |
114 | node = dTree(X, Y)
115 | print('完全生长的决策树内部结点数目:', internal_node(node))
116 | ```
117 |
118 | 完全生长的决策树内部结点数目: 10
119 |
120 | Q14和Q15:基于$0/1$误差判据的训练集和验证集的错误率$E_{in},E_{out}$分别是多少?
121 |
122 | A14和A15:
123 |
124 | ```python
125 | # Q14 and Q15
126 | _, ein = err_fun(X, Y, node)
127 | _, eout = err_fun(Xtest, Ytest, node)
128 | print('Ein: ', ein, '\nEout: ', eout)
129 | ```
130 |
131 | Ein: 0.0
132 | Eout: 0.126
133 |
134 | ### 问题6
135 |
136 | Q16~Q18:关于随机森林算法
137 |
138 | 采用$N^\prime=N$的bagging策略,并结合上述实现的决策树算法,构造随机森林。实践中进行$T=300$次bagging(即生成300棵树),并进行100次(为了节约时间,采用50次)实验取平均的$E_{in},E_{out}$
139 |
140 | ```python
141 | # bagging函数
142 | def bagging(X, Y):
143 | row, col = X.shape
144 | pos = np.random.randint(0, row, (row,))
145 | return X[pos, :], Y[pos, :]
146 | ```
147 | ```python
148 | # 随机森林算法---没有加入feature的随机选择
149 | def random_forest(X, Y, T):
150 | nodeArr = []
151 | for i in range(T):
152 | Xtemp, Ytemp = bagging(X, Y)
153 | node = dTree(Xtemp, Ytemp)
154 | nodeArr.append(node)
155 | return nodeArr
156 | ```
157 | Q16~Q18:分别求30000(此处采用15000)棵决策树误差的平均$E_{in}(g_t)$,以及平均误差$E_{in}(G_{RF}),E_{out}(G_{RF})$
158 |
159 | ```python
160 | # Q16,Q17,Q18
161 | ein = 0; eout = 0; err = 0
162 | for j in range(50):
163 | nodeArr = random_forest(X, Y, 300)
164 | l = len(nodeArr)
165 | yhat1 = np.zeros((Y.shape[0], l))
166 | yhat2 = np.zeros((Ytest.shape[0], l))
167 | for i in range(l):
168 | yhat1[:, i:i+1], _ = err_fun(X, Y, nodeArr[i])
169 | yhat2[:, i:i+1], _ = err_fun(Xtest, Ytest, nodeArr[i])
170 | errg = np.sum(yhat1!=Y, 0)/Y.shape[0]
171 | Yhat = np.sign(np.sum(yhat1, 1)).reshape(Y.shape)
172 | Ytesthat = np.sign(np.sum(yhat2, 1)).reshape(Ytest.shape)
173 | Yhat[Yhat == 0] = 1; Ytesthat[Ytesthat == 0] = 1
174 | ein += np.sum(Yhat!=Y)/Y.shape[0]
175 | eout += np.sum(Ytesthat!=Ytest)/Ytest.shape[0]
176 | err += np.sum(errg)/l
177 | print('Ein(gt)的平均:', err/50)
178 | print('Ein(G): ', ein/50)
179 | print('Eout(G): ', eout/50)
180 | ```
181 |
182 | Ein(gt)的平均: 0.0518873333333
183 | Ein(G): 0.0
184 | Eout(G): 0.07452
185 |
186 | ### 问题7
187 |
188 | Q19~Q20:基于“超级”剪枝的随机森林算法
189 |
190 | 将每棵决策树限制为只有一个分支的情况。即相当于变成了一个决策树桩。与问题6中采用同样的bagging策略。重复实验,求对应的平均$E_{in}(G_{RS}),E_{out}(G_{RS})$
191 |
192 | ```python
193 | # 定义只进行一次划分的决策树(夸张的剪枝)
194 | def dTree_one(X, Y):
195 | b, index = decision_stump(X, Y)
196 | pos1 = X[:, index] < b; pos2 = X[:, index] >= b
197 | node = Node(b, index)
198 | value1 = 1 if np.sign(np.sum(Y[pos1]))>=0 else -1
199 | value2 = 1 if np.sign(np.sum(Y[pos2]))>=0 else -1
200 | node.leftNode = Node(None, None, np.array([value1]))
201 | node.rightNode = Node(None, None, np.array([value2]))
202 | return node
203 | ```
204 | ```python
205 | # 基于剪枝后的随机森林算法
206 | def random_forest_pruned(X, Y, T):
207 | nodeArr = []
208 | for i in range(T):
209 | Xtemp, Ytemp = bagging(X, Y)
210 | node = dTree_one(Xtemp, Ytemp)
211 | nodeArr.append(node)
212 | return nodeArr
213 | ```
214 | A19~A20:
215 |
216 | ```python
217 | # Q19, Q20
218 | ein = 0; eout = 0
219 | for j in range(50):
220 | nodeArr = random_forest_pruned(X, Y, 300)
221 | l = len(nodeArr)
222 | yhat1 = np.zeros((Y.shape[0], l))
223 | yhat2 = np.zeros((Ytest.shape[0], l))
224 | for i in range(l):
225 | yhat1[:, i:i + 1], _ = err_fun(X, Y, nodeArr[i])
226 | yhat2[:, i:i + 1], _ = err_fun(Xtest, Ytest, nodeArr[i])
227 | Yhat = np.sign(np.sum(yhat1, 1)).reshape(Y.shape)
228 | Ytesthat = np.sign(np.sum(yhat2, 1)).reshape(Ytest.shape)
229 | Yhat[Yhat == 0] = 1;
230 | Ytesthat[Ytesthat == 0] = 1
231 | ein += np.sum(Yhat != Y) / Y.shape[0]
232 | eout += np.sum(Ytesthat != Ytest) / Ytest.shape[0]
233 | print('Ein: ', ein/50)
234 | print('Eout: ', eout/50)
235 | ```
236 |
237 | Ein: 0.1106
238 | Eout: 0.15336
239 |
240 |
--------------------------------------------------------------------------------
/lecture/MLT3.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法Lec9-Lec12
3 | date: 2017-02-23 21:06:27
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法Lec9-Lec12主要知识点:对应作业3
9 |
10 |
11 | ## 决策树
12 |
13 | ### 三种不同类型的集成模型
14 |
15 | 常见的三种基本集成模型如下表所示(其中①blending代表融合的方式---可以理解为在获得$g_t$后采用的融合策略 ②learning代表具体的学习算法---可以理解为在学习的过程中就将融合的策略加入$g_t$)
16 | 
17 | 后续大部分集成算法都是基于这些集成算法进一步“融合或改进”
18 |
19 | ### 决策树模型基本框架
20 |
21 | 
22 |
23 | ### CART算法(决策树模型的一种)
24 |
25 | “纯度函数”(impurity()函数)的选择:推荐回归问题中采用regression error,分类问题中选用Gini index
26 | 
27 | 终止条件的选择:
28 | 
29 | CART算法的实现:
30 | 
31 |
32 | ### CART存在的问题及改进
33 |
34 | ① 过拟合问题:完全生长的树容易导致过拟合($E_{in}=0$)
35 | 加入正则项,该正则项可以为限制**叶子结点**的数目,从而使树“生长”到一定程度就停止了
36 |
37 | ② 特征缺失问题:比如需预测的数据中某个特征$x_i$缺失
38 | 在进行划分的过程中,如果遇到最优划分特征为“$x_i$”,则采用“次优或其他”的特征(且该数据包含的特征)来替代该特征的划分。但是这样付出的代价就是要“保留”其他特征“拿来替换”特征的划分标准等
39 |
40 | CART树的优点:
41 | 
42 |
43 | ## 随机森林
44 |
45 | ### 随机森林算法框架
46 |
47 | 随机森林=bagging+fully-grown CART:结合了bagging降低variance的特点以及CART具有较大variance的特点。
48 | 
49 | 增加随机森林“多样性”的方法(增加了决策树的多样性)--- 这也是随机森林“随机”的体现:
50 | ① 通过“bagging”使得每次数据集不一样,从而使得每次决策树基于的样本不同。增加了决策树的多样性
51 | ② 每次$b(x)$过程中选用$d^\prime$个特征,而不是$d$个特征($d^\prime \ll d$),从而能够获得更多不同的决策树
52 |
53 | ### 随机森林的小技巧
54 |
55 | ① 随机森林的自验证(OOBvalidation)
56 | 通过bagging每次会有一部分数据并未被取到,而该部分可以被用作验证集
57 | 
58 | 需要主要的是,其中$G_n^{-}(x_n)$代表全部在bagging中未取到$x_n$的决策树的平均,而不是全部决策树的平均
59 |
60 | ② 特征选择技巧
61 | 可以采用的策略有
62 |
63 | - 移除冗余特征(如年龄和生日)和不相关特征
64 |
65 | - 给不同特征赋予不同的重要性,在随机森林中可以通过下述策略来衡量特征重要性(替换后错误率改变多少来衡量):
66 | 
67 |
68 | ③ 随机森林可能存在的问题
69 | 当全部的“随机过程”显得非常“随机”时,会带来不稳定性,从而可能需要非常多的决策树融合才能达到较好的效果
70 |
71 | ## GBDT(Gradient Boosted Decision Tree)
72 |
73 | ### AdaBoost-DTree
74 |
75 | 类似随机森林的方法,运用到AdaBoost-DTree上,有下述结果:
76 | 
77 | 上述“直接延伸”存在的问题:
78 | ① 破坏了DTree,需要重新构造新的决策树算法
79 | 解决方法:将权值通过“bagging”来体现,即bagging取样时,并不是直接采用等概率随机方法,而是根据$samling\propto u^{(t)}$来取样,从而也能起到$E_{in}^{u}(h)=\frac{1}{N}\sum u_n\cdot err(y_n,h(x_n))$的效果
80 | ② 完全生长树存在误差为0的问题:
81 | 
82 | 解决方法:通过对树进行剪枝,比如限制树的高度,限制叶子结点的数目等等。
83 |
84 | ### 以优化的观点来看AdaBoost
85 |
86 | AdaBoost更新$u_n^{(t+1)}$时采用下述的策略:
87 | 
88 |
89 | 令$s_n=\sum_{t=1}^T\alpha_tg_t(x_n)$,站在SVM的角度可见,当$y_ns_n$越大,代表离分离超平面越远,如果全部的数据均能使得其随着$T$增大,$y_ns_n$增大,则说明这个超平面“越来越好,鲁棒性更强”:
90 | 
91 | 因此,AdaBoost问题的每次更新可以理解为不断减少$\sum_{n=1}^N u_n^{(t)}$,因此可以视为下述优化问题:
92 | 
93 | 为了与常见的损失函数联系起来,建立下述对应关系:
94 | $$
95 | E(w_t)=\frac{1}{N}\sum_{n=1}^N exp(-y_nw_t)\quad w_t=\sum_{k=1}^{t-1}\alpha_kg_k(x_n)\\
96 | 等价于err=exp(-ys)的损失函数
97 | $$
98 | 从而通过“一阶泰勒展开”来推导梯度下降法:
99 | 
100 | 从而可以推导出下式(理清$w_t$具体表示的是什么),其中$v=h(x)$:
101 | 
102 | 上式推导可见,通过梯度下降法寻找的最优$h=v$等价于通过AdaBoost算法寻找$E_{in}^{u(t)}(h)$最佳的$h$。因此可见,AdaBoost算法背后的原理也就是通过梯度下降法寻找最优的$h$
103 |
104 | 当求解得最佳的$g=h$,下一步就是求解$\eta$:
105 | 
106 | 求解结果也等价于AdaBoost中采用的$\alpha_t$。
107 |
108 | 从而可见,AdaBoost可以视为GradientBoost的一种情况,且GradientBoost具有更好的“可拓展性”,可以更改误差衡量函数$err$便可获得不同的GradientBoost算法:
109 | 
110 |
111 | GradientBoost算法通常的解法:
112 | ① 先将$\eta$视为常量,求最佳的$h=g$
113 | ② 将最佳的$g$代入,求最佳的$\eta$
114 |
115 | ### 基于平方损失的GradientBoost
116 |
117 | 对应的GradientBoost基本形式:
118 | 
119 |
120 | 步骤①: 寻找最佳$h$:
121 | 
122 | 步骤②: 寻找最佳$\eta$:
123 | 
124 | 步骤③: 结合全部的情况,并结合决策树,获得最终的GBDT算法
125 | 
126 |
127 | ### 集成模型大纲
128 |
129 | 
130 | 上述的模型均比较常用,视具体应用场景而定
131 |
132 | ## 神经网络
133 |
134 | ### 神经网络基本框架
135 |
136 | 
137 | 对应的损失函数:
138 | 
139 |
140 | ### 反向传播算法
141 |
142 | 
143 | 对应的具体算法流程
144 | 
145 |
146 | ### 对应的VC维和正则化
147 |
148 | 粗略估计对应的VC维:
149 | 
150 | 容易导致过拟合,可以采用的策略:
151 | ① 加入正则项,例如L2正则项:$\sum(w_{ij}^{(l)})^2$
152 | ② 提前停止,梯度下降到一定程度就停止
--------------------------------------------------------------------------------
/lecture/MLT3/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic1.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic10.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic11.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic12.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic13.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic14.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic14.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic15.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic15.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic16.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic16.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic17.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic17.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic18.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic18.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic19.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic19.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic2.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic20.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic20.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic21.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic21.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic22.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic22.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic23.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic23.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic24.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic24.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic25.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic25.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic26.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic26.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic27.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic27.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic28.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic28.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic3.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic4.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic5.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic6.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic7.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic8.png
--------------------------------------------------------------------------------
/lecture/MLT3/pic9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT3/pic9.png
--------------------------------------------------------------------------------
/lecture/MLT4-1.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法作业4:part1
3 | date: 2017-02-25 09:56:51
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法课后作业4-1:对应题目1~题目10
9 |
10 |
11 | ## 机器学习技法作业4
12 |
13 | ### 问题1
14 |
15 | Q1~Q3:关于神经网络和深度学习
16 |
17 | Q1:考虑一个完全连接的网络,包含三层,每层的神经元数目为$d^{(0)}=5,d^{(1)}=3,d^{(2)}=1$。如果仅仅认为以下三种形式的乘积$w_{ij}^{(l)}x_i^{(l-1)},w_{ij}^{(l+1)}\delta_j^{(l+1)},x_i^{(l-1)}\delta_j^{(l)}$算作一次运算(当然此处的$x$也包括$x_0=1$)。则进行一轮随机梯度下降法更新参数,需要进行多少次运算?
18 |
19 | A1:分为前向传播和反向传播:
20 | 前向传播:只涉及到$w_{ij}^{(l)}x_i^{(l-1)}$运算,总共进行了$n_1=(5+1)\times 3+(3+1)\times1=22$
21 | 反向传播:前面几层求$\delta_j^{(l)}$时涉及到$w_{ij}^{(l+1)}\delta_j^{(l+1)}$运算,且每一层对应该运算的次数为$l+1$层神经元数目。而每一个参数均需经历一次$x_i^{(l-1)}\delta_j^{(l)}$的运算,所以总的数目为$n_2=22+3\times 1=25$
22 | 所以,总的运算次数$n=n_1+n_2=47$
23 |
24 | Q2~Q3:考虑一个没有“误差项”$x_0^{(l)}$的神经网络。且假设输入层元素为$d^{(0)}=10$,隐藏层包含36个神经元(但隐藏层有多少层不一定),输出层包含一个神经元,且神经网络为全连接网络,求所有可能的网络中,参数$w$数目最少和最多分别为多少?
25 |
26 | A2~A3:寻找所有可能的组合情况,可以参考[数字之和问题](http://stackoverflow.com/questions/4632322/finding-all-possible-combinations-of-numbers-to-reach-a-given-sum),具体的实现代码如下:
27 |
28 | ```python
29 | # q2-q3
30 | hidden = []
31 | def subset_sum(numbers, target, partial=[]):
32 | s = sum(partial)
33 | # check if the partial sum is equals to target
34 | if s == target:
35 | # print("sum(%s)=%s" % (partial, target))
36 | hidden.append(partial)
37 | if s >= target:
38 | return # if we reach the number why bother to continue
39 |
40 | for i in range(len(numbers)):
41 | n = numbers[i]
42 | remaining = numbers[i:]
43 | subset_sum(remaining, target, partial + [n])
44 | ```
45 |
46 |
47 | ```python
48 | subset_sum([i+1 for i in range(36)], 36)
49 | maxi = 0; mini = 1000
50 | for i in range(len(hidden)):
51 | wnum = 0
52 | hidden[i].append(1)
53 | for j in range(len(hidden[i])-1):
54 | wnum += hidden[i][j]*hidden[i][j+1]
55 | wnum += 10*hidden[i][0]
56 | maxi = wnum if wnum>maxi else maxi
57 | mini = wnum if wnum
10 |
11 | ## 机器学习技法作业4
12 |
13 | ### 问题5
14 |
15 | Q11~Q14:BP神经网络
16 |
17 | 根据Lec12p16实现$d-M-1$神经网络算法,采用的细节①神经元采用**tanh**函数(输出层也采用该函数)。②误差函数采用平方损失函数 ③采用随机梯度下降法 ④ 进行$T=50000$次参数更新
18 |
19 | - 隐藏层神经元数目为$M$
20 | - $w_{ij}^{(l)}$初始化为$(-r,r)$上的均匀分布
21 | - $\eta$为学习率
22 |
23 | ```python
24 | # 初始化theta函数
25 | def inittheta(d, M, r):
26 | theta1 = np.random.uniform(-r, r, (d, M))
27 | theta2 = np.random.uniform(-r, r, (M+1, 1))
28 | return theta1, theta2
29 | ```
30 | ```python
31 | # tanh的导数函数
32 | def dertanh(s):
33 | return 1-np.tanh(s)**2
34 | ```
35 | ```python
36 | # 神经网络函数---BP更新参数
37 | def nnetwork(X, Y, M, r, eta, T):
38 | row, col = X.shape
39 | theta1, theta2 = inittheta(col, M, r)
40 | for i in range(T):
41 | # 前向传播
42 | randpos = np.random.randint(0, row)
43 | xone = X[randpos: randpos+1, :]
44 | yone = Y[randpos]
45 | s1 = xone.dot(theta1)
46 | x1 = np.tanh(s1)
47 | x1 = np.c_[np.ones((1, 1)), x1]
48 | s2 = x1.dot(theta2)
49 | x2 = np.tanh(s2)[0][0]
50 | delta2 = -2*(yone-x2)
51 | delta1 = delta2*theta2[1:, :].T*dertanh(s1)
52 | theta2 -= eta*x1.T*delta2
53 | theta1 -= eta*xone.T.dot(delta1)
54 | return theta1, theta2
55 | ```
56 | ```python
57 | # 误差衡量函数
58 | def errfun(X, Y, theta):
59 | row, col = X.shape
60 | l = len(theta)
61 | x = X
62 | for i in range(l-1):
63 | x = np.c_[np.ones((row, 1)), np.tanh(x.dot(theta[i]))]
64 | x2 = np.tanh(x.dot(theta[l-1]))
65 | Yhat = x2
66 | Yhat[Yhat>=0] = 1
67 | Yhat[Yhat<0] = -1
68 | return np.sum(Yhat != Y)/row
69 | ```
70 | Q11:已知$\eta=0.1,r=0.1$,考虑$M\in\{1,6,11,16,21\}$并重复进行500次实验,则哪个$M$对应的平均$E_{out}$最小?
71 |
72 | A11:
73 |
74 | ```python
75 | # Q11
76 | M = [1, 6, 11, 16, 21]
77 | eout = np.zeros((len(M),))
78 | for i in range(500):
79 | for j in range(len(M)):
80 | theta1, theta2 = nnetwork(X, Y, M[j], 0.1, 0.1, 50000)
81 | theta = [theta1, theta2]
82 | eout[j] += errfun(Xtest, Ytest, theta)
83 | print(eout/500)
84 | ```
85 |
86 | M= 1 6 11 16 21
87 | eout= [ 0.307912 0.036136 0.036264 0.03644 0.036336]
88 | Q12:已知$\eta=0.1,M=3$,考虑$r\in\{0,0.1,10,100,1000\}$并重复进行500次实验(此处采用了50次实验),则哪个$r$对应的平均$E_{out}$最小?
89 |
90 | A12:
91 |
92 | ```python
93 | # Q12
94 | r = [0, 0.1, 10, 100, 1000]
95 | eout = np.zeros((len(r),))
96 | for i in range(50):
97 | for j in range(len(r)):
98 | theta1, theta2 = nnetwork(X, Y, 3, r[j], 0.1, 50000)
99 | theta = [theta1, theta2]
100 | eout[j] += errfun(Xtest, Ytest, theta)
101 | print(eout / 50)
102 | ```
103 |
104 | r= 0 0.1 10 100 1000
105 | eout= [ 0.49328 0.036 0.15016 0.40392 0.41504]
106 | Q13:已知$r=0.1,M=3$,考虑$\eta\in\{0.001,0.01,0.1,1,10\}$并重复进行500次实验(此处采用了50次实验),则哪个$\eta$对应的平均$E_{out}$最小?
107 |
108 | A13:
109 |
110 | ```python
111 | # Q13
112 | eta = [0.001, 0.01, 0.1, 1, 10]
113 | eout = np.zeros((len(eta),))
114 | for i in range(50):
115 | for j in range(len(eta)):
116 | theta1, theta2 = nnetwork(X, Y, 3, 0.1, eta[j], 50000)
117 | theta = [theta1, theta2]
118 | eout[j] += errfun(Xtest, Ytest, theta)
119 | print(eout / 50)
120 | ```
121 |
122 | eta= 0.001 0.01 0.1 1 10
123 | eout= [ 0.1044 0.03584 0.036 0.3788 0.47104]
124 | Q14:扩展网络,将其变为$d-8-3-1$型的神经网络,其他与之前网络均类似。已知$r=0.1,\eta=0.01$,重复进行500次实验(此处采用了50次实验),则对应的$E_{out}$的平均为多少?
125 |
126 | A14:由于之前实现的算法并不具备扩展性,因此重新建立下述的神经网络函数:
127 |
128 | ```python
129 | # 多层神经网络
130 | def nnetwork2hidden(X, Y, d1, d2, T):
131 | row, col = X.shape
132 | theta1 = np.random.uniform(-0.1, 0.1, (col, d1))
133 | theta2 = np.random.uniform(-0.1, 0.1, (d1+1, d2))
134 | theta3 = np.random.uniform(-0.1, 0.1, (d2+1, 1))
135 | for i in range(T):
136 | # 前向传播
137 | randpos = np.random.randint(0, row)
138 | xone = X[randpos: randpos+1, :]
139 | yone = Y[randpos]
140 | s1 = xone.dot(theta1)
141 | x1 = np.tanh(s1)
142 | x1 = np.c_[np.ones((1, 1)), x1]
143 | s2 = x1.dot(theta2)
144 | x2 = np.tanh(s2)
145 | x2 = np.c_[np.ones((1, 1)), x2]
146 | s3 = x2.dot(theta3)
147 | x3 = np.tanh(s3)[0][0]
148 | delta3 = -2*(yone-x3)
149 | delta2 = delta3*theta3[1:, :].T*dertanh(s2)
150 | delta1 = delta2.dot(theta2[1:, :].T)*dertanh(s1)
151 | theta3 -= 0.01*x2.T*delta3
152 | theta2 -= 0.01*x1.T*delta2
153 | theta1 -= 0.01*xone.T.dot(delta1)
154 | return theta1, theta2, theta3
155 | ```
156 | ```python
157 | # Q14
158 | eout = 0
159 | for i in range(50):
160 | theta1, theta2, theta3 = nnetwork2hidden(X, Y, 8, 3, 50000)
161 | theta = [theta1, theta2, theta3]
162 | eout += errfun(Xtest, Ytest, theta)
163 | print(eout/50)
164 | ```
165 |
166 | eout = 0.036
167 |
168 | ### 问题6
169 |
170 | Q15~Q18:knn算法
171 |
172 | ```python
173 | #---------kNN----------------
174 | def kNNeighbor(k, xpred, X, Y):
175 | xmin = np.sum((xpred - X)**2, 1)
176 | pos = np.argsort(xmin, 0)
177 | Ypred = Y[pos[0:k]]
178 | Ypred = np.sum(Ypred)
179 | Ypred = 1 if Ypred>=0 else -1
180 | return Ypred
181 | ```
182 | ```python
183 | # 预测函数
184 | def predict(Xtest, X, Y, k):
185 | row, col = Xtest.shape
186 | Ypred = np.zeros((row, 1))
187 | for i in range(row):
188 | Ypred[i] = kNNeighbor(k, Xtest[i, :], X, Y)
189 | return Ypred
190 | ```
191 | Q15~Q16:考虑$k=1$时的情况,求对应的$E_{in},E_{out}$
192 |
193 | A15~A16:$E_{in}=0$是显然可见的。
194 |
195 | ```python
196 | # Q15-Q16
197 | Yhat = predict(Xtest, X, Y, 1)
198 | eout = np.sum(Yhat!=Ytest)/Ytest.shape[0]
199 | print(eout)
200 | ```
201 |
202 | eout = 0.344
203 |
204 | Q17~Q18:考虑$k=5$的情况,求对应的$E_{in},E_{out}$
205 |
206 | ```python
207 | # Q17-Q18
208 | Yhat1 = predict(X, X, Y, 5)
209 | Yhat2 = predict(Xtest, X, Y, 5)
210 | ein = np.sum(Yhat1 != Y) / Y.shape[0]
211 | eout = np.sum(Yhat2 != Ytest) / Ytest.shape[0]
212 | print(ein, eout)
213 | ```
214 |
215 | ein eout
216 | 0.16 0.316
217 |
218 | ### 问题7
219 |
220 | Q19~Q20:kMean实验
221 |
222 | 此处定义的kMean的误差函数为:$E_{in}=\frac{1}{N}\sum_{n=1}^N\sum_{m=1^M} [[x_n\in S_m]]||x_n-\mu_m||^2$
223 |
224 | ```python
225 | # -----------kMeans------------
226 | def kMean(k, X):
227 | row, col = X.shape
228 | pos = np.random.permutation(row)
229 | mu = X[pos[0: k], :]
230 | epsilon = 1e-5; simi = 1
231 | while simi>epsilon:
232 | S = np.zeros((row, k))
233 | for i in range(k):
234 | S[:, i] = np.sum((X-mu[i, :])**2, 1)
235 | tempmu = mu.copy()
236 | pos = np.argmin(S, 1)
237 | for i in range(k):
238 | mu[i, :] = np.mean(X[pos == i, :], 0)
239 | simi = np.sum(tempmu-mu)
240 | return mu
241 | ```
242 |
243 |
244 | ```python
245 | # 误差函数
246 | def errfun(X, mu):
247 | row, col = X.shape
248 | k = mu.shape[0]
249 | err = 0
250 | S = np.zeros((row, k))
251 | for i in range(k):
252 | S[:, i] = np.sum((X - mu[i, :]) ** 2, 1)
253 | pos = np.argmin(S, 1)
254 | for i in range(k):
255 | err += np.sum((X[pos == i, :]-mu[i, :])**2)
256 | return err/row
257 | ```
258 | Q19:令$k=2$,进行100次实验,求$E_{in}$的平均
259 |
260 | A19:
261 |
262 | ```python
263 | # Q19
264 | err = 0
265 | for i in range(100):
266 | mu = kMean(2, X)
267 | err += errfun(X, mu)
268 | print(err/100)
269 | ```
270 |
271 | ein = 2.71678714378
272 |
273 | Q20:令$k=10$,进行100次实验,求$E_{in}$的平均
274 |
275 | A20:
276 |
277 | ```python
278 | # Q20
279 | err = 0
280 | for i in range(100):
281 | mu = kMean(10, X)
282 | err += errfun(X, mu)
283 | print(err/100)
284 | ```
285 |
286 | ein = 1.79117604501
287 |
--------------------------------------------------------------------------------
/lecture/MLT4.md:
--------------------------------------------------------------------------------
1 | ---
2 | title: 机器学习技法Lec13-Lec16
3 | date: 2017-02-25 19:37:31
4 | tags: MLF&MLT
5 | categories: ML
6 | ---
7 |
8 | 机器学习技法Lec13-Lec16主要知识点:对应作业4
9 |
10 |
11 | ## 深度学习
12 |
13 | ### 深度学习基本步骤
14 |
15 | 
16 | 采用pre-training的权值主要是为了:①加快后续的训练过程 ②减少调入局部最小值的情况
17 |
18 | ### 深度学习面临的挑战和技巧
19 |
20 | 
21 | 这部分的内容等后续等深度学习进一步了解后再写
22 |
23 | ### 自动编码机(用途广泛,可作为pre-training的方法)
24 |
25 | 自动编码机的本质:将输入信息以“另一种形式(一般占空间更少)”保留,这种形式的信息很容易能够“复原”回原本信息,且只丢失很少的信息。
26 |
27 | ① 基础自动编码机的形式:
28 | 
29 | 其中$d\to \hat{d}$:称为编码($\hat{d}\lt d$:有降维的作用),$\hat{d}\to d$:称为解码
30 |
31 | ② 自动编码机的特点
32 | 
33 | 比较常见的自动编码机采用的$w$满足上述的条件
34 |
35 | ③ 自动编码机的用途
36 | 
37 |
38 | ④ 抗噪型自动编码机
39 | 
40 |
41 | ### 线性自动编码机
42 |
43 | ① 线性自动编码机的形式:
44 | 
45 |
46 | ② 线性自动编码机的目标函数(损失函数)
47 | 
48 | 由于$WW^T$为[正规矩阵](https://www.wikiwand.com/zh-cn/%E6%AD%A3%E8%A7%84%E7%9F%A9%E9%98%B5),因此可以分解为$WW^T=V\Gamma V^T$(其中$\Gamma$为对角矩阵(且只有前$\hat{d}$个元素为非零),$V$为[酉矩阵](https://www.wikiwand.com/zh-cn/%E9%85%89%E7%9F%A9%E9%98%B5),实数情况为正交矩阵)。因此$WW^Tx_n=V\Gamma V^Tx_n$
49 |
50 | - 其中$V^Tx_n$:因为$V^T$可以视为一个[旋转矩阵](https://www.wikiwand.com/zh-cn/%E6%97%8B%E8%BD%AC%E7%9F%A9%E9%98%B5)将$x_n$进行旋转或投影变化,而不改变$x_n$的长度,如下图所示的二维情况
51 | 
52 | - $\Gamma (V^Tx_n)$:将$V^Tx_n$后面$d-\hat{d}$个元素置0,同时对前面$\hat{d}$个元素进行缩放
53 | - $V(\Gamma V^Tx_n)$:将缩放后的值重新旋转回原坐标系下
54 |
55 | ③ 目标函数变为下述形式:
56 | 
57 |
58 | 1. 考虑最优的$\Gamma$:
59 | 
60 | 2. 考虑最优的$V$:
61 | 将最优$\Gamma$代入后可得优化目标函数
62 | 
63 | 从而有
64 | 
65 |
66 | ③ 线性自动编码$\to$PCA
67 | PCA基本形式
68 | 
69 | 线性自动编码和PCA背后的思想:
70 | 
71 |
72 | ## RBF网络(Radial Basis Function Network)
73 |
74 | ### RBF网络基本框架
75 |
76 | 
77 | ① $b$通常都是不放在这的
78 | ② 其中RBF为高斯函数,常用的形式为$exp(-\gamma ||x-x_n||^2)$
79 | ③ RBF网络背后的思想:RBF函数可以视为度量$x$与每个“理想中心点”的相似度,而下一层则对于不同的“理想中心点”的“重要性”不同。(而至于这些“理想中心点”如何而来方法多样,也可以结合先验知识。)
80 |
81 | ### 全连接的RBF网络
82 |
83 | 全连接的RBF网络可以视为将全部的训练集的数据均视为中心点:
84 | 
85 | 这种网络的方便之处就是无需自己确定中心点:$\mu_m$,但同样带了了存储空间的问题和预测效率的问题。
86 |
87 | ### 全连接RBF网络$\to$k近邻算法
88 |
89 | 将$\beta_m=y_m$且作为分类问题时,就可以得出最近邻算法:
90 | 
91 |
92 | ### 全连接网络+平方损失
93 |
94 | 全连接网络可以用于回归问题,其假设函数集形式如下:
95 | 
96 | 上述的目标函数为:$E=\sum_{n=1}^N(\beta^TZ_n-y_n)^2$,从而求导有上述最优的$\beta$
97 |
98 | 当$N$个数据均不同时,其对应的$Z$矩阵可逆(对称矩阵),从而有最优$\beta=Z^{-1}y$
99 |
100 | 全连接网络的最优解将使得$E_{in}=0$,容易过拟合,因此需要加入正则项:
101 | ① ridge regression:$\beta=(Z^TZ+\lambda I)^{-1}Z^TY$,即在目标函数后面加上$\lambda \beta^T\beta$惩罚项
102 | ② 限制中心点和$\beta$的数目:可以联想SVM,只有一些SVs真正起作用
103 |
104 | ### k-Means算法
105 |
106 | ① 核心思想:将“相近(或相似)”的点归为一类
107 | ② 目标函数:$[x_n\in S_m]=1$当$x_n$归为第$m$类,否则为0:
108 | 
109 | ③ 具体算法:
110 | 
111 |
112 | ### RBF网络+k-Means
113 |
114 | 
115 |
116 | ## 矩阵因子分解
117 |
118 | ### 线性网络(以推荐系统为例)
119 |
120 | ① 符号含义: $x_i=[0,0,...,1,0,...,0]^T$第$i$个用户,$y_n=[r_{n1},r_{n2},...,r_{nm}]$:第$n$个用户对$m$部电影的评分
121 | ② 线性网络:
122 | 
123 | 从而假设函数为:$h(x)=W^TVx$
124 | 对于第$n$个用户:$h(x_n)=W^TVx_n=W^Tv_n$($v_n$代表$V$的第$n$列)
125 | 线性网络背后的思想:第一层$V$可以视为特征转换$\phi(x)$,第二层$W$可以视为线性组合
126 | 假设函数对于第$m$部电影为$h_m(x)=w_m^T\phi(x)$($w_m$: 代表第$m$列),从而可知$r_{nm}=w_m^Tv_n$
127 | ③ 目标函数:
128 | 
129 | ④ 算法实现:
130 | 
131 | ⑤ 矩阵因子分解的意义: 隐藏了用户/电影 的特征,无需人为构造这些特征
132 |
133 | ### 基于随机梯度下降法的矩阵因子分解
134 |
135 | 线性网络对应的目标函数及其单个数据对应的误差衡量函数:
136 | 
137 | 因此我们可以采用随机梯度下降法来求解$V,W$矩阵,具体算法如下所示:
138 | 
139 | 采用梯度下降法的优势:实现简单高效,同时更易于扩展(比如不同的$err$函数)
140 |
141 | ### 特征提取模型(Extraction Models)
142 |
143 | 特征提取模型:特征转换$\Phi$以隐藏层的变量来表示+线性组合模型
144 |
145 | 常见的模型:
146 | 
147 | 特征提取模型的优缺点:
148 | 
149 |
150 | ## 总结篇
151 |
152 | ### 利用特征的技术
153 |
154 | ① 利用Kernel技巧利用大量的特征
155 | 使用kernel技巧将大量特征通过内积形式来体现。
156 | 常见kernel类型:
157 | 
158 | 常见利用kernel的算法:
159 | 
160 |
161 | ② 利用Aggregation技术利用预测特征(将预测结果作为特征)
162 | 将预测结果$g_t(x)$作为特征转换:$\phi_t(x)=g_t(x)$
163 | 常见作为$g_t(x)$的模型:
164 | 
165 | 常见集成算法:
166 | 
167 |
168 | ③ 通过特征提取来“创造”潜在特征
169 | 将潜在的特征转换$\phi$通过中间层变量来表示
170 | 常见“利用潜在特征”的特征提取算法:
171 | 
172 |
173 | ④ 通过压缩来使用低维度特征
174 | 将原始特征压缩为更低维度的特征
175 | 常见的提取低维特征方法:
176 | 
177 |
178 | ### 优化方法
179 |
180 | ① 基于梯度下降法的优化方法
181 | 采用1阶泰勒展开来更新参数:$new..variables=old..variables-\eta\nabla E$
182 | 常见的方法:
183 | 
184 |
185 | ② 基于与优化目标函数等价形式的优化
186 | 可以采用与目标函数等价的函数形式(例如加入了约束条件等)
187 | 常见方法:
188 | 
189 |
190 | ③ 通过不断迭代来处理复杂优化问题
191 | 可以通过解决相对简单的“子问题”来不断进行从而解决复杂问题
192 | 常见方法:
193 | 
194 |
195 | ### 过拟合的解决方法
196 |
197 | ① 通过正则项来减少过拟合
198 | 常见的正则化方法:
199 | 
200 | ② 通过交叉验证(validation)来减少过拟合
201 | 
202 |
203 |
--------------------------------------------------------------------------------
/lecture/MLT4/pic1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic1.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic10.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic11.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic11.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic12.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic12.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic13.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic13.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic14.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic14.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic15.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic15.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic16.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic16.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic17.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic17.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic18.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic18.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic19.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic19.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic2.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic20.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic20.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic21.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic21.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic22.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic22.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic23.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic23.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic24.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic24.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic25.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic25.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic26.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic26.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic27.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic27.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic28.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic28.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic29.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic29.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic3.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic30.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic30.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic31.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic31.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic32.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic32.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic33.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic33.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic34.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic34.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic35.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic35.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic36.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic36.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic37.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic37.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic38.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic38.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic39.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic39.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic4.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic40.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic40.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic5.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic6.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic7.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic8.png
--------------------------------------------------------------------------------
/lecture/MLT4/pic9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AceCoooool/MLF-MLT/ff41a473e3c3bce295e0e0f8718051747c2201ca/lecture/MLT4/pic9.png
--------------------------------------------------------------------------------