├── Algorithm_written_test ├── readme.md ├── zoubuti.py └── 【笔试题】:0.“没事儿走两步”问题.md ├── FClayer ├── README.md ├── picture │ ├── 1.png │ ├── 10.png │ ├── 2.png │ ├── 3.PNG │ ├── 4.png │ ├── 5.png │ ├── 6.png │ ├── 7.png │ ├── 8.PNG │ ├── 9.png │ └── README.md ├── 【反向传播】:全连接层.md └── 全连接层正向反向传播.ipynb ├── Image-mean-data ├── README.md ├── img │ ├── README.md │ ├── img0.png │ ├── img1.png │ ├── img2.png │ ├── img3.png │ ├── img4.png │ └── img5.png └── 图像减去均值.ipynb ├── README.md ├── Softmax-Cross Entropy ├── README.md └── 【交叉熵】:神经网络的Loss函数编写:Softmax+Cross Entropy.ipynb ├── caffe2 ├── 1. Intro Tutorial.ipynb ├── 1.Intro Tutorial.md ├── 2.Caffe2 的一些基本概念 - Workspaces&Operators & Nets & Nets 可视化.ipynb ├── 2.Caffe2 的一些基本概念 - Workspaces&Operators & Nets & Nets 可视化.md ├── 3.Brewing Models(快速构建模型).md ├── 4.Toy_Regression.ipynb ├── 4.Toy_Regression.md ├── 5.Models and Datasets.md ├── 6.Loading_Pretrained_Models.ipynb ├── 6.Loading_Pretrained_Models.md ├── 7.Image_Pre-Processing_Pipeline.ipynb ├── 7.Image_Pre-Processing_Pipeline.md ├── 8.MNIST.ipynb ├── 8.MNIST.md ├── 9.create_your_own_dataset.ipynb ├── 9.create_your_own_dataset.md ├── README.md ├── images │ ├── Cellsx128.png │ ├── Ducreux.jpg │ ├── Flower-id.png │ ├── Places-cnn-visual-example.png │ ├── README.md │ ├── aircraft-carrier.jpg │ ├── astronauts.jpg │ ├── cat.jpg │ ├── cell-tower.jpg │ ├── cowboy-hat.jpg │ ├── flower.jpg │ ├── imagenet-boat.png │ ├── imagenet-caffe2.png │ ├── imagenet-meme.jpg │ ├── imagenet-montage.jpg │ ├── lemon.jpg │ ├── mirror-image.jpg │ ├── orange.jpg │ ├── orangutan.jpg │ ├── pretzel.jpg │ └── sickle-cells.jpg ├── iris_test.minidb ├── iris_train.minidb └── markdown_img │ ├── 2.output_30_1.png │ ├── 2.output_46_0.png │ ├── 4.output_13_1.png │ ├── 4.output_15_0.png │ ├── 4.output_5_2.png │ ├── 5.1.png │ ├── 5.2.png │ ├── 6.output_7_3.png │ ├── 6.output_7_4.png │ ├── 6.output_7_5.png │ ├── 7.output_10_1.png │ ├── 7.output_12_1.png │ ├── 7.output_14_1.png │ ├── 7.output_17_2.png │ ├── 7.output_19_1.png │ ├── 7.output_21_2.png │ ├── 7.output_21_3.png │ ├── 7.output_24_1.png │ ├── 7.output_24_2.png │ ├── 7.output_3_1.png │ ├── 7.output_5_1.png │ ├── 7.output_5_2.png │ ├── 7.output_7_1.png │ ├── 7.output_8_1.png │ ├── 8.output_22_0.png │ ├── 8.output_24_0.png │ ├── 8.output_28_2.png │ ├── 8.output_30_0.png │ ├── 8.output_30_1.png │ ├── 8.output_32_0.png │ ├── 8.output_34_1.png │ ├── 8.output_38_1.png │ ├── 8.output_38_2.png │ ├── 9.output_6_0.png │ ├── 9.output_6_1.png │ └── README.md ├── cs224n_note ├── readme.md └── 【NLP】cs224n课程笔记.md ├── jupyter-Pillow-inline ├── README.md ├── img.png └── pillow+plt.ipynb ├── python-numpy-sum ├── Python中Numpy库中的np.sum(array,axis=0,1,2...)怎么理解?.ipynb └── README.md └── text_classfier ├── readme.md ├── text_classfier.ipynb └── text_classfier.md /Algorithm_written_test/readme.md: -------------------------------------------------------------------------------- 1 | 這個文件夾是我的csdn筆記[0.【笔试题】:“没事儿走两步”问题.md](https://blog.csdn.net/weixin_37251044/article/details/83019020)中的代碼部分,詳情請看我的博客文章。 2 | -------------------------------------------------------------------------------- /Algorithm_written_test/zoubuti.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Spyder Editor 4 | 5 | This is a temporary script file. 6 | """ 7 | 8 | 9 | def is_x(x): 10 | n = 0 11 | i = 0 12 | while n < x: 13 | i += 1 14 | n = n + i 15 | if n == x: 16 | print ('need zhe me duo ci:'+str(i)) 17 | for j in range(i): 18 | print ('+' + str(j+1),end='') 19 | print ('=' + str(x)) 20 | elif (n - x) % 2 ==0: 21 | print ('need zhe me duo ci:'+str(i)) 22 | for j in range(i): 23 | if (n-x) / 2 == j+1: 24 | print ('-' ,end ='') 25 | else: 26 | print ('+',end ='') 27 | print (str(j+1),end ='') 28 | print ('=' + str(x)) 29 | 30 | elif (i + 1) % 2 == 0: 31 | print ('need zhe me duo ci:'+str(i+2)) 32 | for j in range(i+2): 33 | if (n-x - 1) / 2 == j+1: 34 | print ('-',end ='') 35 | elif j+1 == i+1: 36 | print ('+',end ='') 37 | elif j+1 == i+2: 38 | print ('-',end ='') 39 | else: 40 | print ('+',end ='') 41 | print (str(j+1),end ='') 42 | print ('=' + str(x)) 43 | 44 | elif (i + 1) % 2 != 0: 45 | print ('need zhe me duo ci:'+str(i+1)) 46 | for j in range(i+1): 47 | if (n-x + i+1 ) / 2 == j+1: 48 | print ('-',end ='') 49 | else: 50 | print ('+',end ='') 51 | print (str(j+1),end ='') 52 | print ('=' + str(x)) 53 | 54 | 55 | 56 | x = 7 57 | x_in = input('please enter your x: ') 58 | print ('your x is :',x_in) 59 | x = int(x_in) 60 | while x_in != 'q': 61 | is_x(x) 62 | x_in = input('please enter your x: ') 63 | print ('your x is :',x_in) 64 | x = int(x_in) 65 | -------------------------------------------------------------------------------- /Algorithm_written_test/【笔试题】:0.“没事儿走两步”问题.md: -------------------------------------------------------------------------------- 1 |  2 | # 问题: 3 | 小明在单位为1的尺子上向目标点走,每次只能向前或向后走,第一次走1步,第二次走2步,第n次走n步,请问小明走到正前方x步最短需要走几次? 4 | 5 | 输入:x 6 | ``` 7 | 1 8 | 2 9 | 3 10 | 4 11 | ``` 12 | 输出:n 13 | ``` 14 | 1 15 | 3 16 | 2 17 | 4 18 | ``` 19 | 20 | 答案: 21 | 22 | 1.代码: 23 | https://github.com/JackKuo666/csdn/blob/master/Algorithm_written_test/zoubuti.py 24 | 25 | ``` 26 | # -*- coding: utf-8 -*- 27 | """ 28 | Spyder Editor 29 | 30 | This is a temporary script file. 31 | """ 32 | 33 | 34 | def is_x(x): 35 | n = 0 36 | i = 0 37 | while n < x: 38 | i += 1 39 | n = n + i 40 | if n == x: 41 | print ('need zhe me duo ci:'+str(i)) 42 | for j in range(i): 43 | print ('+' + str(j+1),end='') 44 | print ('=' + str(x)) 45 | elif (n - x) % 2 ==0: 46 | print ('need zhe me duo ci:'+str(i)) 47 | for j in range(i): 48 | if (n-x) / 2 == j+1: 49 | print ('-' ,end ='') 50 | else: 51 | print ('+',end ='') 52 | print (str(j+1),end ='') 53 | print ('=' + str(x)) 54 | 55 | elif (i + 1) % 2 == 0: 56 | print ('need zhe me duo ci:'+str(i+2)) 57 | for j in range(i+2): 58 | if (n-x - 1) / 2 == j+1: 59 | print ('-',end ='') 60 | elif j+1 == i+1: 61 | print ('+',end ='') 62 | elif j+1 == i+2: 63 | print ('-',end ='') 64 | else: 65 | print ('+',end ='') 66 | print (str(j+1),end ='') 67 | print ('=' + str(x)) 68 | 69 | elif (i + 1) % 2 != 0: 70 | print ('need zhe me duo ci:'+str(i+1)) 71 | for j in range(i+1): 72 | if (n-x + i+1 ) / 2 == j+1: 73 | print ('-',end ='') 74 | else: 75 | print ('+',end ='') 76 | print (str(j+1),end ='') 77 | print ('=' + str(x)) 78 | 79 | 80 | 81 | x = 7 82 | x_in = input('please enter your x: ') 83 | print ('your x is :',x_in) 84 | x = int(x_in) 85 | while x_in != 'q': 86 | is_x(x) 87 | x_in = input('please enter your x: ') 88 | print ('your x is :',x_in) 89 | x = int(x_in) 90 | ``` 91 | 92 | 2.解析: 93 | 设小明向前走x步时最少走m次,则m为: 94 | 设小明向前走到第i次时$1+2+...+(i-2)+(i-1)=x$ 95 | $$m = 96 | \left\{ 97 | \begin{array}{lr} 98 | i, & (1+2+...+i) - x 为偶数\\ 99 | i+1, & (1+2+...+i) - x 为奇数,且(i+1)为奇数\\ 100 | i+2, & (1+2+...+i) - x 为奇数,且(i+1)为偶数 \\ 101 | \end{array} 102 | \right.$$ 103 | 104 | 其实上边的公式时秦老师结出来,让我们通过一些例子理解一下公式: 105 | 我们以表格的形式写出来:(第一列是x,表示小明离目标多少步,第二列m,表示小明需要走多少次,每次前边“+-”号表示向前或向后) 106 | |x|第一步:确定i| 第二步:(1+...+i)-x 奇/偶|i+1 奇/偶|第三步:操作|展开 |m| 107 | |--|--|--|--|--|--|--| 108 | | 1 |1| (+1)-1=0 偶|-|0/2=0,符号不变,加到i| +1 |1|| 109 | |2|2|(+1+2)-2=1 奇|3 奇|(1+3)/2=2,2前边添“-”;且加到i+1|+1-2+3|3|| 110 | |3|2|(+1+2)-3=0 偶|-|0/2=0,符号不变,加到i|+1+2|2|| 111 | |4|3|(+1+2+3)-4=2 偶|-|2/2=1,1前边添“-”,加到i|-1+2+3|3|| 112 | |5|3|(+1+2+3)-5=1 奇|4 偶|(1+4+5)/2=5,5前边添“-”;且加到i+2|+1+2+3+4-5|5| 113 | |6|3|(+1+2+3)-6=0 偶|-|0/2=0,符号不变,加到i|+1+2+3|3| 114 | |7|4|(+1+2+3+4)-7=3 奇|5 奇|(3+5)/2=4,4前边添“-”;且加到i+1|+1+2+3-4+5|5| 115 | |8|4|(+1+2+3+4)-8=2 偶|-|2/2=1,1前边添“-”,加到i|-1+2+3+4|4| 116 | |9|4|(+1+2+3+4)-9=1 奇|5 奇|(1+5)/2=3,3前边添“-”;且加到i+1|+1+2-3+4+5|5| 117 | |10|4|(+1+2+3+4)-10=0 偶|-|0/2=0,符号不变,加到i|+1+2+3+4|4| 118 | |11|5|(+1+2+3+4+5)-11=4 偶|-|4/2=2,2前边添“-”,加到i|+1-2+3+4+5|5| 119 | 120 | 121 | 122 | 注:吴老师提供题目,秦老师提供解题思路,小郭敲代码,一次愉快的合作。 123 | -------------------------------------------------------------------------------- /FClayer/README.md: -------------------------------------------------------------------------------- 1 | 这是我的博客文章:[【反向传播】:全连接层](https://blog.csdn.net/weixin_37251044/article/details/81274479) 的代码,具体细节,请访问博客。 2 | -------------------------------------------------------------------------------- /FClayer/picture/1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/1.png -------------------------------------------------------------------------------- /FClayer/picture/10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/10.png -------------------------------------------------------------------------------- /FClayer/picture/2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/2.png -------------------------------------------------------------------------------- /FClayer/picture/3.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/3.PNG -------------------------------------------------------------------------------- /FClayer/picture/4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/4.png -------------------------------------------------------------------------------- /FClayer/picture/5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/5.png -------------------------------------------------------------------------------- /FClayer/picture/6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/6.png -------------------------------------------------------------------------------- /FClayer/picture/7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/7.png -------------------------------------------------------------------------------- /FClayer/picture/8.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/8.PNG -------------------------------------------------------------------------------- /FClayer/picture/9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/FClayer/picture/9.png -------------------------------------------------------------------------------- /FClayer/picture/README.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /FClayer/【反向传播】:全连接层.md: -------------------------------------------------------------------------------- 1 | # 1.全连接层的推导 2 |   全连接层的每一个结点都与上一层的所有结点相连,用来把前边提取到的特征综合起来。由于其全相连的特性,一般全连接层的参数也是最多的。 3 | 4 | # 2.全连接层的前向计算 5 |   下图中连线最密集的2个地方就是全连接层,这很明显的可以看出全连接层的参数的确很多。在前向计算过程,也就是一个线性的加权求和的过程,全连接层的每一个输出都可以看成前一层的每一个结点乘以一个权重系数W,最后加上一个偏置值b得到。如下图中最后一个全连接层,输入有100个神经元结点,输出有10个结点,则一共需要100*10=1000个权值参数W和10个偏置参数b。 6 |    7 | ![这里写图片描述](https://img-blog.csdn.net/20180730112536285?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0) 8 | 9 | # 2.1 推导过程 10 |   下面用一个简单的网络具体介绍一下推导过程 11 | 12 |
13 | ![这里写图片描述](https://img-blog.csdn.net/2018073011355483?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0) 14 | 15 | 16 | 其中,x1、x2、x3为全连接层的输入,a1、a2、a3为输出,有 17 |
18 | ![这里写图片描述](https://img-blog.csdn.net/20180730113726936?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0)   19 | 20 | 可以写成如下矩阵形式: 21 | 22 |
23 | ![这里写图片描述](https://img-blog.csdn.net/20180730113801851?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0) 24 | 25 |  若我们的一次训练10张图片,即batch_size=10,则我们可以把前向传播计算转化为如下矩阵形式。 26 | 27 | ![这里写图片描述](https://img-blog.csdn.net/20180731152238252?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0) 28 | 29 | ## 2.2 全连接层的前向计算代码 30 | ``` 31 | def forward(self, in_data): 32 | self.bottom_val = in_data 33 | self.top_val = in_data.dot(self.w) + self.b 34 | return self.top_val 35 | ``` 36 | 37 | 38 | # 3 全连接层的反向传播 39 | 40 |   以我们的最后一个全连接层为例,该层有100个输入结点和10个输出结点。 41 |
42 | ![这里写图片描述](https://img-blog.csdn.net/20180730150609751?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0) 43 | 44 |   由于需要对**W和b进行更新**,还要**向前传递梯度**,所以我们需要计算如下**三个偏导数**。 45 | 46 | 【注:一般情况下每层前向传播都会更新三个参数】 47 | 48 | ## 3.1 对上一层的输出(即当前层的输入)求导 49 | 50 |   若我们已知传递到该层的梯度$\frac{\partial loss}{\partial a}$,则我们可以通过链式法则求得loss对x的偏导数。 51 |   首先需要求得该层的输出$a_{i}$对输入$x_{j}$的偏导数:$\frac{\partial a_{i}}{\partial x_{j}}=\frac{\sum_{j}^{100}w_{ij}*x_{j}}{\partial x_{j}}$ 52 | 53 |   再通过链式法则求得loss对x的偏导数:$\frac{\partial loss}{\partial x_{k}}=\sum_{j}^{100}\frac{\partial loss}{\partial a_{j}}\frac{\partial a_{j}}{\partial x_{k}}= \sum_{j}^{100}\frac{\partial loss}{\partial a_{j}}*w_{kj}$ 54 | 【注意:这里的$x_{k}$是向量,$x_{k}$可以是任何一个$x_{j}$】 55 | 56 |   上边求导的结果也印证了我前边那句话:在反向传播过程中,若第x层的a节点通过权值W对x+1层的b节点有贡献,则在反向传播过程中,梯度通过权值W从b节点传播回a节点。 57 | 58 |   若我们的一次训练10张图片,即batch_size=10,则我们可以把计算转化为如下矩阵形式。 59 | 60 |
61 | ![这里写图片描述](https://img-blog.csdn.net/20180731105101514?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0) 62 | 63 | ## 3.2 对权重系数W求导 64 | 65 | 我们前向计算的公式如下图 66 | 67 | 68 |
69 | ![这里写图片描述](https://img-blog.csdn.net/20180730113726936?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0)   70 | 71 | 由图可知$\frac{\partial a_{i}}{\partial w_{ij}}=x_{j}$,所以:$\frac{\partial loss}{\partial w_{kj}}=\frac{\partial loss}{\partial a_{k}}\frac{\partial a_{k}}{\partial w_{kj}}= \frac{\partial loss}{\partial a_{k}}*x_{j}$ 72 | 73 | 当batch_size=10时,写成矩阵形式: 74 | 75 |
76 | ![这里写图片描述](https://img-blog.csdn.net/20180730183034560?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0) 77 | 78 | 79 | ## 3.3 对偏置系数b求导 80 | 81 |   由上面前向推导公式可知,$\frac{\partial a_{i}}{\partial b_{i}}=1$ 82 | 83 |   即loss对偏置系数的偏导数等于对上一层输出的偏导数。 84 | 85 |   当batch_size=10时,将不同batch对应的相同b的偏导相加即可,写成矩阵形式即为乘以一个全1的矩阵: 86 | 87 |
88 | ![这里写图片描述](https://img-blog.csdn.net/20180731151033235?watermark/2/text/aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3dlaXhpbl8zNzI1MTA0NA==/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/0) 89 | 90 | 91 | # 小结: 92 | 注意,这里也可以从矩阵求导的方面的出上边的结论: 93 | 0.正向传播矩阵计算公式是: 94 | $$Top\_data = Bottom\_data*W + c*Bias$$ 95 | >这里c = [1,1,1,1,1,1,1,1,1,1]的列向量,Bias是行向量。 96 | 97 | 1.对输入数据的求导: 98 | $$Bottom\_diff = Top\_diff*W^{T}$$ 99 | 2.对weight的求导: 100 | $$weight\_diff = Bottom\_data^{T}*Top\_diff$$ 101 | 3.对bias求导: 102 | $$bias\_diff = c^{T}*Top\_diff$$ 103 | 104 | **可以看到,对各个参数矩阵求导时,得到的是将参数前的系数矩阵转置,然后在乘以从后边传来的$Top\_diff$。** 105 | 106 | # 4.代码 107 | 具体代码请访问我的github仓库:[csdn/FClayer/](https://github.com/JackKuo666/csdn/tree/master/FClayer) 108 | 109 | 110 | ## 4.1.全连接层:正向传播,随机初始化w,b 111 | ``` 112 | # 全连接层:正向传播 113 | 114 | import numpy as np 115 | 116 | x = np.arange(1,11,1).reshape(2,5) 117 | 118 | print ("上层输出是2个batch,每个batch有5个向量:\n" + str(x)) 119 | 120 | ​ 121 | 122 | ww = std * np.random.randn(5,3) 123 | 124 | bb = std * np.zeros(3) 125 | 126 | print ("\n随机初始化w的参数的shape是:\n" + str(ww.shape)) 127 | 128 | print ("\n随机初始化w的参数是:\n" + str(ww)) 129 | 130 | print ("\n随机初始化b的参数的shape是:\n" + str(bb.shape)) 131 | 132 | print ("\n随机初始化b的参数是:\n" + str(bb)) 133 | 134 | a = x.dot(ww) + bb 135 | 136 | print ("\n全连接之后输出层的shape是:\n" + str(a.shape)) 137 | 138 | print ("\n全连接之后输出是:\n" + str(a)) 139 | ``` 140 | 输出: 141 | ``` 142 | 上层输出是2个batch,每个batch有5个向量: 143 | [[ 1 2 3 4 5] 144 | [ 6 7 8 9 10]] 145 | 146 | 随机初始化w的参数的shape是: 147 | (5, 3) 148 | 149 | 随机初始化w的参数是: 150 | [[ -2.02246258e-04 -2.86134686e-05 5.67536552e-05] 151 | [ -2.65147732e-05 3.46693399e-05 -4.84865213e-05] 152 | [ -8.56741462e-05 -1.36582254e-04 -1.63278178e-04] 153 | [ 7.67300884e-05 4.15051110e-05 3.05435127e-05] 154 | [ 2.03954211e-05 7.60062287e-05 -2.58280679e-04]] 155 | 156 | 随机初始化b的参数的shape是: 157 | (3,) 158 | 159 | 随机初始化b的参数是: 160 | [ 0. 0. 0.] 161 | 162 | 全连接之后输出层的shape是: 163 | (2, 3) 164 | 165 | 全连接之后输出是: 166 | [[-0.0001034 0.00017703 -0.00169928] 167 | [-0.00118995 0.00011195 -0.00361302]] 168 | 169 | ``` 170 | ## 4.2 全连接,反向传播 171 | ``` 172 | # 1.对上一层的输出(即当前层的输入)求导 173 | 174 | loss = np.arange(0,1.2,0.2).reshape(2,3) 175 | 176 | print ("假设下层传过来的loss是:\n" + str(loss)) 177 | 178 | residual_x = loss.dot(ww.T) 179 | 180 | print ("\n1.对上层的输出求导值的shape:\n" + str(residual_x.shape)) 181 | 182 | print ("\n对上层的输出求导:\n" + str(residual_x)) 183 | 184 | ​ 185 | 186 | ​ 187 | 188 | ​ 189 | 190 | # 2.对权重系数W求导 191 | 192 | lr = 0.01 193 | 194 | reg = 0.75 195 | 196 | prev_grad_w = np.zeros_like(ww) 197 | 198 | ww -= lr * (x.T.dot(loss) + prev_grad_w * reg) 199 | 200 | prev_grad_w = ww 201 | 202 | ​ 203 | 204 | print ("\n2.对权重系数W求导之后更新的W值的shape:\n" + str(prev_grad_w.shape)) 205 | 206 | print ("\n对权重系数W求导之后更新的W:\n" + str(prev_grad_w)) 207 | 208 | ​ 209 | 210 | # 3.对偏置系数b求导 211 | 212 | prev_grad_b = np.zeros_like(bb) 213 | 214 | bb -= lr * (np.sum(loss, axis=0)) 215 | 216 | prev_grad_b = bb 217 | 218 | ​ 219 | 220 | print ("\n2.对偏置系数b求导之后更新的b值的shape:\n" + str(prev_grad_b.shape)) 221 | 222 | print ("\n对偏置系数b求导之后更新的b:\n" + str(prev_grad_b)) 223 | ``` 224 | 输出: 225 | ``` 226 | 假设下层传过来的loss是: 227 | [[ 0. 0.2 0.4] 228 | [ 0.6 0.8 1. ]] 229 | 230 | 1.对上层的输出求导值的shape: 231 | (2, 5) 232 | 233 | 对上层的输出求导: 234 | [[-0.14238302 -0.17281246 -0.20329263 -0.23357948 -0.26408811] 235 | [-0.50248748 -0.60483666 -0.70752395 -0.80949021 -0.91218524]] 236 | 237 | 2.对权重系数W求导之后更新的W值的shape: 238 | (5, 3) 239 | 240 | 对权重系数W求导之后更新的W: 241 | [[-0.18020225 -0.25002861 -0.31994325] 242 | [-0.21002651 -0.29996533 -0.39004849] 243 | [-0.24008567 -0.35013658 -0.46016328] 244 | [-0.26992327 -0.39995849 -0.52996946] 245 | [-0.2999796 -0.44992399 -0.60025828]] 246 | 247 | 3.对偏置系数b求导之后更新的b值的shape: 248 | (3,) 249 | 250 | 对偏置系数b求导之后更新的b: 251 | [-0.03 -0.05 -0.07] 252 | ``` 253 | 254 | 255 | 256 | 257 | 参考:[深度学习笔记6:全连接层的实现](https://blog.csdn.net/l691899397/article/details/52267166) -------------------------------------------------------------------------------- /FClayer/全连接层正向反向传播.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 1.全连接层:正向传播,随机初始化w,b" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 44, 13 | "metadata": {}, 14 | "outputs": [ 15 | { 16 | "name": "stdout", 17 | "output_type": "stream", 18 | "text": [ 19 | "上层输出是2个batch,每个batch有5个向量:\n", 20 | "[[ 1 2 3 4 5]\n", 21 | " [ 6 7 8 9 10]]\n", 22 | "\n", 23 | "随机初始化w的参数的shape是:\n", 24 | "(5, 3)\n", 25 | "\n", 26 | "随机初始化w的参数是:\n", 27 | "[[ -2.02246258e-04 -2.86134686e-05 5.67536552e-05]\n", 28 | " [ -2.65147732e-05 3.46693399e-05 -4.84865213e-05]\n", 29 | " [ -8.56741462e-05 -1.36582254e-04 -1.63278178e-04]\n", 30 | " [ 7.67300884e-05 4.15051110e-05 3.05435127e-05]\n", 31 | " [ 2.03954211e-05 7.60062287e-05 -2.58280679e-04]]\n", 32 | "\n", 33 | "随机初始化b的参数的shape是:\n", 34 | "(3,)\n", 35 | "\n", 36 | "随机初始化b的参数是:\n", 37 | "[ 0. 0. 0.]\n", 38 | "\n", 39 | "全连接之后输出层的shape是:\n", 40 | "(2, 3)\n", 41 | "\n", 42 | "全连接之后输出是:\n", 43 | "[[-0.0001034 0.00017703 -0.00169928]\n", 44 | " [-0.00118995 0.00011195 -0.00361302]]\n" 45 | ] 46 | } 47 | ], 48 | "source": [ 49 | "# 全连接层:正向传播\n", 50 | "import numpy as np\n", 51 | "x = np.arange(1,11,1).reshape(2,5)\n", 52 | "print (\"上层输出是2个batch,每个batch有5个向量:\\n\" + str(x))\n", 53 | "\n", 54 | "ww = std * np.random.randn(5,3)\n", 55 | "bb = std * np.zeros(3)\n", 56 | "print (\"\\n随机初始化w的参数的shape是:\\n\" + str(ww.shape))\n", 57 | "print (\"\\n随机初始化w的参数是:\\n\" + str(ww))\n", 58 | "print (\"\\n随机初始化b的参数的shape是:\\n\" + str(bb.shape))\n", 59 | "print (\"\\n随机初始化b的参数是:\\n\" + str(bb))\n", 60 | "a = x.dot(ww) + bb\n", 61 | "print (\"\\n全连接之后输出层的shape是:\\n\" + str(a.shape))\n", 62 | "print (\"\\n全连接之后输出是:\\n\" + str(a))\n", 63 | "\n" 64 | ] 65 | }, 66 | { 67 | "cell_type": "markdown", 68 | "metadata": {}, 69 | "source": [ 70 | "# 2.全连接,反向传播" 71 | ] 72 | }, 73 | { 74 | "cell_type": "code", 75 | "execution_count": 62, 76 | "metadata": {}, 77 | "outputs": [ 78 | { 79 | "name": "stdout", 80 | "output_type": "stream", 81 | "text": [ 82 | "假设下层传过来的loss是:\n", 83 | "[[ 0. 0.2 0.4]\n", 84 | " [ 0.6 0.8 1. ]]\n", 85 | "\n", 86 | "1.对上层的输出求导值的shape:\n", 87 | "(2, 5)\n", 88 | "\n", 89 | "对上层的输出求导:\n", 90 | "[[-0.17798302 -0.21601246 -0.25409263 -0.29197948 -0.33008811]\n", 91 | " [-0.62808748 -0.75603666 -0.88432395 -1.01189021 -1.14018524]]\n", 92 | "\n", 93 | "2.对权重系数W求导之后更新的W值的shape:\n", 94 | "(5, 3)\n", 95 | "\n", 96 | "对权重系数W求导之后更新的W:\n", 97 | "[[-0.21620225 -0.30002861 -0.38394325]\n", 98 | " [-0.25202651 -0.35996533 -0.46804849]\n", 99 | " [-0.28808567 -0.42013658 -0.55216328]\n", 100 | " [-0.32392327 -0.47995849 -0.63596946]\n", 101 | " [-0.3599796 -0.53992399 -0.72025828]]\n", 102 | "\n", 103 | "3.对偏置系数b求导之后更新的b值的shape:\n", 104 | "(3,)\n", 105 | "\n", 106 | "对偏置系数b求导之后更新的b:\n", 107 | "[-0.036 -0.06 -0.084]\n" 108 | ] 109 | } 110 | ], 111 | "source": [ 112 | "# 1.对上一层的输出(即当前层的输入)求导\n", 113 | "loss = np.arange(0,1.2,0.2).reshape(2,3)\n", 114 | "print (\"假设下层传过来的loss是:\\n\" + str(loss))\n", 115 | "residual_x = loss.dot(ww.T)\n", 116 | "print (\"\\n1.对上层的输出求导值的shape:\\n\" + str(residual_x.shape))\n", 117 | "print (\"\\n对上层的输出求导:\\n\" + str(residual_x))\n", 118 | "\n", 119 | "\n", 120 | "\n", 121 | "# 2.对权重系数W求导\n", 122 | "lr = 0.01\n", 123 | "reg = 0.75\n", 124 | "prev_grad_w = np.zeros_like(ww)\n", 125 | "ww -= lr * (x.T.dot(loss) + prev_grad_w * reg)\n", 126 | "prev_grad_w = ww\n", 127 | "\n", 128 | "print (\"\\n2.对权重系数W求导之后更新的W值的shape:\\n\" + str(prev_grad_w.shape))\n", 129 | "print (\"\\n对权重系数W求导之后更新的W:\\n\" + str(prev_grad_w))\n", 130 | "\n", 131 | "# 3.对偏置系数b求导\n", 132 | "prev_grad_b = np.zeros_like(bb)\n", 133 | "bb -= lr * (np.sum(loss, axis=0))\n", 134 | "prev_grad_b = bb\n", 135 | "\n", 136 | "print (\"\\n3.对偏置系数b求导之后更新的b值的shape:\\n\" + str(prev_grad_b.shape))\n", 137 | "print (\"\\n对偏置系数b求导之后更新的b:\\n\" + str(prev_grad_b))" 138 | ] 139 | }, 140 | { 141 | "cell_type": "markdown", 142 | "metadata": {}, 143 | "source": [ 144 | "# 3.numpy 的flatten层:将每个batch里的每个channel安顺序按行展开" 145 | ] 146 | }, 147 | { 148 | "cell_type": "code", 149 | "execution_count": 63, 150 | "metadata": {}, 151 | "outputs": [ 152 | { 153 | "name": "stdout", 154 | "output_type": "stream", 155 | "text": [ 156 | "[[[[ 1 2]\n", 157 | " [ 3 4]]\n", 158 | "\n", 159 | " [[ 5 6]\n", 160 | " [ 7 8]]\n", 161 | "\n", 162 | " [[ 9 10]\n", 163 | " [11 12]]]\n", 164 | "\n", 165 | "\n", 166 | " [[[13 14]\n", 167 | " [15 16]]\n", 168 | "\n", 169 | " [[17 18]\n", 170 | " [19 20]]\n", 171 | "\n", 172 | " [[21 22]\n", 173 | " [23 24]]]]\n", 174 | "[[ 1 2 3 4 5 6 7 8 9 10 11 12]\n", 175 | " [13 14 15 16 17 18 19 20 21 22 23 24]]\n" 176 | ] 177 | } 178 | ], 179 | "source": [ 180 | "# numpy 的flatten层:将每个batch里的每个channel安顺序按行展开\n", 181 | "import numpy as np\n", 182 | "q = np.arange(1,25,1).reshape(2,3,2,2)\n", 183 | "print (q)\n", 184 | "print (q.reshape(2,3*2*2))" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": null, 190 | "metadata": {}, 191 | "outputs": [], 192 | "source": [] 193 | } 194 | ], 195 | "metadata": { 196 | "kernelspec": { 197 | "display_name": "Python 3", 198 | "language": "python", 199 | "name": "python3" 200 | }, 201 | "language_info": { 202 | "codemirror_mode": { 203 | "name": "ipython", 204 | "version": 3 205 | }, 206 | "file_extension": ".py", 207 | "mimetype": "text/x-python", 208 | "name": "python", 209 | "nbconvert_exporter": "python", 210 | "pygments_lexer": "ipython3", 211 | "version": "3.6.6" 212 | } 213 | }, 214 | "nbformat": 4, 215 | "nbformat_minor": 2 216 | } 217 | -------------------------------------------------------------------------------- /Image-mean-data/README.md: -------------------------------------------------------------------------------- 1 | # 图像数据去均值代码,细节请看我的博客:[【数据预处理】:图像去均值](https://blog.csdn.net/weixin_37251044/article/details/81157344) 2 | # 1.图片生成CIfar-10格式的数据 3 | # 2.图片去均值 4 | -------------------------------------------------------------------------------- /Image-mean-data/img/README.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /Image-mean-data/img/img0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/Image-mean-data/img/img0.png -------------------------------------------------------------------------------- /Image-mean-data/img/img1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/Image-mean-data/img/img1.png -------------------------------------------------------------------------------- /Image-mean-data/img/img2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/Image-mean-data/img/img2.png -------------------------------------------------------------------------------- /Image-mean-data/img/img3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/Image-mean-data/img/img3.png -------------------------------------------------------------------------------- /Image-mean-data/img/img4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/Image-mean-data/img/img4.png -------------------------------------------------------------------------------- /Image-mean-data/img/img5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/Image-mean-data/img/img5.png -------------------------------------------------------------------------------- /Image-mean-data/图像减去均值.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 1.图片制作cifar-10格式的数据" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 52, 13 | "metadata": {}, 14 | "outputs": [ 15 | { 16 | "name": "stdout", 17 | "output_type": "stream", 18 | "text": [ 19 | "6\n", 20 | "(6, 3, 32, 32)\n", 21 | "[[ 62 61 60 ..., 64 82 62]\n", 22 | " [ 62 63 61 ..., 77 114 64]\n", 23 | " [ 67 78 115 ..., 100 119 63]\n", 24 | " ..., \n", 25 | " [161 159 159 ..., 152 157 156]\n", 26 | " [163 161 162 ..., 162 161 161]\n", 27 | " [169 167 167 ..., 167 167 167]]\n" 28 | ] 29 | } 30 | ], 31 | "source": [ 32 | "%matplotlib inline\n", 33 | "import matplotlib.pyplot as plt \n", 34 | "from PIL import Image\n", 35 | "import numpy as np\n", 36 | "from numpy import *\n", 37 | "import os\n", 38 | "\n", 39 | "img_dir='.\\img'\n", 40 | "img_list=os.listdir(img_dir)\n", 41 | "\n", 42 | "sum_rgb = []\n", 43 | "sum_img = []\n", 44 | "count=0\n", 45 | "\n", 46 | "for img_name in img_list:\n", 47 | " img_path=os.path.join(img_dir,img_name)\n", 48 | " img = Image.open(img_path, 'r')\n", 49 | " r,g,b = img.split() \n", 50 | " #print (np.array(r).shape)\n", 51 | " sum_rgb.append(np.array(r))\n", 52 | " sum_rgb.append(np.array(g))\n", 53 | " sum_rgb.append(np.array(b)) \n", 54 | " #print (np.array(sum_rgb).shape)\n", 55 | " sum_img.append(sum_rgb)\n", 56 | " #print (np.array(sum_img).shape)\n", 57 | " sum_rgb = []\n", 58 | " count = count +1\n", 59 | "\n", 60 | "print (count)\n", 61 | "print (np.array(sum_img).shape)\n", 62 | "print (np.array(sum_img)[0][0])\n" 63 | ] 64 | }, 65 | { 66 | "cell_type": "markdown", 67 | "metadata": {}, 68 | "source": [ 69 | "# 2.图像去均值(image mean)" 70 | ] 71 | }, 72 | { 73 | "cell_type": "code", 74 | "execution_count": 53, 75 | "metadata": {}, 76 | "outputs": [ 77 | { 78 | "name": "stdout", 79 | "output_type": "stream", 80 | "text": [ 81 | "减去均值之前,X_train的第一幅图像的RGB通道的第一个通道的图像数值32*32:\n", 82 | "[[ 62 61 60 ..., 64 82 62]\n", 83 | " [ 62 63 61 ..., 77 114 64]\n", 84 | " [ 67 78 115 ..., 100 119 63]\n", 85 | " ..., \n", 86 | " [161 159 159 ..., 152 157 156]\n", 87 | " [163 161 162 ..., 162 161 161]\n", 88 | " [169 167 167 ..., 167 167 167]]\n", 89 | "-----------------------------------------------\n", 90 | "mean_image的形状以及数值\n", 91 | "(3, 32, 32)\n", 92 | "[[ 121.33333333 114.66666667 113.83333333 ..., 134. 135.5\n", 93 | " 130.66666667]\n", 94 | " [ 112.33333333 111.5 110.33333333 ..., 134.16666667\n", 95 | " 136.16666667 125.16666667]\n", 96 | " [ 113.33333333 112.66666667 119.83333333 ..., 134.16666667 137.5\n", 97 | " 123.66666667]\n", 98 | " ..., \n", 99 | " [ 135.66666667 131.66666667 129.66666667 ..., 99.33333333 84. 86. ]\n", 100 | " [ 129.16666667 125.5 128.5 ..., 112.16666667\n", 101 | " 99.66666667 101. ]\n", 102 | " [ 129.83333333 125.66666667 127.66666667 ..., 122.16666667\n", 103 | " 112.33333333 109.66666667]]\n", 104 | "-----------------------------------------------\n", 105 | "减去均值之后,X_train的第一幅图像的RGB通道的第一个通道的图像数值32*32:\n", 106 | "[[-59.33333333 -53.66666667 -53.83333333 ..., -70. -53.5\n", 107 | " -68.66666667]\n", 108 | " [-50.33333333 -48.5 -49.33333333 ..., -57.16666667 -22.16666667\n", 109 | " -61.16666667]\n", 110 | " [-46.33333333 -34.66666667 -4.83333333 ..., -34.16666667 -18.5\n", 111 | " -60.66666667]\n", 112 | " ..., \n", 113 | " [ 25.33333333 27.33333333 29.33333333 ..., 52.66666667 73. 70. ]\n", 114 | " [ 33.83333333 35.5 33.5 ..., 49.83333333 61.33333333\n", 115 | " 60. ]\n", 116 | " [ 39.16666667 41.33333333 39.33333333 ..., 44.83333333 54.66666667\n", 117 | " 57.33333333]]\n" 118 | ] 119 | } 120 | ], 121 | "source": [ 122 | "# Normalize the data: subtract the mean image\n", 123 | "X_train = sum_img\n", 124 | "print (\"减去均值之前,X_train的第一幅图像的RGB通道的第一个通道的图像数值32*32:\")\n", 125 | "print (X_train[0][0])\n", 126 | "mean_image = np.mean(X_train, axis=0) \n", 127 | "#shape=(3,32, 32) 这里axis=0表示按照列算均值,在这里是将所有图像的R图上的每个像素点的数值取平均,G,B通道同理,这里是image mean。\n", 128 | "X_train_m = X_train - mean_image\n", 129 | "\n", 130 | "\n", 131 | "\n", 132 | "print (\"-----------------------------------------------\")\n", 133 | "print (\"mean_image的形状以及数值\")\n", 134 | "print (mean_image.shape)\n", 135 | "print (mean_image[0])\n", 136 | "print (\"-----------------------------------------------\")\n", 137 | "print (\"减去均值之后,X_train的第一幅图像的RGB通道的第一个通道的图像数值32*32:\")\n", 138 | "print (X_train_m[0][0])\n" 139 | ] 140 | }, 141 | { 142 | "cell_type": "markdown", 143 | "metadata": {}, 144 | "source": [ 145 | "# 其它:像素均值(pixel mean)" 146 | ] 147 | }, 148 | { 149 | "cell_type": "code", 150 | "execution_count": 51, 151 | "metadata": {}, 152 | "outputs": [ 153 | { 154 | "name": "stdout", 155 | "output_type": "stream", 156 | "text": [ 157 | "[122.30835127019559, 115.90339671024662, 99.094251567814624]\n" 158 | ] 159 | } 160 | ], 161 | "source": [ 162 | "import os\n", 163 | "import cv2\n", 164 | "from numpy import *\n", 165 | "\n", 166 | "img_dir='.\\img'\n", 167 | "img_list=os.listdir(img_dir)\n", 168 | "img_size=224\n", 169 | "sum_r=0\n", 170 | "sum_g=0\n", 171 | "sum_b=0\n", 172 | "count=0\n", 173 | "\n", 174 | "for img_name in img_list:\n", 175 | " img_path=os.path.join(img_dir,img_name)\n", 176 | " img=cv2.imread(img_path)\n", 177 | " img=cv2.cvtColor(img,cv2.COLOR_BGR2RGB)\n", 178 | " img=cv2.resize(img,(img_size,img_size))\n", 179 | " sum_r=sum_r+img[:,:,0].mean()\n", 180 | " sum_g=sum_g+img[:,:,1].mean()\n", 181 | " sum_b=sum_b+img[:,:,2].mean()\n", 182 | " count=count+1\n", 183 | "\n", 184 | "sum_r=sum_r/count\n", 185 | "sum_g=sum_g/count\n", 186 | "sum_b=sum_b/count\n", 187 | "img_mean=[sum_r,sum_g,sum_b]\n", 188 | "print (img_mean)" 189 | ] 190 | }, 191 | { 192 | "cell_type": "code", 193 | "execution_count": null, 194 | "metadata": {}, 195 | "outputs": [], 196 | "source": [] 197 | } 198 | ], 199 | "metadata": { 200 | "kernelspec": { 201 | "display_name": "Python 3", 202 | "language": "python", 203 | "name": "python3" 204 | }, 205 | "language_info": { 206 | "codemirror_mode": { 207 | "name": "ipython", 208 | "version": 3 209 | }, 210 | "file_extension": ".py", 211 | "mimetype": "text/x-python", 212 | "name": "python", 213 | "nbconvert_exporter": "python", 214 | "pygments_lexer": "ipython3", 215 | "version": "3.6.6" 216 | } 217 | }, 218 | "nbformat": 4, 219 | "nbformat_minor": 2 220 | } 221 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # csdn_blog_code_implement 2 | this is my csdn blog code,only code implement,no detail explanation,for more imformation ,please visit my csdn blog ["Jack_Kuo"](https://blog.csdn.net/weixin_37251044) 3 | # csdn 4 | 这是我的csdn博客代码,只有代码实现,没有详细解释,欲了解更多信息,请访问我的csdn博客:["Jack_Kuo"](https://blog.csdn.net/weixin_37251044) 5 | 6 | 7 | ### 1.[使用numpy实现一个CNN来识别cifar-10](https://blog.csdn.net/weixin_37251044/article/details/81290728)
8 | 9 | ### 2.[cs224n课程笔记](https://blog.csdn.net/weixin_37251044/article/details/83473874)
10 | 11 | ### 3.[caffe2教程笔记jupyter版](https://github.com/JackKuo666/csdn_blog_code_implement/tree/master/caffe2)
12 | 13 | ### 4.[NLP文本分类jupyter版](https://github.com/JackKuo666/csdn_blog_code_implement/tree/master/text_classfier)
14 | -------------------------------------------------------------------------------- /Softmax-Cross Entropy/README.md: -------------------------------------------------------------------------------- 1 | # 说明: 2 | 这是我的博客:[神经网络的Loss函数编写:Softmax+Cross Entropy](https://blog.csdn.net/weixin_37251044/article/details/81180449) 的代码,原理请访问我的博客。 3 | -------------------------------------------------------------------------------- /Softmax-Cross Entropy/【交叉熵】:神经网络的Loss函数编写:Softmax+Cross Entropy.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# 1.softmax" 8 | ] 9 | }, 10 | { 11 | "cell_type": "code", 12 | "execution_count": 64, 13 | "metadata": {}, 14 | "outputs": [ 15 | { 16 | "name": "stdout", 17 | "output_type": "stream", 18 | "text": [ 19 | "数组a的值为:\n", 20 | "[[-0.69314718 -1.60943791 -1.2039728 ]\n", 21 | " [-0.22314355 -2.30258509 -2.30258509]]\n", 22 | "找出a中每行最大值:\n", 23 | "[[-0.69314718]\n", 24 | " [-0.22314355]]\n", 25 | "a中每行均减去本行最大值后的数组b:\n", 26 | "[[ 0. -0.91629073 -0.51082562]\n", 27 | " [ 0. -2.07944154 -2.07944154]]\n", 28 | "对数组a进行softmax:\n", 29 | "[[ 0.5 0.2 0.3]\n", 30 | " [ 0.8 0.1 0.1]]\n", 31 | "对去掉最大值的进行softmax:\n", 32 | "[[ 0.5 0.2 0.3]\n", 33 | " [ 0.8 0.1 0.1]]\n" 34 | ] 35 | } 36 | ], 37 | "source": [ 38 | "# import numpy as np\n", 39 | "a = np.array([[-0.69314718 ,-1.60943791, -1.2039728],[-0.22314355, -2.30258509, -2.30258509]])\n", 40 | "print (\"数组a的值为:\\n\" + str(a))\n", 41 | "print (\"找出a中每行最大值:\")\n", 42 | "print (np.max(a, axis=1).reshape(-1,1))\n", 43 | "b = a - np.max(a, axis=1).reshape(-1, 1)\n", 44 | "print (\"a中每行均减去本行最大值后的数组b:\")\n", 45 | "print (b)\n", 46 | "a_softmax = np.exp(a) / np.sum(np.exp(a), axis=1).reshape(-1, 1)\n", 47 | "print (\"对数组a进行softmax:\")\n", 48 | "print (a_softmax)\n", 49 | "b_softmax = np.exp(b) / np.sum(np.exp(b), axis=1).reshape(-1, 1)\n", 50 | "print (\"对去掉最大值的进行softmax:\")\n", 51 | "print (b_softmax)\n" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": {}, 57 | "source": [ 58 | "可以看到减不减去每行的最大值对softmax结果都没有影响,但是这里softmax之前还是要加这一步骤,为了去除数据里的噪声。" 59 | ] 60 | }, 61 | { 62 | "cell_type": "markdown", 63 | "metadata": {}, 64 | "source": [ 65 | "# 2.Cross Entropy" 66 | ] 67 | }, 68 | { 69 | "cell_type": "code", 70 | "execution_count": 68, 71 | "metadata": {}, 72 | "outputs": [ 73 | { 74 | "name": "stdout", 75 | "output_type": "stream", 76 | "text": [ 77 | "我们上次得到的softmax为:\n", 78 | "[[ 0.5 0.2 0.3]\n", 79 | " [ 0.8 0.1 0.1]]\n", 80 | "对softmax取ln:\n", 81 | "[[-0.30103 -0.69897 -0.52287874]\n", 82 | " [-0.09691001 -1. -1. ]]\n", 83 | "找出softmax中每行我们标签概率最大的两个数,也就是第一行的第0个,第二行的第0个:\n", 84 | "[ 0.5 0.8]\n", 85 | "分别对这两个数进行ln:\n", 86 | "[-0.30103 -0.09691001]\n", 87 | "最后,因为这两行是一个batch的两个,所以,加和去平均,得到的就是Loss:\n", 88 | "0.198970004736\n" 89 | ] 90 | } 91 | ], 92 | "source": [ 93 | "print (\"我们上次得到的softmax为:\")\n", 94 | "print (b_softmax)\n", 95 | "d = np.log10(b_softmax) # log下什么都不写默认是自然对数e为底 ,np.log10()是以10为底\n", 96 | "print (\"对softmax取ln:\")\n", 97 | "print (d)\n", 98 | "print (\"找出softmax中每行我们标签概率最大的两个数,也就是第一行的第0个,第二行的第0个:\")\n", 99 | "print (b_softmax[range(2), list([0,0])])\n", 100 | "c = np.log10(b_softmax[range(2), list([0,0])])\n", 101 | "print (\"分别对这两个数进行ln:\")\n", 102 | "print (c)\n", 103 | "print (\"最后,因为这两行是一个batch的两个,所以,加和去平均,得到的就是Loss:\")\n", 104 | "print (-np.sum(np.log10(b_softmax[range(2), list([0,0])]))*(1/2))" 105 | ] 106 | }, 107 | { 108 | "cell_type": "code", 109 | "execution_count": null, 110 | "metadata": {}, 111 | "outputs": [], 112 | "source": [] 113 | } 114 | ], 115 | "metadata": { 116 | "kernelspec": { 117 | "display_name": "Python 3", 118 | "language": "python", 119 | "name": "python3" 120 | }, 121 | "language_info": { 122 | "codemirror_mode": { 123 | "name": "ipython", 124 | "version": 3 125 | }, 126 | "file_extension": ".py", 127 | "mimetype": "text/x-python", 128 | "name": "python", 129 | "nbconvert_exporter": "python", 130 | "pygments_lexer": "ipython3", 131 | "version": "3.6.6" 132 | } 133 | }, 134 | "nbformat": 4, 135 | "nbformat_minor": 2 136 | } 137 | -------------------------------------------------------------------------------- /caffe2/1. Intro Tutorial.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Caffe2 Concepts\n" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "# Blobs and Workspace, Tensors\n" 15 | ] 16 | }, 17 | { 18 | "cell_type": "markdown", 19 | "metadata": {}, 20 | "source": [ 21 | "下面,我们通过搭建一个简单的网络,来简单了解caffe2的Blobs and Workspace, Tensors怎么用,以及之间的关系:\n", 22 | "我们这个网络有三层:\n", 23 | "\n", 24 | "1.一个全连接层 (FC)\n", 25 | "\n", 26 | "2.一个s型激活层,带有一个Softmax\n", 27 | "\n", 28 | "3.一个交叉熵loss层\n", 29 | "\n" 30 | ] 31 | }, 32 | { 33 | "cell_type": "markdown", 34 | "metadata": {}, 35 | "source": [ 36 | "# 1.定义数据,标签;并讲数据标签放入workspace中的“data”“label”中" 37 | ] 38 | }, 39 | { 40 | "cell_type": "code", 41 | "execution_count": 36, 42 | "metadata": {}, 43 | "outputs": [ 44 | { 45 | "data": { 46 | "text/plain": [ 47 | "True" 48 | ] 49 | }, 50 | "execution_count": 36, 51 | "metadata": {}, 52 | "output_type": "execute_result" 53 | } 54 | ], 55 | "source": [ 56 | "from caffe2.python import workspace, model_helper\n", 57 | "import numpy as np\n", 58 | "\n", 59 | "# Create the input data\n", 60 | "data = np.random.rand(16, 100).astype(np.float32)\n", 61 | "\n", 62 | "# Create labels for the data as integers [0, 9].\n", 63 | "label = (np.random.rand(16) * 10).astype(np.int32)\n", 64 | "\n", 65 | "workspace.FeedBlob(\"data\", data)\n", 66 | "workspace.FeedBlob(\"label\", label)\n" 67 | ] 68 | }, 69 | { 70 | "cell_type": "markdown", 71 | "metadata": {}, 72 | "source": [ 73 | "# 2.使用model_helper新建一个model" 74 | ] 75 | }, 76 | { 77 | "cell_type": "code", 78 | "execution_count": 37, 79 | "metadata": { 80 | "collapsed": true 81 | }, 82 | "outputs": [], 83 | "source": [ 84 | "# Create model using a model helper\n", 85 | "m = model_helper.ModelHelper(name=\"my_first_net\")" 86 | ] 87 | }, 88 | { 89 | "cell_type": "markdown", 90 | "metadata": {}, 91 | "source": [ 92 | "ModelHelper将会创建两个相关的net:\n", 93 | "\n", 94 | "1.初始化相关参数的网络(ref. init_net)\n", 95 | "\n", 96 | "2.执行实际训练的网络(ref. exec_net)\n" 97 | ] 98 | }, 99 | { 100 | "cell_type": "markdown", 101 | "metadata": {}, 102 | "source": [ 103 | "# 3.新建FC层" 104 | ] 105 | }, 106 | { 107 | "cell_type": "markdown", 108 | "metadata": {}, 109 | "source": [ 110 | "## 3.1 新建operators之前,先定义w,b" 111 | ] 112 | }, 113 | { 114 | "cell_type": "code", 115 | "execution_count": 38, 116 | "metadata": { 117 | "collapsed": true 118 | }, 119 | "outputs": [], 120 | "source": [ 121 | "weight = m.param_init_net.XavierFill([], 'fc_w', shape=[10, 100])\n", 122 | "bias = m.param_init_net.ConstantFill([], 'fc_b', shape=[10, ])" 123 | ] 124 | }, 125 | { 126 | "cell_type": "markdown", 127 | "metadata": {}, 128 | "source": [ 129 | "## 3.2 给m这个net新建一个FC的operators" 130 | ] 131 | }, 132 | { 133 | "cell_type": "code", 134 | "execution_count": 39, 135 | "metadata": { 136 | "collapsed": true 137 | }, 138 | "outputs": [], 139 | "source": [ 140 | "fc_1 = m.net.FC([\"data\", \"fc_w\", \"fc_b\"], \"fc1\")" 141 | ] 142 | }, 143 | { 144 | "cell_type": "markdown", 145 | "metadata": {}, 146 | "source": [ 147 | "# 4.新建激活层和softmax层" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": 40, 153 | "metadata": { 154 | "collapsed": true 155 | }, 156 | "outputs": [], 157 | "source": [ 158 | "pred = m.net.Sigmoid(fc_1, \"pred\")\n", 159 | "softmax, loss = m.net.SoftmaxWithLoss([pred, \"label\"], [\"softmax\", \"loss\"])" 160 | ] 161 | }, 162 | { 163 | "cell_type": "markdown", 164 | "metadata": {}, 165 | "source": [ 166 | "注意:\n", 167 | "\n", 168 | "1.我们这里batch_size = 16,也就是一次16个samples同时训练;\n", 169 | "\n", 170 | "2.我们这里只是创建模型的定义,下一步开始训练;\n", 171 | "\n", 172 | "3.model会存储在一个protobuf structure中,我们可以通过以下命令查看model:" 173 | ] 174 | }, 175 | { 176 | "cell_type": "code", 177 | "execution_count": 41, 178 | "metadata": {}, 179 | "outputs": [ 180 | { 181 | "name": "stdout", 182 | "output_type": "stream", 183 | "text": [ 184 | "name: \"my_first_net\"\n", 185 | "op {\n", 186 | " input: \"data\"\n", 187 | " input: \"fc_w\"\n", 188 | " input: \"fc_b\"\n", 189 | " output: \"fc1\"\n", 190 | " name: \"\"\n", 191 | " type: \"FC\"\n", 192 | "}\n", 193 | "op {\n", 194 | " input: \"fc1\"\n", 195 | " output: \"pred\"\n", 196 | " name: \"\"\n", 197 | " type: \"Sigmoid\"\n", 198 | "}\n", 199 | "op {\n", 200 | " input: \"pred\"\n", 201 | " input: \"label\"\n", 202 | " output: \"softmax\"\n", 203 | " output: \"loss\"\n", 204 | " name: \"\"\n", 205 | " type: \"SoftmaxWithLoss\"\n", 206 | "}\n", 207 | "external_input: \"data\"\n", 208 | "external_input: \"fc_w\"\n", 209 | "external_input: \"fc_b\"\n", 210 | "external_input: \"label\"\n", 211 | "\n" 212 | ] 213 | } 214 | ], 215 | "source": [ 216 | "print(m.net.Proto())" 217 | ] 218 | }, 219 | { 220 | "cell_type": "markdown", 221 | "metadata": {}, 222 | "source": [ 223 | "**查看初始化的参数:**" 224 | ] 225 | }, 226 | { 227 | "cell_type": "code", 228 | "execution_count": 42, 229 | "metadata": {}, 230 | "outputs": [ 231 | { 232 | "name": "stdout", 233 | "output_type": "stream", 234 | "text": [ 235 | "name: \"my_first_net_init\"\n", 236 | "op {\n", 237 | " output: \"fc_w\"\n", 238 | " name: \"\"\n", 239 | " type: \"XavierFill\"\n", 240 | " arg {\n", 241 | " name: \"shape\"\n", 242 | " ints: 10\n", 243 | " ints: 100\n", 244 | " }\n", 245 | "}\n", 246 | "op {\n", 247 | " output: \"fc_b\"\n", 248 | " name: \"\"\n", 249 | " type: \"ConstantFill\"\n", 250 | " arg {\n", 251 | " name: \"shape\"\n", 252 | " ints: 10\n", 253 | " }\n", 254 | "}\n", 255 | "\n" 256 | ] 257 | } 258 | ], 259 | "source": [ 260 | "print(m.param_init_net.Proto())" 261 | ] 262 | }, 263 | { 264 | "cell_type": "markdown", 265 | "metadata": {}, 266 | "source": [ 267 | "# 5.执行" 268 | ] 269 | }, 270 | { 271 | "cell_type": "markdown", 272 | "metadata": {}, 273 | "source": [ 274 | "## 5.1.运行一次参数初始化:" 275 | ] 276 | }, 277 | { 278 | "cell_type": "code", 279 | "execution_count": 43, 280 | "metadata": {}, 281 | "outputs": [ 282 | { 283 | "data": { 284 | "text/plain": [ 285 | "True" 286 | ] 287 | }, 288 | "execution_count": 43, 289 | "metadata": {}, 290 | "output_type": "execute_result" 291 | } 292 | ], 293 | "source": [ 294 | "workspace.RunNetOnce(m.param_init_net)" 295 | ] 296 | }, 297 | { 298 | "cell_type": "markdown", 299 | "metadata": {}, 300 | "source": [ 301 | "## 5.2.创建训练网络" 302 | ] 303 | }, 304 | { 305 | "cell_type": "code", 306 | "execution_count": 44, 307 | "metadata": {}, 308 | "outputs": [ 309 | { 310 | "data": { 311 | "text/plain": [ 312 | "True" 313 | ] 314 | }, 315 | "execution_count": 44, 316 | "metadata": {}, 317 | "output_type": "execute_result" 318 | } 319 | ], 320 | "source": [ 321 | "workspace.CreateNet(m.net)" 322 | ] 323 | }, 324 | { 325 | "cell_type": "markdown", 326 | "metadata": {}, 327 | "source": [ 328 | "## 5.3.训练网络" 329 | ] 330 | }, 331 | { 332 | "cell_type": "code", 333 | "execution_count": 45, 334 | "metadata": {}, 335 | "outputs": [], 336 | "source": [ 337 | "# Run 100 x 10 iterations\n", 338 | "for _ in range(100):\n", 339 | " data = np.random.rand(16, 100).astype(np.float32)\n", 340 | " label = (np.random.rand(16) * 10).astype(np.int32)\n", 341 | "\n", 342 | " workspace.FeedBlob(\"data\", data)\n", 343 | " workspace.FeedBlob(\"label\", label)\n", 344 | "\n", 345 | " workspace.RunNet(m.name, 10) # run for 10 times" 346 | ] 347 | }, 348 | { 349 | "cell_type": "markdown", 350 | "metadata": {}, 351 | "source": [ 352 | "执行后,您可以检查存储在输出blob(包含张量,即numpy数组)中的结果:" 353 | ] 354 | }, 355 | { 356 | "cell_type": "code", 357 | "execution_count": 46, 358 | "metadata": {}, 359 | "outputs": [ 360 | { 361 | "name": "stdout", 362 | "output_type": "stream", 363 | "text": [ 364 | "[[0.0876148 0.07548364 0.09623323 0.12869014 0.08200206 0.10208801\n", 365 | " 0.12435256 0.11641801 0.09138598 0.09573159]\n", 366 | " [0.08656767 0.08965836 0.10828611 0.1141852 0.09495509 0.08777543\n", 367 | " 0.11474496 0.10817766 0.10231902 0.09333058]\n", 368 | " [0.08370095 0.07577838 0.10283273 0.12051839 0.09019332 0.1001257\n", 369 | " 0.12431574 0.10195971 0.10198188 0.09859312]\n", 370 | " [0.08037042 0.08022888 0.1113458 0.12137544 0.09024613 0.09120934\n", 371 | " 0.11656947 0.10384342 0.10305068 0.10176046]\n", 372 | " [0.0878759 0.07796903 0.10136125 0.12322115 0.0864681 0.09394205\n", 373 | " 0.11819859 0.11712413 0.08619776 0.10764208]\n", 374 | " [0.08172415 0.07318036 0.0948612 0.13026348 0.09646249 0.10189744\n", 375 | " 0.10387536 0.1055788 0.08963021 0.12252647]\n", 376 | " [0.07917343 0.08171824 0.11740194 0.11786009 0.09542355 0.10089807\n", 377 | " 0.10691947 0.08718182 0.09749852 0.11592486]\n", 378 | " [0.09023243 0.08145688 0.09926067 0.11211696 0.09406143 0.09656373\n", 379 | " 0.10264882 0.11141657 0.09514029 0.11710224]\n", 380 | " [0.08539043 0.08547576 0.10477988 0.11539709 0.08580235 0.09573507\n", 381 | " 0.11900022 0.11075822 0.09151191 0.106149 ]\n", 382 | " [0.08888969 0.07497442 0.1067399 0.12012165 0.0906563 0.09574833\n", 383 | " 0.11331636 0.09551872 0.09023768 0.12379707]\n", 384 | " [0.07795423 0.07904412 0.10365619 0.1125275 0.08462506 0.09732373\n", 385 | " 0.12132311 0.10707201 0.09885976 0.11761425]\n", 386 | " [0.08336885 0.07465281 0.10840832 0.11558709 0.08637662 0.11698949\n", 387 | " 0.10753863 0.10991579 0.08486839 0.11229403]\n", 388 | " [0.0904911 0.08106896 0.10705142 0.11207567 0.1001352 0.08801412\n", 389 | " 0.10004795 0.10290649 0.09353973 0.1246694 ]\n", 390 | " [0.08747555 0.07783487 0.11689501 0.11241519 0.08192006 0.09484729\n", 391 | " 0.11507696 0.10664993 0.10004598 0.10683927]\n", 392 | " [0.08289554 0.08357728 0.10166691 0.11176678 0.08568645 0.10207526\n", 393 | " 0.1217039 0.10599326 0.09480698 0.10982765]\n", 394 | " [0.07501717 0.0773883 0.09515747 0.12306631 0.1001178 0.09227181\n", 395 | " 0.12529895 0.09527896 0.10404698 0.11235628]]\n", 396 | "2.3265471\n" 397 | ] 398 | } 399 | ], 400 | "source": [ 401 | "print(workspace.FetchBlob(\"softmax\"))\n", 402 | "print(workspace.FetchBlob(\"loss\"))" 403 | ] 404 | }, 405 | { 406 | "cell_type": "markdown", 407 | "metadata": {}, 408 | "source": [ 409 | "# 6.反向传播\n", 410 | "\n", 411 | "这个网络只包含前向传播,因此它不会学习任何东西。通过在正向传递中为每个运算符添加gradient operators来创建向后传递。" 412 | ] 413 | }, 414 | { 415 | "cell_type": "code", 416 | "execution_count": 50, 417 | "metadata": {}, 418 | "outputs": [ 419 | { 420 | "name": "stdout", 421 | "output_type": "stream", 422 | "text": [ 423 | "name: \"my_first_net_4\"\n", 424 | "op {\n", 425 | " input: \"data\"\n", 426 | " input: \"fc_w\"\n", 427 | " input: \"fc_b\"\n", 428 | " output: \"fc1\"\n", 429 | " name: \"\"\n", 430 | " type: \"FC\"\n", 431 | "}\n", 432 | "op {\n", 433 | " input: \"fc1\"\n", 434 | " output: \"pred\"\n", 435 | " name: \"\"\n", 436 | " type: \"Sigmoid\"\n", 437 | "}\n", 438 | "op {\n", 439 | " input: \"pred\"\n", 440 | " input: \"label\"\n", 441 | " output: \"softmax\"\n", 442 | " output: \"loss\"\n", 443 | " name: \"\"\n", 444 | " type: \"SoftmaxWithLoss\"\n", 445 | "}\n", 446 | "op {\n", 447 | " input: \"loss\"\n", 448 | " output: \"loss_autogen_grad\"\n", 449 | " name: \"\"\n", 450 | " type: \"ConstantFill\"\n", 451 | " arg {\n", 452 | " name: \"value\"\n", 453 | " f: 1.0\n", 454 | " }\n", 455 | "}\n", 456 | "op {\n", 457 | " input: \"pred\"\n", 458 | " input: \"label\"\n", 459 | " input: \"softmax\"\n", 460 | " input: \"loss_autogen_grad\"\n", 461 | " output: \"pred_grad\"\n", 462 | " name: \"\"\n", 463 | " type: \"SoftmaxWithLossGradient\"\n", 464 | " is_gradient_op: true\n", 465 | "}\n", 466 | "op {\n", 467 | " input: \"pred\"\n", 468 | " input: \"pred_grad\"\n", 469 | " output: \"fc1_grad\"\n", 470 | " name: \"\"\n", 471 | " type: \"SigmoidGradient\"\n", 472 | " is_gradient_op: true\n", 473 | "}\n", 474 | "op {\n", 475 | " input: \"data\"\n", 476 | " input: \"fc_w\"\n", 477 | " input: \"fc1_grad\"\n", 478 | " output: \"fc_w_grad\"\n", 479 | " output: \"fc_b_grad\"\n", 480 | " output: \"data_grad\"\n", 481 | " name: \"\"\n", 482 | " type: \"FCGradient\"\n", 483 | " is_gradient_op: true\n", 484 | "}\n", 485 | "external_input: \"data\"\n", 486 | "external_input: \"fc_w\"\n", 487 | "external_input: \"fc_b\"\n", 488 | "external_input: \"label\"\n", 489 | "\n", 490 | "[[0.11452593 0.10101821 0.09836449 0.0965292 0.10351954 0.11437476\n", 491 | " 0.0841739 0.08410054 0.1032382 0.10015523]\n", 492 | " [0.10163245 0.09582136 0.10728257 0.09061375 0.09165412 0.10958022\n", 493 | " 0.09790237 0.08632593 0.10816205 0.11102515]\n", 494 | " [0.11153644 0.09187157 0.08876271 0.10034752 0.10996686 0.10122422\n", 495 | " 0.09731211 0.07600702 0.10907148 0.11390015]\n", 496 | " [0.10677353 0.09250082 0.09395255 0.09013956 0.11205406 0.10125393\n", 497 | " 0.0998921 0.08900981 0.10824097 0.10618276]\n", 498 | " [0.10816263 0.09636337 0.09732535 0.09815469 0.10086755 0.10827011\n", 499 | " 0.09060381 0.07663068 0.10915532 0.11446647]\n", 500 | " [0.10561427 0.09925039 0.09791987 0.09596141 0.10207485 0.10818447\n", 501 | " 0.09482063 0.08366729 0.09857398 0.11393285]\n", 502 | " [0.09989694 0.10426529 0.10590892 0.09590165 0.10164034 0.10572004\n", 503 | " 0.0843733 0.07041462 0.11091355 0.12096539]\n", 504 | " [0.11713242 0.09610002 0.09875591 0.09168133 0.1124823 0.09706804\n", 505 | " 0.09423647 0.07736441 0.09833553 0.11684357]\n", 506 | " [0.11634789 0.10220942 0.09840939 0.08635401 0.11169235 0.09972424\n", 507 | " 0.0881576 0.07687011 0.11021466 0.11002035]\n", 508 | " [0.0986582 0.10288956 0.09780209 0.09394148 0.10023048 0.11133306\n", 509 | " 0.09275422 0.07090774 0.11865855 0.11282461]\n", 510 | " [0.11411306 0.09094301 0.0921051 0.09122376 0.10246258 0.09556931\n", 511 | " 0.10687889 0.0793411 0.10633816 0.12102505]\n", 512 | " [0.11857571 0.10367715 0.0916706 0.09522462 0.10553856 0.1097901\n", 513 | " 0.08682371 0.08298165 0.10331775 0.10240018]\n", 514 | " [0.10449859 0.09788019 0.08315773 0.09780625 0.10613973 0.10121298\n", 515 | " 0.09945863 0.07901601 0.103154 0.12767594]\n", 516 | " [0.0995274 0.10280239 0.0894575 0.09834379 0.1094081 0.09976058\n", 517 | " 0.10002778 0.07689121 0.11048673 0.1132945 ]\n", 518 | " [0.10373392 0.09080214 0.09258449 0.09425651 0.12741047 0.10679753\n", 519 | " 0.09512977 0.073976 0.10132045 0.11398876]\n", 520 | " [0.09918068 0.10451347 0.08183451 0.11167922 0.11003976 0.09900602\n", 521 | " 0.08880362 0.08632898 0.11165588 0.10695779]]\n", 522 | "2.2832193\n" 523 | ] 524 | } 525 | ], 526 | "source": [ 527 | "from caffe2.python import workspace, model_helper\n", 528 | "import numpy as np\n", 529 | "\n", 530 | "# Create the input data\n", 531 | "data = np.random.rand(16, 100).astype(np.float32)\n", 532 | "\n", 533 | "# Create labels for the data as integers [0, 9].\n", 534 | "label = (np.random.rand(16) * 10).astype(np.int32)\n", 535 | "\n", 536 | "workspace.FeedBlob(\"data\", data)\n", 537 | "workspace.FeedBlob(\"label\", label)\n", 538 | "\n", 539 | "# Create model using a model helper\n", 540 | "m = model_helper.ModelHelper(name=\"my_first_net\")\n", 541 | "\n", 542 | "\n", 543 | "weight = m.param_init_net.XavierFill([], 'fc_w', shape=[10, 100])\n", 544 | "bias = m.param_init_net.ConstantFill([], 'fc_b', shape=[10, ])\n", 545 | "\n", 546 | "\n", 547 | "fc_1 = m.net.FC([\"data\", \"fc_w\", \"fc_b\"], \"fc1\")\n", 548 | "pred = m.net.Sigmoid(fc_1, \"pred\")\n", 549 | "softmax, loss = m.net.SoftmaxWithLoss([pred, \"label\"], [\"softmax\", \"loss\"])\n", 550 | "\n", 551 | "# 反向传播 add ================\n", 552 | "m.AddGradientOperators([loss])\n", 553 | "# ============================\n", 554 | "\n", 555 | "workspace.RunNetOnce(m.param_init_net)\n", 556 | "workspace.CreateNet(m.net)\n", 557 | "\n", 558 | "# 可以查看加了反向传播之后,net中多出来三个operators的反向传播层\n", 559 | "print(m.net.Proto())\n", 560 | "#=========================================================\n", 561 | "\n", 562 | "# Run 100 x 10 iterations\n", 563 | "for _ in range(100):\n", 564 | " data = np.random.rand(16, 100).astype(np.float32)\n", 565 | " label = (np.random.rand(16) * 10).astype(np.int32)\n", 566 | "\n", 567 | " workspace.FeedBlob(\"data\", data)\n", 568 | " workspace.FeedBlob(\"label\", label)\n", 569 | "\n", 570 | " workspace.RunNet(m.name, 10) # run for 10 times\n", 571 | "\n", 572 | "print(workspace.FetchBlob(\"softmax\"))\n", 573 | "print(workspace.FetchBlob(\"loss\"))\n" 574 | ] 575 | }, 576 | { 577 | "cell_type": "code", 578 | "execution_count": null, 579 | "metadata": { 580 | "collapsed": true 581 | }, 582 | "outputs": [], 583 | "source": [] 584 | } 585 | ], 586 | "metadata": { 587 | "kernelspec": { 588 | "display_name": "Python 2", 589 | "language": "python", 590 | "name": "python2" 591 | }, 592 | "language_info": { 593 | "codemirror_mode": { 594 | "name": "ipython", 595 | "version": 2 596 | }, 597 | "file_extension": ".py", 598 | "mimetype": "text/x-python", 599 | "name": "python", 600 | "nbconvert_exporter": "python", 601 | "pygments_lexer": "ipython2", 602 | "version": "2.7.14" 603 | } 604 | }, 605 | "nbformat": 4, 606 | "nbformat_minor": 2 607 | } 608 | -------------------------------------------------------------------------------- /caffe2/1.Intro Tutorial.md: -------------------------------------------------------------------------------- 1 | 2 | # Caffe2 Concepts 3 | 4 | 5 | # Blobs and Workspace, Tensors 6 | 7 | 8 | 下面,我们通过搭建一个简单的网络,来简单了解caffe2的Blobs and Workspace, Tensors怎么用,以及之间的关系: 9 | 我们这个网络有三层: 10 | 11 | 1.一个全连接层 (FC) 12 | 13 | 2.一个s型激活层,带有一个Softmax 14 | 15 | 3.一个交叉熵loss层 16 | 17 | 18 | 19 | # 1.定义数据,标签;并讲数据标签放入workspace中的“data”“label”中 20 | 21 | 22 | ```python 23 | from caffe2.python import workspace, model_helper 24 | import numpy as np 25 | 26 | # Create the input data 27 | data = np.random.rand(16, 100).astype(np.float32) 28 | 29 | # Create labels for the data as integers [0, 9]. 30 | label = (np.random.rand(16) * 10).astype(np.int32) 31 | 32 | workspace.FeedBlob("data", data) 33 | workspace.FeedBlob("label", label) 34 | 35 | ``` 36 | 37 | 38 | 39 | 40 | True 41 | 42 | 43 | 44 | # 2.使用model_helper新建一个model 45 | 46 | 47 | ```python 48 | # Create model using a model helper 49 | m = model_helper.ModelHelper(name="my_first_net") 50 | ``` 51 | 52 | ModelHelper将会创建两个相关的net: 53 | 54 | 1.初始化相关参数的网络(ref. init_net) 55 | 56 | 2.执行实际训练的网络(ref. exec_net) 57 | 58 | 59 | # 3.新建FC层 60 | 61 | ## 3.1 新建operators之前,先定义w,b 62 | 63 | 64 | ```python 65 | weight = m.param_init_net.XavierFill([], 'fc_w', shape=[10, 100]) 66 | bias = m.param_init_net.ConstantFill([], 'fc_b', shape=[10, ]) 67 | ``` 68 | 69 | ## 3.2 给m这个net新建一个FC的operators 70 | 71 | 72 | ```python 73 | fc_1 = m.net.FC(["data", "fc_w", "fc_b"], "fc1") 74 | ``` 75 | 76 | # 4.新建激活层和softmax层 77 | 78 | 79 | ```python 80 | pred = m.net.Sigmoid(fc_1, "pred") 81 | softmax, loss = m.net.SoftmaxWithLoss([pred, "label"], ["softmax", "loss"]) 82 | ``` 83 | 84 | 注意: 85 | 86 | 1.我们这里batch_size = 16,也就是一次16个samples同时训练; 87 | 88 | 2.我们这里只是创建模型的定义,下一步开始训练; 89 | 90 | 3.model会存储在一个protobuf structure中,我们可以通过以下命令查看model: 91 | 92 | 93 | ```python 94 | print(m.net.Proto()) 95 | ``` 96 | 97 | name: "my_first_net" 98 | op { 99 | input: "data" 100 | input: "fc_w" 101 | input: "fc_b" 102 | output: "fc1" 103 | name: "" 104 | type: "FC" 105 | } 106 | op { 107 | input: "fc1" 108 | output: "pred" 109 | name: "" 110 | type: "Sigmoid" 111 | } 112 | op { 113 | input: "pred" 114 | input: "label" 115 | output: "softmax" 116 | output: "loss" 117 | name: "" 118 | type: "SoftmaxWithLoss" 119 | } 120 | external_input: "data" 121 | external_input: "fc_w" 122 | external_input: "fc_b" 123 | external_input: "label" 124 | 125 | 126 | 127 | **查看初始化的参数:** 128 | 129 | 130 | ```python 131 | print(m.param_init_net.Proto()) 132 | ``` 133 | 134 | name: "my_first_net_init" 135 | op { 136 | output: "fc_w" 137 | name: "" 138 | type: "XavierFill" 139 | arg { 140 | name: "shape" 141 | ints: 10 142 | ints: 100 143 | } 144 | } 145 | op { 146 | output: "fc_b" 147 | name: "" 148 | type: "ConstantFill" 149 | arg { 150 | name: "shape" 151 | ints: 10 152 | } 153 | } 154 | 155 | 156 | 157 | # 5.执行 158 | 159 | ## 5.1.运行一次参数初始化: 160 | 161 | 162 | ```python 163 | workspace.RunNetOnce(m.param_init_net) 164 | ``` 165 | 166 | 167 | 168 | 169 | True 170 | 171 | 172 | 173 | ## 5.2.创建训练网络 174 | 175 | 176 | ```python 177 | workspace.CreateNet(m.net) 178 | ``` 179 | 180 | 181 | 182 | 183 | True 184 | 185 | 186 | 187 | ## 5.3.训练网络 188 | 189 | 190 | ```python 191 | # Run 100 x 10 iterations 192 | for _ in range(100): 193 | data = np.random.rand(16, 100).astype(np.float32) 194 | label = (np.random.rand(16) * 10).astype(np.int32) 195 | 196 | workspace.FeedBlob("data", data) 197 | workspace.FeedBlob("label", label) 198 | 199 | workspace.RunNet(m.name, 10) # run for 10 times 200 | ``` 201 | 202 | 执行后,您可以检查存储在输出blob(包含张量,即numpy数组)中的结果: 203 | 204 | 205 | ```python 206 | print(workspace.FetchBlob("softmax")) 207 | print(workspace.FetchBlob("loss")) 208 | ``` 209 | 210 | [[0.0876148 0.07548364 0.09623323 0.12869014 0.08200206 0.10208801 211 | 0.12435256 0.11641801 0.09138598 0.09573159] 212 | [0.08656767 0.08965836 0.10828611 0.1141852 0.09495509 0.08777543 213 | 0.11474496 0.10817766 0.10231902 0.09333058] 214 | [0.08370095 0.07577838 0.10283273 0.12051839 0.09019332 0.1001257 215 | 0.12431574 0.10195971 0.10198188 0.09859312] 216 | [0.08037042 0.08022888 0.1113458 0.12137544 0.09024613 0.09120934 217 | 0.11656947 0.10384342 0.10305068 0.10176046] 218 | [0.0878759 0.07796903 0.10136125 0.12322115 0.0864681 0.09394205 219 | 0.11819859 0.11712413 0.08619776 0.10764208] 220 | [0.08172415 0.07318036 0.0948612 0.13026348 0.09646249 0.10189744 221 | 0.10387536 0.1055788 0.08963021 0.12252647] 222 | [0.07917343 0.08171824 0.11740194 0.11786009 0.09542355 0.10089807 223 | 0.10691947 0.08718182 0.09749852 0.11592486] 224 | [0.09023243 0.08145688 0.09926067 0.11211696 0.09406143 0.09656373 225 | 0.10264882 0.11141657 0.09514029 0.11710224] 226 | [0.08539043 0.08547576 0.10477988 0.11539709 0.08580235 0.09573507 227 | 0.11900022 0.11075822 0.09151191 0.106149 ] 228 | [0.08888969 0.07497442 0.1067399 0.12012165 0.0906563 0.09574833 229 | 0.11331636 0.09551872 0.09023768 0.12379707] 230 | [0.07795423 0.07904412 0.10365619 0.1125275 0.08462506 0.09732373 231 | 0.12132311 0.10707201 0.09885976 0.11761425] 232 | [0.08336885 0.07465281 0.10840832 0.11558709 0.08637662 0.11698949 233 | 0.10753863 0.10991579 0.08486839 0.11229403] 234 | [0.0904911 0.08106896 0.10705142 0.11207567 0.1001352 0.08801412 235 | 0.10004795 0.10290649 0.09353973 0.1246694 ] 236 | [0.08747555 0.07783487 0.11689501 0.11241519 0.08192006 0.09484729 237 | 0.11507696 0.10664993 0.10004598 0.10683927] 238 | [0.08289554 0.08357728 0.10166691 0.11176678 0.08568645 0.10207526 239 | 0.1217039 0.10599326 0.09480698 0.10982765] 240 | [0.07501717 0.0773883 0.09515747 0.12306631 0.1001178 0.09227181 241 | 0.12529895 0.09527896 0.10404698 0.11235628]] 242 | 2.3265471 243 | 244 | 245 | # 6.反向传播 246 | 247 | 这个网络只包含前向传播,因此它不会学习任何东西。通过在正向传递中为每个运算符添加gradient operators来创建向后传递。 248 | 249 | 250 | ```python 251 | from caffe2.python import workspace, model_helper 252 | import numpy as np 253 | 254 | # Create the input data 255 | data = np.random.rand(16, 100).astype(np.float32) 256 | 257 | # Create labels for the data as integers [0, 9]. 258 | label = (np.random.rand(16) * 10).astype(np.int32) 259 | 260 | workspace.FeedBlob("data", data) 261 | workspace.FeedBlob("label", label) 262 | 263 | # Create model using a model helper 264 | m = model_helper.ModelHelper(name="my_first_net") 265 | 266 | 267 | weight = m.param_init_net.XavierFill([], 'fc_w', shape=[10, 100]) 268 | bias = m.param_init_net.ConstantFill([], 'fc_b', shape=[10, ]) 269 | 270 | 271 | fc_1 = m.net.FC(["data", "fc_w", "fc_b"], "fc1") 272 | pred = m.net.Sigmoid(fc_1, "pred") 273 | softmax, loss = m.net.SoftmaxWithLoss([pred, "label"], ["softmax", "loss"]) 274 | 275 | # 反向传播 add ================ 276 | m.AddGradientOperators([loss]) 277 | # ============================ 278 | 279 | workspace.RunNetOnce(m.param_init_net) 280 | workspace.CreateNet(m.net) 281 | 282 | # 可以查看加了反向传播之后,net中多出来三个operators的反向传播层 283 | print(m.net.Proto()) 284 | #========================================================= 285 | 286 | # Run 100 x 10 iterations 287 | for _ in range(100): 288 | data = np.random.rand(16, 100).astype(np.float32) 289 | label = (np.random.rand(16) * 10).astype(np.int32) 290 | 291 | workspace.FeedBlob("data", data) 292 | workspace.FeedBlob("label", label) 293 | 294 | workspace.RunNet(m.name, 10) # run for 10 times 295 | 296 | print(workspace.FetchBlob("softmax")) 297 | print(workspace.FetchBlob("loss")) 298 | 299 | ``` 300 | 301 | name: "my_first_net_4" 302 | op { 303 | input: "data" 304 | input: "fc_w" 305 | input: "fc_b" 306 | output: "fc1" 307 | name: "" 308 | type: "FC" 309 | } 310 | op { 311 | input: "fc1" 312 | output: "pred" 313 | name: "" 314 | type: "Sigmoid" 315 | } 316 | op { 317 | input: "pred" 318 | input: "label" 319 | output: "softmax" 320 | output: "loss" 321 | name: "" 322 | type: "SoftmaxWithLoss" 323 | } 324 | op { 325 | input: "loss" 326 | output: "loss_autogen_grad" 327 | name: "" 328 | type: "ConstantFill" 329 | arg { 330 | name: "value" 331 | f: 1.0 332 | } 333 | } 334 | op { 335 | input: "pred" 336 | input: "label" 337 | input: "softmax" 338 | input: "loss_autogen_grad" 339 | output: "pred_grad" 340 | name: "" 341 | type: "SoftmaxWithLossGradient" 342 | is_gradient_op: true 343 | } 344 | op { 345 | input: "pred" 346 | input: "pred_grad" 347 | output: "fc1_grad" 348 | name: "" 349 | type: "SigmoidGradient" 350 | is_gradient_op: true 351 | } 352 | op { 353 | input: "data" 354 | input: "fc_w" 355 | input: "fc1_grad" 356 | output: "fc_w_grad" 357 | output: "fc_b_grad" 358 | output: "data_grad" 359 | name: "" 360 | type: "FCGradient" 361 | is_gradient_op: true 362 | } 363 | external_input: "data" 364 | external_input: "fc_w" 365 | external_input: "fc_b" 366 | external_input: "label" 367 | 368 | [[0.11452593 0.10101821 0.09836449 0.0965292 0.10351954 0.11437476 369 | 0.0841739 0.08410054 0.1032382 0.10015523] 370 | [0.10163245 0.09582136 0.10728257 0.09061375 0.09165412 0.10958022 371 | 0.09790237 0.08632593 0.10816205 0.11102515] 372 | [0.11153644 0.09187157 0.08876271 0.10034752 0.10996686 0.10122422 373 | 0.09731211 0.07600702 0.10907148 0.11390015] 374 | [0.10677353 0.09250082 0.09395255 0.09013956 0.11205406 0.10125393 375 | 0.0998921 0.08900981 0.10824097 0.10618276] 376 | [0.10816263 0.09636337 0.09732535 0.09815469 0.10086755 0.10827011 377 | 0.09060381 0.07663068 0.10915532 0.11446647] 378 | [0.10561427 0.09925039 0.09791987 0.09596141 0.10207485 0.10818447 379 | 0.09482063 0.08366729 0.09857398 0.11393285] 380 | [0.09989694 0.10426529 0.10590892 0.09590165 0.10164034 0.10572004 381 | 0.0843733 0.07041462 0.11091355 0.12096539] 382 | [0.11713242 0.09610002 0.09875591 0.09168133 0.1124823 0.09706804 383 | 0.09423647 0.07736441 0.09833553 0.11684357] 384 | [0.11634789 0.10220942 0.09840939 0.08635401 0.11169235 0.09972424 385 | 0.0881576 0.07687011 0.11021466 0.11002035] 386 | [0.0986582 0.10288956 0.09780209 0.09394148 0.10023048 0.11133306 387 | 0.09275422 0.07090774 0.11865855 0.11282461] 388 | [0.11411306 0.09094301 0.0921051 0.09122376 0.10246258 0.09556931 389 | 0.10687889 0.0793411 0.10633816 0.12102505] 390 | [0.11857571 0.10367715 0.0916706 0.09522462 0.10553856 0.1097901 391 | 0.08682371 0.08298165 0.10331775 0.10240018] 392 | [0.10449859 0.09788019 0.08315773 0.09780625 0.10613973 0.10121298 393 | 0.09945863 0.07901601 0.103154 0.12767594] 394 | [0.0995274 0.10280239 0.0894575 0.09834379 0.1094081 0.09976058 395 | 0.10002778 0.07689121 0.11048673 0.1132945 ] 396 | [0.10373392 0.09080214 0.09258449 0.09425651 0.12741047 0.10679753 397 | 0.09512977 0.073976 0.10132045 0.11398876] 398 | [0.09918068 0.10451347 0.08183451 0.11167922 0.11003976 0.09900602 399 | 0.08880362 0.08632898 0.11165588 0.10695779]] 400 | 2.2832193 401 | 402 | 403 | 404 | ```python 405 | 406 | ``` 407 | -------------------------------------------------------------------------------- /caffe2/2.Caffe2 的一些基本概念 - Workspaces&Operators & Nets & Nets 可视化.md: -------------------------------------------------------------------------------- 1 | 2 | # 2.Caffe2 的一些基本概念 - Workspaces&Operators & Nets & Nets 可视化 3 | 4 | 这个部分我们将介绍一些基本概念,这包括: 怎样写 operators 和 nets 5 | 6 | 首先,让我们 import Caffe2. `core` 和 `workspace` 。如果你想查看Caffe2生成的 protocol buffers 文件 , 那么应该 7 | import `caffe2_pb2` from `caffe2.proto`. 8 | 9 | 10 | ```python 11 | from __future__ import absolute_import 12 | from __future__ import division 13 | from __future__ import print_function 14 | from __future__ import unicode_literals 15 | 16 | # We'll also import a few standard python libraries 17 | from matplotlib import pyplot 18 | import numpy as np 19 | import time 20 | 21 | # These are the droids you are looking for. 22 | from caffe2.python import core, workspace 23 | from caffe2.proto import caffe2_pb2 24 | 25 | # Let's show all plots inline. 26 | %matplotlib inline 27 | ``` 28 | 29 | 您可能会看到一条警告,说caffe2没有GPU支持。这意味着您正在运行CPU的caffe2。不要惊慌,CPU仍然可以运行。 30 | 31 | ## 1.Workspaces 32 | 33 | 让我们首先介绍所有数据所在的工作空间Workspaces。 34 | 35 | 与Matlab类似,caff2工作区由您创建并存储在内存中的blob组成。现在,假设一个blob是一个n维张量(Tensor),类似于numpy的ndarray,但是是连续的。接下来,我们将向您展示blob实际上是一个可以存储任何类型的c++对象的类型化指针,但是张量(Tensor)仅仅是存储在blob中的最常见类型。让我们看看交互是什么样的。 36 | 37 | `Blobs()` prints out all existing blobs in the workspace. 38 | 39 | `HasBlob()` queries if a blob exists in the workspace. As of now, we don't have any. 40 | 41 | 42 | ```python 43 | print("Current blobs in the workspace: {}".format(workspace.Blobs())) 44 | print("Workspace has blob 'X'? {}".format(workspace.HasBlob("X"))) 45 | ``` 46 | 47 | Current blobs in the workspace: [u'W', u'X', u'Y', u'b'] 48 | Workspace has blob 'X'? True 49 | 50 | 51 | 我们可以使用`FeedBlob()`将 blobs 放进 workspace 中: 52 | 53 | 54 | ```python 55 | X = np.random.randn(2, 3).astype(np.float32) 56 | print("Generated X from numpy:\n{}".format(X)) 57 | workspace.FeedBlob("X", X) 58 | ``` 59 | 60 | Generated X from numpy: 61 | [[-0.02584119 1.2813169 0.49287322] 62 | [ 1.8800957 -1.1769409 0.12025763]] 63 | 64 | 65 | 66 | 67 | 68 | True 69 | 70 | 71 | 72 | 现在,让我们看看在 workspace存了什么 blobs . 73 | 74 | 75 | ```python 76 | print("Current blobs in the workspace: {}".format(workspace.Blobs())) 77 | print("Workspace has blob 'X'? {}".format(workspace.HasBlob("X"))) 78 | print("Fetched X:\n{}".format(workspace.FetchBlob("X"))) 79 | ``` 80 | 81 | Current blobs in the workspace: [u'W', u'X', u'Y', u'b'] 82 | Workspace has blob 'X'? True 83 | Fetched X: 84 | [[-0.02584119 1.2813169 0.49287322] 85 | [ 1.8800957 -1.1769409 0.12025763]] 86 | 87 | 88 | 我们可以看到这就是我们刚才存进去的数据: 89 | 90 | 91 | ```python 92 | np.testing.assert_array_equal(X, workspace.FetchBlob("X")) 93 | ``` 94 | 95 | 注意:如果我们想要从workspace中取出一个不存在的blob比如”invincible_pink_unicorn“,那么会报错: 96 | 97 | 98 | ```python 99 | try: 100 | workspace.FetchBlob("invincible_pink_unicorn") 101 | except RuntimeError as err: 102 | print(err) 103 | ``` 104 | 105 | [enforce fail at pybind_state.cc:175] ws->HasBlob(name). Can't find blob: invincible_pink_unicorn 106 | 107 | 108 | 109 | 您可能现在不会立即使用的一件事:您可以使用不同的名称在Python中拥有多个工作空间(Workspace),并在它们之间切换。不同工作空间(Workspace)中的Blob彼此分开。您可以使用`CurrentWorkspace`查询当前工作空间(Workspace)。让我们尝试按名称(例如:gutentag)切换工作区,如果不存在则创建一个新工作区。 110 | 111 | 112 | ```python 113 | print("Current workspace: {}".format(workspace.CurrentWorkspace())) 114 | print("Current blobs in the workspace: {}".format(workspace.Blobs())) 115 | 116 | # Switch the workspace. The second argument "True" means creating 117 | # the workspace if it is missing. 118 | workspace.SwitchWorkspace("gutentag", True) 119 | 120 | # Let's print the current workspace. Note that there is nothing in the 121 | # workspace yet. 122 | print("Current workspace: {}".format(workspace.CurrentWorkspace())) 123 | print("Current blobs in the workspace: {}".format(workspace.Blobs())) 124 | ``` 125 | 126 | Current workspace: default 127 | Current blobs in the workspace: [u'W', u'X', u'Y', u'b'] 128 | Current workspace: gutentag 129 | Current blobs in the workspace: [] 130 | 131 | 132 | 让我们选回默认工作空间(Workspace): 133 | 134 | 135 | ```python 136 | workspace.SwitchWorkspace("default") 137 | print("Current workspace: {}".format(workspace.CurrentWorkspace())) 138 | print("Current blobs in the workspace: {}".format(workspace.Blobs())) 139 | ``` 140 | 141 | Current workspace: default 142 | Current blobs in the workspace: [u'W', u'X', u'Y', u'b'] 143 | 144 | 145 | 最后,`ResetWorkspace()`命令清除当前工作空间中的所有内容: 146 | 147 | 148 | ```python 149 | workspace.ResetWorkspace() 150 | print("Current blobs in the workspace after reset: {}".format(workspace.Blobs())) 151 | ``` 152 | 153 | Current blobs in the workspace after reset: [] 154 | 155 | 156 | ## 2.Operators 157 | 158 | Caffe2中的Operators是一种类似于函数的形式。 从C++的角度来看,它们都来自通用接口并按类型注册,因此我们可以在运行时调用不同的Operators。Operators的接口在`caffe2 / proto / caffe2.proto`中定义。 基本上,它需要一些输入并产生一些输出。 159 | 160 | 切记,当我们在caffe2 Python 中 使用"create an operator"函数时,仅仅是新建了一个protocol buffer 并定义上相应的operators。之后,当我们执行的时候才会送到c++ 后端去执行。 161 | 162 | 让我们看一个例子: 163 | 164 | 165 | 166 | ```python 167 | # Create an operator. 168 | op = core.CreateOperator( 169 | "Relu", # The type of operator that we want to run 170 | ["X"], # A list of input blobs by their names 171 | ["Y"], # A list of output blobs by their names 172 | ) 173 | # and we are done! 174 | ``` 175 | 176 | 正如我们提到的,创建的op实际上是一个protobuf对象。让我们来看看内容: 177 | 178 | 179 | ```python 180 | print("Type of the created op is: {}".format(type(op))) 181 | print("Content:\n") 182 | print(str(op)) 183 | ``` 184 | 185 | Type of the created op is: 186 | Content: 187 | 188 | input: "X" 189 | output: "Y" 190 | name: "" 191 | type: "Relu" 192 | 193 | 194 | 195 | 196 | 好的,让我们运行(run)operator。我们首先将输入X提供给工作空间(Workspace)。然后运行运算符(run operator)的最简单方法是执行workspace.RunOperatorOnce(operator) 197 | 198 | 199 | 200 | ```python 201 | workspace.FeedBlob("X", np.random.randn(2, 3).astype(np.float32)) 202 | workspace.RunOperatorOnce(op) 203 | ``` 204 | 205 | 206 | 207 | 208 | True 209 | 210 | 211 | 212 | 213 | 执行(execution)后,让我们看看(operator)做了什么事情? 214 | 215 | 这次我们定义的(operator)是神经网络中使用的常见激活函数,称为[ReLU](https://en.wikipedia.org/wiki/Rectifier/neural_networks) 216 | ,全称是(Rectified Linear Unit activation)整流线性激活单元.ReLU激活有助于添加神经网络分类器必要的非线性特征,定义如下: 217 | $$ReLU(x) = max(0, x)$$ 218 | 219 | 220 | ```python 221 | print("Current blobs in the workspace: {}\n".format(workspace.Blobs())) 222 | print("X:\n{}\n".format(workspace.FetchBlob("X"))) 223 | print("Y:\n{}\n".format(workspace.FetchBlob("Y"))) 224 | print("Expected:\n{}\n".format(np.maximum(workspace.FetchBlob("X"), 0))) 225 | ``` 226 | 227 | Current blobs in the workspace: [u'X', u'Y'] 228 | 229 | X: 230 | [[ 1.1209662 0.8112169 -1.5429683 ] 231 | [-1.1201754 0.46254316 -0.10579768]] 232 | 233 | Y: 234 | [[1.1209662 0.8112169 0. ] 235 | [0. 0.46254316 0. ]] 236 | 237 | Expected: 238 | [[1.1209662 0.8112169 0. ] 239 | [0. 0.46254316 0. ]] 240 | 241 | 242 | 243 | 从上边的例子可以验证该函数与公式相符。 244 | 245 | 如果需要,运算符(Operators)也可以使用可选参数。它们被指定为键值对(key-value pairs)。让我们看一个简单的例子,它采用张量(Tensor)并用高斯随机变量填充它。 246 | 247 | 248 | 249 | ```python 250 | op = core.CreateOperator( 251 | "GaussianFill",# The type of operator that we want to run 252 | [], # GaussianFill does not need any parameters. # A list of input blobs by their names 253 | ["Z"],# A list of output blobs by their names 254 | shape=[100, 100], # shape argument as a list of ints. 255 | mean=1.0, # mean as a single float 256 | std=1.0, # std as a single float 257 | ) 258 | print("Content of op:\n") 259 | print(str(op)) 260 | ``` 261 | 262 | Content of op: 263 | 264 | output: "Z" 265 | name: "" 266 | type: "GaussianFill" 267 | arg { 268 | name: "std" 269 | f: 1.0 270 | } 271 | arg { 272 | name: "shape" 273 | ints: 100 274 | ints: 100 275 | } 276 | arg { 277 | name: "mean" 278 | f: 1.0 279 | } 280 | 281 | 282 | 283 | 让我们运行一下,看一下结果: 284 | 285 | 286 | ```python 287 | workspace.RunOperatorOnce(op) 288 | temp = workspace.FetchBlob("Z") 289 | pyplot.hist(temp.flatten(), bins=50) 290 | pyplot.title("Distribution of Z") 291 | ``` 292 | 293 | 294 | 295 | 296 | Text(0.5,1,u'Distribution of Z') 297 | 298 | 299 | 300 | 301 | ![png](./markdown_img/2.output_30_1.png) 302 | 303 | 304 | 如果你看到一个钟形曲线,那么它的工作原理! 305 | 306 | ## 3.Nets 307 | 308 | 网络(Nets)本质上是计算图。我们将名称“Net”保持为向后一致性(并且还向神经网络致敬)。网络由多个运算符(operators)组成,就像编写为一系列命令的程序一样。让我们来看看。 309 | 310 | 311 | 当我们谈论网络(nets)时,我们还将讨论BlobReference,它是一个包裹字符串的对象,因此我们可以轻松地对运算符(operators)进行链接。 312 | 313 | 让我们创建一个基本上等同于以下python数学的网络: 314 | ``` 315 | X = np.random.randn(2, 3) 316 | W = np.random.randn(5, 3) 317 | b = np.ones(5) 318 | Y = X * W^T + b 319 | ``` 320 | 我们将逐步展示进展。 Caffe2的`core.Net`是围绕NetDef协议缓冲区的包装类(wrapper class)。 321 | 322 | 创建网络时,除了网络名称之外,其底层协议缓冲区基本上是空的。让我们创建网络(net),然后显示原型内容(proto content)。 323 | 324 | 325 | ```python 326 | my_net = core.Net("my_first_net") 327 | print("Current network proto:\n\n{}".format(net.Proto())) 328 | ``` 329 | 330 | Current network proto: 331 | 332 | name: "my_first_net_3" 333 | op { 334 | output: "X" 335 | name: "" 336 | type: "GaussianFill" 337 | arg { 338 | name: "std" 339 | f: 1.0 340 | } 341 | arg { 342 | name: "run_once" 343 | i: 0 344 | } 345 | arg { 346 | name: "shape" 347 | ints: 2 348 | ints: 3 349 | } 350 | arg { 351 | name: "mean" 352 | f: 0.0 353 | } 354 | } 355 | op { 356 | output: "W" 357 | name: "" 358 | type: "GaussianFill" 359 | arg { 360 | name: "std" 361 | f: 1.0 362 | } 363 | arg { 364 | name: "run_once" 365 | i: 0 366 | } 367 | arg { 368 | name: "shape" 369 | ints: 5 370 | ints: 3 371 | } 372 | arg { 373 | name: "mean" 374 | f: 0.0 375 | } 376 | } 377 | op { 378 | output: "b" 379 | name: "" 380 | type: "ConstantFill" 381 | arg { 382 | name: "run_once" 383 | i: 0 384 | } 385 | arg { 386 | name: "shape" 387 | ints: 5 388 | } 389 | arg { 390 | name: "value" 391 | f: 1.0 392 | } 393 | } 394 | op { 395 | input: "X" 396 | input: "W" 397 | input: "b" 398 | output: "Y" 399 | name: "" 400 | type: "FC" 401 | } 402 | 403 | 404 | 405 | 让我们创建一个名为X的blob,并使用GaussianFill用一些随机数据填充它。 406 | 407 | 408 | ```python 409 | X = my_net.GaussianFill([], ["X"], mean=0.0, std=1.0, shape=[2, 3], run_once=0) 410 | print("New network proto:\n\n{}".format(net.Proto())) 411 | ``` 412 | 413 | New network proto: 414 | 415 | name: "my_first_net_3" 416 | op { 417 | output: "X" 418 | name: "" 419 | type: "GaussianFill" 420 | arg { 421 | name: "std" 422 | f: 1.0 423 | } 424 | arg { 425 | name: "run_once" 426 | i: 0 427 | } 428 | arg { 429 | name: "shape" 430 | ints: 2 431 | ints: 3 432 | } 433 | arg { 434 | name: "mean" 435 | f: 0.0 436 | } 437 | } 438 | op { 439 | output: "W" 440 | name: "" 441 | type: "GaussianFill" 442 | arg { 443 | name: "std" 444 | f: 1.0 445 | } 446 | arg { 447 | name: "run_once" 448 | i: 0 449 | } 450 | arg { 451 | name: "shape" 452 | ints: 5 453 | ints: 3 454 | } 455 | arg { 456 | name: "mean" 457 | f: 0.0 458 | } 459 | } 460 | op { 461 | output: "b" 462 | name: "" 463 | type: "ConstantFill" 464 | arg { 465 | name: "run_once" 466 | i: 0 467 | } 468 | arg { 469 | name: "shape" 470 | ints: 5 471 | } 472 | arg { 473 | name: "value" 474 | f: 1.0 475 | } 476 | } 477 | op { 478 | input: "X" 479 | input: "W" 480 | input: "b" 481 | output: "Y" 482 | name: "" 483 | type: "FC" 484 | } 485 | 486 | 487 | 488 | 您可能已经观察到与之前的`core.CreateOperator`调用有一些差异。基本上,当使用网络(net)时,您可以直接创建一个运算符(operator)并通过调用`net.SomeOp`将其添加到网络中,其中SomeOp是运算符的注册类型字符串(a registered type string of an operator)。下面两行是等价的: 489 | ``` 490 | op = core.CreateOperator( 491 | "GaussianFill", 492 | [], 493 | ["X"], 494 | shape=[2, 3], 495 | mean=0.0, 496 | std=1.0, 497 | run_once=0 498 | ) 499 | net.Proto().op.append(op) 500 | ``` 501 | 等价与: 502 | ``` 503 | X = net.GaussianFill([], ["X"], mean=0.0, std=1.0, shape=[2, 3], run_once=0) 504 | ``` 505 | 506 | 此外,您可能想知道X是什么。 X是一个`BlobReference`,它记录了两件事: 507 | 508 | - blob的名称,用`str(X)`访问 509 | 510 | - 它所在的的网络,由内部变量`_from_net`记录 511 | 512 | 我们来核实一下吧。另外,请记住,我们实际上并没有运行任何东西,所以X只包含一个符号。不要指望现在得到任何数值:) 513 | 514 | 515 | ```python 516 | print("Type of X is: {}".format(type(X))) 517 | print("The blob name is: {}".format(str(X))) 518 | ``` 519 | 520 | Type of X is: 521 | The blob name is: X 522 | 523 | 524 | 让我们继续创建 W and b. 525 | 526 | 527 | ```python 528 | W = my_net.GaussianFill([], ["W"], mean=0.0, std=1.0, shape=[5, 3], run_once=0) 529 | b = my_net.ConstantFill([], ["b"], shape=[5,], value=1.0, run_once=0) 530 | ``` 531 | 532 | 现在,一个简单的代码糖:由于BlobReference对象知道它是从哪个网络生成的,除了从net创建运算符(operators)之外,您还可以从BlobReferences创建运算符(operators)。让我们以这种方式创建FC运算符(operators)。 533 | 534 | 535 | ```python 536 | Y = X.FC([W, b], ["Y"]) 537 | ``` 538 | 539 | 在引擎盖(hood)下,`X.FC(...)`通过插入`X`作为相应运算符的第一个输入来简单地委托给`net.FC`,所以我们上面所做的相当于: 540 | ``` 541 | Y = net.FC([X, W, b], ["Y"]) 542 | ``` 543 | 544 | 让我们看一下当前(net): 545 | 546 | 547 | ```python 548 | print("Current network proto:\n\n{}".format(my_net.Proto())) 549 | ``` 550 | 551 | Current network proto: 552 | 553 | name: "my_first_net_5" 554 | op { 555 | output: "X" 556 | name: "" 557 | type: "GaussianFill" 558 | arg { 559 | name: "std" 560 | f: 1.0 561 | } 562 | arg { 563 | name: "run_once" 564 | i: 0 565 | } 566 | arg { 567 | name: "shape" 568 | ints: 2 569 | ints: 3 570 | } 571 | arg { 572 | name: "mean" 573 | f: 0.0 574 | } 575 | } 576 | op { 577 | output: "W" 578 | name: "" 579 | type: "GaussianFill" 580 | arg { 581 | name: "std" 582 | f: 1.0 583 | } 584 | arg { 585 | name: "run_once" 586 | i: 0 587 | } 588 | arg { 589 | name: "shape" 590 | ints: 5 591 | ints: 3 592 | } 593 | arg { 594 | name: "mean" 595 | f: 0.0 596 | } 597 | } 598 | op { 599 | output: "b" 600 | name: "" 601 | type: "ConstantFill" 602 | arg { 603 | name: "run_once" 604 | i: 0 605 | } 606 | arg { 607 | name: "shape" 608 | ints: 5 609 | } 610 | arg { 611 | name: "value" 612 | f: 1.0 613 | } 614 | } 615 | op { 616 | input: "X" 617 | input: "W" 618 | input: "b" 619 | output: "Y" 620 | name: "" 621 | type: "FC" 622 | } 623 | 624 | 625 | 626 | 太冗长了吧?让我们尝试将其可视化为图形。为此,Caffe2附带了一个非常小的图形可视化工具 627 | 628 | 629 | ```python 630 | from caffe2.python import net_drawer 631 | from IPython import display 632 | graph = net_drawer.GetPydotGraph(my_net, rankdir="LR") 633 | display.Image(graph.create_png(), width=800) 634 | ``` 635 | 636 | 637 | 638 | 639 | ![png](./markdown_img/2.output_46_0.png) 640 | 641 | 642 | 643 | 所以我们已经定义了一个网络,但还没有执行任何内容。请记住,上面的网络本质上是一个保存网络定义的protobuf。当我们真正运行网络时,幕后发生的事情是 644 | - 从protobuf中实例化一个 C++ net object 645 | - 调用实例化的net的Run()函数 646 | 647 | 在我们做任何事情之前,我们应该使用`ResetWorkspace()`清除任何早期的工作区变量。 648 | 649 | 然后有两种方法可以从Python运行网络。我们将在下面的示例中首先执行第一个选项。 650 | 651 | 1.调用`workspace.RunNetOnce()`,它实例化,运行并立即拆分网络 652 | 2.调用`workspace.CreateNet()`来创建工作空间(workspace)所拥有的C++net对象,然后调用workspace.RunNet(),将网络的名称传递给它。 653 | 654 | 655 | ```python 656 | workspace.ResetWorkspace() 657 | print("Current blobs in the workspace: {}".format(workspace.Blobs())) 658 | workspace.RunNetOnce(my_net) 659 | print("Blobs in the workspace after execution: {}".format(workspace.Blobs())) 660 | # Let's dump the contents of the blobs 661 | for name in workspace.Blobs(): 662 | print("{}:\n{}".format(name, workspace.FetchBlob(name))) 663 | ``` 664 | 665 | Current blobs in the workspace: [] 666 | Blobs in the workspace after execution: [u'W', u'X', u'Y', u'b'] 667 | W: 668 | [[-0.5135348 -1.6074936 -0.85531217] 669 | [ 0.55833554 0.55954295 -1.3781272 ] 670 | [-0.991518 -0.49916956 1.3225383 ] 671 | [ 0.84439176 0.07618229 -0.62683123] 672 | [-0.602707 -0.65742415 0.66565436]] 673 | X: 674 | [[-0.62176234 0.63317245 0.8652356 ] 675 | [-1.5274166 -0.09956307 -0.00647647]] 676 | Y: 677 | [[-0.43857074 -0.1852696 2.4447355 -0.01913118 1.5344255 ] 678 | [ 1.949968 0.10040456 2.5555944 -0.2932632 1.9817288 ]] 679 | b: 680 | [1. 1. 1. 1. 1.] 681 | 682 | 683 | 现在让我们尝试第二种方法来创建网络(net)并运行它。首先,使用`ResetWorkspace()`清除变量。然后使用我们之前使用`CreateNet(net_object)`创建的工作空间(workspace)的`net`对象创建网络(my_net)。最后,使用`RunNet(net_name)`运行网络。 684 | 685 | 686 | ```python 687 | workspace.ResetWorkspace() 688 | print("Current blobs in the workspace: {}".format(workspace.Blobs())) 689 | workspace.CreateNet(my_net) 690 | workspace.RunNet(my_net.Proto().name) 691 | print("Blobs in the workspace after execution: {}".format(workspace.Blobs())) 692 | for name in workspace.Blobs(): 693 | print("{}:\n{}".format(name, workspace.FetchBlob(name))) 694 | ``` 695 | 696 | Current blobs in the workspace: [] 697 | Blobs in the workspace after execution: [u'W', u'X', u'Y', u'b'] 698 | W: 699 | [[ 1.9866093e-03 6.1452633e-01 2.6723117e-01] 700 | [-3.7205499e-01 -1.6160715e+00 -1.4222487e+00] 701 | [ 1.1416672e+00 8.5878557e-01 -3.5418496e-01] 702 | [ 3.9810416e-01 1.7708173e-01 8.4679842e-01] 703 | [ 2.3121457e+00 3.7234628e-01 -1.5822794e-01]] 704 | X: 705 | [[-0.97019637 2.3577101 0.49893522] 706 | [ 0.78800434 -0.21854126 0.18668541]] 707 | Y: 708 | [[ 2.5802786 -3.1588717 1.7404106 1.4537657 -0.44429624] 709 | [ 0.91715425 0.79448426 1.6458375 1.4330931 2.711069 ]] 710 | b: 711 | [1. 1. 1. 1. 1.] 712 | 713 | 714 | `RunNetOnce`和`RunNet`之间存在一些差异,但主要区别在于计算开销。由于`RunNetOnce`涉及序列化protobuf以在Python和C之间传递并实例化网络,因此运行可能需要更长时间。让我们进行测试,看看时间开销是多少。 715 | 716 | 717 | ```python 718 | # It seems that %timeit magic does not work well with 719 | # C++ extensions so we'll basically do for loops 720 | start = time.time() 721 | for i in range(1000): 722 | workspace.RunNetOnce(my_net) 723 | end = time.time() 724 | print('Run time per RunNetOnce: {}'.format((end - start) / 1000)) 725 | 726 | start = time.time() 727 | for i in range(1000): 728 | workspace.RunNet(my_net.Proto().name) 729 | end = time.time() 730 | print('Run time per RunNet: {}'.format((end - start) / 1000)) 731 | ``` 732 | 733 | Run time per RunNetOnce: 0.000457355976105 734 | Run time per RunNet: 1.52060985565e-05 735 | 736 | 737 | 恭喜,您现在已经了解了Caffe2 Python API的许多关键组件!准备好了解更多Caffe2?查看其余的教程,了解各种有趣的用例! 738 | 739 | 740 | ```python 741 | 742 | ``` 743 | -------------------------------------------------------------------------------- /caffe2/3.Brewing Models(快速构建模型).md: -------------------------------------------------------------------------------- 1 | 本片文章是我的【caffe2从头学】系列中的一篇,如果想看其他文章,请看目录: 2 | 3 | --- 4 | 0.[目录](https://blog.csdn.net/weixin_37251044/article/details/82344428) 5 | 1.[快速开始](https://blog.csdn.net/weixin_37251044/article/details/82344481) 6 | > 1.1.[什么是caffe2 ?](https://blog.csdn.net/weixin_37251044/article/details/82344481) 7 | 1.2.[安装caffe2](https://blog.csdn.net/weixin_37251044/article/details/82259230) 8 | 9 | 2.[学习caffe2](https://blog.csdn.net/weixin_37251044/article/details/82346301) 10 | 3.[caffe2官方教程的安装与使用](https://blog.csdn.net/weixin_37251044/article/details/82352962) 11 | >3.1.[Blobs and Workspace, Tensors,Net 概念](https://blog.csdn.net/weixin_37251044/article/details/82387868) 12 | >3.2.[Caffe2 的一些基本概念 - Workspaces&Operators & Nets & Nets 可视化](https://blog.csdn.net/weixin_37251044/article/details/82421521) 13 | >3.3.[Brewing Models(快速构建模型)](https://blog.csdn.net/weixin_37251044/article/details/82425057) 14 | 15 | 4.参考 16 | 5.API 17 | 18 | 相关代码在我的github仓库:https://github.com/JackKuo666/csdn/tree/master/caffe2 19 | 20 | --- 21 | 22 | 本页来自:https://caffe2.ai/docs/brew 23 | 24 | # 1.Brewing Models 25 | 26 |   `brew`是Caffe2用于构建模型的新API。` CNNModelHelper`过去曾在caffe里担任过这个角色,但由于Caffe2的扩展远远超出了CNN的优势,因此提供更通用的`ModelHelper`对象是有意义的。您可能会注意到新的`ModelHelper`与`CNNModelHelper`具有许多相同的功能。 `brew`包装了新的`ModelHelper`,使得构造模型(model)比以前更容易。 27 | 28 | # 2.Model Building and Brew’s Helper Functions 29 |   在本概述中,我们将介绍`brew`,一个轻量级的辅助函数(`Helper Functions`)集合,可帮助您构建模型。 30 |   1.我们将首先解释`Ops`与`Helper Functions`的关键概念。 31 |   2.然后我们将展示`brew`使用情况,它如何充当`ModelHelper`对象的接口,以及`arg_scope`语法糖。 32 |   3.最后,我们讨论了引入`brew`的动机。 33 | 34 | # 3.Concepts: Ops vs Helper Functions 35 |   在我们深入研究`brew`之前,我们应该回顾caffe2中的一些约定以及神经网络层是如何表示的。caffe2的深度学习网络`net`是由操作符`operators`建立的。通常,这些操作符`operators`是用c++编写的,以获得最大的性能。caffe2还提供了一个Python API来包装这些c++操作符`operators`,因此您可以更灵活地进行实验和原型设计。在caffe2中,操作符`operators`总是以骆驼拼写法( CamelCase fashion)的形式出现,而具有类似名称的Python `helper functions`是小写的。下面是一些例子。 36 | 37 | ## 3.1.Ops 38 |   我们经常将操作符`operator`称为`Op`或运算符集合`operators`作为`Ops`。例如,`FC` `Op`代表一个完全连接的运算符,它与前一层中的每个神经元和下一层的每个神经元都有加权连接。例如,您可以使用以下命令创建`FC` `Op`: 39 | ``` 40 | model.net.FC([blob_in, weights, bias], blob_out) 41 | ``` 42 |   或者您可以创建一个`Copy` `Op`: 43 | ``` 44 | model.net.Copy(blob_in, blob_out) 45 | ``` 46 | >由`ModelHelper`处理的操作符`operators`列表在本文档的底部,目前包括最常用的29个。这是写这篇文章时的400+ `Ops` `caffe2`的一个子集。 47 | 48 |   还应注意,您还可以创建一个没有注释网络的`operator`。例如,就像我们创建`Copy` `Op`的前一个示例一样,我们可以使用以下代码在`model.net`上创建一个`Copy`运算符: 49 | ``` 50 | model.Copy(blob_in, blob_out) 51 | ``` 52 | ## 3.2.Helper Functions 53 |   仅仅使用单个`operator`来构建模型/网络可能很费劲,因为您必须自己完成参数初始化,设备/引擎选择(但这也是Caffe2如此之快的原因!)。例如,要构建`FC`层,您需要使用几行代码来准备权重和偏差,然后将其提供给`Op`。 54 | 55 | 这是更长的手动方式: 56 | ```python 57 | model = model_helper.ModelHelper(name="train") 58 | # initialize your weight 59 | weight = model.param_init_net.XavierFill( 60 | [], 61 | blob_out + '_w', 62 | shape=[dim_out, dim_in], 63 | **kwargs, # maybe indicating weight should be on GPU here 64 | ) 65 | # initialize your bias 66 | bias = model.param_init_net.ConstantFill( 67 | [], 68 | blob_out + '_b', 69 | shape=[dim_out, ], 70 | **kwargs, 71 | ) 72 | # finally building FC 73 | model.net.FC([blob_in, weights, bias], blob_out, **kwargs) 74 | ``` 75 |   幸运的是,Caffe2`helper functions`可以提供帮助。`helper functions`是包装函数,可为模型(model)创建完整的层(layer)。`helper functions`通常用于处理参数初始化,`operator`定义和引擎选择。 Caffe2默认`helper functions`在Python PEP8函数约定中命名。例如,使用python / helpers / fc.py,通过`helper functions` `fc`实现`FC` `Op`要简单得多: 76 |   使用`helper functions`的更简单方法: 77 | ``` 78 | fcLayer = fc(model, blob_in, blob_out, **kwargs) # returns a blob reference 79 | ``` 80 | 81 | >一些`helper functions`构建了多于1个`operator`。例如,python/rnn_cell.py中的LSTM函数可帮助您在网络中构建整个LSTM单元。 82 | 83 | 查看[repo](https://github.com/pytorch/pytorch/tree/master/caffe2/python/helpers)以获得更酷的帮助函数! 84 | 85 | # 4.brew 86 |   现在你已经了解了`Ops`和`Helper`功能,让我们来介绍一下`brew`如何使模型构建变得更加容易。 `brew`是一个智能的辅助函数`helper functions`集合。只需导入一个`brew`模块,即可使用所有Caffe2强大的`helper functions`功能。您现在可以使用以下方法添加`FC`层: 87 | ``` 88 | from caffe2.python import brew 89 | 90 | brew.fc(model, blob_in, blob_out, ...) 91 | 92 | ``` 93 |   这与直接使用`helper functions`几乎相同,但是一旦模型变得更复杂,`brew`就会开始闪耀。以下是从MNIST教程中提取的LeNet模型构建示例。 94 | ``` 95 | rom caffe2.python import brew 96 | 97 | def AddLeNetModel(model, data): 98 | conv1 = brew.conv(model, data, 'conv1', 1, 20, 5) 99 | pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2, stride=2) 100 | conv2 = brew.conv(model, pool1, 'conv2', 20, 50, 5) 101 | pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2) 102 | fc3 = brew.fc(model, pool2, 'fc3', 50 * 4 * 4, 500) 103 | fc3 = brew.relu(model, fc3, fc3) 104 | pred = brew.fc(model, fc3, 'pred', 500, 10) 105 | softmax = brew.softmax(model, pred, 'softmax') 106 | 107 | ``` 108 |   每个层都是使用`brew`创建的,而`brew`又使用其`operator` `hooks`来实例化每个`Op`。 109 | 110 | ## 4.1.arg_scope 111 |   `arg_scope`是一种语法糖,可以在其上下文中设置默认的`helper functions`参数值。例如,假设您想在ResNet-150训练脚本中尝试不同的权重初始化。你可以: 112 | ``` 113 | # change all weight_init here 114 | brew.conv(model, ..., weight_init=('XavierFill', {}),...) 115 | 116 | # repeat 150 times 117 | 118 | brew.conv(model, ..., weight_init=('XavierFill', {}),...) 119 | ``` 120 |    121 |   或者在`arg_scope`的帮助下,你可以: 122 | ``` 123 | with brew.arg_scope([brew.conv], weight_init=('XavierFill', {})): 124 | brew.conv(model, ...) # no weight_init needed here! 125 | brew.conv(model, ...) 126 | ... 127 | 128 | ``` 129 | 130 | # 5.Custom Helper Function 131 |   当您更频繁地使用`brew`并且发现需要实现`Brew`目前不涵盖的`Op`时,您将需要编写自己的`Helper Function`。您可以将您自己的`Helper Function`注册为`brew`,以享受统一管理和语法糖。 132 |   只需定义新的`Helper Function`,使用`.Register`函数将其注册到`brew`,然后使用`brew.new_helper_function`调用它。 133 | 134 | ``` 135 | def my_super_layer(model, blob_in, blob_out, **kwargs): 136 | """ 137 | 100x faster, awesome code that you'll share one day. 138 | """ 139 | 140 | brew.Register(my_super_layer) 141 | brew.my_super_layer(model, blob_in, blob_out) 142 | ``` 143 | 144 | # 6.Caffe2 Default Helper Functions 145 | 请访问:https://caffe2.ai/docs/brew 146 | 147 | accuracy 148 | add_weight_decay 149 | average_pool 150 | concat 151 | conv 152 | conv_nd 153 | conv_transpose 154 | depth_concat 155 | dropout 156 | fc 157 | fc_decomp 158 | fc_prune 159 | fc_sparse 160 | group_conv 161 | group_conv_deprecated 162 | image_input 163 | instance_norm 164 | iter 165 | lrn 166 | max_pool 167 | max_pool_with_index 168 | packed_fc 169 | prelu 170 | softmax 171 | spatial_bn 172 | relu 173 | sum 174 | transpose 175 | video_input 176 | 177 | # 7.Motivation for brew 178 |   感谢您阅读有关`brew`的全部概述!恭喜你,你终于来了!简而言之,`我们希望将模型构建过程和模型存储分开`。在我们看来,`ModelHelper`类应该只包含网络定义和参数信息。 `brew`模块将具有构建网络和初始化参数的功能。 179 | 180 |   与之前同时进行`模型存储`和`模型构建`的巨型`CNNModelHelper`相比,模型构建的ModelHelper + brew方式更加模块化,更易于扩展。在命名方面,由于Caffe2系列支持各种网络,包括MLP,RNN和CNN,因此它也更不容易混淆。 我们希望本教程能够帮助您更快,更轻松地建立模型,同时更深入地了解Caffe2。python/brew_test.py中有一个`brew`使用的详细示例。如果您对`brew`有任何疑问,请随时与我们联系并在回购问题中提出问题。再次感谢您拥抱新的brew API。 -------------------------------------------------------------------------------- /caffe2/4.Toy_Regression.md: -------------------------------------------------------------------------------- 1 | 2 | # Tutorial 4. A Simple Toy Regression 3 | 4 | 这是一个快速示例,说明如何使用[教程2:基础知识](https://github.com/JackKuo666/csdn/tree/master/caffe2) 中介绍的概念进行回归(Regression)。本教程分为两部分。 **第一部分**是创建和训练多项式回归(Polynomial Regression)模型的更详细的例子,**第二部分**是一个简洁的线性回归(Linear Regression)实例。 5 | 6 | ## Part I: 多项式回归(Polynomial Regression) 7 | 8 | 我们正在处理的问题是一个相对简单的问题,涉及一维输入$x$和一维输出$y$。因为我们寻找二阶多项式(second order polynomial)作为回归模型(regression model),权重向量将包含两个权重($\beta_2$和$\beta_1$)并且会有一个偏差($\beta_0$)或截距(intercept)。所需的解决方案形式如下: 9 | $$y = \beta_2x^2 + \beta_1x + \beta_0$$ 10 | 11 | 在本教程中,我们将生成并格式化具有强二阶多项式关系的任意输入数据集。然后我们将构建模型,指定训练算法,执行训练,最后查看结果。 12 | 13 | 14 | ```python 15 | from __future__ import absolute_import 16 | from __future__ import division 17 | from __future__ import print_function 18 | from __future__ import unicode_literals 19 | from caffe2.python import workspace, brew, optimizer 20 | from caffe2.python.model_helper import ModelHelper 21 | import numpy as np 22 | %matplotlib inline 23 | import matplotlib.pyplot as plt 24 | import sklearn.datasets 25 | from sklearn.preprocessing import PolynomialFeatures 26 | ``` 27 | 28 | ### Inputs 29 | 30 | 在此指定回归模型的输入参数包括:输入数据中的样本数(num_samples),训练迭代次数(training_iters),SGD算法的学习率(learning_rate)以及模型的初始权重(initial_weights) 31 | 32 | 33 | 34 | ```python 35 | # Number of training sample to generate 36 | num_samples = 200 37 | # Learning Rate of SGD algorithm 38 | learning_rate = .05 39 | # Number of iterations to train 40 | training_iters = 100 41 | # Initial model weights 42 | initial_weights = [0.,0.] 43 | ``` 44 | 45 | ### Create and Prepare the Dataset 46 | 47 | 48 | 现在,我们将创建并准备用于模型的数据集。注意,我们只是在这里构建numpy数组。可以使用任何其他数据,只要它在输入到模型之前被正确地整形。 49 | 50 | 51 | ```python 52 | # Create the original observations 53 | orig_X,_ = sklearn.datasets.make_regression(n_samples=num_samples,n_features=1,noise=5) 54 | 55 | poly = PolynomialFeatures(degree=2, include_bias=False) 56 | # Transform the features into second order polynomial features 57 | xx_ = poly.fit_transform(orig_X) 58 | 59 | # 查看origin_X和xx_的shape以及里边数据的情况====== 60 | #print (orig_X.shape) 61 | #print(orig_X[0]) 62 | #print(orig_X[0]**2) 63 | #print(xx_.shape) 64 | #print(xx_[0]) 65 | #============================================== 66 | 67 | 68 | # Extract the predictors and the values from the manufactured data 69 | # 从制造的数据中提取预测变量和值 70 | X = [i[0] for i in xx_] 71 | Y_gt = [i[1] for i in xx_] 72 | noise = np.random.uniform(size=(len(Y_gt))) 73 | # Add some noise to the ground truth values 74 | Y_gt += noise 75 | 76 | 77 | 78 | #查看X和Y_gt的shape和数据情况================== 79 | #print(len(X)) 80 | #print(X[0]) 81 | #print(len(Y_gt)) 82 | #print(noise[0]) 83 | #print(Y_gt[0]) 84 | #============================================ 85 | 86 | 87 | # Shape the ground truth values for use with the model 88 | Y_gt = np.reshape(Y_gt,(-1,1)) 89 | # Format the input features. Recall, we accomplish polynomial regression by 90 | # including the original and the polynomial version of the predictors 91 | # as features of the model 92 | X = np.hstack((np.array(X).reshape(-1,1),np.array(X).reshape(-1,1)**2)) 93 | 94 | # 查看更改shape之后的Y_gt和X的shape和数据情况 95 | #print(orig_X[0]**2)#存的是Y_gt[0]的真值,没有加噪声 96 | #print(len(X)) 97 | #print(X[0]) #第二列数据里边存的是Y_gt[0]的真值,没有加噪声 98 | #print(len(Y_gt)) 99 | #print(Y_gt[0]) 100 | # =========================================== 101 | 102 | # Print a sample of the input data. X is the list of 2-feature input observations 103 | # and Y is the list of ground truth values associated with each observation 104 | print("X Sample:\n{}".format(X[:5])) 105 | print("Y Sample:\n{}".format(Y_gt[:5])) 106 | 107 | # Plot the input data 108 | plt.scatter([i[0] for i in X],Y_gt,label="original data",color='b') 109 | plt.xlabel("x") 110 | plt.ylabel("y") 111 | plt.title("Input Training Data") 112 | ``` 113 | 114 | X Sample: 115 | [[ 0.09209892 0.00848221] 116 | [ 1.15269589 1.32870782] 117 | [-0.21286333 0.0453108 ] 118 | [ 0.27024664 0.07303325] 119 | [ 1.63132252 2.66121316]] 120 | Y Sample: 121 | [[0.59521509] 122 | [1.87600946] 123 | [0.2563054 ] 124 | [0.67216029] 125 | [3.59454641]] 126 | 127 | 128 | 129 | 130 | 131 | Text(0.5,1,u'Input Training Data') 132 | 133 | 134 | 135 | 136 | ![png](./markdown_img/4.output_5_2.png) 137 | 138 | 139 | ### Create the Model 140 | 141 | #### 定义模型体系结构 142 | 143 | 通过创建我们的训练数据和我们的二阶多项式假设,我们现在可以创建一个模型来学习回归线(regression line)。我们将使用“FC”层作为模型的主要组件。由于我们需要两个权重($\beta_2$和$\beta_1$),我们将输入维度设置为2,因为我们只期望单个定量结果,所以我们的输出维度为1.注意,当使用“FC”层时暗示存在偏差,我们将使用它作为$\beta_0$。 144 | 145 | 146 | 147 | 此外,在继续查看此步骤中创建的protobuf之前。第一个打印输出是'net',包含模型的体系结构。乍看之下,我们看到正如预期的那样,网络中有一个操作符`operator` 需要输入$X$,权重$a$和偏差$b$,并输出$y_{pred}$。在打印出'param_init_net'时我们看到这是权重和偏差的初始化存在的地方。这是一个重要的观察结果,可以深入了解如何构建和维护Caffe2中的模型。 148 | 149 | 150 | ```python 151 | # Create the model helper object we will use to create the regression model 152 | regression_model = ModelHelper(name="regression_model") 153 | 154 | # Add the FC layer, which is the main component of a linear regression model 155 | y_pred = brew.fc(regression_model,'X','y_pred', dim_in=2, dim_out=1) 156 | 157 | # Print the predict and init net to see what protobuf was created for this model 158 | print("************* Predict Net *************") 159 | print(regression_model.net.Proto()) 160 | print("\n************* Init Net *************") 161 | print(regression_model.param_init_net.Proto()) 162 | ``` 163 | 164 | ************* Predict Net ************* 165 | name: "regression_model_2" 166 | op { 167 | input: "X" 168 | input: "y_pred_w" 169 | input: "y_pred_b" 170 | output: "y_pred" 171 | name: "" 172 | type: "FC" 173 | arg { 174 | name: "use_cudnn" 175 | i: 1 176 | } 177 | arg { 178 | name: "order" 179 | s: "NCHW" 180 | } 181 | arg { 182 | name: "cudnn_exhaustive_search" 183 | i: 0 184 | } 185 | } 186 | external_input: "X" 187 | external_input: "y_pred_w" 188 | external_input: "y_pred_b" 189 | 190 | 191 | ************* Init Net ************* 192 | name: "regression_model_init_2" 193 | op { 194 | output: "y_pred_w" 195 | name: "" 196 | type: "XavierFill" 197 | arg { 198 | name: "shape" 199 | ints: 1 200 | ints: 2 201 | } 202 | } 203 | op { 204 | output: "y_pred_b" 205 | name: "" 206 | type: "ConstantFill" 207 | arg { 208 | name: "shape" 209 | ints: 1 210 | } 211 | } 212 | 213 | 214 | 215 | #### Add the training operators and prime(填充) the workspace 216 | 217 | 218 | 在这个**非常重要的**步骤中,我们指定损失函数,设置SGD训练算法,填充和初始化工作空间,并初始化模型的权重和偏差。 219 | 220 | 221 | ```python 222 | # The loss function is computed by a squared L2 distance, 223 | # and then averaged over all items. 224 | dist = regression_model.SquaredL2Distance(['Y_gt', y_pred], "dist") 225 | loss = regression_model.AveragedLoss(dist, "loss") 226 | 227 | # Add the gradient operators and setup the SGD algorithm 228 | regression_model.AddGradientOperators([loss]) 229 | optimizer.build_sgd(regression_model, base_learning_rate=learning_rate) 230 | 231 | # Prime the workspace with some data 232 | workspace.FeedBlob("Y_gt",Y_gt.astype(np.float32)) 233 | workspace.FeedBlob("X",X.astype(np.float32)) 234 | 235 | # Run the init net to prepare the workspace then create the net 236 | workspace.RunNetOnce(regression_model.param_init_net) 237 | workspace.CreateNet(regression_model.net) 238 | 239 | # Inject our desired initial weights and bias 240 | workspace.FeedBlob("y_pred_w",np.array([initial_weights]).astype(np.float32)) 241 | workspace.FeedBlob("y_pred_b",np.array([0.]).astype(np.float32)) 242 | ``` 243 | 244 | 245 | 246 | 247 | True 248 | 249 | 250 | 251 | #### Run the training 252 | 253 | 254 | ```python 255 | # Run the training for training_iters 256 | for i in range(training_iters): 257 | workspace.RunNet(regression_model.net) 258 | 259 | print("Training Complete") 260 | ``` 261 | 262 | Training Complete 263 | 264 | 265 | ### Extract Results(提取结果) 266 | 267 | 268 | 现在我们的模型已经过训练,我们可以提取在名为“y_pred_w”和“y_pred_b”的工作空间中作为blob存在的学习权重和偏差。 269 | 270 | 271 | ```python 272 | # Extract the learned coes and intercept from the workspace 273 | coes = workspace.FetchBlob("y_pred_w")[0] 274 | intercept = workspace.FetchBlob("y_pred_b") 275 | 276 | # 查看y_preb_w和y_pred_b中存储的参数======= 277 | #print(workspace.FetchBlob("y_pred_w")) 278 | #print(workspace.FetchBlob("y_pred_b")) 279 | #======================================== 280 | print(intercept[0]) 281 | print(coes) 282 | 283 | # Calculate the regression line for plotting 284 | x_vals = np.linspace(orig_X.min(), orig_X.max(),100) 285 | regression_result = intercept[0] + coes[0]*x_vals + coes[1]*(x_vals**2) 286 | print("Best Fit Line: {}*x^2 + {}*x + {}".format(round(coes[1],5), round(coes[0],5), round(intercept[0],5)))#round(x,5)返回5位精度值 287 | 288 | # Plot the results of the regression 289 | plt.scatter([i[0] for i in X],Y_gt,label="original data",color='b') 290 | plt.plot(x_vals,regression_result,label="regression result",color='r') 291 | #plt.plot(x_vals,(x_vals**2),label="result",color='g') 292 | plt.legend() 293 | plt.xlabel("x") 294 | plt.ylabel("y") 295 | plt.title("Polynomial Regression Fit: ${{{}}}x^2 + {{{}}}x + {{{}}}$".format(round(coes[1],5), round(coes[0],5), round(intercept[0],5))) 296 | plt.show() 297 | 298 | ``` 299 | 300 | 0.45223227 301 | [-0.00435348 1.025054 ] 302 | Best Fit Line: 1.02505*x^2 + -0.00435*x + 0.45223 303 | 304 | 305 | 306 | ![png](./markdown_img/4.output_13_1.png) 307 | 308 | 309 | ## Part II: 表达线性回归示例(Express Linear Regression Example) 310 | 311 | 312 | 上面的示例显示了如何创建一个易于调整以处理高阶多项式的多项式回归模型。现在,我们将考虑基本情况,我们需要一个简单的一阶模型(first order model),一维输入$x$,1-D输出$y$,和一个解决方案: 313 | 314 | $$y = \beta_1x + \beta_0$$ 315 | 316 | 317 | 第二部分的结构与第一部分类似。首先,我们将生成数据集,然后我们将构建模型并指定训练例程,最后我们将训练和提取我们的结果。 318 | 319 | 320 | ```python 321 | ##################################################################### 322 | # Initialize data 323 | ##################################################################### 324 | X,Y_gt = sklearn.datasets.make_regression(n_samples=100,n_features=1,noise=10) 325 | Y_gt = np.reshape(Y_gt,(-1,1)) 326 | Y_gt /= 100. 327 | 328 | ##################################################################### 329 | # Create and train model 330 | ##################################################################### 331 | # Construct model with single FC layer 332 | regression_model = ModelHelper(name="regression_model") 333 | y_pred = brew.fc(regression_model,'X','y_pred', dim_in=1, dim_out=1) 334 | 335 | # Specify Loss function 336 | dist = regression_model.SquaredL2Distance(['Y_gt', y_pred], "dist") 337 | loss = regression_model.AveragedLoss(dist, "loss") 338 | 339 | # Get gradients for all the computations above. 340 | regression_model.AddGradientOperators([loss]) 341 | optimizer.build_sgd(regression_model, base_learning_rate=0.05) 342 | 343 | # Prime and prepare workspace for training 344 | workspace.FeedBlob("Y_gt",Y_gt.astype(np.float32)) 345 | workspace.FeedBlob("X",X.astype(np.float32)) 346 | workspace.RunNetOnce(regression_model.param_init_net) 347 | workspace.CreateNet(regression_model.net) 348 | 349 | # Set the initial weight and bias to 0 350 | workspace.FeedBlob("y_pred_w",np.array([[0.]]).astype(np.float32)) 351 | workspace.FeedBlob("y_pred_b",np.array([0.]).astype(np.float32)) 352 | 353 | # Train the model 354 | for i in range(100): 355 | workspace.RunNet(regression_model.net) 356 | 357 | ##################################################################### 358 | # Collect and format results 359 | ##################################################################### 360 | # Grab the learned weight and bias from workspace 361 | coe = workspace.FetchBlob("y_pred_w")[0] 362 | intercept = workspace.FetchBlob("y_pred_b") 363 | 364 | # Calculate the regression line for plotting 365 | x_vals = range(-3,4) 366 | regression_result = x_vals*coe + intercept 367 | 368 | # Plot the results 369 | plt.scatter(X,Y_gt,label="original data",color='b') 370 | plt.plot(x_vals,regression_result,label="regression result",color='r') 371 | plt.legend() 372 | plt.xlabel("x") 373 | plt.ylabel("y") 374 | plt.title("Regression Line: ${{{}}}x + {{{}}}$".format(round(coe,5), round(intercept[0],5))) 375 | plt.show() 376 | 377 | ``` 378 | 379 | 380 | ![png](./markdown_img/4.output_15_0.png) 381 | 382 | 383 | 384 | ```python 385 | 386 | ``` 387 | -------------------------------------------------------------------------------- /caffe2/5.Models and Datasets.md: -------------------------------------------------------------------------------- 1 | 本片文章是我的【caffe2从头学】系列中的一篇,如果想看其他文章,请看目录: 2 | 3 | --- 4 | 0.[目录](https://blog.csdn.net/weixin_37251044/article/details/82344428) 5 | 1.[快速开始](https://blog.csdn.net/weixin_37251044/article/details/82344481) 6 | > 1.1.[什么是caffe2 ?](https://blog.csdn.net/weixin_37251044/article/details/82344481) 7 | 1.2.[安装caffe2](https://blog.csdn.net/weixin_37251044/article/details/82259230) 8 | 9 | 2.[学习caffe2](https://blog.csdn.net/weixin_37251044/article/details/82346301) 10 | 3.[caffe2官方教程的安装与使用](https://blog.csdn.net/weixin_37251044/article/details/82352962) 11 | >3.1.[Blobs and Workspace, Tensors,Net 概念](https://blog.csdn.net/weixin_37251044/article/details/82387868) 12 | >3.2.[Caffe2 的一些基本概念 - Workspaces&Operators & Nets & Nets 可视化](https://blog.csdn.net/weixin_37251044/article/details/82421521) 13 | >3.3.[Brewing Models(快速构建模型)](https://blog.csdn.net/weixin_37251044/article/details/82425057) 14 | >3.4.[Toy_Regression](https://blog.csdn.net/weixin_37251044/article/details/82428606) 15 | >3.5.[Models and Datasets](https://blog.csdn.net/weixin_37251044/article/details/82455020) 16 | 17 | 4.参考 18 | 5.API 19 | 20 | 相关代码在我的github仓库:https://github.com/JackKuo666/csdn/tree/master/caffe2 21 | 22 | --- 23 | 24 | # 0.Caffe2, Models, and Datasets Overview 25 |   在本教程中,我们将尝试使用现有的Caffe模型。在其他教程中,您可以学习如何修改模型或创建自己的模型。您还可以了解如何生成或修改数据集。在这里,您将学习如何查找模型,涉及哪些文件以及如何使用数据集测试模型。 26 | 27 | # 1.Models vs Datasets 28 |   让我们确保您了解什么是模型与数据集。让我们先从`数据集`开始:这是一组数据,可以是任何数据,但通常都有某种主题,例如鲜花图像的集合。要组成数据集,您还需要一些标签:这是一个文件,讲述每个图像并提供某种描述。例如,它可以是属和种,或者它可以是通用名,或者它可以是它的外观,感觉或气味或其某种组合的描述符。在下面的例子中,Mukane&Kendule提出了一种从图像中提取花的方法,使用**图像分割**`image segmentation`和**特征提取**`feature extraction`将主花拉出训练图像,然后他们的分类器使用**纹理特征**`texture features`进行匹配`matching`。 29 |   
![这里写图片描述](./markdown_img/5.1.png)
30 |   `模型`是通过神经网络运行时从数据集中创建的模型。这称为训练,其中神经网络和运行它的计算机从数据集“学习”。它基于与数据集中的其他示例类似的特征以及标签之间的共性等来挑选它可以找到的关于如何识别图像中的显著对象的所有特征。存在多种类型的神经网络,其被设计用于特定目的,其可以创建比其他模型更准确的模型。在鲜花的情况下,并创建一个擅长准确识别它们的模型,我们将选择一个卷积神经网络。我们也会这样做来识别地方的照片。请看下面的交互式示例,其中显示了网络共有的提取区域以及它们如何跨网络层链接在一起。 31 |   
![这里写图片描述](./markdown_img/5.2.png)
32 | # 2.Evaluating a Model’s Performance 33 |   创建模型的常见做法是使用通常称为`准确性`和`损失`的两个因素来评估其性能。另一种看待这种情况的方法是: 34 | 35 | 1. accuracy: how often is it right versus wrong 36 | 37 | 2. loss: how often did it fail to recognize anything when it should have 38 | 39 |   每个用例对这些因素都有不同的容差。如果您正在编写`鲜花识别应用程序`,那么92%的准确率是非常棒的;如果损失很大,那么你可以依靠用户只需用他们的相机尝试不同的角度,直到它工作。 40 | 41 |   如果您正在寻找肿瘤,92%的准确度相当不错,但如果您的损失非常高,您可能希望在模型上稍微努力,因为医疗成像本身非常昂贵并且不容易要求更多图像或不同的角度,如果你的模型没有提取一些东西。通过获取数据集并将其分为两部分来完成对这些因素的评估: 42 |    43 | 44 | 1. 第一部分要**大**得多,用于训练 45 | 2. 第二个较**小**的部分用于测试 46 | 47 | # 3.Splitting the Dataset(拆分数据集) 48 |   如何拆分数据,以及如何处理标签是另一种讨论。可以这么说,把它想象成一个80%:20%的东西,你在80上训练并用20测试,如果模型在20%上做得好,那么你有一些你可以使用的东西! “做得好”是主观的,取决于你。您可以进行优化,例如调整数据集大小,标签,神经网络及其组件,并希望影响训练速度,检测速度和准确性,以及您可能感兴趣或不感兴趣的其他事项。 49 | 50 |   许多神经网络和深度学习教程使用MNIST手写数据集。当您下载此数据集时,它通常可以进入这些部分,进行训练和测试,每个部分都带有图像和标签: 51 | ## MNIST Training Dataset: 52 | [train-images-idx3-ubyte.gz](https://github.com/caffe2/models/blob/master/mnist/train-images-idx3-ubyte.gz): training set images (9912422 bytes) 53 | [train-labels-idx1-ubyte.gz](https://github.com/caffe2/models/blob/master/mnist/train-labels-idx1-ubyte.gz): training set labels (28881 bytes) 54 | 55 | ## MNIST Test Dataset: 56 | [t10k-images-idx3-ubyte.gz](https://github.com/caffe2/models/blob/master/mnist/t10k-images-idx3-ubyte.gz): test set images (1648877 bytes) 57 | [t10k-labels-idx1-ubyte.gz](https://github.com/caffe2/models/blob/master/mnist/t10k-labels-idx1-ubyte.gz): test set labels (4542 bytes) 58 | 59 |   该数据集分为60:10。60,000个训练图像和10,000个测试图像。在解压缩文件之后不要试图打开文件。它们不是人类可读的,而是它们的格式需要解析才能被查看。有关如何收集和格式化数据的更多信息,请访问此[研究站点](http://yann.lecun.com/exdb/mnist/)。 60 | 61 |   您可以在[MNIST教程](https://caffe2.ai/docs/tutorial-MNIST.html)中使用此数据集创建CNN。 62 | 63 | 64 | # 4.Caffe Model Zoo 65 | 66 |   Caffe和Caffe2最棒的一点就是(Model Zoo)。这是开源社区提供的项目集合,描述了模型的创建方式,使用的数据集以及模型本身。通过这种方式,您实际上不需要进行任何培训。您只需下载该模型即可。您还可以下载训练数据和测试数据以查看其工作原理,并使用提供的测试数据自行验证模型的准确性。 67 | 68 | # 5.Custom Datasets 69 | 70 |   但是,测试自己的数据有点棘手。一旦掌握了提供的模型及其数据集,我们将在另一个教程中介绍如何测试自己的数据。在您尝试这些时,最好注意您可以将数据集,样本/子集与其标签组合在一起。您可能决定要在标签上显示更少的信息,或者更多。您可能还没有在某些训练数据中包含标签。这有一个有趣的副作用,即在某些情况下通过让网络在训练期间做出一些猜测来实际改善模型性能【也就是我们说的泛化能力】。我们对特征进行分类和注释的方式并不总是映射到计算机的神经网络如何做到这一点。 “过度拟合”数据可能会导致网络性能下降。 71 | 72 | # 6.Caffe Model Files 73 |   现在让我们概述一下,让我们跳到一个具体的例子。您将看到一小组文件,这些文件将用于运行模型并查看其工作原理。 74 | 75 | 1. .caffemodel和.pb:这些是model; 它们是二进制文件,通常是大文件。 76 | > .caffemodel:来自Caffe1.0 77 | > .pb:来自Caffe2,一般都有init和predict 78 | 79 | 80 | 2. .pbtxt:Caffe2 pb文件的可读形式 81 | 3. deploy.prototxt:描述部署【使用时】(而非训练)的网络结构(net)。 82 | 4. solver.prototxt:描述训练期间使用的变量【超参数】,包括学习率,正则化等。 83 | 5. train_val.prototxt:描述训练(和验证)的网络架构(net)。 84 | 85 | 86 | 87 | -------------------------------------------------------------------------------- /caffe2/6.Loading_Pretrained_Models.md: -------------------------------------------------------------------------------- 1 | 2 | # Loading Pre-Trained Models 3 | 4 | ## Description 5 | 6 | 7 | 8 | 在本教程中,我们将使用[ModelZoo](https://github.com/caffe2/caffe2/wiki/Model-Zoo) 中预先训练的`squeezenet`模型对我们自己的图像进行分类。作为输入,我们将为要分类的图像提供路径(或URL)。了解图像的[ImageNet目标代码](https://gist.githubusercontent.com/aaronmarkham/cd3a6b6ac071eca6f7b4a6e40e6038aa/raw/9edb4038a37da6b5a44c3b5bc52e448ff09bfe5b/alexnet_codes) 也很有帮助,这样我们就可以验证结果。 '目标代码'只不过是训练期间使用的类的整数标签,例如“985”是类“daisy”(雏菊)的代码。请注意,虽然我们在这里使用squeezenet,但本教程可作为在预训练模型上运行推理的一种通用方法。 9 | 10 | 11 | 12 | 如果您来自[图像预处理教程](https://caffe2.ai/docs/tutorial-image-pre-processing.html) , 您将看到我们正在使用重新缩放和裁剪功能来准备图像,以及将图像重新格式化为CHW,BGR,最后是NCHW。我们还通过使用提供的npy文件中的计算平均值或直接减128作为减平均值来校正图像均值。 13 | 14 | 15 | 16 | 希望你会发现加载预训练模型很简单,语法简洁。从较高的层面来看,这些是在预训练模型上运行推理所需的三个步骤: 17 | 18 | 1. 读取预训练模型的init_net.pb和参数predict_net.pb文件 19 | 20 | with open("init_net.pb", "rb") as f: 21 | init_net = f.read() 22 | with open("predict_net.pb", "rb") as f: 23 | predict_net = f.read() 24 | 25 | 2. 使用protobufs中的blob在工作区(workspace)中初始化预测器 26 | 27 | p = workspace.Predictor(init_net, predict_net) 28 | 29 | 3. Run the net on some data and get the (softmax) results! 30 | 31 | results = p.run({'data': img}) 32 | 33 | 34 | 注意,假设网络的最后一层是softmax层,结果返回为多维数组的概率,其长度等于模型训练的类的数量。概率可以由目标代码(整数类型)索引,因此如果您知道目标代码,则可以在该索引处索引结果数组,以查看网络对输入图像属于该类的置信度。 35 | 36 | **Model Download Options** 37 | 38 | 39 | 虽然我们将在这里使用`squeezenet`,但您可以查看[Model Zoo for pre-trained models](https://github.com/caffe2/caffe2/wiki/Model-Zoo)来浏览/下载各种预训练模型,或者您可以使用Caffe2的`caffe2.python.models.download`模块轻松地从[Github caffe2/models](http://github.com/caffe2/models)获取预训练模型。 40 | 41 | 42 | 为了我们的目的,我们将使用`models.download`模块使用以下命令将`squeezenet`下载到我们本地Caffe2安装的`/caffe2/python/models`文件夹中: 43 | 44 | ``` 45 | python -m caffe2.python.models.download -i squeezenet 46 | ``` 47 | 48 | 49 | 如果完成上面的下载工作,那么你的`/caffe2/python/models`文件夹中应该有一个名为squeezenet的目录,其中包含`init_net.pb`和`predict_net.pb`。注意,如果你不使用`-i`标志,模型将下载到你的本地文件夹,但它仍然是一个名为squeezenet的目录,包含两个protobuf文件。或者,如果您希望下载所有模型,可以使用以下方法克隆整个仓库: 50 | 51 | ``` 52 | git clone https://github.com/caffe2/models 53 | ``` 54 | 55 | ## Code 56 | 57 | 在开始之前,让我们来处理所需的imports。 58 | 59 | 60 | ```python 61 | from __future__ import absolute_import 62 | from __future__ import division 63 | from __future__ import print_function 64 | from __future__ import unicode_literals 65 | %matplotlib inline 66 | from caffe2.proto import caffe2_pb2 67 | import numpy as np 68 | import skimage.io 69 | import skimage.transform 70 | from matplotlib import pyplot 71 | import os 72 | from caffe2.python import core, workspace, models 73 | import urllib2 74 | import operator 75 | print("Required modules imported.") 76 | ``` 77 | 78 | WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode. 79 | 80 | 81 | Required modules imported. 82 | 83 | 84 | ### Inputs 85 | 86 | 在这里,我们将指定用于此次运行的输入(imports),包括输入图像,模型位置,image_mean文件(可选),图像所需的大小以及标签映射文件的位置。 87 | 88 | 89 | ```python 90 | # Configuration --- Change to your setup and preferences! 91 | # This directory should contain the models downloaded from the model zoo. To run this 92 | # tutorial, make sure there is a 'squeezenet' directory at this location that 93 | # contains both the 'init_net.pb' and 'predict_net.pb' 94 | 95 | #CAFFE_MODELS = "~/caffe2/caffe2/python/models" 96 | CAFFE_MODELS = "/home/jack/code_jack/caffe2/caffe2_model" 97 | 98 | # Some sample images you can try, or use any URL to a regular image. 99 | # IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Whole-Lemon.jpg/1235px-Whole-Lemon.jpg" 100 | # IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/7/7b/Orange-Whole-%26-Split.jpg" 101 | # IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/a/ac/Pretzel.jpg" 102 | # IMAGE_LOCATION = "https://cdn.pixabay.com/photo/2015/02/10/21/28/flower-631765_1280.jpg" 103 | IMAGE_LOCATION = "images/flower.jpg" 104 | 105 | # What model are we using? 106 | # Format below is the model's: 107 | # You can switch 'squeezenet' out with 'bvlc_alexnet', 'bvlc_googlenet' or others that you have downloaded 108 | MODEL = 'squeezenet', 'init_net.pb', 'predict_net.pb', 'ilsvrc_2012_mean.npy', 227 109 | 110 | # codes - these help decypher the output and source from a list from ImageNet's object codes 111 | # to provide an result like "tabby cat" or "lemon" depending on what's in the picture 112 | # you submit to the CNN. 113 | codes = "https://gist.githubusercontent.com/aaronmarkham/cd3a6b6ac071eca6f7b4a6e40e6038aa/raw/9edb4038a37da6b5a44c3b5bc52e448ff09bfe5b/alexnet_codes" 114 | print("Config set!") 115 | ``` 116 | 117 | Config set! 118 | 119 | 120 | ### Setup paths 121 | 122 | 123 | 设置配置后,我们现在可以加载image_mean文件(如果存在),以及预测网络和init网络。 124 | 125 | 126 | ```python 127 | # set paths and variables from model choice and prep image 128 | CAFFE_MODELS = os.path.expanduser(CAFFE_MODELS) 129 | 130 | # mean can be 128 or custom based on the model 131 | # gives better results to remove the colors found in all of the training images 132 | MEAN_FILE = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[3]) 133 | if not os.path.exists(MEAN_FILE): 134 | print("No mean file found!") 135 | mean = 128 136 | else: 137 | print ("Mean file found!") 138 | mean = np.load(MEAN_FILE).mean(1).mean(1) 139 | mean = mean[:, np.newaxis, np.newaxis] 140 | print("mean was set to: ", mean) 141 | 142 | # some models were trained with different image sizes, this helps you calibrate(校准) your image 143 | INPUT_IMAGE_SIZE = MODEL[4] 144 | 145 | # make sure all of the files are around... 146 | INIT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[1]) 147 | PREDICT_NET = os.path.join(CAFFE_MODELS, MODEL[0], MODEL[2]) 148 | 149 | # Check to see if the files exist 150 | if not os.path.exists(INIT_NET): 151 | print("WARNING: " + INIT_NET + " not found!") 152 | else: 153 | if not os.path.exists(PREDICT_NET): 154 | print("WARNING: " + PREDICT_NET + " not found!") 155 | else: 156 | print("All needed files found!") 157 | 158 | ``` 159 | 160 | No mean file found! 161 | mean was set to: 128 162 | All needed files found! 163 | 164 | 165 | ### Image Preprocessing 166 | 167 | 168 | 现在我们已经指定了输入并验证了输入网络的存在,我们可以加载图像并预处理图像以供提取到Caffe2卷积神经网络中!这是非常重要的一步,因为训练有素的CNN需要特定大小的输入图像,其值来自特定分布。 169 | 170 | 171 | ```python 172 | # Function to crop the center cropX x cropY pixels from the input image 173 | def crop_center(img,cropx,cropy): 174 | y,x,c = img.shape 175 | startx = x//2-(cropx//2) 176 | starty = y//2-(cropy//2) 177 | return img[starty:starty+cropy,startx:startx+cropx] 178 | 179 | # Function to rescale the input image to the desired height and/or width. This function will preserve 180 | # the aspect ratio of the original image while making the image the correct scale so we can retrieve 181 | # a good center crop. This function is best used with center crop to resize any size input images into 182 | # specific sized images that our model can use. 183 | def rescale(img, input_height, input_width): 184 | # Get original aspect ratio 185 | aspect = img.shape[1]/float(img.shape[0]) 186 | if(aspect>1): 187 | # landscape orientation - wide image 188 | res = int(aspect * input_height) 189 | imgScaled = skimage.transform.resize(img, (input_width, res)) 190 | if(aspect<1): 191 | # portrait orientation - tall image 192 | res = int(input_width/aspect) 193 | imgScaled = skimage.transform.resize(img, (res, input_height)) 194 | if(aspect == 1): 195 | imgScaled = skimage.transform.resize(img, (input_width, input_height)) 196 | return imgScaled 197 | 198 | # Load the image as a 32-bit float 199 | # Note: skimage.io.imread returns a HWC ordered RGB image of some size 200 | img = skimage.img_as_float(skimage.io.imread(IMAGE_LOCATION)).astype(np.float32) 201 | print("Original Image Shape: " , img.shape) 202 | 203 | # Rescale the image to comply with our desired input size. This will not make the image 227x227 204 | # but it will make either the height or width 227 so we can get the ideal center crop. 205 | img = rescale(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE) 206 | print("Image Shape after rescaling: " , img.shape) 207 | pyplot.figure() 208 | pyplot.imshow(img) 209 | pyplot.title('Rescaled image') 210 | 211 | # Crop the center 227x227 pixels of the image so we can feed it to our model 212 | img = crop_center(img, INPUT_IMAGE_SIZE, INPUT_IMAGE_SIZE) 213 | print("Image Shape after cropping: " , img.shape) 214 | pyplot.figure() 215 | pyplot.imshow(img) 216 | pyplot.title('Center Cropped') 217 | 218 | # switch to CHW (HWC --> CHW) 219 | img = img.swapaxes(1, 2).swapaxes(0, 1) 220 | print("CHW Image Shape: " , img.shape) 221 | 222 | pyplot.figure() 223 | for i in range(3): 224 | # For some reason, pyplot subplot follows Matlab's indexing 225 | # convention (starting with 1). Well, we'll just follow it... 226 | pyplot.subplot(1, 3, i+1) 227 | pyplot.imshow(img[i]) 228 | pyplot.axis('off') 229 | pyplot.title('RGB channel %d' % (i+1)) 230 | 231 | # switch to BGR (RGB --> BGR) 232 | img = img[(2, 1, 0), :, :] 233 | 234 | # remove mean for better results 235 | img = img * 255 - mean 236 | 237 | # add batch size axis which completes the formation of the NCHW shaped input that we want 238 | img = img[np.newaxis, :, :, :].astype(np.float32) 239 | 240 | print("NCHW image (ready to be used as input): ", img.shape) 241 | ``` 242 | 243 | Original Image Shape: (751, 1280, 3) 244 | Image Shape after rescaling: (227, 386, 3) 245 | 246 | 247 | /home/jack/anaconda2/envs/caffe2/lib/python2.7/site-packages/skimage/transform/_warps.py:105: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage 0.15. 248 | warn("The default mode, 'constant', will be changed to 'reflect' in " 249 | /home/jack/anaconda2/envs/caffe2/lib/python2.7/site-packages/skimage/transform/_warps.py:110: UserWarning: Anti-aliasing will be enabled by default in skimage 0.15 to avoid aliasing artifacts when down-sampling images. 250 | warn("Anti-aliasing will be enabled by default in skimage 0.15 to " 251 | 252 | 253 | Image Shape after cropping: (227, 227, 3) 254 | CHW Image Shape: (3, 227, 227) 255 | NCHW image (ready to be used as input): (1, 3, 227, 227) 256 | 257 | 258 | 259 | ![png](./markdown_img/output_7_3.png) 260 | 261 | 262 | 263 | ![png](./markdown_img/output_7_4.png) 264 | 265 | 266 | 267 | ![png](./markdown_img/output_7_5.png) 268 | 269 | 270 | ### Prepare the CNN and run the net! 271 | 272 | 273 | 既然图像已准备好被CNN提取,让我们打开protobufs文件,将它们加载到工作区中,然后运行网络。 274 | 275 | 276 | ```python 277 | # Read the contents of the input protobufs into local variables 278 | #将输入的protobufs的内容读入局部变量 279 | with open(INIT_NET, "rb") as f: 280 | init_net = f.read() 281 | with open(PREDICT_NET, "rb") as f: 282 | predict_net = f.read() 283 | 284 | # Initialize the predictor from the input protobufs 285 | p = workspace.Predictor(init_net, predict_net) 286 | 287 | # Run the net and return prediction 288 | results = p.run({'data': img}) 289 | 290 | # Turn it into something we can play with and examine which is in a multi-dimensional array 291 | #把它变成我们可以观察的东西,并检查哪个是多维数组 292 | results = np.asarray(results) 293 | print("results shape: ", results.shape) 294 | 295 | # Quick way to get the top-1 prediction result 296 | # Squeeze out the unnecessary axis. This returns a 1-D array of length 1000 297 | # 挤出不必要的轴。这将返回长度为1000的1-D数组 298 | preds = np.squeeze(results) 299 | # Get the prediction and the confidence by finding the maximum value and index of maximum value in preds array 300 | #通过在preds数组中找到最大值和最大值索引来获得预测和置信度 301 | curr_pred, curr_conf = max(enumerate(preds), key=operator.itemgetter(1)) 302 | print("Prediction: ", curr_pred) 303 | print("Confidence: ", curr_conf) 304 | ``` 305 | 306 | results shape: (1, 1, 1000, 1, 1) 307 | Prediction: 985 308 | Confidence: 0.982227 309 | 310 | 311 | ### Process Results 312 | 313 | 314 | 召回ImageNet是一个1000类数据集,并观察到第三个结果轴长度为1000并非巧合。该轴保留了预训练模型中每个类别的概率。因此,当您在特定索引处查看结果数组时,该数字可以解释为输入属于与该索引对应的类的概率。现在我们已经运行了预测器并收集了结果,我们可以通过将它们与相应的英文标签相匹配来解释它们。 315 | 316 | 317 | ```python 318 | # the rest of this is digging through the results 319 | # 剩下的就是挖掘结果 320 | results = np.delete(results, 1) 321 | index = 0 322 | highest = 0 323 | arr = np.empty((0,2), dtype=object) 324 | arr[:,0] = int(10) 325 | arr[:,1:] = float(10) 326 | for i, r in enumerate(results): 327 | # imagenet index begins with 1! 328 | i=i+1 329 | arr = np.append(arr, np.array([[i,r]]), axis=0) 330 | if (r > highest): 331 | highest = r 332 | index = i 333 | 334 | # top N results 335 | N = 5 336 | topN = sorted(arr, key=lambda x: x[1], reverse=True)[:N] 337 | print("Raw top {} results: {}".format(N,topN)) 338 | 339 | # Isolate the indexes of the top-N most likely classes 340 | # 隔离前N个最有可能的类的索引 341 | topN_inds = [int(x[0]) for x in topN] 342 | print("Top {} classes in order: {}".format(N,topN_inds)) 343 | 344 | # Now we can grab the code list and create a class Look Up Table 345 | # 现在我们可以获取代码列表并创建一个查找表类 346 | response = urllib2.urlopen(codes) 347 | class_LUT = [] 348 | for line in response: 349 | code, result = line.partition(":")[::2] 350 | code = code.strip() 351 | result = result.replace("'", "") 352 | if code.isdigit(): 353 | class_LUT.append(result.split(",")[0][1:]) 354 | 355 | # For each of the top-N results, associate the integer result with an actual class 356 | # 对于每个前N个结果,将整数结果与实际类相关联 357 | for n in topN: 358 | print("Model predicts '{}' with {}% confidence".format(class_LUT[int(n[0])],float("{0:.2f}".format(n[1]*100)))) 359 | 360 | ``` 361 | 362 | Raw top 5 results: [array([985.0, 0.9822270274162292], dtype=object), array([309.0, 0.011943656019866467], dtype=object), array([946.0, 0.004810133948922157], dtype=object), array([325.0, 0.00034070576657541096], dtype=object), array([944.0, 0.00023906580463517457], dtype=object)] 363 | Top 5 classes in order: [985, 309, 946, 325, 944] 364 | Model predicts 'daisy' with 98.22% confidence 365 | Model predicts 'bee' with 1.19% confidence 366 | Model predicts 'cardoon' with 0.48% confidence 367 | Model predicts 'sulphur butterfly' with 0.03% confidence 368 | Model predicts 'artichoke' with 0.02% confidence 369 | 370 | 371 | ### Feeding Larger Batches(喂养更大的批次) 372 | 373 | 374 | 以上是如何一个批次(batch)送入一个图像的示例。如果我们在一个批次中一次提供多个图像,我们可以实现更高的吞吐量。回想一下,输入分类器的数据是'NCHW'顺序,因此为了提供多个图像,我们将扩展'N'轴。 375 | 376 | 377 | ```python 378 | # List of input images to be fed 379 | images = ["images/cowboy-hat.jpg", 380 | "images/cell-tower.jpg", 381 | "images/Ducreux.jpg", 382 | "images/pretzel.jpg", 383 | "images/orangutan.jpg", 384 | "images/aircraft-carrier.jpg", 385 | "images/cat.jpg"] 386 | 387 | # Allocate space for the batch of formatted images 388 | # 为批量格式化的图像分配空间 389 | NCHW_batch = np.zeros((len(images),3,227,227)) 390 | print ("Batch Shape: ",NCHW_batch.shape) 391 | 392 | # For each of the images in the list, format it and place it in the batch 393 | # 对于列表中的每个图像,对其进行格式化并将其放入批处理中 394 | for i,curr_img in enumerate(images): 395 | img = skimage.img_as_float(skimage.io.imread(curr_img)).astype(np.float32) 396 | img = rescale(img, 227, 227) 397 | img = crop_center(img, 227, 227) 398 | img = img.swapaxes(1, 2).swapaxes(0, 1) 399 | img = img[(2, 1, 0), :, :] 400 | img = img * 255 - mean 401 | NCHW_batch[i] = img 402 | 403 | print("NCHW image (ready to be used as input): ", NCHW_batch.shape) 404 | 405 | # Run the net on the batch 406 | results = p.run([NCHW_batch.astype(np.float32)]) 407 | 408 | # Turn it into something we can play with and examine which is in a multi-dimensional array 409 | #把它变成我们可以观察的东西,并检查哪个是多维数组 410 | results = np.asarray(results) 411 | 412 | # Squeeze out the unnecessary axis 413 | # 挤出不必要的轴 414 | preds = np.squeeze(results) 415 | print("Squeezed Predictions Shape, with batch size {}: {}".format(len(images),preds.shape)) 416 | 417 | # Describe the results 418 | for i,pred in enumerate(preds): 419 | print("Results for: '{}'".format(images[i])) 420 | # Get the prediction and the confidence by finding the maximum value and index of maximum value in preds array 421 | # 通过在preds数组中找到最大值和最大值索引来获得预测和置信度 422 | curr_pred, curr_conf = max(enumerate(pred), key=operator.itemgetter(1)) 423 | print("\tPrediction: ", curr_pred) 424 | print("\tClass Name: ", class_LUT[int(curr_pred)]) 425 | print("\tConfidence: ", curr_conf) 426 | ``` 427 | 428 | Batch Shape: (7, 3, 227, 227) 429 | NCHW image (ready to be used as input): (7, 3, 227, 227) 430 | Squeezed Predictions Shape, with batch size 7: (7, 1000) 431 | Results for: 'images/cowboy-hat.jpg' 432 | Prediction: 515 433 | Class Name: cowboy hat 434 | Confidence: 0.8500917 435 | Results for: 'images/cell-tower.jpg' 436 | Prediction: 645 437 | Class Name: maypole 438 | Confidence: 0.18584356 439 | Results for: 'images/Ducreux.jpg' 440 | Prediction: 568 441 | Class Name: fur coat 442 | Confidence: 0.10253135 443 | Results for: 'images/pretzel.jpg' 444 | Prediction: 932 445 | Class Name: pretzel 446 | Confidence: 0.99962187 447 | Results for: 'images/orangutan.jpg' 448 | Prediction: 365 449 | Class Name: orangutan 450 | Confidence: 0.9920053 451 | Results for: 'images/aircraft-carrier.jpg' 452 | Prediction: 403 453 | Class Name: aircraft carrier 454 | Confidence: 0.9998778 455 | Results for: 'images/cat.jpg' 456 | Prediction: 281 457 | Class Name: tabby 458 | Confidence: 0.5133174 459 | 460 | 461 | 462 | ```python 463 | 464 | ``` 465 | -------------------------------------------------------------------------------- /caffe2/7.Image_Pre-Processing_Pipeline.md: -------------------------------------------------------------------------------- 1 | 2 | # Image Loading and Preprocessing 3 | 4 | 5 | 在本教程中,我们将了解如何从本地文件或URL加载图像,然后您可以在其他教程或示例中使用这些文件。此外,我们将深入探讨将Caffe2与图像一起使用所需的预处理类型。 6 | 7 | #### Mac OSx Prerequisites 8 | 9 | 10 | 如果您还没有安装这些Python模块,那么现在就需要这样做。 11 | ``` 12 | sudo pip install scikit-image scipy matplotlib 13 | ``` 14 | 15 | 16 | ```python 17 | from __future__ import absolute_import 18 | from __future__ import division 19 | from __future__ import print_function 20 | from __future__ import unicode_literals 21 | 22 | %matplotlib inline 23 | import skimage 24 | import skimage.io as io 25 | import skimage.transform 26 | import sys 27 | import numpy as np 28 | import math 29 | from matplotlib import pyplot 30 | import matplotlib.image as mpimg 31 | print("Required modules imported.") 32 | ``` 33 | 34 | Required modules imported. 35 | 36 | 37 | ## Test an Image 38 | 39 | 40 | 在下面的代码块中使用`IMAGE_LOCATION`来加载您想要测试的内容。只需更改注释标记即可完成每轮教程。通过这种方式,您将了解各种图像格式会发生什么变化,以及有关如何预处理它们的一些提示。如果要尝试自己的图像,请将其放在images文件夹中或使用远程URL。当您选择远程URL时,请找自己能够轻松获得的并尝试查找指向常见图像文件类型和扩展名的URL,而不是某些长标识符或查询字符串,这可能会影响下一步。 41 | 42 | ## Color Issues 43 | 44 | 45 | 从智能手机相机加载图像时,请记住可能会遇到颜色格式问题。下面我们将展示RGB和BGR之间的翻转如何影响图像的示例。如果图片格式搞错显然会使您的模型中的检测失效。确保您传递的图像数据是您认为的! 46 | 47 | ### Caffe Uses BGR Order 48 | 49 | 50 | 由于Caffe依赖OpenCV的传统支持以及opencv处理蓝绿红(BGR)顺序的图像而不是更常用的红-绿-蓝(RGB)顺序,Caffe2也是** BGR **顺序。从很多方面来说,这个决定从长远来看有助于您使用不同的计算机视觉实用程序和库,但它也可能是混乱的根源。 51 | 52 | 53 | ```python 54 | # You can load either local IMAGE_FILE or remote URL 55 | # For Round 1 of this tutorial, try a local image. 56 | IMAGE_LOCATION = 'images/cat.jpg' 57 | 58 | # For Round 2 of this tutorial, try a URL image with a flower: 59 | # IMAGE_LOCATION = "https://cdn.pixabay.com/photo/2015/02/10/21/28/flower-631765_1280.jpg" 60 | # IMAGE_LOCATION = "images/flower.jpg" 61 | 62 | # For Round 3 of this tutorial, try another URL image with lots of people: 63 | # IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/1/18/NASA_Astronaut_Group_15.jpg" 64 | # IMAGE_LOCATION = "images/astronauts.jpg" 65 | 66 | # For Round 4 of this tutorial, try a URL image with a portrait! 67 | # IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/9/9a/Ducreux1.jpg" 68 | # IMAGE_LOCATION = "images/Ducreux.jpg" 69 | 70 | img = skimage.img_as_float(skimage.io.imread(IMAGE_LOCATION)).astype(np.float32) 71 | 72 | # test color reading 73 | # show the original image 74 | pyplot.figure() 75 | pyplot.subplot(1,2,1) 76 | pyplot.imshow(img) 77 | pyplot.axis('on') 78 | pyplot.title('Original image = RGB') 79 | 80 | # show the image in BGR - just doing RGB->BGR temporarily for display 81 | imgBGR = img[:, :, (2, 1, 0)] 82 | #pyplot.figure() 83 | pyplot.subplot(1,2,2) 84 | pyplot.imshow(imgBGR) 85 | pyplot.axis('on') 86 | pyplot.title('OpenCV, Caffe2 = BGR') 87 | ``` 88 | 89 | 90 | 91 | 92 | Text(0.5,1,u'OpenCV, Caffe2 = BGR') 93 | 94 | 95 | 96 | 97 | ![png](./markdown_img/7.output_3_1.png) 98 | 99 | 100 | 正如您在上面的示例中所看到的,记住顺序的差异非常重要。在下面的代码块中,我们将拍摄图像并转换为BGR顺序,以便Caffe适当地处理它。 101 | 102 | 但等等,有更多的色彩乐趣...... 103 | 104 | ### Caffe Prefers CHW Order 105 | 106 | 怎么办!?你问什么是CHW?那么,还有HWC!这两种格式都出现在图像处理中: 107 | 108 | - H: Height 109 | - W: Width 110 | - C: Channel (as in color) 111 | 112 | 深入研究图像数据的存储方式是内存分配顺序。您可能已经注意到,当我们第一次加载图像时,我们强制它通过一些有趣的转换。这些是数据转换,让我们可以像使用立方体一样使用图像。我们看到的是在立方体的顶部,操纵下面的层可以改变我们所看到的。我们可以修补它的基本属性,如上所述,很容易交换颜色。 113 | 114 | 对于GPU处理,这是Caffe2擅长的,这个订单需要是CHW。对于CPU处理,此顺序通常为HWC。基本上,您将要使用CHW并确保步骤包含在图像管道(image pipeline)中。将RGB调整为BGR,将其封装为此“C”有效负载,然后调整HWC,“C”是您刚刚切换的相同颜色。 115 | 116 | 117 | 你可能会问为什么!原因指向cuDNN,这有助于加速GPU的处理。它只使用CHW,我们总结说它更快。 118 | 119 | 120 | 鉴于这两种转变,您可能认为这已足够,但事实并非如此。我们仍然需要调整大小和、或裁剪(resize and/or crop),并可能会查看方向(旋转)和镜像(orientation (rotation) and mirroring)等内容。 121 | 122 | ## Rotation and Mirroring(旋转和镜像) 123 | 124 | 125 | 本主题通常保留用于来自智能手机的图像。一般来说,手机拍摄的照片很棒,但在拍摄照片的方式以及应该采用的方向方面做得非常糟糕。然后是用户用手机的相机在阳光下做所有事情,让他们做的事情是设计师永远不会做的事情预期。**相机**-对,因为手机通常有两个相机,这两个相机在像素数和纵横比上都采用不同大小的图片,不仅如此,它们有时会将它们镜像,有时它们会以纵向和横向模式拍摄,有时也会他们懒得知道他们在哪个模式。 126 | 127 | 128 | 129 | 在许多方面,这是您需要在管道(in your pipeline)中评估的第一件事,然后查看大小调整(sizing)(如下所述),然后找出颜色情况(the color situation)。如果你正在为iOS开发,那么你很幸运,它会相对容易。如果你是一个超级黑客向导开发人员,带有铅衬短裤并为Android开发,那么至少你有铅衬短裤。 130 | 131 | 132 | Android市场的变化是美妙而可怕的。在理想的世界中,您可以依赖来自任何相机的图片中的EXIF数据并使用它来决定方向和镜像,并且您将拥有一个简单的案例功能来处理您的转换。没有这样的运气,但你并不孤单。许多人来到你面前,为你而受苦。 133 | 134 | 135 | ```python 136 | # Image came in sideways - it should be a portait image! 137 | # How you detect this depends on the platform 138 | # Could be a flag from the camera object 139 | # Could be in the EXIF data 140 | # ROTATED_IMAGE = "https://upload.wikimedia.org/wikipedia/commons/8/87/Cell_Phone_Tower_in_Ladakh_India_with_Buddhist_Prayer_Flags.jpg" 141 | ROTATED_IMAGE = "images/cell-tower.jpg" 142 | imgRotated = skimage.img_as_float(skimage.io.imread(ROTATED_IMAGE)).astype(np.float32) 143 | pyplot.figure() 144 | pyplot.imshow(imgRotated) 145 | pyplot.axis('on') 146 | pyplot.title('Rotated image') 147 | 148 | # Image came in flipped or mirrored - text is backwards! 149 | # Again detection depends on the platform 150 | # This one is intended to be read by drivers in their rear-view mirror 151 | # MIRROR_IMAGE = "https://upload.wikimedia.org/wikipedia/commons/2/27/Mirror_image_sign_to_be_read_by_drivers_who_are_backing_up_-b.JPG" 152 | MIRROR_IMAGE = "images/mirror-image.jpg" 153 | imgMirror = skimage.img_as_float(skimage.io.imread(MIRROR_IMAGE)).astype(np.float32) 154 | pyplot.figure() 155 | pyplot.imshow(imgMirror) 156 | pyplot.axis('on') 157 | pyplot.title('Mirror image') 158 | ``` 159 | 160 | 161 | 162 | 163 | Text(0.5,1,u'Mirror image') 164 | 165 | 166 | 167 | 168 | ![png](./markdown_img/7.output_5_1.png) 169 | 170 | 171 | 172 | ![png](./markdown_img/7.output_5_2.png) 173 | 174 | 175 | 176 | 177 | 所以你可以看到我们遇到了一些问题。如果我们正在探测地点,地标或物体(detecting places, landmarks, or objects),那么侧身的蜂窝塔就不好了。如果我们检测文本(detecting text)并进行自动语言翻译(automatic language translation),那么图片中是镜像文本就不好了。但是,嘿,也许你想制作一个可以通过两种方式检测英语的模型。这将是非常棒的,但不适用于本教程! 178 | 179 | 180 | 181 | 让我们将这些图片变成Caffe2可测,我们周围的标准检测模型可以检测到。此外,这个小技巧可能会拯救你,例如,你真的必须检测蜂窝塔,但没有找到EXIF数据:那么你将循环每一次旋转,每次翻转,产生这张照片的许多衍生物并运行它们。当检测的置信度百分比足够高时,Bam !你找到了你需要的方向和隐藏其中的蜂窝塔。 182 | 183 | 无论如何,对于示例代码: 184 | 185 | 186 | ```python 187 | # Run me to flip the image back and forth 188 | imgMirror = np.fliplr(imgMirror) 189 | pyplot.figure() 190 | pyplot.imshow(imgMirror) 191 | pyplot.axis('off') 192 | pyplot.title('Mirror image') 193 | ``` 194 | 195 | 196 | 197 | 198 | Text(0.5,1,u'Mirror image') 199 | 200 | 201 | 202 | 203 | ![png](./markdown_img/7.output_7_1.png) 204 | 205 | 206 | 207 | ```python 208 | # Run me to rotate the image 90 degrees 209 | imgRotated = np.rot90(imgRotated, 3) 210 | pyplot.figure() 211 | pyplot.imshow(imgRotated) 212 | pyplot.axis('off') 213 | pyplot.title('Rotated image') 214 | ``` 215 | 216 | 217 | 218 | 219 | Text(0.5,1,u'Rotated image') 220 | 221 | 222 | 223 | 224 | ![png](./markdown_img/7.output_8_1.png) 225 | 226 | 227 | ## Sizing 228 | 229 | 230 | 231 | 预处理的一部分是调整大小。由于我们不会进入这里的原因,Caffe2管道(pipeline)中的图像应该是方形的。此外,为了提高性能,它们应调整到标准高度和宽度,通常小于原始来源。在下面的示例中,我们将调整为256 x 256像素,但您可能会注意到`input_height`和`input_width`设置为224 x 224,然后用于指定裁剪。这是几个基于图像的模型所期望的。他们接受了大小为224 x 224的图像训练,为了使模型能够正确识别您投射的可疑图像,这些图像也应该是224 x 224。 232 | 233 | ** 确保仔细检查您正在使用的model的输入数据尺寸(input size)!** 234 | 235 | 236 | ```python 237 | # Model is expecting 224 x 224, so resize/crop needed. 238 | # First, let's resize the image to 256*256 239 | orig_h, orig_w, _ = img.shape 240 | print("Original image's shape is {}x{}".format(orig_h, orig_w)) 241 | input_height, input_width = 224, 224 242 | print("Model's input shape is {}x{}".format(input_height, input_width)) 243 | img256 = skimage.transform.resize(img, (256, 256)) 244 | 245 | # Plot original and resized images for comparison 246 | f, axarr = pyplot.subplots(1,2) 247 | axarr[0].imshow(img) 248 | axarr[0].set_title("Original Image (" + str(orig_h) + "x" + str(orig_w) + ")") 249 | axarr[0].axis('on') 250 | axarr[1].imshow(img256) 251 | axarr[1].axis('on') 252 | axarr[1].set_title('Resized image to 256x256') 253 | pyplot.tight_layout() 254 | 255 | print("New image shape:" + str(img256.shape)) 256 | ``` 257 | 258 | Original image's shape is 360x480 259 | Model's input shape is 224x224 260 | New image shape:(256, 256, 3) 261 | 262 | 263 | 264 | ![png](./markdown_img/7.output_10_1.png) 265 | 266 | 267 | 268 | 请注意,调整大小会使图像失真一点。在处理过程中认识到resizing非常重要,因为它会对模型的结果产生影响。花和动物可能会有一点拉伸或挤压,但面部特征可能不会。 269 | 270 | 当原始图像的尺寸与所需尺寸不成比例时,可能会发生这种情况。在这个特定的例子中,最好只调整大小为224x224,而不是麻烦裁剪(cropping)。让我们尝试另一种重新调整图像和保持宽高比的策略。 271 | 272 | ### Rescaling(重新缩放) 273 | 274 | 275 | 如果你想象肖像(肖像)图像与风景(Landscape)图像,你就会知道有很多东西可以通过调整大小来搞砸。重新缩放假设您正在锁定宽高比以防止图像失真。在这种情况下,我们将图像缩小到与模型输入大小匹配的最短边。 276 | 277 | 278 | 在我们这里的示例中,模型大小为224 x 224.当您在1920x1080中查看显示器时,它的宽度比高度更长,如果将其缩小到224,则在用完之前就会超出高度宽度,所以...... 279 | - Landscape(风景): limit resize by the height 280 | - Portrait(肖像): limit resize by the width 281 | 282 | 283 | ```python 284 | print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!") 285 | print("Model's input shape is {}x{}".format(input_height, input_width)) 286 | aspect = img.shape[1]/float(img.shape[0]) 287 | print("Orginal aspect ratio: " + str(aspect)) 288 | if(aspect>1): 289 | # landscape orientation - wide image 290 | res = int(aspect * input_height) 291 | imgScaled = skimage.transform.resize(img, (input_height, res)) 292 | if(aspect<1): 293 | # portrait orientation - tall image 294 | res = int(input_width/aspect) 295 | imgScaled = skimage.transform.resize(img, (res, input_width)) 296 | if(aspect == 1): 297 | imgScaled = skimage.transform.resize(img, (input_height, input_width)) 298 | pyplot.figure() 299 | pyplot.imshow(imgScaled) 300 | pyplot.axis('on') 301 | pyplot.title('Rescaled image') 302 | print("New image shape:" + str(imgScaled.shape) + " in HWC") 303 | ``` 304 | 305 | Original image shape:(360, 480, 3) and remember it should be in H, W, C! 306 | Model's input shape is 224x224 307 | Orginal aspect ratio: 1.33333333333 308 | New image shape:(224, 298, 3) in HWC 309 | 310 | 311 | 312 | ![png](./markdown_img/7.output_12_1.png) 313 | 314 | 315 | 316 | 此时,只有一个维度设置为模型输入所需的维度。我们仍然需要裁剪(crop)一边做一个正方形。 317 | 318 | ### Cropping(裁剪) 319 | 320 | 321 | 322 | 我们可以利用各种策略。事实上,我们可以倒退并决定做一个中心裁剪(center crop)。所以我们不是缩小到最小的,我们可以至少在一边,我们从中间拿出一大块。如果我们在没有缩放的情况下完成了这项操作,那么我们最终只能使用花朵踏板(flower pedal)的一部分,因此我们仍需要对图像进行一些调整。 323 | 324 | 325 | 下面我们将尝试一些裁剪(cropping)策略: 326 | 1. Just grab the exact dimensions you need from the middle!只需从中间抓住您需要的确切尺寸! 327 | 2. Resize to a square that's pretty close then grab from the middle.调整到一个非常接近的正方形然后从中间抓取。 328 | 3. Use the rescaled image and grab the middle.使用重新缩放的图像并抓住中间。 329 | 330 | 331 | ```python 332 | # Compare the images and cropping strategies 333 | # Try a center crop on the original for giggles 334 | print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!") 335 | def crop_center(img,cropx,cropy): 336 | y,x,c = img.shape 337 | startx = x//2-(cropx//2) 338 | starty = y//2-(cropy//2) 339 | return img[starty:starty+cropy,startx:startx+cropx] 340 | # yes, the function above should match resize and take a tuple... 341 | 342 | pyplot.figure() 343 | # Original image 344 | imgCenter = crop_center(img,224,224) 345 | pyplot.subplot(1,3,1) 346 | pyplot.imshow(imgCenter) 347 | pyplot.axis('on') 348 | pyplot.title('Original') 349 | 350 | # Now let's see what this does on the distorted image 351 | img256Center = crop_center(img256,224,224) 352 | pyplot.subplot(1,3,2) 353 | pyplot.imshow(img256Center) 354 | pyplot.axis('on') 355 | pyplot.title('Squeezed') 356 | 357 | # Scaled image 358 | imgScaledCenter = crop_center(imgScaled,224,224) 359 | pyplot.subplot(1,3,3) 360 | pyplot.imshow(imgScaledCenter) 361 | pyplot.axis('on') 362 | pyplot.title('Scaled') 363 | 364 | pyplot.tight_layout() 365 | ``` 366 | 367 | Original image shape:(360, 480, 3) and remember it should be in H, W, C! 368 | 369 | 370 | 371 | ![png](./markdown_img/7.output_14_1.png) 372 | 373 | 374 | 375 | 你可以看到,除了可能是最后一个之外,它没有那么好用。中间的一个也可能没问题,但是在你尝试模型并测试很多候选图像之前你不会知道。 376 | 在这一点上,我们可以看看我们的差异,将其分成两半并从每一侧移除一些像素。然而,这确实有一个缺点,因为偏离中心的主题会被削减。 377 | 378 | 379 | 如果您现在已经运行过几次本教程并且在第3轮,那么您会发现一个非常大的问题。你错过了宇航员!你仍然可以看到第二轮花的问题。裁剪后缺少的东西可能会导致问题。可以这样想:如果您不知道您正在使用的模型是如何准备的,那么您不知道如何使图像符合要求,因此请注意测试结果!如果模型使用了很多不同的宽高比图像并且只是挤压它们以符合正方形那么很有可能随着时间的推移和大量的样本它“学会”了什么东西看起来像挤压并且可以匹配。但是,如果您正在寻找面部特征和地标等细节,或者任何图像中真正细致入微的元素,这可能是危险且容易出错的。 380 | 381 | #### Further Strategies(进一步战略)? 382 | 383 | 另一种策略是使用真实数据重新调整到最佳尺寸,然后使用您可以在模型中安全忽略的信息填充图像的其余部分。因为你在这里经历了足够的经验,我们将把它保存到另一个教程中! 384 | 385 | ### Upscaling(倍增) 386 | 387 | 当您想要运行的图像“微小”时,您会怎么做?在我们的例子中,我们一直在准备输入图像,规格为224x224。请考虑下面的128x128图像。 388 | ![cells at 128x128](images/Cellsx128.png) 389 | 390 | 现在我们不是在讨论超分辨率或CSI效应(CSI-effect),我们可以拍摄模糊的ATM照片,并将纹身识别为perp的脖子。尽管如此,深度学习已经提供了[一些进展](https://github.com/david-gpu/srez),如果你及时阅读(在2017/3/1日之前),去[看看这个](https://developer.nvidia.com/zoom-enhance-magic-image-upscaling-using-deep-learning)。我们想要做的很简单,但是,就像裁剪一样,它确实有各种你应该考虑的策略。 391 | 392 | 393 | 最基本的方法是从一个小方块到一个更大的方块,并使用defauls skimage为您提供。这个`resize`方法默认插值顺序参数为1,如果你甚至关心,它恰好是双线性的,但值得一提的是因为这些可能是稍后需要修改问题的微调旋钮,例如奇怪的视觉伪像,可以在升级图像中引入。 394 | 395 | 396 | ```python 397 | imgTiny = "images/Cellsx128.png" 398 | imgTiny = skimage.img_as_float(skimage.io.imread(imgTiny)).astype(np.float32) 399 | print("Original image shape: ", imgTiny.shape) 400 | imgTiny224 = skimage.transform.resize(imgTiny, (224, 224)) 401 | print("Upscaled image shape: ", imgTiny224.shape) 402 | # Plot original 403 | pyplot.figure() 404 | pyplot.subplot(1, 2, 1) 405 | pyplot.imshow(imgTiny) 406 | pyplot.axis('on') 407 | pyplot.title('128x128') 408 | # Plot upscaled 409 | pyplot.subplot(1, 2, 2) 410 | pyplot.imshow(imgTiny224) 411 | pyplot.axis('on') 412 | pyplot.title('224x224') 413 | ``` 414 | 415 | Original image shape: (128, 128, 4) 416 | Upscaled image shape: (224, 224, 4) 417 | 418 | 419 | 420 | 421 | 422 | Text(0.5,1,u'224x224') 423 | 424 | 425 | 426 | 427 | ![png](./markdown_img/7.output_17_2.png) 428 | 429 | 430 | 很棒,它有效!您可以在形状输出中看到(128,128,4)并且您已收到(224,224,4)。等一下! 4?到目前为止,在每个例子中,形状的最后一个值是3!当我们使用png文件时,我们进入了一个新的现实;一个可以透明的地方。第4个值描述不透明度或透明度,具体取决于您是否为玻璃半空类型。无论如何,我们可以处理它,但要注意这个数字。 431 | 432 | 433 | 重要的是要知道,在我们对图像进行进一步操作之前,它是数据顺序(data order),以及它的整体有效负载(overall payload),如果您以当前格式对图像进行简单的重采样,则可能会弄乱数据和图像。请记住,它目前是一个数据立方体(a cube of data),现在还有更多的数据,而不仅仅是红色,绿色和蓝色(以及不透明度(opacity))。根据您决定调整大小的时间,您必须考虑额外的数据。 434 | 435 | 436 | 我们打破一下吧!将图像切换为CHW后尝试升级图像。 437 | 438 | 439 | ```python 440 | imgTiny = "images/Cellsx128.png" 441 | imgTiny = skimage.img_as_float(skimage.io.imread(imgTiny)).astype(np.float32) 442 | print("Image shape before HWC --> CHW conversion: ", imgTiny.shape) 443 | # swapping the axes to go from HWC to CHW 444 | # uncomment the next line and run this block! 445 | imgTiny = imgTiny.swapaxes(1, 2).swapaxes(0, 1) 446 | print("Image shape after HWC --> CHW conversion: ", imgTiny.shape) 447 | imgTiny224 = skimage.transform.resize(imgTiny, (224, 224)) 448 | print("Image shape after resize: ", imgTiny224.shape) 449 | # we know this is going to go wrong, so... 450 | try: 451 | # Plot original 452 | pyplot.figure() 453 | pyplot.subplot(1, 2, 1) 454 | pyplot.imshow(imgTiny) 455 | pyplot.axis('on') 456 | pyplot.title('128x128') 457 | except: 458 | print("Here come bad things!") 459 | # hands up if you want to see the error (uncomment next line) 460 | #raise 461 | ``` 462 | 463 | Image shape before HWC --> CHW conversion: (128, 128, 4) 464 | Image shape after HWC --> CHW conversion: (4, 128, 128) 465 | Image shape after resize: (224, 224, 128) 466 | Here come bad things! 467 | 468 | 469 | 470 | ![png](./markdown_img/7.output_19_1.png) 471 | 472 | 473 | 失败了吧?如果你让上面的代码块交换轴,然后调整图像大小,你会看到这个输出: 474 | 475 | `Image shape after resize: (224, 224, 128)` 476 | 477 | 478 | 现在你有128个你应该还有4。糟糕,让我们在下面的代码块中恢复并尝试其他方法。我们将展示一个示例,其中图像小于您的输入规范,而不是正方形。就像它可能来自一个只能在矩形带中拍摄图像的新显微镜。 479 | 480 | 481 | ```python 482 | imgTiny = "images/Cellsx128.png" 483 | imgTiny = skimage.img_as_float(skimage.io.imread(imgTiny)).astype(np.float32) 484 | imgTinySlice = crop_center(imgTiny, 128, 56) 485 | # Plot original 486 | pyplot.figure() 487 | pyplot.subplot(2, 1, 1) 488 | pyplot.imshow(imgTiny) 489 | pyplot.axis('on') 490 | pyplot.title('Original') 491 | # Plot slice 492 | pyplot.figure() 493 | pyplot.subplot(2, 2, 1) 494 | pyplot.imshow(imgTinySlice) 495 | pyplot.axis('on') 496 | pyplot.title('128x56') 497 | # Upscale? 498 | print("Slice image shape: ", imgTinySlice.shape) 499 | imgTiny224 = skimage.transform.resize(imgTinySlice, (224, 224)) 500 | print("Upscaled slice image shape: ", imgTiny224.shape) 501 | # Plot upscaled 502 | pyplot.subplot(2, 2, 2) 503 | pyplot.imshow(imgTiny224) 504 | pyplot.axis('on') 505 | pyplot.title('224x224') 506 | ``` 507 | 508 | Slice image shape: (56, 128, 4) 509 | Upscaled slice image shape: (224, 224, 4) 510 | 511 | 512 | 513 | 514 | 515 | Text(0.5,1,u'224x224') 516 | 517 | 518 | 519 | 520 | ![png](./markdown_img/7.output_21_2.png) 521 | 522 | 523 | 524 | ![png](./markdown_img/7.output_21_3.png) 525 | 526 | 527 | 528 | 好吧,这对于upscaling如何失败的一个例子来说有点紧张。得到它?伸展(Stretch)?然而,这可能是一种生死攸关的失败。如果正常细胞是圆形的并且患病的细胞被拉长并弯曲怎么办?镰状细胞性贫血例如: 529 | ![sickle cells example](images/sickle-cells.jpg) 530 | 531 | 在这种情况下,你做什么?这实际上取决于模型以及它是如何训练的。在某些情况下,可以将图像的其余部分填充为白色,或者可能是黑色,或者可能是噪声,或者甚至可以使用png和透明度并为图像设置遮罩,以便模型忽略透明区域。看看你能想出多少有趣的事情,你也可以取得医学上的突破! 532 | 533 | 534 | 让我们继续我们已经提到的最后一步,即将图像输入调整为BGR顺序。Caffe2还有另一个功能,即`batch term`。我们已经谈过CHW了。对于NCHW中的图像数量,这是N. 535 | 536 | ### Final Preprocessing and the Batch Term(最终预处理和批处理期限) 537 | 538 | 539 | 在下面的最后一步中,我们将把图像的数据顺序切换到BGR,将其填入Color列,然后对列进行重新排序以进行GPU处理(HCW - > CHW),然后向图像添加第四维(N)跟踪图像的数量。从理论上讲,您可以继续为数据添加维度,但Caffe2需要这个维度,因为它会向Caffe传达此批次中预期的图像数量。我们将其设置为一(1)表示此批次中只有一张图像进入Caffe。请注意,在我们检查`img.shape`时的最终输出中,顺序是完全不同的。我们为图像数添加了N,并改变了顺序:`N,C,H,W` 540 | 541 | 542 | ```python 543 | # This next line helps with being able to rerun this section 544 | # if you want to try the outputs of the different crop strategies above 545 | # swap out imgScaled with img (original) or img256 (squeezed) 546 | imgCropped = crop_center(imgScaled,224,224) 547 | print("Image shape before HWC --> CHW conversion: ", imgCropped.shape) 548 | # (1) Since Caffe expects CHW order and the current image is HWC, 549 | # we will need to change the order. 550 | imgCropped = imgCropped.swapaxes(1, 2).swapaxes(0, 1) 551 | print("Image shape after HWC --> CHW conversion: ", imgCropped.shape) 552 | 553 | pyplot.figure() 554 | for i in range(3): 555 | # For some reason, pyplot subplot follows Matlab's indexing 556 | # convention (starting with 1). Well, we'll just follow it... 557 | pyplot.subplot(1, 3, i+1) 558 | pyplot.imshow(imgCropped[i], cmap=pyplot.cm.gray) 559 | pyplot.axis('off') 560 | pyplot.title('RGB channel %d' % (i+1)) 561 | 562 | # (2) Caffe uses a BGR order due to legacy OpenCV issues, so we 563 | # will change RGB to BGR. 564 | imgCropped = imgCropped[(2, 1, 0), :, :] 565 | print("Image shape after BGR conversion: ", imgCropped.shape) 566 | 567 | # for discussion later - not helpful at this point 568 | # (3) (Optional) We will subtract the mean image. Note that skimage loads 569 | # image in the [0, 1] range so we multiply the pixel values 570 | # first to get them into [0, 255]. 571 | #mean_file = os.path.join(CAFFE_ROOT, 'python/caffe/imagenet/ilsvrc_2012_mean.npy') 572 | #mean = np.load(mean_file).mean(1).mean(1) 573 | #img = img * 255 - mean[:, np.newaxis, np.newaxis] 574 | 575 | pyplot.figure() 576 | for i in range(3): 577 | # For some reason, pyplot subplot follows Matlab's indexing 578 | # convention (starting with 1). Well, we'll just follow it... 579 | pyplot.subplot(1, 3, i+1) 580 | pyplot.imshow(imgCropped[i], cmap=pyplot.cm.gray) 581 | pyplot.axis('off') 582 | pyplot.title('BGR channel %d' % (i+1)) 583 | # (4) Finally, since caffe2 expect the input to have a batch term 584 | # so we can feed in multiple images, we will simply prepend a 585 | # batch dimension of size 1. Also, we will make sure image is 586 | # of type np.float32. 587 | imgCropped = imgCropped[np.newaxis, :, :, :].astype(np.float32) 588 | print('Final input shape is:', imgCropped.shape) 589 | ``` 590 | 591 | Image shape before HWC --> CHW conversion: (224, 224, 3) 592 | Image shape after HWC --> CHW conversion: (3, 224, 224) 593 | Image shape after BGR conversion: (3, 224, 224) 594 | Final input shape is: (1, 3, 224, 224) 595 | 596 | 597 | 598 | ![png](./markdown_img/7.output_24_1.png) 599 | 600 | 601 | 602 | ![png](./markdown_img/7.output_24_2.png) 603 | 604 | 605 | 在上面的输出中,您应该注意这些更改: 606 | 1. HWC到CHW之前和之后的变化。3,这是移动到开头的颜色通道的数量。 607 | 2. 在上面的图片中,您可以看到颜色顺序也已切换。 RGB成为BGR。蓝色和红色切换位置。 608 | 3. 最终的输入形状,意味着对图像的最后一次更改是将批处理字段添加到开头,所以现在你有(1,3,224,224): 609 | - 1 image in the batch, 610 | - 3 color channels (in BGR), 611 | - 224 height, 612 | - 224 width. 613 | 614 | 615 | ```python 616 | 617 | ``` 618 | -------------------------------------------------------------------------------- /caffe2/9.create_your_own_dataset.md: -------------------------------------------------------------------------------- 1 | 2 | # How do I create my own dataset? 3 | 4 | 5 | 因此Caffe2使用二进制DB格式来存储我们想要训练模型的数据。Caffe2 DB是键值存储的美化名称,其中键通常是随机的,因此批次大约是i.i.d.这些值是真实的东西:它们包含您希望训练算法摄取的特定数据格式的序列化字符串。因此,存储的数据库看起来(语义上)像这样: 6 | 7 | key1 value1 8 | key2 value2 9 | key3 value3 10 | ... 11 | 12 | 13 | 对于DB,它将键和值视为字符串,但您可能需要结构化内容。一种方法是使用TensorProtos协议缓冲区:它基本上包含张量(也称为多维数组)以及张量数据类型和形状信息。然后,可以使用TensorProtosDBInput运算符将数据加载到SGD训练方式中。 14 | 15 | 16 | 在这里,我们将向您展示如何创建自己的数据集的一个示例。为此,我们将使用UCI Iris数据集 - 这是一个非常流行的经典数据集,用于分类虹膜花。它包含4个代表花朵尺寸的实值特征,并将东西分为3种类型的鸢尾花。数据集可以[这里](https://archive.ics.uci.edu/ml/datasets/Iris)下载。 17 | 18 | 19 | ```python 20 | # First let's import some necessities 21 | from __future__ import absolute_import 22 | from __future__ import division 23 | from __future__ import print_function 24 | from __future__ import unicode_literals 25 | 26 | %matplotlib inline 27 | import urllib2 # for downloading the dataset from the web. 28 | import numpy as np 29 | from matplotlib import pyplot 30 | from StringIO import StringIO 31 | from caffe2.python import core, utils, workspace 32 | from caffe2.proto import caffe2_pb2 33 | ``` 34 | 35 | WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode. 36 | 37 | 38 | 39 | ```python 40 | f = urllib2.urlopen('https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data') 41 | raw_data = f.read() 42 | print('Raw data looks like this:') 43 | print(raw_data[:100] + '...') 44 | ``` 45 | 46 | Raw data looks like this: 47 | 5.1,3.5,1.4,0.2,Iris-setosa 48 | 4.9,3.0,1.4,0.2,Iris-setosa 49 | 4.7,3.2,1.3,0.2,Iris-setosa 50 | 4.6,3.1,1.5,0.2,... 51 | 52 | 53 | 54 | ```python 55 | # load the features to a feature matrix. 56 | features = np.loadtxt(StringIO(raw_data), dtype=np.float32, delimiter=',', usecols=(0, 1, 2, 3)) 57 | # load the labels to a feature matrix 58 | label_converter = lambda s : {'Iris-setosa':0, 'Iris-versicolor':1, 'Iris-virginica':2}[s] 59 | labels = np.loadtxt(StringIO(raw_data), dtype=np.int, delimiter=',', usecols=(4,), converters={4: label_converter}) 60 | ``` 61 | 62 | 在我们进行训练之前,通常有益的一件事是将数据集分成训练和测试。在这种情况下,让我们随机抽取数据,使用前100个数据点进行训练,剩下的50个进行测试。对于更复杂的方法,您可以使用例如交叉验证将您的数据集分成多个训练集和测试集拆分。阅读更多关于交叉验证的信息[这里](http://scikit-learn.org/stable/modules/cross_validation.html)。 63 | 64 | 65 | ```python 66 | random_index = np.random.permutation(150) 67 | features = features[random_index] 68 | labels = labels[random_index] 69 | 70 | train_features = features[:100] 71 | train_labels = labels[:100] 72 | test_features = features[100:] 73 | test_labels = labels[100:] 74 | ``` 75 | 76 | 77 | ```python 78 | # Let's plot the first two features together with the label. 79 | # Remember, while we are plotting the testing feature distribution 80 | # here too, you might not be supposed to do so in real research, 81 | # because one should not peek into the testing data. 82 | legend = ['rx', 'b+', 'go'] 83 | pyplot.title("Training data distribution, feature 0 and 1") 84 | for i in range(3): 85 | pyplot.plot(train_features[train_labels==i, 0], train_features[train_labels==i, 1], legend[i]) 86 | pyplot.figure() 87 | pyplot.title("Testing data distribution, feature 0 and 1") 88 | for i in range(3): 89 | pyplot.plot(test_features[test_labels==i, 0], test_features[test_labels==i, 1], legend[i]) 90 | ``` 91 | 92 | 93 | ![png](./markdown_img/9.output_6_0.png) 94 | 95 | 96 | 97 | ![png](./markdown_img/9.output_6_1.png) 98 | 99 | 100 | 101 | 现在,正如所承诺的那样,让我们把东西放到Caffe2数据库中。在这个DB中,会发生的是我们将使用“train_xxx”作为键,并使用TensorProtos对象为每个数据点存储两个张量:一个作为特征,一个作为标签。我们将使用Caffe2的Python DB接口来实现。 102 | 103 | 104 | ```python 105 | # First, let's see how one can construct a TensorProtos protocol buffer from numpy arrays. 106 | feature_and_label = caffe2_pb2.TensorProtos() 107 | feature_and_label.protos.extend([ 108 | utils.NumpyArrayToCaffe2Tensor(features[0]), 109 | utils.NumpyArrayToCaffe2Tensor(labels[0])]) 110 | print('This is what the tensor proto looks like for a feature and its label:') 111 | print(str(feature_and_label)) 112 | print('This is the compact string that gets written into the db:') 113 | #print(feature_and_label.SerializeToString()) 114 | ``` 115 | 116 | This is what the tensor proto looks like for a feature and its label: 117 | protos { 118 | dims: 4 119 | data_type: FLOAT 120 | float_data: 4.80000019073 121 | float_data: 3.0 122 | float_data: 1.39999997616 123 | float_data: 0.10000000149 124 | } 125 | protos { 126 | data_type: INT32 127 | int32_data: 0 128 | } 129 | 130 | This is the compact string that gets written into the db: 131 | 132 | 133 | 134 | ```python 135 | # Now, actually write the db. 136 | 137 | def write_db(db_type, db_name, features, labels): 138 | db = core.C.create_db(db_type, db_name, core.C.Mode.write) 139 | transaction = db.new_transaction() 140 | for i in range(features.shape[0]): 141 | feature_and_label = caffe2_pb2.TensorProtos() 142 | feature_and_label.protos.extend([ 143 | utils.NumpyArrayToCaffe2Tensor(features[i]), 144 | utils.NumpyArrayToCaffe2Tensor(labels[i])]) 145 | transaction.put( 146 | 'train_%03d'.format(i), 147 | feature_and_label.SerializeToString()) 148 | # Close the transaction, and then close the db. 149 | del transaction 150 | del db 151 | 152 | write_db("minidb", "iris_train.minidb", train_features, train_labels) 153 | write_db("minidb", "iris_test.minidb", test_features, test_labels) 154 | ``` 155 | 156 | 157 | 现在,让我们创建一个非常简单的网络,它只包含一个TensorProtosDBInput运算符,以展示我们如何从我们创建的数据库加载数据。对于训练,您可能希望执行更复杂的操作:创建网络,训练网络,获取模型以及运行预测服务。为此,您可以查看MNIST教程以获取详细信息。 158 | 159 | 160 | ```python 161 | net_proto = core.Net("example_reader") 162 | dbreader = net_proto.CreateDB([], "dbreader", db="iris_train.minidb", db_type="minidb") 163 | net_proto.TensorProtosDBInput([dbreader], ["X", "Y"], batch_size=16) 164 | 165 | print("The net looks like this:") 166 | print(str(net_proto.Proto())) 167 | ``` 168 | 169 | The net looks like this: 170 | name: "example_reader" 171 | op { 172 | output: "dbreader" 173 | name: "" 174 | type: "CreateDB" 175 | arg { 176 | name: "db_type" 177 | s: "minidb" 178 | } 179 | arg { 180 | name: "db" 181 | s: "iris_train.minidb" 182 | } 183 | } 184 | op { 185 | input: "dbreader" 186 | output: "X" 187 | output: "Y" 188 | name: "" 189 | type: "TensorProtosDBInput" 190 | arg { 191 | name: "batch_size" 192 | i: 16 193 | } 194 | } 195 | 196 | 197 | 198 | 199 | ```python 200 | workspace.CreateNet(net_proto) 201 | ``` 202 | 203 | 204 | 205 | 206 | True 207 | 208 | 209 | 210 | 211 | ```python 212 | # Let's run it to get batches of features. 213 | workspace.RunNet(net_proto.Proto().name) 214 | print("The first batch of feature is:") 215 | print(workspace.FetchBlob("X")) 216 | print("The first batch of label is:") 217 | print(workspace.FetchBlob("Y")) 218 | 219 | # Let's run again. 220 | workspace.RunNet(net_proto.Proto().name) 221 | print("The second batch of feature is:") 222 | print(workspace.FetchBlob("X")) 223 | print("The second batch of label is:") 224 | print(workspace.FetchBlob("Y")) 225 | ``` 226 | 227 | The first batch of feature is: 228 | [[4.8 3. 1.4 0.1] 229 | [6.5 3. 5.8 2.2] 230 | [6.7 3. 5. 1.7] 231 | [6.4 2.8 5.6 2.1] 232 | [5.4 3.9 1.3 0.4] 233 | [5.7 4.4 1.5 0.4] 234 | [5.5 2.4 3.7 1. ] 235 | [5.4 3. 4.5 1.5] 236 | [5. 3.3 1.4 0.2] 237 | [5.1 3.5 1.4 0.3] 238 | [5.5 2.5 4. 1.3] 239 | [5. 3. 1.6 0.2] 240 | [5. 3.5 1.6 0.6] 241 | [5.2 3.4 1.4 0.2] 242 | [4.9 3. 1.4 0.2] 243 | [4.3 3. 1.1 0.1]] 244 | The first batch of label is: 245 | [0 2 1 2 0 0 1 1 0 0 1 0 0 0 0 0] 246 | The second batch of feature is: 247 | [[4.8 3.4 1.6 0.2] 248 | [6. 2.7 5.1 1.6] 249 | [6.2 3.4 5.4 2.3] 250 | [4.7 3.2 1.6 0.2] 251 | [5.5 2.4 3.8 1.1] 252 | [6.7 3.1 4.7 1.5] 253 | [5.7 2.8 4.5 1.3] 254 | [6.9 3.1 5.1 2.3] 255 | [6.5 3. 5.2 2. ] 256 | [6.3 3.3 6. 2.5] 257 | [5.4 3.7 1.5 0.2] 258 | [4.8 3. 1.4 0.3] 259 | [5.9 3. 5.1 1.8] 260 | [7.3 2.9 6.3 1.8] 261 | [6.2 2.2 4.5 1.5] 262 | [7.7 2.6 6.9 2.3]] 263 | The second batch of label is: 264 | [0 1 2 0 1 1 1 2 2 2 0 0 2 2 1 2] 265 | 266 | 267 | 268 | ```python 269 | 270 | ``` 271 | 272 | 273 | ```python 274 | 275 | ``` 276 | -------------------------------------------------------------------------------- /caffe2/README.md: -------------------------------------------------------------------------------- 1 | 这是我的csdn博客【caffe2从头学】系列的代码部分,包括单不限于caffe2官方的tutorials解读与改善。想了解更多细节请访问我的CSDN博客:[【caffe2从头学】:0.目录](https://blog.csdn.net/weixin_37251044/article/details/82344428) 2 | 3 | This is the code part of my CSDN blog 【caffe2 from the beginning 】 of the series, including but not limited to caffe2 tutorials and improvements. For more details please visit my CSDN blog:[【caffe2从头学】:0.目录](https://blog.csdn.net/weixin_37251044/article/details/82344428) 4 | --- 5 | 6 | 0.[目录](https://blog.csdn.net/weixin_37251044/article/details/82344428) 7 | 8 | 1.[快速开始](https://blog.csdn.net/weixin_37251044/article/details/82344481) 9 | 10 | > 1.1.[什么是caffe2 ?](https://blog.csdn.net/weixin_37251044/article/details/82344481) 11 | 12 | > 1.2.[安装caffe2](https://blog.csdn.net/weixin_37251044/article/details/82259230) 13 | 14 | 2.[学习caffe2](https://blog.csdn.net/weixin_37251044/article/details/82346301) 15 | 16 | 3.[caffe2官方教程的安装与使用](https://blog.csdn.net/weixin_37251044/article/details/82352962) 17 | 18 | >3.1. [Blobs and Workspace, Tensors,Net 概念](https://blog.csdn.net/weixin_37251044/article/details/82387868) 19 | 20 | >3.2.[Caffe2 的一些基本概念 - Workspaces&Operators & Nets & Nets 可视化](https://blog.csdn.net/weixin_37251044/article/details/82421521) 21 | 22 | 4.参考 23 | 24 | 5.API 25 | 26 | --- 27 | -------------------------------------------------------------------------------- /caffe2/images/Cellsx128.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/Cellsx128.png -------------------------------------------------------------------------------- /caffe2/images/Ducreux.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/Ducreux.jpg -------------------------------------------------------------------------------- /caffe2/images/Flower-id.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/Flower-id.png -------------------------------------------------------------------------------- /caffe2/images/Places-cnn-visual-example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/Places-cnn-visual-example.png -------------------------------------------------------------------------------- /caffe2/images/README.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /caffe2/images/aircraft-carrier.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/aircraft-carrier.jpg -------------------------------------------------------------------------------- /caffe2/images/astronauts.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/astronauts.jpg -------------------------------------------------------------------------------- /caffe2/images/cat.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/cat.jpg -------------------------------------------------------------------------------- /caffe2/images/cell-tower.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/cell-tower.jpg -------------------------------------------------------------------------------- /caffe2/images/cowboy-hat.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/cowboy-hat.jpg -------------------------------------------------------------------------------- /caffe2/images/flower.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/flower.jpg -------------------------------------------------------------------------------- /caffe2/images/imagenet-boat.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/imagenet-boat.png -------------------------------------------------------------------------------- /caffe2/images/imagenet-caffe2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/imagenet-caffe2.png -------------------------------------------------------------------------------- /caffe2/images/imagenet-meme.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/imagenet-meme.jpg -------------------------------------------------------------------------------- /caffe2/images/imagenet-montage.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/imagenet-montage.jpg -------------------------------------------------------------------------------- /caffe2/images/lemon.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/lemon.jpg -------------------------------------------------------------------------------- /caffe2/images/mirror-image.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/mirror-image.jpg -------------------------------------------------------------------------------- /caffe2/images/orange.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/orange.jpg -------------------------------------------------------------------------------- /caffe2/images/orangutan.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/orangutan.jpg -------------------------------------------------------------------------------- /caffe2/images/pretzel.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/pretzel.jpg -------------------------------------------------------------------------------- /caffe2/images/sickle-cells.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/images/sickle-cells.jpg -------------------------------------------------------------------------------- /caffe2/iris_test.minidb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/iris_test.minidb -------------------------------------------------------------------------------- /caffe2/iris_train.minidb: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/iris_train.minidb -------------------------------------------------------------------------------- /caffe2/markdown_img/2.output_30_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/2.output_30_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/2.output_46_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/2.output_46_0.png -------------------------------------------------------------------------------- /caffe2/markdown_img/4.output_13_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/4.output_13_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/4.output_15_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/4.output_15_0.png -------------------------------------------------------------------------------- /caffe2/markdown_img/4.output_5_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/4.output_5_2.png -------------------------------------------------------------------------------- /caffe2/markdown_img/5.1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/5.1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/5.2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/5.2.png -------------------------------------------------------------------------------- /caffe2/markdown_img/6.output_7_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/6.output_7_3.png -------------------------------------------------------------------------------- /caffe2/markdown_img/6.output_7_4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/6.output_7_4.png -------------------------------------------------------------------------------- /caffe2/markdown_img/6.output_7_5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/6.output_7_5.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_10_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_10_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_12_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_12_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_14_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_14_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_17_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_17_2.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_19_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_19_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_21_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_21_2.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_21_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_21_3.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_24_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_24_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_24_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_24_2.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_3_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_3_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_5_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_5_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_5_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_5_2.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_7_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_7_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/7.output_8_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/7.output_8_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/8.output_22_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/8.output_22_0.png -------------------------------------------------------------------------------- /caffe2/markdown_img/8.output_24_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/8.output_24_0.png -------------------------------------------------------------------------------- /caffe2/markdown_img/8.output_28_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/8.output_28_2.png -------------------------------------------------------------------------------- /caffe2/markdown_img/8.output_30_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/8.output_30_0.png -------------------------------------------------------------------------------- /caffe2/markdown_img/8.output_30_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/8.output_30_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/8.output_32_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/8.output_32_0.png -------------------------------------------------------------------------------- /caffe2/markdown_img/8.output_34_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/8.output_34_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/8.output_38_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/8.output_38_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/8.output_38_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/8.output_38_2.png -------------------------------------------------------------------------------- /caffe2/markdown_img/9.output_6_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/9.output_6_0.png -------------------------------------------------------------------------------- /caffe2/markdown_img/9.output_6_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/caffe2/markdown_img/9.output_6_1.png -------------------------------------------------------------------------------- /caffe2/markdown_img/README.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /cs224n_note/readme.md: -------------------------------------------------------------------------------- 1 | 这是我的博客[【NLP】cs224n课程笔记](https://blog.csdn.net/weixin_37251044/article/details/83473874) 的存档,详情,请访问我的博客。 2 | -------------------------------------------------------------------------------- /cs224n_note/【NLP】cs224n课程笔记.md: -------------------------------------------------------------------------------- 1 | 一、前言

自然语言是人类智慧的结晶,自然语言处理是人工智能中最为困难的问题之一,而对自然语言处理的研究也是充满魅力和挑战的。 通过经典的斯坦福cs224n教程,让我们一起和自然语言处理共舞!也希望大家能够在NLP领域有所成就!


二、先修知识(学习的过程中可以遇到问题后再复习)

了解python基础知识了解高等数学、概率论、线性代数知识了解基础机器学习算法:梯度下降、线性回归、逻辑回归、Softmax、SVM、PAC(先修课程斯坦福cs229 或者周志华西瓜书)具有英语4级水平(深度学习学习材料、论文基本都是英文,一定要阅读英文原文,进步和提高的速度会加快!!!!)以上知识要求内容可在最下方的知识工具中查找


三、每周学习时间安排

每周具体学习时间划分为4个部分:

1部分安排周一到周二2部分安排在周四到周五3部分安排在周日4部分作业是本周任何时候空余时间周日晚上提交作业运行截图周三、周六休息^_^


(以下的部分链接在手机端无法正常显示,请复制链接到电脑浏览器打开) 2 |



课程资料: 3 |

课程主页: https://web.stanford.edu/class/cs224n /

中文笔记: http://www.hankcs.com/nlp/cs224n-introduction-to-nlp-and-deep-learning.html 4 | 5 | http://www.hankcs.com/tag/cs224n/

课程视频: https://www.bilibili.com/video/av30326868/?spm_id_from=333 .788.videocard.0

实验环境推荐使用Linux或者Mac系统,以下环境搭建方法皆适用:

· Docker环境配置: https://github.com/ufoym/deepo

· 本地环境配置: https://github.com/learning511/cs224n-learning-camp/blob/master/environment.md

注册一个github账号:github.com

后续发布的一些project和exercise会在这个github下:

 https://github.com/learning511/cs224n-learning-camp

重要的一些资源:

深度学习斯坦福教程: http://deeplearning.stanford.edu/wiki/index.php/UFLDL%E6%95%99%E7%A8%8B

廖雪峰python3教程: https://www.liaoxuefeng.com/article/001432619295115c918a094d8954bd493037b03d27bf9a9000

github教程: https://www.liaoxuefeng.com/wiki/0013739516305929606dd18361248578c67b8067c8c017b000

莫烦机器学习教程: http://morvanzhou.github.io/tutorials /

深度学习经典论文: https://github.com/floodsung/Deep-Learning-Papers-Reading-Roadmap

斯坦福cs229代码(机器学习算法python徒手实现): https://github.com/nsoojin/coursera-ml-py

本人博客: https://blog.csdn.net/dukuku5038/article/details/82253966

知识工具

为了让大家逐渐适应英文阅读,复习材料我们有中英两个版本,但是推荐大家读英文

数学工具

斯坦福资料:

线性代数(链接地址: http://web.stanford.edu/class/cs224n/readings/cs229-linalg.pdf )概率论(链接地址: http://101.96.10.44/web.stanford.edu/class/cs224n/readings/cs229-prob.pdf )凸函数优化(链接地址: http://101.96.10.43/web.stanford.edu/class/cs224n/readings/cs229-cvxopt.pdf )随机梯度下降算法(链接地址: http://cs231n.github.io/optimization-1 /)

中文资料:

机器学习中的数学基本知识(链接地址: https://www.cnblogs.com/steven-yang/p/6348112.html )统计学习方法(链接地址: http://vdisk.weibo.com/s/vfFpMc1YgPOr )大学数学课本(从故纸堆里翻出来^_^)

编程工具

斯坦福资料:

Python复习(链接地址: http://web.stanford.edu/class/cs224n/lectures/python-review.pdf )TensorFlow教程(链接地址: https://github.com/open-source-for-science/TensorFlow-Course#why-use-tensorflow )

中文资料:

廖雪峰python3教程(链接地址: https://www.liaoxuefeng.com/article/001432619295115c918a094d8954bd493037b03d27bf9a9000 )莫烦TensorFlow教程(链接地址: https://morvanzhou.github.io/tutorials/machine-learning/tensorflow /) 6 | 7 | 作业参考答案:http://www.hankcs.com/nlp/cs224n-assignment-1.html 8 | 达观杯:https://github.com/MLjian/TextClassificationImplement 9 | # 第一周 10 | 11 | ##
第1部分学习任务: 12 |

(1)观看自然语言处理课学习绪论,了解深度学习的概括和应用案例以及训练营后续的一些学习安排

学习时长:10/23—10/28

绪论视频地址: https://m.weike.fm/lecture/10194068

(2)自然语言处理和深度学习简介,观看课件lecture01、视频1、学习笔记

学习时长:10/23

课件: lecture01(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture1.pdf )观看视频1(链接地址: https://www.bilibili.com/video/av30326868/?spm_id_from=333 .788.videocard.0)学习笔记:自然语言处理与深度学习简介(链接地址: http://www.hankcs.com/nlp/cs224n-introduction-to-nlp-and-deep-learning.html ) 13 | 14 | ##
第2部分学习任务: 15 |
(1)词的向量表示1,观看课件lecture02、视频2、学习笔记
学习时长:10/25—10/26
课件: lecture02(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture2.pdf )
观看视频2(链接地址: https://www.bilibili.com/video/av30326868/?p=2 )
学习笔记:wordvecotor(链接地址: http://www.hankcs.com/nlp/word-vector-representations-word2vec.html )
16 | 17 | 18 | ##
第3部分学习任务: 19 |
(1)论文导读:一个简单但很难超越的Sentence Embedding基线方法
学习时长:10/28
论文原文: paper(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/A%20Simple%20but%20Tough-to-beat%20Baseline%20for%20Sentence%20Embeddings.pdf )
论文分析: highlight(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture1-highlight.pdf )
论文笔记:Sentence Embedding(链接地址: http://www.hankcs.com/nlp/cs224n-sentence-embeddings.html )
20 | 21 | ##
第4部分作业: 22 | Assignment 1.1-1.2(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/Assignmnet.md )
1.1 Softmax 算法
1.2 Neural Network Basics 神经网络基础实现 23 | 24 | # 第二周 25 | 26 | ## 第1部分学习任务: 27 |
(1)高级词向量表示:word2vec 2,观看课件lecture03、视频3、学习笔记
学习时长:10/29—10/30
课件: lecture03(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture3.pdf )
观看视频3(链接地址: https://www.bilibili.com/video/av30326868/?p=3 )
学习笔记:word2vec 2(链接地址: http://www.hankcs.com/nlp/cs224n-advanced-word-vector-representations.html )
28 | 29 | 30 | ##
第2部分学习任务: 31 |
(1)Word Window分类与神经网络,观看课件lecture04、视频4、学习笔记
学习时长:11/1—11/2
课件: lecture04(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture4.pdf )
观看视频4(链接地址: https://www.bilibili.com/video/av30326868/?p=4 )
学习笔记:Word Window分类与神经网络(链接地址: http://www.hankcs.com/nlp/cs224n-word-window-classification-and-neural-networks.html ) 32 | 33 | 34 | 35 | ##
第3部分学习任务: 36 |
(1)论文导读:词语义项的线性代数结构与词义消歧
学习时长:11/4
论文原文: paper(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Linear%20Algebraic%20Structure%20of%20Word%20Senses%2C%20with%20Applications%20to%20Polysemy.pdf )
论文分析: highlight(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture2-highlight.pdf )
论文笔记:Sentence Embedding(链接地址: http://www.hankcs.com/nlp/cs224n-word-senses.html ) 37 | 38 | ##
第4部分作业: 39 | Assignment 1.3-1.4(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/Assignmnet.md )
1.3 word2vec 实现
1.4 Sentiment Analysis 情绪分析 40 | 41 | # 第三周 42 | (因为中间跳过一周,11/4-11/10,所以本周从11/12开始) 43 | ## 第1部分学习任务: 44 | (1)反向传播与项目指导:Backpropagation and Project Advice,观看课件lecture05、视频5、学习笔记
学习时长:11/12—11/13
课件: lecture05(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture5.pdf )
观看视频5(链接地址: https://www.bilibili.com/video/av30326868/?p=5 )
学习笔记:反向传播与项目指导(链接地址: http://www.hankcs.com/nlp/cs224n-backpropagation-and-project-advice.html ) 45 | 46 | 47 | ## 第2部分学习任务: 48 |
(1)依赖解析:Dependency Parsing,观看课件lecture06、视频6、学习笔记
学习时长:11/15—11/16
课件: lecture06(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture3.pdf )
观看视频6(链接地址: https://www.bilibili.com/video/av30326868/?p=6 )
学习笔记:句法分析和依赖解析(链接地址: http://www.hankcs.com/nlp/cs224n-dependency-parsing.html ) 49 | 50 | ## 第3部分学习内容 51 |
(1)论文导读:高效文本分类
学习时长:11/18
论文原文: paper(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Bag%20of%20Tricks%20for%20Efficient%20Text%20Classification.pdf )
论文分析: highlight(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture3-highlight.pdf )
论文笔记:高效文本分类(链接地址: http://www.hankcs.com/nlp/cs224n-bag-of-tricks-for-efficient-text-classification.html ) 52 | 53 | ##
第4部分作业: 54 | Assignment 2.2(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/Assignmnet.md )
Neural Transition-Based Dependency Parsing 基于神经网络的依赖分析 55 | # 第四周 56 | ## 第1部分学习任务: 57 | (1)TensorFlow入门,观看课件lecture07、视频、学习笔记
学习时长:11/19—11/20
课件: lecture07(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture7-tensorflow.pdf )
观看视频7(链接地址: https://www.bilibili.com/video/av30326868/?p=7 )
学习笔记:TensorFlow(链接地址: http://www.hankcs.com/nlp/cs224n-tensorflow.html ) 58 | ## 第2部分学习任务: 59 | 第2部分学习任务:
(1)RNN和语言模型,观看课件lecture08、视频、学习笔记
学习时长:11/22—11/23
课件: lecture08(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture8.pdf )
观看视频8(链接地址: https://www.bilibili.com/video/av30326868/?p=8 )
学习笔记:TensorFlow(链接地址: http://www.hankcs.com/nlp/cs224n-rnn-and-language-models.html ) 60 | ## 第3部分学习任务: 61 | (1)论文导读:词嵌入对传统方法的启发
学习时长:11/25
论文原文: paper(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Impoving%20distributional%20similarly%20with%20lessons%20learned%20from%20word%20embeddings.pdf )
论文分析: highlight(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture4-highlight.pdf )
论文笔记:词嵌入对传统方法的启发(链接地址: http://www.hankcs.com/nlp/cs224n-improve-word-embeddings.html ) 62 | 63 | ## 第4部分作业: 64 | Assignment 2.1,2.2(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/Assignmnet.md )
2.1Tensorflow Softmax 基于TensorFlow的softmax分类
2.2 Neural Transition-Based Dependency Parsing 基于神经网络的依赖分析 65 | 66 | # 第五周 67 | ## 第1部分学习任务: 68 | (1)高级LSTM及GRU:LSTM and GRU,观看课件lecture09、视频、学习笔记

学习时长:11/26—11/27

课件: lecture09(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture9.pdf )观看视频9(链接地址: https://www.bilibili.com/video/av30326868/?p=9 )学习笔记:高级LSTM及GRU(链接地址: http://www.hankcs.com/nlp/cs224n-mt-lstm-gru.html ) 69 | ## 第2部分学习任务: 70 | (1)期中复习,观看课件和视频、回顾上一阶段学习的知识
学习时长:11/29—11/30
课件: (链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-midterm-review.pdf )
观看视频(链接地址: https://www.youtube.com/watch?v=2DYxT4OMAmw&list=PL3FW7Lu3i5Jsnh1rnUwq_TcylNr7EkRe6&index=10 ) 71 | ## 第3部分学习任务: 72 | (1)论文导读:基于转移的神经网络句法分析的结构化训练

学习时长:12/2

论文原文: paper(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Structured%20Training%20for%20Neural%20Network%20Transition-Based%20Parsing.pdf )论文分析: highlight(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture6-highlight.pdf )论文笔记:基于神经网络句法分析的结构化训练(链接地址: http://www.hankcs.com/nlp/cs224n-syntaxnet.html ) 73 | ## 第4部分作业: 74 | Assignment 2.3(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/Assignmnet.md )
Recurrent Neural Networks: Language Modeling 循环神经网络语言建模 75 | 76 | # 第六周 77 | ## 第1部分学习任务: 78 | (1)机器翻译、序列到序列、注意力模型,观看课件lecture10、视频、学习笔记
学习时长:12/3—12/4
课件: lecture10(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture10.pdf )
观看视频9(链接地址: https://www.bilibili.com/video/av30326868/?p=10 )
学习笔记:机器翻译、序列到序列、注意力模型(链接地址: http://www.hankcs.com/nlp/cs224n-9-nmt-models-with-attention.html ) 79 | 80 | 81 | ## 第2部分学习任务: 82 | (1)GRU和NMT的进阶,观看课件lecture11、视频、学习笔记
学习时长:12/6—12/7
课件: lecture11(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture11.pdf )
观看视频9(链接地址: https://www.bilibili.com/video/av30326868/?p=11 )
学习笔记:GRU和NMT的进阶(链接地址: http://www.hankcs.com/nlp/cs224n-gru-nmt.html ) 83 | 84 | ## 第3部分学习任务: 85 | (1)论文导读:谷歌的多语种神经网络翻译系统
学习时长:12/9
论文原文: paper(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Google%E2%80%99s%20Multilingual%20Neural%20Machine%20Translation%20System_%20Enabling%20Zero-Shot%20Translation.pdf )
论文分析: highlight(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture8-highlight.pdf )
论文笔记:基于神经网络句法分析的结构化训练(链接地址: http://www.hankcs.com/nlp/cs224n-google-nmt.html ) 86 | ## 第4部分作业: 87 | Assignment 3.1(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/Assignmnet.md )
A window into named entity recognition(NER)基于窗口模式的名称识别 88 | 89 | 90 | # 第七周 91 | ## 第1部分学习任务: 92 | (1)语音识别的end-to-end模型,观看课件lecture12、视频、学习笔记
学习时长:12/10—12/11
课件: lecture12(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture12.pdf )
观看视频12(链接地址: https://www.bilibili.com/video/av30326868/?p=12 )
学习笔记:语音识别的end-to-end模型(链接地址: http://www.hankcs.com/nlp/cs224n-end-to-end-asr.html ) 93 | 94 | ## 第2部分学习任务: 95 | (1)卷积神经网络:CNN,观看课件lecture13、视频、学习笔记
学习时长:12/13—12/14
课件: lecture13(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture13.pdf )
观看视频13(链接地址: https://www.bilibili.com/video/av30326868/?p=13 )
学习笔记:卷积神经网络(链接地址: http://www.hankcs.com/nlp/cs224n-convolutional-neural-networks.html ) 96 | 97 | ## 第3部分学习任务: 98 | (1)论文导读:读唇术
学习时长:12/16
论文原文: paper(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Lip%20Reading%20Sentences%20in%20the%20Wild.pdf )
论文分析: highlight(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture9-highlight.pdf )
论文笔记:读唇术(链接地址: http://www.hankcs.com/nlp/cs224n-lip-reading.html ) 99 | 100 | ## 第4部分作业: 101 | Assignment 3.2(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/Assignmnet.md )
Recurrent neural nets for named entity recognition(NER) 基于RNN的名称识别 102 | 103 | # 第八周 104 | ## 第1部分学习任务: 105 | (1)Tree RNN与短语句法分析,观看课件lecture14、视频、学习笔记
学习时长:12/17—12/18
课件: lecture14(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture14.pdf )
观看视频14(链接地址: https://www.bilibili.com/video/av30326868/?p=14 )
学习笔记:Tree RNN与短语句法分析(链接地址: http://www.hankcs.com/nlp/cs224n-tree-recursive-neural-networks-and-constituency-parsing.html ) 106 | 107 | ## 第2部分学习任务: 108 | (1)指代消解,观看课件lecture15、视频、学习笔记
学习时长:12/20—12/21
课件: lecture15(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture15.pdf )
观看视频15(链接地址: https://www.bilibili.com/video/av30326868/?p=15 )
学习笔记:指代消解(链接地址: http://www.hankcs.com/nlp/cs224n-coreference-resolution.html ) 109 | 110 | ## 第3部分学习任务: 111 | (1)论文导读:谷歌的多语种神经网络翻译系统
学习时长:12/23
论文原文: paper(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Character-Aware%20Neural%20Language%20Models.pdf )
论文分析: highlight(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture10-highlight.pdf )
论文笔记:Character-Aware神经网络语言模型(链接地址: http://www.hankcs.com/nlp/cs224n-character-aware-neural-language-models.html ) 112 | ## 第4部分作业: 113 | Assignment 3.3(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/Assignmnet.md )
3Grooving with GRUs((NER)基于GRU的名称识别 114 | 115 | # 第九周 116 | ## 第1部分学习任务: 117 | DMN与问答系统,观看课件lecture16、视频、学习笔记

学习时长:12/24—12/25

课件: lecture16(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture16.pdf )观看视频16(链接地址: https://www.bilibili.com/video/av30326868/?p=16 )学习笔记:DMN与问答系统(链接地址: http://www.hankcs.com/nlp/cs224n-dmn-question-answering.html ) 118 | 119 | ## 第2部分学习任务: 120 | NLP存在的问题与未来的架构,观看课件lecture17、视频、学习笔记
学习时长:12/27—12/28
课件: lecture17(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture17.pdf )
观看视频17(链接地址: https://www.bilibili.com/video/av30326868/?p=17 )
学习笔记:指代消解(链接地址: http://www.hankcs.com/nlp/cs224n-nlp-issues-architectures.html ) 121 | 122 | ## 第3部分学习任务: 123 | (1)论文导读:谷歌的多语种神经网络翻译系统
学习时长:12/30
论文原文: paper(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Learning%20Program%20Embeddings%20to%20Propagate%20Feedback%20on%20Student%20Code.pdf )
论文分析: highlight(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture12-highlight.pdf )
论文笔记:学习代码中的语义(链接地址: http://www.hankcs.com/nlp/cs224n-program-embeddings.html ) 124 | ## 第4部分作业: 125 | Assignment 3.3(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/Assignmnet.md )
3Grooving with GRUs((NER)基于GRU的名称识别 126 | 127 | # 第十周 128 | ## 第1部分学习任务: 129 | 挑战深度学习与自然语言处理的极限,观看课件lecture18、视频、学习笔记

学习时长:12/31—1/1

课件: lecture18(链接地址: https://github.com/learning511/cs224n-learning-camp/blob/master/lecture-notes/cs224n-2017-lecture18.pdf )观看视频18(链接地址: https://www.bilibili.com/video/av30326868/?p=18 )学习笔记:挑战深度学习与自然语言处理的极限(链接地址: http://www.hankcs.com/nlp/cs224n-tackling-the-limits-of-dl-for-nlp.html ) 130 | 131 | ## 第2、3部分学习任务: 132 | (1)论文导读:neural-turing-machines
学习时长:1/3—1/6
论文原文: paper( https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Deep%20Reinforcement%20Learning%20for%20Dialogue%20Generation.pdf )
论文分析: highlight( https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture14-highlight.pdf )
论文笔记:neural-turing-machines( http://www.hankcs.com/nlp/cs224n-neural-turing-machines.html )

(2)论文导读: 深度强化学习用于对话生成
学习时长:1/3—1/6
论文原文: paper( https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Deep%20Reinforcement%20Learning%20for%20Dialogue%20Generation.pdf )
论文分析: highlight( https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture11-highlight.pdf )
论文笔记:深度强化学习用于对话生成( http://www.hankcs.com/nlp/cs224n-deep-reinforcement-learning-for-dialogue-generation.html )


133 | 134 | # 第十一周 135 | 二、分节学习内容

(1)论文导读:图像对话

学习时长:1/7—1/13

论文原文: paper( https://github.com/learning511/cs224n-learning-camp/blob/master/paper/highlight/cs224n-2017-lecture5-highlight.pdf )论文分析: highlight( https://github.com/learning511/cs224n-learning-camp/blob/master/paper/Visual%20Dialog.pdf )论文笔记:图像对话( http://www.hankcs.com/nlp/cs224n-visual-dialog.html )

(2)比赛复盘:对之前的比赛进行回顾

(3)课程总结:输出自己的笔记内容 136 | 137 | --- 138 | 139 | # 达观杯比赛 140 | ## 1.观看达观杯NLP算法大赛报名指导PDF和入门指导视频 141 |
学习时长:10/25—11/4
零基础1小时完成一场AI比赛
达观杯文本智能挑战赛入门指导(视频在下方,如果不清楚也可以去荔枝微课看 https://m.weike.fm/lecture/10195400 ,密码是011220)

[02零基础1小时完成一场AI比赛.pdf](http://p2.dcsapi.com/c2ZtYnVqd2ZRYnVpJD4kMzEyOTAyMzA0MjBOVWh5TmtOeU9rRnhOa0ozT3tGeS9pdW5tJCckd2pmeCQ+JDMxMjkwMjMwNDIwTlVoeU5rTnlPa0Z4TmtKM097RnkvaXVubSQnJHVqbmYkPiQyNjU3OjY5MzEzMTY4JCckdXpxZiQ+JDI1)
2018.10.22
03 达观杯文本智能挑战赛.mp4 2018.10.22 142 | 143 | 144 | ## 2.观看达观杯NLP算法大赛进阶指导视频 145 | 学习时长:11/19—12/2
达观杯文本智能挑战赛进阶指导(视频在下方,如果不清楚也可以去荔枝微课看 https://m.weike.fm/lecture/10726829 ,密码是011220)
04达观杯之文本分类任务解析与代码使用(进阶指导).mp4 2018.11.18 146 | 147 | 148 | 149 | -------------------------------------------------------------------------------- /jupyter-Pillow-inline/README.md: -------------------------------------------------------------------------------- 1 | detail please visit :[【jupyter】:使用Pillow包显示图像时inline显示](https://blog.csdn.net/weixin_37251044/article/details/81137726) 2 | -------------------------------------------------------------------------------- /jupyter-Pillow-inline/img.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/JackKuo666/csdn_blog_code_implement/9996a052cc636286829585426cbdf7fe3d546f8e/jupyter-Pillow-inline/img.png -------------------------------------------------------------------------------- /python-numpy-sum/Python中Numpy库中的np.sum(array,axis=0,1,2...)怎么理解?.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "参考:https://segmentfault.com/q/1010000010111006" 8 | ] 9 | }, 10 | { 11 | "cell_type": "markdown", 12 | "metadata": {}, 13 | "source": [ 14 | "# 1.Python中Numpy库中的np.sum(array,axis=0,1,2...)怎么理解?" 15 | ] 16 | }, 17 | { 18 | "cell_type": "code", 19 | "execution_count": 1, 20 | "metadata": {}, 21 | "outputs": [ 22 | { 23 | "name": "stdout", 24 | "output_type": "stream", 25 | "text": [ 26 | "构造一个[4 3 2]的三维数组:\n", 27 | "[[[ 0 1]\n", 28 | " [ 2 3]\n", 29 | " [ 4 5]]\n", 30 | "\n", 31 | " [[ 6 7]\n", 32 | " [ 8 9]\n", 33 | " [10 11]]\n", 34 | "\n", 35 | " [[12 13]\n", 36 | " [14 15]\n", 37 | " [16 17]]\n", 38 | "\n", 39 | " [[18 19]\n", 40 | " [20 21]\n", 41 | " [22 23]]]\n", 42 | "\n", 43 | "1.数组的第一维相加之和是:\n", 44 | "[[36 40]\n", 45 | " [44 48]\n", 46 | " [52 56]]\n", 47 | "我们看到36是0+6+12+18得到的,40是1+7+13+19得到的。所以可以总结为:axis = 0 时,是4个[3 2]二维数组对应位置相加。\n", 48 | "\n", 49 | "2.数组的第二维相加之和是:\n", 50 | "[[ 6 9]\n", 51 | " [24 27]\n", 52 | " [42 45]\n", 53 | " [60 63]]\n", 54 | "我们看到6是0+2+4得到的,9是1+3+5得到的。所以可以总结为:axis = 1时,是[4 3 2]中第二维的3个3个相加,在这里我们可以理解为0 2 4是三行,那么就是3行相加。\n", 55 | "\n", 56 | "3.数组的第三维相加之和是:\n", 57 | "[[ 1 5 9]\n", 58 | " [13 17 21]\n", 59 | " [25 29 33]\n", 60 | " [37 41 45]]\n", 61 | "我们看到1是0+1得到的,5是2+3得到的。所以可以总结为:axis = 2时,是[4 3 2]中的第三维的2个2个相加,在这里我们可以理解为0 1 是两列,那么就是2列相加\n" 62 | ] 63 | } 64 | ], 65 | "source": [ 66 | "import numpy as np\n", 67 | "abc = np.arange(0,24,1).reshape(4,3,2)\n", 68 | "print (\"构造一个[4 3 2]的三维数组:\")\n", 69 | "print (abc)\n", 70 | "\n", 71 | "print (\"\\n1.数组的第一维相加之和是:\")\n", 72 | "print (np.sum(abc, axis=(0, )))\n", 73 | "print (\"我们看到36是0+6+12+18得到的,40是1+7+13+19得到的。所以可以总结为:axis = 0 时,是4个[3 2]二维数组对应位置相加。\")\n", 74 | "\n", 75 | "print (\"\\n2.数组的第二维相加之和是:\")\n", 76 | "print (np.sum(abc, axis = (1,)))\n", 77 | "print (\"我们看到6是0+2+4得到的,9是1+3+5得到的。所以可以总结为:axis = 1时,是[4 3 2]中第二维的3个3个相加,在这里我们可以理解为0 2 4是三行,那么就是3行相加。\")\n", 78 | "\n", 79 | "print (\"\\n3.数组的第三维相加之和是:\")\n", 80 | "print (np.sum(abc, axis = (2,)))\n", 81 | "print (\"我们看到1是0+1得到的,5是2+3得到的。所以可以总结为:axis = 2时,是[4 3 2]中的第三维的2个2个相加,在这里我们可以理解为0 1 是两列,那么就是2列相加\")" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": {}, 87 | "source": [ 88 | " # 2.Python中Numpy库中的np.sum(array,axis=(0,1,2))怎么理解?" 89 | ] 90 | }, 91 | { 92 | "cell_type": "code", 93 | "execution_count": 9, 94 | "metadata": {}, 95 | "outputs": [ 96 | { 97 | "name": "stdout", 98 | "output_type": "stream", 99 | "text": [ 100 | "构造一个[4 3 2]的三维数组A是:\n", 101 | "[[[ 0 1]\n", 102 | " [ 2 3]\n", 103 | " [ 4 5]]\n", 104 | "\n", 105 | " [[ 6 7]\n", 106 | " [ 8 9]\n", 107 | " [10 11]]\n", 108 | "\n", 109 | " [[12 13]\n", 110 | " [14 15]\n", 111 | " [16 17]]\n", 112 | "\n", 113 | " [[18 19]\n", 114 | " [20 21]\n", 115 | " [22 23]]]\n", 116 | "\n", 117 | "1.数组的第一维相加之和B是:\n", 118 | "[[36 40]\n", 119 | " [44 48]\n", 120 | " [52 56]]\n", 121 | "\n", 122 | "2.数组B的第二维相加之和C是:\n", 123 | "[ 76 92 108]\n", 124 | "\n", 125 | "3.数组A的第一维先相加,之后再第三维相加D:\n", 126 | "[ 76 92 108]\n", 127 | "\n", 128 | "对比说明:axis=(0,2)表示将数组A[4 3 2]的第一维先相加,相加之后,变成数组B[3,2],A的第三维是大小是2,在B里边变成第二维了,所以B[3 2]第二维相加变成C[3,]。这里我们分步求和得到的C和一步求和的到的D是一致的,证明我们猜想正确。\n" 129 | ] 130 | } 131 | ], 132 | "source": [ 133 | "import numpy as np\n", 134 | "abc = np.arange(0,24,1).reshape(4,3,2)\n", 135 | "print (\"构造一个[4 3 2]的三维数组A是:\")\n", 136 | "print (abc)\n", 137 | "print (\"\\n1.数组的第一维相加之和B是:\")\n", 138 | "d = np.sum(abc, axis=(0, ))\n", 139 | "print (d)\n", 140 | "e = np.sum(d, axis=(1, ))\n", 141 | "print (\"\\n2.数组B的第二维相加之和C是:\")\n", 142 | "print (e)\n", 143 | "print(\"\\n3.数组A的第一维先相加,之后再第三维相加D:\")\n", 144 | "print (np.sum(abc, axis=(0, 2)))\n", 145 | "print (\"\\n对比说明:axis=(0,2)表示将数组A[4 3 2]的第一维先相加,\\\n", 146 | "相加之后,变成数组B[3,2],A的第三维是大小是2,在B里边变成第二维了,所以B[3 2]第二维相加变成C[3,]。\\\n", 147 | "这里我们分步求和得到的C和一步求和的到的D是一致的,证明我们猜想正确。\")" 148 | ] 149 | }, 150 | { 151 | "cell_type": "code", 152 | "execution_count": null, 153 | "metadata": {}, 154 | "outputs": [], 155 | "source": [] 156 | }, 157 | { 158 | "cell_type": "code", 159 | "execution_count": null, 160 | "metadata": {}, 161 | "outputs": [], 162 | "source": [] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": null, 167 | "metadata": {}, 168 | "outputs": [], 169 | "source": [] 170 | } 171 | ], 172 | "metadata": { 173 | "kernelspec": { 174 | "display_name": "Python 3", 175 | "language": "python", 176 | "name": "python3" 177 | }, 178 | "language_info": { 179 | "codemirror_mode": { 180 | "name": "ipython", 181 | "version": 3 182 | }, 183 | "file_extension": ".py", 184 | "mimetype": "text/x-python", 185 | "name": "python", 186 | "nbconvert_exporter": "python", 187 | "pygments_lexer": "ipython3", 188 | "version": "3.6.6" 189 | } 190 | }, 191 | "nbformat": 4, 192 | "nbformat_minor": 2 193 | } 194 | -------------------------------------------------------------------------------- /python-numpy-sum/README.md: -------------------------------------------------------------------------------- 1 | # 这是我的博客[【python笔记】:2.Python中Numpy库中的np.sum(array,axis=0,1,2...)怎么理解?](https://blog.csdn.net/weixin_37251044/article/details/81911079)的代码。详情请看我的博客。 2 | -------------------------------------------------------------------------------- /text_classfier/readme.md: -------------------------------------------------------------------------------- 1 | 这个是我的博客:[【NLP】:1.文本分类](https://blog.csdn.net/weixin_37251044/article/details/85866483)的代码部分,详情可以参考我的博客。 2 | -------------------------------------------------------------------------------- /text_classfier/text_classfier.md: -------------------------------------------------------------------------------- 1 | 2 | # 引言 3 | 4 | 文本分类是商业问题中常见的自然语言处理任务,目标是自动将文本文件分到一个或多个已定义好的类别中。文本分类的一些例子如下: 5 | 6 | 7 | 分析社交媒体中的大众情感 8 | 9 | 鉴别垃圾邮件和非垃圾邮件 10 | 11 | 自动标注客户问询 12 | 13 | 将新闻文章按主题分类 14 | 15 | 16 | ```python 17 | #导入数据集预处理、特征工程和模型训练所需的库 18 | 19 | from sklearn import model_selection, preprocessing, linear_model, naive_bayes, metrics, svm 20 | 21 | from sklearn.feature_extraction.text import TfidfVectorizer, CountVectorizer 22 | 23 | from sklearn import decomposition, ensemble 24 | 25 | 26 | import pandas, xgboost, numpy, textblob, string 27 | 28 | from keras.preprocessing import text, sequence 29 | 30 | from keras import layers, models, optimizers 31 | ``` 32 | 33 | Using TensorFlow backend. 34 | 35 | 36 | 一、准备数据集 37 | 38 | 39 | 在本文中,我使用亚马逊的评论数据集,它可以从这个链接下载: 40 | 41 | 42 | https://gist.github.com/kunalj101/ad1d9c58d338e20d09ff26bcc06c4235 43 | 44 | 45 | 这个数据集包含3.6M的文本评论内容及其标签,我们只使用其中一小部分数据。首先,将下载的数据加载到包含两个列(文本和标签)的pandas的数据结构(dataframe)中。 46 | 47 | 48 | 49 | 50 | 51 | ```python 52 | #加载数据集 53 | 54 | data = open('data/corpus').read() 55 | 56 | labels, texts = [], [] 57 | 58 | for i, line in enumerate(data.split("\n")): 59 | 60 | content = line.split() 61 | 62 | #print(content) 63 | #print(content[1]) 64 | 65 | labels.append(content[0]) 66 | 67 | texts.append(content[1]) 68 | 69 | 70 | #创建一个dataframe,列名为text和label 71 | 72 | trainDF = pandas.DataFrame() 73 | 74 | trainDF['text'] = texts 75 | 76 | trainDF['label'] = labels 77 | 78 | 79 | #print texts[:10],labels[:10] 80 | ``` 81 | 82 | 每一个content的数据是这样的: 83 | ['__label__2', 'Stuning', 'even', 'for', 'the', 'non-gamer:', 'This', 'sound', 'track', 'was', 'beautiful!', 'It', 'paints', 'the', 'senery', 'in', 'your', 'mind', 'so', 'well', 'I', 'would', 'recomend', 'it', 'even', 'to', 'people', 'who', 'hate', 'vid.', 'game', 'music!', 'I', 'have', 'played', 'the', 'game', 'Chrono', 'Cross', 'but', 'out', 'of', 'all', 'of', 'the', 'games', 'I', 'have', 'ever', 'played', 'it', 'has', 'the', 'best', 'music!', 'It', 'backs', 'away', 'from', 'crude', 'keyboarding', 'and', 'takes', 'a', 'fresher', 'step', 'with', 'grate', 'guitars', 'and', 'soulful', 'orchestras.', 'It', 'would', 'impress', 'anyone', 'who', 'cares', 'to', 'listen!', '^_^'] 84 | 85 | >值得注意的是:texts里边存的是每个样本的第一个单词,我不知道这个是作者故意的还是疏忽? 86 | 87 | texts[:10]和labels[:10]的数据是这样的: 88 | ['Stuning', 'The', 'Amazing!:', 'Excellent', 'Remember,', 'an', 'Buyer', 'Glorious', 'A', 'Whispers'] 89 | ['__label__2', '__label__2', '__label__2', '__label__2', '__label__2', '__label__2', '__label__1', '__label__2', '__label__2', '__label__2'] 90 | 91 | 92 | 接下来,我们将数据集分为训练集和验证集,这样我们可以训练和测试分类器。另外,我们将编码我们的目标列,以便它可以在机器学习模型中使用: 93 | 94 | 95 | ```python 96 | #将数据集分为训练集和验证集 97 | 98 | train_x, valid_x, train_y, valid_y = model_selection.train_test_split(trainDF['text'], trainDF['label']) 99 | 100 | #print(trainDF,train_x,valid_x) 101 | #将trainDF(10000)个样本和标签打乱拆分成train_x(7500)个,valid_x(2500)个 102 | 103 | # label编码为目标变量 104 | 105 | encoder = preprocessing.LabelEncoder() 106 | 107 | #print train_y 108 | train_y = encoder.fit_transform(train_y) 109 | #print train_y 110 | #将label编码由3451 __label__1 6228 __label__2形式为[0 1 0 。。。]形式 111 | 112 | valid_y = encoder.fit_transform(valid_y) 113 | ``` 114 | 115 | 二、特征工程 116 | 117 | 118 | 接下来是特征工程,在这一步,原始数据将被转换为特征向量,另外也会根据现有的数据创建新的特征。为了从数据集中选出重要的特征,有以下几种方式: 119 | 120 | 121 | 1.计数向量作为特征 122 | 123 | 2.TF-IDF向量作为特征 124 | 125 | 2.1单个词语级别 126 | 127 | 2.2多个词语级别(N-Gram) 128 | 129 | 2.3词性级别 130 | 131 | 3.词嵌入作为特征 132 | 133 | 4.基于文本/NLP的特征 134 | 135 | 5.主题模型作为特征 136 | 137 | 2.1 计数向量作为特征 138 | 139 | 140 | 计数向量是数据集的矩阵表示,其中每行代表来自语料库的文档,每列表示来自语料库的术语,并且每个单元格表示特定文档中特定术语的频率计数: 141 | 142 | 143 | ```python 144 | #创建一个向量计数器对象 145 | 146 | count_vect = CountVectorizer(analyzer='word', token_pattern=r'\w{1,}') 147 | 148 | count_vect.fit(trainDF['text']) 149 | ``` 150 | 151 | 152 | 153 | 154 | CountVectorizer(analyzer='word', binary=False, decode_error=u'strict', 155 | dtype=, encoding=u'utf-8', input=u'content', 156 | lowercase=True, max_df=1.0, max_features=None, min_df=1, 157 | ngram_range=(1, 1), preprocessor=None, stop_words=None, 158 | strip_accents=None, token_pattern='\\w{1,}', tokenizer=None, 159 | vocabulary=None) 160 | 161 | 162 | 163 | 164 | ```python 165 | #使用向量计数器对象转换训练集和验证集 166 | 167 | xtrain_count = count_vect.transform(train_x) 168 | 169 | 170 | xvalid_count = count_vect.transform(valid_x) 171 | ``` 172 | 173 | 2.2 TF-IDF向量作为特征 174 | 175 | 176 | TF-IDF的分数代表了词语在文档和整个语料库中的相对重要性。TF-IDF分数由两部分组成:第一部分是计算标准的词语频率(TF),第二部分是逆文档频率(IDF)。其中计算语料库中文档总数除以含有该词语的文档数量,然后再取对数就是逆文档频率。 177 | 178 | 179 | TF(t)=(该词语在文档出现的次数)/(文档中词语的总数) 180 | 181 | IDF(t)= log_e(文档总数/出现该词语的文档总数) 182 | 183 | TF-IDF向量可以由不同级别的分词产生(单个词语,词性,多个词(n-grams)) 184 | 185 | 186 | 词语级别TF-IDF:矩阵代表了每个词语在不同文档中的TF-IDF分数。 187 | 188 | N-gram级别TF-IDF: N-grams是多个词语在一起的组合,这个矩阵代表了N-grams的TF-IDF分数。 189 | 190 | 词性级别TF-IDF:矩阵代表了语料中多个词性的TF-IDF分数。 191 | 192 | 193 | ```python 194 | #词语级tf-idf 195 | 196 | tfidf_vect = TfidfVectorizer(analyzer='word', token_pattern=r'\w{1,}', max_features=5000) 197 | 198 | tfidf_vect.fit(trainDF['text']) 199 | 200 | xtrain_tfidf = tfidf_vect.transform(train_x) 201 | 202 | xvalid_tfidf = tfidf_vect.transform(valid_x) 203 | 204 | 205 | # ngram 级tf-idf 206 | 207 | tfidf_vect_ngram = TfidfVectorizer(analyzer='word', token_pattern=r'\w{1,}', ngram_range=(2,3), max_features=5000) 208 | 209 | tfidf_vect_ngram.fit(trainDF['text']) 210 | 211 | xtrain_tfidf_ngram = tfidf_vect_ngram.transform(train_x) 212 | 213 | xvalid_tfidf_ngram = tfidf_vect_ngram.transform(valid_x) 214 | 215 | 216 | #词性级tf-idf 217 | 218 | tfidf_vect_ngram_chars = TfidfVectorizer(analyzer='char', token_pattern=r'\w{1,}', ngram_range=(2,3), max_features=5000) 219 | 220 | tfidf_vect_ngram_chars.fit(trainDF['text']) 221 | 222 | xtrain_tfidf_ngram_chars = tfidf_vect_ngram_chars.transform(train_x) 223 | 224 | xvalid_tfidf_ngram_chars = tfidf_vect_ngram_chars.transform(valid_x) 225 | ``` 226 | 227 | 2.3 词嵌入 228 | 229 | 230 | 词嵌入是使用稠密向量代表词语和文档的一种形式。向量空间中单词的位置是从该单词在文本中的上下文学习到的,词嵌入可以使用输入语料本身训练,也可以使用预先训练好的词嵌入模型生成,词嵌入模型有:Glove, FastText,Word2Vec。它们都可以下载,并用迁移学习的方式使用。想了解更多的词嵌入资料,可以访问: 231 | 232 | 233 | https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/ 234 | 235 | 236 | 接下来介绍如何在模型中使用预先训练好的词嵌入模型,主要有四步: 237 | 238 | 239 | 1. 加载预先训练好的词嵌入模型 240 | 241 | 2. 创建一个分词对象 242 | 243 | 3. 将文本文档转换为分词序列并填充它们 244 | 245 | 4. 创建分词和各自嵌入的映射 246 | 247 | 248 | ```python 249 | #加载预先训练好的词嵌入向量 250 | 251 | embeddings_index = {} 252 | 253 | for i, line in enumerate(open('/home/kuo/data/wiki-news-300d-1M.vec')): 254 | 255 | values = line.split() 256 | # if i == 0: 257 | # print values #['999994', '300'] 258 | 259 | embeddings_index[values[0]] = numpy.asarray(values[1:], dtype='float32') 260 | 261 | 262 | #创建一个分词器 263 | 264 | token = text.Tokenizer() 265 | 266 | token.fit_on_texts(trainDF['text']) 267 | 268 | word_index = token.word_index 269 | 270 | 271 | #将文本转换为分词序列,并填充它们保证得到相同长度的向量 272 | 273 | train_seq_x = sequence.pad_sequences(token.texts_to_sequences(train_x), maxlen=70) 274 | 275 | valid_seq_x = sequence.pad_sequences(token.texts_to_sequences(valid_x), maxlen=70) 276 | 277 | 278 | #创建分词嵌入映射 279 | 280 | embedding_matrix = numpy.zeros((len(word_index) + 1, 300)) 281 | 282 | for word, i in word_index.items(): 283 | 284 | embedding_vector = embeddings_index.get(word) 285 | 286 | if embedding_vector is not None: 287 | 288 | embedding_matrix[i] = embedding_vector 289 | ``` 290 | 291 | print len(train_seq_x[0]) 292 | 293 | 70 294 | 295 | print (train_seq_x[0]) 296 | 297 | 298 | [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 299 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 300 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 301 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 220] 302 | 303 | print len(embedding_matrix[0]) 304 | 305 | 300 306 | 307 | 2.4 基于文本/NLP的特征 308 | 309 | 310 | 创建许多额外基于文本的特征有时可以提升模型效果。比如下面的例子: 311 | 312 | 313 | 文档的词语计数—文档中词语的总数量 314 | 315 | 文档的词性计数—文档中词性的总数量 316 | 317 | 文档的平均字密度--文件中使用的单词的平均长度 318 | 319 | 完整文章中的标点符号出现次数--文档中标点符号的总数量 320 | 321 | 整篇文章中的大写次数—文档中大写单词的数量 322 | 323 | 完整文章中标题出现的次数—文档中适当的主题(标题)的总数量 324 | 325 | 词性标注的频率分布 326 | 327 | 名词数量 328 | 329 | 动词数量 330 | 331 | 形容词数量 332 | 333 | 副词数量 334 | 335 | 代词数量 336 | 337 | 338 | 这些特征有很强的实验性质,应该具体问题具体分析。 339 | 340 | 341 | ```python 342 | trainDF['char_count'] = trainDF['text'].apply(len) 343 | 344 | trainDF['word_count'] = trainDF['text'].apply(lambda x: len(x.split())) 345 | 346 | trainDF['word_density'] = trainDF['char_count'] / (trainDF['word_count']+1) 347 | 348 | trainDF['punctuation_count'] = trainDF['text'].apply(lambda x: len("".join(_ for _ in x if _ in string.punctuation))) 349 | 350 | trainDF['title_word_count'] = trainDF['text'].apply(lambda x: len([wrd for wrd in x.split() if wrd.istitle()])) 351 | 352 | trainDF['upper_case_word_count'] = trainDF['text'].apply(lambda x: len([wrd for wrd in x.split() if wrd.isupper()])) 353 | 354 | 355 | trainDF['char_count'] = trainDF['text'].apply(len) 356 | 357 | trainDF['word_count'] = trainDF['text'].apply(lambda x: len(x.split())) 358 | 359 | trainDF['word_density'] = trainDF['char_count'] / (trainDF['word_count']+1) 360 | 361 | trainDF['punctuation_count'] = trainDF['text'].apply(lambda x: len("".join(_ for _ in x if _ in string.punctuation))) 362 | 363 | trainDF['title_word_count'] = trainDF['text'].apply(lambda x: len([wrd for wrd in x.split() if wrd.istitle()])) 364 | 365 | trainDF['upper_case_word_count'] = trainDF['text'].apply(lambda x: len([wrd for wrd in x.split() if wrd.isupper()])) 366 | 367 | pos_family = { 368 | 'noun' : ['NN','NNS','NNP','NNPS'], 369 | 'pron' : ['PRP','PRP$','WP','WP$'], 370 | 'verb' : ['VB','VBD','VBG','VBN','VBP','VBZ'], 371 | 'adj' : ['JJ','JJR','JJS'], 372 | 'adv' : ['RB','RBR','RBS','WRB'] 373 | } 374 | 375 | 376 | #检查和获得特定句子中的单词的词性标签数量 377 | 378 | def check_pos_tag(x, flag): 379 | cnt = 0 380 | try: 381 | wiki = textblob.TextBlob(x) 382 | for tup in wiki.tags: 383 | ppo = list(tup)[1] 384 | if ppo in pos_family[flag]: 385 | cnt += 1 386 | except: 387 | pass 388 | return cnt 389 | 390 | 391 | trainDF['noun_count'] = trainDF['text'].apply(lambda x: check_pos_tag(x, 'noun')) 392 | 393 | trainDF['verb_count'] = trainDF['text'].apply(lambda x: check_pos_tag(x, 'verb')) 394 | 395 | trainDF['adj_count'] = trainDF['text'].apply(lambda x: check_pos_tag(x, 'adj')) 396 | 397 | trainDF['adv_count'] = trainDF['text'].apply(lambda x: check_pos_tag(x, 'adv')) 398 | 399 | trainDF['pron_count'] = trainDF['text'].apply(lambda x: check_pos_tag(x, 'pron')) 400 | ``` 401 | 402 | 2.5 主题模型作为特征 403 | 404 | 405 | 主题模型是从包含重要信息的文档集中识别词组(主题)的技术,我已经使用LDA生成主题模型特征。LDA是一个从固定数量的主题开始的迭代模型,每一个主题代表了词语的分布,每一个文档表示了主题的分布。虽然分词本身没有意义,但是由主题表达出的词语的概率分布可以传达文档思想。如果想了解更多主题模型,请访问: 406 | 407 | 408 | https://www.analyticsvidhya.com/blog/2016/08/beginners-guide-to-topic-modeling-in-python/ 409 | 410 | 411 | 我们看看主题模型运行过程: 412 | 413 | 414 | 415 | ```python 416 | #训练主题模型 417 | 418 | lda_model = decomposition.LatentDirichletAllocation(n_components=20, learning_method='online', max_iter=20) 419 | 420 | X_topics = lda_model.fit_transform(xtrain_count) 421 | 422 | topic_word = lda_model.components_ 423 | 424 | vocab = count_vect.get_feature_names() 425 | 426 | 427 | #可视化主题模型 428 | 429 | n_top_words = 10 430 | 431 | topic_summaries = [] 432 | 433 | for i, topic_dist in enumerate(topic_word): 434 | topic_words = numpy.array(vocab)[numpy.argsort(topic_dist)][:-(n_top_words+1):-1] 435 | topic_summaries.append(' '.join(topic_words)) 436 | ``` 437 | 438 | 三、建模 439 | 440 | 441 | 文本分类框架的最后一步是利用之前创建的特征训练一个分类器。关于这个最终的模型,机器学习中有很多模型可供选择。我们将使用下面不同的分类器来做文本分类: 442 | 443 | 444 | 朴素贝叶斯分类器 445 | 446 | 线性分类器 447 | 448 | 支持向量机(SVM) 449 | 450 | Bagging Models 451 | 452 | Boosting Models 453 | 454 | 浅层神经网络 455 | 456 | 深层神经网络 457 | 458 | 卷积神经网络(CNN) 459 | 460 | LSTM 461 | 462 | GRU 463 | 464 | 双向RNN 465 | 466 | 循环卷积神经网络(RCNN) 467 | 468 | 其它深层神经网络的变种 469 | 470 | 471 | 接下来我们详细介绍并使用这些模型。下面的函数是训练模型的通用函数,它的输入是分类器、训练数据的特征向量、训练数据的标签,验证数据的特征向量。我们使用这些输入训练一个模型,并计算准确度。 472 | 473 | 474 | 475 | 476 | 477 | ```python 478 | def train_model(classifier, feature_vector_train, label, feature_vector_valid, is_neural_net=False): 479 | # fit the training dataset on the classifier 480 | classifier.fit(feature_vector_train, label) 481 | 482 | # predict the labels on validation dataset 483 | predictions = classifier.predict(feature_vector_valid) 484 | 485 | if is_neural_net: 486 | predictions = predictions.argmax(axis=-1) 487 | 488 | return metrics.accuracy_score(predictions, valid_y) 489 | ``` 490 | 491 | 3.1 朴素贝叶斯 492 | 493 | 494 | 利用sklearn框架,在不同的特征下实现朴素贝叶斯模型。 495 | 496 | 497 | 朴素贝叶斯是一种基于贝叶斯定理的分类技术,并且假设预测变量是独立的。朴素贝叶斯分类器假设一个类别中的特定特征与其它存在的特征没有任何关系。 498 | 499 | 500 | 想了解朴素贝叶斯算法细节可点击: 501 | https://www.analyticsvidhya.com/blog/2017/09/naive-bayes-explained/ 502 | 503 | 504 | ```python 505 | #特征为计数向量的朴素贝叶斯 506 | accuracy = train_model(naive_bayes.MultinomialNB(), xtrain_count, train_y, xvalid_count) 507 | print "NB, Count Vectors: ", accuracy 508 | 509 | #特征为词语级别TF-IDF向量的朴素贝叶斯 510 | accuracy = train_model(naive_bayes.MultinomialNB(), xtrain_tfidf, train_y, xvalid_tfidf) 511 | print "NB, WordLevel TF-IDF: ", accuracy 512 | 513 | #特征为多个词语级别TF-IDF向量的朴素贝叶斯 514 | accuracy = train_model(naive_bayes.MultinomialNB(), xtrain_tfidf_ngram, train_y, xvalid_tfidf_ngram) 515 | print "NB, N-Gram Vectors: ", accuracy 516 | 517 | #特征为词性级别TF-IDF向量的朴素贝叶斯 518 | accuracy = train_model(naive_bayes.MultinomialNB(), xtrain_tfidf_ngram_chars, train_y, xvalid_tfidf_ngram_chars) 519 | print "NB, CharLevel Vectors: ", accuracy 520 | ``` 521 | 522 | NB, Count Vectors: 0.6996 523 | NB, WordLevel TF-IDF: 0.6976 524 | NB, N-Gram Vectors: 0.498 525 | NB, CharLevel Vectors: 0.6704 526 | 527 | 528 | 3.2 线性分类器 529 | 530 | 531 | 实现一个线性分类器(Logistic Regression):Logistic回归通过使用logistic / sigmoid函数估计概率来度量类别因变量与一个或多个独立变量之间的关系。如果想了解更多关于logistic回归,请访问: 532 | 533 | 534 | https://www.analyticsvidhya.com/blog/2015/10/basics-logistic-regression/ 535 | 536 | 537 | ```python 538 | #特征为计数向量的线性分类器 539 | accuracy = train_model(linear_model.LogisticRegression(), xtrain_count, train_y, xvalid_count) 540 | print "LR, Count Vectors: ", accuracy 541 | 542 | #特征为词语级别TF-IDF向量的线性分类器 543 | accuracy = train_model(linear_model.LogisticRegression(), xtrain_tfidf, train_y, xvalid_tfidf) 544 | print "LR, WordLevel TF-IDF: ", accuracy 545 | 546 | #特征为多个词语级别TF-IDF向量的线性分类器 547 | accuracy = train_model(linear_model.LogisticRegression(), xtrain_tfidf_ngram, train_y, xvalid_tfidf_ngram) 548 | print "LR, N-Gram Vectors: ", accuracy 549 | 550 | #特征为词性级别TF-IDF向量的线性分类器 551 | accuracy = train_model(linear_model.LogisticRegression(), xtrain_tfidf_ngram_chars, train_y, xvalid_tfidf_ngram_chars) 552 | print "LR, CharLevel Vectors: ", accuracy 553 | ``` 554 | 555 | LR, Count Vectors: 0.7012 556 | LR, WordLevel TF-IDF: 0.6988 557 | LR, N-Gram Vectors: 0.4992 558 | LR, CharLevel Vectors: 0.698 559 | 560 | 561 | 3.3 实现支持向量机模型 562 | 563 | 564 | 支持向量机(SVM)是监督学习算法的一种,它可以用来做分类或回归。该模型提取了分离两个类的最佳超平面或线。如果想了解更多关于SVM,请访问: 565 | 566 | 567 | https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/ 568 | 569 | 570 | ```python 571 | #特征为多个词语级别TF-IDF向量的SVM 572 | accuracy = train_model(svm.SVC(), xtrain_tfidf_ngram, train_y, xvalid_tfidf_ngram) 573 | print "SVM, N-Gram Vectors: ", accuracy 574 | ``` 575 | 576 | SVM, N-Gram Vectors: 0.496 577 | 578 | 579 | 3.4 Bagging Model 580 | 581 | 582 | 实现一个随机森林模型:随机森林是一种集成模型,更准确地说是Bagging model。它是基于树模型家族的一部分。如果想了解更多关于随机森林,请访问: 583 | 584 | 585 | https://www.analyticsvidhya.com/blog/2014/06/introduction-random-forest-simplified/ 586 | 587 | 588 | ```python 589 | #特征为计数向量的RF 590 | accuracy = train_model(ensemble.RandomForestClassifier(), xtrain_count, train_y, xvalid_count) 591 | print "RF, Count Vectors: ", accuracy 592 | 593 | #特征为词语级别TF-IDF向量的RF 594 | accuracy = train_model(ensemble.RandomForestClassifier(), xtrain_tfidf, train_y, xvalid_tfidf) 595 | print "RF, WordLevel TF-IDF: ", accuracy 596 | ``` 597 | 598 | RF, Count Vectors: 0.702 599 | RF, WordLevel TF-IDF: 0.6972 600 | 601 | 602 | 3.5 Boosting Model 603 | 604 | 605 | 实现一个Xgboost模型:Boosting model是另外一种基于树的集成模型。Boosting是一种机器学习集成元算法,主要用于减少模型的偏差,它是一组机器学习算法,可以把弱学习器提升为强学习器。其中弱学习器指的是与真实类别只有轻微相关的分类器(比随机猜测要好一点)。如果想了解更多,请访问: 606 | 607 | 608 | https://www.analyticsvidhya.com/blog/2016/01/xgboost-algorithm-easy-steps/ 609 | 610 | 611 | ```python 612 | #特征为计数向量的Xgboost 613 | accuracy = train_model(xgboost.XGBClassifier(), xtrain_count.tocsc(), train_y, xvalid_count.tocsc()) 614 | print "Xgb, Count Vectors: ", accuracy 615 | 616 | #特征为词语级别TF-IDF向量的Xgboost 617 | accuracy = train_model(xgboost.XGBClassifier(), xtrain_tfidf.tocsc(), train_y, xvalid_tfidf.tocsc()) 618 | print "Xgb, WordLevel TF-IDF: ", accuracy 619 | 620 | #特征为词性级别TF-IDF向量的Xgboost 621 | accuracy = train_model(xgboost.XGBClassifier(), xtrain_tfidf_ngram_chars.tocsc(), train_y, xvalid_tfidf_ngram_chars.tocsc()) 622 | print "Xgb, CharLevel Vectors: ", accuracy 623 | ``` 624 | 625 | Xgb, Count Vectors: 0.6268 626 | Xgb, WordLevel TF-IDF: 0.63 627 | Xgb, CharLevel Vectors: 0.6592 628 | 629 | 630 | 3.6 浅层神经网络 631 | 632 | 633 | 神经网络被设计成与生物神经元和神经系统类似的数学模型,这些模型用于发现被标注数据中存在的复杂模式和关系。一个浅层神经网络主要包含三层神经元-输入层、隐藏层、输出层。如果想了解更多关于浅层神经网络,请访问: 634 | 635 | 636 | https://www.analyticsvidhya.com/blog/2017/05/neural-network-from-scratch-in-python-and-r/ 637 | 638 | 639 | ```python 640 | def create_model_architecture(input_size): 641 | # create input layer 642 | input_layer = layers.Input((input_size, ), sparse=True) 643 | 644 | # create hidden layer 645 | hidden_layer = layers.Dense(100, activation="relu")(input_layer) 646 | 647 | # create output layer 648 | output_layer = layers.Dense(1, activation="sigmoid")(hidden_layer) 649 | 650 | classifier = models.Model(inputs = input_layer, outputs = output_layer) 651 | classifier.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy') 652 | return classifier 653 | 654 | classifier = create_model_architecture(xtrain_tfidf_ngram.shape[1]) 655 | accuracy = train_model(classifier, xtrain_tfidf_ngram, train_y, xvalid_tfidf_ngram, is_neural_net=True) 656 | print "NN, Ngram Level TF IDF Vectors", accuracy 657 | ``` 658 | 659 | Epoch 1/1 660 | 7500/7500 [==============================] - 8s - loss: 0.6913 661 | NN, Ngram Level TF IDF Vectors 0.496 662 | 663 | 664 | 3.7 深层神经网络 665 | 666 | 667 | 深层神经网络是更复杂的神经网络,其中隐藏层执行比简单Sigmoid或Relu激活函数更复杂的操作。不同类型的深层学习模型都可以应用于文本分类问题。 668 | 669 | ![%E5%9B%BE%E7%89%87.png](attachment:%E5%9B%BE%E7%89%87.png) 670 | 671 | 672 | 673 | 卷积神经网络 674 | 675 | 676 | 卷积神经网络中,输入层上的卷积用来计算输出。本地连接结果中,每一个输入单元都会连接到输出神经元上。每一层网络都应用不同的滤波器(filter)并组合它们的结果。 677 | 678 | ![%E5%9B%BE%E7%89%87.png](attachment:%E5%9B%BE%E7%89%87.png) 679 | 680 | 如果想了解更多关于卷积神经网络,请访问: 681 | 682 | 683 | https://www.analyticsvidhya.com/blog/2017/06/architecture-of-convolutional-neural-networks-simplified-demystified/ 684 | 685 | 686 | ```python 687 | def create_cnn(): 688 | # Add an Input Layer 689 | input_layer = layers.Input((70, )) 690 | 691 | # Add the word embedding Layer 692 | embedding_layer = layers.Embedding(len(word_index) + 1, 300, weights=[embedding_matrix], trainable=False)(input_layer) 693 | embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer) 694 | 695 | # Add the convolutional Layer 696 | conv_layer = layers.Convolution1D(100, 3, activation="relu")(embedding_layer) 697 | 698 | # Add the pooling Layer 699 | pooling_layer = layers.GlobalMaxPool1D()(conv_layer) 700 | 701 | # Add the output Layers 702 | output_layer1 = layers.Dense(50, activation="relu")(pooling_layer) 703 | output_layer1 = layers.Dropout(0.25)(output_layer1) 704 | output_layer2 = layers.Dense(1, activation="sigmoid")(output_layer1) 705 | 706 | # Compile the model 707 | model = models.Model(inputs=input_layer, outputs=output_layer2) 708 | model.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy') 709 | 710 | return model 711 | 712 | classifier = create_cnn() 713 | accuracy = train_model(classifier, train_seq_x, train_y, valid_seq_x, is_neural_net=True) 714 | print "CNN, Word Embeddings", accuracy 715 | ``` 716 | 717 | Epoch 1/1 718 | 7500/7500 [==============================] - 22s - loss: 0.6930 719 | CNN, Word Embeddings 0.496 720 | 721 | 722 | 循环神经网络-LSTM 723 | 724 | 725 | 与前馈神经网络不同,前馈神经网络的激活输出仅在一个方向上传播,而循环神经网络的激活输出在两个方向传播(从输入到输出,从输出到输入)。因此在神经网络架构中产生循环,充当神经元的“记忆状态”,这种状态使神经元能够记住迄今为止学到的东西。RNN中的记忆状态优于传统的神经网络,但是被称为梯度弥散的问题也因这种架构而产生。这个问题导致当网络有很多层的时候,很难学习和调整前面网络层的参数。为了解决这个问题,开发了称为LSTM(Long Short Term Memory)模型的新型RNN: 726 | ![%E5%9B%BE%E7%89%87.png](attachment:%E5%9B%BE%E7%89%87.png) 727 | 如果想了解更多关于LSTM,请访问: 728 | 729 | 730 | https://www.analyticsvidhya.com/blog/2017/12/fundamentals-of-deep-learning-introduction-to-lstm/ 731 | 732 | 733 | ```python 734 | 735 | def create_rnn_lstm(): 736 | 737 | # Add an Input Layer 738 | input_layer = layers.Input((70, )) 739 | 740 | # Add the word embedding Layer 741 | embedding_layer = layers.Embedding(len(word_index) + 1, 300, weights=[embedding_matrix], trainable=False)(input_layer) 742 | embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer) 743 | 744 | # Add the LSTM Layer 745 | lstm_layer = layers.LSTM(100)(embedding_layer) 746 | 747 | # Add the output Layers 748 | output_layer1 = layers.Dense(50, activation="relu")(lstm_layer) 749 | output_layer1 = layers.Dropout(0.25)(output_layer1) 750 | output_layer2 = layers.Dense(1, activation="sigmoid")(output_layer1) 751 | 752 | # Compile the model 753 | model = models.Model(inputs=input_layer, outputs=output_layer2) 754 | model.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy') 755 | 756 | return model 757 | 758 | classifier = create_rnn_lstm() 759 | accuracy = train_model(classifier, train_seq_x, train_y, valid_seq_x, is_neural_net=True) 760 | print "RNN-LSTM, Word Embeddings", accuracy 761 | 762 | 763 | ``` 764 | 765 | Epoch 1/1 766 | 7500/7500 [==============================] - 56s - loss: 0.6931 767 | RNN-LSTM, Word Embeddings 0.496 768 | 769 | 770 | 循环神经网络-GRU 771 | 772 | 773 | 门控递归单元是另一种形式的递归神经网络,我们在网络中添加一个GRU层来代替LSTM。 774 | 775 | 776 | ```python 777 | 778 | def create_rnn_gru(): 779 | 780 | # Add an Input Layer 781 | input_layer = layers.Input((70, )) 782 | 783 | # Add the word embedding Layer 784 | embedding_layer = layers.Embedding(len(word_index) + 1, 300, weights=[embedding_matrix], trainable=False)(input_layer) 785 | embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer) 786 | 787 | # Add the GRU Layer 788 | lstm_layer = layers.GRU(100)(embedding_layer) 789 | 790 | # Add the output Layers 791 | output_layer1 = layers.Dense(50, activation="relu")(lstm_layer) 792 | output_layer1 = layers.Dropout(0.25)(output_layer1) 793 | output_layer2 = layers.Dense(1, activation="sigmoid")(output_layer1) 794 | 795 | # Compile the model 796 | model = models.Model(inputs=input_layer, outputs=output_layer2) 797 | model.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy') 798 | 799 | return model 800 | 801 | classifier = create_rnn_gru() 802 | accuracy = train_model(classifier, train_seq_x, train_y, valid_seq_x, is_neural_net=True) 803 | print "RNN-GRU, Word Embeddings", accuracy 804 | 805 | 806 | ``` 807 | 808 | Epoch 1/1 809 | 7500/7500 [==============================] - 35s - loss: 0.6930 810 | RNN-GRU, Word Embeddings 0.496 811 | 812 | 813 | 双向RNN 814 | 815 | 816 | RNN层也可以被封装在双向层中,我们把GRU层封装在双向RNN网络中。 817 | 818 | 819 | ```python 820 | 821 | def create_bidirectional_rnn(): 822 | 823 | # Add an Input Layer 824 | input_layer = layers.Input((70, )) 825 | 826 | # Add the word embedding Layer 827 | embedding_layer = layers.Embedding(len(word_index) + 1, 300, weights=[embedding_matrix], trainable=False)(input_layer) 828 | embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer) 829 | 830 | # Add the LSTM Layer 831 | lstm_layer = layers.Bidirectional(layers.GRU(100))(embedding_layer) 832 | 833 | # Add the output Layers 834 | output_layer1 = layers.Dense(50, activation="relu")(lstm_layer) 835 | output_layer1 = layers.Dropout(0.25)(output_layer1) 836 | output_layer2 = layers.Dense(1, activation="sigmoid")(output_layer1) 837 | 838 | # Compile the model 839 | model = models.Model(inputs=input_layer, outputs=output_layer2) 840 | model.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy') 841 | 842 | return model 843 | 844 | classifier = create_bidirectional_rnn() 845 | accuracy = train_model(classifier, train_seq_x, train_y, valid_seq_x, is_neural_net=True) 846 | print "RNN-Bidirectional, Word Embeddings", accuracy 847 | 848 | 849 | ``` 850 | 851 | Epoch 1/1 852 | 7500/7500 [==============================] - 55s - loss: 0.6932 853 | RNN-Bidirectional, Word Embeddings 0.496 854 | 855 | 856 | 循环卷积神经网络 857 | 858 | 859 | 如果基本的架构已经尝试过,则可以尝试这些层的不同变体,如递归卷积神经网络,还有其它变体,比如: 860 | 861 | 862 | 层次化注意力网络(Sequence to Sequence Models with Attention) 863 | 864 | 具有注意力机制的seq2seq(Sequence to Sequence Models with Attention) 865 | 866 | 双向循环卷积神经网络 867 | 868 | 更多网络层数的CNNs和RNNs 869 | 870 | 871 | ```python 872 | 873 | def create_rcnn(): 874 | 875 | # Add an Input Layer 876 | input_layer = layers.Input((70, )) 877 | 878 | # Add the word embedding Layer 879 | embedding_layer = layers.Embedding(len(word_index) + 1, 300, weights=[embedding_matrix], trainable=False)(input_layer) 880 | embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer) 881 | 882 | # Add the recurrent layer 883 | rnn_layer = layers.Bidirectional(layers.GRU(50, return_sequences=True))(embedding_layer) 884 | 885 | # Add the convolutional Layer 886 | conv_layer = layers.Convolution1D(100, 3, activation="relu")(embedding_layer) 887 | 888 | # Add the pooling Layer 889 | pooling_layer = layers.GlobalMaxPool1D()(conv_layer) 890 | 891 | # Add the output Layers 892 | output_layer1 = layers.Dense(50, activation="relu")(pooling_layer) 893 | output_layer1 = layers.Dropout(0.25)(output_layer1) 894 | output_layer2 = layers.Dense(1, activation="sigmoid")(output_layer1) 895 | 896 | # Compile the model 897 | model = models.Model(inputs=input_layer, outputs=output_layer2) 898 | model.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy') 899 | 900 | return model 901 | 902 | classifier = create_rcnn() 903 | accuracy = train_model(classifier, train_seq_x, train_y, valid_seq_x, is_neural_net=True) 904 | print "CNN, Word Embeddings", accuracy 905 | 906 | 907 | ``` 908 | 909 | Epoch 1/1 910 | 7500/7500 [==============================] - 18s - loss: 0.6930 911 | CNN, Word Embeddings 0.496 912 | 913 | 914 | 进一步提高文本分类模型的性能 915 | 916 | 917 | 虽然上述框架可以应用于多个文本分类问题,但是为了达到更高的准确率,可以在总体框架中进行一些改进。例如,下面是一些改进文本分类模型和该框架性能的技巧: 918 | 919 | 920 | 1. 清洗文本:文本清洗有助于减少文本数据中出现的噪声,包括停用词、标点符号、后缀变化等。这篇文章有助于理解如何实现文本分类: 921 | 922 | 923 | https://www.analyticsvidhya.com/blog/2014/11/text-data-cleaning-steps-python/ 924 | 925 | 926 | 2. 组合文本特征向量的文本/NLP特征:特征工程阶段,我们把生成的文本特征向量组合在一起,可能会提高文本分类器的准确率。 927 | 928 | 929 | 模型中的超参数调优:参数调优是很重要的一步,很多参数通过合适的调优可以获得最佳拟合模型,例如树的深层、叶子节点数、网络参数等。 930 | 931 | 932 | 3. 集成模型:堆叠不同的模型并混合它们的输出有助于进一步改进结果。如果想了解更多关于模型集成,请访问: 933 | 934 | 935 | https://www.analyticsvidhya.com/blog/2015/08/introduction-ensemble-learning/ 936 | 937 | 写在最后 938 | 939 | 940 | 本文讨论了如何准备一个文本数据集,如清洗、创建训练集和验证集。使用不同种类的特征工程,比如计数向量、TF-IDF、词嵌入、主题模型和基本的文本特征。然后训练了多种分类器,有朴素贝叶斯、Logistic回归、SVM、MLP、LSTM和GRU。最后讨论了提高文本分类器性能的多种方法。 941 | 942 | 943 | 你从这篇文章受益了吗?可以在下面评论中分享你的观点和看法。 944 | 945 | 946 | 原文链接:https://www.analyticsvidhya.com/blog/2018/04/a-comprehensive-guide-to-understand-and-implement-text-classification-in-python/ 947 | 948 | 949 | 950 | 951 | 952 | ```python 953 | 954 | ``` 955 | --------------------------------------------------------------------------------