├── README.md
├── README.txt
├── SRC
    ├── cnn_ts_ts.py
    ├── cnn_ts_tt.py
    ├── cnn_tst_tt.py
    ├── cnn_tt_tt.py
    ├── generateMNIST_SandMNIST_T.py
    ├── generate_data.py
    └── temp.py
└── result
    ├── ts_ts.txt
    ├── ts_tt.txt
    ├── tst_tt1.txt
    └── tt_tt.txt


/README.md:
--------------------------------------------------------------------------------
 1 | # domain_adaptation
 2 | this project is the code of domain adaptation referenced by unsupervised domain adaptation by backpropagation(http://machinelearning.wustl.edu/mlpapers/paper_files/icml2015_ganin15.pdf).And i realized it on mnist.
 3 | 
 4 | domain adaptation实现
 5 | =============================
 6 | 
 7 | 项目在cnn网络上实现了domain adaptation
 8 | 项目大体上遵照论文：unsupervised domain adaptation by backpropagation  :
 9 | 下载地址：http://machinelearning.wustl.edu/mlpapers/paper_files/icml2015_ganin15.pdf
10 | 
11 | =============================
12 | 实现是基于cnn网络
13 | CNN代码解读博文：http://blog.csdn.net/u012162613/article/details/43225445
14 | CNN代码来源：https://github.com/wepe/MachineLearning/blob/master/DeepLearning%20Tutorials/cnn_LeNet/convolutional_mlp_commentate.py
15 | 
16 | =============================
17 | 项目文件下只有四个文件夹：SRC，data，result，reference
18 | 
19 | =============================
20 | SRC存储了项目所有的源代码：
21 | 运行顺序：generate_data.py -> generateMNIST_SandMNIST_T.py -> cnn_~~~.py
22 | 
23 | generate_data.py:根据mnist.pkl.gz生成目标域数据集：target~.pkl和st~.pkl（存在data/下）
24 | 可以在generate_data.py中修改theta（源域和目标域图片做差时乘的比例，防止两个域的差距过大）
25 | 
26 | generateMNIST_SandMNIST_T.py：根据source.pkl、target~.pkl和st~.pkl生成部分图片，用来做展示
27 | 只生成了一部分，train,valid,test各生成10个图片。
28 | 
29 | cnn_ts_ts.py:在源域上训练，在源域上测试
30 | 
31 | cnn_ts_tt.py:在源域上训练，在目标域上测试（验证集为目标域数据）
32 | 
33 | cnn_tt_tt.py:在目标域上训练，在目标域上测试
34 | 
35 | cnn_tst_ts.py:在源域和目标域上训练，在目标域上测试（也就是使用领域自适应机制）（验证集合为目标域数据）
36 | 可以在其中调节lmbda（论文中公式4的lambda）
37 | 
38 | 上述4个文件将结果输出在txt文件中，存在result/下
39 | 
40 | =============================
41 | data下存了项目需要存的数据：
42 | 在运行程序前要具有以下文件：BSR文件夹，imageMNIST_T文件夹，imageMNIST_S文件夹，mnist.pkl.gz, source.pkl
43 | 
44 | 其中BSR文件夹：
45 | 下载来源：http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/BSR/BSR_bsds500.tgz
46 | 数据集介绍：https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/resources.html#bsds500
47 | 下载完成之后将BSR_bsds500.tgz解压，包含一个BSR文件夹，把这个BSR文件夹放到data下就行
48 | 
49 | imageMNIST_S和imageMNIST_T:两个文件夹存储generateMNIST_SandMNIST_T.py生成的图片，
50 | imageMNIST_S存储源域的图片
51 | imageMNIST_T存储目标域图片
52 | 运行前为空即可
53 | imageMNIST_S和imageMNIST_T中名字相同的图片相对应
54 | 
55 | mnist.pkl.gz：是mnist数据集，运行前要下载好
56 | 数据来源：http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz
57 | 
58 | source.pkl：是mnist.pkl.gz的解压文件，运行前要解压好
59 | 
60 | 
61 | 运行后会生成st~.pkl和target~.pkl存在data下：
62 | 其中~部分表示theta的取值
63 | 
64 | =============================
65 | result下存了运行结果
66 | 运行结果文件命名方式和代码文件命名方式相同
67 | 
68 | =============================
69 | reference下存了项目论文
70 | unsupervised domain adaptation by backpropagation
71 | 
72 | =============================
73 | 项目调节的主要参数：
74 | theta：源域和噪声域图片同像素点灰度值做差时，噪声域乘的比例。
75 | 论文中4.1Results--MNIST->MNIST-M中的图片生成公式，我在I1和I2做差时对I2乘了一个theta，控制源域和目标域的差距。
76 | theta在0到1之间，越小差距越小，越大差距越大
77 | 
78 | lmbda：论文中公式4的lambda。控制在反向传播过程中，域分类器的损失对特征提取器的参数的修改程度。
79 | lmbda在0到1之间，lmbda越小，域分类器的损失对特征提取器的参数的修改程度越小。
80 | 


--------------------------------------------------------------------------------
/README.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/shucunt/domain_adaptation/f4f283bdf731b2c507532dd7f7bde5f920551434/README.txt


--------------------------------------------------------------------------------
/SRC/cnn_ts_ts.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | @author：wepon
  4 | @blog：http://blog.csdn.net/u012162613/article/details/43225445
  5 | 
  6 | 本程序基于LeNet5，但是有所简化，具体如下：
  7 | -没有实现location-specific gain and bias parameters
  8 | -用的是maxpooling，而不是average_pooling
  9 | -分类器用的是softmax，LeNet5用的是rbf
 10 | -LeNet5第二层并不是全连接的，本程序实现的是全连接
 11 | 具体参考：
 12 | - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner:
 13 |    Gradient-Based Learning Applied to Document
 14 |    Recognition, Proceedings of the IEEE, 86(11):2278-2324, November 1998.
 15 |    http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf
 16 | 
 17 | """
 18 | import cPickle
 19 | import gzip
 20 | import os
 21 | import sys
 22 | import time
 23 | 
 24 | import numpy
 25 | 
 26 | import theano
 27 | import theano.tensor as T
 28 | from theano.tensor.signal import downsample
 29 | from theano.tensor.nnet import conv
 30 | 
 31 | 
 32 | resultSource = os.path.join(os.path.abspath('..'), 'result/ts_ts.txt')
 33 | file_w = open(resultSource, 'a')
 34 | try:
 35 |     file_w.write("modify the time result")
 36 | finally:
 37 |     file_w.close()
 38 | 
 39 | """
 40 | 卷积+下采样合成一个层LeNetConvPoolLayer
 41 | W是系数矩阵，也就是卷积核============================?和filter同形状
 42 | rng:随机数生成器，用于初始化W
 43 | input:4维的向量，theano.tensor.dtensor4
 44 | filter_shape:(number of filters, num input feature maps,filter height, filter width)滤波器，例如5*5的滤波器
 45 |              num imput feature maps:表示输入的特征的个数
 46 |              number of filters代表滤波器的个数,也就是输出的特征的个数
 47 | image_shape:(batch size, num input feature maps,image height, image width)，图像的大小
 48 |             batch size代表是第几个batch
 49 | poolsize: (#rows, #cols)，下采样时对pool中的数求和
 50 | num input feature maps:是上一层到卷基层过程中，代表可训练参数的个数，也就是滤波器个数*（滤波矩阵中点的个数+b的个数）
 51 |                         例如有6个滤波器，为5*5的矩阵，那么就是6*（5*5+1）
 52 | num output feature maps:和num input feature maps一样，不过是从卷基层到下采样层的过程。 
 53 | """
 54 | class LeNetConvPoolLayer(object):
 55 |     def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
 56 |   
 57 | #assert condition，condition为True，则继续往下执行，condition为False，中断程序
 58 | #image_shape[1]和filter_shape[1]都是num input feature maps，它们必须是一样的。
 59 |         assert image_shape[1] == filter_shape[1]
 60 |         self.input = input
 61 | 
 62 | #每个隐层神经元（即像素）与上一层的连接数为num input feature maps * filter height * filter width。
 63 | #可以用numpy.prod(filter_shape[1:])来求得，prod是连乘操作。
 64 | #卷基层和上一层的连接个数
 65 |         fan_in = numpy.prod(filter_shape[1:])
 66 | 
 67 | #lower layer上每个神经元获得的梯度来自于："num output feature maps * filter height * filter width" /pooling size
 68 | #卷基层和下采样层的连接个数
 69 |         fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) /  
 70 |                    numpy.prod(poolsize))   #=======================？应该乘以2，因为一组滤波器只有一个w和一个b
 71 |                    
 72 | #以上求得fan_in、fan_out ，将它们代入公式，以此来随机初始化W,W就是线性卷积核
 73 | #初始化为一个极小值，似乎有公式，W的初始化有个规则：如果使用tanh函数，则在-sqrt(6./(n_in+n_hidden))到sqrt(6./(n_in+n_hidden))之间均匀
 74 | #numpy.random.uniform(low,high,size)左闭右开，返回size形状的随机数组
 75 |         W_bound = numpy.sqrt(6. / (fan_in + fan_out))  
 76 |         self.W = theano.shared(
 77 |             numpy.asarray(
 78 |                 rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
 79 |                 dtype=theano.config.floatX
 80 |             ),
 81 |             borrow=True
 82 |         )
 83 | 
 84 | # the bias is a 1D tensor -- one bias per output feature map
 85 | #偏置b是一维向量，每个输出图的特征图都对应一个偏置，初始化为0向量
 86 | #而输出的特征图的个数由filter个数决定，因此用filter_shape[0]即number of filters来初始化
 87 |         b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)
 88 |         self.b = theano.shared(value=b_values, borrow=True)
 89 | 
 90 | #将输入图像与filter卷积，conv.conv2d函数
 91 | #卷积完没有加b再通过sigmoid，这里是一处简化。
 92 |         conv_out = conv.conv2d(
 93 |             input=input,
 94 |             filters=self.W,
 95 |             filter_shape=filter_shape,
 96 |             image_shape=image_shape
 97 |         )
 98 | 
 99 | #maxpooling，最大子采样过程
100 |         pooled_out = downsample.max_pool_2d(
101 |             input=conv_out,
102 |             ds=poolsize,
103 |             ignore_border=True
104 |         )
105 | 
106 | #加偏置，再通过tanh映射，得到卷积+子采样层的最终输出
107 | #因为b是一维向量，这里用维度转换函数dimshuffle将其reshape。比如b是(10,)，
108 | #则b.dimshuffle('x', 0, 'x', 'x'))将其reshape为(1,10,1,1)
109 |         self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))
110 | #卷积+采样层的参数
111 |         self.params = [self.W, self.b]
112 | 
113 | 
114 | 
115 | 
116 | 
117 | """
118 | 注释：
119 | 这是定义隐藏层的类，首先明确：隐藏层的输入即input，输出即隐藏层的神经元个数。输入层与隐藏层是全连接的。
120 | 假设输入是n_in维的向量（也可以说时n_in个神经元），隐藏层有n_out个神经元，则因为是全连接，
121 | 一共有n_in*n_out个权重，故W大小时(n_in,n_out),n_in行n_out列，每一列对应隐藏层的每一个神经元的连接权重。
122 | b是偏置，隐藏层有n_out个神经元，故b时n_out维向量。
123 | rng即随机数生成器，numpy.random.RandomState，用于初始化W。
124 | input训练模型所用到的所有输入，并不是MLP的输入层，MLP的输入层的神经元个数时n_in，
125 |     而这里的参数input大小是（n_example,n_in）,每一行一个样本，即每一行作为MLP的输入层。
126 | activation:激活函数,这里定义为函数tanh
127 | """
128 | class HiddenLayer(object):
129 |     def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh):
130 |          self.input = input   #类HiddenLayer的input即所传递进来的input
131 | 
132 |          """
133 |          注释：
134 |          代码要兼容GPU，则必须使用 dtype=theano.config.floatX,并且定义为theano.shared
135 |          另外，W的初始化有个规则：如果使用tanh函数，则在-sqrt(6./(n_in+n_hidden))到sqrt(6./(n_in+n_hidden))之间均匀
136 |          抽取数值来初始化W，若时sigmoid函数，则以上再乘4倍。
137 |          """
138 |          #如果W未初始化，则根据上述方法初始化。
139 |          #加入这个判断的原因是：有时候我们可以用训练好的参数来初始化W，见我的上一篇文章。
140 |          if W is None:
141 |             W_values = numpy.asarray(
142 |                 rng.uniform(
143 |                     low=-numpy.sqrt(6. / (n_in + n_out)),
144 |                     high=numpy.sqrt(6. / (n_in + n_out)),
145 |                     size=(n_in, n_out)
146 |                 ),
147 |                 dtype=theano.config.floatX
148 |             )
149 |             if activation == theano.tensor.nnet.sigmoid:
150 |                 W_values *= 4
151 |             W = theano.shared(value=W_values, name='W', borrow=True)
152 | 
153 |          if b is None:
154 |             b_values = numpy.zeros((n_out,), dtype=theano.config.floatX)
155 |             b = theano.shared(value=b_values, name='b', borrow=True)
156 | 
157 |          #用上面定义的W、b来初始化类HiddenLayer的W、b
158 |          self.W = W
159 |          self.b = b
160 | 
161 |         #隐含层的输出
162 |          lin_output = T.dot(input, self.W) + self.b
163 |          self.output = (
164 |             lin_output if activation is None
165 |             else activation(lin_output)
166 |          )
167 | 
168 |         #隐含层的参数
169 |          self.params = [self.W, self.b]
170 | 
171 | 
172 | 
173 | """
174 | 定义分类层LogisticRegression，也即Softmax回归
175 | 在deeplearning tutorial中，直接将LogisticRegression视为Softmax，
176 | 而我们所认识的二类别的逻辑回归就是当n_out=2时的LogisticRegression
177 | """
178 | #参数说明：
179 | #input，大小就是(n_example,n_in)，其中n_example是一个batch的大小，
180 | #因为我们训练时用的是Minibatch SGD，因此input这样定义
181 | #n_in,即上一层(隐含层)的输出
182 | #n_out,输出的类别数 
183 | class LogisticRegression(object):
184 |     def __init__(self, input, n_in, n_out):
185 | 
186 | #W大小是n_in行n_out列，b为n_out维向量。即：每个输出对应W的一列以及b的一个元素。  
187 |         self.W = theano.shared(
188 |             value=numpy.zeros(
189 |                 (n_in, n_out),
190 |                 dtype=theano.config.floatX
191 |             ),
192 |             name='W',
193 |             borrow=True
194 |         )
195 | 
196 |         self.b = theano.shared(
197 |             value=numpy.zeros(
198 |                 (n_out,),
199 |                 dtype=theano.config.floatX
200 |             ),
201 |             name='b',
202 |             borrow=True
203 |         )
204 | 
205 | #input是(n_example,n_in)，W是（n_in,n_out）,点乘得到(n_example,n_out)，加上偏置b，
206 | #再作为T.nnet.softmax的输入，得到p_y_given_x
207 | #故p_y_given_x每一行代表每一个样本被估计为各类别的概率    
208 | #PS：b是n_out维向量，与(n_example,n_out)矩阵相加，内部其实是先复制n_example个b，
209 | #然后(n_example,n_out)矩阵的每一行都加b
210 |         self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)
211 | 
212 | #argmax返回最大值下标，因为本例数据集是MNIST，下标刚好就是类别。axis=1表示按行操作。================？
213 |         self.y_pred = T.argmax(self.p_y_given_x, axis=1)
214 | 
215 | #params，LogisticRegression的参数     
216 |         self.params = [self.W, self.b]
217 | 
218 |     def negative_log_likelihood(self, y):  #==============================？
219 |         return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
220 | 
221 |     def errors(self, y):
222 |         if y.ndim != self.y_pred.ndim:
223 |             raise TypeError(
224 |                 'y should have the same shape as self.y_pred',
225 |                 ('y', y.type, 'y_pred', self.y_pred.type)
226 |             )
227 |         if y.dtype.startswith('int'):
228 |             return T.mean(T.neq(self.y_pred, y))
229 |         else:
230 |             raise NotImplementedError()
231 | 
232 | 
233 | """
234 | 加载MNIST数据集load_data()
235 | mnist.pkl将60000个训练数据分成了50000个训练数据和10000校正数据集；
236 | 每个数组由两部分内容组成，一个图片数组和一个标签数组，图片数组的每一行代表一个图片的像素，有784个元素（28×28）
237 | """
238 | def load_data(dataset):
239 |     # dataset是数据集的路径，程序首先检测该路径下有没有MNIST数据集，没有的话就下载MNIST数据集
240 |     #这一部分就不解释了，与softmax回归算法无关。
241 |     data_dir, data_file = os.path.split(dataset)
242 |     if data_dir == "" and not os.path.isfile(dataset):
243 |         # Check if dataset is in the data directory.
244 |         new_path = os.path.join(
245 |             os.path.split(__file__)[0],
246 |             "..",
247 |             "data",
248 |             dataset
249 |         )
250 |         if os.path.isfile(new_path) or data_file == 'mnist.pkl.gz':
251 |             dataset = new_path
252 | 
253 |     if (not os.path.isfile(dataset)) and data_file == 'mnist.pkl.gz':
254 |         import urllib
255 |         origin = (
256 |             'http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz'
257 |         )
258 |         print 'Downloading data from %s' % origin
259 |         
260 |         urllib.urlretrieve(origin, dataset)
261 | 
262 |     print '... loading data'
263 |     global resultSource
264 |     file_write = open(resultSource, 'a')
265 |     try:
266 |         file_write.write("... loading data\n" )
267 |     finally :
268 |         file_write.close()
269 |     
270 | #以上是检测并下载数据集mnist.pkl.gz，不是本文重点。下面才是load_data的开始
271 |     
272 | #从"mnist.pkl.gz"里加载train_set, valid_set, test_set，它们都是包括label的
273 | #主要用到python里的gzip.open()函数,以及 cPickle.load()。
274 | #‘rb’表示以二进制可读的方式打开文件
275 |     f = gzip.open(dataset, 'rb')
276 |     train_set, valid_set, test_set = cPickle.load(f)
277 |     f.close()
278 |    
279 | 
280 | #将数据设置成shared variables，主要时为了GPU加速，只有shared variables才能存到GPU memory中
281 | #GPU里数据类型只能是float。而data_y是类别，所以最后又转换为int返回
282 |     def shared_dataset(data_xy, borrow=True):
283 |         data_x, data_y = data_xy
284 |         shared_x = theano.shared(numpy.asarray(data_x,
285 |                                                dtype=theano.config.floatX),
286 |                                  borrow=borrow)
287 |         shared_y = theano.shared(numpy.asarray(data_y,
288 |                                                dtype=theano.config.floatX),
289 |                                  borrow=borrow)
290 |         return shared_x, T.cast(shared_y, 'int32')
291 | 
292 | 
293 |     test_set_x, test_set_y = shared_dataset(test_set)
294 |     valid_set_x, valid_set_y = shared_dataset(valid_set)
295 |     train_set_x, train_set_y = shared_dataset(train_set)
296 | 
297 |     rval = [(train_set_x, train_set_y), (valid_set_x, valid_set_y),
298 |             (test_set_x, test_set_y)]
299 |     return rval
300 | 
301 | 
302 | """
303 | 实现LeNet5
304 | LeNet5有两个卷积层，第一个卷积层有20个卷积核，第二个卷积层有50个卷积核
305 | """
306 | def evaluate_lenet5(learning_rate=0.1, n_epochs=200,
307 |                     dataset='/home/zhangshu/faq/shucunt/temp/domainAdaption/data/mnist.pkl.gz',
308 |                     nkerns=[20, 50], batch_size=500):
309 |     """ 
310 |  learning_rate:学习速率，随机梯度前的系数。
311 |  n_epochs训练步数，每一步都会遍历所有batch，即所有样本
312 |  batch_size,这里设置为500，即每遍历完500个样本，才计算梯度并更新参数
313 |  nkerns=[20, 50],每一个LeNetConvPoolLayer卷积核的个数，第一个LeNetConvPoolLayer有
314 |  20个卷积核，第二个有50个
315 |     """
316 | 
317 |     rng = numpy.random.RandomState(23455)
318 | 
319 |     #加载数据
320 |     datasets = load_data(dataset)
321 |     train_set_x, train_set_y = datasets[0]
322 |     valid_set_x, valid_set_y = datasets[1]
323 |     test_set_x, test_set_y = datasets[2]
324 | 
325 |     # 计算batch的个数
326 |     n_train_batches = train_set_x.get_value(borrow=True).shape[0]
327 |     n_valid_batches = valid_set_x.get_value(borrow=True).shape[0]
328 |     n_test_batches = test_set_x.get_value(borrow=True).shape[0]
329 |     n_train_batches /= batch_size
330 |     n_valid_batches /= batch_size
331 |     n_test_batches /= batch_size
332 | 
333 |     #定义几个变量，index表示batch下标，x表示输入的训练数据，y对应其标签
334 |     index = T.lscalar()  
335 |     x = T.matrix('x')   
336 |     y = T.ivector('y') 
337 | 
338 |     ######################
339 |     # BUILD ACTUAL MODEL #
340 |     ######################
341 |     print '... building the model'
342 |     global resultSource
343 |     file_write = open(resultSource, 'a')
344 |     try:
345 |         file_write.write("... building the model\n" )
346 |     finally :
347 |         file_write.close()
348 | 
349 | 
350 | #我们加载进来的batch大小的数据是(batch_size, 28 * 28)，但是LeNetConvPoolLayer的输入是四维的，所以要reshape
351 |     layer0_input = x.reshape((batch_size, 1, 28, 28))
352 | 
353 | # layer0即第一个LeNetConvPoolLayer层
354 | #输入的单张图片(28,28)，经过conv得到(28-5+1 , 28-5+1) = (24, 24)，
355 | #经过maxpooling得到(24/2, 24/2) = (12, 12)
356 | #因为每个batch有batch_size张图，第一个LeNetConvPoolLayer层有nkerns[0]个卷积核，
357 | #故layer0输出为(batch_size, nkerns[0], 12, 12)
358 |     layer0 = LeNetConvPoolLayer(
359 |         rng,
360 |         input=layer0_input,
361 |         image_shape=(batch_size, 1, 28, 28),
362 |         filter_shape=(nkerns[0], 1, 5, 5),
363 |         poolsize=(2, 2)
364 |     )
365 | 
366 | 
367 | #layer1即第二个LeNetConvPoolLayer层
368 | #输入是layer0的输出，每张特征图为(12,12),经过conv得到(12-5+1, 12-5+1) = (8, 8),
369 | #经过maxpooling得到(8/2, 8/2) = (4, 4)
370 | #因为每个batch有batch_size张图（特征图），第二个LeNetConvPoolLayer层有nkerns[1]个卷积核
371 | #，故layer1输出为(batch_size, nkerns[1], 4, 4)
372 |     layer1 = LeNetConvPoolLayer(
373 |         rng,
374 |         input=layer0.output,
375 |         image_shape=(batch_size, nkerns[0], 12, 12),#输入nkerns[0]个特征，即layer0输出nkerns[0]个特征
376 |         filter_shape=(nkerns[1], nkerns[0], 5, 5),
377 |         poolsize=(2, 2)
378 |     )
379 | #到此,实际上将原来的一张图片的一个特征压成了一个4*4的图片，但是一张图片对应nkerns[1]个特征图片
380 | #因此在全连接层，可以认为一个图片变为nkerns[1]*4*4大小的图片----the input of hiddenLayer
381 | 
382 | #前面定义好了两个LeNetConvPoolLayer（layer0和layer1），layer1后面接layer2，这是一个全连接层，相当于MLP里面的隐含层
383 | #故可以用MLP中定义的HiddenLayer来初始化layer2，layer2的输入是二维的(batch_size, num_pixels) ，
384 | #故要将上层中同一张图经不同卷积核卷积出来的特征图合并为一维向量，
385 | #也就是将layer1的输出(batch_size, nkerns[1], 4, 4)flatten为(batch_size, nkerns[1]*4*4)=(500，800),作为layer2的输入。
386 | #而隐藏层的w为800*500（nin*nout）
387 | #(500，800)表示有500个样本，每一行代表一个样本。layer2的输出大小是(batch_size,n_out)=(500,500)
388 |     layer2_input = layer1.output.flatten(2)
389 |     layer2 = HiddenLayer(
390 |         rng,
391 |         input=layer2_input,
392 |         n_in=nkerns[1] * 4 * 4,
393 |         n_out=500,
394 |         activation=T.tanh
395 |     )
396 | 
397 | #最后一层layer3是分类层，用的是逻辑回归中定义的LogisticRegression，
398 | #layer3的输入是layer2的输出(500,500)，layer3的输出就是(batch_size,n_out)=(500,10)
399 |     layer3 = LogisticRegression(input=layer2.output, n_in=500, n_out=10)
400 | 
401 | #代价函数NLL
402 |     cost = layer3.negative_log_likelihood(y)
403 | 
404 | # test_model计算测试误差，x、y根据给定的index具体化，然后调用layer3，
405 | #layer3又会逐层地调用layer2、layer1、layer0，故test_model其实就是整个CNN结构，
406 | #test_model的输入是x、y，输出是layer3.errors(y)的输出，即误差。
407 |     test_model = theano.function(
408 |         [index],
409 |         layer3.errors(y),
410 |         givens={
411 |             x: test_set_x[index * batch_size: (index + 1) * batch_size],
412 |             y: test_set_y[index * batch_size: (index + 1) * batch_size]
413 |         }
414 |     )
415 | #validate_model，验证模型，分析同上。
416 |     validate_model = theano.function(
417 |         [index],
418 |         layer3.errors(y),
419 |         givens={
420 |             x: valid_set_x[index * batch_size: (index + 1) * batch_size],
421 |             y: valid_set_y[index * batch_size: (index + 1) * batch_size]
422 |         }
423 |     )
424 | 
425 | #下面是train_model，涉及到优化算法即SGD，需要计算梯度、更新参数
426 |     #参数集
427 |     params = layer3.params + layer2.params + layer1.params + layer0.params
428 | 
429 |     #对各个参数的梯度
430 |     grads = T.grad(cost, params)
431 | 
432 | #因为参数太多，在updates规则里面一个一个具体地写出来是很麻烦的，所以下面用了一个for..in..,自动生成规则对(param_i, param_i - learning_rate * grad_i)
433 |     updates = [
434 |         (param_i, param_i - learning_rate * grad_i)
435 |         for param_i, grad_i in zip(params, grads)
436 |     ]
437 | 
438 | #train_model，代码分析同test_model。train_model里比test_model、validation_model多出updates规则
439 |     train_model = theano.function(
440 |         [index],
441 |         cost,
442 |         updates=updates,
443 |         givens={
444 |             x: train_set_x[index * batch_size: (index + 1) * batch_size],
445 |             y: train_set_y[index * batch_size: (index + 1) * batch_size]
446 |         }
447 |     )
448 | 
449 | 
450 |     ###############
451 |     #   开始训练  #
452 |     ###############
453 |     print '... training'
454 |     global resultSource
455 |     file_write = open(resultSource, 'a')
456 |     try:
457 |         file_write.write("... training\n" )
458 |     finally :
459 |         file_write.close()
460 |     patience = 10000  
461 |     patience_increase = 2  
462 |     improvement_threshold = 0.995 
463 |                                    
464 |     validation_frequency = min(n_train_batches, patience / 2)
465 |  #这样设置validation_frequency可以保证每一次epoch都会在验证集上测试。
466 | 
467 |     best_validation_loss = numpy.inf   #最好的验证集上的loss，最好即最小
468 |     best_iter = 0                      #最好的迭代次数，以batch为单位。比如best_iter=10000，说明在训练完第10000个batch时，达到best_validation_loss
469 |     test_score = 0.
470 |     start_time = time.time()
471 | 
472 |     epoch = 0
473 |     done_looping = False
474 | 
475 | #下面就是训练过程了，while循环控制的时步数epoch，一个epoch会遍历所有的batch，即所有的图片。
476 | #for循环是遍历一个个batch，一次一个batch地训练。for循环体里会用train_model(minibatch_index)去训练模型，
477 | #train_model里面的updatas会更新各个参数。
478 | #for循环里面会累加训练过的batch数iter，当iter是validation_frequency倍数时则会在验证集上测试，
479 | #如果验证集的损失this_validation_loss小于之前最佳的损失best_validation_loss，
480 | #则更新best_validation_loss和best_iter，同时在testset上测试。
481 | #如果验证集的损失this_validation_loss小于best_validation_loss*improvement_threshold时则更新patience。
482 | #当达到最大步数n_epoch时，或者patience<iter时，结束训练
483 | #iter 表示训练了多少个batch
484 |     while (epoch < n_epochs) and (not done_looping):
485 |         epoch = epoch + 1
486 |         for minibatch_index in xrange(n_train_batches):
487 | 
488 |             iter = (epoch - 1) * n_train_batches + minibatch_index
489 | 
490 |             if iter % 100 == 0:
491 |                 print 'training @ iter = ', iter
492 |                 global resultSource
493 |                 file_write = open(resultSource, 'a')
494 |                 try:
495 |                     file_write.write("training @ iter = " + str(iter) + '\n')
496 |                 finally :
497 |                     file_write.close()
498 |             cost_ij = train_model(minibatch_index)  
499 | #cost_ij 没什么用，后面都没有用到,只是为了调用train_model，而train_model有返回值
500 |             if (iter + 1) % validation_frequency == 0:
501 | 
502 |                 # compute zero-one loss on validation set
503 |                 validation_losses = [validate_model(i) for i
504 |                                      in xrange(n_valid_batches)]
505 |                 this_validation_loss = numpy.mean(validation_losses)
506 |                 print('epoch %i, minibatch %i/%i, validation error %f %%' %
507 |                       (epoch, minibatch_index + 1, n_train_batches,
508 |                        this_validation_loss * 100.))
509 |                 global resultSource
510 |                 file_write = open(resultSource, 'a')
511 |                 try:
512 |                     file_write.write(str('epoch %i, minibatch %i/%i, validation error %f %%' %
513 |                       (epoch, minibatch_index + 1, n_train_batches,
514 |                        this_validation_loss * 100.))  + '\n')
515 |                 finally :
516 |                     file_write.close()
517 | 
518 |  
519 |                 if this_validation_loss < best_validation_loss:
520 | 
521 |                     
522 |                     if this_validation_loss < best_validation_loss *  \
523 |                        improvement_threshold:
524 |                         patience = max(patience, iter * patience_increase)
525 | 
526 |                     
527 |                     best_validation_loss = this_validation_loss
528 |                     best_iter = iter
529 | 
530 |                    
531 |                     test_losses = [
532 |                         test_model(i)
533 |                         for i in xrange(n_test_batches)
534 |                     ]
535 |                     test_score = numpy.mean(test_losses)
536 |                     print(('     epoch %i, minibatch %i/%i, test error of '
537 |                            'best model %f %%') %
538 |                           (epoch, minibatch_index + 1, n_train_batches,
539 |                            test_score * 100.))
540 |                     global resultSource
541 |                     file_write = open(resultSource, 'a')
542 |                     try:
543 |                         file_write.write(str(('     epoch %i, minibatch %i/%i, test error of '
544 |                            'best model %f %%') %
545 |                           (epoch, minibatch_index + 1, n_train_batches,
546 |                            test_score * 100.)) + '\n')
547 |                     finally :
548 |                         file_write.close()
549 | 
550 |             if patience <= iter:
551 |                 done_looping = True
552 |                 break
553 | 
554 |     end_time = time.time()
555 |     print('Optimization complete.')
556 |     print('Best validation score of %f %% obtained at iteration %i, '
557 |           'with test performance %f %%' %
558 |           (best_validation_loss * 100., best_iter + 1, test_score * 100.))
559 |     print >> sys.stderr, ('The code for file ' +
560 |                           os.path.split(__file__)[1] +
561 |                           ' ran for %.2fm' % ((end_time - start_time) / 60.))
562 |     global resultSource
563 |     file_write = open(resultSource, 'a')
564 |     try:
565 |         file_write.write(str('Optimization complete.\n'))
566 |         file_write.write(str('Best validation score of %f %% obtained at iteration %i, '
567 |           'with test performance %f %%' %
568 |           (best_validation_loss * 100., best_iter + 1, test_score * 100.)) + '\n')
569 |         file_write.write(str('The code for file ' +
570 |                           os.path.split(__file__)[1] +
571 |                           ' ran for %.2fm' % ((end_time - start_time) / 60.)) + '\n')
572 |     finally :
573 |         file_write.close()
574 | 
575 | if __name__ == '__main__':
576 |     evaluate_lenet5()
577 | 
578 | 
579 | def experiment(state, channel):
580 |     evaluate_lenet5(state.learning_rate, dataset=state.dataset)
581 | 


--------------------------------------------------------------------------------
/SRC/cnn_ts_tt.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | @author：wepon
  4 | @blog：http://blog.csdn.net/u012162613/article/details/43225445
  5 | 
  6 | 本程序基于LeNet5，但是有所简化，具体如下：
  7 | -没有实现location-specific gain and bias parameters
  8 | -用的是maxpooling，而不是average_pooling
  9 | -分类器用的是softmax，LeNet5用的是rbf
 10 | -LeNet5第二层并不是全连接的，本程序实现的是全连接
 11 | 具体参考：
 12 | - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner:
 13 |    Gradient-Based Learning Applied to Document
 14 |    Recognition, Proceedings of the IEEE, 86(11):2278-2324, November 1998.
 15 |    http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf
 16 | 
 17 | """
 18 | import cPickle
 19 | import gzip
 20 | import os
 21 | import sys
 22 | import time
 23 | import random
 24 | import numpy
 25 | 
 26 | import theano
 27 | import theano.tensor as T
 28 | from theano.tensor.signal import downsample
 29 | from theano.tensor.nnet import conv
 30 | from PIL import Image
 31 | from scipy.misc import imsave
 32 | 
 33 | 
 34 | resultSource = os.path.join(os.path.abspath('..'), 'result/ts_tt.txt')
 35 | file_w = open(resultSource, 'a')
 36 | try:
 37 |     file_w.write("theta is 0.7\n")
 38 |     file_w.write("train data is from source;valid data is from target,test data is from target\n")
 39 |     file_w.write("modify the time result\n")
 40 | finally:
 41 |     file_w.close()
 42 | 
 43 | """
 44 | 卷积+下采样合成一个层LeNetConvPoolLayer
 45 | W是系数矩阵，也就是卷积核============================?和filter同形状
 46 | rng:随机数生成器，用于初始化W
 47 | input:4维的向量，theano.tensor.dtensor4
 48 | filter_shape:(number of filters, num input feature maps,filter height, filter width)滤波器，例如5*5的滤波器
 49 |              num imput feature maps:表示输入的特征的个数
 50 |              number of filters代表滤波器的个数,也就是输出的特征的个数
 51 | image_shape:(batch size, num input feature maps,image height, image width)，图像的大小
 52 |             batch size代表是第几个batch
 53 | poolsize: (#rows, #cols)，下采样时对pool中的数求和
 54 | num input feature maps:是上一层到卷基层过程中，代表可训练参数的个数，也就是滤波器个数*（滤波矩阵中点的个数+b的个数）
 55 |                         例如有6个滤波器，为5*5的矩阵，那么就是6*（5*5+1）
 56 | num output feature maps:和num input feature maps一样，不过是从卷基层到下采样层的过程。 
 57 | """
 58 | class LeNetConvPoolLayer(object):
 59 |     def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
 60 |   
 61 | #assert condition，condition为True，则继续往下执行，condition为False，中断程序
 62 | #image_shape[1]和filter_shape[1]都是num input feature maps，它们必须是一样的。
 63 |         assert image_shape[1] == filter_shape[1]
 64 |         self.input = input
 65 | 
 66 | #每个隐层神经元（即像素）与上一层的连接数为num input feature maps * filter height * filter width。
 67 | #可以用numpy.prod(filter_shape[1:])来求得，prod是连乘操作。
 68 | #卷基层和上一层的连接个数
 69 |         fan_in = numpy.prod(filter_shape[1:])
 70 | 
 71 | #lower layer上每个神经元获得的梯度来自于："num output feature maps * filter height * filter width" /pooling size
 72 | #卷基层和下采样层的连接个数
 73 |         fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) /  
 74 |                    numpy.prod(poolsize))   #=======================？应该乘以2，因为一组滤波器只有一个w和一个b
 75 |                    
 76 | #以上求得fan_in、fan_out ，将它们代入公式，以此来随机初始化W,W就是线性卷积核
 77 | #初始化为一个极小值，似乎有公式，W的初始化有个规则：如果使用tanh函数，则在-sqrt(6./(n_in+n_hidden))到sqrt(6./(n_in+n_hidden))之间均匀
 78 | #numpy.random.uniform(low,high,size)左闭右开，返回size形状的随机数组
 79 |         W_bound = numpy.sqrt(6. / (fan_in + fan_out))  
 80 |         self.W = theano.shared(
 81 |             numpy.asarray(
 82 |                 rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
 83 |                 dtype=theano.config.floatX
 84 |             ),
 85 |             borrow=True
 86 |         )
 87 | 
 88 | # the bias is a 1D tensor -- one bias per output feature map
 89 | #偏置b是一维向量，每个输出图的特征图都对应一个偏置，初始化为0向量
 90 | #而输出的特征图的个数由filter个数决定，因此用filter_shape[0]即number of filters来初始化
 91 |         b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)
 92 |         self.b = theano.shared(value=b_values, borrow=True)
 93 | 
 94 | #将输入图像与filter卷积，conv.conv2d函数
 95 | #卷积完没有加b再通过sigmoid，这里是一处简化。
 96 |         conv_out = conv.conv2d(
 97 |             input=input,
 98 |             filters=self.W,
 99 |             filter_shape=filter_shape,
100 |             image_shape=image_shape
101 |         )
102 | 
103 | #maxpooling，最大子采样过程
104 |         pooled_out = downsample.max_pool_2d(
105 |             input=conv_out,
106 |             ds=poolsize,
107 |             ignore_border=True
108 |         )
109 | 
110 | #加偏置，再通过tanh映射，得到卷积+子采样层的最终输出
111 | #因为b是一维向量，这里用维度转换函数dimshuffle将其reshape。比如b是(10,)，
112 | #则b.dimshuffle('x', 0, 'x', 'x'))将其reshape为(1,10,1,1)
113 |         self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))
114 | #卷积+采样层的参数
115 |         self.params = [self.W, self.b]
116 | 
117 | 
118 | 
119 | 
120 | 
121 | """
122 | 注释：
123 | 这是定义隐藏层的类，首先明确：隐藏层的输入即input，输出即隐藏层的神经元个数。输入层与隐藏层是全连接的。
124 | 假设输入是n_in维的向量（也可以说时n_in个神经元），隐藏层有n_out个神经元，则因为是全连接，
125 | 一共有n_in*n_out个权重，故W大小时(n_in,n_out),n_in行n_out列，每一列对应隐藏层的每一个神经元的连接权重。
126 | b是偏置，隐藏层有n_out个神经元，故b时n_out维向量。
127 | rng即随机数生成器，numpy.random.RandomState，用于初始化W。
128 | input训练模型所用到的所有输入，并不是MLP的输入层，MLP的输入层的神经元个数时n_in，
129 |     而这里的参数input大小是（n_example,n_in）,每一行一个样本，即每一行作为MLP的输入层。
130 | activation:激活函数,这里定义为函数tanh
131 | """
132 | class HiddenLayer(object):
133 |     def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh):
134 |          self.input = input   #类HiddenLayer的input即所传递进来的input
135 | 
136 |          """
137 |          注释：
138 |          代码要兼容GPU，则必须使用 dtype=theano.config.floatX,并且定义为theano.shared
139 |          另外，W的初始化有个规则：如果使用tanh函数，则在-sqrt(6./(n_in+n_hidden))到sqrt(6./(n_in+n_hidden))之间均匀
140 |          抽取数值来初始化W，若时sigmoid函数，则以上再乘4倍。
141 |          """
142 |          #如果W未初始化，则根据上述方法初始化。
143 |          #加入这个判断的原因是：有时候我们可以用训练好的参数来初始化W，见我的上一篇文章。
144 |          if W is None:
145 |             W_values = numpy.asarray(
146 |                 rng.uniform(
147 |                     low=-numpy.sqrt(6. / (n_in + n_out)),
148 |                     high=numpy.sqrt(6. / (n_in + n_out)),
149 |                     size=(n_in, n_out)
150 |                 ),
151 |                 dtype=theano.config.floatX
152 |             )
153 |             if activation == theano.tensor.nnet.sigmoid:
154 |                 W_values *= 4
155 |             W = theano.shared(value=W_values, name='W', borrow=True)
156 | 
157 |          if b is None:
158 |             b_values = numpy.zeros((n_out,), dtype=theano.config.floatX)
159 |             b = theano.shared(value=b_values, name='b', borrow=True)
160 | 
161 |          #用上面定义的W、b来初始化类HiddenLayer的W、b
162 |          self.W = W
163 |          self.b = b
164 | 
165 |         #隐含层的输出
166 |          lin_output = T.dot(input, self.W) + self.b
167 |          self.output = (
168 |             lin_output if activation is None
169 |             else activation(lin_output)
170 |          )
171 | 
172 |         #隐含层的参数
173 |          self.params = [self.W, self.b]
174 | 
175 | 
176 | 
177 | """
178 | 定义分类层LogisticRegression，也即Softmax回归
179 | 在deeplearning tutorial中，直接将LogisticRegression视为Softmax，
180 | 而我们所认识的二类别的逻辑回归就是当n_out=2时的LogisticRegression
181 | """
182 | #参数说明：
183 | #input，大小就是(n_example,n_in)，其中n_example是一个batch的大小，
184 | #因为我们训练时用的是Minibatch SGD，因此input这样定义
185 | #n_in,即上一层(隐含层)的输出
186 | #n_out,输出的类别数 
187 | class LogisticRegression(object):
188 |     def __init__(self, input, n_in, n_out):
189 | 
190 | #W大小是n_in行n_out列，b为n_out维向量。即：每个输出对应W的一列以及b的一个元素。  
191 |         self.W = theano.shared(
192 |             value=numpy.zeros(
193 |                 (n_in, n_out),
194 |                 dtype=theano.config.floatX
195 |             ),
196 |             name='W',
197 |             borrow=True
198 |         )
199 | 
200 |         self.b = theano.shared(
201 |             value=numpy.zeros(
202 |                 (n_out,),
203 |                 dtype=theano.config.floatX
204 |             ),
205 |             name='b',
206 |             borrow=True
207 |         )
208 | 
209 | #input是(n_example,n_in)，W是（n_in,n_out）,点乘得到(n_example,n_out)，加上偏置b，
210 | #再作为T.nnet.softmax的输入，得到p_y_given_x
211 | #故p_y_given_x每一行代表每一个样本被估计为各类别的概率    
212 | #PS：b是n_out维向量，与(n_example,n_out)矩阵相加，内部其实是先复制n_example个b，
213 | #然后(n_example,n_out)矩阵的每一行都加b
214 |         self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)
215 | 
216 | #argmax返回最大值下标，因为本例数据集是MNIST，下标刚好就是类别。axis=1表示按行操作。================？
217 |         self.y_pred = T.argmax(self.p_y_given_x, axis=1)
218 | 
219 | #params，LogisticRegression的参数     
220 |         self.params = [self.W, self.b]
221 | 
222 |     def negative_log_likelihood(self, y):  #==============================？
223 |         return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
224 | 
225 |     def errors(self, y):
226 |         if y.ndim != self.y_pred.ndim:
227 |             raise TypeError(
228 |                 'y should have the same shape as self.y_pred',
229 |                 ('y', y.type, 'y_pred', self.y_pred.type)
230 |             )
231 |         if y.dtype.startswith('int'):
232 |             return T.mean(T.neq(self.y_pred, y))
233 |         else:
234 |             raise NotImplementedError()
235 | 
236 | 
237 | """
238 | 加载MNIST数据集load_data()
239 | mnist.pkl将60000个训练数据分成了50000个训练数据和10000校正数据集；
240 | 每个数组由两部分内容组成，一个图片数组和一个标签数组，图片数组的每一行代表一个图片的像素，有784个元素（28×28）
241 | """
242 | def load_data(dataset_s, dataset_t):
243 |     print '... loading data'
244 |     global resultSource
245 |     file_write = open(resultSource, 'a')
246 |     try:
247 |         file_write.write("... loading data\n" )
248 |     finally :
249 |         file_write.close()
250 |     
251 |     
252 | #从"mnist.pkl.gz"里加载train_set, valid_set, test_set，它们都是包括label的
253 | #主要用到python里的gzip.open()函数,以及 cPickle.load()。
254 | #‘rb’表示以二进制可读的方式打开文件
255 |     f = open(dataset_s, 'rb')
256 |     train_set_source, valid_set_source, test_set_source = cPickle.load(f)
257 |     f.close()
258 |     f = open(dataset_t, 'rb')
259 |     train_set_target, valid_set_target, test_set_target = cPickle.load(f)
260 |     f.close()
261 |    
262 | 
263 | #将数据设置成shared variables，主要时为了GPU加速，只有shared variables才能存到GPU memory中
264 | #GPU里数据类型只能是float。而data_y是类别，所以最后又转换为int返回
265 |     def shared_dataset(data_xy, borrow=True):
266 |         data_x, data_y = data_xy
267 |         shared_x = theano.shared(numpy.asarray(data_x,
268 |                                                dtype=theano.config.floatX),
269 |                                  borrow=borrow)
270 |         shared_y = theano.shared(numpy.asarray(data_y,
271 |                                                dtype=theano.config.floatX),
272 |                                  borrow=borrow)
273 |         return shared_x, T.cast(shared_y, 'int32')
274 | 
275 | 
276 |     def shared_target_dataset(data_xyz, borrow = True):
277 |         data_x, data_y, data_z = data_xyz
278 | 
279 |         shared_x = theano.shared(numpy.asarray(data_x,
280 |                                                dtype=theano.config.floatX),
281 |                                  borrow=borrow)
282 |         shared_y = theano.shared(numpy.asarray(data_y,
283 |                                                dtype=theano.config.floatX),
284 |                                  borrow=borrow)
285 |         return shared_x, T.cast(shared_y, 'int32')
286 | 
287 |     train_set_x, train_set_y = shared_dataset(train_set_source)
288 |     test_target_set_x, test_target_set_y = shared_target_dataset(test_set_target)
289 |     valid_target_set_x, valid_target_set_y = shared_target_dataset(valid_set_target)
290 |     #train_target_set_x, train_target_set_y = shared_target_dataset(target_train_set)
291 | 
292 |     rval = [(train_set_x, train_set_y),
293 |             (valid_target_set_x, valid_target_set_y), (test_target_set_x, test_target_set_y)]
294 |     return rval
295 | 
296 | 
297 | """
298 | 实现LeNet5
299 | LeNet5有两个卷积层，第一个卷积层有20个卷积核，第二个卷积层有50个卷积核
300 | """
301 | def evaluate_lenet5(learning_rate=0.1, n_epochs=200,
302 |                     dataset_s=os.path.join(os.path.abspath('..'), 'data/source.pkl'),
303 |                     dataset_t=os.path.join(os.path.abspath('..'), 'data/target0.7.pkl'),
304 |                     nkerns=[20, 50], batch_size=500):
305 |     """ 
306 |  learning_rate:学习速率，随机梯度前的系数。
307 |  n_epochs训练步数，每一步都会遍历所有batch，即所有样本
308 |  batch_size,这里设置为500，即每遍历完500个样本，才计算梯度并更新参数
309 |  nkerns=[20, 50],每一个LeNetConvPoolLayer卷积核的个数，第一个LeNetConvPoolLayer有
310 |  20个卷积核，第二个有50个
311 |     """
312 | 
313 |     rng = numpy.random.RandomState(23455)
314 | 
315 |     #加载数据
316 |     datasets = load_data(dataset_s, dataset_t)
317 |     train_set_x, train_set_y = datasets[0]
318 |     valid_target_set_x, valid_target_set_y = datasets[1]
319 |     test_target_set_x, test_target_set_y = datasets[2]
320 | 
321 |     # 计算batch的个数
322 |     n_train_batches = train_set_x.get_value(borrow=True).shape[0]
323 |     n_train_batches /= batch_size
324 |     n_valid_target_batches = valid_target_set_x.get_value(borrow=True).shape[0]
325 |     n_test_target_batches = test_target_set_x.get_value(borrow=True).shape[0]
326 |     n_valid_target_batches /= batch_size
327 |     n_test_target_batches /= batch_size
328 | 
329 | 
330 |     #定义几个变量，index表示batch下标，x表示输入的训练数据，y对应其标签
331 |     index = T.lscalar()  
332 |     x = T.matrix('x')   
333 |     y = T.ivector('y') 
334 | 
335 |     ######################
336 |     # BUILD ACTUAL MODEL #
337 |     ######################
338 |     print '... building the model'
339 |     global resultSource
340 |     file_write = open(resultSource, 'a')
341 |     try:
342 |         file_write.write("... building the model\n" )
343 |     finally :
344 |         file_write.close()
345 | 
346 | 
347 | #我们加载进来的batch大小的数据是(batch_size, 28 * 28)，但是LeNetConvPoolLayer的输入是四维的，所以要reshape
348 |     layer0_input = x.reshape((batch_size, 1, 28, 28))
349 | 
350 | # layer0即第一个LeNetConvPoolLayer层
351 | #输入的单张图片(28,28)，经过conv得到(28-5+1 , 28-5+1) = (24, 24)，
352 | #经过maxpooling得到(24/2, 24/2) = (12, 12)
353 | #因为每个batch有batch_size张图，第一个LeNetConvPoolLayer层有nkerns[0]个卷积核，
354 | #故layer0输出为(batch_size, nkerns[0], 12, 12)
355 |     layer0 = LeNetConvPoolLayer(
356 |         rng,
357 |         input=layer0_input,
358 |         image_shape=(batch_size, 1, 28, 28),
359 |         filter_shape=(nkerns[0], 1, 5, 5),
360 |         poolsize=(2, 2)
361 |     )
362 | 
363 | 
364 | #layer1即第二个LeNetConvPoolLayer层
365 | #输入是layer0的输出，每张特征图为(12,12),经过conv得到(12-5+1, 12-5+1) = (8, 8),
366 | #经过maxpooling得到(8/2, 8/2) = (4, 4)
367 | #因为每个batch有batch_size张图（特征图），第二个LeNetConvPoolLayer层有nkerns[1]个卷积核
368 | #，故layer1输出为(batch_size, nkerns[1], 4, 4)
369 |     layer1 = LeNetConvPoolLayer(
370 |         rng,
371 |         input=layer0.output,
372 |         image_shape=(batch_size, nkerns[0], 12, 12),#输入nkerns[0]个特征，即layer0输出nkerns[0]个特征
373 |         filter_shape=(nkerns[1], nkerns[0], 5, 5),
374 |         poolsize=(2, 2)
375 |     )
376 | #到此,实际上将原来的一张图片的一个特征压成了一个4*4的图片，但是一张图片对应nkerns[1]个特征图片
377 | #因此在全连接层，可以认为一个图片变为nkerns[1]*4*4大小的图片----the input of hiddenLayer
378 | 
379 | #前面定义好了两个LeNetConvPoolLayer（layer0和layer1），layer1后面接layer2，这是一个全连接层，相当于MLP里面的隐含层
380 | #故可以用MLP中定义的HiddenLayer来初始化layer2，layer2的输入是二维的(batch_size, num_pixels) ，
381 | #故要将上层中同一张图经不同卷积核卷积出来的特征图合并为一维向量，
382 | #也就是将layer1的输出(batch_size, nkerns[1], 4, 4)flatten为(batch_size, nkerns[1]*4*4)=(500，800),作为layer2的输入。
383 | #而隐藏层的w为800*500（nin*nout）
384 | #(500，800)表示有500个样本，每一行代表一个样本。layer2的输出大小是(batch_size,n_out)=(500,500)
385 |     layer2_input = layer1.output.flatten(2)
386 |     layer2 = HiddenLayer(
387 |         rng,
388 |         input=layer2_input,
389 |         n_in=nkerns[1] * 4 * 4,
390 |         n_out=500,
391 |         activation=T.tanh
392 |     )
393 | 
394 | #最后一层layer3是分类层，用的是逻辑回归中定义的LogisticRegression，
395 | #layer3的输入是layer2的输出(500,500)，layer3的输出就是(batch_size,n_out)=(500,10)
396 |     layer3 = LogisticRegression(input=layer2.output, n_in=500, n_out=10)
397 | 
398 | #代价函数NLL
399 |     cost = layer3.negative_log_likelihood(y)
400 | 
401 | # test_model计算测试误差，x、y根据给定的index具体化，然后调用layer3，
402 | #layer3又会逐层地调用layer2、layer1、layer0，故test_model其实就是整个CNN结构，
403 | #test_model的输入是x、y，输出是layer3.errors(y)的输出，即误差。
404 |     test_model = theano.function(
405 |         [index],
406 |         layer3.errors(y),
407 |         givens={
408 |             x: test_target_set_x[index * batch_size: (index + 1) * batch_size],
409 |             y: test_target_set_y[index * batch_size: (index + 1) * batch_size]
410 |         }
411 |     )
412 | #validate_model，验证模型，分析同上。
413 |     validate_model = theano.function(
414 |         [index],
415 |         layer3.errors(y),
416 |         givens={
417 |             x: valid_target_set_x[index * batch_size: (index + 1) * batch_size],
418 |             y: valid_target_set_y[index * batch_size: (index + 1) * batch_size]
419 |         }
420 |     )
421 | 
422 | #下面是train_model，涉及到优化算法即SGD，需要计算梯度、更新参数
423 |     #参数集
424 |     params = layer3.params + layer2.params + layer1.params + layer0.params
425 | 
426 |     #对各个参数的梯度
427 |     grads = T.grad(cost, params)
428 | 
429 | #因为参数太多，在updates规则里面一个一个具体地写出来是很麻烦的，所以下面用了一个for..in..,自动生成规则对(param_i, param_i - learning_rate * grad_i)
430 |     updates = [
431 |         (param_i, param_i - learning_rate * grad_i)
432 |         for param_i, grad_i in zip(params, grads)
433 |     ]
434 | 
435 | #train_model，代码分析同test_model。train_model里比test_model、validation_model多出updates规则
436 |     train_model = theano.function(
437 |         [index],
438 |         cost,
439 |         updates=updates,
440 |         givens={
441 |             x: train_set_x[index * batch_size: (index + 1) * batch_size],
442 |             y: train_set_y[index * batch_size: (index + 1) * batch_size]
443 |         }
444 |     )
445 | 
446 | 
447 |     ###############
448 |     #   开始训练  #
449 |     ###############
450 |     print '... training'
451 |     global resultSource
452 |     file_write = open(resultSource, 'a')
453 |     try:
454 |         file_write.write("... training\n" )
455 |     finally :
456 |         file_write.close()
457 |     patience = 10000  
458 |     patience_increase = 2  
459 |     improvement_threshold = 0.995 
460 |                                    
461 |     validation_frequency = min(n_train_batches, patience / 2)
462 |  #这样设置validation_frequency可以保证每一次epoch都会在验证集上测试。
463 | 
464 |     best_validation_loss = numpy.inf   #最好的验证集上的loss，最好即最小
465 |     best_iter = 0                      #最好的迭代次数，以batch为单位。比如best_iter=10000，说明在训练完第10000个batch时，达到best_validation_loss
466 |     test_score = 0.
467 |     start_time = time.time()
468 | 
469 |     epoch = 0
470 |     done_looping = False
471 | 
472 | #下面就是训练过程了，while循环控制的时步数epoch，一个epoch会遍历所有的batch，即所有的图片。
473 | #for循环是遍历一个个batch，一次一个batch地训练。for循环体里会用train_model(minibatch_index)去训练模型，
474 | #train_model里面的updatas会更新各个参数。
475 | #for循环里面会累加训练过的batch数iter，当iter是validation_frequency倍数时则会在验证集上测试，
476 | #如果验证集的损失this_validation_loss小于之前最佳的损失best_validation_loss，
477 | #则更新best_validation_loss和best_iter，同时在testset上测试。
478 | #如果验证集的损失this_validation_loss小于best_validation_loss*improvement_threshold时则更新patience。
479 | #当达到最大步数n_epoch时，或者patience<iter时，结束训练
480 | #iter 表示训练了多少个batch
481 |     while (epoch < n_epochs) and (not done_looping):
482 |         epoch = epoch + 1
483 |         for minibatch_index in xrange(n_train_batches):
484 | 
485 |             iter = (epoch - 1) * n_train_batches + minibatch_index
486 | 
487 |             if iter % 100 == 0:
488 |                 print 'training @ iter = ', iter
489 |                 global resultSource
490 |                 file_write = open(resultSource, 'a')
491 |                 try:
492 |                     file_write.write("training @ iter = " + str(iter) + '\n')
493 |                 finally :
494 |                     file_write.close()
495 |             cost_ij = train_model(minibatch_index)  
496 | #cost_ij 没什么用，后面都没有用到,只是为了调用train_model，而train_model有返回值
497 |             if (iter + 1) % validation_frequency == 0:
498 | 
499 |                 # compute zero-one loss on validation set
500 |                 validation_losses = [validate_model(i) for i
501 |                                      in xrange(n_valid_target_batches)]
502 |                 this_validation_loss = numpy.mean(validation_losses)
503 |                 print('epoch %i, minibatch %i/%i, validation error %f %%' %
504 |                       (epoch, minibatch_index + 1, n_train_batches,
505 |                        this_validation_loss * 100.))
506 |                 global resultSource
507 |                 file_write = open(resultSource, 'a')
508 |                 try:
509 |                     file_write.write(str('epoch %i, minibatch %i/%i, validation error %f %%' %
510 |                       (epoch, minibatch_index + 1, n_train_batches,
511 |                        this_validation_loss * 100.))  + '\n')
512 |                 finally :
513 |                     file_write.close()
514 | 
515 |  
516 |                 if this_validation_loss < best_validation_loss:
517 | 
518 |                     
519 |                     if this_validation_loss < best_validation_loss *  \
520 |                        improvement_threshold:
521 |                         patience = max(patience, iter * patience_increase)
522 | 
523 |                     
524 |                     best_validation_loss = this_validation_loss
525 |                     best_iter = iter
526 | 
527 |                    
528 |                     test_losses = [
529 |                         test_model(i)
530 |                         for i in xrange(n_test_target_batches)
531 |                     ]
532 |                     test_score = numpy.mean(test_losses)
533 |                     print(('     epoch %i, minibatch %i/%i, test error of '
534 |                            'best model %f %%') %
535 |                           (epoch, minibatch_index + 1, n_train_batches,
536 |                            test_score * 100.))
537 |                     global resultSource
538 |                     file_write = open(resultSource, 'a')
539 |                     try:
540 |                         file_write.write(str(('     epoch %i, minibatch %i/%i, test error of '
541 |                            'best model %f %%') %
542 |                           (epoch, minibatch_index + 1, n_train_batches,
543 |                            test_score * 100.)) + '\n')
544 |                     finally :
545 |                         file_write.close()
546 | 
547 |             if patience <= iter:
548 |                 done_looping = True
549 |                 break
550 | 
551 |     end_time = time.time()
552 |     print('Optimization complete.')
553 |     print('Best validation score of %f %% obtained at iteration %i, '
554 |           'with test performance %f %%' %
555 |           (best_validation_loss * 100., best_iter + 1, test_score * 100.))
556 |     print >> sys.stderr, ('The code for file ' +
557 |                           os.path.split(__file__)[1] +
558 |                           ' ran for %.2fm' % ((end_time - start_time) / 60.))
559 |     global resultSource
560 |     file_write = open(resultSource, 'a')
561 |     try:
562 |         file_write.write(str('Optimization complete.\n'))
563 |         file_write.write(str('Best validation score of %f %% obtained at iteration %i, '
564 |           'with test performance %f %%' %
565 |           (best_validation_loss * 100., best_iter + 1, test_score * 100.)) + '\n')
566 |         file_write.write(str('The code for file ' +
567 |                           os.path.split(__file__)[1] +
568 |                           ' ran for %.2fm' % ((end_time - start_time) / 60.)) + '\n')
569 |     finally :
570 |         file_write.close()
571 | 
572 | if __name__ == '__main__':
573 |     evaluate_lenet5()
574 | 
575 | 
576 | def experiment(state, channel):
577 |     evaluate_lenet5(state.learning_rate, dataset_s=state.dataset_s, dataset_t=state.dataset_t)
578 | 


--------------------------------------------------------------------------------
/SRC/cnn_tst_tt.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | @author：wepon
  4 | @blog：http://blog.csdn.net/u012162613/article/details/43225445
  5 | 
  6 | 本程序基于LeNet5，但是有所简化，具体如下：
  7 | -没有实现location-specific gain and bias parameters
  8 | -用的是maxpooling，而不是average_pooling
  9 | -分类器用的是softmax，LeNet5用的是rbf
 10 | -LeNet5第二层并不是全连接的，本程序实现的是全连接
 11 | 具体参考：
 12 | - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner:
 13 |    Gradient-Based Learning Applied to Document
 14 |    Recognition, Proceedings of the IEEE, 86(11):2278-2324, November 1998.
 15 |    http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf
 16 | 
 17 | """
 18 | import cPickle
 19 | import gzip
 20 | import os
 21 | import sys
 22 | import time
 23 | import random
 24 | import numpy
 25 | 
 26 | import theano
 27 | import theano.tensor as T
 28 | from theano.tensor.signal import downsample
 29 | from theano.tensor.nnet import conv
 30 | from PIL import Image
 31 | from scipy.misc import imsave
 32 | 
 33 | 
 34 | resultSource = os.path.join(os.path.abspath('..'), 'result/tst_tt1.txt')
 35 | file_w = open(resultSource, 'a')
 36 | try:
 37 |     file_w.write("lmbda is 0.25\ntheta is 0.7\n")
 38 |     file_w.write("modify the time result")
 39 | finally:
 40 |     file_w.close()
 41 | 
 42 | """
 43 | 卷积+下采样合成一个层LeNetConvPoolLayer
 44 | W是系数矩阵，也就是卷积核============================?和filter同形状
 45 | rng:随机数生成器，用于初始化W
 46 | input:4维的向量，theano.tensor.dtensor4
 47 | filter_shape:(number of filters, num input feature maps,filter height, filter width)滤波器，例如5*5的滤波器
 48 |              num imput feature maps:表示输入的特征的个数
 49 |              number of filters代表滤波器的个数,也就是输出的特征的个数
 50 | image_shape:(batch size, num input feature maps,image height, image width)，图像的大小
 51 |             batch size代表是第几个batch
 52 | poolsize: (#rows, #cols)，下采样时对pool中的数求和
 53 | num input feature maps:是上一层到卷基层过程中，代表可训练参数的个数，也就是滤波器个数*（滤波矩阵中点的个数+b的个数）
 54 |                         例如有6个滤波器，为5*5的矩阵，那么就是6*（5*5+1）
 55 | num output feature maps:和num input feature maps一样，不过是从卷基层到下采样层的过程。 
 56 | """
 57 | class LeNetConvPoolLayer(object):
 58 |     def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
 59 |   
 60 | #assert condition，condition为True，则继续往下执行，condition为False，中断程序
 61 | #image_shape[1]和filter_shape[1]都是num input feature maps，它们必须是一样的。
 62 |         assert image_shape[1] == filter_shape[1]
 63 |         self.input = input
 64 | 
 65 | #每个隐层神经元（即像素）与上一层的连接数为num input feature maps * filter height * filter width。
 66 | #可以用numpy.prod(filter_shape[1:])来求得，prod是连乘操作。
 67 | #卷基层和上一层的连接个数
 68 |         fan_in = numpy.prod(filter_shape[1:])
 69 | 
 70 | #lower layer上每个神经元获得的梯度来自于："num output feature maps * filter height * filter width" /pooling size
 71 | #卷基层和下采样层的连接个数
 72 |         fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) /  
 73 |                    numpy.prod(poolsize))   #=======================？应该乘以2，因为一组滤波器只有一个w和一个b
 74 |                    
 75 | #以上求得fan_in、fan_out ，将它们代入公式，以此来随机初始化W,W就是线性卷积核
 76 | #初始化为一个极小值，似乎有公式，W的初始化有个规则：如果使用tanh函数，则在-sqrt(6./(n_in+n_hidden))到sqrt(6./(n_in+n_hidden))之间均匀
 77 | #numpy.random.uniform(low,high,size)左闭右开，返回size形状的随机数组
 78 |         W_bound = numpy.sqrt(6. / (fan_in + fan_out))  
 79 |         self.W = theano.shared(
 80 |             numpy.asarray(
 81 |                 rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
 82 |                 dtype=theano.config.floatX
 83 |             ),
 84 |             borrow=True
 85 |         )
 86 | 
 87 | # the bias is a 1D tensor -- one bias per output feature map
 88 | #偏置b是一维向量，每个输出图的特征图都对应一个偏置，初始化为0向量
 89 | #而输出的特征图的个数由filter个数决定，因此用filter_shape[0]即number of filters来初始化
 90 |         b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)
 91 |         self.b = theano.shared(value=b_values, borrow=True)
 92 | 
 93 | #将输入图像与filter卷积，conv.conv2d函数
 94 | #卷积完没有加b再通过sigmoid，这里是一处简化。
 95 |         conv_out = conv.conv2d(
 96 |             input=input,
 97 |             filters=self.W,
 98 |             filter_shape=filter_shape,
 99 |             image_shape=image_shape
100 |         )
101 | 
102 | #maxpooling，最大子采样过程
103 |         pooled_out = downsample.max_pool_2d(
104 |             input=conv_out,
105 |             ds=poolsize,
106 |             ignore_border=True
107 |         )
108 | 
109 | #加偏置，再通过tanh映射，得到卷积+子采样层的最终输出
110 | #因为b是一维向量，这里用维度转换函数dimshuffle将其reshape。比如b是(10,)，
111 | #则b.dimshuffle('x', 0, 'x', 'x'))将其reshape为(1,10,1,1)
112 |         self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))
113 | #卷积+采样层的参数
114 |         self.params = [self.W, self.b]
115 | 
116 | 
117 | 
118 | 
119 | 
120 | """
121 | 注释：
122 | 这是定义隐藏层的类，首先明确：隐藏层的输入即input，输出即隐藏层的神经元个数。输入层与隐藏层是全连接的。
123 | 假设输入是n_in维的向量（也可以说时n_in个神经元），隐藏层有n_out个神经元，则因为是全连接，
124 | 一共有n_in*n_out个权重，故W大小时(n_in,n_out),n_in行n_out列，每一列对应隐藏层的每一个神经元的连接权重。
125 | b是偏置，隐藏层有n_out个神经元，故b时n_out维向量。
126 | rng即随机数生成器，numpy.random.RandomState，用于初始化W。
127 | input训练模型所用到的所有输入，并不是MLP的输入层，MLP的输入层的神经元个数时n_in，
128 |     而这里的参数input大小是（n_example,n_in）,每一行一个样本，即每一行作为MLP的输入层。
129 | activation:激活函数,这里定义为函数tanh
130 | """
131 | class HiddenLayer(object):
132 |     def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh):
133 |          self.input = input   #类HiddenLayer的input即所传递进来的input
134 | 
135 |          """
136 |          注释：
137 |          代码要兼容GPU，则必须使用 dtype=theano.config.floatX,并且定义为theano.shared
138 |          另外，W的初始化有个规则：如果使用tanh函数，则在-sqrt(6./(n_in+n_hidden))到sqrt(6./(n_in+n_hidden))之间均匀
139 |          抽取数值来初始化W，若时sigmoid函数，则以上再乘4倍。
140 |          """
141 |          #如果W未初始化，则根据上述方法初始化。
142 |          #加入这个判断的原因是：有时候我们可以用训练好的参数来初始化W，见我的上一篇文章。
143 |          if W is None:
144 |             W_values = numpy.asarray(
145 |                 rng.uniform(
146 |                     low=-numpy.sqrt(6. / (n_in + n_out)),
147 |                     high=numpy.sqrt(6. / (n_in + n_out)),
148 |                     size=(n_in, n_out)
149 |                 ),
150 |                 dtype=theano.config.floatX
151 |             )
152 |             if activation == theano.tensor.nnet.sigmoid:
153 |                 W_values *= 4
154 |             W = theano.shared(value=W_values, name='W', borrow=True)
155 | 
156 |          if b is None:
157 |             b_values = numpy.zeros((n_out,), dtype=theano.config.floatX)
158 |             b = theano.shared(value=b_values, name='b', borrow=True)
159 | 
160 |          #用上面定义的W、b来初始化类HiddenLayer的W、b
161 |          self.W = W
162 |          self.b = b
163 | 
164 |         #隐含层的输出
165 |          lin_output = T.dot(input, self.W) + self.b
166 |          self.output = (
167 |             lin_output if activation is None
168 |             else activation(lin_output)
169 |          )
170 | 
171 |         #隐含层的参数
172 |          self.params = [self.W, self.b]
173 | 
174 | 
175 | 
176 | """
177 | 定义分类层LogisticRegression，也即Softmax回归
178 | 在deeplearning tutorial中，直接将LogisticRegression视为Softmax，
179 | 而我们所认识的二类别的逻辑回归就是当n_out=2时的LogisticRegression
180 | """
181 | #参数说明：
182 | #input，大小就是(n_example,n_in)，其中n_example是一个batch的大小，
183 | #因为我们训练时用的是Minibatch SGD，因此input这样定义
184 | #n_in,即上一层(隐含层)的输出
185 | #n_out,输出的类别数 
186 | class LogisticRegression(object):
187 |     def __init__(self, input, n_in, n_out):
188 | 
189 | #W大小是n_in行n_out列，b为n_out维向量。即：每个输出对应W的一列以及b的一个元素。  
190 |         self.W = theano.shared(
191 |             value=numpy.zeros(
192 |                 (n_in, n_out),
193 |                 dtype=theano.config.floatX
194 |             ),
195 |             name='W',
196 |             borrow=True
197 |         )
198 | 
199 |         self.b = theano.shared(
200 |             value=numpy.zeros(
201 |                 (n_out,),
202 |                 dtype=theano.config.floatX
203 |             ),
204 |             name='b',
205 |             borrow=True
206 |         )
207 | 
208 | #input是(n_example,n_in)，W是（n_in,n_out）,点乘得到(n_example,n_out)，加上偏置b，
209 | #再作为T.nnet.softmax的输入，得到p_y_given_x
210 | #故p_y_given_x每一行代表每一个样本被估计为各类别的概率    
211 | #PS：b是n_out维向量，与(n_example,n_out)矩阵相加，内部其实是先复制n_example个b，
212 | #然后(n_example,n_out)矩阵的每一行都加b
213 |         self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)
214 | 
215 | #argmax返回最大值下标，因为本例数据集是MNIST，下标刚好就是类别。axis=1表示按行操作。================？
216 |         self.y_pred = T.argmax(self.p_y_given_x, axis=1)
217 | 
218 | #params，LogisticRegression的参数     
219 |         self.params = [self.W, self.b]
220 | 
221 |     def negative_log_likelihood(self, y):  #==============================？
222 |         return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
223 | 
224 |     def errors(self, y):
225 |         if y.ndim != self.y_pred.ndim:
226 |             raise TypeError(
227 |                 'y should have the same shape as self.y_pred',
228 |                 ('y', y.type, 'y_pred', self.y_pred.type)
229 |             )
230 |         if y.dtype.startswith('int'):
231 |             return T.mean(T.neq(self.y_pred, y))
232 |         else:
233 |             raise NotImplementedError()
234 | 
235 | 
236 | """
237 | 加载MNIST数据集load_data()
238 | mnist.pkl将60000个训练数据分成了50000个训练数据和10000校正数据集；
239 | 每个数组由两部分内容组成，一个图片数组和一个标签数组，图片数组的每一行代表一个图片的像素，有784个元素（28×28）
240 | """
241 | def load_data(dataset_s, dataset_t, dataset_st):
242 |     print '... loading data'
243 |     global resultSource
244 |     file_write = open(resultSource, 'a')
245 |     try:
246 |         file_write.write("... loading data\n" )
247 |     finally :
248 |         file_write.close()
249 |     
250 |     
251 | #从"mnist.pkl.gz"里加载train_set, valid_set, test_set，它们都是包括label的
252 | #主要用到python里的gzip.open()函数,以及 cPickle.load()。
253 | #‘rb’表示以二进制可读的方式打开文件
254 |     f = open(dataset_s, 'rb')
255 |     train_set_source, valid_set_source, test_set_source = cPickle.load(f)
256 |     f.close()
257 |     f = open(dataset_t, 'rb')
258 |     train_set_target, valid_set_target, test_set_target = cPickle.load(f)
259 |     f.close()
260 |     f = open(dataset_st, 'rb')
261 |     train_set_st, valid_set_st, test_set_st = cPickle.load(f)
262 |     f.close()
263 |    
264 | 
265 | #将数据设置成shared variables，主要时为了GPU加速，只有shared variables才能存到GPU memory中
266 | #GPU里数据类型只能是float。而data_y是类别，所以最后又转换为int返回
267 |     def shared_dataset(data_xy, borrow=True):
268 |         data_x, data_y = data_xy
269 |         shared_x = theano.shared(numpy.asarray(data_x,
270 |                                                dtype=theano.config.floatX),
271 |                                  borrow=borrow)
272 |         shared_y = theano.shared(numpy.asarray(data_y,
273 |                                                dtype=theano.config.floatX),
274 |                                  borrow=borrow)
275 |         return shared_x, T.cast(shared_y, 'int32')  
276 | 
277 |     def shared_target_dataset(data_xyz, borrow = True):
278 |         data_x, data_y, data_z = data_xyz
279 | 
280 |         shared_x = theano.shared(numpy.asarray(data_x,
281 |                                                dtype=theano.config.floatX),
282 |                                  borrow=borrow)
283 |         shared_y = theano.shared(numpy.asarray(data_y,
284 |                                                dtype=theano.config.floatX),
285 |                                  borrow=borrow)
286 |         return shared_x, T.cast(shared_y, 'int32')
287 | 
288 |     def shared_st_dataset(data_xyz, borrow = True):
289 |         data_x, data_y, data_z = data_xyz
290 | 
291 |         shared_x = theano.shared(numpy.asarray(data_x,
292 |                                                dtype=theano.config.floatX),
293 |                                  borrow=borrow)
294 |         shared_y = theano.shared(numpy.asarray(data_y,
295 |                                                dtype=theano.config.floatX),
296 |                                  borrow=borrow)
297 |         shared_z = theano.shared(numpy.asarray(data_z,
298 |                                                dtype=theano.config.floatX),
299 |                                  borrow=borrow)
300 |         return shared_x, T.cast(shared_y, 'int32'), T.cast(shared_z, 'int32')
301 | 
302 | 
303 |     train_set_x, train_set_y = shared_dataset(train_set_source)
304 |     test_target_set_x, test_target_set_y = shared_target_dataset(test_set_target)
305 |     valid_target_set_x, valid_target_set_y = shared_target_dataset(valid_set_target)
306 |     train_st_set_x, train_st_set_y, train_st_set_z = shared_st_dataset(train_set_st)
307 | 
308 |     rval = [(train_set_x, train_set_y), 
309 |             (valid_target_set_x, valid_target_set_y), 
310 |             (test_target_set_x, test_target_set_y),
311 |             (train_st_set_x, train_st_set_y, train_st_set_z)]
312 |     return rval
313 | 
314 | 
315 | """
316 | 实现LeNet5
317 | LeNet5有两个卷积层，第一个卷积层有20个卷积核，第二个卷积层有50个卷积核
318 | """
319 | def evaluate_lenet5(learning_rate=0.1, n_epochs=200,
320 |                     dataset_s=os.path.join(os.path.abspath('..'), 'data/source.pkl'),
321 |                     dataset_t=os.path.join(os.path.abspath('..'), 'data/target0.7.pkl'),
322 |                     dataset_st=os.path.join(os.path.abspath('..'), 'data/st0.7.pkl'),
323 |                     nkerns=[20, 50], batch_size=500, lmbda=0.25):
324 |     """ 
325 |  learning_rate:学习速率，随机梯度前的系数。
326 |  n_epochs训练步数，每一步都会遍历所有batch，即所有样本
327 |  batch_size,这里设置为500，即每遍历完500个样本，才计算梯度并更新参数
328 |  nkerns=[20, 50],每一个LeNetConvPoolLayer卷积核的个数，第一个LeNetConvPoolLayer有
329 |  20个卷积核，第二个有50个
330 |     """
331 | 
332 |     rng = numpy.random.RandomState(23455)
333 | 
334 |     #加载数据
335 |     datasets = load_data(dataset_s, dataset_t, dataset_st)
336 |     train_set_x, train_set_y = datasets[0]
337 |     valid_target_set_x, valid_target_set_y = datasets[1]
338 |     test_target_set_x, test_target_set_y = datasets[2]
339 |     train_st_set_x, train_st_set_y, train_st_set_z = datasets[3]
340 | 
341 |     # 计算batch的个数
342 |     n_train_batches = train_set_x.get_value(borrow=True).shape[0]
343 |     n_valid_batches = valid_target_set_x.get_value(borrow=True).shape[0]
344 |     n_test_batches = test_target_set_x.get_value(borrow=True).shape[0]
345 |     n_train_batches /= batch_size
346 |     n_valid_batches /= batch_size
347 |     n_test_batches /= batch_size
348 | 
349 | 
350 |     #定义几个变量，index表示batch下标，x表示输入的训练数据，y对应其标签
351 |     index = T.lscalar()  
352 |     x = T.matrix('x')   
353 |     y = T.ivector('y') 
354 |     z = T.ivector('z') 
355 | 
356 |     ######################
357 |     # BUILD ACTUAL MODEL #
358 |     ######################
359 |     print '... building the model'
360 |     global resultSource
361 |     file_write = open(resultSource, 'a')
362 |     try:
363 |         file_write.write("... building the model\n" )
364 |     finally :
365 |         file_write.close()
366 | 
367 | 
368 | #我们加载进来的batch大小的数据是(batch_size, 28 * 28)，但是LeNetConvPoolLayer的输入是四维的，所以要reshape
369 |     layer0_input = x.reshape((batch_size, 1, 28, 28))
370 | 
371 | # layer0即第一个LeNetConvPoolLayer层
372 | #输入的单张图片(28,28)，经过conv得到(28-5+1 , 28-5+1) = (24, 24)，
373 | #经过maxpooling得到(24/2, 24/2) = (12, 12)
374 | #因为每个batch有batch_size张图，第一个LeNetConvPoolLayer层有nkerns[0]个卷积核，
375 | #故layer0输出为(batch_size, nkerns[0], 12, 12)
376 |     layer0 = LeNetConvPoolLayer(
377 |         rng,
378 |         input=layer0_input,
379 |         image_shape=(batch_size, 1, 28, 28),
380 |         filter_shape=(nkerns[0], 1, 5, 5),
381 |         poolsize=(2, 2)
382 |     )
383 | 
384 | 
385 | #layer1即第二个LeNetConvPoolLayer层
386 | #输入是layer0的输出，每张特征图为(12,12),经过conv得到(12-5+1, 12-5+1) = (8, 8),
387 | #经过maxpooling得到(8/2, 8/2) = (4, 4)
388 | #因为每个batch有batch_size张图（特征图），第二个LeNetConvPoolLayer层有nkerns[1]个卷积核
389 | #，故layer1输出为(batch_size, nkerns[1], 4, 4)
390 |     layer1 = LeNetConvPoolLayer(
391 |         rng,
392 |         input=layer0.output,
393 |         image_shape=(batch_size, nkerns[0], 12, 12),#输入nkerns[0]个特征，即layer0输出nkerns[0]个特征
394 |         filter_shape=(nkerns[1], nkerns[0], 5, 5),
395 |         poolsize=(2, 2)
396 |     )
397 | #到此,实际上将原来的一张图片的一个特征压成了一个4*4的图片，但是一张图片对应nkerns[1]个特征图片
398 | #因此在全连接层，可以认为一个图片变为nkerns[1]*4*4大小的图片----the input of hiddenLayer
399 | 
400 | #前面定义好了两个LeNetConvPoolLayer（layer0和layer1），layer1后面接layer2，这是一个全连接层，相当于MLP里面的隐含层
401 | #故可以用MLP中定义的HiddenLayer来初始化layer2，layer2的输入是二维的(batch_size, num_pixels) ，
402 | #故要将上层中同一张图经不同卷积核卷积出来的特征图合并为一维向量，
403 | #也就是将layer1的输出(batch_size, nkerns[1], 4, 4)flatten为(batch_size, nkerns[1]*4*4)=(500，800),作为layer2的输入。
404 | #而隐藏层的w为800*500（nin*nout）
405 | #(500，800)表示有500个样本，每一行代表一个样本。layer2的输出大小是(batch_size,n_out)=(500,500)
406 |     layer2_input = layer1.output.flatten(2)
407 |     layer2 = HiddenLayer(
408 |         rng,
409 |         input=layer2_input,
410 |         n_in=nkerns[1] * 4 * 4,
411 |         n_out=500,
412 |         activation=T.tanh
413 |     )
414 | 
415 | #最后一层layer3是label分类层，用的是逻辑回归中定义的LogisticRegression，
416 | #layer3的输入是layer2的输出(500,500)，layer3的输出就是(batch_size,n_out)=(500,10)
417 |     layer3 = LogisticRegression(input=layer2.output, n_in=500, n_out=10)
418 | #最后一层layer4是domain分类层，用的是逻辑回归中定义的LogisticRegression，
419 | #layer4的输入是layer2的输出(500,500)，layer4的输出就是(batch_size,n_out)=(500,2)
420 |     layer4 = LogisticRegression(input=layer2.output, n_in=500, n_out=2)
421 | 
422 | #代价函数NLL
423 |     cost_L = layer3.negative_log_likelihood(y)
424 |     cost_D = layer4.negative_log_likelihood(z)
425 | 
426 | # test_model计算测试误差，x、y根据给定的index具体化，然后调用layer3，
427 | #layer3又会逐层地调用layer2、layer1、layer0，故test_model其实就是整个CNN结构，
428 | #test_model的输入是x、y，输出是layer3.errors(y)的输出，即误差。
429 |     test_model = theano.function(
430 |         [index],
431 |         layer3.errors(y),
432 |         givens={
433 |             x: test_target_set_x[index * batch_size: (index + 1) * batch_size],
434 |             y: test_target_set_y[index * batch_size: (index + 1) * batch_size]
435 |         }
436 |     )
437 | #validate_model，验证模型，分析同上。
438 |     validate_model = theano.function(
439 |         [index],
440 |         layer3.errors(y),
441 |         givens={
442 |             x: valid_target_set_x[index * batch_size: (index + 1) * batch_size],
443 |             y: valid_target_set_y[index * batch_size: (index + 1) * batch_size]
444 |         }
445 |     )
446 | 
447 | #下面是train_model，涉及到优化算法即SGD，需要计算梯度、更新参数
448 |     #参数集
449 |     params_L = layer3.params + layer2.params + layer1.params + layer0.params
450 |     params_D2D = layer4.params 
451 |     params_D2F = layer2.params + layer1.params + layer0.params
452 | 
453 |     #对各个参数的梯度
454 |     grads_L = T.grad(cost_L, params_L)
455 |     grads_D2D = T.grad(cost_D, params_D2D)
456 |     grads_D2F = T.grad(cost_D, params_D2F)
457 | 
458 | #因为参数太多，在updates规则里面一个一个具体地写出来是很麻烦的，所以下面用了一个for..in..,自动生成规则对(param_i, param_i - learning_rate * grad_i)
459 |     updates_L = [
460 |         (param_i, param_i - learning_rate * grad_i)
461 |         for param_i, grad_i in zip(params_L, grads_L)
462 |     ]
463 |     updates_D = []
464 |     for param_i, grad_i in zip(params_D2D, grads_D2D):
465 |         updates_D.append((param_i, param_i - learning_rate * grad_i))
466 |     for param_i, grad_i in zip(params_D2F, grads_D2F):
467 |         updates_D.append((param_i, param_i + learning_rate * grad_i * lmbda))
468 | 
469 | #train_model，代码分析同test_model。train_model里比test_model、validation_model多出updates规则
470 |     train_model_L = theano.function(
471 |         [index],
472 |         cost_L,
473 |         updates=updates_L,
474 |         givens={
475 |             x: train_set_x[index * batch_size: (index + 1) * batch_size],
476 |             y: train_set_y[index * batch_size: (index + 1) * batch_size]
477 |         }
478 |     )
479 | 
480 |     train_model_D = theano.function(
481 |         [index],
482 |         cost_D,
483 |         updates=updates_D,
484 |         givens={
485 |             x: train_st_set_x[index * batch_size: (index + 1) * batch_size],
486 |             z: train_st_set_z[index * batch_size: (index + 1) * batch_size],
487 |         }
488 |     )
489 | 
490 | 
491 |     ###############
492 |     #   开始训练  #
493 |     ###############
494 |     print '... training'
495 |     global resultSource
496 |     file_write = open(resultSource, 'a')
497 |     try:
498 |         file_write.write("... training\n" )
499 |     finally :
500 |         file_write.close()
501 |     patience = 10000  
502 |     patience_increase = 2  
503 |     improvement_threshold = 0.995 
504 |                                    
505 |     validation_frequency = min(n_train_batches, patience / 2)
506 |  #这样设置validation_frequency可以保证每一次epoch都会在验证集上测试。
507 | 
508 |     best_validation_loss = numpy.inf   #最好的验证集上的loss，最好即最小
509 |     best_iter = 0                      #最好的迭代次数，以batch为单位。比如best_iter=10000，说明在训练完第10000个batch时，达到best_validation_loss
510 |     test_score = 0.
511 |     start_time = time.time()
512 | 
513 |     epoch = 0
514 |     done_looping = False
515 | 
516 | #下面就是训练过程了，while循环控制的时步数epoch，一个epoch会遍历所有的batch，即所有的图片。
517 | #for循环是遍历一个个batch，一次一个batch地训练。for循环体里会用train_model(minibatch_index)去训练模型，
518 | #train_model里面的updatas会更新各个参数。
519 | #for循环里面会累加训练过的batch数iter，当iter是validation_frequency倍数时则会在验证集上测试，
520 | #如果验证集的损失this_validation_loss小于之前最佳的损失best_validation_loss，
521 | #则更新best_validation_loss和best_iter，同时在testset上测试。
522 | #如果验证集的损失this_validation_loss小于best_validation_loss*improvement_threshold时则更新patience。
523 | #当达到最大步数n_epoch时，或者patience<iter时，结束训练
524 | #iter 表示训练了多少个batch
525 |     while (epoch < n_epochs) and (not done_looping):
526 |         epoch = epoch + 1
527 |         for minibatch_index in xrange(n_train_batches):
528 | 
529 |             iter = (epoch - 1) * n_train_batches + minibatch_index
530 | 
531 |             if iter % 100 == 0:
532 |                 print 'training @ iter = ', iter
533 |                 global resultSource
534 |                 file_write = open(resultSource, 'a')
535 |                 try:
536 |                     file_write.write("training @ iter = " + str(iter) + '\n')
537 |                 finally :
538 |                     file_write.close()
539 |             cost_ij = train_model_L(minibatch_index) 
540 |             cost_ij = train_model_D(minibatch_index) 
541 |             cost_ij = train_model_D(minibatch_index + n_train_batches)  
542 | #cost_ij 没什么用，后面都没有用到,只是为了调用train_model，而train_model有返回值
543 |             if (iter + 1) % validation_frequency == 0:
544 | 
545 |                 # compute zero-one loss on validation set
546 |                 validation_losses = [validate_model(i) for i
547 |                                      in xrange(n_valid_batches)]
548 |                 this_validation_loss = numpy.mean(validation_losses)
549 |                 print('epoch %i, minibatch %i/%i, validation error %f %%' %
550 |                       (epoch, minibatch_index + 1, n_train_batches,
551 |                        this_validation_loss * 100.))
552 |                 global resultSource
553 |                 file_write = open(resultSource, 'a')
554 |                 try:
555 |                     file_write.write(str('epoch %i, minibatch %i/%i, validation error %f %%' %
556 |                       (epoch, minibatch_index + 1, n_train_batches,
557 |                        this_validation_loss * 100.))  + '\n')
558 |                 finally :
559 |                     file_write.close()
560 | 
561 |  
562 |                 if this_validation_loss < best_validation_loss:
563 | 
564 |                     
565 |                     if this_validation_loss < best_validation_loss *  \
566 |                        improvement_threshold:
567 |                         patience = max(patience, iter * patience_increase)
568 | 
569 |                     
570 |                     best_validation_loss = this_validation_loss
571 |                     best_iter = iter
572 | 
573 |                    
574 |                     test_losses = [
575 |                         test_model(i)
576 |                         for i in xrange(n_test_batches)
577 |                     ]
578 |                     test_score = numpy.mean(test_losses)
579 |                     print(('     epoch %i, minibatch %i/%i, test error of '
580 |                            'best model %f %%') %
581 |                           (epoch, minibatch_index + 1, n_train_batches,
582 |                            test_score * 100.))
583 |                     global resultSource
584 |                     file_write = open(resultSource, 'a')
585 |                     try:
586 |                         file_write.write(str(('     epoch %i, minibatch %i/%i, test error of '
587 |                            'best model %f %%') %
588 |                           (epoch, minibatch_index + 1, n_train_batches,
589 |                            test_score * 100.)) + '\n')
590 |                     finally :
591 |                         file_write.close()
592 | 
593 |             if patience <= iter:
594 |                 done_looping = True
595 |                 break
596 | 
597 |     end_time = time.time()
598 |     print('Optimization complete.')
599 |     print('Best validation score of %f %% obtained at iteration %i, '
600 |           'with test performance %f %%' %
601 |           (best_validation_loss * 100., best_iter + 1, test_score * 100.))
602 |     print >> sys.stderr, ('The code for file ' +
603 |                           os.path.split(__file__)[1] +
604 |                           ' ran for %.2fm' % ((end_time - start_time) / 60.))
605 |     global resultSource
606 |     file_write = open(resultSource, 'a')
607 |     try:
608 |         file_write.write(str('Optimization complete.\n'))
609 |         file_write.write(str('Best validation score of %f %% obtained at iteration %i, '
610 |           'with test performance %f %%' %
611 |           (best_validation_loss * 100., best_iter + 1, test_score * 100.)) + '\n')
612 |         file_write.write(str('The code for file ' +
613 |                           os.path.split(__file__)[1] +
614 |                           ' ran for %.2fm' % ((end_time - start_time) / 60.)) + '\n')
615 |     finally :
616 |         file_write.close()
617 | 
618 | if __name__ == '__main__':
619 |     evaluate_lenet5()
620 | 
621 | 
622 | def experiment(state, channel):
623 |     evaluate_lenet5(state.learning_rate, dataset_s=state.dataset_s, dataset_t=state.dataset_t, dataset_st=state.dataset_st)
624 | 


--------------------------------------------------------------------------------
/SRC/cnn_tt_tt.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | """
  3 | @author：wepon
  4 | @blog：http://blog.csdn.net/u012162613/article/details/43225445
  5 | 
  6 | 本程序基于LeNet5，但是有所简化，具体如下：
  7 | -没有实现location-specific gain and bias parameters
  8 | -用的是maxpooling，而不是average_pooling
  9 | -分类器用的是softmax，LeNet5用的是rbf
 10 | -LeNet5第二层并不是全连接的，本程序实现的是全连接
 11 | 具体参考：
 12 | - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner:
 13 |    Gradient-Based Learning Applied to Document
 14 |    Recognition, Proceedings of the IEEE, 86(11):2278-2324, November 1998.
 15 |    http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf
 16 | 
 17 | """
 18 | import cPickle
 19 | import gzip
 20 | import os
 21 | import sys
 22 | import time
 23 | import random
 24 | import numpy
 25 | 
 26 | import theano
 27 | import theano.tensor as T
 28 | from theano.tensor.signal import downsample
 29 | from theano.tensor.nnet import conv
 30 | from PIL import Image
 31 | from scipy.misc import imsave
 32 | 
 33 | 
 34 | resultSource = os.path.join(os.path.abspath('..'), 'result/tt_tt.txt')
 35 | file_w = open(resultSource, 'a')
 36 | try:
 37 |     file_w.write("modify the time result")
 38 | finally:
 39 |     file_w.close()
 40 | 
 41 | """
 42 | 卷积+下采样合成一个层LeNetConvPoolLayer
 43 | W是系数矩阵，也就是卷积核============================?和filter同形状
 44 | rng:随机数生成器，用于初始化W
 45 | input:4维的向量，theano.tensor.dtensor4
 46 | filter_shape:(number of filters, num input feature maps,filter height, filter width)滤波器，例如5*5的滤波器
 47 |              num imput feature maps:表示输入的特征的个数
 48 |              number of filters代表滤波器的个数,也就是输出的特征的个数
 49 | image_shape:(batch size, num input feature maps,image height, image width)，图像的大小
 50 |             batch size代表是第几个batch
 51 | poolsize: (#rows, #cols)，下采样时对pool中的数求和
 52 | num input feature maps:是上一层到卷基层过程中，代表可训练参数的个数，也就是滤波器个数*（滤波矩阵中点的个数+b的个数）
 53 |                         例如有6个滤波器，为5*5的矩阵，那么就是6*（5*5+1）
 54 | num output feature maps:和num input feature maps一样，不过是从卷基层到下采样层的过程。 
 55 | """
 56 | class LeNetConvPoolLayer(object):
 57 |     def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
 58 |   
 59 | #assert condition，condition为True，则继续往下执行，condition为False，中断程序
 60 | #image_shape[1]和filter_shape[1]都是num input feature maps，它们必须是一样的。
 61 |         assert image_shape[1] == filter_shape[1]
 62 |         self.input = input
 63 | 
 64 | #每个隐层神经元（即像素）与上一层的连接数为num input feature maps * filter height * filter width。
 65 | #可以用numpy.prod(filter_shape[1:])来求得，prod是连乘操作。
 66 | #卷基层和上一层的连接个数
 67 |         fan_in = numpy.prod(filter_shape[1:])
 68 | 
 69 | #lower layer上每个神经元获得的梯度来自于："num output feature maps * filter height * filter width" /pooling size
 70 | #卷基层和下采样层的连接个数
 71 |         fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) /  
 72 |                    numpy.prod(poolsize))   #=======================？应该乘以2，因为一组滤波器只有一个w和一个b
 73 |                    
 74 | #以上求得fan_in、fan_out ，将它们代入公式，以此来随机初始化W,W就是线性卷积核
 75 | #初始化为一个极小值，似乎有公式，W的初始化有个规则：如果使用tanh函数，则在-sqrt(6./(n_in+n_hidden))到sqrt(6./(n_in+n_hidden))之间均匀
 76 | #numpy.random.uniform(low,high,size)左闭右开，返回size形状的随机数组
 77 |         W_bound = numpy.sqrt(6. / (fan_in + fan_out))  
 78 |         self.W = theano.shared(
 79 |             numpy.asarray(
 80 |                 rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
 81 |                 dtype=theano.config.floatX
 82 |             ),
 83 |             borrow=True
 84 |         )
 85 | 
 86 | # the bias is a 1D tensor -- one bias per output feature map
 87 | #偏置b是一维向量，每个输出图的特征图都对应一个偏置，初始化为0向量
 88 | #而输出的特征图的个数由filter个数决定，因此用filter_shape[0]即number of filters来初始化
 89 |         b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)
 90 |         self.b = theano.shared(value=b_values, borrow=True)
 91 | 
 92 | #将输入图像与filter卷积，conv.conv2d函数
 93 | #卷积完没有加b再通过sigmoid，这里是一处简化。
 94 |         conv_out = conv.conv2d(
 95 |             input=input,
 96 |             filters=self.W,
 97 |             filter_shape=filter_shape,
 98 |             image_shape=image_shape
 99 |         )
100 | 
101 | #maxpooling，最大子采样过程
102 |         pooled_out = downsample.max_pool_2d(
103 |             input=conv_out,
104 |             ds=poolsize,
105 |             ignore_border=True
106 |         )
107 | 
108 | #加偏置，再通过tanh映射，得到卷积+子采样层的最终输出
109 | #因为b是一维向量，这里用维度转换函数dimshuffle将其reshape。比如b是(10,)，
110 | #则b.dimshuffle('x', 0, 'x', 'x'))将其reshape为(1,10,1,1)
111 |         self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))
112 | #卷积+采样层的参数
113 |         self.params = [self.W, self.b]
114 | 
115 | 
116 | 
117 | 
118 | 
119 | """
120 | 注释：
121 | 这是定义隐藏层的类，首先明确：隐藏层的输入即input，输出即隐藏层的神经元个数。输入层与隐藏层是全连接的。
122 | 假设输入是n_in维的向量（也可以说时n_in个神经元），隐藏层有n_out个神经元，则因为是全连接，
123 | 一共有n_in*n_out个权重，故W大小时(n_in,n_out),n_in行n_out列，每一列对应隐藏层的每一个神经元的连接权重。
124 | b是偏置，隐藏层有n_out个神经元，故b时n_out维向量。
125 | rng即随机数生成器，numpy.random.RandomState，用于初始化W。
126 | input训练模型所用到的所有输入，并不是MLP的输入层，MLP的输入层的神经元个数时n_in，
127 |     而这里的参数input大小是（n_example,n_in）,每一行一个样本，即每一行作为MLP的输入层。
128 | activation:激活函数,这里定义为函数tanh
129 | """
130 | class HiddenLayer(object):
131 |     def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh):
132 |          self.input = input   #类HiddenLayer的input即所传递进来的input
133 | 
134 |          """
135 |          注释：
136 |          代码要兼容GPU，则必须使用 dtype=theano.config.floatX,并且定义为theano.shared
137 |          另外，W的初始化有个规则：如果使用tanh函数，则在-sqrt(6./(n_in+n_hidden))到sqrt(6./(n_in+n_hidden))之间均匀
138 |          抽取数值来初始化W，若时sigmoid函数，则以上再乘4倍。
139 |          """
140 |          #如果W未初始化，则根据上述方法初始化。
141 |          #加入这个判断的原因是：有时候我们可以用训练好的参数来初始化W，见我的上一篇文章。
142 |          if W is None:
143 |             W_values = numpy.asarray(
144 |                 rng.uniform(
145 |                     low=-numpy.sqrt(6. / (n_in + n_out)),
146 |                     high=numpy.sqrt(6. / (n_in + n_out)),
147 |                     size=(n_in, n_out)
148 |                 ),
149 |                 dtype=theano.config.floatX
150 |             )
151 |             if activation == theano.tensor.nnet.sigmoid:
152 |                 W_values *= 4
153 |             W = theano.shared(value=W_values, name='W', borrow=True)
154 | 
155 |          if b is None:
156 |             b_values = numpy.zeros((n_out,), dtype=theano.config.floatX)
157 |             b = theano.shared(value=b_values, name='b', borrow=True)
158 | 
159 |          #用上面定义的W、b来初始化类HiddenLayer的W、b
160 |          self.W = W
161 |          self.b = b
162 | 
163 |         #隐含层的输出
164 |          lin_output = T.dot(input, self.W) + self.b
165 |          self.output = (
166 |             lin_output if activation is None
167 |             else activation(lin_output)
168 |          )
169 | 
170 |         #隐含层的参数
171 |          self.params = [self.W, self.b]
172 | 
173 | 
174 | 
175 | """
176 | 定义分类层LogisticRegression，也即Softmax回归
177 | 在deeplearning tutorial中，直接将LogisticRegression视为Softmax，
178 | 而我们所认识的二类别的逻辑回归就是当n_out=2时的LogisticRegression
179 | """
180 | #参数说明：
181 | #input，大小就是(n_example,n_in)，其中n_example是一个batch的大小，
182 | #因为我们训练时用的是Minibatch SGD，因此input这样定义
183 | #n_in,即上一层(隐含层)的输出
184 | #n_out,输出的类别数 
185 | class LogisticRegression(object):
186 |     def __init__(self, input, n_in, n_out):
187 | 
188 | #W大小是n_in行n_out列，b为n_out维向量。即：每个输出对应W的一列以及b的一个元素。  
189 |         self.W = theano.shared(
190 |             value=numpy.zeros(
191 |                 (n_in, n_out),
192 |                 dtype=theano.config.floatX
193 |             ),
194 |             name='W',
195 |             borrow=True
196 |         )
197 | 
198 |         self.b = theano.shared(
199 |             value=numpy.zeros(
200 |                 (n_out,),
201 |                 dtype=theano.config.floatX
202 |             ),
203 |             name='b',
204 |             borrow=True
205 |         )
206 | 
207 | #input是(n_example,n_in)，W是（n_in,n_out）,点乘得到(n_example,n_out)，加上偏置b，
208 | #再作为T.nnet.softmax的输入，得到p_y_given_x
209 | #故p_y_given_x每一行代表每一个样本被估计为各类别的概率    
210 | #PS：b是n_out维向量，与(n_example,n_out)矩阵相加，内部其实是先复制n_example个b，
211 | #然后(n_example,n_out)矩阵的每一行都加b
212 |         self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)
213 | 
214 | #argmax返回最大值下标，因为本例数据集是MNIST，下标刚好就是类别。axis=1表示按行操作。================？
215 |         self.y_pred = T.argmax(self.p_y_given_x, axis=1)
216 | 
217 | #params，LogisticRegression的参数     
218 |         self.params = [self.W, self.b]
219 | 
220 |     def negative_log_likelihood(self, y):  #==============================？
221 |         return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
222 | 
223 |     def errors(self, y):
224 |         if y.ndim != self.y_pred.ndim:
225 |             raise TypeError(
226 |                 'y should have the same shape as self.y_pred',
227 |                 ('y', y.type, 'y_pred', self.y_pred.type)
228 |             )
229 |         if y.dtype.startswith('int'):
230 |             return T.mean(T.neq(self.y_pred, y))
231 |         else:
232 |             raise NotImplementedError()
233 | 
234 | 
235 | """
236 | 加载MNIST数据集load_data()
237 | mnist.pkl将60000个训练数据分成了50000个训练数据和10000校正数据集；
238 | 每个数组由两部分内容组成，一个图片数组和一个标签数组，图片数组的每一行代表一个图片的像素，有784个元素（28×28）
239 | """
240 | def load_data(dataset):
241 |     print '... loading data'
242 |     global resultSource
243 |     file_write = open(resultSource, 'a')
244 |     try:
245 |         file_write.write("... loading data\n" )
246 |     finally :
247 |         file_write.close()
248 |     
249 |     
250 | #从"mnist.pkl.gz"里加载train_set, valid_set, test_set，它们都是包括label的
251 | #主要用到python里的gzip.open()函数,以及 cPickle.load()。
252 | #‘rb’表示以二进制可读的方式打开文件
253 |     f = open(dataset, 'rb')
254 |     train_set, valid_set, test_set = cPickle.load(f)
255 |     f.close()
256 |    
257 | 
258 | #将数据设置成shared variables，主要时为了GPU加速，只有shared variables才能存到GPU memory中
259 | #GPU里数据类型只能是float。而data_y是类别，所以最后又转换为int返回
260 |     def shared_dataset(data_xyz, borrow=True):
261 |         data_x, data_y, data_z = data_xyz
262 |         shared_x = theano.shared(numpy.asarray(data_x,
263 |                                                dtype=theano.config.floatX),
264 |                                  borrow=borrow)
265 |         shared_y = theano.shared(numpy.asarray(data_y,
266 |                                                dtype=theano.config.floatX),
267 |                                  borrow=borrow)
268 |         return shared_x, T.cast(shared_y, 'int32')
269 | 
270 |     test_target_set_x, test_target_set_y = shared_dataset(test_set)
271 |     valid_target_set_x, valid_target_set_y = shared_dataset(valid_set)
272 |     train_target_set_x, train_target_set_y = shared_dataset(train_set)
273 | 
274 |     rval = [(train_target_set_x, train_target_set_y),
275 |             (valid_target_set_x, valid_target_set_y), (test_target_set_x, test_target_set_y)]
276 |     return rval
277 | 
278 | 
279 | """
280 | 实现LeNet5
281 | LeNet5有两个卷积层，第一个卷积层有20个卷积核，第二个卷积层有50个卷积核
282 | """
283 | def evaluate_lenet5(learning_rate=0.1, n_epochs=200,
284 |                     dataset=os.path.join(os.path.abspath('..'), 'data/target0.7.pkl'),
285 |                     nkerns=[20, 50], batch_size=500):
286 |     """ 
287 |  learning_rate:学习速率，随机梯度前的系数。
288 |  n_epochs训练步数，每一步都会遍历所有batch，即所有样本
289 |  batch_size,这里设置为500，即每遍历完500个样本，才计算梯度并更新参数
290 |  nkerns=[20, 50],每一个LeNetConvPoolLayer卷积核的个数，第一个LeNetConvPoolLayer有
291 |  20个卷积核，第二个有50个
292 |     """
293 | 
294 |     rng = numpy.random.RandomState(23455)
295 | 
296 |     #加载数据
297 |     datasets = load_data(dataset)
298 |     train_target_set_x, train_target_set_y = datasets[0]
299 |     valid_target_set_x, valid_target_set_y = datasets[1]
300 |     test_target_set_x, test_target_set_y = datasets[2]
301 | 
302 |     # 计算batch的个数
303 |     n_train_target_batches = train_target_set_x.get_value(borrow=True).shape[0]
304 |     n_valid_target_batches = valid_target_set_x.get_value(borrow=True).shape[0]
305 |     n_test_target_batches = test_target_set_x.get_value(borrow=True).shape[0]
306 |     n_train_target_batches /= batch_size
307 |     n_valid_target_batches /= batch_size
308 |     n_test_target_batches /= batch_size
309 | 
310 | 
311 |     #定义几个变量，index表示batch下标，x表示输入的训练数据，y对应其标签
312 |     index = T.lscalar()  
313 |     x = T.matrix('x')   
314 |     y = T.ivector('y') 
315 | 
316 |     ######################
317 |     # BUILD ACTUAL MODEL #
318 |     ######################
319 |     print '... building the model'
320 |     global resultSource
321 |     file_write = open(resultSource, 'a')
322 |     try:
323 |         file_write.write("... building the model\n" )
324 |     finally :
325 |         file_write.close()
326 | 
327 | 
328 | #我们加载进来的batch大小的数据是(batch_size, 28 * 28)，但是LeNetConvPoolLayer的输入是四维的，所以要reshape
329 |     layer0_input = x.reshape((batch_size, 1, 28, 28))
330 | 
331 | # layer0即第一个LeNetConvPoolLayer层
332 | #输入的单张图片(28,28)，经过conv得到(28-5+1 , 28-5+1) = (24, 24)，
333 | #经过maxpooling得到(24/2, 24/2) = (12, 12)
334 | #因为每个batch有batch_size张图，第一个LeNetConvPoolLayer层有nkerns[0]个卷积核，
335 | #故layer0输出为(batch_size, nkerns[0], 12, 12)
336 |     layer0 = LeNetConvPoolLayer(
337 |         rng,
338 |         input=layer0_input,
339 |         image_shape=(batch_size, 1, 28, 28),
340 |         filter_shape=(nkerns[0], 1, 5, 5),
341 |         poolsize=(2, 2)
342 |     )
343 | 
344 | 
345 | #layer1即第二个LeNetConvPoolLayer层
346 | #输入是layer0的输出，每张特征图为(12,12),经过conv得到(12-5+1, 12-5+1) = (8, 8),
347 | #经过maxpooling得到(8/2, 8/2) = (4, 4)
348 | #因为每个batch有batch_size张图（特征图），第二个LeNetConvPoolLayer层有nkerns[1]个卷积核
349 | #，故layer1输出为(batch_size, nkerns[1], 4, 4)
350 |     layer1 = LeNetConvPoolLayer(
351 |         rng,
352 |         input=layer0.output,
353 |         image_shape=(batch_size, nkerns[0], 12, 12),#输入nkerns[0]个特征，即layer0输出nkerns[0]个特征
354 |         filter_shape=(nkerns[1], nkerns[0], 5, 5),
355 |         poolsize=(2, 2)
356 |     )
357 | #到此,实际上将原来的一张图片的一个特征压成了一个4*4的图片，但是一张图片对应nkerns[1]个特征图片
358 | #因此在全连接层，可以认为一个图片变为nkerns[1]*4*4大小的图片----the input of hiddenLayer
359 | 
360 | #前面定义好了两个LeNetConvPoolLayer（layer0和layer1），layer1后面接layer2，这是一个全连接层，相当于MLP里面的隐含层
361 | #故可以用MLP中定义的HiddenLayer来初始化layer2，layer2的输入是二维的(batch_size, num_pixels) ，
362 | #故要将上层中同一张图经不同卷积核卷积出来的特征图合并为一维向量，
363 | #也就是将layer1的输出(batch_size, nkerns[1], 4, 4)flatten为(batch_size, nkerns[1]*4*4)=(500，800),作为layer2的输入。
364 | #而隐藏层的w为800*500（nin*nout）
365 | #(500，800)表示有500个样本，每一行代表一个样本。layer2的输出大小是(batch_size,n_out)=(500,500)
366 |     layer2_input = layer1.output.flatten(2)
367 |     layer2 = HiddenLayer(
368 |         rng,
369 |         input=layer2_input,
370 |         n_in=nkerns[1] * 4 * 4,
371 |         n_out=500,
372 |         activation=T.tanh
373 |     )
374 | 
375 | #最后一层layer3是分类层，用的是逻辑回归中定义的LogisticRegression，
376 | #layer3的输入是layer2的输出(500,500)，layer3的输出就是(batch_size,n_out)=(500,10)
377 |     layer3 = LogisticRegression(input=layer2.output, n_in=500, n_out=10)
378 | 
379 | #代价函数NLL
380 |     cost = layer3.negative_log_likelihood(y)
381 | 
382 | # test_model计算测试误差，x、y根据给定的index具体化，然后调用layer3，
383 | #layer3又会逐层地调用layer2、layer1、layer0，故test_model其实就是整个CNN结构，
384 | #test_model的输入是x、y，输出是layer3.errors(y)的输出，即误差。
385 |     test_model = theano.function(
386 |         [index],
387 |         layer3.errors(y),
388 |         givens={
389 |             x: test_target_set_x[index * batch_size: (index + 1) * batch_size],
390 |             y: test_target_set_y[index * batch_size: (index + 1) * batch_size]
391 |         }
392 |     )
393 | #validate_model，验证模型，分析同上。
394 |     validate_model = theano.function(
395 |         [index],
396 |         layer3.errors(y),
397 |         givens={
398 |             x: valid_target_set_x[index * batch_size: (index + 1) * batch_size],
399 |             y: valid_target_set_y[index * batch_size: (index + 1) * batch_size]
400 |         }
401 |     )
402 | 
403 | #下面是train_model，涉及到优化算法即SGD，需要计算梯度、更新参数
404 |     #参数集
405 |     params = layer3.params + layer2.params + layer1.params + layer0.params
406 | 
407 |     #对各个参数的梯度
408 |     grads = T.grad(cost, params)
409 | 
410 | #因为参数太多，在updates规则里面一个一个具体地写出来是很麻烦的，所以下面用了一个for..in..,自动生成规则对(param_i, param_i - learning_rate * grad_i)
411 |     updates = [
412 |         (param_i, param_i - learning_rate * grad_i)
413 |         for param_i, grad_i in zip(params, grads)
414 |     ]
415 | 
416 | #train_model，代码分析同test_model。train_model里比test_model、validation_model多出updates规则
417 |     train_model = theano.function(
418 |         [index],
419 |         cost,
420 |         updates=updates,
421 |         givens={
422 |             x: train_target_set_x[index * batch_size: (index + 1) * batch_size],
423 |             y: train_target_set_y[index * batch_size: (index + 1) * batch_size]
424 |         }
425 |     )
426 | 
427 | 
428 |     ###############
429 |     #   开始训练  #
430 |     ###############
431 |     print '... training'
432 |     global resultSource
433 |     file_write = open(resultSource, 'a')
434 |     try:
435 |         file_write.write("... training\n" )
436 |     finally :
437 |         file_write.close()
438 |     patience = 10000  
439 |     patience_increase = 2  
440 |     improvement_threshold = 0.995 
441 |                                    
442 |     validation_frequency = min(n_train_target_batches, patience / 2)
443 |  #这样设置validation_frequency可以保证每一次epoch都会在验证集上测试。
444 | 
445 |     best_validation_loss = numpy.inf   #最好的验证集上的loss，最好即最小
446 |     best_iter = 0                      #最好的迭代次数，以batch为单位。比如best_iter=10000，说明在训练完第10000个batch时，达到best_validation_loss
447 |     test_score = 0.
448 |     start_time = time.time()
449 | 
450 |     epoch = 0
451 |     done_looping = False
452 | 
453 | #下面就是训练过程了，while循环控制的时步数epoch，一个epoch会遍历所有的batch，即所有的图片。
454 | #for循环是遍历一个个batch，一次一个batch地训练。for循环体里会用train_model(minibatch_index)去训练模型，
455 | #train_model里面的updatas会更新各个参数。
456 | #for循环里面会累加训练过的batch数iter，当iter是validation_frequency倍数时则会在验证集上测试，
457 | #如果验证集的损失this_validation_loss小于之前最佳的损失best_validation_loss，
458 | #则更新best_validation_loss和best_iter，同时在testset上测试。
459 | #如果验证集的损失this_validation_loss小于best_validation_loss*improvement_threshold时则更新patience。
460 | #当达到最大步数n_epoch时，或者patience<iter时，结束训练
461 | #iter 表示训练了多少个batch
462 |     while (epoch < n_epochs) and (not done_looping):
463 |         epoch = epoch + 1
464 |         for minibatch_index in xrange(n_train_target_batches):
465 | 
466 |             iter = (epoch - 1) * n_train_target_batches + minibatch_index
467 | 
468 |             if iter % 100 == 0:
469 |                 print 'training @ iter = ', iter
470 |                 global resultSource
471 |                 file_write = open(resultSource, 'a')
472 |                 try:
473 |                     file_write.write("training @ iter = " + str(iter) + '\n')
474 |                 finally :
475 |                     file_write.close()
476 |             cost_ij = train_model(minibatch_index)  
477 | #cost_ij 没什么用，后面都没有用到,只是为了调用train_model，而train_model有返回值
478 |             if (iter + 1) % validation_frequency == 0:
479 | 
480 |                 # compute zero-one loss on validation set
481 |                 validation_losses = [validate_model(i) for i
482 |                                      in xrange(n_valid_target_batches)]
483 |                 this_validation_loss = numpy.mean(validation_losses)
484 |                 print('epoch %i, minibatch %i/%i, validation error %f %%' %
485 |                       (epoch, minibatch_index + 1, n_train_target_batches,
486 |                        this_validation_loss * 100.))
487 |                 global resultSource
488 |                 file_write = open(resultSource, 'a')
489 |                 try:
490 |                     file_write.write(str('epoch %i, minibatch %i/%i, validation error %f %%' %
491 |                       (epoch, minibatch_index + 1, n_train_target_batches,
492 |                        this_validation_loss * 100.))  + '\n')
493 |                 finally :
494 |                     file_write.close()
495 | 
496 |  
497 |                 if this_validation_loss < best_validation_loss:
498 | 
499 |                     
500 |                     if this_validation_loss < best_validation_loss *  \
501 |                        improvement_threshold:
502 |                         patience = max(patience, iter * patience_increase)
503 | 
504 |                     
505 |                     best_validation_loss = this_validation_loss
506 |                     best_iter = iter
507 | 
508 |                    
509 |                     test_losses = [
510 |                         test_model(i)
511 |                         for i in xrange(n_test_target_batches)
512 |                     ]
513 |                     test_score = numpy.mean(test_losses)
514 |                     print(('     epoch %i, minibatch %i/%i, test error of '
515 |                            'best model %f %%') %
516 |                           (epoch, minibatch_index + 1, n_train_target_batches,
517 |                            test_score * 100.))
518 |                     global resultSource
519 |                     file_write = open(resultSource, 'a')
520 |                     try:
521 |                         file_write.write(str(('     epoch %i, minibatch %i/%i, test error of '
522 |                            'best model %f %%') %
523 |                           (epoch, minibatch_index + 1, n_train_target_batches,
524 |                            test_score * 100.)) + '\n')
525 |                     finally :
526 |                         file_write.close()
527 | 
528 |             if patience <= iter:
529 |                 done_looping = True
530 |                 break
531 | 
532 |     end_time = time.time()
533 |     print('Optimization complete.')
534 |     print('Best validation score of %f %% obtained at iteration %i, '
535 |           'with test performance %f %%' %
536 |           (best_validation_loss * 100., best_iter + 1, test_score * 100.))
537 |     print >> sys.stderr, ('The code for file ' +
538 |                           os.path.split(__file__)[1] +
539 |                           ' ran for %.2fm' % ((end_time - start_time) / 60.))
540 |     global resultSource
541 |     file_write = open(resultSource, 'a')
542 |     try:
543 |         file_write.write(str('Optimization complete.\n'))
544 |         file_write.write(str('Best validation score of %f %% obtained at iteration %i, '
545 |           'with test performance %f %%' %
546 |           (best_validation_loss * 100., best_iter + 1, test_score * 100.)) + '\n')
547 |         file_write.write(str('The code for file ' +
548 |                           os.path.split(__file__)[1] +
549 |                           ' ran for %.2fm' % ((end_time - start_time) / 60.)) + '\n')
550 |     finally :
551 |         file_write.close()
552 | 
553 | if __name__ == '__main__':
554 |     evaluate_lenet5()
555 | 
556 | 
557 | def experiment(state, channel):
558 |     evaluate_lenet5(state.learning_rate, dataset=state.dataset)
559 | 


--------------------------------------------------------------------------------
/SRC/generateMNIST_SandMNIST_T.py:
--------------------------------------------------------------------------------
 1 | # -*- coding: utf-8 -*-
 2 | """
 3 | Created on Tue Apr 18 13:44:06 2017
 4 | 
 5 | @author: shucun Tian
 6 | """
 7 | 
 8 | '''
 9 | 生成MNIST-M  MNIST数据集
10 | 将图片保存到/home/zhangshu/faq/shucunt/temp/domainAdaption/data/imageMNIST-M
11 |           /home/zhangshu/faq/shucunt/temp/domainAdaption/data/imageMNIST中
12 | 
13 | 只生成了test数据集，MNIST和MNIST-M一一对应
14 | 
15 | '''
16 | import cPickle
17 | import gzip
18 | import os
19 | import sys
20 | import time
21 | import random
22 | import numpy
23 | 
24 | 
25 | import theano
26 | import theano.tensor as T
27 | from theano.tensor.nnet import conv
28 | from PIL import Image
29 | from scipy.misc import imsave
30 | 
31 | 
32 | """
33 | 加载MNIST数据集load_data()
34 | mnist.pkl将60000个训练数据分成了50000个训练数据和10000校正数据集；
35 | 每个数组由两部分内容组成，一个图片数组和一个标签数组，图片数组的每一行代表一个图片的像素，有784个元素（28×28）
36 | """
37 | def to_image(dataset_s, dataset_t):
38 | 
39 |     print '... loading data'
40 |     
41 |     
42 |     #从"mnist.pkl.gz"里加载train_set, valid_set, test_set，它们都是包括label的
43 |     #主要用到python里的gzip.open()函数,以及 cPickle.load()。
44 |     #‘rb’表示以二进制可读的方式打开文件
45 |     f = open(dataset_s, 'rb')
46 |     train_set_source, valid_set_source, test_set_source = cPickle.load(f)
47 |     f.close()
48 | 
49 |     f = open(dataset_t, 'rb')
50 |     train_set_target, valid_set_target, test_set_target = cPickle.load(f)
51 |     f.close()
52 | 
53 | 
54 |     def to_image_source(data_xy, image_num, set_class):
55 |         data_x, data_y = data_xy
56 |         
57 |         file_name = 0
58 |         for k in range(image_num):
59 |             temp = [([0] * 28) for i in range(28)]
60 |             for i in range(28):
61 |                 for j in range(28):
62 |                     temp[i][j] = data_x[k][i * 28 + j]
63 |             imsave(os.path.join(os.path.abspath('..') ,'data/imageMNIST_S/' + set_class + str(file_name) + '.png'), temp)
64 |             file_name += 1 #命名新生成文件
65 | 
66 |     def to_image_target(data_xyz, image_num, set_class):
67 |         data_x, data_y, data_z = data_xyz
68 |         file_name = 0
69 |         for k in range(image_num):
70 |             temp = [([0] * 28) for i in range(28)]
71 |             for i in range(28):
72 |                 for j in range(28):
73 |                     temp[i][j] = data_x[k][i * 28 + j]
74 |             imsave(os.path.join(os.path.abspath('..') ,'data/imageMNIST_T/' + set_class + str(file_name) + '.png'), temp)
75 |             file_name += 1 #命名新生成文件
76 | 
77 | 
78 |     to_image_source(train_set_source, 10, 'train_')
79 |     to_image_source(test_set_source, 10, 'test_')
80 |     to_image_source(valid_set_source, 10, 'valid_')
81 |     to_image_target(train_set_target, 10, 'train_')
82 |     to_image_target(test_set_target, 10, 'test_')
83 |     to_image_target(valid_set_target, 10, 'valid_')
84 | 
85 | 
86 | 
87 | to_image(os.path.join(os.path.abspath('..'), 'data/source.pkl'), os.path.join(os.path.abspath('..'), 'data/target0.7.pkl'))
88 | 
89 | 
90 | 


--------------------------------------------------------------------------------
/SRC/generate_data.py:
--------------------------------------------------------------------------------
  1 | # -*- coding: utf-8 -*-
  2 | '''
  3 | 将target数据和st数据存在文件中
  4 | 
  5 | ======================================
  6 | target数据是根据source和bsds500中数据做差而成的
  7 | 其中，在做差时，如果像素点是数字上的像素点，不进行做差处理
  8 | 如果像素点是背景上的像素点，使用如下公式处理
  9 | target = |source - bsds500 * theta|
 10 | 
 11 | target包含train（50000），valid（10000），test（10000）三组
 12 | 每组中包含括号中个数的样本
 13 | 每个样本包含3部分，第一部分是图片，一共28*28个值，代表一张图片
 14 | 第二部分是图片的标签，是一个0-9的值，表示图片中的数字是几
 15 | 第三部分是图片的域标签，是一个0或1的值，0表示是源域，1表示是目标域，target集合中所有域标签都是1
 16 | 
 17 | ======================================
 18 | st数据是source和target的混合数据
 19 | 混合方式是source和target交替存储，一样一个
 20 | st数据集包含train（1000000），valid（20000），test（20000）三组
 21 | 其中每个样本结构和target样本结构完全相同
 22 | 
 23 | '''
 24 | 
 25 | import cPickle
 26 | import gzip
 27 | import os
 28 | import sys
 29 | import time
 30 | import random
 31 | import numpy
 32 | 
 33 | import theano
 34 | import theano.tensor as T
 35 | from theano.tensor.signal import downsample
 36 | from theano.tensor.nnet import conv
 37 | from PIL import Image
 38 | from scipy.misc import imsave
 39 | 
 40 | theta = 0.7
 41 | 
 42 | def load_data(dataset, theta):
 43 |     # dataset是数据集的路径，程序首先检测该路径下有没有MNIST数据集，没有的话就下载MNIST数据集
 44 |     #这一部分就不解释了，与softmax回归算法无关。
 45 |     data_dir, data_file = os.path.split(dataset)
 46 |     if data_dir == "" and not os.path.isfile(dataset):
 47 |         # Check if dataset is in the data directory.
 48 |         new_path = os.path.join(
 49 |             os.path.split(__file__)[0],
 50 |             "..",
 51 |             "data",
 52 |             dataset
 53 |         )
 54 |         if os.path.isfile(new_path) or data_file == 'mnist.pkl.gz':
 55 |             dataset = new_path
 56 | 
 57 |     if (not os.path.isfile(dataset)) and data_file == 'mnist.pkl.gz':
 58 |         import urllib
 59 |         origin = (
 60 |             'http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz'
 61 |         )
 62 |         print 'Downloading data from %s' % origin
 63 |         
 64 |         urllib.urlretrieve(origin, dataset)
 65 | 
 66 |     print '... loading data'
 67 |     
 68 | #以上是检测并下载数据集mnist.pkl.gz，不是本文重点。下面才是load_data的开始
 69 |     
 70 | #从"mnist.pkl.gz"里加载train_set, valid_set, test_set，它们都是包括label的
 71 | #主要用到python里的gzip.open()函数,以及 cPickle.load()。
 72 | #‘rb’表示以二进制可读的方式打开文件
 73 |     f = gzip.open(dataset, 'rb')
 74 |     train_set, valid_set, test_set = cPickle.load(f)
 75 |     f.close()
 76 | 
 77 |     def cpickle_dataset(valid_set, train_set, test_set, theta, borrow = True):
 78 |         #loc1 is the location of t_data
 79 |         #loc2 is the location of st_data
 80 |         print '...train data'
 81 |         train_source_x, train_source_y = train_set
 82 |         
 83 |         train_target_x = []
 84 |         train_source_label = numpy.zeros(train_source_y.size)
 85 |         train_target_label = numpy.ones(train_source_y.size)
 86 |         fileList = []
 87 |         fileList = os.listdir(os.path.join(os.path.abspath('..'),'data/BSR/BSDS500/data/images/train/'))
 88 |         fileList.remove("Thumbs.db")
 89 |         
 90 |         '''
 91 |         从bsds500中随机选一张图片，将图片转化为灰度图像，
 92 |         将像素对应的位置的灰度值相加，存到data_x中
 93 |         data_x中是为了应用在训练中，在那一份代码中会返回源域图像和目标域图像
 94 |         '''
 95 |         for k in range(train_source_x.shape[0]):
 96 |             if k % 10000 == 0:
 97 |                 print k
 98 |             indx = random.randint(0, len(fileList) - 1)
 99 |             img = Image.open(os.path.join(os.path.abspath('..'),'data/BSR/BSDS500/data/images/train/' + fileList[indx]))
100 |             img1 = img.convert("L")
101 |             temp = [0] * 28 * 28
102 |             for i in range(28):
103 |                 for j in range(28):
104 |                     temp[i * 28 + j] = train_source_x[k][i * 28 + j]
105 |                     if(train_source_x[k][i * 28 + j] < 0.1 ):
106 |                         temp[i * 28 + j] -= ((img1.getpixel((i,j)) / 255.0) * theta)
107 |                         temp[i * 28 + j] = abs(temp[i * 28 + j])
108 |             train_target_x.append(temp)
109 |         train_target_x_np = numpy.array(train_target_x)
110 |         train_target = []
111 |         train_target.append(train_target_x_np)
112 |         train_target.append(train_source_y)
113 |         train_target.append(train_target_label)
114 |         train_source_y_list = train_source_y.tolist()
115 | 
116 |         
117 |         print '...valid data'
118 |         valid_source_x, valid_source_y = valid_set
119 |         
120 |         valid_target_x = []
121 |         valid_source_label = numpy.zeros(valid_source_y.size)
122 |         valid_target_label = numpy.ones(valid_source_y.size)
123 |         fileList = []
124 |         fileList = os.listdir(os.path.join(os.path.abspath('..'),'data/BSR/BSDS500/data/images/train/'))
125 |         fileList.remove("Thumbs.db")
126 |         
127 |         '''
128 |         从bsds500中随机选一张图片，将图片转化为灰度图像，
129 |         将像素对应的位置的灰度值相加，存到data_x中
130 |         data_x中是为了应用在训练中，在那一份代码中会返回源域图像和目标域图像
131 |         '''
132 |         for k in range(valid_source_x.shape[0]):
133 |             if k % 10000 == 0:
134 |                 print k
135 |             indx = random.randint(0, len(fileList) - 1)
136 |             img = Image.open(os.path.join(os.path.abspath('..'),'data/BSR/BSDS500/data/images/train/' + fileList[indx]))
137 |             img1 = img.convert("L")
138 |             temp = [0] * 28 * 28
139 |             for i in range(28):
140 |                 for j in range(28):
141 |                     temp[i * 28 + j] = valid_source_x[k][i * 28 + j]
142 |                     if(valid_source_x[k][i * 28 + j] < 0.1 ):
143 |                         temp[i * 28 + j] -= ((img1.getpixel((i,j)) / 255.0) * theta)
144 |                         temp[i * 28 + j] = abs(temp[i * 28 + j])
145 |             valid_target_x.append(temp)
146 |         valid_target_x_np = numpy.array(valid_target_x)
147 |         valid_target = []
148 |         valid_target.append(valid_target_x_np)
149 |         valid_target.append(valid_source_y)
150 |         valid_target.append(valid_target_label)
151 |         valid_source_y_list = valid_source_y.tolist()
152 | 
153 |         print '...test data'
154 |         test_source_x, test_source_y = test_set
155 |         
156 |         test_target_x = []
157 |         test_source_label = numpy.zeros(test_source_y.size)
158 |         test_target_label = numpy.ones(test_source_y.size)
159 |         fileList = []
160 |         fileList = os.listdir(os.path.join(os.path.abspath('..'),'data/BSR/BSDS500/data/images/train/'))
161 |         fileList.remove("Thumbs.db")
162 |         
163 |         '''
164 |         从bsds500中随机选一张图片，将图片转化为灰度图像，
165 |         将像素对应的位置的灰度值相加，存到data_x中
166 |         data_x中是为了应用在训练中，在那一份代码中会返回源域图像和目标域图像
167 |         '''
168 |         for k in range(test_source_x.shape[0]):
169 |             if k % 10000 == 0:
170 |                 print k
171 |             indx = random.randint(0, len(fileList) - 1)
172 |             img = Image.open(os.path.join(os.path.abspath('..'),'data/BSR/BSDS500/data/images/train/' + fileList[indx]))
173 |             img1 = img.convert("L")
174 |             temp = [0] * 28 * 28
175 |             for i in range(28):
176 |                 for j in range(28):
177 |                     temp[i * 28 + j] = test_source_x[k][i * 28 + j]
178 |                     if(valid_source_x[k][i * 28 + j] < 0.1 ):
179 |                         temp[i * 28 + j] -= ((img1.getpixel((i,j)) / 255.0) * theta)
180 |                         temp[i * 28 + j] = abs(temp[i * 28 + j])
181 |             test_target_x.append(temp)
182 |         '''
183 |         f = open(os.path.abspath(), 'w')
184 |         for i in range(len(test_target_x)):
185 |             for j in range(28 * 28):
186 |                 f.write(str(test_target_x[i][j]))
187 |                 f.write('\t')
188 |             f.write('\n')
189 |         f.close()
190 |         '''
191 |         test_target_x_np = numpy.array(test_target_x)
192 |         test_target = []
193 |         test_target.append(test_target_x_np)
194 |         test_target.append(test_source_y)
195 |         test_target.append(test_target_label)
196 |         test_source_y_list = test_source_y.tolist()
197 | 
198 |         target = []
199 |         target.append(train_target)
200 |         target.append(valid_target)
201 |         target.append(test_target)
202 |         cPickle.dump(target, open(os.path.join(os.path.abspath('..'), 'data/target' + str(theta) + '.pkl'), 'wb'))
203 | 
204 |         st = []
205 |         train_st = []
206 |         valid_st = []
207 |         test_st = []
208 | 
209 |         train_st_x = []
210 |         train_st_y = []
211 |         train_st_label = []
212 |         for i in range(len(train_target_x)):
213 |             train_st_x.append(train_target_x[i])
214 |             train_st_x.append(train_source_x[i])
215 |             train_st_y.append(train_source_y_list[i])
216 |             train_st_y.append(train_source_y_list[i])
217 |             train_st_label.append(float(1.0))
218 |             train_st_label.append(float(0.0))
219 |         train_st_x_np = numpy.array(train_st_x)
220 |         train_st_y_np = numpy.array(train_st_y)
221 |         train_st_label_np = numpy.array(train_st_label)
222 |         print train_st_x_np.shape
223 |         print train_st_y_np.shape
224 |         print train_st_label_np.shape
225 |         train_st.append(train_st_x_np)
226 |         train_st.append(train_st_y_np)
227 |         train_st.append(train_st_label_np)
228 | 
229 |         valid_st_x = []
230 |         valid_st_y = []
231 |         valid_st_label = []
232 |         for i in range(len(valid_target_x)):
233 |             valid_st_x.append(valid_target_x[i])
234 |             valid_st_x.append(valid_source_x[i])
235 |             valid_st_y.append(valid_source_y_list[i])
236 |             valid_st_y.append(valid_source_y_list[i])
237 |             valid_st_label.append(float(1.0))
238 |             valid_st_label.append(float(0.0))
239 |         valid_st_x_np = numpy.array(valid_st_x)
240 |         valid_st_y_np = numpy.array(valid_st_y)
241 |         valid_st_label_np = numpy.array(valid_st_label)
242 |         print valid_st_x_np.shape
243 |         print valid_st_y_np.shape
244 |         print valid_st_label_np.shape
245 |         valid_st.append(valid_st_x_np)
246 |         valid_st.append(valid_st_y_np)
247 |         valid_st.append(valid_st_label_np)
248 | 
249 |         test_st_x = []
250 |         test_st_y = []
251 |         test_st_label = []
252 |         for i in range(len(test_target_x)):
253 |             test_st_x.append(test_target_x[i])
254 |             test_st_x.append(test_source_x[i])
255 |             test_st_y.append(test_source_y_list[i])
256 |             test_st_y.append(test_source_y_list[i])
257 |             test_st_label.append(float(1.0))
258 |             test_st_label.append(float(0.0))
259 |         '''
260 |         f = open(os.path.abspath(), 'w')
261 |         for i in range(len(test_target_x)):
262 |             for j in range(28 * 28):
263 |                 f.write(str(test_st_x[i][j]))
264 |                 f.write('\t')
265 |             f.write('\n')
266 |         f.close()
267 |         '''
268 |         test_st_x_np = numpy.array(test_st_x)
269 |         test_st_y_np = numpy.array(test_st_y)
270 |         test_st_label_np = numpy.array(test_st_label)
271 |         print test_st_x_np.shape
272 |         print test_st_y_np.shape
273 |         print test_st_label_np.shape
274 |         test_st.append(test_st_x_np)
275 |         test_st.append(test_st_y_np)
276 |         test_st.append(test_st_label_np)
277 | 
278 |         st.append(train_st)
279 |         st.append(valid_st)
280 |         st.append(test_st)
281 | 
282 |         cPickle.dump(st, open(os.path.join(os.path.abspath('..'), 'data/st' + str(theta) + '.pkl'), 'wb'))
283 | 
284 | 
285 | 
286 | 
287 |     cpickle_dataset(valid_set, train_set, test_set, borrow = True)
288 | 
289 | dataset=os.path.join(os.path.abspath('..'), 'data/mnist.pkl.gz')
290 | load_data(dataset, theta)


--------------------------------------------------------------------------------
/SRC/temp.py:
--------------------------------------------------------------------------------
1 | temp = [[1],[2],[3]]
2 | a, b, c = temp
3 | print a, b, c


--------------------------------------------------------------------------------
/result/ts_ts.txt:
--------------------------------------------------------------------------------
  1 | modify the time result... loading data
  2 | ... building the model
  3 | ... training
  4 | training @ iter = 0
  5 | epoch 1, minibatch 100/100, validation error 9.230000 %
  6 |      epoch 1, minibatch 100/100, test error of best model 9.520000 %
  7 | training @ iter = 100
  8 | epoch 2, minibatch 100/100, validation error 6.180000 %
  9 |      epoch 2, minibatch 100/100, test error of best model 6.500000 %
 10 | training @ iter = 200
 11 | epoch 3, minibatch 100/100, validation error 4.640000 %
 12 |      epoch 3, minibatch 100/100, test error of best model 4.850000 %
 13 | training @ iter = 300
 14 | epoch 4, minibatch 100/100, validation error 3.500000 %
 15 |      epoch 4, minibatch 100/100, test error of best model 3.910000 %
 16 | training @ iter = 400
 17 | epoch 5, minibatch 100/100, validation error 3.020000 %
 18 |      epoch 5, minibatch 100/100, test error of best model 3.260000 %
 19 | training @ iter = 500
 20 | epoch 6, minibatch 100/100, validation error 2.780000 %
 21 |      epoch 6, minibatch 100/100, test error of best model 2.800000 %
 22 | training @ iter = 600
 23 | epoch 7, minibatch 100/100, validation error 2.480000 %
 24 |      epoch 7, minibatch 100/100, test error of best model 2.500000 %
 25 | training @ iter = 700
 26 | epoch 8, minibatch 100/100, validation error 2.290000 %
 27 |      epoch 8, minibatch 100/100, test error of best model 2.220000 %
 28 | training @ iter = 800
 29 | epoch 9, minibatch 100/100, validation error 2.160000 %
 30 |      epoch 9, minibatch 100/100, test error of best model 2.010000 %
 31 | training @ iter = 900
 32 | epoch 10, minibatch 100/100, validation error 1.970000 %
 33 |      epoch 10, minibatch 100/100, test error of best model 1.880000 %
 34 | training @ iter = 1000
 35 | epoch 11, minibatch 100/100, validation error 1.880000 %
 36 |      epoch 11, minibatch 100/100, test error of best model 1.790000 %
 37 | training @ iter = 1100
 38 | epoch 12, minibatch 100/100, validation error 1.790000 %
 39 |      epoch 12, minibatch 100/100, test error of best model 1.660000 %
 40 | training @ iter = 1200
 41 | epoch 13, minibatch 100/100, validation error 1.760000 %
 42 |      epoch 13, minibatch 100/100, test error of best model 1.580000 %
 43 | training @ iter = 1300
 44 | epoch 14, minibatch 100/100, validation error 1.710000 %
 45 |      epoch 14, minibatch 100/100, test error of best model 1.550000 %
 46 | training @ iter = 1400
 47 | epoch 15, minibatch 100/100, validation error 1.680000 %
 48 |      epoch 15, minibatch 100/100, test error of best model 1.500000 %
 49 | training @ iter = 1500
 50 | epoch 16, minibatch 100/100, validation error 1.620000 %
 51 |      epoch 16, minibatch 100/100, test error of best model 1.440000 %
 52 | training @ iter = 1600
 53 | epoch 17, minibatch 100/100, validation error 1.590000 %
 54 |      epoch 17, minibatch 100/100, test error of best model 1.410000 %
 55 | training @ iter = 1700
 56 | epoch 18, minibatch 100/100, validation error 1.560000 %
 57 |      epoch 18, minibatch 100/100, test error of best model 1.410000 %
 58 | training @ iter = 1800
 59 | epoch 19, minibatch 100/100, validation error 1.530000 %
 60 |      epoch 19, minibatch 100/100, test error of best model 1.380000 %
 61 | training @ iter = 1900
 62 | epoch 20, minibatch 100/100, validation error 1.520000 %
 63 |      epoch 20, minibatch 100/100, test error of best model 1.330000 %
 64 | training @ iter = 2000
 65 | epoch 21, minibatch 100/100, validation error 1.490000 %
 66 |      epoch 21, minibatch 100/100, test error of best model 1.270000 %
 67 | training @ iter = 2100
 68 | epoch 22, minibatch 100/100, validation error 1.460000 %
 69 |      epoch 22, minibatch 100/100, test error of best model 1.270000 %
 70 | training @ iter = 2200
 71 | epoch 23, minibatch 100/100, validation error 1.430000 %
 72 |      epoch 23, minibatch 100/100, test error of best model 1.240000 %
 73 | training @ iter = 2300
 74 | epoch 24, minibatch 100/100, validation error 1.410000 %
 75 |      epoch 24, minibatch 100/100, test error of best model 1.230000 %
 76 | training @ iter = 2400
 77 | epoch 25, minibatch 100/100, validation error 1.380000 %
 78 |      epoch 25, minibatch 100/100, test error of best model 1.190000 %
 79 | training @ iter = 2500
 80 | epoch 26, minibatch 100/100, validation error 1.340000 %
 81 |      epoch 26, minibatch 100/100, test error of best model 1.150000 %
 82 | training @ iter = 2600
 83 | epoch 27, minibatch 100/100, validation error 1.320000 %
 84 |      epoch 27, minibatch 100/100, test error of best model 1.150000 %
 85 | training @ iter = 2700
 86 | epoch 28, minibatch 100/100, validation error 1.300000 %
 87 |      epoch 28, minibatch 100/100, test error of best model 1.120000 %
 88 | training @ iter = 2800
 89 | epoch 29, minibatch 100/100, validation error 1.270000 %
 90 |      epoch 29, minibatch 100/100, test error of best model 1.120000 %
 91 | training @ iter = 2900
 92 | epoch 30, minibatch 100/100, validation error 1.260000 %
 93 |      epoch 30, minibatch 100/100, test error of best model 1.110000 %
 94 | training @ iter = 3000
 95 | epoch 31, minibatch 100/100, validation error 1.260000 %
 96 | training @ iter = 3100
 97 | epoch 32, minibatch 100/100, validation error 1.250000 %
 98 |      epoch 32, minibatch 100/100, test error of best model 1.100000 %
 99 | training @ iter = 3200
100 | epoch 33, minibatch 100/100, validation error 1.250000 %
101 | training @ iter = 3300
102 | epoch 34, minibatch 100/100, validation error 1.220000 %
103 |      epoch 34, minibatch 100/100, test error of best model 1.070000 %
104 | training @ iter = 3400
105 | epoch 35, minibatch 100/100, validation error 1.220000 %
106 | training @ iter = 3500
107 | epoch 36, minibatch 100/100, validation error 1.190000 %
108 |      epoch 36, minibatch 100/100, test error of best model 1.050000 %
109 | training @ iter = 3600
110 | epoch 37, minibatch 100/100, validation error 1.190000 %
111 | training @ iter = 3700
112 | epoch 38, minibatch 100/100, validation error 1.180000 %
113 |      epoch 38, minibatch 100/100, test error of best model 1.070000 %
114 | training @ iter = 3800
115 | epoch 39, minibatch 100/100, validation error 1.180000 %
116 | training @ iter = 3900
117 | epoch 40, minibatch 100/100, validation error 1.170000 %
118 |      epoch 40, minibatch 100/100, test error of best model 1.070000 %
119 | training @ iter = 4000
120 | epoch 41, minibatch 100/100, validation error 1.150000 %
121 |      epoch 41, minibatch 100/100, test error of best model 1.080000 %
122 | training @ iter = 4100
123 | epoch 42, minibatch 100/100, validation error 1.150000 %
124 | training @ iter = 4200
125 | epoch 43, minibatch 100/100, validation error 1.140000 %
126 |      epoch 43, minibatch 100/100, test error of best model 1.060000 %
127 | training @ iter = 4300
128 | epoch 44, minibatch 100/100, validation error 1.130000 %
129 |      epoch 44, minibatch 100/100, test error of best model 1.050000 %
130 | training @ iter = 4400
131 | epoch 45, minibatch 100/100, validation error 1.130000 %
132 | training @ iter = 4500
133 | epoch 46, minibatch 100/100, validation error 1.120000 %
134 |      epoch 46, minibatch 100/100, test error of best model 1.050000 %
135 | training @ iter = 4600
136 | epoch 47, minibatch 100/100, validation error 1.110000 %
137 |      epoch 47, minibatch 100/100, test error of best model 1.040000 %
138 | training @ iter = 4700
139 | epoch 48, minibatch 100/100, validation error 1.090000 %
140 |      epoch 48, minibatch 100/100, test error of best model 1.050000 %
141 | training @ iter = 4800
142 | epoch 49, minibatch 100/100, validation error 1.090000 %
143 | training @ iter = 4900
144 | epoch 50, minibatch 100/100, validation error 1.090000 %
145 | training @ iter = 5000
146 | epoch 51, minibatch 100/100, validation error 1.100000 %
147 | training @ iter = 5100
148 | epoch 52, minibatch 100/100, validation error 1.090000 %
149 | training @ iter = 5200
150 | epoch 53, minibatch 100/100, validation error 1.080000 %
151 |      epoch 53, minibatch 100/100, test error of best model 1.030000 %
152 | training @ iter = 5300
153 | epoch 54, minibatch 100/100, validation error 1.070000 %
154 |      epoch 54, minibatch 100/100, test error of best model 1.030000 %
155 | training @ iter = 5400
156 | epoch 55, minibatch 100/100, validation error 1.070000 %
157 | training @ iter = 5500
158 | epoch 56, minibatch 100/100, validation error 1.080000 %
159 | training @ iter = 5600
160 | epoch 57, minibatch 100/100, validation error 1.080000 %
161 | training @ iter = 5700
162 | epoch 58, minibatch 100/100, validation error 1.070000 %
163 | training @ iter = 5800
164 | epoch 59, minibatch 100/100, validation error 1.060000 %
165 |      epoch 59, minibatch 100/100, test error of best model 0.990000 %
166 | training @ iter = 5900
167 | epoch 60, minibatch 100/100, validation error 1.070000 %
168 | training @ iter = 6000
169 | epoch 61, minibatch 100/100, validation error 1.070000 %
170 | training @ iter = 6100
171 | epoch 62, minibatch 100/100, validation error 1.070000 %
172 | training @ iter = 6200
173 | epoch 63, minibatch 100/100, validation error 1.070000 %
174 | training @ iter = 6300
175 | epoch 64, minibatch 100/100, validation error 1.070000 %
176 | training @ iter = 6400
177 | epoch 65, minibatch 100/100, validation error 1.070000 %
178 | training @ iter = 6500
179 | epoch 66, minibatch 100/100, validation error 1.060000 %
180 | training @ iter = 6600
181 | epoch 67, minibatch 100/100, validation error 1.060000 %
182 | training @ iter = 6700
183 | epoch 68, minibatch 100/100, validation error 1.070000 %
184 | training @ iter = 6800
185 | epoch 69, minibatch 100/100, validation error 1.070000 %
186 | training @ iter = 6900
187 | epoch 70, minibatch 100/100, validation error 1.050000 %
188 |      epoch 70, minibatch 100/100, test error of best model 0.970000 %
189 | training @ iter = 7000
190 | epoch 71, minibatch 100/100, validation error 1.030000 %
191 |      epoch 71, minibatch 100/100, test error of best model 0.970000 %
192 | training @ iter = 7100
193 | epoch 72, minibatch 100/100, validation error 1.030000 %
194 | training @ iter = 7200
195 | epoch 73, minibatch 100/100, validation error 1.020000 %
196 |      epoch 73, minibatch 100/100, test error of best model 0.970000 %
197 | training @ iter = 7300
198 | epoch 74, minibatch 100/100, validation error 1.000000 %
199 |      epoch 74, minibatch 100/100, test error of best model 0.960000 %
200 | training @ iter = 7400
201 | epoch 75, minibatch 100/100, validation error 1.000000 %
202 | training @ iter = 7500
203 | epoch 76, minibatch 100/100, validation error 0.980000 %
204 |      epoch 76, minibatch 100/100, test error of best model 0.970000 %
205 | training @ iter = 7600
206 | epoch 77, minibatch 100/100, validation error 0.970000 %
207 |      epoch 77, minibatch 100/100, test error of best model 0.970000 %
208 | training @ iter = 7700
209 | epoch 78, minibatch 100/100, validation error 0.970000 %
210 | training @ iter = 7800
211 | epoch 79, minibatch 100/100, validation error 0.990000 %
212 | training @ iter = 7900
213 | epoch 80, minibatch 100/100, validation error 0.990000 %
214 | training @ iter = 8000
215 | epoch 81, minibatch 100/100, validation error 1.000000 %
216 | training @ iter = 8100
217 | epoch 82, minibatch 100/100, validation error 1.000000 %
218 | training @ iter = 8200
219 | epoch 83, minibatch 100/100, validation error 0.990000 %
220 | training @ iter = 8300
221 | epoch 84, minibatch 100/100, validation error 0.990000 %
222 | training @ iter = 8400
223 | epoch 85, minibatch 100/100, validation error 0.980000 %
224 | training @ iter = 8500
225 | epoch 86, minibatch 100/100, validation error 0.980000 %
226 | training @ iter = 8600
227 | epoch 87, minibatch 100/100, validation error 0.980000 %
228 | training @ iter = 8700
229 | epoch 88, minibatch 100/100, validation error 0.970000 %
230 | training @ iter = 8800
231 | epoch 89, minibatch 100/100, validation error 0.960000 %
232 |      epoch 89, minibatch 100/100, test error of best model 0.970000 %
233 | training @ iter = 8900
234 | epoch 90, minibatch 100/100, validation error 0.970000 %
235 | training @ iter = 9000
236 | epoch 91, minibatch 100/100, validation error 0.970000 %
237 | training @ iter = 9100
238 | epoch 92, minibatch 100/100, validation error 0.950000 %
239 |      epoch 92, minibatch 100/100, test error of best model 0.970000 %
240 | training @ iter = 9200
241 | epoch 93, minibatch 100/100, validation error 0.940000 %
242 |      epoch 93, minibatch 100/100, test error of best model 0.980000 %
243 | training @ iter = 9300
244 | epoch 94, minibatch 100/100, validation error 0.940000 %
245 | training @ iter = 9400
246 | epoch 95, minibatch 100/100, validation error 0.940000 %
247 | training @ iter = 9500
248 | epoch 96, minibatch 100/100, validation error 0.940000 %
249 | training @ iter = 9600
250 | epoch 97, minibatch 100/100, validation error 0.930000 %
251 |      epoch 97, minibatch 100/100, test error of best model 0.960000 %
252 | training @ iter = 9700
253 | epoch 98, minibatch 100/100, validation error 0.930000 %
254 | training @ iter = 9800
255 | epoch 99, minibatch 100/100, validation error 0.940000 %
256 | training @ iter = 9900
257 | epoch 100, minibatch 100/100, validation error 0.940000 %
258 | training @ iter = 10000
259 | epoch 101, minibatch 100/100, validation error 0.940000 %
260 | training @ iter = 10100
261 | epoch 102, minibatch 100/100, validation error 0.940000 %
262 | training @ iter = 10200
263 | epoch 103, minibatch 100/100, validation error 0.940000 %
264 | training @ iter = 10300
265 | epoch 104, minibatch 100/100, validation error 0.940000 %
266 | training @ iter = 10400
267 | epoch 105, minibatch 100/100, validation error 0.950000 %
268 | training @ iter = 10500
269 | epoch 106, minibatch 100/100, validation error 0.950000 %
270 | training @ iter = 10600
271 | epoch 107, minibatch 100/100, validation error 0.950000 %
272 | training @ iter = 10700
273 | epoch 108, minibatch 100/100, validation error 0.950000 %
274 | training @ iter = 10800
275 | epoch 109, minibatch 100/100, validation error 0.950000 %
276 | training @ iter = 10900
277 | epoch 110, minibatch 100/100, validation error 0.950000 %
278 | training @ iter = 11000
279 | epoch 111, minibatch 100/100, validation error 0.950000 %
280 | training @ iter = 11100
281 | epoch 112, minibatch 100/100, validation error 0.950000 %
282 | training @ iter = 11200
283 | epoch 113, minibatch 100/100, validation error 0.940000 %
284 | training @ iter = 11300
285 | epoch 114, minibatch 100/100, validation error 0.940000 %
286 | training @ iter = 11400
287 | epoch 115, minibatch 100/100, validation error 0.940000 %
288 | training @ iter = 11500
289 | epoch 116, minibatch 100/100, validation error 0.940000 %
290 | training @ iter = 11600
291 | epoch 117, minibatch 100/100, validation error 0.940000 %
292 | training @ iter = 11700
293 | epoch 118, minibatch 100/100, validation error 0.940000 %
294 | training @ iter = 11800
295 | epoch 119, minibatch 100/100, validation error 0.940000 %
296 | training @ iter = 11900
297 | epoch 120, minibatch 100/100, validation error 0.940000 %
298 | training @ iter = 12000
299 | epoch 121, minibatch 100/100, validation error 0.940000 %
300 | training @ iter = 12100
301 | epoch 122, minibatch 100/100, validation error 0.930000 %
302 | training @ iter = 12200
303 | epoch 123, minibatch 100/100, validation error 0.930000 %
304 | training @ iter = 12300
305 | epoch 124, minibatch 100/100, validation error 0.930000 %
306 | training @ iter = 12400
307 | epoch 125, minibatch 100/100, validation error 0.930000 %
308 | training @ iter = 12500
309 | epoch 126, minibatch 100/100, validation error 0.930000 %
310 | training @ iter = 12600
311 | epoch 127, minibatch 100/100, validation error 0.930000 %
312 | training @ iter = 12700
313 | epoch 128, minibatch 100/100, validation error 0.930000 %
314 | training @ iter = 12800
315 | epoch 129, minibatch 100/100, validation error 0.940000 %
316 | training @ iter = 12900
317 | epoch 130, minibatch 100/100, validation error 0.950000 %
318 | training @ iter = 13000
319 | epoch 131, minibatch 100/100, validation error 0.950000 %
320 | training @ iter = 13100
321 | epoch 132, minibatch 100/100, validation error 0.950000 %
322 | training @ iter = 13200
323 | epoch 133, minibatch 100/100, validation error 0.950000 %
324 | training @ iter = 13300
325 | epoch 134, minibatch 100/100, validation error 0.950000 %
326 | training @ iter = 13400
327 | epoch 135, minibatch 100/100, validation error 0.950000 %
328 | training @ iter = 13500
329 | epoch 136, minibatch 100/100, validation error 0.950000 %
330 | training @ iter = 13600
331 | epoch 137, minibatch 100/100, validation error 0.950000 %
332 | training @ iter = 13700
333 | epoch 138, minibatch 100/100, validation error 0.950000 %
334 | training @ iter = 13800
335 | epoch 139, minibatch 100/100, validation error 0.950000 %
336 | training @ iter = 13900
337 | epoch 140, minibatch 100/100, validation error 0.940000 %
338 | training @ iter = 14000
339 | epoch 141, minibatch 100/100, validation error 0.940000 %
340 | training @ iter = 14100
341 | epoch 142, minibatch 100/100, validation error 0.940000 %
342 | training @ iter = 14200
343 | epoch 143, minibatch 100/100, validation error 0.940000 %
344 | training @ iter = 14300
345 | epoch 144, minibatch 100/100, validation error 0.940000 %
346 | training @ iter = 14400
347 | epoch 145, minibatch 100/100, validation error 0.940000 %
348 | training @ iter = 14500
349 | epoch 146, minibatch 100/100, validation error 0.940000 %
350 | training @ iter = 14600
351 | epoch 147, minibatch 100/100, validation error 0.940000 %
352 | training @ iter = 14700
353 | epoch 148, minibatch 100/100, validation error 0.940000 %
354 | training @ iter = 14800
355 | epoch 149, minibatch 100/100, validation error 0.940000 %
356 | training @ iter = 14900
357 | epoch 150, minibatch 100/100, validation error 0.940000 %
358 | training @ iter = 15000
359 | epoch 151, minibatch 100/100, validation error 0.940000 %
360 | training @ iter = 15100
361 | epoch 152, minibatch 100/100, validation error 0.940000 %
362 | training @ iter = 15200
363 | epoch 153, minibatch 100/100, validation error 0.940000 %
364 | training @ iter = 15300
365 | epoch 154, minibatch 100/100, validation error 0.940000 %
366 | training @ iter = 15400
367 | epoch 155, minibatch 100/100, validation error 0.940000 %
368 | training @ iter = 15500
369 | epoch 156, minibatch 100/100, validation error 0.940000 %
370 | training @ iter = 15600
371 | epoch 157, minibatch 100/100, validation error 0.930000 %
372 | training @ iter = 15700
373 | epoch 158, minibatch 100/100, validation error 0.930000 %
374 | training @ iter = 15800
375 | epoch 159, minibatch 100/100, validation error 0.930000 %
376 | training @ iter = 15900
377 | epoch 160, minibatch 100/100, validation error 0.930000 %
378 | training @ iter = 16000
379 | epoch 161, minibatch 100/100, validation error 0.920000 %
380 |      epoch 161, minibatch 100/100, test error of best model 0.930000 %
381 | training @ iter = 16100
382 | epoch 162, minibatch 100/100, validation error 0.920000 %
383 | training @ iter = 16200
384 | epoch 163, minibatch 100/100, validation error 0.920000 %
385 | training @ iter = 16300
386 | epoch 164, minibatch 100/100, validation error 0.920000 %
387 | training @ iter = 16400
388 | epoch 165, minibatch 100/100, validation error 0.920000 %
389 | training @ iter = 16500
390 | epoch 166, minibatch 100/100, validation error 0.920000 %
391 | training @ iter = 16600
392 | epoch 167, minibatch 100/100, validation error 0.920000 %
393 | training @ iter = 16700
394 | epoch 168, minibatch 100/100, validation error 0.920000 %
395 | training @ iter = 16800
396 | epoch 169, minibatch 100/100, validation error 0.920000 %
397 | training @ iter = 16900
398 | epoch 170, minibatch 100/100, validation error 0.920000 %
399 | training @ iter = 17000
400 | epoch 171, minibatch 100/100, validation error 0.920000 %
401 | training @ iter = 17100
402 | epoch 172, minibatch 100/100, validation error 0.920000 %
403 | training @ iter = 17200
404 | epoch 173, minibatch 100/100, validation error 0.920000 %
405 | training @ iter = 17300
406 | epoch 174, minibatch 100/100, validation error 0.920000 %
407 | training @ iter = 17400
408 | epoch 175, minibatch 100/100, validation error 0.920000 %
409 | training @ iter = 17500
410 | epoch 176, minibatch 100/100, validation error 0.920000 %
411 | training @ iter = 17600
412 | epoch 177, minibatch 100/100, validation error 0.920000 %
413 | training @ iter = 17700
414 | epoch 178, minibatch 100/100, validation error 0.920000 %
415 | training @ iter = 17800
416 | epoch 179, minibatch 100/100, validation error 0.920000 %
417 | training @ iter = 17900
418 | epoch 180, minibatch 100/100, validation error 0.920000 %
419 | training @ iter = 18000
420 | epoch 181, minibatch 100/100, validation error 0.920000 %
421 | training @ iter = 18100
422 | epoch 182, minibatch 100/100, validation error 0.920000 %
423 | training @ iter = 18200
424 | epoch 183, minibatch 100/100, validation error 0.910000 %
425 |      epoch 183, minibatch 100/100, test error of best model 0.920000 %
426 | training @ iter = 18300
427 | epoch 184, minibatch 100/100, validation error 0.910000 %
428 | training @ iter = 18400
429 | epoch 185, minibatch 100/100, validation error 0.910000 %
430 | training @ iter = 18500
431 | epoch 186, minibatch 100/100, validation error 0.910000 %
432 | training @ iter = 18600
433 | epoch 187, minibatch 100/100, validation error 0.910000 %
434 | training @ iter = 18700
435 | epoch 188, minibatch 100/100, validation error 0.910000 %
436 | training @ iter = 18800
437 | epoch 189, minibatch 100/100, validation error 0.910000 %
438 | training @ iter = 18900
439 | epoch 190, minibatch 100/100, validation error 0.910000 %
440 | training @ iter = 19000
441 | epoch 191, minibatch 100/100, validation error 0.910000 %
442 | training @ iter = 19100
443 | epoch 192, minibatch 100/100, validation error 0.910000 %
444 | training @ iter = 19200
445 | epoch 193, minibatch 100/100, validation error 0.910000 %
446 | training @ iter = 19300
447 | epoch 194, minibatch 100/100, validation error 0.910000 %
448 | training @ iter = 19400
449 | epoch 195, minibatch 100/100, validation error 0.910000 %
450 | training @ iter = 19500
451 | epoch 196, minibatch 100/100, validation error 0.910000 %
452 | training @ iter = 19600
453 | epoch 197, minibatch 100/100, validation error 0.910000 %
454 | training @ iter = 19700
455 | epoch 198, minibatch 100/100, validation error 0.910000 %
456 | training @ iter = 19800
457 | epoch 199, minibatch 100/100, validation error 0.910000 %
458 | training @ iter = 19900
459 | epoch 200, minibatch 100/100, validation error 0.910000 %
460 | Optimization complete.
461 | Best validation score of 0.910000 % obtained at iteration 18300, with test performance 0.920000 %
462 | The code for file cnn_ts_ts.py ran for 1084.74m
463 | 


--------------------------------------------------------------------------------
/result/ts_tt.txt:
--------------------------------------------------------------------------------
  1 | theta is 0.7
  2 | train data is from source;valid data is from target,test data is from target
  3 | modify the time result
  4 | ... loading data
  5 | ... building the model
  6 | ... training
  7 | training @ iter = 0
  8 | epoch 1, minibatch 100/100, validation error 53.760000 %
  9 |      epoch 1, minibatch 100/100, test error of best model 39.140000 %
 10 | training @ iter = 100
 11 | epoch 2, minibatch 100/100, validation error 55.120000 %
 12 | training @ iter = 200
 13 | epoch 3, minibatch 100/100, validation error 56.030000 %
 14 | training @ iter = 300
 15 | epoch 4, minibatch 100/100, validation error 56.860000 %
 16 | training @ iter = 400
 17 | epoch 5, minibatch 100/100, validation error 57.470000 %
 18 | training @ iter = 500
 19 | epoch 6, minibatch 100/100, validation error 57.510000 %
 20 | training @ iter = 600
 21 | epoch 7, minibatch 100/100, validation error 57.210000 %
 22 | training @ iter = 700
 23 | epoch 8, minibatch 100/100, validation error 56.830000 %
 24 | training @ iter = 800
 25 | epoch 9, minibatch 100/100, validation error 56.200000 %
 26 | training @ iter = 900
 27 | epoch 10, minibatch 100/100, validation error 55.600000 %
 28 | training @ iter = 1000
 29 | epoch 11, minibatch 100/100, validation error 55.160000 %
 30 | training @ iter = 1100
 31 | epoch 12, minibatch 100/100, validation error 54.570000 %
 32 | training @ iter = 1200
 33 | epoch 13, minibatch 100/100, validation error 54.070000 %
 34 | training @ iter = 1300
 35 | epoch 14, minibatch 100/100, validation error 53.470000 %
 36 |      epoch 14, minibatch 100/100, test error of best model 45.410000 %
 37 | training @ iter = 1400
 38 | epoch 15, minibatch 100/100, validation error 53.000000 %
 39 |      epoch 15, minibatch 100/100, test error of best model 45.150000 %
 40 | training @ iter = 1500
 41 | epoch 16, minibatch 100/100, validation error 52.640000 %
 42 |      epoch 16, minibatch 100/100, test error of best model 44.860000 %
 43 | training @ iter = 1600
 44 | epoch 17, minibatch 100/100, validation error 52.140000 %
 45 |      epoch 17, minibatch 100/100, test error of best model 44.500000 %
 46 | training @ iter = 1700
 47 | epoch 18, minibatch 100/100, validation error 51.760000 %
 48 |      epoch 18, minibatch 100/100, test error of best model 44.290000 %
 49 | training @ iter = 1800
 50 | epoch 19, minibatch 100/100, validation error 51.490000 %
 51 |      epoch 19, minibatch 100/100, test error of best model 43.970000 %
 52 | training @ iter = 1900
 53 | epoch 20, minibatch 100/100, validation error 51.210000 %
 54 |      epoch 20, minibatch 100/100, test error of best model 43.750000 %
 55 | training @ iter = 2000
 56 | epoch 21, minibatch 100/100, validation error 50.950000 %
 57 |      epoch 21, minibatch 100/100, test error of best model 43.640000 %
 58 | training @ iter = 2100
 59 | epoch 22, minibatch 100/100, validation error 50.570000 %
 60 |      epoch 22, minibatch 100/100, test error of best model 43.480000 %
 61 | training @ iter = 2200
 62 | epoch 23, minibatch 100/100, validation error 50.360000 %
 63 |      epoch 23, minibatch 100/100, test error of best model 43.360000 %
 64 | training @ iter = 2300
 65 | epoch 24, minibatch 100/100, validation error 50.210000 %
 66 |      epoch 24, minibatch 100/100, test error of best model 43.190000 %
 67 | training @ iter = 2400
 68 | epoch 25, minibatch 100/100, validation error 50.010000 %
 69 |      epoch 25, minibatch 100/100, test error of best model 43.150000 %
 70 | training @ iter = 2500
 71 | epoch 26, minibatch 100/100, validation error 49.810000 %
 72 |      epoch 26, minibatch 100/100, test error of best model 43.100000 %
 73 | training @ iter = 2600
 74 | epoch 27, minibatch 100/100, validation error 49.640000 %
 75 |      epoch 27, minibatch 100/100, test error of best model 43.000000 %
 76 | training @ iter = 2700
 77 | epoch 28, minibatch 100/100, validation error 49.510000 %
 78 |      epoch 28, minibatch 100/100, test error of best model 42.960000 %
 79 | training @ iter = 2800
 80 | epoch 29, minibatch 100/100, validation error 49.390000 %
 81 |      epoch 29, minibatch 100/100, test error of best model 42.890000 %
 82 | training @ iter = 2900
 83 | epoch 30, minibatch 100/100, validation error 49.200000 %
 84 |      epoch 30, minibatch 100/100, test error of best model 42.860000 %
 85 | training @ iter = 3000
 86 | epoch 31, minibatch 100/100, validation error 49.030000 %
 87 |      epoch 31, minibatch 100/100, test error of best model 42.790000 %
 88 | training @ iter = 3100
 89 | epoch 32, minibatch 100/100, validation error 48.900000 %
 90 |      epoch 32, minibatch 100/100, test error of best model 42.690000 %
 91 | training @ iter = 3200
 92 | epoch 33, minibatch 100/100, validation error 48.760000 %
 93 |      epoch 33, minibatch 100/100, test error of best model 42.590000 %
 94 | training @ iter = 3300
 95 | epoch 34, minibatch 100/100, validation error 48.680000 %
 96 |      epoch 34, minibatch 100/100, test error of best model 42.520000 %
 97 | training @ iter = 3400
 98 | epoch 35, minibatch 100/100, validation error 48.620000 %
 99 |      epoch 35, minibatch 100/100, test error of best model 42.460000 %
100 | training @ iter = 3500
101 | epoch 36, minibatch 100/100, validation error 48.510000 %
102 |      epoch 36, minibatch 100/100, test error of best model 42.430000 %
103 | training @ iter = 3600
104 | epoch 37, minibatch 100/100, validation error 48.400000 %
105 |      epoch 37, minibatch 100/100, test error of best model 42.360000 %
106 | training @ iter = 3700
107 | epoch 38, minibatch 100/100, validation error 48.360000 %
108 |      epoch 38, minibatch 100/100, test error of best model 42.410000 %
109 | training @ iter = 3800
110 | epoch 39, minibatch 100/100, validation error 48.340000 %
111 |      epoch 39, minibatch 100/100, test error of best model 42.430000 %
112 | training @ iter = 3900
113 | epoch 40, minibatch 100/100, validation error 48.330000 %
114 |      epoch 40, minibatch 100/100, test error of best model 42.380000 %
115 | training @ iter = 4000
116 | epoch 41, minibatch 100/100, validation error 48.280000 %
117 |      epoch 41, minibatch 100/100, test error of best model 42.350000 %
118 | training @ iter = 4100
119 | epoch 42, minibatch 100/100, validation error 48.250000 %
120 |      epoch 42, minibatch 100/100, test error of best model 42.300000 %
121 | training @ iter = 4200
122 | epoch 43, minibatch 100/100, validation error 48.200000 %
123 |      epoch 43, minibatch 100/100, test error of best model 42.290000 %
124 | training @ iter = 4300
125 | epoch 44, minibatch 100/100, validation error 48.180000 %
126 |      epoch 44, minibatch 100/100, test error of best model 42.260000 %
127 | training @ iter = 4400
128 | epoch 45, minibatch 100/100, validation error 48.120000 %
129 |      epoch 45, minibatch 100/100, test error of best model 42.230000 %
130 | training @ iter = 4500
131 | epoch 46, minibatch 100/100, validation error 48.080000 %
132 |      epoch 46, minibatch 100/100, test error of best model 42.230000 %
133 | training @ iter = 4600
134 | epoch 47, minibatch 100/100, validation error 48.040000 %
135 |      epoch 47, minibatch 100/100, test error of best model 42.170000 %
136 | training @ iter = 4700
137 | epoch 48, minibatch 100/100, validation error 47.990000 %
138 |      epoch 48, minibatch 100/100, test error of best model 42.180000 %
139 | training @ iter = 4800
140 | epoch 49, minibatch 100/100, validation error 47.930000 %
141 |      epoch 49, minibatch 100/100, test error of best model 42.160000 %
142 | training @ iter = 4900
143 | epoch 50, minibatch 100/100, validation error 47.950000 %
144 | training @ iter = 5000
145 | epoch 51, minibatch 100/100, validation error 47.940000 %
146 | training @ iter = 5100
147 | epoch 52, minibatch 100/100, validation error 47.940000 %
148 | training @ iter = 5200
149 | epoch 53, minibatch 100/100, validation error 47.930000 %
150 |      epoch 53, minibatch 100/100, test error of best model 42.190000 %
151 | training @ iter = 5300
152 | epoch 54, minibatch 100/100, validation error 47.940000 %
153 | training @ iter = 5400
154 | epoch 55, minibatch 100/100, validation error 47.940000 %
155 | training @ iter = 5500
156 | epoch 56, minibatch 100/100, validation error 47.960000 %
157 | training @ iter = 5600
158 | epoch 57, minibatch 100/100, validation error 47.890000 %
159 |      epoch 57, minibatch 100/100, test error of best model 42.200000 %
160 | training @ iter = 5700
161 | epoch 58, minibatch 100/100, validation error 47.900000 %
162 | training @ iter = 5800
163 | epoch 59, minibatch 100/100, validation error 47.920000 %
164 | training @ iter = 5900
165 | epoch 60, minibatch 100/100, validation error 47.940000 %
166 | training @ iter = 6000
167 | epoch 61, minibatch 100/100, validation error 47.960000 %
168 | training @ iter = 6100
169 | epoch 62, minibatch 100/100, validation error 47.960000 %
170 | training @ iter = 6200
171 | epoch 63, minibatch 100/100, validation error 47.970000 %
172 | training @ iter = 6300
173 | epoch 64, minibatch 100/100, validation error 47.950000 %
174 | training @ iter = 6400
175 | epoch 65, minibatch 100/100, validation error 47.960000 %
176 | training @ iter = 6500
177 | epoch 66, minibatch 100/100, validation error 47.960000 %
178 | training @ iter = 6600
179 | epoch 67, minibatch 100/100, validation error 47.920000 %
180 | training @ iter = 6700
181 | epoch 68, minibatch 100/100, validation error 47.900000 %
182 | training @ iter = 6800
183 | epoch 69, minibatch 100/100, validation error 47.900000 %
184 | training @ iter = 6900
185 | epoch 70, minibatch 100/100, validation error 47.910000 %
186 | training @ iter = 7000
187 | epoch 71, minibatch 100/100, validation error 47.960000 %
188 | training @ iter = 7100
189 | epoch 72, minibatch 100/100, validation error 47.980000 %
190 | training @ iter = 7200
191 | epoch 73, minibatch 100/100, validation error 47.970000 %
192 | training @ iter = 7300
193 | epoch 74, minibatch 100/100, validation error 47.990000 %
194 | training @ iter = 7400
195 | epoch 75, minibatch 100/100, validation error 48.050000 %
196 | training @ iter = 7500
197 | epoch 76, minibatch 100/100, validation error 48.050000 %
198 | training @ iter = 7600
199 | epoch 77, minibatch 100/100, validation error 48.080000 %
200 | training @ iter = 7700
201 | epoch 78, minibatch 100/100, validation error 48.090000 %
202 | training @ iter = 7800
203 | epoch 79, minibatch 100/100, validation error 48.090000 %
204 | training @ iter = 7900
205 | epoch 80, minibatch 100/100, validation error 48.160000 %
206 | training @ iter = 8000
207 | epoch 81, minibatch 100/100, validation error 48.140000 %
208 | training @ iter = 8100
209 | epoch 82, minibatch 100/100, validation error 48.160000 %
210 | training @ iter = 8200
211 | epoch 83, minibatch 100/100, validation error 48.210000 %
212 | training @ iter = 8300
213 | epoch 84, minibatch 100/100, validation error 48.200000 %
214 | training @ iter = 8400
215 | epoch 85, minibatch 100/100, validation error 48.200000 %
216 | training @ iter = 8500
217 | epoch 86, minibatch 100/100, validation error 48.210000 %
218 | training @ iter = 8600
219 | epoch 87, minibatch 100/100, validation error 48.220000 %
220 | training @ iter = 8700
221 | epoch 88, minibatch 100/100, validation error 48.240000 %
222 | training @ iter = 8800
223 | epoch 89, minibatch 100/100, validation error 48.250000 %
224 | training @ iter = 8900
225 | epoch 90, minibatch 100/100, validation error 48.260000 %
226 | training @ iter = 9000
227 | epoch 91, minibatch 100/100, validation error 48.300000 %
228 | training @ iter = 9100
229 | epoch 92, minibatch 100/100, validation error 48.300000 %
230 | training @ iter = 9200
231 | epoch 93, minibatch 100/100, validation error 48.340000 %
232 | training @ iter = 9300
233 | epoch 94, minibatch 100/100, validation error 48.380000 %
234 | training @ iter = 9400
235 | epoch 95, minibatch 100/100, validation error 48.360000 %
236 | training @ iter = 9500
237 | epoch 96, minibatch 100/100, validation error 48.400000 %
238 | training @ iter = 9600
239 | epoch 97, minibatch 100/100, validation error 48.440000 %
240 | training @ iter = 9700
241 | epoch 98, minibatch 100/100, validation error 48.430000 %
242 | training @ iter = 9800
243 | epoch 99, minibatch 100/100, validation error 48.430000 %
244 | training @ iter = 9900
245 | epoch 100, minibatch 100/100, validation error 48.440000 %
246 | training @ iter = 10000
247 | Optimization complete.
248 | Best validation score of 47.890000 % obtained at iteration 5700, with test performance 42.200000 %
249 | The code for file cnn_ts_tt.py ran for 589.79m
250 | 


--------------------------------------------------------------------------------
/result/tst_tt1.txt:
--------------------------------------------------------------------------------
  1 | lmbda is 0.25
  2 | theta is 0.7
  3 | modify the time result... loading data
  4 | ... building the model
  5 | ... training
  6 | training @ iter = 0
  7 | epoch 1, minibatch 100/100, validation error 15.040000 %
  8 |      epoch 1, minibatch 100/100, test error of best model 23.300000 %
  9 | training @ iter = 100
 10 | epoch 2, minibatch 100/100, validation error 20.740000 %
 11 | training @ iter = 200
 12 | epoch 3, minibatch 100/100, validation error 84.510000 %
 13 | training @ iter = 300
 14 | epoch 4, minibatch 100/100, validation error 85.060000 %
 15 | training @ iter = 400
 16 | epoch 5, minibatch 100/100, validation error 79.310000 %
 17 | training @ iter = 500
 18 | epoch 6, minibatch 100/100, validation error 74.090000 %
 19 | training @ iter = 600
 20 | epoch 7, minibatch 100/100, validation error 65.390000 %
 21 | training @ iter = 700
 22 | epoch 8, minibatch 100/100, validation error 56.530000 %
 23 | training @ iter = 800
 24 | epoch 9, minibatch 100/100, validation error 55.560000 %
 25 | training @ iter = 900
 26 | epoch 10, minibatch 100/100, validation error 57.830000 %
 27 | training @ iter = 1000
 28 | epoch 11, minibatch 100/100, validation error 53.360000 %
 29 | training @ iter = 1100
 30 | epoch 12, minibatch 100/100, validation error 48.590000 %
 31 | training @ iter = 1200
 32 | epoch 13, minibatch 100/100, validation error 51.390000 %
 33 | training @ iter = 1300
 34 | epoch 14, minibatch 100/100, validation error 48.310000 %
 35 | training @ iter = 1400
 36 | epoch 15, minibatch 100/100, validation error 53.040000 %
 37 | training @ iter = 1500
 38 | epoch 16, minibatch 100/100, validation error 48.720000 %
 39 | training @ iter = 1600
 40 | epoch 17, minibatch 100/100, validation error 48.630000 %
 41 | training @ iter = 1700
 42 | epoch 18, minibatch 100/100, validation error 46.220000 %
 43 | training @ iter = 1800
 44 | epoch 19, minibatch 100/100, validation error 57.500000 %
 45 | training @ iter = 1900
 46 | epoch 20, minibatch 100/100, validation error 46.050000 %
 47 | training @ iter = 2000
 48 | epoch 21, minibatch 100/100, validation error 51.950000 %
 49 | training @ iter = 2100
 50 | epoch 22, minibatch 100/100, validation error 60.480000 %
 51 | training @ iter = 2200
 52 | epoch 23, minibatch 100/100, validation error 54.880000 %
 53 | training @ iter = 2300
 54 | epoch 24, minibatch 100/100, validation error 44.520000 %
 55 | training @ iter = 2400
 56 | epoch 25, minibatch 100/100, validation error 48.350000 %
 57 | training @ iter = 2500
 58 | epoch 26, minibatch 100/100, validation error 55.230000 %
 59 | training @ iter = 2600
 60 | epoch 27, minibatch 100/100, validation error 58.350000 %
 61 | training @ iter = 2700
 62 | epoch 28, minibatch 100/100, validation error 46.160000 %
 63 | training @ iter = 2800
 64 | epoch 29, minibatch 100/100, validation error 46.560000 %
 65 | training @ iter = 2900
 66 | epoch 30, minibatch 100/100, validation error 45.000000 %
 67 | training @ iter = 3000
 68 | epoch 31, minibatch 100/100, validation error 45.740000 %
 69 | training @ iter = 3100
 70 | epoch 32, minibatch 100/100, validation error 48.940000 %
 71 | training @ iter = 3200
 72 | epoch 33, minibatch 100/100, validation error 65.480000 %
 73 | training @ iter = 3300
 74 | epoch 34, minibatch 100/100, validation error 48.640000 %
 75 | training @ iter = 3400
 76 | epoch 35, minibatch 100/100, validation error 45.640000 %
 77 | training @ iter = 3500
 78 | epoch 36, minibatch 100/100, validation error 47.950000 %
 79 | training @ iter = 3600
 80 | epoch 37, minibatch 100/100, validation error 46.280000 %
 81 | training @ iter = 3700
 82 | epoch 38, minibatch 100/100, validation error 51.160000 %
 83 | training @ iter = 3800
 84 | epoch 39, minibatch 100/100, validation error 46.940000 %
 85 | training @ iter = 3900
 86 | epoch 40, minibatch 100/100, validation error 56.900000 %
 87 | training @ iter = 4000
 88 | epoch 41, minibatch 100/100, validation error 60.910000 %
 89 | training @ iter = 4100
 90 | epoch 42, minibatch 100/100, validation error 64.030000 %
 91 | training @ iter = 4200
 92 | epoch 43, minibatch 100/100, validation error 52.140000 %
 93 | training @ iter = 4300
 94 | epoch 44, minibatch 100/100, validation error 54.460000 %
 95 | training @ iter = 4400
 96 | epoch 45, minibatch 100/100, validation error 47.080000 %
 97 | training @ iter = 4500
 98 | epoch 46, minibatch 100/100, validation error 48.580000 %
 99 | training @ iter = 4600
100 | epoch 47, minibatch 100/100, validation error 58.820000 %
101 | training @ iter = 4700
102 | epoch 48, minibatch 100/100, validation error 56.960000 %
103 | training @ iter = 4800
104 | epoch 49, minibatch 100/100, validation error 51.970000 %
105 | training @ iter = 4900
106 | epoch 50, minibatch 100/100, validation error 64.340000 %
107 | training @ iter = 5000
108 | epoch 51, minibatch 100/100, validation error 47.480000 %
109 | training @ iter = 5100
110 | epoch 52, minibatch 100/100, validation error 57.460000 %
111 | training @ iter = 5200
112 | epoch 53, minibatch 100/100, validation error 53.270000 %
113 | training @ iter = 5300
114 | epoch 54, minibatch 100/100, validation error 62.710000 %
115 | training @ iter = 5400
116 | epoch 55, minibatch 100/100, validation error 56.630000 %
117 | training @ iter = 5500
118 | epoch 56, minibatch 100/100, validation error 53.700000 %
119 | training @ iter = 5600
120 | epoch 57, minibatch 100/100, validation error 56.430000 %
121 | training @ iter = 5700
122 | epoch 58, minibatch 100/100, validation error 54.820000 %
123 | training @ iter = 5800
124 | epoch 59, minibatch 100/100, validation error 56.430000 %
125 | training @ iter = 5900
126 | epoch 60, minibatch 100/100, validation error 60.100000 %
127 | training @ iter = 6000
128 | epoch 61, minibatch 100/100, validation error 61.550000 %
129 | training @ iter = 6100
130 | epoch 62, minibatch 100/100, validation error 56.200000 %
131 | training @ iter = 6200
132 | epoch 63, minibatch 100/100, validation error 42.440000 %
133 | training @ iter = 6300
134 | epoch 64, minibatch 100/100, validation error 56.970000 %
135 | training @ iter = 6400
136 | epoch 65, minibatch 100/100, validation error 47.960000 %
137 | training @ iter = 6500
138 | epoch 66, minibatch 100/100, validation error 38.920000 %
139 | training @ iter = 6600
140 | epoch 67, minibatch 100/100, validation error 57.850000 %
141 | training @ iter = 6700
142 | epoch 68, minibatch 100/100, validation error 59.520000 %
143 | training @ iter = 6800
144 | epoch 69, minibatch 100/100, validation error 51.910000 %
145 | training @ iter = 6900
146 | epoch 70, minibatch 100/100, validation error 42.260000 %
147 | training @ iter = 7000
148 | epoch 71, minibatch 100/100, validation error 56.140000 %
149 | training @ iter = 7100
150 | epoch 72, minibatch 100/100, validation error 49.320000 %
151 | training @ iter = 7200
152 | epoch 73, minibatch 100/100, validation error 51.700000 %
153 | training @ iter = 7300
154 | epoch 74, minibatch 100/100, validation error 57.520000 %
155 | training @ iter = 7400
156 | epoch 75, minibatch 100/100, validation error 40.560000 %
157 | training @ iter = 7500
158 | epoch 76, minibatch 100/100, validation error 47.240000 %
159 | training @ iter = 7600
160 | epoch 77, minibatch 100/100, validation error 47.910000 %
161 | training @ iter = 7700
162 | epoch 78, minibatch 100/100, validation error 45.970000 %
163 | training @ iter = 7800
164 | epoch 79, minibatch 100/100, validation error 45.210000 %
165 | training @ iter = 7900
166 | epoch 80, minibatch 100/100, validation error 49.060000 %
167 | training @ iter = 8000
168 | epoch 81, minibatch 100/100, validation error 46.220000 %
169 | training @ iter = 8100
170 | epoch 82, minibatch 100/100, validation error 53.250000 %
171 | training @ iter = 8200
172 | epoch 83, minibatch 100/100, validation error 42.050000 %
173 | training @ iter = 8300
174 | epoch 84, minibatch 100/100, validation error 44.110000 %
175 | training @ iter = 8400
176 | epoch 85, minibatch 100/100, validation error 49.760000 %
177 | training @ iter = 8500
178 | epoch 86, minibatch 100/100, validation error 49.140000 %
179 | training @ iter = 8600
180 | epoch 87, minibatch 100/100, validation error 55.550000 %
181 | training @ iter = 8700
182 | epoch 88, minibatch 100/100, validation error 56.680000 %
183 | training @ iter = 8800
184 | epoch 89, minibatch 100/100, validation error 51.920000 %
185 | training @ iter = 8900
186 | epoch 90, minibatch 100/100, validation error 47.320000 %
187 | training @ iter = 9000
188 | epoch 91, minibatch 100/100, validation error 60.880000 %
189 | training @ iter = 9100
190 | epoch 92, minibatch 100/100, validation error 41.790000 %
191 | training @ iter = 9200
192 | epoch 93, minibatch 100/100, validation error 53.510000 %
193 | training @ iter = 9300
194 | epoch 94, minibatch 100/100, validation error 55.060000 %
195 | training @ iter = 9400
196 | epoch 95, minibatch 100/100, validation error 43.550000 %
197 | training @ iter = 9500
198 | epoch 96, minibatch 100/100, validation error 57.860000 %
199 | training @ iter = 9600
200 | epoch 97, minibatch 100/100, validation error 40.360000 %
201 | training @ iter = 9700
202 | epoch 98, minibatch 100/100, validation error 46.960000 %
203 | training @ iter = 9800
204 | epoch 99, minibatch 100/100, validation error 44.740000 %
205 | training @ iter = 9900
206 | epoch 100, minibatch 100/100, validation error 53.180000 %
207 | training @ iter = 10000
208 | Optimization complete.
209 | Best validation score of 15.040000 % obtained at iteration 100, with test performance 23.300000 %
210 | The code for file cnn_tst_tt.py ran for 1683.00m
211 | 


--------------------------------------------------------------------------------
/result/tt_tt.txt:
--------------------------------------------------------------------------------
  1 | modify the time result... loading data
  2 | ... building the model
  3 | ... training
  4 | training @ iter = 0
  5 | epoch 1, minibatch 100/100, validation error 12.450000 %
  6 |      epoch 1, minibatch 100/100, test error of best model 21.300000 %
  7 | training @ iter = 100
  8 | epoch 2, minibatch 100/100, validation error 8.120000 %
  9 |      epoch 2, minibatch 100/100, test error of best model 16.990000 %
 10 | training @ iter = 200
 11 | epoch 3, minibatch 100/100, validation error 6.380000 %
 12 |      epoch 3, minibatch 100/100, test error of best model 14.950000 %
 13 | training @ iter = 300
 14 | epoch 4, minibatch 100/100, validation error 5.260000 %
 15 |      epoch 4, minibatch 100/100, test error of best model 14.220000 %
 16 | training @ iter = 400
 17 | epoch 5, minibatch 100/100, validation error 5.990000 %
 18 | training @ iter = 500
 19 | epoch 6, minibatch 100/100, validation error 4.840000 %
 20 |      epoch 6, minibatch 100/100, test error of best model 14.270000 %
 21 | training @ iter = 600
 22 | epoch 7, minibatch 100/100, validation error 3.850000 %
 23 |      epoch 7, minibatch 100/100, test error of best model 13.110000 %
 24 | training @ iter = 700
 25 | epoch 8, minibatch 100/100, validation error 3.280000 %
 26 |      epoch 8, minibatch 100/100, test error of best model 12.630000 %
 27 | training @ iter = 800
 28 | epoch 9, minibatch 100/100, validation error 2.910000 %
 29 |      epoch 9, minibatch 100/100, test error of best model 12.250000 %
 30 | training @ iter = 900
 31 | epoch 10, minibatch 100/100, validation error 2.610000 %
 32 |      epoch 10, minibatch 100/100, test error of best model 11.990000 %
 33 | training @ iter = 1000
 34 | epoch 11, minibatch 100/100, validation error 2.510000 %
 35 |      epoch 11, minibatch 100/100, test error of best model 11.660000 %
 36 | training @ iter = 1100
 37 | epoch 12, minibatch 100/100, validation error 2.350000 %
 38 |      epoch 12, minibatch 100/100, test error of best model 11.660000 %
 39 | training @ iter = 1200
 40 | epoch 13, minibatch 100/100, validation error 2.160000 %
 41 |      epoch 13, minibatch 100/100, test error of best model 11.440000 %
 42 | training @ iter = 1300
 43 | epoch 14, minibatch 100/100, validation error 2.040000 %
 44 |      epoch 14, minibatch 100/100, test error of best model 11.370000 %
 45 | training @ iter = 1400
 46 | epoch 15, minibatch 100/100, validation error 1.980000 %
 47 |      epoch 15, minibatch 100/100, test error of best model 11.270000 %
 48 | training @ iter = 1500
 49 | epoch 16, minibatch 100/100, validation error 1.940000 %
 50 |      epoch 16, minibatch 100/100, test error of best model 11.250000 %
 51 | training @ iter = 1600
 52 | epoch 17, minibatch 100/100, validation error 1.910000 %
 53 |      epoch 17, minibatch 100/100, test error of best model 11.240000 %
 54 | training @ iter = 1700
 55 | epoch 18, minibatch 100/100, validation error 1.830000 %
 56 |      epoch 18, minibatch 100/100, test error of best model 11.300000 %
 57 | training @ iter = 1800
 58 | epoch 19, minibatch 100/100, validation error 1.790000 %
 59 |      epoch 19, minibatch 100/100, test error of best model 11.350000 %
 60 | training @ iter = 1900
 61 | epoch 20, minibatch 100/100, validation error 1.740000 %
 62 |      epoch 20, minibatch 100/100, test error of best model 11.460000 %
 63 | training @ iter = 2000
 64 | epoch 21, minibatch 100/100, validation error 1.730000 %
 65 |      epoch 21, minibatch 100/100, test error of best model 11.370000 %
 66 | training @ iter = 2100
 67 | epoch 22, minibatch 100/100, validation error 1.690000 %
 68 |      epoch 22, minibatch 100/100, test error of best model 11.410000 %
 69 | training @ iter = 2200
 70 | epoch 23, minibatch 100/100, validation error 1.660000 %
 71 |      epoch 23, minibatch 100/100, test error of best model 11.380000 %
 72 | training @ iter = 2300
 73 | epoch 24, minibatch 100/100, validation error 1.630000 %
 74 |      epoch 24, minibatch 100/100, test error of best model 11.420000 %
 75 | training @ iter = 2400
 76 | epoch 25, minibatch 100/100, validation error 1.630000 %
 77 | training @ iter = 2500
 78 | epoch 26, minibatch 100/100, validation error 1.620000 %
 79 |      epoch 26, minibatch 100/100, test error of best model 11.460000 %
 80 | training @ iter = 2600
 81 | epoch 27, minibatch 100/100, validation error 1.620000 %
 82 | training @ iter = 2700
 83 | epoch 28, minibatch 100/100, validation error 1.600000 %
 84 |      epoch 28, minibatch 100/100, test error of best model 11.440000 %
 85 | training @ iter = 2800
 86 | epoch 29, minibatch 100/100, validation error 1.560000 %
 87 |      epoch 29, minibatch 100/100, test error of best model 11.410000 %
 88 | training @ iter = 2900
 89 | epoch 30, minibatch 100/100, validation error 1.550000 %
 90 |      epoch 30, minibatch 100/100, test error of best model 11.440000 %
 91 | training @ iter = 3000
 92 | epoch 31, minibatch 100/100, validation error 1.520000 %
 93 |      epoch 31, minibatch 100/100, test error of best model 11.480000 %
 94 | training @ iter = 3100
 95 | epoch 32, minibatch 100/100, validation error 1.500000 %
 96 |      epoch 32, minibatch 100/100, test error of best model 11.450000 %
 97 | training @ iter = 3200
 98 | epoch 33, minibatch 100/100, validation error 1.500000 %
 99 | training @ iter = 3300
100 | epoch 34, minibatch 100/100, validation error 1.500000 %
101 | training @ iter = 3400
102 | epoch 35, minibatch 100/100, validation error 1.500000 %
103 | training @ iter = 3500
104 | epoch 36, minibatch 100/100, validation error 1.490000 %
105 |      epoch 36, minibatch 100/100, test error of best model 11.450000 %
106 | training @ iter = 3600
107 | epoch 37, minibatch 100/100, validation error 1.480000 %
108 |      epoch 37, minibatch 100/100, test error of best model 11.420000 %
109 | training @ iter = 3700
110 | epoch 38, minibatch 100/100, validation error 1.480000 %
111 | training @ iter = 3800
112 | epoch 39, minibatch 100/100, validation error 1.480000 %
113 | training @ iter = 3900
114 | epoch 40, minibatch 100/100, validation error 1.470000 %
115 |      epoch 40, minibatch 100/100, test error of best model 11.390000 %
116 | training @ iter = 4000
117 | epoch 41, minibatch 100/100, validation error 1.450000 %
118 |      epoch 41, minibatch 100/100, test error of best model 11.360000 %
119 | training @ iter = 4100
120 | epoch 42, minibatch 100/100, validation error 1.450000 %
121 | training @ iter = 4200
122 | epoch 43, minibatch 100/100, validation error 1.450000 %
123 | training @ iter = 4300
124 | epoch 44, minibatch 100/100, validation error 1.450000 %
125 | training @ iter = 4400
126 | epoch 45, minibatch 100/100, validation error 1.430000 %
127 |      epoch 45, minibatch 100/100, test error of best model 11.450000 %
128 | training @ iter = 4500
129 | epoch 46, minibatch 100/100, validation error 1.420000 %
130 |      epoch 46, minibatch 100/100, test error of best model 11.460000 %
131 | training @ iter = 4600
132 | epoch 47, minibatch 100/100, validation error 1.410000 %
133 |      epoch 47, minibatch 100/100, test error of best model 11.460000 %
134 | training @ iter = 4700
135 | epoch 48, minibatch 100/100, validation error 1.400000 %
136 |      epoch 48, minibatch 100/100, test error of best model 11.430000 %
137 | training @ iter = 4800
138 | epoch 49, minibatch 100/100, validation error 1.420000 %
139 | training @ iter = 4900
140 | epoch 50, minibatch 100/100, validation error 1.410000 %
141 | training @ iter = 5000
142 | epoch 51, minibatch 100/100, validation error 1.390000 %
143 |      epoch 51, minibatch 100/100, test error of best model 11.400000 %
144 | training @ iter = 5100
145 | epoch 52, minibatch 100/100, validation error 1.400000 %
146 | training @ iter = 5200
147 | epoch 53, minibatch 100/100, validation error 1.390000 %
148 | training @ iter = 5300
149 | epoch 54, minibatch 100/100, validation error 1.380000 %
150 |      epoch 54, minibatch 100/100, test error of best model 11.430000 %
151 | training @ iter = 5400
152 | epoch 55, minibatch 100/100, validation error 1.390000 %
153 | training @ iter = 5500
154 | epoch 56, minibatch 100/100, validation error 1.390000 %
155 | training @ iter = 5600
156 | epoch 57, minibatch 100/100, validation error 1.360000 %
157 |      epoch 57, minibatch 100/100, test error of best model 11.450000 %
158 | training @ iter = 5700
159 | epoch 58, minibatch 100/100, validation error 1.370000 %
160 | training @ iter = 5800
161 | epoch 59, minibatch 100/100, validation error 1.380000 %
162 | training @ iter = 5900
163 | epoch 60, minibatch 100/100, validation error 1.370000 %
164 | training @ iter = 6000
165 | epoch 61, minibatch 100/100, validation error 1.390000 %
166 | training @ iter = 6100
167 | epoch 62, minibatch 100/100, validation error 1.380000 %
168 | training @ iter = 6200
169 | epoch 63, minibatch 100/100, validation error 1.370000 %
170 | training @ iter = 6300
171 | epoch 64, minibatch 100/100, validation error 1.360000 %
172 | training @ iter = 6400
173 | epoch 65, minibatch 100/100, validation error 1.360000 %
174 | training @ iter = 6500
175 | epoch 66, minibatch 100/100, validation error 1.360000 %
176 | training @ iter = 6600
177 | epoch 67, minibatch 100/100, validation error 1.360000 %
178 | training @ iter = 6700
179 | epoch 68, minibatch 100/100, validation error 1.360000 %
180 | training @ iter = 6800
181 | 


--------------------------------------------------------------------------------