├── README.md
├── TT.png
├── experiments
└── cifar-10
│ ├── FC-Tensorizing-Neural-Networks
│ ├── 2-layer-tt
│ │ ├── input_data.py
│ │ ├── net.py
│ │ ├── results
│ │ │ ├── res_01-04-2016_22_24
│ │ │ ├── res_02-04-2016_02_33
│ │ │ ├── res_02-04-2016_07_20
│ │ │ ├── res_02-04-2016_12_53
│ │ │ ├── res_02-04-2016_19_30
│ │ │ ├── res_03-04-2016_14_56
│ │ │ ├── res_03-04-2016_23_30
│ │ │ ├── res_04-04-2016_09_43
│ │ │ ├── res_08-04-2016_00_08
│ │ │ ├── res_08-04-2016_00_14
│ │ │ ├── res_08-04-2016_01_46
│ │ │ ├── res_08-04-2016_05_03
│ │ │ ├── res_08-04-2016_14_59
│ │ │ ├── res_10-04-2016_20_38
│ │ │ ├── res_10-04-2016_21_16
│ │ │ ├── res_12-04-2016_10_54
│ │ │ ├── res_12-04-2016_11_01
│ │ │ └── res_30-03-2016_17_12
│ │ └── train_cifar.py
│ ├── FC-net
│ │ ├── input_data.py
│ │ ├── net.py
│ │ ├── results
│ │ │ └── res_30-03-2016_16_58
│ │ └── train_cifar.py
│ └── README.md
│ ├── conv-Ultimate-Tensorization
│ ├── README.md
│ ├── eval.py
│ ├── input_data.py
│ ├── nets
│ │ ├── TT-conv-TT-fc.py
│ │ ├── TT-conv-fc.py
│ │ ├── TT-conv.py
│ │ ├── conv-fc.py
│ │ └── conv.py
│ ├── train.py
│ └── train_with_pretrained_convs.py
│ └── data
│ └── prepare_data.py
├── paper.md
├── tensornet
├── __init__.py
├── layers
│ ├── __init__.py
│ ├── aux.py
│ ├── batch_normalization.py
│ ├── conv.py
│ ├── linear.py
│ ├── tt.py
│ ├── tt_conv.py
│ ├── tt_conv1d_full.py
│ ├── tt_conv_direct.py
│ └── tt_conv_full.py
└── tt
│ ├── __init__.py
│ ├── matrix_svd.py
│ ├── max_ranks.py
│ └── svd.py
├── tests
└── python
│ ├── test_matrix_svd.py
│ ├── test_tt.py
│ ├── test_tt_conv.py
│ └── test_tt_conv_full.py
└── ultimate_tensorization_poster.pdf
/README.md:
--------------------------------------------------------------------------------
1 | # TensorNet
2 |
3 | This is a TensorFlow implementation of the Tensor Train compression method for neural networks. It supports _TT-FC_ layer [1] and _TT-conv_ layer [2], which act as a fully-connected and convolutional layers correspondingly, but are much more compact. The TT-FC layer is also faster than its uncompressed analog and allows to use hundreds of thousands of hidden units. The ```experiments``` folder contains the code to reproduce the experiments from the papers.
4 |
5 |
6 | [1] _Tensorizing Neural Networks_
7 | Alexander Novikov, Dmitry Podoprikhin, Anton Osokin, Dmitry Vetrov; In _Advances in Neural Information Processing Systems 28_ (NIPS-2015) [[arXiv](http://arxiv.org/abs/1509.06569)].
8 |
9 | [2] _Ultimate tensorization: compressing convolutional and FC layers alike_
10 | Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov; _Learning with Tensors: Why Now and How?_, NIPS-2016 workshop (NIPS-2015) [[arXiv](https://arxiv.org/abs/1611.03214)].
11 |
12 |
13 | Please cite our work if you write a scientific paper using this code.
14 | In BiBTeX format:
15 | ```latex
16 | @incollection{novikov15tensornet,
17 | author = {Novikov, Alexander and Podoprikhin, Dmitry and Osokin, Anton and Vetrov, Dmitry},
18 | title = {Tensorizing Neural Networks},
19 | booktitle = {Advances in Neural Information Processing Systems 28 (NIPS)},
20 | year = {2015},
21 | }
22 | @article{garipov16ttconv,
23 | author = {Garipov, Timur and Podoprikhin, Dmitry and Novikov, Alexander and Vetrov, Dmitry},
24 | title = {Ultimate tensorization: compressing convolutional and {FC} layers alike},
25 | journal = {arXiv preprint arXiv:1611.03214},
26 | year = {2016}
27 | }
28 | ```
29 |
30 | # Prerequisites
31 | * [TensorFlow](https://www.tensorflow.org/) (tested with v. 1.1.0)
32 | * [NumPy](http://www.numpy.org/)
33 |
34 | # MATLAB and Theano
35 | We also published a [MATLAB and Theano+Lasagne implementation](https://github.com/Bihaqo/TensorNet) in a separate repository.
36 |
37 | # FAQ
38 | ### What is _tensor_ anyway?
39 | Its just a synonym for a multidimensional array. For example a matrix is a 2-dimensional tensor.
40 |
41 | ### But in the fully-connected case you work with matrices, why do you need tensor decompositions?
42 | Good point. Actually, the Tensor Train format coincides the matrix low-rank format when applied to matrices. For this reason, there is a special _matrix Tensor Train format_, which basically does two things: reshapes the matrix into a tensor (say 10-dimensional) and permutes its dimensions in a special way; uses tensor decomposition on the resulting tensor. This way proved to be more efficient than the matrix low-rank format for the matrix of the fully-connected layer.
43 |
44 | ### Where I can read more about this _Tensor Train_ format?
45 | Look at the original paper: Ivan Oseledets, Tensor-Train decomposition, 2011 [[pdf](https://dl.dropboxusercontent.com/content_link/5aBmG8Em2oDCji5AJsviXqKVZWSiVYt4lKkMs2icjskQM79YRCnOoTf2wDP1N3Dh/file?dl=1)]. You can also check out my (Alexander Novikov's) [slides](http://www.slideshare.net/AlexanderNovikov8/tensor-train-decomposition-in-machine-learning), from slide 3 to 14.
46 |
47 | By the way, **train** means like actual train, with wheels. The name comes from the pictures like the one below that illustrate the Tensor Train format and naturally look like a train (at least they say so).
48 |
49 |
50 |
51 |
56 |
57 | ### Are TensorFlow, MATLAB, and Theano implementations compatible?
58 | Unfortunately not (at least not yet).
59 |
60 |
61 | ### I want to implement this in Caffe (or other library without autodiff). Any tips on doing the backward pass?
62 | Great! Write us when you're done or if you have questions along the way.
63 | The MATLAB version of the code has the [backward pass implementation](https://github.com/Bihaqo/TensorNet/blob/master/src/matlab/vl_nntt_backward.m) for TT-FC layer. But note that the forward pass in MATLAB and TensorFlow versions is implemented differently.
64 |
65 | ### Have you tried other tensor decompositions, like CP-decomposition?
66 | We haven't, but this paper uses CP-decomposition to compress the kernel of the convolutional layer: Lebedev V., Ganin Y. et al., Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition [[arXiv](https://arxiv.org/abs/1412.6553)] [[code](https://github.com/vadim-v-lebedev/cp-decomposition)]. They got nice compression results, but was not able to train CP-conv layers from scratch, only to train a network with regular convolutional layers, represent them in the CP-format, and when finetune the rest of the network. Even _finetuning_ an CP-conv layer often diverges.
67 |
--------------------------------------------------------------------------------
/TT.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/timgaripov/TensorNet-TF/76299ad4726370bb5e75589017208d7eae7d8666/TT.png
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/input_data.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 |
4 | class DataSet(object):
5 | def __init__(self, images, labels):
6 | """Construct a DataSet.
7 | """
8 | assert images.shape[0] == labels.shape[0], ('images.shape: %s labels.shape: %s' %
9 | (images.shape, labels.shape))
10 | self._num_examples = images.shape[0]
11 | self._images = images
12 | self._labels = labels
13 | self._epochs_completed = 0
14 | self._index_in_epoch = 0
15 |
16 | @property
17 | def images(self):
18 | return self._images
19 |
20 | @property
21 | def labels(self):
22 | return self._labels
23 |
24 | @property
25 | def num_examples(self):
26 | return self._num_examples
27 |
28 | @property
29 | def epochs_completed(self):
30 | return self._epochs_completed
31 |
32 | def next_batch(self, batch_size):
33 | start = self._index_in_epoch
34 | self._index_in_epoch += batch_size
35 | if self._index_in_epoch > self._num_examples:
36 | # Finished epoch
37 | self._epochs_completed += 1
38 | # Shuffle the data
39 | perm = np.arange(self._num_examples)
40 | np.random.shuffle(perm)
41 | self._images = self._images[perm]
42 | self._labels = self._labels[perm]
43 | # Start next epoch
44 | start = 0
45 | self._index_in_epoch = batch_size
46 | assert batch_size <= self._num_examples
47 | end = self._index_in_epoch
48 | return self._images[start:end], self._labels[start:end]
49 |
50 |
51 | def read_data_sets(data_dir):
52 | f = np.load(data_dir + '/cifar.npz')
53 | train_images = f['train_images'].astype('float32')
54 | train_labels = f['train_labels']
55 |
56 | validation_images = f['validation_images'].astype('float32')
57 | validation_labels = f['validation_labels']
58 |
59 | mean = np.mean(train_images, axis=0)[np.newaxis, :]
60 | std = np.std(train_images, axis=0)[np.newaxis, :]
61 |
62 | train_images = (train_images - mean) / std;
63 | validation_images = (validation_images - mean) / std;
64 |
65 | #train_images = np.reshape(train_images, (-1, 32, 32, 3))
66 | #validation_images = np.reshape(validation_images, (-1, 32, 32, 3))
67 | #train_reshaped = np.empty((train_images.shape[0], 0), dtype=np.float32)
68 | #validation_reshaped = np.empty((validation_images.shape[0], 0), dtype=np.float32)
69 |
70 | #for i in range(4):
71 | # for j in range(4):
72 | # p = np.reshape(train_images[:, 8*i:8*(i+1), 8*j:8*(j+1), :], (-1, 192))
73 | # train_reshaped = np.hstack((train_reshaped, p))
74 | # p = np.reshape(validation_images[:, 8*i:8*(i+1), 8*j:8*(j+1), :], (-1, 192))
75 | # validation_reshaped = np.hstack((validation_reshaped, p))
76 | train = DataSet(train_images, train_labels)
77 | validation = DataSet(validation_images, validation_labels)
78 | return train, validation
79 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/net.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import math
3 | import numpy as np
4 | import sys
5 |
6 |
7 | sys.path.append('../../../../')
8 | import tensornet
9 |
10 | NUM_CLASSES = 10
11 | IMAGE_SIZE = 32
12 | IMAGE_DEPTH = 3
13 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
14 |
15 | opts = {}
16 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
17 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32')
18 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
19 |
20 | opts['inp_modes_2'] = opts['out_modes_1']
21 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
22 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
23 |
24 |
25 | opts['use_dropout'] = True
26 | opts['learning_rate_init'] = 0.06
27 | opts['learning_rate_decay_steps'] = 2000
28 | opts['learning_rate_decay_weight'] = 0.64
29 |
30 | def placeholder_inputs():
31 | """Generate placeholder variables to represent the input tensors.
32 |
33 | Returns:
34 | images_ph: Images placeholder.
35 | labels_ph: Labels placeholder.
36 | train_phase_ph: Train phase indicator placeholder.
37 | """
38 | # Note that the shapes of the placeholders match the shapes of the full
39 | # image and label tensors, except the first dimension is now batch_size
40 | # rather than the full size of the train or test data sets.
41 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
42 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
43 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
44 | return images_ph, labels_ph, train_phase_ph
45 |
46 | def inference(images, train_phase):
47 | """Build the model up to where it may be used for inference.
48 | Args:
49 | images: Images placeholder.
50 | train_phase: Train phase placeholder
51 | Returns:
52 | logits: Output tensor with the computed logits.
53 | """
54 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
55 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
56 |
57 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
58 |
59 |
60 | layers = []
61 | layers.append(images)
62 |
63 |
64 | layers.append(tensornet.layers.tt(layers[-1],
65 | opts['inp_modes_1'],
66 | opts['out_modes_1'],
67 | opts['ranks_1'],
68 | scope='tt_' + str(len(layers)),
69 | biases_initializer=None))
70 |
71 | layers.append(tensornet.layers.batch_normalization(layers[-1],
72 | train_phase,
73 | scope='BN_' + str(len(layers)),
74 | ema_decay=0.8))
75 |
76 | layers.append(tf.nn.relu(layers[-1],
77 | name='relu_' + str(len(layers))))
78 | layers.append(tf.nn.dropout(layers[-1],
79 | dropout_rate(0.6),
80 | name='dropout_' + str(len(layers))))
81 |
82 |
83 | ##########################################
84 | layers.append(tensornet.layers.tt(layers[-1],
85 | opts['inp_modes_2'],
86 | opts['out_modes_2'],
87 | opts['ranks_2'],
88 | scope='tt_' + str(len(layers)),
89 | biases_initializer=None))
90 |
91 | layers.append(tensornet.layers.batch_normalization(layers[-1],
92 | train_phase,
93 | scope='BN_' + str(len(layers)),
94 | ema_decay=0.8))
95 |
96 | layers.append(tf.nn.relu(layers[-1],
97 | name='relu_' + str(len(layers))))
98 |
99 | layers.append(tf.nn.dropout(layers[-1],
100 | dropout_rate(0.6),
101 | name='dropout_' + str(len(layers))))
102 |
103 | ##########################################
104 |
105 | layers.append(tensornet.layers.linear(layers[-1],
106 | NUM_CLASSES,
107 | scope='linear_' + str(len(layers))))
108 |
109 | return layers[-1]
110 |
111 | def loss(logits, labels):
112 | """Calculates the loss from the logits and the labels.
113 | Args:
114 | logits: input tensor, float - [batch_size, NUM_CLASSES].
115 | labels: Labels tensor, int32 - [batch_size].
116 | Returns:
117 | loss: Loss tensor of type float.
118 | """
119 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
120 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
121 | # each with NUM_CLASSES values, all of which are 0.0 except there will
122 | # be a 1.0 in the entry corresponding to the label).
123 | batch_size = tf.size(labels)
124 | labels = tf.expand_dims(labels, 1)
125 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
126 | concated = tf.concat([indices, labels], 1)
127 | onehot_labels = tf.sparse_to_dense(concated,
128 | tf.shape(logits), 1.0, 0.0)
129 |
130 |
131 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits,
132 | labels=onehot_labels,
133 | name='xentropy')
134 | loss = tf.reduce_mean(cross_entropy, name='loss')
135 | tf.summary.scalar('summary/loss', loss)
136 | return loss
137 |
138 | def training(loss):
139 | """Sets up the training Ops.
140 | Creates an optimizer and applies the gradients to all trainable variables.
141 | The Op returned by this function is what must be passed to the
142 | `sess.run()` call to cause the model to train.
143 | Args:
144 | loss: Loss tensor, from loss().
145 | Returns:
146 | train_op: The Op for training.
147 | """
148 | # Create a variable to track the global step.
149 | global_step = tf.Variable(0, name='global_step', trainable=False)
150 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
151 | global_step,
152 | opts['learning_rate_decay_steps'],
153 | opts['learning_rate_decay_weight'],
154 | staircase=True,
155 | name='learning_rate')
156 | tf.summary.scalar('summary/learning_rate', learning_rate)
157 | # Create the gradient descent optimizer with the given learning rate.
158 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
159 |
160 | grads_and_vars = optimizer.compute_gradients(loss)
161 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
162 | return train_op
163 |
164 | def evaluation(logits, labels):
165 | """Evaluate the quality of the logits at predicting the label.
166 | Args:
167 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
168 | labels: Labels tensor, int32 - [batch_size], with values in the
169 | range [0, NUM_CLASSES).
170 | Returns:
171 | A scalar int32 tensor with the number of examples (out of batch_size)
172 | that were predicted correctly.
173 | """
174 | # For a classifier model, we can use the in_top_k Op.
175 | # It returns a bool tensor with shape [batch_size] that is true for
176 | # the examples where the label's is was in the top k (here k=1)
177 | # of all logits for that example.
178 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
179 | # Return the number of true entries.
180 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
181 | return correct_count
182 |
183 |
184 | def build(new_opts={}):
185 | """ Build graph
186 | Args:
187 | new_opts: dict with additional opts, which will be added to opts dict/
188 | """
189 | opts.update(new_opts)
190 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
191 | logits = inference(images_ph, train_phase_ph)
192 | loss_out = loss(logits, labels_ph)
193 | train = training(loss_out)
194 | eval_out = evaluation(logits, labels_ph)
195 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_01-04-2016_22_24:
--------------------------------------------------------------------------------
1 | Iterations: 40000
2 | Learning time: 248.66 minutes
3 | Train precision: 0.74416
4 | Train loss: 0.75565
5 | Validation precision: 0.69510
6 | Validation loss: 0.87159
7 | Extra opts: {'ranks_2': array([1, 2, 2, 2, 2, 2, 1], dtype=int32), 'ranks_1': array([1, 2, 2, 2, 2, 2, 1], dtype=int32)}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_02-04-2016_02_33:
--------------------------------------------------------------------------------
1 | Iterations: 40000
2 | Learning time: 286.55 minutes
3 | Train precision: 0.74570
4 | Train loss: 0.74619
5 | Validation precision: 0.69760
6 | Validation loss: 0.86426
7 | Extra opts: {'ranks_2': array([1, 4, 4, 4, 4, 4, 1], dtype=int32), 'ranks_1': array([1, 4, 4, 4, 4, 4, 1], dtype=int32)}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_02-04-2016_07_20:
--------------------------------------------------------------------------------
1 | Iterations: 40000
2 | Learning time: 333.25 minutes
3 | Train precision: 0.77144
4 | Train loss: 0.67462
5 | Validation precision: 0.70550
6 | Validation loss: 0.83444
7 | Extra opts: {'ranks_2': array([1, 6, 6, 6, 6, 6, 1], dtype=int32), 'ranks_1': array([1, 6, 6, 6, 6, 6, 1], dtype=int32)}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_02-04-2016_12_53:
--------------------------------------------------------------------------------
1 | Iterations: 40000
2 | Learning time: 396.60 minutes
3 | Train precision: 0.78978
4 | Train loss: 0.62019
5 | Validation precision: 0.70740
6 | Validation loss: 0.84421
7 | Extra opts: {'ranks_2': array([ 1, 6, 10, 10, 10, 6, 1], dtype=int32), 'ranks_1': array([ 1, 6, 10, 10, 10, 6, 1], dtype=int32)}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_02-04-2016_19_30:
--------------------------------------------------------------------------------
1 | Iterations: 40000
2 | Learning time: 462.50 minutes
3 | Train precision: 0.80798
4 | Train loss: 0.56969
5 | Validation precision: 0.71340
6 | Validation loss: 0.81786
7 | Extra opts: {'ranks_2': array([ 1, 10, 10, 10, 10, 10, 1], dtype=int32), 'ranks_1': array([ 1, 10, 10, 10, 10, 10, 1], dtype=int32)}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_08-04-2016_00_08:
--------------------------------------------------------------------------------
1 | Iterations: 1000
2 | Learning time: 3.31 minutes
3 | Train precision: 0.54084
4 | Train loss: 1.28820
5 | Validation precision: 0.52470
6 | Validation loss: 1.32532
7 | Extra opts: {'out_modes_1': array([6, 6, 6, 6, 6, 6], dtype=int32), 'inp_modes_2': array([6, 6, 6, 6, 6, 6], dtype=int32)}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 10, 10, 10, 10, 10, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 10, 10, 10, 10, 10, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_08-04-2016_00_14:
--------------------------------------------------------------------------------
1 | Iterations: 30000
2 | Learning time: 92.21 minutes
3 | Train precision: 0.77768
4 | Train loss: 0.65437
5 | Validation precision: 0.69740
6 | Validation loss: 0.84584
7 | Extra opts: {'out_modes_1': array([6, 6, 6, 6, 6, 6], dtype=int32), 'inp_modes_2': array([6, 6, 6, 6, 6, 6], dtype=int32)}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 10, 10, 10, 10, 10, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 10, 10, 10, 10, 10, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_10-04-2016_20_38:
--------------------------------------------------------------------------------
1 | Iterations: 30000
2 | Learning time: 32.98 minutes
3 | Train precision: 0.68852
4 | Train loss: 0.90400
5 | Validation precision: 0.64600
6 | Validation loss: 0.99333
7 | Extra opts: {}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([5, 5, 5, 5, 5, 5], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_10-04-2016_21_16:
--------------------------------------------------------------------------------
1 | Iterations: 30000
2 | Learning time: 28.52 minutes
3 | Train precision: 0.64264
4 | Train loss: 1.05579
5 | Validation precision: 0.60860
6 | Validation loss: 1.13930
7 | Extra opts: {}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([5, 5, 5, 5, 5, 5], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | #layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | # [np.prod(opts['out_modes_1'])],
82 | # train_phase,
83 | # scope='BN_' + str(len(layers)),
84 | # ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | #layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | # [np.prod(opts['out_modes_2'])],
104 | # train_phase,
105 | # scope='BN_' + str(len(layers)),
106 | # ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_12-04-2016_10_54:
--------------------------------------------------------------------------------
1 | Iterations: 2000
2 | Learning time: 2.97 minutes
3 | Train precision: 0.52590
4 | Train loss: 1.36872
5 | Validation precision: 0.51480
6 | Validation loss: 1.39315
7 | Extra opts: {}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([5, 5, 5, 5, 5, 5], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | #layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | # [np.prod(opts['out_modes_1'])],
82 | # train_phase,
83 | # scope='BN_' + str(len(layers)),
84 | # ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | #layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | # [np.prod(opts['out_modes_2'])],
104 | # train_phase,
105 | # scope='BN_' + str(len(layers)),
106 | # ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_12-04-2016_11_01:
--------------------------------------------------------------------------------
1 | Iterations: 30000
2 | Learning time: 42.49 minutes
3 | Train precision: 0.71704
4 | Train loss: 0.82338
5 | Validation precision: 0.67280
6 | Validation loss: 0.91675
7 | Extra opts: {}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([5, 5, 5, 5, 5, 5], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_30-03-2016_17_12:
--------------------------------------------------------------------------------
1 | Iterations: 40000
2 | Learning time: 274.66 minutes
3 | Train precision: 0.73638
4 | Train loss: 0.77769
5 | Validation precision: 0.68290
6 | Validation loss: 0.90713
7 | Extra opts: {}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32')
25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32')
26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
27 |
28 | opts['inp_modes_2'] = opts['out_modes_1']
29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32')
30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32')
31 |
32 |
33 | opts['use_dropout'] = True
34 | opts['learning_rate_init'] = 0.06
35 | opts['learning_rate_decay_steps'] = 2000
36 | opts['learning_rate_decay_weight'] = 0.64
37 |
38 | def placeholder_inputs():
39 | """Generate placeholder variables to represent the input tensors.
40 |
41 | Returns:
42 | images_ph: Images placeholder.
43 | labels_ph: Labels placeholder.
44 | train_phase_ph: Train phase indicator placeholder.
45 | """
46 | # Note that the shapes of the placeholders match the shapes of the full
47 | # image and label tensors, except the first dimension is now batch_size
48 | # rather than the full size of the train or test data sets.
49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
52 | return images_ph, labels_ph, train_phase_ph
53 |
54 | def inference(images, train_phase):
55 | """Build the model up to where it may be used for inference.
56 | Args:
57 | images: Images placeholder.
58 | train_phase: Train phase placeholder
59 | Returns:
60 | logits: Output tensor with the computed logits.
61 | """
62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
64 |
65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
66 |
67 |
68 | layers = []
69 | layers.append(images)
70 |
71 |
72 | layers.append(tensornet.layers.tt(layers[-1],
73 | opts['inp_modes_1'],
74 | opts['out_modes_1'],
75 | opts['ranks_1'],
76 | 3.0, #0.1
77 | 'tt_' + str(len(layers)),
78 | use_biases=False))
79 |
80 | layers.append(tensornet.layers.batch_normalization(layers[-1],
81 | [np.prod(opts['out_modes_1'])],
82 | train_phase,
83 | scope='BN_' + str(len(layers)),
84 | ema_decay=0.8))
85 |
86 | layers.append(tf.nn.relu(layers[-1],
87 | name='relu_' + str(len(layers))))
88 | layers.append(tf.nn.dropout(layers[-1],
89 | dropout_rate(0.6),
90 | name='dropout_' + str(len(layers))))
91 |
92 |
93 | ##########################################
94 | layers.append(tensornet.layers.tt(layers[-1],
95 | opts['inp_modes_2'],
96 | opts['out_modes_2'],
97 | opts['ranks_2'],
98 | 3.0, #0.07
99 | 'tt_' + str(len(layers)),
100 | use_biases=False))
101 |
102 | layers.append(tensornet.layers.batch_normalization(layers[-1],
103 | [np.prod(opts['out_modes_2'])],
104 | train_phase,
105 | scope='BN_' + str(len(layers)),
106 | ema_decay=0.8))
107 |
108 | layers.append(tf.nn.relu(layers[-1],
109 | name='relu_' + str(len(layers))))
110 |
111 | layers.append(tf.nn.dropout(layers[-1],
112 | dropout_rate(0.6),
113 | name='dropout_' + str(len(layers))))
114 |
115 | ##########################################
116 |
117 | layers.append(tensornet.layers.linear(layers[-1],
118 | np.prod(opts['out_modes_2']),
119 | NUM_CLASSES,
120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])),
121 | scope='linear_' + str(len(layers))))
122 |
123 | return layers[-1]
124 |
125 | def loss(logits, labels):
126 | """Calculates the loss from the logits and the labels.
127 | Args:
128 | logits: input tensor, float - [batch_size, NUM_CLASSES].
129 | labels: Labels tensor, int32 - [batch_size].
130 | Returns:
131 | loss: Loss tensor of type float.
132 | """
133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
134 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
135 | # each with NUM_CLASSES values, all of which are 0.0 except there will
136 | # be a 1.0 in the entry corresponding to the label).
137 | batch_size = tf.size(labels)
138 | labels = tf.expand_dims(labels, 1)
139 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
140 | concated = tf.concat(1, [indices, labels])
141 | onehot_labels = tf.sparse_to_dense(concated,
142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
143 |
144 |
145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
146 | onehot_labels,
147 | name='xentropy')
148 | loss = tf.reduce_mean(cross_entropy, name='loss')
149 | tf.scalar_summary('loss', loss, name='summary/loss')
150 | return loss
151 |
152 | def training(loss):
153 | """Sets up the training Ops.
154 | Creates an optimizer and applies the gradients to all trainable variables.
155 | The Op returned by this function is what must be passed to the
156 | `sess.run()` call to cause the model to train.
157 | Args:
158 | loss: Loss tensor, from loss().
159 | Returns:
160 | train_op: The Op for training.
161 | """
162 | # Create a variable to track the global step.
163 | global_step = tf.Variable(0, name='global_step', trainable=False)
164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
165 | global_step,
166 | opts['learning_rate_decay_steps'],
167 | opts['learning_rate_decay_weight'],
168 | staircase=True,
169 | name='learning_rate')
170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
171 | # Create the gradient descent optimizer with the given learning rate.
172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
173 |
174 | grads_and_vars = optimizer.compute_gradients(loss)
175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
176 | return train_op
177 |
178 | def evaluation(logits, labels):
179 | """Evaluate the quality of the logits at predicting the label.
180 | Args:
181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size], with values in the
183 | range [0, NUM_CLASSES).
184 | Returns:
185 | A scalar int32 tensor with the number of examples (out of batch_size)
186 | that were predicted correctly.
187 | """
188 | # For a classifier model, we can use the in_top_k Op.
189 | # It returns a bool tensor with shape [batch_size] that is true for
190 | # the examples where the label's is was in the top k (here k=1)
191 | # of all logits for that example.
192 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
193 | # Return the number of true entries.
194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
195 | return correct_count
196 |
197 |
198 | def build(new_opts={}):
199 | """ Build graph
200 | Args:
201 | new_opts: dict with additional opts, which will be added to opts dict/
202 | """
203 | opts.update(new_opts)
204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
205 | logits = inference(images_ph, train_phase_ph)
206 | loss_out = loss(logits, labels_ph)
207 | train = training(loss_out)
208 | eval_out = evaluation(logits, labels_ph)
209 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/FC-net/input_data.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 |
4 | class DataSet(object):
5 | def __init__(self, images, labels):
6 | """Construct a DataSet.
7 | """
8 | assert images.shape[0] == labels.shape[0], ('images.shape: %s labels.shape: %s' %
9 | (images.shape, labels.shape))
10 | self._num_examples = images.shape[0]
11 | self._images = images
12 | self._labels = labels
13 | self._epochs_completed = 0
14 | self._index_in_epoch = 0
15 |
16 | @property
17 | def images(self):
18 | return self._images
19 |
20 | @property
21 | def labels(self):
22 | return self._labels
23 |
24 | @property
25 | def num_examples(self):
26 | return self._num_examples
27 |
28 | @property
29 | def epochs_completed(self):
30 | return self._epochs_completed
31 |
32 | def next_batch(self, batch_size):
33 | start = self._index_in_epoch
34 | self._index_in_epoch += batch_size
35 | if self._index_in_epoch > self._num_examples:
36 | # Finished epoch
37 | self._epochs_completed += 1
38 | # Shuffle the data
39 | perm = np.arange(self._num_examples)
40 | np.random.shuffle(perm)
41 | self._images = self._images[perm]
42 | self._labels = self._labels[perm]
43 | # Start next epoch
44 | start = 0
45 | self._index_in_epoch = batch_size
46 | assert batch_size <= self._num_examples
47 | end = self._index_in_epoch
48 | return self._images[start:end], self._labels[start:end]
49 |
50 |
51 | def read_data_sets(data_dir):
52 | f = np.load(data_dir + '/cifar.npz')
53 | train_images = f['train_images'].astype('float32')
54 | train_labels = f['train_labels']
55 |
56 | validation_images = f['validation_images'].astype('float32')
57 | validation_labels = f['validation_labels']
58 |
59 | mean = np.mean(train_images, axis=0)[np.newaxis, :]
60 | std = np.std(train_images, axis=0)[np.newaxis, :]
61 |
62 | train_images = (train_images - mean) / std;
63 | validation_images = (validation_images - mean) / std;
64 |
65 | train = DataSet(train_images, train_labels)
66 | validation = DataSet(validation_images, validation_labels)
67 | return train, validation
68 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/FC-net/net.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import math
3 | import numpy as np
4 | import sys
5 |
6 |
7 | sys.path.append('../../../../')
8 | import tensornet
9 |
10 | NUM_CLASSES = 10
11 | IMAGE_SIZE = 32
12 | IMAGE_DEPTH = 3
13 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
14 |
15 | opts = {}
16 |
17 | opts['hidden_units'] = [4096, 4096, 4096, 4096]
18 | opts['use_dropout'] = True
19 | opts['learning_rate_init'] = 1.0
20 | opts['learning_rate_decay_steps'] = 2000
21 | opts['learning_rate_decay_weight'] = 0.64
22 |
23 | def placeholder_inputs():
24 | """Generate placeholder variables to represent the input tensors.
25 |
26 | Returns:
27 | images_ph: Images placeholder.
28 | labels_ph: Labels placeholder.
29 | train_phase_ph: Train phase indicator placeholder.
30 | """
31 | # Note that the shapes of the placeholders match the shapes of the full
32 | # image and label tensors, except the first dimension is now batch_size
33 | # rather than the full size of the train or test data sets.
34 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
35 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
36 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
37 | return images_ph, labels_ph, train_phase_ph
38 |
39 | def inference(images, train_phase):
40 | """Build the model up to where it may be used for inference.
41 | Args:
42 | images: Images placeholder.
43 | train_phase: Train phase placeholder
44 | Returns:
45 | logits: Output tensor with the computed logits.
46 | """
47 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
48 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
49 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
50 |
51 |
52 |
53 | layers = []
54 | layers.append(images)
55 |
56 | cnt = len(opts['hidden_units'])
57 | for i in range(cnt):
58 | n_out = opts['hidden_units'][i]
59 |
60 | layers.append(tensornet.layers.linear(layers[-1],
61 | n_out,
62 | scope='linear_' + str(len(layers)),
63 | biases_initializer=None))
64 |
65 | layers.append(tensornet.layers.batch_normalization(layers[-1],
66 | train_phase,
67 | scope='BN_' + str(len(layers)),
68 | ema_decay=0.8))
69 | layers.append(tf.nn.relu(layers[-1],
70 | name='relu_' + str(len(layers))))
71 |
72 | layers.append(tf.nn.dropout(layers[-1],
73 | dropout_rate(0.77),
74 | name='dropout_' + str(len(layers))))
75 |
76 | layers.append(tensornet.layers.linear(layers[-1],
77 | NUM_CLASSES,
78 | scope='linear_' + str(len(layers))))
79 |
80 | return layers[-1]
81 |
82 | def loss(logits, labels):
83 | """Calculates the loss from the logits and the labels.
84 | Args:
85 | logits: input tensor, float - [batch_size, NUM_CLASSES].
86 | labels: Labels tensor, int32 - [batch_size].
87 | Returns:
88 | loss: Loss tensor of type float.
89 | """
90 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
91 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
92 | # each with NUM_CLASSES values, all of which are 0.0 except there will
93 | # be a 1.0 in the entry corresponding to the label).
94 | batch_size = tf.size(labels)
95 | labels = tf.expand_dims(labels, 1)
96 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
97 | concated = tf.concat([indices, labels], 1)
98 | onehot_labels = tf.sparse_to_dense(concated,
99 | tf.shape(logits), 1.0, 0.0)
100 |
101 |
102 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits,
103 | labels=onehot_labels,
104 | name='xentropy')
105 | loss = tf.reduce_mean(cross_entropy, name='loss')
106 | tf.summary.scalar('summary/loss', loss)
107 | return loss
108 |
109 | def training(loss):
110 | """Sets up the training Ops.
111 | Creates an optimizer and applies the gradients to all trainable variables.
112 | The Op returned by this function is what must be passed to the
113 | `sess.run()` call to cause the model to train.
114 | Args:
115 | loss: Loss tensor, from loss().
116 | Returns:
117 | train_op: The Op for training.
118 | """
119 | # Create a variable to track the global step.
120 | global_step = tf.Variable(0, name='global_step', trainable=False)
121 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
122 | global_step,
123 | opts['learning_rate_decay_steps'],
124 | opts['learning_rate_decay_weight'],
125 | staircase=True,
126 | name='learning_rate')
127 | tf.summary.scalar('summary/learning_rate', learning_rate)
128 | # Create the gradient descent optimizer with the given learning rate.
129 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
130 |
131 | grads_and_vars = optimizer.compute_gradients(loss)
132 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
133 | return train_op
134 |
135 | def evaluation(logits, labels):
136 | """Evaluate the quality of the logits at predicting the label.
137 | Args:
138 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
139 | labels: Labels tensor, int32 - [batch_size], with values in the
140 | range [0, NUM_CLASSES).
141 | Returns:
142 | A scalar int32 tensor with the number of examples (out of batch_size)
143 | that were predicted correctly.
144 | """
145 | # For a classifier model, we can use the in_top_k Op.
146 | # It returns a bool tensor with shape [batch_size] that is true for
147 | # the examples where the label's is was in the top k (here k=1)
148 | # of all logits for that example.
149 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
150 | # Return the number of true entries.
151 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
152 | return correct_count
153 |
154 |
155 | def build(new_opts={}):
156 | """ Build graph
157 | Args:
158 | new_opts: dict with additional opts, which will be added to opts dict/
159 | """
160 | opts.update(new_opts)
161 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
162 | logits = inference(images_ph, train_phase_ph)
163 | loss_out = loss(logits, labels_ph)
164 | train = training(loss_out)
165 | eval_out = evaluation(logits, labels_ph)
166 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/FC-net/results/res_30-03-2016_16_58:
--------------------------------------------------------------------------------
1 | Iterations: 40000
2 | Learning time: 43.06 minutes
3 | Train precision: 0.94912
4 | Train loss: 0.25110
5 | Validation precision: 0.59250
6 | Validation loss: 1.38745
7 | Extra opts: {}
8 | Code:
9 | import tensorflow as tf
10 | import math
11 | import numpy as np
12 | import sys
13 |
14 |
15 | sys.path.append('../../../')
16 | import tensornet
17 |
18 | NUM_CLASSES = 10
19 | IMAGE_SIZE = 32
20 | IMAGE_DEPTH = 3
21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH
22 |
23 | opts = {}
24 |
25 | opts['hidden_units'] = [3072, 4096, 4096, 4096, 4096, 10]
26 | opts['use_dropout'] = True
27 | opts['learning_rate_init'] = 1.0
28 | opts['learning_rate_decay_steps'] = 2000
29 | opts['learning_rate_decay_weight'] = 0.64
30 |
31 | def placeholder_inputs():
32 | """Generate placeholder variables to represent the input tensors.
33 |
34 | Returns:
35 | images_ph: Images placeholder.
36 | labels_ph: Labels placeholder.
37 | train_phase_ph: Train phase indicator placeholder.
38 | """
39 | # Note that the shapes of the placeholders match the shapes of the full
40 | # image and label tensors, except the first dimension is now batch_size
41 | # rather than the full size of the train or test data sets.
42 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images')
43 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels')
44 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase')
45 | return images_ph, labels_ph, train_phase_ph
46 |
47 | def inference(images, train_phase):
48 | """Build the model up to where it may be used for inference.
49 | Args:
50 | images: Images placeholder.
51 | train_phase: Train phase placeholder
52 | Returns:
53 | logits: Output tensor with the computed logits.
54 | """
55 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
56 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
57 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
58 |
59 |
60 |
61 | layers = []
62 | layers.append(images)
63 |
64 | cnt = len(opts['hidden_units'])
65 | for i in range(cnt - 1):
66 | n_in = opts['hidden_units'][i]
67 | n_out = opts['hidden_units'][i + 1]
68 |
69 | layers.append(tensornet.layers.linear(layers[-1],
70 | n_in,
71 | n_out,
72 | init=tn_init(2.0 / n_in),
73 | scope='linear_' + str(len(layers)),
74 | use_biases=(i==cnt-2)))
75 | if (i < cnt - 1):
76 | layers.append(tensornet.layers.batch_normalization(layers[-1],
77 | [n_out],
78 | train_phase,
79 | scope='BN_' + str(len(layers)),
80 | ema_decay=0.8))
81 | layers.append(tf.nn.relu(layers[-1],
82 | name='relu_' + str(len(layers))))
83 |
84 | layers.append(tf.nn.dropout(layers[-1],
85 | dropout_rate(0.77),
86 | name='dropout_' + str(len(layers))))
87 |
88 | return layers[-1]
89 |
90 | def loss(logits, labels):
91 | """Calculates the loss from the logits and the labels.
92 | Args:
93 | logits: input tensor, float - [batch_size, NUM_CLASSES].
94 | labels: Labels tensor, int32 - [batch_size].
95 | Returns:
96 | loss: Loss tensor of type float.
97 | """
98 | # Convert from sparse integer labels in the range [0, NUM_CLASSES)
99 | # to 1-hot dense float vectors (that is we will have batch_size vectors,
100 | # each with NUM_CLASSES values, all of which are 0.0 except there will
101 | # be a 1.0 in the entry corresponding to the label).
102 | batch_size = tf.size(labels)
103 | labels = tf.expand_dims(labels, 1)
104 | indices = tf.expand_dims(tf.range(0, batch_size), 1)
105 | concated = tf.concat(1, [indices, labels])
106 | onehot_labels = tf.sparse_to_dense(concated,
107 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0)
108 |
109 |
110 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits,
111 | onehot_labels,
112 | name='xentropy')
113 | loss = tf.reduce_mean(cross_entropy, name='loss')
114 | tf.scalar_summary('loss', loss, name='summary/loss')
115 | return loss
116 |
117 | def training(loss):
118 | """Sets up the training Ops.
119 | Creates an optimizer and applies the gradients to all trainable variables.
120 | The Op returned by this function is what must be passed to the
121 | `sess.run()` call to cause the model to train.
122 | Args:
123 | loss: Loss tensor, from loss().
124 | Returns:
125 | train_op: The Op for training.
126 | """
127 | # Create a variable to track the global step.
128 | global_step = tf.Variable(0, name='global_step', trainable=False)
129 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'],
130 | global_step,
131 | opts['learning_rate_decay_steps'],
132 | opts['learning_rate_decay_weight'],
133 | staircase=True,
134 | name='learning_rate')
135 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate')
136 | # Create the gradient descent optimizer with the given learning rate.
137 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer')
138 |
139 | grads_and_vars = optimizer.compute_gradients(loss)
140 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op')
141 | return train_op
142 |
143 | def evaluation(logits, labels):
144 | """Evaluate the quality of the logits at predicting the label.
145 | Args:
146 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
147 | labels: Labels tensor, int32 - [batch_size], with values in the
148 | range [0, NUM_CLASSES).
149 | Returns:
150 | A scalar int32 tensor with the number of examples (out of batch_size)
151 | that were predicted correctly.
152 | """
153 | # For a classifier model, we can use the in_top_k Op.
154 | # It returns a bool tensor with shape [batch_size] that is true for
155 | # the examples where the label's is was in the top k (here k=1)
156 | # of all logits for that example.
157 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
158 | # Return the number of true entries.
159 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count')
160 | return correct_count
161 |
162 |
163 | def build(new_opts={}):
164 | """ Build graph
165 | Args:
166 | new_opts: dict with additional opts, which will be added to opts dict/
167 | """
168 | opts.update(new_opts)
169 | images_ph, labels_ph, train_phase_ph = placeholder_inputs()
170 | logits = inference(images_ph, train_phase_ph)
171 | loss_out = loss(logits, labels_ph)
172 | train = training(loss_out)
173 | eval_out = evaluation(logits, labels_ph)
174 |
--------------------------------------------------------------------------------
/experiments/cifar-10/FC-Tensorizing-Neural-Networks/README.md:
--------------------------------------------------------------------------------
1 | # Experiments with TT-FC layer
2 |
3 | This folder contains the code to reproduce the experiments on the CIFAR-10 dataset for the paper
4 |
5 | _Tensorizing Neural Networks_
6 | Alexander Novikov, Dmitry Podoprikhin, Anton Osokin, Dmitry Vetrov; In _Advances in Neural Information Processing Systems 28_ (NIPS-2015) [[arXiv](http://arxiv.org/abs/1509.06569)].
7 |
--------------------------------------------------------------------------------
/experiments/cifar-10/conv-Ultimate-Tensorization/README.md:
--------------------------------------------------------------------------------
1 | # Experiments with TT-conv layer
2 |
3 | This folder contains the framework we used to conduct experiments on the CIFAR-10 dataset for the paper
4 |
5 | _Ultimate tensorization: compressing convolutional and FC layers alike_
6 | Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov; _Learning with Tensors: Why Now and How?_, NIPS-2016 workshop (NIPS-2015) [[arXiv](https://arxiv.org/abs/1611.03214)].
7 |
8 | ## Training
9 |
10 | The following command runs the training procedure:
11 |
12 | ```bash
13 | python3 train.py --net_module= \
14 | --log_dir= \
15 | --data_dir= \
16 | --num_gpus=
17 | ```
18 |
19 | where
20 | * ```net_module``` is a path to a python-file with network description (e.g. ```./nets/conv.py```);
21 |
22 |
23 | * ```log_dir``` is a path to directory where summaries and checkpoints should be saved (e.g. ```./log/conv```);
24 |
25 | * ```data_dir``` is a path to directory with data (e.g. ```../data/```);
26 |
27 |
28 | * ```num_gpus``` is a number of gpu's that will be used for training.
29 |
30 | ### Training with pretrained convolutional part initialization
31 |
32 | There is auxiliary scipt for training a network with convolutional part initialized with pretrained weights:
33 |
34 | ```bash
35 | python3 train_with_pretrained_convs.py --net_module=\
36 | --log_dir= \
37 | --num_gpus= \
38 | --data_dir= \
39 | --pretrained_ckpt=
40 | ```
41 |
42 | where ```pretrained_ckpt``` is the path to the checkpoint file with pretrained weights.
43 |
44 | ## Evaluation
45 |
46 | The following command runs the evaluation process of a trained network:
47 |
48 | ```bash
49 | python3 eval.py --net_module= \
50 | --log_dir= \
51 | --data_dir=
52 | ```
53 |
--------------------------------------------------------------------------------
/experiments/cifar-10/conv-Ultimate-Tensorization/eval.py:
--------------------------------------------------------------------------------
1 | from __future__ import absolute_import
2 | from __future__ import division
3 | from __future__ import print_function
4 | import os
5 | import os.path
6 | import datetime
7 | import shutil
8 | import imp
9 | import time
10 | import tensorflow.python.platform
11 | import numpy as np
12 | from six.moves import xrange # pylint: disable=redefined-builtin
13 | import tensorflow as tf
14 | import re
15 | import input_data
16 | import sys
17 |
18 | import shutil
19 |
20 | net = None
21 |
22 | tf.set_random_seed(12345)
23 | np.random.seed(12345)
24 |
25 | # Basic model parameters as external flags.
26 | flags = tf.app.flags
27 | FLAGS = flags.FLAGS
28 |
29 |
30 | flags.DEFINE_string('net_module', None, 'Module with architecture description.')
31 | flags.DEFINE_string('log_dir', None, 'Directory with log files.')
32 | flags.DEFINE_integer('batch_size', 100, 'Batch size. '
33 | 'Must divide evenly into the dataset sizes.')
34 | flags.DEFINE_string('data_dir', '../data/', 'Directory to put the training data.')
35 |
36 | flags.DEFINE_boolean('log_device_placement', False, """Whether to log device placement.""")
37 |
38 | def tower_loss_and_eval(images, labels, train_phase, cpu_variables=False):
39 | with tf.variable_scope('inference', reuse=False):
40 | logits = net.inference(images, train_phase, cpu_variables=cpu_variables)
41 | losses = net.losses(logits, labels)
42 | total_loss = tf.add_n(losses, name='total_loss')
43 | evaluation = net.evaluation(logits, labels)
44 | return total_loss, evaluation
45 |
46 | def evaluate(sess,
47 | loss,
48 | evaluation,
49 | train_or_val,
50 | images_ph,
51 | images,
52 | labels_ph,
53 | labels):
54 | fmt_str = 'Evaluation [%s]. Batch %d/%d (%d%%). Speed = %.2f sec/b, %.2f img/sec. Batch_loss = %.2f. Batch_precision = %.2f'
55 |
56 | num_batches = labels.size // FLAGS.batch_size
57 | assert labels.size % FLAGS.batch_size == 0, 'Batch size must divide evenly into the dataset sizes.'
58 | assert images.shape[0] == labels.size, 'Images count must be equal to labels count'
59 |
60 | sum_loss = 0.0
61 | sum_correct = 0.0
62 |
63 | w = os.get_terminal_size().columns
64 | sys.stdout.write(('=' * w + '\n') * 2)
65 | sys.stdout.write('\n')
66 | sys.stdout.write('Evaluation [%s]' % train_or_val)
67 |
68 | cum_t = 0.0
69 | for bid in range(num_batches):
70 | b_images = images[bid * FLAGS.batch_size:(bid + 1) * FLAGS.batch_size]
71 | b_labels = labels[bid * FLAGS.batch_size:(bid + 1) * FLAGS.batch_size]
72 | start_time = time.time()
73 | loss_val, eval_val = sess.run([loss, evaluation], feed_dict={images_ph: b_images, labels_ph: b_labels})
74 | duration = time.time() - start_time
75 |
76 | cum_t += duration
77 | sec_per_batch = duration
78 | img_per_sec = FLAGS.batch_size / duration
79 |
80 |
81 | sum_loss += loss_val * FLAGS.batch_size
82 | sum_correct += np.sum(eval_val)
83 |
84 | if cum_t > 0.5:
85 | sys.stdout.write('\r' + fmt_str % (
86 | train_or_val,
87 | bid + 1,
88 | num_batches,
89 | int((bid + 1) * 100.0 / num_batches),
90 | sec_per_batch,
91 | img_per_sec,
92 | loss_val,
93 | np.mean(eval_val) * 100.0
94 | ))
95 | sys.stdout.flush()
96 | cum_t = 0.0
97 |
98 | sys.stdout.write(('\r' + fmt_str + '\n') % (
99 | train_or_val,
100 | num_batches,
101 | num_batches,
102 | int(100.0),
103 | sec_per_batch,
104 | img_per_sec,
105 | loss_val,
106 | np.mean(eval_val) * 100.0
107 | ))
108 |
109 | sys.stdout.write('%s loss = %.2f. %s precision = %.2f.\n\n' % (
110 | train_or_val,
111 | sum_loss / labels.size,
112 | train_or_val,
113 | sum_correct / labels.size * 100.0
114 | ))
115 |
116 | def run_eval(chkpt):
117 | global net
118 | net = imp.load_source('net', FLAGS.net_module)
119 | with tf.Graph().as_default(), tf.device('/cpu:0'):
120 | train_phase = tf.constant(False, name='train_phase', dtype=tf.bool)
121 |
122 | t_images, t_labels = input_data.get_train_data(FLAGS.data_dir)
123 | aux = {
124 | 'mean': np.mean(t_images, axis=0),
125 | 'std': np.std(t_images, axis=0)
126 | }
127 | v_images, v_labels = input_data.get_validation_data(FLAGS.data_dir)
128 |
129 | images_ph = tf.placeholder(tf.float32, shape=[None] + list(t_images.shape[1:]), name='images_ph')
130 | labels_ph = tf.placeholder(tf.int32, shape=[None], name='labels_ph')
131 |
132 | images = net.aug_eval(images_ph, aux)
133 | with tf.device('/gpu:0'):
134 | with tf.name_scope('tower_0') as scope:
135 | loss, evaluation = tower_loss_and_eval(images, labels_ph, train_phase)
136 |
137 |
138 | variable_averages = tf.train.ExponentialMovingAverage(0.999)
139 | variables_averages_op = variable_averages.apply(tf.trainable_variables())
140 |
141 | saver = tf.train.Saver(tf.global_variables())
142 | ema_saver = tf.train.Saver(variable_averages.variables_to_restore())
143 |
144 | sess = tf.Session(config=tf.ConfigProto(
145 | allow_soft_placement=True,
146 | log_device_placement=FLAGS.log_device_placement))
147 |
148 |
149 |
150 | saver.restore(sess, chkpt)
151 | ema_saver.restore(sess, chkpt)
152 | sys.stdout.write('Checkpoint "%s" restored.\n' % (chkpt))
153 | evaluate(sess, loss, evaluation, 'Train', images_ph, t_images, labels_ph, t_labels)
154 | evaluate(sess, loss, evaluation, 'Validation', images_ph, v_images, labels_ph, v_labels)
155 |
156 | def main(_):
157 | latest_chkpt = tf.train.latest_checkpoint(FLAGS.log_dir)
158 | if latest_chkpt is not None:
159 | sys.stdout.write('Checkpoint "%s" found.\n' % latest_chkpt)
160 | run_eval(latest_chkpt)
161 | else:
162 | sys.stdout.write('Checkpoint not found.\n')
163 |
164 | if __name__ == '__main__':
165 | tf.app.run()
166 |
--------------------------------------------------------------------------------
/experiments/cifar-10/conv-Ultimate-Tensorization/input_data.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import tensorflow as tf
3 |
4 |
5 | def get_train_data(data_dir):
6 | f = np.load(data_dir + '/cifar.npz')
7 | images = np.reshape(f['train_images'], [-1, 32, 32, 3])
8 | labels = f['train_labels']
9 | return images, labels
10 |
11 | def get_validation_data(data_dir):
12 | f = np.load(data_dir + '/cifar.npz')
13 | images = np.reshape(f['validation_images'], [-1, 32, 32, 3])
14 | labels = f['validation_labels']
15 | return images, labels
16 |
17 |
18 | def get_input_data(FLAGS):
19 | t_images, t_labels = get_train_data(FLAGS.data_dir)
20 | t_cnt = t_images.shape[0]
21 | train_images_ph = tf.placeholder(dtype=tf.float32, shape=[t_cnt, 32, 32, 3], name='train_images_ph')
22 | train_labels_ph = tf.placeholder(dtype=tf.int32, shape=[t_cnt], name='train_labels_ph')
23 | train_images = tf.Variable(train_images_ph, trainable=False, collections=[], name='train_images')
24 | train_labels = tf.Variable(train_labels_ph, trainable=False, collections=[], name='train_labels')
25 |
26 | train_image_input, train_label_input = tf.train.slice_input_producer([train_images, train_labels],
27 | shuffle=True,
28 | capacity=FLAGS.num_gpus * FLAGS.batch_size + 20,
29 | name='train_input')
30 |
31 |
32 |
33 | v_images, v_labels = get_validation_data(FLAGS.data_dir)
34 | v_cnt = v_images.shape[0]
35 | validation_images_ph = tf.placeholder(dtype=tf.float32, shape=[v_cnt, 32, 32, 3], name='validation_images_ph')
36 | validation_labels_ph = tf.placeholder(dtype=tf.int32, shape=[v_cnt], name='validation_labels_ph')
37 | validation_images = tf.Variable(validation_images_ph, trainable=False, collections=[], name='validation_images')
38 | validation_labels = tf.Variable(validation_labels_ph, trainable=False, collections=[], name='validation_labels')
39 |
40 | validation_image_input, validation_label_input = tf.train.slice_input_producer([validation_images, validation_labels],
41 | shuffle=False,
42 | capacity=FLAGS.batch_size + 20,
43 | name='validation_input')
44 |
45 | result = {}
46 | result['train'] = {
47 | 'images': t_images,
48 | 'labels': t_labels,
49 | 'image_input': train_image_input,
50 | 'label_input': train_label_input
51 | }
52 | result['validation'] = {
53 | 'images': v_images,
54 | 'labels': v_labels,
55 | 'image_input': validation_image_input,
56 | 'label_input': validation_label_input
57 | }
58 | result['initializer'] = [
59 | train_images.initializer,
60 | train_labels.initializer,
61 | validation_images.initializer,
62 | validation_labels.initializer
63 | ]
64 |
65 | result['init_feed'] = {
66 | train_images_ph: t_images,
67 | train_labels_ph: t_labels,
68 | validation_images_ph: v_images,
69 | validation_labels_ph: v_labels
70 | }
71 |
72 | result['aux'] = {
73 | 'mean': np.mean(t_images, axis=0),
74 | 'std': np.std(t_images, axis=0)
75 | }
76 |
77 |
78 | return result
79 |
--------------------------------------------------------------------------------
/experiments/cifar-10/conv-Ultimate-Tensorization/nets/conv.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import math
3 | import numpy as np
4 | import sys
5 |
6 |
7 | sys.path.append('../../../')
8 | import tensornet
9 |
10 | NUM_CLASSES = 10
11 |
12 |
13 |
14 | opts = {}
15 | opts['use_dropout'] = True
16 | opts['initial_learning_rate'] = 0.1
17 | opts['num_epochs_per_decay'] = 30.0
18 | opts['learning_rate_decay_factor'] = 0.1
19 |
20 | def aug_train(image, aux):
21 | aug_image = tf.pad(image, [[4, 4], [4, 4], [0, 0]])
22 | aug_image = tf.random_crop(aug_image, [32, 32, 3])
23 | aug_image = tf.image.random_flip_left_right(aug_image)
24 | aug_image = tf.image.random_contrast(aug_image, 0.75, 1.25)
25 | aug_image = (aug_image - aux['mean']) / aux['std']
26 | return aug_image
27 |
28 | def aug_eval(image, aux):
29 | aug_image = (image - aux['mean']) / aux['std']
30 | return aug_image
31 |
32 | def inference(images, train_phase, reuse=None, cpu_variables=False):
33 | """Build the model up to where it may be used for inference.
34 | Args:
35 | images: Images placeholder.
36 | train_phase: Train phase placeholder
37 | Returns:
38 | logits: Output tensor with the computed logits.
39 | """
40 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev)
41 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound)
42 |
43 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0
44 |
45 |
46 |
47 | layers = []
48 | layers.append(images)
49 |
50 | layers.append(tensornet.layers.conv(layers[-1],
51 | 64,
52 | [3, 3],
53 | cpu_variables=cpu_variables,
54 | biases_initializer=tf.zeros_initializer(),
55 | scope='conv1.1'))
56 |
57 |
58 | layers.append(tensornet.layers.batch_normalization(layers[-1],
59 | train_phase,
60 | cpu_variables=cpu_variables,
61 | scope='bn1.1'))
62 |
63 | layers.append(tf.nn.relu(layers[-1],
64 | name='relu1.1'))
65 |
66 | layers.append(tensornet.layers.conv(layers[-1],
67 | 64,
68 | [3, 3],
69 | cpu_variables=cpu_variables,
70 | biases_initializer=tf.zeros_initializer(),
71 | scope='conv1.2'))
72 |
73 | layers.append(tensornet.layers.batch_normalization(layers[-1],
74 | train_phase,
75 | cpu_variables=cpu_variables,
76 | scope='bn1.2'))
77 |
78 | layers.append(tf.nn.relu(layers[-1],
79 | name='relu1.2'))
80 |
81 |
82 | layers.append(tf.nn.max_pool(layers[-1],
83 | [1, 3, 3, 1],
84 | [1, 2, 2, 1],
85 | 'SAME',
86 | name='max_pool1'))
87 |
88 | layers.append(tensornet.layers.conv(layers[-1],
89 | 128,
90 | [3, 3],
91 | cpu_variables=cpu_variables,
92 | biases_initializer=tf.zeros_initializer(),
93 | scope='conv2.1'))
94 |
95 | layers.append(tensornet.layers.batch_normalization(layers[-1],
96 | train_phase,
97 | cpu_variables=cpu_variables,
98 | scope='bn2.1'))
99 |
100 | layers.append(tf.nn.relu(layers[-1],
101 | name='relu2.1'))
102 |
103 | layers.append(tensornet.layers.conv(layers[-1],
104 | 128,
105 | [3, 3],
106 | cpu_variables=cpu_variables,
107 | biases_initializer=tf.zeros_initializer(),
108 | scope='conv2.2'))
109 |
110 | layers.append(tensornet.layers.batch_normalization(layers[-1],
111 | train_phase,
112 | cpu_variables=cpu_variables,
113 | scope='bn2.2'))
114 |
115 | layers.append(tf.nn.relu(layers[-1],
116 | name='relu2.2'))
117 |
118 |
119 | layers.append(tf.nn.max_pool(layers[-1],
120 | [1, 3, 3, 1],
121 | [1, 2, 2, 1],
122 | 'SAME',
123 | name='max_pool2'))
124 |
125 | layers.append(tensornet.layers.conv(layers[-1],
126 | 128,
127 | [3, 3],
128 | padding='VALID',
129 | cpu_variables=cpu_variables,
130 | biases_initializer=tf.zeros_initializer(),
131 | scope='conv3.1'))
132 |
133 | layers.append(tensornet.layers.batch_normalization(layers[-1],
134 | train_phase,
135 | cpu_variables=cpu_variables,
136 | scope='bn3.1'))
137 |
138 | layers.append(tf.nn.relu(layers[-1],
139 | name='relu3.1'))
140 |
141 |
142 | layers.append(tensornet.layers.conv(layers[-1],
143 | 128,
144 | [3, 3],
145 | padding='VALID',
146 | cpu_variables=cpu_variables,
147 | biases_initializer=tf.zeros_initializer(),
148 | scope='conv3.2'))
149 |
150 | layers.append(tensornet.layers.batch_normalization(layers[-1],
151 | train_phase,
152 | cpu_variables=cpu_variables,
153 | scope='bn3.2'))
154 |
155 | layers.append(tf.nn.relu(layers[-1],
156 | name='relu3.2'))
157 |
158 |
159 |
160 |
161 | layers.append(tf.nn.avg_pool(layers[-1],
162 | [1,4,4,1],
163 | [1,4,4,1],
164 | 'SAME',
165 | name='avg_pool_full'))
166 |
167 |
168 | sz = np.prod(layers[-1].get_shape().as_list()[1:])
169 |
170 | layers.append(tensornet.layers.linear(tf.reshape(layers[-1], [-1, sz]),
171 | NUM_CLASSES,
172 | cpu_variables=cpu_variables,
173 | biases_initializer=None,
174 | scope='linear4.1'))
175 |
176 | return layers[-1]
177 |
178 | def losses(logits, labels):
179 | """Calculates losses from the logits and the labels.
180 | Args:
181 | logits: input tensor, float - [batch_size, NUM_CLASSES].
182 | labels: Labels tensor, int32 - [batch_size].
183 | Returns:
184 | losses: list of loss tensors of type float.
185 | """
186 | xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels, name='xentropy')
187 | loss = tf.reduce_mean(xentropy, name='loss')
188 | return [loss]
189 |
190 | def evaluation(logits, labels):
191 | """Evaluate the quality of the logits at predicting the label.
192 | Args:
193 | logits: Logits tensor, float - [batch_size, NUM_CLASSES].
194 | labels: Labels tensor, int32 - [batch_size], with values in the
195 | range [0, NUM_CLASSES).
196 | Returns:
197 | A scalar int32 tensor with the number of examples (out of batch_size)
198 | that were predicted correctly.
199 | """
200 | # For a classifier model, we can use the in_top_k Op.
201 | # It returns a bool tensor with shape [batch_size] that is true for
202 | # the examples where the label's is was in the top k (here k=1)
203 | # of all logits for that example.
204 | correct_flags = tf.nn.in_top_k(logits, labels, 1)
205 | # Return the number of true entries.
206 | return tf.cast(correct_flags, tf.int32)
207 |
--------------------------------------------------------------------------------
/experiments/cifar-10/data/prepare_data.py:
--------------------------------------------------------------------------------
1 | ################################################################
2 | # Load and unpack CIFAR-10 python version #
3 | # from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz #
4 | # #
5 | # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! #
6 | # Run this scipt with python2 only #
7 | # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! #
8 | ################################################################
9 |
10 | import pickle
11 | import numpy as np
12 |
13 | batches_dir = 'cifar-10-batches-py'
14 |
15 | def unpickle(fname):
16 | fo = open(fname, 'rb')
17 | d = pickle.load(fo)
18 | fo.close()
19 | data = np.reshape(d['data'], [-1, 32, 32, 3], order='F')
20 | data = np.transpose(data, [0, 2, 1, 3])
21 | data = np.reshape(data, [-1, 32*32*3])
22 | labels = np.array(d['labels'], dtype='int8')
23 | return data, labels
24 |
25 | for x in range(1, 6):
26 | fname = batches_dir + '/data_batch_' + str(x)
27 | data, labels = unpickle(fname)
28 | if x == 1:
29 | train_images = data
30 | train_labels = labels
31 | else:
32 | train_images = np.vstack((train_images, data))
33 | train_labels = np.concatenate((train_labels, labels))
34 |
35 | validation_images, validation_labels = unpickle(batches_dir + '/test_batch')
36 |
37 | print(train_images.shape, validation_images.shape)
38 | print(train_labels.shape, validation_labels.shape)
39 | np.savez_compressed('cifar', train_images=train_images, validation_images=validation_images,
40 | train_labels=train_labels, validation_labels=validation_labels)
41 |
42 |
43 |
44 |
45 |
46 |
47 |
48 |
49 |
50 |
51 |
--------------------------------------------------------------------------------
/paper.md:
--------------------------------------------------------------------------------
1 | # Ultimate tensorization: compressing convolutional and FC layers alike
2 | Links: [[arXiv](https://arxiv.org/abs/1611.03214)] [[poster pdf](https://github.com/timgaripov/TensorNet-TF/raw/master/ultimate_tensorization_poster.pdf)]
3 |
4 |
5 | Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset
6 |
--------------------------------------------------------------------------------
/tensornet/__init__.py:
--------------------------------------------------------------------------------
1 | from . import layers
2 | from . import tt
3 |
--------------------------------------------------------------------------------
/tensornet/layers/__init__.py:
--------------------------------------------------------------------------------
1 | from .linear import *
2 | from .batch_normalization import *
3 | from .tt import *
4 | from .conv import *
5 | from .tt_conv import *
6 | from .tt_conv_full import *
7 | from .tt_conv_direct import *
8 |
--------------------------------------------------------------------------------
/tensornet/layers/aux.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 |
4 | def get_var_wrap(name,
5 | shape,
6 | initializer,
7 | regularizer,
8 | trainable,
9 | cpu_variable):
10 | if cpu_variable:
11 | with tf.device('/cpu:0'):
12 | return tf.get_variable(name,
13 | shape=shape,
14 | initializer=initializer,
15 | regularizer=regularizer,
16 | trainable=trainable)
17 | return tf.get_variable(name,
18 | shape=shape,
19 | initializer=initializer,
20 | regularizer=regularizer,
21 | trainable=trainable)
22 |
--------------------------------------------------------------------------------
/tensornet/layers/batch_normalization.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from .aux import get_var_wrap
3 |
4 | def batch_normalization(inp,
5 | train_phase,
6 | ema_decay=0.99,
7 | eps=1e-3,
8 | use_scale=True,
9 | use_shift=True,
10 | trainable=True,
11 | cpu_variables=False,
12 | scope=None):
13 | """Batch normalization layer
14 | Args:
15 | inp: input tensor [batch_el, ...] with 2 or 4 dimensions
16 | train_phase: tensor [1] of bool, train pahse indicator
17 | ema_decay: moving average decay
18 | eps: number added to variance, to exclude division by zero
19 | use_scale: bool flag of scale transform applying
20 | use_shift: bool flag of shift transform applying
21 | trainable: trainable variables flag, bool
22 | cpu_variables: cpu variables flag, bool
23 | scope: layer variable scope name, string
24 | Reutrns:
25 | out: normalizaed tensor of the same shape as inp
26 | """
27 |
28 | reuse = tf.get_variable_scope().reuse
29 | with tf.variable_scope(scope):
30 |
31 | shape = inp.get_shape().as_list()
32 | assert len(shape) in [2, 4]
33 | n_out = shape[-1]
34 |
35 | if len(shape) == 2:
36 | batch_mean, batch_variance = tf.nn.moments(inp, [0], name='moments')
37 | else:
38 | batch_mean, batch_variance = tf.nn.moments(inp, [0, 1, 2], name='moments')
39 | ema = tf.train.ExponentialMovingAverage(decay=ema_decay, zero_debias=True)
40 | if not reuse:
41 | def mean_variance_with_update():
42 | with tf.control_dependencies([ema.apply([batch_mean, batch_variance])]):
43 | return (tf.identity(batch_mean),
44 | tf.identity(batch_variance))
45 |
46 | mean, variance = tf.cond(train_phase,
47 | mean_variance_with_update,
48 | lambda: (ema.average(batch_mean),
49 | ema.average(batch_variance)))
50 | else:
51 | print("At scope %s reuse is truned on! Using previously created ema variables." % tf.get_variable_scope().name)
52 |
53 | #It's a kind of workaround
54 | vars = tf.get_variable_scope().global_variables()
55 | transform = lambda s: '/'.join(s.split('/')[-5:])
56 |
57 | mean_name = transform(ema.average_name(batch_mean))
58 | variance_name = transform(ema.average_name(batch_variance))
59 |
60 | existed = {}
61 | for v in vars:
62 | if (transform(v.op.name) == mean_name):
63 | existed['mean'] = v
64 | if (transform(v.op.name) == variance_name):
65 | existed['variance'] = v
66 |
67 | print('Using:')
68 | print('\t' + existed['mean'].op.name)
69 | print('\t' + existed['variance'].op.name)
70 |
71 |
72 | mean, variance = tf.cond(train_phase,
73 | lambda: (batch_mean,
74 | batch_variance),
75 | lambda: (existed['mean'],
76 | existed['variance']))
77 |
78 | std = tf.sqrt(variance + eps, name='std')
79 | out = (inp - mean) / std
80 | if use_scale:
81 | weights = get_var_wrap('weights',
82 | shape=[n_out],
83 | initializer=tf.ones_initializer,
84 | trainable=trainable,
85 | regularizer=None,
86 | cpu_variable=cpu_variables)
87 |
88 | out = tf.multiply(out, weights)
89 | if use_shift:
90 | biases = get_var_wrap('biases',
91 | shape=[n_out],
92 | initializer=tf.zeros_initializer,
93 | trainable=trainable,
94 | regularizer=None,
95 | cpu_variable=cpu_variables)
96 |
97 | out = tf.add(out, biases)
98 | return out
99 |
--------------------------------------------------------------------------------
/tensornet/layers/conv.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from .aux import get_var_wrap
3 |
4 | def conv(inp,
5 | out_ch,
6 | window_size,
7 | strides=[1, 1],
8 | padding='SAME',
9 | filters_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
10 | filters_regularizer=None,
11 | biases_initializer=tf.zeros_initializer,
12 | biases_regularizer=None,
13 | trainable=True,
14 | cpu_variables=False,
15 | scope=None):
16 | """ convolutional layer
17 | Args:
18 | inp: input tensor, float - [batch_size, H, W, C]
19 | out_ch: output channels count count, int
20 | window_size: convolution window size, list [wH, wW]
21 | strides: strides, list [sx, sy]
22 | padding: 'SAME' or 'VALID', string
23 | filters_initializer: filters init function
24 | filters_regularizer: filters regularizer function
25 | biases_initializer: biases init function (if None then no biases will be used)
26 | biases_regularizer: biases regularizer function
27 | trainable: trainable variables flag, bool
28 | cpu_variables: cpu variables flag, bool
29 | scope: layer variable scope name, string
30 | Returns:
31 | out: output tensor, float - [batch_size, H', W', out_ch]
32 | """
33 |
34 | with tf.variable_scope(scope):
35 | shape = inp.get_shape().as_list()
36 | assert len(shape) == 4, "Not 4D input tensor"
37 | in_ch = shape[-1]
38 |
39 | filters = get_var_wrap('filters',
40 | shape=window_size + [in_ch, out_ch],
41 | initializer=filters_initializer,
42 | regularizer=filters_regularizer,
43 | trainable=trainable,
44 | cpu_variable=cpu_variables)
45 |
46 | out = tf.nn.conv2d(inp, filters, [1] + strides + [1], padding, name='conv2d')
47 |
48 | if biases_initializer is not None:
49 | biases = get_var_wrap('biases',
50 | shape=[out_ch],
51 | initializer=biases_initializer,
52 | regularizer=biases_regularizer,
53 | trainable=trainable,
54 | cpu_variable=cpu_variables)
55 | out = tf.add(out, biases, name='out')
56 | else:
57 | out = tf.identity(out, name='out')
58 | return out
59 |
--------------------------------------------------------------------------------
/tensornet/layers/linear.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | from .aux import get_var_wrap
3 |
4 | def linear(inp,
5 | out_size,
6 | weights_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
7 | weights_regularizer=None,
8 | biases_initializer=tf.zeros_initializer,
9 | biases_regularizer=None,
10 | trainable=True,
11 | cpu_variables=False,
12 | scope=None):
13 | """ linear layer
14 | Args:
15 | inp: input tensor, float - [batch_size, inp_size]
16 | out_size: layer units count, int
17 | weights_initializer: weights init function
18 | weights_regularizer: weights regularizer function
19 | biases_initializer: biases init function (if None then no biases will be used)
20 | biases_regularizer: biases regularizer function
21 | trainable: trainable variables flag, bool
22 | cpu_variables: cpu variables flag, bool
23 | scope: layer variable scope name, string
24 | Returns:
25 | out: output tensor, float - [batch_size, out_size]
26 | """
27 | with tf.variable_scope(scope):
28 | shape = inp.get_shape().as_list()
29 | assert len(shape) == 2, 'Not 2D input tensor'
30 | inp_size = shape[-1]
31 |
32 | weights = get_var_wrap('weights',
33 | shape=[inp_size, out_size],
34 | initializer=weights_initializer,
35 | regularizer=weights_regularizer,
36 | trainable=trainable,
37 | cpu_variable=cpu_variables)
38 |
39 | if biases_initializer is not None:
40 | biases = get_var_wrap('biases',
41 | shape=[out_size],
42 | initializer=biases_initializer,
43 | regularizer=biases_regularizer,
44 | trainable=trainable,
45 | cpu_variable=cpu_variables)
46 |
47 | out = tf.add(tf.matmul(inp, weights, name='matmul'), biases, name='out')
48 | else:
49 | out = tf.matmul(inp, weights, name='out')
50 | return out
51 |
--------------------------------------------------------------------------------
/tensornet/layers/tt.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | from .aux import get_var_wrap
4 |
5 | def tt(inp,
6 | inp_modes,
7 | out_modes,
8 | mat_ranks,
9 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
10 | cores_regularizer=None,
11 | biases_initializer=tf.zeros_initializer,
12 | biases_regularizer=None,
13 | trainable=True,
14 | cpu_variables=False,
15 | scope=None):
16 | """ tt-layer (tt-matrix by full tensor product)
17 | Args:
18 | inp: input tensor, float - [batch_size, prod(inp_modes)]
19 | inp_modes: input tensor modes
20 | out_modes: output tensor modes
21 | mat_ranks: tt-matrix ranks
22 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core
23 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core
24 | biases_initializer: biases init function (if None then no biases will be used)
25 | biases_regularizer: biases regularizer function
26 | trainable: trainable variables flag, bool
27 | cpu_variables: cpu variables flag, bool
28 | scope: layer variable scope name, string
29 | Returns:
30 | out: output tensor, float - [batch_size, prod(out_modes)]
31 | """
32 | with tf.variable_scope(scope):
33 | dim = inp_modes.size
34 |
35 | mat_cores = []
36 |
37 | for i in range(dim):
38 | if type(cores_initializer) == list:
39 | cinit = cores_initializer[i]
40 | else:
41 | cinit = cores_initializer
42 |
43 | if type(cores_regularizer) == list:
44 | creg = cores_regularizer[i]
45 | else:
46 | creg = cores_regularizer
47 |
48 | mat_cores.append(get_var_wrap('mat_core_%d' % (i + 1),
49 | shape=[out_modes[i] * mat_ranks[i + 1], mat_ranks[i] * inp_modes[i]],
50 | initializer=cinit,
51 | regularizer=creg,
52 | trainable=trainable,
53 | cpu_variable=cpu_variables))
54 |
55 |
56 |
57 | out = tf.reshape(inp, [-1, np.prod(inp_modes)])
58 | out = tf.transpose(out, [1, 0])
59 |
60 | for i in range(dim):
61 | out = tf.reshape(out, [mat_ranks[i] * inp_modes[i], -1])
62 |
63 | out = tf.matmul(mat_cores[i], out)
64 | out = tf.reshape(out, [out_modes[i], -1])
65 | out = tf.transpose(out, [1, 0])
66 |
67 | if biases_initializer is not None:
68 |
69 | biases = get_var_wrap('biases',
70 | shape=[np.prod(out_modes)],
71 | initializer=biases_initializer,
72 | regularizer=biases_regularizer,
73 | trainable=trainable,
74 | cpu_variable=cpu_variables)
75 |
76 | out = tf.add(tf.reshape(out, [-1, np.prod(out_modes)]), biases, name="out")
77 | else:
78 | out = tf.reshape(out, [-1, np.prod(out_modes)], name="out")
79 |
80 | return out
81 |
--------------------------------------------------------------------------------
/tensornet/layers/tt_conv.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import math
4 | from .aux import get_var_wrap
5 |
6 | def tt_conv(inp,
7 | window,
8 | inp_ch_modes,
9 | out_ch_modes,
10 | ranks,
11 | strides=[1, 1],
12 | padding='SAME',
13 | filters_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
14 | filters_regularizer=None,
15 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
16 | cores_regularizer=None,
17 | biases_initializer=tf.zeros_initializer,
18 | biases_regularizer=None,
19 | trainable=True,
20 | cpu_variables=False,
21 | scope=None):
22 | """ tt-conv-layer (convolution of full input tensor with tt-filters (core by core))
23 | Args:
24 | inp: input tensor, float - [batch_size, H, W, C]
25 | window: convolution window size, list [wH, wW]
26 | inp_ch_modes: input channels modes, np.array (int32) of size d
27 | out_ch_modes: output channels modes, np.array (int32) of size d
28 | ranks: tt-filters ranks, np.array (int32) of size (d + 1)
29 | strides: strides, list of 2 ints - [sx, sy]
30 | padding: 'SAME' or 'VALID', string
31 | filters_initializer: filters init function
32 | filters_regularizer: filters regularizer function
33 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core
34 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core
35 | biases_initializer: biases init function (if None then no biases will be used)
36 | biases_regularizer: biases regularizer function
37 | trainable: trainable variables flag, bool
38 | cpu_variables: cpu variables flag, bool
39 | scope: layer variable scope name, string
40 | Returns:
41 | out: output tensor, float - [batch_size, prod(out_modes)]
42 | """
43 |
44 | with tf.variable_scope(scope):
45 | inp_shape = inp.get_shape().as_list()[1:]
46 | inp_h, inp_w, inp_ch = inp_shape[0:3]
47 | tmp = tf.reshape(inp, [-1, inp_h, inp_w, inp_ch])
48 | tmp = tf.transpose(tmp, [0, 3, 1, 2])
49 | tmp = tf.reshape(tmp, [-1, inp_h, inp_w, 1])
50 |
51 | filters_shape = [window[0], window[1], 1, ranks[0]]
52 | if (window[0] * window[1] * 1 * ranks[0] == 1):
53 | filters = get_var_wrap('filters',
54 | shape=filters_shape,
55 | initializer=tf.ones_initializer,
56 | regularizer=None,
57 | trainable=False,
58 | cpu_variable=cpu_variables)
59 | else:
60 | filters = get_var_wrap('filters',
61 | shape=filters_shape,
62 | initializer=filters_initializer,
63 | regularizer=filters_regularizer,
64 | trainable=trainable,
65 | cpu_variable=cpu_variables)
66 |
67 | tmp = tf.nn.conv2d(tmp, filters, [1] + strides + [1], padding)
68 |
69 | #tmp shape = [batch_size * inp_ch, h, w, r]
70 | h, w = tmp.get_shape().as_list()[1:3]
71 | tmp = tf.reshape(tmp, [-1, inp_ch, h, w, ranks[0]])
72 | tmp = tf.transpose(tmp, [4, 1, 0, 2, 3])
73 | #tmp shape = [r, c, b, h, w]
74 |
75 | d = inp_ch_modes.size
76 |
77 | cores = []
78 | for i in range(d):
79 |
80 | if type(cores_initializer) == list:
81 | cinit = cores_initializer[i]
82 | else:
83 | cinit = cores_initializer
84 |
85 | if type(cores_regularizer) == list:
86 | creg = cores_regularizer[i]
87 | else:
88 | creg = cores_regularizer
89 |
90 | cores.append(get_var_wrap('core_%d' % (i + 1),
91 | shape=[out_ch_modes[i] * ranks[i + 1], ranks[i] * inp_ch_modes[i]],
92 | initializer=cinit,
93 | regularizer=creg,
94 | trainable=trainable,
95 | cpu_variable=cpu_variables))
96 |
97 | for i in range(d):
98 | tmp = tf.reshape(tmp, [ranks[i] * inp_ch_modes[i], -1])
99 | tmp = tf.matmul(cores[i], tmp)
100 | tmp = tf.reshape(tmp, [out_ch_modes[i], -1])
101 | tmp = tf.transpose(tmp, [1, 0])
102 | out_ch = np.prod(out_ch_modes)
103 |
104 | if biases_initializer is not None:
105 | biases = get_var_wrap('biases',
106 | shape=[out_ch],
107 | initializer=biases_initializer,
108 | regularizer=biases_regularizer,
109 | trainable=trainable,
110 | cpu_variable=cpu_variables)
111 |
112 | out = tf.reshape(tmp, [-1, h, w, out_ch])
113 | out = tf.add(out, biases, name='out')
114 | else:
115 | out = tf.reshape(tmp, [-1, h, w, out_ch], name='out')
116 |
117 | return out
118 |
--------------------------------------------------------------------------------
/tensornet/layers/tt_conv1d_full.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import math
4 | from .aux import get_var_wrap
5 | import tt_conv_full
6 |
7 | def tt_conv1d_full(inp,
8 | window,
9 | inp_ch_modes,
10 | out_ch_modes,
11 | ranks,
12 | strides=[1, 1],
13 | padding='SAME',
14 | filters_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
15 | filters_regularizer=None,
16 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
17 | cores_regularizer=None,
18 | biases_initializer=tf.zeros_initializer,
19 | biases_regularizer=None,
20 | trainable=True,
21 | cpu_variables=False,
22 | scope=None):
23 | """
24 | conv1d wrapper for conv2d function. Internally tensorflow does a conv2d for its vanilla
25 | conv1d. Similarly, this process is applied here. Input is expanded by dim 1 and then output is simply squeezed.
26 |
27 | Note: window should be [1, w] where you insert your width
28 | strides should be [1, stride_width]
29 |
30 |
31 | tt-conv-layer (convolution of full input tensor with tt-filters (make tt full then use conv2d))
32 | Args:
33 | inp: input tensor, float - [batch_size, W, C]
34 | window: convolution window size, list [wH, wW]
35 | inp_ch_modes: input channels modes, np.array (int32) of size d
36 | out_ch_modes: output channels modes, np.array (int32) of size d
37 | ranks: tt-filters ranks, np.array (int32) of size (d + 1)
38 | strides: strides, list of 2 ints - [sx, sy]
39 | padding: 'SAME' or 'VALID', string
40 | filters_initializer: filters init function
41 | filters_regularizer: filters regularizer function
42 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core
43 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core
44 | biases_initializer: biases init function (if None then no biases will be used)
45 | biases_regularizer: biases regularizer function
46 | trainable: trainable variables flag, bool
47 | cpu_variables: cpu variables flag, bool
48 | scope: layer variable scope name, string
49 | Returns:
50 | out: output tensor, float - [batch_size, W, prod(out_modes)]
51 | """
52 | inp_expanded = tf.expand_dims(inp, dim = 1) # expand on height dim
53 |
54 | conv2d_output = tt_conv_full(inp,
55 | window,
56 | inp_ch_modes,
57 | out_ch_modes,
58 | ranks,
59 | strides=strides,
60 | padding=padding,
61 | filters_initializer=filters_initializer,
62 | filters_regularizer=filters_regularizer,
63 | cores_initializer=cores_initializer,
64 | cores_regularizer=cores_regularizer,
65 | biases_initializer=biases_initializer,
66 | biases_regularizer=biases_regularizer,
67 | trainable=trainable,
68 | cpu_variables=cpu_variables,
69 | scope=scope)
70 |
71 | return tf.squeeze(conv2d_output) # get rid of height dimension
72 |
--------------------------------------------------------------------------------
/tensornet/layers/tt_conv_direct.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import math
4 | from .aux import get_var_wrap
5 |
6 | def tt_conv_direct(inp,
7 | window,
8 | out_ch,
9 | ranks,
10 | strides=[1, 1],
11 | padding='SAME',
12 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
13 | cores_regularizer=None,
14 | biases_initializer=tf.zeros_initializer,
15 | biases_regularizer=None,
16 | trainable=True,
17 | cpu_variables=False,
18 | scope=None):
19 | """ tt-conv-layer (convolution of full input tensor with straightforward decomposed tt-filters (make tt full then use conv2d))
20 | Args:
21 | inp: input tensor, float - [batch_size, H, W, C]
22 | window: convolution window size, list [wH, wW]
23 | inp_ch_modes: input channels modes, np.array (int32) of size d
24 | out_ch_modes: output channels modes, np.array (int32) of size d
25 | ranks: tt-filters ranks, np.array (int32) of size (d + 1)
26 | strides: strides, list of 2 ints - [sx, sy]
27 | padding: 'SAME' or 'VALID', string
28 | filters_initializer: filters init function
29 | filters_regularizer: filters regularizer function
30 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core
31 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core
32 | biases_initializer: biases init function (if None then no biases will be used)
33 | biases_regularizer: biases regularizer function
34 | trainable: trainable variables flag, bool
35 | cpu_variables: cpu variables flag, bool
36 | scope: layer variable scope name, string
37 | Returns:
38 | out: output tensor, float - [batch_size, prod(out_modes)]
39 | """
40 |
41 | with tf.variable_scope(scope):
42 | inp_shape = inp.get_shape().as_list()[1:]
43 | inp_h, inp_w, inp_ch = inp_shape[0:3]
44 | tmp = tf.reshape(inp, [-1, inp_h, inp_w, inp_ch])
45 |
46 | modes = np.array([window[0], window[1], inp_ch, out_ch])
47 |
48 | cores = []
49 | for i in range(4):
50 |
51 | sz = modes[i] * ranks[i] * ranks[i + 1]
52 | if (sz == 1):
53 | cinit = tf.ones_initializer
54 | elif type(cores_initializer) == list:
55 | cinit = cores_initializer[i]
56 | else:
57 | cinit = cores_initializer
58 |
59 | if type(cores_regularizer) == list:
60 | creg = cores_regularizer[i]
61 | else:
62 | creg = cores_regularizer
63 |
64 | cores.append(get_var_wrap('core_%d' % (i + 1),
65 | shape=[ranks[i], modes[i] * ranks[i + 1]],
66 | initializer=cinit,
67 | regularizer=creg,
68 | trainable=trainable and (sz > 1),
69 | cpu_variable=cpu_variables))
70 |
71 | full = cores[0]
72 |
73 | for i in range(1, 4):
74 | full = tf.reshape(full, [-1, ranks[i]])
75 | full = tf.matmul(full, cores[i])
76 |
77 | full = tf.reshape(full, [window[0], window[1], inp_ch, out_ch])
78 |
79 |
80 | tmp = tf.nn.conv2d(tmp,
81 | full,
82 | [1] + strides + [1],
83 | padding,
84 | name='conv2d')
85 |
86 | if biases_initializer is not None:
87 | biases = get_var_wrap('biases',
88 | shape=[out_ch],
89 | initializer=biases_initializer,
90 | regularizer=biases_regularizer,
91 | trainable=trainable,
92 | cpu_variable=cpu_variables)
93 |
94 | out = tf.add(tmp, biases, name='out')
95 | else:
96 | out = tf.identity(tmp, name='out')
97 |
98 | return out
99 |
--------------------------------------------------------------------------------
/tensornet/layers/tt_conv_full.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | import math
4 | from .aux import get_var_wrap
5 |
6 | def tt_conv_full(inp,
7 | window,
8 | inp_ch_modes,
9 | out_ch_modes,
10 | ranks,
11 | strides=[1, 1],
12 | padding='SAME',
13 | filters_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
14 | filters_regularizer=None,
15 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False),
16 | cores_regularizer=None,
17 | biases_initializer=tf.zeros_initializer,
18 | biases_regularizer=None,
19 | trainable=True,
20 | cpu_variables=False,
21 | scope=None):
22 | """ tt-conv-layer (convolution of full input tensor with tt-filters (make tt full then use conv2d))
23 | Args:
24 | inp: input tensor, float - [batch_size, H, W, C]
25 | window: convolution window size, list [wH, wW]
26 | inp_ch_modes: input channels modes, np.array (int32) of size d
27 | out_ch_modes: output channels modes, np.array (int32) of size d
28 | ranks: tt-filters ranks, np.array (int32) of size (d + 1)
29 | strides: strides, list of 2 ints - [sx, sy]
30 | padding: 'SAME' or 'VALID', string
31 | filters_initializer: filters init function
32 | filters_regularizer: filters regularizer function
33 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core
34 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core
35 | biases_initializer: biases init function (if None then no biases will be used)
36 | biases_regularizer: biases regularizer function
37 | trainable: trainable variables flag, bool
38 | cpu_variables: cpu variables flag, bool
39 | scope: layer variable scope name, string
40 | Returns:
41 | out: output tensor, float - [batch_size, prod(out_modes)]
42 | """
43 |
44 | with tf.variable_scope(scope):
45 | inp_shape = inp.get_shape().as_list()[1:]
46 | inp_h, inp_w, inp_ch = inp_shape[0:3]
47 | tmp = tf.reshape(inp, [-1, inp_h, inp_w, inp_ch])
48 |
49 | filters_shape = [window[0], window[1], 1, ranks[0]]
50 | if (window[0] * window[1] * 1 * ranks[0] == 1):
51 | filters = get_var_wrap('filters',
52 | shape=filters_shape,
53 | initializer=tf.ones_initializer,
54 | regularizer=None,
55 | trainable=False,
56 | cpu_variable=cpu_variables)
57 | else:
58 | filters = get_var_wrap('filters',
59 | shape=filters_shape,
60 | initializer=filters_initializer,
61 | regularizer=filters_regularizer,
62 | trainable=trainable,
63 | cpu_variable=cpu_variables)
64 | d = inp_ch_modes.size
65 |
66 | cores = []
67 | for i in range(d):
68 |
69 | if type(cores_initializer) == list:
70 | cinit = cores_initializer[i]
71 | else:
72 | cinit = cores_initializer
73 |
74 | if type(cores_regularizer) == list:
75 | creg = cores_regularizer[i]
76 | else:
77 | creg = cores_regularizer
78 |
79 | cores.append(get_var_wrap('core_%d' % (i + 1),
80 | shape=[out_ch_modes[i] * ranks[i + 1], ranks[i] * inp_ch_modes[i]],
81 | initializer=cinit,
82 | regularizer=creg,
83 | trainable=trainable,
84 | cpu_variable=cpu_variables))
85 |
86 | full = filters
87 |
88 | for i in range(d):
89 | full = tf.reshape(full, [-1, ranks[i]])
90 | core = tf.transpose(cores[i], [1, 0])
91 | core = tf.reshape(core, [ranks[i], -1])
92 | full = tf.matmul(full, core)
93 |
94 | out_ch = np.prod(out_ch_modes)
95 |
96 | fshape = [window[0], window[1]]
97 | order = [0, 1]
98 | inord = []
99 | outord = []
100 | for i in range(d):
101 | fshape.append(inp_ch_modes[i])
102 | inord.append(2 + 2 * i)
103 | fshape.append(out_ch_modes[i])
104 | outord.append(2 + 2 * i + 1)
105 | order += inord + outord
106 | full = tf.reshape(full, fshape)
107 | full = tf.transpose(full, order)
108 | full = tf.reshape(full, [window[0], window[1], inp_ch, out_ch])
109 |
110 |
111 | tmp = tf.nn.conv2d(tmp,
112 | full,
113 | [1] + strides + [1],
114 | padding,
115 | name='conv2d')
116 |
117 | if biases_initializer is not None:
118 | biases = get_var_wrap('biases',
119 | shape=[out_ch],
120 | initializer=biases_initializer,
121 | regularizer=biases_regularizer,
122 | trainable=trainable,
123 | cpu_variable=cpu_variables)
124 |
125 | out = tf.add(tmp, biases, name='out')
126 | else:
127 | out = tf.identity(tmp, name='out')
128 |
129 | return out
130 |
--------------------------------------------------------------------------------
/tensornet/tt/__init__.py:
--------------------------------------------------------------------------------
1 | from .svd import *
2 | from .max_ranks import *
3 | from .matrix_svd import *
4 |
--------------------------------------------------------------------------------
/tensornet/tt/matrix_svd.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | from .svd import svd
3 |
4 | def matrix_svd(X, left_modes, right_modes, ranks):
5 | """ TT-SVD for matrix
6 | Args:
7 | X: input matrix, numpy array float32
8 | left_modes: tt-left-modes, numpy array int32
9 | right_modes: tt-right-modes, numpy array int32
10 | ranks: tt-ranks, numpy array int32
11 | Returns:
12 | core: tt-cores array, numpy 1D array float32
13 | """
14 | c = X.copy()
15 | d = left_modes.size
16 | c = np.reshape(c, np.concatenate((left_modes, right_modes)))
17 | order = np.repeat(np.arange(0, d), 2) + np.tile([0, d], d)
18 | c = np.transpose(c, axes=order)
19 | c = np.reshape(c, left_modes * right_modes)
20 | return svd(c, left_modes * right_modes, ranks)
21 |
--------------------------------------------------------------------------------
/tensornet/tt/max_ranks.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | def max_ranks(modes):
4 | """ Computation of maximal ranks for TT-SVD
5 | Args:
6 | modes: tt-modes, numpy array int32
7 | Returns:
8 | ranks: maximal tt-ranks, numpy array int32
9 | """
10 | d = modes.size
11 | ranks = np.zeros(d + 1, dtype='int32')
12 | ranks[0] = 1
13 | prod = np.prod(modes)
14 | for i in range(d):
15 | m = ranks[i] * modes[i]
16 | ranks[i + 1] = min(m, prod // m);
17 | prod = prod // m * ranks[i + 1]
18 | ranks[d] = 1
19 | return ranks
20 |
--------------------------------------------------------------------------------
/tensornet/tt/svd.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | def svd(X, modes, ranks):
4 | """ TT-SVD
5 | Args:
6 | X: input array, numpy array float32
7 | modes: tt-modes, numpy array int32
8 | ranks: tt-ranks, numpy array int32
9 | Returns:
10 | core: tt-cores array, numpy 1D array float32
11 | """
12 | c = X.copy()
13 | d = modes.size
14 | core = np.zeros(np.sum(ranks[:-1] * modes * ranks[1:]), dtype='float32')
15 | pos = 0
16 | for i in range(0, d-1):
17 | m = ranks[i] * modes[i]
18 | c = np.reshape(c, [m, -1])
19 | u, s, v = np.linalg.svd(c, full_matrices=False)
20 | u = u[:, 0:ranks[i + 1]]
21 | s = s[0:ranks[i + 1]]
22 | v = v[0:ranks[i + 1], :]
23 | core[pos:pos + ranks[i] * modes[i] * ranks[i + 1]] = u.ravel()
24 | pos += ranks[i] * modes[i] * ranks[i + 1]
25 | c = np.dot(np.diag(s), v)
26 | core[pos:pos + ranks[d - 1] * modes[d - 1] * ranks[d]] = c.ravel()
27 | return core
28 |
--------------------------------------------------------------------------------
/tests/python/test_matrix_svd.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import sys
3 | sys.path.append('../../')
4 | import tensornet
5 |
6 |
7 | def run_test(left_modes = np.array([4, 6, 8, 3], dtype=np.int32),
8 | right_modes = np.array([5, 2, 7, 4], dtype=np.int32),
9 | test_num=10,
10 | tol=1e-5):
11 | print('*' * 80)
12 | print('*' + ' ' * 28 + 'Testing matrix TT-SVD' + ' ' * 29 + '*')
13 | print('*' * 80)
14 | d = left_modes.size
15 | L = np.prod(left_modes)
16 | R = np.prod(right_modes)
17 | ranks = tensornet.tt.max_ranks(left_modes * right_modes)
18 | ps = np.cumsum(np.concatenate(([0], ranks[:-1] * left_modes * right_modes * ranks[1:])))
19 | for test in range(test_num):
20 | W = np.random.normal(0.0, 1.0, size=(L, R))
21 | T = tensornet.tt.matrix_svd(W, left_modes, right_modes, ranks)
22 | w = np.reshape(T[ps[0]:ps[1]], [left_modes[0] * right_modes[0], ranks[1]])
23 | for i in range(1, d):
24 | core = np.reshape(T[ps[i]:ps[i + 1]], [ranks[i], left_modes[i] * right_modes[i] * ranks[i + 1]])
25 | w = np.dot(w, core)
26 | w = np.reshape(w, [-1, ranks[i + 1]])
27 | w = np.reshape(w, w.shape[:-1])
28 | shape = np.hstack((left_modes.reshape([-1, 1]), right_modes.reshape([-1, 1]))).ravel()
29 | w = np.reshape(w, shape)
30 | order = np.concatenate((np.arange(0, 2 * d, 2), np.arange(1, 2 * d, 2)))
31 | w = np.reshape(np.transpose(w, axes=order), [L, R])
32 | result = np.max(np.abs(W - w))
33 | print('Test #{0:02d}. Error: {1:0.2g}'.format(test + 1, result))
34 | assert result <= tol, 'Error = {0:0.2g} is bigger than tol = {1:0.2g}'.format(result, tol)
35 |
36 |
37 | if __name__ == '__main__':
38 | run_test()
39 |
--------------------------------------------------------------------------------
/tests/python/test_tt.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import tensorflow as tf
3 | import sys
4 |
5 | sys.path.append('../../')
6 |
7 | import tensornet
8 |
9 | def run_test(batch_size=100, test_num=10,
10 | inp_modes=np.array([3, 8, 9, 5], dtype='int32'),
11 | out_modes=np.array([5, 6, 10, 6], dtype='int32'),
12 | mat_ranks=np.array([1, 3, 6, 4, 1], dtype='int32'),
13 | tol=1e-5):
14 | print('*' * 80)
15 | print('*' + ' ' * 31 + 'Testing tt layer' + ' ' * 31 + '*')
16 | print('*' * 80)
17 |
18 | graph = tf.Graph()
19 | with graph.as_default():
20 |
21 | d = inp_modes.size
22 |
23 | INP_SIZE = np.prod(inp_modes)
24 | OUT_SIZE = np.prod(out_modes)
25 |
26 |
27 |
28 | inp = tf.placeholder('float', shape=[None, INP_SIZE])
29 | out = tensornet.layers.tt(inp,
30 | inp_modes,
31 | out_modes,
32 | mat_ranks,
33 | biases_initializer=None,
34 | scope='tt')
35 |
36 |
37 | sess = tf.Session()
38 | init_op = tf.initialize_all_variables()
39 | sess.run(init_op)
40 |
41 | for test in range(test_num):
42 | mat_cores = []
43 | for i in range(d):
44 | mat_cores.append(graph.get_tensor_by_name('tt/mat_core_%d:0' % (i + 1)))
45 | mat_cores[-1] = sess.run(mat_cores[-1])
46 |
47 |
48 | w = np.reshape(mat_cores[0], [out_modes[0] * mat_ranks[1], mat_ranks[0] * inp_modes[0]])
49 | w = np.transpose(w, [1, 0])
50 | w = np.reshape(w, [-1, mat_ranks[1]])
51 | for i in range(1, d):
52 | core = np.reshape(mat_cores[i], [out_modes[i] * mat_ranks[i + 1], mat_ranks[i] * inp_modes[i]])
53 | core = np.transpose(core, [1, 0])
54 | core = np.reshape(core, [mat_ranks[i], -1])
55 | w = np.dot(w, core)
56 | w = np.reshape(w, [-1, mat_ranks[i + 1]])
57 | w = np.reshape(w, w.shape[:-1])
58 | shape = np.hstack((inp_modes.reshape([-1, 1]), out_modes.reshape([-1, 1]))).ravel()
59 | w = np.reshape(w, shape)
60 | order = np.concatenate((np.arange(0, 2 * d, 2), np.arange(1, 2 * d, 2)))
61 | w = np.reshape(np.transpose(w, axes=order), [INP_SIZE, OUT_SIZE])
62 |
63 |
64 |
65 | X = np.random.normal(0.0, 0.2, size=(batch_size, np.prod(inp_modes)))
66 | feed_dict = {inp: X}
67 | y = sess.run(out, feed_dict=feed_dict)
68 | Y = np.dot(X, w)
69 | result = np.max(np.abs(Y - y))
70 | print('Test #{0:02d}. Error: {1:0.2g}'.format(test + 1, result))
71 | assert result <= tol, 'Error = {0:0.2g} is bigger than tol = {1:0.2g}'.format(result, tol)
72 | sess.close()
73 |
74 | if __name__ == '__main__':
75 | run_test()
76 |
77 |
--------------------------------------------------------------------------------
/tests/python/test_tt_conv.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import time
3 | import tensorflow as tf
4 | import sys
5 |
6 | sys.path.append('../../')
7 |
8 | import tensorflow as tf
9 | import tensornet
10 |
11 | def run_test(batch_size=30, test_num=10, tol=1e-5):
12 | print('*' * 80)
13 | print('*' + ' ' * 31 + 'Testing tt conv' + ' ' * 32 + '*')
14 | print('*' * 80)
15 |
16 | in_h = 32
17 | in_w = 32
18 |
19 | padding = 'SAME'
20 |
21 |
22 | inp_ch_modes = np.array([4, 4, 4, 3], dtype=np.int32)
23 | in_c = np.prod(inp_ch_modes)
24 | out_ch_modes = np.array([5, 2, 5, 5], dtype=np.int32)
25 | out_c = np.prod(out_ch_modes)
26 | ranks = np.array([3, 2, 2, 3, 1], dtype=np.int32)
27 |
28 |
29 | inp = tf.placeholder(tf.float32, [None, in_h, in_w, in_c])
30 |
31 |
32 | wh = 5
33 | ww = 5
34 |
35 |
36 | w_ph = tf.placeholder(tf.float32, [wh, ww, in_c, out_c])
37 |
38 | s = [1, 1]
39 |
40 | corr = tf.nn.conv2d(inp, w_ph, [1] + s + [1], padding)
41 |
42 |
43 | out = tensornet.layers.tt_conv(inp,
44 | [wh, ww],
45 | inp_ch_modes,
46 | out_ch_modes,
47 | ranks,
48 | s,
49 | padding,
50 | biases_initializer=None,
51 | scope='tt_conv')
52 |
53 |
54 |
55 |
56 | sess = tf.Session()
57 | graph = tf.get_default_graph()
58 | init_op = tf.initialize_all_variables()
59 |
60 | d = inp_ch_modes.size
61 |
62 | filters_t = graph.get_tensor_by_name('tt_conv/filters:0')
63 |
64 | cores_t = []
65 | for i in range(d):
66 | cores_t.append(graph.get_tensor_by_name('tt_conv/core_%d:0' % (i + 1)))
67 |
68 | for test in range(test_num):
69 | sess.run(init_op)
70 |
71 |
72 |
73 |
74 | filters = sess.run([filters_t])
75 | cores = sess.run(cores_t)
76 |
77 | w = np.reshape(filters.copy(), [wh, ww, ranks[0]])
78 |
79 |
80 |
81 | #mat = np.reshape(inp_cores[inp_ps[0]:inp_ps[1]], [inp_ch_ranks[0], inp_ch_modes[0], inp_ch_ranks[1]])
82 |
83 | for i in range(0, d):
84 | core = cores[i].copy()
85 | #[out_ch_modes[i] * ranks[i + 1], ranks[i] * inp_ch_modes[i]]
86 | core = np.transpose(core, [1, 0])
87 | core = np.reshape(core, [ranks[i], inp_ch_modes[i] * out_ch_modes[i] * ranks[i + 1]])
88 |
89 | w = np.reshape(w, [-1, ranks[i]])
90 | w = np.dot(w, core)
91 |
92 | #w = np.dot(w, np.reshape(mat, [inp_ch_ranks[0], -1]))
93 |
94 | L = []
95 | for i in range(d):
96 | L.append(inp_ch_modes[i])
97 | L.append(out_ch_modes[i])
98 |
99 | w = np.reshape(w, [-1] + L)
100 | w = np.transpose(w, [0] + list(range(1, 2 * d + 1, 2)) + list(range(2, 2 * d + 1, 2)))
101 |
102 | w = np.reshape(w, [wh, ww, in_c, out_c])
103 |
104 | X = np.random.normal(0.0, 0.2, size=(batch_size, in_h, in_w, in_c))
105 |
106 | t1 = time.clock()
107 | correct = sess.run(corr, feed_dict={w_ph: w, inp: X})
108 | t2 = time.clock()
109 | y = sess.run(out, feed_dict={w_ph: w, inp: X})
110 | t3 = time.clock()
111 |
112 |
113 |
114 | err = np.max(np.abs(correct - y))
115 | print('Test #{0:02d}. Error: {1:0.2g}'.format(test + 1, err))
116 | print('TT-conv time: {0:.2f} sec. conv time: {1:.2f} sec.'.format(t3 - t2, t2 - t1))
117 | assert err <= tol, 'Error = {0:0.2g} is bigger than tol = {1:0.2g}'.format(err, tol)
118 |
119 |
120 | if __name__ == '__main__':
121 | run_test()
122 |
123 |
--------------------------------------------------------------------------------
/tests/python/test_tt_conv_full.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import time
3 | import tensorflow as tf
4 | import sys
5 |
6 | sys.path.append('../../')
7 |
8 | import tensorflow as tf
9 | import tensornet
10 |
11 | def run_test(batch_size=30, test_num=10, tol=1e-5):
12 | print('*' * 80)
13 | print('*' + ' ' * 29 + 'Testing tt conv full' + ' ' * 29 + '*')
14 | print('*' * 80)
15 |
16 | in_h = 32
17 | in_w = 32
18 |
19 | padding = 'SAME'
20 |
21 |
22 | inp_ch_modes = np.array([4, 4, 4, 3], dtype=np.int32)
23 | in_c = np.prod(inp_ch_modes)
24 | out_ch_modes = np.array([5, 2, 5, 5], dtype=np.int32)
25 | out_c = np.prod(out_ch_modes)
26 | ranks = np.array([3, 2, 2, 3, 1], dtype=np.int32)
27 |
28 |
29 | inp = tf.placeholder(tf.float32, [None, in_h, in_w, in_c])
30 |
31 |
32 | wh = 5
33 | ww = 5
34 |
35 |
36 | w_ph = tf.placeholder(tf.float32, [wh, ww, in_c, out_c])
37 |
38 | s = [1, 1]
39 |
40 | corr = tf.nn.conv2d(inp, w_ph, [1] + s + [1], padding)
41 |
42 |
43 | out = tensornet.layers.tt_conv_full(inp,
44 | [wh, ww],
45 | inp_ch_modes,
46 | out_ch_modes,
47 | ranks,
48 | s,
49 | padding,
50 | biases_initializer=None,
51 | scope='tt_conv')
52 |
53 | sess = tf.Session()
54 | graph = tf.get_default_graph()
55 | init_op = tf.initialize_all_variables()
56 |
57 | d = inp_ch_modes.size
58 |
59 | filters_t = graph.get_tensor_by_name('tt_conv/filters:0')
60 |
61 | cores_t = []
62 | for i in range(d):
63 | cores_t.append(graph.get_tensor_by_name('tt_conv/core_%d:0' % (i + 1)))
64 |
65 | for test in range(test_num):
66 | sess.run(init_op)
67 |
68 |
69 | filters = sess.run([filters_t])
70 | cores = sess.run(cores_t)
71 |
72 | w = np.reshape(filters.copy(), [wh, ww, ranks[0]])
73 |
74 |
75 |
76 | #mat = np.reshape(inp_cores[inp_ps[0]:inp_ps[1]], [inp_ch_ranks[0], inp_ch_modes[0], inp_ch_ranks[1]])
77 |
78 | for i in range(0, d):
79 | core = cores[i].copy()
80 | #[out_ch_modes[i] * ranks[i + 1], ranks[i] * inp_ch_modes[i]]
81 | core = np.transpose(core, [1, 0])
82 | core = np.reshape(core, [ranks[i], inp_ch_modes[i] * out_ch_modes[i] * ranks[i + 1]])
83 |
84 | w = np.reshape(w, [-1, ranks[i]])
85 | w = np.dot(w, core)
86 |
87 | #w = np.dot(w, np.reshape(mat, [inp_ch_ranks[0], -1]))
88 |
89 | L = []
90 | for i in range(d):
91 | L.append(inp_ch_modes[i])
92 | L.append(out_ch_modes[i])
93 |
94 | w = np.reshape(w, [-1] + L)
95 | w = np.transpose(w, [0] + list(range(1, 2 * d + 1, 2)) + list(range(2, 2 * d + 1, 2)))
96 |
97 | w = np.reshape(w, [wh, ww, in_c, out_c])
98 |
99 | X = np.random.normal(0.0, 0.2, size=(batch_size, in_h, in_w, in_c))
100 |
101 | t1 = time.clock()
102 | correct = sess.run(corr, feed_dict={w_ph: w, inp: X})
103 | t2 = time.clock()
104 | y = sess.run(out, feed_dict={w_ph: w, inp: X})
105 | t3 = time.clock()
106 |
107 |
108 |
109 | err = np.max(np.abs(correct - y))
110 | print('Test #{0:02d}. Error: {1:0.2g}'.format(test + 1, err))
111 | print('TT-conv time: {0:.2f} sec. conv time: {1:.2f} sec.'.format(t3 - t2, t2 - t1))
112 | assert err <= tol, 'Error = {0:0.2g} is bigger than tol = {1:0.2g}'.format(err, tol)
113 |
114 |
115 | if __name__ == '__main__':
116 | run_test()
117 |
118 |
--------------------------------------------------------------------------------
/ultimate_tensorization_poster.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/timgaripov/TensorNet-TF/76299ad4726370bb5e75589017208d7eae7d8666/ultimate_tensorization_poster.pdf
--------------------------------------------------------------------------------