├── README.md ├── TT.png ├── experiments └── cifar-10 │ ├── FC-Tensorizing-Neural-Networks │ ├── 2-layer-tt │ │ ├── input_data.py │ │ ├── net.py │ │ ├── results │ │ │ ├── res_01-04-2016_22_24 │ │ │ ├── res_02-04-2016_02_33 │ │ │ ├── res_02-04-2016_07_20 │ │ │ ├── res_02-04-2016_12_53 │ │ │ ├── res_02-04-2016_19_30 │ │ │ ├── res_03-04-2016_14_56 │ │ │ ├── res_03-04-2016_23_30 │ │ │ ├── res_04-04-2016_09_43 │ │ │ ├── res_08-04-2016_00_08 │ │ │ ├── res_08-04-2016_00_14 │ │ │ ├── res_08-04-2016_01_46 │ │ │ ├── res_08-04-2016_05_03 │ │ │ ├── res_08-04-2016_14_59 │ │ │ ├── res_10-04-2016_20_38 │ │ │ ├── res_10-04-2016_21_16 │ │ │ ├── res_12-04-2016_10_54 │ │ │ ├── res_12-04-2016_11_01 │ │ │ └── res_30-03-2016_17_12 │ │ └── train_cifar.py │ ├── FC-net │ │ ├── input_data.py │ │ ├── net.py │ │ ├── results │ │ │ └── res_30-03-2016_16_58 │ │ └── train_cifar.py │ └── README.md │ ├── conv-Ultimate-Tensorization │ ├── README.md │ ├── eval.py │ ├── input_data.py │ ├── nets │ │ ├── TT-conv-TT-fc.py │ │ ├── TT-conv-fc.py │ │ ├── TT-conv.py │ │ ├── conv-fc.py │ │ └── conv.py │ ├── train.py │ └── train_with_pretrained_convs.py │ └── data │ └── prepare_data.py ├── paper.md ├── tensornet ├── __init__.py ├── layers │ ├── __init__.py │ ├── aux.py │ ├── batch_normalization.py │ ├── conv.py │ ├── linear.py │ ├── tt.py │ ├── tt_conv.py │ ├── tt_conv1d_full.py │ ├── tt_conv_direct.py │ └── tt_conv_full.py └── tt │ ├── __init__.py │ ├── matrix_svd.py │ ├── max_ranks.py │ └── svd.py ├── tests └── python │ ├── test_matrix_svd.py │ ├── test_tt.py │ ├── test_tt_conv.py │ └── test_tt_conv_full.py └── ultimate_tensorization_poster.pdf /README.md: -------------------------------------------------------------------------------- 1 | # TensorNet 2 | 3 | This is a TensorFlow implementation of the Tensor Train compression method for neural networks. It supports _TT-FC_ layer [1] and _TT-conv_ layer [2], which act as a fully-connected and convolutional layers correspondingly, but are much more compact. The TT-FC layer is also faster than its uncompressed analog and allows to use hundreds of thousands of hidden units. The ```experiments``` folder contains the code to reproduce the experiments from the papers. 4 | 5 | 6 | [1] _Tensorizing Neural Networks_ 7 | Alexander Novikov, Dmitry Podoprikhin, Anton Osokin, Dmitry Vetrov; In _Advances in Neural Information Processing Systems 28_ (NIPS-2015) [[arXiv](http://arxiv.org/abs/1509.06569)]. 8 | 9 | [2] _Ultimate tensorization: compressing convolutional and FC layers alike_ 10 | Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov; _Learning with Tensors: Why Now and How?_, NIPS-2016 workshop (NIPS-2015) [[arXiv](https://arxiv.org/abs/1611.03214)]. 11 | 12 | 13 | Please cite our work if you write a scientific paper using this code. 14 | In BiBTeX format: 15 | ```latex 16 | @incollection{novikov15tensornet, 17 | author = {Novikov, Alexander and Podoprikhin, Dmitry and Osokin, Anton and Vetrov, Dmitry}, 18 | title = {Tensorizing Neural Networks}, 19 | booktitle = {Advances in Neural Information Processing Systems 28 (NIPS)}, 20 | year = {2015}, 21 | } 22 | @article{garipov16ttconv, 23 | author = {Garipov, Timur and Podoprikhin, Dmitry and Novikov, Alexander and Vetrov, Dmitry}, 24 | title = {Ultimate tensorization: compressing convolutional and {FC} layers alike}, 25 | journal = {arXiv preprint arXiv:1611.03214}, 26 | year = {2016} 27 | } 28 | ``` 29 | 30 | # Prerequisites 31 | * [TensorFlow](https://www.tensorflow.org/) (tested with v. 1.1.0) 32 | * [NumPy](http://www.numpy.org/) 33 | 34 | # MATLAB and Theano 35 | We also published a [MATLAB and Theano+Lasagne implementation](https://github.com/Bihaqo/TensorNet) in a separate repository. 36 | 37 | # FAQ 38 | ### What is _tensor_ anyway? 39 | Its just a synonym for a multidimensional array. For example a matrix is a 2-dimensional tensor. 40 | 41 | ### But in the fully-connected case you work with matrices, why do you need tensor decompositions? 42 | Good point. Actually, the Tensor Train format coincides the matrix low-rank format when applied to matrices. For this reason, there is a special _matrix Tensor Train format_, which basically does two things: reshapes the matrix into a tensor (say 10-dimensional) and permutes its dimensions in a special way; uses tensor decomposition on the resulting tensor. This way proved to be more efficient than the matrix low-rank format for the matrix of the fully-connected layer. 43 | 44 | ### Where I can read more about this _Tensor Train_ format? 45 | Look at the original paper: Ivan Oseledets, Tensor-Train decomposition, 2011 [[pdf](https://dl.dropboxusercontent.com/content_link/5aBmG8Em2oDCji5AJsviXqKVZWSiVYt4lKkMs2icjskQM79YRCnOoTf2wDP1N3Dh/file?dl=1)]. You can also check out my (Alexander Novikov's) [slides](http://www.slideshare.net/AlexanderNovikov8/tensor-train-decomposition-in-machine-learning), from slide 3 to 14. 46 | 47 | By the way, **train** means like actual train, with wheels. The name comes from the pictures like the one below that illustrate the Tensor Train format and naturally look like a train (at least they say so). 48 | 49 | Tensor Train format 50 | 51 | 56 | 57 | ### Are TensorFlow, MATLAB, and Theano implementations compatible? 58 | Unfortunately not (at least not yet). 59 | 60 | 61 | ### I want to implement this in Caffe (or other library without autodiff). Any tips on doing the backward pass? 62 | Great! Write us when you're done or if you have questions along the way. 63 | The MATLAB version of the code has the [backward pass implementation](https://github.com/Bihaqo/TensorNet/blob/master/src/matlab/vl_nntt_backward.m) for TT-FC layer. But note that the forward pass in MATLAB and TensorFlow versions is implemented differently. 64 | 65 | ### Have you tried other tensor decompositions, like CP-decomposition? 66 | We haven't, but this paper uses CP-decomposition to compress the kernel of the convolutional layer: Lebedev V., Ganin Y. et al., Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition [[arXiv](https://arxiv.org/abs/1412.6553)] [[code](https://github.com/vadim-v-lebedev/cp-decomposition)]. They got nice compression results, but was not able to train CP-conv layers from scratch, only to train a network with regular convolutional layers, represent them in the CP-format, and when finetune the rest of the network. Even _finetuning_ an CP-conv layer often diverges. 67 | -------------------------------------------------------------------------------- /TT.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/timgaripov/TensorNet-TF/76299ad4726370bb5e75589017208d7eae7d8666/TT.png -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/input_data.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | 4 | class DataSet(object): 5 | def __init__(self, images, labels): 6 | """Construct a DataSet. 7 | """ 8 | assert images.shape[0] == labels.shape[0], ('images.shape: %s labels.shape: %s' % 9 | (images.shape, labels.shape)) 10 | self._num_examples = images.shape[0] 11 | self._images = images 12 | self._labels = labels 13 | self._epochs_completed = 0 14 | self._index_in_epoch = 0 15 | 16 | @property 17 | def images(self): 18 | return self._images 19 | 20 | @property 21 | def labels(self): 22 | return self._labels 23 | 24 | @property 25 | def num_examples(self): 26 | return self._num_examples 27 | 28 | @property 29 | def epochs_completed(self): 30 | return self._epochs_completed 31 | 32 | def next_batch(self, batch_size): 33 | start = self._index_in_epoch 34 | self._index_in_epoch += batch_size 35 | if self._index_in_epoch > self._num_examples: 36 | # Finished epoch 37 | self._epochs_completed += 1 38 | # Shuffle the data 39 | perm = np.arange(self._num_examples) 40 | np.random.shuffle(perm) 41 | self._images = self._images[perm] 42 | self._labels = self._labels[perm] 43 | # Start next epoch 44 | start = 0 45 | self._index_in_epoch = batch_size 46 | assert batch_size <= self._num_examples 47 | end = self._index_in_epoch 48 | return self._images[start:end], self._labels[start:end] 49 | 50 | 51 | def read_data_sets(data_dir): 52 | f = np.load(data_dir + '/cifar.npz') 53 | train_images = f['train_images'].astype('float32') 54 | train_labels = f['train_labels'] 55 | 56 | validation_images = f['validation_images'].astype('float32') 57 | validation_labels = f['validation_labels'] 58 | 59 | mean = np.mean(train_images, axis=0)[np.newaxis, :] 60 | std = np.std(train_images, axis=0)[np.newaxis, :] 61 | 62 | train_images = (train_images - mean) / std; 63 | validation_images = (validation_images - mean) / std; 64 | 65 | #train_images = np.reshape(train_images, (-1, 32, 32, 3)) 66 | #validation_images = np.reshape(validation_images, (-1, 32, 32, 3)) 67 | #train_reshaped = np.empty((train_images.shape[0], 0), dtype=np.float32) 68 | #validation_reshaped = np.empty((validation_images.shape[0], 0), dtype=np.float32) 69 | 70 | #for i in range(4): 71 | # for j in range(4): 72 | # p = np.reshape(train_images[:, 8*i:8*(i+1), 8*j:8*(j+1), :], (-1, 192)) 73 | # train_reshaped = np.hstack((train_reshaped, p)) 74 | # p = np.reshape(validation_images[:, 8*i:8*(i+1), 8*j:8*(j+1), :], (-1, 192)) 75 | # validation_reshaped = np.hstack((validation_reshaped, p)) 76 | train = DataSet(train_images, train_labels) 77 | validation = DataSet(validation_images, validation_labels) 78 | return train, validation 79 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/net.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import math 3 | import numpy as np 4 | import sys 5 | 6 | 7 | sys.path.append('../../../../') 8 | import tensornet 9 | 10 | NUM_CLASSES = 10 11 | IMAGE_SIZE = 32 12 | IMAGE_DEPTH = 3 13 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 14 | 15 | opts = {} 16 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 17 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32') 18 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 19 | 20 | opts['inp_modes_2'] = opts['out_modes_1'] 21 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 22 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 23 | 24 | 25 | opts['use_dropout'] = True 26 | opts['learning_rate_init'] = 0.06 27 | opts['learning_rate_decay_steps'] = 2000 28 | opts['learning_rate_decay_weight'] = 0.64 29 | 30 | def placeholder_inputs(): 31 | """Generate placeholder variables to represent the input tensors. 32 | 33 | Returns: 34 | images_ph: Images placeholder. 35 | labels_ph: Labels placeholder. 36 | train_phase_ph: Train phase indicator placeholder. 37 | """ 38 | # Note that the shapes of the placeholders match the shapes of the full 39 | # image and label tensors, except the first dimension is now batch_size 40 | # rather than the full size of the train or test data sets. 41 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 42 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 43 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 44 | return images_ph, labels_ph, train_phase_ph 45 | 46 | def inference(images, train_phase): 47 | """Build the model up to where it may be used for inference. 48 | Args: 49 | images: Images placeholder. 50 | train_phase: Train phase placeholder 51 | Returns: 52 | logits: Output tensor with the computed logits. 53 | """ 54 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 55 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 56 | 57 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 58 | 59 | 60 | layers = [] 61 | layers.append(images) 62 | 63 | 64 | layers.append(tensornet.layers.tt(layers[-1], 65 | opts['inp_modes_1'], 66 | opts['out_modes_1'], 67 | opts['ranks_1'], 68 | scope='tt_' + str(len(layers)), 69 | biases_initializer=None)) 70 | 71 | layers.append(tensornet.layers.batch_normalization(layers[-1], 72 | train_phase, 73 | scope='BN_' + str(len(layers)), 74 | ema_decay=0.8)) 75 | 76 | layers.append(tf.nn.relu(layers[-1], 77 | name='relu_' + str(len(layers)))) 78 | layers.append(tf.nn.dropout(layers[-1], 79 | dropout_rate(0.6), 80 | name='dropout_' + str(len(layers)))) 81 | 82 | 83 | ########################################## 84 | layers.append(tensornet.layers.tt(layers[-1], 85 | opts['inp_modes_2'], 86 | opts['out_modes_2'], 87 | opts['ranks_2'], 88 | scope='tt_' + str(len(layers)), 89 | biases_initializer=None)) 90 | 91 | layers.append(tensornet.layers.batch_normalization(layers[-1], 92 | train_phase, 93 | scope='BN_' + str(len(layers)), 94 | ema_decay=0.8)) 95 | 96 | layers.append(tf.nn.relu(layers[-1], 97 | name='relu_' + str(len(layers)))) 98 | 99 | layers.append(tf.nn.dropout(layers[-1], 100 | dropout_rate(0.6), 101 | name='dropout_' + str(len(layers)))) 102 | 103 | ########################################## 104 | 105 | layers.append(tensornet.layers.linear(layers[-1], 106 | NUM_CLASSES, 107 | scope='linear_' + str(len(layers)))) 108 | 109 | return layers[-1] 110 | 111 | def loss(logits, labels): 112 | """Calculates the loss from the logits and the labels. 113 | Args: 114 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 115 | labels: Labels tensor, int32 - [batch_size]. 116 | Returns: 117 | loss: Loss tensor of type float. 118 | """ 119 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 120 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 121 | # each with NUM_CLASSES values, all of which are 0.0 except there will 122 | # be a 1.0 in the entry corresponding to the label). 123 | batch_size = tf.size(labels) 124 | labels = tf.expand_dims(labels, 1) 125 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 126 | concated = tf.concat([indices, labels], 1) 127 | onehot_labels = tf.sparse_to_dense(concated, 128 | tf.shape(logits), 1.0, 0.0) 129 | 130 | 131 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits, 132 | labels=onehot_labels, 133 | name='xentropy') 134 | loss = tf.reduce_mean(cross_entropy, name='loss') 135 | tf.summary.scalar('summary/loss', loss) 136 | return loss 137 | 138 | def training(loss): 139 | """Sets up the training Ops. 140 | Creates an optimizer and applies the gradients to all trainable variables. 141 | The Op returned by this function is what must be passed to the 142 | `sess.run()` call to cause the model to train. 143 | Args: 144 | loss: Loss tensor, from loss(). 145 | Returns: 146 | train_op: The Op for training. 147 | """ 148 | # Create a variable to track the global step. 149 | global_step = tf.Variable(0, name='global_step', trainable=False) 150 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 151 | global_step, 152 | opts['learning_rate_decay_steps'], 153 | opts['learning_rate_decay_weight'], 154 | staircase=True, 155 | name='learning_rate') 156 | tf.summary.scalar('summary/learning_rate', learning_rate) 157 | # Create the gradient descent optimizer with the given learning rate. 158 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 159 | 160 | grads_and_vars = optimizer.compute_gradients(loss) 161 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 162 | return train_op 163 | 164 | def evaluation(logits, labels): 165 | """Evaluate the quality of the logits at predicting the label. 166 | Args: 167 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 168 | labels: Labels tensor, int32 - [batch_size], with values in the 169 | range [0, NUM_CLASSES). 170 | Returns: 171 | A scalar int32 tensor with the number of examples (out of batch_size) 172 | that were predicted correctly. 173 | """ 174 | # For a classifier model, we can use the in_top_k Op. 175 | # It returns a bool tensor with shape [batch_size] that is true for 176 | # the examples where the label's is was in the top k (here k=1) 177 | # of all logits for that example. 178 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 179 | # Return the number of true entries. 180 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 181 | return correct_count 182 | 183 | 184 | def build(new_opts={}): 185 | """ Build graph 186 | Args: 187 | new_opts: dict with additional opts, which will be added to opts dict/ 188 | """ 189 | opts.update(new_opts) 190 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 191 | logits = inference(images_ph, train_phase_ph) 192 | loss_out = loss(logits, labels_ph) 193 | train = training(loss_out) 194 | eval_out = evaluation(logits, labels_ph) 195 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_01-04-2016_22_24: -------------------------------------------------------------------------------- 1 | Iterations: 40000 2 | Learning time: 248.66 minutes 3 | Train precision: 0.74416 4 | Train loss: 0.75565 5 | Validation precision: 0.69510 6 | Validation loss: 0.87159 7 | Extra opts: {'ranks_2': array([1, 2, 2, 2, 2, 2, 1], dtype=int32), 'ranks_1': array([1, 2, 2, 2, 2, 2, 1], dtype=int32)} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_02-04-2016_02_33: -------------------------------------------------------------------------------- 1 | Iterations: 40000 2 | Learning time: 286.55 minutes 3 | Train precision: 0.74570 4 | Train loss: 0.74619 5 | Validation precision: 0.69760 6 | Validation loss: 0.86426 7 | Extra opts: {'ranks_2': array([1, 4, 4, 4, 4, 4, 1], dtype=int32), 'ranks_1': array([1, 4, 4, 4, 4, 4, 1], dtype=int32)} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_02-04-2016_07_20: -------------------------------------------------------------------------------- 1 | Iterations: 40000 2 | Learning time: 333.25 minutes 3 | Train precision: 0.77144 4 | Train loss: 0.67462 5 | Validation precision: 0.70550 6 | Validation loss: 0.83444 7 | Extra opts: {'ranks_2': array([1, 6, 6, 6, 6, 6, 1], dtype=int32), 'ranks_1': array([1, 6, 6, 6, 6, 6, 1], dtype=int32)} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_02-04-2016_12_53: -------------------------------------------------------------------------------- 1 | Iterations: 40000 2 | Learning time: 396.60 minutes 3 | Train precision: 0.78978 4 | Train loss: 0.62019 5 | Validation precision: 0.70740 6 | Validation loss: 0.84421 7 | Extra opts: {'ranks_2': array([ 1, 6, 10, 10, 10, 6, 1], dtype=int32), 'ranks_1': array([ 1, 6, 10, 10, 10, 6, 1], dtype=int32)} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_02-04-2016_19_30: -------------------------------------------------------------------------------- 1 | Iterations: 40000 2 | Learning time: 462.50 minutes 3 | Train precision: 0.80798 4 | Train loss: 0.56969 5 | Validation precision: 0.71340 6 | Validation loss: 0.81786 7 | Extra opts: {'ranks_2': array([ 1, 10, 10, 10, 10, 10, 1], dtype=int32), 'ranks_1': array([ 1, 10, 10, 10, 10, 10, 1], dtype=int32)} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_08-04-2016_00_08: -------------------------------------------------------------------------------- 1 | Iterations: 1000 2 | Learning time: 3.31 minutes 3 | Train precision: 0.54084 4 | Train loss: 1.28820 5 | Validation precision: 0.52470 6 | Validation loss: 1.32532 7 | Extra opts: {'out_modes_1': array([6, 6, 6, 6, 6, 6], dtype=int32), 'inp_modes_2': array([6, 6, 6, 6, 6, 6], dtype=int32)} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 10, 10, 10, 10, 10, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 10, 10, 10, 10, 10, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_08-04-2016_00_14: -------------------------------------------------------------------------------- 1 | Iterations: 30000 2 | Learning time: 92.21 minutes 3 | Train precision: 0.77768 4 | Train loss: 0.65437 5 | Validation precision: 0.69740 6 | Validation loss: 0.84584 7 | Extra opts: {'out_modes_1': array([6, 6, 6, 6, 6, 6], dtype=int32), 'inp_modes_2': array([6, 6, 6, 6, 6, 6], dtype=int32)} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 10, 10, 10, 10, 10, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 10, 10, 10, 10, 10, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_10-04-2016_20_38: -------------------------------------------------------------------------------- 1 | Iterations: 30000 2 | Learning time: 32.98 minutes 3 | Train precision: 0.68852 4 | Train loss: 0.90400 5 | Validation precision: 0.64600 6 | Validation loss: 0.99333 7 | Extra opts: {} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([5, 5, 5, 5, 5, 5], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_10-04-2016_21_16: -------------------------------------------------------------------------------- 1 | Iterations: 30000 2 | Learning time: 28.52 minutes 3 | Train precision: 0.64264 4 | Train loss: 1.05579 5 | Validation precision: 0.60860 6 | Validation loss: 1.13930 7 | Extra opts: {} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([5, 5, 5, 5, 5, 5], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | #layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | # [np.prod(opts['out_modes_1'])], 82 | # train_phase, 83 | # scope='BN_' + str(len(layers)), 84 | # ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | #layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | # [np.prod(opts['out_modes_2'])], 104 | # train_phase, 105 | # scope='BN_' + str(len(layers)), 106 | # ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_12-04-2016_10_54: -------------------------------------------------------------------------------- 1 | Iterations: 2000 2 | Learning time: 2.97 minutes 3 | Train precision: 0.52590 4 | Train loss: 1.36872 5 | Validation precision: 0.51480 6 | Validation loss: 1.39315 7 | Extra opts: {} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([5, 5, 5, 5, 5, 5], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | #layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | # [np.prod(opts['out_modes_1'])], 82 | # train_phase, 83 | # scope='BN_' + str(len(layers)), 84 | # ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | #layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | # [np.prod(opts['out_modes_2'])], 104 | # train_phase, 105 | # scope='BN_' + str(len(layers)), 106 | # ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_12-04-2016_11_01: -------------------------------------------------------------------------------- 1 | Iterations: 30000 2 | Learning time: 42.49 minutes 3 | Train precision: 0.71704 4 | Train loss: 0.82338 5 | Validation precision: 0.67280 6 | Validation loss: 0.91675 7 | Extra opts: {} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([5, 5, 5, 5, 5, 5], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 4, 4, 4, 4, 4, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/2-layer-tt/results/res_30-03-2016_17_12: -------------------------------------------------------------------------------- 1 | Iterations: 40000 2 | Learning time: 274.66 minutes 3 | Train precision: 0.73638 4 | Train loss: 0.77769 5 | Validation precision: 0.68290 6 | Validation loss: 0.90713 7 | Extra opts: {} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | opts['inp_modes_1'] = np.array([4, 4, 4, 4, 4, 3], dtype='int32') 25 | opts['out_modes_1'] = np.array([8, 8, 8, 8, 8, 8], dtype='int32') 26 | opts['ranks_1'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 27 | 28 | opts['inp_modes_2'] = opts['out_modes_1'] 29 | opts['out_modes_2'] = np.array([4, 4, 4, 4, 4, 4], dtype='int32') 30 | opts['ranks_2'] = np.array([1, 3, 3, 3, 3, 3, 1], dtype='int32') 31 | 32 | 33 | opts['use_dropout'] = True 34 | opts['learning_rate_init'] = 0.06 35 | opts['learning_rate_decay_steps'] = 2000 36 | opts['learning_rate_decay_weight'] = 0.64 37 | 38 | def placeholder_inputs(): 39 | """Generate placeholder variables to represent the input tensors. 40 | 41 | Returns: 42 | images_ph: Images placeholder. 43 | labels_ph: Labels placeholder. 44 | train_phase_ph: Train phase indicator placeholder. 45 | """ 46 | # Note that the shapes of the placeholders match the shapes of the full 47 | # image and label tensors, except the first dimension is now batch_size 48 | # rather than the full size of the train or test data sets. 49 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 50 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 51 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 52 | return images_ph, labels_ph, train_phase_ph 53 | 54 | def inference(images, train_phase): 55 | """Build the model up to where it may be used for inference. 56 | Args: 57 | images: Images placeholder. 58 | train_phase: Train phase placeholder 59 | Returns: 60 | logits: Output tensor with the computed logits. 61 | """ 62 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 63 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 64 | 65 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 66 | 67 | 68 | layers = [] 69 | layers.append(images) 70 | 71 | 72 | layers.append(tensornet.layers.tt(layers[-1], 73 | opts['inp_modes_1'], 74 | opts['out_modes_1'], 75 | opts['ranks_1'], 76 | 3.0, #0.1 77 | 'tt_' + str(len(layers)), 78 | use_biases=False)) 79 | 80 | layers.append(tensornet.layers.batch_normalization(layers[-1], 81 | [np.prod(opts['out_modes_1'])], 82 | train_phase, 83 | scope='BN_' + str(len(layers)), 84 | ema_decay=0.8)) 85 | 86 | layers.append(tf.nn.relu(layers[-1], 87 | name='relu_' + str(len(layers)))) 88 | layers.append(tf.nn.dropout(layers[-1], 89 | dropout_rate(0.6), 90 | name='dropout_' + str(len(layers)))) 91 | 92 | 93 | ########################################## 94 | layers.append(tensornet.layers.tt(layers[-1], 95 | opts['inp_modes_2'], 96 | opts['out_modes_2'], 97 | opts['ranks_2'], 98 | 3.0, #0.07 99 | 'tt_' + str(len(layers)), 100 | use_biases=False)) 101 | 102 | layers.append(tensornet.layers.batch_normalization(layers[-1], 103 | [np.prod(opts['out_modes_2'])], 104 | train_phase, 105 | scope='BN_' + str(len(layers)), 106 | ema_decay=0.8)) 107 | 108 | layers.append(tf.nn.relu(layers[-1], 109 | name='relu_' + str(len(layers)))) 110 | 111 | layers.append(tf.nn.dropout(layers[-1], 112 | dropout_rate(0.6), 113 | name='dropout_' + str(len(layers)))) 114 | 115 | ########################################## 116 | 117 | layers.append(tensornet.layers.linear(layers[-1], 118 | np.prod(opts['out_modes_2']), 119 | NUM_CLASSES, 120 | init=tn_init(2.0 / np.prod(opts['out_modes_2'])), 121 | scope='linear_' + str(len(layers)))) 122 | 123 | return layers[-1] 124 | 125 | def loss(logits, labels): 126 | """Calculates the loss from the logits and the labels. 127 | Args: 128 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 129 | labels: Labels tensor, int32 - [batch_size]. 130 | Returns: 131 | loss: Loss tensor of type float. 132 | """ 133 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 134 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 135 | # each with NUM_CLASSES values, all of which are 0.0 except there will 136 | # be a 1.0 in the entry corresponding to the label). 137 | batch_size = tf.size(labels) 138 | labels = tf.expand_dims(labels, 1) 139 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 140 | concated = tf.concat(1, [indices, labels]) 141 | onehot_labels = tf.sparse_to_dense(concated, 142 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 143 | 144 | 145 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 146 | onehot_labels, 147 | name='xentropy') 148 | loss = tf.reduce_mean(cross_entropy, name='loss') 149 | tf.scalar_summary('loss', loss, name='summary/loss') 150 | return loss 151 | 152 | def training(loss): 153 | """Sets up the training Ops. 154 | Creates an optimizer and applies the gradients to all trainable variables. 155 | The Op returned by this function is what must be passed to the 156 | `sess.run()` call to cause the model to train. 157 | Args: 158 | loss: Loss tensor, from loss(). 159 | Returns: 160 | train_op: The Op for training. 161 | """ 162 | # Create a variable to track the global step. 163 | global_step = tf.Variable(0, name='global_step', trainable=False) 164 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 165 | global_step, 166 | opts['learning_rate_decay_steps'], 167 | opts['learning_rate_decay_weight'], 168 | staircase=True, 169 | name='learning_rate') 170 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 171 | # Create the gradient descent optimizer with the given learning rate. 172 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 173 | 174 | grads_and_vars = optimizer.compute_gradients(loss) 175 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 176 | return train_op 177 | 178 | def evaluation(logits, labels): 179 | """Evaluate the quality of the logits at predicting the label. 180 | Args: 181 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size], with values in the 183 | range [0, NUM_CLASSES). 184 | Returns: 185 | A scalar int32 tensor with the number of examples (out of batch_size) 186 | that were predicted correctly. 187 | """ 188 | # For a classifier model, we can use the in_top_k Op. 189 | # It returns a bool tensor with shape [batch_size] that is true for 190 | # the examples where the label's is was in the top k (here k=1) 191 | # of all logits for that example. 192 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 193 | # Return the number of true entries. 194 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 195 | return correct_count 196 | 197 | 198 | def build(new_opts={}): 199 | """ Build graph 200 | Args: 201 | new_opts: dict with additional opts, which will be added to opts dict/ 202 | """ 203 | opts.update(new_opts) 204 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 205 | logits = inference(images_ph, train_phase_ph) 206 | loss_out = loss(logits, labels_ph) 207 | train = training(loss_out) 208 | eval_out = evaluation(logits, labels_ph) 209 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/FC-net/input_data.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | 4 | class DataSet(object): 5 | def __init__(self, images, labels): 6 | """Construct a DataSet. 7 | """ 8 | assert images.shape[0] == labels.shape[0], ('images.shape: %s labels.shape: %s' % 9 | (images.shape, labels.shape)) 10 | self._num_examples = images.shape[0] 11 | self._images = images 12 | self._labels = labels 13 | self._epochs_completed = 0 14 | self._index_in_epoch = 0 15 | 16 | @property 17 | def images(self): 18 | return self._images 19 | 20 | @property 21 | def labels(self): 22 | return self._labels 23 | 24 | @property 25 | def num_examples(self): 26 | return self._num_examples 27 | 28 | @property 29 | def epochs_completed(self): 30 | return self._epochs_completed 31 | 32 | def next_batch(self, batch_size): 33 | start = self._index_in_epoch 34 | self._index_in_epoch += batch_size 35 | if self._index_in_epoch > self._num_examples: 36 | # Finished epoch 37 | self._epochs_completed += 1 38 | # Shuffle the data 39 | perm = np.arange(self._num_examples) 40 | np.random.shuffle(perm) 41 | self._images = self._images[perm] 42 | self._labels = self._labels[perm] 43 | # Start next epoch 44 | start = 0 45 | self._index_in_epoch = batch_size 46 | assert batch_size <= self._num_examples 47 | end = self._index_in_epoch 48 | return self._images[start:end], self._labels[start:end] 49 | 50 | 51 | def read_data_sets(data_dir): 52 | f = np.load(data_dir + '/cifar.npz') 53 | train_images = f['train_images'].astype('float32') 54 | train_labels = f['train_labels'] 55 | 56 | validation_images = f['validation_images'].astype('float32') 57 | validation_labels = f['validation_labels'] 58 | 59 | mean = np.mean(train_images, axis=0)[np.newaxis, :] 60 | std = np.std(train_images, axis=0)[np.newaxis, :] 61 | 62 | train_images = (train_images - mean) / std; 63 | validation_images = (validation_images - mean) / std; 64 | 65 | train = DataSet(train_images, train_labels) 66 | validation = DataSet(validation_images, validation_labels) 67 | return train, validation 68 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/FC-net/net.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import math 3 | import numpy as np 4 | import sys 5 | 6 | 7 | sys.path.append('../../../../') 8 | import tensornet 9 | 10 | NUM_CLASSES = 10 11 | IMAGE_SIZE = 32 12 | IMAGE_DEPTH = 3 13 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 14 | 15 | opts = {} 16 | 17 | opts['hidden_units'] = [4096, 4096, 4096, 4096] 18 | opts['use_dropout'] = True 19 | opts['learning_rate_init'] = 1.0 20 | opts['learning_rate_decay_steps'] = 2000 21 | opts['learning_rate_decay_weight'] = 0.64 22 | 23 | def placeholder_inputs(): 24 | """Generate placeholder variables to represent the input tensors. 25 | 26 | Returns: 27 | images_ph: Images placeholder. 28 | labels_ph: Labels placeholder. 29 | train_phase_ph: Train phase indicator placeholder. 30 | """ 31 | # Note that the shapes of the placeholders match the shapes of the full 32 | # image and label tensors, except the first dimension is now batch_size 33 | # rather than the full size of the train or test data sets. 34 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 35 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 36 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 37 | return images_ph, labels_ph, train_phase_ph 38 | 39 | def inference(images, train_phase): 40 | """Build the model up to where it may be used for inference. 41 | Args: 42 | images: Images placeholder. 43 | train_phase: Train phase placeholder 44 | Returns: 45 | logits: Output tensor with the computed logits. 46 | """ 47 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 48 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 49 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 50 | 51 | 52 | 53 | layers = [] 54 | layers.append(images) 55 | 56 | cnt = len(opts['hidden_units']) 57 | for i in range(cnt): 58 | n_out = opts['hidden_units'][i] 59 | 60 | layers.append(tensornet.layers.linear(layers[-1], 61 | n_out, 62 | scope='linear_' + str(len(layers)), 63 | biases_initializer=None)) 64 | 65 | layers.append(tensornet.layers.batch_normalization(layers[-1], 66 | train_phase, 67 | scope='BN_' + str(len(layers)), 68 | ema_decay=0.8)) 69 | layers.append(tf.nn.relu(layers[-1], 70 | name='relu_' + str(len(layers)))) 71 | 72 | layers.append(tf.nn.dropout(layers[-1], 73 | dropout_rate(0.77), 74 | name='dropout_' + str(len(layers)))) 75 | 76 | layers.append(tensornet.layers.linear(layers[-1], 77 | NUM_CLASSES, 78 | scope='linear_' + str(len(layers)))) 79 | 80 | return layers[-1] 81 | 82 | def loss(logits, labels): 83 | """Calculates the loss from the logits and the labels. 84 | Args: 85 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 86 | labels: Labels tensor, int32 - [batch_size]. 87 | Returns: 88 | loss: Loss tensor of type float. 89 | """ 90 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 91 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 92 | # each with NUM_CLASSES values, all of which are 0.0 except there will 93 | # be a 1.0 in the entry corresponding to the label). 94 | batch_size = tf.size(labels) 95 | labels = tf.expand_dims(labels, 1) 96 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 97 | concated = tf.concat([indices, labels], 1) 98 | onehot_labels = tf.sparse_to_dense(concated, 99 | tf.shape(logits), 1.0, 0.0) 100 | 101 | 102 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits, 103 | labels=onehot_labels, 104 | name='xentropy') 105 | loss = tf.reduce_mean(cross_entropy, name='loss') 106 | tf.summary.scalar('summary/loss', loss) 107 | return loss 108 | 109 | def training(loss): 110 | """Sets up the training Ops. 111 | Creates an optimizer and applies the gradients to all trainable variables. 112 | The Op returned by this function is what must be passed to the 113 | `sess.run()` call to cause the model to train. 114 | Args: 115 | loss: Loss tensor, from loss(). 116 | Returns: 117 | train_op: The Op for training. 118 | """ 119 | # Create a variable to track the global step. 120 | global_step = tf.Variable(0, name='global_step', trainable=False) 121 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 122 | global_step, 123 | opts['learning_rate_decay_steps'], 124 | opts['learning_rate_decay_weight'], 125 | staircase=True, 126 | name='learning_rate') 127 | tf.summary.scalar('summary/learning_rate', learning_rate) 128 | # Create the gradient descent optimizer with the given learning rate. 129 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 130 | 131 | grads_and_vars = optimizer.compute_gradients(loss) 132 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 133 | return train_op 134 | 135 | def evaluation(logits, labels): 136 | """Evaluate the quality of the logits at predicting the label. 137 | Args: 138 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 139 | labels: Labels tensor, int32 - [batch_size], with values in the 140 | range [0, NUM_CLASSES). 141 | Returns: 142 | A scalar int32 tensor with the number of examples (out of batch_size) 143 | that were predicted correctly. 144 | """ 145 | # For a classifier model, we can use the in_top_k Op. 146 | # It returns a bool tensor with shape [batch_size] that is true for 147 | # the examples where the label's is was in the top k (here k=1) 148 | # of all logits for that example. 149 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 150 | # Return the number of true entries. 151 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 152 | return correct_count 153 | 154 | 155 | def build(new_opts={}): 156 | """ Build graph 157 | Args: 158 | new_opts: dict with additional opts, which will be added to opts dict/ 159 | """ 160 | opts.update(new_opts) 161 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 162 | logits = inference(images_ph, train_phase_ph) 163 | loss_out = loss(logits, labels_ph) 164 | train = training(loss_out) 165 | eval_out = evaluation(logits, labels_ph) 166 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/FC-net/results/res_30-03-2016_16_58: -------------------------------------------------------------------------------- 1 | Iterations: 40000 2 | Learning time: 43.06 minutes 3 | Train precision: 0.94912 4 | Train loss: 0.25110 5 | Validation precision: 0.59250 6 | Validation loss: 1.38745 7 | Extra opts: {} 8 | Code: 9 | import tensorflow as tf 10 | import math 11 | import numpy as np 12 | import sys 13 | 14 | 15 | sys.path.append('../../../') 16 | import tensornet 17 | 18 | NUM_CLASSES = 10 19 | IMAGE_SIZE = 32 20 | IMAGE_DEPTH = 3 21 | IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE * IMAGE_DEPTH 22 | 23 | opts = {} 24 | 25 | opts['hidden_units'] = [3072, 4096, 4096, 4096, 4096, 10] 26 | opts['use_dropout'] = True 27 | opts['learning_rate_init'] = 1.0 28 | opts['learning_rate_decay_steps'] = 2000 29 | opts['learning_rate_decay_weight'] = 0.64 30 | 31 | def placeholder_inputs(): 32 | """Generate placeholder variables to represent the input tensors. 33 | 34 | Returns: 35 | images_ph: Images placeholder. 36 | labels_ph: Labels placeholder. 37 | train_phase_ph: Train phase indicator placeholder. 38 | """ 39 | # Note that the shapes of the placeholders match the shapes of the full 40 | # image and label tensors, except the first dimension is now batch_size 41 | # rather than the full size of the train or test data sets. 42 | images_ph = tf.placeholder(tf.float32, shape=(None, IMAGE_PIXELS), name='placeholder/images') 43 | labels_ph = tf.placeholder(tf.int32, shape=(None), name='placeholder/labels') 44 | train_phase_ph = tf.placeholder(tf.bool, name='placeholder/train_phase') 45 | return images_ph, labels_ph, train_phase_ph 46 | 47 | def inference(images, train_phase): 48 | """Build the model up to where it may be used for inference. 49 | Args: 50 | images: Images placeholder. 51 | train_phase: Train phase placeholder 52 | Returns: 53 | logits: Output tensor with the computed logits. 54 | """ 55 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 56 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 57 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 58 | 59 | 60 | 61 | layers = [] 62 | layers.append(images) 63 | 64 | cnt = len(opts['hidden_units']) 65 | for i in range(cnt - 1): 66 | n_in = opts['hidden_units'][i] 67 | n_out = opts['hidden_units'][i + 1] 68 | 69 | layers.append(tensornet.layers.linear(layers[-1], 70 | n_in, 71 | n_out, 72 | init=tn_init(2.0 / n_in), 73 | scope='linear_' + str(len(layers)), 74 | use_biases=(i==cnt-2))) 75 | if (i < cnt - 1): 76 | layers.append(tensornet.layers.batch_normalization(layers[-1], 77 | [n_out], 78 | train_phase, 79 | scope='BN_' + str(len(layers)), 80 | ema_decay=0.8)) 81 | layers.append(tf.nn.relu(layers[-1], 82 | name='relu_' + str(len(layers)))) 83 | 84 | layers.append(tf.nn.dropout(layers[-1], 85 | dropout_rate(0.77), 86 | name='dropout_' + str(len(layers)))) 87 | 88 | return layers[-1] 89 | 90 | def loss(logits, labels): 91 | """Calculates the loss from the logits and the labels. 92 | Args: 93 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 94 | labels: Labels tensor, int32 - [batch_size]. 95 | Returns: 96 | loss: Loss tensor of type float. 97 | """ 98 | # Convert from sparse integer labels in the range [0, NUM_CLASSES) 99 | # to 1-hot dense float vectors (that is we will have batch_size vectors, 100 | # each with NUM_CLASSES values, all of which are 0.0 except there will 101 | # be a 1.0 in the entry corresponding to the label). 102 | batch_size = tf.size(labels) 103 | labels = tf.expand_dims(labels, 1) 104 | indices = tf.expand_dims(tf.range(0, batch_size), 1) 105 | concated = tf.concat(1, [indices, labels]) 106 | onehot_labels = tf.sparse_to_dense(concated, 107 | tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) 108 | 109 | 110 | cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, 111 | onehot_labels, 112 | name='xentropy') 113 | loss = tf.reduce_mean(cross_entropy, name='loss') 114 | tf.scalar_summary('loss', loss, name='summary/loss') 115 | return loss 116 | 117 | def training(loss): 118 | """Sets up the training Ops. 119 | Creates an optimizer and applies the gradients to all trainable variables. 120 | The Op returned by this function is what must be passed to the 121 | `sess.run()` call to cause the model to train. 122 | Args: 123 | loss: Loss tensor, from loss(). 124 | Returns: 125 | train_op: The Op for training. 126 | """ 127 | # Create a variable to track the global step. 128 | global_step = tf.Variable(0, name='global_step', trainable=False) 129 | learning_rate = tf.train.exponential_decay(opts['learning_rate_init'], 130 | global_step, 131 | opts['learning_rate_decay_steps'], 132 | opts['learning_rate_decay_weight'], 133 | staircase=True, 134 | name='learning_rate') 135 | tf.scalar_summary('learning_rate', learning_rate, name='summary/learning_rate') 136 | # Create the gradient descent optimizer with the given learning rate. 137 | optimizer = tf.train.GradientDescentOptimizer(learning_rate, name='optimizer') 138 | 139 | grads_and_vars = optimizer.compute_gradients(loss) 140 | train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name='train_op') 141 | return train_op 142 | 143 | def evaluation(logits, labels): 144 | """Evaluate the quality of the logits at predicting the label. 145 | Args: 146 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 147 | labels: Labels tensor, int32 - [batch_size], with values in the 148 | range [0, NUM_CLASSES). 149 | Returns: 150 | A scalar int32 tensor with the number of examples (out of batch_size) 151 | that were predicted correctly. 152 | """ 153 | # For a classifier model, we can use the in_top_k Op. 154 | # It returns a bool tensor with shape [batch_size] that is true for 155 | # the examples where the label's is was in the top k (here k=1) 156 | # of all logits for that example. 157 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 158 | # Return the number of true entries. 159 | correct_count = tf.reduce_sum(tf.cast(correct_flags, tf.int32), name='correct_count') 160 | return correct_count 161 | 162 | 163 | def build(new_opts={}): 164 | """ Build graph 165 | Args: 166 | new_opts: dict with additional opts, which will be added to opts dict/ 167 | """ 168 | opts.update(new_opts) 169 | images_ph, labels_ph, train_phase_ph = placeholder_inputs() 170 | logits = inference(images_ph, train_phase_ph) 171 | loss_out = loss(logits, labels_ph) 172 | train = training(loss_out) 173 | eval_out = evaluation(logits, labels_ph) 174 | -------------------------------------------------------------------------------- /experiments/cifar-10/FC-Tensorizing-Neural-Networks/README.md: -------------------------------------------------------------------------------- 1 | # Experiments with TT-FC layer 2 | 3 | This folder contains the code to reproduce the experiments on the CIFAR-10 dataset for the paper 4 | 5 | _Tensorizing Neural Networks_ 6 | Alexander Novikov, Dmitry Podoprikhin, Anton Osokin, Dmitry Vetrov; In _Advances in Neural Information Processing Systems 28_ (NIPS-2015) [[arXiv](http://arxiv.org/abs/1509.06569)]. 7 | -------------------------------------------------------------------------------- /experiments/cifar-10/conv-Ultimate-Tensorization/README.md: -------------------------------------------------------------------------------- 1 | # Experiments with TT-conv layer 2 | 3 | This folder contains the framework we used to conduct experiments on the CIFAR-10 dataset for the paper 4 | 5 | _Ultimate tensorization: compressing convolutional and FC layers alike_ 6 | Timur Garipov, Dmitry Podoprikhin, Alexander Novikov, Dmitry Vetrov; _Learning with Tensors: Why Now and How?_, NIPS-2016 workshop (NIPS-2015) [[arXiv](https://arxiv.org/abs/1611.03214)]. 7 | 8 | ## Training 9 | 10 | The following command runs the training procedure: 11 | 12 | ```bash 13 | python3 train.py --net_module= \ 14 | --log_dir= \ 15 | --data_dir= \ 16 | --num_gpus= 17 | ``` 18 | 19 | where 20 | * ```net_module``` is a path to a python-file with network description (e.g. ```./nets/conv.py```); 21 | 22 | 23 | * ```log_dir``` is a path to directory where summaries and checkpoints should be saved (e.g. ```./log/conv```); 24 | 25 | * ```data_dir``` is a path to directory with data (e.g. ```../data/```); 26 | 27 | 28 | * ```num_gpus``` is a number of gpu's that will be used for training. 29 | 30 | ### Training with pretrained convolutional part initialization 31 | 32 | There is auxiliary scipt for training a network with convolutional part initialized with pretrained weights: 33 | 34 | ```bash 35 | python3 train_with_pretrained_convs.py --net_module=\ 36 | --log_dir= \ 37 | --num_gpus= \ 38 | --data_dir= \ 39 | --pretrained_ckpt= 40 | ``` 41 | 42 | where ```pretrained_ckpt``` is the path to the checkpoint file with pretrained weights. 43 | 44 | ## Evaluation 45 | 46 | The following command runs the evaluation process of a trained network: 47 | 48 | ```bash 49 | python3 eval.py --net_module= \ 50 | --log_dir= \ 51 | --data_dir= 52 | ``` 53 | -------------------------------------------------------------------------------- /experiments/cifar-10/conv-Ultimate-Tensorization/eval.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | import os 5 | import os.path 6 | import datetime 7 | import shutil 8 | import imp 9 | import time 10 | import tensorflow.python.platform 11 | import numpy as np 12 | from six.moves import xrange # pylint: disable=redefined-builtin 13 | import tensorflow as tf 14 | import re 15 | import input_data 16 | import sys 17 | 18 | import shutil 19 | 20 | net = None 21 | 22 | tf.set_random_seed(12345) 23 | np.random.seed(12345) 24 | 25 | # Basic model parameters as external flags. 26 | flags = tf.app.flags 27 | FLAGS = flags.FLAGS 28 | 29 | 30 | flags.DEFINE_string('net_module', None, 'Module with architecture description.') 31 | flags.DEFINE_string('log_dir', None, 'Directory with log files.') 32 | flags.DEFINE_integer('batch_size', 100, 'Batch size. ' 33 | 'Must divide evenly into the dataset sizes.') 34 | flags.DEFINE_string('data_dir', '../data/', 'Directory to put the training data.') 35 | 36 | flags.DEFINE_boolean('log_device_placement', False, """Whether to log device placement.""") 37 | 38 | def tower_loss_and_eval(images, labels, train_phase, cpu_variables=False): 39 | with tf.variable_scope('inference', reuse=False): 40 | logits = net.inference(images, train_phase, cpu_variables=cpu_variables) 41 | losses = net.losses(logits, labels) 42 | total_loss = tf.add_n(losses, name='total_loss') 43 | evaluation = net.evaluation(logits, labels) 44 | return total_loss, evaluation 45 | 46 | def evaluate(sess, 47 | loss, 48 | evaluation, 49 | train_or_val, 50 | images_ph, 51 | images, 52 | labels_ph, 53 | labels): 54 | fmt_str = 'Evaluation [%s]. Batch %d/%d (%d%%). Speed = %.2f sec/b, %.2f img/sec. Batch_loss = %.2f. Batch_precision = %.2f' 55 | 56 | num_batches = labels.size // FLAGS.batch_size 57 | assert labels.size % FLAGS.batch_size == 0, 'Batch size must divide evenly into the dataset sizes.' 58 | assert images.shape[0] == labels.size, 'Images count must be equal to labels count' 59 | 60 | sum_loss = 0.0 61 | sum_correct = 0.0 62 | 63 | w = os.get_terminal_size().columns 64 | sys.stdout.write(('=' * w + '\n') * 2) 65 | sys.stdout.write('\n') 66 | sys.stdout.write('Evaluation [%s]' % train_or_val) 67 | 68 | cum_t = 0.0 69 | for bid in range(num_batches): 70 | b_images = images[bid * FLAGS.batch_size:(bid + 1) * FLAGS.batch_size] 71 | b_labels = labels[bid * FLAGS.batch_size:(bid + 1) * FLAGS.batch_size] 72 | start_time = time.time() 73 | loss_val, eval_val = sess.run([loss, evaluation], feed_dict={images_ph: b_images, labels_ph: b_labels}) 74 | duration = time.time() - start_time 75 | 76 | cum_t += duration 77 | sec_per_batch = duration 78 | img_per_sec = FLAGS.batch_size / duration 79 | 80 | 81 | sum_loss += loss_val * FLAGS.batch_size 82 | sum_correct += np.sum(eval_val) 83 | 84 | if cum_t > 0.5: 85 | sys.stdout.write('\r' + fmt_str % ( 86 | train_or_val, 87 | bid + 1, 88 | num_batches, 89 | int((bid + 1) * 100.0 / num_batches), 90 | sec_per_batch, 91 | img_per_sec, 92 | loss_val, 93 | np.mean(eval_val) * 100.0 94 | )) 95 | sys.stdout.flush() 96 | cum_t = 0.0 97 | 98 | sys.stdout.write(('\r' + fmt_str + '\n') % ( 99 | train_or_val, 100 | num_batches, 101 | num_batches, 102 | int(100.0), 103 | sec_per_batch, 104 | img_per_sec, 105 | loss_val, 106 | np.mean(eval_val) * 100.0 107 | )) 108 | 109 | sys.stdout.write('%s loss = %.2f. %s precision = %.2f.\n\n' % ( 110 | train_or_val, 111 | sum_loss / labels.size, 112 | train_or_val, 113 | sum_correct / labels.size * 100.0 114 | )) 115 | 116 | def run_eval(chkpt): 117 | global net 118 | net = imp.load_source('net', FLAGS.net_module) 119 | with tf.Graph().as_default(), tf.device('/cpu:0'): 120 | train_phase = tf.constant(False, name='train_phase', dtype=tf.bool) 121 | 122 | t_images, t_labels = input_data.get_train_data(FLAGS.data_dir) 123 | aux = { 124 | 'mean': np.mean(t_images, axis=0), 125 | 'std': np.std(t_images, axis=0) 126 | } 127 | v_images, v_labels = input_data.get_validation_data(FLAGS.data_dir) 128 | 129 | images_ph = tf.placeholder(tf.float32, shape=[None] + list(t_images.shape[1:]), name='images_ph') 130 | labels_ph = tf.placeholder(tf.int32, shape=[None], name='labels_ph') 131 | 132 | images = net.aug_eval(images_ph, aux) 133 | with tf.device('/gpu:0'): 134 | with tf.name_scope('tower_0') as scope: 135 | loss, evaluation = tower_loss_and_eval(images, labels_ph, train_phase) 136 | 137 | 138 | variable_averages = tf.train.ExponentialMovingAverage(0.999) 139 | variables_averages_op = variable_averages.apply(tf.trainable_variables()) 140 | 141 | saver = tf.train.Saver(tf.global_variables()) 142 | ema_saver = tf.train.Saver(variable_averages.variables_to_restore()) 143 | 144 | sess = tf.Session(config=tf.ConfigProto( 145 | allow_soft_placement=True, 146 | log_device_placement=FLAGS.log_device_placement)) 147 | 148 | 149 | 150 | saver.restore(sess, chkpt) 151 | ema_saver.restore(sess, chkpt) 152 | sys.stdout.write('Checkpoint "%s" restored.\n' % (chkpt)) 153 | evaluate(sess, loss, evaluation, 'Train', images_ph, t_images, labels_ph, t_labels) 154 | evaluate(sess, loss, evaluation, 'Validation', images_ph, v_images, labels_ph, v_labels) 155 | 156 | def main(_): 157 | latest_chkpt = tf.train.latest_checkpoint(FLAGS.log_dir) 158 | if latest_chkpt is not None: 159 | sys.stdout.write('Checkpoint "%s" found.\n' % latest_chkpt) 160 | run_eval(latest_chkpt) 161 | else: 162 | sys.stdout.write('Checkpoint not found.\n') 163 | 164 | if __name__ == '__main__': 165 | tf.app.run() 166 | -------------------------------------------------------------------------------- /experiments/cifar-10/conv-Ultimate-Tensorization/input_data.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import tensorflow as tf 3 | 4 | 5 | def get_train_data(data_dir): 6 | f = np.load(data_dir + '/cifar.npz') 7 | images = np.reshape(f['train_images'], [-1, 32, 32, 3]) 8 | labels = f['train_labels'] 9 | return images, labels 10 | 11 | def get_validation_data(data_dir): 12 | f = np.load(data_dir + '/cifar.npz') 13 | images = np.reshape(f['validation_images'], [-1, 32, 32, 3]) 14 | labels = f['validation_labels'] 15 | return images, labels 16 | 17 | 18 | def get_input_data(FLAGS): 19 | t_images, t_labels = get_train_data(FLAGS.data_dir) 20 | t_cnt = t_images.shape[0] 21 | train_images_ph = tf.placeholder(dtype=tf.float32, shape=[t_cnt, 32, 32, 3], name='train_images_ph') 22 | train_labels_ph = tf.placeholder(dtype=tf.int32, shape=[t_cnt], name='train_labels_ph') 23 | train_images = tf.Variable(train_images_ph, trainable=False, collections=[], name='train_images') 24 | train_labels = tf.Variable(train_labels_ph, trainable=False, collections=[], name='train_labels') 25 | 26 | train_image_input, train_label_input = tf.train.slice_input_producer([train_images, train_labels], 27 | shuffle=True, 28 | capacity=FLAGS.num_gpus * FLAGS.batch_size + 20, 29 | name='train_input') 30 | 31 | 32 | 33 | v_images, v_labels = get_validation_data(FLAGS.data_dir) 34 | v_cnt = v_images.shape[0] 35 | validation_images_ph = tf.placeholder(dtype=tf.float32, shape=[v_cnt, 32, 32, 3], name='validation_images_ph') 36 | validation_labels_ph = tf.placeholder(dtype=tf.int32, shape=[v_cnt], name='validation_labels_ph') 37 | validation_images = tf.Variable(validation_images_ph, trainable=False, collections=[], name='validation_images') 38 | validation_labels = tf.Variable(validation_labels_ph, trainable=False, collections=[], name='validation_labels') 39 | 40 | validation_image_input, validation_label_input = tf.train.slice_input_producer([validation_images, validation_labels], 41 | shuffle=False, 42 | capacity=FLAGS.batch_size + 20, 43 | name='validation_input') 44 | 45 | result = {} 46 | result['train'] = { 47 | 'images': t_images, 48 | 'labels': t_labels, 49 | 'image_input': train_image_input, 50 | 'label_input': train_label_input 51 | } 52 | result['validation'] = { 53 | 'images': v_images, 54 | 'labels': v_labels, 55 | 'image_input': validation_image_input, 56 | 'label_input': validation_label_input 57 | } 58 | result['initializer'] = [ 59 | train_images.initializer, 60 | train_labels.initializer, 61 | validation_images.initializer, 62 | validation_labels.initializer 63 | ] 64 | 65 | result['init_feed'] = { 66 | train_images_ph: t_images, 67 | train_labels_ph: t_labels, 68 | validation_images_ph: v_images, 69 | validation_labels_ph: v_labels 70 | } 71 | 72 | result['aux'] = { 73 | 'mean': np.mean(t_images, axis=0), 74 | 'std': np.std(t_images, axis=0) 75 | } 76 | 77 | 78 | return result 79 | -------------------------------------------------------------------------------- /experiments/cifar-10/conv-Ultimate-Tensorization/nets/conv.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import math 3 | import numpy as np 4 | import sys 5 | 6 | 7 | sys.path.append('../../../') 8 | import tensornet 9 | 10 | NUM_CLASSES = 10 11 | 12 | 13 | 14 | opts = {} 15 | opts['use_dropout'] = True 16 | opts['initial_learning_rate'] = 0.1 17 | opts['num_epochs_per_decay'] = 30.0 18 | opts['learning_rate_decay_factor'] = 0.1 19 | 20 | def aug_train(image, aux): 21 | aug_image = tf.pad(image, [[4, 4], [4, 4], [0, 0]]) 22 | aug_image = tf.random_crop(aug_image, [32, 32, 3]) 23 | aug_image = tf.image.random_flip_left_right(aug_image) 24 | aug_image = tf.image.random_contrast(aug_image, 0.75, 1.25) 25 | aug_image = (aug_image - aux['mean']) / aux['std'] 26 | return aug_image 27 | 28 | def aug_eval(image, aux): 29 | aug_image = (image - aux['mean']) / aux['std'] 30 | return aug_image 31 | 32 | def inference(images, train_phase, reuse=None, cpu_variables=False): 33 | """Build the model up to where it may be used for inference. 34 | Args: 35 | images: Images placeholder. 36 | train_phase: Train phase placeholder 37 | Returns: 38 | logits: Output tensor with the computed logits. 39 | """ 40 | tn_init = lambda dev: lambda shape: tf.truncated_normal(shape, stddev=dev) 41 | tu_init = lambda bound: lambda shape: tf.random_uniform(shape, minval = -bound, maxval = bound) 42 | 43 | dropout_rate = lambda p: (opts['use_dropout'] * (p - 1.0)) * tf.to_float(train_phase) + 1.0 44 | 45 | 46 | 47 | layers = [] 48 | layers.append(images) 49 | 50 | layers.append(tensornet.layers.conv(layers[-1], 51 | 64, 52 | [3, 3], 53 | cpu_variables=cpu_variables, 54 | biases_initializer=tf.zeros_initializer(), 55 | scope='conv1.1')) 56 | 57 | 58 | layers.append(tensornet.layers.batch_normalization(layers[-1], 59 | train_phase, 60 | cpu_variables=cpu_variables, 61 | scope='bn1.1')) 62 | 63 | layers.append(tf.nn.relu(layers[-1], 64 | name='relu1.1')) 65 | 66 | layers.append(tensornet.layers.conv(layers[-1], 67 | 64, 68 | [3, 3], 69 | cpu_variables=cpu_variables, 70 | biases_initializer=tf.zeros_initializer(), 71 | scope='conv1.2')) 72 | 73 | layers.append(tensornet.layers.batch_normalization(layers[-1], 74 | train_phase, 75 | cpu_variables=cpu_variables, 76 | scope='bn1.2')) 77 | 78 | layers.append(tf.nn.relu(layers[-1], 79 | name='relu1.2')) 80 | 81 | 82 | layers.append(tf.nn.max_pool(layers[-1], 83 | [1, 3, 3, 1], 84 | [1, 2, 2, 1], 85 | 'SAME', 86 | name='max_pool1')) 87 | 88 | layers.append(tensornet.layers.conv(layers[-1], 89 | 128, 90 | [3, 3], 91 | cpu_variables=cpu_variables, 92 | biases_initializer=tf.zeros_initializer(), 93 | scope='conv2.1')) 94 | 95 | layers.append(tensornet.layers.batch_normalization(layers[-1], 96 | train_phase, 97 | cpu_variables=cpu_variables, 98 | scope='bn2.1')) 99 | 100 | layers.append(tf.nn.relu(layers[-1], 101 | name='relu2.1')) 102 | 103 | layers.append(tensornet.layers.conv(layers[-1], 104 | 128, 105 | [3, 3], 106 | cpu_variables=cpu_variables, 107 | biases_initializer=tf.zeros_initializer(), 108 | scope='conv2.2')) 109 | 110 | layers.append(tensornet.layers.batch_normalization(layers[-1], 111 | train_phase, 112 | cpu_variables=cpu_variables, 113 | scope='bn2.2')) 114 | 115 | layers.append(tf.nn.relu(layers[-1], 116 | name='relu2.2')) 117 | 118 | 119 | layers.append(tf.nn.max_pool(layers[-1], 120 | [1, 3, 3, 1], 121 | [1, 2, 2, 1], 122 | 'SAME', 123 | name='max_pool2')) 124 | 125 | layers.append(tensornet.layers.conv(layers[-1], 126 | 128, 127 | [3, 3], 128 | padding='VALID', 129 | cpu_variables=cpu_variables, 130 | biases_initializer=tf.zeros_initializer(), 131 | scope='conv3.1')) 132 | 133 | layers.append(tensornet.layers.batch_normalization(layers[-1], 134 | train_phase, 135 | cpu_variables=cpu_variables, 136 | scope='bn3.1')) 137 | 138 | layers.append(tf.nn.relu(layers[-1], 139 | name='relu3.1')) 140 | 141 | 142 | layers.append(tensornet.layers.conv(layers[-1], 143 | 128, 144 | [3, 3], 145 | padding='VALID', 146 | cpu_variables=cpu_variables, 147 | biases_initializer=tf.zeros_initializer(), 148 | scope='conv3.2')) 149 | 150 | layers.append(tensornet.layers.batch_normalization(layers[-1], 151 | train_phase, 152 | cpu_variables=cpu_variables, 153 | scope='bn3.2')) 154 | 155 | layers.append(tf.nn.relu(layers[-1], 156 | name='relu3.2')) 157 | 158 | 159 | 160 | 161 | layers.append(tf.nn.avg_pool(layers[-1], 162 | [1,4,4,1], 163 | [1,4,4,1], 164 | 'SAME', 165 | name='avg_pool_full')) 166 | 167 | 168 | sz = np.prod(layers[-1].get_shape().as_list()[1:]) 169 | 170 | layers.append(tensornet.layers.linear(tf.reshape(layers[-1], [-1, sz]), 171 | NUM_CLASSES, 172 | cpu_variables=cpu_variables, 173 | biases_initializer=None, 174 | scope='linear4.1')) 175 | 176 | return layers[-1] 177 | 178 | def losses(logits, labels): 179 | """Calculates losses from the logits and the labels. 180 | Args: 181 | logits: input tensor, float - [batch_size, NUM_CLASSES]. 182 | labels: Labels tensor, int32 - [batch_size]. 183 | Returns: 184 | losses: list of loss tensors of type float. 185 | """ 186 | xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels, name='xentropy') 187 | loss = tf.reduce_mean(xentropy, name='loss') 188 | return [loss] 189 | 190 | def evaluation(logits, labels): 191 | """Evaluate the quality of the logits at predicting the label. 192 | Args: 193 | logits: Logits tensor, float - [batch_size, NUM_CLASSES]. 194 | labels: Labels tensor, int32 - [batch_size], with values in the 195 | range [0, NUM_CLASSES). 196 | Returns: 197 | A scalar int32 tensor with the number of examples (out of batch_size) 198 | that were predicted correctly. 199 | """ 200 | # For a classifier model, we can use the in_top_k Op. 201 | # It returns a bool tensor with shape [batch_size] that is true for 202 | # the examples where the label's is was in the top k (here k=1) 203 | # of all logits for that example. 204 | correct_flags = tf.nn.in_top_k(logits, labels, 1) 205 | # Return the number of true entries. 206 | return tf.cast(correct_flags, tf.int32) 207 | -------------------------------------------------------------------------------- /experiments/cifar-10/data/prepare_data.py: -------------------------------------------------------------------------------- 1 | ################################################################ 2 | # Load and unpack CIFAR-10 python version # 3 | # from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz # 4 | # # 5 | # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! # 6 | # Run this scipt with python2 only # 7 | # !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! # 8 | ################################################################ 9 | 10 | import pickle 11 | import numpy as np 12 | 13 | batches_dir = 'cifar-10-batches-py' 14 | 15 | def unpickle(fname): 16 | fo = open(fname, 'rb') 17 | d = pickle.load(fo) 18 | fo.close() 19 | data = np.reshape(d['data'], [-1, 32, 32, 3], order='F') 20 | data = np.transpose(data, [0, 2, 1, 3]) 21 | data = np.reshape(data, [-1, 32*32*3]) 22 | labels = np.array(d['labels'], dtype='int8') 23 | return data, labels 24 | 25 | for x in range(1, 6): 26 | fname = batches_dir + '/data_batch_' + str(x) 27 | data, labels = unpickle(fname) 28 | if x == 1: 29 | train_images = data 30 | train_labels = labels 31 | else: 32 | train_images = np.vstack((train_images, data)) 33 | train_labels = np.concatenate((train_labels, labels)) 34 | 35 | validation_images, validation_labels = unpickle(batches_dir + '/test_batch') 36 | 37 | print(train_images.shape, validation_images.shape) 38 | print(train_labels.shape, validation_labels.shape) 39 | np.savez_compressed('cifar', train_images=train_images, validation_images=validation_images, 40 | train_labels=train_labels, validation_labels=validation_labels) 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | -------------------------------------------------------------------------------- /paper.md: -------------------------------------------------------------------------------- 1 | # Ultimate tensorization: compressing convolutional and FC layers alike 2 | Links: [[arXiv](https://arxiv.org/abs/1611.03214)] [[poster pdf](https://github.com/timgaripov/TensorNet-TF/raw/master/ultimate_tensorization_poster.pdf)] 3 | 4 | 5 | Convolutional neural networks excel in image recognition tasks, but this comes at the cost of high computational and memory complexity. To tackle this problem, [1] developed a tensor factorization framework to compress fully-connected layers. In this paper, we focus on compressing convolutional layers. We show that while the direct application of the tensor framework [1] to the 4-dimensional kernel of convolution does compress the layer, we can do better. We reshape the convolutional kernel into a tensor of higher order and factorize it. We combine the proposed approach with the previous work to compress both convolutional and fully-connected layers of a network and achieve 80x network compression rate with 1.1% accuracy drop on the CIFAR-10 dataset 6 | -------------------------------------------------------------------------------- /tensornet/__init__.py: -------------------------------------------------------------------------------- 1 | from . import layers 2 | from . import tt 3 | -------------------------------------------------------------------------------- /tensornet/layers/__init__.py: -------------------------------------------------------------------------------- 1 | from .linear import * 2 | from .batch_normalization import * 3 | from .tt import * 4 | from .conv import * 5 | from .tt_conv import * 6 | from .tt_conv_full import * 7 | from .tt_conv_direct import * 8 | -------------------------------------------------------------------------------- /tensornet/layers/aux.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | 3 | 4 | def get_var_wrap(name, 5 | shape, 6 | initializer, 7 | regularizer, 8 | trainable, 9 | cpu_variable): 10 | if cpu_variable: 11 | with tf.device('/cpu:0'): 12 | return tf.get_variable(name, 13 | shape=shape, 14 | initializer=initializer, 15 | regularizer=regularizer, 16 | trainable=trainable) 17 | return tf.get_variable(name, 18 | shape=shape, 19 | initializer=initializer, 20 | regularizer=regularizer, 21 | trainable=trainable) 22 | -------------------------------------------------------------------------------- /tensornet/layers/batch_normalization.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from .aux import get_var_wrap 3 | 4 | def batch_normalization(inp, 5 | train_phase, 6 | ema_decay=0.99, 7 | eps=1e-3, 8 | use_scale=True, 9 | use_shift=True, 10 | trainable=True, 11 | cpu_variables=False, 12 | scope=None): 13 | """Batch normalization layer 14 | Args: 15 | inp: input tensor [batch_el, ...] with 2 or 4 dimensions 16 | train_phase: tensor [1] of bool, train pahse indicator 17 | ema_decay: moving average decay 18 | eps: number added to variance, to exclude division by zero 19 | use_scale: bool flag of scale transform applying 20 | use_shift: bool flag of shift transform applying 21 | trainable: trainable variables flag, bool 22 | cpu_variables: cpu variables flag, bool 23 | scope: layer variable scope name, string 24 | Reutrns: 25 | out: normalizaed tensor of the same shape as inp 26 | """ 27 | 28 | reuse = tf.get_variable_scope().reuse 29 | with tf.variable_scope(scope): 30 | 31 | shape = inp.get_shape().as_list() 32 | assert len(shape) in [2, 4] 33 | n_out = shape[-1] 34 | 35 | if len(shape) == 2: 36 | batch_mean, batch_variance = tf.nn.moments(inp, [0], name='moments') 37 | else: 38 | batch_mean, batch_variance = tf.nn.moments(inp, [0, 1, 2], name='moments') 39 | ema = tf.train.ExponentialMovingAverage(decay=ema_decay, zero_debias=True) 40 | if not reuse: 41 | def mean_variance_with_update(): 42 | with tf.control_dependencies([ema.apply([batch_mean, batch_variance])]): 43 | return (tf.identity(batch_mean), 44 | tf.identity(batch_variance)) 45 | 46 | mean, variance = tf.cond(train_phase, 47 | mean_variance_with_update, 48 | lambda: (ema.average(batch_mean), 49 | ema.average(batch_variance))) 50 | else: 51 | print("At scope %s reuse is truned on! Using previously created ema variables." % tf.get_variable_scope().name) 52 | 53 | #It's a kind of workaround 54 | vars = tf.get_variable_scope().global_variables() 55 | transform = lambda s: '/'.join(s.split('/')[-5:]) 56 | 57 | mean_name = transform(ema.average_name(batch_mean)) 58 | variance_name = transform(ema.average_name(batch_variance)) 59 | 60 | existed = {} 61 | for v in vars: 62 | if (transform(v.op.name) == mean_name): 63 | existed['mean'] = v 64 | if (transform(v.op.name) == variance_name): 65 | existed['variance'] = v 66 | 67 | print('Using:') 68 | print('\t' + existed['mean'].op.name) 69 | print('\t' + existed['variance'].op.name) 70 | 71 | 72 | mean, variance = tf.cond(train_phase, 73 | lambda: (batch_mean, 74 | batch_variance), 75 | lambda: (existed['mean'], 76 | existed['variance'])) 77 | 78 | std = tf.sqrt(variance + eps, name='std') 79 | out = (inp - mean) / std 80 | if use_scale: 81 | weights = get_var_wrap('weights', 82 | shape=[n_out], 83 | initializer=tf.ones_initializer, 84 | trainable=trainable, 85 | regularizer=None, 86 | cpu_variable=cpu_variables) 87 | 88 | out = tf.multiply(out, weights) 89 | if use_shift: 90 | biases = get_var_wrap('biases', 91 | shape=[n_out], 92 | initializer=tf.zeros_initializer, 93 | trainable=trainable, 94 | regularizer=None, 95 | cpu_variable=cpu_variables) 96 | 97 | out = tf.add(out, biases) 98 | return out 99 | -------------------------------------------------------------------------------- /tensornet/layers/conv.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from .aux import get_var_wrap 3 | 4 | def conv(inp, 5 | out_ch, 6 | window_size, 7 | strides=[1, 1], 8 | padding='SAME', 9 | filters_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 10 | filters_regularizer=None, 11 | biases_initializer=tf.zeros_initializer, 12 | biases_regularizer=None, 13 | trainable=True, 14 | cpu_variables=False, 15 | scope=None): 16 | """ convolutional layer 17 | Args: 18 | inp: input tensor, float - [batch_size, H, W, C] 19 | out_ch: output channels count count, int 20 | window_size: convolution window size, list [wH, wW] 21 | strides: strides, list [sx, sy] 22 | padding: 'SAME' or 'VALID', string 23 | filters_initializer: filters init function 24 | filters_regularizer: filters regularizer function 25 | biases_initializer: biases init function (if None then no biases will be used) 26 | biases_regularizer: biases regularizer function 27 | trainable: trainable variables flag, bool 28 | cpu_variables: cpu variables flag, bool 29 | scope: layer variable scope name, string 30 | Returns: 31 | out: output tensor, float - [batch_size, H', W', out_ch] 32 | """ 33 | 34 | with tf.variable_scope(scope): 35 | shape = inp.get_shape().as_list() 36 | assert len(shape) == 4, "Not 4D input tensor" 37 | in_ch = shape[-1] 38 | 39 | filters = get_var_wrap('filters', 40 | shape=window_size + [in_ch, out_ch], 41 | initializer=filters_initializer, 42 | regularizer=filters_regularizer, 43 | trainable=trainable, 44 | cpu_variable=cpu_variables) 45 | 46 | out = tf.nn.conv2d(inp, filters, [1] + strides + [1], padding, name='conv2d') 47 | 48 | if biases_initializer is not None: 49 | biases = get_var_wrap('biases', 50 | shape=[out_ch], 51 | initializer=biases_initializer, 52 | regularizer=biases_regularizer, 53 | trainable=trainable, 54 | cpu_variable=cpu_variables) 55 | out = tf.add(out, biases, name='out') 56 | else: 57 | out = tf.identity(out, name='out') 58 | return out 59 | -------------------------------------------------------------------------------- /tensornet/layers/linear.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | from .aux import get_var_wrap 3 | 4 | def linear(inp, 5 | out_size, 6 | weights_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 7 | weights_regularizer=None, 8 | biases_initializer=tf.zeros_initializer, 9 | biases_regularizer=None, 10 | trainable=True, 11 | cpu_variables=False, 12 | scope=None): 13 | """ linear layer 14 | Args: 15 | inp: input tensor, float - [batch_size, inp_size] 16 | out_size: layer units count, int 17 | weights_initializer: weights init function 18 | weights_regularizer: weights regularizer function 19 | biases_initializer: biases init function (if None then no biases will be used) 20 | biases_regularizer: biases regularizer function 21 | trainable: trainable variables flag, bool 22 | cpu_variables: cpu variables flag, bool 23 | scope: layer variable scope name, string 24 | Returns: 25 | out: output tensor, float - [batch_size, out_size] 26 | """ 27 | with tf.variable_scope(scope): 28 | shape = inp.get_shape().as_list() 29 | assert len(shape) == 2, 'Not 2D input tensor' 30 | inp_size = shape[-1] 31 | 32 | weights = get_var_wrap('weights', 33 | shape=[inp_size, out_size], 34 | initializer=weights_initializer, 35 | regularizer=weights_regularizer, 36 | trainable=trainable, 37 | cpu_variable=cpu_variables) 38 | 39 | if biases_initializer is not None: 40 | biases = get_var_wrap('biases', 41 | shape=[out_size], 42 | initializer=biases_initializer, 43 | regularizer=biases_regularizer, 44 | trainable=trainable, 45 | cpu_variable=cpu_variables) 46 | 47 | out = tf.add(tf.matmul(inp, weights, name='matmul'), biases, name='out') 48 | else: 49 | out = tf.matmul(inp, weights, name='out') 50 | return out 51 | -------------------------------------------------------------------------------- /tensornet/layers/tt.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | from .aux import get_var_wrap 4 | 5 | def tt(inp, 6 | inp_modes, 7 | out_modes, 8 | mat_ranks, 9 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 10 | cores_regularizer=None, 11 | biases_initializer=tf.zeros_initializer, 12 | biases_regularizer=None, 13 | trainable=True, 14 | cpu_variables=False, 15 | scope=None): 16 | """ tt-layer (tt-matrix by full tensor product) 17 | Args: 18 | inp: input tensor, float - [batch_size, prod(inp_modes)] 19 | inp_modes: input tensor modes 20 | out_modes: output tensor modes 21 | mat_ranks: tt-matrix ranks 22 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core 23 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core 24 | biases_initializer: biases init function (if None then no biases will be used) 25 | biases_regularizer: biases regularizer function 26 | trainable: trainable variables flag, bool 27 | cpu_variables: cpu variables flag, bool 28 | scope: layer variable scope name, string 29 | Returns: 30 | out: output tensor, float - [batch_size, prod(out_modes)] 31 | """ 32 | with tf.variable_scope(scope): 33 | dim = inp_modes.size 34 | 35 | mat_cores = [] 36 | 37 | for i in range(dim): 38 | if type(cores_initializer) == list: 39 | cinit = cores_initializer[i] 40 | else: 41 | cinit = cores_initializer 42 | 43 | if type(cores_regularizer) == list: 44 | creg = cores_regularizer[i] 45 | else: 46 | creg = cores_regularizer 47 | 48 | mat_cores.append(get_var_wrap('mat_core_%d' % (i + 1), 49 | shape=[out_modes[i] * mat_ranks[i + 1], mat_ranks[i] * inp_modes[i]], 50 | initializer=cinit, 51 | regularizer=creg, 52 | trainable=trainable, 53 | cpu_variable=cpu_variables)) 54 | 55 | 56 | 57 | out = tf.reshape(inp, [-1, np.prod(inp_modes)]) 58 | out = tf.transpose(out, [1, 0]) 59 | 60 | for i in range(dim): 61 | out = tf.reshape(out, [mat_ranks[i] * inp_modes[i], -1]) 62 | 63 | out = tf.matmul(mat_cores[i], out) 64 | out = tf.reshape(out, [out_modes[i], -1]) 65 | out = tf.transpose(out, [1, 0]) 66 | 67 | if biases_initializer is not None: 68 | 69 | biases = get_var_wrap('biases', 70 | shape=[np.prod(out_modes)], 71 | initializer=biases_initializer, 72 | regularizer=biases_regularizer, 73 | trainable=trainable, 74 | cpu_variable=cpu_variables) 75 | 76 | out = tf.add(tf.reshape(out, [-1, np.prod(out_modes)]), biases, name="out") 77 | else: 78 | out = tf.reshape(out, [-1, np.prod(out_modes)], name="out") 79 | 80 | return out 81 | -------------------------------------------------------------------------------- /tensornet/layers/tt_conv.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import math 4 | from .aux import get_var_wrap 5 | 6 | def tt_conv(inp, 7 | window, 8 | inp_ch_modes, 9 | out_ch_modes, 10 | ranks, 11 | strides=[1, 1], 12 | padding='SAME', 13 | filters_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 14 | filters_regularizer=None, 15 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 16 | cores_regularizer=None, 17 | biases_initializer=tf.zeros_initializer, 18 | biases_regularizer=None, 19 | trainable=True, 20 | cpu_variables=False, 21 | scope=None): 22 | """ tt-conv-layer (convolution of full input tensor with tt-filters (core by core)) 23 | Args: 24 | inp: input tensor, float - [batch_size, H, W, C] 25 | window: convolution window size, list [wH, wW] 26 | inp_ch_modes: input channels modes, np.array (int32) of size d 27 | out_ch_modes: output channels modes, np.array (int32) of size d 28 | ranks: tt-filters ranks, np.array (int32) of size (d + 1) 29 | strides: strides, list of 2 ints - [sx, sy] 30 | padding: 'SAME' or 'VALID', string 31 | filters_initializer: filters init function 32 | filters_regularizer: filters regularizer function 33 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core 34 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core 35 | biases_initializer: biases init function (if None then no biases will be used) 36 | biases_regularizer: biases regularizer function 37 | trainable: trainable variables flag, bool 38 | cpu_variables: cpu variables flag, bool 39 | scope: layer variable scope name, string 40 | Returns: 41 | out: output tensor, float - [batch_size, prod(out_modes)] 42 | """ 43 | 44 | with tf.variable_scope(scope): 45 | inp_shape = inp.get_shape().as_list()[1:] 46 | inp_h, inp_w, inp_ch = inp_shape[0:3] 47 | tmp = tf.reshape(inp, [-1, inp_h, inp_w, inp_ch]) 48 | tmp = tf.transpose(tmp, [0, 3, 1, 2]) 49 | tmp = tf.reshape(tmp, [-1, inp_h, inp_w, 1]) 50 | 51 | filters_shape = [window[0], window[1], 1, ranks[0]] 52 | if (window[0] * window[1] * 1 * ranks[0] == 1): 53 | filters = get_var_wrap('filters', 54 | shape=filters_shape, 55 | initializer=tf.ones_initializer, 56 | regularizer=None, 57 | trainable=False, 58 | cpu_variable=cpu_variables) 59 | else: 60 | filters = get_var_wrap('filters', 61 | shape=filters_shape, 62 | initializer=filters_initializer, 63 | regularizer=filters_regularizer, 64 | trainable=trainable, 65 | cpu_variable=cpu_variables) 66 | 67 | tmp = tf.nn.conv2d(tmp, filters, [1] + strides + [1], padding) 68 | 69 | #tmp shape = [batch_size * inp_ch, h, w, r] 70 | h, w = tmp.get_shape().as_list()[1:3] 71 | tmp = tf.reshape(tmp, [-1, inp_ch, h, w, ranks[0]]) 72 | tmp = tf.transpose(tmp, [4, 1, 0, 2, 3]) 73 | #tmp shape = [r, c, b, h, w] 74 | 75 | d = inp_ch_modes.size 76 | 77 | cores = [] 78 | for i in range(d): 79 | 80 | if type(cores_initializer) == list: 81 | cinit = cores_initializer[i] 82 | else: 83 | cinit = cores_initializer 84 | 85 | if type(cores_regularizer) == list: 86 | creg = cores_regularizer[i] 87 | else: 88 | creg = cores_regularizer 89 | 90 | cores.append(get_var_wrap('core_%d' % (i + 1), 91 | shape=[out_ch_modes[i] * ranks[i + 1], ranks[i] * inp_ch_modes[i]], 92 | initializer=cinit, 93 | regularizer=creg, 94 | trainable=trainable, 95 | cpu_variable=cpu_variables)) 96 | 97 | for i in range(d): 98 | tmp = tf.reshape(tmp, [ranks[i] * inp_ch_modes[i], -1]) 99 | tmp = tf.matmul(cores[i], tmp) 100 | tmp = tf.reshape(tmp, [out_ch_modes[i], -1]) 101 | tmp = tf.transpose(tmp, [1, 0]) 102 | out_ch = np.prod(out_ch_modes) 103 | 104 | if biases_initializer is not None: 105 | biases = get_var_wrap('biases', 106 | shape=[out_ch], 107 | initializer=biases_initializer, 108 | regularizer=biases_regularizer, 109 | trainable=trainable, 110 | cpu_variable=cpu_variables) 111 | 112 | out = tf.reshape(tmp, [-1, h, w, out_ch]) 113 | out = tf.add(out, biases, name='out') 114 | else: 115 | out = tf.reshape(tmp, [-1, h, w, out_ch], name='out') 116 | 117 | return out 118 | -------------------------------------------------------------------------------- /tensornet/layers/tt_conv1d_full.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import math 4 | from .aux import get_var_wrap 5 | import tt_conv_full 6 | 7 | def tt_conv1d_full(inp, 8 | window, 9 | inp_ch_modes, 10 | out_ch_modes, 11 | ranks, 12 | strides=[1, 1], 13 | padding='SAME', 14 | filters_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 15 | filters_regularizer=None, 16 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 17 | cores_regularizer=None, 18 | biases_initializer=tf.zeros_initializer, 19 | biases_regularizer=None, 20 | trainable=True, 21 | cpu_variables=False, 22 | scope=None): 23 | """ 24 | conv1d wrapper for conv2d function. Internally tensorflow does a conv2d for its vanilla 25 | conv1d. Similarly, this process is applied here. Input is expanded by dim 1 and then output is simply squeezed. 26 | 27 | Note: window should be [1, w] where you insert your width 28 | strides should be [1, stride_width] 29 | 30 | 31 | tt-conv-layer (convolution of full input tensor with tt-filters (make tt full then use conv2d)) 32 | Args: 33 | inp: input tensor, float - [batch_size, W, C] 34 | window: convolution window size, list [wH, wW] 35 | inp_ch_modes: input channels modes, np.array (int32) of size d 36 | out_ch_modes: output channels modes, np.array (int32) of size d 37 | ranks: tt-filters ranks, np.array (int32) of size (d + 1) 38 | strides: strides, list of 2 ints - [sx, sy] 39 | padding: 'SAME' or 'VALID', string 40 | filters_initializer: filters init function 41 | filters_regularizer: filters regularizer function 42 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core 43 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core 44 | biases_initializer: biases init function (if None then no biases will be used) 45 | biases_regularizer: biases regularizer function 46 | trainable: trainable variables flag, bool 47 | cpu_variables: cpu variables flag, bool 48 | scope: layer variable scope name, string 49 | Returns: 50 | out: output tensor, float - [batch_size, W, prod(out_modes)] 51 | """ 52 | inp_expanded = tf.expand_dims(inp, dim = 1) # expand on height dim 53 | 54 | conv2d_output = tt_conv_full(inp, 55 | window, 56 | inp_ch_modes, 57 | out_ch_modes, 58 | ranks, 59 | strides=strides, 60 | padding=padding, 61 | filters_initializer=filters_initializer, 62 | filters_regularizer=filters_regularizer, 63 | cores_initializer=cores_initializer, 64 | cores_regularizer=cores_regularizer, 65 | biases_initializer=biases_initializer, 66 | biases_regularizer=biases_regularizer, 67 | trainable=trainable, 68 | cpu_variables=cpu_variables, 69 | scope=scope) 70 | 71 | return tf.squeeze(conv2d_output) # get rid of height dimension 72 | -------------------------------------------------------------------------------- /tensornet/layers/tt_conv_direct.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import math 4 | from .aux import get_var_wrap 5 | 6 | def tt_conv_direct(inp, 7 | window, 8 | out_ch, 9 | ranks, 10 | strides=[1, 1], 11 | padding='SAME', 12 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 13 | cores_regularizer=None, 14 | biases_initializer=tf.zeros_initializer, 15 | biases_regularizer=None, 16 | trainable=True, 17 | cpu_variables=False, 18 | scope=None): 19 | """ tt-conv-layer (convolution of full input tensor with straightforward decomposed tt-filters (make tt full then use conv2d)) 20 | Args: 21 | inp: input tensor, float - [batch_size, H, W, C] 22 | window: convolution window size, list [wH, wW] 23 | inp_ch_modes: input channels modes, np.array (int32) of size d 24 | out_ch_modes: output channels modes, np.array (int32) of size d 25 | ranks: tt-filters ranks, np.array (int32) of size (d + 1) 26 | strides: strides, list of 2 ints - [sx, sy] 27 | padding: 'SAME' or 'VALID', string 28 | filters_initializer: filters init function 29 | filters_regularizer: filters regularizer function 30 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core 31 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core 32 | biases_initializer: biases init function (if None then no biases will be used) 33 | biases_regularizer: biases regularizer function 34 | trainable: trainable variables flag, bool 35 | cpu_variables: cpu variables flag, bool 36 | scope: layer variable scope name, string 37 | Returns: 38 | out: output tensor, float - [batch_size, prod(out_modes)] 39 | """ 40 | 41 | with tf.variable_scope(scope): 42 | inp_shape = inp.get_shape().as_list()[1:] 43 | inp_h, inp_w, inp_ch = inp_shape[0:3] 44 | tmp = tf.reshape(inp, [-1, inp_h, inp_w, inp_ch]) 45 | 46 | modes = np.array([window[0], window[1], inp_ch, out_ch]) 47 | 48 | cores = [] 49 | for i in range(4): 50 | 51 | sz = modes[i] * ranks[i] * ranks[i + 1] 52 | if (sz == 1): 53 | cinit = tf.ones_initializer 54 | elif type(cores_initializer) == list: 55 | cinit = cores_initializer[i] 56 | else: 57 | cinit = cores_initializer 58 | 59 | if type(cores_regularizer) == list: 60 | creg = cores_regularizer[i] 61 | else: 62 | creg = cores_regularizer 63 | 64 | cores.append(get_var_wrap('core_%d' % (i + 1), 65 | shape=[ranks[i], modes[i] * ranks[i + 1]], 66 | initializer=cinit, 67 | regularizer=creg, 68 | trainable=trainable and (sz > 1), 69 | cpu_variable=cpu_variables)) 70 | 71 | full = cores[0] 72 | 73 | for i in range(1, 4): 74 | full = tf.reshape(full, [-1, ranks[i]]) 75 | full = tf.matmul(full, cores[i]) 76 | 77 | full = tf.reshape(full, [window[0], window[1], inp_ch, out_ch]) 78 | 79 | 80 | tmp = tf.nn.conv2d(tmp, 81 | full, 82 | [1] + strides + [1], 83 | padding, 84 | name='conv2d') 85 | 86 | if biases_initializer is not None: 87 | biases = get_var_wrap('biases', 88 | shape=[out_ch], 89 | initializer=biases_initializer, 90 | regularizer=biases_regularizer, 91 | trainable=trainable, 92 | cpu_variable=cpu_variables) 93 | 94 | out = tf.add(tmp, biases, name='out') 95 | else: 96 | out = tf.identity(tmp, name='out') 97 | 98 | return out 99 | -------------------------------------------------------------------------------- /tensornet/layers/tt_conv_full.py: -------------------------------------------------------------------------------- 1 | import tensorflow as tf 2 | import numpy as np 3 | import math 4 | from .aux import get_var_wrap 5 | 6 | def tt_conv_full(inp, 7 | window, 8 | inp_ch_modes, 9 | out_ch_modes, 10 | ranks, 11 | strides=[1, 1], 12 | padding='SAME', 13 | filters_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 14 | filters_regularizer=None, 15 | cores_initializer=tf.contrib.layers.xavier_initializer(uniform=False), 16 | cores_regularizer=None, 17 | biases_initializer=tf.zeros_initializer, 18 | biases_regularizer=None, 19 | trainable=True, 20 | cpu_variables=False, 21 | scope=None): 22 | """ tt-conv-layer (convolution of full input tensor with tt-filters (make tt full then use conv2d)) 23 | Args: 24 | inp: input tensor, float - [batch_size, H, W, C] 25 | window: convolution window size, list [wH, wW] 26 | inp_ch_modes: input channels modes, np.array (int32) of size d 27 | out_ch_modes: output channels modes, np.array (int32) of size d 28 | ranks: tt-filters ranks, np.array (int32) of size (d + 1) 29 | strides: strides, list of 2 ints - [sx, sy] 30 | padding: 'SAME' or 'VALID', string 31 | filters_initializer: filters init function 32 | filters_regularizer: filters regularizer function 33 | cores_initializer: cores init function, could be a list of functions for specifying different function for each core 34 | cores_regularizer: cores regularizer function, could be a list of functions for specifying different function for each core 35 | biases_initializer: biases init function (if None then no biases will be used) 36 | biases_regularizer: biases regularizer function 37 | trainable: trainable variables flag, bool 38 | cpu_variables: cpu variables flag, bool 39 | scope: layer variable scope name, string 40 | Returns: 41 | out: output tensor, float - [batch_size, prod(out_modes)] 42 | """ 43 | 44 | with tf.variable_scope(scope): 45 | inp_shape = inp.get_shape().as_list()[1:] 46 | inp_h, inp_w, inp_ch = inp_shape[0:3] 47 | tmp = tf.reshape(inp, [-1, inp_h, inp_w, inp_ch]) 48 | 49 | filters_shape = [window[0], window[1], 1, ranks[0]] 50 | if (window[0] * window[1] * 1 * ranks[0] == 1): 51 | filters = get_var_wrap('filters', 52 | shape=filters_shape, 53 | initializer=tf.ones_initializer, 54 | regularizer=None, 55 | trainable=False, 56 | cpu_variable=cpu_variables) 57 | else: 58 | filters = get_var_wrap('filters', 59 | shape=filters_shape, 60 | initializer=filters_initializer, 61 | regularizer=filters_regularizer, 62 | trainable=trainable, 63 | cpu_variable=cpu_variables) 64 | d = inp_ch_modes.size 65 | 66 | cores = [] 67 | for i in range(d): 68 | 69 | if type(cores_initializer) == list: 70 | cinit = cores_initializer[i] 71 | else: 72 | cinit = cores_initializer 73 | 74 | if type(cores_regularizer) == list: 75 | creg = cores_regularizer[i] 76 | else: 77 | creg = cores_regularizer 78 | 79 | cores.append(get_var_wrap('core_%d' % (i + 1), 80 | shape=[out_ch_modes[i] * ranks[i + 1], ranks[i] * inp_ch_modes[i]], 81 | initializer=cinit, 82 | regularizer=creg, 83 | trainable=trainable, 84 | cpu_variable=cpu_variables)) 85 | 86 | full = filters 87 | 88 | for i in range(d): 89 | full = tf.reshape(full, [-1, ranks[i]]) 90 | core = tf.transpose(cores[i], [1, 0]) 91 | core = tf.reshape(core, [ranks[i], -1]) 92 | full = tf.matmul(full, core) 93 | 94 | out_ch = np.prod(out_ch_modes) 95 | 96 | fshape = [window[0], window[1]] 97 | order = [0, 1] 98 | inord = [] 99 | outord = [] 100 | for i in range(d): 101 | fshape.append(inp_ch_modes[i]) 102 | inord.append(2 + 2 * i) 103 | fshape.append(out_ch_modes[i]) 104 | outord.append(2 + 2 * i + 1) 105 | order += inord + outord 106 | full = tf.reshape(full, fshape) 107 | full = tf.transpose(full, order) 108 | full = tf.reshape(full, [window[0], window[1], inp_ch, out_ch]) 109 | 110 | 111 | tmp = tf.nn.conv2d(tmp, 112 | full, 113 | [1] + strides + [1], 114 | padding, 115 | name='conv2d') 116 | 117 | if biases_initializer is not None: 118 | biases = get_var_wrap('biases', 119 | shape=[out_ch], 120 | initializer=biases_initializer, 121 | regularizer=biases_regularizer, 122 | trainable=trainable, 123 | cpu_variable=cpu_variables) 124 | 125 | out = tf.add(tmp, biases, name='out') 126 | else: 127 | out = tf.identity(tmp, name='out') 128 | 129 | return out 130 | -------------------------------------------------------------------------------- /tensornet/tt/__init__.py: -------------------------------------------------------------------------------- 1 | from .svd import * 2 | from .max_ranks import * 3 | from .matrix_svd import * 4 | -------------------------------------------------------------------------------- /tensornet/tt/matrix_svd.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from .svd import svd 3 | 4 | def matrix_svd(X, left_modes, right_modes, ranks): 5 | """ TT-SVD for matrix 6 | Args: 7 | X: input matrix, numpy array float32 8 | left_modes: tt-left-modes, numpy array int32 9 | right_modes: tt-right-modes, numpy array int32 10 | ranks: tt-ranks, numpy array int32 11 | Returns: 12 | core: tt-cores array, numpy 1D array float32 13 | """ 14 | c = X.copy() 15 | d = left_modes.size 16 | c = np.reshape(c, np.concatenate((left_modes, right_modes))) 17 | order = np.repeat(np.arange(0, d), 2) + np.tile([0, d], d) 18 | c = np.transpose(c, axes=order) 19 | c = np.reshape(c, left_modes * right_modes) 20 | return svd(c, left_modes * right_modes, ranks) 21 | -------------------------------------------------------------------------------- /tensornet/tt/max_ranks.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | def max_ranks(modes): 4 | """ Computation of maximal ranks for TT-SVD 5 | Args: 6 | modes: tt-modes, numpy array int32 7 | Returns: 8 | ranks: maximal tt-ranks, numpy array int32 9 | """ 10 | d = modes.size 11 | ranks = np.zeros(d + 1, dtype='int32') 12 | ranks[0] = 1 13 | prod = np.prod(modes) 14 | for i in range(d): 15 | m = ranks[i] * modes[i] 16 | ranks[i + 1] = min(m, prod // m); 17 | prod = prod // m * ranks[i + 1] 18 | ranks[d] = 1 19 | return ranks 20 | -------------------------------------------------------------------------------- /tensornet/tt/svd.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | def svd(X, modes, ranks): 4 | """ TT-SVD 5 | Args: 6 | X: input array, numpy array float32 7 | modes: tt-modes, numpy array int32 8 | ranks: tt-ranks, numpy array int32 9 | Returns: 10 | core: tt-cores array, numpy 1D array float32 11 | """ 12 | c = X.copy() 13 | d = modes.size 14 | core = np.zeros(np.sum(ranks[:-1] * modes * ranks[1:]), dtype='float32') 15 | pos = 0 16 | for i in range(0, d-1): 17 | m = ranks[i] * modes[i] 18 | c = np.reshape(c, [m, -1]) 19 | u, s, v = np.linalg.svd(c, full_matrices=False) 20 | u = u[:, 0:ranks[i + 1]] 21 | s = s[0:ranks[i + 1]] 22 | v = v[0:ranks[i + 1], :] 23 | core[pos:pos + ranks[i] * modes[i] * ranks[i + 1]] = u.ravel() 24 | pos += ranks[i] * modes[i] * ranks[i + 1] 25 | c = np.dot(np.diag(s), v) 26 | core[pos:pos + ranks[d - 1] * modes[d - 1] * ranks[d]] = c.ravel() 27 | return core 28 | -------------------------------------------------------------------------------- /tests/python/test_matrix_svd.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import sys 3 | sys.path.append('../../') 4 | import tensornet 5 | 6 | 7 | def run_test(left_modes = np.array([4, 6, 8, 3], dtype=np.int32), 8 | right_modes = np.array([5, 2, 7, 4], dtype=np.int32), 9 | test_num=10, 10 | tol=1e-5): 11 | print('*' * 80) 12 | print('*' + ' ' * 28 + 'Testing matrix TT-SVD' + ' ' * 29 + '*') 13 | print('*' * 80) 14 | d = left_modes.size 15 | L = np.prod(left_modes) 16 | R = np.prod(right_modes) 17 | ranks = tensornet.tt.max_ranks(left_modes * right_modes) 18 | ps = np.cumsum(np.concatenate(([0], ranks[:-1] * left_modes * right_modes * ranks[1:]))) 19 | for test in range(test_num): 20 | W = np.random.normal(0.0, 1.0, size=(L, R)) 21 | T = tensornet.tt.matrix_svd(W, left_modes, right_modes, ranks) 22 | w = np.reshape(T[ps[0]:ps[1]], [left_modes[0] * right_modes[0], ranks[1]]) 23 | for i in range(1, d): 24 | core = np.reshape(T[ps[i]:ps[i + 1]], [ranks[i], left_modes[i] * right_modes[i] * ranks[i + 1]]) 25 | w = np.dot(w, core) 26 | w = np.reshape(w, [-1, ranks[i + 1]]) 27 | w = np.reshape(w, w.shape[:-1]) 28 | shape = np.hstack((left_modes.reshape([-1, 1]), right_modes.reshape([-1, 1]))).ravel() 29 | w = np.reshape(w, shape) 30 | order = np.concatenate((np.arange(0, 2 * d, 2), np.arange(1, 2 * d, 2))) 31 | w = np.reshape(np.transpose(w, axes=order), [L, R]) 32 | result = np.max(np.abs(W - w)) 33 | print('Test #{0:02d}. Error: {1:0.2g}'.format(test + 1, result)) 34 | assert result <= tol, 'Error = {0:0.2g} is bigger than tol = {1:0.2g}'.format(result, tol) 35 | 36 | 37 | if __name__ == '__main__': 38 | run_test() 39 | -------------------------------------------------------------------------------- /tests/python/test_tt.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import tensorflow as tf 3 | import sys 4 | 5 | sys.path.append('../../') 6 | 7 | import tensornet 8 | 9 | def run_test(batch_size=100, test_num=10, 10 | inp_modes=np.array([3, 8, 9, 5], dtype='int32'), 11 | out_modes=np.array([5, 6, 10, 6], dtype='int32'), 12 | mat_ranks=np.array([1, 3, 6, 4, 1], dtype='int32'), 13 | tol=1e-5): 14 | print('*' * 80) 15 | print('*' + ' ' * 31 + 'Testing tt layer' + ' ' * 31 + '*') 16 | print('*' * 80) 17 | 18 | graph = tf.Graph() 19 | with graph.as_default(): 20 | 21 | d = inp_modes.size 22 | 23 | INP_SIZE = np.prod(inp_modes) 24 | OUT_SIZE = np.prod(out_modes) 25 | 26 | 27 | 28 | inp = tf.placeholder('float', shape=[None, INP_SIZE]) 29 | out = tensornet.layers.tt(inp, 30 | inp_modes, 31 | out_modes, 32 | mat_ranks, 33 | biases_initializer=None, 34 | scope='tt') 35 | 36 | 37 | sess = tf.Session() 38 | init_op = tf.initialize_all_variables() 39 | sess.run(init_op) 40 | 41 | for test in range(test_num): 42 | mat_cores = [] 43 | for i in range(d): 44 | mat_cores.append(graph.get_tensor_by_name('tt/mat_core_%d:0' % (i + 1))) 45 | mat_cores[-1] = sess.run(mat_cores[-1]) 46 | 47 | 48 | w = np.reshape(mat_cores[0], [out_modes[0] * mat_ranks[1], mat_ranks[0] * inp_modes[0]]) 49 | w = np.transpose(w, [1, 0]) 50 | w = np.reshape(w, [-1, mat_ranks[1]]) 51 | for i in range(1, d): 52 | core = np.reshape(mat_cores[i], [out_modes[i] * mat_ranks[i + 1], mat_ranks[i] * inp_modes[i]]) 53 | core = np.transpose(core, [1, 0]) 54 | core = np.reshape(core, [mat_ranks[i], -1]) 55 | w = np.dot(w, core) 56 | w = np.reshape(w, [-1, mat_ranks[i + 1]]) 57 | w = np.reshape(w, w.shape[:-1]) 58 | shape = np.hstack((inp_modes.reshape([-1, 1]), out_modes.reshape([-1, 1]))).ravel() 59 | w = np.reshape(w, shape) 60 | order = np.concatenate((np.arange(0, 2 * d, 2), np.arange(1, 2 * d, 2))) 61 | w = np.reshape(np.transpose(w, axes=order), [INP_SIZE, OUT_SIZE]) 62 | 63 | 64 | 65 | X = np.random.normal(0.0, 0.2, size=(batch_size, np.prod(inp_modes))) 66 | feed_dict = {inp: X} 67 | y = sess.run(out, feed_dict=feed_dict) 68 | Y = np.dot(X, w) 69 | result = np.max(np.abs(Y - y)) 70 | print('Test #{0:02d}. Error: {1:0.2g}'.format(test + 1, result)) 71 | assert result <= tol, 'Error = {0:0.2g} is bigger than tol = {1:0.2g}'.format(result, tol) 72 | sess.close() 73 | 74 | if __name__ == '__main__': 75 | run_test() 76 | 77 | -------------------------------------------------------------------------------- /tests/python/test_tt_conv.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import time 3 | import tensorflow as tf 4 | import sys 5 | 6 | sys.path.append('../../') 7 | 8 | import tensorflow as tf 9 | import tensornet 10 | 11 | def run_test(batch_size=30, test_num=10, tol=1e-5): 12 | print('*' * 80) 13 | print('*' + ' ' * 31 + 'Testing tt conv' + ' ' * 32 + '*') 14 | print('*' * 80) 15 | 16 | in_h = 32 17 | in_w = 32 18 | 19 | padding = 'SAME' 20 | 21 | 22 | inp_ch_modes = np.array([4, 4, 4, 3], dtype=np.int32) 23 | in_c = np.prod(inp_ch_modes) 24 | out_ch_modes = np.array([5, 2, 5, 5], dtype=np.int32) 25 | out_c = np.prod(out_ch_modes) 26 | ranks = np.array([3, 2, 2, 3, 1], dtype=np.int32) 27 | 28 | 29 | inp = tf.placeholder(tf.float32, [None, in_h, in_w, in_c]) 30 | 31 | 32 | wh = 5 33 | ww = 5 34 | 35 | 36 | w_ph = tf.placeholder(tf.float32, [wh, ww, in_c, out_c]) 37 | 38 | s = [1, 1] 39 | 40 | corr = tf.nn.conv2d(inp, w_ph, [1] + s + [1], padding) 41 | 42 | 43 | out = tensornet.layers.tt_conv(inp, 44 | [wh, ww], 45 | inp_ch_modes, 46 | out_ch_modes, 47 | ranks, 48 | s, 49 | padding, 50 | biases_initializer=None, 51 | scope='tt_conv') 52 | 53 | 54 | 55 | 56 | sess = tf.Session() 57 | graph = tf.get_default_graph() 58 | init_op = tf.initialize_all_variables() 59 | 60 | d = inp_ch_modes.size 61 | 62 | filters_t = graph.get_tensor_by_name('tt_conv/filters:0') 63 | 64 | cores_t = [] 65 | for i in range(d): 66 | cores_t.append(graph.get_tensor_by_name('tt_conv/core_%d:0' % (i + 1))) 67 | 68 | for test in range(test_num): 69 | sess.run(init_op) 70 | 71 | 72 | 73 | 74 | filters = sess.run([filters_t]) 75 | cores = sess.run(cores_t) 76 | 77 | w = np.reshape(filters.copy(), [wh, ww, ranks[0]]) 78 | 79 | 80 | 81 | #mat = np.reshape(inp_cores[inp_ps[0]:inp_ps[1]], [inp_ch_ranks[0], inp_ch_modes[0], inp_ch_ranks[1]]) 82 | 83 | for i in range(0, d): 84 | core = cores[i].copy() 85 | #[out_ch_modes[i] * ranks[i + 1], ranks[i] * inp_ch_modes[i]] 86 | core = np.transpose(core, [1, 0]) 87 | core = np.reshape(core, [ranks[i], inp_ch_modes[i] * out_ch_modes[i] * ranks[i + 1]]) 88 | 89 | w = np.reshape(w, [-1, ranks[i]]) 90 | w = np.dot(w, core) 91 | 92 | #w = np.dot(w, np.reshape(mat, [inp_ch_ranks[0], -1])) 93 | 94 | L = [] 95 | for i in range(d): 96 | L.append(inp_ch_modes[i]) 97 | L.append(out_ch_modes[i]) 98 | 99 | w = np.reshape(w, [-1] + L) 100 | w = np.transpose(w, [0] + list(range(1, 2 * d + 1, 2)) + list(range(2, 2 * d + 1, 2))) 101 | 102 | w = np.reshape(w, [wh, ww, in_c, out_c]) 103 | 104 | X = np.random.normal(0.0, 0.2, size=(batch_size, in_h, in_w, in_c)) 105 | 106 | t1 = time.clock() 107 | correct = sess.run(corr, feed_dict={w_ph: w, inp: X}) 108 | t2 = time.clock() 109 | y = sess.run(out, feed_dict={w_ph: w, inp: X}) 110 | t3 = time.clock() 111 | 112 | 113 | 114 | err = np.max(np.abs(correct - y)) 115 | print('Test #{0:02d}. Error: {1:0.2g}'.format(test + 1, err)) 116 | print('TT-conv time: {0:.2f} sec. conv time: {1:.2f} sec.'.format(t3 - t2, t2 - t1)) 117 | assert err <= tol, 'Error = {0:0.2g} is bigger than tol = {1:0.2g}'.format(err, tol) 118 | 119 | 120 | if __name__ == '__main__': 121 | run_test() 122 | 123 | -------------------------------------------------------------------------------- /tests/python/test_tt_conv_full.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import time 3 | import tensorflow as tf 4 | import sys 5 | 6 | sys.path.append('../../') 7 | 8 | import tensorflow as tf 9 | import tensornet 10 | 11 | def run_test(batch_size=30, test_num=10, tol=1e-5): 12 | print('*' * 80) 13 | print('*' + ' ' * 29 + 'Testing tt conv full' + ' ' * 29 + '*') 14 | print('*' * 80) 15 | 16 | in_h = 32 17 | in_w = 32 18 | 19 | padding = 'SAME' 20 | 21 | 22 | inp_ch_modes = np.array([4, 4, 4, 3], dtype=np.int32) 23 | in_c = np.prod(inp_ch_modes) 24 | out_ch_modes = np.array([5, 2, 5, 5], dtype=np.int32) 25 | out_c = np.prod(out_ch_modes) 26 | ranks = np.array([3, 2, 2, 3, 1], dtype=np.int32) 27 | 28 | 29 | inp = tf.placeholder(tf.float32, [None, in_h, in_w, in_c]) 30 | 31 | 32 | wh = 5 33 | ww = 5 34 | 35 | 36 | w_ph = tf.placeholder(tf.float32, [wh, ww, in_c, out_c]) 37 | 38 | s = [1, 1] 39 | 40 | corr = tf.nn.conv2d(inp, w_ph, [1] + s + [1], padding) 41 | 42 | 43 | out = tensornet.layers.tt_conv_full(inp, 44 | [wh, ww], 45 | inp_ch_modes, 46 | out_ch_modes, 47 | ranks, 48 | s, 49 | padding, 50 | biases_initializer=None, 51 | scope='tt_conv') 52 | 53 | sess = tf.Session() 54 | graph = tf.get_default_graph() 55 | init_op = tf.initialize_all_variables() 56 | 57 | d = inp_ch_modes.size 58 | 59 | filters_t = graph.get_tensor_by_name('tt_conv/filters:0') 60 | 61 | cores_t = [] 62 | for i in range(d): 63 | cores_t.append(graph.get_tensor_by_name('tt_conv/core_%d:0' % (i + 1))) 64 | 65 | for test in range(test_num): 66 | sess.run(init_op) 67 | 68 | 69 | filters = sess.run([filters_t]) 70 | cores = sess.run(cores_t) 71 | 72 | w = np.reshape(filters.copy(), [wh, ww, ranks[0]]) 73 | 74 | 75 | 76 | #mat = np.reshape(inp_cores[inp_ps[0]:inp_ps[1]], [inp_ch_ranks[0], inp_ch_modes[0], inp_ch_ranks[1]]) 77 | 78 | for i in range(0, d): 79 | core = cores[i].copy() 80 | #[out_ch_modes[i] * ranks[i + 1], ranks[i] * inp_ch_modes[i]] 81 | core = np.transpose(core, [1, 0]) 82 | core = np.reshape(core, [ranks[i], inp_ch_modes[i] * out_ch_modes[i] * ranks[i + 1]]) 83 | 84 | w = np.reshape(w, [-1, ranks[i]]) 85 | w = np.dot(w, core) 86 | 87 | #w = np.dot(w, np.reshape(mat, [inp_ch_ranks[0], -1])) 88 | 89 | L = [] 90 | for i in range(d): 91 | L.append(inp_ch_modes[i]) 92 | L.append(out_ch_modes[i]) 93 | 94 | w = np.reshape(w, [-1] + L) 95 | w = np.transpose(w, [0] + list(range(1, 2 * d + 1, 2)) + list(range(2, 2 * d + 1, 2))) 96 | 97 | w = np.reshape(w, [wh, ww, in_c, out_c]) 98 | 99 | X = np.random.normal(0.0, 0.2, size=(batch_size, in_h, in_w, in_c)) 100 | 101 | t1 = time.clock() 102 | correct = sess.run(corr, feed_dict={w_ph: w, inp: X}) 103 | t2 = time.clock() 104 | y = sess.run(out, feed_dict={w_ph: w, inp: X}) 105 | t3 = time.clock() 106 | 107 | 108 | 109 | err = np.max(np.abs(correct - y)) 110 | print('Test #{0:02d}. Error: {1:0.2g}'.format(test + 1, err)) 111 | print('TT-conv time: {0:.2f} sec. conv time: {1:.2f} sec.'.format(t3 - t2, t2 - t1)) 112 | assert err <= tol, 'Error = {0:0.2g} is bigger than tol = {1:0.2g}'.format(err, tol) 113 | 114 | 115 | if __name__ == '__main__': 116 | run_test() 117 | 118 | -------------------------------------------------------------------------------- /ultimate_tensorization_poster.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/timgaripov/TensorNet-TF/76299ad4726370bb5e75589017208d7eae7d8666/ultimate_tensorization_poster.pdf --------------------------------------------------------------------------------