├── FileWriters-TensorBoard.md ├── Implementing-Regression.md ├── Initial-Steps.md ├── MNIST-advanced.md ├── README.md ├── TensorBoard.md ├── TensorRanks-TensorShapes.md ├── Training-and-evaluating-the-model.md ├── Wide-Deep-Learning.md ├── mnist-advanced.py ├── mnist-summary.py ├── mnist_softmax.py ├── neural-network-classifer-iris.py ├── tf-contrib-learn.md ├── weights_accuracy_during_training-tensorboard.md └── wide-n-deep.py /FileWriters-TensorBoard.md: -------------------------------------------------------------------------------- 1 | ## After we've initialized the FileWriters, we have to add summaries to the FileWriters as we train and test the model. Use the following: 2 | 3 | def feed_dict(train): 4 | """Make a TensorFlow feed_dict: maps data onto Tensor placeholders.""" 5 | if train or FLAGS.fake_data: 6 | xs, ys = mnist.train.next_batch(100, fake_data=FLAGS.fake_data) 7 | k = FLAGS.dropout 8 | else: 9 | xs, ys = mnist.test.images, mnist.test.labels 10 | k = 1.0 11 | return {x: xs, y_: ys, keep_prob: k} 12 | for i in range(FLAGS.max_steps): 13 | if i % 10 == 0: 14 | summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False)) 15 | test_writer.add_summary(summary, i) 16 | print('Accuracy at step %s: %s' % (i, acc)) 17 | else: 18 | summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True)) 19 | train_writer.add_summary(summary, i) 20 | 21 | 22 | ## Use this and try building the graph :) 23 | -------------------------------------------------------------------------------- /Implementing-Regression.md: -------------------------------------------------------------------------------- 1 | 2 | # Implementing Regression: 3 | 4 | ### Analysis 5 | 6 | 1. Import TensorFlow : 7 | 8 | import tensorflow as tf 9 | 10 | 2. We describe these interacting operations by manipulating symbolic variables. Let's create one: 11 | 12 | x = tf.placeholder(tf.float32, [None, 784]) 13 | 14 | x isn't a specific value. It's a placeholder, a value that we'll input when we ask TensorFlow to run a computation. We want to be able to input any number of MNIST images, each flattened into a 784-dimensional vector. We represent this as a 2-D tensor of floating-point numbers, with a shape [None, 784]. (Here None means that a dimension can be of any length.) 15 | 16 | We also need weights and biases for our model which is taken care in Step 3. 17 | 18 | 3. Create weights and biases: 19 | 20 | W = tf.Variable(tf.zeros([784, 10])) 21 | b = tf.Variable(tf.zeros([10])) 22 | 23 | We create these Variables by giving tf.Variable the initial value of the Variable: in this case, we initialize both W and b as tensors full of zeros. Since we are going to learn W and b, it doesn't matter very much what they initially are. 24 | 25 | Notice that W has a shape of [784, 10] because we want to multiply the 784-dimensional image vectors by it to produce 10-dimensional vectors of evidence for the difference classes. b has a shape of [10] so we can add it to the output. 26 | 27 | We can now implement our model. It only takes one line to define it: 28 | 29 | y = tf.nn.softmax(tf.matmul(x, W) + b) 30 | 31 | First, we multiply x by W with the expression tf.matmul(x, W). This is flipped from when we multiplied them in our equation, where we had Wx, as a small trick to deal with x being a 2D tensor with multiple inputs. We then add b, and finally apply tf.nn.softmax. 32 | 33 | That's it! :) 34 | 35 | 36 | -------------------------------------------------------------------------------- /Initial-Steps.md: -------------------------------------------------------------------------------- 1 | 2 | # Initial Steps: 3 | 1. Analyze 'mnist_softmax.py' file 4 | 5 | A. The MNIST database is hosted on http://yann.lecun.com/exdb/mnist/ [Download] 6 | OR 7 | B. Run the following code to download and read in the data automatically: 8 | 9 | from tensorflow.examples.tutorials.mnist import input_data 10 | mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) 11 | 12 | ## PLEASE NOTE: 13 | 14 | The MNIST data is split into three parts: 15 | a. 55,000 data points of Training data - mnist.train 16 | b. 10,000 data points of Test data - mnist.test 17 | c. 5,000 data points of Validation data - mnist.validation 18 | 19 | Every MNIST data point has two parts: an image of a handwritten digit and a corresponding label. We'll call the images "x" and the labels "y". 20 | Both the training set and test set contain images and their corresponding labels. 21 | For example the training images are mnist.train.images and the training labels are mnist.train.labels. 22 | -------------------------------------------------------------------------------- /MNIST-advanced.md: -------------------------------------------------------------------------------- 1 | # MNIST Advanced - Building a multilayer convolutional Network 2 | 3 | ##### 1. Load MNIST Data: 4 | 5 | from tensorflow.examples.tutorials.mnist import input_data 6 | mnist = input_data.read_data_sets('MNIST_data', one_hot=True) 7 | 8 | 9 | ##### 2. Start TensorFlow InteractiveSession: 10 | 11 | import tensorflow as tf 12 | sess = tf.InteractiveSession() 13 | 14 | 15 | ##### 3. We start building the computation graph by creating nodes for the input images and target output classes: 16 | 17 | x = tf.placeholder(tf.float32, shape=[None, 784]) 18 | y_ = tf.placeholder(tf.float32, shape=[None, 10]) 19 | 20 | Here _*x*_ and _*y*_ aren't specific values. Rather, they are each a placeholder *_-- a_* value that we'll input when we ask TensorFlow to run a computation. 21 | 22 | 23 | ##### 4. We now define the weights _*W*_ and biases _*b*_ for our model. We could imagine treating these like additional inputs, but TensorFlow has an even better way to handle them: Variable. A Variable is a value that lives in TensorFlow's computation graph. It can be used and even modified by the computation. In machine learning applications, one generally has the model parameters be Variables. 24 | 25 | W = tf.Variable(tf.zeros([784,10])) 26 | b = tf.Variable(tf.zeros([10])) 27 | 28 | 29 | ##### 5. We can now implement our regression model. It only takes one line! We multiply the vectorized input images _*x*_ by the weight matrix _*W*_, add the bias _*b*_. 30 | 31 | y = tf.matmul(x,W) + b 32 | 33 | We can specify a loss function just as easily. Loss indicates how bad the model's prediction was on a single example; we try to minimize that while training across all the examples. Here, our loss function is the cross-entropy between the target and the softmax activation function applied to the model's prediction. As in the beginners tutorial, we use the stable formulation: 34 | 35 | cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)) 36 | 37 | Note that _*tf.nn.softmax_cross_entropy_with_logits*_ internally applies the softmax on the model's unnormalized model prediction and sums across all classes, and tf.reduce_mean takes the average over these sums. 38 | 39 | 40 | ##### 6. Now that we have defined our model and training loss function, it is straightforward to train using TensorFlow. Because TensorFlow knows the entire computation graph, it can use automatic differentiation to find the gradients of the loss with respect to each of the variables. TensorFlow has a variety of built-in optimization algorithms. For this example, we will use steepest gradient descent, with a step length of 0.5, to descend the cross entropy. 41 | 42 | train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) 43 | 44 | What TensorFlow actually did in that single line was to add new operations to the computation graph. These operations included ones to compute gradients, compute parameter update steps, and apply update steps to the parameters. 45 | 46 | So we first run the train step: 47 | 48 | tf.initialize_all_variables().run() 49 | 50 | The returned operation _*train_step*_, when run, will apply the gradient descent updates to the parameters. Training the model can therefore be accomplished by repeatedly running _*train_step*_. 51 | 52 | for i in range(1000): 53 | batch = mnist.train.next_batch(100) 54 | train_step.run(feed_dict={x: batch[0], y_: batch[1]}) 55 | 56 | We load 100 training examples in each training iteration. We then run the train_step operation, using _*feed_dict*_ to replace the placeholder tensors _*x*_ and _*y_ *_ with the training examples. Note that you can replace any tensor in your computation graph using feed_dict -- _it's not restricted to just placeholders_. 57 | 58 | 59 | ##### 7. How well did our model do? First we'll figure out where we predicted the correct label. _*tf.argmax*_ is an extremely useful function which gives you the index of the highest entry in a tensor along some axis. For example, _*tf.argmax(y,1)*_ is the label our model thinks is most likely for each input, while _*tf.argmax(y_,1)*_ is the true label. We can use _*tf.equal*_ to check if our prediction matches the truth. 60 | 61 | correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) 62 | 63 | That gives us a list of booleans. To determine what fraction are correct, we cast to floating point numbers and then take the mean. For example, _*[True, False, True, True]*_ would become _*[1,0,1,1]*_ which would become _*0.75*_. 64 | 65 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 66 | 67 | Finally, we can evaluate our accuracy on the test data. This should be about _*92%*_ correct. 68 | 69 | print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})) 70 | 71 | 72 | ##### 8. *Building a Multilayer Convolutional Network* 73 | 74 | ##### Getting a *92%* accuracy on MNIST is bad. Now, let's fix that and get higher accuracy rate. 75 | 76 | ###### A. Weight Initialization: 77 | 78 | To create this model, we're going to need to create a lot of weights and biases. One should generally initialize weights with a small amount of noise for symmetry breaking, and to prevent 0 gradients. Since we're using *ReLU* neurons, it is also good practice to initialize them with a slightly positive initial bias to avoid "dead neurons". Instead of doing this repeatedly while we build the model, let's create two handy functions to do it for us. 79 | 80 | def weight_variable(shape): 81 | initial = tf.truncated_normal(shape, stddev=0.1) 82 | return tf.Variable(initial) 83 | 84 | def bias_variable(shape): 85 | initial = tf.constant(0.1, shape=shape) 86 | return tf.Variable(initial) 87 | 88 | ###### B. Convolution and Pooling: 89 | 90 | TensorFlow also gives us a lot of flexibility in convolution and pooling operations. How do we handle the boundaries? What is our stride size? In this example, we're always going to choose the vanilla version. Our convolutions uses a stride of one and are zero padded so that the output is the same size as the input. Our pooling is plain old max pooling over 2x2 blocks. To keep our code cleaner, let's also abstract those operations into functions. 91 | 92 | def conv2d(x, W): 93 | return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') 94 | 95 | def max_pool_2x2(x): 96 | return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') 97 | 98 | ###### C. First Convolution Layer: 99 | 100 | We can now implement our first layer. It will consist of convolution, followed by max pooling. The convolution will compute 32 features for each 5x5 patch. Its weight tensor will have a shape of [5, 5, 1, 32]. The first two dimensions are the patch size, the next is the number of input channels, and the last is the number of output channels. We will also have a bias vector with a component for each output channel. 101 | 102 | W_conv1 = weight_variable([5, 5, 1, 32]) 103 | b_conv1 = bias_variable([32]) 104 | 105 | To apply the layer, we first reshape *_x_* to a 4d tensor, with the second and third dimensions corresponding to image width and height, and the final dimension corresponding to the number of color channels. 106 | 107 | x_image = tf.reshape(x, [-1,28,28,1]) 108 | 109 | We then convolve _*x_image*_ with the weight tensor, add the bias, apply the ReLU function, and finally max pool. The *_ max_pool_2x2 _* method will reduce the image size to _*14x14*_. 110 | 111 | h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) 112 | h_pool1 = max_pool_2x2(h_conv1) 113 | 114 | ###### D. Second Convolution Layer: 115 | 116 | In order to build a deep network, we stack several layers of this type. The second layer will have 64 features for each 5x5 patch. 117 | 118 | W_conv2 = weight_variable([5, 5, 32, 64]) 119 | b_conv2 = bias_variable([64]) 120 | 121 | h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) 122 | h_pool2 = max_pool_2x2(h_conv2) 123 | 124 | ###### E. Densely Connected Layer: 125 | 126 | Now that the image size has been reduced to 7x7, we add a fully-connected layer with 1024 neurons to allow processing on the entire image. We reshape the tensor from the pooling layer into a batch of vectors, multiply by a weight matrix, add a bias, and apply a ReLU. 127 | 128 | W_fc1 = weight_variable([7 * 7 * 64, 1024]) 129 | b_fc1 = bias_variable([1024]) 130 | 131 | h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) 132 | h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) 133 | 134 | ###### F. Dropout: 135 | 136 | To reduce overfitting, we will apply *dropout* before the readout layer. We create a *_placeholder_* for the probability that a neuron's output is kept during dropout. This allows us to turn dropout on during training, and turn it off during testing. TensorFlow's *_tf.nn.dropout op automatically handles scaling neuron outputs in addition to masking them, so dropout just works without any additional scaling. 137 | 138 | keep_prob = tf.placeholder(tf.float32) 139 | h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) 140 | 141 | ###### G. Readout: 142 | 143 | Finally, we add a layer, just like for the one layer softmax regression above. 144 | 145 | W_fc2 = weight_variable([1024, 10]) 146 | b_fc2 = bias_variable([10]) 147 | 148 | y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2 149 | 150 | 151 | ##### 9. Train and Evaluate the Model: 152 | 153 | How well does this model do? To train and evaluate it we will use code that is nearly identical to that for the simple one layer SoftMax network above. 154 | 155 | The differences are that: 156 | We will replace the steepest gradient descent optimizer with the more sophisticated ADAM optimizer. 157 | We will include the additional parameter keep_prob in feed_dict to control the dropout rate. 158 | We will add logging to every 100th iteration in the training process. 159 | 160 | cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)) 161 | train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) 162 | correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) 163 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 164 | sess.run(tf.global_variables_initializer()) 165 | for i in range(20000): 166 | batch = mnist.train.next_batch(50) 167 | if i%100 == 0: 168 | train_accuracy = accuracy.eval(feed_dict={ 169 | x:batch[0], y_: batch[1], keep_prob: 1.0}) 170 | print("step %d, training accuracy %g"%(i, train_accuracy)) 171 | train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) 172 | 173 | print("test accuracy %g"%accuracy.eval(feed_dict={ 174 | x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})) 175 | 176 | ##### The final test set accuracy after running this code should be approximately *_99.2%_*. 177 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Resource page for Machine Learning Study Jam 2017 (Google Developers' Group Bangalore) 2 | 3 | This is the resource page used for GDG Bangalore's TensorFlow Study Jam 2017 4 | -------------------------------------------------------------------------------- /TensorBoard.md: -------------------------------------------------------------------------------- 1 | # TensorBoard: Hands-On 2 | #### The computations you'll use TensorFlow for - like training a massive deep neural network - can be complex and confusing. TensorBoard is a suite of Visualization tools that makes it easier to understand, debug, and optimize TensorFlow programs. 3 | 4 | #### You can use TensorBoard to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data like images that pass through it. Configure TensorBoard and see how it looks :) 5 | 6 | ## Serializing the data: 7 | 8 | ##### TensorBoard operates by reading TensorFlow events files, which contain summary data that you can generate when running TensorFlow. Here's the general lifecycle for summary data within TensorBoard: 9 | 10 | ###### First, create the TensorFlow graph that you'd like to collect summary data from, and decide which nodes you would like to annotate with summary operations [Link: https://www.tensorflow.org/versions/master/api_docs/python/summary/]. 11 | 12 | ###### For example, suppose you are training a convolutional neural network for recognizing MNIST digits. You'd like to record how the learning rate varies over time, and how the objective function is changing. Collect these by attaching scalar_summary ops to the nodes that output the learning rate and loss respectively. Then, give each scalar_summary a meaningful tag, like 'learning rate' or 'loss function'. 13 | 14 | ###### Perhaps you'd also like to visualize the distributions of activations coming off a particular layer, or the distribution of gradients or weights. Collect this data by attaching histogram_summary ops to the gradient outputs and to the variable that holds your weights, respectively. 15 | 16 | ###### Operations in TensorFlow don't do anything until you run them, or an op that depends on their output. And the summary nodes that we've just created are peripheral to your graph: none of the ops you are currently running depend on them. So, to generate summaries, we need to run all of these summary nodes. Managing them by hand would be tedious, so use tf.summary.merge_all to combine them into a single op that generates all the summary data. 17 | 18 | ###### Then, you can just run the merged summary op, which will generate a serialized Summary protobuf object with all of your summary data at a given step. Finally, to write this summary data to disk, pass the summary protobuf to a tf.summary.FileWriter. 19 | 20 | ###### The FileWriter takes a logdir in its constructor - this logdir is quite important, it's the directory where all of the events will be written out. Also, the FileWriter can optionally take a Graph in its constructor. If it receives a Graph object, then TensorBoard will visualize your graph along with tensor shape information. This will give you a much better sense of what flows through the graph: see Tensor shape information [Link: https://www.tensorflow.org/versions/master/how_tos/graph_viz/#tensor_shape_information] 21 | 22 | ###### Now that you've modified your graph and have a FileWriter, you're ready to start running your network! If you want, you could run the merged summary op every single step, and record a ton of training data. That's likely to be more data than you need, though. Instead, consider running the merged summary op every n steps. 23 | 24 | ###### The code example below is a modification of the simple MNIST tutorial, in which we have added some summary ops, and run them every ten steps. If you run this and then launch tensorboard --logdir=/tmp/mnist_logs, you'll be able to visualize statistics, such as how the weights or accuracy varied during training - https://github.com/sujayVittal/Machine-Learning-with-TensorFlow-Study-Jam-2017/blob/master/weights_accuracy_during_training-tensorboard.md 25 | 26 | ###### After we've initialized the FileWriters, we have to add summaries to the FileWriters as we train and test the model - https://github.com/sujayVittal/Machine-Learning-with-TensorFlow-Study-Jam-2017/blob/master/FileWriters-TensorBoard.md 27 | 28 | 29 | ## Execution: 30 | 31 | ##### To run TensorBoard, use the following command (alternatively python -m tensorflow.tensorboard) where logdir points to the directory where the FileWriter serialized its data. If this logdir directory contains subdirectories which contain serialized data from separate runs, then TensorBoard will visualize the data from all of those runs. Once TensorBoard is running, navigate your web browser to localhost:6006 to view the TensorBoard.: 32 | 33 | tensorboard --logdir=path/to/log-directory 34 | 35 | ##### When looking at TensorBoard, you will see the navigation tabs in the top right corner. Each tab represents a set of serialized data that can be visualized. 36 | 37 | #### Play around :) 38 | 39 | ##### 40 | -------------------------------------------------------------------------------- /TensorRanks-TensorShapes.md: -------------------------------------------------------------------------------- 1 | 2 | # Tensor Ranks and Tensor Shapes: 3 | 4 | TensorFlow programs use a tensor data structure to represent all data. 5 | You can think of a TensorFlow tensor as an n-dimensional array or list. A tensor has a static type and dynamic dimensions. 6 | 7 | 8 | ### Rank 9 | In the TensorFlow system, tensors are described by a unit of dimensionality known as Rank. 10 | Tensor rank is the number of dimensions of the tensor. For example, consider the following table: 11 | 12 | --------------------------------------------------------------------------------------------------------- 13 | | Rank | Math Entity | Python Example | 14 | | ----- | -------------------------------- | ------------------------------------------------------------ | 15 | | 0 | Scalar (magnitude only) | s = 483 | 16 | | 1 | Vector (magnitude and direction) | v = [1.1, 2.2, 3.3] | 17 | | 2 | Matrix (table of numbers) | m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] | 18 | | 3 | 3-Tensor (cube of numbers) | t = [[[2], [4], [6]], [[8], [10], [12]], [[14], [16], [18]]] | 19 | | n | n-Tensor (you get the idea) | .... | 20 | ---------------------------------------------------------------------------------------------------------- 21 | 22 | 23 | ### Shape 24 | The TensorFlow documentation uses three notational conventions to describe tensor dimensionality: rank, shape, and dimension number. 25 | The following table shows how these relate to one another: 26 | 27 | -------------------------------------------------------------------------------------------- 28 | | Rank | Shape | Dimension Number | Example | 29 | | ----- | ------------------ | ------------------ | --------------------------------------- | 30 | | 0 | [ ] | 0-D | A 0-D tensor. A scalar. | 31 | | 1 | [D0] | 1-D | A 1-D tensor with shape [5] | 32 | | 2 | [D1, D2] | 2-D | A 2-D tensor with shape [3, 4] | 33 | | 3 | [D1, D2, D3] | 3-D | A 3-D tensor with shape [1, 2, 3] | 34 | | n | [D0, D1, ... Dn-1] | n-D | A tensor with shape [D0, D1, ... Dn-1] | 35 | -------------------------------------------------------------------------------------------- 36 | -------------------------------------------------------------------------------- /Training-and-evaluating-the-model.md: -------------------------------------------------------------------------------- 1 | # Training 2 | 3 | #### 1. To implement cross-entropy we need to first add a new placeholder to input the correct answers: 4 | 5 | y_ = tf.placeholder(tf.float32, [None, 10]) 6 | 7 | #### 2. Then we can implement the cross-entropy function, −∑y′log⁡(y): 8 | 9 | cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) 10 | 11 | ###### First, tf.log computes the logarithm of each element of y. Next, we multiply each element of y_ with the corresponding element of tf.log(y). Then tf.reduce_sum adds the elements in the second dimension of y, due to the reduction_indices=[1] parameter. Finally, tf.reduce_mean computes the mean over all the examples in the batch. 12 | 13 | ###### Note that in the source code, we don't use this formulation, because it is numerically unstable. Instead, we apply tf.nn.softmax_cross_entropy_with_logits on the unnormalized logits (e.g., we call softmax_cross_entropy_with_logits on tf.matmul(x, W) + b), because this more numerically stable function internally computes the softmax activation. In your code, consider using tf.nn.softmax_cross_entropy_with_logits instead. 14 | 15 | ###### Now that we know what we want our model to do, it's very easy to have TensorFlow train it to do so. Because TensorFlow knows the entire graph of your computations, it can automatically use the backpropagation algorithm to efficiently determine how your variables affect the loss you ask it to minimize. Then it can apply your choice of optimization algorithm to modify the variables and reduce the loss. 16 | 17 | train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) 18 | 19 | ###### In this case, we ask TensorFlow to minimize cross_entropy using the gradient descent algorithm with a learning rate of 0.5. Gradient descent is a simple procedure, where TensorFlow simply shifts each variable a little bit in the direction that reduces the cost. But TensorFlow also provides many other optimization algorithms: using one is as simple as tweaking one line. 20 | 21 | ###### What TensorFlow actually does here, behind the scenes, is to add new operations to your graph which implement backpropagation and gradient descent. Then it gives you back a single operation which, when run, does a step of gradient descent training, slightly tweaking your variables to reduce the loss. 22 | 23 | 24 | #### 3. We can now launch the model in an InteractiveSession: 25 | 26 | sess = tf.InteractiveSession() 27 | 28 | #### 4. We first have to create an operation to initialize the variables we created: 29 | 30 | tf.global_variables_initializer().run() 31 | 32 | #### 5. Let's train -- we'll run the training step 1000 times! 33 | 34 | for _ in range(1000): 35 | batch_xs, batch_ys = mnist.train.next_batch(100) 36 | sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) 37 | 38 | ###### Each step of the loop, we get a "batch" of one hundred random data points from our training set. We run train_step feeding in the batches data to replace the placeholders. 39 | 40 | ###### Using small batches of random data is called stochastic training -- in this case, stochastic gradient descent. Ideally, we'd like to use all our data for every step of training because that would give us a better sense of what we should be doing, but that's expensive. So, instead, we use a different subset every time. Doing this is cheap and has much of the same benefit. 41 | 42 | 43 | # Evaluating our Model 44 | 45 | ## How well does our model do? 46 | 47 | #### 1. First let's figure out where we predicted the correct label. tf.argmax is an extremely useful function which gives you the index of the highest entry in a tensor along some axis. For example, tf.argmax(y,1) is the label our model thinks is most likely for each input, while tf.argmax(y_,1) is the correct label. We can use tf.equal to check if our prediction matches the truth: 48 | 49 | correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) 50 | 51 | #### 2. That gives us a list of booleans. To determine what fraction are correct, we cast to floating point numbers and then take the mean. For example, [True, False, True, True] would become [1,0,1,1] which would become 0.75. 52 | 53 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 54 | 55 | #### 3. Finally, we ask for our accuracy on our test data. 56 | 57 | print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})) 58 | 59 | 60 | ### This should be about **_92%_**. 61 | -------------------------------------------------------------------------------- /Wide-Deep-Learning.md: -------------------------------------------------------------------------------- 1 | # Wide and Deep Learning using TensorFlow 2 | 3 | ### A. Setup 4 | 5 | 1. Download *wide-n-deep.py* [link: https://github.com/sujayVittal/Machine-Learning-with-TensorFlow-Study-Jam-2017/blob/master/wide-n-deep.py] 6 | 7 | 2. Install the pandas data analysis library. tf.learn doesn't require pandas, but it does support it, and this tutorial uses pandas. To install pandas: 8 | - Use *pip* to install pandas: 9 | 10 | shell $ sudo pip install pandas 11 | 12 | 3. Execute the tutorial code to train the linear model: 13 | 14 | shell $ python wide-n-deep.py --model_type=wide_n_deep 15 | 16 | 17 | ### B. Define Base Feature Columns 18 | 19 | First, let's define the base categorical and continuous feature columns that we'll use. These base columns will be the building blocks used by both the wide part and the deep part of the model. 20 | 21 | import tensorflow as tf 22 | 23 | # Categorical base columns. 24 | gender = tf.contrib.layers.sparse_column_with_keys(column_name="gender", keys=["Female", "Male"]) 25 | race = tf.contrib.layers.sparse_column_with_keys(column_name="race", keys=[ 26 | "Amer-Indian-Eskimo", "Asian-Pac-Islander", "Black", "Other", "White"]) 27 | education = tf.contrib.layers.sparse_column_with_hash_bucket("education", hash_bucket_size=1000) 28 | relationship = tf.contrib.layers.sparse_column_with_hash_bucket("relationship", hash_bucket_size=100) 29 | workclass = tf.contrib.layers.sparse_column_with_hash_bucket("workclass", hash_bucket_size=100) 30 | occupation = tf.contrib.layers.sparse_column_with_hash_bucket("occupation", hash_bucket_size=1000) 31 | native_country = tf.contrib.layers.sparse_column_with_hash_bucket("native_country", hash_bucket_size=1000) 32 | 33 | # Continuous base columns. 34 | age = tf.contrib.layers.real_valued_column("age") 35 | age_buckets = tf.contrib.layers.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) 36 | education_num = tf.contrib.layers.real_valued_column("education_num") 37 | capital_gain = tf.contrib.layers.real_valued_column("capital_gain") 38 | capital_loss = tf.contrib.layers.real_valued_column("capital_loss") 39 | hours_per_week = tf.contrib.layers.real_valued_column("hours_per_week") 40 | 41 | 42 | ### C. The Wide Model: Linear Model with Crossed Feature Columns 43 | 44 | The wide model is a linear model with a wide set of sparse and crossed feature columns: 45 | 46 | wide_columns = [ 47 | gender, native_country, education, occupation, workclass, relationship, age_buckets, 48 | tf.contrib.layers.crossed_column([education, occupation], hash_bucket_size=int(1e4)), 49 | tf.contrib.layers.crossed_column([native_country, occupation], hash_bucket_size=int(1e4)), 50 | tf.contrib.layers.crossed_column([age_buckets, education, occupation], hash_bucket_size=int(1e6)) ] 51 | 52 | Wide models with crossed feature columns can memorize sparse interactions between features effectively. That being said, one limitation of crossed feature columns is that they do not generalize to feature combinations that have not appeared in the training data. Let's add a deep model with embeddings to fix that. 53 | 54 | 55 | ### D. The Deep Model: Neural Network with Embeddings 56 | 57 | The deep model is a feed-forward neural network as shown in the slide deck (refer presentation slide number 12). Each of the sparse, high-dimensional categorical features are first converted into a low-dimensional and dense real-valued vector, often referred to as an embedding vector. These low-dimensional dense embedding vectors are concatenated with the continuous features, and then fed into the hidden layers of a neural network in the forward pass. The embedding values are initialized randomly, and are trained along with all other model parameters to minimize the training loss. 58 | 59 | We'll configure the embeddings for the categorical columns using embedding_column, and concatenate them with the continuous columns: 60 | 61 | deep_columns = [ 62 | tf.contrib.layers.embedding_column(workclass, dimension=8), 63 | tf.contrib.layers.embedding_column(education, dimension=8), 64 | tf.contrib.layers.embedding_column(gender, dimension=8), 65 | tf.contrib.layers.embedding_column(relationship, dimension=8), 66 | tf.contrib.layers.embedding_column(native_country, dimension=8), 67 | tf.contrib.layers.embedding_column(occupation, dimension=8), 68 | age, education_num, capital_gain, capital_loss, hours_per_week] 69 | 70 | The higher the dimension of the embedding is, the more degrees of freedom the model will have to learn the representations of the features. 71 | 72 | 73 | ### E. Combining Wide and Deep Models into One 74 | 75 | The wide models and deep models are combined by summing up their final output log odds as the prediction, then feeding the prediction to a logistic loss function. All the graph definition and variable allocations have already been handled for you under the hood, so you simply need to create a DNNLinearCombinedClassifier: 76 | 77 | import tempfile 78 | model_dir = tempfile.mkdtemp() 79 | m = tf.contrib.learn.DNNLinearCombinedClassifier( 80 | model_dir=model_dir, 81 | linear_feature_columns=wide_columns, 82 | dnn_feature_columns=deep_columns, 83 | dnn_hidden_units=[100, 50]) 84 | 85 | ### F. Training and Evaluating The Model 86 | 87 | Before we train the model, let's read in the Census dataset. The code for input data processing is provided here again for your convenience: 88 | 89 | import pandas as pd 90 | import urllib 91 | 92 | 93 | COLUMNS = ["age", "workclass", "fnlwgt", "education", "education_num", 94 | "marital_status", "occupation", "relationship", "race", "gender", 95 | "capital_gain", "capital_loss", "hours_per_week", "native_country", "income_bracket"] 96 | LABEL_COLUMN = 'label' 97 | CATEGORICAL_COLUMNS = ["workclass", "education", "marital_status", "occupation", 98 | "relationship", "race", "gender", "native_country"] 99 | CONTINUOUS_COLUMNS = ["age", "education_num", "capital_gain", "capital_loss", 100 | "hours_per_week"] 101 | 102 | 103 | train_file = tempfile.NamedTemporaryFile() 104 | test_file = tempfile.NamedTemporaryFile() 105 | urllib.urlretrieve("http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.data", train_file.name) 106 | urllib.urlretrieve("http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.test", test_file.name) 107 | 108 | 109 | df_train = pd.read_csv(train_file, names=COLUMNS, skipinitialspace=True) 110 | df_test = pd.read_csv(test_file, names=COLUMNS, skipinitialspace=True, skiprows=1) 111 | df_train[LABEL_COLUMN] = (df_train['income_bracket'].apply(lambda x: '>50K' in x)).astype(int) 112 | df_test[LABEL_COLUMN] = (df_test['income_bracket'].apply(lambda x: '>50K' in x)).astype(int) 113 | 114 | def input_fn(df): 115 | 116 | continuous_cols = {k: tf.constant(df[k].values) 117 | for k in CONTINUOUS_COLUMNS} 118 | 119 | categorical_cols = {k: tf.SparseTensor( 120 | indices=[[i, 0] for i in range(df[k].size)], 121 | values=df[k].values, 122 | shape=[df[k].size, 1]) 123 | for k in CATEGORICAL_COLUMNS} 124 | 125 | feature_cols = dict(continuous_cols.items() + categorical_cols.items()) 126 | 127 | label = tf.constant(df[LABEL_COLUMN].values) 128 | 129 | return feature_cols, label 130 | 131 | def train_input_fn(): 132 | return input_fn(df_train) 133 | 134 | def eval_input_fn(): 135 | return input_fn(df_test) 136 | 137 | 138 | After reading the data, you can train and evaluate the model: 139 | 140 | m.fit(input_fn=train_input_fn, steps=200) 141 | results = m.evaluate(input_fn=eval_input_fn, steps=1) 142 | for key in sorted(results): 143 | print "%s: %s" % (key, results[key]) 144 | 145 | 146 | -------------------------------------------------------------------------------- /mnist-advanced.py: -------------------------------------------------------------------------------- 1 | 2 | # 3 | # Developed by sujayVittal; Sat Mar 11 01:05:34 IST 2017 4 | # 5 | ############### 6 | 7 | from __future__ import absolute_import 8 | from __future__ import division 9 | from __future__ import print_function 10 | 11 | import argparse 12 | import sys 13 | 14 | from tensorflow.examples.tutorials.mnist import input_data 15 | 16 | import tensorflow as tf 17 | 18 | FLAGS = None 19 | 20 | def weight_variable(shape): 21 | initial = tf.truncated_normal(shape, stddev=0.1) 22 | return tf.Variable(initial) 23 | 24 | def bias_variable(shape): 25 | initial = tf.constant(0.1, shape=shape) 26 | return tf.Variable(initial) 27 | 28 | def conv2d(x, W): 29 | return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') 30 | 31 | def max_pool_2x2(x): 32 | return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], 33 | strides=[1, 2, 2, 1], padding='SAME') 34 | 35 | def main(_): 36 | # Import data 37 | mnist = input_data.read_data_sets('MNIST_data', one_hot=True) 38 | 39 | 40 | 41 | # Create the model 42 | x = tf.placeholder(tf.float32, shape=[None, 784]) 43 | y_ = tf.placeholder(tf.float32, shape=[None, 10]) 44 | W = tf.Variable(tf.zeros([784,10])) 45 | b = tf.Variable(tf.zeros([10])) 46 | 47 | 48 | # Define loss and optimizer 49 | y = tf.matmul(x,W) + b 50 | 51 | sess = tf.InteractiveSession() 52 | tf.global_variables_initializer().run() 53 | 54 | 55 | # Train 56 | 57 | #layer 1 of CNN: We can now implement our first layer. It will consist of convolution, followed by max pooling. The convolution will compute 32 features for each 5x5 patch. Its weight tensor will have a shape of [5, 5, 1, 32]. The first two dimensions are the patch size, the next is the number of input channels, and the last is the number of output channels. We will also have a bias vector with a component for each output channel. 58 | W_conv1 = weight_variable([5, 5, 1, 32]) 59 | b_conv1 = bias_variable([32]) 60 | 61 | x_image = tf.reshape(x, [-1,28,28,1]) 62 | 63 | h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) 64 | h_pool1 = max_pool_2x2(h_conv1) 65 | 66 | #layer 2 CNN: In order to build a deep network, we stack several layers of this type. The second layer will have 64 features for each 5x5 patch. 67 | W_conv2 = weight_variable([5, 5, 32, 64]) 68 | b_conv2 = bias_variable([64]) 69 | 70 | h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) 71 | h_pool2 = max_pool_2x2(h_conv2) 72 | 73 | 74 | # densely connected layer : Now that the image size has been reduced to 7x7, we add a fully-connected layer with 1024 neurons to allow processing on the entire image. We reshape the tensor from the pooling layer into a batch of vectors, multiply by a weight matrix, add a bias, and apply a ReLU. 75 | W_fc1 = weight_variable([7 * 7 * 64, 1024]) 76 | b_fc1 = bias_variable([1024]) 77 | 78 | h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) 79 | h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) 80 | 81 | # dropout 82 | 83 | keep_prob = tf.placeholder(tf.float32) 84 | h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) 85 | 86 | # Readout layer 87 | 88 | W_fc2 = weight_variable([1024, 10]) 89 | b_fc2 = bias_variable([10]) 90 | 91 | y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2 92 | 93 | 94 | cross_entropy = tf.reduce_mean( 95 | tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv)) 96 | train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) 97 | correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) 98 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 99 | sess.run(tf.global_variables_initializer()) 100 | for i in range(20000): 101 | batch = mnist.train.next_batch(50) 102 | if i%100 == 0: 103 | train_accuracy = accuracy.eval(feed_dict={ 104 | x:batch[0], y_: batch[1], keep_prob: 1.0}) 105 | print("step %d, training accuracy %g"%(i, train_accuracy)) 106 | train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) 107 | 108 | print("test accuracy %g"%accuracy.eval(feed_dict={ 109 | x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0})) 110 | 111 | # Test trained model 112 | correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) 113 | accuracy_prob = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 114 | accuracy = accuracy_prob*100 115 | print(sess.run(accuracy, feed_dict={x: mnist.test.images, 116 | y_: mnist.test.labels})) 117 | 118 | if __name__ == '__main__': 119 | parser = argparse.ArgumentParser() 120 | parser.add_argument('--data_dir', type=str, default='/tmp/tensorflow/mnist/input_data', 121 | help='Directory for storing input data') 122 | FLAGS, unparsed = parser.parse_known_args() 123 | tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) 124 | -------------------------------------------------------------------------------- /mnist-summary.py: -------------------------------------------------------------------------------- 1 | """A simple MNIST classifier which displays summaries in TensorBoard. 2 | 3 | This is an unimpressive MNIST model, but it is a good example of using 4 | tf.name_scope to make a graph legible in the TensorBoard graph explorer, and of 5 | naming summary tags so that they are grouped meaningfully in TensorBoard. 6 | 7 | It demonstrates the functionality of every TensorBoard dashboard. 8 | """ 9 | from __future__ import absolute_import 10 | from __future__ import division 11 | from __future__ import print_function 12 | 13 | import argparse 14 | import sys 15 | 16 | import tensorflow as tf 17 | 18 | from tensorflow.examples.tutorials.mnist import input_data 19 | 20 | FLAGS = None 21 | 22 | 23 | def train(): 24 | # Import data 25 | mnist = input_data.read_data_sets(FLAGS.data_dir, 26 | one_hot=True, 27 | fake_data=FLAGS.fake_data) 28 | 29 | sess = tf.InteractiveSession() 30 | # Create a multilayer model. 31 | 32 | # Input placeholders 33 | with tf.name_scope('input'): 34 | x = tf.placeholder(tf.float32, [None, 784], name='x-input') 35 | y_ = tf.placeholder(tf.float32, [None, 10], name='y-input') 36 | 37 | with tf.name_scope('input_reshape'): 38 | image_shaped_input = tf.reshape(x, [-1, 28, 28, 1]) 39 | tf.summary.image('input', image_shaped_input, 10) 40 | 41 | # We can't initialize these variables to 0 - the network will get stuck. 42 | def weight_variable(shape): 43 | """Create a weight variable with appropriate initialization.""" 44 | initial = tf.truncated_normal(shape, stddev=0.1) 45 | return tf.Variable(initial) 46 | 47 | def bias_variable(shape): 48 | """Create a bias variable with appropriate initialization.""" 49 | initial = tf.constant(0.1, shape=shape) 50 | return tf.Variable(initial) 51 | 52 | def variable_summaries(var): 53 | """Attach a lot of summaries to a Tensor (for TensorBoard visualization).""" 54 | with tf.name_scope('summaries'): 55 | mean = tf.reduce_mean(var) 56 | tf.summary.scalar('mean', mean) 57 | with tf.name_scope('stddev'): 58 | stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean))) 59 | tf.summary.scalar('stddev', stddev) 60 | tf.summary.scalar('max', tf.reduce_max(var)) 61 | tf.summary.scalar('min', tf.reduce_min(var)) 62 | tf.summary.histogram('histogram', var) 63 | 64 | def nn_layer(input_tensor, input_dim, output_dim, layer_name, act=tf.nn.relu): 65 | """Reusable code for making a simple neural net layer. 66 | 67 | It does a matrix multiply, bias add, and then uses relu to nonlinearize. 68 | It also sets up name scoping so that the resultant graph is easy to read, 69 | and adds a number of summary ops. 70 | """ 71 | # Adding a name scope ensures logical grouping of the layers in the graph. 72 | with tf.name_scope(layer_name): 73 | # This Variable will hold the state of the weights for the layer 74 | with tf.name_scope('weights'): 75 | weights = weight_variable([input_dim, output_dim]) 76 | variable_summaries(weights) 77 | with tf.name_scope('biases'): 78 | biases = bias_variable([output_dim]) 79 | variable_summaries(biases) 80 | with tf.name_scope('Wx_plus_b'): 81 | preactivate = tf.matmul(input_tensor, weights) + biases 82 | tf.summary.histogram('pre_activations', preactivate) 83 | activations = act(preactivate, name='activation') 84 | tf.summary.histogram('activations', activations) 85 | return activations 86 | 87 | hidden1 = nn_layer(x, 784, 500, 'layer1') 88 | 89 | with tf.name_scope('dropout'): 90 | keep_prob = tf.placeholder(tf.float32) 91 | tf.summary.scalar('dropout_keep_probability', keep_prob) 92 | dropped = tf.nn.dropout(hidden1, keep_prob) 93 | 94 | # Do not apply softmax activation yet, see below. 95 | y = nn_layer(dropped, 500, 10, 'layer2', act=tf.identity) 96 | 97 | with tf.name_scope('cross_entropy'): 98 | # The raw formulation of cross-entropy, 99 | # 100 | # tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)), 101 | # reduction_indices=[1])) 102 | # 103 | # can be numerically unstable. 104 | # 105 | # So here we use tf.nn.softmax_cross_entropy_with_logits on the 106 | # raw outputs of the nn_layer above, and then average across 107 | # the batch. 108 | diff = tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y) 109 | with tf.name_scope('total'): 110 | cross_entropy = tf.reduce_mean(diff) 111 | tf.summary.scalar('cross_entropy', cross_entropy) 112 | 113 | with tf.name_scope('train'): 114 | train_step = tf.train.AdamOptimizer(FLAGS.learning_rate).minimize( 115 | cross_entropy) 116 | 117 | with tf.name_scope('accuracy'): 118 | with tf.name_scope('correct_prediction'): 119 | correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) 120 | with tf.name_scope('accuracy'): 121 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 122 | tf.summary.scalar('accuracy', accuracy) 123 | 124 | # Merge all the summaries and write them out to /tmp/tensorflow/mnist/logs/mnist_with_summaries (by default) 125 | merged = tf.summary.merge_all() 126 | train_writer = tf.summary.FileWriter(FLAGS.log_dir + '/train', sess.graph) 127 | test_writer = tf.summary.FileWriter(FLAGS.log_dir + '/test') 128 | tf.global_variables_initializer().run() 129 | 130 | # Train the model, and also write summaries. 131 | # Every 10th step, measure test-set accuracy, and write test summaries 132 | # All other steps, run train_step on training data, & add training summaries 133 | 134 | def feed_dict(train): 135 | """Make a TensorFlow feed_dict: maps data onto Tensor placeholders.""" 136 | if train or FLAGS.fake_data: 137 | xs, ys = mnist.train.next_batch(100, fake_data=FLAGS.fake_data) 138 | k = FLAGS.dropout 139 | else: 140 | xs, ys = mnist.test.images, mnist.test.labels 141 | k = 1.0 142 | return {x: xs, y_: ys, keep_prob: k} 143 | 144 | for i in range(FLAGS.max_steps): 145 | if i % 10 == 0: # Record summaries and test-set accuracy 146 | summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False)) 147 | test_writer.add_summary(summary, i) 148 | print('Accuracy at step %s: %s' % (i, acc)) 149 | else: # Record train set summaries, and train 150 | if i % 100 == 99: # Record execution stats 151 | run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE) 152 | run_metadata = tf.RunMetadata() 153 | summary, _ = sess.run([merged, train_step], 154 | feed_dict=feed_dict(True), 155 | options=run_options, 156 | run_metadata=run_metadata) 157 | train_writer.add_run_metadata(run_metadata, 'step%03d' % i) 158 | train_writer.add_summary(summary, i) 159 | print('Adding run metadata for', i) 160 | else: # Record a summary 161 | summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True)) 162 | train_writer.add_summary(summary, i) 163 | train_writer.close() 164 | test_writer.close() 165 | 166 | 167 | def main(_): 168 | if tf.gfile.Exists(FLAGS.log_dir): 169 | tf.gfile.DeleteRecursively(FLAGS.log_dir) 170 | tf.gfile.MakeDirs(FLAGS.log_dir) 171 | train() 172 | 173 | 174 | if __name__ == '__main__': 175 | parser = argparse.ArgumentParser() 176 | parser.add_argument('--fake_data', nargs='?', const=True, type=bool, 177 | default=False, 178 | help='If true, uses fake data for unit testing.') 179 | parser.add_argument('--max_steps', type=int, default=1000, 180 | help='Number of steps to run trainer.') 181 | parser.add_argument('--learning_rate', type=float, default=0.001, 182 | help='Initial learning rate') 183 | parser.add_argument('--dropout', type=float, default=0.9, 184 | help='Keep probability for training dropout.') 185 | parser.add_argument('--data_dir', type=str, default='/tmp/tensorflow/mnist/input_data', 186 | help='Directory for storing input data') 187 | parser.add_argument('--log_dir', type=str, default='/tmp/tensorflow/mnist/logs/mnist_with_summaries', 188 | help='Summaries log directory') 189 | FLAGS, unparsed = parser.parse_known_args() 190 | tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) 191 | -------------------------------------------------------------------------------- /mnist_softmax.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | import argparse 6 | import sys 7 | 8 | from tensorflow.examples.tutorials.mnist import input_data 9 | 10 | import tensorflow as tf 11 | 12 | FLAGS = None 13 | 14 | 15 | def main(_): 16 | # Import data 17 | mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True) 18 | 19 | # Create the model 20 | x = tf.placeholder(tf.float32, [None, 784]) 21 | W = tf.Variable(tf.zeros([784, 10])) 22 | b = tf.Variable(tf.zeros([10])) 23 | y = tf.matmul(x, W) + b 24 | 25 | # Define loss and optimizer 26 | y_ = tf.placeholder(tf.float32, [None, 10]) 27 | 28 | # The raw formulation of cross-entropy, 29 | # 30 | # tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.nn.softmax(y)), 31 | # reduction_indices=[1])) 32 | # 33 | # can be numerically unstable. 34 | # 35 | # So here we use tf.nn.softmax_cross_entropy_with_logits on the raw 36 | # outputs of 'y', and then average across the batch. 37 | cross_entropy = tf.reduce_mean( 38 | tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y)) 39 | train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) 40 | 41 | sess = tf.InteractiveSession() 42 | tf.global_variables_initializer().run() 43 | # Train 44 | for _ in range(1000): 45 | batch_xs, batch_ys = mnist.train.next_batch(100) 46 | sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) 47 | 48 | # Test trained model 49 | correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) 50 | accuracy_prob = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 51 | accuracy = accuracy_prob*100 52 | print(sess.run(accuracy, feed_dict={x: mnist.test.images, 53 | y_: mnist.test.labels})) 54 | 55 | if __name__ == '__main__': 56 | parser = argparse.ArgumentParser() 57 | parser.add_argument('--data_dir', type=str, default='/tmp/tensorflow/mnist/input_data', 58 | help='Directory for storing input data') 59 | FLAGS, unparsed = parser.parse_known_args() 60 | tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) 61 | -------------------------------------------------------------------------------- /neural-network-classifer-iris.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | import tensorflow as tf 6 | import numpy as np 7 | 8 | # Data sets 9 | IRIS_TRAINING = "iris_training.csv" 10 | IRIS_TEST = "iris_test.csv" 11 | 12 | # Load datasets. 13 | training_set = tf.contrib.learn.datasets.base.load_csv_with_header( 14 | filename=IRIS_TRAINING, 15 | target_dtype=np.int, 16 | features_dtype=np.float32) 17 | test_set = tf.contrib.learn.datasets.base.load_csv_with_header( 18 | filename=IRIS_TEST, 19 | target_dtype=np.int, 20 | features_dtype=np.float32) 21 | 22 | # Specify that all features have real-value data 23 | feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)] 24 | 25 | # Build 3 layer DNN with 10, 20, 10 units respectively. 26 | classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns, 27 | hidden_units=[10, 20, 10], 28 | n_classes=3, 29 | model_dir="/tmp/iris_model") 30 | 31 | # Fit model. 32 | classifier.fit(x=training_set.data, 33 | y=training_set.target, 34 | steps=2000) 35 | 36 | # Evaluate accuracy. 37 | accuracy_score = classifier.evaluate(x=test_set.data, 38 | y=test_set.target)["accuracy"] 39 | print('Accuracy: {0:f}'.format(accuracy_score*100)) 40 | 41 | # Classify two new flower samples. 42 | new_samples = np.array( 43 | [[6.4, 3.2, 4.5, 1.5], [5.8, 3.1, 5.0, 1.7]], dtype=float) 44 | y = list(classifier.predict(new_samples, as_iterable=True)) 45 | print('Predictions: {}'.format(str(y))) 46 | -------------------------------------------------------------------------------- /tf-contrib-learn.md: -------------------------------------------------------------------------------- 1 | # _tf.contrib.learn_ hands-on 2 | 3 | ### The following code walk through is with respect to **_neural-network-classifer-iris.py_**. Please make sure you have the file. You can find the resouce file at: 4 | 5 | https://github.com/sujayVittal/Machine-Learning-with-TensorFlow-Study-Jam-2017/blob/master/neural-network-classifer-iris.py 6 | 7 | ### Source of data set 8 | 1. A training set of 120 examples - http://download.tensorflow.org/data/iris_training.csv 9 | 2. A test set of 30 samples - http://download.tensorflow.org/data/iris_test.csv 10 | 11 | Place these files in the **_same_** directory as your Python code! 12 | 13 | #### 1. To get started, first import TensorFlow and NumPy: 14 | 15 | from __future__ import absolute_import 16 | from __future__ import division 17 | from __future__ import print_function 18 | 19 | import tensorflow as tf 20 | import numpy as np 21 | 22 | ### Load the Iris CSV data to TensorFlow: 23 | 24 | #### 2. Next, load the training and test sets into Datasets using the load_csv_with_header() method in learn.datasets.base. The load_csv_with_header() method takes three required arguments: 25 | 26 | a. **filename**, which takes the filepath to the CSV file 27 | 28 | b. **target_dtype**, which takes the numpy datatype of the dataset's target value. 29 | 30 | c. **features_dtype**, which takes the numpy datatype of the dataset's feature values. 31 | 32 | #### Here, the target (the value you're training the model to predict) is flower species, which is an integer from 0–2, so the appropriate numpy datatype is np.int: 33 | 34 | 35 | IRIS_TRAINING = "iris_training.csv" 36 | IRIS_TEST = "iris_test.csv" 37 | 38 | training_set = tf.contrib.learn.datasets.base.load_csv_with_header( 39 | filename=IRIS_TRAINING, 40 | target_dtype=np.int, 41 | features_dtype=np.float32) 42 | test_set = tf.contrib.learn.datasets.base.load_csv_with_header( 43 | filename=IRIS_TEST, 44 | target_dtype=np.int, 45 | features_dtype=np.float32) 46 | 47 | 48 | #### Datasets in tf.contrib.learn are named tuples; you can access feature data and target values via the data and target fields. Here, training_set.data and training_set.target contain the feature data and target values for the training set, respectively, and test_set.data and test_set.target contain feature data and target values for the test set. 49 | 50 | 51 | ### Construct a Deep Neural Network Classifier: 52 | 53 | #### 3. f.contrib.learn offers a variety of predefined models, called Estimators, which you can use "out of the box" to run training and evaluation operations on your data. Here, you'll configure a Deep Neural Network Classifier model to fit the Iris data. Using tf.contrib.learn, you can instantiate your tf.contrib.learn.DNNClassifier with just a couple lines of code: 54 | 55 | 56 | feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)] 57 | 58 | 59 | classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns, 60 | hidden_units=[10, 20, 10], 61 | n_classes=3, 62 | model_dir="/tmp/iris_model") 63 | 64 | 65 | #### 4. The code above first defines the model's feature columns, which specify the data type for the features in the data set. All the feature data is continuous, so tf.contrib.layers.real_valued_column is the appropriate function to use to construct the feature columns. There are four features in the data set (sepal width, sepal height, petal width, and petal height), so accordingly dimension must be set to 4 to hold all the data. 66 | 67 | #### Then, the code creates a DNNClassifier model using the following arguments: 68 | 69 | a. feature_columns=feature_columns. The set of feature columns defined above. 70 | b. hidden_units=[10, 20, 10]. Three hidden layers, containing 10, 20, and 10 neurons, respectively. 71 | c. n_classes=3. Three target classes, representing the three Iris species. 72 | d. model_dir=/tmp/iris_model. The directory in which TensorFlow will save checkpoint data during model training. 73 | 74 | 75 | ### Fit the DNNClassifier to the Iris Training Data 76 | 77 | #### 5. Now that you've configured your DNN classifier model, you can fit it to the Iris training data using the fit method. Pass as arguments your feature data (training_set.data), target values (training_set.target), and the number of steps to train (here, 2000): 78 | 79 | classifier.fit(x=training_set.data, y=training_set.target, steps=2000) 80 | 81 | #### 6. The state of the model is preserved in the classifier, which means you can train iteratively if you like. For example, the above is equivalent to the following: 82 | 83 | classifier.fit(x=training_set.data, y=training_set.target, steps=1000) 84 | classifier.fit(x=training_set.data, y=training_set.target, steps=1000) 85 | 86 | #### However, if you're looking to track the model while it trains, you'll likely want to instead use a TensorFlow monitor to perform logging operations. 87 | 88 | 89 | ### Evaluate Model Accuracy 90 | 91 | #### 7. You've fit your DNNClassifier model on the Iris training data; now, you can check its accuracy on the Iris test data using the evaluate method. Like fit, evaluate takes feature data and target values as arguments, and returns a dict with the evaluation results. The following code passes the Iris test data—test_set.data and test_set.target—to evaluate and prints the accuracy from the results: 92 | 93 | accuracy_score = classifier.evaluate(x=test_set.data, y=test_set.target)["accuracy"] 94 | print('Accuracy: {0:f}'.format(accuracy_score)) 95 | 96 | #### Run the script to check the accuracy results: 97 | 98 | Accuracy: 0.966667 99 | 100 | 101 | 102 | -------------------------------------------------------------------------------- /weights_accuracy_during_training-tensorboard.md: -------------------------------------------------------------------------------- 1 | ### The code example below is a modification of the simple MNIST tutorial, in which we have added some summary ops, and run them every ten steps. If you run this and then launch tensorboard --logdir=/tmp/mnist_logs, you'll be able to visualize statistics, such as how the weights or accuracy varied during training. 2 | 3 | ### The code excerpt is as follows: 4 | 5 | def variable_summaries(var): 6 | """Attach a lot of summaries to a Tensor (for TensorBoard visualization).""" 7 | with tf.name_scope('summaries'): 8 | mean = tf.reduce_mean(var) 9 | tf.summary.scalar('mean', mean) 10 | with tf.name_scope('stddev'): 11 | stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean))) 12 | tf.summary.scalar('stddev', stddev) 13 | tf.summary.scalar('max', tf.reduce_max(var)) 14 | tf.summary.scalar('min', tf.reduce_min(var)) 15 | tf.summary.histogram('histogram', var) 16 | 17 | def nn_layer(input_tensor, input_dim, output_dim, layer_name, act=tf.nn.relu): 18 | 19 | with tf.name_scope(layer_name): 20 | 21 | with tf.name_scope('weights'): 22 | weights = weight_variable([input_dim, output_dim]) 23 | variable_summaries(weights) 24 | with tf.name_scope('biases'): 25 | biases = bias_variable([output_dim]) 26 | variable_summaries(biases) 27 | with tf.name_scope('Wx_plus_b'): 28 | preactivate = tf.matmul(input_tensor, weights) + biases 29 | tf.summary.histogram('pre_activations', preactivate) 30 | activations = act(preactivate, name='activation') 31 | tf.summary.histogram('activations', activations) 32 | return activations 33 | 34 | hidden1 = nn_layer(x, 784, 500, 'layer1') 35 | 36 | with tf.name_scope('dropout'): 37 | keep_prob = tf.placeholder(tf.float32) 38 | tf.summary.scalar('dropout_keep_probability', keep_prob) 39 | dropped = tf.nn.dropout(hidden1, keep_prob) 40 | 41 | 42 | y = nn_layer(dropped, 500, 10, 'layer2', act=tf.identity) 43 | 44 | with tf.name_scope('cross_entropy'): 45 | 46 | diff = tf.nn.softmax_cross_entropy_with_logits(targets=y_, logits=y) 47 | with tf.name_scope('total'): 48 | cross_entropy = tf.reduce_mean(diff) 49 | tf.summary.scalar('cross_entropy', cross_entropy) 50 | 51 | with tf.name_scope('train'): 52 | train_step = tf.train.AdamOptimizer(FLAGS.learning_rate).minimize( 53 | cross_entropy) 54 | 55 | with tf.name_scope('accuracy'): 56 | with tf.name_scope('correct_prediction'): 57 | correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) 58 | with tf.name_scope('accuracy'): 59 | accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) 60 | tf.summary.scalar('accuracy', accuracy) 61 | 62 | 63 | merged = tf.summary.merge_all() 64 | train_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/train', 65 | sess.graph) 66 | test_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/test') 67 | tf.global_variables_initializer().run() 68 | -------------------------------------------------------------------------------- /wide-n-deep.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | import argparse 6 | import sys 7 | import tempfile 8 | 9 | from six.moves import urllib 10 | 11 | import pandas as pd 12 | import tensorflow as tf 13 | 14 | 15 | COLUMNS = ["age", "workclass", "fnlwgt", "education", "education_num", 16 | "marital_status", "occupation", "relationship", "race", "gender", 17 | "capital_gain", "capital_loss", "hours_per_week", "native_country", 18 | "income_bracket"] 19 | LABEL_COLUMN = "label" 20 | CATEGORICAL_COLUMNS = ["workclass", "education", "marital_status", "occupation", 21 | "relationship", "race", "gender", "native_country"] 22 | CONTINUOUS_COLUMNS = ["age", "education_num", "capital_gain", "capital_loss", 23 | "hours_per_week"] 24 | 25 | 26 | def maybe_download(train_data, test_data): 27 | """Maybe downloads training data and returns train and test file names.""" 28 | if train_data: 29 | train_file_name = train_data 30 | else: 31 | train_file = tempfile.NamedTemporaryFile(delete=False) 32 | urllib.request.urlretrieve("http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.data", train_file.name) # pylint: disable=line-too-long 33 | train_file_name = train_file.name 34 | train_file.close() 35 | print("Training data is downloaded to %s" % train_file_name) 36 | 37 | if test_data: 38 | test_file_name = test_data 39 | else: 40 | test_file = tempfile.NamedTemporaryFile(delete=False) 41 | urllib.request.urlretrieve("http://mlr.cs.umass.edu/ml/machine-learning-databases/adult/adult.test", test_file.name) # pylint: disable=line-too-long 42 | test_file_name = test_file.name 43 | test_file.close() 44 | print("Test data is downloaded to %s" % test_file_name) 45 | 46 | return train_file_name, test_file_name 47 | 48 | 49 | def build_estimator(model_dir, model_type): 50 | """Build an estimator.""" 51 | # Sparse base columns. 52 | gender = tf.contrib.layers.sparse_column_with_keys(column_name="gender", 53 | keys=["female", "male"]) 54 | education = tf.contrib.layers.sparse_column_with_hash_bucket( 55 | "education", hash_bucket_size=1000) 56 | relationship = tf.contrib.layers.sparse_column_with_hash_bucket( 57 | "relationship", hash_bucket_size=100) 58 | workclass = tf.contrib.layers.sparse_column_with_hash_bucket( 59 | "workclass", hash_bucket_size=100) 60 | occupation = tf.contrib.layers.sparse_column_with_hash_bucket( 61 | "occupation", hash_bucket_size=1000) 62 | native_country = tf.contrib.layers.sparse_column_with_hash_bucket( 63 | "native_country", hash_bucket_size=1000) 64 | 65 | # Continuous base columns. 66 | age = tf.contrib.layers.real_valued_column("age") 67 | education_num = tf.contrib.layers.real_valued_column("education_num") 68 | capital_gain = tf.contrib.layers.real_valued_column("capital_gain") 69 | capital_loss = tf.contrib.layers.real_valued_column("capital_loss") 70 | hours_per_week = tf.contrib.layers.real_valued_column("hours_per_week") 71 | 72 | # Transformations. 73 | age_buckets = tf.contrib.layers.bucketized_column(age, 74 | boundaries=[ 75 | 18, 25, 30, 35, 40, 45, 76 | 50, 55, 60, 65 77 | ]) 78 | 79 | # Wide columns and deep columns. 80 | wide_columns = [gender, native_country, education, occupation, workclass, 81 | relationship, age_buckets, 82 | tf.contrib.layers.crossed_column([education, occupation], 83 | hash_bucket_size=int(1e4)), 84 | tf.contrib.layers.crossed_column( 85 | [age_buckets, education, occupation], 86 | hash_bucket_size=int(1e6)), 87 | tf.contrib.layers.crossed_column([native_country, occupation], 88 | hash_bucket_size=int(1e4))] 89 | deep_columns = [ 90 | tf.contrib.layers.embedding_column(workclass, dimension=8), 91 | tf.contrib.layers.embedding_column(education, dimension=8), 92 | tf.contrib.layers.embedding_column(gender, dimension=8), 93 | tf.contrib.layers.embedding_column(relationship, dimension=8), 94 | tf.contrib.layers.embedding_column(native_country, 95 | dimension=8), 96 | tf.contrib.layers.embedding_column(occupation, dimension=8), 97 | age, 98 | education_num, 99 | capital_gain, 100 | capital_loss, 101 | hours_per_week, 102 | ] 103 | 104 | if model_type == "wide": 105 | m = tf.contrib.learn.LinearClassifier(model_dir=model_dir, 106 | feature_columns=wide_columns) 107 | elif model_type == "deep": 108 | m = tf.contrib.learn.DNNClassifier(model_dir=model_dir, 109 | feature_columns=deep_columns, 110 | hidden_units=[100, 50]) 111 | else: 112 | m = tf.contrib.learn.DNNLinearCombinedClassifier( 113 | model_dir=model_dir, 114 | linear_feature_columns=wide_columns, 115 | dnn_feature_columns=deep_columns, 116 | dnn_hidden_units=[100, 50]) 117 | return m 118 | 119 | 120 | def input_fn(df): 121 | """Input builder function.""" 122 | # Creates a dictionary mapping from each continuous feature column name (k) to 123 | # the values of that column stored in a constant Tensor. 124 | continuous_cols = {k: tf.constant(df[k].values) for k in CONTINUOUS_COLUMNS} 125 | # Creates a dictionary mapping from each categorical feature column name (k) 126 | # to the values of that column stored in a tf.SparseTensor. 127 | categorical_cols = { 128 | k: tf.SparseTensor( 129 | indices=[[i, 0] for i in range(df[k].size)], 130 | values=df[k].values, 131 | dense_shape=[df[k].size, 1]) 132 | for k in CATEGORICAL_COLUMNS} 133 | # Merges the two dictionaries into one. 134 | feature_cols = dict(continuous_cols) 135 | feature_cols.update(categorical_cols) 136 | # Converts the label column into a constant Tensor. 137 | label = tf.constant(df[LABEL_COLUMN].values) 138 | # Returns the feature columns and the label. 139 | return feature_cols, label 140 | 141 | 142 | def train_and_eval(model_dir, model_type, train_steps, train_data, test_data): 143 | """Train and evaluate the model.""" 144 | train_file_name, test_file_name = maybe_download(train_data, test_data) 145 | df_train = pd.read_csv( 146 | tf.gfile.Open(train_file_name), 147 | names=COLUMNS, 148 | skipinitialspace=True, 149 | engine="python") 150 | df_test = pd.read_csv( 151 | tf.gfile.Open(test_file_name), 152 | names=COLUMNS, 153 | skipinitialspace=True, 154 | skiprows=1, 155 | engine="python") 156 | 157 | # remove NaN elements 158 | df_train = df_train.dropna(how='any', axis=0) 159 | df_test = df_test.dropna(how='any', axis=0) 160 | 161 | df_train[LABEL_COLUMN] = ( 162 | df_train["income_bracket"].apply(lambda x: ">50K" in x)).astype(int) 163 | df_test[LABEL_COLUMN] = ( 164 | df_test["income_bracket"].apply(lambda x: ">50K" in x)).astype(int) 165 | 166 | model_dir = tempfile.mkdtemp() if not model_dir else model_dir 167 | print("model directory = %s" % model_dir) 168 | 169 | m = build_estimator(model_dir, model_type) 170 | m.fit(input_fn=lambda: input_fn(df_train), steps=train_steps) 171 | results = m.evaluate(input_fn=lambda: input_fn(df_test), steps=1) 172 | for key in sorted(results): 173 | print("%s: %s" % (key, results[key])) 174 | 175 | 176 | FLAGS = None 177 | 178 | 179 | def main(_): 180 | train_and_eval(FLAGS.model_dir, FLAGS.model_type, FLAGS.train_steps, 181 | FLAGS.train_data, FLAGS.test_data) 182 | 183 | 184 | if __name__ == "__main__": 185 | parser = argparse.ArgumentParser() 186 | parser.register("type", "bool", lambda v: v.lower() == "true") 187 | parser.add_argument( 188 | "--model_dir", 189 | type=str, 190 | default="", 191 | help="Base directory for output models." 192 | ) 193 | parser.add_argument( 194 | "--model_type", 195 | type=str, 196 | default="wide_n_deep", 197 | help="Valid model types: {'wide', 'deep', 'wide_n_deep'}." 198 | ) 199 | parser.add_argument( 200 | "--train_steps", 201 | type=int, 202 | default=200, 203 | help="Number of training steps." 204 | ) 205 | parser.add_argument( 206 | "--train_data", 207 | type=str, 208 | default="", 209 | help="Path to the training data." 210 | ) 211 | parser.add_argument( 212 | "--test_data", 213 | type=str, 214 | default="", 215 | help="Path to the test data." 216 | ) 217 | FLAGS, unparsed = parser.parse_known_args() 218 | tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) 219 | --------------------------------------------------------------------------------