├── .gitattributes ├── .github └── workflows │ └── manual.yml ├── .gitignore ├── CODEOWNERS ├── LICENSE ├── README.md ├── create_bottlenecks.sh ├── feature_extraction.py ├── feature_extraction_solution.py ├── inception-cifar10-bottleneck-data.zip ├── inception-traffic-bottleneck-data.zip ├── resnet-cifar10-bottleneck-data.zip ├── resnet-traffic-bottleneck-data.zip ├── run_bottleneck.py ├── shrink.py ├── vgg-cifar10-bottleneck-data.zip └── vgg-traffic-bottleneck-data.zip /.gitattributes: -------------------------------------------------------------------------------- 1 | *.zip filter=lfs diff=lfs merge=lfs -text 2 | -------------------------------------------------------------------------------- /.github/workflows/manual.yml: -------------------------------------------------------------------------------- 1 | # Workflow to ensure whenever a Github PR is submitted, 2 | # a JIRA ticket gets created automatically. 3 | name: Manual Workflow 4 | 5 | # Controls when the action will run. 6 | on: 7 | # Triggers the workflow on pull request events but only for the master branch 8 | pull_request_target: 9 | types: [opened, reopened] 10 | 11 | # Allows you to run this workflow manually from the Actions tab 12 | workflow_dispatch: 13 | 14 | jobs: 15 | test-transition-issue: 16 | name: Convert Github Issue to Jira Issue 17 | runs-on: ubuntu-latest 18 | steps: 19 | - name: Checkout 20 | uses: actions/checkout@master 21 | 22 | - name: Login 23 | uses: atlassian/gajira-login@master 24 | env: 25 | JIRA_BASE_URL: ${{ secrets.JIRA_BASE_URL }} 26 | JIRA_USER_EMAIL: ${{ secrets.JIRA_USER_EMAIL }} 27 | JIRA_API_TOKEN: ${{ secrets.JIRA_API_TOKEN }} 28 | 29 | - name: Create NEW JIRA ticket 30 | id: create 31 | uses: atlassian/gajira-create@master 32 | with: 33 | project: CONUPDATE 34 | issuetype: Task 35 | summary: | 36 | Github PR [Assign the ND component] | Repo: ${{ github.repository }} | PR# ${{github.event.number}} 37 | description: | 38 | Repo link: https://github.com/${{ github.repository }} 39 | PR no. ${{ github.event.pull_request.number }} 40 | PR title: ${{ github.event.pull_request.title }} 41 | PR description: ${{ github.event.pull_request.description }} 42 | In addition, please resolve other issues, if any. 43 | fields: '{"components": [{"name":"nd013 - Self Driving Car Engineer ND"}], "customfield_16449":"https://classroom.udacity.com/", "customfield_16450":"Resolve the PR", "labels": ["github"], "priority":{"id": "4"}}' 44 | 45 | - name: Log created issue 46 | run: echo "Issue ${{ steps.create.outputs.issue }} was created" 47 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | data 2 | models 3 | .ipynb_checkpoints 4 | bottlenecks -------------------------------------------------------------------------------- /CODEOWNERS: -------------------------------------------------------------------------------- 1 | * @domluna 2 | 3 | * @udacity/active-public-content -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2016-2018 Udacity, Inc. 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Deprecated Repo 2 | This repository is deprecated. Currently enrolled learners, if any, can: 3 | - Utilize the https://knowledge.udacity.com/ forum to seek help on content-specific issues. 4 | - [Submit a support ticket](https://udacity.zendesk.com/hc/en-us/requests/new) if (learners are) blocked due to other reasons. 5 | 6 | 7 | # Transfer Learning Lab with VGG, Inception and ResNet 8 | [![Udacity - Self-Driving Car NanoDegree](https://s3.amazonaws.com/udacity-sdc/github/shield-carnd.svg)](http://www.udacity.com/drive) 9 | 10 | In this lab, you will continue exploring transfer learning. You've already explored feature extraction with AlexNet and TensorFlow. Next, you will use Keras to explore feature extraction with the VGG, Inception and ResNet architectures. The models you will use were trained for days or weeks on the [ImageNet dataset](http://www.image-net.org/). Thus, the weights encapsulate higher-level features learned from training on thousands of classes. 11 | 12 | We'll use two datasets in this lab: 13 | 14 | 1. German Traffic Sign Dataset 15 | 2. Cifar10 16 | 17 | Unless you have a powerful GPU, running feature extraction on these models will take a significant amount of time. To make things we precomputed **bottleneck features** for each (network, dataset) pair, this will allow you experiment with feature extraction even on a modest CPU. You can think of bottleneck features as feature extraction but with caching. Because the base network weights are frozen during feature extraction, the output for an image will always be the same. Thus, once the image has already been passed once through the network we can cache and reuse the output. 18 | 19 | The files are encoded as such: 20 | 21 | - {network}_{dataset}_bottleneck_features_train.p 22 | - {network}_{dataset}_bottleneck_features_validation.p 23 | 24 | network can be one of 'vgg', 'inception', or 'resnet' 25 | 26 | dataset can be on of 'cifar10' or 'traffic' 27 | 28 | How will the pretrained model perform on the new datasets? 29 | -------------------------------------------------------------------------------- /create_bottlenecks.sh: -------------------------------------------------------------------------------- 1 | # echo "Running Inception Cifar10" 2 | # python run_bottleneck.py --network inception --batch_size 32 --dataset cifar10 3 | # echo "Running Inception Traffic" 4 | # python run_bottleneck.py --network inception --batch_size 32 --dataset traffic 5 | # echo "Running ResNet Cifar10" 6 | # python run_bottleneck.py --network resnet --batch_size 32 --dataset cifar10 7 | # echo "Running ResNet Traffic" 8 | # python run_bottleneck.py --network resnet --batch_size 32 --dataset traffic 9 | # echo "Running VGG Cifar10" 10 | # python run_bottleneck.py --network vgg --batch_size 16 --dataset cifar10 11 | # echo "Running VGG Traffic" 12 | # python run_bottleneck.py --network vgg --batch_size 16 --dataset traffic 13 | python shrink.py --network vgg --dataset traffic --size 100 14 | python shrink.py --network vgg --dataset cifar10 --size 100 15 | python shrink.py --network resnet --dataset traffic --size 100 16 | python shrink.py --network resnet --dataset cifar10 --size 100 17 | python shrink.py --network inception --dataset traffic --size 100 18 | python shrink.py --network inception --dataset cifar10 --size 100 -------------------------------------------------------------------------------- /feature_extraction.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | import tensorflow as tf 3 | # TODO: import Keras layers you need here 4 | 5 | flags = tf.app.flags 6 | FLAGS = flags.FLAGS 7 | 8 | # command line flags 9 | flags.DEFINE_string('training_file', '', "Bottleneck features training file (.p)") 10 | flags.DEFINE_string('validation_file', '', "Bottleneck features validation file (.p)") 11 | 12 | 13 | def load_bottleneck_data(training_file, validation_file): 14 | """ 15 | Utility function to load bottleneck features. 16 | 17 | Arguments: 18 | training_file - String 19 | validation_file - String 20 | """ 21 | print("Training file", training_file) 22 | print("Validation file", validation_file) 23 | 24 | with open(training_file, 'rb') as f: 25 | train_data = pickle.load(f) 26 | with open(validation_file, 'rb') as f: 27 | validation_data = pickle.load(f) 28 | 29 | X_train = train_data['features'] 30 | y_train = train_data['labels'] 31 | X_val = validation_data['features'] 32 | y_val = validation_data['labels'] 33 | 34 | return X_train, y_train, X_val, y_val 35 | 36 | 37 | def main(_): 38 | # load bottleneck data 39 | X_train, y_train, X_val, y_val = load_bottleneck_data(FLAGS.training_file, FLAGS.validation_file) 40 | 41 | print(X_train.shape, y_train.shape) 42 | print(X_val.shape, y_val.shape) 43 | 44 | # TODO: define your model and hyperparams here 45 | # make sure to adjust the number of classes based on 46 | # the dataset 47 | # 10 for cifar10 48 | # 43 for traffic 49 | 50 | # TODO: train your model here 51 | 52 | 53 | # parses flags and calls the `main` function above 54 | if __name__ == '__main__': 55 | tf.app.run() 56 | -------------------------------------------------------------------------------- /feature_extraction_solution.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | import tensorflow as tf 3 | import numpy as np 4 | from keras.layers import Input, Flatten, Dense 5 | from keras.models import Model 6 | 7 | flags = tf.app.flags 8 | FLAGS = flags.FLAGS 9 | 10 | # command line flags 11 | flags.DEFINE_string('training_file', '', "Bottleneck features training file (.p)") 12 | flags.DEFINE_string('validation_file', '', "Bottleneck features validation file (.p)") 13 | flags.DEFINE_integer('epochs', 50, "The number of epochs.") 14 | flags.DEFINE_integer('batch_size', 256, "The batch size.") 15 | 16 | 17 | def load_bottleneck_data(training_file, validation_file): 18 | """ 19 | Utility function to load bottleneck features. 20 | 21 | Arguments: 22 | training_file - String 23 | validation_file - String 24 | """ 25 | print("Training file", training_file) 26 | print("Validation file", validation_file) 27 | 28 | with open(training_file, 'rb') as f: 29 | train_data = pickle.load(f) 30 | with open(validation_file, 'rb') as f: 31 | validation_data = pickle.load(f) 32 | 33 | X_train = train_data['features'] 34 | y_train = train_data['labels'] 35 | X_val = validation_data['features'] 36 | y_val = validation_data['labels'] 37 | 38 | return X_train, y_train, X_val, y_val 39 | 40 | 41 | def main(_): 42 | # load bottleneck data 43 | X_train, y_train, X_val, y_val = load_bottleneck_data(FLAGS.training_file, FLAGS.validation_file) 44 | 45 | print(X_train.shape, y_train.shape) 46 | print(X_val.shape, y_val.shape) 47 | 48 | nb_classes = len(np.unique(y_train)) 49 | 50 | # define model 51 | input_shape = X_train.shape[1:] 52 | inp = Input(shape=input_shape) 53 | x = Flatten()(inp) 54 | x = Dense(nb_classes, activation='softmax')(x) 55 | model = Model(inp, x) 56 | model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) 57 | 58 | # train model 59 | model.fit(X_train, y_train, FLAGS.batch_size, FLAGS.epochs, validation_data=(X_val, y_val), shuffle=True) 60 | 61 | 62 | # parses flags and calls the `main` function above 63 | if __name__ == '__main__': 64 | tf.app.run() 65 | -------------------------------------------------------------------------------- /inception-cifar10-bottleneck-data.zip: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:09c25f5b9a56ebbee29981d7e4783f83670b8e8710244d50759c2492e63dc582 3 | size 379422349 4 | -------------------------------------------------------------------------------- /inception-traffic-bottleneck-data.zip: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:abc340c0d415222881cb807a185287a534098b70ee17b93bd2b55ba43500a8dd 3 | size 297257393 4 | -------------------------------------------------------------------------------- /resnet-cifar10-bottleneck-data.zip: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:12efd07d329c5c4b4e2e054dd8f9dd432db54c7418c9075547ea6426e4b1c06c 3 | size 361499542 4 | -------------------------------------------------------------------------------- /resnet-traffic-bottleneck-data.zip: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:75d3adee0af71ba85d92ef5029ab58ddf0f0230b0adbc401756a70500b877cc1 3 | size 282217560 4 | -------------------------------------------------------------------------------- /run_bottleneck.py: -------------------------------------------------------------------------------- 1 | from keras.applications.resnet50 import ResNet50, preprocess_input 2 | from keras.applications.inception_v3 import InceptionV3 3 | from keras.applications.vgg16 import VGG16 4 | from keras.layers import Input, AveragePooling2D 5 | from sklearn.model_selection import train_test_split 6 | from keras.models import Model 7 | from keras.datasets import cifar10 8 | import pickle 9 | import tensorflow as tf 10 | import keras.backend as K 11 | 12 | flags = tf.app.flags 13 | FLAGS = flags.FLAGS 14 | 15 | flags.DEFINE_string('dataset', 'cifar10', "Make bottleneck features this for dataset, one of 'cifar10', or 'traffic'") 16 | flags.DEFINE_string('network', 'resnet', "The model to bottleneck, one of 'vgg', 'inception', or 'resnet'") 17 | flags.DEFINE_integer('batch_size', 16, 'The batch size for the generator') 18 | 19 | batch_size = FLAGS.batch_size 20 | 21 | 22 | h, w, ch = 224, 224, 3 23 | if FLAGS.network == 'inception': 24 | h, w, ch = 299, 299, 3 25 | from keras.applications.inception_v3 import preprocess_input 26 | 27 | img_placeholder = tf.placeholder("uint8", (None, 32, 32, 3)) 28 | resize_op = tf.image.resize_images(img_placeholder, (h, w), method=0) 29 | 30 | 31 | def gen(session, data, labels, batch_size): 32 | def _f(): 33 | start = 0 34 | end = start + batch_size 35 | n = data.shape[0] 36 | 37 | while True: 38 | X_batch = session.run(resize_op, {img_placeholder: data[start:end]}) 39 | X_batch = preprocess_input(X_batch) 40 | y_batch = labels[start:end] 41 | start += batch_size 42 | end += batch_size 43 | if start >= n: 44 | start = 0 45 | end = batch_size 46 | 47 | print(start, end) 48 | yield (X_batch, y_batch) 49 | 50 | return _f 51 | 52 | 53 | def create_model(): 54 | input_tensor = Input(shape=(h, w, ch)) 55 | if FLAGS.network == 'vgg': 56 | model = VGG16(input_tensor=input_tensor, include_top=False) 57 | x = model.output 58 | x = AveragePooling2D((7, 7))(x) 59 | model = Model(model.input, x) 60 | elif FLAGS.network == 'inception': 61 | model = InceptionV3(input_tensor=input_tensor, include_top=False) 62 | x = model.output 63 | x = AveragePooling2D((8, 8), strides=(8, 8))(x) 64 | model = Model(model.input, x) 65 | else: 66 | model = ResNet50(input_tensor=input_tensor, include_top=False) 67 | return model 68 | 69 | 70 | def main(_): 71 | 72 | if FLAGS.dataset == 'cifar10': 73 | (X_train, y_train), (_, _) = cifar10.load_data() 74 | X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=0) 75 | else: 76 | with open('data/train.p', mode='rb') as f: 77 | train = pickle.load(f) 78 | X_train, X_val, y_train, y_val = train_test_split(train['features'], train['labels'], test_size=0.33, random_state=0) 79 | 80 | train_output_file = "{}_{}_{}.p".format(FLAGS.network, FLAGS.dataset, 'bottleneck_features_train') 81 | validation_output_file = "{}_{}_{}.p".format(FLAGS.network, FLAGS.dataset, 'bottleneck_features_validation') 82 | 83 | print("Resizing to", (w, h, ch)) 84 | print("Saving to ...") 85 | print(train_output_file) 86 | print(validation_output_file) 87 | 88 | with tf.Session() as sess: 89 | K.set_session(sess) 90 | K.set_learning_phase(1) 91 | 92 | model = create_model() 93 | 94 | print('Bottleneck training') 95 | train_gen = gen(sess, X_train, y_train, batch_size) 96 | bottleneck_features_train = model.predict_generator(train_gen(), X_train.shape[0]) 97 | data = {'features': bottleneck_features_train, 'labels': y_train} 98 | pickle.dump(data, open(train_output_file, 'wb')) 99 | 100 | print('Bottleneck validation') 101 | val_gen = gen(sess, X_val, y_val, batch_size) 102 | bottleneck_features_validation = model.predict_generator(val_gen(), X_val.shape[0]) 103 | data = {'features': bottleneck_features_validation, 'labels': y_val} 104 | pickle.dump(data, open(validation_output_file, 'wb')) 105 | 106 | if __name__ == '__main__': 107 | tf.app.run() 108 | -------------------------------------------------------------------------------- /shrink.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | import tensorflow as tf 3 | from collections import Counter 4 | from sklearn.utils import shuffle 5 | 6 | flags = tf.app.flags 7 | FLAGS = flags.FLAGS 8 | 9 | # command line flags 10 | flags.DEFINE_string('training_file', '', "Bottleneck features training file (.p)") 11 | flags.DEFINE_string('output_file', '', "Name of the output file with reduced number of examples.") 12 | flags.DEFINE_integer('size', 100, 'Number of examples per class to keep') 13 | 14 | 15 | def main(_): 16 | # load bottleneck data 17 | with open(FLAGS.training_file, 'rb') as f: 18 | train_data = pickle.load(f) 19 | X_train = train_data['features'] 20 | y_train = train_data['labels'] 21 | 22 | print(X_train.shape, y_train.shape) 23 | 24 | X_train, y_train = shuffle(X_train, y_train, random_state=0) 25 | keep_indices = [] 26 | keep_counter = Counter() 27 | 28 | for i, label in enumerate(y_train.reshape(-1)): 29 | if keep_counter[label] < FLAGS.size: 30 | keep_counter[label] += 1 31 | keep_indices.append(i) 32 | 33 | X_train_small = X_train[keep_indices] 34 | y_train_small = y_train[keep_indices] 35 | 36 | print(X_train_small.shape, y_train_small.shape) 37 | 38 | print("Writing to {}".format(FLAGS.output_file)) 39 | data = {'features': X_train_small, 'labels': y_train_small} 40 | pickle.dump(data, open(FLAGS.output_file, 'wb')) 41 | 42 | # parses flags and calls the `main` function above 43 | if __name__ == '__main__': 44 | tf.app.run() 45 | -------------------------------------------------------------------------------- /vgg-cifar10-bottleneck-data.zip: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:e9b29d8074f0ddf6dceef37589017cc372a29d23187b194b841eb4a74f9fdeaf 3 | size 79337072 4 | -------------------------------------------------------------------------------- /vgg-traffic-bottleneck-data.zip: -------------------------------------------------------------------------------- 1 | version https://git-lfs.github.com/spec/v1 2 | oid sha256:656dcf9aafc9974eae1af09e628a77dcbc7dd814676e80f31dea3ba19870bfa3 3 | size 58281808 4 | --------------------------------------------------------------------------------