├── Data ├── readme.md └── deploy_changed_net.png ├── CAFFE_ARCH.png ├── Deep-Neural-Network-with-Caffe ├── output_15_1.png ├── output_22_1.png ├── output_38_0.png ├── output_40_0.png ├── output_42_0.png ├── output_44_0.png ├── output_46_1.png ├── output_48_2.png ├── output_68_1.png ├── output_69_1.png ├── output_73_1.png ├── output_76_1.png ├── output_78_0.png ├── output_78_1.png ├── output_78_10.png ├── output_78_11.png ├── output_78_12.png ├── output_78_13.png ├── output_78_14.png ├── output_78_15.png ├── output_78_2.png ├── output_78_3.png ├── output_78_4.png ├── output_78_5.png ├── output_78_6.png ├── output_78_7.png ├── output_78_8.png ├── output_78_9.png ├── output_80_0.png ├── output_80_1.png ├── output_80_10.png ├── output_80_11.png ├── output_80_12.png ├── output_80_13.png ├── output_80_14.png ├── output_80_15.png ├── output_80_2.png ├── output_80_3.png ├── output_80_4.png ├── output_80_5.png ├── output_80_6.png ├── output_80_7.png ├── output_80_8.png ├── output_80_9.png ├── output_82_2.png ├── readme.md └── Deep Neural Network with Caffe.md ├── README.md ├── How to train in Caffe.md ├── Imagenet └── How-to-properly-set-up-Imagenet-Dataset.md ├── Installation_Instructions ├── Caffe-Installation-Script_or_Bare-Instructions.md └── Caffe Installation Instructions.md ├── Caffe_Things_to_know.md └── CaffeClassificationExample.ipynb /Data/readme.md: -------------------------------------------------------------------------------- 1 | **Misc Data storage.** 2 | 3 | * Sample Images 4 | * Sample Codes 5 | -------------------------------------------------------------------------------- /CAFFE_ARCH.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/CAFFE_ARCH.png -------------------------------------------------------------------------------- /Data/deploy_changed_net.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Data/deploy_changed_net.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_15_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_15_1.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_22_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_22_1.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_38_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_38_0.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_40_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_40_0.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_42_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_42_0.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_44_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_44_0.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_46_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_46_1.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_48_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_48_2.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_68_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_68_1.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_69_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_69_1.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_73_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_73_1.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_76_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_76_1.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_0.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_1.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_10.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_11.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_12.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_13.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_14.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_14.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_15.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_15.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_2.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_3.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_4.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_5.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_6.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_7.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_8.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_78_9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_78_9.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_0.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_1.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_10.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_10.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_11.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_11.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_12.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_12.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_13.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_13.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_14.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_14.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_15.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_15.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_2.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_3.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_4.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_5.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_6.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_7.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_8.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_8.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_80_9.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_80_9.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/output_82_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/arundasan91/Deep-Learning-with-Caffe/HEAD/Deep-Neural-Network-with-Caffe/output_82_2.png -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/readme.md: -------------------------------------------------------------------------------- 1 | ## DNN with Caffe 2 | 3 | **EXAMPLES EXTRACTED FROM CAFFE'S OWN REPO, EDITTED AND EXPLAINED.** 4 | 5 | 1. CLASSIFICATION EXAMPLE 6 | 2. LENET TRAINING EXAMPLE 7 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Caffe 2 | My tests and experiments on Caffe, the deep learning framework by Berkeley Vision and Learning Center (BVLC) and its contributors. 3 | 4 | Please start with [Deep Neural Network with Caffe](https://github.com/arundasan91/Deep-Learning-in-Caffe/blob/master/Deep-Neural-Network-with-Caffe/Deep%20Neural%20Network%20with%20Caffe.md) tutorial. It is an indepth explanation of Caffes own notebook examples. 5 | More information on Caffe can be found in these files: 6 | 7 | 1. [How to train your own network in Caffe](https://github.com/arundasan91/Deep-Learning-in-Caffe/blob/master/How%20to%20train%20in%20Caffe.md) 8 | 9 | 2. [Things to know while training your first net](https://github.com/arundasan91/Deep-Learning-in-Caffe/blob/master/Caffe_Things_to_know.md) 10 | 11 | 3. [How to properly setup ImageNet dataset for training the ImageNet model in Caffe](https://github.com/arundasan91/Deep-Learning-in-Caffe/blob/master/Imagenet/How-to-properly-set-up-Imagenet-Dataset.md) 12 | -------------------------------------------------------------------------------- /How to train in Caffe.md: -------------------------------------------------------------------------------- 1 | 2 | # How to train your own network in Caffe 3 | 4 | The main files, apart from the dataset, required to train your network are the model definitions and the solver definitions. These files are saved in a Google Protobuf format as .prototxt files. It is similar to a yaml file. 5 | 6 | The model definition file defines the architecture of your neural net. The number of layers and its descriptions are to be written in them. The solver definition file is where you specify the learning rate, momentum, snapshot interval and a whole host of other key parameters required for training and testing your neural network. Please view the [Caffe: Things to know to train your network](https://github.com/arundasan91/Caffe/blob/master/Caffe_Things_to_know.md) file for more info. 7 | 8 | Data is as important as the algorithm and the model and is to be preprocessed to one of the formats recognized by Caffe. LMDB formats works well with Caffe. Caffe also support HDF5 formated data and image files. If the model is to be trained on a dataset unrecognized by Caffe, you can write your own class for the respective type and include the proper layer. 9 | 10 | Once you have the Data, ModelParameter and SolverParameter files, you can train it by going into caffe root directory and executing the following command: 11 | 12 | ``` 13 | ./build/tools/caffe train --solver=/path/to/solver.prototxt 14 | ``` 15 | 16 | You can write your own python code to do the same thing by importing the caffe library. You can even write functions to create prototxt files for you according to the parameters you pass to it. If you have jupyter notebook installed in your system this becomes a fairly easy task. Also, many of Caffe's examples are provided in a Notebook format so that you can run it in your system learn on the go. 17 | 18 | There are other ways also which I am learning/exploring. As of now, to sum up: 19 | 20 | 1. Define your network in a prototxt format by writing your own or using python code. 21 | 2. Define the solver parameters in a prototxt format. 22 | 3. Define, preprocess and ready your data. 23 | 4. Initialize training by specifying your two prototxt files. 24 | 25 | Note that you can always resume your training with the snapshotted caffemodel files. To do this you have to specify the solverstate file while you want to use while training. Solverstate file is generated along with the caffemodel file while snaoshotting the trained neural network. An example: 26 | 27 | ``` 28 | ./build/tools/caffe train --solver=/path/to/solver.prototxt --snapshot=/path/to/caffe_n_iter.solverstate 29 | ``` 30 | 31 | Learn from mistakes. 32 | Happy Coding ! 33 | -------------------------------------------------------------------------------- /Imagenet/How-to-properly-set-up-Imagenet-Dataset.md: -------------------------------------------------------------------------------- 1 | #How to properly set up Imagenet Dataset for training in Caffe ! 2 | 3 | Assuming that you have the training and testing compressed files downloaded, let's move ahead and untar them properly. 4 | Make folders for training, testing and validation images. Make these where you want to store the dataset. If you are making these directories in a seperate partition, make sure that it has enough space to accomodate the files. 5 | 6 | ``` 7 | mkdir path/to/new/train/images/folder 8 | mkdir path/to/new/test/images/folder 9 | mkdir path/to/new/val/images/folder 10 | ``` 11 | 12 | Once you have the directories ready, we will start extracting the files. 13 | Please note that extracting the `ILSVRC2012_img_test.tar` and `ILSVRC2012_img_val.tar` will give you images with `.JPEG` extension. However, extracting `ILSVRC2012_img_train.tar` and `ILSVRC2012_img_train_t3.tar` files will give you many more `.tar` files which you can extract according to your needs. 14 | 15 | To extract the tar files, simply do these steps: 16 | 17 | ``` 18 | tar -xf ILSVRC2012_img_test.tar -C path/to/new/test/images/folder 19 | tar -xf ILSVRC2012_img_val.tar -C path/to/new/val/images/folder 20 | tar -xf ILSVRC2012_img_train.tar -C path/to/new/train/images/folder 21 | tar -xf ILSVRC2012_img_train_t3.tar -C path/to/new/train_t3/images/folder 22 | ``` 23 | If you want to see what files are being extracted, use the `-v` flag along with `-xf`. So that would be `tar -xvf file.tar -C `. 24 | 25 | Once you are done with the extraction, let us now discuss about the training images. Each `.tar` files in the training images folder corresponds to different set of images. You can train your network with any number of images from any number of these subsets. Here, however, I will explain how to extract every `tar` files in the train folder so that we can work with the entire dataset. 26 | 27 | To extract all the `.tar` files in train folder, make a script file inside the train folder. The following script will extract all the `.tar` files into respective directories and will remove the `.tar` files after extraction completes. If you do not want to remove the compressed files, please comment the appropriate line of code. 28 | 29 | Make sure you are in the train folder. 30 | ``` 31 | cd path/to/new/train/images/folder 32 | ``` 33 | 34 | Copy the following and paste it into a file (ex. `tar_extract_script.sh`). 35 | 36 | ``` 37 | #!/bin/bash 38 | for f in *.tar; do 39 | d=`basename $f .tar` 40 | mkdir $d 41 | (cd $d && tar xf ../$f) 42 | done 43 | rm -r *.tar 44 | ``` 45 | 46 | Give it executable permissions. 47 | ``` 48 | chmod 700 tar_extract_script.sh 49 | ``` 50 | 51 | Run the script. 52 | ``` 53 | ./tar_extract_script.sh 54 | ``` 55 | 56 | It will take a long time to extract all the files. If you want to see which files are being extracted, give the -v flag in the script. 57 | 58 | Lets do some deep learning now! 59 | -------------------------------------------------------------------------------- /Installation_Instructions/Caffe-Installation-Script_or_Bare-Instructions.md: -------------------------------------------------------------------------------- 1 | ##Caffe Installation Script/Bare Instructions 2 | ###Go one step at a time, understand the script and then make it your own. 3 | ####Tested on Ubuntu 14.04 4 | ####Full explanations available here in my [Caffe Installation Tutorial](../master/Caffe Installation Instructions.md). 5 | 6 | #####Add required repositories, update and upgrade database. 7 | ``` 8 | echo 'deb http://archive.ubuntu.com/ubuntu trusty main restricted universe multiverse' >>/tmp/multiverse.list 9 | sudo cp /tmp/multiverse.list /etc/apt/sources.list.d/ 10 | rm /tmp/multiverse.list 11 | sudo apt-get -y install software-properties-common 12 | sudo add-apt-repository ppa:mc3man/trusty-media 13 | sudo apt-get update && sudo apt-get upgrade && sudo apt-get dist-upgrade && sudo apt-get autoremove 14 | ``` 15 | ##### Remove older versions and install new versions of relevant packages. 16 | ``` 17 | sudo apt-get -y remove ffmpeg x264 libx264-dev 18 | ``` 19 | ``` 20 | sudo apt-get -y install libopenblas-dev libboost-all-dev libfaac-dev ffmpeg gstreamer0.10-ffmpeg build-essential cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev unzip python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev libtiff4-dev libopenexr-dev libeigen2-dev yasm libopencore-amrnb-dev libtheora-dev libvorbis-dev libxvidcore-dev python-tk libeigen3-dev libx264-dev libqt4-dev libqt4-opengl-dev sphinx-common texlive-latex-extra libv4l-dev default-jdk ant libvtk5-qt4-dev python-pip python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler python-pydot zip``` 21 | ``` 22 | ``` 23 | sudo apt-get -y install --fix-missing libboost-all-dev 24 | ``` 25 | #####Install OpenCV 26 | ``` 27 | cd ~ 28 | wget http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.4.9/opencv-2.4.9.zip 29 | unzip opencv-2.4.9.zip 30 | cd opencv-2.4.9 31 | mkdir build 32 | cd build 33 | cmake -D WITH_TBB=ON -D BUILD_NEW_PYTHON_SUPPORT=ON -D WITH_V4L=ON -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D BUILD_EXAMPLES=ON -D WITH_QT=ON -D WITH_OPENGL=ON -D WITH_VTK=ON .. 34 | sudo make -j10 35 | sudo make install 36 | sudo sh -c 'echo "/usr/local/lib" > /etc/ld.so.conf.d/opencv.conf' 37 | sudo ldconfig 38 | ``` 39 | **Export the following to .bashrc file** 40 | ``` 41 | export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/opencv/lib 42 | ``` 43 | #####Install Cython 44 | ``` 45 | cd ~ 46 | wget https://pypi.python.org/packages/0a/b6/fd142319fd0fe83dc266cfe83bfd095bc200fae5190fce0a2482560acb55/Cython-0.23.4.zip#md5=84c8c764ffbeae5f4a513d25fda4fd5e 47 | unzip Cython-0.23.4.zip 48 | cd Cython-0.23.4 49 | sudo python setup.py install 50 | sudo pip install protobuf scikit-image scikit-learn 51 | ``` 52 | #####Make/Install Caffe 53 | ``` 54 | cd ~ 55 | git clone https://github.com/BVLC/caffe 56 | cd caffe 57 | cp Makefile.config.example Makefile.config 58 | ``` 59 | **Make necessary changes in Makefile.config** 60 | ``` 61 | cd ~/caffe/python 62 | sudo pip install -r requirements.txt 63 | cd ~/caffe 64 | sudo make all -j10 65 | sudo make pycaffe 66 | sudo make distribute 67 | sudo make test 68 | sudo make runtest 69 | ``` 70 | **Update .bashrc with the following** 71 | ``` 72 | #Caffe Root 73 | 74 | export CAFFE_ROOT=/home//caffe/ 75 | export PYTHONPATH=/home//caffe/distribute/python:$PYTHONPATH 76 | export PYTHONPATH=/home//caffe/python:$PYTHONPATH 77 | ``` 78 | **REBOOT the system** 79 | ``` 80 | sudo reboot 81 | ``` 82 | -------------------------------------------------------------------------------- /Caffe_Things_to_know.md: -------------------------------------------------------------------------------- 1 | # Caffe: Things to know to train your network 2 | 1. **Data Injestion and Preprocessing** 3 | * **Data Injestion formats** 4 | * Level DB or LMDB database. 5 | * Data in memory (C++ or Python). 6 | * HDF5 formated data. 7 | * Image files. 8 | * **Preprocessing Tools** 9 | `~ Can be found at $CAFFE_ROOT/tools` 10 | * LevelDB or LMDB creation from raw images. `($CAFFE_ROOT/tools/convert_imageset.cpp)` 11 | * Training and validation set creation with shuffling algorithms. 12 | * Mean image generation. 13 | * **Data Transformation Tools** 14 | * Image Cropping, Scaling, Resizing and mirroring. 15 | * Mean Subtration. 16 | 17 | 2. **Model Defenition** 18 | * Defined in a prototxt format. 19 | * Prototxt is a human readable format developed by Google. 20 | * It autogenerates and checks Caffe code. 21 | * Used to define Caffe's network architecture and training parameters. 22 | * Can be written by hand without any coding. 23 | * Can be written as a python function also\. It autogenerates the prototxt file for you. 24 | * Example Model Defenition prototxt file: 25 | ``` 26 | name: "CaffeNet" 27 | layer { 28 | name: "data" 29 | type: "Input" 30 | top: "data" 31 | input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } } 32 | } 33 | layer { 34 | name: "conv1" 35 | type: "Convolution" 36 | bottom: "data" 37 | top: "conv1" 38 | convolution_param { 39 | num_output: 96 40 | kernel_size: 11 41 | stride: 4 42 | } 43 | } 44 | layer { 45 | name: "relu1" 46 | type: "ReLU" 47 | bottom: "conv1" 48 | top: "conv1" 49 | } 50 | layer { 51 | name: "pool1" 52 | type: "Pooling" 53 | bottom: "conv1" 54 | top: "pool1" 55 | pooling_param { 56 | pool: MAX 57 | kernel_size: 3 58 | stride: 2 59 | } 60 | } 61 | layer { 62 | name: "norm1" 63 | type: "LRN" 64 | bottom: "pool1" 65 | top: "norm1" 66 | lrn_param { 67 | local_size: 5 68 | alpha: 0.0001 69 | beta: 0.75 70 | } 71 | } 72 | 73 | add more layers by simply mentioning them in the same format 74 | 75 | layer { 76 | name: "conv2" 77 | type: "Convolution" 78 | bottom: "norm1" 79 | top: "conv2" 80 | convolution_param { 81 | num_output: 256 82 | pad: 2 83 | kernel_size: 5 84 | group: 2 85 | } 86 | } 87 | layer { 88 | name: "relu2" 89 | type: "ReLU" 90 | bottom: "conv2" 91 | top: "conv2" 92 | } 93 | layer { 94 | name: "pool2" 95 | type: "Pooling" 96 | bottom: "conv2" 97 | top: "pool2" 98 | pooling_param { 99 | pool: MAX 100 | kernel_size: 3 101 | stride: 2 102 | } 103 | } 104 | layer { 105 | name: "norm2" 106 | type: "LRN" 107 | bottom: "pool2" 108 | top: "norm2" 109 | lrn_param { 110 | local_size: 5 111 | alpha: 0.0001 112 | beta: 0.75 113 | } 114 | } 115 | 116 | end the prototxt file after mentioning all the layers 117 | 118 | layer { 119 | name: "fc6" 120 | type: "InnerProduct" 121 | bottom: "pool5" 122 | top: "fc6" 123 | inner_product_param { 124 | num_output: 4096 125 | } 126 | } 127 | layer { 128 | name: "relu6" 129 | type: "ReLU" 130 | bottom: "fc6" 131 | top: "fc6" 132 | } 133 | layer { 134 | name: "drop6" 135 | type: "Dropout" 136 | bottom: "fc6" 137 | top: "fc6" 138 | dropout_param { 139 | dropout_ratio: 0.5 140 | } 141 | } 142 | layer { 143 | name: "fc7" 144 | type: "InnerProduct" 145 | bottom: "fc6" 146 | top: "fc7" 147 | inner_product_param { 148 | num_output: 4096 149 | } 150 | } 151 | layer { 152 | name: "relu7" 153 | type: "ReLU" 154 | bottom: "fc7" 155 | top: "fc7" 156 | } 157 | layer { 158 | name: "drop7" 159 | type: "Dropout" 160 | bottom: "fc7" 161 | top: "fc7" 162 | dropout_param { 163 | dropout_ratio: 0.5 164 | } 165 | } 166 | layer { 167 | name: "fc8" 168 | type: "InnerProduct" 169 | bottom: "fc7" 170 | top: "fc8" 171 | inner_product_param { 172 | num_output: 1000 173 | } 174 | } 175 | layer { 176 | name: "prob" 177 | type: "Softmax" 178 | bottom: "fc8" 179 | top: "prob" 180 | } 181 | ``` 182 | * Example python code to define the model 183 | 184 | ``` 185 | from caffe import layers as L, params as P 186 | 187 | def lenet(lmdb, batch_size): 188 | #our version of LeNet: a series of linear and simple nonlinear transformations 189 | n = caffe.NetSpec() 190 | 191 | n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb, 192 | transform_param=dict(scale=1./255), ntop=2) 193 | 194 | n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier')) 195 | n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX) 196 | n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier')) 197 | n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX) 198 | n.fc1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier')) 199 | n.relu1 = L.ReLU(n.fc1, in_place=True) 200 | n.score = L.InnerProduct(n.relu1, num_output=10, weight_filler=dict(type='xavier')) 201 | n.loss = L.SoftmaxWithLoss(n.score, n.label) 202 | 203 | return n.to_proto() 204 | ``` 205 | 206 | 2.1 **Different fucntions and layers** 207 | * **Loss Functions** 208 | * Classification 209 | * Softmax 210 | * Hinge loss 211 | * Linear Regression 212 | * Euclidean Loss 213 | * Attributes/Multiclassification 214 | * Simoid cross entropy loss 215 | * **Layers** 216 | * Convolution 217 | * Pooling 218 | * Normalization 219 | * **Activation Functions** 220 | * ReLU 221 | * Sigmoid 222 | * Tanh 223 | 224 | 3. **Network Training - Solver Files** 225 | * Prototxt file to list the parameters of the neural net's training algorithm 226 | * Example Solver File: 227 | 228 | ``` 229 | # The train/test net protocol buffer definition 230 | train_net: "mnist/lenet_auto_train.prototxt" 231 | test_net: "mnist/lenet_auto_test.prototxt" 232 | # test_iter specifies how many forward passes the test should carry out. 233 | # In the case of MNIST, we have test batch size 100 and 100 test iterations, 234 | # covering the full 10,000 testing images. 235 | test_iter: 100 236 | # Carry out testing every 500 training iterations. 237 | test_interval: 500 238 | # The base learning rate, momentum and the weight decay of the network. 239 | base_lr: 0.01 240 | momentum: 0.9 241 | weight_decay: 0.0005 242 | # The learning rate policy 243 | lr_policy: "inv" 244 | gamma: 0.0001 245 | power: 0.75 246 | # Display every 100 iterations 247 | display: 100 248 | # The maximum number of iterations 249 | max_iter: 10000 250 | # snapshot intermediate results 251 | snapshot: 5000 252 | snapshot_prefix: "mnist/lenet" 253 | ``` 254 | 4. Optimization Algorithms 255 | * SGD + momentum 256 | * ADAGRAD 257 | * NAG 258 | 259 | 5. After training 260 | * Caffe training produces a binary file with extension *\.caffemodel*. 261 | * This is a machine readable file generally a few hundered mega bytes. 262 | * This model can be reused for further training and can be shared as well. 263 | * Caffemodel file can be used for implementation of the neural net. 264 | * Integrate the model into the data pipeline using Caffe command line or Matlab/Python. 265 | * Deploy model across hardware or OS environments with caffe installed. 266 | 267 | 6. Share to ModelZoo 268 | * ModelZoo is a project/model sharing community. 269 | * Normally, when sharing the caffemodel, the following should be present: 270 | * Solver 271 | * Model Prototxt Files 272 | * readme.md file which describes the: 273 | * Caffe version. 274 | * URL and SHA1 of *\.caffemodel* file. 275 | * License 276 | * Description of the Training Data. 277 | 278 | ### Caffe: Extensible Code 279 | Caffe's inbuilt data types, layer types or loss functions may not be directly relevant in our neural network architecture. In those cases, we can write our own datatype or layertype or loss function by writing specific a Python or C++ classes for the same. This way we can extend caffe's capabilities to meet our needs. The new layer or loss function class can now be properly used in the required prototxt file. 280 | 281 | ### Reference 282 | 1. nVIDIA QwikLab sessions, which can be found [here.](https://nvidia.qwiklab.com/) 283 | 2. Caffe's examples in $CAFFE_ROOT/examples/ 284 | -------------------------------------------------------------------------------- /CaffeClassificationExample.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# Caffe Classification Example\n", 8 | "### Extracted and enhanced from [Caffe's Documentation](https://github.com/BVLC/caffe/blob/master/examples/00-classification.ipynb)\n", 9 | "\n", 10 | "\n", 11 | "Caffe's [Github repo provides examples](https://github.com/BVLC/caffe/tree/master/examples) on Classification using Caffe.\n", 12 | "This tutorial covers Caffe's native tutorial. Further, my own code and enhancements are included to better understand the working of Caffe." 13 | ] 14 | }, 15 | { 16 | "cell_type": "markdown", 17 | "metadata": {}, 18 | "source": [ 19 | "First let us import some libraries that are required to visualize the trained Neural net. These include the Numpy library for saving the trained images as arrays and Matplotlib for plotting various figures and graphs out of it.\n", 20 | "We also tune the plot parameters as mentioned below." 21 | ] 22 | }, 23 | { 24 | "cell_type": "code", 25 | "execution_count": null, 26 | "metadata": { 27 | "collapsed": true 28 | }, 29 | "outputs": [], 30 | "source": [ 31 | "# set up Python environment: numpy for numerical routines, and matplotlib for plotting\n", 32 | "import numpy as np\n", 33 | "import matplotlib.pyplot as plt\n", 34 | "# display plots in this notebook\n", 35 | "%matplotlib inline\n", 36 | "\n", 37 | "# set display defaults\n", 38 | "# these are for the matplotlib figure's.\n", 39 | "plt.rcParams['figure.figsize'] = (10, 10) # large images\n", 40 | "plt.rcParams['image.interpolation'] = 'nearest' # don't interpolate: show square pixels\n", 41 | "plt.rcParams['image.cmap'] = 'gray' # use grayscale output rather than a (potentially misleading) color heatmap" 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "metadata": {}, 47 | "source": [ 48 | "Caffe's examples are hosted in an example directory inside its root directory. We should run the following python code from the example directory to make sure that the scripts work. We include the *sys* and *os* modules to work with the file paths and the working directory. The caffe's python folder must be fixed as the python path." 49 | ] 50 | }, 51 | { 52 | "cell_type": "code", 53 | "execution_count": null, 54 | "metadata": { 55 | "collapsed": false 56 | }, 57 | "outputs": [], 58 | "source": [ 59 | "# The caffe module needs to be on the Python path;\n", 60 | "import sys\n", 61 | "import os\n", 62 | "\n", 63 | "# set the working directory to caffe root directory.\n", 64 | "os.chdir('/root/caffe/examples/')\n", 65 | "\n", 66 | "caffe_root = '../'\n", 67 | "#caffe_root = '/root/caffe/' # The caffe_root is changed to reflect the actual folder in the server.\n", 68 | "sys.path.insert(0, caffe_root + 'python') # Correct the python path\n", 69 | "\n", 70 | "import caffe\n", 71 | "# Successfully imported Caffe !\n" 72 | ] 73 | }, 74 | { 75 | "cell_type": "markdown", 76 | "metadata": {}, 77 | "source": [ 78 | "Training a model is computationally expensive and time consuming. For this example, let us stick on to a pre-trained network bundled with Caffe. We will search for the caffemodel and start from there. Caffemodel is the trained model. During the training phase, on the set time intervals or iterations, Caffe saves caffemodel file which saves the state of the net at that particular time. For example, if we have a total of 10000 iterations to perform and we explicitly mention that we need to save the state of the net at intervals of 2000 iterations, Caffe will generate 5 caffemodel files and 5 solver files, which saves the respective state of the net at iterations 2000,4000,6000,8000 and 10000." 79 | ] 80 | }, 81 | { 82 | "cell_type": "code", 83 | "execution_count": null, 84 | "metadata": { 85 | "collapsed": false 86 | }, 87 | "outputs": [], 88 | "source": [ 89 | "# Print out the current working directory.\n", 90 | "print os.getcwd()\n", 91 | "\n", 92 | "# If the reference model is not found, download it from the Caffe repo's.\n", 93 | "if os.path.isfile(caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'):\n", 94 | " print 'CaffeNet found.'\n", 95 | "else:\n", 96 | " print 'Downloading pre-trained CaffeNet model...'\n", 97 | " !../scripts/download_model_binary.py ../models/bvlc_reference_caffenet" 98 | ] 99 | }, 100 | { 101 | "cell_type": "markdown", 102 | "metadata": {}, 103 | "source": [ 104 | "Since we explicitly fixed the working directory, it is not required to have the Notebook in the same directory as that of the example. We define the Model definitions and weights of the pre-trained network by including the correct path. The neural net is defined by using the defenitions and weigths saved earlier. Since the network was already trained for a huge dataset, we can choose the Test mode in caffe and not perform dropout's while defining the net. More info on dropout [here](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf). " 105 | ] 106 | }, 107 | { 108 | "cell_type": "code", 109 | "execution_count": null, 110 | "metadata": { 111 | "collapsed": false 112 | }, 113 | "outputs": [], 114 | "source": [ 115 | "# set Caffe mode as CPU only. This is to be done because OCI servers are not equipped with GPU's yet.\n", 116 | "caffe.set_mode_cpu()\n", 117 | "\n", 118 | "# set the model definitions since we are using a pretrained network here.\n", 119 | "# this protoype definitions can be changed to make significant changes in the learning method.\n", 120 | "model_def = caffe_root + 'models/bvlc_reference_caffenet/deploy.prototxt'\n", 121 | "model_weights = caffe_root + 'models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'\n", 122 | "\n", 123 | "net = caffe.Net(model_def, # defines the structure of the model\n", 124 | " model_weights, # contains the trained weights\n", 125 | " caffe.TEST) # use test mode (e.g., don't perform dropout)" 126 | ] 127 | }, 128 | { 129 | "cell_type": "markdown", 130 | "metadata": {}, 131 | "source": [ 132 | "Now we can create a transformer to input data into our net. Input data here are images. The subtracted-mean of the images in the dataset considered are to be set in the transformer. Mean subtraction is a way of preprocessing the image. The mean is subtracted across every individual feature in the dataset. This can be interpreted as the centering of a cloud of data around the origin along every dimension. With our input data fixed as images, this relates to subtracting the mean from each of the pixels, seperately across the three channels. More on it [here](http://cs231n.github.io/neural-networks-2/).\n", 133 | "\n", 134 | "These are the steps usually carried out in each transformers:\n", 135 | "1. Transpose the data from (height, width, channels) to (channels, width, height)\n", 136 | "2. Swap the color channels from RGB to BGR\n", 137 | "3. Subtract the mean pixel value of the training dataset (unless you disable that feature).\n", 138 | "\n", 139 | "More information on these [here](https://groups.google.com/forum/#!topic/digits-users/FIh6VyU1XqQ), [here](https://github.com/NVIDIA/DIGITS/issues/59) and [here](https://github.com/NVIDIA/DIGITS/blob/v1.1.0/digits/model/tasks/caffe_train.py#L938-L961)." 140 | ] 141 | }, 142 | { 143 | "cell_type": "code", 144 | "execution_count": null, 145 | "metadata": { 146 | "collapsed": false 147 | }, 148 | "outputs": [], 149 | "source": [ 150 | "# load the mean ImageNet image (as distributed with Caffe) for subtraction\n", 151 | "mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')\n", 152 | "mu = mu.mean(1).mean(1) # average over pixels to obtain the mean (BGR) pixel values\n", 153 | "print 'mean-subtracted values:', zip('BGR', mu)\n", 154 | "\n", 155 | "# create transformer for the input called 'data'\n", 156 | "transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})\n", 157 | "\n", 158 | "transformer.set_transpose('data', (2,0,1)) # move image channels to outermost dimension\n", 159 | "transformer.set_mean('data', mu) # subtract the dataset-mean value in each channel\n", 160 | "transformer.set_raw_scale('data', 255) # rescale from [0, 1] to [0, 255]\n", 161 | "transformer.set_channel_swap('data', (2,1,0)) # swap channels from RGB to BGR" 162 | ] 163 | }, 164 | { 165 | "cell_type": "markdown", 166 | "metadata": {}, 167 | "source": [ 168 | "If needed, we can reshape the data to meet our specifications. In the particular example the batch size, number of channels and image size is explicitly specified as below." 169 | ] 170 | }, 171 | { 172 | "cell_type": "code", 173 | "execution_count": null, 174 | "metadata": { 175 | "collapsed": true 176 | }, 177 | "outputs": [], 178 | "source": [ 179 | "# set the size of the input (we can skip this if we're happy\n", 180 | "# with the default; we can also change it later, e.g., for different batch sizes)\n", 181 | "net.blobs['data'].reshape(50, # batch size\n", 182 | " 3, # 3-channel (BGR) images\n", 183 | " 227, 227) # image size is 227x227" 184 | ] 185 | }, 186 | { 187 | "cell_type": "markdown", 188 | "metadata": {}, 189 | "source": [ 190 | "Any image can be now loaded into caffe. For simplicity let us now stick with Caffe's example image ($CAFFE_ROOT/examples/images/cat.jpg). The image is then transformed as mentioned above using the transformer that we defined. Finally, the image is plotted using matplotlib.pyplot imported as plt." 191 | ] 192 | }, 193 | { 194 | "cell_type": "code", 195 | "execution_count": null, 196 | "metadata": { 197 | "collapsed": false 198 | }, 199 | "outputs": [], 200 | "source": [ 201 | "image = caffe.io.load_image(caffe_root + 'examples/images/cat.jpg')\n", 202 | "transformed_image = transformer.preprocess('data', image)\n", 203 | "plt.imshow(image)" 204 | ] 205 | }, 206 | { 207 | "cell_type": "markdown", 208 | "metadata": {}, 209 | "source": [ 210 | "Great ! Now we have our net ready so is the image that we need to classify. Remember that data in Caffe is interpreted using blobs. Quoting from [Caffe's Documentation](http://caffe.berkeleyvision.org/tutorial/net_layer_blob.html) : *As data and derivatives flow through the network in the forward and backward passes Caffe stores, communicates, and manipulates the information as blobs: the blob is the standard array and unified memory interface for the framework.* \n", 211 | "\n", 212 | "For caffe to get information from the image, it needs to be copied to the memory allocated by Caffe. \n", 213 | "\n", 214 | "Once the image is loaded into memory, we can perform classification with it. To start classification, we call ***net.forward()*** and redirect its output to a variavle named output (name can be anything obviously). The probability of the output is saved in a vector format. Since we gave a batch size of 50, there will be 50 input images at once. The probability of our image will be saved in the **[0]**th location. The output probability can be extracted out by properly calling it. Finally the predicted class of the image can be extracted by using argmax which returns the indices of the maximum values along an axis. " 215 | ] 216 | }, 217 | { 218 | "cell_type": "code", 219 | "execution_count": null, 220 | "metadata": { 221 | "collapsed": false 222 | }, 223 | "outputs": [], 224 | "source": [ 225 | "# copy the image data into the memory allocated for the net\n", 226 | "net.blobs['data'].data[...] = transformed_image\n", 227 | "\n", 228 | "### perform classification\n", 229 | "output = net.forward()\n", 230 | "\n", 231 | "output_prob = output['prob'][0] # the output probability vector for the first image in the batch\n", 232 | "\n", 233 | "print 'predicted class is:', output_prob.argmax()" 234 | ] 235 | }, 236 | { 237 | "cell_type": "markdown", 238 | "metadata": {}, 239 | "source": [ 240 | "In our case, for the cute cat, the predicted class is 281. Make sure that you are getting the same (just in case).\n", 241 | "\n", 242 | "For our eyes the image is a cute cat, agreed. To see what our net thinks it is, let us fetch the label of the predicted/classified image. Load the labels file from the dataset and output the specific label." 243 | ] 244 | }, 245 | { 246 | "cell_type": "code", 247 | "execution_count": null, 248 | "metadata": { 249 | "collapsed": false 250 | }, 251 | "outputs": [], 252 | "source": [ 253 | "# load ImageNet labels\n", 254 | "\n", 255 | "labels_file = caffe_root + 'data/ilsvrc12/synset_words.txt'\n", 256 | "\n", 257 | "if not os.path.exists(labels_file):\n", 258 | " !/root/caffe/data/ilsvrc12/get_ilsvrc_aux.sh\n", 259 | " \n", 260 | "labels = np.loadtxt(labels_file, str, delimiter='\\t')\n", 261 | "\n", 262 | "print 'output label:', labels[output_prob.argmax()]" 263 | ] 264 | }, 265 | { 266 | "cell_type": "markdown", 267 | "metadata": {}, 268 | "source": [ 269 | "What do you think about the prediction ? Fair ? Let us see a quantitative result. We will output the top five predictions from the output layer (softmax layer)." 270 | ] 271 | }, 272 | { 273 | "cell_type": "code", 274 | "execution_count": null, 275 | "metadata": { 276 | "collapsed": false 277 | }, 278 | "outputs": [], 279 | "source": [ 280 | "# sort top five predictions from softmax output\n", 281 | "top_inds = output_prob.argsort()[::-1][:5] # reverse sort and take five largest items\n", 282 | "\n", 283 | "print 'probabilities and labels:'\n", 284 | "zip(output_prob[top_inds], labels[top_inds])" 285 | ] 286 | }, 287 | { 288 | "cell_type": "markdown", 289 | "metadata": {}, 290 | "source": [ 291 | "To find the time required to train the network for the particular input, let us use timeit function." 292 | ] 293 | }, 294 | { 295 | "cell_type": "code", 296 | "execution_count": null, 297 | "metadata": { 298 | "collapsed": false 299 | }, 300 | "outputs": [], 301 | "source": [ 302 | "# find the time required to train the network\n", 303 | "%timeit net.forward()" 304 | ] 305 | }, 306 | { 307 | "cell_type": "markdown", 308 | "metadata": {}, 309 | "source": [ 310 | "**blob.data.shape** can be used to find the shape of the different layers in your net. Loop across it to get shape of each layer." 311 | ] 312 | }, 313 | { 314 | "cell_type": "code", 315 | "execution_count": null, 316 | "metadata": { 317 | "collapsed": false 318 | }, 319 | "outputs": [], 320 | "source": [ 321 | "# for each layer, show the output shape\n", 322 | "for layer_name, blob in net.blobs.iteritems():\n", 323 | " print layer_name + '\\t' + str(blob.data.shape)" 324 | ] 325 | }, 326 | { 327 | "cell_type": "code", 328 | "execution_count": null, 329 | "metadata": { 330 | "collapsed": false 331 | }, 332 | "outputs": [], 333 | "source": [ 334 | "for layer_name, param in net.params.iteritems():\n", 335 | " print layer_name + '\\t' + str(param[0].data.shape), str(param[1].data.shape)" 336 | ] 337 | }, 338 | { 339 | "cell_type": "markdown", 340 | "metadata": {}, 341 | "source": [ 342 | "The layers can be viewed in a diagramatic view by exporting it to an image file. This can be done by a script available in /python folder in caffe's root directory." 343 | ] 344 | }, 345 | { 346 | "cell_type": "code", 347 | "execution_count": null, 348 | "metadata": { 349 | "collapsed": false 350 | }, 351 | "outputs": [], 352 | "source": [ 353 | "! $CAFFE_ROOT/python/draw_net.py $CAFFE_ROOT/models/bvlc_reference_caffenet/deploy.prototxt caffenet.png" 354 | ] 355 | }, 356 | { 357 | "cell_type": "code", 358 | "execution_count": null, 359 | "metadata": { 360 | "collapsed": true 361 | }, 362 | "outputs": [], 363 | "source": [ 364 | "def vis_square(data):\n", 365 | " \"\"\"Take an array of shape (n, height, width) or (n, height, width, 3)\n", 366 | " and visualize each (height, width) thing in a grid of size approx. sqrt(n) by sqrt(n)\"\"\"\n", 367 | " \n", 368 | " # normalize data for display\n", 369 | " data = (data - data.min()) / (data.max() - data.min())\n", 370 | " \n", 371 | " # force the number of filters to be square\n", 372 | " n = int(np.ceil(np.sqrt(data.shape[0])))\n", 373 | " padding = (((0, n ** 2 - data.shape[0]),\n", 374 | " (0, 1), (0, 1)) # add some space between filters\n", 375 | " + ((0, 0),) * (data.ndim - 3)) # don't pad the last dimension (if there is one)\n", 376 | " data = np.pad(data, padding, mode='constant', constant_values=1) # pad with ones (white)\n", 377 | " \n", 378 | " # tile the filters into an image\n", 379 | " data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1)))\n", 380 | " data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:])\n", 381 | " \n", 382 | " plt.imshow(data); plt.axis('off')" 383 | ] 384 | }, 385 | { 386 | "cell_type": "markdown", 387 | "metadata": {}, 388 | "source": [ 389 | "We can filter out the weights and biases of each layer to visualize the changes happening in each layer. This is a powerful way of analyzing the net as it gives intuition into what is happening inside it. The image drawn above (layer by layer representation) together with the visualization of things happening in each layer will help us understand the net in more depth. Here we use the **conv1** layer for the same." 390 | ] 391 | }, 392 | { 393 | "cell_type": "code", 394 | "execution_count": null, 395 | "metadata": { 396 | "collapsed": false 397 | }, 398 | "outputs": [], 399 | "source": [ 400 | "# the parameters are a list of [weights, biases]\n", 401 | "filters = net.params['conv1'][0].data\n", 402 | "filters.shape" 403 | ] 404 | }, 405 | { 406 | "cell_type": "markdown", 407 | "metadata": {}, 408 | "source": [ 409 | "If you noticed, the shape of the filter is different from the function vis_square. So we need to transpose the vector accordingly before passing it into the function to visualize the layers." 410 | ] 411 | }, 412 | { 413 | "cell_type": "code", 414 | "execution_count": null, 415 | "metadata": { 416 | "collapsed": false 417 | }, 418 | "outputs": [], 419 | "source": [ 420 | "vis_square(filters.transpose(0, 2, 3, 1))" 421 | ] 422 | }, 423 | { 424 | "cell_type": "markdown", 425 | "metadata": {}, 426 | "source": [ 427 | "To visualize the data as such, we can use the net.blobs instead of net.params. This will give us a visual clue on what the data may look like and what is happening at the same time. We are doing it for the **conv1** layer." 428 | ] 429 | }, 430 | { 431 | "cell_type": "code", 432 | "execution_count": null, 433 | "metadata": { 434 | "collapsed": false 435 | }, 436 | "outputs": [], 437 | "source": [ 438 | "feat = net.blobs['conv1'].data[0, :36]\n", 439 | "vis_square(feat)" 440 | ] 441 | }, 442 | { 443 | "cell_type": "markdown", 444 | "metadata": {}, 445 | "source": [ 446 | "Similarly for the **pool5** layer." 447 | ] 448 | }, 449 | { 450 | "cell_type": "code", 451 | "execution_count": null, 452 | "metadata": { 453 | "collapsed": false 454 | }, 455 | "outputs": [], 456 | "source": [ 457 | "feat = net.blobs['pool5'].data[0]\n", 458 | "vis_square(feat)" 459 | ] 460 | }, 461 | { 462 | "cell_type": "markdown", 463 | "metadata": {}, 464 | "source": [ 465 | "We can plot graphs using the various data saved in the layers. The fully connected layer fc6 will result in the following plot." 466 | ] 467 | }, 468 | { 469 | "cell_type": "code", 470 | "execution_count": null, 471 | "metadata": { 472 | "collapsed": false 473 | }, 474 | "outputs": [], 475 | "source": [ 476 | "feat = net.blobs['fc6'].data[0]\n", 477 | "plt.subplot(2, 1, 1)\n", 478 | "plt.plot(feat.flat)\n", 479 | "plt.subplot(2, 1, 2)\n", 480 | "_ = plt.hist(feat.flat[feat.flat > 0], bins=100)" 481 | ] 482 | }, 483 | { 484 | "cell_type": "markdown", 485 | "metadata": {}, 486 | "source": [ 487 | "The probability of predicting the correct label for the particular image we classified can be plotted as well. X-axis is the Feature's label number and Y-Axis is the probability of correct classification." 488 | ] 489 | }, 490 | { 491 | "cell_type": "code", 492 | "execution_count": null, 493 | "metadata": { 494 | "collapsed": false 495 | }, 496 | "outputs": [], 497 | "source": [ 498 | "feat = net.blobs['prob'].data[0]\n", 499 | "plt.figure(figsize=(15, 3))\n", 500 | "plt.plot(feat.flat)" 501 | ] 502 | }, 503 | { 504 | "cell_type": "markdown", 505 | "metadata": {}, 506 | "source": [ 507 | "Now, let us download an image of our own and try to classify it. Here, a http link of the image is used to download the image. The image is then loaded into Caffe. The image is then preprocessed using the transformer we defined earlier.\n", 508 | "\n", 509 | "Once we are done with the preprocessing, we have a formated image in the memory that is ready to be classified. Perform the classification by running **net.forward()**. The output probability can be found just like earlier.The top 5 probabilities are dound out, the image displayed and the 5 probabilities are printed out." 510 | ] 511 | }, 512 | { 513 | "cell_type": "code", 514 | "execution_count": null, 515 | "metadata": { 516 | "collapsed": false 517 | }, 518 | "outputs": [], 519 | "source": [ 520 | "# download an image\n", 521 | "# for example:\n", 522 | "# my_image_url = \"https://upload.wikimedia.org/wikipedia/commons/b/be/Orang_Utan%2C_Semenggok_Forest_Reserve%2C_Sarawak%2C_Borneo%2C_Malaysia.JPG\"\n", 523 | "#my_image_url = \"https://www.petfinder.com/wp-content/uploads/2012/11/140272627-grooming-needs-senior-cat-632x475.jpg\" # paste your URL here\n", 524 | "#my_image_url =\"http://kids.nationalgeographic.com/content/dam/kids/photos/animals/Mammals/H-P/lion-male-roar.jpg\"\n", 525 | "\n", 526 | "my_image_url =\"http://www.depositagift.com/img/bank_assets/Band-Aid.jpg\"\n", 527 | "\n", 528 | "!wget -O image.jpg $my_image_url\n", 529 | "\n", 530 | "# transform it and copy it into the net\n", 531 | "image = caffe.io.load_image('image.jpg')\n", 532 | "net.blobs['data'].data[...] = transformer.preprocess('data', image)\n", 533 | "\n", 534 | "# perform classification\n", 535 | "net.forward()\n", 536 | "\n", 537 | "# obtain the output probabilities\n", 538 | "output_prob = net.blobs['prob'].data[0]\n", 539 | "\n", 540 | "# sort top five predictions from softmax output\n", 541 | "top_inds = output_prob.argsort()[::-1][:5]\n", 542 | "\n", 543 | "plt.imshow(image)\n", 544 | "\n", 545 | "print 'probabilities and labels:'\n", 546 | "zip(output_prob[top_inds], labels[top_inds])" 547 | ] 548 | }, 549 | { 550 | "cell_type": "markdown", 551 | "metadata": {}, 552 | "source": [ 553 | "If you ever wonder what are the labels in the dataset, run the following:" 554 | ] 555 | }, 556 | { 557 | "cell_type": "code", 558 | "execution_count": null, 559 | "metadata": { 560 | "collapsed": false 561 | }, 562 | "outputs": [], 563 | "source": [ 564 | "zip(labels[:])" 565 | ] 566 | } 567 | ], 568 | "metadata": { 569 | "kernelspec": { 570 | "display_name": "Python 2", 571 | "language": "python", 572 | "name": "python2" 573 | }, 574 | "language_info": { 575 | "codemirror_mode": { 576 | "name": "ipython", 577 | "version": 2 578 | }, 579 | "file_extension": ".py", 580 | "mimetype": "text/x-python", 581 | "name": "python", 582 | "nbconvert_exporter": "python", 583 | "pygments_lexer": "ipython2", 584 | "version": "2.7.6" 585 | } 586 | }, 587 | "nbformat": 4, 588 | "nbformat_minor": 0 589 | } 590 | -------------------------------------------------------------------------------- /Installation_Instructions/Caffe Installation Instructions.md: -------------------------------------------------------------------------------- 1 | 2 | #
![Caffe](https://developer.nvidia.com/sites/default/files/akamai/cuda/images/deeplearning/caffe.png)
3 | ##
Freshly brewed !
4 | 5 | With the availability of huge amount of data for research and powerfull machines to run your code on, Machine Learning and Neural Networks is gaining their foot again and impacting us more than ever in our everyday lives. With huge players like Google opensourcing part of their Machine Learning systems like the TensorFlow software library for numerical computation, there are many options for someone interested in starting off with Machine Learning/Neural Nets to choose from. Caffe, a deep learning framework developed by the Berkeley Vision and Learning Center (BVLC) and its contributors, comes to the play with a fresh cup of coffee. 6 | 7 | # Installation Instructions (Ubuntu 14 Trusty) - Pretty old instructions. Please request if you need a new version. 8 | 9 | The following section is divided in to two parts. Caffe's [documentation]( http://caffe.berkeleyvision.org/installation.html "Caffe Installation Instruction") suggests you to install [Anaconda](https://www.continuum.io/why-anaconda "Anaconda") Python distribution to make sure that you've installed necessary packages, with ease. If you're someone who do not want to install Anaconda in your system for some reason, I've covered that too. So in the first part you'll find information on [how to install Caffe with Anaconda](#1.-Caffe-+-Anaconda) and in the second part you'll find the information for installing Caffe without Anaconda . 10 | 11 | Please note that the following instructions were tested on my local machine and in two Chameleon Cloud Instances. However I cannot garuntee success for anyone. Please be ready to see some errors on the way, but I hope you won't stumble into any if you follow the directions as is. 12 | 13 | My local machine and the instances I used are NOT equipped with GPU's. So the installation instrucions are strictly for non-GPU based or more clearly CPU-only systems running Ubuntu 14 trusty. However, to install it in a GPU based system, you just have to install CUDA and necessary drivers for your GPU. You can find the instructions in [Stack Overflow](http://stackoverflow.com/ "Stack Overflow") or in the always go to friend [Google](https://www.google.com/ "GOOGLE SEARCH"). 14 | 15 | # For systems without GPU's (CPU_only) 16 | 17 | # 1. Caffe + Anaconda 18 | 19 | Anaconda python distribution includes scientific and analytic Python packages which are extremely useful. The complete list of packages can be found [here](http://docs.continuum.io/anaconda/pkg-docs "Anaconda Packages"). 20 | 21 | To install Anaconda, you have to first download the Installer to your machine. Go to this [website](https://www.continuum.io/downloads "Scroll to the 'Anaconda for Linux' section") to download the Installer. Scroll to the 'Anaconda for Linux' section and choose the installer to download depending on your system architecture. 22 | 23 | Once you have the Installer in your machine, run the following code to install Anaconda. 24 | 25 | bash Anaconda2-2.5.0-Linux-x86_64.sh 26 | 27 | If you fail to read the few lines printed after installation, you'll waste a good amount of your produtive time on trying to figure out what went wrong. An important line reads: 28 | 29 | For this change to become active, you have to open a new terminal. 30 | 31 | So, once the Anaconda installation is over, please open a new terminal. Period. 32 | 33 | After opening a new terminal, to verify the installation type: 34 | 35 | conda -V 36 | 37 | This should give you the current version of conda, thus verifying the installation. Now that's done ! 38 | 39 | Now we will install OpenBLAS. 40 | 41 | sudo apt-get install libopenblas-dev 42 | 43 | Next go ahead and install Boost. More info on boost [here](http://www.boost.org/ "boost.org") 44 | 45 | I faced a problem while installing boost in all my machines. I fixed it by including multiverse repository into the sources.list. Since playing with sources.list is not reccomended, follow the steps for a better alternative. 46 | 47 | 48 | echo 'deb http://archive.ubuntu.com/ubuntu trusty main restricted universe multiverse' >>/tmp/multiverse.list 49 | 50 | sudo cp /tmp/multiverse.list /etc/apt/sources.list.d/ 51 | 52 | rm /tmp/multiverse.list 53 | 54 | The repo is saved to a temporary list named 'multiverse.list' in the /tmp folder. It is then copied to /etc/apt/sources.list.d/ folder. The file in /tmp folder is then removed. I found this fix in [Stack Exchange fourm](http://stackoverflow.com/questions/1584066/append-to-etc-apt-sources-list). 55 | 56 | Now to install boost, run: 57 | 58 | sudo apt-get install libboost-all-dev 59 | 60 | Now, let us install OpenCV. Go ahead and run: 61 | 62 | conda install opencv 63 | 64 | sudo apt-get install libopencv-dev 65 | 66 | Now let us install some dependencies of Caffe. Run the following: 67 | 68 | sudo apt-get install libleveldb-dev libsnappy-dev libhdf5-serial-dev 69 | 70 | sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev 71 | 72 | sudo apt-get install protobuf-compiler libprotobuf-dev 73 | 74 | conda install -c https://conda.anaconda.org/anaconda protobuf 75 | 76 | Okay, that's it. Let us now download the Caffe. If you don't have git installed in your system yet, run this code really quick: 77 | 78 | sudo apt-get install git 79 | 80 | We will clone the official [Caffe repository](https://github.com/BVLC/caffe "Caffe GitHub repo") from Github. 81 | 82 | git clone https://github.com/BVLC/caffe 83 | 84 | Once the git is cloned, cd into caffe folder. 85 | 86 | cd caffe 87 | 88 | We will edit the configuration file of Caffe now. We need to do it to specify that we are using a CPU-only system. (Tell compiler to disable GPU, CUDA etc). For this, make a copy of the Makefile.config.example. 89 | 90 | cp Makefile.config.example Makefile.config 91 | 92 | Great ! Now go ahead and open the Makefile.config in your favourite text editor (vi or vim or gedit or ...). Change the following: 93 | 94 | 1. Uncomment (No space in the beginning): 95 | CPU_ONLY := 1 96 | 97 | 2. Change: 98 | BLAS := atlas to BLAS := open 99 | 100 | 3. Comment out: 101 | PYTHON_INCLUDE := /usr/include/python2.7 \ 102 | /usr/lib/python2.7/dist-packages/numpy/core/include 103 | 104 | 4. Uncomment: 105 | ANACONDA_HOME := $(HOME)/anaconda2 106 | 107 | PYTHON_INCLUDE := $(ANACONDA_HOME)/include \ 108 | $(ANACONDA_HOME)/include/python2.7 \ 109 | $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include 110 | 111 | 5. Comment: 112 | PYTHON_LIB := /usr/lib 113 | 114 | 6. Uncomment: 115 | PYTHON_LIB := $(ANACONDA_HOME)/lib 116 | 117 | 7. Uncomment: 118 | USE_PKG_CONFIG := 1 119 | 120 | Your Makefile.config should look something like this now: [Makefile.config](#Makefile.config) 121 | 122 | Now that's done, let me share with you an error I came across. Our Makefile.config is okay. But while 'make'-ing / building the installation files, the hf5 dependeny gave me an error. This might not apply to you. I can't say for sure. The build required two files libhdf5_h1.so.10 and libhd5.so.10 but the files in the system were libhdf5_h1.so.7 and libhd5.so.7. I fixed this by doing the following: 123 | 124 | cd /usr/lib/x86_64-linux-gnu/ 125 | 126 | sudo cp libhdf5_hl.so.7 libhdf5_hl.so.10 127 | 128 | sudo cp libhdf5.so.7 libhdf5.so.10 129 | 130 | We will now install the libraries listed in the requirements.txt file. 131 | 132 | cd ~/caffe/python 133 | 134 | sudo apt-get install python-pip && sudo pip install -r requirements.txt 135 | 136 | Now, we can safely build the files in the caffe directory. We will run the make process as 4 jobs by specifying it like -j4. More on it [here](http://www.tutorialspoint.com/unix_commands/make.htm " make command") 137 | 138 | cd ~/caffe 139 | 140 | sudo make all -j4 141 | 142 | I hope the make process went well. If not, please see which package failed by checking the logs or from terminal itself. Feel free to comment, I will help to the best of my knowledge. You can seek help from your go to friend Google or Stack Exchange as mentioned above. 143 | 144 | Provided that the make process was successfull, continue with the rest of the installation process. 145 | 146 | We will now make the Pycaffe files. Pycaffe is the Python interface of Caffe which allows you to use Caffe inside Python. More on it [here](http://caffe.berkeleyvision.org/tutorial/interfaces.html). We will also make distribute. This is explained in Caffe website. 147 | 148 | sudo make pycaffe 149 | 150 | sudo make distribute 151 | 152 | Awesome! We are almost there. We just need to test whether everything went fine. For that make the files for testing and run the test. 153 | 154 | sudo make test 155 | 156 | sudo make runtest 157 | 158 | If you succeed in all the tests then you've successfully installed Caffe in your system ! One good reason to smile ! 159 | 160 | Finally, we need to add the correct path to our installed modules. Using your favourite text editor, add the following to the .bashrc file in your /home/user/ folder for Caffe to work properly. Please make sure you replace the < username > with your system's username. 161 | 162 | ``` 163 | #Anaconda if not present already 164 | export PATH=/home//anaconda2/bin:$PATH 165 | #Caffe Root 166 | export CAFFE_ROOT=/home//caffe/ 167 | export PYTHONPATH=/home//caffe/distribute/python:$PYTHONPATH 168 | export PYTHONPATH=/home//caffe/python:$PYTHONPATH 169 | ``` 170 | 171 | CHEERS ! You're done ! Now let's test if it really works. 172 | 173 | Restart/reboot your system to ensure everything loads perfect. 174 | 175 | sudo reboot 176 | 177 | Open Python and type: 178 | 179 | import caffe 180 | 181 | You should be able to successfully load caffe. 182 | Now let's start coding :) 183 | 184 | # 2. Caffe without installing Anaconda 185 | 186 | By preference, if you don't want to install Anaconda in your system, you can install Caffe by following the steps below. As mentioned earlier, installing all the dependencies can be difficult. If this tutorial does not work for you, please look into the errors, use our trusted friends. 187 | 188 | To start with, we will update and upgrade the packages in our system. Then we will have to install the dependencies one by one on the machine. Type the following to get started. 189 | 190 | sudo apt-get update && sudo apt-get upgrade && sudo apt-get dist-upgrade && sudo apt-get autoremove 191 | 192 | Now, let us install openblas. 193 | 194 | sudo apt-get -y install libopenblas-dev 195 | 196 | Next go ahead and install Boost. More info on boost [here](http://www.boost.org/ "boost.org") 197 | 198 | I faced a problem while installing boost in all my machines. I fixed it by including multiverse repository into the sources.list. Since playing with sources.list is not reccomended, follow the steps for a better alternative. 199 | 200 | echo 'deb http://archive.ubuntu.com/ubuntu trusty main restricted universe multiverse' >>/tmp/multiverse.list 201 | 202 | sudo cp /tmp/multiverse.list /etc/apt/sources.list.d/ 203 | 204 | rm /tmp/multiverse.list 205 | 206 | The repo is saved to a temporary list named 'multiverse.list' in the /tmp folder. It is then copied to /etc/apt/sources.list.d/ folder. The file in /tmp folder is then removed. I found this fix in [Stack Exchange fourm](http://stackoverflow.com/questions/1584066/append-to-etc-apt-sources-list). 207 | 208 | Now to install boost, run: 209 | 210 | sudo apt-get update && sudo apt-get install libboost-all-dev 211 | 212 | If later in the installation process you find that any of the boost related files are missing, run the following command. You can skip this one for now but won't hurt if you do it either. 213 | 214 | sudo apt-get -y install --fix-missing libboost-all-dev 215 | 216 | Go ahead and install libfaac-dev package. 217 | 218 | sudo apt-get install libfaac-dev 219 | 220 | Now, we need to install ffmpeg. Let us also make sure that the ffmpeg version is one which OpenCV and Caffe approves. We will remove any previous versions of ffmpeg and install new ones. 221 | 222 | The following code will remove ffmpeg and related packages: 223 | 224 | sudo apt-get -y remove ffmpeg x264 libx264-dev 225 | 226 | The mc3man repository hosts ffmpeg packages. I came to know about it from Stack Exchange forums. To include the repo, type this: 227 | 228 | sudo add-apt-repository ppa:mc3man/trusty-media 229 | 230 | Update and install ffmpeg. 231 | 232 | sudo apt-get update && sudo apt-get install ffmpeg gstreamer0.10-ffmpeg 233 | 234 | Now, we can install OpenCV. First let us install the dependencies. Building OpenCV can be challenging at first, but if you have all the dependencies correct it will be done in no time. 235 | 236 | Go ahead and run the following lines: 237 | 238 | sudo apt-get install build-essential 239 | 240 | The 'build-essential' ensures that we have the compilers ready. Now we will install some required packages. Run: 241 | 242 | sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev unzip 243 | 244 | We will install some optional packages as well. Run: 245 | 246 | sudo apt-get install python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev 247 | 248 | sudo apt-get install libtiff4-dev libopenexr-dev libeigen2-dev yasm libopencore-amrnb-dev libtheora-dev libvorbis-dev libxvidcore-dev 249 | 250 | sudo apt-get install python-tk libeigen3-dev libx264-dev libqt4-dev libqt4-opengl-dev sphinx-common texlive-latex-extra libv4l-dev default-jdk ant libvtk5-qt4-dev 251 | 252 | Now we can go ahead and download the OpenCV build files. Go to your root folder first. 253 | 254 | cd ~ 255 | 256 | Download the files: 257 | 258 | wget http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.4.9/opencv-2.4.9.zip 259 | 260 | Unzip the file by: 261 | 262 | unzip opencv-2.4.9.zip 263 | 264 | Go to the opencv folder by running: 265 | 266 | cd opencv-2.4.9 267 | 268 | Make a build directory inside. 269 | 270 | mkdir build 271 | 272 | Go inside the build directory. 273 | 274 | cd build 275 | 276 | Build the files using cmake. 277 | 278 | cmake -D WITH_TBB=ON -D BUILD_NEW_PYTHON_SUPPORT=ON -D WITH_V4L=ON -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D BUILD_EXAMPLES=ON -D WITH_QT=ON -D WITH_OPENGL=ON -D WITH_VTK=ON .. 279 | 280 | In the summary, make sure that FFMPEG is installed, also check whether the Python, Numpy, Java and OpenCL are properly installed and recognized. 281 | 282 | Now we will run the make process as 4 jobs by specifying it like -j4. More on it [here](http://www.tutorialspoint.com/unix_commands/make.htm " make command") 283 | 284 | sudo make -j4 285 | 286 | Go ahead and continue installation. 287 | 288 | sudo make install 289 | 290 | Once the installation is complete, do these steps to get OpenCV configured. 291 | 292 | sudo sh -c 'echo "/usr/local/lib" > /etc/ld.so.conf.d/opencv.conf' 293 | 294 | sudo ldconfig 295 | 296 | export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/opencv/lib 297 | 298 | Come out of the build folder if you haven't already by running: 299 | 300 | cd ~ 301 | 302 | Install python-pip: 303 | 304 | sudo apt-get install python-pip 305 | 306 | Now, we will install the Scipy and other scientific packages which are key Caffe dependencies. 307 | 308 | sudo apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose 309 | 310 | We will install Cython now. (I wanted it to install scikit-image properly) 311 | **Updated (6/24/16) the download link, old one does not work anymore** 312 | 313 | wget https://pypi.python.org/packages/0a/b6/fd142319fd0fe83dc266cfe83bfd095bc200fae5190fce0a2482560acb55/Cython-0.23.4.zip#md5=84c8c764ffbeae5f4a513d25fda4fd5e 314 | 315 | unzip Cython-0.23.4.zip 316 | 317 | cd Cython-0.23.4 318 | 319 | sudo python setup.py install 320 | 321 | cd ~ 322 | 323 | Now that we have Cython, go ahead and run the code below to install Scikit Image and Scikit Learn. 324 | 325 | sudo pip install scikit-image scikit-learn 326 | 327 | We will now install some more crucial dependencies of Caffe 328 | 329 | sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev 330 | 331 | sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler 332 | 333 | sudo pip install protobuf 334 | 335 | Installing Pydot will be beneficial to view our net by saving it off in an image file. 336 | 337 | sudo apt-get install python-pydot 338 | 339 | Now that all the dependencies are installed, we will go ahead and download the Caffe installation files. Go ahead and run: 340 | 341 | git clone https://github.com/BVLC/caffe 342 | 343 | 344 | Go into the caffe folder and copy and rename the Makefile.config.example file to Makefile.config. 345 | 346 | cd caffe 347 | cp Makefile.config.example Makefile.config 348 | 349 | Great ! Now go ahead and open the Makefile.config in your favourite text editor (vi or vim or gedit or ...). Change the following: 350 | ``` 351 | 1. Uncomment (No space in the beginning): 352 | CPU_ONLY := 1 353 | 354 | 2. Uncomment: 355 | USE_PKG_CONFIG := 1 356 | ``` 357 | 358 | We will install the packages listed in Caffe's requirements.txt file as well; just in case. 359 | 360 | cd ~/caffe/python 361 | sudo pip install -r requirements.txt 362 | 363 | Now, we can safely build the files in the caffe directory. We will run the make process as 4 jobs by specifying it like -j4. More on it [here](http://www.tutorialspoint.com/unix_commands/make.htm " make command") 364 | 365 | cd ~/caffe 366 | sudo make all -j4 367 | 368 | I hope the make process went well. If not, please see which package failed by checking the logs or from terminal itself. Feel free to comment, I will help to the best of my knowledge. You can seek help from your go to friend Google or Stack Exchange as mentioned above. 369 | 370 | Provided that the make process was successfull, continue with the rest of the installation process. 371 | 372 | We will now make the Pycaffe files. Pycaffe is the Python interface of Caffe which allows you to use Caffe inside Python. More on it [here](http://caffe.berkeleyvision.org/tutorial/interfaces.html). We will also make distribute. This is explained in Caffe website. 373 | 374 | sudo make pycaffe 375 | sudo make distribute 376 | 377 | Awesome! We are almost there. We just need to test whether everything went fine. For that make the files for testing and run the test. 378 | 379 | sudo make test 380 | sudo make runtest 381 | 382 | If you succeed in all the tests then you've successfully installed Caffe in your system ! One good reason to smile ! 383 | 384 | Finally, we need to add the correct path to our installed modules. Using your favourite text editor, add the following to the .bashrc file in your /home/user/ folder for Caffe to work properly. Please make sure you replace the < username > with your system's username. 385 | 386 | ``` 387 | # Caffe Root 388 | export CAFFE_ROOT=/home//caffe/ 389 | export PYTHONPATH=/home//caffe/distribute/python:$PYTHONPATH 390 | export PYTHONPATH=/home//caffe/python:$PYTHONPATH 391 | ``` 392 | 393 | CHEERS ! You're done ! Now let's test if it really works. 394 | 395 | Restart/reboot your system to ensure everything loads perfect. 396 | 397 | sudo reboot 398 | 399 | Open Python and type: 400 | 401 | import caffe 402 | 403 | You should be able to successfully load caffe. Now let's start coding :) 404 | 405 | # Appendix 406 | 407 | ## Makefile.config 408 | 409 | ### For Caffe + Anaconda 410 | 411 | ## Refer to http://caffe.berkeleyvision.org/installation.html 412 | # Contributions simplifying and improving our build system are welcome! 413 | 414 | # cuDNN acceleration switch (uncomment to build with cuDNN). 415 | # USE_CUDNN := 1 416 | 417 | # CPU-only switch (uncomment to build without GPU support). 418 | CPU_ONLY := 1 419 | 420 | # uncomment to disable IO dependencies and corresponding data layers 421 | # USE_OPENCV := 0 422 | # USE_LEVELDB := 0 423 | # USE_LMDB := 0 424 | 425 | # uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary) 426 | # You should not set this flag if you will be reading LMDBs with any 427 | # possibility of simultaneous read and write 428 | # ALLOW_LMDB_NOLOCK := 1 429 | 430 | # Uncomment if you're using OpenCV 3 431 | # OPENCV_VERSION := 3 432 | 433 | # To customize your choice of compiler, uncomment and set the following. 434 | # N.B. the default for Linux is g++ and the default for OSX is clang++ 435 | # CUSTOM_CXX := g++ 436 | 437 | # CUDA directory contains bin/ and lib/ directories that we need. 438 | CUDA_DIR := /usr/local/cuda 439 | # On Ubuntu 14.04, if cuda tools are installed via 440 | # "sudo apt-get install nvidia-cuda-toolkit" then use this instead: 441 | # CUDA_DIR := /usr 442 | 443 | # CUDA architecture setting: going with all of them. 444 | # For CUDA < 6.0, comment the *_50 lines for compatibility. 445 | CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \ 446 | -gencode arch=compute_20,code=sm_21 \ 447 | -gencode arch=compute_30,code=sm_30 \ 448 | -gencode arch=compute_35,code=sm_35 \ 449 | -gencode arch=compute_50,code=sm_50 \ 450 | -gencode arch=compute_50,code=compute_50 451 | 452 | # BLAS choice: 453 | # atlas for ATLAS (default) 454 | # mkl for MKL 455 | # open for OpenBlas 456 | BLAS := open 457 | # Custom (MKL/ATLAS/OpenBLAS) include and lib directories. 458 | # Leave commented to accept the defaults for your choice of BLAS 459 | # (which should work)! 460 | # BLAS_INCLUDE := /path/to/your/blas 461 | # BLAS_LIB := /path/to/your/blas 462 | 463 | # Homebrew puts openblas in a directory that is not on the standard search path 464 | # BLAS_INCLUDE := $(shell brew --prefix openblas)/include 465 | # BLAS_LIB := $(shell brew --prefix openblas)/lib 466 | 467 | # This is required only if you will compile the matlab interface. 468 | # MATLAB directory should contain the mex binary in /bin. 469 | # MATLAB_DIR := /usr/local 470 | # MATLAB_DIR := /Applications/MATLAB_R2012b.app 471 | 472 | # NOTE: this is required only if you will compile the python interface. 473 | # We need to be able to find Python.h and numpy/arrayobject.h. 474 | #PYTHON_INCLUDE := /usr/include/python2.7 \ 475 | /usr/lib/python2.7/dist-packages/numpy/core/include 476 | # Anaconda Python distribution is quite popular. Include path: 477 | # Verify anaconda location, sometimes it's in root. 478 | ANACONDA_HOME := $(HOME)/anaconda2 479 | PYTHON_INCLUDE := $(ANACONDA_HOME)/include \ 480 | $(ANACONDA_HOME)/include/python2.7 \ 481 | $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include 482 | 483 | # Uncomment to use Python 3 (default is Python 2) 484 | # PYTHON_LIBRARIES := boost_python3 python3.5m 485 | # PYTHON_INCLUDE := /usr/include/python3.5m \ 486 | # /usr/lib/python3.5/dist-packages/numpy/core/include 487 | 488 | # We need to be able to find libpythonX.X.so or .dylib. 489 | #PYTHON_LIB := /usr/lib 490 | PYTHON_LIB := $(ANACONDA_HOME)/lib 491 | 492 | # Homebrew installs numpy in a non standard path (keg only) 493 | # PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include 494 | # PYTHON_LIB += $(shell brew --prefix numpy)/lib 495 | 496 | # Uncomment to support layers written in Python (will link against Python libs) 497 | # WITH_PYTHON_LAYER := 1 498 | 499 | # Whatever else you find you need goes here. 500 | INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include 501 | LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib 502 | 503 | # If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies 504 | # INCLUDE_DIRS += $(shell brew --prefix)/include 505 | # LIBRARY_DIRS += $(shell brew --prefix)/lib 506 | 507 | # Uncomment to use `pkg-config` to specify OpenCV library paths. 508 | # (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.) 509 | USE_PKG_CONFIG := 1 510 | 511 | BUILD_DIR := build 512 | DISTRIBUTE_DIR := distribute 513 | 514 | # Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171 515 | # DEBUG := 1 516 | 517 | # The ID of the GPU that 'make runtest' will use to run unit tests. 518 | TEST_GPUID := 0 519 | 520 | # enable pretty build (comment to see full commands) 521 | Q ?= @ 522 | 523 | ### For Caffe without Anaconda 524 | 525 | ## Refer to http://caffe.berkeleyvision.org/installation.html 526 | # Contributions simplifying and improving our build system are welcome! 527 | 528 | # cuDNN acceleration switch (uncomment to build with cuDNN). 529 | # USE_CUDNN := 1 530 | 531 | # CPU-only switch (uncomment to build without GPU support). 532 | CPU_ONLY := 1 533 | 534 | # uncomment to disable IO dependencies and corresponding data layers 535 | # USE_OPENCV := 0 536 | # USE_LEVELDB := 0 537 | # USE_LMDB := 0 538 | 539 | # uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary) 540 | # You should not set this flag if you will be reading LMDBs with any 541 | # possibility of simultaneous read and write 542 | # ALLOW_LMDB_NOLOCK := 1 543 | 544 | # Uncomment if you're using OpenCV 3 545 | # OPENCV_VERSION := 3 546 | 547 | # To customize your choice of compiler, uncomment and set the following. 548 | # N.B. the default for Linux is g++ and the default for OSX is clang++ 549 | # CUSTOM_CXX := g++ 550 | 551 | # CUDA directory contains bin/ and lib/ directories that we need. 552 | CUDA_DIR := /usr/local/cuda 553 | # On Ubuntu 14.04, if cuda tools are installed via 554 | # "sudo apt-get install nvidia-cuda-toolkit" then use this instead: 555 | # CUDA_DIR := /usr 556 | 557 | # CUDA architecture setting: going with all of them. 558 | # For CUDA < 6.0, comment the *_50 lines for compatibility. 559 | CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \ 560 | -gencode arch=compute_20,code=sm_21 \ 561 | -gencode arch=compute_30,code=sm_30 \ 562 | -gencode arch=compute_35,code=sm_35 \ 563 | -gencode arch=compute_50,code=sm_50 \ 564 | -gencode arch=compute_50,code=compute_50 565 | 566 | # BLAS choice: 567 | # atlas for ATLAS (default) 568 | # mkl for MKL 569 | # open for OpenBlas 570 | BLAS := open 571 | # Custom (MKL/ATLAS/OpenBLAS) include and lib directories. 572 | # Leave commented to accept the defaults for your choice of BLAS 573 | # (which should work)! 574 | # BLAS_INCLUDE := /path/to/your/blas 575 | # BLAS_LIB := /path/to/your/blas 576 | 577 | # Homebrew puts openblas in a directory that is not on the standard search path 578 | # BLAS_INCLUDE := $(shell brew --prefix openblas)/include 579 | # BLAS_LIB := $(shell brew --prefix openblas)/lib 580 | 581 | # This is required only if you will compile the matlab interface. 582 | # MATLAB directory should contain the mex binary in /bin. 583 | # MATLAB_DIR := /usr/local 584 | # MATLAB_DIR := /Applications/MATLAB_R2012b.app 585 | 586 | # NOTE: this is required only if you will compile the python interface. 587 | # We need to be able to find Python.h and numpy/arrayobject.h. 588 | PYTHON_INCLUDE := /usr/include/python2.7 \ 589 | /usr/lib/python2.7/dist-packages/numpy/core/include 590 | # Anaconda Python distribution is quite popular. Include path: 591 | # Verify anaconda location, sometimes it's in root. 592 | # ANACONDA_HOME := $(HOME)/anaconda 593 | # PYTHON_INCLUDE := $(ANACONDA_HOME)/include \ 594 | # $(ANACONDA_HOME)/include/python2.7 \ 595 | # $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \ 596 | 597 | # Uncomment to use Python 3 (default is Python 2) 598 | # PYTHON_LIBRARIES := boost_python3 python3.5m 599 | # PYTHON_INCLUDE := /usr/include/python3.5m \ 600 | # /usr/lib/python3.5/dist-packages/numpy/core/include 601 | 602 | # We need to be able to find libpythonX.X.so or .dylib. 603 | PYTHON_LIB := /usr/lib 604 | # PYTHON_LIB := $(ANACONDA_HOME)/lib 605 | 606 | # Homebrew installs numpy in a non standard path (keg only) 607 | # PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include 608 | # PYTHON_LIB += $(shell brew --prefix numpy)/lib 609 | 610 | # Uncomment to support layers written in Python (will link against Python libs) 611 | # WITH_PYTHON_LAYER := 1 612 | 613 | # Whatever else you find you need goes here. 614 | INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include 615 | LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib 616 | 617 | # If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies 618 | # INCLUDE_DIRS += $(shell brew --prefix)/include 619 | # LIBRARY_DIRS += $(shell brew --prefix)/lib 620 | 621 | # Uncomment to use `pkg-config` to specify OpenCV library paths. 622 | # (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.) 623 | # USE_PKG_CONFIG := 1 624 | 625 | BUILD_DIR := build 626 | DISTRIBUTE_DIR := distribute 627 | 628 | # Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171 629 | # DEBUG := 1 630 | 631 | # The ID of the GPU that 'make runtest' will use to run unit tests. 632 | TEST_GPUID := 0 633 | 634 | # enable pretty build (comment to see full commands) 635 | Q ?= @ 636 | -------------------------------------------------------------------------------- /Deep-Neural-Network-with-Caffe/Deep Neural Network with Caffe.md: -------------------------------------------------------------------------------- 1 | 2 | # Deep Neural Network with Caffe 3 | ### Extracted and enhanced from [Caffe's Documentation](https://github.com/BVLC/caffe/blob/master/examples/00-classification.ipynb) 4 | 5 | 6 | Coding by hand a problem such as speech recognition is nearly impossible due to the shear amount of variance in the data. The way our brain interprets these kind of problems are complex but turns out they can be modeled with a substantial amount of accuracy, sometimes beating humans itself. The whole concept of artifical neural network started evolving in and after the 1950's. Artificial Neurons gradually evolved from a simple Perceptron to Sigmoid neurons and then to many other forms. Earlier neurons were able to provide binary output to the many inputs that we provide. Newer algorithms and activation functions allow artificial neural network to make complex predictions by learning on its own. 7 | 8 | In an artificial neural network, group of neurons are visualized as a layer. ANN's have multiple layers, each layer doing a specific task. The first layer is always the input layer where we input our training or test data. The last layer is the ouput layer where we get the output. Any layer in between is called a hidden layer, which does not take an input or give an output directly from us users. 9 | 10 | Whenever you have many number of layers in your ANN, it is termed as a Deep Neural Network (DNN). DNN's have enabled complex Speech recognition, Natural language processing, Computer vision and many other once Star-Trek level Sci-Fi technologies in to existance now. Academic and industrial level research is greatly improving the performance and architecture of DNN's and it is an exciting field to work with. 11 | 12 | With huge players like Google opensourcing part of their Machine Learning systems like the TensorFlow software library for numerical computation, there are many options for someone interested in starting off with Machine Learning/Neural Nets to choose from. **Caffe**, a deep learning framework developed by the **Berkeley Vision and Learning Center (BVLC)** and its contributors, comes to the play with a fresh cup of coffee.This tutorial aims in providing an introduction to **Caffe**. The installation instructions can be found [here](https://github.com/arundasan91/Caffe/blob/master/Caffe%20Installation%20Instructions.md). 13 | 14 | We will first use a pre-trained model and figure out how Caffe handles a Classification problem. The example is extracted from Caffe's own example section on their GitHub page. Caffe's [Github repo provides examples](https://github.com/BVLC/caffe/tree/master/examples) on many different algorithms and approaches. 15 | I have added my own code and enhancements to help better understand the working of Caffe. Once we are through our Classification example using the pre-trained network, we will try to architect our own network by defining each and every layer. 16 | 17 | First let us import some libraries that are required to visualize the trained Neural net. These include the Numpy library for saving the trained images as arrays and Matplotlib for plotting various figures and graphs out of it. 18 | We also tune the plot parameters as mentioned below. 19 | 20 | 21 | ```python 22 | # set up Python environment: numpy for numerical routines, and matplotlib for plotting 23 | import numpy as np 24 | import matplotlib.pyplot as plt 25 | # display plots in this notebook 26 | %matplotlib inline 27 | 28 | # set display defaults 29 | # these are for the matplotlib figure's. 30 | plt.rcParams['figure.figsize'] = (10, 10) # large images 31 | plt.rcParams['image.interpolation'] = 'nearest' # don't interpolate: show square pixels 32 | plt.rcParams['image.cmap'] = 'gray' # use grayscale output rather than a (potentially misleading) color heatmap 33 | ``` 34 | 35 | Caffe's examples are hosted in an example directory inside its root directory. We should run the following python code from the example directory to make sure that the scripts work. We include the *sys* and *os* modules to work with the file paths and the working directory. The caffe's python folder must be fixed as the python path. 36 | 37 | 38 | ```python 39 | # The caffe module needs to be on the Python path; 40 | import sys 41 | import os 42 | 43 | caffe_root = '/root/caffe/' # The caffe_root is changed to reflect the actual folder in the server. 44 | sys.path.insert(0, caffe_root + 'python') # Correct the python path 45 | 46 | import caffe 47 | # Successfully imported Caffe ! 48 | 49 | ``` 50 | 51 | /root/caffe/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for boost::shared_ptr > already registered; second conversion method ignored. 52 | from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \ 53 | /root/caffe/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for boost::shared_ptr > already registered; second conversion method ignored. 54 | from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \ 55 | /root/caffe/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for boost::shared_ptr > already registered; second conversion method ignored. 56 | from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \ 57 | 58 | 59 | Training a model is computationally expensive and time consuming. For this example, let us stick on to a pre-trained network bundled with Caffe. We will search for the caffemodel and start from there. Caffemodel is the trained model. During the training phase, on the set time intervals or iterations, Caffe saves caffemodel file which saves the state of the net at that particular time. For example, if we have a total of 10000 iterations to perform and we explicitly mention that we need to save the state of the net at intervals of 2000 iterations, Caffe will generate 5 caffemodel files and 5 solver files, which saves the respective state of the net at iterations 2000,4000,6000,8000 and 10000. 60 | 61 | Since we explicitly fixed the working directory, it is not required to have the Notebook in the same directory as that of the example. We define the Model definitions and weights of the pre-trained network by including the correct path. The neural net is defined by using the defenitions and weigths saved earlier. Since the network was already trained for a huge dataset, we can choose the Test mode in caffe and not perform dropout's while defining the net. More info on dropout [here](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf). 62 | 63 | 64 | ```python 65 | # set Caffe mode as CPU only. This is to be done because OCI servers are not equipped with GPU's yet. 66 | caffe.set_mode_cpu() 67 | 68 | # set the model definitions since we are using a pretrained network here. 69 | # this protoype definitions can be changed to make significant changes in the learning method. 70 | model_def = '/root/machineLearning/deepNeuralNet/caffe/caffemodels/bvlc/caffenet/deploy_changed_net.prototxt' 71 | model_weights = '/root/machineLearning/deepNeuralNet/caffe/caffemodels/bvlc/caffenet/caffenet.caffemodel' 72 | 73 | net = caffe.Net(model_def, # defines the structure of the model 74 | model_weights, # contains the trained weights 75 | caffe.TEST) # use test mode (e.g., don't perform dropout) 76 | ``` 77 | 78 | We can visualize our network architecture by converting the model defenition prototxt file into an image. The `draw_net.py` python code will allow us to do that. Let us now see the prototxt file and its visual interpretation. The image here in the notebook is pretty small, you can view it better [here](https://raw.githubusercontent.com/arundasan91/Caffe/master/Data/deploy_changed_net.png). 79 | 80 | 81 | ```python 82 | %%bash 83 | cat /root/machineLearning/deepNeuralNet/caffe/caffemodels/bvlc/caffenet/deploy_changed_net.prototxt 84 | python $CAFFE_ROOT/python/draw_net.py \ 85 | /root/machineLearning/deepNeuralNet/caffe/caffemodels/bvlc/caffenet/deploy_changed_net.prototxt \ 86 | /root/machineLearning/deepNeuralNet/caffe/caffemodels/bvlc/caffenet/deploy_changed_net.png 87 | ``` 88 | 89 | name: "CaffeNet" 90 | layer { 91 | name: "data" 92 | type: "Input" 93 | top: "data" 94 | input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } } 95 | } 96 | layer { 97 | name: "conv1" 98 | type: "Convolution" 99 | bottom: "data" 100 | top: "conv1" 101 | convolution_param { 102 | num_output: 96 103 | kernel_size: 11 104 | stride: 4 105 | } 106 | } 107 | layer { 108 | name: "relu1" 109 | type: "ReLU" 110 | bottom: "conv1" 111 | top: "conv1" 112 | } 113 | layer { 114 | name: "pool1" 115 | type: "Pooling" 116 | bottom: "conv1" 117 | top: "pool1" 118 | pooling_param { 119 | pool: MAX 120 | kernel_size: 3 121 | stride: 2 122 | } 123 | } 124 | layer { 125 | name: "norm1" 126 | type: "LRN" 127 | bottom: "pool1" 128 | top: "norm1" 129 | lrn_param { 130 | local_size: 5 131 | alpha: 0.0001 132 | beta: 0.75 133 | } 134 | } 135 | layer { 136 | name: "conv2" 137 | type: "Convolution" 138 | bottom: "norm1" 139 | top: "conv2" 140 | convolution_param { 141 | num_output: 256 142 | pad: 2 143 | kernel_size: 5 144 | group: 2 145 | } 146 | } 147 | layer { 148 | name: "relu2" 149 | type: "ReLU" 150 | bottom: "conv2" 151 | top: "conv2" 152 | } 153 | layer { 154 | name: "pool2" 155 | type: "Pooling" 156 | bottom: "conv2" 157 | top: "pool2" 158 | pooling_param { 159 | pool: MAX 160 | kernel_size: 3 161 | stride: 2 162 | } 163 | } 164 | layer { 165 | name: "norm2" 166 | type: "LRN" 167 | bottom: "pool2" 168 | top: "norm2" 169 | lrn_param { 170 | local_size: 5 171 | alpha: 0.0001 172 | beta: 0.75 173 | } 174 | } 175 | layer { 176 | name: "conv3" 177 | type: "Convolution" 178 | bottom: "norm2" 179 | top: "conv3" 180 | convolution_param { 181 | num_output: 384 182 | pad: 1 183 | kernel_size: 3 184 | } 185 | } 186 | layer { 187 | name: "relu3" 188 | type: "ReLU" 189 | bottom: "conv3" 190 | top: "conv3" 191 | } 192 | layer { 193 | name: "conv4" 194 | type: "Convolution" 195 | bottom: "conv3" 196 | top: "conv4" 197 | convolution_param { 198 | num_output: 384 199 | pad: 1 200 | kernel_size: 3 201 | group: 2 202 | } 203 | } 204 | layer { 205 | name: "relu4" 206 | type: "ReLU" 207 | bottom: "conv4" 208 | top: "conv4" 209 | } 210 | layer { 211 | name: "conv5" 212 | type: "Convolution" 213 | bottom: "conv4" 214 | top: "conv5" 215 | convolution_param { 216 | num_output: 256 217 | pad: 1 218 | kernel_size: 3 219 | group: 2 220 | } 221 | } 222 | layer { 223 | name: "relu5" 224 | type: "ReLU" 225 | bottom: "conv5" 226 | top: "conv5" 227 | } 228 | layer { 229 | name: "pool5" 230 | type: "Pooling" 231 | bottom: "conv5" 232 | top: "pool5" 233 | pooling_param { 234 | pool: MAX 235 | kernel_size: 3 236 | stride: 2 237 | } 238 | } 239 | layer { 240 | name: "fc6" 241 | type: "InnerProduct" 242 | bottom: "pool5" 243 | top: "fc6" 244 | inner_product_param { 245 | num_output: 4096 246 | } 247 | } 248 | layer { 249 | name: "relu6" 250 | type: "ReLU" 251 | bottom: "fc6" 252 | top: "fc6" 253 | } 254 | layer { 255 | name: "drop6" 256 | type: "Dropout" 257 | bottom: "fc6" 258 | top: "fc6" 259 | dropout_param { 260 | dropout_ratio: 0.5 261 | } 262 | } 263 | layer { 264 | name: "fc7" 265 | type: "InnerProduct" 266 | bottom: "fc6" 267 | top: "fc7" 268 | inner_product_param { 269 | num_output: 4096 270 | } 271 | } 272 | layer { 273 | name: "relu7" 274 | type: "ReLU" 275 | bottom: "fc7" 276 | top: "fc7" 277 | } 278 | layer { 279 | name: "drop7" 280 | type: "Dropout" 281 | bottom: "fc7" 282 | top: "fc7" 283 | dropout_param { 284 | dropout_ratio: 0.5 285 | } 286 | } 287 | layer { 288 | name: "fc8" 289 | type: "InnerProduct" 290 | bottom: "fc7" 291 | top: "fc8" 292 | inner_product_param { 293 | num_output: 1000 294 | } 295 | } 296 | layer { 297 | name: "prob" 298 | type: "Softmax" 299 | bottom: "fc8" 300 | top: "prob" 301 | } 302 | Drawing net to /root/machineLearning/deepNeuralNet/caffe/caffemodels/bvlc/caffenet/deploy_changed_net.png 303 | 304 | 305 | /root/caffe/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for boost::shared_ptr > already registered; second conversion method ignored. 306 | from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \ 307 | /root/caffe/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for boost::shared_ptr > already registered; second conversion method ignored. 308 | from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \ 309 | /root/caffe/python/caffe/pycaffe.py:13: RuntimeWarning: to-Python converter for boost::shared_ptr > already registered; second conversion method ignored. 310 | from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, \ 311 | 312 | 313 | 314 | ```python 315 | image = caffe.io.load_image('/root/machineLearning/deepNeuralNet/caffe/caffemodels/bvlc/caffenet/deploy_changed_net.png') 316 | plt.imshow(image) 317 | ``` 318 | 319 | 320 | 321 | 322 | 323 | 324 | 325 | 326 | 327 | ![png](output_15_1.png) 328 | 329 | 330 | ***The image here in the notebook is pretty small, you can view it better [here](https://raw.githubusercontent.com/arundasan91/Caffe/master/Data/deploy_changed_net.png).*** 331 | 332 | Now we can create a transformer to input data into our net. Input data here are images. The subtracted-mean of the images in the dataset considered are to be set in the transformer. Mean subtraction is a way of preprocessing the image. The mean is subtracted across every individual feature in the dataset. This can be interpreted as the centering of a cloud of data around the origin along every dimension. With our input data fixed as images, this relates to subtracting the mean from each of the pixels, seperately across the three channels. More on it [here](http://cs231n.github.io/neural-networks-2/). 333 | 334 | These are the steps usually carried out in each transformers: 335 | 1. Transpose the data from (height, width, channels) to (channels, width, height) 336 | 2. Swap the color channels from RGB to BGR 337 | 3. Subtract the mean pixel value of the training dataset (unless you disable that feature). 338 | 339 | More information on these [here](https://groups.google.com/forum/#!topic/digits-users/FIh6VyU1XqQ), [here](https://github.com/NVIDIA/DIGITS/issues/59) and [here](https://github.com/NVIDIA/DIGITS/blob/v1.1.0/digits/model/tasks/caffe_train.py#L938-L961). 340 | 341 | 342 | ```python 343 | # load the mean ImageNet image (as distributed with Caffe) for subtraction 344 | mu = np.load('/root/machineLearning/deepNeuralNet/caffe/datasets/ilsvrc12/ilsvrc_2012_mean.npy') 345 | mu = mu.mean(1).mean(1) # average over pixels to obtain the mean (BGR) pixel values 346 | print 'mean-subtracted values:', zip('BGR', mu) 347 | 348 | # create transformer for the input called 'data' 349 | transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape}) 350 | 351 | transformer.set_transpose('data', (2,0,1)) # move image channels to outermost dimension 352 | transformer.set_mean('data', mu) # subtract the dataset-mean value in each channel 353 | transformer.set_raw_scale('data', 255) # rescale from [0, 1] to [0, 255] 354 | transformer.set_channel_swap('data', (2,1,0)) # swap channels from RGB to BGR 355 | ``` 356 | 357 | mean-subtracted values: [('B', 104.0069879317889), ('G', 116.66876761696767), ('R', 122.6789143406786)] 358 | 359 | 360 | If needed, we can reshape the data to meet our specifications. In the particular example the batch size, number of channels and image size is explicitly specified as below. 361 | 362 | 363 | ```python 364 | # set the size of the input (we can skip this if we're happy 365 | # with the default; we can also change it later, e.g., for different batch sizes) 366 | #net.blobs['data'].reshape(50, # batch size 367 | # 3, # 3-channel (BGR) images 368 | # 227, 227) # image size is 227x227 369 | ``` 370 | 371 | Any image can be now loaded into caffe. For simplicity let us now stick with Caffe's example image ($CAFFE_ROOT/examples/images/cat.jpg). The image is then transformed as mentioned above using the transformer that we defined. Finally, the image is plotted using matplotlib.pyplot imported as plt. 372 | 373 | 374 | ```python 375 | image = caffe.io.load_image('/root/machineLearning/deepNeuralNet/caffe/datasets/images/samples/cat.jpg') 376 | transformed_image = transformer.preprocess('data', image) 377 | plt.imshow(image) 378 | ``` 379 | 380 | 381 | 382 | 383 | 384 | 385 | 386 | 387 | 388 | ![png](output_22_1.png) 389 | 390 | 391 | Great ! Now we have our net ready so is the image that we need to classify. Remember that data in Caffe is interpreted using blobs. Quoting from [Caffe's Documentation](http://caffe.berkeleyvision.org/tutorial/net_layer_blob.html) : *As data and derivatives flow through the network in the forward and backward passes Caffe stores, communicates, and manipulates the information as blobs: the blob is the standard array and unified memory interface for the framework.* 392 | 393 | For caffe to get information from the image, it needs to be copied to the memory allocated by Caffe. 394 | 395 | Once the image is loaded into memory, we can perform classification with it. To start classification, we call ***net.forward()*** and redirect its output to a variavle named output (name can be anything obviously). The probability of the output is saved in a vector format. Since we gave a batch size of 50, there will be 50 input images at once. The probability of our image will be saved in the **[0]**th location. The output probability can be extracted out by properly calling it. Finally the predicted class of the image can be extracted by using argmax which returns the indices of the maximum values along an axis. 396 | 397 | 398 | ```python 399 | # copy the image data into the memory allocated for the net 400 | net.blobs['data'].data[...] = transformed_image 401 | 402 | ### perform classification 403 | output = net.forward() 404 | 405 | output_prob = output['prob'][0] # the output probability vector for the first image in the batch 406 | 407 | print 'predicted class is:', output_prob.argmax() 408 | ``` 409 | 410 | predicted class is: 281 411 | 412 | 413 | In our case, for the cute cat, the predicted class is 281. Make sure that you are getting the same (just in case). 414 | 415 | For our eyes the image is a cute cat, agreed. To see what our net thinks it is, let us fetch the label of the predicted/classified image. Load the labels file from the dataset and output the specific label. 416 | 417 | 418 | ```python 419 | # load ImageNet labels 420 | 421 | labels_file = '/root/machineLearning/deepNeuralNet/caffe/datasets/ilsvrc12/synset_words.txt' 422 | 423 | if not os.path.exists(labels_file): 424 | !/root/caffe/data/ilsvrc12/get_ilsvrc_aux.sh 425 | 426 | labels = np.loadtxt(labels_file, str, delimiter='\t') 427 | 428 | print 'output label:', labels[output_prob.argmax()] 429 | ``` 430 | 431 | output label: n02123045 tabby, tabby cat 432 | 433 | 434 | What do you think about the prediction ? Fair ? Let us see a quantitative result. We will output the top five predictions from the output layer (softmax layer). 435 | 436 | 437 | ```python 438 | # sort top five predictions from softmax output 439 | top_inds = output_prob.argsort()[::-1][:5] # reverse sort and take five largest items 440 | 441 | print 'probabilities and labels:' 442 | zip(output_prob[top_inds], labels[top_inds]) 443 | ``` 444 | 445 | probabilities and labels: 446 | 447 | 448 | 449 | 450 | 451 | [(0.31243613, 'n02123045 tabby, tabby cat'), 452 | (0.23797171, 'n02123159 tiger cat'), 453 | (0.1238724, 'n02124075 Egyptian cat'), 454 | (0.10075741, 'n02119022 red fox, Vulpes vulpes'), 455 | (0.07095696, 'n02127052 lynx, catamount')] 456 | 457 | 458 | 459 | To find the time required to train the network for the particular input, let us use timeit function. 460 | 461 | 462 | ```python 463 | # find the time required to train the network 464 | %timeit net.forward() 465 | ``` 466 | 467 | 1 loop, best of 3: 688 ms per loop 468 | 469 | 470 | **blob.data.shape** can be used to find the shape of the different layers in your net. Loop across it to get shape of each layer. 471 | 472 | 473 | ```python 474 | # for each layer, show the output shape 475 | for layer_name, blob in net.blobs.iteritems(): 476 | print layer_name + '\t' + str(blob.data.shape) 477 | ``` 478 | 479 | data (10, 3, 227, 227) 480 | conv1 (10, 96, 55, 55) 481 | pool1 (10, 96, 27, 27) 482 | norm1 (10, 96, 27, 27) 483 | conv2 (10, 256, 27, 27) 484 | pool2 (10, 256, 13, 13) 485 | norm2 (10, 256, 13, 13) 486 | conv3 (10, 384, 13, 13) 487 | conv4 (10, 384, 13, 13) 488 | conv5 (10, 256, 13, 13) 489 | pool5 (10, 256, 6, 6) 490 | fc6 (10, 4096) 491 | fc7 (10, 4096) 492 | fc8 (10, 1000) 493 | prob (10, 1000) 494 | 495 | 496 | 497 | ```python 498 | for layer_name, param in net.params.iteritems(): 499 | print layer_name + '\t' + str(param[0].data.shape), str(param[1].data.shape) 500 | ``` 501 | 502 | conv1 (96, 3, 11, 11) (96,) 503 | conv2 (256, 48, 5, 5) (256,) 504 | conv3 (384, 256, 3, 3) (384,) 505 | conv4 (384, 192, 3, 3) (384,) 506 | conv5 (256, 192, 3, 3) (256,) 507 | fc6 (4096, 9216) (4096,) 508 | fc7 (4096, 4096) (4096,) 509 | fc8 (1000, 4096) (1000,) 510 | 511 | 512 | 513 | ```python 514 | def vis_square(data): 515 | """Take an array of shape (n, height, width) or (n, height, width, 3) 516 | and visualize each (height, width) thing in a grid of size approx. sqrt(n) by sqrt(n)""" 517 | 518 | # normalize data for display 519 | data = (data - data.min()) / (data.max() - data.min()) 520 | 521 | # force the number of filters to be square 522 | n = int(np.ceil(np.sqrt(data.shape[0]))) 523 | padding = (((0, n ** 2 - data.shape[0]), 524 | (0, 1), (0, 1)) # add some space between filters 525 | + ((0, 0),) * (data.ndim - 3)) # don't pad the last dimension (if there is one) 526 | data = np.pad(data, padding, mode='constant', constant_values=1) # pad with ones (white) 527 | 528 | # tile the filters into an image 529 | data = data.reshape((n, n) + data.shape[1:]).transpose((0, 2, 1, 3) + tuple(range(4, data.ndim + 1))) 530 | data = data.reshape((n * data.shape[1], n * data.shape[3]) + data.shape[4:]) 531 | 532 | plt.imshow(data); plt.axis('off') 533 | ``` 534 | 535 | We can filter out the weights and biases of each layer to visualize the changes happening in each layer. This is a powerful way of analyzing the net as it gives intuition into what is happening inside it. The image drawn above (layer by layer representation) together with the visualization of things happening in each layer will help us understand the net in more depth. Here we use the **conv1** layer for the same. 536 | 537 | 538 | ```python 539 | # the parameters are a list of [weights, biases] 540 | filters = net.params['conv1'][0].data 541 | filters.shape 542 | ``` 543 | 544 | 545 | 546 | 547 | (96, 3, 11, 11) 548 | 549 | 550 | 551 | If you noticed, the shape of the filter is different from the function vis_square. So we need to transpose the vector accordingly before passing it into the function to visualize the layers. We pass in the data of first convolution layer. The image below shows that the lower layer is working as an edge detector of sort. 552 | 553 | 554 | ```python 555 | vis_square(filters.transpose(0, 2, 3, 1)) 556 | ``` 557 | 558 | 559 | ![png](output_38_0.png) 560 | 561 | 562 | To visualize the data as such, we can use the net.blobs instead of net.params. This will give us a visual clue on what the data may look like and what is happening at the same time. We are doing it for the **conv1** layer. 563 | 564 | 565 | ```python 566 | feat = net.blobs['conv1'].data[0, :36] 567 | vis_square(feat) 568 | ``` 569 | 570 | 571 | ![png](output_40_0.png) 572 | 573 | 574 | Similarly for the **pool5** layer. 575 | 576 | 577 | ```python 578 | feat = net.blobs['pool5'].data[0] 579 | vis_square(feat) 580 | ``` 581 | 582 | 583 | ![png](output_42_0.png) 584 | 585 | 586 | We can plot graphs using the various data saved in the layers. The fully connected layer fc6 will result in the following plot. 587 | 588 | 589 | ```python 590 | feat = net.blobs['fc6'].data[0] 591 | plt.subplot(2, 1, 1) 592 | plt.plot(feat.flat) 593 | plt.subplot(2, 1, 2) 594 | _ = plt.hist(feat.flat[feat.flat > 0], bins=100) 595 | ``` 596 | 597 | 598 | ![png](output_44_0.png) 599 | 600 | 601 | The probability of predicting the correct label for the particular image we classified can be plotted as well. X-axis is the Feature's label number and Y-Axis is the probability of correct classification. 602 | 603 | 604 | ```python 605 | feat = net.blobs['prob'].data[0] 606 | plt.figure(figsize=(15, 3)) 607 | plt.plot(feat.flat) 608 | ``` 609 | 610 | 611 | 612 | 613 | [] 614 | 615 | 616 | 617 | 618 | ![png](output_46_1.png) 619 | 620 | 621 | Now, let us download an image of our own and try to classify it. Here, a http link of the image is used to download the image. The image is then loaded into Caffe. The image is then preprocessed using the transformer we defined earlier. 622 | 623 | Once we are done with the preprocessing, we have a formated image in the memory that is ready to be classified. Perform the classification by running **net.forward()**. The output probability can be found just like earlier.The top 5 probabilities are dound out, the image displayed and the 5 probabilities are printed out. 624 | 625 | 626 | ```python 627 | # download an image 628 | # for example: 629 | # my_image_url = "https://upload.wikimedia.org/wikipedia/commons/b/be/Orang_Utan%2C_Semenggok_Forest_Reserve%2C_Sarawak%2C_Borneo%2C_Malaysia.JPG" 630 | # my_image_url = "https://www.petfinder.com/wp-content/uploads/2012/11/140272627-grooming-needs-senior-cat-632x475.jpg" # paste your URL here 631 | # my_image_url = "http://kids.nationalgeographic.com/content/dam/kids/photos/animals/Mammals/H-P/lion-male-roar.jpg" 632 | 633 | my_image_url ="http://www.depositagift.com/img/bank_assets/Band-Aid.jpg" 634 | 635 | !wget -O image.jpg $my_image_url 636 | 637 | # transform it and copy it into the net 638 | image = caffe.io.load_image('image.jpg') 639 | net.blobs['data'].data[...] = transformer.preprocess('data', image) 640 | 641 | # perform classification 642 | net.forward() 643 | 644 | # obtain the output probabilities 645 | output_prob = net.blobs['prob'].data[0] 646 | 647 | # sort top five predictions from softmax output 648 | top_inds = output_prob.argsort()[::-1][:5] 649 | 650 | plt.imshow(image) 651 | 652 | print 'probabilities and labels:' 653 | zip(output_prob[top_inds], labels[top_inds]) 654 | ``` 655 | 656 | --2016-04-16 03:11:42-- http://www.depositagift.com/img/bank_assets/Band-Aid.jpg 657 | Resolving www.depositagift.com (www.depositagift.com)... 50.28.4.115 658 | Connecting to www.depositagift.com (www.depositagift.com)|50.28.4.115|:80... connected. 659 | HTTP request sent, awaiting response... 200 OK 660 | Length: 34197 (33K) [image/jpeg] 661 | Saving to: 'image.jpg' 662 | 663 | 100%[======================================>] 34,197 --.-K/s in 0.1s 664 | 665 | 2016-04-16 03:11:42 (239 KB/s) - 'image.jpg' saved [34197/34197] 666 | 667 | probabilities and labels: 668 | 669 | 670 | 671 | 672 | 673 | [(0.94379258, 'n02786058 Band Aid'), 674 | (0.0064510447, 'n03530642 honeycomb'), 675 | (0.0061318246, 'n07684084 French loaf'), 676 | (0.0045337547, 'n04476259 tray'), 677 | (0.0042723794, 'n03314780 face powder')] 678 | 679 | 680 | 681 | 682 | ![png](output_48_2.png) 683 | 684 | 685 | That's well classified! 94 percent probability is good indeed. But the image itself is a simple one with clean background and of decent size. 686 | 687 | Now we will look into how we create a network of our own by defining our own network architecture and solver parameters. Here we will be using the python code to create the prototxt files for us automatically. 688 | 689 | # Solving in Python with LeNet 690 | 691 | In this example, we'll explore learning with Caffe in Python, using the fully-exposed `Solver` interface. This is the example hosted in Caffe's GitHub and is available in Caffe's examples folder. 692 | 693 | The previous tutorial explained in detail a classification example in Caffe. The classification was done in a network extracted from a pretrained one. For this tutorial, we will see how to create a Neural Net of your own by defining it from scratch. 694 | 695 | The python interface for Caffe 'pycaffe' need to be built beforehand. If you have followed the [installation instructions](https://github.com/arundasan91/Caffe/blob/master/Caffe%20Installation%20Instructions.md) on the previous tutorial, pycaffe is already built and active. 696 | 697 | * We'll be using the provided LeNet example data and networks (make sure you've downloaded the data and created the databases, as below). More on LeNet [here](http://yann.lecun.com/exdb/lenet/). 698 | 699 | 700 | ```python 701 | # run scripts from caffe root 702 | os.chdir(caffe_root) 703 | # Download data 704 | !data/mnist/get_mnist.sh 705 | # Prepare data 706 | !examples/mnist/create_mnist.sh 707 | # back to examples 708 | os.chdir('examples') 709 | ``` 710 | 711 | Downloading... 712 | Creating lmdb... 713 | Done. 714 | 715 | 716 | ### 2. Creating the net 717 | 718 | Now let's make a variant of LeNet, the classic 1989 convnet architecture. 719 | 720 | We'll need two external files to help out: 721 | * the net `prototxt`, defining the architecture and pointing to the train/test data 722 | * the solver `prototxt`, defining the learning parameters 723 | 724 | We start by creating the net. We'll write the net in a succinct and natural way as Python code that serializes to Caffe's protobuf model format. 725 | 726 | This network expects to read from pregenerated LMDBs, but reading directly from `ndarray`s is also possible using `MemoryDataLayer`. 727 | 728 | 729 | ```python 730 | from caffe import layers as L, params as P 731 | 732 | def lenet(lmdb, batch_size): 733 | # our version of LeNet: a series of linear and simple nonlinear transformations 734 | n = caffe.NetSpec() 735 | 736 | n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb, 737 | transform_param=dict(scale=1./255), ntop=2) 738 | 739 | n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier')) 740 | n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX) 741 | n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier')) 742 | n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX) 743 | n.fc1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier')) 744 | n.relu1 = L.ReLU(n.fc1, in_place=True) 745 | n.score = L.InnerProduct(n.relu1, num_output=10, weight_filler=dict(type='xavier')) 746 | n.loss = L.SoftmaxWithLoss(n.score, n.label) 747 | 748 | return n.to_proto() 749 | 750 | with open('mnist/lenet_auto_train.prototxt', 'w') as f: 751 | f.write(str(lenet('mnist/mnist_train_lmdb', 64))) 752 | 753 | with open('mnist/lenet_auto_test.prototxt', 'w') as f: 754 | f.write(str(lenet('mnist/mnist_test_lmdb', 100))) 755 | ``` 756 | 757 | The neural net is defined as a function `lenet` with parameters `lmdb` and `batch_size`. `lmdb` is the dataset and `batch_size` is the number of images that you are inputting at once. 758 | 759 | n = caffe.NetSpec() 760 | The NetSpec class creates the NetParameter's which helps in defining each and every layer of the net. A formal definition can be found [here](https://github.com/BVLC/caffe/blob/master/python/caffe/net_spec.py) and reads like: 761 | ***"A NetSpec contains a set of Tops (assigned directly as attributes). Calling NetSpec.to_proto generates a NetParameter containing all of the layers needed to produce all of the assigned Tops, using the assigned names."*** 762 | 763 | Once we are done with the NetSpec() class, we can start defining our net. The input data is defined first here. 764 | 765 | n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb, 766 | transform_param=dict(scale=1./255), ntop=2) 767 | 768 | The `batch_size` parameter is the one we passed on to the function. The dataset contains both data values and corresponding labels. The input layer thus has an `ntop` value of 2. The data source is of the type `lmdb`. The backend is defined as `P.Data.LMDB`. The `transform_param` field indicate the feature scaling coefficient which this maps the [0, 255] data to [0, 1]. 769 | 770 | n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier')) 771 | n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX) 772 | 773 | The first layer consists of a Convolutional layer and a pooling layer. The feed-in layer for the `conv1` layer is the `n.data` layer. The `kernal_size` and `num_output` determines the size of the kernal to apply the filters and the number of output neurons from the layer respectively. The weights are filled initially using the `xavier` algorithm. 774 | Similarly, the second layer consists of a convolution and a pooling layer. 775 | 776 | n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier')) 777 | n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX) 778 | 779 | The second layer is connected to a fully connected layer and a rectified linear unit (ReLU). Another fully connected layer is added to the ReLU layer and finally a loss layer is connected as well. 780 | 781 | n.fc1 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier')) 782 | n.relu1 = L.ReLU(n.fc1, in_place=True) 783 | n.score = L.InnerProduct(n.relu1, num_output=10, weight_filler=dict(type='xavier')) 784 | n.loss = L.SoftmaxWithLoss(n.score, n.label) 785 | 786 | If you found the term `xavier` in the `weight_filler` section to be odd, check these links [1](http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf), [2](http://andyljones.tumblr.com/post/110998971763/an-explanation-of-xavier-initialization) to know about it better. 787 | 788 | Basically, `xavier` algorithm is used as a weight filler in caffe during the time of initialization. It makes sure that the weights intilialized are reasonable enough to pass through the many deep layers. 789 | 790 | The net has been written to disk in a more verbose but human-readable serialization format using Google's protobuf library. You can read, write, and modify this description directly. Let's take a look at the train net. 791 | 792 | 793 | ```python 794 | !cat mnist/lenet_auto_train.prototxt 795 | ``` 796 | 797 | layer { 798 | name: "data" 799 | type: "Data" 800 | top: "data" 801 | top: "label" 802 | transform_param { 803 | scale: 0.00392156862745 804 | } 805 | data_param { 806 | source: "mnist/mnist_train_lmdb" 807 | batch_size: 64 808 | backend: LMDB 809 | } 810 | } 811 | layer { 812 | name: "conv1" 813 | type: "Convolution" 814 | bottom: "data" 815 | top: "conv1" 816 | convolution_param { 817 | num_output: 20 818 | kernel_size: 5 819 | weight_filler { 820 | type: "xavier" 821 | } 822 | } 823 | } 824 | layer { 825 | name: "pool1" 826 | type: "Pooling" 827 | bottom: "conv1" 828 | top: "pool1" 829 | pooling_param { 830 | pool: MAX 831 | kernel_size: 2 832 | stride: 2 833 | } 834 | } 835 | layer { 836 | name: "conv2" 837 | type: "Convolution" 838 | bottom: "pool1" 839 | top: "conv2" 840 | convolution_param { 841 | num_output: 50 842 | kernel_size: 5 843 | weight_filler { 844 | type: "xavier" 845 | } 846 | } 847 | } 848 | layer { 849 | name: "pool2" 850 | type: "Pooling" 851 | bottom: "conv2" 852 | top: "pool2" 853 | pooling_param { 854 | pool: MAX 855 | kernel_size: 2 856 | stride: 2 857 | } 858 | } 859 | layer { 860 | name: "fc1" 861 | type: "InnerProduct" 862 | bottom: "pool2" 863 | top: "fc1" 864 | inner_product_param { 865 | num_output: 500 866 | weight_filler { 867 | type: "xavier" 868 | } 869 | } 870 | } 871 | layer { 872 | name: "relu1" 873 | type: "ReLU" 874 | bottom: "fc1" 875 | top: "fc1" 876 | } 877 | layer { 878 | name: "score" 879 | type: "InnerProduct" 880 | bottom: "fc1" 881 | top: "score" 882 | inner_product_param { 883 | num_output: 10 884 | weight_filler { 885 | type: "xavier" 886 | } 887 | } 888 | } 889 | layer { 890 | name: "loss" 891 | type: "SoftmaxWithLoss" 892 | bottom: "score" 893 | bottom: "label" 894 | top: "loss" 895 | } 896 | 897 | 898 | 899 | ```python 900 | !cat mnist/lenet_auto_solver.prototxt 901 | ``` 902 | 903 | # The train/test net protocol buffer definition 904 | train_net: "mnist/lenet_auto_train.prototxt" 905 | test_net: "mnist/lenet_auto_test.prototxt" 906 | # test_iter specifies how many forward passes the test should carry out. 907 | # In the case of MNIST, we have test batch size 100 and 100 test iterations, 908 | # covering the full 10,000 testing images. 909 | test_iter: 100 910 | # Carry out testing every 500 training iterations. 911 | test_interval: 500 912 | # The base learning rate, momentum and the weight decay of the network. 913 | base_lr: 0.01 914 | momentum: 0.9 915 | weight_decay: 0.0005 916 | # The learning rate policy 917 | lr_policy: "inv" 918 | gamma: 0.0001 919 | power: 0.75 920 | # Display every 100 iterations 921 | display: 100 922 | # The maximum number of iterations 923 | max_iter: 10000 924 | # snapshot intermediate results 925 | snapshot: 5000 926 | snapshot_prefix: "mnist/lenet" 927 | 928 | 929 | ### 3. Loading and checking the solver 930 | 931 | * Let's pick a device and load the solver. We'll use SGD (with momentum), but other methods (such as Adagrad and Nesterov's accelerated gradient) are also available. 932 | 933 | 934 | ```python 935 | from pylab import * 936 | %matplotlib inline 937 | ``` 938 | 939 | 940 | ```python 941 | #caffe.set_device(0) 942 | caffe.set_mode_cpu() 943 | 944 | ### load the solver and create train and test nets 945 | solver = None # ignore this workaround for lmdb data (can't instantiate two solvers on the same data) 946 | solver = caffe.SGDSolver('mnist/lenet_auto_solver.prototxt') 947 | ``` 948 | 949 | * To get an idea of the architecture of our net, we can check the dimensions of the intermediate features (blobs) and parameters (these will also be useful to refer to when manipulating data later). 950 | 951 | 952 | ```python 953 | # each output is (batch size, feature dim, spatial dim) 954 | [(k, v.data.shape) for k, v in solver.net.blobs.items()] 955 | ``` 956 | 957 | 958 | 959 | 960 | [('data', (64, 1, 28, 28)), 961 | ('label', (64,)), 962 | ('conv1', (64, 20, 24, 24)), 963 | ('pool1', (64, 20, 12, 12)), 964 | ('conv2', (64, 50, 8, 8)), 965 | ('pool2', (64, 50, 4, 4)), 966 | ('fc1', (64, 500)), 967 | ('score', (64, 10)), 968 | ('loss', ())] 969 | 970 | 971 | 972 | 973 | ```python 974 | # just print the weight sizes (we'll omit the biases) 975 | [(k, v[0].data.shape) for k, v in solver.net.params.items()] 976 | ``` 977 | 978 | 979 | 980 | 981 | [('conv1', (20, 1, 5, 5)), 982 | ('conv2', (50, 20, 5, 5)), 983 | ('fc1', (500, 800)), 984 | ('score', (10, 500))] 985 | 986 | 987 | 988 | * Before taking off, let's check that everything is loaded as we expect. We'll run a forward pass on the train and test nets and check that they contain our data. 989 | 990 | 991 | ```python 992 | solver.net.forward() # train net 993 | solver.test_nets[0].forward() # test net (there can be more than one) 994 | ``` 995 | 996 | 997 | 998 | 999 | {'loss': array(2.3281469345092773, dtype=float32)} 1000 | 1001 | 1002 | 1003 | 1004 | ```python 1005 | # we use a little trick to tile the first eight images 1006 | imshow(solver.net.blobs['data'].data[:8, 0].transpose(1, 0, 2).reshape(28, 8*28), cmap='gray'); axis('off') 1007 | print 'train labels:', solver.net.blobs['label'].data[:8] 1008 | ``` 1009 | 1010 | train labels: [ 5. 0. 4. 1. 9. 2. 1. 3.] 1011 | 1012 | 1013 | 1014 | ![png](output_68_1.png) 1015 | 1016 | 1017 | 1018 | ```python 1019 | imshow(solver.test_nets[0].blobs['data'].data[:8, 0].transpose(1, 0, 2).reshape(28, 8*28), cmap='gray'); axis('off') 1020 | print 'test labels:', solver.test_nets[0].blobs['label'].data[:8] 1021 | ``` 1022 | 1023 | test labels: [ 7. 2. 1. 0. 4. 1. 4. 9.] 1024 | 1025 | 1026 | 1027 | ![png](output_69_1.png) 1028 | 1029 | 1030 | ### 4. Stepping the solver 1031 | 1032 | Both train and test nets seem to be loading data, and to have correct labels. 1033 | 1034 | * Let's take one step of (minibatch) SGD and see what happens. 1035 | 1036 | 1037 | ```python 1038 | solver.step(1) 1039 | ``` 1040 | 1041 | Do we have gradients propagating through our filters? Let's see the updates to the first layer, shown here as a $4 \times 5$ grid of $5 \times 5$ filters. 1042 | 1043 | 1044 | ```python 1045 | imshow(solver.net.params['conv1'][0].diff[:, 0].reshape(4, 5, 5, 5) 1046 | .transpose(0, 2, 1, 3).reshape(4*5, 5*5), cmap='gray'); axis('off') 1047 | ``` 1048 | 1049 | 1050 | 1051 | 1052 | (-0.5, 24.5, 19.5, -0.5) 1053 | 1054 | 1055 | 1056 | 1057 | ![png](output_73_1.png) 1058 | 1059 | 1060 | ### 5. Writing a custom training loop 1061 | 1062 | Something is happening. Let's run the net for a while, keeping track of a few things as it goes. 1063 | Note that this process will be the same as if training through the `caffe` binary. In particular: 1064 | * logging will continue to happen as normal 1065 | * snapshots will be taken at the interval specified in the solver prototxt (here, every 5000 iterations) 1066 | * testing will happen at the interval specified (here, every 500 iterations) 1067 | 1068 | Since we have control of the loop in Python, we're free to compute additional things as we go, as we show below. We can do many other things as well, for example: 1069 | * write a custom stopping criterion 1070 | * change the solving process by updating the net in the loop 1071 | 1072 | 1073 | ```python 1074 | %%time 1075 | niter = 200 1076 | test_interval = 25 1077 | # losses will also be stored in the log 1078 | train_loss = zeros(niter) 1079 | test_acc = zeros(int(np.ceil(niter / test_interval))) 1080 | output = zeros((niter, 8, 10)) 1081 | 1082 | # the main solver loop 1083 | for it in range(niter): 1084 | solver.step(1) # SGD by Caffe 1085 | 1086 | # store the train loss 1087 | train_loss[it] = solver.net.blobs['loss'].data 1088 | 1089 | # store the output on the first test batch 1090 | # (start the forward pass at conv1 to avoid loading new data) 1091 | solver.test_nets[0].forward(start='conv1') 1092 | output[it] = solver.test_nets[0].blobs['score'].data[:8] 1093 | 1094 | # run a full test every so often 1095 | # (Caffe can also do this for us and write to a log, but we show here 1096 | # how to do it directly in Python, where more complicated things are easier.) 1097 | if it % test_interval == 0: 1098 | print 'Iteration', it, 'testing...' 1099 | correct = 0 1100 | for test_it in range(100): 1101 | solver.test_nets[0].forward() 1102 | correct += sum(solver.test_nets[0].blobs['score'].data.argmax(1) 1103 | == solver.test_nets[0].blobs['label'].data) 1104 | test_acc[it // test_interval] = correct / 1e4 1105 | ``` 1106 | 1107 | Iteration 0 testing... 1108 | Iteration 25 testing... 1109 | Iteration 50 testing... 1110 | Iteration 75 testing... 1111 | Iteration 100 testing... 1112 | Iteration 125 testing... 1113 | Iteration 150 testing... 1114 | Iteration 175 testing... 1115 | CPU times: user 8min 43s, sys: 22min 18s, total: 31min 1s 1116 | Wall time: 1min 17s 1117 | 1118 | 1119 | 1120 | ```python 1121 | _, ax1 = subplots() 1122 | ax2 = ax1.twinx() 1123 | ax1.plot(arange(niter), train_loss) 1124 | ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r') 1125 | ax1.set_xlabel('iteration') 1126 | ax1.set_ylabel('train loss') 1127 | ax2.set_ylabel('test accuracy') 1128 | ax2.set_title('Test Accuracy: {:.2f}'.format(test_acc[-1])) 1129 | ``` 1130 | 1131 | 1132 | 1133 | 1134 | 1135 | 1136 | 1137 | 1138 | 1139 | ![png](output_76_1.png) 1140 | 1141 | 1142 | The loss seems to have dropped quickly and coverged (except for stochasticity), while the accuracy rose correspondingly. Hooray! 1143 | 1144 | * Since we saved the results on the first test batch, we can watch how our prediction scores evolved. We'll plot time on the $x$ axis and each possible label on the $y$, with lightness indicating confidence. 1145 | 1146 | 1147 | ```python 1148 | for i in range(8): 1149 | figure(figsize=(2, 2)) 1150 | imshow(solver.test_nets[0].blobs['data'].data[i, 0], cmap='gray') 1151 | figure(figsize=(10, 2)) 1152 | imshow(output[:50, i].T, interpolation='nearest', cmap='gray') 1153 | xlabel('iteration') 1154 | ylabel('label') 1155 | ``` 1156 | 1157 | 1158 | ![png](output_78_0.png) 1159 | 1160 | 1161 | 1162 | ![png](output_78_1.png) 1163 | 1164 | 1165 | 1166 | ![png](output_78_2.png) 1167 | 1168 | 1169 | 1170 | ![png](output_78_3.png) 1171 | 1172 | 1173 | 1174 | ![png](output_78_4.png) 1175 | 1176 | 1177 | 1178 | ![png](output_78_5.png) 1179 | 1180 | 1181 | 1182 | ![png](output_78_6.png) 1183 | 1184 | 1185 | 1186 | ![png](output_78_7.png) 1187 | 1188 | 1189 | 1190 | ![png](output_78_8.png) 1191 | 1192 | 1193 | 1194 | ![png](output_78_9.png) 1195 | 1196 | 1197 | 1198 | ![png](output_78_10.png) 1199 | 1200 | 1201 | 1202 | ![png](output_78_11.png) 1203 | 1204 | 1205 | 1206 | ![png](output_78_12.png) 1207 | 1208 | 1209 | 1210 | ![png](output_78_13.png) 1211 | 1212 | 1213 | 1214 | ![png](output_78_14.png) 1215 | 1216 | 1217 | 1218 | ![png](output_78_15.png) 1219 | 1220 | 1221 | We started with little idea about any of these digits, and ended up with correct classifications for each. If you've been following along, you'll see the last digit is the most difficult, a slanted "9" that's (understandably) most confused with "4". 1222 | 1223 | * Note that these are the "raw" output scores rather than the softmax-computed probability vectors. The latter, shown below, make it easier to see the confidence of our net (but harder to see the scores for less likely digits). 1224 | 1225 | 1226 | ```python 1227 | for i in range(8): 1228 | figure(figsize=(2, 2)) 1229 | imshow(solver.test_nets[0].blobs['data'].data[i, 0], cmap='gray') 1230 | figure(figsize=(10, 2)) 1231 | imshow(exp(output[:50, i].T) / exp(output[:50, i].T).sum(0), interpolation='nearest', cmap='gray') 1232 | xlabel('iteration') 1233 | ylabel('label') 1234 | ``` 1235 | 1236 | 1237 | ![png](output_80_0.png) 1238 | 1239 | 1240 | 1241 | ![png](output_80_1.png) 1242 | 1243 | 1244 | 1245 | ![png](output_80_2.png) 1246 | 1247 | 1248 | 1249 | ![png](output_80_3.png) 1250 | 1251 | 1252 | 1253 | ![png](output_80_4.png) 1254 | 1255 | 1256 | 1257 | ![png](output_80_5.png) 1258 | 1259 | 1260 | 1261 | ![png](output_80_6.png) 1262 | 1263 | 1264 | 1265 | ![png](output_80_7.png) 1266 | 1267 | 1268 | 1269 | ![png](output_80_8.png) 1270 | 1271 | 1272 | 1273 | ![png](output_80_9.png) 1274 | 1275 | 1276 | 1277 | ![png](output_80_10.png) 1278 | 1279 | 1280 | 1281 | ![png](output_80_11.png) 1282 | 1283 | 1284 | 1285 | ![png](output_80_12.png) 1286 | 1287 | 1288 | 1289 | ![png](output_80_13.png) 1290 | 1291 | 1292 | 1293 | ![png](output_80_14.png) 1294 | 1295 | 1296 | 1297 | ![png](output_80_15.png) 1298 | 1299 | 1300 | ### 6. Experiment with architecture and optimization 1301 | 1302 | Now that we've defined, trained, and tested LeNet there are many possible next steps: 1303 | 1304 | - Define new architectures for comparison 1305 | - Tune optimization by setting `base_lr` and the like or simply training longer 1306 | - Switching the solver type from `SGD` to an adaptive method like `AdaDelta` or `Adam` 1307 | 1308 | `` CHANGES THAT I MADE TO THE ARCHITECTURE `` 1309 | 1310 | 1. Switch the nonlinearity from `ReLU` to `ELU`. 1311 | 2. Stacked one fully connected layer between conv1 and conv2 layer 1312 | 3. Switched the solver type to `Nesterov`. 1313 | 1314 | 1315 | ```python 1316 | train_net_path = 'mnist/custom_auto_train.prototxt' 1317 | test_net_path = 'mnist/custom_auto_test.prototxt' 1318 | solver_config_path = 'mnist/custom_auto_solver.prototxt' 1319 | 1320 | ### define net 1321 | def custom_net(lmdb, batch_size): 1322 | # define your own net! 1323 | n = caffe.NetSpec() 1324 | 1325 | # keep this data layer for all networks 1326 | n.data, n.label = L.Data(batch_size=batch_size, backend=P.Data.LMDB, source=lmdb, 1327 | transform_param=dict(scale=1./255), ntop=2) 1328 | 1329 | # EDIT HERE this is the LeNet variant we have already tried 1330 | n.conv1 = L.Convolution(n.data, kernel_size=5, num_output=20, weight_filler=dict(type='xavier')) 1331 | 1332 | n.pool1 = L.Pooling(n.conv1, kernel_size=2, stride=2, pool=P.Pooling.MAX) 1333 | 1334 | n.fc1 = L.InnerProduct(n.pool1, num_output=500, weight_filler=dict(type='xavier')) 1335 | 1336 | n.conv2 = L.Convolution(n.pool1, kernel_size=5, num_output=50, weight_filler=dict(type='xavier')) 1337 | 1338 | n.pool2 = L.Pooling(n.conv2, kernel_size=2, stride=2, pool=P.Pooling.MAX) 1339 | 1340 | n.fc2 = L.InnerProduct(n.pool2, num_output=500, weight_filler=dict(type='xavier')) 1341 | 1342 | #EDIT HERE consider L.ELU or L.Sigmoid for the nonlinearity 1343 | n.relu1 = L.ELU(n.fc2, in_place=True) 1344 | 1345 | 1346 | n.score = L.InnerProduct(n.fc2, num_output=10, weight_filler=dict(type='xavier')) 1347 | 1348 | # keep this loss layer for all networks 1349 | n.loss = L.SoftmaxWithLoss(n.score, n.label) 1350 | 1351 | return n.to_proto() 1352 | 1353 | with open(train_net_path, 'w') as f: 1354 | f.write(str(custom_net('mnist/mnist_train_lmdb', 64))) 1355 | with open(test_net_path, 'w') as f: 1356 | f.write(str(custom_net('mnist/mnist_test_lmdb', 100))) 1357 | 1358 | ### define solver 1359 | from caffe.proto import caffe_pb2 1360 | s = caffe_pb2.SolverParameter() 1361 | 1362 | # Set a seed for reproducible experiments: 1363 | # this controls for randomization in training. 1364 | s.random_seed = 0xCAFFE 1365 | 1366 | # Specify locations of the train and (maybe) test networks. 1367 | s.train_net = train_net_path 1368 | s.test_net.append(test_net_path) 1369 | s.test_interval = 500 # Test after every 500 training iterations. 1370 | s.test_iter.append(100) # Test on 100 batches each time we test. 1371 | 1372 | s.max_iter = 10000 # no. of times to update the net (training iterations) 1373 | 1374 | # EDIT HERE to try different solvers 1375 | # solver types include "SGD", "Adam", and "Nesterov" among others. 1376 | #s.type = "SGD" 1377 | s.type = "Nesterov" 1378 | 1379 | # Set the initial learning rate for SGD. 1380 | s.base_lr = 0.01 # EDIT HERE to try different learning rates 1381 | # Set momentum to accelerate learning by 1382 | # taking weighted average of current and previous updates. 1383 | s.momentum = 0.9 1384 | # Set weight decay to regularize and prevent overfitting 1385 | s.weight_decay = 5e-4 1386 | 1387 | # Set `lr_policy` to define how the learning rate changes during training. 1388 | # This is the same policy as our default LeNet. 1389 | #s.lr_policy = 'inv' 1390 | s.lr_policy = 'fixed' 1391 | s.gamma = 0.0001 1392 | s.power = 0.75 1393 | # EDIT HERE to try the fixed rate (and compare with adaptive solvers) 1394 | # `fixed` is the simplest policy that keeps the learning rate constant. 1395 | # s.lr_policy = 'fixed' 1396 | 1397 | # Display the current training loss and accuracy every 1000 iterations. 1398 | s.display = 1000 1399 | 1400 | # Snapshots are files used to store networks we've trained. 1401 | # We'll snapshot every 5K iterations -- twice during training. 1402 | s.snapshot = 5000 1403 | s.snapshot_prefix = 'mnist/custom_net' 1404 | 1405 | # Train on the GPU 1406 | s.solver_mode = caffe_pb2.SolverParameter.CPU 1407 | 1408 | # Write the solver to a temporary file and return its filename. 1409 | with open(solver_config_path, 'w') as f: 1410 | f.write(str(s)) 1411 | 1412 | ### load the solver and create train and test nets 1413 | solver = None # ignore this workaround for lmdb data (can't instantiate two solvers on the same data) 1414 | solver = caffe.get_solver(solver_config_path) 1415 | 1416 | ### solve 1417 | niter = 250 # EDIT HERE increase to train for longer 1418 | test_interval = niter / 10 1419 | # losses will also be stored in the log 1420 | train_loss = zeros(niter) 1421 | test_acc = zeros(int(np.ceil(niter / test_interval))) 1422 | 1423 | # the main solver loop 1424 | for it in range(niter): 1425 | solver.step(1) # SGD by Caffe 1426 | 1427 | # store the train loss 1428 | train_loss[it] = solver.net.blobs['loss'].data 1429 | 1430 | # run a full test every so often 1431 | # (Caffe can also do this for us and write to a log, but we show here 1432 | # how to do it directly in Python, where more complicated things are easier.) 1433 | if it % test_interval == 0: 1434 | print 'Iteration', it, 'testing...' 1435 | correct = 0 1436 | for test_it in range(100): 1437 | solver.test_nets[0].forward() 1438 | correct += sum(solver.test_nets[0].blobs['score'].data.argmax(1) 1439 | == solver.test_nets[0].blobs['label'].data) 1440 | test_acc[it // test_interval] = correct / 1e4 1441 | 1442 | _, ax1 = subplots() 1443 | ax2 = ax1.twinx() 1444 | ax1.plot(arange(niter), train_loss) 1445 | ax2.plot(test_interval * arange(len(test_acc)), test_acc, 'r') 1446 | ax1.set_xlabel('iteration') 1447 | ax1.set_ylabel('train loss') 1448 | ax2.set_ylabel('test accuracy') 1449 | ax2.set_title('Custom Test Accuracy: {:.2f}'.format(test_acc[-1])) 1450 | ``` 1451 | 1452 | Iteration 0 testing... 1453 | Iteration 25 testing... 1454 | Iteration 50 testing... 1455 | Iteration 75 testing... 1456 | Iteration 100 testing... 1457 | Iteration 125 testing... 1458 | Iteration 150 testing... 1459 | Iteration 175 testing... 1460 | Iteration 200 testing... 1461 | Iteration 225 testing... 1462 | 1463 | 1464 | 1465 | 1466 | 1467 | 1468 | 1469 | 1470 | 1471 | 1472 | ![png](output_82_2.png) 1473 | 1474 | 1475 | We successfully improved the accuracy from 94 percent to 96 percent by changing various parameters listed. This is a good way to go. 1476 | 1477 | Caffe's examples are a great source of information. Next stop, finding solution to a problem of our own. 1478 | 1479 | Learn from mistakes. Happy Coding ! 1480 | --------------------------------------------------------------------------------