├── LICENSE
├── README.md
├── matlab
    ├── +dagnn
    │   └── Combine.m
    ├── DepthMapPrediction.m
    ├── error_metrics.m
    ├── evaluateMake3D.m
    ├── evaluateNYU.m
    └── setupMatConvNet.m
└── tensorflow
    ├── models
        ├── __init__.py
        ├── fcrn.py
        └── network.py
    └── predict.py


/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright (c) 2016, Iro Laina
 2 | All rights reserved.
 3 | 
 4 | Redistribution and use in source and binary forms, with or without
 5 | modification, are permitted provided that the following conditions are met:
 6 | 
 7 | * Redistributions of source code must retain the above copyright notice, this
 8 |   list of conditions and the following disclaimer.
 9 | 
10 | * Redistributions in binary form must reproduce the above copyright notice,
11 |   this list of conditions and the following disclaimer in the documentation
12 |   and/or other materials provided with the distribution.
13 | 
14 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
15 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
16 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
17 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
18 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
19 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
20 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
21 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
22 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
23 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
24 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Deeper Depth Prediction with Fully Convolutional Residual Networks
  2 | 
  3 | By [Iro Laina](http://campar.in.tum.de/Main/IroLaina), [Christian Rupprecht](http://campar.in.tum.de/Main/ChristianRupprecht), [Vasileios Belagiannis](http://www.robots.ox.ac.uk/~vb/), [Federico Tombari](http://campar.in.tum.de/Main/FedericoTombari), [Nassir Navab](http://campar.in.tum.de/Main/NassirNavab).
  4 | 
  5 | ## Contents
  6 | 0. [Introduction](#introduction)
  7 | 0. [Quick Guide](#quick-guide)
  8 | 0. [Models](#models)
  9 | 0. [Results](#results)
 10 | 0. [Citation](#citation)
 11 | 0. [License](#license)
 12 | 
 13 | 
 14 | ## Introduction
 15 | 
 16 | This repository contains the CNN models trained for depth prediction from a single RGB image, as described in the paper "[Deeper Depth Prediction with Fully Convolutional Residual Networks](https://arxiv.org/abs/1606.00373)". The provided models are those that were used to obtain the results reported in the paper on the benchmark datasets NYU Depth v2 and Make3D for indoor and outdoor scenes respectively. Moreover, the provided code can be used for inference on arbitrary images. 
 17 | 
 18 | 
 19 | ## Quick Guide
 20 | 
 21 | The trained models are currently provided in two frameworks, MatConvNet and TensorFlow. Please read below for more information on how to get started.
 22 | 
 23 | ### TensorFlow
 24 | The code provided in the *tensorflow* folder requires accordingly a successful installation of the [TensorFlow](https://www.tensorflow.org/) library (any platform). 
 25 | The model's graph is constructed in ```fcrn.py``` and the corresponding weights can be downloaded using the link below. The implementation is based on [ethereon's](https://github.com/ethereon/caffe-tensorflow) Caffe-to-TensorFlow conversion tool. 
 26 | ```predict.py``` provides sample code for using the network to predict the depth map of an input image. Use ```python predict.py NYU_FCRN.ckpt yourimage.jpg``` to try the code.
 27 | 
 28 | ### MatConvNet
 29 | 
 30 | **Prerequisites**
 31 | 
 32 | The code provided in the *matlab* folder requires the [MatConvNet toolbox](http://www.vlfeat.org/matconvnet/) for CNNs. It is required that a version of the library equal or newer than the 1.0-beta20 is successfully compiled either with or without GPU support. 
 33 | Furthermore, the user should modify  ``` matconvnet_path = '../matconvnet-1.0-beta20' ``` within `evaluateNYU.m` and `evaluateMake3D.m` so that it points to the correct path, where the library is stored. 
 34 | 
 35 | **How-to** 
 36 | 
 37 | For acquiring the predicted depth maps and evaluation on NYU or Make3D *test sets*, the user can simply run  `evaluateNYU.m` or `evaluateMake3D.m` respectively. Please note that all required data and models will be then automatically downloaded (if they do not already exist) and no further user intervention is needed, except for setting the options `opts` and `netOpts` as preferred. Make sure that you have enough free disk space (up to 5 GB). The predictions will be eventually saved in a .mat file in the specified directory.  
 38 | 
 39 | Alternatively, one could run `DepthMapPrediction.m` in order to manually use a trained model in test mode to predict the depth maps of arbitrary images. 
 40 | 
 41 | ## Models
 42 | 
 43 | The models are fully convolutional and use the residual learning idea also for upsampling CNN layers. Here we provide the fastest variant in which interleaving of feature maps is used for upsampling. For this reason, a custom layer `+dagnn/Combine.m` is provided.
 44 | 
 45 | The trained models - namely **ResNet-UpProj** in the paper - can also be downloaded here:
 46 | 
 47 | - NYU Depth v2: [MatConvNet model](http://campar.in.tum.de/files/rupprecht/depthpred/NYU_ResNet-UpProj.zip), [TensorFlow model (.npy)](http://campar.in.tum.de/files/rupprecht/depthpred/NYU_ResNet-UpProj.npy), [TensorFlow model (.ckpt)](http://campar.in.tum.de/files/rupprecht/depthpred/NYU_FCRN-checkpoint.zip)
 48 | - Make3D: [MatConvNet model](http://campar.in.tum.de/files/rupprecht/depthpred/Make3D_ResNet-UpProj.zip), TensorFlow model (soon)
 49 | 
 50 | 
 51 | ## Results
 52 | 
 53 | **NEW!** The predictions for the validation set of NYU-Depth-v2 dataset can also be downloaded [here](http://campar.in.tum.de/files/rupprecht/depthpred/predictions_NYUval.mat) (.mat). 
 54 | 
 55 | In the following tables, we report the results that should be obtained after evaluation and also compare to other (most recent) methods on depth prediction from a single image. 
 56 | - Error metrics on NYU Depth v2:
 57 | 
 58 | | State of the art on NYU     |  rel  |  rms  | log10 |
 59 | |-----------------------------|:-----:|:-----:|:-----:|
 60 | | [Roy & Todorovic](http://web.engr.oregonstate.edu/~sinisa/research/publications/cvpr16_NRF.pdf) (_CVPR 2016_) | 0.187 | 0.744 | 0.078 |
 61 | | [Eigen & Fergus](http://cs.nyu.edu/~deigen/dnl/) (_ICCV 2015_)  | 0.158 | 0.641 |   -   |
 62 | | **Ours**                        | **0.127** | **0.573** | **0.055** |
 63 | 	
 64 | - Error metrics on Make3D:
 65 | 
 66 | | State of the art on Make3D  |  rel  |  rms  | log10 |
 67 | |-----------------------------|:-----:|:-----:|:-----:|
 68 | | [Liu et al.](https://bitbucket.org/fayao/dcnf-fcsp) (_CVPR 2015_)      | 0.314 |  8.60 | 0.119 |
 69 | | [Li et al.](http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_Depth_and_Surface_2015_CVPR_paper.pdf) (_CVPR 2015_)      | 0.278 | 7.19 | 0.092 |
 70 | | **Ours**                        | **0.175** |  **4.45** | **0.072** |
 71 | 
 72 | - Qualitative results:
 73 | ![Results](http://campar.in.tum.de/files/rupprecht/depthpred/images.jpg)
 74 | 
 75 | ## Citation
 76 | 
 77 | If you use this method in your research, please cite:
 78 | 
 79 |     @inproceedings{laina2016deeper,
 80 |             title={Deeper depth prediction with fully convolutional residual networks},
 81 |             author={Laina, Iro and Rupprecht, Christian and Belagiannis, Vasileios and Tombari, Federico and Navab, Nassir},
 82 |             booktitle={3D Vision (3DV), 2016 Fourth International Conference on},
 83 |             pages={239--248},
 84 |             year={2016},
 85 |             organization={IEEE}
 86 |     }
 87 | 
 88 | ## License
 89 | 
 90 | Simplified BSD License
 91 | 
 92 | Copyright (c) 2016, Iro Laina  
 93 | All rights reserved.
 94 | 
 95 | Redistribution and use in source and binary forms, with or without
 96 | modification, are permitted provided that the following conditions are met:
 97 | 
 98 | * Redistributions of source code must retain the above copyright notice, this
 99 |   list of conditions and the following disclaimer.
100 | 
101 | * Redistributions in binary form must reproduce the above copyright notice,
102 |   this list of conditions and the following disclaimer in the documentation
103 |   and/or other materials provided with the distribution.
104 | 
105 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
106 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
107 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
108 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
109 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
110 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
111 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
112 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
113 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
114 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
115 | 


--------------------------------------------------------------------------------
/matlab/+dagnn/Combine.m:
--------------------------------------------------------------------------------
 1 | classdef Combine < dagnn.ElementWise
 2 | 
 3 |     methods
 4 |         function outputs = forward(self, inputs, params)
 5 |             %double the size of feature maps, combining four responses
 6 |             Y = zeros(size(inputs{1},1)*2, size(inputs{1},2)*2, size(inputs{1},3), size(inputs{1},4), 'like', inputs{1});
 7 |             Y(1:2:end, 1:2:end, :, :) = inputs{1};  %A
 8 |             Y(2:2:end, 1:2:end, :, :) = inputs{2};  %C
 9 |             Y(1:2:end, 2:2:end, :, :) = inputs{3};  %B
10 |             Y(2:2:end, 2:2:end, :, :) = inputs{4};  %D
11 |             outputs{1} = Y;
12 |         end
13 |         
14 |         function [derInputs, derParams] = backward(self, inputs, params, derOutputs)
15 |             %split the feature map into four feature maps of half size
16 |             derInputs{1} = derOutputs{1}(1:2:end, 1:2:end, :, :);
17 |             derInputs{2} = derOutputs{1}(2:2:end, 1:2:end, :, :);
18 |             derInputs{3} = derOutputs{1}(1:2:end, 2:2:end, :, :);
19 |             derInputs{4} = derOutputs{1}(2:2:end, 2:2:end, :, :);
20 |             derParams = {} ;
21 |         end
22 |               
23 |         function outputSizes = getOutputSizes(obj, inputSizes)
24 |             outputSizes{1}(1) = 2*inputSizes{1}(1);
25 |             outputSizes{1}(2) = 2*inputSizes{1}(2);
26 |             outputSizes{1}(3) = inputSizes{1}(3);
27 |             outputSizes{1}(4) = inputSizes{1}(4);
28 |         end
29 |         
30 |         function obj = Combine(varargin)
31 |             obj.load(varargin) ;
32 |         end
33 |     end
34 | end
35 | 


--------------------------------------------------------------------------------
/matlab/DepthMapPrediction.m:
--------------------------------------------------------------------------------
 1 | function pred = DepthMapPrediction(imdb, net, varargin)
 2 | 
 3 | % Depth prediction (inference) using a trained model.
 4 | % Inputs (imdb) can be either from the NYUDepth_v2 or Make3D dataset, along
 5 | % with the corresponding trained model (net). Additionally, the evaluation
 6 | % can be run for any single image. MatConvNet library has to be already
 7 | % setup for this function to work properly.
 8 | % -------------------------------------------------------------------------
 9 | % Inputs:
10 | %   - imdb: a structure with fields 'images' and 'depths' in the case of
11 | %           the benchmark datasets with known ground truth. imdb could
12 | %           alternatively be any single RGB image of size NxMx3 in [0,255]
13 | %           or a tensor of D input images NxMx3xD.
14 | %   - net:  a trained model of type struct (suitable to be converted to a 
15 | %           DagNN object and successively processed using the DagNN 
16 | %           wrapper). For testing on arbitrary images, use NYU model for 
17 | %           indoor and Make3D model for outdoor scenes respectively.
18 | % -------------------------------------------------------------------------
19 | 
20 | opts.gpu = false;           % Set to true (false) for GPU (CPU only) support 
21 | opts.plot = false;          % Set to true to visualize the predictions during inference
22 | opts = vl_argparse(opts, varargin);
23 | 
24 | % Set network properties
25 | net = dagnn.DagNN.loadobj(net);
26 | net.mode = 'test';
27 | out = net.getVarIndex('prediction');
28 | if opts.gpu
29 |     net.move('gpu');
30 | end
31 | 
32 | % Check input
33 | if isa(imdb, 'struct')
34 |     % case of benchmark datasets (NYU, Make3D)
35 |     images = imdb.images;
36 |     groundTruth = imdb.depths;
37 | else
38 |     % case of arbitrary image(s)
39 |     images = imdb;
40 |     images = imresize(images, net.meta.normalization.imageSize(1:2));
41 |     groundTruth = [];
42 | end
43 | 
44 | % Get output size for initialization
45 | varSizes = net.getVarSizes({'data', net.meta.normalization.imageSize});  % get variable sizes
46 | pred = zeros(varSizes{out}(1), varSizes{out}(2), varSizes{out}(3), size(images, 4));    % initiliaze 
47 | 
48 | if opts.plot, figure(); end
49 | 
50 | fprintf('predicting...\n');
51 | for i = 1:size(images, 4)
52 |     % get input image
53 |     im = single(images(:,:,:,i));
54 |     if opts.gpu
55 |         im = gpuArray(im);
56 |     end
57 |     
58 |     % run the CNN
59 |     inputs = {'data', im};
60 |     net.eval(inputs) ;
61 |     
62 |     % obtain prediction
63 |     pred(:,:,i) = gather(net.vars(out).value);
64 |     
65 |     % visualize results
66 |     if opts.plot
67 |         colormap jet
68 |         if ~isempty(groundTruth)
69 |             subplot(1,3,1), imagesc(uint8(images(:,:,:,i))), title('RGB Input'), axis off
70 |             subplot(1,3,2), imagesc(groundTruth(:,:,i)), title('Depth Ground Truth'), axis off
71 |             subplot(1,3,3), imagesc(pred(:,:,i)), title('Depth Prediction'), axis off
72 |         else
73 |             subplot(1,2,1), imagesc(uint8(images(:,:,:,i))), title('RGB Input'), axis off
74 |             subplot(1,2,2), imagesc(pred(:,:,i)), title('Depth Prediction'), axis off
75 |         end
76 |         drawnow;
77 |     end
78 | end
79 | 


--------------------------------------------------------------------------------
/matlab/error_metrics.m:
--------------------------------------------------------------------------------
 1 | function results = error_metrics(pred, gt, mask)
 2 | 
 3 | % Compute error metrics on benchmark datasets
 4 | % -------------------------------------------------------------------------
 5 | 
 6 | % make sure predictions and ground truth have same dimensions
 7 | if size(pred) ~= size(gt)
 8 |     pred = imresize(pred, [size(gt,1), size(gt,2)], 'bilinear');  
 9 | end
10 | 
11 | if isempty(mask)
12 |     n_pxls = numel(gt);
13 | else
14 |     n_pxls = sum(mask(:));  % average over valid pixels only
15 | end
16 | 
17 | fprintf('\n Errors computed over the entire test set \n');
18 | fprintf('------------------------------------------\n');
19 | 
20 | % Mean Absolute Relative Error
21 | rel = abs(gt(:) - pred(:)) ./ gt(:);    % compute errors
22 | rel(~mask) = 0;                         % mask out invalid ground truth pixels
23 | rel = sum(rel) / n_pxls;                % average over all pixels
24 | fprintf('Mean Absolute Relative Error: %4f\n', rel);
25 | 
26 | % Root Mean Squared Error
27 | rms = (gt(:) - pred(:)).^2;
28 | rms(~mask) = 0;
29 | rms = sqrt(sum(rms) / n_pxls);
30 | fprintf('Root Mean Squared Error: %4f\n', rms);
31 | 
32 | % LOG10 Error
33 | lg10 = abs(log10(gt(:)) - log10(pred(:)));
34 | lg10(~mask) = 0;
35 | lg10 = sum(lg10) / n_pxls ; 
36 | fprintf('Mean Log10 Error: %4f\n', lg10);
37 | 
38 | results.rel = rel;
39 | results.rms = rms;
40 | results.log10 = lg10;
41 | 


--------------------------------------------------------------------------------
/matlab/evaluateMake3D.m:
--------------------------------------------------------------------------------
  1 | function evaluateMake3D
  2 | 
  3 | % Evaluation of depth prediction on Make3D dataset.
  4 | 
  5 | % -------------------------------------------------------------------------
  6 | % Setup MatConvNet
  7 | % -------------------------------------------------------------------------
  8 | 
  9 | % Set your matconvnet path here:
 10 | matconvnet_path = '../../matconvnet-1.0-beta20';
 11 | setupMatConvNet(matconvnet_path);
 12 | 
 13 | % -------------------------------------------------------------------------
 14 | % Options
 15 | % -------------------------------------------------------------------------
 16 | 
 17 | opts.dataDir = fullfile(pwd, 'Make3D');     % working directory
 18 | opts.interp = 'nearest';    % interpolation method applied during resizing
 19 | opts.imageSize = [460,345]; % desired image size for evaluation
 20 | 
 21 | netOpts.gpu = true;     % set to true to enable GPU support
 22 | netOpts.plot = true;    % set to true to visualize the predictions during inference
 23 | 
 24 | % -------------------------------------------------------------------------
 25 | % Prepate data
 26 | % -------------------------------------------------------------------------
 27 | 
 28 | imdb = get_Make3D(opts);
 29 | net = get_model(opts);
 30 | 
 31 | % Test set
 32 | testSet.images = imdb.images(:,:,:, imdb.set == 2);
 33 | testSet.depths = imdb.depths(:,:, imdb.set == 2);
 34 | 
 35 | % resize images to input resolution (equal to round(opts.imageSize/2))
 36 | testSet.images = imresize(testSet.images, net.meta.normalization.imageSize(1:2), opts.interp);    
 37 | % resize depth to opts.imageSize resolution
 38 | testSet.depths = imresize(testSet.depths, opts.imageSize, opts.interp);     
 39 | 
 40 | % -------------------------------------------------------------------------
 41 | % Evaluate network
 42 | % -------------------------------------------------------------------------
 43 | 
 44 | % Get predictions
 45 | predictions = DepthMapPrediction(testSet, net, netOpts);
 46 | predictions = squeeze(predictions); % remove singleton dimensions
 47 | predictions = imresize(predictions, [size(testSet.depths,1), size(testSet.depths,2)], 'bilinear'); %rescale
 48 | 
 49 | % Error calculation
 50 | c1_mask = testSet.depths > 0 & testSet.depths < 70;
 51 | errors = error_metrics(predictions, testSet.depths, c1_mask);
 52 | 
 53 | % Save results
 54 | fprintf('\nsaving predictions...');
 55 | save(fullfile(opts.dataDir, 'results.mat'), 'predictions', 'errors', '-v7.3');
 56 | fprintf('done!\n');
 57 | 
 58 | 
 59 | function imdb = get_Make3D(opts)
 60 | % -------------------------------------------------------------------------
 61 | % Download required data (test only)
 62 | % -------------------------------------------------------------------------
 63 | 
 64 | opts.dataDirImages = fullfile(opts.dataDir, 'data', 'Test134');
 65 | opts.dataDirDepths = fullfile(opts.dataDir, 'data', 'Gridlaserdata');
 66 | 
 67 | % Download test set
 68 | if ~exist(opts.dataDirImages, 'dir')
 69 |     fprintf('downloading Make3D testing images (~190 MB)...');
 70 |     mkdir(opts.dataDirImages);
 71 |     untar('http://www.cs.cornell.edu/~asaxena/learningdepth/Test134.tar.gz', fileparts(opts.dataDirImages));
 72 |     fprintf('done.\n');
 73 | end
 74 | 
 75 | if ~exist(opts.dataDirDepths, 'dir')
 76 |     fprintf('downloading Make3D testing depth maps (~22 MB)...');
 77 |     mkdir(opts.dataDirDepths);   
 78 |     untar('http://www.cs.cornell.edu/~asaxena/learningdepth/Test134Depth.tar.gz', fileparts(opts.dataDirDepths));
 79 |     fprintf('done.\n');
 80 | end
 81 | 
 82 | fprintf('preparing testing data...');
 83 | img_files = dir(fullfile(opts.dataDirImages, 'img-*.jpg'));
 84 | depth_files = dir(fullfile(opts.dataDirDepths, 'depth_sph_corr-*.mat'));
 85 | 
 86 | % Verify that the correct number of files has been found
 87 | assert(numel(img_files)==134, 'Incorrect number of Make3D test images. \n');
 88 | assert(numel(depth_files)==134, 'Incorrect number of Make3D test depths. \n');
 89 | 
 90 | % Read dataset files and store necessary information to imdb structure
 91 | for i = 1:numel(img_files)
 92 |     imdb.images(:,:,:,i) = single(imread(fullfile(opts.dataDirImages, img_files(i).name)));   % get RGB image
 93 |     gt = load(fullfile(opts.dataDirDepths, depth_files(i).name));
 94 |     imdb.depths(:,:,i) = single(gt.Position3DGrid(:,:,4));  % get depth channel
 95 |     imdb.set(i) = 2;
 96 | end
 97 | fprintf(' done!\n');
 98 | 
 99 | 
100 | 
101 | function net = get_model(opts)
102 | % -------------------------------------------------------------------------
103 | % Download trained models
104 | % -------------------------------------------------------------------------
105 | 
106 | opts.dataDir = fullfile(opts.dataDir, 'models');
107 | if ~exist(opts.dataDir, 'dir'), mkdir(opts.dataDir); end
108 | 
109 | filename = fullfile(opts.dataDir, 'Make3D_ResNet-UpProj.mat');
110 | if ~exist(filename, 'file')
111 |     url = 'http://campar.in.tum.de/files/rupprecht/depthpred/Make3D_ResNet-UpProj.zip';
112 |     fprintf('downloading trained model: %s\n', url);
113 |     unzip(url, opts.dataDir);
114 | end
115 | 
116 | net = load(filename);
117 | 
118 | 


--------------------------------------------------------------------------------
/matlab/evaluateNYU.m:
--------------------------------------------------------------------------------
  1 | function evaluateNYU
  2 | 
  3 | % Evaluation of depth prediction on NYU Depth v2 dataset. 
  4 | 
  5 | % -------------------------------------------------------------------------
  6 | % Setup MatConvNet
  7 | % -------------------------------------------------------------------------
  8 | 
  9 | % Set your matconvnet path here:
 10 | matconvnet_path = '../../matconvnet-1.0-beta20';
 11 | setupMatConvNet(matconvnet_path);
 12 | 
 13 | % -------------------------------------------------------------------------
 14 | % Options
 15 | % -------------------------------------------------------------------------
 16 | 
 17 | opts.dataDir = fullfile(pwd, 'NYU');    % working directory
 18 | opts.interp = 'nearest';                % interpolation method applied during resizing
 19 | 
 20 | netOpts.gpu = true;     % set to true to enable GPU support      
 21 | netOpts.plot = true;    % set to true to visualize the predictions during inference          
 22 | 
 23 | % -------------------------------------------------------------------------
 24 | % Prepate data
 25 | % -------------------------------------------------------------------------
 26 | 
 27 | imdb = get_NYUDepth_v2(opts);
 28 | net = get_model(opts);
 29 | 
 30 | % Test set
 31 | testSet.images = imdb.images(:,:,:, imdb.set == 2);
 32 | testSet.depths = imdb.depths(:,:, imdb.set == 2);
 33 | 
 34 | % Prepare input for evaluation through the network, in accordance to the
 35 | % way the model was trained for the NYU dataset. No processing is applied
 36 | % to the ground truth. 
 37 | meta = net.meta.normalization;  % information about input
 38 | res = meta.imageSize(1:2) + 2*meta.border; 
 39 | testSet.images = imresize(testSet.images, res, opts.interp);  % resize
 40 | testSet.images = testSet.images(1+meta.border(1):end-meta.border(1), 1+meta.border(2):end-meta.border(2), :, :);  % center crop
 41 | 
 42 | % -------------------------------------------------------------------------
 43 | % Evaluate network 
 44 | % -------------------------------------------------------------------------
 45 | 
 46 | % Get predictions
 47 | predictions = DepthMapPrediction(testSet, net, netOpts);
 48 | predictions = squeeze(predictions); % remove singleton dimensions
 49 | predictions = imresize(predictions, [size(testSet.depths,1), size(testSet.depths,2)], 'bilinear'); %rescale
 50 | 
 51 | % Error calculation 
 52 | errors = error_metrics(predictions, testSet.depths, []);
 53 | 
 54 | % Save results
 55 | fprintf('\nsaving predictions...');
 56 | save(fullfile(opts.dataDir, 'results.mat'), 'predictions', 'errors', '-v7.3');
 57 | fprintf('done!\n');
 58 | 
 59 | 
 60 | 
 61 | function imdb = get_NYUDepth_v2(opts)
 62 | % -------------------------------------------------------------------------
 63 | % Download required data 
 64 | % -------------------------------------------------------------------------
 65 | 
 66 | opts.dataDir = fullfile(opts.dataDir, 'data');
 67 | if ~exist(opts.dataDir, 'dir'), mkdir(opts.dataDir); end
 68 | 
 69 | % Download dataset
 70 | filename = fullfile(opts.dataDir, 'nyu_depth_v2_labeled.mat');
 71 | if ~exist(filename, 'file')
 72 |     url = 'http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_labeled.mat';
 73 |     fprintf('downloading dataset (~2.8 GB): %s\n', url);
 74 |     websave(filename, url);
 75 | end
 76 | 
 77 | % Download official train/test split
 78 | filename_splits = fullfile(opts.dataDir, 'splits.mat');
 79 | if ~exist(filename_splits, 'file')
 80 |     url_split = 'http://horatio.cs.nyu.edu/mit/silberman/indoor_seg_sup/splits.mat';
 81 |     fprintf('downloading train/test split: %s\n', url_split);
 82 |     websave(filename_splits, url_split);
 83 | end
 84 | 
 85 | % Load dataset and splits
 86 | fprintf('loading data to workspace...');
 87 | data = load(filename);
 88 | splits = load(filename_splits);
 89 | 
 90 | % Store necessary information to imdb structure
 91 | imdb.images = single(data.images);  %(no mean subtraction has been performed)
 92 | imdb.depths = single(data.depths);  %depth filled-in values
 93 | imdb.set(splits.trainNdxs) = 1;     %training indices (ignored for inference)
 94 | imdb.set(splits.testNdxs) = 2;      %testing indices (on which evaluation is performed)
 95 | fprintf(' done!\n');
 96 | 
 97 | 
 98 | 
 99 | function net = get_model(opts)
100 | % -------------------------------------------------------------------------
101 | % Download trained models
102 | % -------------------------------------------------------------------------
103 | 
104 | opts.dataDir = fullfile(opts.dataDir, 'models');
105 | if ~exist(opts.dataDir, 'dir'), mkdir(opts.dataDir); end
106 | 
107 | filename = fullfile(opts.dataDir, 'NYU_ResNet-UpProj.mat');
108 | if ~exist(filename, 'file')
109 |     url = 'http://campar.in.tum.de/files/rupprecht/depthpred/NYU_ResNet-UpProj.zip';
110 |     fprintf('downloading trained model: %s\n', url);
111 |     unzip(url, opts.dataDir);
112 | end
113 | 
114 | net = load(filename);
115 | 
116 | 


--------------------------------------------------------------------------------
/matlab/setupMatConvNet.m:
--------------------------------------------------------------------------------
 1 | function setupMatConvNet(matconvnet_path)
 2 | 
 3 | % check path
 4 | if ~exist(fullfile(matconvnet_path, 'matlab', 'vl_setupnn.m'), 'file')
 5 |     error('Count not find MatConvNet in "%s"!\nPlease point matcovnet_path to the correct directory.\n', matconvnet_path);
 6 | end
 7 | 
 8 | % check if it is the right version (beta-20)
 9 | mcnv = getMatConvNetVersion(matconvnet_path);
10 | if ~strcmp(mcnv, '1.0-beta20')
11 |     error('Your MatConvNet version (%s) is not the required version 1.0-beta20. Please download and compile the right version.', mcnv);
12 | end
13 | 
14 | % if everything is fine, then set up
15 | run(fullfile(matconvnet_path, 'matlab', 'vl_setupnn.m'));
16 | 
17 | 
18 | 
19 | function versionName = getMatConvNetVersion(matconvnet_path)
20 | 
21 | fid = fopen(fullfile(matconvnet_path, 'Makefile'), 'rt');
22 | s = textscan(fid, '%s', 'delimiter', '\n');
23 | fclose(fid);
24 | idxs = find(~cellfun(@isempty,strfind(s{1}, 'VER = ')));
25 | mcnVersion = s{1}(idxs(1));
26 | versionName = mcnVersion{1}(7:end);


--------------------------------------------------------------------------------
/tensorflow/models/__init__.py:
--------------------------------------------------------------------------------
1 | from .fcrn import ResNet50UpProj
2 | 


--------------------------------------------------------------------------------
/tensorflow/models/fcrn.py:
--------------------------------------------------------------------------------
  1 | from .network import Network
  2 | 
  3 | class ResNet50UpProj(Network):
  4 |     def setup(self):
  5 |         (self.feed('data')
  6 |              .conv(7, 7, 64, 2, 2, relu=False, name='conv1')
  7 |              .batch_normalization(relu=True, name='bn_conv1')
  8 |              .max_pool(3, 3, 2, 2, name='pool1')
  9 |              .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2a_branch1')
 10 |              .batch_normalization(name='bn2a_branch1'))
 11 | 
 12 |         (self.feed('pool1')
 13 |              .conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2a_branch2a')
 14 |              .batch_normalization(relu=True, name='bn2a_branch2a')
 15 |              .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2a_branch2b')
 16 |              .batch_normalization(relu=True, name='bn2a_branch2b')
 17 |              .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2a_branch2c')
 18 |              .batch_normalization(name='bn2a_branch2c'))
 19 | 
 20 |         (self.feed('bn2a_branch1',
 21 |                    'bn2a_branch2c')
 22 |              .add(name='res2a')
 23 |              .relu(name='res2a_relu')
 24 |              .conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2b_branch2a')
 25 |              .batch_normalization(relu=True, name='bn2b_branch2a')
 26 |              .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2b_branch2b')
 27 |              .batch_normalization(relu=True, name='bn2b_branch2b')
 28 |              .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2b_branch2c')
 29 |              .batch_normalization(name='bn2b_branch2c'))
 30 | 
 31 |         (self.feed('res2a_relu',
 32 |                    'bn2b_branch2c')
 33 |              .add(name='res2b')
 34 |              .relu(name='res2b_relu')
 35 |              .conv(1, 1, 64, 1, 1, biased=False, relu=False, name='res2c_branch2a')
 36 |              .batch_normalization(relu=True, name='bn2c_branch2a')
 37 |              .conv(3, 3, 64, 1, 1, biased=False, relu=False, name='res2c_branch2b')
 38 |              .batch_normalization(relu=True, name='bn2c_branch2b')
 39 |              .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res2c_branch2c')
 40 |              .batch_normalization(name='bn2c_branch2c'))
 41 | 
 42 |         (self.feed('res2b_relu',
 43 |                    'bn2c_branch2c')
 44 |              .add(name='res2c')
 45 |              .relu(name='res2c_relu')
 46 |              .conv(1, 1, 512, 2, 2, biased=False, relu=False, name='res3a_branch1')
 47 |              .batch_normalization(name='bn3a_branch1'))
 48 | 
 49 |         (self.feed('res2c_relu')
 50 |              .conv(1, 1, 128, 2, 2, biased=False, relu=False, name='res3a_branch2a')
 51 |              .batch_normalization(relu=True, name='bn3a_branch2a')
 52 |              .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='res3a_branch2b')
 53 |              .batch_normalization(relu=True, name='bn3a_branch2b')
 54 |              .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res3a_branch2c')
 55 |              .batch_normalization(name='bn3a_branch2c'))
 56 | 
 57 |         (self.feed('bn3a_branch1',
 58 |                    'bn3a_branch2c')
 59 |              .add(name='res3a')
 60 |              .relu(name='res3a_relu')
 61 |              .conv(1, 1, 128, 1, 1, biased=False, relu=False, name='res3b_branch2a')
 62 |              .batch_normalization(relu=True, name='bn3b_branch2a')
 63 |              .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='res3b_branch2b')
 64 |              .batch_normalization(relu=True, name='bn3b_branch2b')
 65 |              .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res3b_branch2c')
 66 |              .batch_normalization(name='bn3b_branch2c'))
 67 | 
 68 |         (self.feed('res3a_relu',
 69 |                    'bn3b_branch2c')
 70 |              .add(name='res3b')
 71 |              .relu(name='res3b_relu')
 72 |              .conv(1, 1, 128, 1, 1, biased=False, relu=False, name='res3c_branch2a')
 73 |              .batch_normalization(relu=True, name='bn3c_branch2a')
 74 |              .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='res3c_branch2b')
 75 |              .batch_normalization(relu=True, name='bn3c_branch2b')
 76 |              .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res3c_branch2c')
 77 |              .batch_normalization(name='bn3c_branch2c'))
 78 | 
 79 |         (self.feed('res3b_relu',
 80 |                    'bn3c_branch2c')
 81 |              .add(name='res3c')
 82 |              .relu(name='res3c_relu')
 83 |              .conv(1, 1, 128, 1, 1, biased=False, relu=False, name='res3d_branch2a')
 84 |              .batch_normalization(relu=True, name='bn3d_branch2a')
 85 |              .conv(3, 3, 128, 1, 1, biased=False, relu=False, name='res3d_branch2b')
 86 |              .batch_normalization(relu=True, name='bn3d_branch2b')
 87 |              .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res3d_branch2c')
 88 |              .batch_normalization(name='bn3d_branch2c'))
 89 | 
 90 |         (self.feed('res3c_relu',
 91 |                    'bn3d_branch2c')
 92 |              .add(name='res3d')
 93 |              .relu(name='res3d_relu')
 94 |              .conv(1, 1, 1024, 2, 2, biased=False, relu=False, name='res4a_branch1')
 95 |              .batch_normalization(name='bn4a_branch1'))
 96 | 
 97 |         (self.feed('res3d_relu')
 98 |              .conv(1, 1, 256, 2, 2, biased=False, relu=False, name='res4a_branch2a')
 99 |              .batch_normalization(relu=True, name='bn4a_branch2a')
100 |              .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4a_branch2b')
101 |              .batch_normalization(relu=True, name='bn4a_branch2b')
102 |              .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4a_branch2c')
103 |              .batch_normalization(name='bn4a_branch2c'))
104 | 
105 |         (self.feed('bn4a_branch1',
106 |                    'bn4a_branch2c')
107 |              .add(name='res4a')
108 |              .relu(name='res4a_relu')
109 |              .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4b_branch2a')
110 |              .batch_normalization(relu=True, name='bn4b_branch2a')
111 |              .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4b_branch2b')
112 |              .batch_normalization(relu=True, name='bn4b_branch2b')
113 |              .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4b_branch2c')
114 |              .batch_normalization(name='bn4b_branch2c'))
115 | 
116 |         (self.feed('res4a_relu',
117 |                    'bn4b_branch2c')
118 |              .add(name='res4b')
119 |              .relu(name='res4b_relu')
120 |              .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4c_branch2a')
121 |              .batch_normalization(relu=True, name='bn4c_branch2a')
122 |              .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4c_branch2b')
123 |              .batch_normalization(relu=True, name='bn4c_branch2b')
124 |              .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4c_branch2c')
125 |              .batch_normalization(name='bn4c_branch2c'))
126 | 
127 |         (self.feed('res4b_relu',
128 |                    'bn4c_branch2c')
129 |              .add(name='res4c')
130 |              .relu(name='res4c_relu')
131 |              .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4d_branch2a')
132 |              .batch_normalization(relu=True, name='bn4d_branch2a')
133 |              .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4d_branch2b')
134 |              .batch_normalization(relu=True, name='bn4d_branch2b')
135 |              .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4d_branch2c')
136 |              .batch_normalization(name='bn4d_branch2c'))
137 | 
138 |         (self.feed('res4c_relu',
139 |                    'bn4d_branch2c')
140 |              .add(name='res4d')
141 |              .relu(name='res4d_relu')
142 |              .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4e_branch2a')
143 |              .batch_normalization(relu=True, name='bn4e_branch2a')
144 |              .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4e_branch2b')
145 |              .batch_normalization(relu=True, name='bn4e_branch2b')
146 |              .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4e_branch2c')
147 |              .batch_normalization(name='bn4e_branch2c'))
148 | 
149 |         (self.feed('res4d_relu',
150 |                    'bn4e_branch2c')
151 |              .add(name='res4e')
152 |              .relu(name='res4e_relu')
153 |              .conv(1, 1, 256, 1, 1, biased=False, relu=False, name='res4f_branch2a')
154 |              .batch_normalization(relu=True, name='bn4f_branch2a')
155 |              .conv(3, 3, 256, 1, 1, biased=False, relu=False, name='res4f_branch2b')
156 |              .batch_normalization(relu=True, name='bn4f_branch2b')
157 |              .conv(1, 1, 1024, 1, 1, biased=False, relu=False, name='res4f_branch2c')
158 |              .batch_normalization(name='bn4f_branch2c'))
159 | 
160 |         (self.feed('res4e_relu',
161 |                    'bn4f_branch2c')
162 |              .add(name='res4f')
163 |              .relu(name='res4f_relu')
164 |              .conv(1, 1, 2048, 2, 2, biased=False, relu=False, name='res5a_branch1')
165 |              .batch_normalization(name='bn5a_branch1'))
166 | 
167 |         (self.feed('res4f_relu')
168 |              .conv(1, 1, 512, 2, 2, biased=False, relu=False, name='res5a_branch2a')
169 |              .batch_normalization(relu=True, name='bn5a_branch2a')
170 |              .conv(3, 3, 512, 1, 1, biased=False, relu=False, name='res5a_branch2b')
171 |              .batch_normalization(relu=True, name='bn5a_branch2b')
172 |              .conv(1, 1, 2048, 1, 1, biased=False, relu=False, name='res5a_branch2c')
173 |              .batch_normalization(name='bn5a_branch2c'))
174 | 
175 |         (self.feed('bn5a_branch1',
176 |                    'bn5a_branch2c')
177 |              .add(name='res5a')
178 |              .relu(name='res5a_relu')
179 |              .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res5b_branch2a')
180 |              .batch_normalization(relu=True, name='bn5b_branch2a')
181 |              .conv(3, 3, 512, 1, 1, biased=False, relu=False, name='res5b_branch2b')
182 |              .batch_normalization(relu=True, name='bn5b_branch2b')
183 |              .conv(1, 1, 2048, 1, 1, biased=False, relu=False, name='res5b_branch2c')
184 |              .batch_normalization(name='bn5b_branch2c'))
185 | 
186 |         (self.feed('res5a_relu',
187 |                    'bn5b_branch2c')
188 |              .add(name='res5b')
189 |              .relu(name='res5b_relu')
190 |              .conv(1, 1, 512, 1, 1, biased=False, relu=False, name='res5c_branch2a')
191 |              .batch_normalization(relu=True, name='bn5c_branch2a')
192 |              .conv(3, 3, 512, 1, 1, biased=False, relu=False, name='res5c_branch2b')
193 |              .batch_normalization(relu=True, name='bn5c_branch2b')
194 |              .conv(1, 1, 2048, 1, 1, biased=False, relu=False, name='res5c_branch2c')
195 |              .batch_normalization(name='bn5c_branch2c'))
196 | 
197 |         (self.feed('res5b_relu',
198 |                    'bn5c_branch2c')
199 |              .add(name='res5c')
200 |              .relu(name='res5c_relu')
201 |              .conv(1, 1, 1024, 1, 1, biased=True, relu=False, name='layer1')
202 |              .batch_normalization(relu=False, name='layer1_BN')
203 |              .up_project([3, 3, 1024, 512], id = '2x', stride = 1, BN=True)
204 |              .up_project([3, 3, 512, 256], id = '4x', stride = 1, BN=True)
205 |              .up_project([3, 3, 256, 128], id = '8x', stride = 1, BN=True)
206 |              .up_project([3, 3, 128, 64], id = '16x', stride = 1, BN=True)
207 |              .dropout(name = 'drop', keep_prob = 1.)
208 |              .conv(3, 3, 1, 1, 1, name = 'ConvPred'))
209 | 


--------------------------------------------------------------------------------
/tensorflow/models/network.py:
--------------------------------------------------------------------------------
  1 | import numpy as np
  2 | import tensorflow as tf
  3 | 
  4 | # ----------------------------------------------------------------------------------
  5 | # Commonly used layers and operations based on ethereon's implementation 
  6 | # https://github.com/ethereon/caffe-tensorflow
  7 | # Slight modifications may apply. FCRN-specific operations have also been appended. 
  8 | # ----------------------------------------------------------------------------------
  9 | # Thanks to *Helisa Dhamo* for the model conversion and integration into TensorFlow.
 10 | # ----------------------------------------------------------------------------------
 11 | 
 12 | DEFAULT_PADDING = 'SAME'
 13 | 
 14 | 
 15 | def get_incoming_shape(incoming):
 16 |     """ Returns the incoming data shape """
 17 |     if isinstance(incoming, tf.Tensor):
 18 |         return incoming.get_shape().as_list()
 19 |     elif type(incoming) in [np.array, list, tuple]:
 20 |         return np.shape(incoming)
 21 |     else:
 22 |         raise Exception("Invalid incoming layer.")
 23 | 
 24 | 
 25 | def interleave(tensors, axis):
 26 |     old_shape = get_incoming_shape(tensors[0])[1:]
 27 |     new_shape = [-1] + old_shape
 28 |     new_shape[axis] *= len(tensors)
 29 |     return tf.reshape(tf.stack(tensors, axis + 1), new_shape)
 30 | 
 31 | def layer(op):
 32 |     '''Decorator for composable network layers.'''
 33 | 
 34 |     def layer_decorated(self, *args, **kwargs):
 35 |         # Automatically set a name if not provided.
 36 |         name = kwargs.setdefault('name', self.get_unique_name(op.__name__))
 37 | 
 38 |         # Figure out the layer inputs.
 39 |         if len(self.terminals) == 0:
 40 |             raise RuntimeError('No input variables found for layer %s.' % name)
 41 |         elif len(self.terminals) == 1:
 42 |             layer_input = self.terminals[0]
 43 |         else:
 44 |             layer_input = list(self.terminals)
 45 |         # Perform the operation and get the output.
 46 |         layer_output = op(self, layer_input, *args, **kwargs)
 47 |         # Add to layer LUT.
 48 |         self.layers[name] = layer_output
 49 |         # This output is now the input for the next layer.
 50 |         self.feed(layer_output)
 51 |         # Return self for chained calls.
 52 |         return self
 53 | 
 54 |     return layer_decorated
 55 | 
 56 | 
 57 | class Network(object):
 58 | 
 59 |     def __init__(self, inputs, batch, keep_prob, is_training, trainable = True):
 60 |         # The input nodes for this network
 61 |         self.inputs = inputs
 62 |         # The current list of terminal nodes
 63 |         self.terminals = []
 64 |         # Mapping from layer names to layers
 65 |         self.layers = dict(inputs)
 66 |         # If true, the resulting variables are set as trainable
 67 |         self.trainable = trainable
 68 |         self.batch_size = batch
 69 |         self.keep_prob = keep_prob
 70 |         self.is_training = is_training
 71 |         self.setup()
 72 | 
 73 | 
 74 |     def setup(self):
 75 |         '''Construct the network. '''
 76 |         raise NotImplementedError('Must be implemented by the subclass.')
 77 | 
 78 |     def load(self, data_path, session, ignore_missing=False):
 79 |         '''Load network weights.
 80 |         data_path: The path to the numpy-serialized network weights
 81 |         session: The current TensorFlow session
 82 |         ignore_missing: If true, serialized weights for missing layers are ignored.
 83 |         '''
 84 |         data_dict = np.load(data_path, encoding='latin1').item()
 85 |         for op_name in data_dict: 
 86 |             with tf.variable_scope(op_name, reuse=True):
 87 |                 for param_name, data in iter(data_dict[op_name].items()):      
 88 |                     try:
 89 |                         var = tf.get_variable(param_name)
 90 |                         session.run(var.assign(data))
 91 | 
 92 |                     except ValueError:
 93 |                         if not ignore_missing:
 94 |                             raise
 95 | 
 96 |     def feed(self, *args):
 97 |         '''Set the input(s) for the next operation by replacing the terminal nodes.
 98 |         The arguments can be either layer names or the actual layers.
 99 |         '''
100 |         assert len(args) != 0
101 |         self.terminals = []
102 |         for fed_layer in args:
103 |             if isinstance(fed_layer, str):
104 |                 try:
105 |                     fed_layer = self.layers[fed_layer]
106 |                 except KeyError:
107 |                     raise KeyError('Unknown layer name fed: %s' % fed_layer)
108 |             self.terminals.append(fed_layer)
109 |         return self
110 | 
111 |     def get_output(self):
112 |         '''Returns the current network output.'''
113 |         return self.terminals[-1]
114 | 
115 |     def get_layer_output(self, name):
116 |         return self.layers[name]
117 | 
118 |     def get_unique_name(self, prefix):
119 |         '''Returns an index-suffixed unique name for the given prefix.
120 |         This is used for auto-generating layer names based on the type-prefix.
121 |         '''
122 |         ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1
123 |         return '%s_%d' % (prefix, ident)
124 | 
125 |     def make_var(self, name, shape):
126 |         '''Creates a new TensorFlow variable.'''
127 |         return tf.get_variable(name, shape, dtype = 'float32', trainable=self.trainable)
128 | 
129 |     def validate_padding(self, padding):
130 |         '''Verifies that the padding is one of the supported ones.'''
131 |         assert padding in ('SAME', 'VALID')
132 | 
133 |     @layer
134 |     def conv(self,
135 |              input_data,
136 |              k_h,
137 |              k_w,
138 |              c_o,
139 |              s_h,
140 |              s_w,
141 |              name,
142 |              relu=True,
143 |              padding=DEFAULT_PADDING,
144 |              group=1,
145 |              biased=True):
146 | 
147 |         # Verify that the padding is acceptable
148 |         self.validate_padding(padding)
149 |         # Get the number of channels in the input
150 |         c_i = input_data.get_shape()[-1]
151 | 
152 |         if (padding == 'SAME'):
153 |             input_data = tf.pad(input_data, [[0, 0], [(k_h - 1)//2, (k_h - 1)//2], [(k_w - 1)//2, (k_w - 1)//2], [0, 0]], "CONSTANT")
154 |         
155 |         # Verify that the grouping parameter is valid
156 |         assert c_i % group == 0
157 |         assert c_o % group == 0
158 |         # Convolution for a given input and kernel
159 |         convolve = lambda i, k: tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding='VALID')
160 |         
161 |         with tf.variable_scope(name) as scope:
162 |             kernel = self.make_var('weights', shape=[k_h, k_w, c_i // group, c_o])
163 | 
164 |             if group == 1:
165 |                 # This is the common-case. Convolve the input without any further complications.
166 |                 output = convolve(input_data, kernel)
167 |             else:
168 |                 # Split the input into groups and then convolve each of them independently
169 | 
170 |                 input_groups = tf.split(3, group, input_data)
171 |                 kernel_groups = tf.split(3, group, kernel)
172 |                 output_groups = [convolve(i, k) for i, k in zip(input_groups, kernel_groups)]
173 |                 # Concatenate the groups
174 |                 output = tf.concat(3, output_groups)
175 | 
176 |             # Add the biases
177 |             if biased:
178 |                 biases = self.make_var('biases', [c_o])
179 |                 output = tf.nn.bias_add(output, biases)
180 |             if relu:
181 |                 # ReLU non-linearity
182 |                 output = tf.nn.relu(output, name=scope.name)
183 | 
184 |             return output
185 | 
186 |     @layer
187 |     def relu(self, input_data, name):
188 |         return tf.nn.relu(input_data, name=name)
189 | 
190 |     @layer
191 |     def max_pool(self, input_data, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):
192 |         self.validate_padding(padding)
193 |         return tf.nn.max_pool(input_data,
194 |                               ksize=[1, k_h, k_w, 1],
195 |                               strides=[1, s_h, s_w, 1],
196 |                               padding=padding,
197 |                               name=name)
198 | 
199 |     @layer
200 |     def avg_pool(self, input_data, k_h, k_w, s_h, s_w, name, padding=DEFAULT_PADDING):
201 |         self.validate_padding(padding)
202 |         return tf.nn.avg_pool(input_data,
203 |                               ksize=[1, k_h, k_w, 1],
204 |                               strides=[1, s_h, s_w, 1],
205 |                               padding=padding,
206 |                               name=name)
207 | 
208 |     @layer
209 |     def lrn(self, input_data, radius, alpha, beta, name, bias=1.0):
210 |         return tf.nn.local_response_normalization(input_data,
211 |                                                   depth_radius=radius,
212 |                                                   alpha=alpha,
213 |                                                   beta=beta,
214 |                                                   bias=bias,
215 |                                                   name=name)
216 | 
217 |     @layer
218 |     def concat(self, inputs, axis, name):
219 |         return tf.concat(concat_dim=axis, values=inputs, name=name)
220 | 
221 |     @layer
222 |     def add(self, inputs, name):
223 |         return tf.add_n(inputs, name=name)
224 | 
225 |     @layer
226 |     def fc(self, input_data, num_out, name, relu=True):
227 |         with tf.variable_scope(name) as scope:
228 |             input_shape = input_data.get_shape()
229 |             if input_shape.ndims == 4:
230 |                 # The input is spatial. Vectorize it first.
231 |                 dim = 1
232 |                 for d in input_shape[1:].as_list():
233 |                     dim *= d
234 |                 feed_in = tf.reshape(input_data, [-1, dim])
235 |             else:
236 |                 feed_in, dim = (input_data, input_shape[-1].value)
237 |             weights = self.make_var('weights', shape=[dim, num_out])
238 |             biases = self.make_var('biases', [num_out])
239 |             op = tf.nn.relu_layer if relu else tf.nn.xw_plus_b
240 |             fc = op(feed_in, weights, biases, name=scope.name)
241 |             return fc
242 | 
243 |     @layer
244 |     def softmax(self, input_data, name):
245 |         input_shape = map(lambda v: v.value, input_data.get_shape())
246 |         if len(input_shape) > 2:
247 |             # For certain models (like NiN), the singleton spatial dimensions
248 |             # need to be explicitly squeezed, since they're not broadcast-able
249 |             # in TensorFlow's NHWC ordering (unlike Caffe's NCHW).
250 |             if input_shape[1] == 1 and input_shape[2] == 1:
251 |                 input_data = tf.squeeze(input_data, squeeze_dims=[1, 2])
252 |             else:
253 |                 raise ValueError('Rank 2 tensor input expected for softmax!')
254 |         return tf.nn.softmax(input_data, name)
255 | 
256 |     @layer
257 |     def batch_normalization(self, input_data, name, scale_offset=True, relu=False):
258 | 
259 |         with tf.variable_scope(name) as scope:
260 |             shape = [input_data.get_shape()[-1]]
261 |             pop_mean = tf.get_variable("mean", shape, initializer = tf.constant_initializer(0.0), trainable=False)
262 |             pop_var = tf.get_variable("variance", shape, initializer = tf.constant_initializer(1.0), trainable=False)
263 |             epsilon = 1e-4
264 |             decay = 0.999
265 |             if scale_offset:
266 |                 scale = tf.get_variable("scale", shape, initializer = tf.constant_initializer(1.0))
267 |                 offset = tf.get_variable("offset", shape, initializer = tf.constant_initializer(0.0))
268 |             else:
269 |                 scale, offset = (None, None)
270 |             if self.is_training:
271 |                 batch_mean, batch_var = tf.nn.moments(input_data, [0, 1, 2])
272 | 
273 |                 train_mean = tf.assign(pop_mean,
274 |                                pop_mean * decay + batch_mean * (1 - decay))
275 |                 train_var = tf.assign(pop_var,
276 |                               pop_var * decay + batch_var * (1 - decay))
277 |                 with tf.control_dependencies([train_mean, train_var]):
278 |                     output = tf.nn.batch_normalization(input_data,
279 |                     batch_mean, batch_var, offset, scale, epsilon, name = name)
280 |             else:
281 |                 output = tf.nn.batch_normalization(input_data,
282 |                 pop_mean, pop_var, offset, scale, epsilon, name = name)
283 | 
284 |             if relu:
285 |                 output = tf.nn.relu(output)
286 | 
287 |             return output
288 | 
289 |     @layer
290 |     def dropout(self, input_data, keep_prob, name):
291 |         return tf.nn.dropout(input_data, keep_prob, name=name)
292 |     
293 | 
294 |     def unpool_as_conv(self, size, input_data, id, stride = 1, ReLU = False, BN = True):
295 | 
296 | 		# Model upconvolutions (unpooling + convolution) as interleaving feature
297 | 		# maps of four convolutions (A,B,C,D). Building block for up-projections. 
298 | 
299 | 
300 |         # Convolution A (3x3)
301 |         # --------------------------------------------------
302 |         layerName = "layer%s_ConvA" % (id)
303 |         self.feed(input_data)
304 |         self.conv( 3, 3, size[3], stride, stride, name = layerName, padding = 'SAME', relu = False)
305 |         outputA = self.get_output()
306 | 
307 |         # Convolution B (2x3)
308 |         # --------------------------------------------------
309 |         layerName = "layer%s_ConvB" % (id)
310 |         padded_input_B = tf.pad(input_data, [[0, 0], [1, 0], [1, 1], [0, 0]], "CONSTANT")
311 |         self.feed(padded_input_B)
312 |         self.conv(2, 3, size[3], stride, stride, name = layerName, padding = 'VALID', relu = False)
313 |         outputB = self.get_output()
314 | 
315 |         # Convolution C (3x2)
316 |         # --------------------------------------------------
317 |         layerName = "layer%s_ConvC" % (id)
318 |         padded_input_C = tf.pad(input_data, [[0, 0], [1, 1], [1, 0], [0, 0]], "CONSTANT")
319 |         self.feed(padded_input_C)
320 |         self.conv(3, 2, size[3], stride, stride, name = layerName, padding = 'VALID', relu = False)
321 |         outputC = self.get_output()
322 | 
323 |         # Convolution D (2x2)
324 |         # --------------------------------------------------
325 |         layerName = "layer%s_ConvD" % (id)
326 |         padded_input_D = tf.pad(input_data, [[0, 0], [1, 0], [1, 0], [0, 0]], "CONSTANT")
327 |         self.feed(padded_input_D)
328 |         self.conv(2, 2, size[3], stride, stride, name = layerName, padding = 'VALID', relu = False)
329 |         outputD = self.get_output()
330 | 
331 |         # Interleaving elements of the four feature maps
332 |         # --------------------------------------------------
333 |         left = interleave([outputA, outputB], axis=1)  # columns
334 |         right = interleave([outputC, outputD], axis=1)  # columns
335 |         Y = interleave([left, right], axis=2) # rows
336 |         
337 |         if BN:
338 |             layerName = "layer%s_BN" % (id)
339 |             self.feed(Y)
340 |             self.batch_normalization(name = layerName, scale_offset = True, relu = False)
341 |             Y = self.get_output()
342 | 
343 |         if ReLU:
344 |             Y = tf.nn.relu(Y, name = layerName)
345 |         
346 |         return Y
347 | 
348 | 
349 |     def up_project(self, size, id, stride = 1, BN = True):
350 |         
351 |         # Create residual upsampling layer (UpProjection)
352 | 
353 |         input_data = self.get_output()
354 | 
355 |         # Branch 1
356 |         id_br1 = "%s_br1" % (id)
357 | 
358 |         # Interleaving Convs of 1st branch
359 |         out = self.unpool_as_conv(size, input_data, id_br1, stride, ReLU=True, BN=True)
360 | 
361 |         # Convolution following the upProjection on the 1st branch
362 |         layerName = "layer%s_Conv" % (id)
363 |         self.feed(out)
364 |         self.conv(size[0], size[1], size[3], stride, stride, name = layerName, relu = False)
365 | 
366 |         if BN:
367 |             layerName = "layer%s_BN" % (id)
368 |             self.batch_normalization(name = layerName, scale_offset=True, relu = False)
369 | 
370 |         # Output of 1st branch
371 |         branch1_output = self.get_output()
372 | 
373 |             
374 |         # Branch 2
375 |         id_br2 = "%s_br2" % (id)
376 |         # Interleaving convolutions and output of 2nd branch
377 |         branch2_output = self.unpool_as_conv(size, input_data, id_br2, stride, ReLU=False)
378 | 
379 |         
380 |         # sum branches
381 |         layerName = "layer%s_Sum" % (id)
382 |         output = tf.add_n([branch1_output, branch2_output], name = layerName)
383 |         # ReLU
384 |         layerName = "layer%s_ReLU" % (id)
385 |         output = tf.nn.relu(output, name=layerName)
386 | 
387 |         self.feed(output)
388 |         return self
389 | 


--------------------------------------------------------------------------------
/tensorflow/predict.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import os
 3 | import numpy as np
 4 | import tensorflow as tf
 5 | from matplotlib import pyplot as plt
 6 | from PIL import Image
 7 | 
 8 | import models
 9 | 
10 | def predict(model_data_path, image_path):
11 | 
12 |     
13 |     # Default input size
14 |     height = 228
15 |     width = 304
16 |     channels = 3
17 |     batch_size = 1
18 |    
19 |     # Read image
20 |     img = Image.open(image_path)
21 |     img = img.resize([width,height], Image.ANTIALIAS)
22 |     img = np.array(img).astype('float32')
23 |     img = np.expand_dims(np.asarray(img), axis = 0)
24 |    
25 |     # Create a placeholder for the input image
26 |     input_node = tf.placeholder(tf.float32, shape=(None, height, width, channels))
27 | 
28 |     # Construct the network
29 |     net = models.ResNet50UpProj({'data': input_node}, batch_size, 1, False)
30 |         
31 |     with tf.Session() as sess:
32 | 
33 |         # Load the converted parameters
34 |         print('Loading the model')
35 | 
36 |         # Use to load from ckpt file
37 |         saver = tf.train.Saver()     
38 |         saver.restore(sess, model_data_path)
39 | 
40 |         # Use to load from npy file
41 |         #net.load(model_data_path, sess) 
42 | 
43 |         # Evalute the network for the given image
44 |         pred = sess.run(net.get_output(), feed_dict={input_node: img})
45 |         
46 |         # Plot result
47 |         fig = plt.figure()
48 |         ii = plt.imshow(pred[0,:,:,0], interpolation='nearest')
49 |         fig.colorbar(ii)
50 |         plt.show()
51 |         
52 |         return pred
53 |         
54 |                 
55 | def main():
56 |     # Parse arguments
57 |     parser = argparse.ArgumentParser()
58 |     parser.add_argument('model_path', help='Converted parameters for the model')
59 |     parser.add_argument('image_paths', help='Directory of images to predict')
60 |     args = parser.parse_args()
61 | 
62 |     # Predict the image
63 |     pred = predict(args.model_path, args.image_paths)
64 |     
65 |     os._exit(0)
66 | 
67 | if __name__ == '__main__':
68 |     main()
69 | 
70 |         
71 | 
72 | 
73 | 
74 | 


--------------------------------------------------------------------------------