├── LICENSE ├── README.md ├── _config.yml ├── caffe ├── include │ └── caffe │ │ ├── filler.hpp │ │ └── layers │ │ ├── abs_loss_layer.hpp │ │ ├── geometry_transformation.hpp │ │ ├── inverse_warping_layer.hpp │ │ └── pin_hole_layer.hpp ├── python │ └── pygeometry.py └── src │ └── caffe │ └── layers │ ├── abs_loss_layer.cpp │ ├── abs_loss_layer.cu │ ├── geometry_transformation.cpp │ ├── geometry_transformation.cu │ ├── inverse_warping_layer.cpp │ ├── inverse_warping_layer.cu │ ├── pin_hole_layer.cpp │ └── pin_hole_layer.cu ├── data ├── README.md ├── dataset_builder.py ├── depth_evaluation │ └── kitti_eigen │ │ └── test_files_eigen.txt └── odometry_evaluation │ └── poses │ ├── 00.txt │ ├── 01.txt │ ├── 02.txt │ ├── 03.txt │ ├── 04.txt │ ├── 05.txt │ ├── 06.txt │ ├── 07.txt │ ├── 08.txt │ ├── 09.txt │ └── 10.txt ├── experiments ├── depth │ ├── solver.prototxt │ ├── train.prototxt │ └── train.sh ├── depth_feature │ ├── solver.prototxt │ ├── train.prototxt │ └── train.sh ├── depth_odometry │ ├── solver.prototxt │ ├── train.prototxt │ └── train.sh ├── depth_odometry_feature │ ├── solver.prototxt │ ├── train.prototxt │ └── train.sh └── networks │ ├── depth_deploy.prototxt │ └── odometry_deploy.prototxt └── tools ├── eval_depth.py ├── eval_depth_utils.py ├── evaluation_tools.py └── sfmlearner_odometry_tool ├── get_sfmlearner_result.py └── pose_evaluation_utils.py /LICENSE: -------------------------------------------------------------------------------- 1 | Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction (Depth-VO-Feat) for non-commercial purposes 2 | 3 | Copyright (c) 2018, Huangying Zhan 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | * Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | * Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 17 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 18 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 19 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 20 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 21 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 22 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 23 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 24 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 25 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 26 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Introduction 2 | 3 | This repo implements the system described in the CVPR-2018 paper: 4 | 5 | [**Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction** 6 | ](https://arxiv.org/abs/1803.03893) 7 | 8 | Huangying Zhan, Ravi Garg, Chamara Saroj Weerasekera, Kejie Li, Harsh Agarwal, Ian Reid 9 | 10 | ``` 11 | @InProceedings{Zhan_2018_CVPR, 12 | author = {Zhan, Huangying and Garg, Ravi and Saroj Weerasekera, Chamara and Li, Kejie and Agarwal, Harsh and Reid, Ian}, 13 | title = {Unsupervised Learning of Monocular Depth Estimation and Visual Odometry With Deep Feature Reconstruction}, 14 | booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 15 | month = {June}, 16 | year = {2018} 17 | } 18 | ``` 19 | This repo includes (1) the training procedure of our models; (2) evaluation scripts for the results; (3) trained models and results. 20 | 21 | 22 | ### Contents 23 | 1. [Requirements](#part-1-requirements) 24 | 2. [Prepare dataset](#part-2-prepare-dataset) 25 | 3. [Depth](#part-3-depth) 26 | 4. [Depth and odometry](#part-4-depth-and-odometry) 27 | 5. [Feature Reconstruction Loss for Depth](#part-5-feature-reconstruction-loss-for-depth) 28 | 6. [Depth, odometry and feature](#part-6-depth-odometry-and-feature) 29 | 7. [Result evaluation](#part-7-result-evaluation) 30 | 31 | 32 | ### Part 1. Requirements 33 | 34 | This code was tested with Python 2.7, CUDA 8.0 and Ubuntu 14.04 using [Caffe](http://caffe.berkeleyvision.org/). 35 | 36 | Caffe: Add the required layers in `./caffe` into your own Caffe. Remember to enable Python Layers in the Caffe configuration. 37 | 38 | Most of our required models, trained models and results can be downloaded from [here](https://www.dropbox.com/sh/qxfqflrrzzwupua/AAAPA1mF0QaKwwR2Ds0jtDhYa?dl=0). The following instruction also includes specific links to the items. 39 | 40 | ### Part 2. Download dataset and models 41 | 42 | The main dataset used in this project is [KITTI Driving Dataset](http://www.cvlibs.net/datasets/kitti/raw_data.php). Please follow the instruction in `./data/README.md` to prepare the required dataset. 43 | 44 | For our trained models and pre-requested models, please visit [here](https://www.dropbox.com/sh/60onn52jm9g2ygu/AADUkDRkwycS1STazstG5XOpa?dl=0) to download the models and put the models into the directory `./models`. 45 | 46 | ### Part 3. Depth 47 | 48 | In this part, the training of single view depth estimation network from stereo pairs is introduced. Photometric loss is used as the main supervision signal. Only stereo pairs are used in this experiment. 49 | 50 | 1. Update `$YOUR_CAFFE_DIR` in `./experiments/depth/train.sh`. 51 | 2. Run `bash ./expriments/depth/train.sh`. 52 | 53 | The trained models are saved in `./snapshots/depth` 54 | 55 | ### Part 4. Depth and odometry 56 | 57 | In this part, the joint training of the depth estimation network and the visual odometry network is introduced. 58 | Photometric losses for spatial pairs and temporal pairs are used as the main supervision signal. 59 | Both spatial (stereo) pairs and temporal pairs (i.e. stereo sequences) are used in this experiment. 60 | 61 | To facilitate the training, the model trained in the Depth experiment is used as an initialization. 62 | 1. Update `$YOUR_CAFFE_DIR` in `./experiments/depth_odometry/train.sh`. 63 | 2. Run `bash ./expriments/depth_odometry/train.sh`. 64 | 65 | The trained models are saved in `./snapshots/depth_odometry` 66 | 67 | ### Part 5. Feature Reconstruction Loss for Depth 68 | 69 | In this part, the training of single view depth estimation network from stereo pairs is introduced. Both photometric loss and feature reconstruction loss are used as the main supervision signal. Only stereo pairs are used in this experiment. There are several features we have tried for this experiment. Currently, only the example of using **KITTI Feat.** is shown here. More details of using other features will be updated later. 70 | 71 | To facilitate the training, the model trained in the Depth experiment is used as an initialization. 72 | 1. Update `$YOUR_CAFFE_DIR` in `./experiments/depth_feature/train.sh`. 73 | 2. Run `bash ./expriments/depth_feature/train.sh`. 74 | 75 | The trained models are saved in `./snapshots/depth_feature` 76 | 77 | ### Part 6. Depth, odometry and feature 78 | 79 | In this part, we show the training including feature reconstruction loss. 80 | Stereo sequences are used in this experiment. 81 | 82 | With the feature extractor proposed in [Weerasekera et.al](https://arxiv.org/abs/1711.05919), we can finetune the trained depth model and/or odometry model with our proposed deep feature reconstruction loss. 83 | 84 | 1. Update `$YOUR_CAFFE_DIR` in `./experiments/depth_odometry_feature/train.sh`. 85 | 2. Run `bash ./expriments/depth_odometry_feature/train.sh`. 86 | 87 | **NOTE:** The link to download the feature extractor proposed in [Weerasekera et.al](https://arxiv.org/abs/1711.05919) will be released soon. 88 | 89 | ### Part 7. Result evalution 90 | 91 | Note that the evaluation script provided here uses a different image interpolation for resizing input images (i.e. python's interpolation v.s. Caffe's interpolation), therefore the quantative result could be a little different from the published result. 92 | 93 | #### Depth estimation 94 | 95 | Using the test set (697 image-depth pairs from 28 scenes) in Eigen Split is a common protocol to evaluate depth estimation result. 96 | 97 | We basically use the evaluation script provided by [monodepth](https://github.com/mrharicot/monodepth) to evalute depth estimation results. 98 | 99 | In order to run the evaluation, a `npy` file is required to store the predicted depths. Then run the script to evaluate the performance. 100 | 101 | 1. Update `caffe_root` in `./tools/evaluation_tools.py` 102 | 2. To generate the depth prediction and save it in a `npy` file. 103 | ``` 104 | python ./tools/evaluation_tools.py --func generate_depth_npy --dataset kitti_eigen --depth_net_def ./experiments/networks/depth_deploy.prototxt --model models/trained_models/eigen_split/Baseline.caffemodel --npy_dir ./result/depth/inv_depths_baseline.npy 105 | ``` 106 | 107 | 3. To evalute the predictions. 108 | ``` 109 | python ./tools/eval_depth.py --split eigen --predicted_inv_depth_path ./result/depth/inv_depths_baseline.npy --gt_path data/kitti_raw_data/ --min_depth 1 --max_depth 50 --garg_crop 110 | ``` 111 | 112 | Some of our results (inverse depths) are released and can be downloaded from [here](https://www.dropbox.com/sh/1f6nkd4ezx0qfw4/AADmGuFLIxImtikz2UJrHeTOa?dl=0). 113 | 114 | #### Visual Odometry 115 | 116 | [KITTI Odometry benchmark](http://www.cvlibs.net/datasets/kitti/eval_odometry.php) contains 22 stereo sequences, in which 11 sequences are provided with ground truth. The 11 sequences are used for evaluation or training of visual odometry. 117 | 118 | 1. Update `caffe_root` in `./tools/evaluation_tools.py` 119 | 2. To generate the odometry predictions (relative camera motions), run the following script. 120 | 121 | ``` 122 | python ./tools/evaluation_tools.py --func generate_odom_result --model models/trained_models/odometry_split/Temporal.caffemodel --odom_net_def ./experiments/networks/odometry_deploy.prototxt --odom_result_dir ./result/odom_result 123 | ``` 124 | 125 | 3. After getting the odometry predictions, we can evalute the performance by comparing with the ground truth poses. 126 | 127 | ``` 128 | python ./tools/evaluation_tools.py --func eval_odom --odom_result_dir ./result/odom_result 129 | ``` 130 | 131 | 132 | Our odometry results are released and can be downloaded from [here](https://www.dropbox.com/sh/qsb54kdpsp4i3wd/AAAht6__ssw3LlN168DsEqxca?dl=0). 133 | 134 | ### License 135 | For academic usage, the code is released under the permissive BSD license. For any commercial purpose, please contact the authors. 136 | 137 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-slate -------------------------------------------------------------------------------- /caffe/include/caffe/filler.hpp: -------------------------------------------------------------------------------- 1 | // Fillers are random number generators that fills a blob using the specified 2 | // algorithm. The expectation is that they are only going to be used during 3 | // initialization time and will not involve any GPUs. 4 | 5 | #ifndef CAFFE_FILLER_HPP 6 | #define CAFFE_FILLER_HPP 7 | 8 | #include 9 | 10 | #include "caffe/blob.hpp" 11 | #include "caffe/proto/caffe.pb.h" 12 | #include "caffe/syncedmem.hpp" 13 | #include "caffe/util/math_functions.hpp" 14 | #include 15 | #include "iostream" 16 | namespace caffe { 17 | 18 | /// @brief Fills a Blob with constant or randomly-generated data. 19 | template 20 | class Filler { 21 | public: 22 | explicit Filler(const FillerParameter& param) : filler_param_(param) {} 23 | virtual ~Filler() {} 24 | virtual void Fill(Blob* blob) = 0; 25 | protected: 26 | FillerParameter filler_param_; 27 | }; // class Filler 28 | 29 | 30 | /// @brief Fills a Blob with constant values @f$ x = 0 @f$. 31 | template 32 | class ConstantFiller : public Filler { 33 | public: 34 | explicit ConstantFiller(const FillerParameter& param) 35 | : Filler(param) {} 36 | virtual void Fill(Blob* blob) { 37 | Dtype* data = blob->mutable_cpu_data(); 38 | const int count = blob->count(); 39 | const Dtype value = this->filler_param_.value(); 40 | CHECK(count); 41 | for (int i = 0; i < count; ++i) { 42 | data[i] = value; 43 | } 44 | CHECK_EQ(this->filler_param_.sparse(), -1) 45 | << "Sparsity not supported by this Filler."; 46 | } 47 | }; 48 | 49 | /// @brief Fills a Blob with uniformly distributed values @f$ x\sim U(a, b) @f$. 50 | template 51 | class UniformFiller : public Filler { 52 | public: 53 | explicit UniformFiller(const FillerParameter& param) 54 | : Filler(param) {} 55 | virtual void Fill(Blob* blob) { 56 | CHECK(blob->count()); 57 | caffe_rng_uniform(blob->count(), Dtype(this->filler_param_.min()), 58 | Dtype(this->filler_param_.max()), blob->mutable_cpu_data()); 59 | CHECK_EQ(this->filler_param_.sparse(), -1) 60 | << "Sparsity not supported by this Filler."; 61 | } 62 | }; 63 | 64 | /// @brief Fills a Blob with Gaussian-distributed values @f$ x = a @f$. 65 | template 66 | class GaussianFiller : public Filler { 67 | public: 68 | explicit GaussianFiller(const FillerParameter& param) 69 | : Filler(param) {} 70 | virtual void Fill(Blob* blob) { 71 | Dtype* data = blob->mutable_cpu_data(); 72 | CHECK(blob->count()); 73 | caffe_rng_gaussian(blob->count(), Dtype(this->filler_param_.mean()), 74 | Dtype(this->filler_param_.std()), blob->mutable_cpu_data()); 75 | int sparse = this->filler_param_.sparse(); 76 | CHECK_GE(sparse, -1); 77 | if (sparse >= 0) { 78 | // Sparse initialization is implemented for "weight" blobs; i.e. matrices. 79 | // These have num == channels == 1; width is number of inputs; height is 80 | // number of outputs. The 'sparse' variable specifies the mean number 81 | // of non-zero input weights for a given output. 82 | CHECK_GE(blob->num_axes(), 1); 83 | const int num_outputs = blob->shape(0); 84 | Dtype non_zero_probability = Dtype(sparse) / Dtype(num_outputs); 85 | rand_vec_.reset(new SyncedMemory(blob->count() * sizeof(int))); 86 | int* mask = reinterpret_cast(rand_vec_->mutable_cpu_data()); 87 | caffe_rng_bernoulli(blob->count(), non_zero_probability, mask); 88 | for (int i = 0; i < blob->count(); ++i) { 89 | data[i] *= mask[i]; 90 | } 91 | } 92 | } 93 | 94 | protected: 95 | shared_ptr rand_vec_; 96 | }; 97 | 98 | /** @brief Fills a Blob with values @f$ x \in [0, 1] @f$ 99 | * such that @f$ \forall i \sum_j x_{ij} = 1 @f$. 100 | */ 101 | template 102 | class PositiveUnitballFiller : public Filler { 103 | public: 104 | explicit PositiveUnitballFiller(const FillerParameter& param) 105 | : Filler(param) {} 106 | virtual void Fill(Blob* blob) { 107 | Dtype* data = blob->mutable_cpu_data(); 108 | DCHECK(blob->count()); 109 | caffe_rng_uniform(blob->count(), 0, 1, blob->mutable_cpu_data()); 110 | // We expect the filler to not be called very frequently, so we will 111 | // just use a simple implementation 112 | int dim = blob->count() / blob->num(); 113 | CHECK(dim); 114 | for (int i = 0; i < blob->num(); ++i) { 115 | Dtype sum = 0; 116 | for (int j = 0; j < dim; ++j) { 117 | sum += data[i * dim + j]; 118 | } 119 | for (int j = 0; j < dim; ++j) { 120 | data[i * dim + j] /= sum; 121 | } 122 | } 123 | CHECK_EQ(this->filler_param_.sparse(), -1) 124 | << "Sparsity not supported by this Filler."; 125 | } 126 | }; 127 | 128 | /** 129 | * @brief Fills a Blob with values @f$ x \sim U(-a, +a) @f$ where @f$ a @f$ is 130 | * set inversely proportional to number of incoming nodes, outgoing 131 | * nodes, or their average. 132 | * 133 | * A Filler based on the paper [Bengio and Glorot 2010]: Understanding 134 | * the difficulty of training deep feedforward neuralnetworks. 135 | * 136 | * It fills the incoming matrix by randomly sampling uniform data from [-scale, 137 | * scale] where scale = sqrt(3 / n) where n is the fan_in, fan_out, or their 138 | * average, depending on the variance_norm option. You should make sure the 139 | * input blob has shape (num, a, b, c) where a * b * c = fan_in and num * b * c 140 | * = fan_out. Note that this is currently not the case for inner product layers. 141 | * 142 | * TODO(dox): make notation in above comment consistent with rest & use LaTeX. 143 | */ 144 | template 145 | class XavierFiller : public Filler { 146 | public: 147 | explicit XavierFiller(const FillerParameter& param) 148 | : Filler(param) {} 149 | virtual void Fill(Blob* blob) { 150 | CHECK(blob->count()); 151 | int fan_in = blob->count() / blob->num(); 152 | int fan_out = blob->count() / blob->channels(); 153 | Dtype n = fan_in; // default to fan_in 154 | if (this->filler_param_.variance_norm() == 155 | FillerParameter_VarianceNorm_AVERAGE) { 156 | n = (fan_in + fan_out) / Dtype(2); 157 | } else if (this->filler_param_.variance_norm() == 158 | FillerParameter_VarianceNorm_FAN_OUT) { 159 | n = fan_out; 160 | } 161 | Dtype scale = sqrt(Dtype(3) / n); 162 | caffe_rng_uniform(blob->count(), -scale, scale, 163 | blob->mutable_cpu_data()); 164 | CHECK_EQ(this->filler_param_.sparse(), -1) 165 | << "Sparsity not supported by this Filler."; 166 | } 167 | }; 168 | 169 | /** 170 | * @brief Fills a Blob with values @f$ x \sim N(0, \sigma^2) @f$ where 171 | * @f$ \sigma^2 @f$ is set inversely proportional to number of incoming 172 | * nodes, outgoing nodes, or their average. 173 | * 174 | * A Filler based on the paper [He, Zhang, Ren and Sun 2015]: Specifically 175 | * accounts for ReLU nonlinearities. 176 | * 177 | * Aside: for another perspective on the scaling factor, see the derivation of 178 | * [Saxe, McClelland, and Ganguli 2013 (v3)]. 179 | * 180 | * It fills the incoming matrix by randomly sampling Gaussian data with std = 181 | * sqrt(2 / n) where n is the fan_in, fan_out, or their average, depending on 182 | * the variance_norm option. You should make sure the input blob has shape (num, 183 | * a, b, c) where a * b * c = fan_in and num * b * c = fan_out. Note that this 184 | * is currently not the case for inner product layers. 185 | */ 186 | template 187 | class MSRAFiller : public Filler { 188 | public: 189 | explicit MSRAFiller(const FillerParameter& param) 190 | : Filler(param) {} 191 | virtual void Fill(Blob* blob) { 192 | CHECK(blob->count()); 193 | int fan_in = blob->count() / blob->num(); 194 | int fan_out = blob->count() / blob->channels(); 195 | Dtype n = fan_in; // default to fan_in 196 | if (this->filler_param_.variance_norm() == 197 | FillerParameter_VarianceNorm_AVERAGE) { 198 | n = (fan_in + fan_out) / Dtype(2); 199 | } else if (this->filler_param_.variance_norm() == 200 | FillerParameter_VarianceNorm_FAN_OUT) { 201 | n = fan_out; 202 | } 203 | Dtype std = sqrt(Dtype(2) / n); 204 | caffe_rng_gaussian(blob->count(), Dtype(0), std, 205 | blob->mutable_cpu_data()); 206 | CHECK_EQ(this->filler_param_.sparse(), -1) 207 | << "Sparsity not supported by this Filler."; 208 | } 209 | }; 210 | 211 | /*! 212 | @brief Fills a Blob with coefficients for bilinear interpolation. 213 | 214 | A common use case is with the DeconvolutionLayer acting as upsampling. 215 | You can upsample a feature map with shape of (B, C, H, W) by any integer factor 216 | using the following proto. 217 | \code 218 | layer { 219 | name: "upsample", type: "Deconvolution" 220 | bottom: "{{bottom_name}}" top: "{{top_name}}" 221 | convolution_param { 222 | kernel_size: {{2 * factor - factor % 2}} stride: {{factor}} 223 | num_output: {{C}} group: {{C}} 224 | pad: {{ceil((factor - 1) / 2.)}} 225 | weight_filler: { type: "bilinear" } bias_term: false 226 | } 227 | param { lr_mult: 0 decay_mult: 0 } 228 | } 229 | \endcode 230 | Please use this by replacing `{{}}` with your values. By specifying 231 | `num_output: {{C}} group: {{C}}`, it behaves as 232 | channel-wise convolution. The filter shape of this deconvolution layer will be 233 | (C, 1, K, K) where K is `kernel_size`, and this filler will set a (K, K) 234 | interpolation kernel for every channel of the filter identically. The resulting 235 | shape of the top feature map will be (B, C, factor * H, factor * W). 236 | Note that the learning rate and the 237 | weight decay are set to 0 in order to keep coefficient values of bilinear 238 | interpolation unchanged during training. If you apply this to an image, this 239 | operation is equivalent to the following call in Python with Scikit.Image. 240 | \code{.py} 241 | out = skimage.transform.rescale(img, factor, mode='constant', cval=0) 242 | \endcode 243 | */ 244 | template 245 | class BilinearFiller : public Filler { 246 | public: 247 | explicit BilinearFiller(const FillerParameter& param) 248 | : Filler(param) {} 249 | virtual void Fill(Blob* blob) { 250 | CHECK_EQ(blob->num_axes(), 4) << "Blob must be 4 dim."; 251 | CHECK_EQ(blob->width(), blob->height()) << "Filter must be square"; 252 | Dtype* data = blob->mutable_cpu_data(); 253 | int f = ceil(blob->width() / 2.); 254 | float c = (2 * f - 1 - f % 2) / (2. * f); 255 | for (int i = 0; i < blob->count(); ++i) { 256 | float x = i % blob->width(); 257 | float y = (i / blob->width()) % blob->height(); 258 | data[i] = (1 - fabs(x / f - c)) * (1 - fabs(y / f - c)); 259 | } 260 | CHECK_EQ(this->filler_param_.sparse(), -1) 261 | << "Sparsity not supported by this Filler."; 262 | } 263 | }; 264 | 265 | 266 | template 267 | class EdgeXFiller : public Filler { 268 | public: 269 | explicit EdgeXFiller(const FillerParameter& param) 270 | : Filler(param) {} 271 | virtual void Fill(Blob* blob) { 272 | CHECK_EQ(blob->num_axes(), 4) << "Blob must be 4 dim."; 273 | //CHECK_EQ(blob->width(), blob->height()) << "Filter must be square"; 274 | CHECK_EQ( blob->width(),3) << "Filter must have kernel size equals 3"; 275 | CHECK_EQ(blob->height() ,3 ) << "Filter must have kernel size equals 3"; 276 | Dtype* data = blob->mutable_cpu_data(); 277 | //float X[9] = {0.0f ,-2.0f ,0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 2.0f, 0.0f}; 278 | //float X[9] = {1.0f/8 ,2.0f/8 ,1.0f/8, 0.0f/8, 0.0f/8, 0.0f/8, -1.0f/8, -2.0f/8, -1.0f/8}; 279 | float X[9] = {0.0f ,-0.5f ,0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.5f, 0.0f}; 280 | for (int i = 0; i < blob->count(); ++i) { 281 | int j = i%9; 282 | data[i] = X[j]; 283 | } 284 | CHECK_EQ(this->filler_param_.sparse(), -1) 285 | << "Sparsity not supported by this Filler."; 286 | } 287 | }; 288 | 289 | 290 | template 291 | class EdgeYFiller : public Filler { 292 | public: 293 | explicit EdgeYFiller(const FillerParameter& param) 294 | : Filler(param) {} 295 | virtual void Fill(Blob* blob) { 296 | CHECK_EQ(blob->num_axes(), 4) << "Blob must be 4 dim."; 297 | //CHECK_EQ(blob->width(), blob->height()) << "Filter must be square"; 298 | CHECK_EQ( blob->width(),3) << "Filter must have kernel size equals 3"; 299 | CHECK_EQ(blob->height() ,3 ) << "Filter must have kernel size equals 3"; 300 | std::cout<<"kernel_size \t = "<< blob->width()<<"\t" << blob->height()<<"\n"; 301 | Dtype* data = blob->mutable_cpu_data(); 302 | //float X[9] = {0.0f ,0.0f ,0.0f, -2.0f, 0.0f, 2.0f, 0.0f, 0.0f, 0.0f}; 303 | //float X[9] = {1.0f/8 ,0.0f/8 ,-1.0f/8, 2.0f/8, 0.0f/8, -2.0f/8, 1.0f/8, 0.0f/8, -1.0f/8}; 304 | float X[9] = {0.0f ,0.0f ,0.0f, -0.5f, 0.0f, 0.5f, 0.0f, 0.0f, 0.0f}; 305 | std::cout<<"filter: "; 306 | for (int i = 0; i < blob->count(); ++i) { 307 | int j = i%9; 308 | data[i] = X[j]; 309 | std::cout<filler_param_.sparse(), -1) 313 | << "Sparsity not supported by this Filler."; 314 | } 315 | }; 316 | 317 | 318 | template 319 | class LaplacianFiller : public Filler { 320 | public: 321 | explicit LaplacianFiller(const FillerParameter& param) 322 | : Filler(param) {} 323 | virtual void Fill(Blob* blob) { 324 | CHECK_EQ(blob->num_axes(), 4) << "Blob must be 4 dim."; 325 | //CHECK_EQ(blob->width(), blob->height()) << "Filter must be square"; 326 | CHECK_EQ( blob->width(),3) << "Filter must have kernel size equals 3"; 327 | CHECK_EQ(blob->height() ,3 ) << "Filter must have kernel size equals 3"; 328 | Dtype* data = blob->mutable_cpu_data(); 329 | Dtype X[9] = {1.0f/6.0f, 2.0f/3.0f, 1.0f/6.0f, 2.0f/3.0f, -10.0f/3.0f ,2.0f/3.0f, 1.0f/6.0f, 2.0f/3.0f, 1.0f/6.0f }; 330 | for (int i = 0; i < blob->count(); ++i) { 331 | int j = i%9; 332 | data[i] = X[j]; 333 | } 334 | CHECK_EQ(this->filler_param_.sparse(), -1) 335 | << "Sparsity not supported by this Filler."; 336 | } 337 | }; 338 | 339 | template 340 | class LoGFiller : public Filler { 341 | public: 342 | explicit LoGFiller(const FillerParameter& param) 343 | : Filler(param) {} 344 | virtual void Fill(Blob* blob) { 345 | CHECK_EQ(blob->num_axes(), 4) << "Blob must be 4 dim."; 346 | //CHECK_EQ(blob->width(), blob->height()) << "Filter must be square"; 347 | CHECK_EQ( blob->width()%2,1) << "Filter must have odd kernel size"; 348 | CHECK_EQ(blob->height()%2,1) << "Filter must have odd kernel size"; 349 | CHECK_EQ(blob->height(),blob->width()) << "Filter must be square"; 350 | Dtype* data = blob->mutable_cpu_data(); 351 | Dtype sigma = Dtype(this->filler_param_.std()); 352 | float HalfWidth = (blob->height() - 1)/2; 353 | //float FilSize = (2*HalfWidth+1)*(2*HalfWidth+1); 354 | Dtype pi = Dtype(3.1415); 355 | for (int ii = 0; ii < blob->count(); ++ii) { 356 | 357 | // i,j is the x,y index of the filter 358 | float i = ii % blob->width(); 359 | float j = (ii / blob->width()) % blob->height(); 360 | // if thr was group then ind will give subindex of the filter inside group 361 | //int ind = ii%FilSize; 362 | //float i = floor(ind/(2*HalfWidth+1)); 363 | //float j = ind%(2*HalfWidth+1); 364 | // center is zero 365 | Dtype dist = (pow(i - HalfWidth ,2) + pow(j - HalfWidth ,2)) / (2*pow(sigma,2)); 366 | // Finally the value 367 | data[ii] = - exp(-dist) * (1-dist) / (pi * pow(sigma,4)); 368 | } 369 | CHECK_EQ(this->filler_param_.sparse(), -1) 370 | << "Sparsity not supported by this Filler."; 371 | } 372 | }; 373 | 374 | template 375 | class CentralDifferenceFiller : public Filler { 376 | public: 377 | explicit CentralDifferenceFiller(const FillerParameter& param) 378 | : Filler(param) {} 379 | virtual void Fill(Blob* blob) { 380 | CHECK_EQ(blob->num_axes(), 4) << "Blob must be 4 dim."; 381 | CHECK_EQ(blob->channels(), 4) << "Filter must have 4 channels"; 382 | CHECK_EQ( blob->width(),3) << "Filter must have kernel size equals 3"; 383 | CHECK_EQ(blob->height() ,3 ) << "Filter must have kernel size equals 3"; 384 | Dtype* data = blob->mutable_cpu_data(); 385 | float X[9] = {0.0f ,0.0f ,0.0f, -0.5f, 0.0f, 0.5f, 0.0f, 0.0f, 0.0f}; 386 | float Y[9] = {0.0f ,-0.5f ,0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.5f, 0.0f}; 387 | float DiagRL[9] = {0.0f , 0.0f ,0.5f, 0.0f, 0.0f, 0.0f, 0.5f, 0.0f, 0.0f}; 388 | float DiagLR[9] = {0.5f , 0.0f ,0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.5f}; 389 | 390 | for (int i=0; icount();++i){ 391 | int j = i%9; 392 | int c = i/9%4; 393 | switch (c){ 394 | case 0: 395 | data[i] = X[j]; 396 | case 1: 397 | data[i] = Y[j]; 398 | case 2: 399 | data[i] = DiagRL[j]; 400 | case 3: 401 | data[i] = DiagLR[j]; 402 | } 403 | } 404 | 405 | 406 | CHECK_EQ(this->filler_param_.sparse(), -1) 407 | << "Sparsity not supported by this Filler."; 408 | } 409 | }; 410 | 411 | template 412 | class ForwardDifferenceFiller : public Filler { 413 | public: 414 | explicit ForwardDifferenceFiller(const FillerParameter& param) 415 | : Filler(param) {} 416 | virtual void Fill(Blob* blob) { 417 | CHECK_EQ(blob->num_axes(), 4) << "Blob must be 4 dim."; 418 | // CHECK_EQ(blob->channels(), 8) << "Filter must have 8 channels"; 419 | CHECK_EQ( blob->width(),3) << "Filter must have kernel size equals 3"; 420 | CHECK_EQ(blob->height() ,3 ) << "Filter must have kernel size equals 3"; 421 | Dtype* data = blob->mutable_cpu_data(); 422 | float TopRight[9] = {0.0f ,0.0f , 1.0f, 0.0f, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f}; 423 | float Right[9] = {0.0f ,0.0f , 0.0f, 0.0f, -1.0f, 1.0f, 0.0f, 0.0f, 0.0f}; 424 | float BottomRight[9] = {0.0f ,0.0f , 0.0f, 0.0f, -1.0f, 0.0f, 0.0f, 0.0f, 1.0f}; 425 | float Bottom[9] = {0.0f ,0.0f , 0.0f, 0.0f, -1.0f, 0.0f, 0.0f, 1.0f, 0.0f}; 426 | float BottomLeft[9] = {0.0f ,0.0f ,0.0f, 0.0f, -1.0f, 0.0f, 1.0f, 0.0f, 0.0f}; 427 | float Left[9] = {0.0f ,0.0f , 0.0f, 1.0f, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f}; 428 | float TopLeft[9] = {1.0f ,0.0f , 0.0f, 0.0f, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f}; 429 | float Top[9] = {0.0f ,1.0f ,0.0f, 0.0f, -1.0f, 0.0f, 0.0f, 0.0f, 0.0f}; 430 | 431 | for (int i=0; icount();++i){ 432 | int j = i%9; 433 | int c = i/9%8; 434 | switch (c){ 435 | case 0: 436 | data[i] = Right[j]; 437 | case 1: 438 | data[i] = BottomRight[j]; 439 | case 2: 440 | data[i] = Bottom[j]; 441 | case 3: 442 | data[i] = BottomLeft[j]; 443 | case 4: 444 | data[i] = Left[j]; 445 | case 5: 446 | data[i] = TopLeft[j]; 447 | case 6: 448 | data[i] = Top[j]; 449 | case 7: 450 | data[i] = TopRight[j]; 451 | } 452 | } 453 | 454 | 455 | CHECK_EQ(this->filler_param_.sparse(), -1) 456 | << "Sparsity not supported by this Filler."; 457 | } 458 | }; 459 | 460 | 461 | /* 462 | template 463 | class LoGFiller : public Filler { 464 | public: 465 | explicit LoGFiller(const FillerParameter& param) 466 | : Filler(param) {} 467 | virtual void Fill(Blob* blob) { 468 | CHECK_EQ(blob->num_axes(), 4) << "Blob must be 4 dim."; 469 | //CHECK_EQ(blob->width(), blob->height()) << "Filter must be square"; 470 | CHECK_EQ( blob->width(),5) << "Filter must have kernel size equals 5"; 471 | CHECK_EQ(blob->height() ,5 ) << "Filter must have kernel size equals 5"; 472 | Dtype* data = blob->mutable_cpu_data(); 473 | Dtype X[25] = {0.0448,0.0468,0.0564,0.0468,0.0448, 474 | 0.0468,0.3167,0.7146,0.3167,0.0468, 475 | 0.0564,0.7146,-4.9048,0.7146,0.0564, 476 | 0.0468,0.3167,0.7146,0.3167,0.0468, 477 | 0.0448,0.0468,0.0564,0.0468,0.0448}; 478 | for (int i = 0; i < blob->count(); ++i) { 479 | int j = i%25; 480 | data[i] = X[j]; 481 | } 482 | CHECK_EQ(this->filler_param_.sparse(), -1) 483 | << "Sparsity not supported by this Filler."; 484 | } 485 | }; 486 | */ 487 | 488 | 489 | /** 490 | * @brief Get a specific filler from the specification given in FillerParameter. 491 | * 492 | * Ideally this would be replaced by a factory pattern, but we will leave it 493 | * this way for now. 494 | */ 495 | template 496 | Filler* GetFiller(const FillerParameter& param) { 497 | const std::string& type = param.type(); 498 | if (type == "constant") { 499 | return new ConstantFiller(param); 500 | } else if (type == "gaussian") { 501 | return new GaussianFiller(param); 502 | } else if (type == "positive_unitball") { 503 | return new PositiveUnitballFiller(param); 504 | } else if (type == "uniform") { 505 | return new UniformFiller(param); 506 | } else if (type == "xavier") { 507 | return new XavierFiller(param); 508 | } else if (type == "msra") { 509 | return new MSRAFiller(param); 510 | } else if (type == "bilinear") { 511 | return new BilinearFiller(param); 512 | }else if (type == "EdgeX") { 513 | return new EdgeXFiller(param); 514 | } else if (type == "EdgeY") { 515 | return new EdgeYFiller(param); 516 | } else if (type == "Laplacian") { 517 | return new LaplacianFiller(param); 518 | }else if (type == "LoG") { 519 | return new LoGFiller(param); 520 | } else if (type == "ForwardDiff") { 521 | return new ForwardDifferenceFiller(param); 522 | } else if (type == "CentralDiff") { 523 | return new CentralDifferenceFiller(param); 524 | } else { 525 | CHECK(false) << "Unknown filler name: " << param.type(); 526 | } 527 | return (Filler*)(NULL); 528 | } 529 | 530 | } // namespace caffe 531 | 532 | #endif // CAFFE_FILLER_HPP_ 533 | -------------------------------------------------------------------------------- /caffe/include/caffe/layers/abs_loss_layer.hpp: -------------------------------------------------------------------------------- 1 | #ifndef CAFFE_ABS_LOSS_LAYER_HPP_ 2 | #define CAFFE_ABS_LOSS_LAYER_HPP_ 3 | 4 | #include 5 | 6 | #include "caffe/blob.hpp" 7 | #include "caffe/layer.hpp" 8 | #include "caffe/proto/caffe.pb.h" 9 | 10 | #include "caffe/layers/loss_layer.hpp" 11 | 12 | namespace caffe { 13 | 14 | /** 15 | * @brief Computes the L1 loss @f$ 16 | * E = \frac{1}{N} \sum\limits_{n=1}^N \left| \left| \hat{y}_n - y_n 17 | * \right| \right|_2 @f$ for real-valued regression tasks. 18 | * 19 | * @param bottom input Blob vector (length 2) 20 | * -# @f$ (N \times C \times H \times W) @f$ 21 | * the predictions @f$ \hat{y} \in [-\infty, +\infty]@f$ 22 | * -# @f$ (N \times C \times H \times W) @f$ 23 | * the targets @f$ y \in [-\infty, +\infty]@f$ 24 | * @param top output Blob vector (length 1) 25 | * -# @f$ (1 \times 1 \times 1 \times 1) @f$ 26 | * the computed Euclidean loss: @f$ E = 27 | * \frac{1}{n} \sum\limits_{n=1}^N \left| \left| \hat{y}_n - y_n 28 | * \right| \right|_2 @f$ 29 | * 30 | * This can be used for least-squares regression tasks. An InnerProductLayer 31 | * input to a EuclideanLossLayer exactly formulates a linear least squares 32 | * regression problem. With non-zero weight decay the problem becomes one of 33 | * ridge regression -- see src/caffe/test/test_sgd_solver.cpp for a concrete 34 | * example wherein we check that the gradients computed for a Net with exactly 35 | * this structure match hand-computed gradient formulas for ridge regression. 36 | * 37 | * (Note: Caffe, and SGD in general, is certainly \b not the best way to solve 38 | * linear least squares problems! We use it only as an instructive example.) 39 | */ 40 | template 41 | class AbsLossLayer : public LossLayer { 42 | public: 43 | explicit AbsLossLayer(const LayerParameter& param) 44 | : LossLayer(param), diff_() {} 45 | virtual void Reshape(const vector*>& bottom, 46 | const vector*>& top); 47 | 48 | virtual inline const char* type() const { return "AbsLoss"; } 49 | /** 50 | * Unlike most loss layers, in the EuclideanLossLayer we can backpropagate 51 | * to both inputs -- override to return true and always allow force_backward. 52 | */ 53 | virtual inline bool AllowForceBackward(const int bottom_index) const { 54 | return true; 55 | } 56 | 57 | protected: 58 | /// @copydoc EuclideanLossLayer 59 | virtual void Forward_cpu(const vector*>& bottom, 60 | const vector*>& top); 61 | virtual void Forward_gpu(const vector*>& bottom, 62 | const vector*>& top); 63 | 64 | /** 65 | * @brief Computes the Euclidean error gradient w.r.t. the inputs. 66 | * 67 | * Unlike other children of LossLayer, EuclideanLossLayer \b can compute 68 | * gradients with respect to the label inputs bottom[1] (but still only will 69 | * if propagate_down[1] is set, due to being produced by learnable parameters 70 | * or if force_backward is set). In fact, this layer is "commutative" -- the 71 | * result is the same regardless of the order of the two bottoms. 72 | * 73 | * @param top output Blob vector (length 1), providing the error gradient with 74 | * respect to the outputs 75 | * -# @f$ (1 \times 1 \times 1 \times 1) @f$ 76 | * This Blob's diff will simply contain the loss_weight* @f$ \lambda @f$, 77 | * as @f$ \lambda @f$ is the coefficient of this layer's output 78 | * @f$\ell_i@f$ in the overall Net loss 79 | * @f$ E = \lambda_i \ell_i + \mbox{other loss terms}@f$; hence 80 | * @f$ \frac{\partial E}{\partial \ell_i} = \lambda_i @f$. 81 | * (*Assuming that this top Blob is not used as a bottom (input) by any 82 | * other layer of the Net.) 83 | * @param propagate_down see Layer::Backward. 84 | * @param bottom input Blob vector (length 2) 85 | * -# @f$ (N \times C \times H \times W) @f$ 86 | * the predictions @f$\hat{y}@f$; Backward fills their diff with 87 | * gradients @f$ 88 | * \frac{\partial E}{\partial \hat{y}} = 89 | * \frac{1}{n} \sum\limits_{n=1}^N (\hat{y}_n - y_n) 90 | * @f$ if propagate_down[0] 91 | * -# @f$ (N \times C \times H \times W) @f$ 92 | * the targets @f$y@f$; Backward fills their diff with gradients 93 | * @f$ \frac{\partial E}{\partial y} = 94 | * \frac{1}{n} \sum\limits_{n=1}^N (y_n - \hat{y}_n) 95 | * @f$ if propagate_down[1] 96 | */ 97 | virtual void Backward_cpu(const vector*>& top, 98 | const vector& propagate_down, const vector*>& bottom); 99 | virtual void Backward_gpu(const vector*>& top, 100 | const vector& propagate_down, const vector*>& bottom); 101 | 102 | Blob diff_; 103 | }; 104 | 105 | } // namespace caffe 106 | 107 | #endif // CAFFE_ABS_LOSS_LAYER_HPP_ 108 | -------------------------------------------------------------------------------- /caffe/include/caffe/layers/geometry_transformation.hpp: -------------------------------------------------------------------------------- 1 | #ifndef CAFFE_GEOTRANSFORM_LAYER_HPP_ 2 | #define CAFFE_GEOTRANSFORM_LAYER_HPP_ 3 | 4 | #include 5 | 6 | #include "caffe/blob.hpp" 7 | #include "caffe/layer.hpp" 8 | #include "caffe/proto/caffe.pb.h" 9 | 10 | namespace caffe { 11 | 12 | template 13 | class GeoTransformLayer : public Layer { 14 | public: 15 | explicit GeoTransformLayer(const LayerParameter& param) 16 | : Layer(param) {} 17 | virtual void Reshape(const vector*>& bottom, 18 | const vector*>& top); 19 | 20 | virtual inline const char* type() const { return "GeoTransform"; } 21 | 22 | virtual inline bool AllowForceBackward(const int bottom_index) const { 23 | return bottom_index == 0; 24 | } 25 | virtual inline int ExactNumBottomBlobs() const { return -1; } 26 | virtual inline int ExactNumTopBlobs() const { return -1; } 27 | protected: 28 | 29 | virtual void Forward_cpu(const vector*>& bottom, 30 | const vector*>& top); 31 | virtual void Forward_gpu(const vector*>& bottom, 32 | const vector*>& top); 33 | 34 | virtual void Backward_cpu(const vector*>& top, 35 | const vector& propagate_down, const vector*>& bottom); 36 | virtual void Backward_gpu(const vector*>& top, 37 | const vector& propagate_down, const vector*>& bottom); 38 | }; 39 | 40 | } // namespace caffe 41 | 42 | #endif // CAFFE_GEOTRANSFORM_LAYER_HPP_ 43 | -------------------------------------------------------------------------------- /caffe/include/caffe/layers/inverse_warping_layer.hpp: -------------------------------------------------------------------------------- 1 | #ifndef CAFFE_INVERSE_WARPING_LAYER_HPP_ 2 | #define CAFFE_INVERSE_WARPING_LAYER_HPP_ 3 | 4 | #include 5 | 6 | #include "caffe/blob.hpp" 7 | #include "caffe/layer.hpp" 8 | #include "caffe/proto/caffe.pb.h" 9 | 10 | namespace caffe { 11 | /** 12 | * @brief Compute forward warp image given, 13 | * input image, projection coordinate 14 | * 15 | * TODO(dox): thorough documentation for Forward, Backward, and proto params. 16 | */ 17 | 18 | template 19 | class InverseWarpingLayer : public Layer { 20 | public: 21 | explicit InverseWarpingLayer(const LayerParameter& param) 22 | : Layer(param) {} 23 | virtual void LayerSetUp(const vector*>& bottom, 24 | const vector*>& top); 25 | virtual void Reshape(const vector*>& bottom, 26 | const vector*>& top); 27 | 28 | // InverseWarping layer outputs a warp image given an input image and projection coordinate; 29 | 30 | virtual inline const char* type() const { return "InverseWarping";} 31 | // virtual inline int MinBottomBlobs() const { return 3; } 32 | //virtual inline int ExactNumTopBlobs() const { return 1; } 33 | // If 2 blobs at top the external occussion mask is returned 34 | //virtual inline int MinTopBlobs() const { return 1; } 35 | // virtual inline int MaxTopBlobs() const { return 2; } 36 | 37 | protected: 38 | virtual void Forward_cpu(const vector*>& bottom, 39 | const vector*>& top); 40 | virtual void Forward_gpu(const vector*>& bottom, 41 | const vector*>& top); 42 | virtual void Backward_cpu(const vector*>& top, 43 | const vector& propagate_down, const vector*>& bottom); 44 | virtual void Backward_gpu(const vector*>& top, 45 | const vector& propagate_down, const vector*>& bottom); 46 | 47 | //int kernel_h_, kernel_w_; 48 | //int stride_h_, stride_w_; 49 | //int pad_h_, pad_w_; 50 | // int channels_; 51 | // int height_, width_; 52 | 53 | // to use top_mask 54 | // bool mask_flag ; 55 | // Blob ext_occ; 56 | //int ext_occ_panelty = 0.01; 57 | 58 | 59 | //Blob warp_image; 60 | 61 | /*int pooled_height_, pooled_width_; 62 | bool global_pooling_; 63 | Blob rand_idx_; 64 | Blob max_idx_;*/ 65 | }; 66 | 67 | } // namespace caffe 68 | 69 | #endif // CAFFE_WARPING_LAYER_HPP_ 70 | -------------------------------------------------------------------------------- /caffe/include/caffe/layers/pin_hole_layer.hpp: -------------------------------------------------------------------------------- 1 | #ifndef CAFFE_PINHOLE_LAYER_HPP_ 2 | #define CAFFE_PINHOLE_LAYER_HPP_ 3 | 4 | #include 5 | 6 | #include "caffe/blob.hpp" 7 | #include "caffe/layer.hpp" 8 | #include "caffe/proto/caffe.pb.h" 9 | 10 | namespace caffe { 11 | 12 | template 13 | class PinHoleLayer : public Layer { 14 | public: 15 | explicit PinHoleLayer(const LayerParameter& param) 16 | : Layer(param) {} 17 | virtual void Reshape(const vector*>& bottom, 18 | const vector*>& top); 19 | 20 | virtual inline const char* type() const { return "PinHole"; } 21 | 22 | virtual inline bool AllowForceBackward(const int bottom_index) const { 23 | return bottom_index == 0; 24 | } 25 | virtual inline int ExactNumBottomBlobs() const { return -1; } 26 | virtual inline int ExactNumTopBlobs() const { return -1; } 27 | protected: 28 | 29 | virtual void Forward_cpu(const vector*>& bottom, 30 | const vector*>& top); 31 | virtual void Forward_gpu(const vector*>& bottom, 32 | const vector*>& top); 33 | 34 | virtual void Backward_cpu(const vector*>& top, 35 | const vector& propagate_down, const vector*>& bottom); 36 | virtual void Backward_gpu(const vector*>& top, 37 | const vector& propagate_down, const vector*>& bottom); 38 | }; 39 | 40 | } // namespace caffe 41 | 42 | #endif // CAFFE_PINHOLE_LAYER_HPP_ 43 | -------------------------------------------------------------------------------- /caffe/python/pygeometry.py: -------------------------------------------------------------------------------- 1 | import caffe 2 | import numpy as np 3 | import time 4 | 5 | 6 | class SE3_Generator_KITTI(caffe.Layer): 7 | """ 8 | SE3_Generator takes 6 transformation parameters (se3) and generate corresponding 4x4 transformation matrix 9 | Input: 10 | bottom[0] | se3 | shape is (batchsize, 6, 1, 1) 11 | Output: 12 | top[0] | SE3 | shape is (batchsize, 1, 4, 4) 13 | """ 14 | 15 | def setup(self, bottom, top): 16 | # check input pair 17 | if len(bottom) != 1: 18 | raise Exception("Need one input to compute transformation matrix.") 19 | 20 | # Define variables 21 | self.batchsize = bottom[0].num 22 | self.threshold = 1e-12 23 | 24 | def reshape(self, bottom, top): 25 | # check input dimension 26 | if bottom[0].count%6 != 0: #bottom.shape = (batchsize,6) 27 | raise Exception("Inputs must have the correct dimension.") 28 | # Output is 4x4 transformation matrix 29 | top[0].reshape(bottom[0].num,1,4,4) 30 | 31 | def forward(self, bottom, top): 32 | # Define skew matrix of so3, .size = (batchsize,1,3,3) 33 | self.uw = bottom[0].data[:,:3] 34 | 35 | self.uw_x = np.zeros((self.batchsize,1,3,3)) 36 | self.uw_x[:,0,0,1] = -self.uw[:,2,0,0] 37 | self.uw_x[:,0,0,2] = self.uw[:,1,0,0] 38 | self.uw_x[:,0,1,0] = self.uw[:,2,0,0] 39 | self.uw_x[:,0,1,2] = -self.uw[:,0,0,0] 40 | self.uw_x[:,0,2,0] = -self.uw[:,1,0,0] 41 | self.uw_x[:,0,2,1] = self.uw[:,0,0,0] 42 | 43 | # Get translation lie algebra 44 | self.ut = bottom[0].data[:,3:] 45 | self.ut = np.reshape(self.ut, (self.batchsize,1,3,1)) 46 | # Calculate SO3 and T, i.e. rotation matrix (batchsize,1,3,3) and translation matrix (batchsize,1,1,3) 47 | self.R = np.zeros((self.batchsize,1,3,3)) 48 | self.R[:,0] = np.eye(3) 49 | self.theta = np.linalg.norm(self.uw,axis=1) #theta.size = (batchsize,1) 50 | for i in range(self.batchsize): 51 | if self.theta[i]**2 < self.threshold: 52 | self.R[i,0] += self.uw_x[i,0] 53 | # self.V[i,0] += 0.5 * self.uw_x[i,0] 54 | continue 55 | else: 56 | c1 = np.sin(self.theta[i])/self.theta[i] 57 | c2 = 2*np.sin(self.theta[i]/2)**2/self.theta[i]**2 58 | c3 = ((self.theta[i] - np.sin(self.theta[i]))/self.theta[i]**3)**2 59 | self.R[i,0] += c1*self.uw_x[i,0] + c2*np.dot(self.uw_x[i,0],self.uw_x[i,0]) 60 | # self.V[i,0] += c2*self.uw_x[i,0] + c3*np.dot(self.uw_x[i,0],self.uw_x[i,0]) 61 | 62 | # Calculate output 63 | top[0].data[:,:,:3,:3] = self.R 64 | # top[0].data[:,:,:3,3] = np.matmul(self.V, self.ut)[:,:,:,0] 65 | # Rt implementation 66 | top[0].data[:,:,:3,3] = np.matmul(self.R, self.ut)[:,:,:,0] 67 | top[0].data[:,:,3,3] = 1 68 | 69 | 70 | def backward(self, top, propagate_down, bottom): 71 | if propagate_down[0]: 72 | # top[0].diff .shape is (batchsize,1,4,4) 73 | dLdT = top[0].diff[:,:,:3,3].copy() #batchsize,1,3 74 | dLdT = dLdT[:,np.newaxis] 75 | 76 | # Rt implementation for DLdut is dLdT x R 77 | # dLdut = np.matmul(dLdT, self.V) 78 | dLdut = np.matmul(dLdT, self.R) 79 | bottom[0].diff[:,3:,0,0] = dLdut[:,0,0] 80 | # Gradient correction for dLdR. '.' R also affect T, need update dLdR 81 | grad_corr = np.matmul(np.swapaxes(dLdT, 2, 3), np.swapaxes(self.ut, 2, 3)) # from (b,hw,4,1) to (b,4,hw,1) 82 | 83 | 84 | # dLduw 85 | dLdR = top[0].diff[:,:,:3,:3] 86 | dLdR += grad_corr 87 | dLduw = np.zeros((self.batchsize,3)) 88 | # for theta less than threshold 89 | generators = np.zeros((3,3,3)) 90 | generators[0] = np.array([[0,0,0],[0,0,1],[0,-1,0]]) 91 | generators[1] = np.array([[0,0,-1],[0,0,0],[1,0,0]]) 92 | generators[2] = np.array([[0,1,0],[-1,0,0],[0,0,0]]) 93 | for index in range(3): 94 | I3 = np.zeros((self.batchsize,1,3,3)) 95 | I3[:,0] = np.eye(3) 96 | ei = np.zeros((self.batchsize,1,3,1)) 97 | ei[:,0,index] = 1 98 | cross_term = np.matmul(self.uw_x, np.matmul(I3-self.R,ei)) 99 | cross = np.zeros((self.batchsize,1,3,3)) 100 | cross[:,0,0,1] = -cross_term[:,0,2,0] 101 | cross[:,0,0,2] = cross_term[:,0,1,0] 102 | cross[:,0,1,0] = cross_term[:,0,2,0] 103 | cross[:,0,1,2] = -cross_term[:,0,0,0] 104 | cross[:,0,2,0] = -cross_term[:,0,1,0] 105 | cross[:,0,2,1] = cross_term[:,0,0,0] 106 | self.dRduw_i = np.zeros((self.batchsize,1,3,3)) 107 | for j in range(self.batchsize): 108 | if self.theta[j]**2 < self.threshold: 109 | self.dRduw_i[j] = generators[index] 110 | else: 111 | self.dRduw_i[j,0] = np.matmul((self.uw[j,index]*self.uw_x[j,0] + cross[j,0])/(self.theta[j]**2), self.R[j,0]) 112 | dLduw[:,index]=np.sum(np.sum(dLdR*self.dRduw_i,axis=2),axis=2)[:,0] 113 | 114 | bottom[0].diff[:,:3,0,0] = dLduw 115 | 116 | -------------------------------------------------------------------------------- /caffe/src/caffe/layers/abs_loss_layer.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include "caffe/layers/abs_loss_layer.hpp" 4 | #include "caffe/util/math_functions.hpp" 5 | 6 | namespace caffe { 7 | 8 | template 9 | void AbsLossLayer::Reshape( 10 | const vector*>& bottom, const vector*>& top) { 11 | LossLayer::Reshape(bottom, top); 12 | CHECK_EQ(bottom[0]->count(1), bottom[1]->count(1)) 13 | << "Inputs must have the same dimension."; 14 | diff_.ReshapeLike(*bottom[1]); 15 | } 16 | 17 | template 18 | void AbsLossLayer::Forward_cpu(const vector*>& bottom, 19 | const vector*>& top) { 20 | 21 | int count = bottom[0]->count(); 22 | // do difference and store for backprob 23 | caffe_sub( 24 | count, 25 | bottom[0]->cpu_data(), 26 | bottom[1]->cpu_data(), 27 | diff_.mutable_cpu_data()); 28 | 29 | Dtype loss = caffe_cpu_asum(count, diff_.cpu_data()) / bottom[0]->num(); 30 | 31 | 32 | //Dtype dot = caffe_cpu_dot(count, diff_.cpu_data(), diff_.cpu_data()); 33 | //Dtype loss = dot / bottom[0]->num() / Dtype(2); 34 | top[0]->mutable_cpu_data()[0] = loss ; 35 | 36 | } 37 | 38 | /* GOOD to know comment Ravi 39 | For loss layers, there is no next layer, and so the top diff blob is technically undefined and unused - but Caffe is using this preallocated space to store unrelated data: Caffe supports multiplying loss layers with a user-defined weight (loss_weight in the prototxt), this information (a single scalar floating point number) is stored in the first element of the diff array of the top blob. That's why you'll see in every loss layer, that they multiply by that amount to support that functionality. This is explained in Caffe's tutorial about the loss layer. 40 | 41 | This weight is usually used to add auxiliary losses to the network. You can read more about it in Google's Going Deeper with Convoltions or in Deeply-Supervised Nets. 42 | */ 43 | 44 | template 45 | void AbsLossLayer::Backward_cpu(const vector*>& top, 46 | const vector& propagate_down, const vector*>& bottom) { 47 | for (int i = 0; i < 2; ++i) { 48 | if (propagate_down[i]) { 49 | int count = bottom[i]->count(); 50 | const Dtype sign = (i == 0) ? 1 : -1; 51 | const Dtype alpha = sign * top[0]->cpu_diff()[0] / bottom[i]->num(); 52 | Dtype* bottom_diff = bottom[i]->mutable_cpu_diff(); 53 | const Dtype* diff = diff_.cpu_data(); 54 | for (int i = 0; i < count; ++i) { 55 | bottom_diff[i] = alpha * (diff[i] > 0) - alpha * (diff[i] <= 0); 56 | } 57 | 58 | /* caffe_cpu_sign(count, diff_.cpu_data(), bottom[i]->mutable_cpu_diff()); 59 | 60 | caffe_scal( 61 | count, 62 | alpha, 63 | bottom[i]->mutable_gpu_diff(), 64 | 1); 65 | 66 | caffe_cpu_axpby( 67 | count, // count 68 | alpha, // alpha 69 | bottom[i]->mutable_cpu_diff(), // a 70 | Dtype(0), // beta 71 | bottom[i]->mutable_cpu_diff()); // b 72 | 73 | 74 | const Dtype sign = (i == 0) ? 1 : -1; 75 | const Dtype alpha = sign * top[0]->cpu_diff()[0] / bottom[i]->num(); 76 | caffe_cpu_axpby( 77 | bottom[i]->count(), // count 78 | alpha, // alpha 79 | diff_.cpu_data(), // a 80 | Dtype(0), // beta 81 | bottom[i]->mutable_cpu_diff()); // b 82 | */ 83 | } 84 | } 85 | } 86 | 87 | #ifdef CPU_ONLY 88 | STUB_GPU(AbsLossLayer); 89 | #endif 90 | 91 | INSTANTIATE_CLASS(AbsLossLayer); 92 | REGISTER_LAYER_CLASS(AbsLoss); 93 | 94 | } // namespace caffe 95 | -------------------------------------------------------------------------------- /caffe/src/caffe/layers/abs_loss_layer.cu: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include "caffe/layers/abs_loss_layer.hpp" 4 | #include "caffe/util/math_functions.hpp" 5 | 6 | namespace caffe { 7 | 8 | 9 | 10 | template 11 | void AbsLossLayer::Forward_gpu(const vector*>& bottom, 12 | const vector*>& top) { 13 | int count = bottom[0]->count(); 14 | caffe_gpu_sub( 15 | count, 16 | bottom[0]->gpu_data(), 17 | bottom[1]->gpu_data(), 18 | diff_.mutable_gpu_data()); 19 | Dtype loss; 20 | caffe_gpu_asum(count, diff_.gpu_data(), &loss) ; 21 | loss = loss / bottom[0]->num(); 22 | //Dtype dot; 23 | //caffe_gpu_dot(count, diff_.gpu_data(), diff_.gpu_data(), &dot); 24 | //Dtype loss = dot / bottom[0]->num() / Dtype(2); 25 | top[0]->mutable_cpu_data()[0] = loss; 26 | } 27 | 28 | template 29 | __global__ void AbsLossBackward(const int n, 30 | const Dtype* diff_data, Dtype* out_diff, const Dtype alpha) { 31 | CUDA_KERNEL_LOOP(index, n) { 32 | out_diff[index] = alpha * ( (diff_data[index] > 0) - (diff_data[index] <= 0)); 33 | } 34 | } 35 | 36 | template 37 | void AbsLossLayer::Backward_gpu(const vector*>& top, 38 | const vector& propagate_down, const vector*>& bottom) { 39 | for (int i = 0; i < 2; ++i) { 40 | if (propagate_down[i]) { 41 | int count = bottom[i]->count(); 42 | const Dtype sign = (i == 0) ? 1 : -1; 43 | const Dtype alpha = sign * top[0]->cpu_diff()[0] / bottom[i]->num(); 44 | 45 | Dtype* bottom_diff = bottom[i]->mutable_gpu_diff(); 46 | const Dtype* diff = diff_.gpu_data(); 47 | //ReLUBackward<<>>( 48 | // count, top_diff, bottom_data, bottom_diff, negative_slope); 49 | AbsLossBackward<<>>( 50 | count, diff, bottom_diff, alpha); 51 | CUDA_POST_KERNEL_CHECK; 52 | /*for (int i = 0; i < count; ++i) { 53 | bottom_diff[i] = alpha * (diff[i] > 0) - alpha * (diff[i] < 0); 54 | }*/ 55 | 56 | /* caffe_gpu_sign(count, diff_.gpu_data(), bottom[i]->mutable_gpu_diff()); 57 | caffe_gpu_scal( 58 | count, 59 | alpha, 60 | bottom[i]->mutable_gpu_diff(), 61 | 1); 62 | caffe_gpu_axpby( 63 | count, // count 64 | alpha, // alpha 65 | bottom[i]->mutable_gpu_diff(), // a 66 | Dtype(0), // beta 67 | bottom[i]->mutable_gpu_diff()); // b 68 | 69 | 70 | 71 | caffe_gpu_axpby( 72 | bottom[i]->count(), // count 73 | alpha, // alpha 74 | diff_.gpu_data(), // a 75 | Dtype(0), // beta 76 | bottom[i]->mutable_gpu_diff()); // b 77 | */ 78 | } 79 | } 80 | } 81 | 82 | INSTANTIATE_LAYER_GPU_FUNCS(AbsLossLayer); 83 | 84 | } // namespace caffe 85 | -------------------------------------------------------------------------------- /caffe/src/caffe/layers/geometry_transformation.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include "caffe/layers/geometry_transformation.hpp" 4 | #include "caffe/util/io.hpp" 5 | #include "caffe/util/math_functions.hpp" 6 | 7 | // #define LEVELS 256 // cost volume levels 8 | // #define MINID 1e-5 9 | // #define MAXID 4 10 | 11 | namespace caffe 12 | { 13 | 14 | template 15 | void GeoTransformLayer::Reshape( 16 | const vector*>& bottom, const vector*>& top) 17 | { 18 | top[0]->Reshape(bottom[0]->num(), 3, bottom[0]->height(), bottom[0]->width()); 19 | } 20 | 21 | template 22 | void GeoTransformLayer::Forward_cpu(const vector*>& bottom, const vector*>& top) 23 | { 24 | std::cout << "GeoTransformLayer Warning: Running Forward_gpu() code instead" << std::endl; 25 | 26 | // Forward_gpu(bottom, top); 27 | } 28 | 29 | template 30 | void GeoTransformLayer::Backward_cpu(const vector*>& top, 31 | const vector& propagate_down, const vector*>& bottom) 32 | { 33 | std::cout << "GeoTransformLayer Warning: Running Backward_gpu() code instead" << std::endl; 34 | 35 | // Backward_gpu(top, propagate_down, bottom); 36 | } 37 | 38 | #ifdef CPU_ONLY 39 | STUB_GPU(GeoTransformLayer); 40 | #endif 41 | 42 | INSTANTIATE_CLASS(GeoTransformLayer); 43 | REGISTER_LAYER_CLASS(GeoTransform); 44 | 45 | } // namespace caffe 46 | -------------------------------------------------------------------------------- /caffe/src/caffe/layers/geometry_transformation.cu: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | #include "caffe/layers/geometry_transformation.hpp" 5 | #include "caffe/util/math_functions.hpp" 6 | #include "caffe/util/gpu_util.cuh" 7 | 8 | namespace caffe { 9 | 10 | template 11 | __global__ void GeoTransform(const int nthreads, const int H, const int W, const int top_channel, const Dtype* const depths, const Dtype* const transformation, const Dtype* const camIntrinsic, Dtype* const transformed_points) 12 | { 13 | 14 | // transformation (shape: N,1,4,4) 15 | // FIXME 16 | // camIntrinsic: (shape: N,4,1,1; fx, fy, cx, cy) 17 | CUDA_KERNEL_LOOP(index, nthreads) 18 | { 19 | const int x = index % W; 20 | const int y = (index / W) % H; 21 | const int n = index / W / H; 22 | const int offset = (n * H + y) * W + x; 23 | const Dtype* const trans_off = transformation + n*16; 24 | const Dtype* const camIntrin_off = camIntrinsic + n*4; 25 | 26 | const float fx = camIntrin_off[0]; 27 | const float fy = camIntrin_off[1]; 28 | const float cx = camIntrin_off[2]; 29 | const float cy = camIntrin_off[3]; 30 | 31 | 32 | const Dtype d = depths[offset]; 33 | // fixme 34 | // Dtype disp = depths[offset]; 35 | // Dtype d = fx*0.54/(disp+1e-3); 36 | // fixme 37 | const Dtype X = (x-cx)/fx*d; 38 | const Dtype Y = (y-cy)/fy*d; 39 | 40 | int x_offset = ((n * top_channel + 0) * H + y) * W + x; 41 | transformed_points[x_offset] = trans_off[0]*X + trans_off[1]*Y + trans_off[2]*d + trans_off[3]; 42 | int y_offset = ((n * top_channel + 1) * H + y) * W + x; 43 | transformed_points[y_offset] = trans_off[4]*X + trans_off[5]*Y + trans_off[6]*d + trans_off[7]; 44 | int z_offset = ((n * top_channel + 2) * H + y) * W + x; 45 | transformed_points[z_offset] = trans_off[8]*X + trans_off[9]*Y + trans_off[10]*d + trans_off[11]; 46 | } 47 | } 48 | 49 | template 50 | void GeoTransformLayer::Forward_gpu(const vector*>& bottom, 51 | const vector*>& top) 52 | { 53 | int count = bottom[0]->count(); 54 | int num = bottom[0]->num(); 55 | int top_channel = top[0]->channels(); 56 | int height = bottom[0]->height(); 57 | int width = bottom[0]->width(); 58 | int n_threads = num * height * width; 59 | 60 | // bottom[0] --> DepthMap (N,1,H,W) 61 | // bottom[1] --> Transformation matrix (N,1,4,4) 62 | // bottom[2] --> camera intrinsic coefficient (N,4,1,1) 63 | // top[0] --> transformed 3D points (N,3,H,W) 64 | 65 | GeoTransform <<>>( 66 | n_threads, height, width, top_channel, bottom[0]->gpu_data(), bottom[1]->gpu_data(), bottom[2]->gpu_data(), 67 | top[0]->mutable_gpu_data()); 68 | } 69 | 70 | 71 | 72 | template 73 | __global__ void GetGradient(const int nthreads, 74 | const Dtype* const top_diff, const Dtype* const depths, const Dtype* const transformation, const Dtype* const camIntrinsic, 75 | const int height, const int width, const int top_channel, Dtype* const depths_diff, Dtype* const transformation_diff, Dtype* const camIntrinsic_diff, 76 | const bool propagate_down_0, const bool propagate_down_1,const bool propagate_down_2 77 | ){ 78 | 79 | CUDA_KERNEL_LOOP(index, nthreads) { 80 | const int x = index % width; 81 | const int y = (index / width) % height; 82 | // const int top_channel = 4; 83 | const int n = index / width / height ; 84 | 85 | const Dtype* const trans_off = transformation + n*16; 86 | // const Dtype* const trans_diff_off = transformation_diff + n*16; 87 | const Dtype* const camIntrin_off = camIntrinsic + n*4; 88 | // const Dtype* const camIntrin_diff_off = camIntrinsic_diff + n*4; 89 | int depth_offset = (n*height+y)*width+x; 90 | 91 | const float fx = camIntrin_off[0]; 92 | const float fy = camIntrin_off[1]; 93 | const float cx = camIntrin_off[2]; 94 | const float cy = camIntrin_off[3]; 95 | 96 | // Setup base grid 97 | const float base_X = (x-cx)/fx; 98 | const float base_Y = (y-cy)/fy; 99 | const Dtype depth = depths[depth_offset]; 100 | 101 | int x_offset = ((n*top_channel + 0)*height+y)*width+x; 102 | int y_offset = ((n*top_channel + 1)*height+y)*width+x; 103 | int z_offset = ((n*top_channel + 2)*height+y)*width+x; 104 | 105 | 106 | 107 | if (propagate_down_0){ 108 | 109 | // bottom[0] dLdD 110 | Dtype depth_diff_val(0); 111 | float dX_dD = trans_off[0] * base_X + trans_off[1] * base_Y + trans_off[2]; 112 | depth_diff_val += top_diff[x_offset] * dX_dD; 113 | 114 | 115 | float dY_dD = trans_off[4] * base_X + trans_off[5] * base_Y + trans_off[6]; 116 | depth_diff_val += top_diff[y_offset] * dY_dD; 117 | 118 | 119 | float dZ_dD = trans_off[8] * base_X + trans_off[9] * base_Y + trans_off[10]; 120 | depth_diff_val += top_diff[z_offset] * dZ_dD; 121 | 122 | depths_diff[depth_offset] = depth_diff_val; 123 | } 124 | 125 | 126 | if (propagate_down_1){ 127 | // bottom[1] dLdT 128 | caffe_gpu_atomic_add( top_diff[x_offset] * base_X * depth, transformation_diff + n*16 + 0); 129 | caffe_gpu_atomic_add( top_diff[x_offset] * base_Y * depth, transformation_diff + n*16 + 1); 130 | caffe_gpu_atomic_add( top_diff[x_offset] * depth, transformation_diff + n*16 + 2); 131 | caffe_gpu_atomic_add( top_diff[x_offset] * Dtype(1), transformation_diff + n*16 + 3); 132 | 133 | caffe_gpu_atomic_add( top_diff[y_offset] * base_X * depth, transformation_diff + n*16 + 4); 134 | caffe_gpu_atomic_add( top_diff[y_offset] * base_Y * depth, transformation_diff + n*16 + 5); 135 | caffe_gpu_atomic_add( top_diff[y_offset] * depth, transformation_diff + n*16 + 6); 136 | caffe_gpu_atomic_add( top_diff[y_offset] * Dtype(1), transformation_diff + n*16 + 7); 137 | 138 | caffe_gpu_atomic_add( top_diff[z_offset] * base_X * depth, transformation_diff + n*16 + 8); 139 | caffe_gpu_atomic_add( top_diff[z_offset] * base_Y * depth, transformation_diff + n*16 + 9); 140 | caffe_gpu_atomic_add( top_diff[z_offset] * depth, transformation_diff + n*16 + 10); 141 | caffe_gpu_atomic_add( top_diff[z_offset] * Dtype(1), transformation_diff + n*16 + 11); 142 | } 143 | 144 | if (propagate_down_2){ 145 | // bottom[2] dLdK 146 | Dtype cx_diff_val(0); 147 | cx_diff_val += top_diff[x_offset] * trans_off[0]; 148 | cx_diff_val += top_diff[y_offset] * trans_off[4]; 149 | cx_diff_val += top_diff[z_offset] * trans_off[8]; 150 | cx_diff_val *= -depth/fx; 151 | caffe_gpu_atomic_add( cx_diff_val, camIntrinsic_diff + n*4 + 2); 152 | 153 | Dtype cy_diff_val(0); 154 | cy_diff_val += top_diff[x_offset] * trans_off[1]; 155 | cy_diff_val += top_diff[y_offset] * trans_off[5]; 156 | cy_diff_val += top_diff[z_offset] * trans_off[9]; 157 | cy_diff_val *= -depth/fy; 158 | caffe_gpu_atomic_add( cy_diff_val, camIntrinsic_diff + n*4 + 3); 159 | 160 | Dtype fx_diff_val(0); 161 | fx_diff_val += top_diff[x_offset] * trans_off[0]; 162 | fx_diff_val += top_diff[y_offset] * trans_off[4]; 163 | fx_diff_val += top_diff[z_offset] * trans_off[8]; 164 | fx_diff_val *= - base_X / fx * depth; 165 | caffe_gpu_atomic_add( fx_diff_val, camIntrinsic_diff + n*4 + 0); 166 | 167 | Dtype fy_diff_val(0); 168 | fy_diff_val += top_diff[x_offset] * trans_off[1]; 169 | fy_diff_val += top_diff[y_offset] * trans_off[5]; 170 | fy_diff_val += top_diff[z_offset] * trans_off[9]; 171 | fy_diff_val *= - base_Y / fy * depth; 172 | caffe_gpu_atomic_add( fy_diff_val, camIntrinsic_diff + n*4 + 1); 173 | } 174 | 175 | } 176 | } 177 | 178 | template 179 | void GeoTransformLayer::Backward_gpu(const vector*>& top, 180 | const vector& propagate_down, const vector*>& bottom) 181 | { 182 | const Dtype* top_diff = top[0]->gpu_diff(); 183 | int num = bottom[0]->num(); 184 | int height = bottom[0]->height(); 185 | int width = bottom[0]->width(); 186 | int top_channel = top[0]->channels(); 187 | int n_threads = num * height * width; 188 | 189 | const Dtype* depths = bottom[0]->gpu_data(); 190 | const Dtype* transformation = bottom[1]->gpu_data(); 191 | const Dtype* camIntrinsic = bottom[2]->gpu_data(); 192 | 193 | Dtype* depths_diff = bottom[0]->mutable_gpu_diff(); 194 | Dtype* transformation_diff = bottom[1]->mutable_gpu_diff(); 195 | Dtype* camIntrinsic_diff = bottom[2]->mutable_gpu_diff(); 196 | 197 | caffe_gpu_set(bottom[0]->count(), (Dtype)0., depths_diff); 198 | caffe_gpu_set(bottom[1]->count(), (Dtype)0., transformation_diff); 199 | caffe_gpu_set(bottom[2]->count(), (Dtype)0., camIntrinsic_diff); 200 | 201 | GetGradient<<>>( 202 | n_threads, top_diff, depths, transformation, camIntrinsic, 203 | height, width, top_channel, depths_diff, transformation_diff, camIntrinsic_diff, 204 | propagate_down[0], propagate_down[1], propagate_down[2]); 205 | CUDA_POST_KERNEL_CHECK; 206 | } 207 | 208 | INSTANTIATE_LAYER_GPU_FUNCS(GeoTransformLayer); 209 | 210 | } // namespace caffe 211 | -------------------------------------------------------------------------------- /caffe/src/caffe/layers/inverse_warping_layer.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | 6 | #include "caffe/common.hpp" 7 | #include "caffe/layer.hpp" 8 | #include "caffe/syncedmem.hpp" 9 | #include "caffe/util/math_functions.hpp" 10 | #include "caffe/layers/inverse_warping_layer.hpp" 11 | 12 | namespace caffe { 13 | template 14 | void InverseWarpingLayer::LayerSetUp(const vector*>& bottom, 15 | const vector*>& top) { 16 | // validate number of bottom and top blobs 17 | // CHECK_EQ(3,bottom.size())<< "We need 3 bottoms: Image, horizontal flow and verical flow"; 18 | //CHECK_EQ(1,top.size())<< "Output is only warp image"; 19 | // removed previous check if I want external occusion mask 20 | // CHECK_EQ(bottom[0]->num(),bottom[1]->num())<< " Input image and the flow should have same number of instances"; 21 | // CHECK_EQ(bottom[0]->height(),bottom[1]->height())<< " Input image and the flow should have same height"; 22 | // CHECK_EQ(bottom[0]->width(),bottom[1]->width())<< " Input image and the flow should have same width"; 23 | // CHECK_EQ(bottom[2]->num(),bottom[1]->num())<< " horizontal and vertical flow should have same number of instances"; 24 | // CHECK_EQ(bottom[2]->height(),bottom[1]->height())<< " horizontal and vertical flow should have same height"; 25 | // CHECK_EQ(bottom[2]->width(),bottom[1]->width())<< " horizontal and vertical flow should have same width"; 26 | } 27 | 28 | template 29 | void InverseWarpingLayer::Reshape(const vector*>& bottom, 30 | const vector*>& top) { 31 | CHECK_EQ(4, bottom[0]->num_axes()) << "Input must have 4 axes, " 32 | << "corresponding to (num, channels, height, width)"; 33 | // channels_ = bottom[0]->channels(); 34 | // height_ = bottom[0]->height(); 35 | // width_ = bottom[0]->width(); 36 | top[0]->Reshape(bottom[0]->num(),bottom[0]->channels() ,bottom[0]->height(),bottom[0]->width()); 37 | // use_top_mask = top.size() > 1; 38 | //std::cout<< "came here with topsize , use_top_mask"<< use_top_mask<<"\n \n \n "; 39 | // if (use_top_mask) { 40 | // top[1]->ReshapeLike(*top[0]); 41 | //std::cout<< "came here with topsize 2 \n \n"; 42 | // } 43 | // else 44 | // { 45 | // ext_occ.ReshapeLike(*top[0]); 46 | // } 47 | //warp_image.Reshape(bottom[0]->num(), channels_,height_,width_); 48 | } 49 | 50 | 51 | template 52 | void InverseWarpingLayer::Forward_cpu(const vector*>& bottom, 53 | const vector*>& top) { 54 | /*const Dtype* Image2Warp = bottom[0]->cpu_data(); 55 | const Dtype* u = bottom[1]->cpu_data(); 56 | const Dtype* v = bottom[2]->cpu_data(); 57 | 58 | Dtype* top_data = top[0]->mutable_cpu_data(); 59 | const int top_count = top[0]->count(); 60 | */ 61 | // For this moment we only have GPU 62 | CHECK_EQ(1,0)<< "Layer does not exist on CPU"; 63 | } 64 | 65 | 66 | template 67 | void InverseWarpingLayer::Backward_cpu(const vector*>& top, 68 | const vector& propagate_down, const vector*>& bottom) { 69 | if (!propagate_down[0]) { 70 | return; 71 | } 72 | 73 | // For this moment we only have GPU 74 | CHECK_EQ(1,0)<< "Layer does not exist on CPU"; 75 | 76 | } 77 | 78 | 79 | 80 | #ifdef CPU_ONLY 81 | STUB_GPU(InverseWarpingLayer); 82 | #endif 83 | 84 | INSTANTIATE_CLASS(InverseWarpingLayer); 85 | REGISTER_LAYER_CLASS(InverseWarping); 86 | } // namespace caffe 87 | 88 | 89 | 90 | 91 | -------------------------------------------------------------------------------- /caffe/src/caffe/layers/inverse_warping_layer.cu: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | #include "caffe/layers/inverse_warping_layer.hpp" 5 | #include "caffe/util/math_functions.hpp" 6 | #include "caffe/util/gpu_util.cuh" 7 | 8 | namespace caffe { 9 | 10 | template 11 | __global__ void InverseWarping(const int nthreads, 12 | const int top_channel, const int height, const int width, 13 | const Dtype* const U, const Dtype* const proj_xy, Dtype* const top_data) { 14 | CUDA_KERNEL_LOOP(index, nthreads) { 15 | const int x = index % width; 16 | const int y = (index / width) % height; 17 | const int n = index / width / height ; 18 | 19 | const int proj_x_offset = ((n * 2 + 0) * height + y) * width + x; 20 | const int proj_y_offset = ((n * 2 + 1) * height + y) * width + x; 21 | 22 | float xx = proj_xy[proj_x_offset]; 23 | float yy = proj_xy[proj_y_offset]; 24 | 25 | int x1 = floorf(xx); 26 | int x2 = x1+1; 27 | 28 | int y1 = floorf(yy); 29 | int y2 = y1+1; 30 | 31 | float wx2 = xx - float(x1); 32 | float wx1 = float(x2)-xx; 33 | float wy2 = yy - float(y1); 34 | float wy1 = float(y2)-yy; 35 | 36 | 37 | 38 | for (int cc = 0; cc= 0 && x1 <= width-1 && y1 >= 0 && y1 <= height-1 ) 43 | top_data[x + y * width +off] += wx1 * wy1 * U[x1 + y1 *width + off] ; 44 | if(x1 >= 0 && x1 <= width-1 && y2 >= 0 && y2 <= height-1 ) 45 | top_data[x + y * width +off] += wx1 * wy2 * U[x1 + y2 *width + off] ; 46 | if(x2 >= 0 && x2 <= width-1 && y1 >= 0 && y1 <= height-1 ) 47 | top_data[x + y * width +off] += wx2 * wy1 * U[x2 + y1 *width + off] ; 48 | if(x2 >= 0 && x2 <= width-1 && y2 >= 0 && y2 <= height-1 ) 49 | top_data[x + y * width +off] += wx2 * wy2 * U[x2 + y2 *width + off] ; 50 | } 51 | } 52 | } 53 | 54 | 55 | 56 | template 57 | void InverseWarpingLayer::Forward_gpu(const vector*>& bottom, 58 | const vector*>& top) { 59 | int count = bottom[0]->count(); 60 | int num = bottom[0]->num(); 61 | 62 | int top_channel = top[0]->channels(); 63 | int height = bottom[0]->height(); 64 | int width = bottom[0]->width(); 65 | int n_threads = num * height * width; 66 | 67 | // bottom[0] --> Image (N,C,H,W) 68 | // bottom[1] --> projection coordinate (N,K,H,W) 69 | // top[0] --> Warp iamge (N,C,H,W) 70 | 71 | 72 | InverseWarping <<>>( 73 | n_threads, top_channel, height, width, bottom[0]->gpu_data(), bottom[1]->gpu_data(), 74 | top[0]->mutable_gpu_data()); 75 | 76 | 77 | 78 | } 79 | 80 | 81 | 82 | 83 | 84 | template 85 | __global__ void GetGradient( 86 | const int nthreads, const int top_channel, const int height, const int width, 87 | const Dtype* const U, const Dtype* const proj_xy, 88 | const Dtype* const top_diff, 89 | Dtype* const U_diff, Dtype* const proj_xy_diff, 90 | const bool propagate_down_0, const bool propagate_down_1) { 91 | CUDA_KERNEL_LOOP(index, nthreads) { 92 | 93 | const int x = index % width; 94 | const int y = (index / width) % height; 95 | const int n = index / width / height ; 96 | 97 | const int proj_x_offset = ((n * 2 + 0) * height + y) * width + x; 98 | const int proj_y_offset = ((n * 2 + 1) * height + y) * width + x; 99 | 100 | float xx = proj_xy[proj_x_offset]; 101 | float yy = proj_xy[proj_y_offset]; 102 | 103 | int x1 = floorf(xx); 104 | int x2 = x1+1; 105 | int y1 = floorf(yy); 106 | int y2 = y1+1; 107 | 108 | float wx2 = xx - float(x1); 109 | float wx1 = float(x2)-xx; 110 | float wy2 = yy - float(y1); 111 | float wy1 = float(y2)-yy; 112 | 113 | Dtype topLeftProduct(0); 114 | Dtype topRightProduct(0); 115 | Dtype bottomLeftProduct(0); 116 | Dtype bottomRightProduct(0); 117 | 118 | 119 | 120 | 121 | // if (index==0){printf("Propagate_down_0: %d\n", propagate_down_0);} 122 | 123 | 124 | for (int cc = 0; cc= 0 && x1 <= width-1 && y1 >= 0 && y1 <= height-1 ) { 128 | // dLdU 129 | if (propagate_down_0){ caffe_gpu_atomic_add(top_diff_this*wx1*wy1, U_diff + x1 + y1 * width + off ); } 130 | // for dLd[x,y] future use 131 | topLeftProduct += top_diff_this * U[off + width * y1 + x1]; 132 | 133 | } 134 | if(x1 >= 0 && x1 <= width-1 && y2 >= 0 && y2 <= height-1 ) { 135 | // dLdU 136 | if (propagate_down_0){ caffe_gpu_atomic_add(top_diff_this*wx1*wy2, U_diff +x1 + y2 * width + off );} 137 | // for dLd[x,y] future use 138 | bottomLeftProduct += top_diff_this * U[off + width * y2 + x1]; 139 | 140 | } 141 | 142 | if(x2 >= 0 && x2 <= width-1 && y1 >= 0 && y1 <= height-1 ) { 143 | // dLdU 144 | if (propagate_down_0){ caffe_gpu_atomic_add(top_diff_this*wx2*wy1, U_diff +x2 + y1 * width + off );} 145 | // for dLd[x,y] future use 146 | topRightProduct += top_diff_this * U[off + width * y1 + x2]; 147 | 148 | } 149 | 150 | if(x2 >= 0 && x2 <= width-1 && y2 >= 0 && y2 <= height-1 ) { 151 | // dLdU 152 | if (propagate_down_0){ caffe_gpu_atomic_add(top_diff_this*wx2*wy2, U_diff +x2 + y2 * width + off );} 153 | // for dLd[x,y] future use 154 | bottomRightProduct += top_diff_this * U[off + width * y2 + x2]; 155 | } 156 | } 157 | 158 | 159 | // dLd[x,y] 160 | if (propagate_down_1){ 161 | // if (index==0){printf("Propagate_down_1: %d\n", propagate_down_1);} 162 | proj_xy_diff[proj_x_offset] = (topRightProduct - topLeftProduct) * wy1 + (bottomRightProduct - bottomLeftProduct) * wy2; 163 | proj_xy_diff[proj_y_offset] = (bottomLeftProduct - topLeftProduct) * wx1 + (bottomRightProduct - topRightProduct) * wx2; 164 | // if (index==0){printf("grad : %f\n", proj_xy_diff[proj_y_offset]);} 165 | } 166 | 167 | } 168 | 169 | } 170 | 171 | 172 | 173 | 174 | 175 | template 176 | void InverseWarpingLayer::Backward_gpu(const vector*>& top, 177 | const vector& propagate_down, const vector*>& bottom) { 178 | 179 | 180 | const Dtype* top_diff = top[0]->gpu_diff(); 181 | int num = bottom[0]->num(); 182 | int height = bottom[0]->height(); 183 | int width = bottom[0]->width(); 184 | int top_channel = top[0]->channels(); 185 | int n_threads = num * height * width; 186 | 187 | const Dtype* U = bottom[0]->gpu_data(); 188 | const Dtype* proj_xy = bottom[1]->gpu_data(); 189 | 190 | bool propagate_down_0 = propagate_down[0]; 191 | bool propagate_down_1 = propagate_down[1]; 192 | 193 | Dtype* U_diff = bottom[0]->mutable_gpu_diff(); 194 | Dtype* proj_xy_diff = bottom[1]->mutable_gpu_diff(); 195 | 196 | caffe_gpu_set(bottom[0]->count(), (Dtype)0., U_diff); 197 | caffe_gpu_set(bottom[1]->count(), (Dtype)0., proj_xy_diff); 198 | 199 | GetGradient<<>>( 200 | n_threads, top_channel, height, width, 201 | U, proj_xy, 202 | top_diff, U_diff, proj_xy_diff, 203 | propagate_down_0, propagate_down_1); 204 | 205 | CUDA_POST_KERNEL_CHECK; 206 | 207 | 208 | 209 | } 210 | 211 | 212 | INSTANTIATE_LAYER_GPU_FUNCS(InverseWarpingLayer); 213 | } 214 | 215 | -------------------------------------------------------------------------------- /caffe/src/caffe/layers/pin_hole_layer.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include "caffe/layers/pin_hole_layer.hpp" 4 | #include "caffe/util/io.hpp" 5 | #include "caffe/util/math_functions.hpp" 6 | 7 | namespace caffe 8 | { 9 | 10 | template 11 | void PinHoleLayer::Reshape( 12 | const vector*>& bottom, const vector*>& top) 13 | { 14 | top[0]->Reshape(bottom[0]->num(), 2, bottom[0]->height(), bottom[0]->width()); 15 | top[1]->Reshape(bottom[0]->num(), 2, bottom[0]->height(), bottom[0]->width()); 16 | } 17 | 18 | template 19 | void PinHoleLayer::Forward_cpu(const vector*>& bottom, const vector*>& top) 20 | { 21 | std::cout << "PinHoleLayer Warning: Running Forward_gpu() code instead" << std::endl; 22 | 23 | // Forward_gpu(bottom, top); 24 | } 25 | 26 | template 27 | void PinHoleLayer::Backward_cpu(const vector*>& top, 28 | const vector& propagate_down, const vector*>& bottom) 29 | { 30 | std::cout << "PinHoleLayer Warning: Running Backward_gpu() code instead" << std::endl; 31 | 32 | // Backward_gpu(top, propagate_down, bottom); 33 | } 34 | 35 | #ifdef CPU_ONLY 36 | STUB_GPU(PinHoleLayer); 37 | #endif 38 | 39 | INSTANTIATE_CLASS(PinHoleLayer); 40 | REGISTER_LAYER_CLASS(PinHole); 41 | 42 | } // namespace caffe 43 | -------------------------------------------------------------------------------- /caffe/src/caffe/layers/pin_hole_layer.cu: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | #include "caffe/layers/pin_hole_layer.hpp" 5 | #include "caffe/util/math_functions.hpp" 6 | #include "caffe/util/gpu_util.cuh" 7 | 8 | namespace caffe { 9 | 10 | template 11 | __global__ void PinHoleProjection(const int nthreads, const int H, const int W, const int bottom_channel,const int top_channel, const Dtype* const pts3D, const Dtype* const camIntrinsic, Dtype* const flows, Dtype* const proj_coords) 12 | { 13 | 14 | // pts3D (shape: N,3,H,W) 15 | // flows: (shape: N,2,H,W) 16 | // proj_coords: (shape: N,2,H,W) 17 | CUDA_KERNEL_LOOP(index, nthreads) 18 | { 19 | const int x = index % W; 20 | const int y = (index / W) % H; 21 | const int n = index / W / H; 22 | // const int bottom_channel = 4; 23 | // const int top_channel = 2; 24 | const int X_offset = ((n * bottom_channel + 0) * H + y) * W + x; 25 | const int Y_offset = ((n * bottom_channel + 1) * H + y) * W + x; 26 | const int Z_offset = ((n * bottom_channel + 2) * H + y) * W + x; 27 | 28 | const Dtype* const camIntrin_off = camIntrinsic + n*4; 29 | 30 | 31 | const float fx = camIntrin_off[0]; 32 | const float fy = camIntrin_off[1]; 33 | const float cx = camIntrin_off[2]; 34 | const float cy = camIntrin_off[3]; 35 | 36 | const Dtype X = pts3D[X_offset]; 37 | const Dtype Y = pts3D[Y_offset]; 38 | const Dtype Z = pts3D[Z_offset]; 39 | 40 | 41 | int x_offset = ((n * top_channel + 0) * H + y) * W + x; 42 | int y_offset = ((n * top_channel + 1) * H + y) * W + x; 43 | 44 | flows[x_offset] = fx * X / (Z+1e-12) + cx - x; 45 | flows[y_offset] = fy * Y / (Z+1e-12) + cy - y; 46 | 47 | proj_coords[x_offset] = fx * X / (Z+1e-12) + cx ; 48 | proj_coords[y_offset] = fy * Y / (Z+1e-12) + cy ; 49 | } 50 | } 51 | 52 | template 53 | void PinHoleLayer::Forward_gpu(const vector*>& bottom, 54 | const vector*>& top) 55 | { 56 | int count = bottom[0]->count(); 57 | int num = bottom[0]->num(); 58 | int height = bottom[0]->height(); 59 | int bottom_channel = bottom[0]->channels(); 60 | int top_channel = top[0]->channels(); 61 | int width = bottom[0]->width(); 62 | int n_threads = num * height * width; 63 | 64 | // bottom[0] --> 3D points (N,C,H,W) 65 | // bottom[1] --> camera intrinsic (N,4,1,1) 66 | // top[0] --> flows (N,2,H,W) [u,v] 67 | // top[1] --> projection coordinat (N,2,H,W) [xp,yp] 68 | 69 | PinHoleProjection <<>>( 70 | n_threads, height, width,bottom_channel, top_channel, bottom[0]->gpu_data(), bottom[1]->gpu_data(), 71 | top[0]->mutable_gpu_data(), top[1]->mutable_gpu_data()); 72 | } 73 | 74 | 75 | 76 | template 77 | __global__ void GetGradient(const int nthreads, 78 | const Dtype* const top_flows_diff, const Dtype* const top_coords_diff, const Dtype* const pts3D, const Dtype* const camIntrinsic, 79 | const int height, const int width, const int bottom_channel, const int top_channel,Dtype* const pts3D_diff, Dtype* const camIntrinsic_diff){ 80 | 81 | CUDA_KERNEL_LOOP(index, nthreads) { 82 | const int x = index % width; 83 | const int y = (index / width) % height; 84 | // const int bottom_channel = 4; 85 | // const int top_channel = 2; 86 | const int n = index / width / height ; 87 | 88 | const Dtype* const camIntrin_off = camIntrinsic + n*4; 89 | const int X_offset = ((n * bottom_channel + 0) * height + y) * width + x; 90 | const int Y_offset = ((n * bottom_channel + 1) * height + y) * width + x; 91 | const int Z_offset = ((n * bottom_channel + 2) * height + y) * width + x; 92 | 93 | const int x_offset = ((n * top_channel + 0) * height + y) * width + x; 94 | const int y_offset = ((n * top_channel + 1) * height + y) * width + x; 95 | 96 | const float fx = camIntrin_off[0]; 97 | const float fy = camIntrin_off[1]; 98 | // const float cx = camIntrin_off[2]; 99 | // const float cy = camIntrin_off[3]; 100 | 101 | // Setup base grid 102 | const Dtype X = pts3D[X_offset]; 103 | const Dtype Y = pts3D[Y_offset]; 104 | const Dtype Z = pts3D[Z_offset]; 105 | 106 | // bottom[0] dLd[X,Y,Z] 107 | Dtype X_diff_val(0); 108 | X_diff_val += top_flows_diff[x_offset] * fx / (Z+1e-12); 109 | X_diff_val += top_coords_diff[x_offset] * fx / (Z+1e-12); 110 | pts3D_diff[X_offset] = X_diff_val; 111 | 112 | Dtype Y_diff_val(0); 113 | Y_diff_val += top_flows_diff[y_offset] * fy / (Z+1e-12); 114 | Y_diff_val += top_coords_diff[y_offset] * fy / (Z+1e-12); 115 | pts3D_diff[Y_offset] = Y_diff_val; 116 | 117 | Dtype Z_diff_val(0); 118 | Z_diff_val += - top_flows_diff[x_offset] * fx * X / (Z*Z+1e-12); 119 | Z_diff_val += - top_coords_diff[x_offset] * fx * X / (Z*Z+1e-12); 120 | Z_diff_val += - top_flows_diff[y_offset] * fy * Y / (Z*Z+1e-12); 121 | Z_diff_val += - top_coords_diff[y_offset] * fy * Y / (Z*Z+1e-12); 122 | pts3D_diff[Z_offset] = Z_diff_val; 123 | 124 | // bootom[1] dLdK 125 | Dtype cx_diff_val(0); 126 | cx_diff_val += top_flows_diff[x_offset]; 127 | cx_diff_val += top_coords_diff[x_offset]; 128 | caffe_gpu_atomic_add( cx_diff_val, camIntrinsic_diff + n*4 + 2); 129 | 130 | Dtype cy_diff_val(0); 131 | cy_diff_val += top_flows_diff[y_offset]; 132 | cy_diff_val += top_coords_diff[y_offset]; 133 | caffe_gpu_atomic_add( cy_diff_val, camIntrinsic_diff + n*4 + 3); 134 | 135 | Dtype fx_diff_val(0); 136 | fx_diff_val += top_flows_diff[x_offset] * X / (Z+1e-12); 137 | fx_diff_val += top_coords_diff[x_offset] * X / (Z+1e-12); 138 | caffe_gpu_atomic_add( fx_diff_val, camIntrinsic_diff + n*4 + 0); 139 | 140 | Dtype fy_diff_val(0); 141 | fy_diff_val += top_flows_diff[y_offset] * Y/ (Z+1e-12); 142 | fy_diff_val += top_coords_diff[y_offset] * Y / (Z+1e-12); 143 | caffe_gpu_atomic_add( fy_diff_val, camIntrinsic_diff + n*4 + 1); 144 | 145 | } 146 | } 147 | 148 | template 149 | void PinHoleLayer::Backward_gpu(const vector*>& top, 150 | const vector& propagate_down, const vector*>& bottom) 151 | { 152 | const Dtype* top_flows_diff = top[0]->gpu_diff(); 153 | const Dtype* top_coords_diff = top[1]->gpu_diff(); 154 | int num = bottom[0]->num(); 155 | int height = bottom[0]->height(); 156 | int width = bottom[0]->width(); 157 | int bottom_channel = bottom[0]->channels(); 158 | int top_channel = top[0]->channels(); 159 | int n_threads = num * height * width; 160 | 161 | const Dtype* pts3D = bottom[0]->gpu_data(); 162 | const Dtype* camIntrinsic = bottom[1]->gpu_data(); 163 | 164 | Dtype* pts3D_diff = bottom[0]->mutable_gpu_diff(); 165 | Dtype* camIntrinsic_diff = bottom[1]->mutable_gpu_diff(); 166 | 167 | caffe_gpu_set(bottom[0]->count(), (Dtype)0., pts3D_diff); 168 | caffe_gpu_set(bottom[1]->count(), (Dtype)0., camIntrinsic_diff); 169 | 170 | GetGradient<<>>( 171 | n_threads, top_flows_diff, top_coords_diff, pts3D, camIntrinsic, 172 | height, width, bottom_channel, top_channel, pts3D_diff, camIntrinsic_diff); 173 | CUDA_POST_KERNEL_CHECK; 174 | } 175 | 176 | INSTANTIATE_LAYER_GPU_FUNCS(PinHoleLayer); 177 | 178 | } // namespace caffe 179 | -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- 1 | ## Download dataset 2 | 3 | The main dataset used in this project is [KITTI Driving Dataset](http://www.cvlibs.net/datasets/kitti/raw_data.php). The videos in three categories (City, Residential, Road) are used in our experiments. 4 | 5 | 6 | 7 | ## Dataset structure 8 | 9 | After getting the dataset, in order to use our provided `dataset_builder.py`, the dataset should be arranged in the following structure. Let's call `$TOP` as the home directory of this repo. `kitti_raw_data` should be placed in the directory `$TOP/data` or a softlink should be created in the directory. 10 | 11 | ``` 12 | |-- $TOP 13 | |-- data 14 | |-- kitti_raw_data 15 | |-- city 16 | |-- 2011_09_26_drive_0001 17 | |-- image_02 18 | |-- data # contain left images 19 | |-- image_03 20 | |-- data # contain right images 21 | |-- calib 22 | |-- calib_cam_to_cam.txt # contain camera intrinsic and extrainsic parameters 23 | |-- velodyne_points # contain laser readings for depth evaluation 24 | |-- oxts 25 | |-- 2011_09_26_drive_0002 26 | |-- ... 27 | |-- residential 28 | |-- 2011_09_26_drive_0019 29 | |-- ... 30 | |-- road 31 | |-- 2011_09_26_drive_0015 32 | |-- ... 33 | ``` 34 | 35 | ## Build Dataset 36 | 37 | ### Train Set 38 | 39 | We provide `dataset_builder.py` which builds the training set from the raw KITTI data. To use the script, please see the following example for creating dataset using [Eigen Split](https://arxiv.org/abs/1406.2283). 40 | 41 | ``` Shell 42 | cd $TOP 43 | python data/dataset_builder.py --builder kitti_eigen --train_frame_distance 1 --raw_data_dir ./data/kitti_raw_data --dataset_dir ./data/dataset/kitti_eigen --image_size [160,608] 44 | ``` 45 | 46 | Other optional arguments/functions please refer to the script. **NOTE** if you have built a dataset and want to replace the original dataset, remember to delete the files in the original directory (especially **LMDB** folders, new LMDB file will be appended to the original file if you forgot to do so.). 47 | 48 | ### Evaluation Set (Depth estimation) 49 | 50 | The Eigen Split is commonly used for single view depth estimation benchmarking. 697 image-depth pairs are used for evalution. The list of images is saved at `./data/depth_evaluation/kitti_eigen/test_files_eigen.txt`. We also provide the images, which can be downloaded [here](https://www.dropbox.com/sh/n4uvg4rhdi4fzuk/AABWfmvc_WECj6h9X87M2d5Oa?dl=0) `./data/depth_evaluation/kitti_eigen`. 51 | 52 | ### Evaluation Set (Visual odometry) 53 | 54 | [KITTI Odometry benchmark](http://www.cvlibs.net/datasets/kitti/eval_odometry.php) contains 22 stereo sequences, in which 11 sequences are provided with ground truth. The 11 sequences are used for evaluation or training of visual odometry. For the details, please refer to our paper. 55 | 56 | The ground truth files can be downloaded [here](http://www.cvlibs.net/download.php?file=data_odometry_poses.zip). The files should be arranged in the following structure. 57 | 58 | 59 | ``` 60 | |-- $TOP 61 | |-- data 62 | |-- kitti_raw_data 63 | |-- depth_evaluation 64 | |-- odometry_evaluation 65 | |--poses 66 | |-- 00.txt 67 | |-- 01.txt 68 | |-- ... 69 | |-- 10.txt 70 | ``` 71 | 72 | -------------------------------------------------------------------------------- /data/dataset_builder.py: -------------------------------------------------------------------------------- 1 | import os, os.path 2 | import random 3 | import scipy.io as sio 4 | import numpy as np 5 | import lmdb 6 | from shutil import copyfile 7 | import cv2 8 | import json 9 | import argparse 10 | 11 | import sys 12 | from os.path import expanduser 13 | home = expanduser("~") 14 | caffe_root = home + '/caffe/' # this file should be run from {caffe_root}/examples (otherwise change this line) 15 | sys.path.insert(0, caffe_root + 'python') 16 | import caffe 17 | 18 | class kittiEigenBuilder(): 19 | def __init__(self): 20 | self.train_scenes = [ 21 | 'residential/2011_09_30_drive_0033', 22 | 'residential/2011_09_26_drive_0087', 23 | 'residential/2011_09_30_drive_0020', 24 | 'residential/2011_09_26_drive_0039', 25 | 'residential/2011_09_30_drive_0028', 26 | 'city/2011_09_26_drive_0018', 27 | 'residential/2011_09_26_drive_0035', 28 | 'city/2011_09_26_drive_0057', 29 | 'road/2011_10_03_drive_0042', 30 | 'residential/2011_09_26_drive_0022', 31 | 'road/2011_09_26_drive_0028', 32 | 'residential/2011_10_03_drive_0034', 33 | 'road/2011_09_29_drive_0004', 34 | 'road/2011_09_26_drive_0070', 35 | 'residential/2011_09_26_drive_0061', 36 | 'city/2011_09_26_drive_0091', 37 | 'city/2011_09_29_drive_0026', 38 | 'city/2011_09_26_drive_0014', 39 | 'city/2011_09_26_drive_0104', 40 | 'city/2011_09_26_drive_0001', 41 | 'city/2011_09_26_drive_0017', 42 | 'city/2011_09_26_drive_0051', 43 | 'residential/2011_09_30_drive_0034', 44 | 'city/2011_09_26_drive_0095', 45 | 'city/2011_09_26_drive_0060', 46 | 'residential/2011_09_26_drive_0079', 47 | 'road/2011_09_26_drive_0015', 48 | 'residential/2011_09_26_drive_0019', 49 | 'city/2011_09_26_drive_0005', 50 | 'city/2011_09_26_drive_0011', 51 | 'road/2011_09_26_drive_0032', 52 | 'city/2011_09_28_drive_0001', 53 | 'city/2011_09_26_drive_0113'] 54 | 55 | def setup(self,setup_opt): 56 | self.train_frame_distance = setup_opt['train_frame_distance'] 57 | self.raw_data_dir = setup_opt['raw_data_dir'] 58 | self.dataset_dir = setup_opt['dataset_dir'] 59 | self.image_size = setup_opt['image_size'] 60 | 61 | if not os.path.exists(self.dataset_dir): 62 | os.makedirs(self.dataset_dir) 63 | 64 | def getData(self,isTrain): 65 | # ---------------------------------------------------------------------- 66 | # Get dataset: image pairs (L1,L2,R1,R2 & K &T(L2R)) 67 | # ---------------------------------------------------------------------- 68 | self.L1_set = [] 69 | self.R1_set = [] 70 | self.L2_set = [] 71 | self.R2_set = [] 72 | self.K = [] 73 | self.T = [] 74 | 75 | scenes = self.train_scenes 76 | 77 | for cnt, scene in enumerate(scenes): 78 | print "Getting data. [Scene: ", cnt, "/" , len(scenes), "]" 79 | seq_path = "/".join([self.raw_data_dir, scene,"image_02", "data"]) 80 | seq_end = len(os.listdir(seq_path))-1 81 | for i in xrange(0, seq_end - self.train_frame_distance + 1): 82 | L1 = "/".join([self.raw_data_dir, scene, "image_02", "data", '{:010}'.format(i)]) + ".png" 83 | L2 = "/".join([self.raw_data_dir, scene, "image_02", "data", '{:010}'.format(i+self.train_frame_distance)]) + ".png" 84 | R1 = "/".join([self.raw_data_dir, scene, "image_03", "data", '{:010}'.format(i)]) + ".png" 85 | R2 = "/".join([self.raw_data_dir, scene, "image_03", "data", '{:010}'.format(i+self.train_frame_distance)]) + ".png" 86 | 87 | self.L1_set.append(L1) 88 | self.L2_set.append(L2) 89 | self.R1_set.append(R1) 90 | self.R2_set.append(R2) 91 | 92 | kt_scene = "/".join([self.raw_data_dir, scene]) 93 | 94 | KT = self.getKT(kt_scene) #Get K and T(right-to-left) 95 | self.K.append(KT[:4]) 96 | self.T.append(KT[4]) 97 | 98 | def getKT(self,scene): 99 | # ---------------------------------------------------------------------- 100 | # Get K (camera intrinsic) and T (camera extrinsic) 101 | # ---------------------------------------------------------------------- 102 | new_image_size = [float(self.image_size[0]), float(self.image_size[1])] #[height,width] 103 | 104 | # ---------------------------------------------------------------------- 105 | # Get original K 106 | # ---------------------------------------------------------------------- 107 | f = open(scene+"/calib/calib_cam_to_cam.txt", 'r') 108 | camTxt = f.readlines() 109 | f.close() 110 | K_dict = {} 111 | for line in camTxt: 112 | line_split = line.split(":") 113 | K_dict[line_split[0]] = line_split[1] 114 | 115 | # ---------------------------------------------------------------------- 116 | # original K02 117 | # ---------------------------------------------------------------------- 118 | P_split = K_dict["P_rect_02"].split(" ") 119 | S_split = K_dict["S_rect_02"].split(" ") 120 | ref_img_size = [float(S_split[2]), float(S_split[1])] # height, width 121 | 122 | 123 | # ---------------------------------------------------------------------- 124 | # Get new K & position 125 | # ---------------------------------------------------------------------- 126 | W_ratio = new_image_size[1] / ref_img_size[1] 127 | H_ratio = new_image_size[0] / ref_img_size[0] 128 | fx = float(P_split[1]) * W_ratio 129 | fy = float(P_split[6]) * H_ratio 130 | cx = float(P_split[3]) * W_ratio 131 | cy = float(P_split[7]) * H_ratio 132 | 133 | tx_L = float(P_split[4]) / float(P_split[1]) 134 | # ty_L = float(P_split[8]) / float(P_split[6]) 135 | 136 | # ---------------------------------------------------------------------- 137 | # original K03 138 | # ---------------------------------------------------------------------- 139 | P_split = K_dict["P_rect_03"].split(" ") 140 | S_split = K_dict["S_rect_03"].split(" ") 141 | 142 | tx_R = float(P_split[4]) / float(P_split[1]) 143 | # ty_R = float(P_split[8]) / float(P_split[6]) 144 | 145 | # ---------------------------------------------------------------------- 146 | # Get position of Right camera w.r.t Left 147 | # ---------------------------------------------------------------------- 148 | Tx = np.abs(tx_R - tx_L) 149 | # Ty = np.abs(tx_R - tx_L) 150 | 151 | se3 = [0,0,0,Tx,0,0] 152 | 153 | return [fx,fy,cx,cy,se3] 154 | 155 | def shuffleDataset(self): 156 | list_ = list(zip(self.L1_set, self.L2_set, self.R1_set, self.R2_set, self.K, self.T)) 157 | random.shuffle(list_) 158 | self.L1_set, self.L2_set, self.R1_set, self.R2_set, self.K, self.T = zip(*list_) 159 | 160 | def saveDataset(self, isTrain, with_val): 161 | if isTrain: 162 | start_idx = 0 163 | end_idx = len(self.L1_set) 164 | if with_val: 165 | end_idx = 22600 166 | 167 | txt_to_save = "/".join([self.dataset_dir,"train_left_1.txt"]) 168 | self.saveTxt(txt_to_save, self.L1_set[start_idx:end_idx]) 169 | 170 | txt_to_save = "/".join([self.dataset_dir,"train_right_1.txt"]) 171 | self.saveTxt(txt_to_save, self.R1_set[start_idx:end_idx]) 172 | 173 | txt_to_save = "/".join([self.dataset_dir,"train_left_2.txt"]) 174 | self.saveTxt(txt_to_save, self.L2_set[start_idx:end_idx]) 175 | 176 | txt_to_save = "/".join([self.dataset_dir,"train_right_2.txt"]) 177 | self.saveTxt(txt_to_save, self.R2_set[start_idx:end_idx]) 178 | 179 | lmdb_to_save = "/".join([self.dataset_dir,"train_K"]) 180 | self.saveLmdb(lmdb_to_save, np.expand_dims(np.expand_dims(np.asarray(self.K[start_idx:end_idx]),3),4)) 181 | 182 | lmdb_to_save = "/".join([self.dataset_dir,"train_T_R2L"]) 183 | self.saveLmdb(lmdb_to_save, np.expand_dims(np.expand_dims(np.asarray(self.T[start_idx:end_idx]),3),4)) 184 | print "Dataset built! Number of training instances: ", len(self.L1_set[start_idx:end_idx]) 185 | else: 186 | start_idx = 22600 187 | txt_to_save = "/".join([self.dataset_dir,"val_left_1.txt"]) 188 | self.saveTxt(txt_to_save, self.L1_set[start_idx:]) 189 | 190 | txt_to_save = "/".join([self.dataset_dir,"val_right_1.txt"]) 191 | self.saveTxt(txt_to_save, self.R1_set[start_idx:]) 192 | 193 | txt_to_save = "/".join([self.dataset_dir,"val_left_2.txt"]) 194 | self.saveTxt(txt_to_save, self.L2_set[start_idx:]) 195 | 196 | txt_to_save = "/".join([self.dataset_dir,"val_right_2.txt"]) 197 | self.saveTxt(txt_to_save, self.R2_set[start_idx:]) 198 | 199 | lmdb_to_save = "/".join([self.dataset_dir,"val_K"]) 200 | self.saveLmdb(lmdb_to_save, np.expand_dims(np.expand_dims(np.asarray(self.K[start_idx:]),3),4)) 201 | 202 | lmdb_to_save = "/".join([self.dataset_dir,"val_T_R2L"]) 203 | self.saveLmdb(lmdb_to_save, np.expand_dims(np.expand_dims(np.asarray(self.T[start_idx:]),3),4)) 204 | 205 | print "Dataset built! Number of validation instances: ", len(self.L1_set[start_idx:]) 206 | 207 | def saveTxt(self, path, img_list): 208 | f = open(path, 'w') 209 | for line in img_list: 210 | f.writelines(line+"\n") 211 | f.close() 212 | 213 | def saveLmdb(self, path, np_arr): 214 | # input: np_arr: shape = (N,C,H,W) 215 | N = np_arr.shape[0] 216 | map_size = np_arr.nbytes * 10 217 | env = lmdb.open(path, map_size=map_size) 218 | 219 | with env.begin(write=True) as txn: 220 | # txn is a Transaction object 221 | for i in range(N): 222 | datum = caffe.proto.caffe_pb2.Datum() 223 | datum.channels = np_arr.shape[1] 224 | datum.height = np_arr.shape[2] 225 | datum.width = np_arr.shape[3] 226 | datum = caffe.io.array_to_datum(np_arr[i]) 227 | str_id = '{:08}'.format(i) 228 | # The encode is only essential in Python 3 229 | txn.put(str_id.encode('ascii'), datum.SerializeToString()) 230 | 231 | 232 | class kittiOdomBuilder(): 233 | def __init__(self): 234 | self.train_scenes = ["residential/2011_10_03_drive_0027", 235 | "road/2011_10_03_drive_0042", 236 | "residential/2011_10_03_drive_0034", 237 | # "residential/2011_09_26_drive_0067", 238 | "road/2011_09_30_drive_0016", 239 | "residential/2011_09_30_drive_0018", 240 | "residential/2011_09_30_drive_0020", 241 | "residential/2011_09_30_drive_0027", 242 | "residential/2011_09_30_drive_0028" 243 | # test sets 244 | # ,"residential/2011_09_30_drive_0033", 245 | # "residential/2011_09_30_drive_0034" 246 | ] 247 | self.train_seqs = [[0,4540], 248 | [0,1100], 249 | [0,4660], 250 | # [0,800], 251 | [0,270], 252 | [0,2760], 253 | [0,1100], 254 | [0,1100], 255 | [1100,5170] 256 | 257 | # test sets 258 | # ,[0,1590], 259 | # [0,1200] 260 | ] 261 | 262 | self.test_scenes = ["residential/2011_09_30_drive_0033", 263 | "residential/2011_09_30_drive_0034"] 264 | 265 | self.test_seqs = [[0,1590],[0,1200]] 266 | 267 | # self.test_scenes = self.train_scenes 268 | # self.test_seqs = self.train_seqs 269 | 270 | def setup(self, setupOpt): 271 | self.train_frame_distance = setupOpt['train_frame_distance'] 272 | self.test_frame_distance = setupOpt['test_frame_distance'] 273 | self.dataset_dir = setupOpt['raw_data_dir'] 274 | self.resource_dir = setupOpt['dataset_dir'] 275 | self.image_size = setup_opt['image_size'] 276 | 277 | def getTrainData(self): 278 | # ---------------------------------------------------------------------- 279 | # Get Training dataset: image pairs (L1,L2,R1,R2 & K &T(L2R)) 280 | # ---------------------------------------------------------------------- 281 | self.train_L1 = [] 282 | self.train_R1 = [] 283 | self.train_L2 = [] 284 | self.train_R2 = [] 285 | self.K = [] 286 | self.T = [] 287 | 288 | for cnt, train_scene in enumerate(self.train_scenes): 289 | seq_start = self.train_seqs[cnt][0] 290 | seq_end = self.train_seqs[cnt][1] 291 | for i in xrange(seq_start,seq_end-self.train_frame_distance+1): 292 | L1 = "/".join([self.dataset_dir, train_scene, "image_02", "data", '{:010}'.format(i)]) + ".png" 293 | L2 = "/".join([self.dataset_dir, train_scene, "image_02", "data", '{:010}'.format(i+self.train_frame_distance)]) + ".png" 294 | R1 = "/".join([self.dataset_dir, train_scene, "image_03", "data", '{:010}'.format(i)]) + ".png" 295 | R2 = "/".join([self.dataset_dir, train_scene, "image_03", "data", '{:010}'.format(i+self.train_frame_distance)]) + ".png" 296 | 297 | self.train_L1.append(L1) 298 | self.train_L2.append(L2) 299 | self.train_R1.append(R1) 300 | self.train_R2.append(R2) 301 | 302 | # kt_file_path = "/".join([self.dataset_dir, train_scene, "KT.txt"]) 303 | kt_scene = "/".join([self.dataset_dir, train_scene]) 304 | KT = self.getKT(kt_scene) 305 | self.K.append(KT[:4]) 306 | self.T.append(KT[4]) 307 | 308 | def getTestData(self): 309 | # ---------------------------------------------------------------------- 310 | # Get Training dataset: image pairs (L1,L2,R1,R2 & K &T(L2R)) 311 | # ---------------------------------------------------------------------- 312 | self.test_L1 = [] 313 | self.test_R1 = [] 314 | self.test_L2 = [] 315 | self.test_R2 = [] 316 | self.test_K = [] 317 | self.test_T = [] 318 | 319 | for cnt, test_scene in enumerate(self.test_scenes): 320 | seq_start = self.test_seqs[cnt][0] 321 | seq_end = self.test_seqs[cnt][1] 322 | for i in xrange(seq_start,seq_end-self.test_frame_distance+1): 323 | L1 = "/".join([self.dataset_dir, test_scene, "image_02", "data", '{:010}'.format(i)]) + ".png" 324 | L2 = "/".join([self.dataset_dir, test_scene, "image_02", "data", '{:010}'.format(i+self.test_frame_distance)]) + ".png" 325 | R1 = "/".join([self.dataset_dir, test_scene, "image_03", "data", '{:010}'.format(i)]) + ".png" 326 | R2 = "/".join([self.dataset_dir, test_scene, "image_03", "data", '{:010}'.format(i+self.test_frame_distance)]) + ".png" 327 | 328 | self.test_L1.append(L1) 329 | self.test_L2.append(L2) 330 | self.test_R1.append(R1) 331 | self.test_R2.append(R2) 332 | 333 | kt_file_path = "/".join([self.dataset_dir, test_scene, "KT.txt"]) 334 | KT = self.getKT(kt_file_path) 335 | self.test_K.append(KT[:4]) 336 | self.test_T.append(KT[4]) 337 | 338 | def getKT(self,scene): 339 | # ---------------------------------------------------------------------- 340 | # Get K (camera intrinsic) and T (camera extrinsic) 341 | # ---------------------------------------------------------------------- 342 | new_image_size = [float(self.image_size[0]), float(self.image_size[1])] #[height,width] 343 | 344 | # ---------------------------------------------------------------------- 345 | # Get original K 346 | # ---------------------------------------------------------------------- 347 | f = open(scene+"/calib/calib_cam_to_cam.txt", 'r') 348 | camTxt = f.readlines() 349 | f.close() 350 | K_dict = {} 351 | for line in camTxt: 352 | line_split = line.split(":") 353 | K_dict[line_split[0]] = line_split[1] 354 | 355 | # ---------------------------------------------------------------------- 356 | # original K02 357 | # ---------------------------------------------------------------------- 358 | P_split = K_dict["P_rect_02"].split(" ") 359 | S_split = K_dict["S_rect_02"].split(" ") 360 | ref_img_size = [float(S_split[2]), float(S_split[1])] # height, width 361 | 362 | 363 | # ---------------------------------------------------------------------- 364 | # Get new K & position 365 | # ---------------------------------------------------------------------- 366 | W_ratio = new_image_size[1] / ref_img_size[1] 367 | H_ratio = new_image_size[0] / ref_img_size[0] 368 | fx = float(P_split[1]) * W_ratio 369 | fy = float(P_split[6]) * H_ratio 370 | cx = float(P_split[3]) * W_ratio 371 | cy = float(P_split[7]) * H_ratio 372 | 373 | tx_L = float(P_split[4]) / float(P_split[1]) 374 | # ty_L = float(P_split[8]) / float(P_split[6]) 375 | 376 | # ---------------------------------------------------------------------- 377 | # original K03 378 | # ---------------------------------------------------------------------- 379 | P_split = K_dict["P_rect_03"].split(" ") 380 | S_split = K_dict["S_rect_03"].split(" ") 381 | 382 | tx_R = float(P_split[4]) / float(P_split[1]) 383 | # ty_R = float(P_split[8]) / float(P_split[6]) 384 | 385 | # ---------------------------------------------------------------------- 386 | # Get position of Right camera w.r.t Left 387 | # ---------------------------------------------------------------------- 388 | Tx = np.abs(tx_R - tx_L) 389 | # Ty = np.abs(tx_R - tx_L) 390 | 391 | se3 = [0,0,0,Tx,0,0] 392 | 393 | return [fx,fy,cx,cy,se3] 394 | 395 | 396 | def shuffleDataset(self): 397 | # Shuffled the following three lists 398 | # 1. self.Isrc_list: source image paths 399 | # 2. self.Itgt_list: target image paths 400 | # 3. self.camMotionGT: transformation matrix from target frame to source frame 401 | list_ = list(zip(self.train_L1, self.train_L2, self.train_R1, self.train_R2, self.K, self.T)) 402 | random.shuffle(list_) 403 | self.train_L1, self.train_L2, self.train_R1, self.train_R2, self.K, self.T = zip(*list_) 404 | 405 | def saveTrainset(self): 406 | # ---------------------------------------------------------------------- 407 | # 408 | # ---------------------------------------------------------------------- 409 | txt_to_save = "/".join([self.resource_dir,"left_1.txt"]) 410 | self.saveTxt(txt_to_save, self.train_L1) 411 | 412 | txt_to_save = "/".join([self.resource_dir,"right_1.txt"]) 413 | self.saveTxt(txt_to_save, self.train_R1) 414 | 415 | txt_to_save = "/".join([self.resource_dir,"left_2.txt"]) 416 | self.saveTxt(txt_to_save, self.train_L2) 417 | 418 | txt_to_save = "/".join([self.resource_dir,"right_2.txt"]) 419 | self.saveTxt(txt_to_save, self.train_R2) 420 | 421 | lmdb_to_save = "/".join([self.resource_dir,"K"]) 422 | # self.saveLmdb(lmdb_to_save, np.expand_dims(np.expand_dims(np.asarray(self.K),3),4)) 423 | 424 | lmdb_to_save = "/".join([self.resource_dir,"T_R2L"]) 425 | # self.saveLmdb(lmdb_to_save, np.expand_dims(np.expand_dims(np.asarray(self.T),3),4)) 426 | 427 | def saveTestset(self): 428 | txt_to_save = "/".join([self.resource_dir,"left_1.txt"]) 429 | self.saveTxt(txt_to_save, self.test_L1) 430 | 431 | txt_to_save = "/".join([self.resource_dir,"right_1.txt"]) 432 | self.saveTxt(txt_to_save, self.test_R1) 433 | 434 | txt_to_save = "/".join([self.resource_dir,"left_2.txt"]) 435 | self.saveTxt(txt_to_save, self.test_L2) 436 | 437 | txt_to_save = "/".join([self.resource_dir,"right_2.txt"]) 438 | self.saveTxt(txt_to_save, self.test_R2) 439 | 440 | lmdb_to_save = "/".join([self.resource_dir,"K"]) 441 | self.saveLmdb(lmdb_to_save, np.expand_dims(np.expand_dims(np.asarray(self.test_K),3),4)) 442 | 443 | lmdb_to_save = "/".join([self.resource_dir,"T_R2L"]) 444 | self.saveLmdb(lmdb_to_save, np.expand_dims(np.expand_dims(np.asarray(self.test_T),3),4)) 445 | 446 | def saveTxt(self, path, img_list): 447 | f = open(path, 'w') 448 | for line in img_list: 449 | f.writelines(line+"\n") 450 | f.close() 451 | 452 | def saveLmdb(self, path, np_arr): 453 | # input: np_arr: shape = (N,C,H,W) 454 | N = np_arr.shape[0] 455 | map_size = np_arr.nbytes * 10 456 | env = lmdb.open(path, map_size=map_size) 457 | print np_arr.shape 458 | 459 | with env.begin(write=True) as txn: 460 | # txn is a Transaction object 461 | for i in range(N): 462 | datum = caffe.proto.caffe_pb2.Datum() 463 | datum.channels = np_arr.shape[1] 464 | datum.height = np_arr.shape[2] 465 | datum.width = np_arr.shape[3] 466 | datum = caffe.io.array_to_datum(np_arr[i]) 467 | str_id = '{:08}'.format(i) 468 | # The encode is only essential in Python 3 469 | txn.put(str_id.encode('ascii'), datum.SerializeToString()) 470 | 471 | 472 | 473 | parser = argparse.ArgumentParser(description='Dataset builder') 474 | parser.add_argument('--builder', type=str, default='kitti_eigen', help='Select builder (kitti_eigen; kitti_dometry)') 475 | parser.add_argument('--train_frame_distance', type=int, default=1, help='Frame distance between training instances') 476 | parser.add_argument('--raw_data_dir', type=str, default='./data/kitti_raw_data', help='Directory path storing the raw KITTI dataset') 477 | parser.add_argument('--dataset_dir', type=str, default='./data/dataset/kitti_eigen', help='Directory path storing the created dataset') 478 | parser.add_argument('--image_size', type=list, default="[160, 608]", help='Image size for the dataset [height, width]') 479 | parser.add_argument('--with_val', type=bool, default=False, help='Building validation set as well') 480 | 481 | args = parser.parse_args() 482 | args.image_size = [int("".join(args.image_size).split(",")[0][1:]), int("".join(args.image_size).split(",")[1][:-1])] 483 | 484 | if args.builder == "kitti_eigen": 485 | builder = kittiEigenBuilder() 486 | 487 | # ---------------------------------------------------------------------- 488 | # Setup options 489 | # ---------------------------------------------------------------------- 490 | setup_opt = {} 491 | setup_opt['train_frame_distance'] = args.train_frame_distance 492 | setup_opt['raw_data_dir'] = args.raw_data_dir 493 | setup_opt['dataset_dir'] = args.dataset_dir 494 | setup_opt['image_size'] = args.image_size 495 | 496 | builder.setup(setup_opt) 497 | # ---------------------------------------------------------------------- 498 | # Training set 499 | # ---------------------------------------------------------------------- 500 | builder.getData(isTrain=True) 501 | builder.shuffleDataset() 502 | builder.saveDataset(isTrain=True, with_val=args.with_val) 503 | # ---------------------------------------------------------------------- 504 | # Validation set 505 | # ---------------------------------------------------------------------- 506 | if args.with_val: 507 | builder.saveDataset(isTrain=False) 508 | 509 | elif args.builder == "kitti_odom": 510 | builder = kittiOdomBuilder() 511 | setup_opt = {} 512 | setup_opt['train_frame_distance'] = args.train_frame_distance 513 | setup_opt['test_frame_distance'] = 1 514 | setup_opt['raw_data_dir'] = args.raw_data_dir 515 | setup_opt['dataset_dir'] = args.dataset_dir 516 | setup_opt['image_size'] = args.image_size 517 | builder.setup(setup_opt) 518 | builder.getTrainData() 519 | builder.shuffleDataset() 520 | builder.saveTrainset() -------------------------------------------------------------------------------- /data/depth_evaluation/kitti_eigen/test_files_eigen.txt: -------------------------------------------------------------------------------- 1 | city/2011_09_26_drive_0002/image_02/data/0000000069.png 2 | city/2011_09_26_drive_0002/image_02/data/0000000054.png 3 | city/2011_09_26_drive_0002/image_02/data/0000000042.png 4 | city/2011_09_26_drive_0002/image_02/data/0000000057.png 5 | city/2011_09_26_drive_0002/image_02/data/0000000030.png 6 | city/2011_09_26_drive_0002/image_02/data/0000000027.png 7 | city/2011_09_26_drive_0002/image_02/data/0000000012.png 8 | city/2011_09_26_drive_0002/image_02/data/0000000075.png 9 | city/2011_09_26_drive_0002/image_02/data/0000000036.png 10 | city/2011_09_26_drive_0002/image_02/data/0000000033.png 11 | city/2011_09_26_drive_0002/image_02/data/0000000015.png 12 | city/2011_09_26_drive_0002/image_02/data/0000000072.png 13 | city/2011_09_26_drive_0002/image_02/data/0000000003.png 14 | city/2011_09_26_drive_0002/image_02/data/0000000039.png 15 | city/2011_09_26_drive_0002/image_02/data/0000000009.png 16 | city/2011_09_26_drive_0002/image_02/data/0000000051.png 17 | city/2011_09_26_drive_0002/image_02/data/0000000060.png 18 | city/2011_09_26_drive_0002/image_02/data/0000000021.png 19 | city/2011_09_26_drive_0002/image_02/data/0000000000.png 20 | city/2011_09_26_drive_0002/image_02/data/0000000024.png 21 | city/2011_09_26_drive_0002/image_02/data/0000000045.png 22 | city/2011_09_26_drive_0002/image_02/data/0000000018.png 23 | city/2011_09_26_drive_0002/image_02/data/0000000048.png 24 | city/2011_09_26_drive_0002/image_02/data/0000000006.png 25 | city/2011_09_26_drive_0002/image_02/data/0000000063.png 26 | city/2011_09_26_drive_0009/image_02/data/0000000000.png 27 | city/2011_09_26_drive_0009/image_02/data/0000000016.png 28 | city/2011_09_26_drive_0009/image_02/data/0000000032.png 29 | city/2011_09_26_drive_0009/image_02/data/0000000048.png 30 | city/2011_09_26_drive_0009/image_02/data/0000000064.png 31 | city/2011_09_26_drive_0009/image_02/data/0000000080.png 32 | city/2011_09_26_drive_0009/image_02/data/0000000096.png 33 | city/2011_09_26_drive_0009/image_02/data/0000000112.png 34 | city/2011_09_26_drive_0009/image_02/data/0000000128.png 35 | city/2011_09_26_drive_0009/image_02/data/0000000144.png 36 | city/2011_09_26_drive_0009/image_02/data/0000000160.png 37 | city/2011_09_26_drive_0009/image_02/data/0000000176.png 38 | city/2011_09_26_drive_0009/image_02/data/0000000196.png 39 | city/2011_09_26_drive_0009/image_02/data/0000000212.png 40 | city/2011_09_26_drive_0009/image_02/data/0000000228.png 41 | city/2011_09_26_drive_0009/image_02/data/0000000244.png 42 | city/2011_09_26_drive_0009/image_02/data/0000000260.png 43 | city/2011_09_26_drive_0009/image_02/data/0000000276.png 44 | city/2011_09_26_drive_0009/image_02/data/0000000292.png 45 | city/2011_09_26_drive_0009/image_02/data/0000000308.png 46 | city/2011_09_26_drive_0009/image_02/data/0000000324.png 47 | city/2011_09_26_drive_0009/image_02/data/0000000340.png 48 | city/2011_09_26_drive_0009/image_02/data/0000000356.png 49 | city/2011_09_26_drive_0009/image_02/data/0000000372.png 50 | city/2011_09_26_drive_0009/image_02/data/0000000388.png 51 | city/2011_09_26_drive_0013/image_02/data/0000000090.png 52 | city/2011_09_26_drive_0013/image_02/data/0000000050.png 53 | city/2011_09_26_drive_0013/image_02/data/0000000110.png 54 | city/2011_09_26_drive_0013/image_02/data/0000000115.png 55 | city/2011_09_26_drive_0013/image_02/data/0000000060.png 56 | city/2011_09_26_drive_0013/image_02/data/0000000105.png 57 | city/2011_09_26_drive_0013/image_02/data/0000000125.png 58 | city/2011_09_26_drive_0013/image_02/data/0000000020.png 59 | city/2011_09_26_drive_0013/image_02/data/0000000140.png 60 | city/2011_09_26_drive_0013/image_02/data/0000000085.png 61 | city/2011_09_26_drive_0013/image_02/data/0000000070.png 62 | city/2011_09_26_drive_0013/image_02/data/0000000080.png 63 | city/2011_09_26_drive_0013/image_02/data/0000000065.png 64 | city/2011_09_26_drive_0013/image_02/data/0000000095.png 65 | city/2011_09_26_drive_0013/image_02/data/0000000130.png 66 | city/2011_09_26_drive_0013/image_02/data/0000000100.png 67 | city/2011_09_26_drive_0013/image_02/data/0000000010.png 68 | city/2011_09_26_drive_0013/image_02/data/0000000030.png 69 | city/2011_09_26_drive_0013/image_02/data/0000000000.png 70 | city/2011_09_26_drive_0013/image_02/data/0000000135.png 71 | city/2011_09_26_drive_0013/image_02/data/0000000040.png 72 | city/2011_09_26_drive_0013/image_02/data/0000000005.png 73 | city/2011_09_26_drive_0013/image_02/data/0000000120.png 74 | city/2011_09_26_drive_0013/image_02/data/0000000045.png 75 | city/2011_09_26_drive_0013/image_02/data/0000000035.png 76 | residential/2011_09_26_drive_0020/image_02/data/0000000003.png 77 | residential/2011_09_26_drive_0020/image_02/data/0000000069.png 78 | residential/2011_09_26_drive_0020/image_02/data/0000000057.png 79 | residential/2011_09_26_drive_0020/image_02/data/0000000012.png 80 | residential/2011_09_26_drive_0020/image_02/data/0000000072.png 81 | residential/2011_09_26_drive_0020/image_02/data/0000000018.png 82 | residential/2011_09_26_drive_0020/image_02/data/0000000063.png 83 | residential/2011_09_26_drive_0020/image_02/data/0000000000.png 84 | residential/2011_09_26_drive_0020/image_02/data/0000000084.png 85 | residential/2011_09_26_drive_0020/image_02/data/0000000015.png 86 | residential/2011_09_26_drive_0020/image_02/data/0000000066.png 87 | residential/2011_09_26_drive_0020/image_02/data/0000000006.png 88 | residential/2011_09_26_drive_0020/image_02/data/0000000048.png 89 | residential/2011_09_26_drive_0020/image_02/data/0000000060.png 90 | residential/2011_09_26_drive_0020/image_02/data/0000000009.png 91 | residential/2011_09_26_drive_0020/image_02/data/0000000033.png 92 | residential/2011_09_26_drive_0020/image_02/data/0000000021.png 93 | residential/2011_09_26_drive_0020/image_02/data/0000000075.png 94 | residential/2011_09_26_drive_0020/image_02/data/0000000027.png 95 | residential/2011_09_26_drive_0020/image_02/data/0000000045.png 96 | residential/2011_09_26_drive_0020/image_02/data/0000000078.png 97 | residential/2011_09_26_drive_0020/image_02/data/0000000036.png 98 | residential/2011_09_26_drive_0020/image_02/data/0000000051.png 99 | residential/2011_09_26_drive_0020/image_02/data/0000000054.png 100 | residential/2011_09_26_drive_0020/image_02/data/0000000042.png 101 | residential/2011_09_26_drive_0023/image_02/data/0000000018.png 102 | residential/2011_09_26_drive_0023/image_02/data/0000000090.png 103 | residential/2011_09_26_drive_0023/image_02/data/0000000126.png 104 | residential/2011_09_26_drive_0023/image_02/data/0000000378.png 105 | residential/2011_09_26_drive_0023/image_02/data/0000000036.png 106 | residential/2011_09_26_drive_0023/image_02/data/0000000288.png 107 | residential/2011_09_26_drive_0023/image_02/data/0000000198.png 108 | residential/2011_09_26_drive_0023/image_02/data/0000000450.png 109 | residential/2011_09_26_drive_0023/image_02/data/0000000144.png 110 | residential/2011_09_26_drive_0023/image_02/data/0000000072.png 111 | residential/2011_09_26_drive_0023/image_02/data/0000000252.png 112 | residential/2011_09_26_drive_0023/image_02/data/0000000180.png 113 | residential/2011_09_26_drive_0023/image_02/data/0000000432.png 114 | residential/2011_09_26_drive_0023/image_02/data/0000000396.png 115 | residential/2011_09_26_drive_0023/image_02/data/0000000054.png 116 | residential/2011_09_26_drive_0023/image_02/data/0000000468.png 117 | residential/2011_09_26_drive_0023/image_02/data/0000000306.png 118 | residential/2011_09_26_drive_0023/image_02/data/0000000108.png 119 | residential/2011_09_26_drive_0023/image_02/data/0000000162.png 120 | residential/2011_09_26_drive_0023/image_02/data/0000000342.png 121 | residential/2011_09_26_drive_0023/image_02/data/0000000270.png 122 | residential/2011_09_26_drive_0023/image_02/data/0000000414.png 123 | residential/2011_09_26_drive_0023/image_02/data/0000000216.png 124 | residential/2011_09_26_drive_0023/image_02/data/0000000360.png 125 | residential/2011_09_26_drive_0023/image_02/data/0000000324.png 126 | road/2011_09_26_drive_0027/image_02/data/0000000077.png 127 | road/2011_09_26_drive_0027/image_02/data/0000000035.png 128 | road/2011_09_26_drive_0027/image_02/data/0000000091.png 129 | road/2011_09_26_drive_0027/image_02/data/0000000112.png 130 | road/2011_09_26_drive_0027/image_02/data/0000000007.png 131 | road/2011_09_26_drive_0027/image_02/data/0000000175.png 132 | road/2011_09_26_drive_0027/image_02/data/0000000042.png 133 | road/2011_09_26_drive_0027/image_02/data/0000000098.png 134 | road/2011_09_26_drive_0027/image_02/data/0000000133.png 135 | road/2011_09_26_drive_0027/image_02/data/0000000161.png 136 | road/2011_09_26_drive_0027/image_02/data/0000000014.png 137 | road/2011_09_26_drive_0027/image_02/data/0000000126.png 138 | road/2011_09_26_drive_0027/image_02/data/0000000168.png 139 | road/2011_09_26_drive_0027/image_02/data/0000000070.png 140 | road/2011_09_26_drive_0027/image_02/data/0000000084.png 141 | road/2011_09_26_drive_0027/image_02/data/0000000140.png 142 | road/2011_09_26_drive_0027/image_02/data/0000000049.png 143 | road/2011_09_26_drive_0027/image_02/data/0000000000.png 144 | road/2011_09_26_drive_0027/image_02/data/0000000182.png 145 | road/2011_09_26_drive_0027/image_02/data/0000000147.png 146 | road/2011_09_26_drive_0027/image_02/data/0000000056.png 147 | road/2011_09_26_drive_0027/image_02/data/0000000063.png 148 | road/2011_09_26_drive_0027/image_02/data/0000000021.png 149 | road/2011_09_26_drive_0027/image_02/data/0000000119.png 150 | road/2011_09_26_drive_0027/image_02/data/0000000028.png 151 | road/2011_09_26_drive_0029/image_02/data/0000000380.png 152 | road/2011_09_26_drive_0029/image_02/data/0000000394.png 153 | road/2011_09_26_drive_0029/image_02/data/0000000324.png 154 | road/2011_09_26_drive_0029/image_02/data/0000000000.png 155 | road/2011_09_26_drive_0029/image_02/data/0000000268.png 156 | road/2011_09_26_drive_0029/image_02/data/0000000366.png 157 | road/2011_09_26_drive_0029/image_02/data/0000000296.png 158 | road/2011_09_26_drive_0029/image_02/data/0000000014.png 159 | road/2011_09_26_drive_0029/image_02/data/0000000028.png 160 | road/2011_09_26_drive_0029/image_02/data/0000000182.png 161 | road/2011_09_26_drive_0029/image_02/data/0000000168.png 162 | road/2011_09_26_drive_0029/image_02/data/0000000196.png 163 | road/2011_09_26_drive_0029/image_02/data/0000000140.png 164 | road/2011_09_26_drive_0029/image_02/data/0000000084.png 165 | road/2011_09_26_drive_0029/image_02/data/0000000056.png 166 | road/2011_09_26_drive_0029/image_02/data/0000000112.png 167 | road/2011_09_26_drive_0029/image_02/data/0000000352.png 168 | road/2011_09_26_drive_0029/image_02/data/0000000126.png 169 | road/2011_09_26_drive_0029/image_02/data/0000000070.png 170 | road/2011_09_26_drive_0029/image_02/data/0000000310.png 171 | road/2011_09_26_drive_0029/image_02/data/0000000154.png 172 | road/2011_09_26_drive_0029/image_02/data/0000000098.png 173 | road/2011_09_26_drive_0029/image_02/data/0000000408.png 174 | road/2011_09_26_drive_0029/image_02/data/0000000042.png 175 | road/2011_09_26_drive_0029/image_02/data/0000000338.png 176 | residential/2011_09_26_drive_0036/image_02/data/0000000000.png 177 | residential/2011_09_26_drive_0036/image_02/data/0000000128.png 178 | residential/2011_09_26_drive_0036/image_02/data/0000000192.png 179 | residential/2011_09_26_drive_0036/image_02/data/0000000032.png 180 | residential/2011_09_26_drive_0036/image_02/data/0000000352.png 181 | residential/2011_09_26_drive_0036/image_02/data/0000000608.png 182 | residential/2011_09_26_drive_0036/image_02/data/0000000224.png 183 | residential/2011_09_26_drive_0036/image_02/data/0000000576.png 184 | residential/2011_09_26_drive_0036/image_02/data/0000000672.png 185 | residential/2011_09_26_drive_0036/image_02/data/0000000064.png 186 | residential/2011_09_26_drive_0036/image_02/data/0000000448.png 187 | residential/2011_09_26_drive_0036/image_02/data/0000000704.png 188 | residential/2011_09_26_drive_0036/image_02/data/0000000640.png 189 | residential/2011_09_26_drive_0036/image_02/data/0000000512.png 190 | residential/2011_09_26_drive_0036/image_02/data/0000000768.png 191 | residential/2011_09_26_drive_0036/image_02/data/0000000160.png 192 | residential/2011_09_26_drive_0036/image_02/data/0000000416.png 193 | residential/2011_09_26_drive_0036/image_02/data/0000000480.png 194 | residential/2011_09_26_drive_0036/image_02/data/0000000800.png 195 | residential/2011_09_26_drive_0036/image_02/data/0000000288.png 196 | residential/2011_09_26_drive_0036/image_02/data/0000000544.png 197 | residential/2011_09_26_drive_0036/image_02/data/0000000096.png 198 | residential/2011_09_26_drive_0036/image_02/data/0000000384.png 199 | residential/2011_09_26_drive_0036/image_02/data/0000000256.png 200 | residential/2011_09_26_drive_0036/image_02/data/0000000320.png 201 | residential/2011_09_26_drive_0046/image_02/data/0000000000.png 202 | residential/2011_09_26_drive_0046/image_02/data/0000000005.png 203 | residential/2011_09_26_drive_0046/image_02/data/0000000010.png 204 | residential/2011_09_26_drive_0046/image_02/data/0000000015.png 205 | residential/2011_09_26_drive_0046/image_02/data/0000000020.png 206 | residential/2011_09_26_drive_0046/image_02/data/0000000025.png 207 | residential/2011_09_26_drive_0046/image_02/data/0000000030.png 208 | residential/2011_09_26_drive_0046/image_02/data/0000000035.png 209 | residential/2011_09_26_drive_0046/image_02/data/0000000040.png 210 | residential/2011_09_26_drive_0046/image_02/data/0000000045.png 211 | residential/2011_09_26_drive_0046/image_02/data/0000000050.png 212 | residential/2011_09_26_drive_0046/image_02/data/0000000055.png 213 | residential/2011_09_26_drive_0046/image_02/data/0000000060.png 214 | residential/2011_09_26_drive_0046/image_02/data/0000000065.png 215 | residential/2011_09_26_drive_0046/image_02/data/0000000070.png 216 | residential/2011_09_26_drive_0046/image_02/data/0000000075.png 217 | residential/2011_09_26_drive_0046/image_02/data/0000000080.png 218 | residential/2011_09_26_drive_0046/image_02/data/0000000085.png 219 | residential/2011_09_26_drive_0046/image_02/data/0000000090.png 220 | residential/2011_09_26_drive_0046/image_02/data/0000000095.png 221 | residential/2011_09_26_drive_0046/image_02/data/0000000100.png 222 | residential/2011_09_26_drive_0046/image_02/data/0000000105.png 223 | residential/2011_09_26_drive_0046/image_02/data/0000000110.png 224 | residential/2011_09_26_drive_0046/image_02/data/0000000115.png 225 | residential/2011_09_26_drive_0046/image_02/data/0000000120.png 226 | city/2011_09_26_drive_0048/image_02/data/0000000000.png 227 | city/2011_09_26_drive_0048/image_02/data/0000000001.png 228 | city/2011_09_26_drive_0048/image_02/data/0000000002.png 229 | city/2011_09_26_drive_0048/image_02/data/0000000003.png 230 | city/2011_09_26_drive_0048/image_02/data/0000000004.png 231 | city/2011_09_26_drive_0048/image_02/data/0000000005.png 232 | city/2011_09_26_drive_0048/image_02/data/0000000006.png 233 | city/2011_09_26_drive_0048/image_02/data/0000000007.png 234 | city/2011_09_26_drive_0048/image_02/data/0000000008.png 235 | city/2011_09_26_drive_0048/image_02/data/0000000009.png 236 | city/2011_09_26_drive_0048/image_02/data/0000000010.png 237 | city/2011_09_26_drive_0048/image_02/data/0000000011.png 238 | city/2011_09_26_drive_0048/image_02/data/0000000012.png 239 | city/2011_09_26_drive_0048/image_02/data/0000000013.png 240 | city/2011_09_26_drive_0048/image_02/data/0000000014.png 241 | city/2011_09_26_drive_0048/image_02/data/0000000015.png 242 | city/2011_09_26_drive_0048/image_02/data/0000000016.png 243 | city/2011_09_26_drive_0048/image_02/data/0000000017.png 244 | city/2011_09_26_drive_0048/image_02/data/0000000018.png 245 | city/2011_09_26_drive_0048/image_02/data/0000000019.png 246 | city/2011_09_26_drive_0048/image_02/data/0000000020.png 247 | city/2011_09_26_drive_0048/image_02/data/0000000021.png 248 | road/2011_09_26_drive_0052/image_02/data/0000000046.png 249 | road/2011_09_26_drive_0052/image_02/data/0000000014.png 250 | road/2011_09_26_drive_0052/image_02/data/0000000036.png 251 | road/2011_09_26_drive_0052/image_02/data/0000000028.png 252 | road/2011_09_26_drive_0052/image_02/data/0000000026.png 253 | road/2011_09_26_drive_0052/image_02/data/0000000050.png 254 | road/2011_09_26_drive_0052/image_02/data/0000000040.png 255 | road/2011_09_26_drive_0052/image_02/data/0000000008.png 256 | road/2011_09_26_drive_0052/image_02/data/0000000016.png 257 | road/2011_09_26_drive_0052/image_02/data/0000000044.png 258 | road/2011_09_26_drive_0052/image_02/data/0000000018.png 259 | road/2011_09_26_drive_0052/image_02/data/0000000032.png 260 | road/2011_09_26_drive_0052/image_02/data/0000000042.png 261 | road/2011_09_26_drive_0052/image_02/data/0000000010.png 262 | road/2011_09_26_drive_0052/image_02/data/0000000020.png 263 | road/2011_09_26_drive_0052/image_02/data/0000000048.png 264 | road/2011_09_26_drive_0052/image_02/data/0000000052.png 265 | road/2011_09_26_drive_0052/image_02/data/0000000006.png 266 | road/2011_09_26_drive_0052/image_02/data/0000000030.png 267 | road/2011_09_26_drive_0052/image_02/data/0000000012.png 268 | road/2011_09_26_drive_0052/image_02/data/0000000038.png 269 | road/2011_09_26_drive_0052/image_02/data/0000000000.png 270 | road/2011_09_26_drive_0052/image_02/data/0000000002.png 271 | road/2011_09_26_drive_0052/image_02/data/0000000004.png 272 | road/2011_09_26_drive_0052/image_02/data/0000000022.png 273 | city/2011_09_26_drive_0056/image_02/data/0000000011.png 274 | city/2011_09_26_drive_0056/image_02/data/0000000033.png 275 | city/2011_09_26_drive_0056/image_02/data/0000000242.png 276 | city/2011_09_26_drive_0056/image_02/data/0000000253.png 277 | city/2011_09_26_drive_0056/image_02/data/0000000286.png 278 | city/2011_09_26_drive_0056/image_02/data/0000000154.png 279 | city/2011_09_26_drive_0056/image_02/data/0000000099.png 280 | city/2011_09_26_drive_0056/image_02/data/0000000220.png 281 | city/2011_09_26_drive_0056/image_02/data/0000000022.png 282 | city/2011_09_26_drive_0056/image_02/data/0000000077.png 283 | city/2011_09_26_drive_0056/image_02/data/0000000187.png 284 | city/2011_09_26_drive_0056/image_02/data/0000000143.png 285 | city/2011_09_26_drive_0056/image_02/data/0000000066.png 286 | city/2011_09_26_drive_0056/image_02/data/0000000176.png 287 | city/2011_09_26_drive_0056/image_02/data/0000000110.png 288 | city/2011_09_26_drive_0056/image_02/data/0000000275.png 289 | city/2011_09_26_drive_0056/image_02/data/0000000264.png 290 | city/2011_09_26_drive_0056/image_02/data/0000000198.png 291 | city/2011_09_26_drive_0056/image_02/data/0000000055.png 292 | city/2011_09_26_drive_0056/image_02/data/0000000088.png 293 | city/2011_09_26_drive_0056/image_02/data/0000000121.png 294 | city/2011_09_26_drive_0056/image_02/data/0000000209.png 295 | city/2011_09_26_drive_0056/image_02/data/0000000165.png 296 | city/2011_09_26_drive_0056/image_02/data/0000000231.png 297 | city/2011_09_26_drive_0056/image_02/data/0000000044.png 298 | city/2011_09_26_drive_0059/image_02/data/0000000056.png 299 | city/2011_09_26_drive_0059/image_02/data/0000000000.png 300 | city/2011_09_26_drive_0059/image_02/data/0000000344.png 301 | city/2011_09_26_drive_0059/image_02/data/0000000358.png 302 | city/2011_09_26_drive_0059/image_02/data/0000000316.png 303 | city/2011_09_26_drive_0059/image_02/data/0000000238.png 304 | city/2011_09_26_drive_0059/image_02/data/0000000098.png 305 | city/2011_09_26_drive_0059/image_02/data/0000000112.png 306 | city/2011_09_26_drive_0059/image_02/data/0000000028.png 307 | city/2011_09_26_drive_0059/image_02/data/0000000014.png 308 | city/2011_09_26_drive_0059/image_02/data/0000000330.png 309 | city/2011_09_26_drive_0059/image_02/data/0000000154.png 310 | city/2011_09_26_drive_0059/image_02/data/0000000042.png 311 | city/2011_09_26_drive_0059/image_02/data/0000000302.png 312 | city/2011_09_26_drive_0059/image_02/data/0000000182.png 313 | city/2011_09_26_drive_0059/image_02/data/0000000288.png 314 | city/2011_09_26_drive_0059/image_02/data/0000000140.png 315 | city/2011_09_26_drive_0059/image_02/data/0000000274.png 316 | city/2011_09_26_drive_0059/image_02/data/0000000224.png 317 | city/2011_09_26_drive_0059/image_02/data/0000000372.png 318 | city/2011_09_26_drive_0059/image_02/data/0000000196.png 319 | city/2011_09_26_drive_0059/image_02/data/0000000126.png 320 | city/2011_09_26_drive_0059/image_02/data/0000000084.png 321 | city/2011_09_26_drive_0059/image_02/data/0000000210.png 322 | city/2011_09_26_drive_0059/image_02/data/0000000070.png 323 | residential/2011_09_26_drive_0064/image_02/data/0000000528.png 324 | residential/2011_09_26_drive_0064/image_02/data/0000000308.png 325 | residential/2011_09_26_drive_0064/image_02/data/0000000044.png 326 | residential/2011_09_26_drive_0064/image_02/data/0000000352.png 327 | residential/2011_09_26_drive_0064/image_02/data/0000000066.png 328 | residential/2011_09_26_drive_0064/image_02/data/0000000000.png 329 | residential/2011_09_26_drive_0064/image_02/data/0000000506.png 330 | residential/2011_09_26_drive_0064/image_02/data/0000000176.png 331 | residential/2011_09_26_drive_0064/image_02/data/0000000022.png 332 | residential/2011_09_26_drive_0064/image_02/data/0000000242.png 333 | residential/2011_09_26_drive_0064/image_02/data/0000000462.png 334 | residential/2011_09_26_drive_0064/image_02/data/0000000418.png 335 | residential/2011_09_26_drive_0064/image_02/data/0000000110.png 336 | residential/2011_09_26_drive_0064/image_02/data/0000000440.png 337 | residential/2011_09_26_drive_0064/image_02/data/0000000396.png 338 | residential/2011_09_26_drive_0064/image_02/data/0000000154.png 339 | residential/2011_09_26_drive_0064/image_02/data/0000000374.png 340 | residential/2011_09_26_drive_0064/image_02/data/0000000088.png 341 | residential/2011_09_26_drive_0064/image_02/data/0000000286.png 342 | residential/2011_09_26_drive_0064/image_02/data/0000000550.png 343 | residential/2011_09_26_drive_0064/image_02/data/0000000264.png 344 | residential/2011_09_26_drive_0064/image_02/data/0000000220.png 345 | residential/2011_09_26_drive_0064/image_02/data/0000000330.png 346 | residential/2011_09_26_drive_0064/image_02/data/0000000484.png 347 | residential/2011_09_26_drive_0064/image_02/data/0000000198.png 348 | city/2011_09_26_drive_0084/image_02/data/0000000283.png 349 | city/2011_09_26_drive_0084/image_02/data/0000000361.png 350 | city/2011_09_26_drive_0084/image_02/data/0000000270.png 351 | city/2011_09_26_drive_0084/image_02/data/0000000127.png 352 | city/2011_09_26_drive_0084/image_02/data/0000000205.png 353 | city/2011_09_26_drive_0084/image_02/data/0000000218.png 354 | city/2011_09_26_drive_0084/image_02/data/0000000153.png 355 | city/2011_09_26_drive_0084/image_02/data/0000000335.png 356 | city/2011_09_26_drive_0084/image_02/data/0000000192.png 357 | city/2011_09_26_drive_0084/image_02/data/0000000348.png 358 | city/2011_09_26_drive_0084/image_02/data/0000000101.png 359 | city/2011_09_26_drive_0084/image_02/data/0000000049.png 360 | city/2011_09_26_drive_0084/image_02/data/0000000179.png 361 | city/2011_09_26_drive_0084/image_02/data/0000000140.png 362 | city/2011_09_26_drive_0084/image_02/data/0000000374.png 363 | city/2011_09_26_drive_0084/image_02/data/0000000322.png 364 | city/2011_09_26_drive_0084/image_02/data/0000000309.png 365 | city/2011_09_26_drive_0084/image_02/data/0000000244.png 366 | city/2011_09_26_drive_0084/image_02/data/0000000062.png 367 | city/2011_09_26_drive_0084/image_02/data/0000000257.png 368 | city/2011_09_26_drive_0084/image_02/data/0000000088.png 369 | city/2011_09_26_drive_0084/image_02/data/0000000114.png 370 | city/2011_09_26_drive_0084/image_02/data/0000000075.png 371 | city/2011_09_26_drive_0084/image_02/data/0000000296.png 372 | city/2011_09_26_drive_0084/image_02/data/0000000231.png 373 | residential/2011_09_26_drive_0086/image_02/data/0000000007.png 374 | residential/2011_09_26_drive_0086/image_02/data/0000000196.png 375 | residential/2011_09_26_drive_0086/image_02/data/0000000439.png 376 | residential/2011_09_26_drive_0086/image_02/data/0000000169.png 377 | residential/2011_09_26_drive_0086/image_02/data/0000000115.png 378 | residential/2011_09_26_drive_0086/image_02/data/0000000034.png 379 | residential/2011_09_26_drive_0086/image_02/data/0000000304.png 380 | residential/2011_09_26_drive_0086/image_02/data/0000000331.png 381 | residential/2011_09_26_drive_0086/image_02/data/0000000277.png 382 | residential/2011_09_26_drive_0086/image_02/data/0000000520.png 383 | residential/2011_09_26_drive_0086/image_02/data/0000000682.png 384 | residential/2011_09_26_drive_0086/image_02/data/0000000628.png 385 | residential/2011_09_26_drive_0086/image_02/data/0000000088.png 386 | residential/2011_09_26_drive_0086/image_02/data/0000000601.png 387 | residential/2011_09_26_drive_0086/image_02/data/0000000574.png 388 | residential/2011_09_26_drive_0086/image_02/data/0000000223.png 389 | residential/2011_09_26_drive_0086/image_02/data/0000000655.png 390 | residential/2011_09_26_drive_0086/image_02/data/0000000358.png 391 | residential/2011_09_26_drive_0086/image_02/data/0000000412.png 392 | residential/2011_09_26_drive_0086/image_02/data/0000000142.png 393 | residential/2011_09_26_drive_0086/image_02/data/0000000385.png 394 | residential/2011_09_26_drive_0086/image_02/data/0000000061.png 395 | residential/2011_09_26_drive_0086/image_02/data/0000000493.png 396 | residential/2011_09_26_drive_0086/image_02/data/0000000466.png 397 | residential/2011_09_26_drive_0086/image_02/data/0000000250.png 398 | city/2011_09_26_drive_0093/image_02/data/0000000000.png 399 | city/2011_09_26_drive_0093/image_02/data/0000000016.png 400 | city/2011_09_26_drive_0093/image_02/data/0000000032.png 401 | city/2011_09_26_drive_0093/image_02/data/0000000048.png 402 | city/2011_09_26_drive_0093/image_02/data/0000000064.png 403 | city/2011_09_26_drive_0093/image_02/data/0000000080.png 404 | city/2011_09_26_drive_0093/image_02/data/0000000096.png 405 | city/2011_09_26_drive_0093/image_02/data/0000000112.png 406 | city/2011_09_26_drive_0093/image_02/data/0000000128.png 407 | city/2011_09_26_drive_0093/image_02/data/0000000144.png 408 | city/2011_09_26_drive_0093/image_02/data/0000000160.png 409 | city/2011_09_26_drive_0093/image_02/data/0000000176.png 410 | city/2011_09_26_drive_0093/image_02/data/0000000192.png 411 | city/2011_09_26_drive_0093/image_02/data/0000000208.png 412 | city/2011_09_26_drive_0093/image_02/data/0000000224.png 413 | city/2011_09_26_drive_0093/image_02/data/0000000240.png 414 | city/2011_09_26_drive_0093/image_02/data/0000000256.png 415 | city/2011_09_26_drive_0093/image_02/data/0000000305.png 416 | city/2011_09_26_drive_0093/image_02/data/0000000321.png 417 | city/2011_09_26_drive_0093/image_02/data/0000000337.png 418 | city/2011_09_26_drive_0093/image_02/data/0000000353.png 419 | city/2011_09_26_drive_0093/image_02/data/0000000369.png 420 | city/2011_09_26_drive_0093/image_02/data/0000000385.png 421 | city/2011_09_26_drive_0093/image_02/data/0000000401.png 422 | city/2011_09_26_drive_0093/image_02/data/0000000417.png 423 | city/2011_09_26_drive_0096/image_02/data/0000000000.png 424 | city/2011_09_26_drive_0096/image_02/data/0000000019.png 425 | city/2011_09_26_drive_0096/image_02/data/0000000038.png 426 | city/2011_09_26_drive_0096/image_02/data/0000000057.png 427 | city/2011_09_26_drive_0096/image_02/data/0000000076.png 428 | city/2011_09_26_drive_0096/image_02/data/0000000095.png 429 | city/2011_09_26_drive_0096/image_02/data/0000000114.png 430 | city/2011_09_26_drive_0096/image_02/data/0000000133.png 431 | city/2011_09_26_drive_0096/image_02/data/0000000152.png 432 | city/2011_09_26_drive_0096/image_02/data/0000000171.png 433 | city/2011_09_26_drive_0096/image_02/data/0000000190.png 434 | city/2011_09_26_drive_0096/image_02/data/0000000209.png 435 | city/2011_09_26_drive_0096/image_02/data/0000000228.png 436 | city/2011_09_26_drive_0096/image_02/data/0000000247.png 437 | city/2011_09_26_drive_0096/image_02/data/0000000266.png 438 | city/2011_09_26_drive_0096/image_02/data/0000000285.png 439 | city/2011_09_26_drive_0096/image_02/data/0000000304.png 440 | city/2011_09_26_drive_0096/image_02/data/0000000323.png 441 | city/2011_09_26_drive_0096/image_02/data/0000000342.png 442 | city/2011_09_26_drive_0096/image_02/data/0000000361.png 443 | city/2011_09_26_drive_0096/image_02/data/0000000380.png 444 | city/2011_09_26_drive_0096/image_02/data/0000000399.png 445 | city/2011_09_26_drive_0096/image_02/data/0000000418.png 446 | city/2011_09_26_drive_0096/image_02/data/0000000437.png 447 | city/2011_09_26_drive_0096/image_02/data/0000000456.png 448 | road/2011_09_26_drive_0101/image_02/data/0000000692.png 449 | road/2011_09_26_drive_0101/image_02/data/0000000930.png 450 | road/2011_09_26_drive_0101/image_02/data/0000000760.png 451 | road/2011_09_26_drive_0101/image_02/data/0000000896.png 452 | road/2011_09_26_drive_0101/image_02/data/0000000284.png 453 | road/2011_09_26_drive_0101/image_02/data/0000000148.png 454 | road/2011_09_26_drive_0101/image_02/data/0000000522.png 455 | road/2011_09_26_drive_0101/image_02/data/0000000794.png 456 | road/2011_09_26_drive_0101/image_02/data/0000000624.png 457 | road/2011_09_26_drive_0101/image_02/data/0000000726.png 458 | road/2011_09_26_drive_0101/image_02/data/0000000216.png 459 | road/2011_09_26_drive_0101/image_02/data/0000000318.png 460 | road/2011_09_26_drive_0101/image_02/data/0000000488.png 461 | road/2011_09_26_drive_0101/image_02/data/0000000590.png 462 | road/2011_09_26_drive_0101/image_02/data/0000000454.png 463 | road/2011_09_26_drive_0101/image_02/data/0000000862.png 464 | road/2011_09_26_drive_0101/image_02/data/0000000386.png 465 | road/2011_09_26_drive_0101/image_02/data/0000000352.png 466 | road/2011_09_26_drive_0101/image_02/data/0000000420.png 467 | road/2011_09_26_drive_0101/image_02/data/0000000658.png 468 | road/2011_09_26_drive_0101/image_02/data/0000000828.png 469 | road/2011_09_26_drive_0101/image_02/data/0000000556.png 470 | road/2011_09_26_drive_0101/image_02/data/0000000114.png 471 | road/2011_09_26_drive_0101/image_02/data/0000000182.png 472 | road/2011_09_26_drive_0101/image_02/data/0000000080.png 473 | city/2011_09_26_drive_0106/image_02/data/0000000015.png 474 | city/2011_09_26_drive_0106/image_02/data/0000000035.png 475 | city/2011_09_26_drive_0106/image_02/data/0000000043.png 476 | city/2011_09_26_drive_0106/image_02/data/0000000051.png 477 | city/2011_09_26_drive_0106/image_02/data/0000000059.png 478 | city/2011_09_26_drive_0106/image_02/data/0000000067.png 479 | city/2011_09_26_drive_0106/image_02/data/0000000075.png 480 | city/2011_09_26_drive_0106/image_02/data/0000000083.png 481 | city/2011_09_26_drive_0106/image_02/data/0000000091.png 482 | city/2011_09_26_drive_0106/image_02/data/0000000099.png 483 | city/2011_09_26_drive_0106/image_02/data/0000000107.png 484 | city/2011_09_26_drive_0106/image_02/data/0000000115.png 485 | city/2011_09_26_drive_0106/image_02/data/0000000123.png 486 | city/2011_09_26_drive_0106/image_02/data/0000000131.png 487 | city/2011_09_26_drive_0106/image_02/data/0000000139.png 488 | city/2011_09_26_drive_0106/image_02/data/0000000147.png 489 | city/2011_09_26_drive_0106/image_02/data/0000000155.png 490 | city/2011_09_26_drive_0106/image_02/data/0000000163.png 491 | city/2011_09_26_drive_0106/image_02/data/0000000171.png 492 | city/2011_09_26_drive_0106/image_02/data/0000000179.png 493 | city/2011_09_26_drive_0106/image_02/data/0000000187.png 494 | city/2011_09_26_drive_0106/image_02/data/0000000195.png 495 | city/2011_09_26_drive_0106/image_02/data/0000000203.png 496 | city/2011_09_26_drive_0106/image_02/data/0000000211.png 497 | city/2011_09_26_drive_0106/image_02/data/0000000219.png 498 | city/2011_09_26_drive_0117/image_02/data/0000000312.png 499 | city/2011_09_26_drive_0117/image_02/data/0000000494.png 500 | city/2011_09_26_drive_0117/image_02/data/0000000104.png 501 | city/2011_09_26_drive_0117/image_02/data/0000000130.png 502 | city/2011_09_26_drive_0117/image_02/data/0000000156.png 503 | city/2011_09_26_drive_0117/image_02/data/0000000182.png 504 | city/2011_09_26_drive_0117/image_02/data/0000000598.png 505 | city/2011_09_26_drive_0117/image_02/data/0000000416.png 506 | city/2011_09_26_drive_0117/image_02/data/0000000364.png 507 | city/2011_09_26_drive_0117/image_02/data/0000000026.png 508 | city/2011_09_26_drive_0117/image_02/data/0000000078.png 509 | city/2011_09_26_drive_0117/image_02/data/0000000572.png 510 | city/2011_09_26_drive_0117/image_02/data/0000000468.png 511 | city/2011_09_26_drive_0117/image_02/data/0000000260.png 512 | city/2011_09_26_drive_0117/image_02/data/0000000624.png 513 | city/2011_09_26_drive_0117/image_02/data/0000000234.png 514 | city/2011_09_26_drive_0117/image_02/data/0000000442.png 515 | city/2011_09_26_drive_0117/image_02/data/0000000390.png 516 | city/2011_09_26_drive_0117/image_02/data/0000000546.png 517 | city/2011_09_26_drive_0117/image_02/data/0000000286.png 518 | city/2011_09_26_drive_0117/image_02/data/0000000000.png 519 | city/2011_09_26_drive_0117/image_02/data/0000000338.png 520 | city/2011_09_26_drive_0117/image_02/data/0000000208.png 521 | city/2011_09_26_drive_0117/image_02/data/0000000650.png 522 | city/2011_09_26_drive_0117/image_02/data/0000000052.png 523 | city/2011_09_28_drive_0002/image_02/data/0000000024.png 524 | city/2011_09_28_drive_0002/image_02/data/0000000021.png 525 | city/2011_09_28_drive_0002/image_02/data/0000000036.png 526 | city/2011_09_28_drive_0002/image_02/data/0000000000.png 527 | city/2011_09_28_drive_0002/image_02/data/0000000051.png 528 | city/2011_09_28_drive_0002/image_02/data/0000000018.png 529 | city/2011_09_28_drive_0002/image_02/data/0000000033.png 530 | city/2011_09_28_drive_0002/image_02/data/0000000090.png 531 | city/2011_09_28_drive_0002/image_02/data/0000000045.png 532 | city/2011_09_28_drive_0002/image_02/data/0000000054.png 533 | city/2011_09_28_drive_0002/image_02/data/0000000012.png 534 | city/2011_09_28_drive_0002/image_02/data/0000000039.png 535 | city/2011_09_28_drive_0002/image_02/data/0000000009.png 536 | city/2011_09_28_drive_0002/image_02/data/0000000003.png 537 | city/2011_09_28_drive_0002/image_02/data/0000000030.png 538 | city/2011_09_28_drive_0002/image_02/data/0000000078.png 539 | city/2011_09_28_drive_0002/image_02/data/0000000060.png 540 | city/2011_09_28_drive_0002/image_02/data/0000000048.png 541 | city/2011_09_28_drive_0002/image_02/data/0000000084.png 542 | city/2011_09_28_drive_0002/image_02/data/0000000081.png 543 | city/2011_09_28_drive_0002/image_02/data/0000000006.png 544 | city/2011_09_28_drive_0002/image_02/data/0000000057.png 545 | city/2011_09_28_drive_0002/image_02/data/0000000072.png 546 | city/2011_09_28_drive_0002/image_02/data/0000000087.png 547 | city/2011_09_28_drive_0002/image_02/data/0000000063.png 548 | city/2011_09_29_drive_0071/image_02/data/0000000252.png 549 | city/2011_09_29_drive_0071/image_02/data/0000000540.png 550 | city/2011_09_29_drive_0071/image_02/data/0000001054.png 551 | city/2011_09_29_drive_0071/image_02/data/0000000036.png 552 | city/2011_09_29_drive_0071/image_02/data/0000000360.png 553 | city/2011_09_29_drive_0071/image_02/data/0000000807.png 554 | city/2011_09_29_drive_0071/image_02/data/0000000879.png 555 | city/2011_09_29_drive_0071/image_02/data/0000000288.png 556 | city/2011_09_29_drive_0071/image_02/data/0000000771.png 557 | city/2011_09_29_drive_0071/image_02/data/0000000000.png 558 | city/2011_09_29_drive_0071/image_02/data/0000000216.png 559 | city/2011_09_29_drive_0071/image_02/data/0000000951.png 560 | city/2011_09_29_drive_0071/image_02/data/0000000324.png 561 | city/2011_09_29_drive_0071/image_02/data/0000000432.png 562 | city/2011_09_29_drive_0071/image_02/data/0000000504.png 563 | city/2011_09_29_drive_0071/image_02/data/0000000576.png 564 | city/2011_09_29_drive_0071/image_02/data/0000000108.png 565 | city/2011_09_29_drive_0071/image_02/data/0000000180.png 566 | city/2011_09_29_drive_0071/image_02/data/0000000072.png 567 | city/2011_09_29_drive_0071/image_02/data/0000000612.png 568 | city/2011_09_29_drive_0071/image_02/data/0000000915.png 569 | city/2011_09_29_drive_0071/image_02/data/0000000735.png 570 | city/2011_09_29_drive_0071/image_02/data/0000000144.png 571 | city/2011_09_29_drive_0071/image_02/data/0000000396.png 572 | city/2011_09_29_drive_0071/image_02/data/0000000468.png 573 | road/2011_09_30_drive_0016/image_02/data/0000000132.png 574 | road/2011_09_30_drive_0016/image_02/data/0000000011.png 575 | road/2011_09_30_drive_0016/image_02/data/0000000154.png 576 | road/2011_09_30_drive_0016/image_02/data/0000000022.png 577 | road/2011_09_30_drive_0016/image_02/data/0000000242.png 578 | road/2011_09_30_drive_0016/image_02/data/0000000198.png 579 | road/2011_09_30_drive_0016/image_02/data/0000000176.png 580 | road/2011_09_30_drive_0016/image_02/data/0000000231.png 581 | road/2011_09_30_drive_0016/image_02/data/0000000275.png 582 | road/2011_09_30_drive_0016/image_02/data/0000000220.png 583 | road/2011_09_30_drive_0016/image_02/data/0000000088.png 584 | road/2011_09_30_drive_0016/image_02/data/0000000143.png 585 | road/2011_09_30_drive_0016/image_02/data/0000000055.png 586 | road/2011_09_30_drive_0016/image_02/data/0000000033.png 587 | road/2011_09_30_drive_0016/image_02/data/0000000187.png 588 | road/2011_09_30_drive_0016/image_02/data/0000000110.png 589 | road/2011_09_30_drive_0016/image_02/data/0000000044.png 590 | road/2011_09_30_drive_0016/image_02/data/0000000077.png 591 | road/2011_09_30_drive_0016/image_02/data/0000000066.png 592 | road/2011_09_30_drive_0016/image_02/data/0000000000.png 593 | road/2011_09_30_drive_0016/image_02/data/0000000165.png 594 | road/2011_09_30_drive_0016/image_02/data/0000000264.png 595 | road/2011_09_30_drive_0016/image_02/data/0000000253.png 596 | road/2011_09_30_drive_0016/image_02/data/0000000209.png 597 | road/2011_09_30_drive_0016/image_02/data/0000000121.png 598 | residential/2011_09_30_drive_0018/image_02/data/0000000107.png 599 | residential/2011_09_30_drive_0018/image_02/data/0000002247.png 600 | residential/2011_09_30_drive_0018/image_02/data/0000001391.png 601 | residential/2011_09_30_drive_0018/image_02/data/0000000535.png 602 | residential/2011_09_30_drive_0018/image_02/data/0000001819.png 603 | residential/2011_09_30_drive_0018/image_02/data/0000001177.png 604 | residential/2011_09_30_drive_0018/image_02/data/0000000428.png 605 | residential/2011_09_30_drive_0018/image_02/data/0000001926.png 606 | residential/2011_09_30_drive_0018/image_02/data/0000000749.png 607 | residential/2011_09_30_drive_0018/image_02/data/0000001284.png 608 | residential/2011_09_30_drive_0018/image_02/data/0000002140.png 609 | residential/2011_09_30_drive_0018/image_02/data/0000001605.png 610 | residential/2011_09_30_drive_0018/image_02/data/0000001498.png 611 | residential/2011_09_30_drive_0018/image_02/data/0000000642.png 612 | residential/2011_09_30_drive_0018/image_02/data/0000002740.png 613 | residential/2011_09_30_drive_0018/image_02/data/0000002419.png 614 | residential/2011_09_30_drive_0018/image_02/data/0000000856.png 615 | residential/2011_09_30_drive_0018/image_02/data/0000002526.png 616 | residential/2011_09_30_drive_0018/image_02/data/0000001712.png 617 | residential/2011_09_30_drive_0018/image_02/data/0000001070.png 618 | residential/2011_09_30_drive_0018/image_02/data/0000000000.png 619 | residential/2011_09_30_drive_0018/image_02/data/0000002033.png 620 | residential/2011_09_30_drive_0018/image_02/data/0000000214.png 621 | residential/2011_09_30_drive_0018/image_02/data/0000000963.png 622 | residential/2011_09_30_drive_0018/image_02/data/0000002633.png 623 | residential/2011_09_30_drive_0027/image_02/data/0000000533.png 624 | residential/2011_09_30_drive_0027/image_02/data/0000001040.png 625 | residential/2011_09_30_drive_0027/image_02/data/0000000082.png 626 | residential/2011_09_30_drive_0027/image_02/data/0000000205.png 627 | residential/2011_09_30_drive_0027/image_02/data/0000000835.png 628 | residential/2011_09_30_drive_0027/image_02/data/0000000451.png 629 | residential/2011_09_30_drive_0027/image_02/data/0000000164.png 630 | residential/2011_09_30_drive_0027/image_02/data/0000000794.png 631 | residential/2011_09_30_drive_0027/image_02/data/0000000328.png 632 | residential/2011_09_30_drive_0027/image_02/data/0000000615.png 633 | residential/2011_09_30_drive_0027/image_02/data/0000000917.png 634 | residential/2011_09_30_drive_0027/image_02/data/0000000369.png 635 | residential/2011_09_30_drive_0027/image_02/data/0000000287.png 636 | residential/2011_09_30_drive_0027/image_02/data/0000000123.png 637 | residential/2011_09_30_drive_0027/image_02/data/0000000876.png 638 | residential/2011_09_30_drive_0027/image_02/data/0000000410.png 639 | residential/2011_09_30_drive_0027/image_02/data/0000000492.png 640 | residential/2011_09_30_drive_0027/image_02/data/0000000958.png 641 | residential/2011_09_30_drive_0027/image_02/data/0000000656.png 642 | residential/2011_09_30_drive_0027/image_02/data/0000000000.png 643 | residential/2011_09_30_drive_0027/image_02/data/0000000753.png 644 | residential/2011_09_30_drive_0027/image_02/data/0000000574.png 645 | residential/2011_09_30_drive_0027/image_02/data/0000001081.png 646 | residential/2011_09_30_drive_0027/image_02/data/0000000041.png 647 | residential/2011_09_30_drive_0027/image_02/data/0000000246.png 648 | residential/2011_10_03_drive_0027/image_02/data/0000002906.png 649 | residential/2011_10_03_drive_0027/image_02/data/0000002544.png 650 | residential/2011_10_03_drive_0027/image_02/data/0000000362.png 651 | residential/2011_10_03_drive_0027/image_02/data/0000004535.png 652 | residential/2011_10_03_drive_0027/image_02/data/0000000734.png 653 | residential/2011_10_03_drive_0027/image_02/data/0000001096.png 654 | residential/2011_10_03_drive_0027/image_02/data/0000004173.png 655 | residential/2011_10_03_drive_0027/image_02/data/0000000543.png 656 | residential/2011_10_03_drive_0027/image_02/data/0000001277.png 657 | residential/2011_10_03_drive_0027/image_02/data/0000004354.png 658 | residential/2011_10_03_drive_0027/image_02/data/0000001458.png 659 | residential/2011_10_03_drive_0027/image_02/data/0000001820.png 660 | residential/2011_10_03_drive_0027/image_02/data/0000003449.png 661 | residential/2011_10_03_drive_0027/image_02/data/0000003268.png 662 | residential/2011_10_03_drive_0027/image_02/data/0000000915.png 663 | residential/2011_10_03_drive_0027/image_02/data/0000002363.png 664 | residential/2011_10_03_drive_0027/image_02/data/0000002725.png 665 | residential/2011_10_03_drive_0027/image_02/data/0000000181.png 666 | residential/2011_10_03_drive_0027/image_02/data/0000001639.png 667 | residential/2011_10_03_drive_0027/image_02/data/0000003992.png 668 | residential/2011_10_03_drive_0027/image_02/data/0000003087.png 669 | residential/2011_10_03_drive_0027/image_02/data/0000002001.png 670 | residential/2011_10_03_drive_0027/image_02/data/0000003811.png 671 | residential/2011_10_03_drive_0027/image_02/data/0000003630.png 672 | residential/2011_10_03_drive_0027/image_02/data/0000000000.png 673 | road/2011_10_03_drive_0047/image_02/data/0000000096.png 674 | road/2011_10_03_drive_0047/image_02/data/0000000800.png 675 | road/2011_10_03_drive_0047/image_02/data/0000000320.png 676 | road/2011_10_03_drive_0047/image_02/data/0000000576.png 677 | road/2011_10_03_drive_0047/image_02/data/0000000000.png 678 | road/2011_10_03_drive_0047/image_02/data/0000000480.png 679 | road/2011_10_03_drive_0047/image_02/data/0000000640.png 680 | road/2011_10_03_drive_0047/image_02/data/0000000032.png 681 | road/2011_10_03_drive_0047/image_02/data/0000000384.png 682 | road/2011_10_03_drive_0047/image_02/data/0000000160.png 683 | road/2011_10_03_drive_0047/image_02/data/0000000704.png 684 | road/2011_10_03_drive_0047/image_02/data/0000000736.png 685 | road/2011_10_03_drive_0047/image_02/data/0000000672.png 686 | road/2011_10_03_drive_0047/image_02/data/0000000064.png 687 | road/2011_10_03_drive_0047/image_02/data/0000000288.png 688 | road/2011_10_03_drive_0047/image_02/data/0000000352.png 689 | road/2011_10_03_drive_0047/image_02/data/0000000512.png 690 | road/2011_10_03_drive_0047/image_02/data/0000000544.png 691 | road/2011_10_03_drive_0047/image_02/data/0000000608.png 692 | road/2011_10_03_drive_0047/image_02/data/0000000128.png 693 | road/2011_10_03_drive_0047/image_02/data/0000000224.png 694 | road/2011_10_03_drive_0047/image_02/data/0000000416.png 695 | road/2011_10_03_drive_0047/image_02/data/0000000192.png 696 | road/2011_10_03_drive_0047/image_02/data/0000000448.png 697 | road/2011_10_03_drive_0047/image_02/data/0000000768.png -------------------------------------------------------------------------------- /experiments/depth/solver.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "experiments/depth/train.prototxt" 2 | display: 20 3 | average_loss: 20 4 | lr_policy: "step" 5 | gamma: 0.1 # drop the learning rate by a factor of 10 6 | # (i.e., multiply it by a factor of gamma = 0.1) 7 | 8 | stepsize: 80000 # drop the learning rate every 80K iterations 9 | base_lr: 1e-3 10 | # high momentum 11 | momentum: 0.9 12 | momentum2:0.999 13 | delta:1e-8 14 | # no gradient accumulation 15 | iter_size: 1 16 | max_iter: 200000 17 | weight_decay: 0.0001 18 | snapshot: 10000 19 | snapshot_prefix: "snapshots/depth/train" 20 | test_initialization: false 21 | type: "Adam" 22 | solver_mode: GPU 23 | -------------------------------------------------------------------------------- /experiments/depth/train.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env sh 2 | 3 | # Create a folder for snaptshots. 4 | mkdir -p snapshots/depth 5 | 6 | # TOOLS=$YOUR_CAFFE_DIR/build/tools 7 | TOOLS=/home/hyzhan/caffe/build/tools 8 | $TOOLS/caffe train\ 9 | --solver ./experiments/depth/solver.prototxt\ 10 | --weights models/resnet_50_1by2.caffemodel 11 | --gpu 0 -------------------------------------------------------------------------------- /experiments/depth_feature/solver.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "experiments/depth_feature/train.prototxt" 2 | display: 20 3 | average_loss: 20 4 | lr_policy: "step" 5 | gamma: 0.1 # drop the learning rate by a factor of 10 6 | # (i.e., multiply it by a factor of gamma = 0.1) 7 | 8 | stepsize: 80000 # drop the learning rate every 120K iterations 9 | base_lr: 1e-3 10 | # high momentum 11 | momentum: 0.9 12 | momentum2:0.999 13 | delta:1e-8 14 | # no gradient accumulation 15 | iter_size: 1 16 | max_iter: 200000 17 | weight_decay: 0.0001 18 | snapshot: 10000 19 | snapshot_prefix: "snapshots/depth_feature/train" 20 | test_initialization: false 21 | type: "Adam" 22 | solver_mode: GPU 23 | -------------------------------------------------------------------------------- /experiments/depth_feature/train.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env sh 2 | 3 | # Create a folder for snaptshots. 4 | mkdir -p snapshots/depth_feature 5 | 6 | TOOLS=$YOUR_CAFFE_DIR/build/tools 7 | $TOOLS/caffe train\ 8 | --solver experiments/depth_feature/solver.prototxt\ 9 | --gpu 0\ 10 | --weights models/feature_extractor/feat_extractor_KITTI_Feat.caffemodel,snapshots/depth/train_iter_200000.caffemodel -------------------------------------------------------------------------------- /experiments/depth_odometry/solver.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "experiments/depth_odometry/train.prototxt" 2 | display: 20 3 | average_loss: 20 4 | lr_policy: "step" 5 | gamma: 0.1 # drop the learning rate by a factor of 10 6 | # (i.e., multiply it by a factor of gamma = 0.1) 7 | 8 | stepsize: 80000 # drop the learning rate every 120K iterations 9 | base_lr: 1e-3 10 | # high momentum 11 | momentum: 0.9 12 | momentum2:0.999 13 | delta:1e-8 14 | # no gradient accumulation 15 | iter_size: 1 16 | max_iter: 200000 17 | weight_decay: 0.0001 18 | snapshot: 10000 19 | snapshot_prefix: "snapshots/depth_odometry/train" 20 | test_initialization: false 21 | type: "Adam" 22 | solver_mode: GPU 23 | -------------------------------------------------------------------------------- /experiments/depth_odometry/train.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env sh 2 | 3 | # Create a folder for snaptshots. 4 | mkdir -p snapshots/depth_odometry 5 | 6 | TOOLS=$YOUR_CAFFE_DIR/build/tools 7 | $TOOLS/caffe train\ 8 | --solver experiments/depth_odometry/solver.prototxt\ 9 | --gpu 0\ 10 | --weights snapshots/depth/train_iter_200000.caffemodel -------------------------------------------------------------------------------- /experiments/depth_odometry_feature/solver.prototxt: -------------------------------------------------------------------------------- 1 | train_net: "experiments/depth_odometry_feature/train.prototxt" 2 | display: 20 3 | average_loss: 20 4 | lr_policy: "step" 5 | gamma: 0.1 # drop the learning rate by a factor of 10 6 | # (i.e., multiply it by a factor of gamma = 0.1) 7 | 8 | stepsize: 80000 # drop the learning rate every 120K iterations 9 | base_lr: 1e-3 10 | # high momentum 11 | momentum: 0.9 12 | momentum2:0.999 13 | delta:1e-8 14 | # no gradient accumulation 15 | iter_size: 1 16 | max_iter: 200000 17 | weight_decay: 0.0001 18 | snapshot: 10000 19 | snapshot_prefix: "snapshots/depth_odometry_feature/train" 20 | test_initialization: false 21 | type: "Adam" 22 | solver_mode: GPU 23 | -------------------------------------------------------------------------------- /experiments/depth_odometry_feature/train.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env sh 2 | 3 | # Create a folder for snaptshots. 4 | mkdir -p snapshots/depth_odometry_feature 5 | 6 | TOOLS=$YOUR_CAFFE_DIR/build/tools 7 | $TOOLS/caffe train\ 8 | --solver experiments/depth_odometry_feature/solver.prototxt\ 9 | --gpu 0\ 10 | # --weights snapshots/depth_odometry/train_iter_200000.caffemodel,model/weerasekera_nyu.caffemodel -------------------------------------------------------------------------------- /experiments/networks/odometry_deploy.prototxt: -------------------------------------------------------------------------------- 1 | # Visual odometry net 2 | 3 | # ---------------------------------------------------------------------- 4 | # Data input 5 | # ---------------------------------------------------------------------- 6 | input: "imgs" 7 | input_dim: 1 8 | input_dim: 6 #first 3-channel: I2; last 3-channel: I1 9 | input_dim: 160 10 | input_dim: 608 11 | 12 | # ---------------------------------------------------------------------- 13 | # Pose Network 14 | # ---------------------------------------------------------------------- 15 | 16 | layer { 17 | name: "conv_0_pose" 18 | type: "Convolution" 19 | bottom: "imgs" 20 | top: "conv_0_pose" 21 | param { 22 | lr_mult: 1 23 | decay_mult: 1 24 | } 25 | param { 26 | lr_mult: 2 27 | decay_mult: 0 28 | } 29 | convolution_param { 30 | num_output: 16 31 | kernel_size: 7 32 | stride: 2 33 | pad: 3 34 | weight_filler { 35 | type: "xavier" 36 | } 37 | bias_filler { 38 | type: "constant" 39 | value: 0 40 | } 41 | } 42 | } 43 | 44 | layer { 45 | name: "relu_0_pose" 46 | type: "ReLU" 47 | bottom: "conv_0_pose" 48 | top: "conv_0_pose" 49 | } 50 | 51 | 52 | layer { 53 | name: "conv_1_pose" 54 | type: "Convolution" 55 | bottom: "conv_0_pose" 56 | top: "conv_1_pose" 57 | param { 58 | lr_mult: 1 59 | decay_mult: 1 60 | } 61 | param { 62 | lr_mult: 2 63 | decay_mult: 0 64 | } 65 | convolution_param { 66 | num_output: 32 67 | kernel_size: 5 68 | stride: 2 69 | pad: 2 70 | weight_filler { 71 | type: "xavier" 72 | } 73 | bias_filler { 74 | type: "constant" 75 | value: 0 76 | } 77 | } 78 | } 79 | 80 | layer { 81 | name: "relu_1_pose" 82 | type: "ReLU" 83 | bottom: "conv_1_pose" 84 | top: "conv_1_pose" 85 | } 86 | 87 | 88 | layer { 89 | name: "conv_2_pose" 90 | type: "Convolution" 91 | bottom: "conv_1_pose" 92 | top: "conv_2_pose" 93 | param { 94 | lr_mult: 1 95 | decay_mult: 1 96 | } 97 | param { 98 | lr_mult: 2 99 | decay_mult: 0 100 | } 101 | convolution_param { 102 | num_output: 64 103 | kernel_size: 3 104 | stride: 2 105 | pad: 1 106 | weight_filler { 107 | type: "xavier" 108 | } 109 | bias_filler { 110 | type: "constant" 111 | value: 0 112 | } 113 | } 114 | } 115 | 116 | layer { 117 | name: "relu_2_pose" 118 | type: "ReLU" 119 | bottom: "conv_2_pose" 120 | top: "conv_2_pose" 121 | } 122 | 123 | layer { 124 | name: "conv_3_pose" 125 | type: "Convolution" 126 | bottom: "conv_2_pose" 127 | top: "conv_3_pose" 128 | param { 129 | lr_mult: 1 130 | decay_mult: 1 131 | } 132 | param { 133 | lr_mult: 2 134 | decay_mult: 0 135 | } 136 | convolution_param { 137 | num_output: 128 138 | kernel_size: 3 139 | stride: 2 140 | pad: 1 141 | weight_filler { 142 | type: "xavier" 143 | } 144 | bias_filler { 145 | type: "constant" 146 | value: 0 147 | } 148 | } 149 | } 150 | 151 | layer { 152 | name: "relu_3_pose" 153 | type: "ReLU" 154 | bottom: "conv_3_pose" 155 | top: "conv_3_pose" 156 | } 157 | 158 | layer { 159 | name: "conv_4_pose" 160 | type: "Convolution" 161 | bottom: "conv_3_pose" 162 | top: "conv_4_pose" 163 | param { 164 | lr_mult: 1 165 | decay_mult: 1 166 | } 167 | param { 168 | lr_mult: 2 169 | decay_mult: 0 170 | } 171 | convolution_param { 172 | num_output: 256 173 | kernel_size: 3 174 | stride: 2 175 | pad: 1 176 | weight_filler { 177 | type: "xavier" 178 | } 179 | bias_filler { 180 | type: "constant" 181 | value: 0 182 | } 183 | } 184 | } 185 | 186 | layer { 187 | name: "relu_4_pose" 188 | type: "ReLU" 189 | bottom: "conv_4_pose" 190 | top: "conv_4_pose" 191 | } 192 | 193 | layer { 194 | name: "conv_5_pose" 195 | type: "Convolution" 196 | bottom: "conv_4_pose" 197 | top: "conv_5_pose" 198 | param { 199 | lr_mult: 1 200 | decay_mult: 1 201 | } 202 | param { 203 | lr_mult: 2 204 | decay_mult: 0 205 | } 206 | convolution_param { 207 | num_output: 256 208 | kernel_size: 3 209 | stride: 2 210 | pad: 1 211 | weight_filler { 212 | type: "xavier" 213 | } 214 | bias_filler { 215 | type: "constant" 216 | value: 0 217 | } 218 | } 219 | } 220 | 221 | layer { 222 | name: "relu_5_pose" 223 | type: "ReLU" 224 | bottom: "conv_5_pose" 225 | top: "conv_5_pose" 226 | } 227 | 228 | layer { 229 | name: "fc_0_pose" 230 | type: "InnerProduct" 231 | bottom: "conv_5_pose" 232 | top: "fc_0_pose" 233 | param { lr_mult: 1 decay_mult: 1 } 234 | param { lr_mult: 2 decay_mult: 0 } 235 | inner_product_param { 236 | num_output: 512 237 | weight_filler { 238 | type: "xavier" 239 | } 240 | bias_filler { 241 | type: "constant" 242 | value: 0 243 | } 244 | } 245 | } 246 | 247 | layer { 248 | name: "relu_fc_0_pose" 249 | type: "ReLU" 250 | bottom: "fc_0_pose" 251 | top: "fc_0_pose" 252 | } 253 | 254 | layer { 255 | name: "fc_1_pose" 256 | type: "InnerProduct" 257 | bottom: "fc_0_pose" 258 | top: "fc_1_pose" 259 | param { lr_mult: 1 decay_mult: 1 } 260 | param { lr_mult: 2 decay_mult: 0 } 261 | inner_product_param { 262 | num_output: 512 263 | weight_filler { 264 | type: "xavier" 265 | } 266 | bias_filler { 267 | type: "constant" 268 | value: 0 269 | } 270 | } 271 | } 272 | 273 | layer { 274 | name: "relu_fc_1_pose" 275 | type: "ReLU" 276 | bottom: "fc_1_pose" 277 | top: "fc_1_pose" 278 | } 279 | 280 | 281 | layer { 282 | name: "temporal_pose_0" 283 | type: "InnerProduct" 284 | bottom: "fc_1_pose" 285 | top: "temporal_pose_0" 286 | param { lr_mult: 0.1 decay_mult: 1 } 287 | param { lr_mult: 0.2 decay_mult: 0 } 288 | inner_product_param { 289 | num_output: 6 290 | bias_filler { 291 | type: "constant" 292 | value: 0 293 | } 294 | } 295 | } 296 | 297 | 298 | layer { 299 | name: "temporal_pose" 300 | type: "Reshape" 301 | bottom: "temporal_pose_0" 302 | top: "T_2to1" # transformation from t2 to t1 303 | reshape_param { 304 | shape { 305 | dim: 0 # copy the dimension from below 306 | dim: 0 307 | dim: 1 308 | dim: 1 # infer it from the other dimensions 309 | } 310 | } 311 | } 312 | 313 | # ---------------------------------------------------------------------- 314 | # Geometry 315 | # ---------------------------------------------------------------------- 316 | 317 | layer { 318 | name: "SE3" 319 | type: "Python" 320 | bottom: "T_2to1" 321 | top: "SE3" 322 | python_param { 323 | module: "pygeometry" 324 | layer: "SE3_Generator_KITTI" 325 | } 326 | } 327 | -------------------------------------------------------------------------------- /tools/eval_depth.py: -------------------------------------------------------------------------------- 1 | # Mostly based on the code written by Clement Godard: 2 | # https://github.com/mrharicot/monodepth/blob/master/utils/evaluate_kitti.py 3 | 4 | import numpy as np 5 | import cv2 6 | import argparse 7 | from eval_depth_utils import * 8 | 9 | parser = argparse.ArgumentParser(description='Evaluation on the KITTI dataset') 10 | parser.add_argument('--split', type=str, help='data split, kitti or eigen', required=True) 11 | parser.add_argument('--predicted_inv_depth_path', type=str, help='path to estimated disparities', required=True) 12 | parser.add_argument('--gt_path', type=str, help='path to ground truth disparities', required=True) 13 | parser.add_argument('--min_depth', type=float, help='minimum depth for evaluation', default=1e-3) 14 | parser.add_argument('--max_depth', type=float, help='maximum depth for evaluation', default=80) 15 | parser.add_argument('--eigen_crop', help='if set, crops according to Eigen NIPS14', action='store_true') 16 | parser.add_argument('--garg_crop', help='if set, crops according to Garg ECCV16', action='store_true') 17 | 18 | args = parser.parse_args() 19 | 20 | if __name__ == '__main__': 21 | 22 | pred_disparities = np.load(args.predicted_inv_depth_path) 23 | 24 | if args.split == 'kitti': 25 | num_samples = 200 26 | 27 | gt_disparities = load_gt_disp_kitti(args.gt_path) 28 | gt_depths, pred_depths, pred_disparities_resized = convert_disps_to_depths_kitti(gt_disparities, pred_disparities) 29 | 30 | elif args.split == 'eigen': 31 | num_samples = 697 32 | test_files = read_text_lines('./data/depth_evaluation/kitti_eigen/test_files_eigen.txt') 33 | gt_files, gt_calib, im_sizes, im_files, cams = read_file_data(test_files, args.gt_path) 34 | 35 | num_test = len(im_files) 36 | gt_depths = [] 37 | pred_depths = [] 38 | 39 | print "Getting ground truth depths and predicted depths..." 40 | for t_id in range(num_samples): 41 | camera_id = cams[t_id] # 2 is left, 3 is right 42 | depth = generate_depth_map(gt_calib[t_id], gt_files[t_id], im_sizes[t_id], camera_id, False, True) 43 | gt_depths.append(depth.astype(np.float32)) 44 | 45 | inv_depth_pred = cv2.resize(pred_disparities[t_id], (im_sizes[t_id][1], im_sizes[t_id][0]), interpolation=cv2.INTER_LINEAR) 46 | 47 | 48 | # ---------------------------------------------------------------------- 49 | # Convert disparity into depth 50 | # ---------------------------------------------------------------------- 51 | # inv_depth_pred = inv_depth_pred * inv_depth_pred.shape[1] 52 | # focal_length, baseline = get_focal_length_baseline(gt_calib[t_id], camera_id) 53 | # depth_pred = (baseline * focal_length) / (inv_depth_pred+1e-4) 54 | 55 | # ---------------------------------------------------------------------- 56 | # Convert inverse depth to depth 57 | # ---------------------------------------------------------------------- 58 | depth_pred = 1.0 / (inv_depth_pred+1e-4) 59 | depth_pred[np.isinf(depth_pred)] = 0 60 | 61 | pred_depths.append(depth_pred) 62 | print "Getting ground truth depths and predicted depths... Done!" 63 | 64 | rms = np.zeros(num_samples, np.float32) 65 | log_rms = np.zeros(num_samples, np.float32) 66 | abs_rel = np.zeros(num_samples, np.float32) 67 | sq_rel = np.zeros(num_samples, np.float32) 68 | d1_all = np.zeros(num_samples, np.float32) 69 | a1 = np.zeros(num_samples, np.float32) 70 | a2 = np.zeros(num_samples, np.float32) 71 | a3 = np.zeros(num_samples, np.float32) 72 | 73 | for i in range(num_samples): 74 | 75 | gt_depth = gt_depths[i] 76 | pred_depth = pred_depths[i] 77 | 78 | pred_depth[pred_depth < args.min_depth] = args.min_depth 79 | pred_depth[pred_depth > args.max_depth] = args.max_depth 80 | 81 | if args.split == 'eigen': 82 | mask = np.logical_and(gt_depth > args.min_depth, gt_depth < args.max_depth) 83 | 84 | 85 | if args.garg_crop or args.eigen_crop: 86 | gt_height, gt_width = gt_depth.shape 87 | 88 | # crop used by Garg ECCV16 89 | # if used on gt_size 370x1224 produces a crop of [-218, -3, 44, 1180] 90 | if args.garg_crop: 91 | print "Evaluating (Garg Crop)...: ", i, "/ 697" 92 | crop = np.array([0.40810811 * gt_height, 0.99189189 * gt_height, 93 | 0.03594771 * gt_width, 0.96405229 * gt_width]).astype(np.int32) 94 | # crop we found by trial and error to reproduce Eigen NIPS14 results 95 | elif args.eigen_crop: 96 | print "Evaluating (Eigen Crop)...: ", i, "/ 697" 97 | crop = np.array([0.3324324 * gt_height, 0.91351351 * gt_height, 98 | 0.0359477 * gt_width, 0.96405229 * gt_width]).astype(np.int32) 99 | 100 | crop_mask = np.zeros(mask.shape) 101 | crop_mask[crop[0]:crop[1],crop[2]:crop[3]] = 1 102 | mask = np.logical_and(mask, crop_mask) 103 | 104 | if args.split == 'kitti': 105 | gt_disp = gt_disparities[i] 106 | mask = gt_disp > 0 107 | pred_disp = pred_disparities_resized[i] 108 | 109 | disp_diff = np.abs(gt_disp[mask] - pred_disp[mask]) 110 | bad_pixels = np.logical_and(disp_diff >= 3, (disp_diff / gt_disp[mask]) >= 0.05) 111 | d1_all[i] = 100.0 * bad_pixels.sum() / mask.sum() 112 | 113 | abs_rel[i], sq_rel[i], rms[i], log_rms[i], a1[i], a2[i], a3[i] = compute_errors(gt_depth[mask], pred_depth[mask]) 114 | 115 | print("{:>10}, {:>10}, {:>10}, {:>10}, {:>10}, {:>10}, {:>10}, {:>10}".format('abs_rel', 'sq_rel', 'rms', 'log_rms', 'd1_all', 'a1', 'a2', 'a3')) 116 | print("{:10.4f}, {:10.4f}, {:10.3f}, {:10.3f}, {:10.3f}, {:10.3f}, {:10.3f}, {:10.3f}".format(abs_rel.mean(), sq_rel.mean(), rms.mean(), log_rms.mean(), d1_all.mean(), a1.mean(), a2.mean(), a3.mean())) 117 | -------------------------------------------------------------------------------- /tools/eval_depth_utils.py: -------------------------------------------------------------------------------- 1 | # Mostly based on the code written by Clement Godard: 2 | # https://github.com/mrharicot/monodepth/blob/master/utils/evaluation_utils.py 3 | 4 | import numpy as np 5 | import os 6 | import cv2 7 | from collections import Counter 8 | import pickle 9 | 10 | def compute_errors(gt, pred): 11 | thresh = np.maximum((gt / pred), (pred / gt)) 12 | a1 = (thresh < 1.25 ).mean() 13 | a2 = (thresh < 1.25 ** 2).mean() 14 | a3 = (thresh < 1.25 ** 3).mean() 15 | 16 | rmse = (gt - pred) ** 2 17 | rmse = np.sqrt(rmse.mean()) 18 | 19 | rmse_log = (np.log(gt) - np.log(pred)) ** 2 20 | rmse_log = np.sqrt(rmse_log.mean()) 21 | 22 | abs_rel = np.mean(np.abs(gt - pred) / gt) 23 | 24 | sq_rel = np.mean(((gt - pred)**2) / gt) 25 | 26 | return abs_rel, sq_rel, rmse, rmse_log, a1, a2, a3 27 | 28 | ############################################################################### 29 | ####################### KITTI 30 | 31 | width_to_focal = dict() 32 | width_to_focal[1242] = 721.5377 33 | width_to_focal[1241] = 718.856 34 | width_to_focal[1224] = 707.0493 35 | width_to_focal[1238] = 718.3351 36 | 37 | def load_gt_disp_kitti(path): 38 | gt_disparities = [] 39 | for i in range(200): 40 | disp = cv2.imread(path + "/training/disp_noc_0/" + str(i).zfill(6) + "_10.png", -1) 41 | disp = disp.astype(np.float32) / 256 42 | gt_disparities.append(disp) 43 | return gt_disparities 44 | 45 | def convert_disps_to_depths_kitti(gt_disparities, pred_disparities): 46 | gt_depths = [] 47 | pred_depths = [] 48 | pred_disparities_resized = [] 49 | 50 | for i in range(len(gt_disparities)): 51 | gt_disp = gt_disparities[i] 52 | height, width = gt_disp.shape 53 | 54 | pred_disp = pred_disparities[i] 55 | pred_disp = width * cv2.resize(pred_disp, (width, height), interpolation=cv2.INTER_LINEAR) 56 | 57 | pred_disparities_resized.append(pred_disp) 58 | 59 | mask = gt_disp > 0 60 | 61 | gt_depth = width_to_focal[width] * 0.54 / (gt_disp + (1.0 - mask)) 62 | pred_depth = width_to_focal[width] * 0.54 / pred_disp 63 | 64 | gt_depths.append(gt_depth) 65 | pred_depths.append(pred_depth) 66 | return gt_depths, pred_depths, pred_disparities_resized 67 | 68 | 69 | ############################################################################### 70 | ####################### EIGEN 71 | 72 | def read_text_lines(file_path): 73 | f = open(file_path, 'r') 74 | lines = f.readlines() 75 | f.close() 76 | lines = [l.rstrip() for l in lines] 77 | return lines 78 | 79 | def read_file_data(files, data_root): 80 | gt_files = [] 81 | gt_calib = [] 82 | im_sizes = [] 83 | im_files = [] 84 | cams = [] 85 | num_probs = 0 86 | for filename in files: 87 | filename = filename.split()[0] 88 | splits = filename.split('/') 89 | # camera_id = filename[-1] # 2 is left, 3 is right 90 | scene_class = splits[0] 91 | scene = splits[1] 92 | im_id = splits[4][:10] 93 | file_root = '{}/{}' 94 | 95 | im = filename 96 | vel = '{}/{}/velodyne_points/data/{}.bin'.format(splits[0], splits[1], im_id) 97 | 98 | if os.path.isfile(data_root + im): 99 | gt_files.append(data_root + vel) 100 | gt_calib.append(data_root + scene_class + '/' + scene + "/calib/") 101 | im_sizes.append(cv2.imread(data_root + im).shape[:2]) 102 | im_files.append(data_root + im) 103 | cams.append(2) 104 | else: 105 | num_probs += 1 106 | print('{} missing'.format(data_root + im)) 107 | # print(num_probs, 'files missing') 108 | 109 | return gt_files, gt_calib, im_sizes, im_files, cams 110 | 111 | def load_velodyne_points(file_name): 112 | # adapted from https://github.com/hunse/kitti 113 | points = np.fromfile(file_name, dtype=np.float32).reshape(-1, 4) 114 | points[:, 3] = 1.0 # homogeneous 115 | return points 116 | 117 | 118 | def lin_interp(shape, xyd): 119 | # taken from https://github.com/hunse/kitti 120 | m, n = shape 121 | ij, d = xyd[:, 1::-1], xyd[:, 2] 122 | f = LinearNDInterpolator(ij, d, fill_value=0) 123 | J, I = np.meshgrid(np.arange(n), np.arange(m)) 124 | IJ = np.vstack([I.flatten(), J.flatten()]).T 125 | disparity = f(IJ).reshape(shape) 126 | return disparity 127 | 128 | 129 | def read_calib_file(path): 130 | # taken from https://github.com/hunse/kitti 131 | float_chars = set("0123456789.e+- ") 132 | data = {} 133 | with open(path, 'r') as f: 134 | for line in f.readlines(): 135 | key, value = line.split(':', 1) 136 | value = value.strip() 137 | data[key] = value 138 | if float_chars.issuperset(value): 139 | # try to cast to float array 140 | try: 141 | data[key] = np.array(map(float, value.split(' '))) 142 | except ValueError: 143 | # casting error: data[key] already eq. value, so pass 144 | pass 145 | 146 | return data 147 | 148 | 149 | def get_focal_length_baseline(calib_dir, cam=2): 150 | cam2cam = read_calib_file(calib_dir + 'calib_cam_to_cam.txt') 151 | P2_rect = cam2cam['P_rect_02'].reshape(3,4) 152 | P3_rect = cam2cam['P_rect_03'].reshape(3,4) 153 | 154 | # cam 2 is left of camera 0 -6cm 155 | # cam 3 is to the right +54cm 156 | b2 = P2_rect[0,3] / -P2_rect[0,0] 157 | b3 = P3_rect[0,3] / -P3_rect[0,0] 158 | baseline = b3-b2 159 | 160 | if cam==2: 161 | focal_length = P2_rect[0,0] 162 | elif cam==3: 163 | focal_length = P3_rect[0,0] 164 | 165 | return focal_length, baseline 166 | 167 | 168 | def sub2ind(matrixSize, rowSub, colSub): 169 | m, n = matrixSize 170 | return rowSub * (n-1) + colSub - 1 171 | 172 | def generate_depth_map(calib_dir, velo_file_name, im_shape, cam=2, interp=False, vel_depth=False): 173 | # load calibration files 174 | cam2cam = read_calib_file(calib_dir + 'calib_cam_to_cam.txt') 175 | velo2cam = read_calib_file(calib_dir + 'calib_velo_to_cam.txt') 176 | velo2cam = np.hstack((velo2cam['R'].reshape(3,3), velo2cam['T'][..., np.newaxis])) 177 | velo2cam = np.vstack((velo2cam, np.array([0, 0, 0, 1.0]))) 178 | 179 | # compute projection matrix velodyne->image plane 180 | R_cam2rect = np.eye(4) 181 | R_cam2rect[:3,:3] = cam2cam['R_rect_00'].reshape(3,3) 182 | P_rect = cam2cam['P_rect_0'+str(cam)].reshape(3,4) 183 | P_velo2im = np.dot(np.dot(P_rect, R_cam2rect), velo2cam) 184 | 185 | # load velodyne points and remove all behind image plane (approximation) 186 | # each row of the velodyne data is forward, left, up, reflectance 187 | velo = load_velodyne_points(velo_file_name) 188 | velo = velo[velo[:, 0] >= 0, :] 189 | 190 | # project the points to the camera 191 | velo_pts_im = np.dot(P_velo2im, velo.T).T 192 | velo_pts_im[:, :2] = velo_pts_im[:,:2] / velo_pts_im[:,2][..., np.newaxis] 193 | 194 | if vel_depth: 195 | velo_pts_im[:, 2] = velo[:, 0] 196 | 197 | # check if in bounds 198 | # use minus 1 to get the exact same value as KITTI matlab code 199 | velo_pts_im[:, 0] = np.round(velo_pts_im[:,0]) - 1 200 | velo_pts_im[:, 1] = np.round(velo_pts_im[:,1]) - 1 201 | val_inds = (velo_pts_im[:, 0] >= 0) & (velo_pts_im[:, 1] >= 0) 202 | val_inds = val_inds & (velo_pts_im[:,0] < im_shape[1]) & (velo_pts_im[:,1] < im_shape[0]) 203 | velo_pts_im = velo_pts_im[val_inds, :] 204 | 205 | # project to image 206 | depth = np.zeros((im_shape)) 207 | depth[velo_pts_im[:, 1].astype(np.int), velo_pts_im[:, 0].astype(np.int)] = velo_pts_im[:, 2] 208 | 209 | # find the duplicate points and choose the closest depth 210 | inds = sub2ind(depth.shape, velo_pts_im[:, 1], velo_pts_im[:, 0]) 211 | dupe_inds = [item for item, count in Counter(inds).iteritems() if count > 1] 212 | for dd in dupe_inds: 213 | pts = np.where(inds==dd)[0] 214 | x_loc = int(velo_pts_im[pts[0], 0]) 215 | y_loc = int(velo_pts_im[pts[0], 1]) 216 | depth[y_loc, x_loc] = velo_pts_im[pts, 2].min() 217 | depth[depth<0] = 0 218 | 219 | if interp: 220 | # interpolate the depth map to fill in holes 221 | depth_interp = lin_interp(im_shape, velo_pts_im) 222 | return depth, depth_interp 223 | else: 224 | return depth 225 | 226 | 227 | -------------------------------------------------------------------------------- /tools/evaluation_tools.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | import numpy as np 3 | import sys 4 | import numpy as np 5 | from matplotlib import pyplot as plt 6 | 7 | caffe_root = '$YOUR_CAFFE_DIR' 8 | sys.path.insert(0, caffe_root + 'python') 9 | import caffe 10 | 11 | import h5py 12 | import os, os.path 13 | import cv2 14 | import argparse 15 | 16 | parser = argparse.ArgumentParser(description='Evaluation toolkit') 17 | parser.add_argument('--func', type=str, default='generate_depth_npy', help='Select function (generate_depth_npy; generate_odom_result; eval_odom)') 18 | parser.add_argument('--dataset', type=str, default='kitti_eigen', help='Select dataset (kitti_eigen)') 19 | parser.add_argument('--model', type=str, help='Depth caffemodel') 20 | 21 | 22 | parser.add_argument('--depth_net_def', type=str, default="experiments/networks/depth_deploy.prototxt", help='Depth network prototxt') 23 | parser.add_argument('--npy_dir', type=str, default='./result/depth/depths', help='Directory path storing the created npy file') 24 | 25 | parser.add_argument('--odom_net_def', type=str, default="experiments/networks/odometry_deploy.prototxt", help='Visual odometry network prototxt') 26 | parser.add_argument('--odom_result_dir', type=str, default='./result/depth_odometry/odom_result', help='Directory path storing the odometry results') 27 | 28 | 29 | global args 30 | args = parser.parse_args() 31 | 32 | # caffe.set_mode_cpu() 33 | caffe.set_mode_gpu() 34 | caffe.set_device(0) 35 | 36 | class kittiEigenGenerateDepthNpy(): 37 | def __init__(self): 38 | depth_net_def = args.depth_net_def 39 | caffe_model = args.model 40 | self.depth_net = caffe.Net(depth_net_def, caffe_model, caffe.TEST) 41 | self.image_width = self.depth_net.blobs['img'].data.shape[3] 42 | self.image_height = self.depth_net.blobs['img'].data.shape[2] 43 | 44 | # ---------------------------------------------------------------------- 45 | # Check evaluation set exist 46 | # ---------------------------------------------------------------------- 47 | self.dataset_path = "./data/depth_evaluation/kitti_eigen" 48 | assert(os.path.exists(self.dataset_path)==True) 49 | 50 | def getImage(self, img_path): 51 | # ---------------------------------------------------------------------- 52 | # Get and preprocess image 53 | # ---------------------------------------------------------------------- 54 | img = cv2.imread(img_path) 55 | if img==None: 56 | print "img_path: ", img_path 57 | assert img!=None, "Image reading error. Check whether your image path is correct or not." 58 | img = cv2.resize(img, (self.image_width, self.image_height)) 59 | img = img.transpose((2,0,1)) 60 | img = img.astype(np.float32) 61 | img[0] -= 104 62 | img[1] -= 117 63 | img[2] -= 123 64 | return img 65 | 66 | def getPredInvDepths(self): 67 | inv_depths = [] 68 | for cnt in xrange(697): 69 | print "Getting prediction: ", cnt, " / 697" 70 | img_path = self.dataset_path + "/left_rgb/" + str(cnt) + ".png" 71 | img = self.getImage(img_path) 72 | self.depth_net.blobs['img'].data[0] = img #dimension (3,H,W) 73 | self.depth_net.forward(); 74 | inv_depths.append(self.depth_net.blobs["inv_depth"].data[0,0].copy()) 75 | inv_depths = np.asarray(inv_depths) 76 | return inv_depths 77 | 78 | def saveNpy(self, inv_depths): 79 | npy_folder_dir = '/'.join(args.npy_dir.split('/')[:-1]) 80 | if not os.path.exists(npy_folder_dir): 81 | os.makedirs(npy_folder_dir) 82 | np.save(args.npy_dir, inv_depths) 83 | 84 | class kittiPredOdom(): 85 | def __init__(self): 86 | model_def = args.odom_net_def 87 | caffe_model = args.model 88 | self.odom_net = caffe.Net(model_def, caffe_model, caffe.TEST) 89 | self.image_width = self.odom_net.blobs['imgs'].data.shape[3] 90 | self.image_height = self.odom_net.blobs['imgs'].data.shape[2] 91 | 92 | self.result_path = args.odom_result_dir 93 | 94 | self.eval_seqs = ["00", "01", "02", "04", "05", "06", "07", "08", "09", "10"] 95 | self.eval_seqs_start_end = { 96 | "00": [0, 4540], 97 | "01": [0, 1100], 98 | "02": [0, 4660], 99 | "04": [0, 270], 100 | "05": [0, 2760], 101 | "06": [0, 1100], 102 | "07": [0, 1100], 103 | "08": [1100, 5170], 104 | "09": [0, 1590], 105 | "10": [0, 1200] 106 | } 107 | 108 | self.eval_seqs_path = { 109 | "00": "residential/2011_10_03_drive_0027", 110 | "01": "road/2011_10_03_drive_0042", 111 | "02": "residential/2011_10_03_drive_0034", 112 | "04": "road/2011_09_30_drive_0016", 113 | "05": "residential/2011_09_30_drive_0018", 114 | "06": "residential/2011_09_30_drive_0020", 115 | "07": "residential/2011_09_30_drive_0027", 116 | "08": "residential/2011_09_30_drive_0028", 117 | "09": "residential/2011_09_30_drive_0033", 118 | "10": "residential/2011_09_30_drive_0034" 119 | } 120 | 121 | 122 | def getImage(self, img_path): 123 | # ---------------------------------------------------------------------- 124 | # Get and preprocess image 125 | # ---------------------------------------------------------------------- 126 | img = cv2.imread(img_path) 127 | if img==None: 128 | print "img_path: ", img_path 129 | assert img!=None, "Image reading error. Check whether your image path is correct or not." 130 | img = cv2.resize(img, (self.image_width, self.image_height)) 131 | img = img.transpose((2,0,1)) 132 | img = img.astype(np.float32) 133 | img[0] -= 104 134 | img[1] -= 117 135 | img[2] -= 123 136 | return img 137 | 138 | def getPredInvDepths(self): 139 | inv_depths = [] 140 | for cnt in xrange(697): 141 | img_path = self.dataset_path + "/left_rgb/" + str(cnt) + ".png" 142 | img = self.getImage(img_path) 143 | self.depth_net.blobs['img'].data[0] = img #dimension (3,H,W) 144 | self.depth_net.forward(); 145 | inv_depths.append(self.depth_net.blobs["inv_depth"].data[0,0].copy()) 146 | inv_depths = np.asarray(inv_depths) 147 | return inv_depths 148 | 149 | def getPredPoses(self): 150 | pred_poses = {} 151 | for cnt,seq in enumerate(self.eval_seqs): 152 | print "Getting predictions... Sequence: ", cnt, " / ",len(self.eval_seqs) 153 | pred_poses[seq] = [] 154 | seq_path = "./data/kitti_raw_data/" + self.eval_seqs_path[seq] 155 | start_idx = self.eval_seqs_start_end[seq][0] 156 | end_idx = self.eval_seqs_start_end[seq][1] 157 | for idx in xrange(start_idx, end_idx): 158 | img1_path = seq_path + "/image_02/data/{:010}.png".format(idx) 159 | img2_path = seq_path + "/image_02/data/{:010}.png".format(idx+1) 160 | img1 = self.getImage(img1_path) 161 | img2 = self.getImage(img2_path) 162 | self.odom_net.blobs['imgs'].data[0,:3] = img2 163 | self.odom_net.blobs['imgs'].data[0,3:] = img1 164 | self.odom_net.forward(); 165 | pred_poses[seq].append(self.odom_net.blobs['SE3'].data[0,0].copy()) 166 | return pred_poses 167 | 168 | def SE3_cam2world(self, pred_poses): 169 | self.pred_SE3_world = {} 170 | for seq in self.eval_seqs: 171 | cur_T = np.eye(4) 172 | tmp_SE3_world = [] 173 | tmp_SE3_world.append(cur_T) 174 | for pose in pred_poses[seq]: 175 | cur_T = np.dot(cur_T, pose) 176 | tmp_SE3_world.append(cur_T) 177 | self.pred_SE3_world[seq] = tmp_SE3_world 178 | 179 | def saveResultPoses(self): 180 | result_dir = args.odom_result_dir 181 | if not os.path.exists(self.result_path): 182 | os.makedirs(self.result_path) 183 | 184 | for seq in self.eval_seqs: 185 | f = open(self.result_path + "/" + seq + ".txt", 'w') 186 | for cnt, SE3 in enumerate(self.pred_SE3_world[seq]): 187 | tx = str(SE3[0,3]) 188 | ty = str(SE3[1,3]) 189 | tz = str(SE3[2,3]) 190 | R00 = str(SE3[0,0]) 191 | R01 = str(SE3[0,1]) 192 | R02 = str(SE3[0,2]) 193 | R10 = str(SE3[1,0]) 194 | R11 = str(SE3[1,1]) 195 | R12 = str(SE3[1,2]) 196 | R20 = str(SE3[2,0]) 197 | R21 = str(SE3[2,1]) 198 | R22 = str(SE3[2,2]) 199 | line_to_write = " ".join([R00, R01, R02, tx, R10, R11, R12, ty, R20, R21, R22, tz]) 200 | f.writelines(line_to_write+"\n") 201 | f.close() 202 | 203 | def rot2quat(self,R): 204 | rz, ry, rx = self.mat2euler(R) 205 | qw, qx, qy, qz = self.euler2quat(rz, ry, rx) 206 | return qw, qx, qy, qz 207 | 208 | def quat2mat(self,q): 209 | ''' Calculate rotation matrix corresponding to quaternion 210 | https://afni.nimh.nih.gov/pub/dist/src/pkundu/meica.libs/nibabel/quaternions.py 211 | Parameters 212 | ---------- 213 | q : 4 element array-like 214 | 215 | Returns 216 | ------- 217 | M : (3,3) array 218 | Rotation matrix corresponding to input quaternion *q* 219 | 220 | Notes 221 | ----- 222 | Rotation matrix applies to column vectors, and is applied to the 223 | left of coordinate vectors. The algorithm here allows non-unit 224 | quaternions. 225 | 226 | References 227 | ---------- 228 | Algorithm from 229 | http://en.wikipedia.org/wiki/Rotation_matrix#Quaternion 230 | 231 | Examples 232 | -------- 233 | >>> import numpy as np 234 | >>> M = quat2mat([1, 0, 0, 0]) # Identity quaternion 235 | >>> np.allclose(M, np.eye(3)) 236 | True 237 | >>> M = quat2mat([0, 1, 0, 0]) # 180 degree rotn around axis 0 238 | >>> np.allclose(M, np.diag([1, -1, -1])) 239 | True 240 | ''' 241 | w, x, y, z = q 242 | Nq = w*w + x*x + y*y + z*z 243 | if Nq < 1e-8: 244 | return np.eye(3) 245 | s = 2.0/Nq 246 | X = x*s 247 | Y = y*s 248 | Z = z*s 249 | wX = w*X; wY = w*Y; wZ = w*Z 250 | xX = x*X; xY = x*Y; xZ = x*Z 251 | yY = y*Y; yZ = y*Z; zZ = z*Z 252 | return np.array( 253 | [[ 1.0-(yY+zZ), xY-wZ, xZ+wY ], 254 | [ xY+wZ, 1.0-(xX+zZ), yZ-wX ], 255 | [ xZ-wY, yZ+wX, 1.0-(xX+yY) ]]) 256 | 257 | def mat2euler(self,M, cy_thresh=None, seq='zyx'): 258 | ''' 259 | Taken From: http://afni.nimh.nih.gov/pub/dist/src/pkundu/meica.libs/nibabel/eulerangles.py 260 | Discover Euler angle vector from 3x3 matrix 261 | Uses the conventions above. 262 | Parameters 263 | ---------- 264 | M : array-like, shape (3,3) 265 | cy_thresh : None or scalar, optional 266 | threshold below which to give up on straightforward arctan for 267 | estimating x rotation. If None (default), estimate from 268 | precision of input. 269 | Returns 270 | ------- 271 | z : scalar 272 | y : scalar 273 | x : scalar 274 | Rotations in radians around z, y, x axes, respectively 275 | Notes 276 | ----- 277 | If there was no numerical error, the routine could be derived using 278 | Sympy expression for z then y then x rotation matrix, which is:: 279 | [ cos(y)*cos(z), -cos(y)*sin(z), sin(y)], 280 | [cos(x)*sin(z) + cos(z)*sin(x)*sin(y), cos(x)*cos(z) - sin(x)*sin(y)*sin(z), -cos(y)*sin(x)], 281 | [sin(x)*sin(z) - cos(x)*cos(z)*sin(y), cos(z)*sin(x) + cos(x)*sin(y)*sin(z), cos(x)*cos(y)] 282 | with the obvious derivations for z, y, and x 283 | z = atan2(-r12, r11) 284 | y = asin(r13) 285 | x = atan2(-r23, r33) 286 | for x,y,z order 287 | y = asin(-r31) 288 | x = atan2(r32, r33) 289 | z = atan2(r21, r11) 290 | Problems arise when cos(y) is close to zero, because both of:: 291 | z = atan2(cos(y)*sin(z), cos(y)*cos(z)) 292 | x = atan2(cos(y)*sin(x), cos(x)*cos(y)) 293 | will be close to atan2(0, 0), and highly unstable. 294 | The ``cy`` fix for numerical instability below is from: *Graphics 295 | Gems IV*, Paul Heckbert (editor), Academic Press, 1994, ISBN: 296 | 0123361559. Specifically it comes from EulerAngles.c by Ken 297 | Shoemake, and deals with the case where cos(y) is close to zero: 298 | See: http://www.graphicsgems.org/ 299 | The code appears to be licensed (from the website) as "can be used 300 | without restrictions". 301 | ''' 302 | M = np.asarray(M) 303 | if cy_thresh is None: 304 | try: 305 | cy_thresh = np.finfo(M.dtype).eps * 4 306 | except ValueError: 307 | cy_thresh = _FLOAT_EPS_4 308 | r11, r12, r13, r21, r22, r23, r31, r32, r33 = M.flat 309 | # cy: sqrt((cos(y)*cos(z))**2 + (cos(x)*cos(y))**2) 310 | cy = math.sqrt(r33*r33 + r23*r23) 311 | if seq=='zyx': 312 | if cy > cy_thresh: # cos(y) not close to zero, standard form 313 | z = math.atan2(-r12, r11) # atan2(cos(y)*sin(z), cos(y)*cos(z)) 314 | y = math.atan2(r13, cy) # atan2(sin(y), cy) 315 | x = math.atan2(-r23, r33) # atan2(cos(y)*sin(x), cos(x)*cos(y)) 316 | else: # cos(y) (close to) zero, so x -> 0.0 (see above) 317 | # so r21 -> sin(z), r22 -> cos(z) and 318 | z = math.atan2(r21, r22) 319 | y = math.atan2(r13, cy) # atan2(sin(y), cy) 320 | x = 0.0 321 | elif seq=='xyz': 322 | if cy > cy_thresh: 323 | y = math.atan2(-r31, cy) 324 | x = math.atan2(r32, r33) 325 | z = math.atan2(r21, r11) 326 | else: 327 | z = 0.0 328 | if r31 < 0: 329 | y = np.pi/2 330 | x = atan2(r12, r13) 331 | else: 332 | y = -np.pi/2 333 | else: 334 | raise Exception('Sequence not recognized') 335 | return z, y, x 336 | 337 | def euler2quat(self,z=0, y=0, x=0, isRadian=True): 338 | ''' Return quaternion corresponding to these Euler angles 339 | Uses the z, then y, then x convention above 340 | Parameters 341 | ---------- 342 | z : scalar 343 | Rotation angle in radians around z-axis (performed first) 344 | y : scalar 345 | Rotation angle in radians around y-axis 346 | x : scalar 347 | Rotation angle in radians around x-axis (performed last) 348 | Returns 349 | ------- 350 | quat : array shape (4,) 351 | Quaternion in w, x, y z (real, then vector) format 352 | Notes 353 | ----- 354 | We can derive this formula in Sympy using: 355 | 1. Formula giving quaternion corresponding to rotation of theta radians 356 | about arbitrary axis: 357 | http://mathworld.wolfram.com/EulerParameters.html 358 | 2. Generated formulae from 1.) for quaternions corresponding to 359 | theta radians rotations about ``x, y, z`` axes 360 | 3. Apply quaternion multiplication formula - 361 | http://en.wikipedia.org/wiki/Quaternions#Hamilton_product - to 362 | formulae from 2.) to give formula for combined rotations. 363 | ''' 364 | 365 | if not isRadian: 366 | z = ((np.pi)/180.) * z 367 | y = ((np.pi)/180.) * y 368 | x = ((np.pi)/180.) * x 369 | z = z/2.0 370 | y = y/2.0 371 | x = x/2.0 372 | cz = math.cos(z) 373 | sz = math.sin(z) 374 | cy = math.cos(y) 375 | sy = math.sin(y) 376 | cx = math.cos(x) 377 | sx = math.sin(x) 378 | return np.array([ 379 | cx*cy*cz - sx*sy*sz, 380 | cx*sy*sz + cy*cz*sx, 381 | cx*cz*sy - sx*cy*sz, 382 | cx*cy*sz + sx*cz*sy]) 383 | 384 | class kittiEvalOdom(): 385 | # ---------------------------------------------------------------------- 386 | # poses: N,4,4 387 | # pose: 4,4 388 | # ---------------------------------------------------------------------- 389 | def __init__(self): 390 | self.lengths= [100,200,300,400,500,600,700,800] 391 | self.num_lengths = len(self.lengths) 392 | self.gt_dir = "./data/odometry_evaluation/poses" 393 | 394 | def loadPoses(self, file_name): 395 | # ---------------------------------------------------------------------- 396 | # Each line in the file should follow one of the following structures 397 | # (1) idx pose(3x4 matrix in terms of 12 numbers) 398 | # (2) pose(3x4 matrix in terms of 12 numbers) 399 | # ---------------------------------------------------------------------- 400 | f = open(file_name, 'r') 401 | s = f.readlines() 402 | f.close() 403 | file_len = len(s) 404 | poses = {} 405 | for cnt, line in enumerate(s): 406 | P = np.eye(4) 407 | line_split = [float(i) for i in line.split(" ")] 408 | withIdx = int(len(line_split)==13) 409 | for row in xrange(3): 410 | for col in xrange(4): 411 | P[row, col] = line_split[row*4+col+ withIdx] 412 | if withIdx: 413 | frame_idx = line_split[0] 414 | else: 415 | frame_idx = cnt 416 | poses[frame_idx] = P 417 | return poses 418 | 419 | def trajectoryDistances(self, poses): 420 | # ---------------------------------------------------------------------- 421 | # poses: dictionary: [frame_idx: pose] 422 | # ---------------------------------------------------------------------- 423 | dist = [0] 424 | sort_frame_idx = sorted(poses.keys()) 425 | for i in xrange(len(sort_frame_idx)-1): 426 | cur_frame_idx = sort_frame_idx[i] 427 | next_frame_idx = sort_frame_idx[i+1] 428 | P1 = poses[cur_frame_idx] 429 | P2 = poses[next_frame_idx] 430 | dx = P1[0,3] - P2[0,3] 431 | dy = P1[1,3] - P2[1,3] 432 | dz = P1[2,3] - P2[2,3] 433 | dist.append(dist[i]+np.sqrt(dx**2+dy**2+dz**2)) 434 | return dist 435 | 436 | def rotationError(self, pose_error): 437 | a = pose_error[0,0] 438 | b = pose_error[1,1] 439 | c = pose_error[2,2] 440 | d = 0.5*(a+b+c-1.0) 441 | return np.arccos(max(min(d,1.0),-1.0)) 442 | 443 | def translationError(self, pose_error): 444 | dx = pose_error[0,3] 445 | dy = pose_error[1,3] 446 | dz = pose_error[2,3] 447 | return np.sqrt(dx**2+dy**2+dz**2) 448 | 449 | def lastFrameFromSegmentLength(self, dist, first_frame, len_): 450 | for i in xrange(first_frame, len(dist), 1): 451 | if dist[i] > (dist[first_frame] + len_): 452 | return i 453 | return -1 454 | 455 | def calcSequenceErrors(self, poses_gt, poses_result): 456 | err = [] 457 | dist = self.trajectoryDistances(poses_gt) 458 | self.step_size = 10 459 | 460 | for first_frame in xrange(0, len(poses_gt), self.step_size): 461 | for i in xrange(self.num_lengths): 462 | len_ = self.lengths[i] 463 | last_frame = self.lastFrameFromSegmentLength(dist, first_frame, len_) 464 | 465 | # ---------------------------------------------------------------------- 466 | # Continue if sequence not long enough 467 | # ---------------------------------------------------------------------- 468 | if last_frame == -1 or not(last_frame in poses_result.keys()) or not(first_frame in poses_result.keys()): 469 | continue 470 | 471 | # ---------------------------------------------------------------------- 472 | # compute rotational and translational errors 473 | # ---------------------------------------------------------------------- 474 | pose_delta_gt = np.dot(np.linalg.inv(poses_gt[first_frame]), poses_gt[last_frame]) 475 | pose_delta_result = np.dot(np.linalg.inv(poses_result[first_frame]), poses_result[last_frame]) 476 | pose_error = np.dot(np.linalg.inv(pose_delta_result), pose_delta_gt) 477 | 478 | r_err = self.rotationError(pose_error) 479 | t_err = self.translationError(pose_error) 480 | 481 | # ---------------------------------------------------------------------- 482 | # compute speed 483 | # ---------------------------------------------------------------------- 484 | num_frames = last_frame - first_frame + 1.0 485 | speed = len_/(0.1*num_frames) 486 | 487 | err.append([first_frame, r_err/len_, t_err/len_, len_, speed]) 488 | return err 489 | 490 | def saveSequenceErrors(self, err, file_name): 491 | fp = open(file_name,'w') 492 | for i in err: 493 | line_to_write = " ".join([str(j) for j in i]) 494 | fp.writelines(line_to_write+"\n") 495 | fp.close() 496 | 497 | def computeOverallErr(self, seq_err): 498 | t_err = 0 499 | r_err = 0 500 | 501 | seq_len = len(seq_err) 502 | 503 | for item in seq_err: 504 | r_err += item[1] 505 | t_err += item[2] 506 | ave_t_err = t_err / seq_len 507 | ave_r_err = r_err / seq_len 508 | return ave_t_err, ave_r_err 509 | 510 | def plotPath(self, seq, poses_gt, poses_result): 511 | plot_keys = ["Ground Truth", "Ours"] 512 | fontsize_ = 20 513 | plot_num =-1 514 | 515 | poses_dict = {} 516 | poses_dict["Ground Truth"] = poses_gt 517 | poses_dict["Ours"] = poses_result 518 | 519 | fig = plt.figure() 520 | ax = plt.gca() 521 | ax.set_aspect('equal') 522 | 523 | for key in plot_keys: 524 | pos_xz = [] 525 | # for pose in poses_dict[key]: 526 | for frame_idx in sorted(poses_dict[key].keys()): 527 | pose = poses_dict[key][frame_idx] 528 | pos_xz.append([pose[0,3], pose[2,3]]) 529 | pos_xz = np.asarray(pos_xz) 530 | plt.plot(pos_xz[:,0], pos_xz[:,1], label = key) 531 | 532 | plt.legend(loc = "upper right", prop={'size': fontsize_}) 533 | plt.xticks(fontsize = fontsize_) 534 | plt.yticks(fontsize = fontsize_) 535 | plt.xlabel('x (m)',fontsize = fontsize_) 536 | plt.ylabel('z (m)',fontsize = fontsize_) 537 | fig.set_size_inches(10, 10) 538 | png_title = "sequence_{:02}".format(seq) 539 | plt.savefig(self.plot_path_dir + "/" + png_title + ".pdf",bbox_inches='tight', pad_inches=0) 540 | # plt.show() 541 | 542 | def plotError(self, avg_segment_errs): 543 | # ---------------------------------------------------------------------- 544 | # avg_segment_errs: dict [100: err, 200: err...] 545 | # ---------------------------------------------------------------------- 546 | plot_y = [] 547 | plot_x = [] 548 | for len_ in self.lengths: 549 | plot_x.append(len_) 550 | plot_y.append(avg_segment_errs[len_][0]) 551 | fig = plt.figure() 552 | plt.plot(plot_x, plot_y) 553 | plt.show() 554 | 555 | def computeSegmentErr(self, seq_errs): 556 | # ---------------------------------------------------------------------- 557 | # This function calculates average errors for different segment. 558 | # ---------------------------------------------------------------------- 559 | 560 | segment_errs = {} 561 | avg_segment_errs = {} 562 | for len_ in self.lengths: 563 | segment_errs[len_] = [] 564 | # ---------------------------------------------------------------------- 565 | # Get errors 566 | # ---------------------------------------------------------------------- 567 | for err in seq_errs: 568 | len_ = err[3] 569 | t_err = err[2] 570 | r_err = err[1] 571 | segment_errs[len_].append([t_err, r_err]) 572 | # ---------------------------------------------------------------------- 573 | # Compute average 574 | # ---------------------------------------------------------------------- 575 | for len_ in self.lengths: 576 | if segment_errs[len_] != []: 577 | avg_t_err = np.mean(np.asarray(segment_errs[len_])[:,0]) 578 | avg_r_err = np.mean(np.asarray(segment_errs[len_])[:,1]) 579 | avg_segment_errs[len_] = [avg_t_err, avg_r_err] 580 | else: 581 | avg_segment_errs[len_] = [] 582 | return avg_segment_errs 583 | 584 | def eval(self, result_dir): 585 | error_dir = result_dir + "/errors" 586 | self.plot_path_dir = result_dir + "/plot_path" 587 | plot_error_dir = result_dir + "/plot_error" 588 | 589 | if not os.path.exists(error_dir): 590 | os.makedirs(error_dir) 591 | if not os.path.exists(self.plot_path_dir): 592 | os.makedirs(self.plot_path_dir) 593 | if not os.path.exists(plot_error_dir): 594 | os.makedirs(plot_error_dir) 595 | 596 | total_err = [] 597 | 598 | ave_t_errs = [] 599 | ave_r_errs = [] 600 | 601 | for i in self.eval_seqs: 602 | self.cur_seq = '{:02}'.format(i) 603 | file_name = '{:02}.txt'.format(i) 604 | 605 | poses_result = self.loadPoses(result_dir+"/"+file_name) 606 | poses_gt = self.loadPoses(self.gt_dir + "/" + file_name) 607 | self.result_file_name = result_dir+file_name 608 | 609 | # ---------------------------------------------------------------------- 610 | # compute sequence errors 611 | # ---------------------------------------------------------------------- 612 | seq_err = self.calcSequenceErrors(poses_gt, poses_result) 613 | self.saveSequenceErrors(seq_err, error_dir + "/" + file_name) 614 | 615 | # ---------------------------------------------------------------------- 616 | # Compute segment errors 617 | # ---------------------------------------------------------------------- 618 | avg_segment_errs = self.computeSegmentErr(seq_err) 619 | 620 | # ---------------------------------------------------------------------- 621 | # compute overall error 622 | # ---------------------------------------------------------------------- 623 | ave_t_err, ave_r_err = self.computeOverallErr(seq_err) 624 | print "Sequence: " + str(i) 625 | print "Average translational RMSE (%): ", ave_t_err*100 626 | print "Average rotational error (deg/100m): ", ave_r_err/np.pi * 180 *100 627 | ave_t_errs.append(ave_t_err) 628 | ave_r_errs.append(ave_r_err) 629 | 630 | # ---------------------------------------------------------------------- 631 | # Ploting (To-do) 632 | # (1) plot trajectory 633 | # (2) plot per segment error 634 | # ---------------------------------------------------------------------- 635 | self.plotPath(i,poses_gt, poses_result) 636 | # self.plotError(avg_segment_errs) 637 | 638 | print "-------------------- For Copying ------------------------------" 639 | for i in xrange(len(ave_t_errs)): 640 | print "{0:.2f}".format(ave_t_errs[i]*100) 641 | print "{0:.2f}".format(ave_r_errs[i]/np.pi*180*100) 642 | print "-------------------- For copying ------------------------------" 643 | 644 | if args.func == "generate_depth_npy": 645 | if args.dataset == "kitti_eigen": 646 | generator = kittiEigenGenerateDepthNpy() 647 | inv_depths = generator.getPredInvDepths() 648 | generator.saveNpy(inv_depths) 649 | 650 | elif args.func == "generate_odom_result": 651 | print "Getting predictions..." 652 | generator = kittiPredOdom() 653 | pred_poses = generator.getPredPoses() 654 | print "Converting to world coordinates..." 655 | generator.SE3_cam2world(pred_poses) 656 | print "Saving predictions..." 657 | generator.saveResultPoses() 658 | 659 | elif args.func == "eval_odom": 660 | odom_eval = kittiEvalOdom() 661 | odom_eval.eval_seqs = [0,1,2,4,5,6,7,8,9,10] # Seq 03 is missing since the dataset is not available in KITTI homepage. 662 | odom_eval.eval(args.odom_result_dir) 663 | 664 | 665 | 666 | 667 | -------------------------------------------------------------------------------- /tools/sfmlearner_odometry_tool/get_sfmlearner_result.py: -------------------------------------------------------------------------------- 1 | import copy 2 | from glob import glob 3 | import numpy as np 4 | import os 5 | 6 | from pose_evaluation_utils import * 7 | 8 | gt_dir = "./sfmLearner/ground_truth/" 9 | pred_dir = "./sfmLearner/ours_results/" 10 | solve_global_pose = False 11 | 12 | 13 | def load_pose_from_list(traj_list, idx): 14 | """Load pose from SfM-learner list 15 | Args: 16 | traj_list (list): snippet trajectory list 17 | idx (int): index 18 | Returns: 19 | pose_mat (4x4 array): pose array 20 | """ 21 | pose = traj_list[list(traj_list.keys())[idx]] 22 | pose = [float(i) for i in pose] 23 | pose_mat = np.eye(4) 24 | pose_mat[:3, :3] = quat2mat([pose[6], pose[3], pose[4], pose[5]]) 25 | pose_mat[:3, 3] = pose[:3] 26 | return pose_mat 27 | 28 | 29 | def save_traj(txt, poses): 30 | """Save trajectory (absolute poses) as KITTI odometry file format 31 | Args: 32 | txt (str): pose text file path 33 | poses (array dict): poses, each pose is 4x4 array 34 | """ 35 | with open(txt, "w") as f: 36 | for i in poses: 37 | pose = poses[i] 38 | pose = pose.flatten()[:12] 39 | line_to_write = " ".join([str(i) for i in pose]) 40 | f.writelines(line_to_write+"\n") 41 | print("Trajectory saved.") 42 | 43 | 44 | for seq in ['09', '10']: 45 | gt_files = sorted(glob(os.path.join(gt_dir, seq, "*.txt"))) 46 | pred_files = sorted(glob(os.path.join(pred_dir, seq, "*.txt"))) 47 | 48 | for pose_type in ['pred', 'gt']: 49 | poses = {0: np.eye(4)} 50 | for cnt in range(len(gt_files)): 51 | scale = 1 52 | if not(solve_global_pose): 53 | # Solve pose scale 54 | ate, scale = compute_ate(gt_files[cnt], pred_files[cnt]) 55 | 56 | # Read pred pose 57 | if pose_type == "pred": 58 | traj_list = read_file_list(pred_files[cnt]) 59 | elif pose_type == "gt": 60 | traj_list = read_file_list(gt_files[cnt]) 61 | 62 | if cnt < len(gt_files) - 1: 63 | # Read second pose in the traj file 64 | poses[cnt+1] = load_pose_from_list(traj_list, 1) 65 | poses[cnt+1][:3, 3] *= scale 66 | # Transform the pose w.r.t first frame 67 | poses[cnt+1] = poses[cnt] @ poses[cnt+1] 68 | else: 69 | # Read second to last poses 70 | for k in range(1, len(traj_list)): 71 | poses[cnt+k] = load_pose_from_list(traj_list, k) 72 | poses[cnt+k][:3, 3] *= scale 73 | poses[cnt+k] = poses[cnt+k-1] @ poses[cnt+k] 74 | 75 | if pose_type == "pred": 76 | poses_pred = copy.deepcopy(poses) 77 | elif pose_type == "gt": 78 | poses_gt = copy.deepcopy(poses) 79 | 80 | # If solve global pose 81 | if solve_global_pose: 82 | # Read XYZ 83 | gtruth_xyz = [] 84 | pred_xyz = [] 85 | for cnt in poses: 86 | gtruth_xyz.append(poses_gt[cnt][:3, 3]) 87 | pred_xyz.append(poses_pred[cnt][:3, 3]) 88 | gtruth_xyz = np.asarray(gtruth_xyz) 89 | pred_xyz = np.asarray(pred_xyz) 90 | 91 | # Solve for global scale 92 | scale = np.sum(gtruth_xyz * pred_xyz)/np.sum(pred_xyz ** 2) 93 | 94 | # Update pose 95 | for cnt in poses: 96 | poses_pred[cnt][:3, 3] *= scale 97 | 98 | save_traj("./{}.txt".format(seq), poses_pred) 99 | -------------------------------------------------------------------------------- /tools/sfmlearner_odometry_tool/pose_evaluation_utils.py: -------------------------------------------------------------------------------- 1 | # Some of the code are from the TUM evaluation toolkit: 2 | # https://vision.in.tum.de/data/datasets/rgbd-dataset/tools#absolute_trajectory_error_ate 3 | 4 | import math 5 | import numpy as np 6 | 7 | def compute_ate(gtruth_file, pred_file): 8 | gtruth_list = read_file_list(gtruth_file) 9 | pred_list = read_file_list(pred_file) 10 | matches = associate(gtruth_list, pred_list, 0, 0.01) 11 | if len(matches) < 2: 12 | return False 13 | 14 | gtruth_xyz = np.array([[float(value) for value in gtruth_list[a][0:3]] for a,b in matches]) 15 | pred_xyz = np.array([[float(value) for value in pred_list[b][0:3]] for a,b in matches]) 16 | 17 | # Make sure that the first matched frames align (no need for rotational alignment as 18 | # all the predicted/ground-truth snippets have been converted to use the same coordinate 19 | # system with the first frame of the snippet being the origin). 20 | offset = gtruth_xyz[0] - pred_xyz[0] 21 | pred_xyz += offset[None,:] 22 | 23 | # Optimize the scaling factor 24 | scale = np.sum(gtruth_xyz * pred_xyz)/np.sum(pred_xyz ** 2) 25 | alignment_error = pred_xyz * scale - gtruth_xyz 26 | rmse = np.sqrt(np.sum(alignment_error ** 2))/len(matches) 27 | return rmse, scale 28 | 29 | def read_file_list(filename): 30 | """ 31 | Reads a trajectory from a text file. 32 | 33 | File format: 34 | The file format is "stamp d1 d2 d3 ...", where stamp denotes the time stamp (to be matched) 35 | and "d1 d2 d3.." is arbitary data (e.g., a 3D position and 3D orientation) associated to this timestamp. 36 | 37 | Input: 38 | filename -- File name 39 | 40 | Output: 41 | dict -- dictionary of (stamp,data) tuples 42 | 43 | """ 44 | file = open(filename) 45 | data = file.read() 46 | lines = data.replace(","," ").replace("\t"," ").split("\n") 47 | list = [[v.strip() for v in line.split(" ") if v.strip()!=""] for line in lines if len(line)>0 and line[0]!="#"] 48 | list = [(float(l[0]),l[1:]) for l in list if len(l)>1] 49 | return dict(list) 50 | 51 | def associate(first_list, second_list,offset,max_difference): 52 | """ 53 | Associate two dictionaries of (stamp,data). As the time stamps never match exactly, we aim 54 | to find the closest match for every input tuple. 55 | 56 | Input: 57 | first_list -- first dictionary of (stamp,data) tuples 58 | second_list -- second dictionary of (stamp,data) tuples 59 | offset -- time offset between both dictionaries (e.g., to model the delay between the sensors) 60 | max_difference -- search radius for candidate generation 61 | 62 | Output: 63 | matches -- list of matched tuples ((stamp1,data1),(stamp2,data2)) 64 | 65 | """ 66 | first_keys = list(first_list.keys()) 67 | second_keys = list(second_list.keys()) 68 | potential_matches = [(abs(a - (b + offset)), a, b) 69 | for a in first_keys 70 | for b in second_keys 71 | if abs(a - (b + offset)) < max_difference] 72 | potential_matches.sort() 73 | matches = [] 74 | for diff, a, b in potential_matches: 75 | if a in first_keys and b in second_keys: 76 | first_keys.remove(a) 77 | second_keys.remove(b) 78 | matches.append((a, b)) 79 | 80 | matches.sort() 81 | return matches 82 | 83 | def rot2quat(R): 84 | rz, ry, rx = mat2euler(R) 85 | qw, qx, qy, qz = euler2quat(rz, ry, rx) 86 | return qw, qx, qy, qz 87 | 88 | def quat2mat(q): 89 | ''' Calculate rotation matrix corresponding to quaternion 90 | https://afni.nimh.nih.gov/pub/dist/src/pkundu/meica.libs/nibabel/quaternions.py 91 | Parameters 92 | ---------- 93 | q : 4 element array-like 94 | 95 | Returns 96 | ------- 97 | M : (3,3) array 98 | Rotation matrix corresponding to input quaternion *q* 99 | 100 | Notes 101 | ----- 102 | Rotation matrix applies to column vectors, and is applied to the 103 | left of coordinate vectors. The algorithm here allows non-unit 104 | quaternions. 105 | 106 | References 107 | ---------- 108 | Algorithm from 109 | http://en.wikipedia.org/wiki/Rotation_matrix#Quaternion 110 | 111 | Examples 112 | -------- 113 | >>> import numpy as np 114 | >>> M = quat2mat([1, 0, 0, 0]) # Identity quaternion 115 | >>> np.allclose(M, np.eye(3)) 116 | True 117 | >>> M = quat2mat([0, 1, 0, 0]) # 180 degree rotn around axis 0 118 | >>> np.allclose(M, np.diag([1, -1, -1])) 119 | True 120 | ''' 121 | w, x, y, z = q 122 | Nq = w*w + x*x + y*y + z*z 123 | if Nq < 1e-8: 124 | return np.eye(3) 125 | s = 2.0/Nq 126 | X = x*s 127 | Y = y*s 128 | Z = z*s 129 | wX = w*X; wY = w*Y; wZ = w*Z 130 | xX = x*X; xY = x*Y; xZ = x*Z 131 | yY = y*Y; yZ = y*Z; zZ = z*Z 132 | return np.array( 133 | [[ 1.0-(yY+zZ), xY-wZ, xZ+wY ], 134 | [ xY+wZ, 1.0-(xX+zZ), yZ-wX ], 135 | [ xZ-wY, yZ+wX, 1.0-(xX+yY) ]]) 136 | 137 | def mat2euler(M, cy_thresh=None, seq='zyx'): 138 | ''' 139 | Taken From: http://afni.nimh.nih.gov/pub/dist/src/pkundu/meica.libs/nibabel/eulerangles.py 140 | Discover Euler angle vector from 3x3 matrix 141 | Uses the conventions above. 142 | Parameters 143 | ---------- 144 | M : array-like, shape (3,3) 145 | cy_thresh : None or scalar, optional 146 | threshold below which to give up on straightforward arctan for 147 | estimating x rotation. If None (default), estimate from 148 | precision of input. 149 | Returns 150 | ------- 151 | z : scalar 152 | y : scalar 153 | x : scalar 154 | Rotations in radians around z, y, x axes, respectively 155 | Notes 156 | ----- 157 | If there was no numerical error, the routine could be derived using 158 | Sympy expression for z then y then x rotation matrix, which is:: 159 | [ cos(y)*cos(z), -cos(y)*sin(z), sin(y)], 160 | [cos(x)*sin(z) + cos(z)*sin(x)*sin(y), cos(x)*cos(z) - sin(x)*sin(y)*sin(z), -cos(y)*sin(x)], 161 | [sin(x)*sin(z) - cos(x)*cos(z)*sin(y), cos(z)*sin(x) + cos(x)*sin(y)*sin(z), cos(x)*cos(y)] 162 | with the obvious derivations for z, y, and x 163 | z = atan2(-r12, r11) 164 | y = asin(r13) 165 | x = atan2(-r23, r33) 166 | for x,y,z order 167 | y = asin(-r31) 168 | x = atan2(r32, r33) 169 | z = atan2(r21, r11) 170 | Problems arise when cos(y) is close to zero, because both of:: 171 | z = atan2(cos(y)*sin(z), cos(y)*cos(z)) 172 | x = atan2(cos(y)*sin(x), cos(x)*cos(y)) 173 | will be close to atan2(0, 0), and highly unstable. 174 | The ``cy`` fix for numerical instability below is from: *Graphics 175 | Gems IV*, Paul Heckbert (editor), Academic Press, 1994, ISBN: 176 | 0123361559. Specifically it comes from EulerAngles.c by Ken 177 | Shoemake, and deals with the case where cos(y) is close to zero: 178 | See: http://www.graphicsgems.org/ 179 | The code appears to be licensed (from the website) as "can be used 180 | without restrictions". 181 | ''' 182 | M = np.asarray(M) 183 | if cy_thresh is None: 184 | try: 185 | cy_thresh = np.finfo(M.dtype).eps * 4 186 | except ValueError: 187 | cy_thresh = _FLOAT_EPS_4 188 | r11, r12, r13, r21, r22, r23, r31, r32, r33 = M.flat 189 | # cy: sqrt((cos(y)*cos(z))**2 + (cos(x)*cos(y))**2) 190 | cy = math.sqrt(r33*r33 + r23*r23) 191 | if seq=='zyx': 192 | if cy > cy_thresh: # cos(y) not close to zero, standard form 193 | z = math.atan2(-r12, r11) # atan2(cos(y)*sin(z), cos(y)*cos(z)) 194 | y = math.atan2(r13, cy) # atan2(sin(y), cy) 195 | x = math.atan2(-r23, r33) # atan2(cos(y)*sin(x), cos(x)*cos(y)) 196 | else: # cos(y) (close to) zero, so x -> 0.0 (see above) 197 | # so r21 -> sin(z), r22 -> cos(z) and 198 | z = math.atan2(r21, r22) 199 | y = math.atan2(r13, cy) # atan2(sin(y), cy) 200 | x = 0.0 201 | elif seq=='xyz': 202 | if cy > cy_thresh: 203 | y = math.atan2(-r31, cy) 204 | x = math.atan2(r32, r33) 205 | z = math.atan2(r21, r11) 206 | else: 207 | z = 0.0 208 | if r31 < 0: 209 | y = np.pi/2 210 | x = atan2(r12, r13) 211 | else: 212 | y = -np.pi/2 213 | else: 214 | raise Exception('Sequence not recognized') 215 | return z, y, x 216 | 217 | import functools 218 | def euler2mat(z=0, y=0, x=0, isRadian=True): 219 | ''' Return matrix for rotations around z, y and x axes 220 | Uses the z, then y, then x convention above 221 | Parameters 222 | ---------- 223 | z : scalar 224 | Rotation angle in radians around z-axis (performed first) 225 | y : scalar 226 | Rotation angle in radians around y-axis 227 | x : scalar 228 | Rotation angle in radians around x-axis (performed last) 229 | Returns 230 | ------- 231 | M : array shape (3,3) 232 | Rotation matrix giving same rotation as for given angles 233 | Examples 234 | -------- 235 | >>> zrot = 1.3 # radians 236 | >>> yrot = -0.1 237 | >>> xrot = 0.2 238 | >>> M = euler2mat(zrot, yrot, xrot) 239 | >>> M.shape == (3, 3) 240 | True 241 | The output rotation matrix is equal to the composition of the 242 | individual rotations 243 | >>> M1 = euler2mat(zrot) 244 | >>> M2 = euler2mat(0, yrot) 245 | >>> M3 = euler2mat(0, 0, xrot) 246 | >>> composed_M = np.dot(M3, np.dot(M2, M1)) 247 | >>> np.allclose(M, composed_M) 248 | True 249 | You can specify rotations by named arguments 250 | >>> np.all(M3 == euler2mat(x=xrot)) 251 | True 252 | When applying M to a vector, the vector should column vector to the 253 | right of M. If the right hand side is a 2D array rather than a 254 | vector, then each column of the 2D array represents a vector. 255 | >>> vec = np.array([1, 0, 0]).reshape((3,1)) 256 | >>> v2 = np.dot(M, vec) 257 | >>> vecs = np.array([[1, 0, 0],[0, 1, 0]]).T # giving 3x2 array 258 | >>> vecs2 = np.dot(M, vecs) 259 | Rotations are counter-clockwise. 260 | >>> zred = np.dot(euler2mat(z=np.pi/2), np.eye(3)) 261 | >>> np.allclose(zred, [[0, -1, 0],[1, 0, 0], [0, 0, 1]]) 262 | True 263 | >>> yred = np.dot(euler2mat(y=np.pi/2), np.eye(3)) 264 | >>> np.allclose(yred, [[0, 0, 1],[0, 1, 0], [-1, 0, 0]]) 265 | True 266 | >>> xred = np.dot(euler2mat(x=np.pi/2), np.eye(3)) 267 | >>> np.allclose(xred, [[1, 0, 0],[0, 0, -1], [0, 1, 0]]) 268 | True 269 | Notes 270 | ----- 271 | The direction of rotation is given by the right-hand rule (orient 272 | the thumb of the right hand along the axis around which the rotation 273 | occurs, with the end of the thumb at the positive end of the axis; 274 | curl your fingers; the direction your fingers curl is the direction 275 | of rotation). Therefore, the rotations are counterclockwise if 276 | looking along the axis of rotation from positive to negative. 277 | ''' 278 | 279 | if not isRadian: 280 | z = ((np.pi)/180.) * z 281 | y = ((np.pi)/180.) * y 282 | x = ((np.pi)/180.) * x 283 | assert z>=(-np.pi) and z < np.pi, 'Inapprorpriate z: %f' % z 284 | assert y>=(-np.pi) and y < np.pi, 'Inapprorpriate y: %f' % y 285 | assert x>=(-np.pi) and x < np.pi, 'Inapprorpriate x: %f' % x 286 | 287 | Ms = [] 288 | if z: 289 | cosz = math.cos(z) 290 | sinz = math.sin(z) 291 | Ms.append(np.array( 292 | [[cosz, -sinz, 0], 293 | [sinz, cosz, 0], 294 | [0, 0, 1]])) 295 | if y: 296 | cosy = math.cos(y) 297 | siny = math.sin(y) 298 | Ms.append(np.array( 299 | [[cosy, 0, siny], 300 | [0, 1, 0], 301 | [-siny, 0, cosy]])) 302 | if x: 303 | cosx = math.cos(x) 304 | sinx = math.sin(x) 305 | Ms.append(np.array( 306 | [[1, 0, 0], 307 | [0, cosx, -sinx], 308 | [0, sinx, cosx]])) 309 | if Ms: 310 | return functools.reduce(np.dot, Ms[::-1]) 311 | return np.eye(3) 312 | 313 | def euler2quat(z=0, y=0, x=0, isRadian=True): 314 | ''' Return quaternion corresponding to these Euler angles 315 | Uses the z, then y, then x convention above 316 | Parameters 317 | ---------- 318 | z : scalar 319 | Rotation angle in radians around z-axis (performed first) 320 | y : scalar 321 | Rotation angle in radians around y-axis 322 | x : scalar 323 | Rotation angle in radians around x-axis (performed last) 324 | Returns 325 | ------- 326 | quat : array shape (4,) 327 | Quaternion in w, x, y z (real, then vector) format 328 | Notes 329 | ----- 330 | We can derive this formula in Sympy using: 331 | 1. Formula giving quaternion corresponding to rotation of theta radians 332 | about arbitrary axis: 333 | http://mathworld.wolfram.com/EulerParameters.html 334 | 2. Generated formulae from 1.) for quaternions corresponding to 335 | theta radians rotations about ``x, y, z`` axes 336 | 3. Apply quaternion multiplication formula - 337 | http://en.wikipedia.org/wiki/Quaternions#Hamilton_product - to 338 | formulae from 2.) to give formula for combined rotations. 339 | ''' 340 | 341 | if not isRadian: 342 | z = ((np.pi)/180.) * z 343 | y = ((np.pi)/180.) * y 344 | x = ((np.pi)/180.) * x 345 | z = z/2.0 346 | y = y/2.0 347 | x = x/2.0 348 | cz = math.cos(z) 349 | sz = math.sin(z) 350 | cy = math.cos(y) 351 | sy = math.sin(y) 352 | cx = math.cos(x) 353 | sx = math.sin(x) 354 | return np.array([ 355 | cx*cy*cz - sx*sy*sz, 356 | cx*sy*sz + cy*cz*sx, 357 | cx*cz*sy - sx*cy*sz, 358 | cx*cy*sz + sx*cz*sy]) 359 | 360 | def pose_vec_to_mat(vec): 361 | tx = vec[0] 362 | ty = vec[1] 363 | tz = vec[2] 364 | trans = np.array([tx, ty, tz]).reshape((3,1)) 365 | rot = euler2mat(vec[5], vec[4], vec[3]) 366 | Tmat = np.concatenate((rot, trans), axis=1) 367 | hfiller = np.array([0, 0, 0, 1]).reshape((1,4)) 368 | Tmat = np.concatenate((Tmat, hfiller), axis=0) 369 | return Tmat 370 | 371 | def dump_pose_seq_TUM(out_file, poses, times): 372 | # First frame as the origin 373 | first_pose = pose_vec_to_mat(poses[0]) 374 | with open(out_file, 'w') as f: 375 | for p in range(len(times)): 376 | this_pose = pose_vec_to_mat(poses[p]) 377 | this_pose = np.dot(first_pose, np.linalg.inv(this_pose)) 378 | tx = this_pose[0, 3] 379 | ty = this_pose[1, 3] 380 | tz = this_pose[2, 3] 381 | rot = this_pose[:3, :3] 382 | qw, qx, qy, qz = rot2quat(rot) 383 | f.write('%f %f %f %f %f %f %f %f\n' % (times[p], tx, ty, tz, qx, qy, qz, qw)) --------------------------------------------------------------------------------