├── LICENSE ├── code_of_Dsec ├── .gitignore ├── LTC_Dsec │ ├── model_LTC │ │ ├── embedding.py │ │ ├── errors.py │ │ ├── estimator.py │ │ ├── loss.py │ │ ├── matching.py │ │ ├── network.py │ │ ├── network_blocks.py │ │ ├── network_pds.py │ │ ├── pds_trainer.py │ │ ├── provider.py │ │ ├── regularization.py │ │ ├── size_adapter.py │ │ ├── temporal_aggregation.py │ │ ├── trainer.py │ │ ├── trainer_pds.py │ │ ├── transforms.py │ │ ├── utils │ │ │ └── __init__.py │ │ └── visualization.py │ └── run_experiment.py └── LTC_Dsec_spade │ ├── model_LTC │ ├── embedding.py │ ├── errors.py │ ├── estimator.py │ ├── loss.py │ ├── matching.py │ ├── network.py │ ├── network_blocks.py │ ├── network_pds.py │ ├── pds_trainer.py │ ├── provider.py │ ├── regularization.py │ ├── size_adapter.py │ ├── spade_e2v.py │ ├── temporal_aggregation.py │ ├── trainer.py │ ├── trainer_pds.py │ ├── transforms.py │ ├── utils │ │ └── __init__.py │ └── visualization.py │ └── run_experiment.py ├── code_of_mvsec ├── DTC_SPADE_for_mvsec │ ├── model_LTC │ │ ├── __pycache__ │ │ │ ├── dataset.cpython-38.pyc │ │ │ ├── dataset_constants.cpython-38.pyc │ │ │ ├── deform.cpython-38.pyc │ │ │ ├── embedding.cpython-38.pyc │ │ │ ├── errors.cpython-38.pyc │ │ │ ├── estimator.cpython-38.pyc │ │ │ ├── loss.cpython-38.pyc │ │ │ ├── matching.cpython-38.pyc │ │ │ ├── network.cpython-38.pyc │ │ │ ├── network_blocks.cpython-38.pyc │ │ │ ├── network_pds.cpython-38.pyc │ │ │ ├── pds_trainer.cpython-38.pyc │ │ │ ├── regularization.cpython-38.pyc │ │ │ ├── size_adapter.cpython-38.pyc │ │ │ ├── spade_e2v.cpython-38.pyc │ │ │ ├── stay_embedding.cpython-38.pyc │ │ │ ├── temporal_aggregation.cpython-38.pyc │ │ │ ├── trainer.cpython-38.pyc │ │ │ ├── trainer_pds.cpython-38.pyc │ │ │ ├── transforms.cpython-38.pyc │ │ │ └── visualization.cpython-38.pyc │ │ ├── dataset.py │ │ ├── dataset_constants.py │ │ ├── e2v_utils.py │ │ ├── embedding.py │ │ ├── errors.py │ │ ├── estimator.py │ │ ├── loss.py │ │ ├── matching.py │ │ ├── network.py │ │ ├── network_blocks.py │ │ ├── network_pds.py │ │ ├── pds_trainer.py │ │ ├── regularization.py │ │ ├── size_adapter.py │ │ ├── spade_e2v.py │ │ ├── stay_embedding.py │ │ ├── temporal_aggregation.py │ │ ├── trainer.py │ │ ├── trainer_pds.py │ │ ├── transforms.py │ │ └── visualization.py │ ├── run_experiment.py │ ├── test.sh │ └── train.sh ├── DTC_pds_for_mvsec │ ├── dataloader │ │ ├── __init__.py │ │ ├── __pycache__ │ │ │ ├── __init__.cpython-38.pyc │ │ │ ├── dataloader.cpython-38.pyc │ │ │ ├── dataset.cpython-38.pyc │ │ │ ├── dataset_constants.cpython-38.pyc │ │ │ └── transforms.cpython-38.pyc │ │ ├── dataloader.py │ │ ├── dataset.py │ │ ├── dataset_constants.py │ │ └── transforms.py │ ├── ltc_fixed_12345 │ │ ├── 035_checkpoint.bin │ │ ├── log.txt │ │ └── plot.png │ ├── model_LTC │ │ ├── __pycache__ │ │ │ ├── dataset.cpython-38.pyc │ │ │ ├── dataset_constants.cpython-38.pyc │ │ │ ├── deform.cpython-38.pyc │ │ │ ├── embedding.cpython-38.pyc │ │ │ ├── errors.cpython-38.pyc │ │ │ ├── estimator.cpython-38.pyc │ │ │ ├── loss.cpython-38.pyc │ │ │ ├── matching.cpython-38.pyc │ │ │ ├── network.cpython-38.pyc │ │ │ ├── network_blocks.cpython-38.pyc │ │ │ ├── network_pds.cpython-38.pyc │ │ │ ├── pds_trainer.cpython-38.pyc │ │ │ ├── regularization.cpython-38.pyc │ │ │ ├── size_adapter.cpython-38.pyc │ │ │ ├── temporal_aggregation.cpython-38.pyc │ │ │ ├── trainer.cpython-38.pyc │ │ │ ├── trainer_pds.cpython-38.pyc │ │ │ ├── transforms.cpython-38.pyc │ │ │ └── visualization.cpython-38.pyc │ │ ├── dataset.py │ │ ├── dataset_constants.py │ │ ├── deform.py │ │ ├── embedding.py │ │ ├── errors.py │ │ ├── estimator.py │ │ ├── loss.py │ │ ├── matching.py │ │ ├── network.py │ │ ├── network_blocks.py │ │ ├── network_pds.py │ │ ├── pds_trainer.py │ │ ├── regularization.py │ │ ├── size_adapter.py │ │ ├── temporal_aggregation.py │ │ ├── trainer.py │ │ ├── trainer_pds.py │ │ ├── transforms.py │ │ └── visualization.py │ ├── run_experiment.py │ ├── test.sh │ └── train.sh └── data_preprocess.py ├── readme.md └── requirements.txt /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Huawei-BIC 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /code_of_Dsec/.gitignore: -------------------------------------------------------------------------------- 1 | experiment* 2 | train.sh 3 | __pycache__/ 4 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec/model_LTC/embedding.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | 8 | from model_LTC import network_blocks 9 | 10 | 11 | class Embedding(nn.Module): 12 | """Embedding module.""" 13 | 14 | def __init__(self, 15 | number_of_input_features=3, 16 | number_of_embedding_features=16, 17 | number_of_shortcut_features=8, 18 | number_of_residual_blocks=2): 19 | """Returns initialized embedding module. 20 | 21 | Args: 22 | number_of_input_features: number of channels in the input image; 23 | number_of_embedding_features: number of channels in image's 24 | descriptor; 25 | number_of_shortcut_features: number of channels in the redirect 26 | connection descriptor; 27 | number_of_residual_blocks: number of residual blocks in embedding 28 | network. 29 | """ 30 | super(Embedding, self).__init__() 31 | embedding_modules = [ 32 | nn.InstanceNorm2d(number_of_input_features), 33 | network_blocks.convolutional_block_5x5_stride_2( 34 | number_of_input_features, number_of_embedding_features), 35 | network_blocks.convolutional_block_5x5_stride_2( 36 | number_of_embedding_features, number_of_embedding_features), 37 | ] 38 | embedding_modules += [ 39 | network_blocks.ResidualBlock(number_of_embedding_features) 40 | for _ in range(number_of_residual_blocks) 41 | ] 42 | self._embedding_modules = nn.ModuleList(embedding_modules) 43 | self._shortcut = network_blocks.convolutional_block_3x3( 44 | number_of_embedding_features, number_of_shortcut_features) 45 | 46 | def forward(self, image): 47 | """Returns image's descriptor and redirect connection descriptor. 48 | 49 | Args: 50 | image: color image of size 51 | batch_size x 3 x height x width; 52 | 53 | Returns: 54 | descriptor: image's descriptor of size 55 | batch_size x 64 x (height / 4) x (width / 4); 56 | shortcut_from_left_image: shortcut connection from left image 57 | descriptor (it is used in regularization network). It 58 | is tensor of size 59 | (batch_size, 8, height / 4, width / 4). 60 | """ 61 | # print("image.size()", image.size()) 62 | descriptor = image 63 | for embedding_module in self._embedding_modules: 64 | descriptor = embedding_module(descriptor) 65 | 66 | return descriptor, self._shortcut(descriptor) 67 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec/model_LTC/errors.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import torch as th 7 | import math 8 | 9 | def compute_absolute_error(estimated_disparity, 10 | ground_truth_disparity, 11 | use_mean=True): 12 | """Returns pixel-wise and mean absolute error. 13 | 14 | Locations where ground truth is not avaliable do not contribute to mean 15 | absolute error. In such locations pixel-wise error is shown as zero. 16 | If ground truth is not avaliable in all locations, function returns 0. 17 | 18 | Args: 19 | ground_truth_disparity: ground truth disparity where locations with 20 | unknow disparity are set to inf's. 21 | estimated_disparity: estimated disparity. 22 | use_mean: if True than use mean to average pixelwise errors, 23 | otherwise use median. 24 | """ 25 | absolute_difference = (estimated_disparity - ground_truth_disparity).abs() 26 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 27 | pixelwise_absolute_error = absolute_difference.clone() 28 | pixelwise_absolute_error[locations_without_ground_truth] = 0 29 | absolute_differece_with_ground_truth = absolute_difference[ 30 | ~locations_without_ground_truth] 31 | if absolute_differece_with_ground_truth.numel() == 0: 32 | average_absolute_error = 0.0 33 | else: 34 | if use_mean: 35 | average_absolute_error = absolute_differece_with_ground_truth.mean( 36 | ).item() 37 | else: 38 | average_absolute_error = absolute_differece_with_ground_truth.median( 39 | ).item() 40 | return pixelwise_absolute_error, average_absolute_error 41 | 42 | 43 | def RMSE(estimated_disparity, 44 | ground_truth_disparity, 45 | use_mean=True): 46 | absolute_difference = (estimated_disparity - ground_truth_disparity) ** 2 47 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 48 | pixelwise_absolute_error = absolute_difference.clone() 49 | pixelwise_absolute_error[locations_without_ground_truth] = 0 50 | absolute_differece_with_ground_truth = absolute_difference[ 51 | ~locations_without_ground_truth] 52 | if absolute_differece_with_ground_truth.numel() == 0: 53 | average_absolute_error = 0.0 54 | else: 55 | if use_mean: 56 | average_absolute_error = math.sqrt(absolute_differece_with_ground_truth.mean()) 57 | else: 58 | average_absolute_error = math.sqrt(absolute_differece_with_ground_truth.median()) 59 | return pixelwise_absolute_error, average_absolute_error 60 | 61 | 62 | def compute_n_pixels_error(estimated_disparity, ground_truth_disparity, n=3.0): 63 | """Return pixel-wise n-pixels error and % of pixels with n-pixels error. 64 | 65 | Locations where ground truth is not avaliable do not contribute to mean 66 | n-pixel error. In such locations pixel-wise error is shown as zero. 67 | 68 | Note that n-pixel error is equal to one if 69 | |estimated_disparity-ground_truth_disparity| > n and zero otherwise. 70 | 71 | If ground truth is not avaliable in all locations, function returns 0. 72 | 73 | Args: 74 | ground_truth_disparity: ground truth disparity where locations with 75 | unknow disparity are set to inf's. 76 | estimated_disparity: estimated disparity. 77 | n: maximum absolute disparity difference, that does not trigger 78 | n-pixel error. 79 | """ 80 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 81 | more_than_n_pixels_absolute_difference = ( 82 | estimated_disparity - ground_truth_disparity).abs().gt(n).float() 83 | pixelwise_n_pixels_error = more_than_n_pixels_absolute_difference.clone() 84 | pixelwise_n_pixels_error[locations_without_ground_truth] = 0.0 85 | more_than_n_pixels_absolute_difference_with_ground_truth = \ 86 | more_than_n_pixels_absolute_difference[~locations_without_ground_truth] 87 | if more_than_n_pixels_absolute_difference_with_ground_truth.numel() == 0: 88 | percentage_of_pixels_with_error = 0.0 89 | else: 90 | percentage_of_pixels_with_error = \ 91 | more_than_n_pixels_absolute_difference_with_ground_truth.mean( 92 | ).item() * 100 93 | return pixelwise_n_pixels_error, percentage_of_pixels_with_error 94 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec/model_LTC/estimator.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch.nn import functional 7 | import torch as th 8 | import numpy as np 9 | import time 10 | 11 | class SubpixelMap(object): 12 | """Approximation of an sub-pixel MAP estimator. 13 | 14 | In every location (x, y), function collects similarity scores 15 | for disparities in a vicinty of a disparity with maximum similarity 16 | score and converts them to disparity distribution using softmax. 17 | Next, the disparity in every location (x, y) is computed as mean 18 | of this distribution. 19 | 20 | It is used only for inference. 21 | """ 22 | 23 | def __init__(self, half_support_window=4, disparity_step=2): 24 | super(SubpixelMap, self).__init__() 25 | """Returns object of SubpixelMap class. 26 | 27 | Args: 28 | disparity_step: step in pixels between near-by disparities in 29 | input "similarities" tensor. 30 | half_support_window: defines size of disparity window in pixels 31 | around disparity with maximum similarity, 32 | which is used to convert similarities 33 | to probabilities and compute mean. 34 | """ 35 | if disparity_step < 1: 36 | raise ValueError('"disparity_step" should be positive integer.') 37 | if half_support_window < 1: 38 | raise ValueError( 39 | '"half_support_window" should be positive integer.') 40 | if half_support_window % disparity_step != 0: 41 | raise ValueError('"half_support_window" should be multiple of the' 42 | '"disparity_step"') 43 | self._disparity_step = disparity_step 44 | self._half_support_window = half_support_window 45 | 46 | def __call__(self, similarities): 47 | """Returns sub-pixel disparity. 48 | 49 | Args: 50 | similarities: Tensor with similarities for every 51 | disparity and every location with indices 52 | [batch_index, disparity_index, y, x]. 53 | 54 | Returns: 55 | Tensor with disparities for every location with 56 | indices [batch_index, y, x]. 57 | """ 58 | # In every location (x, y) find disparity with maximum similarity 59 | # score. 60 | # start_time = time.time() 61 | maximum_similarity, disparity_index_with_maximum_similarity = \ 62 | th.max(similarities, dim=1, keepdim=True) 63 | support_disparities, support_similarities = [], [] 64 | maximum_disparity_index = similarities.size(1) 65 | 66 | # Collect similarity scores for the disparities around the disparity 67 | # with the maximum similarity score. 68 | for disparity_index_shift in range( 69 | -self._half_support_window // self._disparity_step, 70 | self._half_support_window // self._disparity_step + 1): 71 | disparity_index = (disparity_index_with_maximum_similarity + 72 | disparity_index_shift).float() 73 | invalid_disparity_index_mask = ( 74 | (disparity_index < 0) | 75 | (disparity_index >= maximum_disparity_index)) 76 | disparity_index[invalid_disparity_index_mask] = 0 77 | nearby_similarities = th.gather(similarities, 1, 78 | disparity_index.long()) 79 | nearby_similarities[invalid_disparity_index_mask] = -float('inf') 80 | support_similarities.append(nearby_similarities) 81 | nearby_disparities = th.gather( 82 | (self._disparity_step * 83 | disparity_index).expand_as(similarities), 1, 84 | disparity_index.long()) 85 | support_disparities.append(nearby_disparities) 86 | support_similarities = th.stack(support_similarities, dim=1) 87 | support_disparities = th.stack(support_disparities, dim=1) 88 | 89 | # Convert collected similarity scores to the disparity distribution 90 | # using softmax and compute disparity as a mean of this distribution. 91 | probabilities = functional.softmax(support_similarities, dim=1) 92 | disparities = th.sum(probabilities * support_disparities.float(), 1) 93 | # print(time.time()-start_time) 94 | # print(time.time()-start_time) 95 | # print(time.time()-start_time) 96 | return disparities.squeeze(1) 97 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec/model_LTC/loss.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import torch as th 7 | 8 | from torch import nn 9 | from torch.nn import functional 10 | # import torch.distributed as dist 11 | 12 | def _unnormalized_laplace_probability(value, location, diversity): 13 | return th.exp(-th.abs(location - value) / diversity) / (2 * diversity) 14 | 15 | 16 | class SubpixelCrossEntropy(nn.Module): 17 | def __init__(self, diversity=1.0, disparity_step=2): 18 | """Returns SubpixelCrossEntropy object. 19 | 20 | Args: 21 | disparity_step: disparity difference between near-by 22 | disparity indices in "similarities" tensor. 23 | diversity: diversity of the target Laplace distribution, 24 | centered at the sub-pixel ground truth. 25 | """ 26 | super(SubpixelCrossEntropy, self).__init__() 27 | self._diversity = diversity 28 | self._disparity_step = disparity_step 29 | 30 | def forward(self, similarities, ground_truth_disparities, weights=None): 31 | """Returns sub-pixel cross-entropy loss. 32 | 33 | Cross-entropy is computed as 34 | 35 | - sum_d log( P_predicted(d) ) x P_target(d) 36 | ------------------------------------------------- 37 | sum_d P_target(d) 38 | 39 | We need to normalize the cross-entropy by sum_d P_target(d), 40 | since the target distribution is not normalized. 41 | 42 | Args: 43 | ground_truth_disparities: Tensor with ground truth disparities with 44 | indices [example_index, y, x]. The 45 | disparity values are floats. The locations with unknown 46 | disparities are filled with 'inf's. 47 | similarities: Tensor with similarities with indices 48 | [example_index, disparity_index, y, x]. 49 | weights: Tensor with weights of individual locations. 50 | """ 51 | maximum_disparity_index = similarities.size(1) 52 | known_ground_truth_disparity = ground_truth_disparities.data != float( 53 | 'inf') 54 | log_P_predicted = functional.log_softmax(similarities, dim=1) 55 | sum_P_target = th.zeros(ground_truth_disparities.size()) 56 | sum_P_target_x_log_P_predicted = th.zeros( 57 | ground_truth_disparities.size()) 58 | if similarities.is_cuda: 59 | sum_P_target = sum_P_target.cuda() 60 | sum_P_target_x_log_P_predicted = \ 61 | sum_P_target_x_log_P_predicted.cuda() 62 | for disparity_index in range(maximum_disparity_index): 63 | disparity = disparity_index * self._disparity_step 64 | P_target = _unnormalized_laplace_probability( 65 | value=disparity, 66 | location=ground_truth_disparities, 67 | diversity=self._diversity) 68 | sum_P_target += P_target 69 | sum_P_target_x_log_P_predicted += ( 70 | log_P_predicted[:, disparity_index] * P_target) 71 | entropy = -sum_P_target_x_log_P_predicted[ 72 | known_ground_truth_disparity] / sum_P_target[ 73 | known_ground_truth_disparity] 74 | if weights is not None: 75 | weights_with_ground_truth = weights[known_ground_truth_disparity] 76 | return (weights_with_ground_truth * entropy).sum() / ( 77 | weights_with_ground_truth.sum() + 1e-15) 78 | # print("entropy.mean",entropy.mean()) 79 | return entropy.mean() 80 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec/model_LTC/network.py: -------------------------------------------------------------------------------- 1 | # CopyriVght 2021 Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: Apache-2.0 3 | from torch import nn 4 | import torch 5 | from model_LTC import temporal_aggregation 6 | 7 | from model_LTC import embedding 8 | from model_LTC import estimator 9 | from model_LTC import matching 10 | from model_LTC import network_pds as network 11 | from model_LTC import network_blocks 12 | from model_LTC import regularization 13 | from model_LTC import size_adapter 14 | 15 | import time 16 | 17 | class DenseDeepEventStereo(network.PdsNetwork): 18 | def __init__(self, size_adapter_module, temporal_aggregation_module, 19 | spatial_aggregation_module, matching_module, 20 | regularization_module, estimator_module): 21 | super(DenseDeepEventStereo, 22 | self).__init__(size_adapter_module, spatial_aggregation_module, 23 | matching_module, regularization_module, 24 | estimator_module) 25 | self._temporal_aggregation = temporal_aggregation_module 26 | 27 | @staticmethod 28 | def default_with_continuous_fully_connected(hyper_params, maximum_disparity, embedding_features, embedding_shortcuts,matching_concat_features,matching_features,matching_shortcuts,matching_residual_blocks): 29 | """Returns default network with continuous fully connected.""" 30 | stereo_network = DenseDeepEventStereo( 31 | size_adapter_module=size_adapter.SizeAdapter(), 32 | temporal_aggregation_module=temporal_aggregation. 33 | ContinuousFullyConnected(hyper_params), 34 | spatial_aggregation_module=embedding.Embedding( 35 | number_of_input_features=hyper_params['nltc'], number_of_embedding_features=embedding_features, 36 | number_of_shortcut_features=embedding_shortcuts), 37 | matching_module=matching.Matching( 38 | operation=matching.MatchingOperation(number_of_concatenated_descriptor_features=matching_concat_features, number_of_features=matching_features, number_of_compact_matching_signature_features=matching_shortcuts, number_of_residual_blocks=matching_residual_blocks), maximum_disparity=0), 39 | regularization_module=regularization.Regularization(), 40 | estimator_module=estimator.SubpixelMap()) 41 | stereo_network.set_maximum_disparity(maximum_disparity) 42 | return stereo_network 43 | 44 | def forward(self, left_event_queue, right_event_queue): 45 | # input:[2, 2, 1, 320, 384] 46 | LTC_start_time = time.time() 47 | left_projected_events = self._temporal_aggregation( 48 | self._size_adapter.pad(left_event_queue), self.training) 49 | # print("leftpro",left_projected_events.size()) 1 32 320 384 50 | 51 | right_projected_events = self._temporal_aggregation( 52 | self._size_adapter.pad(right_event_queue), self.training) 53 | network_output = self.pass_through_network(left_projected_events, 54 | right_projected_events)[0] 55 | if not self.training: 56 | start_time = time.time() 57 | network_output = self._estimator(network_output) 58 | return self._size_adapter.unpad(network_output) 59 | 60 | 61 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec/model_LTC/network_blocks.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | 8 | 9 | def convolution_3x3x3(number_of_input_features, number_of_output_features, 10 | stride): 11 | return nn.Conv3d( 12 | number_of_input_features, 13 | number_of_output_features, 14 | kernel_size=3, 15 | stride=stride, 16 | padding=1) 17 | 18 | 19 | def convolution_3x3(number_of_input_features, number_of_output_features): 20 | return nn.Conv2d( 21 | number_of_input_features, 22 | number_of_output_features, 23 | kernel_size=3, 24 | padding=1) 25 | 26 | 27 | def convolution_5x5_stride_2(number_of_input_features, 28 | number_of_output_features): 29 | return nn.Conv2d( 30 | number_of_input_features, 31 | number_of_output_features, 32 | kernel_size=5, 33 | stride=2, 34 | padding=2) 35 | 36 | 37 | def transposed_convolution_3x4x4_stride_122(number_of_input_features, 38 | number_of_output_features): 39 | return nn.ConvTranspose3d( 40 | number_of_input_features, 41 | number_of_output_features, 42 | kernel_size=(3, 4, 4), 43 | stride=(1, 2, 2), 44 | padding=(1, 1, 1)) 45 | 46 | 47 | def convolution_block_2D_with_relu_and_instance_norm(number_of_input_features, 48 | number_of_output_features, 49 | kernel_size, stride): 50 | return nn.Sequential( 51 | nn.Conv2d( 52 | number_of_input_features, 53 | number_of_output_features, 54 | kernel_size=kernel_size, 55 | stride=stride, 56 | padding=kernel_size // 2), 57 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 58 | nn.InstanceNorm2d(number_of_output_features, affine=True)) 59 | 60 | 61 | def convolution_block_3D_with_relu_and_instance_norm(number_of_input_features, 62 | number_of_output_features, 63 | kernel_size, stride): 64 | return nn.Sequential( 65 | nn.Conv3d( 66 | number_of_input_features, 67 | number_of_output_features, 68 | kernel_size=kernel_size, 69 | stride=stride, 70 | padding=kernel_size // 2), 71 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 72 | nn.InstanceNorm3d(number_of_output_features, affine=True)) 73 | 74 | 75 | def transposed_convololution_block_3D_with_relu_and_instance_norm( 76 | number_of_input_features, number_of_output_features, kernel_size, 77 | stride, padding): 78 | return nn.Sequential( 79 | nn.ConvTranspose3d( 80 | number_of_input_features, 81 | number_of_output_features, 82 | kernel_size=kernel_size, 83 | stride=stride, 84 | padding=padding), nn.LeakyReLU(negative_slope=0.1, inplace=True), 85 | nn.InstanceNorm3d(number_of_output_features, affine=True)) 86 | 87 | 88 | def convolutional_block_5x5_stride_2(number_of_input_features, 89 | number_of_output_features): 90 | return convolution_block_2D_with_relu_and_instance_norm( 91 | number_of_input_features, 92 | number_of_output_features, 93 | kernel_size=5, 94 | stride=2) 95 | 96 | 97 | def convolutional_block_3x3(number_of_input_features, 98 | number_of_output_features): 99 | return convolution_block_2D_with_relu_and_instance_norm( 100 | number_of_input_features, 101 | number_of_output_features, 102 | kernel_size=3, 103 | stride=1) 104 | 105 | 106 | def convolutional_block_3x3x3(number_of_input_features, 107 | number_of_output_features): 108 | return convolution_block_3D_with_relu_and_instance_norm( 109 | number_of_input_features, 110 | number_of_output_features, 111 | kernel_size=3, 112 | stride=1) 113 | 114 | 115 | def convolutional_block_3x3x3_stride_2(number_of_input_features, 116 | number_of_output_features): 117 | return convolution_block_3D_with_relu_and_instance_norm( 118 | number_of_input_features, 119 | number_of_output_features, 120 | kernel_size=3, 121 | stride=2) 122 | 123 | 124 | def transposed_convolutional_block_4x4x4_stride_2(number_of_input_features, 125 | number_of_output_features): 126 | return transposed_convololution_block_3D_with_relu_and_instance_norm( 127 | number_of_input_features, 128 | number_of_output_features, 129 | kernel_size=4, 130 | stride=2, 131 | padding=1) 132 | 133 | 134 | class ResidualBlock(nn.Module): 135 | """Residual block with nonlinearity before addition.""" 136 | 137 | def __init__(self, number_of_features): 138 | super(ResidualBlock, self).__init__() 139 | self.convolutions = nn.Sequential( 140 | convolutional_block_3x3(number_of_features, number_of_features), 141 | convolutional_block_3x3(number_of_features, number_of_features)) 142 | 143 | def forward(self, block_input): 144 | return self.convolutions(block_input) + block_input 145 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec/model_LTC/network_pds.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | from torch import nn 6 | from model_LTC import embedding 7 | from model_LTC import estimator 8 | from model_LTC import matching 9 | from model_LTC import regularization 10 | from model_LTC import size_adapter 11 | 12 | import time 13 | 14 | 15 | class PdsNetwork(nn.Module): 16 | """Practical Deep Stereo (PDS) network.""" 17 | 18 | def __init__(self, size_adapter_module, embedding_module, matching_module, 19 | regularization_module, estimator_module): 20 | super(PdsNetwork, self).__init__() 21 | self._size_adapter = size_adapter_module 22 | self._embedding = embedding_module 23 | self._matching = matching_module 24 | self._regularization = regularization_module 25 | self._estimator = estimator_module 26 | 27 | def set_maximum_disparity(self, maximum_disparity): 28 | """Reconfigure network for different disparity range.""" 29 | # if (maximum_disparity + 1) % 64 != 0: 30 | # raise ValueError( 31 | # '"maximum_disparity" + 1 should be multiple of 64, e.g.,' 32 | # '"maximum disparity" can be equal to 63, 191, 255, 319...') 33 | self._maximum_disparity = maximum_disparity 34 | # During the embedding spatial dimensions of an input are downsampled 35 | # 4x times. Therefore, "maximum_disparity" of matching module is 36 | # computed as (maximum_disparity + 1) / 4 - 1. 37 | # self._matching.set_maximum_disparity((maximum_disparity + 1) // 4 - 1) 38 | self._matching.set_maximum_disparity((maximum_disparity + 1) // 4 - 1) 39 | 40 | def pass_through_network(self, left_image, right_image): 41 | start_time = time.time() 42 | left_descriptor, shortcut_from_left = self._embedding(left_image) 43 | right_descriptor = self._embedding(right_image)[0] 44 | # print("Embedding Duration:{:.4f}s".format(time.time()-start_time)) 45 | 46 | start_time = time.time() 47 | matching_signatures = self._matching(left_descriptor, right_descriptor) 48 | # print("Matching Duration:{:.4f}s".format(time.time()-start_time)) 49 | 50 | start_time = time.time() 51 | output = self._regularization(matching_signatures, shortcut_from_left) 52 | # print("Cost volumn:{:.4f}s".format(time.time()-start_time)) 53 | 54 | return output, shortcut_from_left 55 | 56 | 57 | def forward(self, left_image, right_image): 58 | """Returns sub-pixel disparity (or matching cost in training mode).""" 59 | network_output = self.pass_through_network( 60 | self._size_adapter.pad(left_image), 61 | self._size_adapter.pad(right_image))[0] 62 | if not self.training: 63 | network_output = self._estimator(network_output) 64 | return self._size_adapter.unpad(network_output) 65 | 66 | @staticmethod 67 | def default(maximum_disparity=255): 68 | """Returns network with default parameters.""" 69 | network = PdsNetwork( 70 | size_adapter_module=size_adapter.SizeAdapter(), 71 | embedding_module=embedding.Embedding(), 72 | matching_module=matching.Matching( 73 | operation=matching.MatchingOperation(), maximum_disparity=0), 74 | regularization_module=regularization.Regularization(), 75 | estimator_module=estimator.SubpixelMap()) 76 | network.set_maximum_disparity(maximum_disparity) 77 | return network 78 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec/model_LTC/provider.py: -------------------------------------------------------------------------------- 1 | from pathlib import Path 2 | import random 3 | import torch 4 | import PIL.Image 5 | import cv2 6 | import numpy as np 7 | from torch.utils.data import Dataset 8 | 9 | class dataset(Dataset): 10 | def __init__(self,path, pre_frame=6): 11 | self.path = path 12 | self.pre_frame = pre_frame 13 | 14 | def __len__(self): 15 | return len(self.path) 16 | 17 | def get_example(self, index): 18 | 19 | left_path = self.path[index] 20 | right_path = left_path.parent.parent/'right'/left_path.parts[-1] 21 | 22 | gt_name = left_path.parts[-1][:6]+'.png' 23 | gt_path = Path('/media/HDD1/personal_files/zkx/datasets/Dsec/train/train_disparity/')/left_path.parent.parent.parts[-1]/'disparity'/'event'/gt_name 24 | 25 | frame_index = int(left_path.stem) 26 | left_eq, right_eq = [], [] 27 | first_index = int(max(1,(frame_index-self.pre_frame*2)))# this should be 1, because we don`t have 000000.npy, and the first index of npy should be 000002.npy. 28 | for previous_frame_index in range(frame_index, first_index, -2): 29 | 30 | left_eq.append(torch.from_numpy(np.load(left_path.parent/'{:06d}.npy'.format(previous_frame_index)))) 31 | right_eq.append(torch.from_numpy(np.load(right_path.parent/'{:06d}.npy'.format(previous_frame_index)))) 32 | 33 | if len(left_eq) < self.pre_frame: 34 | total_need = self.pre_frame-len(left_eq) 35 | # print(total_need) 36 | for i in range(total_need): 37 | left_copy_last_frame = left_eq[-1].clone() 38 | right_copy_last_frame = right_eq[-1].clone() 39 | left_eq.append(left_copy_last_frame) 40 | right_eq.append(right_copy_last_frame) 41 | 42 | 43 | left_event_queue = torch.cat(left_eq,0).float() 44 | right_event_queue = torch.cat(right_eq,0).float() 45 | 46 | disp_16bit = cv2.imread(str(gt_path), cv2.IMREAD_ANYDEPTH) 47 | disp_16bit = disp_16bit.astype('float32')/256.0 48 | valid_disp = (disp_16bit > 0) 49 | disp_16bit[~valid_disp] = float('inf') 50 | gt_disparity = torch.from_numpy(np.array(disp_16bit)) 51 | 52 | return { 53 | 'left':{ 54 | 'event_queue':left_event_queue, 55 | 'disparity_image': gt_disparity 56 | }, 57 | 58 | 'right':{ 59 | 'event_queue': right_event_queue 60 | }, 61 | 62 | 'frame_index': int(left_path.parts[-1][:6]), 63 | } 64 | 65 | 66 | def __getitem__(self, index): 67 | 68 | assert index1: 55 | stereo_network = th.nn.DataParallel(stereo_network) 56 | 57 | print("Preparation ends. Duration: ", time.time()-start_time) 58 | print("Parameters:",np.sum([p.numel() for p in stereo_network.parameters()]).item()) 59 | 60 | return { 61 | 'network': stereo_network, 62 | 'optimizer': optimizer, 63 | 'criterion': criterion, 64 | 'learning_rate_scheduler': learning_rate_scheduler, 65 | 'training_set_loader': training_set_loader, 66 | 'test_set_loader': test_set_loader, 67 | 'end_epoch': end_epoch, 68 | 'experiment_folder': experiment_folder, 69 | 'spec_title': spec_title, 70 | 'DA': DA 71 | } 72 | 73 | 74 | parser = argparse.ArgumentParser() 75 | parser.add_argument('--batch_size', type=int, default=2) 76 | parser.add_argument('--experiment_folder', default= None, type=str) 77 | parser.add_argument('--checkpoint_file', default= None, type=str) 78 | parser.add_argument('--dataset_folder', type=str) 79 | parser.add_argument('--test_mode', default=False, action='store_true') 80 | parser.add_argument('--main_lr', type=float, default=None) 81 | parser.add_argument('--pre_frame', type=int, default=6) 82 | parser.add_argument('--temporal_aggregation_lr',type=float, default=None) 83 | parser.add_argument('--end_epoch',type=int, default=22) 84 | parser.add_argument('--milestone',type=int, nargs='+') 85 | parser.add_argument('--DA',default=False, action='store_true') 86 | 87 | parser.add_argument('--maximum_disparity', default=127, type=int) 88 | parser.add_argument('--embedding_features', default=32, type=int) 89 | parser.add_argument('--embedding_shortcuts', default=8, type=int) 90 | parser.add_argument('--matching_concat_features', default=64, type=int) 91 | parser.add_argument('--matching_features', default=64, type=int) 92 | parser.add_argument('--matching_shortcuts', default=8, type=int) 93 | parser.add_argument('--matching_residual_blocks', default=2, type=int) 94 | 95 | parser.add_argument('--all_data',action='store_true') 96 | parser.add_argument('--ltc_hparams', default={'use_erevin':False, 'taum_ini':[.5,.8], 'nltc': 32, 'usetaum':True, 'ltcv1':True}, type=dict) 97 | parser.add_argument('--num_plane',type=int, default=10) 98 | parser.add_argument('--data_hparams', default={'use10ms': True, 'usenorm': False, 'pre_nframes':10}, type=dict) 99 | parser.add_argument('--share_hparams', default={'stream_opt':False, 'burn_in_time':5}, type=dict) 100 | parser.add_argument('--spec_title', default = 4000, type=int) # for normal training and general testing 101 | args = parser.parse_args() 102 | args.ltc_hparams['num_plane'] = args.num_plane 103 | # print(args.ltc_hparams) 104 | print(args) 105 | 106 | 107 | for i,v in args.share_hparams.items(): 108 | args.ltc_hparams[i] = v 109 | args.data_hparams[i] = v 110 | 111 | 112 | def set_seed(seed=0): 113 | random.seed(seed) 114 | np.random.seed(seed) 115 | th.manual_seed(seed) 116 | th.cuda.manual_seed(seed) 117 | th.cuda.manual_seed_all(seed) 118 | th.backends.cudnn.deterministic = True 119 | th.backends.cudnn.benchmark = False 120 | 121 | 122 | if __name__ == '__main__': 123 | 124 | 125 | set_seed(12345) 126 | main_lr = 1e-3 127 | temporal_aggregation_lr = 5e-3 128 | 129 | dataset_folder = args.dataset_folder 130 | experiment_folder = args.experiment_folder 131 | 132 | if not os.path.isdir(experiment_folder): 133 | os.mkdir(experiment_folder) 134 | 135 | parameters = _initialize_parameters( 136 | dataset_folder, experiment_folder, args.test_mode, main_lr,temporal_aggregation_lr, args.ltc_hparams, args.spec_title, args.data_hparams, args.DA, args.pre_frame,args.end_epoch, args.milestone, args.all_data, args.maximum_disparity,args.embedding_features, args.embedding_shortcuts,args.matching_concat_features,args.matching_features,args.matching_shortcuts,args.matching_residual_blocks ) 137 | 138 | stereo_trainer = trainer.Trainer(parameters) 139 | 140 | if args.checkpoint_file: 141 | stereo_trainer.load_checkpoint(args.checkpoint_file, load_only_network=True) 142 | 143 | if args.test_mode: 144 | print("Testing.") 145 | stereo_trainer.test() 146 | else: 147 | print("Training.") 148 | stereo_trainer.train() 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec_spade/model_LTC/embedding.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | 8 | from model_LTC import network_blocks 9 | # from model_LTC.deform import DeformConv2d,DeformBottleneck,DeformSimpleBottleneck 10 | 11 | 12 | # class SELayer(nn.Module): 13 | # def __init__(self, channel, reduction=4): 14 | # super(SELayer, self).__init__() 15 | # self.avg_pool = nn.AdaptiveAvgPool2d(1) 16 | # self.fc = nn.Sequential( 17 | # nn.Linear(channel, channel // reduction, bias=False), 18 | # nn.ReLU(inplace=True), 19 | # nn.Linear(channel // reduction, channel, bias=False), 20 | # nn.Sigmoid() 21 | # ) 22 | # 23 | # def forward(self, x): 24 | # b, c, _, _ = x.size() 25 | # y = self.avg_pool(x).view(b, c) 26 | # y = self.fc(y).view(b, c, 1, 1) 27 | # return x * y.expand_as(x) 28 | 29 | 30 | 31 | class Embedding(nn.Module): 32 | """Embedding module.""" 33 | 34 | def __init__(self, 35 | number_of_input_features=3, 36 | number_of_embedding_features=16, 37 | number_of_shortcut_features=8, 38 | number_of_residual_blocks=2): 39 | """Returns initialized embedding module. 40 | 41 | Args: 42 | number_of_input_features: number of channels in the input image; 43 | number_of_embedding_features: number of channels in image's 44 | descriptor; 45 | number_of_shortcut_features: number of channels in the redirect 46 | connection descriptor; 47 | number_of_residual_blocks: number of residual blocks in embedding 48 | network. 49 | """ 50 | super(Embedding, self).__init__() 51 | embedding_modules = [ 52 | nn.InstanceNorm2d(number_of_input_features), 53 | network_blocks.convolutional_block_5x5_stride_2( 54 | number_of_input_features, number_of_embedding_features), 55 | network_blocks.convolutional_block_5x5_stride_1( 56 | number_of_embedding_features, number_of_embedding_features), 57 | # DeformConv2d(in_channels=number_of_input_features, out_channels=number_of_embedding_features, kernel_size=5, stride=2, dilation=1, padding=2), 58 | # DeformConv2d(in_channels=number_of_embedding_features, out_channels=number_of_embedding_features, kernel_size=5, stride=2, dilation=1, padding=2) 59 | ] 60 | embedding_modules += [ 61 | network_blocks.ResidualBlock(number_of_embedding_features) 62 | # DeformSimpleBottleneck(number_of_embedding_features, number_of_embedding_features,mdconv_dilation=1,padding=1) 63 | for _ in range(number_of_residual_blocks) 64 | ] 65 | 66 | ###################################channel-sise attention############################## 67 | # embedding_modules += [network_blocks.eca_block(number_of_embedding_features)] 68 | # embedding_modules += [network_blocks.SELayer(number_of_embedding_features)] 69 | 70 | ######################################################################################### 71 | 72 | 73 | 74 | self._embedding_modules = nn.ModuleList(embedding_modules) 75 | self._shortcut = network_blocks.convolutional_block_3x3( 76 | number_of_embedding_features, number_of_shortcut_features) 77 | # self.shortcut_eca_block = network_blocks.eca_block(number_of_shortcut_features) 78 | # self._shortcut = DeformConv2d(in_channels=number_of_embedding_features, out_channels=number_of_shortcut_features, kernel_size=3, stride=1, dilation=1, padding=1) 79 | 80 | 81 | def forward(self, image): 82 | """Returns image's descriptor and redirect connection descriptor. 83 | 84 | Args: 85 | image: color image of size 86 | batch_size x 3 x height x width; 87 | 88 | Returns: 89 | descriptor: image's descriptor of size 90 | batch_size x 64 x (height / 4) x (width / 4); 91 | shortcut_from_left_image: shortcut connection from left image 92 | descriptor (it is used in regularization network). It 93 | is tensor of size 94 | (batch_size, 8, height / 4, width / 4). 95 | """ 96 | descriptor = image 97 | for embedding_module in self._embedding_modules: 98 | descriptor = embedding_module(descriptor) 99 | # short_cut = self.shortcut_eca_block(self._shortcut(descriptor)) 100 | return descriptor, self._shortcut(descriptor) 101 | # return descriptor, short_cut 102 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec_spade/model_LTC/errors.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import torch as th 7 | import math 8 | 9 | def compute_absolute_error(estimated_disparity, 10 | ground_truth_disparity, 11 | use_mean=True): 12 | """Returns pixel-wise and mean absolute error. 13 | 14 | Locations where ground truth is not avaliable do not contribute to mean 15 | absolute error. In such locations pixel-wise error is shown as zero. 16 | If ground truth is not avaliable in all locations, function returns 0. 17 | 18 | Args: 19 | ground_truth_disparity: ground truth disparity where locations with 20 | unknow disparity are set to inf's. 21 | estimated_disparity: estimated disparity. 22 | use_mean: if True than use mean to average pixelwise errors, 23 | otherwise use median. 24 | """ 25 | absolute_difference = (estimated_disparity - ground_truth_disparity).abs() 26 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 27 | pixelwise_absolute_error = absolute_difference.clone() 28 | pixelwise_absolute_error[locations_without_ground_truth] = 0 29 | absolute_differece_with_ground_truth = absolute_difference[ 30 | ~locations_without_ground_truth] 31 | if absolute_differece_with_ground_truth.numel() == 0: 32 | average_absolute_error = 0.0 33 | else: 34 | if use_mean: 35 | average_absolute_error = absolute_differece_with_ground_truth.mean( 36 | ).item() 37 | else: 38 | average_absolute_error = absolute_differece_with_ground_truth.median( 39 | ).item() 40 | return pixelwise_absolute_error, average_absolute_error 41 | 42 | 43 | def RMSE(estimated_disparity, 44 | ground_truth_disparity, 45 | use_mean=True): 46 | absolute_difference = (estimated_disparity - ground_truth_disparity) ** 2 47 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 48 | pixelwise_absolute_error = absolute_difference.clone() 49 | pixelwise_absolute_error[locations_without_ground_truth] = 0 50 | absolute_differece_with_ground_truth = absolute_difference[ 51 | ~locations_without_ground_truth] 52 | if absolute_differece_with_ground_truth.numel() == 0: 53 | average_absolute_error = 0.0 54 | else: 55 | if use_mean: 56 | average_absolute_error = math.sqrt(absolute_differece_with_ground_truth.mean()) 57 | else: 58 | average_absolute_error = math.sqrt(absolute_differece_with_ground_truth.median()) 59 | return pixelwise_absolute_error, average_absolute_error 60 | 61 | 62 | def compute_n_pixels_error(estimated_disparity, ground_truth_disparity, n=3.0): 63 | """Return pixel-wise n-pixels error and % of pixels with n-pixels error. 64 | 65 | Locations where ground truth is not avaliable do not contribute to mean 66 | n-pixel error. In such locations pixel-wise error is shown as zero. 67 | 68 | Note that n-pixel error is equal to one if 69 | |estimated_disparity-ground_truth_disparity| > n and zero otherwise. 70 | 71 | If ground truth is not avaliable in all locations, function returns 0. 72 | 73 | Args: 74 | ground_truth_disparity: ground truth disparity where locations with 75 | unknow disparity are set to inf's. 76 | estimated_disparity: estimated disparity. 77 | n: maximum absolute disparity difference, that does not trigger 78 | n-pixel error. 79 | """ 80 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 81 | more_than_n_pixels_absolute_difference = ( 82 | estimated_disparity - ground_truth_disparity).abs().gt(n).float() 83 | pixelwise_n_pixels_error = more_than_n_pixels_absolute_difference.clone() 84 | pixelwise_n_pixels_error[locations_without_ground_truth] = 0.0 85 | more_than_n_pixels_absolute_difference_with_ground_truth = \ 86 | more_than_n_pixels_absolute_difference[~locations_without_ground_truth] 87 | if more_than_n_pixels_absolute_difference_with_ground_truth.numel() == 0: 88 | percentage_of_pixels_with_error = 0.0 89 | else: 90 | percentage_of_pixels_with_error = \ 91 | more_than_n_pixels_absolute_difference_with_ground_truth.mean( 92 | ).item() * 100 93 | return pixelwise_n_pixels_error, percentage_of_pixels_with_error 94 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec_spade/model_LTC/estimator.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch.nn import functional 7 | import torch as th 8 | import numpy as np 9 | import time 10 | 11 | class SubpixelMap(object): 12 | """Approximation of an sub-pixel MAP estimator. 13 | 14 | In every location (x, y), function collects similarity scores 15 | for disparities in a vicinty of a disparity with maximum similarity 16 | score and converts them to disparity distribution using softmax. 17 | Next, the disparity in every location (x, y) is computed as mean 18 | of this distribution. 19 | 20 | It is used only for inference. 21 | """ 22 | 23 | def __init__(self, half_support_window=4, disparity_step=2): 24 | super(SubpixelMap, self).__init__() 25 | """Returns object of SubpixelMap class. 26 | 27 | Args: 28 | disparity_step: step in pixels between near-by disparities in 29 | input "similarities" tensor. 30 | half_support_window: defines size of disparity window in pixels 31 | around disparity with maximum similarity, 32 | which is used to convert similarities 33 | to probabilities and compute mean. 34 | """ 35 | if disparity_step < 1: 36 | raise ValueError('"disparity_step" should be positive integer.') 37 | if half_support_window < 1: 38 | raise ValueError( 39 | '"half_support_window" should be positive integer.') 40 | if half_support_window % disparity_step != 0: 41 | raise ValueError('"half_support_window" should be multiple of the' 42 | '"disparity_step"') 43 | self._disparity_step = disparity_step 44 | self._half_support_window = half_support_window 45 | 46 | def __call__(self, similarities): 47 | """Returns sub-pixel disparity. 48 | 49 | Args: 50 | similarities: Tensor with similarities for every 51 | disparity and every location with indices 52 | [batch_index, disparity_index, y, x]. 53 | 54 | Returns: 55 | Tensor with disparities for every location with 56 | indices [batch_index, y, x]. 57 | """ 58 | # In every location (x, y) find disparity with maximum similarity 59 | # score. 60 | maximum_similarity, disparity_index_with_maximum_similarity = \ 61 | th.max(similarities, dim=1, keepdim=True) 62 | support_disparities, support_similarities = [], [] 63 | maximum_disparity_index = similarities.size(1) 64 | 65 | # Collect similarity scores for the disparities around the disparity 66 | # with the maximum similarity score. 67 | for disparity_index_shift in range( 68 | -self._half_support_window // self._disparity_step, 69 | self._half_support_window // self._disparity_step + 1): 70 | disparity_index = (disparity_index_with_maximum_similarity + 71 | disparity_index_shift).float() 72 | invalid_disparity_index_mask = ( 73 | (disparity_index < 0) | 74 | (disparity_index >= maximum_disparity_index)) 75 | disparity_index[invalid_disparity_index_mask] = 0 76 | nearby_similarities = th.gather(similarities, 1, 77 | disparity_index.long()) 78 | nearby_similarities[invalid_disparity_index_mask] = -float('inf') 79 | support_similarities.append(nearby_similarities) 80 | nearby_disparities = th.gather( 81 | (self._disparity_step * 82 | disparity_index).expand_as(similarities), 1, 83 | disparity_index.long()) 84 | support_disparities.append(nearby_disparities) 85 | support_similarities = th.stack(support_similarities, dim=1) 86 | support_disparities = th.stack(support_disparities, dim=1) 87 | 88 | # Convert collected similarity scores to the disparity distribution 89 | # using softmax and compute disparity as a mean of this distribution. 90 | probabilities = functional.softmax(support_similarities, dim=1) 91 | disparities = th.sum(probabilities * support_disparities.float(), 1) 92 | return disparities.squeeze(1) 93 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec_spade/model_LTC/loss.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import torch as th 7 | 8 | from torch import nn 9 | from torch.nn import functional 10 | # import torch.distributed as dist 11 | 12 | def _unnormalized_laplace_probability(value, location, diversity): 13 | return th.exp(-th.abs(location - value) / diversity) / (2 * diversity) 14 | 15 | 16 | class SubpixelCrossEntropy(nn.Module): 17 | def __init__(self, diversity=1.0, disparity_step=2): 18 | """Returns SubpixelCrossEntropy object. 19 | 20 | Args: 21 | disparity_step: disparity difference between near-by 22 | disparity indices in "similarities" tensor. 23 | diversity: diversity of the target Laplace distribution, 24 | centered at the sub-pixel ground truth. 25 | """ 26 | super(SubpixelCrossEntropy, self).__init__() 27 | self._diversity = diversity 28 | self._disparity_step = disparity_step 29 | 30 | def forward(self, similarities, ground_truth_disparities, weights=None): 31 | """Returns sub-pixel cross-entropy loss. 32 | 33 | Cross-entropy is computed as 34 | 35 | - sum_d log( P_predicted(d) ) x P_target(d) 36 | ------------------------------------------------- 37 | sum_d P_target(d) 38 | 39 | We need to normalize the cross-entropy by sum_d P_target(d), 40 | since the target distribution is not normalized. 41 | 42 | Args: 43 | ground_truth_disparities: Tensor with ground truth disparities with 44 | indices [example_index, y, x]. The 45 | disparity values are floats. The locations with unknown 46 | disparities are filled with 'inf's. 47 | similarities: Tensor with similarities with indices 48 | [example_index, disparity_index, y, x]. 49 | weights: Tensor with weights of individual locations. 50 | """ 51 | maximum_disparity_index = similarities.size(1) 52 | known_ground_truth_disparity = ground_truth_disparities.data != float( 53 | 'inf') 54 | log_P_predicted = functional.log_softmax(similarities, dim=1) 55 | sum_P_target = th.zeros(ground_truth_disparities.size()) 56 | sum_P_target_x_log_P_predicted = th.zeros( 57 | ground_truth_disparities.size()) 58 | if similarities.is_cuda: 59 | sum_P_target = sum_P_target.cuda() 60 | sum_P_target_x_log_P_predicted = \ 61 | sum_P_target_x_log_P_predicted.cuda() 62 | for disparity_index in range(maximum_disparity_index): 63 | disparity = disparity_index * self._disparity_step 64 | P_target = _unnormalized_laplace_probability( 65 | value=disparity, 66 | location=ground_truth_disparities, 67 | diversity=self._diversity) 68 | sum_P_target += P_target 69 | sum_P_target_x_log_P_predicted += ( 70 | log_P_predicted[:, disparity_index] * P_target) 71 | entropy = -sum_P_target_x_log_P_predicted[ 72 | known_ground_truth_disparity] / sum_P_target[ 73 | known_ground_truth_disparity] 74 | if weights is not None: 75 | weights_with_ground_truth = weights[known_ground_truth_disparity] 76 | return (weights_with_ground_truth * entropy).sum() / ( 77 | weights_with_ground_truth.sum() + 1e-15) 78 | # print("entropy.mean",entropy.mean()) 79 | return entropy.mean() 80 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec_spade/model_LTC/matching.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | import torch as th 8 | 9 | from einops import rearrange 10 | from model_LTC import network_blocks 11 | # from model_LTC.deform import DeformConv2d,DeformBottleneck,DeformSimpleBottleneck 12 | 13 | 14 | def _pad_zero_columns_from_left(tensor, number_of_columns): 15 | return nn.ZeroPad2d((number_of_columns, 0, 0, 0))(tensor) 16 | 17 | 18 | class Matching(nn.Module): 19 | def __init__(self, maximum_disparity, operation): 20 | """Returns initialized matching module. 21 | 22 | Args: 23 | maximum_disparity: Upper limit of disparity range 24 | [0, maximum_disparity]. 25 | operation: Operation that is applied to concatenated 26 | left-right discriptors for all disparities. 27 | This can be network module of function. 28 | """ 29 | super(Matching, self).__init__() 30 | self._maximum_disparity = maximum_disparity 31 | self._operation = operation 32 | 33 | def set_maximum_disparity(self, maximum_disparity): 34 | self._maximum_disparity = maximum_disparity 35 | 36 | def forward(self, left_embedding, right_embedding): 37 | """Returns concatenated compact matching signatures for every disparity. 38 | 39 | Args: 40 | left_embedding, right_embedding: Tensors for left and right 41 | image embeddings with indices 42 | [batch_index, feature_index, y, x]. 43 | 44 | Returns: 45 | matching_signature: 4D tensor that contains concatenated 46 | matching signatures (or matching score, 47 | depending on "operation") for every disparity. 48 | Tensor has indices 49 | [batch_index, feature_index, disparity_index, 50 | y, x]. 51 | """ 52 | padded_right_embedding = _pad_zero_columns_from_left( 53 | right_embedding, self._maximum_disparity) 54 | matching_signatures = [] 55 | concatenated_embedding = th.cat([left_embedding, right_embedding],dim=1) 56 | # matching_signatures.append(self._operation(concatenated_embedding)) 57 | matching_signatures.append(concatenated_embedding) 58 | for disparity in range(1, self._maximum_disparity + 1): 59 | shifted_right_embedding = \ 60 | padded_right_embedding[:, :, :, 61 | self._maximum_disparity - disparity:-disparity] 62 | concatenated_embedding = th.cat( 63 | [left_embedding, shifted_right_embedding], dim=1) 64 | matching_signatures.append(concatenated_embedding) 65 | 66 | matching_signatures_5d = th.stack(matching_signatures, dim=2) 67 | matching_signatures_5d = rearrange(matching_signatures_5d, 'b c d h w -> (b d) c h w') 68 | 69 | after_operation = rearrange(self._operation(matching_signatures_5d), '(b d) c h w -> b c d h w', d = self._maximum_disparity+1) 70 | return after_operation 71 | 72 | 73 | class MatchingOperation(nn.Module): 74 | """Operation applied to concatenated left / right descriptors.""" 75 | 76 | def __init__(self, 77 | number_of_concatenated_descriptor_features=128, 78 | number_of_features=64, 79 | number_of_compact_matching_signature_features=8, 80 | number_of_residual_blocks=2): 81 | """Returns initialized match operation network. 82 | 83 | For every disparity, left image descriptor is concatenated 84 | along the feature dimension with shifted by the disparity value 85 | right image descriptor and passed throught the network. 86 | """ 87 | super(MatchingOperation, self).__init__() 88 | matching_operation_modules = [ 89 | network_blocks.convolution_3x3( 90 | number_of_concatenated_descriptor_features, number_of_features) 91 | # DeformConv2d(in_channels=number_of_concatenated_descriptor_features, out_channels=number_of_features, kernel_size=3, stride=1, dilation=1, padding=1) 92 | ] 93 | matching_operation_modules += [ 94 | network_blocks.ResidualBlock(number_of_features) 95 | # DeformSimpleBottleneck(number_of_features, number_of_features,mdconv_dilation=1,padding=1) 96 | for _ in range(number_of_residual_blocks) 97 | ] 98 | matching_operation_modules += [ 99 | # DeformConv2d(in_channels=number_of_features, out_channels=number_of_compact_matching_signature_features, kernel_size=3, stride=1, dilation=1, padding=1) 100 | network_blocks.convolution_3x3( 101 | number_of_features, 102 | number_of_compact_matching_signature_features) 103 | ] 104 | # self.SEblock = network_blocks.SELayer() 105 | # matching_operation_modules += [network_blocks.SELayer(number_of_compact_matching_signature_features)] 106 | # matching_operation_modules += [network_blocks.eca_block(number_of_compact_matching_signature_features)] 107 | self._matching_operation_modules = nn.ModuleList( 108 | matching_operation_modules) 109 | 110 | def forward(self, concatenated_descriptors): 111 | """Returns compact matching signature. 112 | 113 | Args: 114 | concatenated_descriptors: concatenated left / right image 115 | descriptors of size 116 | batch_size x 128 x (height / 4) x (width / 4). 117 | 118 | Returns: 119 | compact_matching_signature: tensor of size 120 | batch_size x 8 x (height / 4) x (width / 4). 121 | """ 122 | compact_matching_signature = concatenated_descriptors 123 | for _module in self._matching_operation_modules: 124 | compact_matching_signature = _module(compact_matching_signature) 125 | return compact_matching_signature 126 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec_spade/model_LTC/network.py: -------------------------------------------------------------------------------- 1 | # CopyriVght 2021 Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: Apache-2.0 3 | from torch import nn 4 | from model_LTC import temporal_aggregation 5 | from model_LTC.spade_e2v import SPADE 6 | import torch.nn.functional as F 7 | from model_LTC import embedding 8 | from model_LTC import estimator 9 | from model_LTC import matching 10 | from model_LTC import network_pds as network 11 | from model_LTC import network_blocks 12 | from model_LTC import regularization 13 | from model_LTC import size_adapter 14 | # from model_LTC import stay_embedding 15 | import time 16 | import numpy as np 17 | class Dummy(nn.Module): 18 | def __init__(self): 19 | super(Dummy, self).__init__() 20 | 21 | def forward(self, input): 22 | return input 23 | 24 | 25 | class DenseDeepEventStereo(network.PdsNetwork): 26 | """Dense deep stereo network. 27 | 28 | The network is based on "Practical Deeps Stereo: Toward 29 | applications-friendly deep stereo matching" by Stepan Tulyakov et al. 30 | Compare to the parent, this network has additional 31 | temporal aggregation module that embedds local events sequence in 32 | every location. 33 | """ 34 | def __init__(self, size_adapter_module, temporal_aggregation_module,spade_module, 35 | spatial_aggregation_module, matching_module, 36 | regularization_module, estimator_module): 37 | super(DenseDeepEventStereo, 38 | self).__init__(size_adapter_module, spatial_aggregation_module, 39 | matching_module, regularization_module, 40 | estimator_module) 41 | self._temporal_aggregation = temporal_aggregation_module 42 | self.sbt_size_adapter = size_adapter.SizeAdapter() 43 | # self.spade = SPADE(32,10) 44 | self.spade = spade_module 45 | 46 | @staticmethod 47 | def default_with_continuous_fully_connected(hyper_params, maximum_disparity, embedding_features, embedding_shortcuts,matching_concat_features,matching_features,matching_shortcuts,matching_residual_blocks): 48 | """Returns default network with continuous fully connected.""" 49 | stereo_network = DenseDeepEventStereo( 50 | size_adapter_module=size_adapter.SizeAdapter(), 51 | temporal_aggregation_module=temporal_aggregation. 52 | ContinuousFullyConnected(hyper_params), 53 | # spade_module=SPADE(hyper_params['nltc'],hyper_params['num_plane']), 54 | spade_module=SPADE(embedding_features,hyper_params['num_plane']), 55 | spatial_aggregation_module=embedding.Embedding( 56 | number_of_input_features=hyper_params['nltc'], number_of_embedding_features=embedding_features, 57 | number_of_shortcut_features=embedding_shortcuts), 58 | # spatial_aggregation_module=embedding.Embedding( 59 | # number_of_input_features=10), 60 | matching_module=matching.Matching( 61 | operation=matching.MatchingOperation(number_of_concatenated_descriptor_features=matching_concat_features, number_of_features=matching_features, number_of_compact_matching_signature_features=matching_shortcuts, number_of_residual_blocks=matching_residual_blocks), maximum_disparity=0), 62 | regularization_module=regularization.Regularization(), 63 | estimator_module=estimator.SubpixelMap()) 64 | stereo_network.set_maximum_disparity(maximum_disparity) 65 | return stereo_network 66 | 67 | def forward(self,batch): 68 | # input:[2, 2, 1, 320, 384] 69 | LTC_start_time = time.time() 70 | left_event_queue = batch['left']['event_queue'] 71 | right_event_queue = batch['right']['event_queue'] 72 | 73 | left_projected_events = self._temporal_aggregation( 74 | self._size_adapter.pad(left_event_queue), self.training) 75 | 76 | right_projected_events = self._temporal_aggregation( 77 | self._size_adapter.pad(right_event_queue), self.training) 78 | 79 | left_descriptor, shortcut_from_left = self._embedding(left_projected_events) 80 | right_descriptor = self._embedding(right_projected_events)[0] 81 | reshape_left_sbt = left_event_queue[:,0] 82 | reshape_right_sbt = right_event_queue[:,0] 83 | left_fusion = self.spade(left_descriptor, reshape_left_sbt) 84 | right_fusion = self.spade(right_descriptor, reshape_right_sbt) 85 | 86 | batch['after_spade'] = left_fusion.clone() 87 | 88 | matching_signatures = self._matching(left_fusion,right_fusion) 89 | 90 | network_output = self._regularization(matching_signatures, shortcut_from_left) 91 | expand_h,expand_w = left_projected_events.size()[-2:] 92 | 93 | if not self.training: 94 | start_time = time.time() 95 | network_output = self._estimator(network_output) 96 | if network_output.size()[-2:] != (expand_h,expand_w): 97 | # print("network_output",network_output.size()) 98 | network_output = F.interpolate(network_output.unsqueeze(1),(expand_h,expand_w),mode='nearest').squeeze(1) 99 | 100 | if network_output.size()[-2:] != (expand_h,expand_w): 101 | network_output = F.interpolate(network_output,(expand_h,expand_w),mode='nearest') 102 | # network output 2 32 320 384 103 | return self._size_adapter.unpad(network_output) 104 | 105 | 106 | 107 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec_spade/model_LTC/network_pds.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | from torch import nn 6 | from model_LTC import embedding 7 | from model_LTC import estimator 8 | from model_LTC import matching 9 | from model_LTC import regularization 10 | from model_LTC import size_adapter 11 | 12 | import time 13 | 14 | 15 | class PdsNetwork(nn.Module): 16 | """Practical Deep Stereo (PDS) network.""" 17 | 18 | def __init__(self, size_adapter_module, embedding_module, matching_module, 19 | regularization_module, estimator_module): 20 | super(PdsNetwork, self).__init__() 21 | self._size_adapter = size_adapter_module 22 | self._embedding = embedding_module 23 | self._matching = matching_module 24 | self._regularization = regularization_module 25 | self._estimator = estimator_module 26 | 27 | def set_maximum_disparity(self, maximum_disparity): 28 | """Reconfigure network for different disparity range.""" 29 | # if (maximum_disparity + 1) % 64 != 0: 30 | # raise ValueError( 31 | # '"maximum_disparity" + 1 should be multiple of 64, e.g.,' 32 | # '"maximum disparity" can be equal to 63, 191, 255, 319...') 33 | self._maximum_disparity = maximum_disparity 34 | # During the embedding spatial dimensions of an input are downsampled 35 | # 4x times. Therefore, "maximum_disparity" of matching module is 36 | # computed as (maximum_disparity + 1) / 4 - 1. 37 | self._matching.set_maximum_disparity((maximum_disparity + 1) // 4 - 1) 38 | 39 | def pass_through_network(self, left_image, right_image): 40 | start_time = time.time() 41 | left_descriptor, shortcut_from_left = self._embedding(left_image) 42 | right_descriptor = self._embedding(right_image)[0] 43 | print("Embedding Duration:{:.4f}s".format(time.time()-start_time)) 44 | 45 | start_time = time.time() 46 | matching_signatures = self._matching(left_descriptor, right_descriptor) 47 | print("Matching Duration:{:.4f}s".format(time.time()-start_time)) 48 | 49 | start_time = time.time() 50 | output = self._regularization(matching_signatures, shortcut_from_left) 51 | print("Cost volumn:{:.4f}s".format(time.time()-start_time)) 52 | 53 | return output, shortcut_from_left 54 | 55 | 56 | def forward(self, left_image, right_image): 57 | """Returns sub-pixel disparity (or matching cost in training mode).""" 58 | network_output = self.pass_through_network( 59 | self._size_adapter.pad(left_image), 60 | self._size_adapter.pad(right_image))[0] 61 | if not self.training: 62 | network_output = self._estimator(network_output) 63 | return self._size_adapter.unpad(network_output) 64 | 65 | @staticmethod 66 | def default(maximum_disparity=255): 67 | """Returns network with default parameters.""" 68 | network = PdsNetwork( 69 | size_adapter_module=size_adapter.SizeAdapter(), 70 | embedding_module=embedding.Embedding(), 71 | matching_module=matching.Matching( 72 | operation=matching.MatchingOperation(), maximum_disparity=0), 73 | regularization_module=regularization.Regularization(), 74 | estimator_module=estimator.SubpixelMap()) 75 | network.set_maximum_disparity(maximum_disparity) 76 | return network 77 | -------------------------------------------------------------------------------- /code_of_Dsec/LTC_Dsec_spade/model_LTC/provider.py: -------------------------------------------------------------------------------- 1 | from pathlib import Path 2 | import random 3 | import torch 4 | import PIL.Image 5 | import cv2 6 | import numpy as np 7 | # from . import sequence 8 | from torch.utils.data import Dataset 9 | 10 | class dataset(Dataset): 11 | def __init__(self,path, pre_frame=6): 12 | self.path = path 13 | self.pre_frame = pre_frame 14 | 15 | def __len__(self): 16 | return len(self.path) 17 | 18 | def get_example(self, index): 19 | 20 | left_path = self.path[index] 21 | right_path = left_path.parent.parent/'right'/left_path.parts[-1] 22 | 23 | gt_name = left_path.parts[-1][:6]+'.png' 24 | gt_path = Path('/media/HDD1/personal_files/zkx/datasets/Dsec/train/train_disparity/')/left_path.parent.parent.parts[-1]/'disparity'/'event'/gt_name 25 | 26 | frame_index = int(left_path.stem) 27 | left_eq, right_eq = [], [] 28 | first_index = int(max(1,(frame_index-self.pre_frame*2)))# this should be 1, because we don`t have 000000.npy, and the first index of npy should be 000002.npy. 29 | for previous_frame_index in range(frame_index, first_index, -2): 30 | 31 | left_eq.append(torch.from_numpy(np.load(left_path.parent/'{:06d}.npy'.format(previous_frame_index)))) 32 | right_eq.append(torch.from_numpy(np.load(right_path.parent/'{:06d}.npy'.format(previous_frame_index)))) 33 | 34 | if len(left_eq) < self.pre_frame: 35 | total_need = self.pre_frame-len(left_eq) 36 | # print(total_need) 37 | for i in range(total_need): 38 | left_copy_last_frame = left_eq[-1].clone() 39 | right_copy_last_frame = right_eq[-1].clone() 40 | left_eq.append(left_copy_last_frame) 41 | right_eq.append(right_copy_last_frame) 42 | 43 | 44 | left_event_queue = torch.cat(left_eq,0).float() 45 | right_event_queue = torch.cat(right_eq,0).float() 46 | 47 | disp_16bit = cv2.imread(str(gt_path), cv2.IMREAD_ANYDEPTH) 48 | disp_16bit = disp_16bit.astype('float32')/256.0 49 | valid_disp = (disp_16bit > 0) 50 | disp_16bit[~valid_disp] = float('inf') 51 | gt_disparity = torch.from_numpy(np.array(disp_16bit)) 52 | 53 | return { 54 | 'left':{ 55 | 'event_queue':left_event_queue, 56 | 'disparity_image': gt_disparity 57 | }, 58 | 59 | 'right':{ 60 | 'event_queue': right_event_queue 61 | }, 62 | 63 | 'frame_index': int(left_path.parts[-1][:6]), 64 | } 65 | 66 | 67 | def __getitem__(self, index): 68 | 69 | assert index1: 49 | stereo_network = th.nn.DataParallel(stereo_network) 50 | 51 | print("Preparation ends. Duration: ", time.time()-start_time) 52 | print("Parameters:",np.sum([p.numel() for p in stereo_network.parameters()]).item()) 53 | return { 54 | 'network': stereo_network, 55 | 'optimizer': optimizer, 56 | 'criterion': criterion, 57 | 'learning_rate_scheduler': learning_rate_scheduler, 58 | 'training_set_loader': training_set_loader, 59 | 'test_set_loader': test_set_loader, 60 | 'end_epoch': end_epoch, 61 | 'experiment_folder': experiment_folder, 62 | 'spec_title': spec_title, 63 | 'DA': DA 64 | } 65 | 66 | 67 | parser = argparse.ArgumentParser() 68 | parser.add_argument('--batch_size', type=int, default=2) 69 | parser.add_argument('--experiment_folder', default= None, type=str) 70 | parser.add_argument('--checkpoint_file', default= None, type=str) 71 | parser.add_argument('--dataset_folder', type=str) 72 | parser.add_argument('--test_mode', default=False, action='store_true') 73 | parser.add_argument('--main_lr', type=float, default=None) 74 | parser.add_argument('--maximum_disparity', type=int, default=127) 75 | parser.add_argument('--pre_frame', type=int, default=6) 76 | parser.add_argument('--temporal_aggregation_lr',type=float, default=None) 77 | parser.add_argument('--end_epoch',type=int, default=22) 78 | 79 | parser.add_argument('--milestone',type=int, nargs='+') 80 | parser.add_argument('--embedding_features', default=32, type=int) 81 | parser.add_argument('--embedding_shortcuts', default=8, type=int) 82 | parser.add_argument('--matching_concat_features', default=64, type=int) 83 | parser.add_argument('--matching_features', default=64, type=int) 84 | parser.add_argument('--matching_shortcuts', default=8, type=int) 85 | parser.add_argument('--matching_residual_blocks', default=2, type=int) 86 | 87 | 88 | parser.add_argument('--all_data',default=False,action='store_true') 89 | parser.add_argument('--DA',default=False, action='store_true') 90 | parser.add_argument('--ltc_hparams', default={'use_erevin':False, 'taum_ini':[.5,.8], 'nltc': 32, 'usetaum':True, 'ltcv1':True}, type=dict) 91 | parser.add_argument('--num_plane',type=int, default=10) 92 | parser.add_argument('--data_hparams', default={'use10ms': True, 'usenorm': False, 'pre_nframes':10}, type=dict) 93 | parser.add_argument('--share_hparams', default={'stream_opt':False, 'burn_in_time':5}, type=dict) 94 | parser.add_argument('--spec_title', default = 4000, type=int) # for normal training and general testing 95 | args = parser.parse_args() 96 | args.ltc_hparams['num_plane'] = args.num_plane 97 | # print(args.ltc_hparams) 98 | print(args) 99 | 100 | 101 | for i,v in args.share_hparams.items(): 102 | args.ltc_hparams[i] = v 103 | args.data_hparams[i] = v 104 | 105 | 106 | def set_seed(seed=0): 107 | random.seed(0) 108 | np.random.seed(seed) 109 | th.manual_seed(seed) 110 | th.cuda.manual_seed(seed) 111 | th.cuda.manual_seed_all(seed) 112 | th.backends.cudnn.deterministic = True 113 | th.backends.cudnn.benchmark = False 114 | 115 | 116 | if __name__ == '__main__': 117 | 118 | 119 | set_seed(12345) 120 | 121 | main_lr = 1e-3 122 | temporal_aggregation_lr = 5e-3 123 | 124 | dataset_folder = args.dataset_folder 125 | experiment_folder = args.experiment_folder 126 | 127 | if not os.path.isdir(experiment_folder): 128 | os.mkdir(experiment_folder) 129 | 130 | 131 | parameters = _initialize_parameters( 132 | dataset_folder, experiment_folder, args.test_mode, main_lr,temporal_aggregation_lr, args.ltc_hparams, args.spec_title, args.data_hparams, args.DA, args.pre_frame,args.end_epoch, args.milestone, args.all_data, args.maximum_disparity,args.embedding_features, args.embedding_shortcuts,args.matching_concat_features,args.matching_features,args.matching_shortcuts,args.matching_residual_blocks ) 133 | 134 | stereo_trainer = trainer.Trainer(parameters) 135 | if args.checkpoint_file: 136 | stereo_trainer.load_checkpoint(args.checkpoint_file, load_only_network=True) 137 | 138 | if args.test_mode: 139 | print("Testing.") 140 | stereo_trainer.test() 141 | print('LTC cm ', stereo_trainer._network._temporal_aggregation._LTC_Conv.cm) 142 | else: 143 | print("Training.") 144 | stereo_trainer.train() 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/dataset.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/dataset.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/dataset_constants.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/dataset_constants.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/deform.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/deform.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/embedding.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/embedding.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/errors.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/errors.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/estimator.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/estimator.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/loss.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/loss.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/matching.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/matching.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/network.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/network.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/network_blocks.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/network_blocks.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/network_pds.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/network_pds.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/pds_trainer.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/pds_trainer.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/regularization.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/regularization.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/size_adapter.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/size_adapter.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/spade_e2v.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/spade_e2v.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/stay_embedding.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/stay_embedding.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/temporal_aggregation.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/temporal_aggregation.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/trainer.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/trainer.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/trainer_pds.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/trainer_pds.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/transforms.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/transforms.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/visualization.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/__pycache__/visualization.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/dataset_constants.py: -------------------------------------------------------------------------------- 1 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: Apache-2.0 3 | import os 4 | 5 | DISPARITY_MULTIPLIER = 7.0 6 | TIME_BETWEEN_EXAMPLES = 0.05 # sec 7 | EXPERIMENTS = { 8 | 'indoor_flying': [1, 2, 3, 4], 9 | 'outdoor_day': [1, 2], 10 | 'outdoor_night': [1, 2, 3] 11 | } 12 | # Focal length multiplied by baseline [pix * meter]. 13 | FOCAL_LENGTH_X_BASELINE = { 14 | 'indoor_flying': 19.941772, 15 | 'outdoor_night': 19.651191, 16 | 'outdoor_day': 19.635287 17 | } 18 | INVALID_DISPARITY = 255 19 | DISPARITY_MAXIMUM = 37 20 | IMAGE_WIDTH = 346 21 | IMAGE_HEIGHT = 260 22 | 23 | 24 | def create_folders(paths): 25 | for name, folder in paths.items(): 26 | if isinstance(folder, dict): 27 | create_folders(folder) 28 | else: 29 | if 'folder' in name and not os.path.exists(folder): 30 | os.makedirs(folder) 31 | 32 | 33 | def experiment_paths(experiment_name, experiment_number, dataset_root): 34 | paths = {'cam0': {}, 'cam1': {}} 35 | 36 | paths['experiment_folder'] = os.path.join( 37 | dataset_root, '%s_%i' % (experiment_name, experiment_number)) 38 | 39 | for camera, value in {'cam0': 0, 'cam1': 1}.items(): 40 | paths[camera]['image_folder'] = os.path.join( 41 | paths['experiment_folder'], 'image%i' % value) 42 | paths[camera]['image_file'] = os.path.join( 43 | paths[camera]['image_folder'], '%0.6i.png') 44 | paths[camera]['event_folder'] = os.path.join( 45 | paths['experiment_folder'], 'event%i' % value) 46 | paths[camera]['event_file'] = os.path.join( 47 | paths[camera]['event_folder'], '%0.6i.npy') 48 | 49 | paths['timestamps_file'] = os.path.join(paths['experiment_folder'], 50 | 'timestamps.txt') 51 | paths['disparity_folder'] = os.path.join(paths['experiment_folder'], 52 | 'disparity_image') 53 | paths['disparity_file'] = os.path.join(paths['disparity_folder'], 54 | '%0.6i.png') 55 | paths['description'] = os.path.join(dataset_root, 'readme.txt') 56 | 57 | return paths 58 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/embedding.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | 8 | from model_LTC import network_blocks 9 | from model_LTC.deform import DeformConv2d,DeformBottleneck,DeformSimpleBottleneck 10 | 11 | 12 | # class SELayer(nn.Module): 13 | # def __init__(self, channel, reduction=4): 14 | # super(SELayer, self).__init__() 15 | # self.avg_pool = nn.AdaptiveAvgPool2d(1) 16 | # self.fc = nn.Sequential( 17 | # nn.Linear(channel, channel // reduction, bias=False), 18 | # nn.ReLU(inplace=True), 19 | # nn.Linear(channel // reduction, channel, bias=False), 20 | # nn.Sigmoid() 21 | # ) 22 | # 23 | # def forward(self, x): 24 | # b, c, _, _ = x.size() 25 | # y = self.avg_pool(x).view(b, c) 26 | # y = self.fc(y).view(b, c, 1, 1) 27 | # return x * y.expand_as(x) 28 | 29 | 30 | 31 | class Embedding(nn.Module): 32 | """Embedding module.""" 33 | 34 | def __init__(self, 35 | number_of_input_features=3, 36 | number_of_embedding_features=16, 37 | number_of_shortcut_features=8, 38 | number_of_residual_blocks=2): 39 | """Returns initialized embedding module. 40 | 41 | Args: 42 | number_of_input_features: number of channels in the input image; 43 | number_of_embedding_features: number of channels in image's 44 | descriptor; 45 | number_of_shortcut_features: number of channels in the redirect 46 | connection descriptor; 47 | number_of_residual_blocks: number of residual blocks in embedding 48 | network. 49 | """ 50 | super(Embedding, self).__init__() 51 | embedding_modules = [ 52 | nn.InstanceNorm2d(number_of_input_features), 53 | network_blocks.convolutional_block_5x5_stride_2( 54 | number_of_input_features, number_of_embedding_features), 55 | network_blocks.convolutional_block_5x5_stride_1( 56 | number_of_embedding_features, number_of_embedding_features), 57 | # DeformConv2d(in_channels=number_of_input_features, out_channels=number_of_embedding_features, kernel_size=5, stride=2, dilation=1, padding=2), 58 | # DeformConv2d(in_channels=number_of_embedding_features, out_channels=number_of_embedding_features, kernel_size=5, stride=2, dilation=1, padding=2) 59 | ] 60 | embedding_modules += [ 61 | network_blocks.ResidualBlock(number_of_embedding_features) 62 | # DeformSimpleBottleneck(number_of_embedding_features, number_of_embedding_features,mdconv_dilation=1,padding=1) 63 | for _ in range(number_of_residual_blocks) 64 | ] 65 | 66 | ###################################channel-sise attention############################## 67 | # embedding_modules += [network_blocks.eca_block(number_of_embedding_features)] 68 | # embedding_modules += [network_blocks.SELayer(number_of_embedding_features)] 69 | 70 | ######################################################################################### 71 | 72 | 73 | 74 | self._embedding_modules = nn.ModuleList(embedding_modules) 75 | self._shortcut = network_blocks.convolutional_block_3x3( 76 | number_of_embedding_features, number_of_shortcut_features) 77 | # self.shortcut_eca_block = network_blocks.eca_block(number_of_shortcut_features) 78 | # self._shortcut = DeformConv2d(in_channels=number_of_embedding_features, out_channels=number_of_shortcut_features, kernel_size=3, stride=1, dilation=1, padding=1) 79 | 80 | 81 | def forward(self, image): 82 | """Returns image's descriptor and redirect connection descriptor. 83 | 84 | Args: 85 | image: color image of size 86 | batch_size x 3 x height x width; 87 | 88 | Returns: 89 | descriptor: image's descriptor of size 90 | batch_size x 64 x (height / 4) x (width / 4); 91 | shortcut_from_left_image: shortcut connection from left image 92 | descriptor (it is used in regularization network). It 93 | is tensor of size 94 | (batch_size, 8, height / 4, width / 4). 95 | """ 96 | descriptor = image 97 | for embedding_module in self._embedding_modules: 98 | descriptor = embedding_module(descriptor) 99 | # short_cut = self.shortcut_eca_block(self._shortcut(descriptor)) 100 | return descriptor, self._shortcut(descriptor) 101 | # return descriptor, short_cut 102 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/errors.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import torch as th 7 | 8 | 9 | def compute_absolute_error(estimated_disparity, 10 | ground_truth_disparity, 11 | use_mean=True): 12 | """Returns pixel-wise and mean absolute error. 13 | 14 | Locations where ground truth is not avaliable do not contribute to mean 15 | absolute error. In such locations pixel-wise error is shown as zero. 16 | If ground truth is not avaliable in all locations, function returns 0. 17 | 18 | Args: 19 | ground_truth_disparity: ground truth disparity where locations with 20 | unknow disparity are set to inf's. 21 | estimated_disparity: estimated disparity. 22 | use_mean: if True than use mean to average pixelwise errors, 23 | otherwise use median. 24 | """ 25 | absolute_difference = (estimated_disparity - ground_truth_disparity).abs() 26 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 27 | pixelwise_absolute_error = absolute_difference.clone() 28 | pixelwise_absolute_error[locations_without_ground_truth] = 0 29 | absolute_differece_with_ground_truth = absolute_difference[ 30 | ~locations_without_ground_truth] 31 | if absolute_differece_with_ground_truth.numel() == 0: 32 | average_absolute_error = 0.0 33 | else: 34 | if use_mean: 35 | average_absolute_error = absolute_differece_with_ground_truth.mean( 36 | ).item() 37 | else: 38 | average_absolute_error = absolute_differece_with_ground_truth.median( 39 | ).item() 40 | return pixelwise_absolute_error, average_absolute_error 41 | 42 | 43 | def compute_n_pixels_error(estimated_disparity, ground_truth_disparity, n=3.0): 44 | """Return pixel-wise n-pixels error and % of pixels with n-pixels error. 45 | 46 | Locations where ground truth is not avaliable do not contribute to mean 47 | n-pixel error. In such locations pixel-wise error is shown as zero. 48 | 49 | Note that n-pixel error is equal to one if 50 | |estimated_disparity-ground_truth_disparity| > n and zero otherwise. 51 | 52 | If ground truth is not avaliable in all locations, function returns 0. 53 | 54 | Args: 55 | ground_truth_disparity: ground truth disparity where locations with 56 | unknow disparity are set to inf's. 57 | estimated_disparity: estimated disparity. 58 | n: maximum absolute disparity difference, that does not trigger 59 | n-pixel error. 60 | """ 61 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 62 | more_than_n_pixels_absolute_difference = ( 63 | estimated_disparity - ground_truth_disparity).abs().gt(n).float() 64 | pixelwise_n_pixels_error = more_than_n_pixels_absolute_difference.clone() 65 | pixelwise_n_pixels_error[locations_without_ground_truth] = 0.0 66 | more_than_n_pixels_absolute_difference_with_ground_truth = \ 67 | more_than_n_pixels_absolute_difference[~locations_without_ground_truth] 68 | if more_than_n_pixels_absolute_difference_with_ground_truth.numel() == 0: 69 | percentage_of_pixels_with_error = 0.0 70 | else: 71 | percentage_of_pixels_with_error = \ 72 | more_than_n_pixels_absolute_difference_with_ground_truth.mean( 73 | ).item() * 100 74 | return pixelwise_n_pixels_error, percentage_of_pixels_with_error 75 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/estimator.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch.nn import functional 7 | import torch as th 8 | 9 | 10 | class SubpixelMap(object): 11 | """Approximation of an sub-pixel MAP estimator. 12 | 13 | In every location (x, y), function collects similarity scores 14 | for disparities in a vicinty of a disparity with maximum similarity 15 | score and converts them to disparity distribution using softmax. 16 | Next, the disparity in every location (x, y) is computed as mean 17 | of this distribution. 18 | 19 | It is used only for inference. 20 | """ 21 | 22 | def __init__(self, half_support_window=4, disparity_step=2): 23 | super(SubpixelMap, self).__init__() 24 | """Returns object of SubpixelMap class. 25 | 26 | Args: 27 | disparity_step: step in pixels between near-by disparities in 28 | input "similarities" tensor. 29 | half_support_window: defines size of disparity window in pixels 30 | around disparity with maximum similarity, 31 | which is used to convert similarities 32 | to probabilities and compute mean. 33 | """ 34 | if disparity_step < 1: 35 | raise ValueError('"disparity_step" should be positive integer.') 36 | if half_support_window < 1: 37 | raise ValueError( 38 | '"half_support_window" should be positive integer.') 39 | if half_support_window % disparity_step != 0: 40 | raise ValueError('"half_support_window" should be multiple of the' 41 | '"disparity_step"') 42 | self._disparity_step = disparity_step 43 | self._half_support_window = half_support_window 44 | 45 | def __call__(self, similarities): 46 | """Returns sub-pixel disparity. 47 | 48 | Args: 49 | similarities: Tensor with similarities for every 50 | disparity and every location with indices 51 | [batch_index, disparity_index, y, x]. 52 | 53 | Returns: 54 | Tensor with disparities for every location with 55 | indices [batch_index, y, x]. 56 | """ 57 | # In every location (x, y) find disparity with maximum similarity 58 | # score. 59 | maximum_similarity, disparity_index_with_maximum_similarity = \ 60 | th.max(similarities, dim=1, keepdim=True) 61 | support_disparities, support_similarities = [], [] 62 | maximum_disparity_index = similarities.size(1) 63 | 64 | # Collect similarity scores for the disparities around the disparity 65 | # with the maximum similarity score. 66 | for disparity_index_shift in range( 67 | -self._half_support_window // self._disparity_step, 68 | self._half_support_window // self._disparity_step + 1): 69 | disparity_index = (disparity_index_with_maximum_similarity + 70 | disparity_index_shift).float() 71 | invalid_disparity_index_mask = ( 72 | (disparity_index < 0) | 73 | (disparity_index >= maximum_disparity_index)) 74 | disparity_index[invalid_disparity_index_mask] = 0 75 | nearby_similarities = th.gather(similarities, 1, 76 | disparity_index.long()) 77 | nearby_similarities[invalid_disparity_index_mask] = -float('inf') 78 | support_similarities.append(nearby_similarities) 79 | nearby_disparities = th.gather( 80 | (self._disparity_step * 81 | disparity_index).expand_as(similarities), 1, 82 | disparity_index.long()) 83 | support_disparities.append(nearby_disparities) 84 | support_similarities = th.stack(support_similarities, dim=1) 85 | support_disparities = th.stack(support_disparities, dim=1) 86 | 87 | # Convert collected similarity scores to the disparity distribution 88 | # using softmax and compute disparity as a mean of this distribution. 89 | probabilities = functional.softmax(support_similarities, dim=1) 90 | disparities = th.sum(probabilities * support_disparities.float(), 1) 91 | return disparities.squeeze(1) 92 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/loss.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import torch as th 7 | 8 | from torch import nn 9 | from torch.nn import functional 10 | # import torch.distributed as dist 11 | 12 | def _unnormalized_laplace_probability(value, location, diversity): 13 | return th.exp(-th.abs(location - value) / diversity) / (2 * diversity) 14 | 15 | 16 | class SubpixelCrossEntropy(nn.Module): 17 | def __init__(self, diversity=1.0, disparity_step=2): 18 | """Returns SubpixelCrossEntropy object. 19 | 20 | Args: 21 | disparity_step: disparity difference between near-by 22 | disparity indices in "similarities" tensor. 23 | diversity: diversity of the target Laplace distribution, 24 | centered at the sub-pixel ground truth. 25 | """ 26 | super(SubpixelCrossEntropy, self).__init__() 27 | self._diversity = diversity 28 | self._disparity_step = disparity_step 29 | 30 | def forward(self, similarities, ground_truth_disparities, weights=None): 31 | """Returns sub-pixel cross-entropy loss. 32 | 33 | Cross-entropy is computed as 34 | 35 | - sum_d log( P_predicted(d) ) x P_target(d) 36 | ------------------------------------------------- 37 | sum_d P_target(d) 38 | 39 | We need to normalize the cross-entropy by sum_d P_target(d), 40 | since the target distribution is not normalized. 41 | 42 | Args: 43 | ground_truth_disparities: Tensor with ground truth disparities with 44 | indices [example_index, y, x]. The 45 | disparity values are floats. The locations with unknown 46 | disparities are filled with 'inf's. 47 | similarities: Tensor with similarities with indices 48 | [example_index, disparity_index, y, x]. 49 | weights: Tensor with weights of individual locations. 50 | """ 51 | maximum_disparity_index = similarities.size(1) 52 | known_ground_truth_disparity = ground_truth_disparities.data != float( 53 | 'inf') 54 | log_P_predicted = functional.log_softmax(similarities, dim=1) 55 | sum_P_target = th.zeros(ground_truth_disparities.size()) 56 | sum_P_target_x_log_P_predicted = th.zeros( 57 | ground_truth_disparities.size()) 58 | if similarities.is_cuda: 59 | sum_P_target = sum_P_target.cuda() 60 | sum_P_target_x_log_P_predicted = \ 61 | sum_P_target_x_log_P_predicted.cuda() 62 | for disparity_index in range(maximum_disparity_index): 63 | disparity = disparity_index * self._disparity_step 64 | P_target = _unnormalized_laplace_probability( 65 | value=disparity, 66 | location=ground_truth_disparities, 67 | diversity=self._diversity) 68 | sum_P_target += P_target 69 | sum_P_target_x_log_P_predicted += ( 70 | log_P_predicted[:, disparity_index] * P_target) 71 | entropy = -sum_P_target_x_log_P_predicted[ 72 | known_ground_truth_disparity] / sum_P_target[ 73 | known_ground_truth_disparity] 74 | if weights is not None: 75 | weights_with_ground_truth = weights[known_ground_truth_disparity] 76 | return (weights_with_ground_truth * entropy).sum() / ( 77 | weights_with_ground_truth.sum() + 1e-15) 78 | # print("entropy.mean",entropy.mean()) 79 | return entropy.mean() 80 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/network_pds.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | from torch import nn 6 | from model_LTC import embedding 7 | from model_LTC import estimator 8 | from model_LTC import matching 9 | from model_LTC import regularization 10 | from model_LTC import size_adapter 11 | 12 | import time 13 | 14 | 15 | class PdsNetwork(nn.Module): 16 | """Practical Deep Stereo (PDS) network.""" 17 | 18 | def __init__(self, size_adapter_module, embedding_module, matching_module, 19 | regularization_module, estimator_module): 20 | super(PdsNetwork, self).__init__() 21 | self._size_adapter = size_adapter_module 22 | self._embedding = embedding_module 23 | self._matching = matching_module 24 | self._regularization = regularization_module 25 | self._estimator = estimator_module 26 | 27 | def set_maximum_disparity(self, maximum_disparity): 28 | """Reconfigure network for different disparity range.""" 29 | if (maximum_disparity + 1) % 64 != 0: 30 | raise ValueError( 31 | '"maximum_disparity" + 1 should be multiple of 64, e.g.,' 32 | '"maximum disparity" can be equal to 63, 191, 255, 319...') 33 | self._maximum_disparity = maximum_disparity 34 | # During the embedding spatial dimensions of an input are downsampled 35 | # 4x times. Therefore, "maximum_disparity" of matching module is 36 | # computed as (maximum_disparity + 1) / 4 - 1. 37 | self._matching.set_maximum_disparity((maximum_disparity + 1) // 4 - 1) 38 | 39 | def pass_through_network(self, left_image, right_image): 40 | start_time = time.time() 41 | left_descriptor, shortcut_from_left = self._embedding(left_image) 42 | right_descriptor = self._embedding(right_image)[0] 43 | print("Embedding Duration:{:.4f}s".format(time.time()-start_time)) 44 | print("leftdisc",left_descriptor.size())# 1 32 80 96 45 | print("sfl",shortcut_from_left.size()) # 1 8 80 96 46 | start_time = time.time() 47 | matching_signatures = self._matching(left_descriptor, right_descriptor) 48 | print("Matching Duration:{:.4f}s".format(time.time()-start_time)) 49 | # print("after matching", matching_signatures.size()) 50 | # after matching : 1 8 16 320 384 51 | 52 | start_time = time.time() 53 | output = self._regularization(matching_signatures, shortcut_from_left) 54 | print("Cost volumn:{:.4f}s".format(time.time()-start_time)) 55 | return output, shortcut_from_left 56 | 57 | 58 | def forward(self, left_image, right_image): 59 | """Returns sub-pixel disparity (or matching cost in training mode).""" 60 | network_output = self.pass_through_network( 61 | self._size_adapter.pad(left_image), 62 | self._size_adapter.pad(right_image))[0] 63 | if not self.training: 64 | network_output = self._estimator(network_output) 65 | return self._size_adapter.unpad(network_output) 66 | 67 | @staticmethod 68 | def default(maximum_disparity=255): 69 | """Returns network with default parameters.""" 70 | network = PdsNetwork( 71 | size_adapter_module=size_adapter.SizeAdapter(), 72 | embedding_module=embedding.Embedding(), 73 | matching_module=matching.Matching( 74 | operation=matching.MatchingOperation(), maximum_disparity=0), 75 | regularization_module=regularization.Regularization(), 76 | estimator_module=estimator.SubpixelMap()) 77 | network.set_maximum_disparity(maximum_disparity) 78 | return network 79 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/regularization.py: -------------------------------------------------------------------------------- 1 | # Copirights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | 8 | from model_LTC import network_blocks 9 | 10 | 11 | class ContractionBlock3d(nn.Module): 12 | """Contraction block, that downsamples the input. 13 | 14 | The contraction blocks constitute the contraction part of 15 | the regularization network. Each block consists of 2x 16 | "donwsampling" convolution followed by conventional "smoothing" 17 | convolution. 18 | """ 19 | 20 | def __init__(self, number_of_features): 21 | super(ContractionBlock3d, self).__init__() 22 | self._downsampling_2x = \ 23 | network_blocks.convolutional_block_3x3x3_stride_2( 24 | number_of_features, 2 * number_of_features) 25 | self._smoothing = network_blocks.convolutional_block_3x3x3( 26 | 2 * number_of_features, 2 * number_of_features) 27 | 28 | def forward(self, block_input): 29 | output_of_downsampling_2x = self._downsampling_2x(block_input) 30 | return output_of_downsampling_2x, self._smoothing( 31 | output_of_downsampling_2x) 32 | 33 | 34 | class ExpansionBlock3d(nn.Module): 35 | """Expansion block, that upsamples the input. 36 | 37 | The expansion blocks constitute the expansion part of 38 | the regularization network. Each block consists of 2x 39 | "upsampling" transposed convolution and 40 | conventional "smoothing" convolution. The output of the 41 | "upsampling" convolution is summed with the 42 | "shortcut_from_contraction" and is fed to the "smoothing" 43 | convolution. 44 | """ 45 | 46 | def __init__(self, number_of_features): 47 | super(ExpansionBlock3d, self).__init__() 48 | self._upsampling_2x = \ 49 | network_blocks.transposed_convolutional_block_4x4x4_stride_2( 50 | number_of_features, number_of_features // 2) 51 | self._smoothing = network_blocks.convolutional_block_3x3x3( 52 | number_of_features // 2, number_of_features // 2) 53 | 54 | def forward(self, block_input, shortcut_from_contraction): 55 | output_of_upsampling = self._upsampling_2x(block_input) 56 | return self._smoothing(output_of_upsampling + 57 | shortcut_from_contraction) 58 | 59 | 60 | class Regularization(nn.Module): 61 | """Regularization module, that enforce stereo matching constraints. 62 | 63 | It is a hourglass 3D convolutional network that consists 64 | of contraction and expansion parts, with the shortcut connections 65 | between them. 66 | 67 | The network downsamples the input 16x times along the spatial 68 | and disparity dimensions and then upsamples it 64x times along 69 | the spatial dimensions and 32x times along the disparity 70 | dimension, effectively computing matching cost only for even 71 | disparities. 72 | """ 73 | 74 | def __init__(self, number_of_features=8): 75 | """Returns initialized regularization module.""" 76 | super(Regularization, self).__init__() 77 | self._smoothing = network_blocks.convolutional_block_3x3x3( 78 | number_of_features, number_of_features) 79 | self._contraction_blocks = nn.ModuleList([ 80 | ContractionBlock3d(number_of_features * scale) 81 | for scale in [1, 2, 4, 8] 82 | ]) 83 | self._expansion_blocks = nn.ModuleList([ 84 | ExpansionBlock3d(number_of_features * scale) 85 | for scale in [16, 8, 4, 2] 86 | ]) 87 | self._upsample_to_halfsize = \ 88 | network_blocks.transposed_convolutional_block_4x4x4_stride_211( 89 | number_of_features, number_of_features // 2) 90 | self._upsample_to_fullsize = \ 91 | network_blocks.transposed_convolution_3x4x4_stride_122( 92 | number_of_features // 2, 1) 93 | 94 | # self.SEblock = network_blocks.SELayer(32) 95 | # self.eca_block_contract = network_blocks.eca_block(32) 96 | # self.eca_block = network_blocks.eca_block(128) 97 | 98 | def forward(self, matching_signatures, shortcut_from_left_image): 99 | """Returns regularized matching cost tensor. 100 | 101 | Args: 102 | matching_signatures: concatenated compact matching signatures 103 | for every disparity. It is tensor of size 104 | (batch_size, number_of_features, 105 | maximum_disparity / 4, height / 4, 106 | width / 4). 107 | shortcut_from_left_image: shortcut connection from the left 108 | image descriptor. It has size of 109 | (batch_size, number_of_features, height / 4, 110 | width / 4); 111 | 112 | Returns: 113 | regularized matching cost tensor of size (batch_size, 114 | maximum_disparity / 2, height, width). Every element of this 115 | tensor along the disparity dimension is a matching cost for 116 | disparity 0, 2, .. , maximum_disparity. 117 | """ 118 | shortcuts_from_contraction = [] 119 | shortcut = shortcut_from_left_image.unsqueeze(2) 120 | output = self._smoothing(matching_signatures) 121 | for contraction_block in self._contraction_blocks: 122 | shortcuts_from_contraction.append(output) 123 | shortcut, output = contraction_block(shortcut + output) 124 | 125 | # output = self.eca_block_contract(output.squeeze(2)).unsqueeze(2) 126 | 127 | del shortcut 128 | for expansion_block in self._expansion_blocks: 129 | output = expansion_block(output, shortcuts_from_contraction.pop()) 130 | 131 | full_size_output = self._upsample_to_fullsize(self._upsample_to_halfsize(output)).squeeze_(1) 132 | # full_size_output = self.eca_block(full_size_output) 133 | # full_size_output = self.SEblock(full_size_output) 134 | # return self._upsample_to_fullsize( 135 | # self._upsample_to_halfsize(output)).squeeze_(1) 136 | return full_size_output 137 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/size_adapter.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import math 7 | 8 | from torch import nn 9 | 10 | 11 | class SizeAdapter(object): 12 | """Converts size of input to standard size. 13 | 14 | Practical deep network works only with input images 15 | which height and width are multiples of a minimum size. 16 | This class allows to pass to the network images of arbitrary 17 | size, by padding the input to the closest multiple 18 | and unpadding the network's output to the original size. 19 | """ 20 | 21 | def __init__(self, minimum_size=64): 22 | self._minimum_size = minimum_size 23 | self._pixels_pad_to_width = None 24 | self._pixels_pad_to_height = None 25 | 26 | def _closest_larger_multiple_of_minimum_size(self, size): 27 | return int(math.ceil(size / self._minimum_size) * self._minimum_size) 28 | 29 | def pad(self, network_input): 30 | """Returns "network_input" paded with zeros to the "standard" size. 31 | 32 | The "standard" size correspond to the height and width that 33 | are closest multiples of "minimum_size". The method pads 34 | height and width and and saves padded values. These 35 | values are then used by "unpad_output" method. 36 | """ 37 | height, width = network_input.size()[-2:] 38 | self._pixels_pad_to_height = ( 39 | self._closest_larger_multiple_of_minimum_size(height) - height) 40 | self._pixels_pad_to_width = ( 41 | self._closest_larger_multiple_of_minimum_size(width) - width) 42 | return nn.ZeroPad2d((self._pixels_pad_to_width, 0, 43 | self._pixels_pad_to_height, 0))(network_input) 44 | 45 | def unpad(self, network_output): 46 | """Returns "network_output" cropped to the original size. 47 | 48 | The cropping is performed using values save by the "pad_input" 49 | method. 50 | """ 51 | return network_output[..., self._pixels_pad_to_height:, self. 52 | _pixels_pad_to_width:] 53 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/model_LTC/stay_embedding.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | 8 | from model_LTC import network_blocks 9 | from model_LTC.deform import DeformConv2d,DeformBottleneck,DeformSimpleBottleneck 10 | 11 | 12 | # class SELayer(nn.Module): 13 | # def __init__(self, channel, reduction=4): 14 | # super(SELayer, self).__init__() 15 | # self.avg_pool = nn.AdaptiveAvgPool2d(1) 16 | # self.fc = nn.Sequential( 17 | # nn.Linear(channel, channel // reduction, bias=False), 18 | # nn.ReLU(inplace=True), 19 | # nn.Linear(channel // reduction, channel, bias=False), 20 | # nn.Sigmoid() 21 | # ) 22 | # 23 | # def forward(self, x): 24 | # b, c, _, _ = x.size() 25 | # y = self.avg_pool(x).view(b, c) 26 | # y = self.fc(y).view(b, c, 1, 1) 27 | # return x * y.expand_as(x) 28 | 29 | 30 | 31 | class Embedding(nn.Module): 32 | """Embedding module.""" 33 | 34 | def __init__(self, 35 | number_of_input_features=3, 36 | number_of_embedding_features=16, 37 | number_of_shortcut_features=8, 38 | number_of_residual_blocks=2): 39 | """Returns initialized embedding module. 40 | 41 | Args: 42 | number_of_input_features: number of channels in the input image; 43 | number_of_embedding_features: number of channels in image's 44 | descriptor; 45 | number_of_shortcut_features: number of channels in the redirect 46 | connection descriptor; 47 | number_of_residual_blocks: number of residual blocks in embedding 48 | network. 49 | """ 50 | super(Embedding, self).__init__() 51 | embedding_modules = [ 52 | nn.InstanceNorm2d(number_of_input_features), 53 | network_blocks.convolutional_block_3x3( 54 | number_of_input_features, number_of_embedding_features), 55 | network_blocks.convolutional_block_3x3( 56 | number_of_embedding_features, number_of_embedding_features), 57 | # DeformConv2d(in_channels=number_of_input_features, out_channels=number_of_embedding_features, kernel_size=5, stride=2, dilation=1, padding=2), 58 | # DeformConv2d(in_channels=number_of_embedding_features, out_channels=number_of_embedding_features, kernel_size=5, stride=2, dilation=1, padding=2) 59 | ] 60 | embedding_modules += [ 61 | network_blocks.ResidualBlock(number_of_embedding_features) 62 | # DeformSimpleBottleneck(number_of_embedding_features, number_of_embedding_features,mdconv_dilation=1,padding=1) 63 | for _ in range(number_of_residual_blocks) 64 | ] 65 | 66 | ###################################channel-sise attention############################## 67 | # embedding_modules += [network_blocks.eca_block(number_of_embedding_features)] 68 | # embedding_modules += [network_blocks.SELayer(number_of_embedding_features)] 69 | 70 | ######################################################################################### 71 | 72 | 73 | 74 | self._embedding_modules = nn.ModuleList(embedding_modules) 75 | self._shortcut = network_blocks.convolutional_block_3x3( 76 | number_of_embedding_features, number_of_shortcut_features) 77 | # self.shortcut_eca_block = network_blocks.eca_block(number_of_shortcut_features) 78 | # self._shortcut = DeformConv2d(in_channels=number_of_embedding_features, out_channels=number_of_shortcut_features, kernel_size=3, stride=1, dilation=1, padding=1) 79 | 80 | 81 | def forward(self, image): 82 | """Returns image's descriptor and redirect connection descriptor. 83 | 84 | Args: 85 | image: color image of size 86 | batch_size x 3 x height x width; 87 | 88 | Returns: 89 | descriptor: image's descriptor of size 90 | batch_size x 64 x (height / 4) x (width / 4); 91 | shortcut_from_left_image: shortcut connection from left image 92 | descriptor (it is used in regularization network). It 93 | is tensor of size 94 | (batch_size, 8, height / 4, width / 4). 95 | """ 96 | descriptor = image 97 | for embedding_module in self._embedding_modules: 98 | descriptor = embedding_module(descriptor) 99 | # short_cut = self.shortcut_eca_block(self._shortcut(descriptor)) 100 | return descriptor, self._shortcut(descriptor) 101 | # return descriptor, short_cut 102 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/test.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=4 python run_experiment_L.py \ 2 | --experiment_folder experiments_test/sbt_bigger_embedding_spade_after_spatial_embedding_auto_adjust_size_with_DA \ 3 | --checkpoint_file experiments_train/sbt_bigger_embedding_spade_after_spatial_embedding_auto_adjust_size_with_DA/039_checkpoint.bin \ 4 | --test 5 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_SPADE_for_mvsec/train.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=0 python run_experiment_L.py \ 2 | --experiment_folder experiments_train/sbt_bigger_embedding_spade1234fused_after_spatial_embedding_auto_adjust_size_with_DA_test2 \ 3 | # --experiment_folder experiments_train/4_6test_noda \ 4 | 5 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/dataloader/__init__.py: -------------------------------------------------------------------------------- 1 | from .dataloader import StereoDataset 2 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/dataloader.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/dataloader.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/dataset.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/dataset.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/dataset_constants.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/dataset_constants.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/transforms.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/dataloader/__pycache__/transforms.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/dataloader/dataloader.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from __future__ import division 3 | from __future__ import print_function 4 | 5 | from torch.utils.data import Dataset 6 | import os 7 | 8 | from utils import utils 9 | from utils.file_io import read_img, read_disp 10 | 11 | 12 | class StereoDataset(Dataset): 13 | def __init__(self, data_dir, 14 | dataset_name='SceneFlow', 15 | mode='train', 16 | save_filename=False, 17 | load_pseudo_gt=False, 18 | transform=None): 19 | super(StereoDataset, self).__init__() 20 | 21 | self.data_dir = data_dir 22 | self.dataset_name = dataset_name 23 | self.mode = mode 24 | self.save_filename = save_filename 25 | self.transform = transform 26 | 27 | sceneflow_finalpass_dict = { 28 | 'train': 'filenames/SceneFlow_finalpass_train.txt', 29 | 'val': 'filenames/SceneFlow_finalpass_val.txt', 30 | 'test': 'filenames/SceneFlow_finalpass_test.txt' 31 | } 32 | 33 | kitti_2012_dict = { 34 | 'train': 'filenames/KITTI_2012_train.txt', 35 | 'train_all': 'filenames/KITTI_2012_train_all.txt', 36 | 'val': 'filenames/KITTI_2012_val.txt', 37 | 'test': 'filenames/KITTI_2012_test.txt' 38 | } 39 | 40 | kitti_2015_dict = { 41 | 'train': 'filenames/KITTI_2015_train.txt', 42 | 'train_all': 'filenames/KITTI_2015_train_all.txt', 43 | 'val': 'filenames/KITTI_2015_val.txt', 44 | 'test': 'filenames/KITTI_2015_test.txt' 45 | } 46 | 47 | kitti_mix_dict = { 48 | 'train': 'filenames/KITTI_mix.txt', 49 | 'test': 'filenames/KITTI_2015_test.txt' 50 | } 51 | 52 | dataset_name_dict = { 53 | 'SceneFlow': sceneflow_finalpass_dict, 54 | 'KITTI2012': kitti_2012_dict, 55 | 'KITTI2015': kitti_2015_dict, 56 | 'KITTI_mix': kitti_mix_dict, 57 | } 58 | 59 | assert dataset_name in dataset_name_dict.keys() 60 | self.dataset_name = dataset_name 61 | 62 | self.samples = [] 63 | 64 | data_filenames = dataset_name_dict[dataset_name][mode] 65 | 66 | lines = utils.read_text_lines(data_filenames) 67 | 68 | for line in lines: 69 | splits = line.split() 70 | 71 | left_img, right_img = splits[:2] 72 | gt_disp = None if len(splits) == 2 else splits[2] 73 | 74 | sample = dict() 75 | 76 | if self.save_filename: 77 | sample['left_name'] = left_img.split('/', 1)[1] 78 | 79 | sample['left'] = os.path.join(data_dir, left_img) 80 | sample['right'] = os.path.join(data_dir, right_img) 81 | sample['disp'] = os.path.join(data_dir, gt_disp) if gt_disp is not None else None 82 | 83 | if load_pseudo_gt and sample['disp'] is not None: 84 | # KITTI 2015 85 | if 'disp_occ_0' in sample['disp']: 86 | sample['pseudo_disp'] = (sample['disp']).replace('disp_occ_0', 87 | 'disp_occ_0_pseudo_gt') 88 | # KITTI 2012 89 | elif 'disp_occ' in sample['disp']: 90 | sample['pseudo_disp'] = (sample['disp']).replace('disp_occ', 91 | 'disp_occ_pseudo_gt') 92 | else: 93 | raise NotImplementedError 94 | else: 95 | sample['pseudo_disp'] = None 96 | 97 | self.samples.append(sample) 98 | 99 | def __getitem__(self, index): 100 | sample = {} 101 | sample_path = self.samples[index] 102 | 103 | if self.save_filename: 104 | sample['left_name'] = sample_path['left_name'] 105 | 106 | sample['left'] = read_img(sample_path['left']) # [H, W, 3] 107 | sample['right'] = read_img(sample_path['right']) 108 | 109 | # GT disparity of subset if negative, finalpass and cleanpass is positive 110 | subset = True if 'subset' in self.dataset_name else False 111 | if sample_path['disp'] is not None: 112 | sample['disp'] = read_disp(sample_path['disp'], subset=subset) # [H, W] 113 | if sample_path['pseudo_disp'] is not None: 114 | sample['pseudo_disp'] = read_disp(sample_path['pseudo_disp'], subset=subset) # [H, W] 115 | 116 | if self.transform is not None: 117 | sample = self.transform(sample) 118 | 119 | return sample 120 | 121 | def __len__(self): 122 | return len(self.samples) 123 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/dataloader/dataset_constants.py: -------------------------------------------------------------------------------- 1 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: Apache-2.0 3 | import os 4 | 5 | DISPARITY_MULTIPLIER = 7.0 6 | TIME_BETWEEN_EXAMPLES = 0.05 # sec 7 | EXPERIMENTS = { 8 | 'indoor_flying': [1, 2, 3, 4], 9 | 'outdoor_day': [1, 2], 10 | 'outdoor_night': [1, 2, 3] 11 | } 12 | # Focal length multiplied by baseline [pix * meter]. 13 | FOCAL_LENGTH_X_BASELINE = { 14 | 'indoor_flying': 19.941772, 15 | 'outdoor_night': 19.651191, 16 | 'outdoor_day': 19.635287 17 | } 18 | INVALID_DISPARITY = 255 19 | DISPARITY_MAXIMUM = 37 20 | IMAGE_WIDTH = 346 21 | IMAGE_HEIGHT = 260 22 | 23 | 24 | def create_folders(paths): 25 | for name, folder in paths.items(): 26 | if isinstance(folder, dict): 27 | create_folders(folder) 28 | else: 29 | if 'folder' in name and not os.path.exists(folder): 30 | os.makedirs(folder) 31 | 32 | 33 | def experiment_paths(experiment_name, experiment_number, dataset_root): 34 | paths = {'cam0': {}, 'cam1': {}} 35 | 36 | paths['experiment_folder'] = os.path.join( 37 | dataset_root, '%s_%i' % (experiment_name, experiment_number)) 38 | 39 | for camera, value in {'cam0': 0, 'cam1': 1}.items(): 40 | paths[camera]['image_folder'] = os.path.join( 41 | paths['experiment_folder'], 'image%i' % value) 42 | paths[camera]['image_file'] = os.path.join( 43 | paths[camera]['image_folder'], '%0.6i.png') 44 | paths[camera]['event_folder'] = os.path.join( 45 | paths['experiment_folder'], 'event%i' % value) 46 | paths[camera]['event_file'] = os.path.join( 47 | paths[camera]['event_folder'], '%0.6i.npy') 48 | 49 | paths['timestamps_file'] = os.path.join(paths['experiment_folder'], 50 | 'timestamps.txt') 51 | paths['disparity_folder'] = os.path.join(paths['experiment_folder'], 52 | 'disparity_image') 53 | paths['disparity_file'] = os.path.join(paths['disparity_folder'], 54 | '%0.6i.png') 55 | paths['description'] = os.path.join(dataset_root, 'readme.txt') 56 | 57 | return paths 58 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/ltc_fixed_12345/035_checkpoint.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/ltc_fixed_12345/035_checkpoint.bin -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/ltc_fixed_12345/plot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/ltc_fixed_12345/plot.png -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/dataset.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/dataset.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/dataset_constants.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/dataset_constants.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/deform.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/deform.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/embedding.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/embedding.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/errors.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/errors.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/estimator.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/estimator.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/loss.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/loss.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/matching.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/matching.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/network.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/network.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/network_blocks.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/network_blocks.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/network_pds.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/network_pds.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/pds_trainer.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/pds_trainer.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/regularization.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/regularization.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/size_adapter.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/size_adapter.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/temporal_aggregation.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/temporal_aggregation.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/trainer.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/trainer.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/trainer_pds.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/trainer_pds.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/transforms.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/transforms.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/visualization.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Huawei-BIC/Discrete_Time_Convolution_for_Fast_Event_Based_Stereo/ac3834eeeb98b03ff7b9e92ea87a800c02278184/code_of_mvsec/DTC_pds_for_mvsec/model_LTC/__pycache__/visualization.cpython-38.pyc -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/dataset_constants.py: -------------------------------------------------------------------------------- 1 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 2 | # SPDX-License-Identifier: Apache-2.0 3 | import os 4 | 5 | DISPARITY_MULTIPLIER = 7.0 6 | TIME_BETWEEN_EXAMPLES = 0.05 # sec 7 | EXPERIMENTS = { 8 | 'indoor_flying': [1, 2, 3, 4], 9 | 'outdoor_day': [1, 2], 10 | 'outdoor_night': [1, 2, 3] 11 | } 12 | # Focal length multiplied by baseline [pix * meter]. 13 | FOCAL_LENGTH_X_BASELINE = { 14 | 'indoor_flying': 19.941772, 15 | 'outdoor_night': 19.651191, 16 | 'outdoor_day': 19.635287 17 | } 18 | INVALID_DISPARITY = 255 19 | DISPARITY_MAXIMUM = 37 20 | IMAGE_WIDTH = 346 21 | IMAGE_HEIGHT = 260 22 | 23 | 24 | def create_folders(paths): 25 | for name, folder in paths.items(): 26 | if isinstance(folder, dict): 27 | create_folders(folder) 28 | else: 29 | if 'folder' in name and not os.path.exists(folder): 30 | os.makedirs(folder) 31 | 32 | 33 | def experiment_paths(experiment_name, experiment_number, dataset_root): 34 | paths = {'cam0': {}, 'cam1': {}} 35 | 36 | paths['experiment_folder'] = os.path.join( 37 | dataset_root, '%s_%i' % (experiment_name, experiment_number)) 38 | 39 | for camera, value in {'cam0': 0, 'cam1': 1}.items(): 40 | paths[camera]['image_folder'] = os.path.join( 41 | paths['experiment_folder'], 'image%i' % value) 42 | paths[camera]['image_file'] = os.path.join( 43 | paths[camera]['image_folder'], '%0.6i.png') 44 | paths[camera]['event_folder'] = os.path.join( 45 | paths['experiment_folder'], 'event%i' % value) 46 | paths[camera]['event_file'] = os.path.join( 47 | paths[camera]['event_folder'], '%0.6i.npy') 48 | 49 | paths['timestamps_file'] = os.path.join(paths['experiment_folder'], 50 | 'timestamps.txt') 51 | paths['disparity_folder'] = os.path.join(paths['experiment_folder'], 52 | 'disparity_image') 53 | paths['disparity_file'] = os.path.join(paths['disparity_folder'], 54 | '%0.6i.png') 55 | paths['description'] = os.path.join(dataset_root, 'readme.txt') 56 | 57 | return paths 58 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/embedding.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | 8 | from model_LTC import network_blocks 9 | # from model_LTC.deform import DeformConv2d,DeformBottleneck,DeformSimpleBottleneck 10 | 11 | 12 | # class SELayer(nn.Module): 13 | # def __init__(self, channel, reduction=4): 14 | # super(SELayer, self).__init__() 15 | # self.avg_pool = nn.AdaptiveAvgPool2d(1) 16 | # self.fc = nn.Sequential( 17 | # nn.Linear(channel, channel // reduction, bias=False), 18 | # nn.ReLU(inplace=True), 19 | # nn.Linear(channel // reduction, channel, bias=False), 20 | # nn.Sigmoid() 21 | # ) 22 | # 23 | # def forward(self, x): 24 | # b, c, _, _ = x.size() 25 | # y = self.avg_pool(x).view(b, c) 26 | # y = self.fc(y).view(b, c, 1, 1) 27 | # return x * y.expand_as(x) 28 | 29 | 30 | 31 | class Embedding(nn.Module): 32 | """Embedding module.""" 33 | 34 | def __init__(self, 35 | number_of_input_features=3, 36 | number_of_embedding_features=16, 37 | number_of_shortcut_features=8, 38 | number_of_residual_blocks=2): 39 | """Returns initialized embedding module. 40 | 41 | Args: 42 | number_of_input_features: number of channels in the input image; 43 | number_of_embedding_features: number of channels in image's 44 | descriptor; 45 | number_of_shortcut_features: number of channels in the redirect 46 | connection descriptor; 47 | number_of_residual_blocks: number of residual blocks in embedding 48 | network. 49 | """ 50 | super(Embedding, self).__init__() 51 | embedding_modules = [ 52 | nn.InstanceNorm2d(number_of_input_features), 53 | network_blocks.convolutional_block_5x5_stride_2( 54 | number_of_input_features, number_of_embedding_features), 55 | network_blocks.convolutional_block_5x5_stride_2( 56 | number_of_embedding_features, number_of_embedding_features), 57 | # DeformConv2d(in_channels=number_of_input_features, out_channels=number_of_embedding_features, kernel_size=5, stride=2, dilation=1, padding=2), 58 | # DeformConv2d(in_channels=number_of_embedding_features, out_channels=number_of_embedding_features, kernel_size=5, stride=2, dilation=1, padding=2) 59 | ] 60 | embedding_modules += [ 61 | network_blocks.ResidualBlock(number_of_embedding_features) 62 | # DeformSimpleBottleneck(number_of_embedding_features, number_of_embedding_features,mdconv_dilation=1,padding=1) 63 | for _ in range(number_of_residual_blocks) 64 | ] 65 | 66 | ###################################channel-sise attention############################## 67 | # embedding_modules += [network_blocks.eca_block(number_of_embedding_features)] 68 | # embedding_modules += [network_blocks.SELayer(number_of_embedding_features)] 69 | 70 | ######################################################################################### 71 | 72 | 73 | 74 | self._embedding_modules = nn.ModuleList(embedding_modules) 75 | self._shortcut = network_blocks.convolutional_block_3x3( 76 | number_of_embedding_features, number_of_shortcut_features) 77 | # self.shortcut_eca_block = network_blocks.eca_block(number_of_shortcut_features) 78 | # self._shortcut = DeformConv2d(in_channels=number_of_embedding_features, out_channels=number_of_shortcut_features, kernel_size=3, stride=1, dilation=1, padding=1) 79 | 80 | 81 | def forward(self, image): 82 | """Returns image's descriptor and redirect connection descriptor. 83 | 84 | Args: 85 | image: color image of size 86 | batch_size x 3 x height x width; 87 | 88 | Returns: 89 | descriptor: image's descriptor of size 90 | batch_size x 64 x (height / 4) x (width / 4); 91 | shortcut_from_left_image: shortcut connection from left image 92 | descriptor (it is used in regularization network). It 93 | is tensor of size 94 | (batch_size, 8, height / 4, width / 4). 95 | """ 96 | descriptor = image 97 | for embedding_module in self._embedding_modules: 98 | descriptor = embedding_module(descriptor) 99 | # short_cut = self.shortcut_eca_block(self._shortcut(descriptor)) 100 | return descriptor, self._shortcut(descriptor) 101 | # return descriptor, short_cut 102 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/errors.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import torch as th 7 | 8 | 9 | def compute_absolute_error(estimated_disparity, 10 | ground_truth_disparity, 11 | use_mean=True): 12 | """Returns pixel-wise and mean absolute error. 13 | 14 | Locations where ground truth is not avaliable do not contribute to mean 15 | absolute error. In such locations pixel-wise error is shown as zero. 16 | If ground truth is not avaliable in all locations, function returns 0. 17 | 18 | Args: 19 | ground_truth_disparity: ground truth disparity where locations with 20 | unknow disparity are set to inf's. 21 | estimated_disparity: estimated disparity. 22 | use_mean: if True than use mean to average pixelwise errors, 23 | otherwise use median. 24 | """ 25 | absolute_difference = (estimated_disparity - ground_truth_disparity).abs() 26 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 27 | pixelwise_absolute_error = absolute_difference.clone() 28 | pixelwise_absolute_error[locations_without_ground_truth] = 0 29 | absolute_differece_with_ground_truth = absolute_difference[ 30 | ~locations_without_ground_truth] 31 | if absolute_differece_with_ground_truth.numel() == 0: 32 | average_absolute_error = 0.0 33 | else: 34 | if use_mean: 35 | average_absolute_error = absolute_differece_with_ground_truth.mean( 36 | ).item() 37 | else: 38 | average_absolute_error = absolute_differece_with_ground_truth.median( 39 | ).item() 40 | return pixelwise_absolute_error, average_absolute_error 41 | 42 | 43 | def compute_n_pixels_error(estimated_disparity, ground_truth_disparity, n=3.0): 44 | """Return pixel-wise n-pixels error and % of pixels with n-pixels error. 45 | 46 | Locations where ground truth is not avaliable do not contribute to mean 47 | n-pixel error. In such locations pixel-wise error is shown as zero. 48 | 49 | Note that n-pixel error is equal to one if 50 | |estimated_disparity-ground_truth_disparity| > n and zero otherwise. 51 | 52 | If ground truth is not avaliable in all locations, function returns 0. 53 | 54 | Args: 55 | ground_truth_disparity: ground truth disparity where locations with 56 | unknow disparity are set to inf's. 57 | estimated_disparity: estimated disparity. 58 | n: maximum absolute disparity difference, that does not trigger 59 | n-pixel error. 60 | """ 61 | locations_without_ground_truth = th.isinf(ground_truth_disparity) 62 | more_than_n_pixels_absolute_difference = ( 63 | estimated_disparity - ground_truth_disparity).abs().gt(n).float() 64 | pixelwise_n_pixels_error = more_than_n_pixels_absolute_difference.clone() 65 | pixelwise_n_pixels_error[locations_without_ground_truth] = 0.0 66 | more_than_n_pixels_absolute_difference_with_ground_truth = \ 67 | more_than_n_pixels_absolute_difference[~locations_without_ground_truth] 68 | if more_than_n_pixels_absolute_difference_with_ground_truth.numel() == 0: 69 | percentage_of_pixels_with_error = 0.0 70 | else: 71 | percentage_of_pixels_with_error = \ 72 | more_than_n_pixels_absolute_difference_with_ground_truth.mean( 73 | ).item() * 100 74 | return pixelwise_n_pixels_error, percentage_of_pixels_with_error 75 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/estimator.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch.nn import functional 7 | import torch as th 8 | 9 | 10 | class SubpixelMap(object): 11 | """Approximation of an sub-pixel MAP estimator. 12 | 13 | In every location (x, y), function collects similarity scores 14 | for disparities in a vicinty of a disparity with maximum similarity 15 | score and converts them to disparity distribution using softmax. 16 | Next, the disparity in every location (x, y) is computed as mean 17 | of this distribution. 18 | 19 | It is used only for inference. 20 | """ 21 | 22 | def __init__(self, half_support_window=4, disparity_step=2): 23 | super(SubpixelMap, self).__init__() 24 | """Returns object of SubpixelMap class. 25 | 26 | Args: 27 | disparity_step: step in pixels between near-by disparities in 28 | input "similarities" tensor. 29 | half_support_window: defines size of disparity window in pixels 30 | around disparity with maximum similarity, 31 | which is used to convert similarities 32 | to probabilities and compute mean. 33 | """ 34 | if disparity_step < 1: 35 | raise ValueError('"disparity_step" should be positive integer.') 36 | if half_support_window < 1: 37 | raise ValueError( 38 | '"half_support_window" should be positive integer.') 39 | if half_support_window % disparity_step != 0: 40 | raise ValueError('"half_support_window" should be multiple of the' 41 | '"disparity_step"') 42 | self._disparity_step = disparity_step 43 | self._half_support_window = half_support_window 44 | 45 | def __call__(self, similarities): 46 | """Returns sub-pixel disparity. 47 | 48 | Args: 49 | similarities: Tensor with similarities for every 50 | disparity and every location with indices 51 | [batch_index, disparity_index, y, x]. 52 | 53 | Returns: 54 | Tensor with disparities for every location with 55 | indices [batch_index, y, x]. 56 | """ 57 | # In every location (x, y) find disparity with maximum similarity 58 | # score. 59 | maximum_similarity, disparity_index_with_maximum_similarity = \ 60 | th.max(similarities, dim=1, keepdim=True) 61 | support_disparities, support_similarities = [], [] 62 | maximum_disparity_index = similarities.size(1) 63 | 64 | # Collect similarity scores for the disparities around the disparity 65 | # with the maximum similarity score. 66 | for disparity_index_shift in range( 67 | -self._half_support_window // self._disparity_step, 68 | self._half_support_window // self._disparity_step + 1): 69 | disparity_index = (disparity_index_with_maximum_similarity + 70 | disparity_index_shift).float() 71 | invalid_disparity_index_mask = ( 72 | (disparity_index < 0) | 73 | (disparity_index >= maximum_disparity_index)) 74 | disparity_index[invalid_disparity_index_mask] = 0 75 | nearby_similarities = th.gather(similarities, 1, 76 | disparity_index.long()) 77 | nearby_similarities[invalid_disparity_index_mask] = -float('inf') 78 | support_similarities.append(nearby_similarities) 79 | nearby_disparities = th.gather( 80 | (self._disparity_step * 81 | disparity_index).expand_as(similarities), 1, 82 | disparity_index.long()) 83 | support_disparities.append(nearby_disparities) 84 | support_similarities = th.stack(support_similarities, dim=1) 85 | support_disparities = th.stack(support_disparities, dim=1) 86 | 87 | # Convert collected similarity scores to the disparity distribution 88 | # using softmax and compute disparity as a mean of this distribution. 89 | probabilities = functional.softmax(support_similarities, dim=1) 90 | disparities = th.sum(probabilities * support_disparities.float(), 1) 91 | return disparities.squeeze(1) 92 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/loss.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import torch as th 7 | 8 | from torch import nn 9 | from torch.nn import functional 10 | # import torch.distributed as dist 11 | 12 | def _unnormalized_laplace_probability(value, location, diversity): 13 | return th.exp(-th.abs(location - value) / diversity) / (2 * diversity) 14 | 15 | 16 | class SubpixelCrossEntropy(nn.Module): 17 | def __init__(self, diversity=1.0, disparity_step=2): 18 | """Returns SubpixelCrossEntropy object. 19 | 20 | Args: 21 | disparity_step: disparity difference between near-by 22 | disparity indices in "similarities" tensor. 23 | diversity: diversity of the target Laplace distribution, 24 | centered at the sub-pixel ground truth. 25 | """ 26 | super(SubpixelCrossEntropy, self).__init__() 27 | self._diversity = diversity 28 | self._disparity_step = disparity_step 29 | 30 | def forward(self, similarities, ground_truth_disparities, weights=None): 31 | """Returns sub-pixel cross-entropy loss. 32 | 33 | Cross-entropy is computed as 34 | 35 | - sum_d log( P_predicted(d) ) x P_target(d) 36 | ------------------------------------------------- 37 | sum_d P_target(d) 38 | 39 | We need to normalize the cross-entropy by sum_d P_target(d), 40 | since the target distribution is not normalized. 41 | 42 | Args: 43 | ground_truth_disparities: Tensor with ground truth disparities with 44 | indices [example_index, y, x]. The 45 | disparity values are floats. The locations with unknown 46 | disparities are filled with 'inf's. 47 | similarities: Tensor with similarities with indices 48 | [example_index, disparity_index, y, x]. 49 | weights: Tensor with weights of individual locations. 50 | """ 51 | maximum_disparity_index = similarities.size(1) 52 | known_ground_truth_disparity = ground_truth_disparities.data != float( 53 | 'inf') 54 | log_P_predicted = functional.log_softmax(similarities, dim=1) 55 | sum_P_target = th.zeros(ground_truth_disparities.size()) 56 | sum_P_target_x_log_P_predicted = th.zeros( 57 | ground_truth_disparities.size()) 58 | if similarities.is_cuda: 59 | sum_P_target = sum_P_target.cuda() 60 | sum_P_target_x_log_P_predicted = \ 61 | sum_P_target_x_log_P_predicted.cuda() 62 | for disparity_index in range(maximum_disparity_index): 63 | disparity = disparity_index * self._disparity_step 64 | P_target = _unnormalized_laplace_probability( 65 | value=disparity, 66 | location=ground_truth_disparities, 67 | diversity=self._diversity) 68 | sum_P_target += P_target 69 | sum_P_target_x_log_P_predicted += ( 70 | log_P_predicted[:, disparity_index] * P_target) 71 | entropy = -sum_P_target_x_log_P_predicted[ 72 | known_ground_truth_disparity] / sum_P_target[ 73 | known_ground_truth_disparity] 74 | if weights is not None: 75 | weights_with_ground_truth = weights[known_ground_truth_disparity] 76 | return (weights_with_ground_truth * entropy).sum() / ( 77 | weights_with_ground_truth.sum() + 1e-15) 78 | # print("entropy.mean",entropy.mean()) 79 | return entropy.mean() 80 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/network_blocks.py: -------------------------------------------------------------------------------- 1 | # © All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | 8 | 9 | def convolution_3x3x3(number_of_input_features, number_of_output_features, 10 | stride): 11 | return nn.Conv3d( 12 | number_of_input_features, 13 | number_of_output_features, 14 | kernel_size=3, 15 | stride=stride, 16 | padding=1) 17 | 18 | 19 | def convolution_3x3(number_of_input_features, number_of_output_features): 20 | return nn.Conv2d( 21 | number_of_input_features, 22 | number_of_output_features, 23 | kernel_size=3, 24 | padding=1) 25 | 26 | 27 | def convolution_5x5_stride_2(number_of_input_features, 28 | number_of_output_features): 29 | return nn.Conv2d( 30 | number_of_input_features, 31 | number_of_output_features, 32 | kernel_size=5, 33 | stride=2, 34 | padding=2) 35 | 36 | 37 | def transposed_convolution_3x4x4_stride_122(number_of_input_features, 38 | number_of_output_features): 39 | return nn.ConvTranspose3d( 40 | number_of_input_features, 41 | number_of_output_features, 42 | kernel_size=(3, 4, 4), 43 | stride=(1, 2, 2), 44 | padding=(1, 1, 1)) 45 | 46 | 47 | def convolution_block_2D_with_relu_and_instance_norm(number_of_input_features, 48 | number_of_output_features, 49 | kernel_size, stride): 50 | return nn.Sequential( 51 | nn.Conv2d( 52 | number_of_input_features, 53 | number_of_output_features, 54 | kernel_size=kernel_size, 55 | stride=stride, 56 | padding=kernel_size // 2), 57 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 58 | nn.InstanceNorm2d(number_of_output_features, affine=True)) 59 | 60 | 61 | def convolution_block_3D_with_relu_and_instance_norm(number_of_input_features, 62 | number_of_output_features, 63 | kernel_size, stride): 64 | return nn.Sequential( 65 | nn.Conv3d( 66 | number_of_input_features, 67 | number_of_output_features, 68 | kernel_size=kernel_size, 69 | stride=stride, 70 | padding=kernel_size // 2), 71 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 72 | nn.InstanceNorm3d(number_of_output_features, affine=True)) 73 | 74 | 75 | def transposed_convololution_block_3D_with_relu_and_instance_norm( 76 | number_of_input_features, number_of_output_features, kernel_size, 77 | stride, padding): 78 | return nn.Sequential( 79 | nn.ConvTranspose3d( 80 | number_of_input_features, 81 | number_of_output_features, 82 | kernel_size=kernel_size, 83 | stride=stride, 84 | padding=padding), nn.LeakyReLU(negative_slope=0.1, inplace=True), 85 | nn.InstanceNorm3d(number_of_output_features, affine=True)) 86 | 87 | 88 | def convolutional_block_5x5_stride_2(number_of_input_features, 89 | number_of_output_features): 90 | return convolution_block_2D_with_relu_and_instance_norm( 91 | number_of_input_features, 92 | number_of_output_features, 93 | kernel_size=5, 94 | stride=2) 95 | 96 | 97 | def convolutional_block_3x3(number_of_input_features, 98 | number_of_output_features): 99 | return convolution_block_2D_with_relu_and_instance_norm( 100 | number_of_input_features, 101 | number_of_output_features, 102 | kernel_size=3, 103 | stride=1) 104 | 105 | 106 | def convolutional_block_3x3x3(number_of_input_features, 107 | number_of_output_features): 108 | return convolution_block_3D_with_relu_and_instance_norm( 109 | number_of_input_features, 110 | number_of_output_features, 111 | kernel_size=3, 112 | stride=1) 113 | 114 | 115 | def convolutional_block_3x3x3_stride_2(number_of_input_features, 116 | number_of_output_features): 117 | return convolution_block_3D_with_relu_and_instance_norm( 118 | number_of_input_features, 119 | number_of_output_features, 120 | kernel_size=3, 121 | stride=2) 122 | 123 | 124 | def transposed_convolutional_block_4x4x4_stride_2(number_of_input_features, 125 | number_of_output_features): 126 | return transposed_convololution_block_3D_with_relu_and_instance_norm( 127 | number_of_input_features, 128 | number_of_output_features, 129 | kernel_size=4, 130 | stride=2, 131 | padding=1) 132 | 133 | 134 | class ResidualBlock(nn.Module): 135 | """Residual block with nonlinearity before addition.""" 136 | 137 | def __init__(self, number_of_features): 138 | super(ResidualBlock, self).__init__() 139 | self.convolutions = nn.Sequential( 140 | convolutional_block_3x3(number_of_features, number_of_features), 141 | convolutional_block_3x3(number_of_features, number_of_features)) 142 | 143 | def forward(self, block_input): 144 | return self.convolutions(block_input) + block_input 145 | 146 | 147 | class SELayer(nn.Module): 148 | def __init__(self, channel, reduction=4): 149 | super(SELayer, self).__init__() 150 | self.avg_pool = nn.AdaptiveAvgPool2d(1) 151 | self.fc = nn.Sequential( 152 | nn.Linear(channel, channel // reduction, bias=False), 153 | nn.ReLU(inplace=True), 154 | nn.Linear(channel // reduction, channel, bias=False), 155 | nn.Sigmoid() 156 | ) 157 | 158 | def forward(self, x): 159 | b, c, _, _ = x.size() 160 | y = self.avg_pool(x).view(b, c) 161 | y = self.fc(y).view(b, c, 1, 1) 162 | return x * y.expand_as(x) 163 | 164 | 165 | class eca_block(nn.Module): 166 | def __init__(self, channel, k_size=3): 167 | super(eca_block, self).__init__() 168 | self.avg_pool = nn.AdaptiveAvgPool2d(1) 169 | self.conv = nn.Conv1d(1,1,kernel_size=k_size, padding=(k_size-1)//2, bias=False) 170 | self.sigmoid = nn.Sigmoid() 171 | 172 | def forward(self, x): 173 | b, c, _, _ = x.size() 174 | y = self.avg_pool(x) 175 | y = self.conv(y.squeeze(-1).transpose(-1,-2)).transpose(-1,-2).unsqueeze(-1) 176 | y = self.sigmoid(y) 177 | 178 | return x * y.expand_as(x) 179 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/network_pds.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | from torch import nn 6 | from model_LTC import embedding 7 | from model_LTC import estimator 8 | from model_LTC import matching 9 | from model_LTC import regularization 10 | from model_LTC import size_adapter 11 | 12 | import time 13 | 14 | 15 | class PdsNetwork(nn.Module): 16 | """Practical Deep Stereo (PDS) network.""" 17 | 18 | def __init__(self, size_adapter_module, embedding_module, matching_module, 19 | regularization_module, estimator_module): 20 | super(PdsNetwork, self).__init__() 21 | self._size_adapter = size_adapter_module 22 | self._embedding = embedding_module 23 | self._matching = matching_module 24 | self._regularization = regularization_module 25 | self._estimator = estimator_module 26 | 27 | def set_maximum_disparity(self, maximum_disparity): 28 | """Reconfigure network for different disparity range.""" 29 | if (maximum_disparity + 1) % 64 != 0: 30 | raise ValueError( 31 | '"maximum_disparity" + 1 should be multiple of 64, e.g.,' 32 | '"maximum disparity" can be equal to 63, 191, 255, 319...') 33 | self._maximum_disparity = maximum_disparity 34 | # During the embedding spatial dimensions of an input are downsampled 35 | # 4x times. Therefore, "maximum_disparity" of matching module is 36 | # computed as (maximum_disparity + 1) / 4 - 1. 37 | self._matching.set_maximum_disparity((maximum_disparity + 1) // 4 - 1) 38 | 39 | def pass_through_network(self, left_image, right_image): 40 | start_time = time.time() 41 | left_descriptor, shortcut_from_left = self._embedding(left_image) 42 | right_descriptor = self._embedding(right_image)[0] 43 | print("Embedding Duration:{:.4f}s".format(time.time()-start_time)) 44 | print("leftdisc",left_descriptor.size())# 1 32 80 96 45 | print("sfl",shortcut_from_left.size()) # 1 8 80 96 46 | start_time = time.time() 47 | matching_signatures = self._matching(left_descriptor, right_descriptor) 48 | print("Matching Duration:{:.4f}s".format(time.time()-start_time)) 49 | # print("after matching", matching_signatures.size()) 50 | # after matching : 1 8 16 320 384 51 | 52 | start_time = time.time() 53 | output = self._regularization(matching_signatures, shortcut_from_left) 54 | print("Cost volumn:{:.4f}s".format(time.time()-start_time)) 55 | return output, shortcut_from_left 56 | 57 | 58 | def forward(self, left_image, right_image): 59 | """Returns sub-pixel disparity (or matching cost in training mode).""" 60 | network_output = self.pass_through_network( 61 | self._size_adapter.pad(left_image), 62 | self._size_adapter.pad(right_image))[0] 63 | if not self.training: 64 | network_output = self._estimator(network_output) 65 | return self._size_adapter.unpad(network_output) 66 | 67 | @staticmethod 68 | def default(maximum_disparity=255): 69 | """Returns network with default parameters.""" 70 | network = PdsNetwork( 71 | size_adapter_module=size_adapter.SizeAdapter(), 72 | embedding_module=embedding.Embedding(), 73 | matching_module=matching.Matching( 74 | operation=matching.MatchingOperation(), maximum_disparity=0), 75 | regularization_module=regularization.Regularization(), 76 | estimator_module=estimator.SubpixelMap()) 77 | network.set_maximum_disparity(maximum_disparity) 78 | return network 79 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/regularization.py: -------------------------------------------------------------------------------- 1 | # Copirights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | from torch import nn 7 | 8 | from model_LTC import network_blocks 9 | 10 | 11 | class ContractionBlock3d(nn.Module): 12 | """Contraction block, that downsamples the input. 13 | 14 | The contraction blocks constitute the contraction part of 15 | the regularization network. Each block consists of 2x 16 | "donwsampling" convolution followed by conventional "smoothing" 17 | convolution. 18 | """ 19 | 20 | def __init__(self, number_of_features): 21 | super(ContractionBlock3d, self).__init__() 22 | self._downsampling_2x = \ 23 | network_blocks.convolutional_block_3x3x3_stride_2( 24 | number_of_features, 2 * number_of_features) 25 | self._smoothing = network_blocks.convolutional_block_3x3x3( 26 | 2 * number_of_features, 2 * number_of_features) 27 | 28 | def forward(self, block_input): 29 | output_of_downsampling_2x = self._downsampling_2x(block_input) 30 | return output_of_downsampling_2x, self._smoothing( 31 | output_of_downsampling_2x) 32 | 33 | 34 | class ExpansionBlock3d(nn.Module): 35 | """Expansion block, that upsamples the input. 36 | 37 | The expansion blocks constitute the expansion part of 38 | the regularization network. Each block consists of 2x 39 | "upsampling" transposed convolution and 40 | conventional "smoothing" convolution. The output of the 41 | "upsampling" convolution is summed with the 42 | "shortcut_from_contraction" and is fed to the "smoothing" 43 | convolution. 44 | """ 45 | 46 | def __init__(self, number_of_features): 47 | super(ExpansionBlock3d, self).__init__() 48 | self._upsampling_2x = \ 49 | network_blocks.transposed_convolutional_block_4x4x4_stride_2( 50 | number_of_features, number_of_features // 2) 51 | self._smoothing = network_blocks.convolutional_block_3x3x3( 52 | number_of_features // 2, number_of_features // 2) 53 | 54 | def forward(self, block_input, shortcut_from_contraction): 55 | output_of_upsampling = self._upsampling_2x(block_input) 56 | return self._smoothing(output_of_upsampling + 57 | shortcut_from_contraction) 58 | 59 | 60 | class Regularization(nn.Module): 61 | """Regularization module, that enforce stereo matching constraints. 62 | 63 | It is a hourglass 3D convolutional network that consists 64 | of contraction and expansion parts, with the shortcut connections 65 | between them. 66 | 67 | The network downsamples the input 16x times along the spatial 68 | and disparity dimensions and then upsamples it 64x times along 69 | the spatial dimensions and 32x times along the disparity 70 | dimension, effectively computing matching cost only for even 71 | disparities. 72 | """ 73 | 74 | def __init__(self, number_of_features=8): 75 | """Returns initialized regularization module.""" 76 | super(Regularization, self).__init__() 77 | self._smoothing = network_blocks.convolutional_block_3x3x3( 78 | number_of_features, number_of_features) 79 | self._contraction_blocks = nn.ModuleList([ 80 | ContractionBlock3d(number_of_features * scale) 81 | for scale in [1, 2, 4, 8] 82 | ]) 83 | self._expansion_blocks = nn.ModuleList([ 84 | ExpansionBlock3d(number_of_features * scale) 85 | for scale in [16, 8, 4, 2] 86 | ]) 87 | self._upsample_to_halfsize = \ 88 | network_blocks.transposed_convolutional_block_4x4x4_stride_2( 89 | number_of_features, number_of_features // 2) 90 | self._upsample_to_fullsize = \ 91 | network_blocks.transposed_convolution_3x4x4_stride_122( 92 | number_of_features // 2, 1) 93 | 94 | # self.SEblock = network_blocks.SELayer(32) 95 | # self.eca_block_contract = network_blocks.eca_block(32) 96 | # self.eca_block = network_blocks.eca_block(128) 97 | 98 | def forward(self, matching_signatures, shortcut_from_left_image): 99 | """Returns regularized matching cost tensor. 100 | 101 | Args: 102 | matching_signatures: concatenated compact matching signatures 103 | for every disparity. It is tensor of size 104 | (batch_size, number_of_features, 105 | maximum_disparity / 4, height / 4, 106 | width / 4). 107 | shortcut_from_left_image: shortcut connection from the left 108 | image descriptor. It has size of 109 | (batch_size, number_of_features, height / 4, 110 | width / 4); 111 | 112 | Returns: 113 | regularized matching cost tensor of size (batch_size, 114 | maximum_disparity / 2, height, width). Every element of this 115 | tensor along the disparity dimension is a matching cost for 116 | disparity 0, 2, .. , maximum_disparity. 117 | """ 118 | shortcuts_from_contraction = [] 119 | shortcut = shortcut_from_left_image.unsqueeze(2) 120 | output = self._smoothing(matching_signatures) 121 | for contraction_block in self._contraction_blocks: 122 | shortcuts_from_contraction.append(output) 123 | shortcut, output = contraction_block(shortcut + output) 124 | 125 | # output = self.eca_block_contract(output.squeeze(2)).unsqueeze(2) 126 | 127 | del shortcut 128 | for expansion_block in self._expansion_blocks: 129 | output = expansion_block(output, shortcuts_from_contraction.pop()) 130 | 131 | full_size_output = self._upsample_to_fullsize(self._upsample_to_halfsize(output)).squeeze_(1) 132 | # full_size_output = self.eca_block(full_size_output) 133 | # full_size_output = self.SEblock(full_size_output) 134 | # return self._upsample_to_fullsize( 135 | # self._upsample_to_halfsize(output)).squeeze_(1) 136 | return full_size_output 137 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/model_LTC/size_adapter.py: -------------------------------------------------------------------------------- 1 | # Copyrights. All rights reserved. 2 | # ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE, Switzerland, 3 | # Space Center (eSpace), 2018 4 | # See the LICENSE.TXT file for more details. 5 | 6 | import math 7 | 8 | from torch import nn 9 | 10 | 11 | class SizeAdapter(object): 12 | """Converts size of input to standard size. 13 | 14 | Practical deep network works only with input images 15 | which height and width are multiples of a minimum size. 16 | This class allows to pass to the network images of arbitrary 17 | size, by padding the input to the closest multiple 18 | and unpadding the network's output to the original size. 19 | """ 20 | 21 | def __init__(self, minimum_size=64): 22 | self._minimum_size = minimum_size 23 | self._pixels_pad_to_width = None 24 | self._pixels_pad_to_height = None 25 | 26 | def _closest_larger_multiple_of_minimum_size(self, size): 27 | return int(math.ceil(size / self._minimum_size) * self._minimum_size) 28 | 29 | def pad(self, network_input): 30 | """Returns "network_input" paded with zeros to the "standard" size. 31 | 32 | The "standard" size correspond to the height and width that 33 | are closest multiples of "minimum_size". The method pads 34 | height and width and and saves padded values. These 35 | values are then used by "unpad_output" method. 36 | """ 37 | height, width = network_input.size()[-2:] 38 | self._pixels_pad_to_height = ( 39 | self._closest_larger_multiple_of_minimum_size(height) - height) 40 | self._pixels_pad_to_width = ( 41 | self._closest_larger_multiple_of_minimum_size(width) - width) 42 | return nn.ZeroPad2d((self._pixels_pad_to_width, 0, 43 | self._pixels_pad_to_height, 0))(network_input) 44 | 45 | def unpad(self, network_output): 46 | """Returns "network_output" cropped to the original size. 47 | 48 | The cropping is performed using values save by the "pad_input" 49 | method. 50 | """ 51 | return network_output[..., self._pixels_pad_to_height:, self. 52 | _pixels_pad_to_width:] 53 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/test.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=3 python run_experiment_L.py \ 2 | --experiment_folder experiments_test/ltc_fixed_12345 \ 3 | --checkpoint_file ltc_fixed_12345/035_checkpoint.bin \ 4 | --test_mode 5 | -------------------------------------------------------------------------------- /code_of_mvsec/DTC_pds_for_mvsec/train.sh: -------------------------------------------------------------------------------- 1 | CUDA_VISIBLE_DEVICES=0 python run_experiment_L.py \ 2 | --experiment_folder experiments_train/test \ 3 | 4 | -------------------------------------------------------------------------------- /code_of_mvsec/data_preprocess.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import os 3 | from tqdm.notebook import tqdm 4 | 5 | # The SBT method 6 | def sbt_frame(events, start_time, height=260, width=346 , ms_per_channel=10, channel_per_frame=5): 7 | num_events = events.shape[0] 8 | final_frame = np.zeros((channel_per_frame,height,width)) 9 | for i in range(num_events): 10 | channel_index =int((start_time - events[i,0])*1000 // ms_per_channel) 11 | x_position = int(events[i,1]) 12 | y_position = int(events[i,2]) 13 | polarity = int(events[i,3]) 14 | assert 0<= channel_index <= 4 ,"channel_index should be [0,4]" 15 | final_frame[channel_index,y_position,x_position] += polarity 16 | return np.sign(final_frame) 17 | 18 | def get_paths(experiment_number): 19 | paths = {} 20 | root = "indoor_flying_%d/" % experiment_number 21 | paths['data_folder0'] = root + "event0/" 22 | paths['data_folder1'] = root + "event1/" 23 | paths['experiment_folder0'] = root + "event0_10ms_frame/" 24 | paths['experiment_folder1'] = root + "event1_10ms_frame/" 25 | paths['timestamp'] = root + 'timestamps.txt' 26 | if not os.path.exists(paths['experiment_folder0']): 27 | os.makedirs(paths['experiment_folder0']) 28 | if not os.path.exists(paths['experiment_folder1']): 29 | os.makedirs(paths['experiment_folder1']) 30 | return paths 31 | 32 | experiment_numbers = [1,2,3,4] 33 | 34 | pbr0 = tqdm(total = len(experiment_numbers)) 35 | for i in experiment_numbers: 36 | print("processing indoor_flying%d" % i) 37 | paths = get_paths(i) 38 | timestamp_file = np.loadtxt(paths['timestamp']) 39 | num_image = len(timestamp_file) 40 | pbr = tqdm(total = num_image) 41 | for i in range(num_image): 42 | file_name = "%06d.npy" % i 43 | events0 = np.load(paths["data_folder0"] + file_name) 44 | events1 = np.load(paths["data_folder1"] + file_name) 45 | 46 | start_time = timestamp_file[i] 47 | event_frame0 = sbt_frame(events0,start_time) 48 | event_frame1 = sbt_frame(events1,start_time) 49 | 50 | np.save(paths["experiment_folder0"] + file_name,event_frame0) 51 | np.save(paths["experiment_folder1"] + file_name,event_frame1) 52 | pbr.update(1) 53 | 54 | pbr.close() 55 | pbr0.update(1) 56 | 57 | 58 | pbr0.close() -------------------------------------------------------------------------------- /readme.md: -------------------------------------------------------------------------------- 1 | # Discrete time convolution for fast event-based stereo 2 | This code is a demo of our CVPR 2022 paper "Discrete time convolution for 3 | fast event-based stereo". 4 | 5 | 6 | # dataset 7 | Change the MVSEC/DSEC dataset address in dataset.py. 8 | 9 | Data preprocess for DSEC: \ 10 | Follow from https://github.com/uzh-rpg/DSEC. 11 | 12 | Data process for MVSEC (SBT): \ 13 | Step 1: From h5 file to npy file: follow from https://github.com/tlkvstepan/event_stereo_ICCV2019. \ 14 | Step 2: Using SBT method to proccess npy file: `python code_of_mvsec/data_preprocess.py` 15 | 16 | 17 | # cofe for mvsec 18 | - code_of_mvsec/DTC_pds_for_mvsec \ 19 | usage: DTC-pds 20 | 21 | - code_of_mvsec/DTC_SPADE_for_mvsec \ 22 | usage: DTC-spade 23 | 24 | For training/test procedure, just execute: \ 25 | bash train.sh/test.sh 26 | 27 | 28 | 29 | # code for Dsec 30 | - code_of_Dsec/LTC_Dsec \ 31 | usage: generate DSEC website test result by DTC-PDS 32 | 33 | - code_of_Dsec/LTC_Dsec_spade \ 34 | usage: generate DSEC website test result by DTC-SPADE 35 | 36 | - code_of_Dsec/LTC_for_Dsec/LTC_Dsec_clear_version \ 37 | usage: DTC-PDS 38 | 39 | - code_of_Dsec/LTC_for_Dsec/LTC_Dsec_spade_clear_version \ 40 | usage: DTC-SPADE 41 | 42 | For training/test procedure, just execute: \ 43 | bash train.sh/test.sh 44 | 45 | 46 | # Paper Reference 47 | @inproceedings{zhang2022discrete,\ 48 | title={Discrete Time Convolution for Fast Event-Based Stereo},\ 49 | author={Zhang, Kaixuan and Che, Kaiwei and Zhang, Jianguo and Cheng, Jie and Zhang, Ziyang and Guo, Qinghai and Leng, Luziwei},\ 50 | booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},\ 51 | pages={8676--8686},\ 52 | year={2022}\ 53 | } 54 | 55 | Our code is developed based on the code from ICCV2019 paper "Learning an event sequence embedding 56 | for dense event-based deep stereo" 57 | 58 | paper: https://openaccess.thecvf.com/content_ICCV_2019/papers/Tulyakov_Learning_an_Event_Sequence_Embedding_for_Dense_Event-Based_Deep_Stereo_ICCV_2019_paper.pdf \ 59 | code: https://github.com/tlkvstepan/event_stereo_ICCV2019 60 | 61 | 62 | # Liscence 63 | This open-source project is not an official Huawei product, and Huawei is not expected to provide support for this project. 64 | 65 | --------------------------------------------------------------------------------