├── README.md ├── download.sh ├── log ├── events.out.tfevents.1505315272.deeplearning ├── events.out.tfevents.1505316794.deeplearning ├── events.out.tfevents.1505317347.deeplearning ├── events.out.tfevents.1505317478.deeplearning ├── events.out.tfevents.1505317537.deeplearning ├── events.out.tfevents.1505317667.deeplearning ├── events.out.tfevents.1505317740.deeplearning ├── events.out.tfevents.1505317846.deeplearning ├── events.out.tfevents.1505318058.deeplearning ├── events.out.tfevents.1505318207.deeplearning ├── events.out.tfevents.1505318578.deeplearning ├── events.out.tfevents.1505318869.deeplearning ├── events.out.tfevents.1505319774.deeplearning ├── events.out.tfevents.1505320164.deeplearning ├── events.out.tfevents.1505329935.deeplearning ├── events.out.tfevents.1505330193.deeplearning ├── events.out.tfevents.1505358839.deeplearning ├── events.out.tfevents.1505371238.deeplearning ├── events.out.tfevents.1505392293.deeplearning ├── events.out.tfevents.1505407743.deeplearning ├── events.out.tfevents.1505410895.deeplearning ├── events.out.tfevents.1505418187.deeplearning ├── events.out.tfevents.1505421208.deeplearning ├── events.out.tfevents.1505433857.deeplearning ├── events.out.tfevents.1505505945.deeplearning ├── events.out.tfevents.1505506012.deeplearning ├── events.out.tfevents.1505506266.deeplearning ├── events.out.tfevents.1505506302.deeplearning ├── events.out.tfevents.1505579898.deeplearning ├── events.out.tfevents.1505580836.deeplearning ├── events.out.tfevents.1505667583.deeplearning └── events.out.tfevents.1505667753.deeplearning ├── model ├── model │ └── shared_cost_weight.hdf5 └── shared_cost_weight.hdf5 ├── src ├── conv3dTranspose.py ├── conv3dTranspose.pyc ├── custom_callback.py ├── custom_callback.pyc ├── data_utils.py ├── data_utils.pyc ├── environment.json ├── gcnetwork.py ├── gcnetwork.pyc ├── generator.py ├── generator.pyc ├── hyperparams.json ├── load_pfm.py ├── load_pfm.pyc ├── losses.py ├── losses.pyc ├── parse_arguments.py ├── parse_arguments.pyc ├── test_params.json ├── train_params.json └── util_params.json ├── test.py └── train.py /README.md: -------------------------------------------------------------------------------- 1 | # Geometry and Context Network (On-Going Project) 2 | A Keras implementation of GC Network by HungShi Lin(hl2997@columbia.edu). The paper can be found [here](https://arxiv.org/abs/1703.04309). 3 | I do some modifications by adding a linear output function and enable training highway block at the second stage. 4 | 5 | ### Issue 6 | 1. Model performance can't acheive that in the original paper. 7 | 8 | ### Update (10/06/2017) 9 | Model can be trained with image with size (256, 512) 10 | 11 | ### Software Requirement 12 | tensorflow([install from here](https://www.tensorflow.org/install/)), keras([install from here](https://keras.io/#installation)) 13 | 14 | ### Data used for training model 15 | I trained my model with [drivingfinalpass dataset](https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html), which contains more than 4000 stereo images with 2 epochs. 16 | 17 | ### Preprocessing 18 | We crop training patches with size of 256x256 (different from that in the paper) from training images and normalize each channel. 19 | 20 | ### Download 21 | Run the following command: 22 | #### 23 | git clone https://github.com/LinHungShi/GCNetwork.git 24 | 25 | ### Two ways to download driving dataset: 26 | 1. create subdirectories sceneflow/driving in data, download and tar driving_final pass and driving_disparity from [here](https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html). 27 | 2. You can also issue command 28 | #### 29 | “sh download.sh” 30 | which will create subdirectories and download datasets. 31 | 32 | ### Train the model by running: 33 | Run the following command: 34 | #### 35 | python train.py 36 | 37 | ### Predict the data with test.py 38 | 1. create a directory, which contains two subdirectories -- left and right. 39 | 2. Run the following command 40 | #### 41 | python test.py -data -wpath [option] 42 | 3. The default file will be saved as npy file named prediction.npy, you can replace it with -pspath when issuing the above command. 43 | ### (Optional) Specify the pretrained weight by 44 | 1. Set it in train_params.py 45 | 2. python train.py -wpath 46 | 47 | ### Something you might want to do 48 | 1. To enable training with Monkaa dataset, 49 | a. Download Monkaa dataset from previous link. 50 | 51 | b. Create a directory in data, which has the name as monkaa_root in src/environment.json. 52 | 53 | c. Create a subdirectory, which has the name as monkaa_train in src/environment.json. 54 | 55 | d. Create a subdirectory, which has the name as monkaa_label in src/environment.json. 56 | 57 | 2. All hyperparameters used for building the model can be found in src/hyperparams.json 58 | 59 | ### Reference : 60 | *Kendall, Alex, et al. "End-to-End Learning of Geometry and Context for Deep Stereo Regression." arXiv preprint arXiv:1703.04309 (2017).* 61 | -------------------------------------------------------------------------------- /download.sh: -------------------------------------------------------------------------------- 1 | mkdir -p data/sceneflow/driving && 2 | mkdir -p data/sceneflow/monkaa && 3 | cd data/sceneflow/driving && 4 | wget https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/Driving/raw_data/driving__frames_finalpass.tar && tar -xvf driving__frames_finalpass.tar && 5 | wget https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/Driving/derived_data/driving__disparity.tar.bz2 && tar -xvf driving__disparity.tar.bz2 && 6 | cd .. && cd monkaa && 7 | wget https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/Monkaa/raw_data/monkaa__frames_cleanpass.tar && tar -xvf monkaa__frames_cleanpass.tar && 8 | wget https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/Monkaa/derived_data/monkaa__disparity.tar.bz2 && tar -xvf monkaa__disparity.tar.bz2 9 | cd .. && cd .. && cd .. 10 | -------------------------------------------------------------------------------- /log/events.out.tfevents.1505315272.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505315272.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505316794.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505316794.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505317347.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317347.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505317478.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317478.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505317537.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317537.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505317667.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317667.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505317740.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317740.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505317846.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317846.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505318058.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505318058.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505318207.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505318207.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505318578.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505318578.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505318869.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505318869.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505319774.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505319774.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505320164.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505320164.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505329935.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505329935.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505330193.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505330193.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505358839.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505358839.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505371238.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505371238.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505392293.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505392293.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505407743.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505407743.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505410895.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505410895.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505418187.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505418187.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505421208.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505421208.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505433857.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505433857.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505505945.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505505945.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505506012.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505506012.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505506266.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505506266.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505506302.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505506302.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505579898.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505579898.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505580836.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505580836.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505667583.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505667583.deeplearning -------------------------------------------------------------------------------- /log/events.out.tfevents.1505667753.deeplearning: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505667753.deeplearning -------------------------------------------------------------------------------- /model/model/shared_cost_weight.hdf5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/model/model/shared_cost_weight.hdf5 -------------------------------------------------------------------------------- /model/shared_cost_weight.hdf5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/model/shared_cost_weight.hdf5 -------------------------------------------------------------------------------- /src/conv3dTranspose.py: -------------------------------------------------------------------------------- 1 | from keras import layers 2 | from keras.layers import Conv3D 3 | from keras.engine import InputSpec 4 | from keras import backend as K 5 | from keras.utils import conv_utils 6 | import tensorflow as tf 7 | def _preprocess_deconv3d_output_shape(x, shape, data_format): 8 | if data_format == 'channels_first': 9 | shape = (shape[0], shape[2], shape[3], shape[4], shape[1]) 10 | 11 | if shape[0] is None: 12 | shape = (tf.shape(x)[0], ) + tuple(shape[1:]) 13 | shape = tf.stack(list(shape)) 14 | return shape 15 | 16 | def _preprocess_padding(padding): 17 | if padding == 'same': 18 | padding = 'SAME' 19 | elif padding == 'valid': 20 | padding = 'VALID' 21 | else: 22 | raise ValueError('Invalid border mode:', padding) 23 | return padding 24 | 25 | def conv3d_transpose(x, kernel, output_shape, strides=(1, 1, 1), 26 | padding='valid', data_format=None): 27 | """3D deconvolution (i.e. transposed convolution). 28 | 29 | # Arguments 30 | x: Tensor or variable. 31 | kernel: kernel tensor. 32 | output_shape: 1D int tensor for the output shape. 33 | strides: strides tuple. 34 | padding: string, `"same"` or `"valid"`. 35 | data_format: `"channels_last"` or `"channels_first"`. 36 | Whether to use Theano or TensorFlow data format 37 | for inputs/kernels/ouputs. 38 | 39 | # Returns 40 | A tensor, result of transposed 3D convolution. 41 | 42 | # Raises 43 | ValueError: if `data_format` is neither `channels_last` or `channels_first`. 44 | """ 45 | if data_format is None: 46 | data_format = image_data_format() 47 | if data_format not in {'channels_first', 'channels_last'}: 48 | raise ValueError('Unknown data_format ' + str(data_format)) 49 | if isinstance(output_shape, (tuple, list)): 50 | output_shape = tf.stack(output_shape) 51 | 52 | x = _preprocess_conv3d_input(x, data_format) 53 | output_shape = _preprocess_deconv3d_output_shape(x, output_shape, data_format) 54 | padding = _preprocess_padding(padding) 55 | strides = (1,) + strides + (1,) 56 | 57 | x = tf.nn.conv3d_transpose(x, kernel, output_shape, strides, 58 | padding=padding) 59 | x = _postprocess_conv3d_output(x, data_format) 60 | return x 61 | 62 | def _preprocess_conv3d_input(x, data_format): 63 | if K.dtype(x) == 'float64': 64 | x = tf.cast(x, 'float32') 65 | if data_format == 'channels_first': 66 | x = tf.transpose(x, (0, 2, 3, 4, 1)) 67 | return x 68 | 69 | def _postprocess_conv3d_output(x, data_format): 70 | if data_format == 'channels_first': 71 | x = tf.transpose(x, (0, 4, 1, 2, 3)) 72 | 73 | if K.floatx() == 'float64': 74 | x = tf.cast(x, 'float64') 75 | return x 76 | class Conv3DTranspose(Conv3D): 77 | """Transposed convolution layer (sometimes called Deconvolution). 78 | The need for transposed convolutions generally arises 79 | from the desire to use a transformation going in the opposite direction 80 | of a normal convolution, i.e., from something that has the shape of the 81 | output of some convolution to something that has the shape of its input 82 | while maintaining a connectivity pattern that is compatible with 83 | said convolution. 84 | When using this layer as the first layer in a model, 85 | provide the keyword argument `input_shape` 86 | (tuple of integers, does not include the sample axis), 87 | e.g. `input_shape=(128, 128, 128, 3)` for 128x128 RGB pictures 88 | in `data_format="channels_last"`. 89 | # Arguments 90 | filters: Integer, the dimensionality of the output space 91 | (i.e. the number of output filters in the convolution). 92 | kernel_size: An integer or tuple/list of 2 integers, specifying the 93 | width and height of the 2D convolution window. 94 | Can be a single integer to specify the same value for 95 | all spatial dimensions. 96 | strides: An integer or tuple/list of 2 integers, 97 | specifying the strides of the convolution along the width and height. 98 | Can be a single integer to specify the same value for 99 | all spatial dimensions. 100 | Specifying any stride value != 1 is incompatible with specifying 101 | any `dilation_rate` value != 1. 102 | padding: one of `"valid"` or `"same"` (case-insensitive). 103 | data_format: A string, 104 | one of `channels_last` (default) or `channels_first`. 105 | The ordering of the dimensions in the inputs. 106 | `channels_last` corresponds to inputs with shape 107 | `(batch, height, width, channels)` while `channels_first` 108 | corresponds to inputs with shape 109 | `(batch, channels, height, width)`. 110 | It defaults to the `image_data_format` value found in your 111 | Keras config file at `~/.keras/keras.json`. 112 | If you never set it, then it will be "channels_last". 113 | dilation_rate: an integer or tuple/list of 2 integers, specifying 114 | the dilation rate to use for dilated convolution. 115 | Can be a single integer to specify the same value for 116 | all spatial dimensions. 117 | Currently, specifying any `dilation_rate` value != 1 is 118 | incompatible with specifying any stride value != 1. 119 | activation: Activation function to use 120 | (see [activations](../activations.md)). 121 | If you don't specify anything, no activation is applied 122 | (ie. "linear" activation: `a(x) = x`). 123 | use_bias: Boolean, whether the layer uses a bias vector. 124 | kernel_initializer: Initializer for the `kernel` weights matrix 125 | (see [initializers](../initializers.md)). 126 | bias_initializer: Initializer for the bias vector 127 | (see [initializers](../initializers.md)). 128 | kernel_regularizer: Regularizer function applied to 129 | the `kernel` weights matrix 130 | (see [regularizer](../regularizers.md)). 131 | bias_regularizer: Regularizer function applied to the bias vector 132 | (see [regularizer](../regularizers.md)). 133 | activity_regularizer: Regularizer function applied to 134 | the output of the layer (its "activation"). 135 | (see [regularizer](../regularizers.md)). 136 | kernel_constraint: Constraint function applied to the kernel matrix 137 | (see [constraints](../constraints.md)). 138 | bias_constraint: Constraint function applied to the bias vector 139 | (see [constraints](../constraints.md)). 140 | # Input shape 141 | 4D tensor with shape: 142 | `(batch, channels, rows, cols)` if data_format='channels_first' 143 | or 4D tensor with shape: 144 | `(batch, rows, cols, channels)` if data_format='channels_last'. 145 | # Output shape 146 | 4D tensor with shape: 147 | `(batch, filters, new_rows, new_cols)` if data_format='channels_first' 148 | or 4D tensor with shape: 149 | `(batch, new_rows, new_cols, filters)` if data_format='channels_last'. 150 | `rows` and `cols` values might have changed due to padding. 151 | # References 152 | - [A guide to convolution arithmetic for deep learning](https://arxiv.org/abs/1603.07285v1) 153 | - [Deconvolutional Networks](http://www.matthewzeiler.com/pubs/cvpr2010/cvpr2010.pdf) 154 | """ 155 | 156 | #@interfaces.legacy_deconv3d_support 157 | def __init__(self, filters, 158 | kernel_size, 159 | strides=(1, 1, 1), 160 | padding='valid', 161 | data_format=None, 162 | activation=None, 163 | use_bias=True, 164 | kernel_initializer='glorot_uniform', 165 | bias_initializer='zeros', 166 | kernel_regularizer=None, 167 | bias_regularizer=None, 168 | activity_regularizer=None, 169 | kernel_constraint=None, 170 | bias_constraint=None, 171 | **kwargs): 172 | super(Conv3DTranspose, self).__init__( 173 | filters, 174 | kernel_size, 175 | strides=strides, 176 | padding=padding, 177 | data_format=data_format, 178 | activation=activation, 179 | use_bias=use_bias, 180 | kernel_initializer=kernel_initializer, 181 | bias_initializer=bias_initializer, 182 | kernel_regularizer=kernel_regularizer, 183 | bias_regularizer=bias_regularizer, 184 | activity_regularizer=activity_regularizer, 185 | kernel_constraint=kernel_constraint, 186 | bias_constraint=bias_constraint, 187 | **kwargs) 188 | self.input_spec = InputSpec(ndim=5) 189 | 190 | def build(self, input_shape): 191 | if len(input_shape) != 5: 192 | raise ValueError('Inputs should have rank ' + 193 | str(5) + 194 | '; Received input shape:', str(input_shape)) 195 | if self.data_format == 'channels_first': 196 | channel_axis = 1 197 | else: 198 | channel_axis = -1 199 | if input_shape[channel_axis] is None: 200 | raise ValueError('The channel dimension of the inputs ' 201 | 'should be defined. Found `None`.') 202 | input_dim = input_shape[channel_axis] 203 | kernel_shape = self.kernel_size + (self.filters, input_dim) 204 | 205 | self.kernel = self.add_weight(kernel_shape, 206 | initializer=self.kernel_initializer, 207 | name='kernel', 208 | regularizer=self.kernel_regularizer, 209 | constraint=self.kernel_constraint) 210 | if self.use_bias: 211 | self.bias = self.add_weight((self.filters,), 212 | initializer=self.bias_initializer, 213 | name='bias', 214 | regularizer=self.bias_regularizer, 215 | constraint=self.bias_constraint) 216 | else: 217 | self.bias = None 218 | # Set input spec. 219 | self.input_spec = InputSpec(ndim=5, axes={channel_axis: input_dim}) 220 | self.built = True 221 | 222 | def call(self, inputs): 223 | input_shape = K.shape(inputs) 224 | batch_size = input_shape[0] 225 | if self.data_format == 'channels_first': 226 | d_axis, h_axis, w_axis = 2, 3, 4 227 | else: 228 | d_axis, h_axis, w_axis = 1, 2, 3 229 | 230 | depth, height, width = input_shape[d_axis], input_shape[h_axis], input_shape[w_axis] 231 | kernel_d, kernel_h, kernel_w = self.kernel_size 232 | stride_d, stride_h, stride_w = self.strides 233 | 234 | # Infer the dynamic output shape: 235 | out_depth = conv_utils.deconv_length(depth, 236 | stride_h, kernel_h, 237 | self.padding) 238 | out_height = conv_utils.deconv_length(height, 239 | stride_h, kernel_h, 240 | self.padding) 241 | out_width = conv_utils.deconv_length(width, 242 | stride_w, kernel_w, 243 | self.padding) 244 | if self.data_format == 'channels_first': 245 | output_shape = (batch_size, self.filters, out_depth, out_height, out_width) 246 | else: 247 | output_shape = (batch_size, out_depth, out_height, out_width, self.filters) 248 | 249 | outputs = conv3d_transpose( 250 | inputs, 251 | self.kernel, 252 | output_shape, 253 | self.strides, 254 | padding=self.padding, 255 | data_format=self.data_format) 256 | 257 | if self.bias: 258 | outputs = K.bias_add( 259 | outputs, 260 | self.bias, 261 | data_format=self.data_format) 262 | 263 | if self.activation is not None: 264 | return self.activation(outputs) 265 | return outputs 266 | 267 | def compute_output_shape(self, input_shape): 268 | output_shape = list(input_shape) 269 | if self.data_format == 'channels_first': 270 | c_axis, d_axis, h_axis, w_axis = 1, 2, 3, 4 271 | else: 272 | c_axis, d_axis, h_axis, w_axis = 4, 1, 2, 3 273 | 274 | kernel_d, kernel_h, kernel_w = self.kernel_size 275 | stride_d, stride_h, stride_w = self.strides 276 | 277 | output_shape[c_axis] = self.filters 278 | output_shape[d_axis] = conv_utils.deconv_length( 279 | output_shape[d_axis], stride_d, kernel_d, self.padding) 280 | output_shape[h_axis] = conv_utils.deconv_length( 281 | output_shape[h_axis], stride_h, kernel_h, self.padding) 282 | output_shape[w_axis] = conv_utils.deconv_length( 283 | output_shape[w_axis], stride_w, kernel_w, self.padding) 284 | return tuple(output_shape) 285 | 286 | def get_config(self): 287 | config = super(Conv3DTranspose, self).get_config() 288 | config.pop('dilation_rate') 289 | return config 290 | -------------------------------------------------------------------------------- /src/conv3dTranspose.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/conv3dTranspose.pyc -------------------------------------------------------------------------------- /src/custom_callback.py: -------------------------------------------------------------------------------- 1 | from keras.callbacks import Callback 2 | import numpy as np 3 | from keras.models import Model 4 | from keras import backend as K 5 | from keras.layers import Input 6 | class customModelCheckpoint(Callback): 7 | """Save the model after every epoch. 8 | 9 | `filepath` can contain named formatting options, 10 | which will be filled the value of `epoch` and 11 | keys in `logs` (passed in `on_epoch_end`). 12 | 13 | For example: if `filepath` is `weights.{epoch:02d}-{val_loss:.2f}.hdf5`, 14 | then the model checkpoints will be saved with the epoch number and 15 | the validation loss in the filename. 16 | 17 | # Arguments 18 | filepath: string, path to save the model file. 19 | monitor: quantity to monitor. 20 | verbose: verbosity mode, 0 or 1. 21 | save_best_only: if `save_best_only=True`, 22 | the latest best model according to 23 | the quantity monitored will not be overwritten. 24 | mode: one of {auto, min, max}. 25 | If `save_best_only=True`, the decision 26 | to overwrite the current save file is made 27 | based on either the maximization or the 28 | minimization of the monitored quantity. For `val_acc`, 29 | this should be `max`, for `val_loss` this should 30 | be `min`, etc. In `auto` mode, the direction is 31 | automatically inferred from the name of the monitored quantity. 32 | save_weights_only: if True, then only the model's weights will be 33 | saved (`model.save_weights(filepath)`), else the full model 34 | is saved (`model.save(filepath)`). 35 | period: Interval (number of epochs) between checkpoints. 36 | """ 37 | 38 | def __init__(self, cost_weight_filepath, linear_output_weight_filepath, monitor='val_loss', verbose=0, 39 | save_best_only=False, mode='auto', period=1): 40 | super(customModelCheckpoint, self).__init__() 41 | self.monitor = monitor 42 | self.verbose = verbose 43 | self.cost_weight_filepath = cost_weight_filepath 44 | self.linear_output_weight_filepath = linear_output_weight_filepath 45 | self.save_best_only = save_best_only 46 | self.period = period 47 | self.epochs_since_last_save = 0 48 | 49 | if mode not in ['auto', 'min', 'max']: 50 | warnings.warn('ModelCheckpoint mode %s is unknown, ' 51 | 'fallback to auto mode.' % (mode), 52 | RuntimeWarning) 53 | mode = 'auto' 54 | 55 | if mode == 'min': 56 | self.monitor_op = np.less 57 | self.best = np.Inf 58 | elif mode == 'max': 59 | self.monitor_op = np.greater 60 | self.best = -np.Inf 61 | else: 62 | if 'acc' in self.monitor or self.monitor.startswith('fmeasure'): 63 | self.monitor_op = np.greater 64 | self.best = -np.Inf 65 | else: 66 | self.monitor_op = np.less 67 | self.best = np.Inf 68 | def custom_save_weights(self, overwrite): 69 | cost = self.model.layers[-2].output 70 | cost_model = Model(self.model.input, cost) 71 | cost_model.save_weights(self.cost_weight_filepath, overwrite) 72 | if self.linear_output_weight_filepath: 73 | linear_output = self.model.layers[-1] 74 | b, m, h, w = K.int_shape(cost) 75 | linear_input = Input((m, h, w)) 76 | linear_model = Model(linear_input, linear_output(linear_input)) 77 | linear_model.save_weights(self.linear_output_weight_filepath, overwrite) 78 | def on_epoch_end(self, epoch, logs=None): 79 | logs = logs or {} 80 | self.epochs_since_last_save += 1 81 | if self.epochs_since_last_save >= self.period: 82 | self.epochs_since_last_save = 0 83 | #filepath = self.filepath.format(epoch=epoch, **logs) 84 | if self.save_best_only: 85 | current = logs.get(self.monitor) 86 | if current is None: 87 | warnings.warn('Can save best model only with %s available, ' 88 | 'skipping.' % (self.monitor), RuntimeWarning) 89 | else: 90 | if self.monitor_op(current, self.best): 91 | if self.verbose > 0: 92 | print('Epoch %05d: %s improved from %0.5f to %0.5f,' 93 | ' saving weight to %s and %s' 94 | % (epoch, self.monitor, self.best, 95 | current, self.cost_weight_filepath, self.linear_output_weight_filepath)) 96 | self.best = current 97 | self.custom_save_weights(True) 98 | else: 99 | if self.verbose > 0: 100 | print('Epoch %05d: %s did not improve' % 101 | (epoch, self.monitor)) 102 | else: 103 | if self.verbose > 0: 104 | print('Epoch %05d: saving model to %s and %s' % (epoch,self.cost_weight_filepath, self.linear_output_weight_filepath)) 105 | self.custom_save_weights(True) 106 | -------------------------------------------------------------------------------- /src/custom_callback.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/custom_callback.pyc -------------------------------------------------------------------------------- /src/data_utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import os 4 | import glob 5 | import random 6 | import math 7 | def genDrivingPath(x, y): 8 | l_paths = [] 9 | r_paths = [] 10 | y_paths = [] 11 | focal_lengths = ["15mm_focallength", "35mm_focallength"] 12 | directions = ["scene_backwards", "scene_forwards"] 13 | types = ["fast", "slow"] 14 | sides = ["left", "right"] 15 | for focal_length in focal_lengths: 16 | for direction in directions: 17 | for type in types: 18 | l_paths.append(os.path.join(x, *[focal_length, direction, type, sides[0]])) 19 | r_paths.append(os.path.join(x, *[focal_length, direction, type, sides[1]])) 20 | y_paths.append(os.path.join(y, *[focal_length, direction, type, sides[0]])) 21 | return l_paths, r_paths, y_paths 22 | 23 | def genMonkaaPath(x, y): 24 | l_paths = [] 25 | r_paths = [] 26 | y_paths = [] 27 | scenes = sorted(os.listdir(x)) 28 | sides = ["left", "right"] 29 | for scene in scenes: 30 | l_paths.append(os.path.join(x, *[scene, sides[0]])) 31 | r_paths.append(os.path.join(x, *[scene, sides[1]])) 32 | y_paths.append(os.path.join(y, *[scene, sides[0]])) 33 | return l_paths, r_paths, y_paths 34 | 35 | def extractAllImage(lefts, rights, disps): 36 | left_images = [] 37 | right_images = [] 38 | disp_images = [] 39 | for left_path, right_path, disp_path in zip(lefts, rights, disps): 40 | left_data = sorted(glob.glob(left_path + "/*.png")) 41 | right_data = sorted(glob.glob(right_path + "/*.png")) 42 | disps_data = sorted(glob.glob(disp_path + "/*.pfm")) 43 | left_images = left_images + left_data 44 | right_images = right_images + right_data 45 | disp_images = disp_images + disps_data 46 | return left_images, right_images, disp_images 47 | 48 | 49 | def splitData(l, r, d, val_ratio, fraction = 1): 50 | tmp = zip(l, r, d) 51 | random.shuffle(tmp) 52 | num_samples = len(l) 53 | num_data = int(fraction * num_samples) 54 | tmp = tmp[0:num_data] 55 | val_samples = int(math.ceil(num_data * val_ratio)) 56 | val = tmp[0:val_samples] 57 | train = tmp[val_samples:] 58 | l_val, r_val, d_val = zip(*val) 59 | l_train, r_train, d_train = zip(*train) 60 | return [l_train, r_train, d_train], [l_val, r_val, d_val] 61 | -------------------------------------------------------------------------------- /src/data_utils.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/data_utils.pyc -------------------------------------------------------------------------------- /src/environment.json: -------------------------------------------------------------------------------- 1 | { 2 | "sceneflow_root": "data/sceneflow", 3 | "driving_root": "driving", 4 | "driving_train": "frames_finalpass", 5 | "driving_label": "disparity", 6 | "monkaa_root": "monkaa", 7 | "monkaa_train": "frames_cleanpass", 8 | "monkaa_label": "disparity", 9 | "train_all": 0, 10 | "train_driving": 1, 11 | "train_monkaa": 0 12 | } 13 | 14 | -------------------------------------------------------------------------------- /src/gcnetwork.py: -------------------------------------------------------------------------------- 1 | from keras.models import Sequential, Model 2 | from keras.layers.convolutional import Conv2D, Conv3D, Conv2DTranspose 3 | from conv3dTranspose import Conv3DTranspose 4 | from keras.layers.normalization import BatchNormalization 5 | from keras.layers import Activation 6 | from keras import backend as K 7 | from keras.layers import Input, Add, add, multiply 8 | from keras.layers.core import Lambda, Permute, Reshape 9 | from ipykernel import kernelapp as app 10 | import tensorflow as tf 11 | import numpy as np 12 | 13 | def _resNetBlock_(filters, ksize, stride, padding, act_func): 14 | conv1 = Conv2D(filters, ksize, strides = stride, padding = padding) 15 | bn1 = BatchNormalization(axis = -1) 16 | act1 = Activation(act_func) 17 | conv2 = Conv2D(filters,ksize, strides = stride, padding = padding) 18 | bn2 = BatchNormalization(axis = -1) 19 | act2 = Activation(act_func) 20 | add = Add() 21 | return [conv1, bn1, act1, conv2, bn2, act2, add] 22 | 23 | def _addConv3D_(input, filters, ksize, stride, padding, bn = True, act_func = 'relu'): 24 | conv = Conv3D(filters, ksize, strides = stride, padding = padding) (input) 25 | if bn: 26 | conv = BatchNormalization(axis = -1)(conv) 27 | if act_func: 28 | conv = Activation(act_func)(conv) 29 | return conv 30 | 31 | def _convDownSampling_(input, filters, ksize, ds_stride, padding): 32 | conv = _addConv3D_(input, filters, ksize, ds_stride, padding) 33 | conv = _addConv3D_(conv, filters, ksize, 1, padding) 34 | conv = _addConv3D_(conv, filters, ksize, 1, padding) 35 | return conv 36 | 37 | def _createDeconv3D_(input, filters, ksize, stride, padding, bn = True, act_func = 'relu'): 38 | deconv = Conv3DTranspose(filters, ksize, stride, padding) (input) 39 | if bn: 40 | deconv = BatchNormalization(axis = -1)(deconv) 41 | if act_func: 42 | deconv = Activation(act_func)(deconv) 43 | return deconv 44 | 45 | def _highwayBlock_(tensor): 46 | output, input, trans = tensor 47 | return add([multiply([output, trans]), multiply([input, 1 - trans])]) 48 | 49 | def _getCostVolume_(inputs, max_d): 50 | left_tensor, right_tensor = inputs 51 | shape = K.shape(right_tensor) 52 | right_tensor = K.spatial_2d_padding(right_tensor, padding=((0, 0), (max_d, 0))) 53 | disparity_costs = [] 54 | for d in reversed(range(max_d)): 55 | left_tensor_slice = left_tensor 56 | right_tensor_slice = tf.slice(right_tensor, begin = [0, 0, d, 0], size = [-1, -1, shape[2], -1]) 57 | cost = K.concatenate([left_tensor_slice, right_tensor_slice], axis = 3) 58 | disparity_costs.append(cost) 59 | cost_volume = K.stack(disparity_costs, axis = 1) 60 | return cost_volume 61 | 62 | def _computeLinearScore_(cv, d): 63 | cv = K.permute_dimensions(cv, (0,2,3,1)) 64 | disp_map = K.reshape(K.arange(0, d, dtype = K.floatx()), (1,1,d,1)) 65 | output = K.conv2d(cv, disp_map, strides = (1,1), padding = 'valid') 66 | return K.squeeze(output, axis = -1) 67 | 68 | def _computeSoftArgMin_(cv, d): 69 | softmax = tf.nn.softmax(cv, dim = 1) 70 | #softmax = K.permute_dimensions(softmax, (0,2,3,1)) 71 | disp_map = K.reshape(K.arange(0, d, dtype = 'float32'), (1,1,d,1)) 72 | output = K.conv2d(softmax, disp_map, strides = (1,1), data_format = 'channels_first', padding = 'valid') 73 | return K.squeeze(output, axis = 1) 74 | 75 | def getOutputFunction(output): 76 | if output == 'linear': 77 | return _computeLinearScore_ 78 | if output == 'softargmin': 79 | return _computeSoftArgMin_ 80 | 81 | def _createUniFeature_(input_shape, num_res, filters, first_ksize, ksize, act_func, ds_stride, padding): 82 | conv1 = Conv2D(filters, first_ksize, strides = ds_stride, padding = padding, input_shape = input_shape) 83 | bn1 = BatchNormalization(axis = -1) 84 | act1 = Activation(act_func) 85 | layers = [conv1, bn1, act1] 86 | for i in range(num_res): 87 | layers += _resNetBlock_(filters, ksize, 1, padding, act_func) 88 | output = Conv2D(filters, ksize, strides = 1, padding = padding) 89 | layers.append(output) 90 | return layers 91 | 92 | def _LearnReg_(input, base_num_filters, ksize, ds_stride, resnet, padding, highway_func, num_down_conv): 93 | down_convs = list() 94 | conv = _addConv3D_(input, base_num_filters, ksize, 1, padding) 95 | conv = _addConv3D_(conv, base_num_filters, ksize, 1, padding) 96 | down_convs.insert(0, conv) 97 | if not resnet: 98 | trans_gates = list() 99 | gate = _addConv3D_(conv, base_num_filters, ksize, 1, padding) 100 | trans_gates.insert(0, gate) 101 | for i in range(num_down_conv): 102 | if i < num_down_conv - 1: 103 | mult = 2 104 | else: 105 | mult = 4 106 | conv = _convDownSampling_(conv, mult * base_num_filters, ksize, ds_stride, padding) 107 | down_convs.insert(0, conv) 108 | if not resnet: 109 | gate = _addConv3D_(conv, mult * base_num_filters, ksize, 1, padding) 110 | trans_gates.insert(0, gate) 111 | up_convs = down_convs[0] 112 | for i in range(num_down_conv): 113 | filters = K.int_shape(down_convs[i+1])[-1] 114 | deconv = _createDeconv3D_(up_convs, filters, ksize, ds_stride, padding) 115 | if not resnet: 116 | up_convs = Lambda(_highwayBlock_)([deconv, down_convs[i+1], trans_gates[i+1]]) 117 | else: 118 | up_convs = add([deconv, down_convs[i+1]]) 119 | cost = _createDeconv3D_(up_convs, 1, ksize, ds_stride, padding, bn = False, act_func = None) 120 | cost = Lambda(lambda x: -x)(cost) 121 | cost = Lambda(K.squeeze, arguments = {'axis': -1})(cost) 122 | return cost 123 | 124 | def createFeature(input, layers): 125 | res = layers[0](input) 126 | tensor = res 127 | for layer in layers[1:]: 128 | if isinstance(layer, Add): 129 | tensor = layer([tensor, res]) 130 | res = tensor 131 | else: 132 | tensor = layer(tensor) 133 | return tensor 134 | 135 | def createGCNetwork(hp, tp, pre_weight): 136 | padding = 'same' 137 | cost_weight = tp['cost_volume_weight_path'] 138 | linear_weight = tp['linear_output_weight_path'] 139 | d = hp['max_disp'] 140 | resnet = hp['resnet'] 141 | first_ksize = hp['first_kernel_size'] 142 | ksize = hp['kernel_size'] 143 | num_filters = hp['base_num_filters'] 144 | act_func = hp['act_func'] 145 | highway_func = hp['h_act_func'] 146 | num_down_conv = hp['num_down_conv'] 147 | output = hp['output'] 148 | num_res = hp['num_res'] 149 | ds_stride = hp['ds_stride'] 150 | padding = hp['padding'] 151 | shared_weight = tp['shared_weight'] 152 | K.set_image_data_format(hp['data_format']) 153 | input_shape = (None, None, 3) 154 | left_img = Input(input_shape, dtype = "float32") 155 | right_img = Input(input_shape, dtype = "float32") 156 | layers = _createUniFeature_(input_shape, num_res, num_filters, first_ksize, ksize, act_func, ds_stride, padding) 157 | l_feature = createFeature(left_img, layers) 158 | if shared_weight == 1: 159 | print "Use shared weight for first stage" 160 | r_feature = createFeature(right_img, layers) 161 | else: 162 | print "Use different weights for first stage" 163 | layers2 = _createUniFeature_(input_shape, num_res, num_filters, first_ksize, ksize, act_func, ds_stride, padding) 164 | r_feature = createFeature(right_img, layers2) 165 | unifeatures = [l_feature, r_feature] 166 | cv = Lambda(_getCostVolume_, arguments = {'max_d':d/2}, output_shape = (d/2, None, None, num_filters * 2))(unifeatures) 167 | disp_map = _LearnReg_(cv, num_filters, ksize, ds_stride, resnet, padding, highway_func, num_down_conv) 168 | cost_model = Model([left_img , right_img], disp_map) 169 | if pre_weight == 1: 170 | print "Loading pretrained cost weight..." 171 | cost_model.load_weights(cost_weight) 172 | out_func = getOutputFunction(output) 173 | disp_map_input = Input((d, None, None)) 174 | output = Lambda(out_func, arguments = {'d':d})(disp_map_input) 175 | linear_output_model = Model(disp_map_input, output) 176 | if out_func == "linear" and pre_weight == 1: 177 | print "Loading pretrained linear output weight..." 178 | linear_output_model.load_weights(linear_weight) 179 | model = Model(cost_model.input, linear_output_model(cost_model.output)) 180 | return model 181 | 182 | -------------------------------------------------------------------------------- /src/gcnetwork.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/gcnetwork.pyc -------------------------------------------------------------------------------- /src/generator.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | import glob 4 | import random 5 | from load_pfm import * 6 | 7 | def generate_arrays_from_file(lefts, rights, up, disps = None): 8 | crop_height = up['crop_height'] 9 | crop_width = up['crop_width'] 10 | train = True 11 | random.seed(up['seed']) 12 | if disps == None: 13 | disps = np.arange(len(lefts)) 14 | train = False 15 | while 1: 16 | for ldata, rdata, ddata in zip(lefts, rights, disps): 17 | left_image = cv2.imread(ldata) 18 | right_image = cv2.imread(rdata) 19 | if train == True: 20 | disp_image = load_pfm(open(ddata)) 21 | h, w = left_image.shape[0:2] 22 | start_w = random.randint(0, w - crop_width) 23 | start_h = random.randint(0, h - crop_height) 24 | finish_h = start_h + crop_height 25 | finish_w = start_w + crop_width 26 | left_image = left_image[start_h:finish_h, start_w:finish_w] 27 | right_image = right_image[start_h:finish_h, start_w:finish_w] 28 | disp_image = disp_image[start_h:finish_h, start_w:finish_w] 29 | disp_image = np.expand_dims(disp_image, 0) 30 | left_image = _centerImage_(left_image) 31 | right_image = _centerImage_(right_image) 32 | #left_image = _normImage_(left_image, 1, -1, 255, 0) 33 | #right_image = _normImage_(right_image, 1, -1, 255, 0) 34 | left_image = np.expand_dims(left_image, 0) 35 | right_image = np.expand_dims(right_image, 0) 36 | #print "\n l_image : {}\n width and height = ({}, {}), cropped width and height = ({},{})".format(ldata, h, w, start_h, start_w) 37 | if train == True: 38 | yield ([left_image, right_image], disp_image) 39 | else: 40 | yield ([left_image, right_image]) 41 | if not train: 42 | break 43 | def _centerImage_(img): 44 | img = img.astype(np.float32) 45 | var = np.var(img, axis = (0,1), keepdims = True) 46 | mean = np.mean(img, axis = (0,1), keepdims = True) 47 | return (img - mean) / np.sqrt(var) 48 | def _normImage_(img, new_max, new_min, old_max, old_min): 49 | img = img.astype(np.float32) 50 | return (img - old_min) * (new_max - new_min) / (old_max - old_min) + new_min 51 | -------------------------------------------------------------------------------- /src/generator.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/generator.pyc -------------------------------------------------------------------------------- /src/hyperparams.json: -------------------------------------------------------------------------------- 1 | { 2 | "max_disp": 192, 3 | "base_num_filters": 32, 4 | "first_kernel_size": 5, 5 | "kernel_size": 3, 6 | "num_res": 8, 7 | "num_down_conv": 4, 8 | "resnet": 1, 9 | "output": "softargmin", 10 | "act_func": "relu", 11 | "h_act_func": "sigmoid", 12 | "ds_stride": 2, 13 | "padding": "same", 14 | "data_format": "channels_last" 15 | } 16 | -------------------------------------------------------------------------------- /src/load_pfm.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import re 4 | def load_pfm(file): 5 | color = None 6 | width = None 7 | height = None 8 | scale = None 9 | endian = None 10 | header = file.readline().rstrip() 11 | if header == 'PF': 12 | color = True 13 | elif header == 'Pf': 14 | color = False 15 | else: 16 | raise Exception('Not a PFM file.') 17 | dim_match = re.match(r'^(\d+)\s(\d+)\s$', file.readline()) 18 | if dim_match: 19 | width, height = map(int, dim_match.groups()) 20 | else: 21 | raise Exception('Malformed PFM header.') 22 | scale = float(file.readline().rstrip()) 23 | if scale < 0: # little-endian 24 | endian = '<' 25 | scale = -scale 26 | else: 27 | endian = '>' # big-endian 28 | data = np.fromfile(file, endian + 'f') 29 | shape = (height, width, 3) if color else (height, width) 30 | data = np.reshape(data, shape) 31 | data = cv2.flip(data, 0) 32 | return data 33 | 34 | -------------------------------------------------------------------------------- /src/load_pfm.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/load_pfm.pyc -------------------------------------------------------------------------------- /src/losses.py: -------------------------------------------------------------------------------- 1 | from keras import backend as K 2 | 3 | def lessOneAccuracy(y_true, y_pred): 4 | shape = K.shape(y_true) 5 | h = K.reshape(shape[1], (1,1)) 6 | w = K.reshape(shape[2], (1,1)) 7 | denom = 1 / K.cast(K.reshape(K.dot(h, w), (1,1)), dtype = 'float32') 8 | return K.dot(K.reshape(K.sum(K.cast(K.less_equal(K.abs(y_true - y_pred), 1), dtype = 'float32')), (1,1)), denom) 9 | 10 | def lessThreeAccuracy(y_true, y_pred): 11 | shape = K.shape(y_true) 12 | h = K.reshape(shape[1], (1,1)) 13 | w = K.reshape(shape[2], (1,1)) 14 | denom = K.dot(h, w) 15 | denom = 1 / K.cast(K.reshape(K.dot(h, w), (1,1)), dtype = 'float32') 16 | return K.dot(K.reshape(K.sum(K.cast(K.less_equal(K.abs(y_true - y_pred), 3), dtype = 'float32')), (1,1)), denom) 17 | -------------------------------------------------------------------------------- /src/losses.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/losses.pyc -------------------------------------------------------------------------------- /src/parse_arguments.py: -------------------------------------------------------------------------------- 1 | import json 2 | import sys 3 | def parseArguments(): 4 | with open('src/hyperparams.json') as json_file: 5 | hp = json.load(json_file) 6 | with open('src/environment.json') as json_file: 7 | env = json.load(json_file) 8 | with open('src/train_params.json') as json_file: 9 | tp = json.load(json_file) 10 | #with open('src/test_params.json') as json_file: 11 | # pp = json.load(json_file) 12 | with open('src/util_params.json') as json_file: 13 | up = json.load(json_file) 14 | return hp, tp, up, env 15 | -------------------------------------------------------------------------------- /src/parse_arguments.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/parse_arguments.pyc -------------------------------------------------------------------------------- /src/test_params.json: -------------------------------------------------------------------------------- 1 | { 2 | "pspath": "prediction", 3 | "batch_size": 1, 4 | "w_path": "model_weight.hdf5", 5 | "max_q_size": 3, 6 | "verbose": 1 7 | } 8 | -------------------------------------------------------------------------------- /src/train_params.json: -------------------------------------------------------------------------------- 1 | { 2 | "period": 1, 3 | "verbose": 1, 4 | "log_save_path": "log", 5 | "max_q_size": 1, 6 | "save_best_only": 0, 7 | "learning_rate": 0.001, 8 | "batch_size": 1, 9 | "epochs": 50, 10 | "epsilon": 0.00000001, 11 | "rho": 0.9, 12 | "decay": 0.0, 13 | "shared_weight": 1, 14 | "loss_function": "mean_absolute_error", 15 | "cost_volume_weight_save_path": "model/shared_cost_weight.hdf5", 16 | "cost_volume_weight_path": "model/shared_cost_weight.hdf5", 17 | "linear_output_weight_save_path": "model/shared_linear_output_weight.hdf5", 18 | "linear_output_weight_path": "model/shared_linear_output_weight.hdf5", 19 | "pspath": "prediction" 20 | } 21 | -------------------------------------------------------------------------------- /src/util_params.json: -------------------------------------------------------------------------------- 1 | { 2 | "crop_width": 512, 3 | "crop_height": 256, 4 | "val_ratio": 0.2, 5 | "file_extension": "png", 6 | "seed": 1234, 7 | "fraction": 1 8 | 9 | } 10 | -------------------------------------------------------------------------------- /test.py: -------------------------------------------------------------------------------- 1 | import sys 2 | sys.path.append('src') 3 | import numpy as np 4 | import argparse 5 | import parse_arguments 6 | from gcnetwork import * 7 | import glob 8 | import os 9 | import psutil 10 | from generator import * 11 | 12 | def get_mem_usage(): 13 | process = psutil.Process(os.getpid()) 14 | return process.memory_info() 15 | 16 | def _predictFromArrays_(model, left, right, bs): 17 | return model.predict([left, right], bs) 18 | 19 | def _predictFromGenerator_(model, generator, steps, max_q_size): 20 | return model.predict_generator(generator, steps, max_q_size) 21 | def Predict(): 22 | hp, tp, up, env = parse_arguments.parseArguments() 23 | pspath = tp['pspath'] 24 | parser = argparse.ArgumentParser() 25 | #parser.add_argument('-wpath', help = 'weight path of pretrained model', default = weight) 26 | parser.add_argument('-pspath', help = 'path for saving prediction result', default = pspath) 27 | parser.add_argument('-data', help = 'data used for prediction', required = True, default = None) 28 | parser.add_argument('-bs', help = 'batch size or steps', default = tp['batch_size']) 29 | args = parser.parse_args() 30 | #weight_path = args.wpath 31 | pred_path = args.pspath 32 | ext = up['file_extension'] 33 | data_path = args.data 34 | bs = tp['batch_size'] 35 | max_q_size = tp['max_q_size'] 36 | verbose = tp['verbose'] 37 | def get_session(gpu_fraction=0.95): 38 | gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction, 39 | allow_growth=True) 40 | return tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) 41 | K.set_session(get_session()) 42 | model = createGCNetwork(hp, tp, True) 43 | if data_path.endswith('npz'): 44 | images = np.load(data_path) 45 | print "Predict data using arrays" 46 | pred = _predictFromArrays_(model, images[left], images[1], images[2], bs) 47 | np.save(pspath, pred) 48 | else: 49 | q_size = tp['max_q_size'] 50 | left_path = os.path.join(data_path, 'left') 51 | right_path = os.path.join(data_path, 'right') 52 | left_images = glob.glob(left_path + "/*.{}".format(ext)) 53 | right_images = glob.glob(right_path + "/*.{}".format(ext)) 54 | generator = generate_arrays_from_file(left_images, right_images, up) 55 | print "Predict data using generator..." 56 | pred = model.predict_generator(generator, max_queue_size = max_q_size, steps = bs, verbose = verbose) 57 | np.save(pspath, pred) 58 | K.clear_session() 59 | if __name__ == "__main__": 60 | Predict() 61 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import sys 3 | sys.path.append('src') 4 | from parse_arguments import * 5 | import os 6 | from data_utils import * 7 | from custom_callback import customModelCheckpoint 8 | import gcnetwork 9 | from generator import * 10 | from losses import * 11 | from keras.callbacks import ModelCheckpoint, EarlyStopping, TensorBoard 12 | from keras import optimizers 13 | from keras import backend as K 14 | import math 15 | import random 16 | import tensorflow as tf 17 | def trainSceneFlowData(hp, tp, up, env, callbacks, upw): 18 | lr = tp['learning_rate'] 19 | epochs = tp['epochs'] 20 | batch_size = tp['batch_size'] 21 | q_size = tp['max_q_size'] 22 | epsilon = tp['epsilon'] 23 | rho = tp['rho'] 24 | decay = tp['decay'] 25 | loss = tp['loss_function'] 26 | sceneflow_root = env['sceneflow_root'] 27 | driving_root = env['driving_root'] 28 | driving_train = env['driving_train'] 29 | driving_label = env['driving_label'] 30 | train_all = env['train_all'] 31 | train_driving = env['train_driving'] 32 | train_monkaa = env['train_monkaa'] 33 | val_ratio = up['val_ratio'] 34 | fraction = up['fraction'] 35 | root = os.path.join(os.getcwd(), sceneflow_root) 36 | driving = os.path.join(root, driving_root) 37 | driving_data_path = os.path.join(driving, driving_train) 38 | driving_label_path = os.path.join(driving, driving_label) 39 | monkaa_root = env['monkaa_root'] 40 | monkaa = os.path.join(root, monkaa_root) 41 | monkaa_train = env['monkaa_train'] 42 | monkaa_label = env['monkaa_label'] 43 | monkaa_data_path = os.path.join(monkaa, monkaa_train) 44 | monkaa_label_path = os.path.join(monkaa, monkaa_label) 45 | if train_all: 46 | train_list = [[driving_data_path, driving_label_path, genDrivingPath], [monkaa_data_path, monkaa_label_path, genMonkaaPath]] 47 | else: 48 | train_list = [] 49 | if train_driving: 50 | train_list.append([driving_data_path, driving_label_path, genDrivingPath]) 51 | if train_monkaa: 52 | train_list.append([monkaa_data_path, monkaa_label_path, genMonkaaPath]) 53 | train_paths = map(lambda x: x[2](x[0], x[1]), train_list) 54 | agg_train_path = zip(*train_paths) 55 | left, right, disp = [reduce(lambda x, y: x + y, path) for path in agg_train_path] 56 | l_imgs, r_imgs, d_imgs = extractAllImage(left, right, disp) 57 | train, val = splitData(l_imgs, r_imgs, d_imgs, val_ratio, fraction) 58 | val_generator = generate_arrays_from_file(val[0], val[1], up,val[2]) 59 | train_generator = generate_arrays_from_file(train[0], train[1], up, train[2]) 60 | num_steps = math.ceil(len(train[0]) / batch_size) 61 | val_steps = math.ceil(len(val[0]) / batch_size) 62 | model = gcnetwork.createGCNetwork(hp, tp, upw) 63 | optimizer = optimizers.RMSprop(lr = lr, rho = rho, epsilon = epsilon, decay = decay) 64 | model.compile(optimizer = optimizer, loss = loss, metrics = [lessOneAccuracy, lessThreeAccuracy]) 65 | model.fit_generator(train_generator, validation_data = val_generator, validation_steps = val_steps, steps_per_epoch = num_steps, max_q_size = q_size, epochs = epochs, callbacks = callbacks) 66 | print "Training Complete" 67 | result = model.predict_generator(train_generator, steps = 1) 68 | np.save("prediction.npy", result) 69 | def genCallBacks(cost_filepath, outputfilepath, log_save_path, save_best_only, period, verbose): 70 | callback_tb = TensorBoard(log_dir = log_save_path, histogram_freq = 0, write_graph = True, write_images = True) 71 | callback_mc = customModelCheckpoint(cost_filepath, outputfilepath, verbose = verbose, save_best_only = save_best_only, period = period) 72 | return [callback_tb, callback_mc] 73 | 74 | if __name__ == '__main__': 75 | #config = tf.ConfigProto() 76 | #config.gpu_options.allow_growth=True 77 | #config.gpu_options.allocator_type ='BFC' 78 | #config.gpu_options.per_process_gpu_memory_fraction = 0.98 79 | #sess = tf.Session(config = config) 80 | #K.set_session(sess) 81 | hp, tp, up, env = parseArguments() 82 | parser = argparse.ArgumentParser() 83 | parser.add_argument('-upw', '--use_pretrained_weight', type = int, help = 'train the model use pretrained weight', default = 1) 84 | args = parser.parse_args() 85 | #weight_save_path = tp['weight_save_path'] 86 | log_save_path = tp['log_save_path'] 87 | save_best_only = tp['save_best_only'] 88 | period = tp['period'] 89 | verbose = tp['verbose'] 90 | cost_weight_path = tp['cost_volume_weight_save_path'] 91 | linear_output_weight_path = tp['linear_output_weight_path'] 92 | if hp['output'] == 'softargmin': 93 | linear_output_weight_path = None 94 | callbacks = genCallBacks(cost_weight_path, linear_output_weight_path, log_save_path, save_best_only, period, verbose) 95 | trainSceneFlowData(hp, tp, up, env, callbacks, args.use_pretrained_weight) 96 | --------------------------------------------------------------------------------