├── README.md
├── download.sh
├── log
    ├── events.out.tfevents.1505315272.deeplearning
    ├── events.out.tfevents.1505316794.deeplearning
    ├── events.out.tfevents.1505317347.deeplearning
    ├── events.out.tfevents.1505317478.deeplearning
    ├── events.out.tfevents.1505317537.deeplearning
    ├── events.out.tfevents.1505317667.deeplearning
    ├── events.out.tfevents.1505317740.deeplearning
    ├── events.out.tfevents.1505317846.deeplearning
    ├── events.out.tfevents.1505318058.deeplearning
    ├── events.out.tfevents.1505318207.deeplearning
    ├── events.out.tfevents.1505318578.deeplearning
    ├── events.out.tfevents.1505318869.deeplearning
    ├── events.out.tfevents.1505319774.deeplearning
    ├── events.out.tfevents.1505320164.deeplearning
    ├── events.out.tfevents.1505329935.deeplearning
    ├── events.out.tfevents.1505330193.deeplearning
    ├── events.out.tfevents.1505358839.deeplearning
    ├── events.out.tfevents.1505371238.deeplearning
    ├── events.out.tfevents.1505392293.deeplearning
    ├── events.out.tfevents.1505407743.deeplearning
    ├── events.out.tfevents.1505410895.deeplearning
    ├── events.out.tfevents.1505418187.deeplearning
    ├── events.out.tfevents.1505421208.deeplearning
    ├── events.out.tfevents.1505433857.deeplearning
    ├── events.out.tfevents.1505505945.deeplearning
    ├── events.out.tfevents.1505506012.deeplearning
    ├── events.out.tfevents.1505506266.deeplearning
    ├── events.out.tfevents.1505506302.deeplearning
    ├── events.out.tfevents.1505579898.deeplearning
    ├── events.out.tfevents.1505580836.deeplearning
    ├── events.out.tfevents.1505667583.deeplearning
    └── events.out.tfevents.1505667753.deeplearning
├── model
    ├── model
    │   └── shared_cost_weight.hdf5
    └── shared_cost_weight.hdf5
├── src
    ├── conv3dTranspose.py
    ├── conv3dTranspose.pyc
    ├── custom_callback.py
    ├── custom_callback.pyc
    ├── data_utils.py
    ├── data_utils.pyc
    ├── environment.json
    ├── gcnetwork.py
    ├── gcnetwork.pyc
    ├── generator.py
    ├── generator.pyc
    ├── hyperparams.json
    ├── load_pfm.py
    ├── load_pfm.pyc
    ├── losses.py
    ├── losses.pyc
    ├── parse_arguments.py
    ├── parse_arguments.pyc
    ├── test_params.json
    ├── train_params.json
    └── util_params.json
├── test.py
└── train.py


/README.md:
--------------------------------------------------------------------------------
 1 | # Geometry and Context Network (On-Going Project)
 2 |    A Keras implementation of GC Network by HungShi Lin(hl2997@columbia.edu). The paper can be found [here](https://arxiv.org/abs/1703.04309).
 3 | I do some modifications by adding a linear output function and enable training highway block at the second stage.
 4 | 
 5 | ### Issue
 6 |    1. Model performance can't acheive that in the original paper.
 7 | 
 8 | ### Update (10/06/2017)
 9 |    Model can be trained with image with size (256, 512)
10 |    
11 | ### Software Requirement
12 |    tensorflow([install from here](https://www.tensorflow.org/install/)), keras([install from here](https://keras.io/#installation))
13 | 
14 | ### Data used for training model  
15 |    I trained my model with [drivingfinalpass dataset](https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html), which contains more than 4000 stereo images with 2 epochs.
16 | 
17 | ### Preprocessing
18 |    We crop training patches with size of 256x256 (different from that in the paper) from training images and normalize each channel.
19 | 
20 | ### Download
21 |    Run the following command:
22 | ####   
23 |       git clone https://github.com/LinHungShi/GCNetwork.git
24 |    
25 | ### Two ways to download driving dataset:  
26 |   1. create subdirectories sceneflow/driving in data, download and tar driving_final pass and driving_disparity from [here](https://lmb.informatik.uni-freiburg.de/resources/datasets/SceneFlowDatasets.en.html). 
27 |   2. You can also issue command 
28 | ####
29 |       “sh download.sh” 
30 |    which will create subdirectories and download datasets.
31 |    
32 | ### Train the model by running:
33 |    Run the following command: 
34 | ####
35 |       python train.py
36 | 
37 | ### Predict the data with test.py
38 |    1. create a directory, which contains two subdirectories -- left and right. 
39 |    2. Run the following command
40 | ####
41 |       python test.py -data <path/to/directory> -wpath <path/to/weight> [option]
42 |    3. The default file will be saved as npy file named prediction.npy, you can replace it with -pspath when issuing the above command.
43 | ### (Optional) Specify the pretrained weight by
44 |    1. Set it in train_params.py
45 |    2. python train.py -wpath <path to the pretrained weight>
46 | 
47 | ### Something you might want to do
48 |    1. To enable training with Monkaa dataset, 
49 |       a. Download Monkaa dataset from previous link.
50 |       
51 |       b. Create a directory in data, which has the name as monkaa_root in src/environment.json.
52 |       
53 |       c. Create a subdirectory, which has the name as monkaa_train in src/environment.json.
54 |       
55 |       d. Create a subdirectory, which has the name as monkaa_label in src/environment.json.
56 |       
57 |    2. All hyperparameters used for building the model can be found in src/hyperparams.json
58 | 
59 | ### Reference :
60 |    *Kendall, Alex, et al. "End-to-End Learning of Geometry and Context for Deep Stereo Regression." arXiv preprint arXiv:1703.04309 (2017).*
61 | 


--------------------------------------------------------------------------------
/download.sh:
--------------------------------------------------------------------------------
 1 | mkdir -p data/sceneflow/driving &&
 2 | mkdir -p data/sceneflow/monkaa &&
 3 | cd data/sceneflow/driving &&
 4 | wget https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/Driving/raw_data/driving__frames_finalpass.tar && tar -xvf driving__frames_finalpass.tar &&
 5 | wget https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/Driving/derived_data/driving__disparity.tar.bz2 && tar -xvf driving__disparity.tar.bz2 &&
 6 | cd .. && cd monkaa &&
 7 | wget https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/Monkaa/raw_data/monkaa__frames_cleanpass.tar && tar -xvf monkaa__frames_cleanpass.tar &&
 8 | wget https://lmb.informatik.uni-freiburg.de/data/SceneFlowDatasets_CVPR16/Release_april16/data/Monkaa/derived_data/monkaa__disparity.tar.bz2 && tar -xvf monkaa__disparity.tar.bz2
 9 | cd .. && cd .. && cd ..
10 | 


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505315272.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505315272.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505316794.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505316794.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505317347.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317347.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505317478.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317478.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505317537.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317537.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505317667.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317667.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505317740.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317740.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505317846.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505317846.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505318058.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505318058.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505318207.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505318207.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505318578.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505318578.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505318869.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505318869.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505319774.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505319774.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505320164.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505320164.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505329935.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505329935.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505330193.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505330193.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505358839.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505358839.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505371238.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505371238.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505392293.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505392293.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505407743.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505407743.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505410895.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505410895.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505418187.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505418187.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505421208.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505421208.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505433857.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505433857.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505505945.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505505945.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505506012.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505506012.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505506266.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505506266.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505506302.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505506302.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505579898.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505579898.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505580836.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505580836.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505667583.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505667583.deeplearning


--------------------------------------------------------------------------------
/log/events.out.tfevents.1505667753.deeplearning:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/log/events.out.tfevents.1505667753.deeplearning


--------------------------------------------------------------------------------
/model/model/shared_cost_weight.hdf5:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/model/model/shared_cost_weight.hdf5


--------------------------------------------------------------------------------
/model/shared_cost_weight.hdf5:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/model/shared_cost_weight.hdf5


--------------------------------------------------------------------------------
/src/conv3dTranspose.py:
--------------------------------------------------------------------------------
  1 | from keras import layers
  2 | from keras.layers import Conv3D
  3 | from keras.engine import InputSpec
  4 | from keras import backend as K
  5 | from keras.utils import conv_utils
  6 | import tensorflow as tf
  7 | def _preprocess_deconv3d_output_shape(x, shape, data_format):
  8 |     if data_format == 'channels_first':
  9 |         shape = (shape[0], shape[2], shape[3], shape[4], shape[1])
 10 | 
 11 |     if shape[0] is None:
 12 |         shape = (tf.shape(x)[0], ) + tuple(shape[1:])
 13 |         shape = tf.stack(list(shape))
 14 |     return shape
 15 | 
 16 | def _preprocess_padding(padding):
 17 |     if padding == 'same':
 18 |         padding = 'SAME'
 19 |     elif padding == 'valid':
 20 |         padding = 'VALID'
 21 |     else:
 22 |         raise ValueError('Invalid border mode:', padding)
 23 |     return padding
 24 | 
 25 | def conv3d_transpose(x, kernel, output_shape, strides=(1, 1, 1),
 26 |                      padding='valid', data_format=None):
 27 |     """3D deconvolution (i.e. transposed convolution).
 28 | 
 29 |     # Arguments
 30 |         x: Tensor or variable.
 31 |         kernel: kernel tensor.
 32 |         output_shape: 1D int tensor for the output shape.
 33 |         strides: strides tuple.
 34 |         padding: string, `"same"` or `"valid"`.
 35 |         data_format: `"channels_last"` or `"channels_first"`.
 36 |             Whether to use Theano or TensorFlow data format
 37 |             for inputs/kernels/ouputs.
 38 | 
 39 |     # Returns
 40 |         A tensor, result of transposed 3D convolution.
 41 | 
 42 |     # Raises
 43 |         ValueError: if `data_format` is neither `channels_last` or `channels_first`.
 44 |     """
 45 |     if data_format is None:
 46 |         data_format = image_data_format()
 47 |     if data_format not in {'channels_first', 'channels_last'}:
 48 |         raise ValueError('Unknown data_format ' + str(data_format))
 49 |     if isinstance(output_shape, (tuple, list)):
 50 |         output_shape = tf.stack(output_shape)
 51 | 
 52 |     x = _preprocess_conv3d_input(x, data_format)
 53 |     output_shape = _preprocess_deconv3d_output_shape(x, output_shape, data_format)
 54 |     padding = _preprocess_padding(padding)
 55 |     strides = (1,) + strides + (1,)
 56 | 
 57 |     x = tf.nn.conv3d_transpose(x, kernel, output_shape, strides,
 58 |                                padding=padding)
 59 |     x = _postprocess_conv3d_output(x, data_format)
 60 |     return x
 61 | 
 62 | def _preprocess_conv3d_input(x, data_format):
 63 |     if K.dtype(x) == 'float64':
 64 |         x = tf.cast(x, 'float32')
 65 |     if data_format == 'channels_first':
 66 |         x = tf.transpose(x, (0, 2, 3, 4, 1))
 67 |     return x
 68 | 
 69 | def _postprocess_conv3d_output(x, data_format):
 70 |     if data_format == 'channels_first':
 71 |         x = tf.transpose(x, (0, 4, 1, 2, 3))
 72 | 
 73 |     if K.floatx() == 'float64':
 74 |         x = tf.cast(x, 'float64')
 75 |     return x
 76 | class Conv3DTranspose(Conv3D):
 77 |     """Transposed convolution layer (sometimes called Deconvolution).
 78 |     The need for transposed convolutions generally arises
 79 |     from the desire to use a transformation going in the opposite direction
 80 |     of a normal convolution, i.e., from something that has the shape of the
 81 |     output of some convolution to something that has the shape of its input
 82 |     while maintaining a connectivity pattern that is compatible with
 83 |     said convolution.
 84 |     When using this layer as the first layer in a model,
 85 |     provide the keyword argument `input_shape`
 86 |     (tuple of integers, does not include the sample axis),
 87 |     e.g. `input_shape=(128, 128, 128, 3)` for 128x128 RGB pictures
 88 |     in `data_format="channels_last"`.
 89 |     # Arguments
 90 |         filters: Integer, the dimensionality of the output space
 91 |             (i.e. the number of output filters in the convolution).
 92 |         kernel_size: An integer or tuple/list of 2 integers, specifying the
 93 |             width and height of the 2D convolution window.
 94 |             Can be a single integer to specify the same value for
 95 |             all spatial dimensions.
 96 |         strides: An integer or tuple/list of 2 integers,
 97 |             specifying the strides of the convolution along the width and height.
 98 |             Can be a single integer to specify the same value for
 99 |             all spatial dimensions.
100 |             Specifying any stride value != 1 is incompatible with specifying
101 |             any `dilation_rate` value != 1.
102 |         padding: one of `"valid"` or `"same"` (case-insensitive).
103 |         data_format: A string,
104 |             one of `channels_last` (default) or `channels_first`.
105 |             The ordering of the dimensions in the inputs.
106 |             `channels_last` corresponds to inputs with shape
107 |             `(batch, height, width, channels)` while `channels_first`
108 |             corresponds to inputs with shape
109 |             `(batch, channels, height, width)`.
110 |             It defaults to the `image_data_format` value found in your
111 |             Keras config file at `~/.keras/keras.json`.
112 |             If you never set it, then it will be "channels_last".
113 |         dilation_rate: an integer or tuple/list of 2 integers, specifying
114 |             the dilation rate to use for dilated convolution.
115 |             Can be a single integer to specify the same value for
116 |             all spatial dimensions.
117 |             Currently, specifying any `dilation_rate` value != 1 is
118 |             incompatible with specifying any stride value != 1.
119 |         activation: Activation function to use
120 |             (see [activations](../activations.md)).
121 |             If you don't specify anything, no activation is applied
122 |             (ie. "linear" activation: `a(x) = x`).
123 |         use_bias: Boolean, whether the layer uses a bias vector.
124 |         kernel_initializer: Initializer for the `kernel` weights matrix
125 |             (see [initializers](../initializers.md)).
126 |         bias_initializer: Initializer for the bias vector
127 |             (see [initializers](../initializers.md)).
128 |         kernel_regularizer: Regularizer function applied to
129 |             the `kernel` weights matrix
130 |             (see [regularizer](../regularizers.md)).
131 |         bias_regularizer: Regularizer function applied to the bias vector
132 |             (see [regularizer](../regularizers.md)).
133 |         activity_regularizer: Regularizer function applied to
134 |             the output of the layer (its "activation").
135 |             (see [regularizer](../regularizers.md)).
136 |         kernel_constraint: Constraint function applied to the kernel matrix
137 |             (see [constraints](../constraints.md)).
138 |         bias_constraint: Constraint function applied to the bias vector
139 |             (see [constraints](../constraints.md)).
140 |     # Input shape
141 |         4D tensor with shape:
142 |         `(batch, channels, rows, cols)` if data_format='channels_first'
143 |         or 4D tensor with shape:
144 |         `(batch, rows, cols, channels)` if data_format='channels_last'.
145 |     # Output shape
146 |         4D tensor with shape:
147 |         `(batch, filters, new_rows, new_cols)` if data_format='channels_first'
148 |         or 4D tensor with shape:
149 |         `(batch, new_rows, new_cols, filters)` if data_format='channels_last'.
150 |         `rows` and `cols` values might have changed due to padding.
151 |     # References
152 |         - [A guide to convolution arithmetic for deep learning](https://arxiv.org/abs/1603.07285v1)
153 |         - [Deconvolutional Networks](http://www.matthewzeiler.com/pubs/cvpr2010/cvpr2010.pdf)
154 |     """
155 | 
156 |     #@interfaces.legacy_deconv3d_support
157 |     def __init__(self, filters,
158 |                  kernel_size,
159 |                  strides=(1, 1, 1),
160 |                  padding='valid',
161 |                  data_format=None,
162 |                  activation=None,
163 |                  use_bias=True,
164 |                  kernel_initializer='glorot_uniform',
165 |                  bias_initializer='zeros',
166 |                  kernel_regularizer=None,
167 |                  bias_regularizer=None,
168 |                  activity_regularizer=None,
169 |                  kernel_constraint=None,
170 |                  bias_constraint=None,
171 |                  **kwargs):
172 |         super(Conv3DTranspose, self).__init__(
173 |             filters,
174 |             kernel_size,
175 |             strides=strides,
176 |             padding=padding,
177 |             data_format=data_format,
178 |             activation=activation,
179 |             use_bias=use_bias,
180 |             kernel_initializer=kernel_initializer,
181 |             bias_initializer=bias_initializer,
182 |             kernel_regularizer=kernel_regularizer,
183 |             bias_regularizer=bias_regularizer,
184 |             activity_regularizer=activity_regularizer,
185 |             kernel_constraint=kernel_constraint,
186 |             bias_constraint=bias_constraint,
187 |             **kwargs)
188 |         self.input_spec = InputSpec(ndim=5)
189 | 
190 |     def build(self, input_shape):
191 |         if len(input_shape) != 5:
192 |             raise ValueError('Inputs should have rank ' +
193 |                              str(5) +
194 |                              '; Received input shape:', str(input_shape))
195 |         if self.data_format == 'channels_first':
196 |             channel_axis = 1
197 |         else:
198 |             channel_axis = -1
199 |         if input_shape[channel_axis] is None:
200 |             raise ValueError('The channel dimension of the inputs '
201 |                              'should be defined. Found `None`.')
202 |         input_dim = input_shape[channel_axis]
203 |         kernel_shape = self.kernel_size + (self.filters, input_dim)
204 | 
205 |         self.kernel = self.add_weight(kernel_shape,
206 |                                       initializer=self.kernel_initializer,
207 |                                       name='kernel',
208 |                                       regularizer=self.kernel_regularizer,
209 |                                       constraint=self.kernel_constraint)
210 |         if self.use_bias:
211 |             self.bias = self.add_weight((self.filters,),
212 |                                         initializer=self.bias_initializer,
213 |                                         name='bias',
214 |                                         regularizer=self.bias_regularizer,
215 |                                         constraint=self.bias_constraint)
216 |         else:
217 |             self.bias = None
218 |         # Set input spec.
219 |         self.input_spec = InputSpec(ndim=5, axes={channel_axis: input_dim})
220 |         self.built = True
221 | 
222 |     def call(self, inputs):
223 |         input_shape = K.shape(inputs)
224 |         batch_size = input_shape[0]
225 |         if self.data_format == 'channels_first':
226 |             d_axis, h_axis, w_axis = 2, 3, 4
227 |         else:
228 |             d_axis, h_axis, w_axis = 1, 2, 3
229 | 
230 |         depth, height, width = input_shape[d_axis], input_shape[h_axis], input_shape[w_axis]
231 |         kernel_d, kernel_h, kernel_w = self.kernel_size
232 |         stride_d, stride_h, stride_w = self.strides
233 | 
234 |         # Infer the dynamic output shape:
235 |         out_depth = conv_utils.deconv_length(depth,
236 |                                               stride_h, kernel_h,
237 |                                               self.padding)
238 |         out_height = conv_utils.deconv_length(height,
239 |                                               stride_h, kernel_h,
240 |                                               self.padding)
241 |         out_width = conv_utils.deconv_length(width,
242 |                                              stride_w, kernel_w,
243 |                                              self.padding)
244 |         if self.data_format == 'channels_first':
245 |             output_shape = (batch_size, self.filters, out_depth, out_height, out_width)
246 |         else:
247 |             output_shape = (batch_size, out_depth, out_height, out_width, self.filters)
248 | 
249 |         outputs = conv3d_transpose(
250 |             inputs,
251 |             self.kernel,
252 |             output_shape,
253 |             self.strides,
254 |             padding=self.padding,
255 |             data_format=self.data_format)
256 | 
257 |         if self.bias:
258 |             outputs = K.bias_add(
259 |                 outputs,
260 |                 self.bias,
261 |                 data_format=self.data_format)
262 | 
263 |         if self.activation is not None:
264 |             return self.activation(outputs)
265 |         return outputs
266 | 
267 |     def compute_output_shape(self, input_shape):
268 |         output_shape = list(input_shape)
269 |         if self.data_format == 'channels_first':
270 |             c_axis, d_axis, h_axis, w_axis = 1, 2, 3, 4
271 |         else:
272 |             c_axis, d_axis, h_axis, w_axis = 4, 1, 2, 3
273 | 
274 |         kernel_d, kernel_h, kernel_w = self.kernel_size
275 |         stride_d, stride_h, stride_w = self.strides
276 | 
277 |         output_shape[c_axis] = self.filters
278 |         output_shape[d_axis] = conv_utils.deconv_length(
279 |             output_shape[d_axis], stride_d, kernel_d, self.padding)
280 |         output_shape[h_axis] = conv_utils.deconv_length(
281 |             output_shape[h_axis], stride_h, kernel_h, self.padding)
282 |         output_shape[w_axis] = conv_utils.deconv_length(
283 |             output_shape[w_axis], stride_w, kernel_w, self.padding)
284 |         return tuple(output_shape)
285 | 
286 |     def get_config(self):
287 |         config = super(Conv3DTranspose, self).get_config()
288 |         config.pop('dilation_rate')
289 |         return config
290 | 


--------------------------------------------------------------------------------
/src/conv3dTranspose.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/conv3dTranspose.pyc


--------------------------------------------------------------------------------
/src/custom_callback.py:
--------------------------------------------------------------------------------
  1 | from keras.callbacks import Callback
  2 | import numpy as np
  3 | from keras.models import Model
  4 | from keras import backend as K
  5 | from keras.layers import Input
  6 | class customModelCheckpoint(Callback):
  7 |     """Save the model after every epoch.
  8 | 
  9 |     `filepath` can contain named formatting options,
 10 |     which will be filled the value of `epoch` and
 11 |     keys in `logs` (passed in `on_epoch_end`).
 12 | 
 13 |     For example: if `filepath` is `weights.{epoch:02d}-{val_loss:.2f}.hdf5`,
 14 |     then the model checkpoints will be saved with the epoch number and
 15 |     the validation loss in the filename.
 16 | 
 17 |     # Arguments
 18 |         filepath: string, path to save the model file.
 19 |         monitor: quantity to monitor.
 20 |         verbose: verbosity mode, 0 or 1.
 21 |         save_best_only: if `save_best_only=True`,
 22 |             the latest best model according to
 23 |             the quantity monitored will not be overwritten.
 24 |         mode: one of {auto, min, max}.
 25 |             If `save_best_only=True`, the decision
 26 |             to overwrite the current save file is made
 27 |             based on either the maximization or the
 28 |             minimization of the monitored quantity. For `val_acc`,
 29 |             this should be `max`, for `val_loss` this should
 30 |             be `min`, etc. In `auto` mode, the direction is
 31 |             automatically inferred from the name of the monitored quantity.
 32 |         save_weights_only: if True, then only the model's weights will be
 33 |             saved (`model.save_weights(filepath)`), else the full model
 34 |             is saved (`model.save(filepath)`).
 35 |         period: Interval (number of epochs) between checkpoints.
 36 |     """
 37 | 
 38 |     def __init__(self, cost_weight_filepath, linear_output_weight_filepath, monitor='val_loss', verbose=0,
 39 |                  save_best_only=False, mode='auto', period=1):
 40 |         super(customModelCheckpoint, self).__init__()
 41 |         self.monitor = monitor
 42 |         self.verbose = verbose
 43 |         self.cost_weight_filepath = cost_weight_filepath
 44 |  	self.linear_output_weight_filepath = linear_output_weight_filepath
 45 |         self.save_best_only = save_best_only
 46 |         self.period = period
 47 |         self.epochs_since_last_save = 0
 48 | 
 49 |         if mode not in ['auto', 'min', 'max']:
 50 |             warnings.warn('ModelCheckpoint mode %s is unknown, '
 51 |                           'fallback to auto mode.' % (mode),
 52 |                           RuntimeWarning)
 53 |             mode = 'auto'
 54 | 
 55 |         if mode == 'min':
 56 |             self.monitor_op = np.less
 57 |             self.best = np.Inf
 58 |         elif mode == 'max':
 59 |             self.monitor_op = np.greater
 60 |             self.best = -np.Inf
 61 |         else:
 62 |             if 'acc' in self.monitor or self.monitor.startswith('fmeasure'):
 63 |                 self.monitor_op = np.greater
 64 |                 self.best = -np.Inf
 65 |             else:
 66 |                 self.monitor_op = np.less
 67 |                 self.best = np.Inf
 68 |     def custom_save_weights(self, overwrite):  
 69 |         cost = self.model.layers[-2].output
 70 |         cost_model = Model(self.model.input, cost)
 71 |         cost_model.save_weights(self.cost_weight_filepath, overwrite)
 72 |         if self.linear_output_weight_filepath:
 73 |              linear_output = self.model.layers[-1]
 74 |              b, m, h, w = K.int_shape(cost)
 75 |              linear_input = Input((m, h, w))
 76 |              linear_model = Model(linear_input, linear_output(linear_input))
 77 |              linear_model.save_weights(self.linear_output_weight_filepath, overwrite)
 78 |     def on_epoch_end(self, epoch, logs=None):
 79 |         logs = logs or {}
 80 |         self.epochs_since_last_save += 1
 81 |         if self.epochs_since_last_save >= self.period:
 82 |             self.epochs_since_last_save = 0
 83 |             #filepath = self.filepath.format(epoch=epoch, **logs)
 84 |             if self.save_best_only:
 85 |                 current = logs.get(self.monitor)
 86 |                 if current is None:
 87 |                     warnings.warn('Can save best model only with %s available, '
 88 |                                   'skipping.' % (self.monitor), RuntimeWarning)
 89 |                 else:
 90 |                     if self.monitor_op(current, self.best):
 91 |                         if self.verbose > 0:
 92 |                             print('Epoch %05d: %s improved from %0.5f to %0.5f,'
 93 |                                   ' saving weight to %s and %s'
 94 |                                   % (epoch, self.monitor, self.best,
 95 |                                      current, self.cost_weight_filepath, self.linear_output_weight_filepath))
 96 |                         self.best = current
 97 | 		        self.custom_save_weights(True)
 98 |                     else:
 99 |                         if self.verbose > 0:
100 |                             print('Epoch %05d: %s did not improve' %
101 |                                   (epoch, self.monitor))
102 |             else:
103 |                 if self.verbose > 0:
104 |                     print('Epoch %05d: saving model to %s and %s' % (epoch,self.cost_weight_filepath, self.linear_output_weight_filepath))
105 | 		self.custom_save_weights(True)
106 | 


--------------------------------------------------------------------------------
/src/custom_callback.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/custom_callback.pyc


--------------------------------------------------------------------------------
/src/data_utils.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import cv2
 3 | import os
 4 | import glob
 5 | import random
 6 | import math
 7 | def genDrivingPath(x, y):
 8 |         l_paths = []
 9 |         r_paths = []
10 |         y_paths = []
11 |         focal_lengths = ["15mm_focallength", "35mm_focallength"]
12 |         directions = ["scene_backwards", "scene_forwards"]
13 |         types = ["fast", "slow"]
14 |         sides = ["left", "right"]
15 |         for focal_length in focal_lengths:
16 |                 for direction in directions:
17 |                         for type in types:
18 |                                 l_paths.append(os.path.join(x, *[focal_length, direction, type, sides[0]]))
19 |                                 r_paths.append(os.path.join(x, *[focal_length, direction, type, sides[1]]))
20 |                                 y_paths.append(os.path.join(y, *[focal_length, direction, type, sides[0]]))
21 |         return l_paths, r_paths, y_paths
22 | 
23 | def genMonkaaPath(x, y):
24 |         l_paths = []
25 |         r_paths = []
26 |         y_paths = []
27 |         scenes = sorted(os.listdir(x))
28 |         sides = ["left", "right"]
29 |         for scene in scenes:
30 |                         l_paths.append(os.path.join(x, *[scene, sides[0]]))
31 |                         r_paths.append(os.path.join(x, *[scene, sides[1]]))
32 |                         y_paths.append(os.path.join(y, *[scene, sides[0]]))
33 | 	return l_paths, r_paths, y_paths
34 | 
35 | def extractAllImage(lefts, rights, disps):
36 |         left_images = []
37 |         right_images = []
38 |         disp_images = []
39 |         for left_path, right_path, disp_path in zip(lefts, rights, disps):
40 |                 left_data = sorted(glob.glob(left_path + "/*.png"))
41 |                 right_data = sorted(glob.glob(right_path + "/*.png"))
42 |                 disps_data = sorted(glob.glob(disp_path + "/*.pfm"))
43 |                 left_images = left_images + left_data
44 |                 right_images = right_images + right_data
45 |                 disp_images = disp_images + disps_data
46 |         return left_images, right_images, disp_images
47 | 
48 | 
49 | def splitData(l, r, d, val_ratio, fraction = 1):
50 | 	tmp = zip(l, r, d)
51 | 	random.shuffle(tmp)
52 | 	num_samples = len(l)
53 | 	num_data = int(fraction * num_samples)
54 | 	tmp = tmp[0:num_data]
55 |         val_samples = int(math.ceil(num_data * val_ratio))
56 | 	val = tmp[0:val_samples]
57 | 	train = tmp[val_samples:]
58 | 	l_val, r_val, d_val = zip(*val)
59 | 	l_train, r_train, d_train = zip(*train)
60 |         return [l_train, r_train, d_train], [l_val, r_val, d_val]
61 | 


--------------------------------------------------------------------------------
/src/data_utils.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/data_utils.pyc


--------------------------------------------------------------------------------
/src/environment.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"sceneflow_root": "data/sceneflow",
 3 | 	"driving_root": "driving",
 4 | 	"driving_train": "frames_finalpass",
 5 | 	"driving_label": "disparity",
 6 | 	"monkaa_root": "monkaa",
 7 | 	"monkaa_train": "frames_cleanpass",
 8 | 	"monkaa_label": "disparity",
 9 | 	"train_all": 0,
10 | 	"train_driving": 1,
11 | 	"train_monkaa": 0 
12 | }
13 | 
14 | 


--------------------------------------------------------------------------------
/src/gcnetwork.py:
--------------------------------------------------------------------------------
  1 | from keras.models import Sequential, Model
  2 | from keras.layers.convolutional import Conv2D, Conv3D, Conv2DTranspose
  3 | from conv3dTranspose import Conv3DTranspose
  4 | from keras.layers.normalization import BatchNormalization
  5 | from keras.layers import Activation
  6 | from keras import backend as K
  7 | from keras.layers import Input, Add, add, multiply
  8 | from keras.layers.core import Lambda, Permute, Reshape
  9 | from ipykernel import kernelapp as app
 10 | import tensorflow as tf
 11 | import numpy as np
 12 | 
 13 | def _resNetBlock_(filters, ksize, stride, padding, act_func):
 14 |     conv1 = Conv2D(filters, ksize, strides = stride, padding = padding)
 15 |     bn1 = BatchNormalization(axis = -1)
 16 |     act1 = Activation(act_func)
 17 |     conv2 = Conv2D(filters,ksize, strides = stride, padding = padding)
 18 |     bn2 = BatchNormalization(axis = -1)
 19 |     act2 = Activation(act_func)
 20 |     add = Add()
 21 |     return [conv1, bn1, act1, conv2, bn2, act2, add]
 22 | 
 23 | def _addConv3D_(input, filters, ksize, stride, padding, bn = True, act_func = 'relu'):
 24 |     conv = Conv3D(filters, ksize, strides = stride, padding = padding) (input)
 25 |     if bn:
 26 |         conv = BatchNormalization(axis = -1)(conv)
 27 |     if act_func:
 28 |         conv = Activation(act_func)(conv)
 29 |     return conv
 30 | 
 31 | def _convDownSampling_(input, filters, ksize, ds_stride, padding):
 32 |     conv = _addConv3D_(input, filters, ksize, ds_stride, padding)
 33 |     conv = _addConv3D_(conv, filters, ksize, 1, padding)    
 34 |     conv = _addConv3D_(conv, filters, ksize, 1, padding)    
 35 |     return conv
 36 | 
 37 | def _createDeconv3D_(input, filters, ksize, stride, padding, bn = True, act_func = 'relu'):
 38 |     deconv = Conv3DTranspose(filters, ksize, stride, padding) (input)
 39 |     if bn:
 40 |         deconv = BatchNormalization(axis = -1)(deconv)
 41 |     if act_func:
 42 |         deconv = Activation(act_func)(deconv)
 43 |     return deconv
 44 | 
 45 | def _highwayBlock_(tensor):
 46 | 	output, input, trans = tensor
 47 | 	return add([multiply([output, trans]), multiply([input, 1 - trans])])
 48 | 
 49 | def _getCostVolume_(inputs, max_d):
 50 |     left_tensor, right_tensor = inputs
 51 |     shape = K.shape(right_tensor)
 52 |     right_tensor = K.spatial_2d_padding(right_tensor, padding=((0, 0), (max_d, 0)))
 53 |     disparity_costs = []
 54 |     for d in reversed(range(max_d)):
 55 | 	left_tensor_slice = left_tensor
 56 | 	right_tensor_slice = tf.slice(right_tensor, begin = [0, 0, d, 0], size = [-1, -1, shape[2], -1])
 57 | 	cost = K.concatenate([left_tensor_slice, right_tensor_slice], axis = 3)
 58 | 	disparity_costs.append(cost)
 59 |     cost_volume = K.stack(disparity_costs, axis = 1)
 60 |     return cost_volume
 61 | 
 62 | def  _computeLinearScore_(cv, d):
 63 |     cv = K.permute_dimensions(cv, (0,2,3,1))
 64 |     disp_map = K.reshape(K.arange(0, d, dtype = K.floatx()), (1,1,d,1))
 65 |     output = K.conv2d(cv, disp_map, strides = (1,1), padding = 'valid')
 66 |     return K.squeeze(output, axis = -1)
 67 | 
 68 | def _computeSoftArgMin_(cv, d):
 69 |     softmax = tf.nn.softmax(cv, dim = 1)
 70 |     #softmax = K.permute_dimensions(softmax, (0,2,3,1))
 71 |     disp_map = K.reshape(K.arange(0, d, dtype = 'float32'), (1,1,d,1))
 72 |     output = K.conv2d(softmax, disp_map, strides = (1,1), data_format = 'channels_first', padding = 'valid')
 73 |     return K.squeeze(output, axis = 1)
 74 | 
 75 | def getOutputFunction(output):
 76 |         if output == 'linear':
 77 |                 return _computeLinearScore_
 78 |         if output == 'softargmin':
 79 |                 return _computeSoftArgMin_
 80 | 
 81 | def _createUniFeature_(input_shape, num_res, filters, first_ksize, ksize, act_func, ds_stride, padding):
 82 |     conv1 = Conv2D(filters, first_ksize, strides = ds_stride, padding = padding, input_shape = input_shape)
 83 |     bn1 = BatchNormalization(axis = -1)
 84 |     act1 = Activation(act_func)
 85 |     layers = [conv1, bn1, act1]
 86 |     for i in range(num_res):
 87 |         layers += _resNetBlock_(filters, ksize, 1, padding, act_func)
 88 |     output = Conv2D(filters, ksize, strides = 1, padding = padding)
 89 |     layers.append(output)
 90 |     return layers
 91 | 
 92 | def _LearnReg_(input, base_num_filters, ksize, ds_stride, resnet, padding, highway_func, num_down_conv):    
 93 |     down_convs = list()
 94 |     conv = _addConv3D_(input, base_num_filters, ksize, 1, padding) 
 95 |     conv = _addConv3D_(conv, base_num_filters, ksize, 1, padding)
 96 |     down_convs.insert(0, conv)
 97 |     if not resnet:
 98 |     	trans_gates = list()
 99 |     	gate = _addConv3D_(conv, base_num_filters, ksize, 1, padding)
100 |     	trans_gates.insert(0, gate)
101 |     for i in range(num_down_conv):
102 | 	if i < num_down_conv - 1:
103 | 		mult = 2
104 | 	else:
105 | 		mult = 4
106 | 	conv = _convDownSampling_(conv, mult * base_num_filters, ksize, ds_stride, padding)	 	
107 | 	down_convs.insert(0, conv)
108 | 	if not resnet:
109 | 		gate = _addConv3D_(conv, mult * base_num_filters, ksize, 1, padding)	
110 | 		trans_gates.insert(0, gate)
111 |     up_convs = down_convs[0]
112 |     for i in range(num_down_conv):    
113 | 	filters = K.int_shape(down_convs[i+1])[-1]
114 | 	deconv = _createDeconv3D_(up_convs, filters, ksize, ds_stride, padding) 
115 | 	if not resnet:
116 | 		up_convs = Lambda(_highwayBlock_)([deconv, down_convs[i+1], trans_gates[i+1]])
117 |         else: 
118 | 		up_convs = add([deconv, down_convs[i+1]])
119 |     cost = _createDeconv3D_(up_convs, 1, ksize, ds_stride, padding, bn = False, act_func = None)
120 |     cost = Lambda(lambda x: -x)(cost)
121 |     cost = Lambda(K.squeeze, arguments = {'axis': -1})(cost)
122 |     return cost
123 | 
124 | def createFeature(input, layers):
125 |     res = layers[0](input)
126 |     tensor = res
127 |     for layer in layers[1:]:
128 |         if isinstance(layer, Add):
129 |             tensor = layer([tensor, res])
130 |             res = tensor
131 |         else:
132 |             tensor = layer(tensor)
133 |     return tensor
134 | 
135 | def createGCNetwork(hp, tp, pre_weight):
136 |     padding = 'same'
137 |     cost_weight = tp['cost_volume_weight_path']
138 |     linear_weight = tp['linear_output_weight_path']
139 |     d = hp['max_disp']
140 |     resnet = hp['resnet']
141 |     first_ksize = hp['first_kernel_size']
142 |     ksize = hp['kernel_size']
143 |     num_filters = hp['base_num_filters']
144 |     act_func = hp['act_func']
145 |     highway_func = hp['h_act_func']
146 |     num_down_conv = hp['num_down_conv']
147 |     output = hp['output']
148 |     num_res = hp['num_res']
149 |     ds_stride = hp['ds_stride']
150 |     padding = hp['padding']
151 |     shared_weight = tp['shared_weight']
152 |     K.set_image_data_format(hp['data_format'])
153 |     input_shape = (None, None, 3)
154 |     left_img = Input(input_shape, dtype = "float32")
155 |     right_img = Input(input_shape, dtype = "float32")
156 |     layers = _createUniFeature_(input_shape, num_res, num_filters, first_ksize, ksize, act_func, ds_stride, padding)
157 |     l_feature = createFeature(left_img, layers)
158 |     if shared_weight == 1:
159 | 	print "Use shared weight for first stage"   
160 |     	r_feature = createFeature(right_img, layers)
161 |     else:
162 | 	print "Use different weights for first stage"
163 |     	layers2 = _createUniFeature_(input_shape, num_res, num_filters, first_ksize, ksize, act_func, ds_stride, padding)
164 | 	r_feature = createFeature(right_img, layers2)
165 |     unifeatures = [l_feature, r_feature]
166 |     cv = Lambda(_getCostVolume_, arguments = {'max_d':d/2}, output_shape = (d/2, None, None, num_filters * 2))(unifeatures)
167 |     disp_map = _LearnReg_(cv, num_filters, ksize, ds_stride, resnet, padding, highway_func, num_down_conv)
168 |     cost_model = Model([left_img , right_img], disp_map)
169 |     if pre_weight == 1:
170 |         print "Loading pretrained cost weight..."
171 |         cost_model.load_weights(cost_weight)
172 |     out_func = getOutputFunction(output)
173 |     disp_map_input = Input((d, None, None))
174 |     output = Lambda(out_func, arguments = {'d':d})(disp_map_input)
175 |     linear_output_model = Model(disp_map_input, output)
176 |     if out_func == "linear" and pre_weight == 1:
177 |         print "Loading pretrained linear output weight..."
178 |         linear_output_model.load_weights(linear_weight)
179 |     model = Model(cost_model.input, linear_output_model(cost_model.output))
180 |     return model
181 | 
182 | 


--------------------------------------------------------------------------------
/src/gcnetwork.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/gcnetwork.pyc


--------------------------------------------------------------------------------
/src/generator.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import numpy as np
 3 | import glob
 4 | import random
 5 | from load_pfm import *
 6 | 
 7 | def generate_arrays_from_file(lefts, rights, up, disps = None):
 8 |         crop_height = up['crop_height']
 9 |         crop_width = up['crop_width']
10 | 	train = True
11 |         random.seed(up['seed'])
12 | 	if disps == None:
13 | 		disps = np.arange(len(lefts))
14 | 		train = False
15 |         while 1:
16 |         	for ldata, rdata, ddata in zip(lefts, rights, disps):
17 | 			left_image = cv2.imread(ldata)
18 |                         right_image = cv2.imread(rdata)
19 | 			if train == True:
20 |                         	disp_image = load_pfm(open(ddata))
21 |                         	h, w = left_image.shape[0:2]
22 |                         	start_w = random.randint(0, w - crop_width)
23 |                         	start_h = random.randint(0, h - crop_height)
24 | 				finish_h = start_h + crop_height
25 | 				finish_w = start_w + crop_width
26 |                         	left_image = left_image[start_h:finish_h, start_w:finish_w]
27 |                         	right_image = right_image[start_h:finish_h, start_w:finish_w]
28 |                         	disp_image = disp_image[start_h:finish_h, start_w:finish_w]
29 |                         	disp_image = np.expand_dims(disp_image, 0)
30 |                         left_image = _centerImage_(left_image)
31 |                         right_image = _centerImage_(right_image)
32 | 			#left_image = _normImage_(left_image, 1, -1, 255, 0)
33 | 			#right_image = _normImage_(right_image, 1, -1, 255, 0)
34 |                         left_image = np.expand_dims(left_image, 0)
35 |                         right_image = np.expand_dims(right_image, 0)
36 | 			#print "\n l_image : {}\n width and height = ({}, {}), cropped width and height = ({},{})".format(ldata, h, w, start_h, start_w)
37 | 			if train == True:
38 |                        		yield ([left_image, right_image], disp_image)
39 | 			else:
40 | 				yield ([left_image, right_image])
41 | 		if not train:
42 | 			break
43 | def _centerImage_(img):
44 | 	img = img.astype(np.float32)
45 | 	var = np.var(img, axis = (0,1), keepdims = True)
46 |         mean = np.mean(img, axis = (0,1), keepdims = True)
47 |         return (img - mean) / np.sqrt(var) 
48 | def _normImage_(img, new_max, new_min, old_max, old_min):
49 | 	img = img.astype(np.float32)
50 | 	return (img - old_min) * (new_max - new_min) / (old_max - old_min) + new_min
51 | 


--------------------------------------------------------------------------------
/src/generator.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/generator.pyc


--------------------------------------------------------------------------------
/src/hyperparams.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"max_disp": 192,
 3 | 	"base_num_filters": 32,
 4 | 	"first_kernel_size": 5, 
 5 | 	"kernel_size": 3,
 6 | 	"num_res": 8, 
 7 | 	"num_down_conv": 4,
 8 | 	"resnet": 1,
 9 | 	"output": "softargmin",
10 | 	"act_func": "relu",
11 | 	"h_act_func": "sigmoid",
12 | 	"ds_stride": 2, 
13 | 	"padding": "same",
14 | 	"data_format": "channels_last"
15 | }
16 | 


--------------------------------------------------------------------------------
/src/load_pfm.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import cv2
 3 | import re
 4 | def load_pfm(file):
 5 |     color = None
 6 |     width = None
 7 |     height = None
 8 |     scale = None
 9 |     endian = None
10 |     header = file.readline().rstrip()
11 |     if header == 'PF':
12 |         color = True
13 |     elif header == 'Pf':
14 |         color = False
15 |     else:
16 |         raise Exception('Not a PFM file.')
17 |     dim_match = re.match(r'^(\d+)\s(\d+)\s$', file.readline())
18 |     if dim_match:
19 |         width, height = map(int, dim_match.groups())
20 |     else:
21 |         raise Exception('Malformed PFM header.')
22 |     scale = float(file.readline().rstrip())
23 |     if scale < 0: # little-endian
24 |         endian = '<'
25 |         scale = -scale
26 |     else:
27 |         endian = '>' # big-endian
28 |     data = np.fromfile(file, endian + 'f')
29 |     shape = (height, width, 3) if color else (height, width)
30 |     data = np.reshape(data, shape)
31 |     data = cv2.flip(data, 0)
32 |     return data
33 | 
34 | 


--------------------------------------------------------------------------------
/src/load_pfm.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/load_pfm.pyc


--------------------------------------------------------------------------------
/src/losses.py:
--------------------------------------------------------------------------------
 1 | from keras import backend as K
 2 | 
 3 | def lessOneAccuracy(y_true, y_pred):
 4 |     shape = K.shape(y_true)
 5 |     h = K.reshape(shape[1], (1,1))
 6 |     w = K.reshape(shape[2], (1,1))
 7 |     denom = 1 / K.cast(K.reshape(K.dot(h, w), (1,1)), dtype = 'float32')
 8 |     return K.dot(K.reshape(K.sum(K.cast(K.less_equal(K.abs(y_true - y_pred), 1), dtype = 'float32')), (1,1)), denom)
 9 | 
10 | def lessThreeAccuracy(y_true, y_pred):
11 |     shape = K.shape(y_true)
12 |     h = K.reshape(shape[1], (1,1))
13 |     w = K.reshape(shape[2], (1,1))
14 |     denom = K.dot(h, w)
15 |     denom = 1 / K.cast(K.reshape(K.dot(h, w), (1,1)), dtype = 'float32')
16 |     return K.dot(K.reshape(K.sum(K.cast(K.less_equal(K.abs(y_true - y_pred), 3), dtype = 'float32')), (1,1)), denom)
17 | 


--------------------------------------------------------------------------------
/src/losses.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/losses.pyc


--------------------------------------------------------------------------------
/src/parse_arguments.py:
--------------------------------------------------------------------------------
 1 | import json
 2 | import sys
 3 | def parseArguments():
 4 | 	with open('src/hyperparams.json') as json_file:
 5 | 		hp = json.load(json_file)
 6 | 	with open('src/environment.json') as json_file:
 7 | 		env = json.load(json_file)
 8 | 	with open('src/train_params.json') as json_file:
 9 | 		tp = json.load(json_file)
10 | 	#with open('src/test_params.json') as json_file:
11 | 	#	pp = json.load(json_file)
12 | 	with open('src/util_params.json') as json_file:
13 | 		up = json.load(json_file)
14 | 	return hp, tp, up, env
15 | 


--------------------------------------------------------------------------------
/src/parse_arguments.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/LinHungShi/GCNetwork/5b3844fe0582fc41e5eca971661633c090bcadc9/src/parse_arguments.pyc


--------------------------------------------------------------------------------
/src/test_params.json:
--------------------------------------------------------------------------------
1 | {
2 | 	"pspath": "prediction",
3 | 	"batch_size": 1,
4 | 	"w_path": "model_weight.hdf5",
5 | 	"max_q_size": 3,
6 | 	"verbose": 1
7 | }
8 | 


--------------------------------------------------------------------------------
/src/train_params.json:
--------------------------------------------------------------------------------
 1 | {
 2 |         "period": 1,
 3 |         "verbose": 1, 
 4 |         "log_save_path": "log",
 5 | 	"max_q_size": 1,
 6 |         "save_best_only": 0,
 7 | 	"learning_rate": 0.001,
 8 |         "batch_size": 1,
 9 |         "epochs": 50,
10 | 	"epsilon": 0.00000001,
11 | 	"rho": 0.9,
12 | 	"decay": 0.0,
13 | 	"shared_weight": 1,
14 | 	"loss_function": "mean_absolute_error",
15 | 	"cost_volume_weight_save_path": "model/shared_cost_weight.hdf5",
16 | 	"cost_volume_weight_path": "model/shared_cost_weight.hdf5",
17 | 	"linear_output_weight_save_path": "model/shared_linear_output_weight.hdf5",
18 | 	"linear_output_weight_path": "model/shared_linear_output_weight.hdf5",  
19 | 	"pspath": "prediction"
20 | }
21 | 


--------------------------------------------------------------------------------
/src/util_params.json:
--------------------------------------------------------------------------------
 1 | {
 2 | 	"crop_width": 512, 
 3 |         "crop_height": 256,
 4 |         "val_ratio": 0.2,
 5 | 	"file_extension": "png",
 6 | 	"seed": 1234,
 7 | 	"fraction": 1 
 8 | 
 9 | }
10 | 


--------------------------------------------------------------------------------
/test.py:
--------------------------------------------------------------------------------
 1 | import sys
 2 | sys.path.append('src')
 3 | import numpy as np
 4 | import argparse
 5 | import parse_arguments
 6 | from gcnetwork import *
 7 | import glob
 8 | import os
 9 | import psutil
10 | from generator import *
11 | 
12 | def get_mem_usage():                                                                                                                               
13 |     process = psutil.Process(os.getpid())                                                                                                          
14 |     return process.memory_info() 
15 | 
16 | def _predictFromArrays_(model, left, right, bs):
17 | 	return model.predict([left, right], bs)
18 | 
19 | def _predictFromGenerator_(model, generator, steps, max_q_size):
20 | 	return model.predict_generator(generator, steps, max_q_size)
21 | def Predict():
22 | 	hp, tp, up, env = parse_arguments.parseArguments()
23 | 	pspath = tp['pspath']
24 | 	parser = argparse.ArgumentParser()
25 | 	#parser.add_argument('-wpath', help = 'weight path of pretrained model', default = weight)
26 | 	parser.add_argument('-pspath', help = 'path for saving prediction result', default = pspath)
27 | 	parser.add_argument('-data', help = 'data used for prediction', required = True, default = None)
28 | 	parser.add_argument('-bs', help = 'batch size or steps', default = tp['batch_size'])
29 | 	args = parser.parse_args()
30 | 	#weight_path = args.wpath
31 | 	pred_path = args.pspath
32 | 	ext = up['file_extension']
33 | 	data_path = args.data
34 | 	bs = tp['batch_size'] 
35 | 	max_q_size = tp['max_q_size']
36 | 	verbose = tp['verbose']
37 | 	def get_session(gpu_fraction=0.95):
38 |     		gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_fraction,
39 |                                 allow_growth=True)
40 |     		return tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
41 | 	K.set_session(get_session())
42 | 	model = createGCNetwork(hp, tp, True)
43 | 	if data_path.endswith('npz'):
44 | 		images = np.load(data_path)
45 | 		print "Predict data using arrays"
46 | 		pred = _predictFromArrays_(model, images[left], images[1], images[2], bs)
47 | 		np.save(pspath, pred)
48 | 	else:
49 | 		q_size = tp['max_q_size']
50 | 		left_path = os.path.join(data_path, 'left')
51 | 		right_path = os.path.join(data_path, 'right')
52 | 		left_images = glob.glob(left_path + "/*.{}".format(ext))
53 | 		right_images = glob.glob(right_path + "/*.{}".format(ext))
54 | 		generator = generate_arrays_from_file(left_images, right_images, up)
55 | 		print "Predict data using generator..."
56 | 		pred = model.predict_generator(generator, max_queue_size = max_q_size, steps = bs, verbose = verbose)
57 | 		np.save(pspath, pred)
58 | 	K.clear_session()
59 | if __name__ == "__main__":
60 | 	Predict()	
61 | 


--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | import sys
 3 | sys.path.append('src')
 4 | from parse_arguments import *
 5 | import os
 6 | from data_utils import *
 7 | from custom_callback import customModelCheckpoint
 8 | import gcnetwork
 9 | from generator import *
10 | from losses import *
11 | from keras.callbacks import ModelCheckpoint, EarlyStopping, TensorBoard
12 | from keras import optimizers
13 | from keras import backend as K
14 | import math
15 | import random
16 | import tensorflow as tf
17 | def trainSceneFlowData(hp, tp, up, env, callbacks, upw):
18 |         lr = tp['learning_rate']
19 |         epochs = tp['epochs']
20 |         batch_size = tp['batch_size']
21 |         q_size = tp['max_q_size']
22 |         epsilon = tp['epsilon']
23 |         rho = tp['rho']
24 |         decay = tp['decay']
25 |         loss = tp['loss_function']
26 |         sceneflow_root = env['sceneflow_root']
27 |         driving_root = env['driving_root']
28 |         driving_train = env['driving_train']
29 |         driving_label = env['driving_label']
30 |         train_all = env['train_all']
31 |         train_driving = env['train_driving']
32 |         train_monkaa = env['train_monkaa']
33 |         val_ratio = up['val_ratio']
34 |         fraction = up['fraction']
35 |         root = os.path.join(os.getcwd(), sceneflow_root)
36 |         driving = os.path.join(root, driving_root)
37 |         driving_data_path = os.path.join(driving, driving_train)
38 |         driving_label_path = os.path.join(driving, driving_label)
39 |         monkaa_root = env['monkaa_root']
40 |         monkaa = os.path.join(root, monkaa_root)
41 |         monkaa_train = env['monkaa_train']
42 |         monkaa_label = env['monkaa_label']
43 |         monkaa_data_path = os.path.join(monkaa, monkaa_train)
44 |         monkaa_label_path = os.path.join(monkaa, monkaa_label)
45 |         if train_all:
46 |                 train_list = [[driving_data_path, driving_label_path, genDrivingPath], [monkaa_data_path, monkaa_label_path, genMonkaaPath]]
47 |         else:
48 |                 train_list = []
49 |                 if train_driving:
50 |                         train_list.append([driving_data_path, driving_label_path, genDrivingPath])
51 |                 if train_monkaa:
52 |                         train_list.append([monkaa_data_path, monkaa_label_path, genMonkaaPath])
53 |         train_paths = map(lambda x: x[2](x[0], x[1]), train_list)
54 |         agg_train_path = zip(*train_paths)
55 |         left, right, disp = [reduce(lambda x, y: x + y, path) for path in agg_train_path]
56 |         l_imgs, r_imgs, d_imgs = extractAllImage(left, right, disp)
57 |         train, val = splitData(l_imgs, r_imgs, d_imgs, val_ratio, fraction)
58 |         val_generator = generate_arrays_from_file(val[0], val[1], up,val[2])
59 |         train_generator = generate_arrays_from_file(train[0], train[1], up, train[2])
60 |         num_steps = math.ceil(len(train[0]) / batch_size)
61 |         val_steps = math.ceil(len(val[0]) / batch_size)
62 |         model = gcnetwork.createGCNetwork(hp, tp, upw)
63 |         optimizer = optimizers.RMSprop(lr = lr, rho = rho, epsilon = epsilon, decay = decay)
64 |         model.compile(optimizer = optimizer, loss = loss, metrics = [lessOneAccuracy, lessThreeAccuracy])
65 |         model.fit_generator(train_generator, validation_data = val_generator, validation_steps = val_steps, steps_per_epoch = num_steps, max_q_size = q_size, epochs = epochs,  callbacks = callbacks)
66 |         print "Training Complete"
67 |         result = model.predict_generator(train_generator, steps = 1)
68 |         np.save("prediction.npy", result)
69 | def genCallBacks(cost_filepath, outputfilepath, log_save_path, save_best_only, period, verbose):
70 |         callback_tb = TensorBoard(log_dir = log_save_path, histogram_freq = 0, write_graph = True, write_images = True)
71 |         callback_mc = customModelCheckpoint(cost_filepath, outputfilepath, verbose = verbose, save_best_only = save_best_only, period = period)
72 |         return [callback_tb, callback_mc]
73 | 
74 | if __name__ == '__main__':
75 |         #config = tf.ConfigProto()
76 |         #config.gpu_options.allow_growth=True
77 |         #config.gpu_options.allocator_type ='BFC'
78 |         #config.gpu_options.per_process_gpu_memory_fraction = 0.98
79 |         #sess = tf.Session(config = config)
80 |         #K.set_session(sess)
81 |         hp, tp, up, env = parseArguments()
82 |         parser = argparse.ArgumentParser()
83 |         parser.add_argument('-upw', '--use_pretrained_weight', type = int, help = 'train the model use pretrained weight', default = 1)
84 |         args = parser.parse_args()
85 |         #weight_save_path = tp['weight_save_path']
86 |         log_save_path = tp['log_save_path']
87 |         save_best_only = tp['save_best_only']
88 |         period = tp['period']
89 |         verbose = tp['verbose']
90 |         cost_weight_path = tp['cost_volume_weight_save_path']
91 |         linear_output_weight_path = tp['linear_output_weight_path']
92 |         if hp['output'] == 'softargmin':
93 |                 linear_output_weight_path = None
94 |         callbacks = genCallBacks(cost_weight_path, linear_output_weight_path, log_save_path, save_best_only, period, verbose)
95 |         trainSceneFlowData(hp, tp, up, env, callbacks, args.use_pretrained_weight)
96 | 


--------------------------------------------------------------------------------