├── .idea └── vcs.xml ├── LICENSE ├── README.md ├── dual_path_network.py └── images ├── dual path networks.png └── original-results-on-imagenet1k.png /.idea/vcs.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright 2017 Somshubra Majumdar 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Dual Path Networks in Keras 2 | [Dual Path Networks](https://arxiv.org/abs/1707.01629) are highly efficient networks which combine the strength of both ResNeXt [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/abs/1611.05431) and DenseNets [Densely Connected Convolutional Networks](https://arxiv.org/abs/1608.06993). 3 | 4 | Note: Weights have not been ported over yet. 5 | 6 | ## Dual Path Connections 7 | 8 | 9 | ## Usage 10 | Several of the standard Dual Path Network models have been included. They can be initialized as : 11 | ``` 12 | from dual_path_network import DPN92, DPN98, DPN107, DPN137 13 | 14 | model = DPN92(input_shape=(224, 224, 3)) # same for the others 15 | ``` 16 | 17 | To create a custom DualPathNetwork, use the DualPathNetwork builder method : 18 | ``` 19 | from dual_path_network import DualPathNetwork 20 | 21 | model = DualPathNetwork(input_shape=(224, 224, 3), 22 | initial_conv_filters=64, 23 | depth=[3, 4, 20, 3], 24 | filter_increment=[16, 32, 24, 128], 25 | cardinality=32, 26 | width=3, 27 | weight_decay=0, 28 | include_top=True, 29 | weights=None, 30 | input_tensor=None, 31 | pooling=None, 32 | classes=1000) 33 | ``` 34 | 35 | ## Performance 36 | 37 | 38 | ## Support 39 | - Keras does not have inbuilt support for grouped convolutions. Therefore I had to use lambda layers to match the ResNeXt paper implementation. When grouped convolution support is added, I hope to add it in this as well. 40 | - Mean-Max Global Pooling support is present with the help of Lambda layer to scale the sum. 41 | - Depth and Filter_Increment must be lists for now, and must be lists of same length. Will think about adding support for integers, but I think list support is far more useful anyway, so I may not implement it. 42 | - Weight decay support is added, but disabled by default. The DPN paper does not mention it, but ResNet, WRN and ResNeXt paper may all use small weight regularization. Use a small value of `1e-4` or `5e-4` if you wish to use it. 43 | -------------------------------------------------------------------------------- /dual_path_network.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Dual Path Networks 3 | Combines ResNeXt grouped convolutions and DenseNet dense 4 | connections to acheive state-of-the-art performance on ImageNet 5 | 6 | References: 7 | - [Dual Path Networks](https://arxiv.org/abs/1707.01629) 8 | ''' 9 | from __future__ import print_function 10 | from __future__ import absolute_import 11 | from __future__ import division 12 | 13 | from keras.models import Model 14 | from keras.layers import Input 15 | from keras.layers import Dense 16 | from keras.layers import Lambda 17 | from keras.layers import Activation 18 | from keras.layers import BatchNormalization 19 | from keras.layers import MaxPooling2D 20 | from keras.layers import GlobalAveragePooling2D 21 | from keras.layers import GlobalMaxPooling2D 22 | from keras.layers import Conv2D 23 | from keras.layers import concatenate 24 | from keras.layers import add 25 | from keras.regularizers import l2 26 | from keras.utils import conv_utils 27 | from keras.utils.data_utils import get_file 28 | from keras.engine.topology import get_source_inputs 29 | from keras_applications.imagenet_utils import _obtain_input_shape 30 | from keras.applications.imagenet_utils import decode_predictions 31 | from keras import backend as K 32 | 33 | __all__ = ['DualPathNetwork', 'DPN92', 'DPN98', 'DPN137', 'DPN107', 'preprocess_input', 'decode_predictions'] 34 | 35 | 36 | def preprocess_input(x, data_format=None): 37 | """Preprocesses a tensor encoding a batch of images. 38 | Obtained from https://github.com/cypw/DPNs 39 | 40 | # Arguments 41 | x: input Numpy tensor, 4D. 42 | data_format: data format of the image tensor. 43 | 44 | # Returns 45 | Preprocessed tensor. 46 | """ 47 | if data_format is None: 48 | data_format = K.image_data_format() 49 | assert data_format in {'channels_last', 'channels_first'} 50 | 51 | if data_format == 'channels_first': 52 | # 'RGB'->'BGR' 53 | x = x[:, ::-1, :, :] 54 | # Zero-center by mean pixel 55 | x[:, 0, :, :] -= 104 56 | x[:, 1, :, :] -= 117 57 | x[:, 2, :, :] -= 128 58 | else: 59 | # 'RGB'->'BGR' 60 | x = x[:, :, :, ::-1] 61 | # Zero-center by mean pixel 62 | x[:, :, :, 0] -= 104 63 | x[:, :, :, 1] -= 117 64 | x[:, :, :, 2] -= 124 65 | 66 | x *= 0.0167 67 | return x 68 | 69 | 70 | def DualPathNetwork(input_shape=None, 71 | initial_conv_filters=64, 72 | depth=[3, 4, 20, 3], 73 | filter_increment=[16, 32, 24, 128], 74 | cardinality=32, 75 | width=3, 76 | weight_decay=0, 77 | include_top=True, 78 | weights=None, 79 | input_tensor=None, 80 | pooling=None, 81 | classes=1000): 82 | """ Instantiate the Dual Path Network architecture for the ImageNet dataset. Note that , 83 | when using TensorFlow for best performance you should set 84 | `image_data_format="channels_last"` in your Keras config 85 | at ~/.keras/keras.json. 86 | The model are compatible with both 87 | TensorFlow and Theano. The dimension ordering 88 | convention used by the model is the one 89 | specified in your Keras config file. 90 | # Arguments 91 | initial_conv_filters: number of features for the initial convolution 92 | depth: number or layers in the each block, defined as a list. 93 | DPN-92 = [3, 4, 20, 3] 94 | DPN-98 = [3, 6, 20, 3] 95 | DPN-131 = [4, 8, 28, 3] 96 | DPN-107 = [4, 8, 20, 3] 97 | filter_increment: number of filters incremented per block, defined as a list. 98 | DPN-92 = [16, 32, 24, 128] 99 | DON-98 = [16, 32, 32, 128] 100 | DPN-131 = [16, 32, 32, 128] 101 | DPN-107 = [20, 64, 64, 128] 102 | cardinality: the size of the set of transformations 103 | width: width multiplier for the network 104 | weight_decay: weight decay (l2 norm) 105 | include_top: whether to include the fully-connected 106 | layer at the top of the network. 107 | weights: `None` (random initialization) or `imagenet` (trained 108 | on ImageNet) 109 | input_tensor: optional Keras tensor (i.e. output of `layers.Input()`) 110 | to use as image input for the model. 111 | input_shape: optional shape tuple, only to be specified 112 | if `include_top` is False (otherwise the input shape 113 | has to be `(224, 224, 3)` (with `tf` dim ordering) 114 | or `(3, 224, 224)` (with `th` dim ordering). 115 | It should have exactly 3 inputs channels, 116 | and width and height should be no smaller than 8. 117 | E.g. `(200, 200, 3)` would be one valid value. 118 | pooling: Optional pooling mode for feature extraction 119 | when `include_top` is `False`. 120 | - `None` means that the output of the model will be 121 | the 4D tensor output of the 122 | last convolutional layer. 123 | - `avg` means that global average pooling 124 | will be applied to the output of the 125 | last convolutional layer, and thus 126 | the output of the model will be a 2D tensor. 127 | - `max` means that global max pooling will 128 | be applied. 129 | - `max-avg` means that both global average and global max 130 | pooling will be applied to the output of the last 131 | convolution layer 132 | classes: optional number of classes to classify images 133 | into, only to be specified if `include_top` is True, and 134 | if no `weights` argument is specified. 135 | # Returns 136 | A Keras model instance. 137 | """ 138 | 139 | if weights not in {'imagenet', None}: 140 | raise ValueError('The `weights` argument should be either ' 141 | '`None` (random initialization) or `imagenet` ' 142 | '(pre-training on ImageNet).') 143 | 144 | if weights == 'imagenet' and include_top and classes != 1000: 145 | raise ValueError('If using `weights` as imagenet with `include_top`' 146 | ' as true, `classes` should be 1000') 147 | 148 | assert len(depth) == len(filter_increment), "The length of filter increment list must match the length " \ 149 | "of the depth list." 150 | 151 | # Determine proper input shape 152 | input_shape = _obtain_input_shape(input_shape, 153 | default_size=224, 154 | min_size=112, 155 | data_format=K.image_data_format(), 156 | require_flatten=include_top) 157 | 158 | if input_tensor is None: 159 | img_input = Input(shape=input_shape) 160 | else: 161 | if not K.is_keras_tensor(input_tensor): 162 | img_input = Input(tensor=input_tensor, shape=input_shape) 163 | else: 164 | img_input = input_tensor 165 | 166 | x = _create_dpn(classes, img_input, include_top, initial_conv_filters, 167 | filter_increment, depth, cardinality, width, weight_decay, pooling) 168 | 169 | # Ensure that the model takes into account 170 | # any potential predecessors of `input_tensor`. 171 | if input_tensor is not None: 172 | inputs = get_source_inputs(input_tensor) 173 | else: 174 | inputs = img_input 175 | # Create model. 176 | model = Model(inputs, x, name='resnext') 177 | 178 | # load weights 179 | 180 | return model 181 | 182 | 183 | def DPN92(input_shape=None, 184 | include_top=True, 185 | weights=None, 186 | input_tensor=None, 187 | pooling=None, 188 | classes=1000): 189 | return DualPathNetwork(input_shape, include_top=include_top, weights=weights, input_tensor=input_tensor, 190 | pooling=pooling, classes=classes) 191 | 192 | 193 | def DPN98(input_shape=None, 194 | include_top=True, 195 | weights=None, 196 | input_tensor=None, 197 | pooling=None, 198 | classes=1000): 199 | return DualPathNetwork(input_shape, initial_conv_filters=96, depth=[3, 6, 20, 3], filter_increment=[16, 32, 32, 128], 200 | cardinality=40, width=4, include_top=include_top, weights=weights, input_tensor=input_tensor, 201 | pooling=pooling, classes=classes) 202 | 203 | 204 | def DPN137(input_shape=None, 205 | include_top=True, 206 | weights=None, 207 | input_tensor=None, 208 | pooling=None, 209 | classes=1000): 210 | return DualPathNetwork(input_shape, initial_conv_filters=128, depth=[4, 8, 28, 3], filter_increment=[16, 32, 32, 128], 211 | cardinality=40, width=4, include_top=include_top, weights=weights, input_tensor=input_tensor, 212 | pooling=pooling, classes=classes) 213 | 214 | 215 | def DPN107(input_shape=None, 216 | include_top=True, 217 | weights=None, 218 | input_tensor=None, 219 | pooling=None, 220 | classes=1000): 221 | return DualPathNetwork(input_shape, initial_conv_filters=128, depth=[4, 8, 20, 3], filter_increment=[20, 64, 64, 128], 222 | cardinality=50, width=4, include_top=include_top, weights=weights, input_tensor=input_tensor, 223 | pooling=pooling, classes=classes) 224 | 225 | 226 | def _initial_conv_block_inception(input, initial_conv_filters, weight_decay=5e-4): 227 | ''' Adds an initial conv block, with batch norm and relu for the DPN 228 | Args: 229 | input: input tensor 230 | initial_conv_filters: number of filters for initial conv block 231 | weight_decay: weight decay factor 232 | Returns: a keras tensor 233 | ''' 234 | channel_axis = 1 if K.image_data_format() == 'channels_first' else -1 235 | 236 | x = Conv2D(initial_conv_filters, (7, 7), padding='same', use_bias=False, kernel_initializer='he_normal', 237 | kernel_regularizer=l2(weight_decay), strides=(2, 2))(input) 238 | x = BatchNormalization(axis=channel_axis)(x) 239 | x = Activation('relu')(x) 240 | 241 | x = MaxPooling2D((3, 3), strides=(2, 2), padding='same')(x) 242 | 243 | return x 244 | 245 | 246 | def _bn_relu_conv_block(input, filters, kernel=(3, 3), stride=(1, 1), weight_decay=5e-4): 247 | ''' Adds a Batchnorm-Relu-Conv block for DPN 248 | Args: 249 | input: input tensor 250 | filters: number of output filters 251 | kernel: convolution kernel size 252 | stride: stride of convolution 253 | Returns: a keras tensor 254 | ''' 255 | channel_axis = 1 if K.image_data_format() == 'channels_first' else -1 256 | 257 | x = Conv2D(filters, kernel, padding='same', use_bias=False, kernel_initializer='he_normal', 258 | kernel_regularizer=l2(weight_decay), strides=stride)(input) 259 | x = BatchNormalization(axis=channel_axis)(x) 260 | x = Activation('relu')(x) 261 | return x 262 | 263 | 264 | def _grouped_convolution_block(input, grouped_channels, cardinality, strides, weight_decay=5e-4): 265 | ''' Adds a grouped convolution block. It is an equivalent block from the paper 266 | Args: 267 | input: input tensor 268 | grouped_channels: grouped number of filters 269 | cardinality: cardinality factor describing the number of groups 270 | strides: performs strided convolution for downscaling if > 1 271 | weight_decay: weight decay term 272 | Returns: a keras tensor 273 | ''' 274 | init = input 275 | channel_axis = 1 if K.image_data_format() == 'channels_first' else -1 276 | 277 | group_list = [] 278 | 279 | if cardinality == 1: 280 | # with cardinality 1, it is a standard convolution 281 | x = Conv2D(grouped_channels, (3, 3), padding='same', use_bias=False, strides=strides, 282 | kernel_initializer='he_normal', kernel_regularizer=l2(weight_decay))(init) 283 | x = BatchNormalization(axis=channel_axis)(x) 284 | x = Activation('relu')(x) 285 | return x 286 | 287 | for c in range(cardinality): 288 | x = Lambda(lambda z: z[:, :, :, c * grouped_channels:(c + 1) * grouped_channels] 289 | if K.image_data_format() == 'channels_last' else 290 | lambda z: z[:, c * grouped_channels:(c + 1) * grouped_channels, :, :])(input) 291 | 292 | x = Conv2D(grouped_channels, (3, 3), padding='same', use_bias=False, strides=strides, 293 | kernel_initializer='he_normal', kernel_regularizer=l2(weight_decay))(x) 294 | 295 | group_list.append(x) 296 | 297 | group_merge = concatenate(group_list, axis=channel_axis) 298 | group_merge = BatchNormalization(axis=channel_axis)(group_merge) 299 | group_merge = Activation('relu')(group_merge) 300 | return group_merge 301 | 302 | 303 | def _dual_path_block(input, pointwise_filters_a, grouped_conv_filters_b, pointwise_filters_c, 304 | filter_increment, cardinality, block_type='normal'): 305 | ''' 306 | Creates a Dual Path Block. The first path is a ResNeXt type 307 | grouped convolution block. The second is a DenseNet type dense 308 | convolution block. 309 | 310 | Args: 311 | input: input tensor 312 | pointwise_filters_a: number of filters for the bottleneck 313 | pointwise convolution 314 | grouped_conv_filters_b: number of filters for the grouped 315 | convolution block 316 | pointwise_filters_c: number of filters for the bottleneck 317 | convolution block 318 | filter_increment: number of filters that will be added 319 | cardinality: cardinality factor 320 | block_type: determines what action the block will perform 321 | - `projection`: adds a projection connection 322 | - `downsample`: downsamples the spatial resolution 323 | - `normal` : simple adds a dual path connection 324 | 325 | Returns: a list of two output tensors - one path of ResNeXt 326 | and another path for DenseNet 327 | 328 | ''' 329 | channel_axis = 1 if K.image_data_format() == 'channels_first' else -1 330 | grouped_channels = int(grouped_conv_filters_b / cardinality) 331 | 332 | init = concatenate(input, axis=channel_axis) if isinstance(input, list) else input 333 | 334 | if block_type == 'projection': 335 | stride = (1, 1) 336 | projection = True 337 | elif block_type == 'downsample': 338 | stride = (2, 2) 339 | projection = True 340 | elif block_type == 'normal': 341 | stride = (1, 1) 342 | projection = False 343 | else: 344 | raise ValueError('`block_type` must be one of ["projection", "downsample", "normal"]. Given %s' % block_type) 345 | 346 | if projection: 347 | projection_path = _bn_relu_conv_block(init, filters=pointwise_filters_c + 2 * filter_increment, 348 | kernel=(1, 1), stride=stride) 349 | input_residual_path = Lambda(lambda z: z[:, :, :, :pointwise_filters_c] 350 | if K.image_data_format() == 'channels_last' else 351 | z[:, :pointwise_filters_c, :, :])(projection_path) 352 | input_dense_path = Lambda(lambda z: z[:, :, :, pointwise_filters_c:] 353 | if K.image_data_format() == 'channels_last' else 354 | z[:, pointwise_filters_c:, :, :])(projection_path) 355 | else: 356 | input_residual_path = input[0] 357 | input_dense_path = input[1] 358 | 359 | x = _bn_relu_conv_block(init, filters=pointwise_filters_a, kernel=(1, 1)) 360 | x = _grouped_convolution_block(x, grouped_channels=grouped_channels, cardinality=cardinality, strides=stride) 361 | x = _bn_relu_conv_block(x, filters=pointwise_filters_c + filter_increment, kernel=(1, 1)) 362 | 363 | output_residual_path = Lambda(lambda z: z[:, :, :, :pointwise_filters_c] 364 | if K.image_data_format() == 'channels_last' else 365 | z[:, :pointwise_filters_c, :, :])(x) 366 | output_dense_path = Lambda(lambda z: z[:, :, :, pointwise_filters_c:] 367 | if K.image_data_format() == 'channels_last' else 368 | z[:, pointwise_filters_c:, :, :])(x) 369 | 370 | residual_path = add([input_residual_path, output_residual_path]) 371 | dense_path = concatenate([input_dense_path, output_dense_path], axis=channel_axis) 372 | 373 | return [residual_path, dense_path] 374 | 375 | 376 | def _create_dpn(nb_classes, img_input, include_top, initial_conv_filters, 377 | filter_increment, depth, cardinality=32, width=3, weight_decay=5e-4, pooling=None): 378 | ''' Creates a ResNeXt model with specified parameters 379 | Args: 380 | initial_conv_filters: number of features for the initial convolution 381 | include_top: Flag to include the last dense layer 382 | initial_conv_filters: number of features for the initial convolution 383 | filter_increment: number of filters incremented per block, defined as a list. 384 | DPN-92 = [16, 32, 24, 128] 385 | DON-98 = [16, 32, 32, 128] 386 | DPN-131 = [16, 32, 32, 128] 387 | DPN-107 = [20, 64, 64, 128] 388 | depth: number or layers in the each block, defined as a list. 389 | DPN-92 = [3, 4, 20, 3] 390 | DPN-98 = [3, 6, 20, 3] 391 | DPN-131 = [4, 8, 28, 3] 392 | DPN-107 = [4, 8, 20, 3] 393 | width: width multiplier for network 394 | weight_decay: weight_decay (l2 norm) 395 | pooling: Optional pooling mode for feature extraction 396 | when `include_top` is `False`. 397 | - `None` means that the output of the model will be 398 | the 4D tensor output of the 399 | last convolutional layer. 400 | - `avg` means that global average pooling 401 | will be applied to the output of the 402 | last convolutional layer, and thus 403 | the output of the model will be a 2D tensor. 404 | - `max` means that global max pooling will 405 | be applied. 406 | - `max-avg` means that both global average and global max 407 | pooling will be applied to the output of the last 408 | convolution layer 409 | Returns: a Keras Model 410 | ''' 411 | channel_axis = 1 if K.image_data_format() == 'channels_first' else -1 412 | N = list(depth) 413 | base_filters = 256 414 | 415 | # block 1 (initial conv block) 416 | x = _initial_conv_block_inception(img_input, initial_conv_filters, weight_decay) 417 | 418 | # block 2 (projection block) 419 | filter_inc = filter_increment[0] 420 | filters = int(cardinality * width) 421 | 422 | x = _dual_path_block(x, pointwise_filters_a=filters, 423 | grouped_conv_filters_b=filters, 424 | pointwise_filters_c=base_filters, 425 | filter_increment=filter_inc, 426 | cardinality=cardinality, 427 | block_type='projection') 428 | 429 | for i in range(N[0] - 1): 430 | x = _dual_path_block(x, pointwise_filters_a=filters, 431 | grouped_conv_filters_b=filters, 432 | pointwise_filters_c=base_filters, 433 | filter_increment=filter_inc, 434 | cardinality=cardinality, 435 | block_type='normal') 436 | 437 | # remaining blocks 438 | for k in range(1, len(N)): 439 | print("BLOCK %d" % (k + 1)) 440 | filter_inc = filter_increment[k] 441 | filters *= 2 442 | base_filters *= 2 443 | 444 | x = _dual_path_block(x, pointwise_filters_a=filters, 445 | grouped_conv_filters_b=filters, 446 | pointwise_filters_c=base_filters, 447 | filter_increment=filter_inc, 448 | cardinality=cardinality, 449 | block_type='downsample') 450 | 451 | for i in range(N[k] - 1): 452 | x = _dual_path_block(x, pointwise_filters_a=filters, 453 | grouped_conv_filters_b=filters, 454 | pointwise_filters_c=base_filters, 455 | filter_increment=filter_inc, 456 | cardinality=cardinality, 457 | block_type='normal') 458 | 459 | x = concatenate(x, axis=channel_axis) 460 | 461 | if include_top: 462 | avg = GlobalAveragePooling2D()(x) 463 | max = GlobalMaxPooling2D()(x) 464 | x = add([avg, max]) 465 | x = Lambda(lambda z: 0.5 * z)(x) 466 | x = Dense(nb_classes, use_bias=False, kernel_regularizer=l2(weight_decay), 467 | kernel_initializer='he_normal', activation='softmax')(x) 468 | else: 469 | if pooling == 'avg': 470 | x = GlobalAveragePooling2D()(x) 471 | elif pooling == 'max': 472 | x = GlobalMaxPooling2D()(x) 473 | elif pooling == 'max-avg': 474 | a = GlobalMaxPooling2D()(x) 475 | b = GlobalAveragePooling2D()(x) 476 | x = add([a, b]) 477 | x = Lambda(lambda z: 0.5 * z)(x) 478 | 479 | return x 480 | 481 | if __name__ == '__main__': 482 | model = DPN92((224, 224, 3)) 483 | model.summary() 484 | -------------------------------------------------------------------------------- /images/dual path networks.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/titu1994/Keras-DualPathNetworks/2d9bd917a938f61e55ee4e264b7bab846c45ab87/images/dual path networks.png -------------------------------------------------------------------------------- /images/original-results-on-imagenet1k.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/titu1994/Keras-DualPathNetworks/2d9bd917a938f61e55ee4e264b7bab846c45ab87/images/original-results-on-imagenet1k.png --------------------------------------------------------------------------------