├── .github
    └── PULL_REQUEST_TEMPLATE.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── NOTICE
├── README.md
├── enet_full_res_deploy.py
├── gen.py
├── mxnet_segmentation.ipynb
└── segmentation.py


/.github/PULL_REQUEST_TEMPLATE.md:
--------------------------------------------------------------------------------
1 | *Issue #, if available:*
2 | 
3 | *Description of changes:*
4 | 
5 | 
6 | By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
7 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ## Code of Conduct
2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 
3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 
4 | opensource-codeofconduct@amazon.com with any additional questions or comments.
5 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing Guidelines
 2 | 
 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 
 4 | documentation, we greatly value feedback and contributions from our community.
 5 | 
 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 
 7 | information to effectively respond to your bug report or contribution.
 8 | 
 9 | 
10 | ## Reporting Bugs/Feature Requests
11 | 
12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13 | 
14 | When filing an issue, please check [existing open](https://github.com/aws-samples/aws-sagemaker-segmentation-example/issues), or [recently closed](https://github.com/aws-samples/aws-sagemaker-segmentation-example/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aclosed%20), issues to make sure somebody else hasn't already 
15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16 | 
17 | * A reproducible test case or series of steps
18 | * The version of our code being used
19 | * Any modifications you've made relevant to the bug
20 | * Anything unusual about your environment or deployment
21 | 
22 | 
23 | ## Contributing via Pull Requests
24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25 | 
26 | 1. You are working against the latest source on the *master* branch.
27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
29 | 
30 | To send us a pull request, please:
31 | 
32 | 1. Fork the repository.
33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34 | 3. Ensure local tests pass.
35 | 4. Commit to your fork using clear commit messages.
36 | 5. Send us a pull request, answering any default questions in the pull request interface.
37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38 | 
39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 
40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41 | 
42 | 
43 | ## Finding contributions to work on
44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels ((enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/aws-samples/aws-sagemaker-segmentation-example/labels/help%20wanted) issues is a great place to start. 
45 | 
46 | 
47 | ## Code of Conduct
48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 
49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 
50 | opensource-codeofconduct@amazon.com with any additional questions or comments.
51 | 
52 | 
53 | ## Security issue notifications
54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55 | 
56 | 
57 | ## Licensing
58 | 
59 | See the [LICENSE](https://github.com/aws-samples/aws-sagemaker-segmentation-example/blob/master/LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
60 | 
61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes.
62 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 | 
  2 |                                  Apache License
  3 |                            Version 2.0, January 2004
  4 |                         http://www.apache.org/licenses/
  5 | 
  6 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  7 | 
  8 |    1. Definitions.
  9 | 
 10 |       "License" shall mean the terms and conditions for use, reproduction,
 11 |       and distribution as defined by Sections 1 through 9 of this document.
 12 | 
 13 |       "Licensor" shall mean the copyright owner or entity authorized by
 14 |       the copyright owner that is granting the License.
 15 | 
 16 |       "Legal Entity" shall mean the union of the acting entity and all
 17 |       other entities that control, are controlled by, or are under common
 18 |       control with that entity. For the purposes of this definition,
 19 |       "control" means (i) the power, direct or indirect, to cause the
 20 |       direction or management of such entity, whether by contract or
 21 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 22 |       outstanding shares, or (iii) beneficial ownership of such entity.
 23 | 
 24 |       "You" (or "Your") shall mean an individual or Legal Entity
 25 |       exercising permissions granted by this License.
 26 | 
 27 |       "Source" form shall mean the preferred form for making modifications,
 28 |       including but not limited to software source code, documentation
 29 |       source, and configuration files.
 30 | 
 31 |       "Object" form shall mean any form resulting from mechanical
 32 |       transformation or translation of a Source form, including but
 33 |       not limited to compiled object code, generated documentation,
 34 |       and conversions to other media types.
 35 | 
 36 |       "Work" shall mean the work of authorship, whether in Source or
 37 |       Object form, made available under the License, as indicated by a
 38 |       copyright notice that is included in or attached to the work
 39 |       (an example is provided in the Appendix below).
 40 | 
 41 |       "Derivative Works" shall mean any work, whether in Source or Object
 42 |       form, that is based on (or derived from) the Work and for which the
 43 |       editorial revisions, annotations, elaborations, or other modifications
 44 |       represent, as a whole, an original work of authorship. For the purposes
 45 |       of this License, Derivative Works shall not include works that remain
 46 |       separable from, or merely link (or bind by name) to the interfaces of,
 47 |       the Work and Derivative Works thereof.
 48 | 
 49 |       "Contribution" shall mean any work of authorship, including
 50 |       the original version of the Work and any modifications or additions
 51 |       to that Work or Derivative Works thereof, that is intentionally
 52 |       submitted to Licensor for inclusion in the Work by the copyright owner
 53 |       or by an individual or Legal Entity authorized to submit on behalf of
 54 |       the copyright owner. For the purposes of this definition, "submitted"
 55 |       means any form of electronic, verbal, or written communication sent
 56 |       to the Licensor or its representatives, including but not limited to
 57 |       communication on electronic mailing lists, source code control systems,
 58 |       and issue tracking systems that are managed by, or on behalf of, the
 59 |       Licensor for the purpose of discussing and improving the Work, but
 60 |       excluding communication that is conspicuously marked or otherwise
 61 |       designated in writing by the copyright owner as "Not a Contribution."
 62 | 
 63 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 64 |       on behalf of whom a Contribution has been received by Licensor and
 65 |       subsequently incorporated within the Work.
 66 | 
 67 |    2. Grant of Copyright License. Subject to the terms and conditions of
 68 |       this License, each Contributor hereby grants to You a perpetual,
 69 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 70 |       copyright license to reproduce, prepare Derivative Works of,
 71 |       publicly display, publicly perform, sublicense, and distribute the
 72 |       Work and such Derivative Works in Source or Object form.
 73 | 
 74 |    3. Grant of Patent License. Subject to the terms and conditions of
 75 |       this License, each Contributor hereby grants to You a perpetual,
 76 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 77 |       (except as stated in this section) patent license to make, have made,
 78 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 79 |       where such license applies only to those patent claims licensable
 80 |       by such Contributor that are necessarily infringed by their
 81 |       Contribution(s) alone or by combination of their Contribution(s)
 82 |       with the Work to which such Contribution(s) was submitted. If You
 83 |       institute patent litigation against any entity (including a
 84 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 85 |       or a Contribution incorporated within the Work constitutes direct
 86 |       or contributory patent infringement, then any patent licenses
 87 |       granted to You under this License for that Work shall terminate
 88 |       as of the date such litigation is filed.
 89 | 
 90 |    4. Redistribution. You may reproduce and distribute copies of the
 91 |       Work or Derivative Works thereof in any medium, with or without
 92 |       modifications, and in Source or Object form, provided that You
 93 |       meet the following conditions:
 94 | 
 95 |       (a) You must give any other recipients of the Work or
 96 |           Derivative Works a copy of this License; and
 97 | 
 98 |       (b) You must cause any modified files to carry prominent notices
 99 |           stating that You changed the files; and
100 | 
101 |       (c) You must retain, in the Source form of any Derivative Works
102 |           that You distribute, all copyright, patent, trademark, and
103 |           attribution notices from the Source form of the Work,
104 |           excluding those notices that do not pertain to any part of
105 |           the Derivative Works; and
106 | 
107 |       (d) If the Work includes a "NOTICE" text file as part of its
108 |           distribution, then any Derivative Works that You distribute must
109 |           include a readable copy of the attribution notices contained
110 |           within such NOTICE file, excluding those notices that do not
111 |           pertain to any part of the Derivative Works, in at least one
112 |           of the following places: within a NOTICE text file distributed
113 |           as part of the Derivative Works; within the Source form or
114 |           documentation, if provided along with the Derivative Works; or,
115 |           within a display generated by the Derivative Works, if and
116 |           wherever such third-party notices normally appear. The contents
117 |           of the NOTICE file are for informational purposes only and
118 |           do not modify the License. You may add Your own attribution
119 |           notices within Derivative Works that You distribute, alongside
120 |           or as an addendum to the NOTICE text from the Work, provided
121 |           that such additional attribution notices cannot be construed
122 |           as modifying the License.
123 | 
124 |       You may add Your own copyright statement to Your modifications and
125 |       may provide additional or different license terms and conditions
126 |       for use, reproduction, or distribution of Your modifications, or
127 |       for any such Derivative Works as a whole, provided Your use,
128 |       reproduction, and distribution of the Work otherwise complies with
129 |       the conditions stated in this License.
130 | 
131 |    5. Submission of Contributions. Unless You explicitly state otherwise,
132 |       any Contribution intentionally submitted for inclusion in the Work
133 |       by You to the Licensor shall be under the terms and conditions of
134 |       this License, without any additional terms or conditions.
135 |       Notwithstanding the above, nothing herein shall supersede or modify
136 |       the terms of any separate license agreement you may have executed
137 |       with Licensor regarding such Contributions.
138 | 
139 |    6. Trademarks. This License does not grant permission to use the trade
140 |       names, trademarks, service marks, or product names of the Licensor,
141 |       except as required for reasonable and customary use in describing the
142 |       origin of the Work and reproducing the content of the NOTICE file.
143 | 
144 |    7. Disclaimer of Warranty. Unless required by applicable law or
145 |       agreed to in writing, Licensor provides the Work (and each
146 |       Contributor provides its Contributions) on an "AS IS" BASIS,
147 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
148 |       implied, including, without limitation, any warranties or conditions
149 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
150 |       PARTICULAR PURPOSE. You are solely responsible for determining the
151 |       appropriateness of using or redistributing the Work and assume any
152 |       risks associated with Your exercise of permissions under this License.
153 | 
154 |    8. Limitation of Liability. In no event and under no legal theory,
155 |       whether in tort (including negligence), contract, or otherwise,
156 |       unless required by applicable law (such as deliberate and grossly
157 |       negligent acts) or agreed to in writing, shall any Contributor be
158 |       liable to You for damages, including any direct, indirect, special,
159 |       incidental, or consequential damages of any character arising as a
160 |       result of this License or out of the use or inability to use the
161 |       Work (including but not limited to damages for loss of goodwill,
162 |       work stoppage, computer failure or malfunction, or any and all
163 |       other commercial damages or losses), even if such Contributor
164 |       has been advised of the possibility of such damages.
165 | 
166 |    9. Accepting Warranty or Additional Liability. While redistributing
167 |       the Work or Derivative Works thereof, You may choose to offer,
168 |       and charge a fee for, acceptance of support, warranty, indemnity,
169 |       or other liability obligations and/or rights consistent with this
170 |       License. However, in accepting such obligations, You may act only
171 |       on Your own behalf and on Your sole responsibility, not on behalf
172 |       of any other Contributor, and only if You agree to indemnify,
173 |       defend, and hold each Contributor harmless for any liability
174 |       incurred by, or claims asserted against, such Contributor by reason
175 |       of your accepting any such warranty or additional liability.
176 | 
177 |    END OF TERMS AND CONDITIONS
178 | 
179 |    APPENDIX: How to apply the Apache License to your work.
180 | 
181 |       To apply the Apache License to your work, attach the following
182 |       boilerplate notice, with the fields enclosed by brackets "[]"
183 |       replaced with your own identifying information. (Don't include
184 |       the brackets!)  The text should be enclosed in the appropriate
185 |       comment syntax for the file format. We also recommend that a
186 |       file or class name and description of purpose be included on the
187 |       same "printed page" as the copyright notice for easier
188 |       identification within third-party archives.
189 | 
190 |    Copyright [yyyy] [name of copyright owner]
191 | 
192 |    Licensed under the Apache License, Version 2.0 (the "License");
193 |    you may not use this file except in compliance with the License.
194 |    You may obtain a copy of the License at
195 | 
196 |        http://www.apache.org/licenses/LICENSE-2.0
197 | 
198 |    Unless required by applicable law or agreed to in writing, software
199 |    distributed under the License is distributed on an "AS IS" BASIS,
200 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
201 |    See the License for the specific language governing permissions and
202 |    limitations under the License.
203 | 
204 | MIT License
205 | 
206 | Copyright (c) 2017 Pavlos
207 | 
208 | Permission is hereby granted, free of charge, to any person obtaining a copy
209 | of this software and associated documentation files (the "Software"), to deal
210 | in the Software without restriction, including without limitation the rights
211 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
212 | copies of the Software, and to permit persons to whom the Software is
213 | furnished to do so, subject to the following conditions:
214 | 
215 | The above copyright notice and this permission notice shall be included in all
216 | copies or substantial portions of the Software.
217 | 
218 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
219 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
220 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
221 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
222 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
223 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
224 | SOFTWARE.
225 | 
226 | The MIT License (MIT)
227 | 
228 | Copyright (c) 2015 matthewearl
229 | 
230 | Permission is hereby granted, free of charge, to any person obtaining a copy
231 | of this software and associated documentation files (the "Software"), to deal
232 | in the Software without restriction, including without limitation the rights
233 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
234 | copies of the Software, and to permit persons to whom the Software is
235 | furnished to do so, subject to the following conditions:
236 | 
237 | The above copyright notice and this permission notice shall be included in all
238 | copies or substantial portions of the Software.
239 | 
240 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
241 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
242 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
243 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
244 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
245 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
246 | SOFTWARE.
247 | 


--------------------------------------------------------------------------------
/NOTICE:
--------------------------------------------------------------------------------
 1 | AWS Sagemaker Segmentation Example
 2 | Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 
 3 | 
 4 | [gen.py]
 5 | Copyright [2018]-[2018] Amazon.com, Inc. or its affiliates. All Rights Reserved.
 6 | 
 7 | [segmentation.py]
 8 | Copyright [2018]-[2018] Amazon.com, Inc. or its affiliates. All Rights Reserved.
 9 | 
10 | [enet_full_res_deploy.py]
11 | Copyright [2018]-[2018] Amazon.com, Inc. or its affiliates. All Rights Reserved.
12 | 
13 | **********************
14 | THIRD PARTY COMPONENTS
15 | **********************
16 | This software includes third party software subject to the following copyrights:
17 | 
18 | - Functions for generating random affine transformations for data generation from https://github.com/matthewearl/deep-anpr - Copyright (c) 2015 matthewearl
19 | - Functions for constructing ENet architecture, adapted/ported from Keras implementation from https://github.com/PavlosMelissinos/enet-keras - Copyright (c) 2017 Pavlos
20 | 
21 | The licenses for these third party components are included in LICENSE.txt
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## AWS Sagemaker Segmentation Example
2 | 
3 | A tutorial on how to build, train, and deploy advanced CNN architectures U-Net and ENet for per-pixel binary segmentation on SageMaker.
4 | 
5 | ## License
6 | 
7 | This library is licensed under the Apache 2.0 License. 
8 | 


--------------------------------------------------------------------------------
/enet_full_res_deploy.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Copyright [2018]-[2018] Amazon.com, Inc. or its affiliates. All Rights Reserved.
  3 | 
  4 | Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at
  5 | 
  6 |     http://aws.amazon.com/apache2.0/
  7 | 
  8 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
  9 | 
 10 | Portions copyright Copyright (c) 2017 Pavlos. Please see LICENSE.txt for applicable license terms and NOTICE.txt for applicable notices.
 11 | """
 12 | from __future__ import print_function
 13 | import mxnet as mx
 14 | from mxnet import ndarray as F
 15 | from mxnet.io import DataBatch, DataDesc
 16 | import os
 17 | import numpy as np
 18 | import logging
 19 | import urllib
 20 | import zipfile
 21 | import tarfile
 22 | import shutil
 23 | import gzip
 24 | from glob import glob
 25 | import random
 26 | import json
 27 | 
 28 | ###############################
 29 | ###     Loss Functions      ###
 30 | ###############################
 31 | 
 32 | def dice_coef(y_true, y_pred):
 33 |     intersection = mx.sym.sum(mx.sym.broadcast_mul(y_true, y_pred), axis=(1, 2, 3))
 34 |     return mx.sym.broadcast_div((2. * intersection + 1.),(mx.sym.sum(y_true, axis=(1, 2, 3)) + mx.sym.sum(y_pred, axis=(1, 2, 3)) + 1.))
 35 | 
 36 | def dice_coef_loss(y_true, y_pred):
 37 |     intersection = mx.sym.sum(mx.sym.broadcast_mul(y_true, y_pred), axis=1, )
 38 |     return -mx.sym.broadcast_div((2. * intersection + 1.),(mx.sym.broadcast_add(mx.sym.sum(y_true, axis=1), mx.sym.sum(y_pred, axis=1)) + 1.))
 39 | 
 40 | ###############################
 41 | ###     ENet Architecture   ###
 42 | ###############################
 43 | 
 44 | class SpatialDropout(mx.operator.CustomOp):
 45 |     def __init__(self, p, num_filters, ctx):
 46 |         self._p = float(p)
 47 |         self._num_filters = int(num_filters)
 48 |         self._ctx = ctx
 49 |         self._spatial_dropout_mask = F.ones(shape=(1, 1, 1, 1), ctx=self._ctx)
 50 |         
 51 |     def forward(self, is_train, req, in_data, out_data, aux):
 52 |         x = in_data[0]
 53 |         if is_train:
 54 |             self._spatial_dropout_mask = F.broadcast_greater(
 55 |                 F.random_uniform(low=0, high=1, shape=(1, self._num_filters, 1, 1), ctx=self._ctx), 
 56 |                 F.ones(shape=(1, self._num_filters, 1, 1), ctx=self._ctx) * self._p,
 57 |                 ctx=self._ctx
 58 |             )
 59 |             y = F.broadcast_mul(x, self._spatial_dropout_mask, ctx=self._ctx) / (1-self._p)
 60 |             self.assign(out_data[0], req[0], y)
 61 |         else:
 62 |             self.assign(out_data[0], req[0], x)
 63 |             
 64 |     def backward(self, req, out_grad, in_data, out_data, in_grad, aux):
 65 |         dy = out_grad[0]
 66 |         dx = F.broadcast_mul(self._spatial_dropout_mask, dy)
 67 |         self.assign(in_grad[0], req[0], dx)
 68 |         
 69 | @mx.operator.register('spatial_dropout')
 70 | class SpatialDropoutProp(mx.operator.CustomOpProp):
 71 |     def __init__(self, p, num_filters):
 72 |         super(SpatialDropoutProp, self).__init__(True)
 73 |         self._p = p
 74 |         self._num_filters = num_filters
 75 |         
 76 |     def infer_shape(self, in_shapes):
 77 |         data_shape = in_shapes[0]
 78 |         output_shape = data_shape
 79 |         # return 3 lists representing inputs shapes, outputs shapes, and aux data shapes.
 80 |         return (data_shape,), (output_shape,), ()
 81 |             
 82 |     def create_operator(self, ctx, in_shape, in_dtypes):
 83 |         return SpatialDropout(self._p, self._num_filters, ctx)
 84 | 
 85 | #begin third party code
 86 | #third party code modified in porting from Keras to MXNet
 87 | def same_padding(inp_dims, outp_dims, strides, kernel):
 88 |     inp_h, inp_w = inp_dims[1:]
 89 |     outp_h, outp_w = outp_dims
 90 |     kernel_h, kernel_w = kernel
 91 |     pad_along_height = max((outp_h - 1) * strides[0] + kernel_h - inp_h, 0)
 92 |     pad_along_width = max((outp_w - 1) * strides[1] + kernel_w - inp_w, 0)
 93 |     pad_top = pad_along_height // 2          
 94 |     pad_bottom = pad_along_height - pad_top  
 95 |     pad_left = pad_along_width // 2          
 96 |     pad_right = pad_along_width - pad_left   
 97 |     return (0,0,0,0,pad_top,pad_bottom,pad_left, pad_right)
 98 | 
 99 | def initial_block(inp, inp_dims, outp_dims, nb_filter=13, nb_row=3, nb_col=3, strides=(2, 2)):
100 |     
101 |     padded_inp = mx.sym.pad(inp, mode='constant',
102 |                             pad_width=same_padding(inp_dims, outp_dims, strides, kernel=(nb_row, nb_col)), name='init_pad')
103 |     conv = mx.sym.Convolution(padded_inp, num_filter=nb_filter, kernel=(nb_row, nb_col), stride=strides, name='init_conv')
104 |     max_pool = mx.sym.Pooling(inp, kernel=(2,2), stride=(2,2), pool_type='max', name='init_pool')
105 |     merged = mx.sym.concat(*[conv, max_pool], dim=1, name='init_concat')
106 |     return merged
107 | 
108 | def encoder_bottleneck(inp, inp_filter, output, name, internal_scale=4, asymmetric=0, dilated=0, downsample=False, dropout_rate=0.1):
109 |     # main branch
110 |     internal = output // internal_scale
111 |     encoder = inp
112 | 
113 |     # 1x1
114 |     input_stride = 2 if downsample else 1  # the 1st 1x1 projection is replaced with a 2x2 convolution when downsampling
115 |     encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(input_stride, input_stride),
116 |                                 stride=(input_stride, input_stride), no_bias=True, name="conv1_%i"%name)
117 |     # Batch normalization + PReLU
118 |     encoder = mx.sym.BatchNorm(encoder, momentum=0.1, name="bn1_%i"%name)
119 |     encoder = mx.sym.LeakyReLU(encoder, act_type='prelu', name='prelu1_%i'%name)
120 |     # conv
121 |     if not asymmetric and not dilated:
122 |         encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(3,3), pad=(1,1), name="conv2_%i"%name)
123 |     elif asymmetric:
124 |         encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(1, asymmetric),
125 |                                      pad=(0, asymmetric// 2), no_bias=True, name="conv3_%i"%name)
126 |         encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(asymmetric, 1),
127 |                                      pad=(asymmetric// 2, 0), name="conv4_%i"%name)
128 |     elif dilated:
129 |         encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(3,3),
130 |                                      dilate=(dilated, dilated), pad=((3+(dilated-1)*2)// 2, (3+(dilated-1)*2)// 2),
131 |                                      name="conv2_%i"%name)
132 |     else:
133 |         raise(Exception('You shouldn\'t be here'))
134 |     encoder = mx.sym.BatchNorm(encoder, momentum=0.1, name='bn2_%i'%name)
135 |     encoder = mx.sym.LeakyReLU(encoder, act_type='prelu', name='prelu2_%i'%name)    
136 |     # 1x1
137 |     encoder = mx.sym.Convolution(encoder, num_filter=output, kernel=(1,1), no_bias=True, name="conv5_%i"%name)
138 |     encoder = mx.sym.BatchNorm(encoder, momentum=0.1, name='bn3_%i'%name)
139 |     encoder = mx.sym.Custom(encoder, op_type='spatial_dropout', name='spatial_dropout_%i' % name,
140 |                             p = dropout_rate, num_filters = output)
141 | #     encoder = SpatialDropout(encoder, output, dropout_rate, train, name)
142 |     other = inp
143 |     # other branch
144 |     if downsample:
145 |         other = mx.sym.Pooling(other, kernel=(2,2), stride=(2,2), pool_type='max', name='pool1_%i'%name)
146 |         other = mx.sym.transpose(other, axes=(0,2,1,3), name='trans1_%i'%name)
147 |         pad_feature_maps = output - inp_filter
148 |         other = mx.sym.pad(other, mode='constant', pad_width=(0,0,0,0,0,pad_feature_maps,0,0), name='pad1_%i'%name)
149 |         other = mx.sym.transpose(other, axes=(0,2,1,3), name='trans2_%i'%name)
150 |     encoder = mx.sym.broadcast_add(encoder, other, name='add1_%i'%name)
151 |     encoder = mx.sym.LeakyReLU(encoder, act_type='prelu', name='prelu3_%i'%name)
152 |     return encoder
153 | 
154 | def build_encoder(inp, inp_dims, dropout_rate=0.01):
155 |     enet = initial_block(inp, inp_dims=inp_dims, outp_dims=(inp_dims[1]//2, inp_dims[2]//2))
156 |     enet = mx.sym.BatchNorm(enet, momentum=0.1, name='bn_0')
157 |     encet = mx.sym.LeakyReLU(enet, act_type='prelu', name='prelu_0')
158 |     enet = encoder_bottleneck(enet, 13+inp_dims[0], 64, downsample=True, dropout_rate=dropout_rate, name=1)  # bottleneck 1.0
159 |     for n in range(4):
160 |         enet = encoder_bottleneck(enet, 64, 64, dropout_rate=dropout_rate, name=n+10)  # bottleneck 1.i
161 |     
162 |     enet = encoder_bottleneck(enet, 64, 128, downsample=True, name=19)  # bottleneck 2.0
163 |     # bottleneck 2.x and 3.x
164 |     for n in range(2):
165 |         enet = encoder_bottleneck(enet, 128, 128, name=n*10+20)  # bottleneck 2.1
166 |         enet = encoder_bottleneck(enet, 128, 128, dilated=2, name=n*10+21)  # bottleneck 2.2
167 |         enet = encoder_bottleneck(enet, 128, 128, asymmetric=5, name=n*10+22)  # bottleneck 2.3
168 |         enet = encoder_bottleneck(enet, 128, 128, dilated=4, name=n*10+23)  # bottleneck 2.4
169 |         enet = encoder_bottleneck(enet, 128, 128, name=n*10+24)  # bottleneck 2.5
170 |         enet = encoder_bottleneck(enet, 128, 128, dilated=8, name=n*10+25)  # bottleneck 2.6
171 |         enet = encoder_bottleneck(enet, 128, 128, asymmetric=5, name=n*10+26)  # bottleneck 2.7
172 |         enet = encoder_bottleneck(enet, 128, 128, dilated=16, name=n*10+27)  # bottleneck 2.8
173 |     return enet
174 | 
175 | def decoder_bottleneck(encoder, inp_filter, output, upsample=False, upsample_dims=None, reverse_module=False, name=0):
176 |     internal = output // 4
177 |     
178 |     x = mx.sym.Convolution(encoder, num_filter=internal, kernel=(1,1), no_bias=True, name="conv6_%i"%name)
179 |     x = mx.sym.BatchNorm(x, momentum=0.1, name='bn4_%i'%name)
180 |     x = mx.sym.Activation(x, act_type='relu', name='relu1_%i'%name)
181 |     if not upsample:
182 |         x = mx.sym.Convolution(x, num_filter=internal, kernel=(3,3), pad=(1,1), no_bias=False, name="conv7_%i"%name)
183 |     else:
184 |         x = mx.sym.Deconvolution(x, num_filter=internal, kernel=(3, 3), stride=(2, 2), target_shape=upsample_dims, name="dconv1_%i"%name)
185 |     x = mx.sym.BatchNorm(x, momentum=0.1, name='bn5_%i'%name)
186 |     x = mx.sym.Activation(x, act_type='relu', name='relu2_%i'%name)
187 |     x = mx.sym.Convolution(x, num_filter=output, kernel=(1,1), no_bias=True, name="conv8_%i"%name)
188 |     other = encoder
189 |     if inp_filter != output or upsample:
190 |         other = mx.sym.Convolution(other, num_filter=output, kernel=(1,1), no_bias=True, name="conv9_%i"%name)
191 |         other = mx.sym.BatchNorm(other, momentum=0.1, name='bn6_%i'%name)
192 |         if upsample and reverse_module is not False:
193 |             other = mx.sym.UpSampling(other, scale=2, sample_type='nearest', name="upsample1_%i"%name)        
194 |     if upsample and reverse_module is False:
195 |         decoder = x
196 |     else:
197 |         x = mx.sym.BatchNorm(x, momentum=0.1, name='bn7_%i'%name)
198 |         decoder = mx.sym.broadcast_add(x, other, name='add2_%i'%name)
199 |         decoder = mx.sym.Activation(decoder, act_type='relu', name='relu3_%i'%name)
200 |     return decoder
201 | 
202 | def build_decoder(encoder, nc, output_shape=(3, 512, 512)):
203 |     enet = decoder_bottleneck(encoder, 128, 64, upsample=True, upsample_dims=(output_shape[1]//4, output_shape[2]//4), reverse_module=True, name=20)  # bottleneck 4.0
204 |     enet = decoder_bottleneck(enet, 64, 64, name=21)  # bottleneck 4.1
205 |     enet = decoder_bottleneck(enet, 64, 64, name=22)  # bottleneck 4.2
206 |     enet = decoder_bottleneck(enet, 64, 16, upsample=True, upsample_dims=(output_shape[1]//2, output_shape[2]//2), reverse_module=True, name=23)  # bottleneck 5.0
207 |     enet = decoder_bottleneck(enet, 16, 16, name=24)  # bottleneck 5.1
208 | 
209 |     enet = mx.sym.Deconvolution(enet, num_filter=nc, kernel=(2, 2), stride=(2, 2), target_shape=(output_shape[1],output_shape[2]), name='dconv2')
210 |     return enet
211 | 
212 | def build_enet(inp_dims):
213 |     data = mx.sym.Variable(name='data')
214 |     label = mx.sym.Variable(name='label')
215 |     label = mx.sym.flatten(label, name='flat_label')
216 |     encoder = build_encoder(data, inp_dims=inp_dims)
217 |     decoded = build_decoder(encoder, 1, output_shape=inp_dims)
218 |     mask = mx.sym.sigmoid(decoded, name='mask_sigmoid')
219 |     sigmoid = mx.sym.Flatten(mask)
220 |     loss = mx.sym.MakeLoss(dice_coef_loss(label, sigmoid), normalization='batch', name="dice_loss")
221 |     mask_output = mx.sym.BlockGrad(mask, 'mask')
222 |     out = mx.sym.Group([loss, mask_output])
223 |     return out
224 | #end ported third party code
225 | 
226 | ###############################
227 | ###     Hosting Methods     ###
228 | ###############################
229 | 
230 | def model_fn(model_dir):
231 |     _, arg_params, aux_params = mx.model.load_checkpoint('%s/model' % model_dir, 0)
232 |     batch_size = 1
233 |     data_shape = (batch_size, 1, 720, 720)
234 |     sym = build_enet(data_shape[1:])
235 |     net = mx.mod.Module(sym, data_names=('data',), label_names=('label',))
236 |     net.bind(data_shapes=[['data', data_shape]], label_shapes=[['label', data_shape]], for_training=False)
237 |     net.set_params(arg_params, aux_params)
238 |     return net


--------------------------------------------------------------------------------
/gen.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Copyright [2018]-[2018] Amazon.com, Inc. or its affiliates. All Rights Reserved.
  3 | 
  4 | Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at
  5 | 
  6 |     http://aws.amazon.com/apache2.0/
  7 | 
  8 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
  9 | 
 10 | Portions copyright Copyright (c) 2015 matthewearl. Please see LICENSE.txt for applicable license terms and NOTICE.txt for applicable notices.
 11 | """
 12 | 
 13 | import numpy as np
 14 | from sklearn.datasets.samples_generator import make_blobs
 15 | import cv2
 16 | import math
 17 | 
 18 | def generate_background(shape):
 19 |     num_clusters = np.random.randint(1, 3)
 20 |     cluster_values = np.random.choice(np.concatenate([np.arange(75,125),np.arange(200, 255)]),
 21 |                                       size=num_clusters, replace=True)
 22 |     X, y = make_blobs(300000, centers=num_clusters)
 23 |     X = np.clip(((X / 10) * shape[0]).astype(int), 0, shape[0]-1)
 24 |     bg = (np.random.rand(shape[0], shape[1])*255 + np.random.normal(size=shape)).astype(np.uint8)
 25 |     for i, x in enumerate(X):
 26 |         bg[x[0],x[1]] = cluster_values[(y[i]-1)]
 27 |     return bg
 28 | 
 29 | #Begin third party code
 30 | def euler_to_mat(yaw, pitch, roll):
 31 |     # Rotate clockwise about the Y-axis
 32 |     c, s = math.cos(yaw), math.sin(yaw)
 33 |     M = np.matrix([[  c, 0.,  s],
 34 |                       [ 0., 1., 0.],
 35 |                       [ -s, 0.,  c]])
 36 | 
 37 |     # Rotate clockwise about the X-axis
 38 |     c, s = math.cos(pitch), math.sin(pitch)
 39 |     M = np.matrix([[ 1., 0., 0.],
 40 |                       [ 0.,  c, -s],
 41 |                       [ 0.,  s,  c]]) * M
 42 | 
 43 |     # Rotate clockwise about the Z-axis
 44 |     c, s = math.cos(roll), math.sin(roll)
 45 |     M = np.matrix([[  c, -s, 0.],
 46 |                       [  s,  c, 0.],
 47 |                       [ 0., 0., 1.]]) * M
 48 |     return M
 49 | 
 50 | 
 51 | def make_affine_transform(from_shape, to_shape, 
 52 |                           min_scale, max_scale,
 53 |                           scale_variation=1.0,
 54 |                           rotation_variation=1.0,
 55 |                           translation_variation=1.0):
 56 |     out_of_bounds = False
 57 | 
 58 |     from_size = np.array([[from_shape[1], from_shape[0]]]).T
 59 |     to_size = np.array([[to_shape[1], to_shape[0]]]).T
 60 | 
 61 |     scale = np.random.uniform((min_scale + max_scale) * 0.5 -
 62 |                            (max_scale - min_scale) * 0.5 * scale_variation,
 63 |                            (min_scale + max_scale) * 0.5 +
 64 |                            (max_scale - min_scale) * 0.5 * scale_variation)
 65 |     if scale > max_scale or scale < min_scale:
 66 |         out_of_bounds = True
 67 |     roll = np.random.uniform(-0.3, 0.3) * rotation_variation
 68 |     pitch = np.random.uniform(-0.2, 0.2) * rotation_variation
 69 |     yaw = np.random.uniform(-1.2, 1.2) * rotation_variation
 70 | 
 71 |     # Compute a bounding box on the skewed input image (`from_shape`).
 72 |     M = euler_to_mat(yaw, pitch, roll)[:2, :2]
 73 |     h, w = from_shape
 74 |     corners = np.matrix([[-w, +w, -w, +w],
 75 |                             [-h, -h, +h, +h]]) * 0.5
 76 |     skewed_size = np.array(np.max(M * corners, axis=1) -
 77 |                               np.min(M * corners, axis=1))
 78 | 
 79 |     # Set the scale as large as possible such that the skewed and scaled shape
 80 |     # is less than or equal to the desired ratio in either dimension.
 81 |     scale *= np.min(to_size / skewed_size)
 82 | 
 83 |     # Set the translation such that the skewed and scaled image falls within
 84 |     # the output shape's bounds.
 85 |     trans = (np.random.random((2,1)) - 0.5) * translation_variation
 86 |     trans = ((2.0 * trans) ** 5.0) / 2.0
 87 |     if np.any(trans < -0.5) or np.any(trans > 0.5):
 88 |         out_of_bounds = True
 89 |     trans = (to_size - skewed_size * scale) * trans
 90 | 
 91 |     center_to = to_size / 2.
 92 |     center_from = from_size / 2.
 93 | 
 94 |     M = euler_to_mat(yaw, pitch, roll)[:2, :2]
 95 |     M *= scale
 96 |     M = np.hstack([M, trans + center_to - M * center_from])
 97 | 
 98 |     return M, out_of_bounds
 99 | 
100 | #begin modified third party code
101 | def generate_sample(logo, logo_mask, bg_shape=(256, 256)):
102 |     #generate num logos to use
103 |     num_logos = np.random.randint(4, 15)
104 |     logos = [logo.copy() for i in range(num_logos)]
105 |     logo_masks = [logo_mask.copy() for i in range(num_logos)]
106 |     #generate random structured background
107 |     background = generate_background(bg_shape)
108 |     #generate noisy logo
109 |     for i, lg in enumerate(logos):
110 |         lg[lg!=0] = np.round((lg[lg!=0] + np.random.normal(scale=5.0, size=lg.shape)[lg!=0])).astype(np.uint8)
111 |     #generate matrix for random affine transformation
112 |         M, out_of_bounds = make_affine_transform(
113 |                             from_shape=lg.shape,
114 |                             to_shape=background.shape,
115 |                             min_scale=0.1,
116 |                             max_scale=0.5,
117 |                             rotation_variation=2.0,
118 |                             scale_variation=1.0,
119 |                             translation_variation=1.3)
120 |     # apply transformation to both logo and mask
121 |         lg = cv2.warpAffine(lg, M, (background.shape[1], background.shape[0]))
122 |         logo_masks[i] = cv2.warpAffine(logo_masks[i], M, (background.shape[1], background.shape[0]))
123 |     # insert into background image
124 |         background[logo_masks[i] != 0] = lg[logo_masks[i]!=0]
125 |     #merge masks into one
126 |     merged_masks = sum(logo_masks)
127 |     merged_masks = (merged_masks!=0).astype(np.uint8)
128 |     #shift colors in background
129 |     background + np.random.randint(0, 255)
130 |     return background + np.random.randint(0, 255), merged_masks
131 | #end adapted/copied third party code
132 | 
133 | 
134 | def generate_dataset(logo, logo_mask, num_samples=1000, shape=(256, 256)):
135 |     X = []
136 |     Y = []
137 |     for i in range(num_samples):
138 |         x, y = generate_sample(logo, logo_mask, bg_shape=shape)
139 |         X.append(x)
140 |         Y.append(y)
141 |     return np.stack(X), np.stack(Y)
142 | 
143 | 


--------------------------------------------------------------------------------
/segmentation.py:
--------------------------------------------------------------------------------
  1 | 
  2 | """
  3 | Copyright [2018]-[2018] Amazon.com, Inc. or its affiliates. All Rights Reserved.
  4 | 
  5 | Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. A copy of the License is located at
  6 | 
  7 |     http://aws.amazon.com/apache2.0/
  8 | 
  9 | or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
 10 | 
 11 | Portions copyright Copyright (c) 2017 Pavlo. Please see LICENSE.txt for applicable license terms and NOTICE.txt for applicable notices.
 12 | """
 13 | from __future__ import print_function
 14 | import mxnet as mx
 15 | from mxnet import ndarray as F
 16 | from mxnet.io import DataBatch, DataDesc
 17 | import os
 18 | import numpy as np
 19 | import logging
 20 | import urllib
 21 | import zipfile
 22 | import tarfile
 23 | import shutil
 24 | import gzip
 25 | from glob import glob
 26 | import random
 27 | import json
 28 | 
 29 | ###############################
 30 | ###     Loss Functions      ###
 31 | ###############################
 32 | 
 33 | def dice_coef(y_true, y_pred):
 34 |     intersection = mx.sym.sum(mx.sym.broadcast_mul(y_true, y_pred), axis=(1, 2, 3))
 35 |     return mx.sym.broadcast_div((2. * intersection + 1.),(mx.sym.sum(y_true, axis=(1, 2, 3)) + mx.sym.sum(y_pred, axis=(1, 2, 3)) + 1.))
 36 | 
 37 | def dice_coef_loss(y_true, y_pred):
 38 |     intersection = mx.sym.sum(mx.sym.broadcast_mul(y_true, y_pred), axis=1, )
 39 |     return -mx.sym.broadcast_div((2. * intersection + 1.),(mx.sym.broadcast_add(mx.sym.sum(y_true, axis=1), mx.sym.sum(y_pred, axis=1)) + 1.))
 40 | 
 41 | ###############################
 42 | ###     UNet Architecture   ###
 43 | ###############################
 44 | 
 45 | def conv_block(inp, num_filter, kernel, pad, block, conv_block):
 46 |     conv = mx.sym.Convolution(inp, num_filter=num_filter, kernel=kernel, pad=pad, name='conv%i_%i' % (block, conv_block))
 47 |     conv = mx.sym.BatchNorm(conv, name='bn%i_%i' % (block, conv_block))
 48 |     conv = mx.sym.Activation(conv, act_type='relu', name='relu%i_%i' % (block, conv_block))
 49 |     return conv
 50 | 
 51 | def down_block(inp, num_filter, kernel, pad, block, pool=True):
 52 |     conv = conv_block(inp, num_filter, kernel, pad, block, 1)
 53 |     conv = conv_block(conv, num_filter, kernel, pad, block, 2)
 54 |     if pool:
 55 |         pool = mx.sym.Pooling(conv, kernel=(2,2), stride=(2,2), pool_type='max', name='pool_%i' % block)
 56 |         return pool, conv
 57 |     return conv
 58 | 
 59 | def down_branch(inp):
 60 |     pool1, conv1 = down_block(inp, num_filter=32, kernel=(3,3), pad=(1,1), block=1)
 61 |     pool2, conv2 = down_block(pool1, num_filter=64, kernel=(3,3), pad=(1,1), block=2)
 62 |     pool3, conv3 = down_block(pool2, num_filter=128, kernel=(3,3), pad=(1,1), block=3)
 63 |     pool4, conv4 = down_block(pool3, num_filter=256, kernel=(3,3), pad=(1,1), block=4)
 64 |     conv5 = down_block(pool4, num_filter=512, kernel=(3,3), pad=(1,1), block=5, pool=False)
 65 |     return [conv5, conv4, conv3, conv2, conv1]
 66 | 
 67 | def up_block(inp, down_feature, num_filter, kernel, pad, block):
 68 | 
 69 |     trans_conv = mx.sym.Deconvolution(inp, num_filter=num_filter, kernel=(2,2), stride=(2,2), no_bias=True,
 70 |                                       name='trans_conv_%i' % block)
 71 |     up = mx.sym.concat(*[trans_conv, down_feature], dim=1, name='concat_%i' % block)
 72 |     conv = conv_block(up, num_filter, kernel, pad, block, 1)
 73 |     conv = conv_block(conv, num_filter, kernel, pad, block, 2)
 74 |     return conv
 75 | 
 76 | def up_branch(down_features):
 77 |     conv6 = up_block(down_features[0], down_features[1], num_filter=256, kernel=(3,3), pad=(1,1), block=6)
 78 |     conv7 = up_block(conv6, down_features[2], num_filter=128, kernel=(3,3), pad=(1,1), block=7)
 79 |     conv8 = up_block(conv7, down_features[3], num_filter=64, kernel=(3,3), pad=(1,1), block=8)
 80 |     conv9 = up_block(conv8, down_features[4], num_filter=64, kernel=(3,3), pad=(1,1), block=9)
 81 |     conv10 = mx.sym.Convolution(conv9, num_filter=1, kernel=(1,1), name='conv10_1')
 82 |     return conv10
 83 | 
 84 | def build_unet():
 85 |     data = mx.sym.Variable(name='data')
 86 |     label = mx.sym.Variable(name='label')
 87 |     label = mx.sym.Flatten(label, name='label_flatten')
 88 | 
 89 |     down_features = down_branch(data)
 90 |     decoded = up_branch(down_features)
 91 |     decoded = mx.sym.sigmoid(decoded, name='softmax')
 92 |     net = mx.sym.Flatten(decoded)
 93 |     loss = mx.sym.MakeLoss(dice_coef_loss(label, net), normalization='batch')
 94 |     mask_output = mx.sym.BlockGrad(decoded, 'mask')
 95 |     out = mx.sym.Group([loss, mask_output])
 96 |     return out
 97 | 
 98 | ###############################
 99 | ###     ENet Architecture   ###
100 | ###############################
101 | 
102 | class SpatialDropout(mx.operator.CustomOp):
103 |     def __init__(self, p, num_filters, ctx):
104 |         self._p = float(p)
105 |         self._num_filters = int(num_filters)
106 |         self._ctx = ctx
107 |         self._spatial_dropout_mask = F.ones(shape=(1, 1, 1, 1), ctx=self._ctx)
108 |         
109 |     def forward(self, is_train, req, in_data, out_data, aux):
110 |         x = in_data[0]
111 |         if is_train:
112 |             self._spatial_dropout_mask = F.broadcast_greater(
113 |                 F.random_uniform(low=0, high=1, shape=(1, self._num_filters, 1, 1), ctx=self._ctx), 
114 |                 F.ones(shape=(1, self._num_filters, 1, 1), ctx=self._ctx) * self._p,
115 |                 ctx=self._ctx
116 |             )
117 |             y = F.broadcast_mul(x, self._spatial_dropout_mask, ctx=self._ctx) / (1-self._p)
118 |             self.assign(out_data[0], req[0], y)
119 |         else:
120 |             self.assign(out_data[0], req[0], x)
121 |             
122 |     def backward(self, req, out_grad, in_data, out_data, in_grad, aux):
123 |         dy = out_grad[0]
124 |         dx = F.broadcast_mul(self._spatial_dropout_mask, dy)
125 |         self.assign(in_grad[0], req[0], dx)
126 |         
127 | @mx.operator.register('spatial_dropout')
128 | class SpatialDropoutProp(mx.operator.CustomOpProp):
129 |     def __init__(self, p, num_filters):
130 |         super(SpatialDropoutProp, self).__init__(True)
131 |         self._p = p
132 |         self._num_filters = num_filters
133 |         
134 |     def infer_shape(self, in_shapes):
135 |         data_shape = in_shapes[0]
136 |         output_shape = data_shape
137 |         # return 3 lists representing inputs shapes, outputs shapes, and aux data shapes.
138 |         return (data_shape,), (output_shape,), ()
139 |             
140 |     def create_operator(self, ctx, in_shape, in_dtypes):
141 |         return SpatialDropout(self._p, self._num_filters, ctx)
142 | 
143 | #begin third party code
144 | #third party code has been modified by porting from Keras to MXNet
145 | def same_padding(inp_dims, outp_dims, strides, kernel):
146 |     inp_h, inp_w = inp_dims[1:]
147 |     outp_h, outp_w = outp_dims
148 |     kernel_h, kernel_w = kernel
149 |     pad_along_height = max((outp_h - 1) * strides[0] + kernel_h - inp_h, 0)
150 |     pad_along_width = max((outp_w - 1) * strides[1] + kernel_w - inp_w, 0)
151 |     pad_top = pad_along_height // 2          
152 |     pad_bottom = pad_along_height - pad_top  
153 |     pad_left = pad_along_width // 2          
154 |     pad_right = pad_along_width - pad_left   
155 |     return (0,0,0,0,pad_top,pad_bottom,pad_left, pad_right)
156 | 
157 | def initial_block(inp, inp_dims, outp_dims, nb_filter=13, nb_row=3, nb_col=3, strides=(2, 2)):
158 |     
159 |     padded_inp = mx.sym.pad(inp, mode='constant',
160 |                             pad_width=same_padding(inp_dims, outp_dims, strides, kernel=(nb_row, nb_col)), name='init_pad')
161 |     conv = mx.sym.Convolution(padded_inp, num_filter=nb_filter, kernel=(nb_row, nb_col), stride=strides, name='init_conv')
162 |     max_pool = mx.sym.Pooling(inp, kernel=(2,2), stride=(2,2), pool_type='max', name='init_pool')
163 |     merged = mx.sym.concat(*[conv, max_pool], dim=1, name='init_concat')
164 |     return merged
165 | 
166 | def encoder_bottleneck(inp, inp_filter, output, name, internal_scale=4, asymmetric=0, dilated=0, downsample=False, dropout_rate=0.1):
167 |     # main branch
168 |     internal = output // internal_scale
169 |     encoder = inp
170 | 
171 |     # 1x1
172 |     input_stride = 2 if downsample else 1  # the 1st 1x1 projection is replaced with a 2x2 convolution when downsampling
173 |     encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(input_stride, input_stride),
174 |                                 stride=(input_stride, input_stride), no_bias=True, name="conv1_%i"%name)
175 |     # Batch normalization + PReLU
176 |     encoder = mx.sym.BatchNorm(encoder, momentum=0.1, name="bn1_%i"%name)
177 |     encoder = mx.sym.LeakyReLU(encoder, act_type='prelu', name='prelu1_%i'%name)
178 |     # conv
179 |     if not asymmetric and not dilated:
180 |         encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(3,3), pad=(1,1), name="conv2_%i"%name)
181 |     elif asymmetric:
182 |         encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(1, asymmetric),
183 |                                      pad=(0, asymmetric// 2), no_bias=True, name="conv3_%i"%name)
184 |         encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(asymmetric, 1),
185 |                                      pad=(asymmetric// 2, 0), name="conv4_%i"%name)
186 |     elif dilated:
187 |         encoder = mx.sym.Convolution(encoder, num_filter=internal, kernel=(3,3),
188 |                                      dilate=(dilated, dilated), pad=((3+(dilated-1)*2)// 2, (3+(dilated-1)*2)// 2),
189 |                                      name="conv2_%i"%name)
190 |     else:
191 |         raise(Exception('You shouldn\'t be here'))
192 |     encoder = mx.sym.BatchNorm(encoder, momentum=0.1, name='bn2_%i'%name)
193 |     encoder = mx.sym.LeakyReLU(encoder, act_type='prelu', name='prelu2_%i'%name)    
194 |     # 1x1
195 |     encoder = mx.sym.Convolution(encoder, num_filter=output, kernel=(1,1), no_bias=True, name="conv5_%i"%name)
196 |     encoder = mx.sym.BatchNorm(encoder, momentum=0.1, name='bn3_%i'%name)
197 |     encoder = mx.sym.Custom(encoder, op_type='spatial_dropout', name='spatial_dropout_%i' % name,
198 |                             p = dropout_rate, num_filters = output)
199 |     other = inp
200 |     # other branch
201 |     if downsample:
202 |         other = mx.sym.Pooling(other, kernel=(2,2), stride=(2,2), pool_type='max', name='pool1_%i'%name)
203 |         other = mx.sym.transpose(other, axes=(0,2,1,3), name='trans1_%i'%name)
204 |         pad_feature_maps = output - inp_filter
205 |         other = mx.sym.pad(other, mode='constant', pad_width=(0,0,0,0,0,pad_feature_maps,0,0), name='pad1_%i'%name)
206 |         other = mx.sym.transpose(other, axes=(0,2,1,3), name='trans2_%i'%name)
207 |     encoder = mx.sym.broadcast_add(encoder, other, name='add1_%i'%name)
208 |     encoder = mx.sym.LeakyReLU(encoder, act_type='prelu', name='prelu3_%i'%name)
209 |     return encoder
210 | 
211 | def build_encoder(inp, inp_dims, dropout_rate=0.01):
212 |     enet = initial_block(inp, inp_dims=inp_dims, outp_dims=(inp_dims[1]//2, inp_dims[2]//2))
213 |     enet = mx.sym.BatchNorm(enet, momentum=0.1, name='bn_0')
214 |     encet = mx.sym.LeakyReLU(enet, act_type='prelu', name='prelu_0')
215 |     enet = encoder_bottleneck(enet, 13+inp_dims[0], 64, downsample=True, dropout_rate=dropout_rate, name=1)  # bottleneck 1.0
216 |     for n in range(4):
217 |         enet = encoder_bottleneck(enet, 64, 64, dropout_rate=dropout_rate, name=n+10)  # bottleneck 1.i
218 |     
219 |     enet = encoder_bottleneck(enet, 64, 128, downsample=True, name=19)  # bottleneck 2.0
220 |     # bottleneck 2.x and 3.x
221 |     for n in range(2):
222 |         enet = encoder_bottleneck(enet, 128, 128, name=n*10+20)  # bottleneck 2.1
223 |         enet = encoder_bottleneck(enet, 128, 128, dilated=2, name=n*10+21)  # bottleneck 2.2
224 |         enet = encoder_bottleneck(enet, 128, 128, asymmetric=5, name=n*10+22)  # bottleneck 2.3
225 |         enet = encoder_bottleneck(enet, 128, 128, dilated=4, name=n*10+23)  # bottleneck 2.4
226 |         enet = encoder_bottleneck(enet, 128, 128, name=n*10+24)  # bottleneck 2.5
227 |         enet = encoder_bottleneck(enet, 128, 128, dilated=8, name=n*10+25)  # bottleneck 2.6
228 |         enet = encoder_bottleneck(enet, 128, 128, asymmetric=5, name=n*10+26)  # bottleneck 2.7
229 |         enet = encoder_bottleneck(enet, 128, 128, dilated=16, name=n*10+27)  # bottleneck 2.8
230 |     return enet
231 | 
232 | def decoder_bottleneck(encoder, inp_filter, output, upsample=False, upsample_dims=None, reverse_module=False, name=0):
233 |     internal = output // 4
234 |     
235 |     x = mx.sym.Convolution(encoder, num_filter=internal, kernel=(1,1), no_bias=True, name="conv6_%i"%name)
236 |     x = mx.sym.BatchNorm(x, momentum=0.1, name='bn4_%i'%name)
237 |     x = mx.sym.Activation(x, act_type='relu', name='relu1_%i'%name)
238 |     if not upsample:
239 |         x = mx.sym.Convolution(x, num_filter=internal, kernel=(3,3), pad=(1,1), no_bias=False, name="conv7_%i"%name)
240 |     else:
241 |         x = mx.sym.Deconvolution(x, num_filter=internal, kernel=(3, 3), stride=(2, 2), target_shape=upsample_dims, name="dconv1_%i"%name)
242 |     x = mx.sym.BatchNorm(x, momentum=0.1, name='bn5_%i'%name)
243 |     x = mx.sym.Activation(x, act_type='relu', name='relu2_%i'%name)
244 |     x = mx.sym.Convolution(x, num_filter=output, kernel=(1,1), no_bias=True, name="conv8_%i"%name)
245 |     other = encoder
246 |     if inp_filter != output or upsample:
247 |         other = mx.sym.Convolution(other, num_filter=output, kernel=(1,1), no_bias=True, name="conv9_%i"%name)
248 |         other = mx.sym.BatchNorm(other, momentum=0.1, name='bn6_%i'%name)
249 |         if upsample and reverse_module is not False:
250 |             other = mx.sym.UpSampling(other, scale=2, sample_type='nearest', name="upsample1_%i"%name)        
251 |     if upsample and reverse_module is False:
252 |         decoder = x
253 |     else:
254 |         x = mx.sym.BatchNorm(x, momentum=0.1, name='bn7_%i'%name)
255 |         decoder = mx.sym.broadcast_add(x, other, name='add2_%i'%name)
256 |         decoder = mx.sym.Activation(decoder, act_type='relu', name='relu3_%i'%name)
257 |     return decoder
258 | 
259 | def build_decoder(encoder, nc, output_shape=(3, 512, 512)):
260 |     enet = decoder_bottleneck(encoder, 128, 64, upsample=True, upsample_dims=(output_shape[1]//4, output_shape[2]//4), reverse_module=True, name=20)  # bottleneck 4.0
261 |     enet = decoder_bottleneck(enet, 64, 64, name=21)  # bottleneck 4.1
262 |     enet = decoder_bottleneck(enet, 64, 64, name=22)  # bottleneck 4.2
263 |     enet = decoder_bottleneck(enet, 64, 16, upsample=True, upsample_dims=(output_shape[1]//2, output_shape[2]//2), reverse_module=True, name=23)  # bottleneck 5.0
264 |     enet = decoder_bottleneck(enet, 16, 16, name=24)  # bottleneck 5.1
265 | 
266 |     enet = mx.sym.Deconvolution(enet, num_filter=nc, kernel=(2, 2), stride=(2, 2), target_shape=(output_shape[1],output_shape[2]), name='dconv2')
267 |     return enet
268 | 
269 | def build_enet(inp_dims):
270 |     data = mx.sym.Variable(name='data')
271 |     label = mx.sym.Variable(name='label')
272 |     label = mx.sym.flatten(label, name='flat_label')
273 |     encoder = build_encoder(data, inp_dims=inp_dims)
274 |     decoded = build_decoder(encoder, 1, output_shape=inp_dims)
275 |     mask = mx.sym.sigmoid(decoded, name='mask_sigmoid')
276 |     sigmoid = mx.sym.Flatten(mask)
277 |     loss = mx.sym.MakeLoss(dice_coef_loss(label, sigmoid), normalization='batch', name="dice_loss")
278 |     mask_output = mx.sym.BlockGrad(mask, 'mask')
279 |     out = mx.sym.Group([loss, mask_output])
280 |     return out
281 | #end modified third party code
282 |                               
283 | def get_data(f_path):
284 |     train_X = np.load(os.path.join(f_path,'train/train_X.npy'))
285 |     train_Y = np.load(os.path.join(f_path,'train/train_Y.npy'))
286 |     validation_X = np.load(os.path.join(f_path,'validation/validation_X.npy'))
287 |     validation_Y = np.load(os.path.join(f_path,'validation/validation_Y.npy'))
288 |     return train_X, train_Y, validation_X, validation_Y
289 | 
290 | ###############################
291 | ###     Training Script     ###
292 | ###############################
293 | 
294 | def train(channel_input_dirs, hyperparameters, hosts, **kwargs):
295 |     # retrieve the hyperparameters we set in notebook (with some defaults)
296 |     model = hyperparameters.get('model', 'enet')
297 |     batch_size = hyperparameters.get('batch_size', 128)
298 |     epochs = hyperparameters.get('epochs', 100)
299 |     learning_rate = hyperparameters.get('learning_rate', 0.1)
300 |     beta1 = hyperparameters.get('beta1', 0.9)
301 |     beta2 = hyperparameters.get('beta2', 0.99)
302 |     mean = hyperparameters.get('mean', [0, 0, 0])
303 |     std = hyperparameters.get('std', [1, 1, 1])
304 |     num_gpus = hyperparameters.get('num_gpus', 0)
305 |     burn_in = hyperparameters.get('burn_in', 5)
306 |     # set logging
307 |     logging.getLogger().setLevel(logging.DEBUG)
308 | 
309 |     # setup for distributed training
310 |     if len(hosts) == 1:
311 |         kvstore = 'device' if num_gpus > 0 else 'local'
312 |     else:
313 |         kvstore = 'dist_device_sync' if num_gpus > 0 else 'dist_sync'
314 |                               
315 |     # set context for gpu if available, else cpu
316 |     ctx = [mx.gpu(i) for i in range(num_gpus)] if num_gpus > 0 else [mx.cpu()]
317 |     print (ctx)
318 |     f_path = channel_input_dirs['training']
319 |     # load data
320 |     train_X, train_Y, validation_X, validation_Y = get_data(f_path)
321 |     
322 |     print ('loaded data')
323 |     # define MXNet iterators
324 |     train_iter = train_iter = mx.io.NDArrayIter(data = train_X, label=train_Y, batch_size=batch_size, shuffle=True)
325 |     validation_iter = mx.io.NDArrayIter(data = validation_X, label=validation_Y, batch_size=batch_size, shuffle=False)
326 |     data_shape = (batch_size,) + train_X.shape[1:]
327 | 
328 |     print ('created iters')
329 |     
330 |     print ('building %s' % model)
331 |     # build relevant model
332 |     if model == 'enet':
333 |         sym = build_enet(inp_dims=data_shape[1:])
334 |     else:
335 |         sym = build_unet()
336 |     # build network, bind to data shapes, and initialize parameters and optimizer
337 |     net = mx.mod.Module(sym, context=ctx, data_names=('data',), label_names=('label',))
338 |     net.bind(data_shapes=[['data', data_shape]], label_shapes=[['label', data_shape]])
339 |     net.init_params(mx.initializer.Xavier(magnitude=6))
340 |     net.init_optimizer(optimizer = 'adam', 
341 |                                optimizer_params=(
342 |                                    ('learning_rate', learning_rate),
343 |                                    ('beta1', beta1),
344 |                                    ('beta2', beta2)
345 |                               ))
346 |     # begin training loop, tracking loss values
347 |     print ('start training')
348 |     smoothing_constant = .01
349 |     curr_losses = []
350 |     moving_losses = []
351 |     i = 0
352 |     best_val_loss = np.inf
353 |     for e in range(epochs):
354 |         while True:
355 |             # if iterator is out, reset and move to next epoch
356 |             try:
357 |                 batch = next(train_iter)
358 |             except StopIteration:
359 |                 train_iter.reset()
360 |                 break
361 |             # forward and backward pass
362 |             net.forward_backward(batch)
363 |             loss = net.get_outputs()[0]
364 |             # optimizer step
365 |             net.update()
366 |             # loss calculations
367 |             curr_loss = F.mean(loss).asscalar()
368 |             curr_losses.append(curr_loss)
369 |             moving_loss = (curr_loss if ((i == 0) and (e == 0))
370 |                                    else (1 - smoothing_constant) * moving_loss + (smoothing_constant) * curr_loss)
371 |             moving_losses.append(moving_loss)
372 |             i += 1
373 |         # loss metrics
374 |         val_losses = []
375 |         for batch in validation_iter:
376 |             net.forward(batch)
377 |             loss = net.get_outputs()[0]
378 |             val_losses.append(F.mean(loss).asscalar())
379 |         validation_iter.reset()
380 |         # early stopping
381 |         val_loss = np.mean(val_losses)
382 |         # early stopping by saving the model w/ best validation metric
383 |         if e > burn_in and val_loss < best_val_loss:
384 |             best_val_loss = val_loss
385 |             net.save_checkpoint('best_net', 0)
386 |             print("Best model at Epoch %i" %(e+1))
387 |         print("Epoch %i: Moving Training Loss %0.5f, Validation Loss %0.5f" % (e+1, moving_loss, val_loss))
388 |     # load the model with the best validation metrics, then
389 |     net.load_params('best_net-0000.params')
390 |     return net
391 | 
392 | ###############################
393 | ###     Hosting Methods     ###
394 | ###############################
395 | 
396 | def model_fn(model_dir):
397 |     sym, arg_params, aux_params = mx.model.load_checkpoint('%s/model' % model_dir, 0)
398 |     with open('%s/model-shapes.json' % model_dir) as f:
399 |         shapes = json.load(f)
400 |     shape = shapes[0]['shape']
401 |     batch_size = 1
402 |     data_shape = (batch_size, 1, shape[2], shape[3])
403 |     net = mx.mod.Module(sym, data_names=('data',), label_names=('label',))
404 |     net.bind(data_shapes=[['data', data_shape]], label_shapes=[['label', data_shape]], for_training=False)
405 |     net.set_params(arg_params, aux_params)
406 |     return net
407 | 


--------------------------------------------------------------------------------