├── .gitignore
├── LICENSE
├── README.md
├── dataset
└── list
│ ├── train_aug.txt
│ └── val.txt
├── deeplab
├── __init__.py
├── datasets.py
├── loss.py
├── metric.py
└── model.py
├── evaluate.py
├── evaluate_msc.py
├── train.py
└── train_msc.py
/.gitignore:
--------------------------------------------------------------------------------
1 | # models
2 | dataset/MS_DeepLab_resnet_pretrained_COCO_init.pth
3 |
4 | # pyc
5 | *.pyc
6 | deeplab/*.pyc
7 |
8 | snapshots/
9 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2018 Zilong Huang
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # DeepLab-ResNet-Pytorch
2 |
3 | **New!** We have released [Pytorch-Segmentation-Toolbox](https://github.com/speedinghzl/pytorch-segmentation-toolbox) which contains PyTorch Implementations for DeeplabV3 and PSPNet with **Better Reproduced Performance** on cityscapes.
4 |
5 | This is an (re-)implementation of [DeepLab-ResNet](http://liangchiehchen.com/projects/DeepLabv2_resnet.html) in Pytorch for semantic image segmentation on the [PASCAL VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/).
6 |
7 | ## Updates
8 |
9 | **9 July, 2017**:
10 | * The training script `train.py` has been re-written following the original optimisation setup: SGD with momentum, weight decay, learning rate with polynomial decay, different learning rates for different layers, ignoring the 'void' label (255
).
11 | * The training script with multi-scale inputs `train_msc.py` has been added: the input is resized to 0.5
and 0.75
of the original resolution, and 4
losses are aggregated: loss on the original resolution, on the 0.75
resolution, on the 0.5
resolution, and loss on the all fused outputs.
12 | * Evaluation of a single-scale model on the PASCAL VOC validation dataset (using ['SegmentationClassAug'](https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0)) leads to 74.0%
mIoU ['VOC12_scenes_20000.pth'](https://pan.baidu.com/s/1bP52R8) without CRF as post-processing step. The evaluation of multi-scale model is in progress.
13 |
14 |
15 | ## Model Description
16 |
17 | The DeepLab-ResNet is built on a fully convolutional variant of [ResNet-101](https://github.com/KaimingHe/deep-residual-networks) with [atrous (dilated) convolutions](https://github.com/fyu/dilation), atrous spatial pyramid pooling, and multi-scale inputs (not implemented here).
18 |
19 | The model is trained on a mini-batch of images and corresponding ground truth masks with the softmax classifier at the top. During training, the masks are downsampled to match the size of the output from the network; during inference, to acquire the output of the same size as the input, bilinear upsampling is applied. The final segmentation mask is computed using argmax over the logits.
20 | Optionally, a fully-connected probabilistic graphical model, namely, CRF, can be applied to refine the final predictions.
21 | On the test set of PASCAL VOC, the model achieves 79.7%
with CRFs and 76.4%
without CRFs of mean intersection-over-union.
22 |
23 | For more details on the underlying model please refer to the following paper:
24 |
25 |
26 | @article{CP2016Deeplab,
27 | title={DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs},
28 | author={Liang-Chieh Chen and George Papandreou and Iasonas Kokkinos and Kevin Murphy and Alan L Yuille},
29 | journal={arXiv:1606.00915},
30 | year={2016}
31 | }
32 |
33 | ## Dataset and Training
34 |
35 | To train the network, one can use the augmented PASCAL VOC 2012 dataset with 10582
images for training and 1449
images for validation. Pytorch >= 0.4.0.
36 |
37 | You can download converted `init.caffemodel` with extension name .pth [here](https://drive.google.com/open?id=0BxhUwxvLPO7TVFJQU1dwbXhHdEk). Besides that, one can also exploit random scaling and mirroring of the inputs during training as a means for data augmentation. For example, to train the model from scratch with random scale and mirroring turned on, simply run:
38 | ```bash
39 | python train.py --random-mirror --random-scale --gpu 0
40 | ```
41 |
42 | ## Evaluation
43 |
44 | The single-scale model shows 74.0%
mIoU on the Pascal VOC 2012 validation dataset (['SegmentationClassAug'](https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0)). No post-processing step with CRF is applied.
45 |
46 | The following command provides the description of each of the evaluation settings:
47 | ```bash
48 | python evaluate.py --help
49 | ```
50 |
51 | ## Acknowledgment
52 | This code is heavily borrowed from [pytorch-deeplab-resnet](https://github.com/isht7/pytorch-deeplab-resnet).
53 |
54 | ## Other implementations
55 | * [DeepLab-LargeFOV in TensorFlow](https://github.com/DrSleep/tensorflow-deeplab-lfov)
56 | * [DeepLab-LargeFOV in Pytorch](https://github.com/isht7/pytorch-deeplab-resnet)
57 |
58 |
59 |
--------------------------------------------------------------------------------
/dataset/list/val.txt:
--------------------------------------------------------------------------------
1 | 2007_000033
2 | 2007_000042
3 | 2007_000061
4 | 2007_000123
5 | 2007_000129
6 | 2007_000175
7 | 2007_000187
8 | 2007_000323
9 | 2007_000332
10 | 2007_000346
11 | 2007_000452
12 | 2007_000464
13 | 2007_000491
14 | 2007_000529
15 | 2007_000559
16 | 2007_000572
17 | 2007_000629
18 | 2007_000636
19 | 2007_000661
20 | 2007_000663
21 | 2007_000676
22 | 2007_000727
23 | 2007_000762
24 | 2007_000783
25 | 2007_000799
26 | 2007_000804
27 | 2007_000830
28 | 2007_000837
29 | 2007_000847
30 | 2007_000862
31 | 2007_000925
32 | 2007_000999
33 | 2007_001154
34 | 2007_001175
35 | 2007_001239
36 | 2007_001284
37 | 2007_001288
38 | 2007_001289
39 | 2007_001299
40 | 2007_001311
41 | 2007_001321
42 | 2007_001377
43 | 2007_001408
44 | 2007_001423
45 | 2007_001430
46 | 2007_001457
47 | 2007_001458
48 | 2007_001526
49 | 2007_001568
50 | 2007_001585
51 | 2007_001586
52 | 2007_001587
53 | 2007_001594
54 | 2007_001630
55 | 2007_001677
56 | 2007_001678
57 | 2007_001717
58 | 2007_001733
59 | 2007_001761
60 | 2007_001763
61 | 2007_001774
62 | 2007_001884
63 | 2007_001955
64 | 2007_002046
65 | 2007_002094
66 | 2007_002119
67 | 2007_002132
68 | 2007_002260
69 | 2007_002266
70 | 2007_002268
71 | 2007_002284
72 | 2007_002376
73 | 2007_002378
74 | 2007_002387
75 | 2007_002400
76 | 2007_002412
77 | 2007_002426
78 | 2007_002427
79 | 2007_002445
80 | 2007_002470
81 | 2007_002539
82 | 2007_002565
83 | 2007_002597
84 | 2007_002618
85 | 2007_002619
86 | 2007_002624
87 | 2007_002643
88 | 2007_002648
89 | 2007_002719
90 | 2007_002728
91 | 2007_002823
92 | 2007_002824
93 | 2007_002852
94 | 2007_002903
95 | 2007_003011
96 | 2007_003020
97 | 2007_003022
98 | 2007_003051
99 | 2007_003088
100 | 2007_003101
101 | 2007_003106
102 | 2007_003110
103 | 2007_003131
104 | 2007_003134
105 | 2007_003137
106 | 2007_003143
107 | 2007_003169
108 | 2007_003188
109 | 2007_003194
110 | 2007_003195
111 | 2007_003201
112 | 2007_003349
113 | 2007_003367
114 | 2007_003373
115 | 2007_003499
116 | 2007_003503
117 | 2007_003506
118 | 2007_003530
119 | 2007_003571
120 | 2007_003587
121 | 2007_003611
122 | 2007_003621
123 | 2007_003682
124 | 2007_003711
125 | 2007_003714
126 | 2007_003742
127 | 2007_003786
128 | 2007_003841
129 | 2007_003848
130 | 2007_003861
131 | 2007_003872
132 | 2007_003917
133 | 2007_003957
134 | 2007_003991
135 | 2007_004033
136 | 2007_004052
137 | 2007_004112
138 | 2007_004121
139 | 2007_004143
140 | 2007_004189
141 | 2007_004190
142 | 2007_004193
143 | 2007_004241
144 | 2007_004275
145 | 2007_004281
146 | 2007_004380
147 | 2007_004392
148 | 2007_004405
149 | 2007_004468
150 | 2007_004483
151 | 2007_004510
152 | 2007_004538
153 | 2007_004558
154 | 2007_004644
155 | 2007_004649
156 | 2007_004712
157 | 2007_004722
158 | 2007_004856
159 | 2007_004866
160 | 2007_004902
161 | 2007_004969
162 | 2007_005058
163 | 2007_005074
164 | 2007_005107
165 | 2007_005114
166 | 2007_005149
167 | 2007_005173
168 | 2007_005281
169 | 2007_005294
170 | 2007_005296
171 | 2007_005304
172 | 2007_005331
173 | 2007_005354
174 | 2007_005358
175 | 2007_005428
176 | 2007_005460
177 | 2007_005469
178 | 2007_005509
179 | 2007_005547
180 | 2007_005600
181 | 2007_005608
182 | 2007_005626
183 | 2007_005689
184 | 2007_005696
185 | 2007_005705
186 | 2007_005759
187 | 2007_005803
188 | 2007_005813
189 | 2007_005828
190 | 2007_005844
191 | 2007_005845
192 | 2007_005857
193 | 2007_005911
194 | 2007_005915
195 | 2007_005978
196 | 2007_006028
197 | 2007_006035
198 | 2007_006046
199 | 2007_006076
200 | 2007_006086
201 | 2007_006117
202 | 2007_006171
203 | 2007_006241
204 | 2007_006260
205 | 2007_006277
206 | 2007_006348
207 | 2007_006364
208 | 2007_006373
209 | 2007_006444
210 | 2007_006449
211 | 2007_006549
212 | 2007_006553
213 | 2007_006560
214 | 2007_006647
215 | 2007_006678
216 | 2007_006680
217 | 2007_006698
218 | 2007_006761
219 | 2007_006802
220 | 2007_006837
221 | 2007_006841
222 | 2007_006864
223 | 2007_006866
224 | 2007_006946
225 | 2007_007007
226 | 2007_007084
227 | 2007_007109
228 | 2007_007130
229 | 2007_007165
230 | 2007_007168
231 | 2007_007195
232 | 2007_007196
233 | 2007_007203
234 | 2007_007211
235 | 2007_007235
236 | 2007_007341
237 | 2007_007414
238 | 2007_007417
239 | 2007_007470
240 | 2007_007477
241 | 2007_007493
242 | 2007_007498
243 | 2007_007524
244 | 2007_007534
245 | 2007_007624
246 | 2007_007651
247 | 2007_007688
248 | 2007_007748
249 | 2007_007795
250 | 2007_007810
251 | 2007_007815
252 | 2007_007818
253 | 2007_007836
254 | 2007_007849
255 | 2007_007881
256 | 2007_007996
257 | 2007_008051
258 | 2007_008084
259 | 2007_008106
260 | 2007_008110
261 | 2007_008204
262 | 2007_008222
263 | 2007_008256
264 | 2007_008260
265 | 2007_008339
266 | 2007_008374
267 | 2007_008415
268 | 2007_008430
269 | 2007_008543
270 | 2007_008547
271 | 2007_008596
272 | 2007_008645
273 | 2007_008670
274 | 2007_008708
275 | 2007_008722
276 | 2007_008747
277 | 2007_008802
278 | 2007_008815
279 | 2007_008897
280 | 2007_008944
281 | 2007_008964
282 | 2007_008973
283 | 2007_008980
284 | 2007_009015
285 | 2007_009068
286 | 2007_009084
287 | 2007_009088
288 | 2007_009096
289 | 2007_009221
290 | 2007_009245
291 | 2007_009251
292 | 2007_009252
293 | 2007_009258
294 | 2007_009320
295 | 2007_009323
296 | 2007_009331
297 | 2007_009346
298 | 2007_009392
299 | 2007_009413
300 | 2007_009419
301 | 2007_009446
302 | 2007_009458
303 | 2007_009521
304 | 2007_009562
305 | 2007_009592
306 | 2007_009654
307 | 2007_009655
308 | 2007_009684
309 | 2007_009687
310 | 2007_009691
311 | 2007_009706
312 | 2007_009750
313 | 2007_009756
314 | 2007_009764
315 | 2007_009794
316 | 2007_009817
317 | 2007_009841
318 | 2007_009897
319 | 2007_009911
320 | 2007_009923
321 | 2007_009938
322 | 2008_000009
323 | 2008_000016
324 | 2008_000073
325 | 2008_000075
326 | 2008_000080
327 | 2008_000107
328 | 2008_000120
329 | 2008_000123
330 | 2008_000149
331 | 2008_000182
332 | 2008_000213
333 | 2008_000215
334 | 2008_000223
335 | 2008_000233
336 | 2008_000234
337 | 2008_000239
338 | 2008_000254
339 | 2008_000270
340 | 2008_000271
341 | 2008_000345
342 | 2008_000359
343 | 2008_000391
344 | 2008_000401
345 | 2008_000464
346 | 2008_000469
347 | 2008_000474
348 | 2008_000501
349 | 2008_000510
350 | 2008_000533
351 | 2008_000573
352 | 2008_000589
353 | 2008_000602
354 | 2008_000630
355 | 2008_000657
356 | 2008_000661
357 | 2008_000662
358 | 2008_000666
359 | 2008_000673
360 | 2008_000700
361 | 2008_000725
362 | 2008_000731
363 | 2008_000763
364 | 2008_000765
365 | 2008_000782
366 | 2008_000795
367 | 2008_000811
368 | 2008_000848
369 | 2008_000853
370 | 2008_000863
371 | 2008_000911
372 | 2008_000919
373 | 2008_000943
374 | 2008_000992
375 | 2008_001013
376 | 2008_001028
377 | 2008_001040
378 | 2008_001070
379 | 2008_001074
380 | 2008_001076
381 | 2008_001078
382 | 2008_001135
383 | 2008_001150
384 | 2008_001170
385 | 2008_001231
386 | 2008_001249
387 | 2008_001260
388 | 2008_001283
389 | 2008_001308
390 | 2008_001379
391 | 2008_001404
392 | 2008_001433
393 | 2008_001439
394 | 2008_001478
395 | 2008_001491
396 | 2008_001504
397 | 2008_001513
398 | 2008_001514
399 | 2008_001531
400 | 2008_001546
401 | 2008_001547
402 | 2008_001580
403 | 2008_001629
404 | 2008_001640
405 | 2008_001682
406 | 2008_001688
407 | 2008_001715
408 | 2008_001821
409 | 2008_001874
410 | 2008_001885
411 | 2008_001895
412 | 2008_001966
413 | 2008_001971
414 | 2008_001992
415 | 2008_002043
416 | 2008_002152
417 | 2008_002205
418 | 2008_002212
419 | 2008_002239
420 | 2008_002240
421 | 2008_002241
422 | 2008_002269
423 | 2008_002273
424 | 2008_002358
425 | 2008_002379
426 | 2008_002383
427 | 2008_002429
428 | 2008_002464
429 | 2008_002467
430 | 2008_002492
431 | 2008_002495
432 | 2008_002504
433 | 2008_002521
434 | 2008_002536
435 | 2008_002588
436 | 2008_002623
437 | 2008_002680
438 | 2008_002681
439 | 2008_002775
440 | 2008_002778
441 | 2008_002835
442 | 2008_002859
443 | 2008_002864
444 | 2008_002900
445 | 2008_002904
446 | 2008_002929
447 | 2008_002936
448 | 2008_002942
449 | 2008_002958
450 | 2008_003003
451 | 2008_003026
452 | 2008_003034
453 | 2008_003076
454 | 2008_003105
455 | 2008_003108
456 | 2008_003110
457 | 2008_003135
458 | 2008_003141
459 | 2008_003155
460 | 2008_003210
461 | 2008_003238
462 | 2008_003270
463 | 2008_003330
464 | 2008_003333
465 | 2008_003369
466 | 2008_003379
467 | 2008_003451
468 | 2008_003461
469 | 2008_003477
470 | 2008_003492
471 | 2008_003499
472 | 2008_003511
473 | 2008_003546
474 | 2008_003576
475 | 2008_003577
476 | 2008_003676
477 | 2008_003709
478 | 2008_003733
479 | 2008_003777
480 | 2008_003782
481 | 2008_003821
482 | 2008_003846
483 | 2008_003856
484 | 2008_003858
485 | 2008_003874
486 | 2008_003876
487 | 2008_003885
488 | 2008_003886
489 | 2008_003926
490 | 2008_003976
491 | 2008_004069
492 | 2008_004101
493 | 2008_004140
494 | 2008_004172
495 | 2008_004175
496 | 2008_004212
497 | 2008_004279
498 | 2008_004339
499 | 2008_004345
500 | 2008_004363
501 | 2008_004367
502 | 2008_004396
503 | 2008_004399
504 | 2008_004453
505 | 2008_004477
506 | 2008_004552
507 | 2008_004562
508 | 2008_004575
509 | 2008_004610
510 | 2008_004612
511 | 2008_004621
512 | 2008_004624
513 | 2008_004654
514 | 2008_004659
515 | 2008_004687
516 | 2008_004701
517 | 2008_004704
518 | 2008_004705
519 | 2008_004754
520 | 2008_004758
521 | 2008_004854
522 | 2008_004910
523 | 2008_004995
524 | 2008_005049
525 | 2008_005089
526 | 2008_005097
527 | 2008_005105
528 | 2008_005145
529 | 2008_005197
530 | 2008_005217
531 | 2008_005242
532 | 2008_005245
533 | 2008_005254
534 | 2008_005262
535 | 2008_005338
536 | 2008_005398
537 | 2008_005399
538 | 2008_005422
539 | 2008_005439
540 | 2008_005445
541 | 2008_005525
542 | 2008_005544
543 | 2008_005628
544 | 2008_005633
545 | 2008_005637
546 | 2008_005642
547 | 2008_005676
548 | 2008_005680
549 | 2008_005691
550 | 2008_005727
551 | 2008_005738
552 | 2008_005812
553 | 2008_005904
554 | 2008_005915
555 | 2008_006008
556 | 2008_006036
557 | 2008_006055
558 | 2008_006063
559 | 2008_006108
560 | 2008_006130
561 | 2008_006143
562 | 2008_006159
563 | 2008_006216
564 | 2008_006219
565 | 2008_006229
566 | 2008_006254
567 | 2008_006275
568 | 2008_006325
569 | 2008_006327
570 | 2008_006341
571 | 2008_006408
572 | 2008_006480
573 | 2008_006523
574 | 2008_006526
575 | 2008_006528
576 | 2008_006553
577 | 2008_006554
578 | 2008_006703
579 | 2008_006722
580 | 2008_006752
581 | 2008_006784
582 | 2008_006835
583 | 2008_006874
584 | 2008_006981
585 | 2008_006986
586 | 2008_007025
587 | 2008_007031
588 | 2008_007048
589 | 2008_007120
590 | 2008_007123
591 | 2008_007143
592 | 2008_007194
593 | 2008_007219
594 | 2008_007273
595 | 2008_007350
596 | 2008_007378
597 | 2008_007392
598 | 2008_007402
599 | 2008_007497
600 | 2008_007498
601 | 2008_007507
602 | 2008_007513
603 | 2008_007527
604 | 2008_007548
605 | 2008_007596
606 | 2008_007677
607 | 2008_007737
608 | 2008_007797
609 | 2008_007804
610 | 2008_007811
611 | 2008_007814
612 | 2008_007828
613 | 2008_007836
614 | 2008_007945
615 | 2008_007994
616 | 2008_008051
617 | 2008_008103
618 | 2008_008127
619 | 2008_008221
620 | 2008_008252
621 | 2008_008268
622 | 2008_008296
623 | 2008_008301
624 | 2008_008335
625 | 2008_008362
626 | 2008_008392
627 | 2008_008393
628 | 2008_008421
629 | 2008_008434
630 | 2008_008469
631 | 2008_008629
632 | 2008_008682
633 | 2008_008711
634 | 2008_008746
635 | 2009_000012
636 | 2009_000013
637 | 2009_000022
638 | 2009_000032
639 | 2009_000037
640 | 2009_000039
641 | 2009_000074
642 | 2009_000080
643 | 2009_000087
644 | 2009_000096
645 | 2009_000121
646 | 2009_000136
647 | 2009_000149
648 | 2009_000156
649 | 2009_000201
650 | 2009_000205
651 | 2009_000219
652 | 2009_000242
653 | 2009_000309
654 | 2009_000318
655 | 2009_000335
656 | 2009_000351
657 | 2009_000354
658 | 2009_000387
659 | 2009_000391
660 | 2009_000412
661 | 2009_000418
662 | 2009_000421
663 | 2009_000426
664 | 2009_000440
665 | 2009_000446
666 | 2009_000455
667 | 2009_000457
668 | 2009_000469
669 | 2009_000487
670 | 2009_000488
671 | 2009_000523
672 | 2009_000573
673 | 2009_000619
674 | 2009_000628
675 | 2009_000641
676 | 2009_000664
677 | 2009_000675
678 | 2009_000704
679 | 2009_000705
680 | 2009_000712
681 | 2009_000716
682 | 2009_000723
683 | 2009_000727
684 | 2009_000730
685 | 2009_000731
686 | 2009_000732
687 | 2009_000771
688 | 2009_000825
689 | 2009_000828
690 | 2009_000839
691 | 2009_000840
692 | 2009_000845
693 | 2009_000879
694 | 2009_000892
695 | 2009_000919
696 | 2009_000924
697 | 2009_000931
698 | 2009_000935
699 | 2009_000964
700 | 2009_000989
701 | 2009_000991
702 | 2009_000998
703 | 2009_001008
704 | 2009_001082
705 | 2009_001108
706 | 2009_001160
707 | 2009_001215
708 | 2009_001240
709 | 2009_001255
710 | 2009_001278
711 | 2009_001299
712 | 2009_001300
713 | 2009_001314
714 | 2009_001332
715 | 2009_001333
716 | 2009_001363
717 | 2009_001391
718 | 2009_001411
719 | 2009_001433
720 | 2009_001505
721 | 2009_001535
722 | 2009_001536
723 | 2009_001565
724 | 2009_001607
725 | 2009_001644
726 | 2009_001663
727 | 2009_001683
728 | 2009_001684
729 | 2009_001687
730 | 2009_001718
731 | 2009_001731
732 | 2009_001765
733 | 2009_001768
734 | 2009_001775
735 | 2009_001804
736 | 2009_001816
737 | 2009_001818
738 | 2009_001850
739 | 2009_001851
740 | 2009_001854
741 | 2009_001941
742 | 2009_001991
743 | 2009_002012
744 | 2009_002035
745 | 2009_002042
746 | 2009_002082
747 | 2009_002094
748 | 2009_002097
749 | 2009_002122
750 | 2009_002150
751 | 2009_002155
752 | 2009_002164
753 | 2009_002165
754 | 2009_002171
755 | 2009_002185
756 | 2009_002202
757 | 2009_002221
758 | 2009_002238
759 | 2009_002239
760 | 2009_002265
761 | 2009_002268
762 | 2009_002291
763 | 2009_002295
764 | 2009_002317
765 | 2009_002320
766 | 2009_002346
767 | 2009_002366
768 | 2009_002372
769 | 2009_002382
770 | 2009_002390
771 | 2009_002415
772 | 2009_002445
773 | 2009_002487
774 | 2009_002521
775 | 2009_002527
776 | 2009_002535
777 | 2009_002539
778 | 2009_002549
779 | 2009_002562
780 | 2009_002568
781 | 2009_002571
782 | 2009_002573
783 | 2009_002584
784 | 2009_002591
785 | 2009_002594
786 | 2009_002604
787 | 2009_002618
788 | 2009_002635
789 | 2009_002638
790 | 2009_002649
791 | 2009_002651
792 | 2009_002727
793 | 2009_002732
794 | 2009_002749
795 | 2009_002753
796 | 2009_002771
797 | 2009_002808
798 | 2009_002856
799 | 2009_002887
800 | 2009_002888
801 | 2009_002928
802 | 2009_002936
803 | 2009_002975
804 | 2009_002982
805 | 2009_002990
806 | 2009_003003
807 | 2009_003005
808 | 2009_003043
809 | 2009_003059
810 | 2009_003063
811 | 2009_003065
812 | 2009_003071
813 | 2009_003080
814 | 2009_003105
815 | 2009_003123
816 | 2009_003193
817 | 2009_003196
818 | 2009_003217
819 | 2009_003224
820 | 2009_003241
821 | 2009_003269
822 | 2009_003273
823 | 2009_003299
824 | 2009_003304
825 | 2009_003311
826 | 2009_003323
827 | 2009_003343
828 | 2009_003378
829 | 2009_003387
830 | 2009_003406
831 | 2009_003433
832 | 2009_003450
833 | 2009_003466
834 | 2009_003481
835 | 2009_003494
836 | 2009_003498
837 | 2009_003504
838 | 2009_003507
839 | 2009_003517
840 | 2009_003523
841 | 2009_003542
842 | 2009_003549
843 | 2009_003551
844 | 2009_003564
845 | 2009_003569
846 | 2009_003576
847 | 2009_003589
848 | 2009_003607
849 | 2009_003640
850 | 2009_003666
851 | 2009_003696
852 | 2009_003703
853 | 2009_003707
854 | 2009_003756
855 | 2009_003771
856 | 2009_003773
857 | 2009_003804
858 | 2009_003806
859 | 2009_003810
860 | 2009_003849
861 | 2009_003857
862 | 2009_003858
863 | 2009_003895
864 | 2009_003903
865 | 2009_003904
866 | 2009_003928
867 | 2009_003938
868 | 2009_003971
869 | 2009_003991
870 | 2009_004021
871 | 2009_004033
872 | 2009_004043
873 | 2009_004070
874 | 2009_004072
875 | 2009_004084
876 | 2009_004099
877 | 2009_004125
878 | 2009_004140
879 | 2009_004217
880 | 2009_004221
881 | 2009_004247
882 | 2009_004248
883 | 2009_004255
884 | 2009_004298
885 | 2009_004324
886 | 2009_004455
887 | 2009_004494
888 | 2009_004497
889 | 2009_004504
890 | 2009_004507
891 | 2009_004509
892 | 2009_004540
893 | 2009_004568
894 | 2009_004579
895 | 2009_004581
896 | 2009_004590
897 | 2009_004592
898 | 2009_004594
899 | 2009_004635
900 | 2009_004653
901 | 2009_004687
902 | 2009_004721
903 | 2009_004730
904 | 2009_004732
905 | 2009_004738
906 | 2009_004748
907 | 2009_004789
908 | 2009_004799
909 | 2009_004801
910 | 2009_004848
911 | 2009_004859
912 | 2009_004867
913 | 2009_004882
914 | 2009_004886
915 | 2009_004895
916 | 2009_004942
917 | 2009_004969
918 | 2009_004987
919 | 2009_004993
920 | 2009_004994
921 | 2009_005038
922 | 2009_005078
923 | 2009_005087
924 | 2009_005089
925 | 2009_005137
926 | 2009_005148
927 | 2009_005156
928 | 2009_005158
929 | 2009_005189
930 | 2009_005190
931 | 2009_005217
932 | 2009_005219
933 | 2009_005220
934 | 2009_005231
935 | 2009_005260
936 | 2009_005262
937 | 2009_005302
938 | 2010_000003
939 | 2010_000038
940 | 2010_000065
941 | 2010_000083
942 | 2010_000084
943 | 2010_000087
944 | 2010_000110
945 | 2010_000159
946 | 2010_000160
947 | 2010_000163
948 | 2010_000174
949 | 2010_000216
950 | 2010_000238
951 | 2010_000241
952 | 2010_000256
953 | 2010_000272
954 | 2010_000284
955 | 2010_000309
956 | 2010_000318
957 | 2010_000330
958 | 2010_000335
959 | 2010_000342
960 | 2010_000372
961 | 2010_000422
962 | 2010_000426
963 | 2010_000427
964 | 2010_000502
965 | 2010_000530
966 | 2010_000552
967 | 2010_000559
968 | 2010_000572
969 | 2010_000573
970 | 2010_000622
971 | 2010_000628
972 | 2010_000639
973 | 2010_000666
974 | 2010_000679
975 | 2010_000682
976 | 2010_000683
977 | 2010_000724
978 | 2010_000738
979 | 2010_000764
980 | 2010_000788
981 | 2010_000814
982 | 2010_000836
983 | 2010_000874
984 | 2010_000904
985 | 2010_000906
986 | 2010_000907
987 | 2010_000918
988 | 2010_000929
989 | 2010_000941
990 | 2010_000952
991 | 2010_000961
992 | 2010_001000
993 | 2010_001010
994 | 2010_001011
995 | 2010_001016
996 | 2010_001017
997 | 2010_001024
998 | 2010_001036
999 | 2010_001061
1000 | 2010_001069
1001 | 2010_001070
1002 | 2010_001079
1003 | 2010_001104
1004 | 2010_001124
1005 | 2010_001149
1006 | 2010_001151
1007 | 2010_001174
1008 | 2010_001206
1009 | 2010_001246
1010 | 2010_001251
1011 | 2010_001256
1012 | 2010_001264
1013 | 2010_001292
1014 | 2010_001313
1015 | 2010_001327
1016 | 2010_001331
1017 | 2010_001351
1018 | 2010_001367
1019 | 2010_001376
1020 | 2010_001403
1021 | 2010_001448
1022 | 2010_001451
1023 | 2010_001522
1024 | 2010_001534
1025 | 2010_001553
1026 | 2010_001557
1027 | 2010_001563
1028 | 2010_001577
1029 | 2010_001579
1030 | 2010_001646
1031 | 2010_001656
1032 | 2010_001692
1033 | 2010_001699
1034 | 2010_001734
1035 | 2010_001752
1036 | 2010_001767
1037 | 2010_001768
1038 | 2010_001773
1039 | 2010_001820
1040 | 2010_001830
1041 | 2010_001851
1042 | 2010_001908
1043 | 2010_001913
1044 | 2010_001951
1045 | 2010_001956
1046 | 2010_001962
1047 | 2010_001966
1048 | 2010_001995
1049 | 2010_002017
1050 | 2010_002025
1051 | 2010_002030
1052 | 2010_002106
1053 | 2010_002137
1054 | 2010_002142
1055 | 2010_002146
1056 | 2010_002147
1057 | 2010_002150
1058 | 2010_002161
1059 | 2010_002200
1060 | 2010_002228
1061 | 2010_002232
1062 | 2010_002251
1063 | 2010_002271
1064 | 2010_002305
1065 | 2010_002310
1066 | 2010_002336
1067 | 2010_002348
1068 | 2010_002361
1069 | 2010_002390
1070 | 2010_002396
1071 | 2010_002422
1072 | 2010_002450
1073 | 2010_002480
1074 | 2010_002512
1075 | 2010_002531
1076 | 2010_002536
1077 | 2010_002538
1078 | 2010_002546
1079 | 2010_002623
1080 | 2010_002682
1081 | 2010_002691
1082 | 2010_002693
1083 | 2010_002701
1084 | 2010_002763
1085 | 2010_002792
1086 | 2010_002868
1087 | 2010_002900
1088 | 2010_002902
1089 | 2010_002921
1090 | 2010_002929
1091 | 2010_002939
1092 | 2010_002988
1093 | 2010_003014
1094 | 2010_003060
1095 | 2010_003123
1096 | 2010_003127
1097 | 2010_003132
1098 | 2010_003168
1099 | 2010_003183
1100 | 2010_003187
1101 | 2010_003207
1102 | 2010_003231
1103 | 2010_003239
1104 | 2010_003275
1105 | 2010_003276
1106 | 2010_003293
1107 | 2010_003302
1108 | 2010_003325
1109 | 2010_003362
1110 | 2010_003365
1111 | 2010_003381
1112 | 2010_003402
1113 | 2010_003409
1114 | 2010_003418
1115 | 2010_003446
1116 | 2010_003453
1117 | 2010_003468
1118 | 2010_003473
1119 | 2010_003495
1120 | 2010_003506
1121 | 2010_003514
1122 | 2010_003531
1123 | 2010_003532
1124 | 2010_003541
1125 | 2010_003547
1126 | 2010_003597
1127 | 2010_003675
1128 | 2010_003708
1129 | 2010_003716
1130 | 2010_003746
1131 | 2010_003758
1132 | 2010_003764
1133 | 2010_003768
1134 | 2010_003771
1135 | 2010_003772
1136 | 2010_003781
1137 | 2010_003813
1138 | 2010_003820
1139 | 2010_003854
1140 | 2010_003912
1141 | 2010_003915
1142 | 2010_003947
1143 | 2010_003956
1144 | 2010_003971
1145 | 2010_004041
1146 | 2010_004042
1147 | 2010_004056
1148 | 2010_004063
1149 | 2010_004104
1150 | 2010_004120
1151 | 2010_004149
1152 | 2010_004165
1153 | 2010_004208
1154 | 2010_004219
1155 | 2010_004226
1156 | 2010_004314
1157 | 2010_004320
1158 | 2010_004322
1159 | 2010_004337
1160 | 2010_004348
1161 | 2010_004355
1162 | 2010_004369
1163 | 2010_004382
1164 | 2010_004419
1165 | 2010_004432
1166 | 2010_004472
1167 | 2010_004479
1168 | 2010_004519
1169 | 2010_004520
1170 | 2010_004529
1171 | 2010_004543
1172 | 2010_004550
1173 | 2010_004551
1174 | 2010_004556
1175 | 2010_004559
1176 | 2010_004628
1177 | 2010_004635
1178 | 2010_004662
1179 | 2010_004697
1180 | 2010_004757
1181 | 2010_004763
1182 | 2010_004772
1183 | 2010_004783
1184 | 2010_004789
1185 | 2010_004795
1186 | 2010_004815
1187 | 2010_004825
1188 | 2010_004828
1189 | 2010_004856
1190 | 2010_004857
1191 | 2010_004861
1192 | 2010_004941
1193 | 2010_004946
1194 | 2010_004951
1195 | 2010_004980
1196 | 2010_004994
1197 | 2010_005013
1198 | 2010_005021
1199 | 2010_005046
1200 | 2010_005063
1201 | 2010_005108
1202 | 2010_005118
1203 | 2010_005159
1204 | 2010_005160
1205 | 2010_005166
1206 | 2010_005174
1207 | 2010_005180
1208 | 2010_005187
1209 | 2010_005206
1210 | 2010_005245
1211 | 2010_005252
1212 | 2010_005284
1213 | 2010_005305
1214 | 2010_005344
1215 | 2010_005353
1216 | 2010_005366
1217 | 2010_005401
1218 | 2010_005421
1219 | 2010_005428
1220 | 2010_005432
1221 | 2010_005433
1222 | 2010_005496
1223 | 2010_005501
1224 | 2010_005508
1225 | 2010_005531
1226 | 2010_005534
1227 | 2010_005575
1228 | 2010_005582
1229 | 2010_005606
1230 | 2010_005626
1231 | 2010_005644
1232 | 2010_005664
1233 | 2010_005705
1234 | 2010_005706
1235 | 2010_005709
1236 | 2010_005718
1237 | 2010_005719
1238 | 2010_005727
1239 | 2010_005762
1240 | 2010_005788
1241 | 2010_005860
1242 | 2010_005871
1243 | 2010_005877
1244 | 2010_005888
1245 | 2010_005899
1246 | 2010_005922
1247 | 2010_005991
1248 | 2010_005992
1249 | 2010_006026
1250 | 2010_006034
1251 | 2010_006054
1252 | 2010_006070
1253 | 2011_000045
1254 | 2011_000051
1255 | 2011_000054
1256 | 2011_000066
1257 | 2011_000070
1258 | 2011_000112
1259 | 2011_000173
1260 | 2011_000178
1261 | 2011_000185
1262 | 2011_000226
1263 | 2011_000234
1264 | 2011_000238
1265 | 2011_000239
1266 | 2011_000248
1267 | 2011_000283
1268 | 2011_000291
1269 | 2011_000310
1270 | 2011_000312
1271 | 2011_000338
1272 | 2011_000396
1273 | 2011_000412
1274 | 2011_000419
1275 | 2011_000435
1276 | 2011_000436
1277 | 2011_000438
1278 | 2011_000455
1279 | 2011_000456
1280 | 2011_000479
1281 | 2011_000481
1282 | 2011_000482
1283 | 2011_000503
1284 | 2011_000512
1285 | 2011_000521
1286 | 2011_000526
1287 | 2011_000536
1288 | 2011_000548
1289 | 2011_000566
1290 | 2011_000585
1291 | 2011_000598
1292 | 2011_000607
1293 | 2011_000618
1294 | 2011_000638
1295 | 2011_000658
1296 | 2011_000661
1297 | 2011_000669
1298 | 2011_000747
1299 | 2011_000780
1300 | 2011_000789
1301 | 2011_000807
1302 | 2011_000809
1303 | 2011_000813
1304 | 2011_000830
1305 | 2011_000843
1306 | 2011_000874
1307 | 2011_000888
1308 | 2011_000900
1309 | 2011_000912
1310 | 2011_000953
1311 | 2011_000969
1312 | 2011_001005
1313 | 2011_001014
1314 | 2011_001020
1315 | 2011_001047
1316 | 2011_001060
1317 | 2011_001064
1318 | 2011_001069
1319 | 2011_001071
1320 | 2011_001082
1321 | 2011_001110
1322 | 2011_001114
1323 | 2011_001159
1324 | 2011_001161
1325 | 2011_001190
1326 | 2011_001232
1327 | 2011_001263
1328 | 2011_001276
1329 | 2011_001281
1330 | 2011_001287
1331 | 2011_001292
1332 | 2011_001313
1333 | 2011_001341
1334 | 2011_001346
1335 | 2011_001350
1336 | 2011_001407
1337 | 2011_001416
1338 | 2011_001421
1339 | 2011_001434
1340 | 2011_001447
1341 | 2011_001489
1342 | 2011_001529
1343 | 2011_001530
1344 | 2011_001534
1345 | 2011_001546
1346 | 2011_001567
1347 | 2011_001589
1348 | 2011_001597
1349 | 2011_001601
1350 | 2011_001607
1351 | 2011_001613
1352 | 2011_001614
1353 | 2011_001619
1354 | 2011_001624
1355 | 2011_001642
1356 | 2011_001665
1357 | 2011_001669
1358 | 2011_001674
1359 | 2011_001708
1360 | 2011_001713
1361 | 2011_001714
1362 | 2011_001722
1363 | 2011_001726
1364 | 2011_001745
1365 | 2011_001748
1366 | 2011_001775
1367 | 2011_001782
1368 | 2011_001793
1369 | 2011_001794
1370 | 2011_001812
1371 | 2011_001862
1372 | 2011_001863
1373 | 2011_001868
1374 | 2011_001880
1375 | 2011_001910
1376 | 2011_001984
1377 | 2011_001988
1378 | 2011_002002
1379 | 2011_002040
1380 | 2011_002041
1381 | 2011_002064
1382 | 2011_002075
1383 | 2011_002098
1384 | 2011_002110
1385 | 2011_002121
1386 | 2011_002124
1387 | 2011_002150
1388 | 2011_002156
1389 | 2011_002178
1390 | 2011_002200
1391 | 2011_002223
1392 | 2011_002244
1393 | 2011_002247
1394 | 2011_002279
1395 | 2011_002295
1396 | 2011_002298
1397 | 2011_002308
1398 | 2011_002317
1399 | 2011_002322
1400 | 2011_002327
1401 | 2011_002343
1402 | 2011_002358
1403 | 2011_002371
1404 | 2011_002379
1405 | 2011_002391
1406 | 2011_002498
1407 | 2011_002509
1408 | 2011_002515
1409 | 2011_002532
1410 | 2011_002535
1411 | 2011_002548
1412 | 2011_002575
1413 | 2011_002578
1414 | 2011_002589
1415 | 2011_002592
1416 | 2011_002623
1417 | 2011_002641
1418 | 2011_002644
1419 | 2011_002662
1420 | 2011_002675
1421 | 2011_002685
1422 | 2011_002713
1423 | 2011_002730
1424 | 2011_002754
1425 | 2011_002812
1426 | 2011_002863
1427 | 2011_002879
1428 | 2011_002885
1429 | 2011_002929
1430 | 2011_002951
1431 | 2011_002975
1432 | 2011_002993
1433 | 2011_002997
1434 | 2011_003003
1435 | 2011_003011
1436 | 2011_003019
1437 | 2011_003030
1438 | 2011_003055
1439 | 2011_003085
1440 | 2011_003103
1441 | 2011_003114
1442 | 2011_003145
1443 | 2011_003146
1444 | 2011_003182
1445 | 2011_003197
1446 | 2011_003205
1447 | 2011_003240
1448 | 2011_003256
1449 | 2011_003271
1450 |
--------------------------------------------------------------------------------
/deeplab/__init__.py:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/deeplab/datasets.py:
--------------------------------------------------------------------------------
1 | import os
2 | import os.path as osp
3 | import numpy as np
4 | import random
5 | import matplotlib.pyplot as plt
6 | import collections
7 | import torch
8 | import torchvision
9 | import cv2
10 | from torch.utils import data
11 |
12 |
13 | class VOCDataSet(data.Dataset):
14 | def __init__(self, root, list_path, max_iters=None, crop_size=(321, 321), mean=(128, 128, 128), scale=True, mirror=True, ignore_label=255):
15 | self.root = root
16 | self.list_path = list_path
17 | self.crop_h, self.crop_w = crop_size
18 | self.scale = scale
19 | self.ignore_label = ignore_label
20 | self.mean = mean
21 | self.is_mirror = mirror
22 | # self.mean_bgr = np.array([104.00698793, 116.66876762, 122.67891434])
23 | self.img_ids = [i_id.strip() for i_id in open(list_path)]
24 | if not max_iters==None:
25 | self.img_ids = self.img_ids * int(np.ceil(float(max_iters) / len(self.img_ids)))
26 | self.files = []
27 | # for split in ["train", "trainval", "val"]:
28 | for name in self.img_ids:
29 | img_file = osp.join(self.root, "JPEGImages/%s.jpg" % name)
30 | label_file = osp.join(self.root, "SegmentationClassAug/%s.png" % name)
31 | self.files.append({
32 | "img": img_file,
33 | "label": label_file,
34 | "name": name
35 | })
36 |
37 | def __len__(self):
38 | return len(self.files)
39 |
40 | def generate_scale_label(self, image, label):
41 | f_scale = 0.5 + random.randint(0, 11) / 10.0
42 | image = cv2.resize(image, None, fx=f_scale, fy=f_scale, interpolation = cv2.INTER_LINEAR)
43 | label = cv2.resize(label, None, fx=f_scale, fy=f_scale, interpolation = cv2.INTER_NEAREST)
44 | return image, label
45 |
46 | def __getitem__(self, index):
47 | datafiles = self.files[index]
48 | image = cv2.imread(datafiles["img"], cv2.IMREAD_COLOR)
49 | label = cv2.imread(datafiles["label"], cv2.IMREAD_GRAYSCALE)
50 | size = image.shape
51 | name = datafiles["name"]
52 | if self.scale:
53 | image, label = self.generate_scale_label(image, label)
54 | image = np.asarray(image, np.float32)
55 | image -= self.mean
56 | img_h, img_w = label.shape
57 | pad_h = max(self.crop_h - img_h, 0)
58 | pad_w = max(self.crop_w - img_w, 0)
59 | if pad_h > 0 or pad_w > 0:
60 | img_pad = cv2.copyMakeBorder(image, 0, pad_h, 0,
61 | pad_w, cv2.BORDER_CONSTANT,
62 | value=(0.0, 0.0, 0.0))
63 | label_pad = cv2.copyMakeBorder(label, 0, pad_h, 0,
64 | pad_w, cv2.BORDER_CONSTANT,
65 | value=(self.ignore_label,))
66 | else:
67 | img_pad, label_pad = image, label
68 |
69 | img_h, img_w = label_pad.shape
70 | h_off = random.randint(0, img_h - self.crop_h)
71 | w_off = random.randint(0, img_w - self.crop_w)
72 | # roi = cv2.Rect(w_off, h_off, self.crop_w, self.crop_h);
73 | image = np.asarray(img_pad[h_off : h_off+self.crop_h, w_off : w_off+self.crop_w], np.float32)
74 | label = np.asarray(label_pad[h_off : h_off+self.crop_h, w_off : w_off+self.crop_w], np.float32)
75 | #image = image[:, :, ::-1] # change to BGR
76 | image = image.transpose((2, 0, 1))
77 | if self.is_mirror:
78 | flip = np.random.choice(2) * 2 - 1
79 | image = image[:, :, ::flip]
80 | label = label[:, ::flip]
81 |
82 | return image.copy(), label.copy(), np.array(size), name
83 |
84 |
85 | class VOCDataTestSet(data.Dataset):
86 | def __init__(self, root, list_path, crop_size=(505, 505), mean=(128, 128, 128)):
87 | self.root = root
88 | self.list_path = list_path
89 | self.crop_h, self.crop_w = crop_size
90 | self.mean = mean
91 | # self.mean_bgr = np.array([104.00698793, 116.66876762, 122.67891434])
92 | self.img_ids = [i_id.strip() for i_id in open(list_path)]
93 | self.files = []
94 | # for split in ["train", "trainval", "val"]:
95 | for name in self.img_ids:
96 | img_file = osp.join(self.root, "JPEGImages/%s.jpg" % name)
97 | self.files.append({
98 | "img": img_file
99 | })
100 |
101 | def __len__(self):
102 | return len(self.files)
103 |
104 | def __getitem__(self, index):
105 | datafiles = self.files[index]
106 | image = cv2.imread(datafiles["img"], cv2.IMREAD_COLOR)
107 | size = image.shape
108 | name = osp.splitext(osp.basename(datafiles["img"]))[0]
109 | image = np.asarray(image, np.float32)
110 | image -= self.mean
111 |
112 | img_h, img_w, _ = image.shape
113 | pad_h = max(self.crop_h - img_h, 0)
114 | pad_w = max(self.crop_w - img_w, 0)
115 | if pad_h > 0 or pad_w > 0:
116 | image = cv2.copyMakeBorder(image, 0, pad_h, 0,
117 | pad_w, cv2.BORDER_CONSTANT,
118 | value=(0.0, 0.0, 0.0))
119 | image = image.transpose((2, 0, 1))
120 | return image, name, size
121 |
122 |
123 | if __name__ == '__main__':
124 | dst = VOCDataSet("./data", is_transform=True)
125 | trainloader = data.DataLoader(dst, batch_size=4)
126 | for i, data in enumerate(trainloader):
127 | imgs, labels = data
128 | if i == 0:
129 | img = torchvision.utils.make_grid(imgs).numpy()
130 | img = np.transpose(img, (1, 2, 0))
131 | img = img[:, :, ::-1]
132 | plt.imshow(img)
133 | plt.show()
134 |
--------------------------------------------------------------------------------
/deeplab/loss.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn.functional as F
3 | import torch.nn as nn
4 | from torch.autograd import Variable
5 |
6 | class CrossEntropy2d(nn.Module):
7 |
8 | def __init__(self, size_average=True, ignore_label=255):
9 | super(CrossEntropy2d, self).__init__()
10 | self.size_average = size_average
11 | self.ignore_label = ignore_label
12 |
13 | def forward(self, predict, target, weight=None):
14 | """
15 | Args:
16 | predict:(n, c, h, w)
17 | target:(n, h, w)
18 | weight (Tensor, optional): a manual rescaling weight given to each class.
19 | If given, has to be a Tensor of size "nclasses"
20 | """
21 | assert not target.requires_grad
22 | assert predict.dim() == 4
23 | assert target.dim() == 3
24 | assert predict.size(0) == target.size(0), "{0} vs {1} ".format(predict.size(0), target.size(0))
25 | assert predict.size(2) == target.size(1), "{0} vs {1} ".format(predict.size(2), target.size(1))
26 | assert predict.size(3) == target.size(2), "{0} vs {1} ".format(predict.size(3), target.size(3))
27 | n, c, h, w = predict.size()
28 | target_mask = (target >= 0) * (target != self.ignore_label)
29 | target = target[target_mask]
30 | if not target.data.dim():
31 | return Variable(torch.zeros(1))
32 | predict = predict.transpose(1, 2).transpose(2, 3).contiguous()
33 | predict = predict[target_mask.view(n, h, w, 1).repeat(1, 1, 1, c)].view(-1, c)
34 | loss = F.cross_entropy(predict, target, weight=weight, size_average=self.size_average)
35 | return loss
--------------------------------------------------------------------------------
/deeplab/metric.py:
--------------------------------------------------------------------------------
1 | import os, sys
2 | import numpy as np
3 |
4 | from multiprocessing import Pool
5 | import copy_reg
6 | import types
7 | def _pickle_method(m):
8 | if m.im_self is None:
9 | return getattr, (m.im_class, m.im_func.func_name)
10 | else:
11 | return getattr, (m.im_self, m.im_func.func_name)
12 |
13 | copy_reg.pickle(types.MethodType, _pickle_method)
14 |
15 | class ConfusionMatrix(object):
16 |
17 | def __init__(self, nclass, classes=None):
18 | self.nclass = nclass
19 | self.classes = classes
20 | self.M = np.zeros((nclass, nclass))
21 |
22 | def add(self, gt, pred):
23 | assert(np.max(pred) <= self.nclass)
24 | assert(len(gt) == len(pred))
25 | for i in range(len(gt)):
26 | if not gt[i] == 255:
27 | self.M[gt[i], pred[i]] += 1.0
28 |
29 | def addM(self, matrix):
30 | assert(matrix.shape == self.M.shape)
31 | self.M += matrix
32 |
33 | def __str__(self):
34 | pass
35 |
36 | def recall(self):
37 | recall = 0.0
38 | for i in xrange(self.nclass):
39 | recall += self.M[i, i] / np.sum(self.M[:, i])
40 |
41 | return recall/self.nclass
42 |
43 | def accuracy(self):
44 | accuracy = 0.0
45 | for i in xrange(self.nclass):
46 | accuracy += self.M[i, i] / np.sum(self.M[i, :])
47 |
48 | return accuracy/self.nclass
49 |
50 | def jaccard(self):
51 | jaccard = 0.0
52 | jaccard_perclass = []
53 | for i in xrange(self.nclass):
54 | jaccard_perclass.append(self.M[i, i] / (np.sum(self.M[i, :]) + np.sum(self.M[:, i]) - self.M[i, i]))
55 |
56 | return np.sum(jaccard_perclass)/len(jaccard_perclass), jaccard_perclass, self.M
57 |
58 | def generateM(self, item):
59 | gt, pred = item
60 | m = np.zeros((self.nclass, self.nclass))
61 | assert(len(gt) == len(pred))
62 | for i in range(len(gt)):
63 | if gt[i] < self.nclass: #and pred[i] < self.nclass:
64 | m[gt[i], pred[i]] += 1.0
65 | return m
66 |
67 |
68 | if __name__ == '__main__':
69 | args = parse_args()
70 |
71 | m_list = []
72 | data_list = []
73 | test_ids = [i.strip() for i in open(args.test_ids) if not i.strip() == '']
74 | for index, img_id in enumerate(test_ids):
75 | if index % 100 == 0:
76 | print('%d processd'%(index))
77 | pred_img_path = os.path.join(args.pred_dir, img_id+'.png')
78 | gt_img_path = os.path.join(args.gt_dir, img_id+'.png')
79 | pred = cv2.imread(pred_img_path, cv2.IMREAD_GRAYSCALE)
80 | gt = cv2.imread(gt_img_path, cv2.IMREAD_GRAYSCALE)
81 | # show_all(gt, pred)
82 | data_list.append([gt.flatten(), pred.flatten()])
83 |
84 | ConfM = ConfusionMatrix(args.class_num)
85 | f = ConfM.generateM
86 | pool = Pool()
87 | m_list = pool.map(f, data_list)
88 | pool.close()
89 | pool.join()
90 |
91 | for m in m_list:
92 | ConfM.addM(m)
93 |
94 | aveJ, j_list, M = ConfM.jaccard()
95 | with open(args.save_path, 'w') as f:
96 | f.write('meanIOU: ' + str(aveJ) + '\n')
97 | f.write(str(j_list)+'\n')
98 | f.write(str(M)+'\n')
99 |
--------------------------------------------------------------------------------
/deeplab/model.py:
--------------------------------------------------------------------------------
1 | import torch.nn as nn
2 | import math
3 | import torch.utils.model_zoo as model_zoo
4 | import torch
5 | import numpy as np
6 | affine_par = True
7 |
8 |
9 | def outS(i):
10 | i = int(i)
11 | i = (i+1)/2
12 | i = int(np.ceil((i+1)/2.0))
13 | i = (i+1)/2
14 | return i
15 |
16 | def conv3x3(in_planes, out_planes, stride=1):
17 | "3x3 convolution with padding"
18 | return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
19 | padding=1, bias=False)
20 |
21 |
22 | class BasicBlock(nn.Module):
23 | expansion = 1
24 |
25 | def __init__(self, inplanes, planes, stride=1, downsample=None):
26 | super(BasicBlock, self).__init__()
27 | self.conv1 = conv3x3(inplanes, planes, stride)
28 | self.bn1 = nn.BatchNorm2d(planes, affine = affine_par)
29 | self.relu = nn.ReLU(inplace=True)
30 | self.conv2 = conv3x3(planes, planes)
31 | self.bn2 = nn.BatchNorm2d(planes, affine = affine_par)
32 | self.downsample = downsample
33 | self.stride = stride
34 |
35 | def forward(self, x):
36 | residual = x
37 |
38 | out = self.conv1(x)
39 | out = self.bn1(out)
40 | out = self.relu(out)
41 |
42 | out = self.conv2(out)
43 | out = self.bn2(out)
44 |
45 | if self.downsample is not None:
46 | residual = self.downsample(x)
47 |
48 | out += residual
49 | out = self.relu(out)
50 |
51 | return out
52 |
53 |
54 | class Bottleneck(nn.Module):
55 | expansion = 4
56 |
57 | def __init__(self, inplanes, planes, stride=1, dilation=1, downsample=None):
58 | super(Bottleneck, self).__init__()
59 | self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, stride=stride, bias=False) # change
60 | self.bn1 = nn.BatchNorm2d(planes,affine = affine_par)
61 | for i in self.bn1.parameters():
62 | i.requires_grad = False
63 |
64 | padding = dilation
65 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, # change
66 | padding=padding, bias=False, dilation = dilation)
67 | self.bn2 = nn.BatchNorm2d(planes,affine = affine_par)
68 | for i in self.bn2.parameters():
69 | i.requires_grad = False
70 | self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
71 | self.bn3 = nn.BatchNorm2d(planes * 4, affine = affine_par)
72 | for i in self.bn3.parameters():
73 | i.requires_grad = False
74 | self.relu = nn.ReLU(inplace=True)
75 | self.downsample = downsample
76 | self.stride = stride
77 |
78 |
79 | def forward(self, x):
80 | residual = x
81 |
82 | out = self.conv1(x)
83 | out = self.bn1(out)
84 | out = self.relu(out)
85 |
86 | out = self.conv2(out)
87 | out = self.bn2(out)
88 | out = self.relu(out)
89 |
90 | out = self.conv3(out)
91 | out = self.bn3(out)
92 |
93 | if self.downsample is not None:
94 | residual = self.downsample(x)
95 |
96 | out += residual
97 | out = self.relu(out)
98 |
99 | return out
100 |
101 | class Classifier_Module(nn.Module):
102 |
103 | def __init__(self, dilation_series, padding_series, num_classes):
104 | super(Classifier_Module, self).__init__()
105 | self.conv2d_list = nn.ModuleList()
106 | for dilation, padding in zip(dilation_series, padding_series):
107 | self.conv2d_list.append(nn.Conv2d(2048, num_classes, kernel_size=3, stride=1, padding=padding, dilation=dilation, bias = True))
108 |
109 | for m in self.conv2d_list:
110 | m.weight.data.normal_(0, 0.01)
111 |
112 | def forward(self, x):
113 | out = self.conv2d_list[0](x)
114 | for i in range(len(self.conv2d_list)-1):
115 | out += self.conv2d_list[i+1](x)
116 | return out
117 |
118 | class Residual_Covolution(nn.Module):
119 | def __init__(self, icol, ocol, num_classes):
120 | super(Residual_Covolution, self).__init__()
121 | self.conv1 = nn.Conv2d(icol, ocol, kernel_size=3, stride=1, padding=12, dilation=12, bias=True)
122 | self.conv2 = nn.Conv2d(ocol, num_classes, kernel_size=3, stride=1, padding=12, dilation=12, bias=True)
123 | self.conv3 = nn.Conv2d(num_classes, ocol, kernel_size=1, stride=1, padding=0, dilation=1, bias=True)
124 | self.conv4 = nn.Conv2d(ocol, icol, kernel_size=1, stride=1, padding=0, dilation=1, bias=True)
125 | self.relu = nn.ReLU(inplace=True)
126 |
127 | def forward(self, x):
128 | dow1 = self.conv1(x)
129 | dow1 = self.relu(dow1)
130 | seg = self.conv2(dow1)
131 | inc1 = self.conv3(seg)
132 | add1 = dow1 + self.relu(inc1)
133 | inc2 = self.conv4(add1)
134 | out = x + self.relu(inc2)
135 | return out, seg
136 |
137 | class Residual_Refinement_Module(nn.Module):
138 |
139 | def __init__(self, num_classes):
140 | super(Residual_Refinement_Module, self).__init__()
141 | self.RC1 = Residual_Covolution(2048, 512, num_classes)
142 | self.RC2 = Residual_Covolution(2048, 512, num_classes)
143 |
144 | def forward(self, x):
145 | x, seg1 = self.RC1(x)
146 | _, seg2 = self.RC2(x)
147 | return [seg1, seg1+seg2]
148 |
149 | class ResNet_Refine(nn.Module):
150 | def __init__(self, block, layers, num_classes):
151 | self.inplanes = 64
152 | super(ResNet_Refine, self).__init__()
153 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
154 | bias=False)
155 | self.bn1 = nn.BatchNorm2d(64, affine = affine_par)
156 | for i in self.bn1.parameters():
157 | i.requires_grad = False
158 | self.relu = nn.ReLU(inplace=True)
159 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, ceil_mode=True) # change
160 | self.layer1 = self._make_layer(block, 64, layers[0])
161 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
162 | self.layer3 = self._make_layer(block, 256, layers[2], stride=1, dilation=2)
163 | self.layer4 = self._make_layer(block, 512, layers[3], stride=1, dilation=4)
164 | self.layer5 = Residual_Refinement_Module(num_classes)
165 |
166 | for m in self.modules():
167 | if isinstance(m, nn.Conv2d):
168 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
169 | m.weight.data.normal_(0, 0.01)
170 | elif isinstance(m, nn.BatchNorm2d):
171 | m.weight.data.fill_(1)
172 | m.bias.data.zero_()
173 | # for i in m.parameters():
174 | # i.requires_grad = False
175 |
176 | def _make_layer(self, block, planes, blocks, stride=1, dilation=1):
177 | downsample = None
178 | if stride != 1 or self.inplanes != planes * block.expansion or dilation == 2 or dilation == 4:
179 | downsample = nn.Sequential(
180 | nn.Conv2d(self.inplanes, planes * block.expansion,
181 | kernel_size=1, stride=stride, bias=False),
182 | nn.BatchNorm2d(planes * block.expansion,affine = affine_par))
183 | for i in downsample._modules['1'].parameters():
184 | i.requires_grad = False
185 | layers = []
186 | layers.append(block(self.inplanes, planes, stride,dilation=dilation, downsample=downsample))
187 | self.inplanes = planes * block.expansion
188 | for i in range(1, blocks):
189 | layers.append(block(self.inplanes, planes, dilation=dilation))
190 |
191 | return nn.Sequential(*layers)
192 |
193 | def forward(self, x):
194 | x = self.conv1(x)
195 | x = self.bn1(x)
196 | x = self.relu(x)
197 | x = self.maxpool(x)
198 | x = self.layer1(x)
199 | x = self.layer2(x)
200 | x = self.layer3(x)
201 | x = self.layer4(x)
202 | x = self.layer5(x)
203 |
204 | return x
205 |
206 | class ResNet(nn.Module):
207 | def __init__(self, block, layers, num_classes):
208 | self.inplanes = 64
209 | super(ResNet, self).__init__()
210 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3,
211 | bias=False)
212 | self.bn1 = nn.BatchNorm2d(64, affine = affine_par)
213 | for i in self.bn1.parameters():
214 | i.requires_grad = False
215 | self.relu = nn.ReLU(inplace=True)
216 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, ceil_mode=True) # change
217 | self.layer1 = self._make_layer(block, 64, layers[0])
218 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
219 | self.layer3 = self._make_layer(block, 256, layers[2], stride=1, dilation=2)
220 | self.layer4 = self._make_layer(block, 512, layers[3], stride=1, dilation=4)
221 | self.layer5 = self._make_pred_layer(Classifier_Module, [6,12,18,24],[6,12,18,24],num_classes)
222 |
223 | for m in self.modules():
224 | if isinstance(m, nn.Conv2d):
225 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
226 | m.weight.data.normal_(0, 0.01)
227 | elif isinstance(m, nn.BatchNorm2d):
228 | m.weight.data.fill_(1)
229 | m.bias.data.zero_()
230 | # for i in m.parameters():
231 | # i.requires_grad = False
232 |
233 | def _make_layer(self, block, planes, blocks, stride=1, dilation=1):
234 | downsample = None
235 | if stride != 1 or self.inplanes != planes * block.expansion or dilation == 2 or dilation == 4:
236 | downsample = nn.Sequential(
237 | nn.Conv2d(self.inplanes, planes * block.expansion,
238 | kernel_size=1, stride=stride, bias=False),
239 | nn.BatchNorm2d(planes * block.expansion,affine = affine_par))
240 | for i in downsample._modules['1'].parameters():
241 | i.requires_grad = False
242 | layers = []
243 | layers.append(block(self.inplanes, planes, stride,dilation=dilation, downsample=downsample))
244 | self.inplanes = planes * block.expansion
245 | for i in range(1, blocks):
246 | layers.append(block(self.inplanes, planes, dilation=dilation))
247 |
248 | return nn.Sequential(*layers)
249 | def _make_pred_layer(self,block, dilation_series, padding_series,num_classes):
250 | return block(dilation_series,padding_series,num_classes)
251 |
252 | def forward(self, x):
253 | x = self.conv1(x)
254 | x = self.bn1(x)
255 | x = self.relu(x)
256 | x = self.maxpool(x)
257 | x = self.layer1(x)
258 | x = self.layer2(x)
259 | x = self.layer3(x)
260 | x = self.layer4(x)
261 | x = self.layer5(x)
262 |
263 | return x
264 |
265 | class MS_Deeplab(nn.Module):
266 | def __init__(self,block,num_classes):
267 | super(MS_Deeplab,self).__init__()
268 | self.Scale = ResNet(block,[3, 4, 23, 3],num_classes) #changed to fix #4
269 |
270 | def forward(self,x):
271 | output = self.Scale(x) # for original scale
272 | output_size = output.size()[2]
273 | input_size = x.size()[2]
274 |
275 | self.interp1 = nn.Upsample(size=(int(input_size*0.75)+1, int(input_size*0.75)+1), mode='bilinear')
276 | self.interp2 = nn.Upsample(size=(int(input_size*0.5)+1, int(input_size*0.5)+1), mode='bilinear')
277 | self.interp3 = nn.Upsample(size=(output_size, output_size), mode='bilinear')
278 |
279 | x75 = self.interp1(x)
280 | output75 = self.interp3(self.Scale(x75)) # for 0.75x scale
281 |
282 | x5 = self.interp2(x)
283 | output5 = self.interp3(self.Scale(x5)) # for 0.5x scale
284 |
285 | out_max = torch.max(torch.max(output, output75), output5)
286 | return [output, output75, output5, out_max]
287 |
288 | def Res_Ms_Deeplab(num_classes=21):
289 | model = MS_Deeplab(Bottleneck, num_classes)
290 | return model
291 |
292 | def Res_Deeplab(num_classes=21, is_refine=False):
293 | if is_refine:
294 | model = ResNet_Refine(Bottleneck,[3, 4, 23, 3], num_classes)
295 | else:
296 | model = ResNet(Bottleneck,[3, 4, 23, 3], num_classes)
297 | return model
298 |
299 |
--------------------------------------------------------------------------------
/evaluate.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import scipy
3 | from scipy import ndimage
4 | import cv2
5 | import numpy as np
6 | import sys
7 |
8 | import torch
9 | from torch.autograd import Variable
10 | import torchvision.models as models
11 | import torch.nn.functional as F
12 | from torch.utils import data
13 | from deeplab.model import Res_Deeplab
14 | from deeplab.datasets import VOCDataSet
15 | from collections import OrderedDict
16 | import os
17 |
18 | import matplotlib.pyplot as plt
19 | import torch.nn as nn
20 | IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32)
21 |
22 | DATA_DIRECTORY = '../../data/VOCdevkit/voc12'
23 | DATA_LIST_PATH = './dataset/list/val.txt'
24 | IGNORE_LABEL = 255
25 | NUM_CLASSES = 21
26 | NUM_STEPS = 1449 # Number of images in the validation set.
27 | RESTORE_FROM = './deeplab_resnet.ckpt'
28 |
29 | def get_arguments():
30 | """Parse all the arguments provided from the CLI.
31 |
32 | Returns:
33 | A list of parsed arguments.
34 | """
35 | parser = argparse.ArgumentParser(description="DeepLabLFOV Network")
36 | parser.add_argument("--data-dir", type=str, default=DATA_DIRECTORY,
37 | help="Path to the directory containing the PASCAL VOC dataset.")
38 | parser.add_argument("--data-list", type=str, default=DATA_LIST_PATH,
39 | help="Path to the file listing the images in the dataset.")
40 | parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL,
41 | help="The index of the label to ignore during the training.")
42 | parser.add_argument("--num-classes", type=int, default=NUM_CLASSES,
43 | help="Number of classes to predict (including background).")
44 | parser.add_argument("--restore-from", type=str, default=RESTORE_FROM,
45 | help="Where restore model parameters from.")
46 | parser.add_argument("--gpu", type=int, default=0,
47 | help="choose gpu device.")
48 | return parser.parse_args()
49 |
50 |
51 | def get_iou(data_list, class_num, save_path=None):
52 | from multiprocessing import Pool
53 | from deeplab.metric import ConfusionMatrix
54 |
55 | ConfM = ConfusionMatrix(class_num)
56 | f = ConfM.generateM
57 | pool = Pool()
58 | m_list = pool.map(f, data_list)
59 | pool.close()
60 | pool.join()
61 |
62 | for m in m_list:
63 | ConfM.addM(m)
64 |
65 | aveJ, j_list, M = ConfM.jaccard()
66 | print('meanIOU: ' + str(aveJ) + '\n')
67 | if save_path:
68 | with open(save_path, 'w') as f:
69 | f.write('meanIOU: ' + str(aveJ) + '\n')
70 | f.write(str(j_list)+'\n')
71 | f.write(str(M)+'\n')
72 |
73 | def show_all(gt, pred):
74 | import matplotlib.pyplot as plt
75 | from matplotlib import colors
76 | from mpl_toolkits.axes_grid1 import make_axes_locatable
77 |
78 | fig, axes = plt.subplots(1, 2)
79 | ax1, ax2 = axes
80 |
81 | classes = np.array(('background', # always index 0
82 | 'aeroplane', 'bicycle', 'bird', 'boat',
83 | 'bottle', 'bus', 'car', 'cat', 'chair',
84 | 'cow', 'diningtable', 'dog', 'horse',
85 | 'motorbike', 'person', 'pottedplant',
86 | 'sheep', 'sofa', 'train', 'tvmonitor'))
87 | colormap = [(0,0,0),(0.5,0,0),(0,0.5,0),(0.5,0.5,0),(0,0,0.5),(0.5,0,0.5),(0,0.5,0.5),
88 | (0.5,0.5,0.5),(0.25,0,0),(0.75,0,0),(0.25,0.5,0),(0.75,0.5,0),(0.25,0,0.5),
89 | (0.75,0,0.5),(0.25,0.5,0.5),(0.75,0.5,0.5),(0,0.25,0),(0.5,0.25,0),(0,0.75,0),
90 | (0.5,0.75,0),(0,0.25,0.5)]
91 | cmap = colors.ListedColormap(colormap)
92 | bounds=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
93 | norm = colors.BoundaryNorm(bounds, cmap.N)
94 |
95 | ax1.set_title('gt')
96 | ax1.imshow(gt, cmap=cmap, norm=norm)
97 |
98 | ax2.set_title('pred')
99 | ax2.imshow(pred, cmap=cmap, norm=norm)
100 |
101 | plt.show()
102 |
103 | def main():
104 | """Create the model and start the evaluation process."""
105 | args = get_arguments()
106 |
107 | gpu0 = args.gpu
108 |
109 | model = Res_Deeplab(num_classes=args.num_classes)
110 |
111 | saved_state_dict = torch.load(args.restore_from)
112 | model.load_state_dict(saved_state_dict)
113 |
114 | model.eval()
115 | model.cuda(gpu0)
116 |
117 | testloader = data.DataLoader(VOCDataSet(args.data_dir, args.data_list, crop_size=(505, 505), mean=IMG_MEAN, scale=False, mirror=False),
118 | batch_size=1, shuffle=False, pin_memory=True)
119 |
120 | interp = nn.Upsample(size=(505, 505), mode='bilinear', align_corners=True)
121 | data_list = []
122 |
123 | for index, batch in enumerate(testloader):
124 | if index % 100 == 0:
125 | print('%d processd'%(index))
126 | image, label, size, name = batch
127 | size = size[0].numpy()
128 | output = model(Variable(image, volatile=True).cuda(gpu0))
129 | output = interp(output).cpu().data[0].numpy()
130 |
131 | output = output[:,:size[0],:size[1]]
132 | gt = np.asarray(label[0].numpy()[:size[0],:size[1]], dtype=np.int)
133 |
134 | output = output.transpose(1,2,0)
135 | output = np.asarray(np.argmax(output, axis=2), dtype=np.int)
136 |
137 | # show_all(gt, output)
138 | data_list.append([gt.flatten(), output.flatten()])
139 |
140 | get_iou(data_list, args.num_classes)
141 |
142 |
143 | if __name__ == '__main__':
144 | main()
145 |
--------------------------------------------------------------------------------
/evaluate_msc.py:
--------------------------------------------------------------------------------
1 | import argparse
2 | import scipy
3 | from scipy import ndimage
4 | import cv2
5 | import numpy as np
6 | import sys
7 |
8 | import torch
9 | from torch.autograd import Variable
10 | import torchvision.models as models
11 | import torch.nn.functional as F
12 | from torch.utils import data
13 | from deeplab.model import Res_Deeplab
14 | from deeplab.datasets import VOCDataSet
15 | from collections import OrderedDict
16 | import os
17 |
18 | import matplotlib.pyplot as plt
19 | import torch.nn as nn
20 | IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32)
21 |
22 | DATA_DIRECTORY = '../data/VOCdevkit/voc12'
23 | DATA_LIST_PATH = './dataset/list/val.txt'
24 | IGNORE_LABEL = 255
25 | NUM_CLASSES = 21
26 | NUM_STEPS = 1449 # Number of images in the validation set.
27 | RESTORE_FROM = './deeplab_resnet.ckpt'
28 |
29 | def get_arguments():
30 | """Parse all the arguments provided from the CLI.
31 |
32 | Returns:
33 | A list of parsed arguments.
34 | """
35 | parser = argparse.ArgumentParser(description="DeepLabLFOV Network")
36 | parser.add_argument("--data-dir", type=str, default=DATA_DIRECTORY,
37 | help="Path to the directory containing the PASCAL VOC dataset.")
38 | parser.add_argument("--data-list", type=str, default=DATA_LIST_PATH,
39 | help="Path to the file listing the images in the dataset.")
40 | parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL,
41 | help="The index of the label to ignore during the training.")
42 | parser.add_argument("--num-classes", type=int, default=NUM_CLASSES,
43 | help="Number of classes to predict (including background).")
44 | parser.add_argument("--restore-from", type=str, default=RESTORE_FROM,
45 | help="Where restore model parameters from.")
46 | parser.add_argument("--gpu", type=int, default=0,
47 | help="choose gpu device.")
48 | return parser.parse_args()
49 |
50 |
51 | def get_iou(data_list, class_num, save_path=None):
52 | from multiprocessing import Pool
53 | from deeplab.metric import ConfusionMatrix
54 |
55 | ConfM = ConfusionMatrix(class_num)
56 | f = ConfM.generateM
57 | pool = Pool()
58 | m_list = pool.map(f, data_list)
59 | pool.close()
60 | pool.join()
61 |
62 | for m in m_list:
63 | ConfM.addM(m)
64 |
65 | aveJ, j_list, M = ConfM.jaccard()
66 | print('meanIOU: ' + str(aveJ) + '\n')
67 | if save_path:
68 | with open(save_path, 'w') as f:
69 | f.write('meanIOU: ' + str(aveJ) + '\n')
70 | f.write(str(j_list)+'\n')
71 | f.write(str(M)+'\n')
72 |
73 | def show_all(gt, pred):
74 | import matplotlib.pyplot as plt
75 | from matplotlib import colors
76 | from mpl_toolkits.axes_grid1 import make_axes_locatable
77 |
78 | fig, axes = plt.subplots(1, 2)
79 | ax1, ax2 = axes
80 |
81 | classes = np.array(('background', # always index 0
82 | 'aeroplane', 'bicycle', 'bird', 'boat',
83 | 'bottle', 'bus', 'car', 'cat', 'chair',
84 | 'cow', 'diningtable', 'dog', 'horse',
85 | 'motorbike', 'person', 'pottedplant',
86 | 'sheep', 'sofa', 'train', 'tvmonitor'))
87 | colormap = [(0,0,0),(0.5,0,0),(0,0.5,0),(0.5,0.5,0),(0,0,0.5),(0.5,0,0.5),(0,0.5,0.5),
88 | (0.5,0.5,0.5),(0.25,0,0),(0.75,0,0),(0.25,0.5,0),(0.75,0.5,0),(0.25,0,0.5),
89 | (0.75,0,0.5),(0.25,0.5,0.5),(0.75,0.5,0.5),(0,0.25,0),(0.5,0.25,0),(0,0.75,0),
90 | (0.5,0.75,0),(0,0.25,0.5)]
91 | cmap = colors.ListedColormap(colormap)
92 | bounds=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]
93 | norm = colors.BoundaryNorm(bounds, cmap.N)
94 |
95 | ax1.set_title('gt')
96 | ax1.imshow(gt, cmap=cmap, norm=norm)
97 |
98 | ax2.set_title('pred')
99 | ax2.imshow(pred, cmap=cmap, norm=norm)
100 |
101 | plt.show()
102 |
103 | def main():
104 | """Create the model and start the evaluation process."""
105 | args = get_arguments()
106 |
107 | gpu0 = args.gpu
108 |
109 | model = Res_Deeplab(num_classes=args.num_classes)
110 |
111 | saved_state_dict = torch.load(args.restore_from)
112 | model.load_state_dict(saved_state_dict)
113 |
114 | model.eval()
115 | model.cuda(gpu0)
116 |
117 | testloader = data.DataLoader(VOCDataSet(args.data_dir, args.data_list, crop_size=(505, 505), mean=IMG_MEAN, scale=False, mirror=False),
118 | batch_size=1, shuffle=False, pin_memory=True)
119 |
120 | interp = nn.Upsample(size=(505, 505), mode='bilinear')
121 | data_list = []
122 |
123 | for index, batch in enumerate(testloader):
124 | if index % 100 == 0:
125 | print('%d processd'%(index))
126 | images, label, size, name = batch
127 | images = Variable(images, volatile=True)
128 | h, w, c = size[0].numpy()
129 | images075 = nn.Upsample(size=(int(h*0.75), int(w*0.75)), mode='bilinear')(images)
130 | images05 = nn.Upsample(size=(int(h*0.5), int(w*0.5)), mode='bilinear')(images)
131 |
132 | out100 = model(images.cuda(args.gpu))
133 | out075 = model(images075.cuda(args.gpu))
134 | out05 = model(images05.cuda(args.gpu))
135 | o_h, o_w = out100.size()[2:]
136 | interpo1 = nn.Upsample(size=(o_h, o_w), mode='bilinear')
137 | out_max = torch.max(torch.stack([out100, interpo1(out075), interpo1(out05)]), dim=0)[0]
138 |
139 | output = interp(out_max).cpu().data[0].numpy()
140 |
141 | output = output[:,:h,:w]
142 | output = output.transpose(1,2,0)
143 | output = np.asarray(np.argmax(output, axis=2), dtype=np.int)
144 |
145 | gt = np.asarray(label[0].numpy()[:h,:w], dtype=np.int)
146 |
147 | # show_all(gt, output)
148 | data_list.append([gt.flatten(), output.flatten()])
149 |
150 | get_iou(data_list, args.num_classes)
151 |
152 |
153 | if __name__ == '__main__':
154 | main()
--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
1 | import argparse
2 |
3 | import torch
4 | import torch.nn as nn
5 | from torch.utils import data
6 | import numpy as np
7 | import pickle
8 | import cv2
9 | from torch.autograd import Variable
10 | import torch.optim as optim
11 | import scipy.misc
12 | import torch.backends.cudnn as cudnn
13 | import sys
14 | import os
15 | import os.path as osp
16 | from deeplab.model import Res_Deeplab
17 | from deeplab.loss import CrossEntropy2d
18 | from deeplab.datasets import VOCDataSet
19 | import matplotlib.pyplot as plt
20 | import random
21 | import timeit
22 | start = timeit.default_timer()
23 |
24 | IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32)
25 |
26 | BATCH_SIZE = 10
27 | DATA_DIRECTORY = '../../data/VOCdevkit/voc12'
28 | DATA_LIST_PATH = './dataset/list/train_aug.txt'
29 | IGNORE_LABEL = 255
30 | INPUT_SIZE = '321,321'
31 | LEARNING_RATE = 2.5e-4
32 | MOMENTUM = 0.9
33 | NUM_CLASSES = 21
34 | NUM_STEPS = 20000
35 | POWER = 0.9
36 | RANDOM_SEED = 1234
37 | RESTORE_FROM = './dataset/MS_DeepLab_resnet_pretrained_COCO_init.pth'
38 | SAVE_NUM_IMAGES = 2
39 | SAVE_PRED_EVERY = 1000
40 | SNAPSHOT_DIR = './snapshots/'
41 | WEIGHT_DECAY = 0.0005
42 |
43 | def get_arguments():
44 | """Parse all the arguments provided from the CLI.
45 |
46 | Returns:
47 | A list of parsed arguments.
48 | """
49 | parser = argparse.ArgumentParser(description="DeepLab-ResNet Network")
50 | parser.add_argument("--batch-size", type=int, default=BATCH_SIZE,
51 | help="Number of images sent to the network in one step.")
52 | parser.add_argument("--data-dir", type=str, default=DATA_DIRECTORY,
53 | help="Path to the directory containing the PASCAL VOC dataset.")
54 | parser.add_argument("--data-list", type=str, default=DATA_LIST_PATH,
55 | help="Path to the file listing the images in the dataset.")
56 | parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL,
57 | help="The index of the label to ignore during the training.")
58 | parser.add_argument("--input-size", type=str, default=INPUT_SIZE,
59 | help="Comma-separated string with height and width of images.")
60 | parser.add_argument("--is-training", action="store_true",
61 | help="Whether to updates the running means and variances during the training.")
62 | parser.add_argument("--learning-rate", type=float, default=LEARNING_RATE,
63 | help="Base learning rate for training with polynomial decay.")
64 | parser.add_argument("--momentum", type=float, default=MOMENTUM,
65 | help="Momentum component of the optimiser.")
66 | parser.add_argument("--not-restore-last", action="store_true",
67 | help="Whether to not restore last (FC) layers.")
68 | parser.add_argument("--num-classes", type=int, default=NUM_CLASSES,
69 | help="Number of classes to predict (including background).")
70 | parser.add_argument("--num-steps", type=int, default=NUM_STEPS,
71 | help="Number of training steps.")
72 | parser.add_argument("--power", type=float, default=POWER,
73 | help="Decay parameter to compute the learning rate.")
74 | parser.add_argument("--random-mirror", action="store_true",
75 | help="Whether to randomly mirror the inputs during the training.")
76 | parser.add_argument("--random-scale", action="store_true",
77 | help="Whether to randomly scale the inputs during the training.")
78 | parser.add_argument("--random-seed", type=int, default=RANDOM_SEED,
79 | help="Random seed to have reproducible results.")
80 | parser.add_argument("--restore-from", type=str, default=RESTORE_FROM,
81 | help="Where restore model parameters from.")
82 | parser.add_argument("--save-num-images", type=int, default=SAVE_NUM_IMAGES,
83 | help="How many images to save.")
84 | parser.add_argument("--save-pred-every", type=int, default=SAVE_PRED_EVERY,
85 | help="Save summaries and checkpoint every often.")
86 | parser.add_argument("--snapshot-dir", type=str, default=SNAPSHOT_DIR,
87 | help="Where to save snapshots of the model.")
88 | parser.add_argument("--weight-decay", type=float, default=WEIGHT_DECAY,
89 | help="Regularisation parameter for L2-loss.")
90 | parser.add_argument("--gpu", type=int, default=0,
91 | help="choose gpu device.")
92 | return parser.parse_args()
93 |
94 | args = get_arguments()
95 |
96 | def loss_calc(pred, label):
97 | """
98 | This function returns cross entropy loss for semantic segmentation
99 | """
100 | # out shape batch_size x channels x h x w -> batch_size x channels x h x w
101 | # label shape h x w x 1 x batch_size -> batch_size x 1 x h x w
102 | label = Variable(label.long()).cuda()
103 | criterion = torch.nn.CrossEntropyLoss(ignore_index=IGNORE_LABEL).cuda()
104 |
105 | return criterion(pred, label)
106 |
107 |
108 | def lr_poly(base_lr, iter, max_iter, power):
109 | return base_lr*((1-float(iter)/max_iter)**(power))
110 |
111 |
112 | def get_1x_lr_params_NOscale(model):
113 | """
114 | This generator returns all the parameters of the net except for
115 | the last classification layer. Note that for each batchnorm layer,
116 | requires_grad is set to False in deeplab_resnet.py, therefore this function does not return
117 | any batchnorm parameter
118 | """
119 | b = []
120 |
121 | b.append(model.conv1)
122 | b.append(model.bn1)
123 | b.append(model.layer1)
124 | b.append(model.layer2)
125 | b.append(model.layer3)
126 | b.append(model.layer4)
127 |
128 |
129 | for i in range(len(b)):
130 | for j in b[i].modules():
131 | jj = 0
132 | for k in j.parameters():
133 | jj+=1
134 | if k.requires_grad:
135 | yield k
136 |
137 | def get_10x_lr_params(model):
138 | """
139 | This generator returns all the parameters for the last layer of the net,
140 | which does the classification of pixel into classes
141 | """
142 | b = []
143 | b.append(model.layer5.parameters())
144 |
145 | for j in range(len(b)):
146 | for i in b[j]:
147 | yield i
148 |
149 |
150 | def adjust_learning_rate(optimizer, i_iter):
151 | """Sets the learning rate to the initial LR divided by 5 at 60th, 120th and 160th epochs"""
152 | lr = lr_poly(args.learning_rate, i_iter, args.num_steps, args.power)
153 | optimizer.param_groups[0]['lr'] = lr
154 | optimizer.param_groups[1]['lr'] = lr * 10
155 |
156 |
157 | def main():
158 | """Create the model and start the training."""
159 |
160 | os.environ["CUDA_VISIBLE_DEVICES"]=str(args.gpu)
161 | h, w = map(int, args.input_size.split(','))
162 | input_size = (h, w)
163 |
164 | cudnn.enabled = True
165 |
166 | # Create network.
167 | model = Res_Deeplab(num_classes=args.num_classes)
168 | # For a small batch size, it is better to keep
169 | # the statistics of the BN layers (running means and variances)
170 | # frozen, and to not update the values provided by the pre-trained model.
171 | # If is_training=True, the statistics will be updated during the training.
172 | # Note that is_training=False still updates BN parameters gamma (scale) and beta (offset)
173 | # if they are presented in var_list of the optimiser definition.
174 |
175 | saved_state_dict = torch.load(args.restore_from)
176 | new_params = model.state_dict().copy()
177 | for i in saved_state_dict:
178 | #Scale.layer5.conv2d_list.3.weight
179 | i_parts = i.split('.')
180 | # print i_parts
181 | if not args.num_classes == 21 or not i_parts[1]=='layer5':
182 | new_params['.'.join(i_parts[1:])] = saved_state_dict[i]
183 | model.load_state_dict(new_params)
184 | #model.float()
185 | #model.eval() # use_global_stats = True
186 | model.train()
187 | model.cuda()
188 |
189 | cudnn.benchmark = True
190 |
191 | if not os.path.exists(args.snapshot_dir):
192 | os.makedirs(args.snapshot_dir)
193 |
194 |
195 | trainloader = data.DataLoader(VOCDataSet(args.data_dir, args.data_list, max_iters=args.num_steps*args.batch_size, crop_size=input_size,
196 | scale=args.random_scale, mirror=args.random_mirror, mean=IMG_MEAN),
197 | batch_size=args.batch_size, shuffle=True, num_workers=5, pin_memory=True)
198 |
199 | optimizer = optim.SGD([{'params': get_1x_lr_params_NOscale(model), 'lr': args.learning_rate },
200 | {'params': get_10x_lr_params(model), 'lr': 10*args.learning_rate}],
201 | lr=args.learning_rate, momentum=args.momentum,weight_decay=args.weight_decay)
202 | optimizer.zero_grad()
203 |
204 | interp = nn.Upsample(size=input_size, mode='bilinear', align_corners=True)
205 |
206 |
207 | for i_iter, batch in enumerate(trainloader):
208 | images, labels, _, _ = batch
209 | images = Variable(images).cuda()
210 |
211 | optimizer.zero_grad()
212 | adjust_learning_rate(optimizer, i_iter)
213 | pred = interp(model(images))
214 | loss = loss_calc(pred, labels)
215 | loss.backward()
216 | optimizer.step()
217 |
218 |
219 | print 'iter = ', i_iter, 'of', args.num_steps,'completed, loss = ', loss.data.cpu().numpy()
220 |
221 | if i_iter >= args.num_steps-1:
222 | print 'save model ...'
223 | torch.save(model.state_dict(),osp.join(args.snapshot_dir, 'VOC12_scenes_'+str(args.num_steps)+'.pth'))
224 | break
225 |
226 | if i_iter % args.save_pred_every == 0 and i_iter!=0:
227 | print 'taking snapshot ...'
228 | torch.save(model.state_dict(),osp.join(args.snapshot_dir, 'VOC12_scenes_'+str(i_iter)+'.pth'))
229 |
230 | end = timeit.default_timer()
231 | print end-start,'seconds'
232 |
233 | if __name__ == '__main__':
234 | main()
235 |
--------------------------------------------------------------------------------
/train_msc.py:
--------------------------------------------------------------------------------
1 | import argparse
2 |
3 | import torch
4 | import torch.nn as nn
5 | from torch.utils import data
6 | import numpy as np
7 | import pickle
8 | import cv2
9 | from torch.autograd import Variable
10 | import torch.optim as optim
11 | import scipy.misc
12 | import torch.backends.cudnn as cudnn
13 | import sys
14 | import os
15 | import os.path as osp
16 | import scipy.ndimage as nd
17 | from deeplab.model import Res_Deeplab
18 | from deeplab.loss import CrossEntropy2d
19 | from deeplab.datasets import VOCDataSet
20 | import matplotlib.pyplot as plt
21 | import random
22 | import timeit
23 | start = timeit.timeit()
24 |
25 | IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32)
26 |
27 | BATCH_SIZE = 1
28 | DATA_DIRECTORY = '../data/VOCdevkit/voc12'
29 | DATA_LIST_PATH = './dataset/list/train_aug.txt'
30 | ITER_SIZE = 10
31 | IGNORE_LABEL = 255
32 | INPUT_SIZE = '321,321'
33 | LEARNING_RATE = 2.5e-4
34 | MOMENTUM = 0.9
35 | NUM_CLASSES = 21
36 | NUM_STEPS = 20000
37 | POWER = 0.9
38 | RANDOM_SEED = 1234
39 | RESTORE_FROM = './dataset/MS_DeepLab_resnet_pretrained_COCO_init.pth'
40 | SAVE_NUM_IMAGES = 2
41 | SAVE_PRED_EVERY = 1000
42 | SNAPSHOT_DIR = './snapshots_msc/'
43 | WEIGHT_DECAY = 0.0005
44 |
45 | def get_arguments():
46 | """Parse all the arguments provided from the CLI.
47 |
48 | Returns:
49 | A list of parsed arguments.
50 | """
51 | parser = argparse.ArgumentParser(description="DeepLab-ResNet Network")
52 | parser.add_argument("--batch-size", type=int, default=BATCH_SIZE,
53 | help="Number of images sent to the network in one step.")
54 | parser.add_argument("--data-dir", type=str, default=DATA_DIRECTORY,
55 | help="Path to the directory containing the PASCAL VOC dataset.")
56 | parser.add_argument("--data-list", type=str, default=DATA_LIST_PATH,
57 | help="Path to the file listing the images in the dataset.")
58 | parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL,
59 | help="The index of the label to ignore during the training.")
60 | parser.add_argument("--input-size", type=str, default=INPUT_SIZE,
61 | help="Comma-separated string with height and width of images.")
62 | parser.add_argument("--iter-size", type=int, default=ITER_SIZE,
63 | help="Number of steps after which gradient update is applied.")
64 | parser.add_argument("--is-training", action="store_true",
65 | help="Whether to updates the running means and variances during the training.")
66 | parser.add_argument("--learning-rate", type=float, default=LEARNING_RATE,
67 | help="Base learning rate for training with polynomial decay.")
68 | parser.add_argument("--momentum", type=float, default=MOMENTUM,
69 | help="Momentum component of the optimiser.")
70 | parser.add_argument("--not-restore-last", action="store_true",
71 | help="Whether to not restore last (FC) layers.")
72 | parser.add_argument("--num-classes", type=int, default=NUM_CLASSES,
73 | help="Number of classes to predict (including background).")
74 | parser.add_argument("--num-steps", type=int, default=NUM_STEPS,
75 | help="Number of training steps.")
76 | parser.add_argument("--power", type=float, default=POWER,
77 | help="Decay parameter to compute the learning rate.")
78 | parser.add_argument("--random-mirror", action="store_true",
79 | help="Whether to randomly mirror the inputs during the training.")
80 | parser.add_argument("--random-scale", action="store_true",
81 | help="Whether to randomly scale the inputs during the training.")
82 | parser.add_argument("--random-seed", type=int, default=RANDOM_SEED,
83 | help="Random seed to have reproducible results.")
84 | parser.add_argument("--restore-from", type=str, default=RESTORE_FROM,
85 | help="Where restore model parameters from.")
86 | parser.add_argument("--save-num-images", type=int, default=SAVE_NUM_IMAGES,
87 | help="How many images to save.")
88 | parser.add_argument("--save-pred-every", type=int, default=SAVE_PRED_EVERY,
89 | help="Save summaries and checkpoint every often.")
90 | parser.add_argument("--snapshot-dir", type=str, default=SNAPSHOT_DIR,
91 | help="Where to save snapshots of the model.")
92 | parser.add_argument("--weight-decay", type=float, default=WEIGHT_DECAY,
93 | help="Regularisation parameter for L2-loss.")
94 | parser.add_argument("--gpu", type=int, default=0,
95 | help="choose gpu device.")
96 | return parser.parse_args()
97 |
98 | args = get_arguments()
99 |
100 | def loss_calc(pred, label, gpu):
101 | """
102 | This function returns cross entropy loss for semantic segmentation
103 | """
104 | # out shape batch_size x channels x h x w -> batch_size x channels x h x w
105 | # label shape h x w x 1 x batch_size -> batch_size x 1 x h x w
106 | label = torch.from_numpy(label).long()
107 | label = Variable(label).cuda(gpu)
108 | m = nn.LogSoftmax()
109 | criterion = CrossEntropy2d().cuda(gpu)
110 | pred = m(pred)
111 |
112 | return criterion(pred, label)
113 |
114 |
115 | def lr_poly(base_lr, iter, max_iter, power):
116 | return base_lr*((1-float(iter)/max_iter)**(power))
117 |
118 |
119 | def get_1x_lr_params_NOscale(model):
120 | """
121 | This generator returns all the parameters of the net except for
122 | the last classification layer. Note that for each batchnorm layer,
123 | requires_grad is set to False in deeplab_resnet.py, therefore this function does not return
124 | any batchnorm parameter
125 | """
126 | b = []
127 |
128 | b.append(model.conv1)
129 | b.append(model.bn1)
130 | b.append(model.layer1)
131 | b.append(model.layer2)
132 | b.append(model.layer3)
133 | b.append(model.layer4)
134 |
135 |
136 | for i in range(len(b)):
137 | for j in b[i].modules():
138 | jj = 0
139 | for k in j.parameters():
140 | jj+=1
141 | if k.requires_grad:
142 | yield k
143 |
144 | def get_10x_lr_params(model):
145 | """
146 | This generator returns all the parameters for the last layer of the net,
147 | which does the classification of pixel into classes
148 | """
149 | b = []
150 | b.append(model.layer5.parameters())
151 |
152 | for j in range(len(b)):
153 | for i in b[j]:
154 | yield i
155 |
156 |
157 | def adjust_learning_rate(optimizer, i_iter):
158 | """Sets the learning rate to the initial LR divided by 5 at 60th, 120th and 160th epochs"""
159 | lr = lr_poly(args.learning_rate, i_iter, args.num_steps, args.power)
160 | optimizer.param_groups[0]['lr'] = lr
161 | optimizer.param_groups[1]['lr'] = lr * 10
162 |
163 |
164 | def main():
165 | """Create the model and start the training."""
166 |
167 | h, w = map(int, args.input_size.split(','))
168 | input_size = (h, w)
169 |
170 | cudnn.enabled = True
171 | gpu = args.gpu
172 |
173 | # Create network.
174 | model = Res_Deeplab(num_classes=args.num_classes)
175 | # For a small batch size, it is better to keep
176 | # the statistics of the BN layers (running means and variances)
177 | # frozen, and to not update the values provided by the pre-trained model.
178 | # If is_training=True, the statistics will be updated during the training.
179 | # Note that is_training=False still updates BN parameters gamma (scale) and beta (offset)
180 | # if they are presented in var_list of the optimiser definition.
181 |
182 | saved_state_dict = torch.load(args.restore_from)
183 | new_params = model.state_dict().copy()
184 | for i in saved_state_dict:
185 | #Scale.layer5.conv2d_list.3.weight
186 | i_parts = i.split('.')
187 | # print i_parts
188 | if not args.num_classes == 21 or not i_parts[1]=='layer5':
189 | new_params['.'.join(i_parts[1:])] = saved_state_dict[i]
190 | model.load_state_dict(new_params)
191 | #model.float()
192 | #model.eval() # use_global_stats = True
193 | model.train()
194 | model.cuda(args.gpu)
195 |
196 | cudnn.benchmark = True
197 |
198 | if not os.path.exists(args.snapshot_dir):
199 | os.makedirs(args.snapshot_dir)
200 |
201 |
202 | trainloader = data.DataLoader(VOCDataSet(args.data_dir, args.data_list, max_iters=args.num_steps*args.iter_size,
203 | crop_size=input_size, scale=args.random_scale, mirror=args.random_mirror, mean=IMG_MEAN),
204 | batch_size=args.batch_size, shuffle=True, num_workers=1, pin_memory=True)
205 |
206 | optimizer = optim.SGD([{'params': get_1x_lr_params_NOscale(model), 'lr': args.learning_rate },
207 | {'params': get_10x_lr_params(model), 'lr': 10*args.learning_rate}],
208 | lr=args.learning_rate, momentum=args.momentum,weight_decay=args.weight_decay)
209 | optimizer.zero_grad()
210 |
211 | b_loss = 0
212 | for i_iter, batch in enumerate(trainloader):
213 |
214 | images, labels, _, _ = batch
215 | images, labels = Variable(images), labels.numpy()
216 | h, w = images.size()[2:]
217 | images075 = nn.Upsample(size=(int(h*0.75), int(w*0.75)), mode='bilinear')(images)
218 | images05 = nn.Upsample(size=(int(h*0.5), int(w*0.5)), mode='bilinear')(images)
219 |
220 | out = model(images.cuda(args.gpu))
221 | out075 = model(images075.cuda(args.gpu))
222 | out05 = model(images05.cuda(args.gpu))
223 | o_h, o_w = out.size()[2:]
224 | interpo1 = nn.Upsample(size=(o_h, o_w), mode='bilinear')
225 | interpo2 = nn.Upsample(size=(h, w), mode='bilinear')
226 | out_max = interpo2(torch.max(torch.stack([out, interpo1(out075), interpo1(out05)]), dim=0)[0])
227 |
228 | loss = loss_calc(out_max, labels, args.gpu)
229 | d1, d2 = float(labels.shape[1]), float(labels.shape[2])
230 | loss100 = loss_calc(out, nd.zoom(labels, (1.0, out.size()[2]/d1, out.size()[3]/d2), order=0), args.gpu)
231 | loss075 = loss_calc(out075, nd.zoom(labels, (1.0, out075.size()[2]/d1, out075.size()[3]/d2), order=0), args.gpu)
232 | loss05 = loss_calc(out05, nd.zoom(labels, (1.0, out05.size()[2]/d1, out05.size()[3]/d2), order=0), args.gpu)
233 | loss_all = (loss + loss100 + loss075 + loss05) / args.iter_size
234 | loss_all.backward()
235 | b_loss += loss_all.data.cpu().numpy()
236 |
237 | b_iter = i_iter / args.iter_size
238 |
239 | if b_iter >= args.num_steps-1:
240 | print 'save model ...'
241 | optimizer.step()
242 | torch.save(model.state_dict(),osp.join(args.snapshot_dir, 'VOC12_scenes_'+str(args.num_steps)+'.pth'))
243 | break
244 |
245 | if i_iter % args.iter_size == 0 and i_iter != 0:
246 | print 'iter = ', b_iter, 'of', args.num_steps,'completed, loss = ', b_loss
247 | optimizer.step()
248 | adjust_learning_rate(optimizer, b_iter)
249 | optimizer.zero_grad()
250 | b_loss = 0
251 |
252 | if i_iter % (args.save_pred_every*args.iter_size) == 0 and b_iter!=0:
253 | print 'taking snapshot ...'
254 | torch.save(model.state_dict(),osp.join(args.snapshot_dir, 'VOC12_scenes_'+str(b_iter)+'.pth'))
255 |
256 | end = timeit.timeit()
257 | print end-start,'seconds'
258 |
259 | if __name__ == '__main__':
260 | main()
261 |
--------------------------------------------------------------------------------