├── .gitignore ├── LICENSE ├── README.md ├── dataset └── list │ ├── train_aug.txt │ └── val.txt ├── deeplab ├── __init__.py ├── datasets.py ├── loss.py ├── metric.py └── model.py ├── evaluate.py ├── evaluate_msc.py ├── train.py └── train_msc.py /.gitignore: -------------------------------------------------------------------------------- 1 | # models 2 | dataset/MS_DeepLab_resnet_pretrained_COCO_init.pth 3 | 4 | # pyc 5 | *.pyc 6 | deeplab/*.pyc 7 | 8 | snapshots/ 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Zilong Huang 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DeepLab-ResNet-Pytorch 2 | 3 | **New!** We have released [Pytorch-Segmentation-Toolbox](https://github.com/speedinghzl/pytorch-segmentation-toolbox) which contains PyTorch Implementations for DeeplabV3 and PSPNet with **Better Reproduced Performance** on cityscapes. 4 | 5 | This is an (re-)implementation of [DeepLab-ResNet](http://liangchiehchen.com/projects/DeepLabv2_resnet.html) in Pytorch for semantic image segmentation on the [PASCAL VOC dataset](http://host.robots.ox.ac.uk/pascal/VOC/). 6 | 7 | ## Updates 8 | 9 | **9 July, 2017**: 10 | * The training script `train.py` has been re-written following the original optimisation setup: SGD with momentum, weight decay, learning rate with polynomial decay, different learning rates for different layers, ignoring the 'void' label (255). 11 | * The training script with multi-scale inputs `train_msc.py` has been added: the input is resized to 0.5 and 0.75 of the original resolution, and 4 losses are aggregated: loss on the original resolution, on the 0.75 resolution, on the 0.5 resolution, and loss on the all fused outputs. 12 | * Evaluation of a single-scale model on the PASCAL VOC validation dataset (using ['SegmentationClassAug'](https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0)) leads to 74.0% mIoU ['VOC12_scenes_20000.pth'](https://pan.baidu.com/s/1bP52R8) without CRF as post-processing step. The evaluation of multi-scale model is in progress. 13 | 14 | 15 | ## Model Description 16 | 17 | The DeepLab-ResNet is built on a fully convolutional variant of [ResNet-101](https://github.com/KaimingHe/deep-residual-networks) with [atrous (dilated) convolutions](https://github.com/fyu/dilation), atrous spatial pyramid pooling, and multi-scale inputs (not implemented here). 18 | 19 | The model is trained on a mini-batch of images and corresponding ground truth masks with the softmax classifier at the top. During training, the masks are downsampled to match the size of the output from the network; during inference, to acquire the output of the same size as the input, bilinear upsampling is applied. The final segmentation mask is computed using argmax over the logits. 20 | Optionally, a fully-connected probabilistic graphical model, namely, CRF, can be applied to refine the final predictions. 21 | On the test set of PASCAL VOC, the model achieves 79.7% with CRFs and 76.4% without CRFs of mean intersection-over-union. 22 | 23 | For more details on the underlying model please refer to the following paper: 24 | 25 | 26 | @article{CP2016Deeplab, 27 | title={DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs}, 28 | author={Liang-Chieh Chen and George Papandreou and Iasonas Kokkinos and Kevin Murphy and Alan L Yuille}, 29 | journal={arXiv:1606.00915}, 30 | year={2016} 31 | } 32 | 33 | ## Dataset and Training 34 | 35 | To train the network, one can use the augmented PASCAL VOC 2012 dataset with 10582 images for training and 1449 images for validation. Pytorch >= 0.4.0. 36 | 37 | You can download converted `init.caffemodel` with extension name .pth [here](https://drive.google.com/open?id=0BxhUwxvLPO7TVFJQU1dwbXhHdEk). Besides that, one can also exploit random scaling and mirroring of the inputs during training as a means for data augmentation. For example, to train the model from scratch with random scale and mirroring turned on, simply run: 38 | ```bash 39 | python train.py --random-mirror --random-scale --gpu 0 40 | ``` 41 | 42 | ## Evaluation 43 | 44 | The single-scale model shows 74.0% mIoU on the Pascal VOC 2012 validation dataset (['SegmentationClassAug'](https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip?dl=0)). No post-processing step with CRF is applied. 45 | 46 | The following command provides the description of each of the evaluation settings: 47 | ```bash 48 | python evaluate.py --help 49 | ``` 50 | 51 | ## Acknowledgment 52 | This code is heavily borrowed from [pytorch-deeplab-resnet](https://github.com/isht7/pytorch-deeplab-resnet). 53 | 54 | ## Other implementations 55 | * [DeepLab-LargeFOV in TensorFlow](https://github.com/DrSleep/tensorflow-deeplab-lfov) 56 | * [DeepLab-LargeFOV in Pytorch](https://github.com/isht7/pytorch-deeplab-resnet) 57 | 58 | 59 | -------------------------------------------------------------------------------- /dataset/list/val.txt: -------------------------------------------------------------------------------- 1 | 2007_000033 2 | 2007_000042 3 | 2007_000061 4 | 2007_000123 5 | 2007_000129 6 | 2007_000175 7 | 2007_000187 8 | 2007_000323 9 | 2007_000332 10 | 2007_000346 11 | 2007_000452 12 | 2007_000464 13 | 2007_000491 14 | 2007_000529 15 | 2007_000559 16 | 2007_000572 17 | 2007_000629 18 | 2007_000636 19 | 2007_000661 20 | 2007_000663 21 | 2007_000676 22 | 2007_000727 23 | 2007_000762 24 | 2007_000783 25 | 2007_000799 26 | 2007_000804 27 | 2007_000830 28 | 2007_000837 29 | 2007_000847 30 | 2007_000862 31 | 2007_000925 32 | 2007_000999 33 | 2007_001154 34 | 2007_001175 35 | 2007_001239 36 | 2007_001284 37 | 2007_001288 38 | 2007_001289 39 | 2007_001299 40 | 2007_001311 41 | 2007_001321 42 | 2007_001377 43 | 2007_001408 44 | 2007_001423 45 | 2007_001430 46 | 2007_001457 47 | 2007_001458 48 | 2007_001526 49 | 2007_001568 50 | 2007_001585 51 | 2007_001586 52 | 2007_001587 53 | 2007_001594 54 | 2007_001630 55 | 2007_001677 56 | 2007_001678 57 | 2007_001717 58 | 2007_001733 59 | 2007_001761 60 | 2007_001763 61 | 2007_001774 62 | 2007_001884 63 | 2007_001955 64 | 2007_002046 65 | 2007_002094 66 | 2007_002119 67 | 2007_002132 68 | 2007_002260 69 | 2007_002266 70 | 2007_002268 71 | 2007_002284 72 | 2007_002376 73 | 2007_002378 74 | 2007_002387 75 | 2007_002400 76 | 2007_002412 77 | 2007_002426 78 | 2007_002427 79 | 2007_002445 80 | 2007_002470 81 | 2007_002539 82 | 2007_002565 83 | 2007_002597 84 | 2007_002618 85 | 2007_002619 86 | 2007_002624 87 | 2007_002643 88 | 2007_002648 89 | 2007_002719 90 | 2007_002728 91 | 2007_002823 92 | 2007_002824 93 | 2007_002852 94 | 2007_002903 95 | 2007_003011 96 | 2007_003020 97 | 2007_003022 98 | 2007_003051 99 | 2007_003088 100 | 2007_003101 101 | 2007_003106 102 | 2007_003110 103 | 2007_003131 104 | 2007_003134 105 | 2007_003137 106 | 2007_003143 107 | 2007_003169 108 | 2007_003188 109 | 2007_003194 110 | 2007_003195 111 | 2007_003201 112 | 2007_003349 113 | 2007_003367 114 | 2007_003373 115 | 2007_003499 116 | 2007_003503 117 | 2007_003506 118 | 2007_003530 119 | 2007_003571 120 | 2007_003587 121 | 2007_003611 122 | 2007_003621 123 | 2007_003682 124 | 2007_003711 125 | 2007_003714 126 | 2007_003742 127 | 2007_003786 128 | 2007_003841 129 | 2007_003848 130 | 2007_003861 131 | 2007_003872 132 | 2007_003917 133 | 2007_003957 134 | 2007_003991 135 | 2007_004033 136 | 2007_004052 137 | 2007_004112 138 | 2007_004121 139 | 2007_004143 140 | 2007_004189 141 | 2007_004190 142 | 2007_004193 143 | 2007_004241 144 | 2007_004275 145 | 2007_004281 146 | 2007_004380 147 | 2007_004392 148 | 2007_004405 149 | 2007_004468 150 | 2007_004483 151 | 2007_004510 152 | 2007_004538 153 | 2007_004558 154 | 2007_004644 155 | 2007_004649 156 | 2007_004712 157 | 2007_004722 158 | 2007_004856 159 | 2007_004866 160 | 2007_004902 161 | 2007_004969 162 | 2007_005058 163 | 2007_005074 164 | 2007_005107 165 | 2007_005114 166 | 2007_005149 167 | 2007_005173 168 | 2007_005281 169 | 2007_005294 170 | 2007_005296 171 | 2007_005304 172 | 2007_005331 173 | 2007_005354 174 | 2007_005358 175 | 2007_005428 176 | 2007_005460 177 | 2007_005469 178 | 2007_005509 179 | 2007_005547 180 | 2007_005600 181 | 2007_005608 182 | 2007_005626 183 | 2007_005689 184 | 2007_005696 185 | 2007_005705 186 | 2007_005759 187 | 2007_005803 188 | 2007_005813 189 | 2007_005828 190 | 2007_005844 191 | 2007_005845 192 | 2007_005857 193 | 2007_005911 194 | 2007_005915 195 | 2007_005978 196 | 2007_006028 197 | 2007_006035 198 | 2007_006046 199 | 2007_006076 200 | 2007_006086 201 | 2007_006117 202 | 2007_006171 203 | 2007_006241 204 | 2007_006260 205 | 2007_006277 206 | 2007_006348 207 | 2007_006364 208 | 2007_006373 209 | 2007_006444 210 | 2007_006449 211 | 2007_006549 212 | 2007_006553 213 | 2007_006560 214 | 2007_006647 215 | 2007_006678 216 | 2007_006680 217 | 2007_006698 218 | 2007_006761 219 | 2007_006802 220 | 2007_006837 221 | 2007_006841 222 | 2007_006864 223 | 2007_006866 224 | 2007_006946 225 | 2007_007007 226 | 2007_007084 227 | 2007_007109 228 | 2007_007130 229 | 2007_007165 230 | 2007_007168 231 | 2007_007195 232 | 2007_007196 233 | 2007_007203 234 | 2007_007211 235 | 2007_007235 236 | 2007_007341 237 | 2007_007414 238 | 2007_007417 239 | 2007_007470 240 | 2007_007477 241 | 2007_007493 242 | 2007_007498 243 | 2007_007524 244 | 2007_007534 245 | 2007_007624 246 | 2007_007651 247 | 2007_007688 248 | 2007_007748 249 | 2007_007795 250 | 2007_007810 251 | 2007_007815 252 | 2007_007818 253 | 2007_007836 254 | 2007_007849 255 | 2007_007881 256 | 2007_007996 257 | 2007_008051 258 | 2007_008084 259 | 2007_008106 260 | 2007_008110 261 | 2007_008204 262 | 2007_008222 263 | 2007_008256 264 | 2007_008260 265 | 2007_008339 266 | 2007_008374 267 | 2007_008415 268 | 2007_008430 269 | 2007_008543 270 | 2007_008547 271 | 2007_008596 272 | 2007_008645 273 | 2007_008670 274 | 2007_008708 275 | 2007_008722 276 | 2007_008747 277 | 2007_008802 278 | 2007_008815 279 | 2007_008897 280 | 2007_008944 281 | 2007_008964 282 | 2007_008973 283 | 2007_008980 284 | 2007_009015 285 | 2007_009068 286 | 2007_009084 287 | 2007_009088 288 | 2007_009096 289 | 2007_009221 290 | 2007_009245 291 | 2007_009251 292 | 2007_009252 293 | 2007_009258 294 | 2007_009320 295 | 2007_009323 296 | 2007_009331 297 | 2007_009346 298 | 2007_009392 299 | 2007_009413 300 | 2007_009419 301 | 2007_009446 302 | 2007_009458 303 | 2007_009521 304 | 2007_009562 305 | 2007_009592 306 | 2007_009654 307 | 2007_009655 308 | 2007_009684 309 | 2007_009687 310 | 2007_009691 311 | 2007_009706 312 | 2007_009750 313 | 2007_009756 314 | 2007_009764 315 | 2007_009794 316 | 2007_009817 317 | 2007_009841 318 | 2007_009897 319 | 2007_009911 320 | 2007_009923 321 | 2007_009938 322 | 2008_000009 323 | 2008_000016 324 | 2008_000073 325 | 2008_000075 326 | 2008_000080 327 | 2008_000107 328 | 2008_000120 329 | 2008_000123 330 | 2008_000149 331 | 2008_000182 332 | 2008_000213 333 | 2008_000215 334 | 2008_000223 335 | 2008_000233 336 | 2008_000234 337 | 2008_000239 338 | 2008_000254 339 | 2008_000270 340 | 2008_000271 341 | 2008_000345 342 | 2008_000359 343 | 2008_000391 344 | 2008_000401 345 | 2008_000464 346 | 2008_000469 347 | 2008_000474 348 | 2008_000501 349 | 2008_000510 350 | 2008_000533 351 | 2008_000573 352 | 2008_000589 353 | 2008_000602 354 | 2008_000630 355 | 2008_000657 356 | 2008_000661 357 | 2008_000662 358 | 2008_000666 359 | 2008_000673 360 | 2008_000700 361 | 2008_000725 362 | 2008_000731 363 | 2008_000763 364 | 2008_000765 365 | 2008_000782 366 | 2008_000795 367 | 2008_000811 368 | 2008_000848 369 | 2008_000853 370 | 2008_000863 371 | 2008_000911 372 | 2008_000919 373 | 2008_000943 374 | 2008_000992 375 | 2008_001013 376 | 2008_001028 377 | 2008_001040 378 | 2008_001070 379 | 2008_001074 380 | 2008_001076 381 | 2008_001078 382 | 2008_001135 383 | 2008_001150 384 | 2008_001170 385 | 2008_001231 386 | 2008_001249 387 | 2008_001260 388 | 2008_001283 389 | 2008_001308 390 | 2008_001379 391 | 2008_001404 392 | 2008_001433 393 | 2008_001439 394 | 2008_001478 395 | 2008_001491 396 | 2008_001504 397 | 2008_001513 398 | 2008_001514 399 | 2008_001531 400 | 2008_001546 401 | 2008_001547 402 | 2008_001580 403 | 2008_001629 404 | 2008_001640 405 | 2008_001682 406 | 2008_001688 407 | 2008_001715 408 | 2008_001821 409 | 2008_001874 410 | 2008_001885 411 | 2008_001895 412 | 2008_001966 413 | 2008_001971 414 | 2008_001992 415 | 2008_002043 416 | 2008_002152 417 | 2008_002205 418 | 2008_002212 419 | 2008_002239 420 | 2008_002240 421 | 2008_002241 422 | 2008_002269 423 | 2008_002273 424 | 2008_002358 425 | 2008_002379 426 | 2008_002383 427 | 2008_002429 428 | 2008_002464 429 | 2008_002467 430 | 2008_002492 431 | 2008_002495 432 | 2008_002504 433 | 2008_002521 434 | 2008_002536 435 | 2008_002588 436 | 2008_002623 437 | 2008_002680 438 | 2008_002681 439 | 2008_002775 440 | 2008_002778 441 | 2008_002835 442 | 2008_002859 443 | 2008_002864 444 | 2008_002900 445 | 2008_002904 446 | 2008_002929 447 | 2008_002936 448 | 2008_002942 449 | 2008_002958 450 | 2008_003003 451 | 2008_003026 452 | 2008_003034 453 | 2008_003076 454 | 2008_003105 455 | 2008_003108 456 | 2008_003110 457 | 2008_003135 458 | 2008_003141 459 | 2008_003155 460 | 2008_003210 461 | 2008_003238 462 | 2008_003270 463 | 2008_003330 464 | 2008_003333 465 | 2008_003369 466 | 2008_003379 467 | 2008_003451 468 | 2008_003461 469 | 2008_003477 470 | 2008_003492 471 | 2008_003499 472 | 2008_003511 473 | 2008_003546 474 | 2008_003576 475 | 2008_003577 476 | 2008_003676 477 | 2008_003709 478 | 2008_003733 479 | 2008_003777 480 | 2008_003782 481 | 2008_003821 482 | 2008_003846 483 | 2008_003856 484 | 2008_003858 485 | 2008_003874 486 | 2008_003876 487 | 2008_003885 488 | 2008_003886 489 | 2008_003926 490 | 2008_003976 491 | 2008_004069 492 | 2008_004101 493 | 2008_004140 494 | 2008_004172 495 | 2008_004175 496 | 2008_004212 497 | 2008_004279 498 | 2008_004339 499 | 2008_004345 500 | 2008_004363 501 | 2008_004367 502 | 2008_004396 503 | 2008_004399 504 | 2008_004453 505 | 2008_004477 506 | 2008_004552 507 | 2008_004562 508 | 2008_004575 509 | 2008_004610 510 | 2008_004612 511 | 2008_004621 512 | 2008_004624 513 | 2008_004654 514 | 2008_004659 515 | 2008_004687 516 | 2008_004701 517 | 2008_004704 518 | 2008_004705 519 | 2008_004754 520 | 2008_004758 521 | 2008_004854 522 | 2008_004910 523 | 2008_004995 524 | 2008_005049 525 | 2008_005089 526 | 2008_005097 527 | 2008_005105 528 | 2008_005145 529 | 2008_005197 530 | 2008_005217 531 | 2008_005242 532 | 2008_005245 533 | 2008_005254 534 | 2008_005262 535 | 2008_005338 536 | 2008_005398 537 | 2008_005399 538 | 2008_005422 539 | 2008_005439 540 | 2008_005445 541 | 2008_005525 542 | 2008_005544 543 | 2008_005628 544 | 2008_005633 545 | 2008_005637 546 | 2008_005642 547 | 2008_005676 548 | 2008_005680 549 | 2008_005691 550 | 2008_005727 551 | 2008_005738 552 | 2008_005812 553 | 2008_005904 554 | 2008_005915 555 | 2008_006008 556 | 2008_006036 557 | 2008_006055 558 | 2008_006063 559 | 2008_006108 560 | 2008_006130 561 | 2008_006143 562 | 2008_006159 563 | 2008_006216 564 | 2008_006219 565 | 2008_006229 566 | 2008_006254 567 | 2008_006275 568 | 2008_006325 569 | 2008_006327 570 | 2008_006341 571 | 2008_006408 572 | 2008_006480 573 | 2008_006523 574 | 2008_006526 575 | 2008_006528 576 | 2008_006553 577 | 2008_006554 578 | 2008_006703 579 | 2008_006722 580 | 2008_006752 581 | 2008_006784 582 | 2008_006835 583 | 2008_006874 584 | 2008_006981 585 | 2008_006986 586 | 2008_007025 587 | 2008_007031 588 | 2008_007048 589 | 2008_007120 590 | 2008_007123 591 | 2008_007143 592 | 2008_007194 593 | 2008_007219 594 | 2008_007273 595 | 2008_007350 596 | 2008_007378 597 | 2008_007392 598 | 2008_007402 599 | 2008_007497 600 | 2008_007498 601 | 2008_007507 602 | 2008_007513 603 | 2008_007527 604 | 2008_007548 605 | 2008_007596 606 | 2008_007677 607 | 2008_007737 608 | 2008_007797 609 | 2008_007804 610 | 2008_007811 611 | 2008_007814 612 | 2008_007828 613 | 2008_007836 614 | 2008_007945 615 | 2008_007994 616 | 2008_008051 617 | 2008_008103 618 | 2008_008127 619 | 2008_008221 620 | 2008_008252 621 | 2008_008268 622 | 2008_008296 623 | 2008_008301 624 | 2008_008335 625 | 2008_008362 626 | 2008_008392 627 | 2008_008393 628 | 2008_008421 629 | 2008_008434 630 | 2008_008469 631 | 2008_008629 632 | 2008_008682 633 | 2008_008711 634 | 2008_008746 635 | 2009_000012 636 | 2009_000013 637 | 2009_000022 638 | 2009_000032 639 | 2009_000037 640 | 2009_000039 641 | 2009_000074 642 | 2009_000080 643 | 2009_000087 644 | 2009_000096 645 | 2009_000121 646 | 2009_000136 647 | 2009_000149 648 | 2009_000156 649 | 2009_000201 650 | 2009_000205 651 | 2009_000219 652 | 2009_000242 653 | 2009_000309 654 | 2009_000318 655 | 2009_000335 656 | 2009_000351 657 | 2009_000354 658 | 2009_000387 659 | 2009_000391 660 | 2009_000412 661 | 2009_000418 662 | 2009_000421 663 | 2009_000426 664 | 2009_000440 665 | 2009_000446 666 | 2009_000455 667 | 2009_000457 668 | 2009_000469 669 | 2009_000487 670 | 2009_000488 671 | 2009_000523 672 | 2009_000573 673 | 2009_000619 674 | 2009_000628 675 | 2009_000641 676 | 2009_000664 677 | 2009_000675 678 | 2009_000704 679 | 2009_000705 680 | 2009_000712 681 | 2009_000716 682 | 2009_000723 683 | 2009_000727 684 | 2009_000730 685 | 2009_000731 686 | 2009_000732 687 | 2009_000771 688 | 2009_000825 689 | 2009_000828 690 | 2009_000839 691 | 2009_000840 692 | 2009_000845 693 | 2009_000879 694 | 2009_000892 695 | 2009_000919 696 | 2009_000924 697 | 2009_000931 698 | 2009_000935 699 | 2009_000964 700 | 2009_000989 701 | 2009_000991 702 | 2009_000998 703 | 2009_001008 704 | 2009_001082 705 | 2009_001108 706 | 2009_001160 707 | 2009_001215 708 | 2009_001240 709 | 2009_001255 710 | 2009_001278 711 | 2009_001299 712 | 2009_001300 713 | 2009_001314 714 | 2009_001332 715 | 2009_001333 716 | 2009_001363 717 | 2009_001391 718 | 2009_001411 719 | 2009_001433 720 | 2009_001505 721 | 2009_001535 722 | 2009_001536 723 | 2009_001565 724 | 2009_001607 725 | 2009_001644 726 | 2009_001663 727 | 2009_001683 728 | 2009_001684 729 | 2009_001687 730 | 2009_001718 731 | 2009_001731 732 | 2009_001765 733 | 2009_001768 734 | 2009_001775 735 | 2009_001804 736 | 2009_001816 737 | 2009_001818 738 | 2009_001850 739 | 2009_001851 740 | 2009_001854 741 | 2009_001941 742 | 2009_001991 743 | 2009_002012 744 | 2009_002035 745 | 2009_002042 746 | 2009_002082 747 | 2009_002094 748 | 2009_002097 749 | 2009_002122 750 | 2009_002150 751 | 2009_002155 752 | 2009_002164 753 | 2009_002165 754 | 2009_002171 755 | 2009_002185 756 | 2009_002202 757 | 2009_002221 758 | 2009_002238 759 | 2009_002239 760 | 2009_002265 761 | 2009_002268 762 | 2009_002291 763 | 2009_002295 764 | 2009_002317 765 | 2009_002320 766 | 2009_002346 767 | 2009_002366 768 | 2009_002372 769 | 2009_002382 770 | 2009_002390 771 | 2009_002415 772 | 2009_002445 773 | 2009_002487 774 | 2009_002521 775 | 2009_002527 776 | 2009_002535 777 | 2009_002539 778 | 2009_002549 779 | 2009_002562 780 | 2009_002568 781 | 2009_002571 782 | 2009_002573 783 | 2009_002584 784 | 2009_002591 785 | 2009_002594 786 | 2009_002604 787 | 2009_002618 788 | 2009_002635 789 | 2009_002638 790 | 2009_002649 791 | 2009_002651 792 | 2009_002727 793 | 2009_002732 794 | 2009_002749 795 | 2009_002753 796 | 2009_002771 797 | 2009_002808 798 | 2009_002856 799 | 2009_002887 800 | 2009_002888 801 | 2009_002928 802 | 2009_002936 803 | 2009_002975 804 | 2009_002982 805 | 2009_002990 806 | 2009_003003 807 | 2009_003005 808 | 2009_003043 809 | 2009_003059 810 | 2009_003063 811 | 2009_003065 812 | 2009_003071 813 | 2009_003080 814 | 2009_003105 815 | 2009_003123 816 | 2009_003193 817 | 2009_003196 818 | 2009_003217 819 | 2009_003224 820 | 2009_003241 821 | 2009_003269 822 | 2009_003273 823 | 2009_003299 824 | 2009_003304 825 | 2009_003311 826 | 2009_003323 827 | 2009_003343 828 | 2009_003378 829 | 2009_003387 830 | 2009_003406 831 | 2009_003433 832 | 2009_003450 833 | 2009_003466 834 | 2009_003481 835 | 2009_003494 836 | 2009_003498 837 | 2009_003504 838 | 2009_003507 839 | 2009_003517 840 | 2009_003523 841 | 2009_003542 842 | 2009_003549 843 | 2009_003551 844 | 2009_003564 845 | 2009_003569 846 | 2009_003576 847 | 2009_003589 848 | 2009_003607 849 | 2009_003640 850 | 2009_003666 851 | 2009_003696 852 | 2009_003703 853 | 2009_003707 854 | 2009_003756 855 | 2009_003771 856 | 2009_003773 857 | 2009_003804 858 | 2009_003806 859 | 2009_003810 860 | 2009_003849 861 | 2009_003857 862 | 2009_003858 863 | 2009_003895 864 | 2009_003903 865 | 2009_003904 866 | 2009_003928 867 | 2009_003938 868 | 2009_003971 869 | 2009_003991 870 | 2009_004021 871 | 2009_004033 872 | 2009_004043 873 | 2009_004070 874 | 2009_004072 875 | 2009_004084 876 | 2009_004099 877 | 2009_004125 878 | 2009_004140 879 | 2009_004217 880 | 2009_004221 881 | 2009_004247 882 | 2009_004248 883 | 2009_004255 884 | 2009_004298 885 | 2009_004324 886 | 2009_004455 887 | 2009_004494 888 | 2009_004497 889 | 2009_004504 890 | 2009_004507 891 | 2009_004509 892 | 2009_004540 893 | 2009_004568 894 | 2009_004579 895 | 2009_004581 896 | 2009_004590 897 | 2009_004592 898 | 2009_004594 899 | 2009_004635 900 | 2009_004653 901 | 2009_004687 902 | 2009_004721 903 | 2009_004730 904 | 2009_004732 905 | 2009_004738 906 | 2009_004748 907 | 2009_004789 908 | 2009_004799 909 | 2009_004801 910 | 2009_004848 911 | 2009_004859 912 | 2009_004867 913 | 2009_004882 914 | 2009_004886 915 | 2009_004895 916 | 2009_004942 917 | 2009_004969 918 | 2009_004987 919 | 2009_004993 920 | 2009_004994 921 | 2009_005038 922 | 2009_005078 923 | 2009_005087 924 | 2009_005089 925 | 2009_005137 926 | 2009_005148 927 | 2009_005156 928 | 2009_005158 929 | 2009_005189 930 | 2009_005190 931 | 2009_005217 932 | 2009_005219 933 | 2009_005220 934 | 2009_005231 935 | 2009_005260 936 | 2009_005262 937 | 2009_005302 938 | 2010_000003 939 | 2010_000038 940 | 2010_000065 941 | 2010_000083 942 | 2010_000084 943 | 2010_000087 944 | 2010_000110 945 | 2010_000159 946 | 2010_000160 947 | 2010_000163 948 | 2010_000174 949 | 2010_000216 950 | 2010_000238 951 | 2010_000241 952 | 2010_000256 953 | 2010_000272 954 | 2010_000284 955 | 2010_000309 956 | 2010_000318 957 | 2010_000330 958 | 2010_000335 959 | 2010_000342 960 | 2010_000372 961 | 2010_000422 962 | 2010_000426 963 | 2010_000427 964 | 2010_000502 965 | 2010_000530 966 | 2010_000552 967 | 2010_000559 968 | 2010_000572 969 | 2010_000573 970 | 2010_000622 971 | 2010_000628 972 | 2010_000639 973 | 2010_000666 974 | 2010_000679 975 | 2010_000682 976 | 2010_000683 977 | 2010_000724 978 | 2010_000738 979 | 2010_000764 980 | 2010_000788 981 | 2010_000814 982 | 2010_000836 983 | 2010_000874 984 | 2010_000904 985 | 2010_000906 986 | 2010_000907 987 | 2010_000918 988 | 2010_000929 989 | 2010_000941 990 | 2010_000952 991 | 2010_000961 992 | 2010_001000 993 | 2010_001010 994 | 2010_001011 995 | 2010_001016 996 | 2010_001017 997 | 2010_001024 998 | 2010_001036 999 | 2010_001061 1000 | 2010_001069 1001 | 2010_001070 1002 | 2010_001079 1003 | 2010_001104 1004 | 2010_001124 1005 | 2010_001149 1006 | 2010_001151 1007 | 2010_001174 1008 | 2010_001206 1009 | 2010_001246 1010 | 2010_001251 1011 | 2010_001256 1012 | 2010_001264 1013 | 2010_001292 1014 | 2010_001313 1015 | 2010_001327 1016 | 2010_001331 1017 | 2010_001351 1018 | 2010_001367 1019 | 2010_001376 1020 | 2010_001403 1021 | 2010_001448 1022 | 2010_001451 1023 | 2010_001522 1024 | 2010_001534 1025 | 2010_001553 1026 | 2010_001557 1027 | 2010_001563 1028 | 2010_001577 1029 | 2010_001579 1030 | 2010_001646 1031 | 2010_001656 1032 | 2010_001692 1033 | 2010_001699 1034 | 2010_001734 1035 | 2010_001752 1036 | 2010_001767 1037 | 2010_001768 1038 | 2010_001773 1039 | 2010_001820 1040 | 2010_001830 1041 | 2010_001851 1042 | 2010_001908 1043 | 2010_001913 1044 | 2010_001951 1045 | 2010_001956 1046 | 2010_001962 1047 | 2010_001966 1048 | 2010_001995 1049 | 2010_002017 1050 | 2010_002025 1051 | 2010_002030 1052 | 2010_002106 1053 | 2010_002137 1054 | 2010_002142 1055 | 2010_002146 1056 | 2010_002147 1057 | 2010_002150 1058 | 2010_002161 1059 | 2010_002200 1060 | 2010_002228 1061 | 2010_002232 1062 | 2010_002251 1063 | 2010_002271 1064 | 2010_002305 1065 | 2010_002310 1066 | 2010_002336 1067 | 2010_002348 1068 | 2010_002361 1069 | 2010_002390 1070 | 2010_002396 1071 | 2010_002422 1072 | 2010_002450 1073 | 2010_002480 1074 | 2010_002512 1075 | 2010_002531 1076 | 2010_002536 1077 | 2010_002538 1078 | 2010_002546 1079 | 2010_002623 1080 | 2010_002682 1081 | 2010_002691 1082 | 2010_002693 1083 | 2010_002701 1084 | 2010_002763 1085 | 2010_002792 1086 | 2010_002868 1087 | 2010_002900 1088 | 2010_002902 1089 | 2010_002921 1090 | 2010_002929 1091 | 2010_002939 1092 | 2010_002988 1093 | 2010_003014 1094 | 2010_003060 1095 | 2010_003123 1096 | 2010_003127 1097 | 2010_003132 1098 | 2010_003168 1099 | 2010_003183 1100 | 2010_003187 1101 | 2010_003207 1102 | 2010_003231 1103 | 2010_003239 1104 | 2010_003275 1105 | 2010_003276 1106 | 2010_003293 1107 | 2010_003302 1108 | 2010_003325 1109 | 2010_003362 1110 | 2010_003365 1111 | 2010_003381 1112 | 2010_003402 1113 | 2010_003409 1114 | 2010_003418 1115 | 2010_003446 1116 | 2010_003453 1117 | 2010_003468 1118 | 2010_003473 1119 | 2010_003495 1120 | 2010_003506 1121 | 2010_003514 1122 | 2010_003531 1123 | 2010_003532 1124 | 2010_003541 1125 | 2010_003547 1126 | 2010_003597 1127 | 2010_003675 1128 | 2010_003708 1129 | 2010_003716 1130 | 2010_003746 1131 | 2010_003758 1132 | 2010_003764 1133 | 2010_003768 1134 | 2010_003771 1135 | 2010_003772 1136 | 2010_003781 1137 | 2010_003813 1138 | 2010_003820 1139 | 2010_003854 1140 | 2010_003912 1141 | 2010_003915 1142 | 2010_003947 1143 | 2010_003956 1144 | 2010_003971 1145 | 2010_004041 1146 | 2010_004042 1147 | 2010_004056 1148 | 2010_004063 1149 | 2010_004104 1150 | 2010_004120 1151 | 2010_004149 1152 | 2010_004165 1153 | 2010_004208 1154 | 2010_004219 1155 | 2010_004226 1156 | 2010_004314 1157 | 2010_004320 1158 | 2010_004322 1159 | 2010_004337 1160 | 2010_004348 1161 | 2010_004355 1162 | 2010_004369 1163 | 2010_004382 1164 | 2010_004419 1165 | 2010_004432 1166 | 2010_004472 1167 | 2010_004479 1168 | 2010_004519 1169 | 2010_004520 1170 | 2010_004529 1171 | 2010_004543 1172 | 2010_004550 1173 | 2010_004551 1174 | 2010_004556 1175 | 2010_004559 1176 | 2010_004628 1177 | 2010_004635 1178 | 2010_004662 1179 | 2010_004697 1180 | 2010_004757 1181 | 2010_004763 1182 | 2010_004772 1183 | 2010_004783 1184 | 2010_004789 1185 | 2010_004795 1186 | 2010_004815 1187 | 2010_004825 1188 | 2010_004828 1189 | 2010_004856 1190 | 2010_004857 1191 | 2010_004861 1192 | 2010_004941 1193 | 2010_004946 1194 | 2010_004951 1195 | 2010_004980 1196 | 2010_004994 1197 | 2010_005013 1198 | 2010_005021 1199 | 2010_005046 1200 | 2010_005063 1201 | 2010_005108 1202 | 2010_005118 1203 | 2010_005159 1204 | 2010_005160 1205 | 2010_005166 1206 | 2010_005174 1207 | 2010_005180 1208 | 2010_005187 1209 | 2010_005206 1210 | 2010_005245 1211 | 2010_005252 1212 | 2010_005284 1213 | 2010_005305 1214 | 2010_005344 1215 | 2010_005353 1216 | 2010_005366 1217 | 2010_005401 1218 | 2010_005421 1219 | 2010_005428 1220 | 2010_005432 1221 | 2010_005433 1222 | 2010_005496 1223 | 2010_005501 1224 | 2010_005508 1225 | 2010_005531 1226 | 2010_005534 1227 | 2010_005575 1228 | 2010_005582 1229 | 2010_005606 1230 | 2010_005626 1231 | 2010_005644 1232 | 2010_005664 1233 | 2010_005705 1234 | 2010_005706 1235 | 2010_005709 1236 | 2010_005718 1237 | 2010_005719 1238 | 2010_005727 1239 | 2010_005762 1240 | 2010_005788 1241 | 2010_005860 1242 | 2010_005871 1243 | 2010_005877 1244 | 2010_005888 1245 | 2010_005899 1246 | 2010_005922 1247 | 2010_005991 1248 | 2010_005992 1249 | 2010_006026 1250 | 2010_006034 1251 | 2010_006054 1252 | 2010_006070 1253 | 2011_000045 1254 | 2011_000051 1255 | 2011_000054 1256 | 2011_000066 1257 | 2011_000070 1258 | 2011_000112 1259 | 2011_000173 1260 | 2011_000178 1261 | 2011_000185 1262 | 2011_000226 1263 | 2011_000234 1264 | 2011_000238 1265 | 2011_000239 1266 | 2011_000248 1267 | 2011_000283 1268 | 2011_000291 1269 | 2011_000310 1270 | 2011_000312 1271 | 2011_000338 1272 | 2011_000396 1273 | 2011_000412 1274 | 2011_000419 1275 | 2011_000435 1276 | 2011_000436 1277 | 2011_000438 1278 | 2011_000455 1279 | 2011_000456 1280 | 2011_000479 1281 | 2011_000481 1282 | 2011_000482 1283 | 2011_000503 1284 | 2011_000512 1285 | 2011_000521 1286 | 2011_000526 1287 | 2011_000536 1288 | 2011_000548 1289 | 2011_000566 1290 | 2011_000585 1291 | 2011_000598 1292 | 2011_000607 1293 | 2011_000618 1294 | 2011_000638 1295 | 2011_000658 1296 | 2011_000661 1297 | 2011_000669 1298 | 2011_000747 1299 | 2011_000780 1300 | 2011_000789 1301 | 2011_000807 1302 | 2011_000809 1303 | 2011_000813 1304 | 2011_000830 1305 | 2011_000843 1306 | 2011_000874 1307 | 2011_000888 1308 | 2011_000900 1309 | 2011_000912 1310 | 2011_000953 1311 | 2011_000969 1312 | 2011_001005 1313 | 2011_001014 1314 | 2011_001020 1315 | 2011_001047 1316 | 2011_001060 1317 | 2011_001064 1318 | 2011_001069 1319 | 2011_001071 1320 | 2011_001082 1321 | 2011_001110 1322 | 2011_001114 1323 | 2011_001159 1324 | 2011_001161 1325 | 2011_001190 1326 | 2011_001232 1327 | 2011_001263 1328 | 2011_001276 1329 | 2011_001281 1330 | 2011_001287 1331 | 2011_001292 1332 | 2011_001313 1333 | 2011_001341 1334 | 2011_001346 1335 | 2011_001350 1336 | 2011_001407 1337 | 2011_001416 1338 | 2011_001421 1339 | 2011_001434 1340 | 2011_001447 1341 | 2011_001489 1342 | 2011_001529 1343 | 2011_001530 1344 | 2011_001534 1345 | 2011_001546 1346 | 2011_001567 1347 | 2011_001589 1348 | 2011_001597 1349 | 2011_001601 1350 | 2011_001607 1351 | 2011_001613 1352 | 2011_001614 1353 | 2011_001619 1354 | 2011_001624 1355 | 2011_001642 1356 | 2011_001665 1357 | 2011_001669 1358 | 2011_001674 1359 | 2011_001708 1360 | 2011_001713 1361 | 2011_001714 1362 | 2011_001722 1363 | 2011_001726 1364 | 2011_001745 1365 | 2011_001748 1366 | 2011_001775 1367 | 2011_001782 1368 | 2011_001793 1369 | 2011_001794 1370 | 2011_001812 1371 | 2011_001862 1372 | 2011_001863 1373 | 2011_001868 1374 | 2011_001880 1375 | 2011_001910 1376 | 2011_001984 1377 | 2011_001988 1378 | 2011_002002 1379 | 2011_002040 1380 | 2011_002041 1381 | 2011_002064 1382 | 2011_002075 1383 | 2011_002098 1384 | 2011_002110 1385 | 2011_002121 1386 | 2011_002124 1387 | 2011_002150 1388 | 2011_002156 1389 | 2011_002178 1390 | 2011_002200 1391 | 2011_002223 1392 | 2011_002244 1393 | 2011_002247 1394 | 2011_002279 1395 | 2011_002295 1396 | 2011_002298 1397 | 2011_002308 1398 | 2011_002317 1399 | 2011_002322 1400 | 2011_002327 1401 | 2011_002343 1402 | 2011_002358 1403 | 2011_002371 1404 | 2011_002379 1405 | 2011_002391 1406 | 2011_002498 1407 | 2011_002509 1408 | 2011_002515 1409 | 2011_002532 1410 | 2011_002535 1411 | 2011_002548 1412 | 2011_002575 1413 | 2011_002578 1414 | 2011_002589 1415 | 2011_002592 1416 | 2011_002623 1417 | 2011_002641 1418 | 2011_002644 1419 | 2011_002662 1420 | 2011_002675 1421 | 2011_002685 1422 | 2011_002713 1423 | 2011_002730 1424 | 2011_002754 1425 | 2011_002812 1426 | 2011_002863 1427 | 2011_002879 1428 | 2011_002885 1429 | 2011_002929 1430 | 2011_002951 1431 | 2011_002975 1432 | 2011_002993 1433 | 2011_002997 1434 | 2011_003003 1435 | 2011_003011 1436 | 2011_003019 1437 | 2011_003030 1438 | 2011_003055 1439 | 2011_003085 1440 | 2011_003103 1441 | 2011_003114 1442 | 2011_003145 1443 | 2011_003146 1444 | 2011_003182 1445 | 2011_003197 1446 | 2011_003205 1447 | 2011_003240 1448 | 2011_003256 1449 | 2011_003271 1450 | -------------------------------------------------------------------------------- /deeplab/__init__.py: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /deeplab/datasets.py: -------------------------------------------------------------------------------- 1 | import os 2 | import os.path as osp 3 | import numpy as np 4 | import random 5 | import matplotlib.pyplot as plt 6 | import collections 7 | import torch 8 | import torchvision 9 | import cv2 10 | from torch.utils import data 11 | 12 | 13 | class VOCDataSet(data.Dataset): 14 | def __init__(self, root, list_path, max_iters=None, crop_size=(321, 321), mean=(128, 128, 128), scale=True, mirror=True, ignore_label=255): 15 | self.root = root 16 | self.list_path = list_path 17 | self.crop_h, self.crop_w = crop_size 18 | self.scale = scale 19 | self.ignore_label = ignore_label 20 | self.mean = mean 21 | self.is_mirror = mirror 22 | # self.mean_bgr = np.array([104.00698793, 116.66876762, 122.67891434]) 23 | self.img_ids = [i_id.strip() for i_id in open(list_path)] 24 | if not max_iters==None: 25 | self.img_ids = self.img_ids * int(np.ceil(float(max_iters) / len(self.img_ids))) 26 | self.files = [] 27 | # for split in ["train", "trainval", "val"]: 28 | for name in self.img_ids: 29 | img_file = osp.join(self.root, "JPEGImages/%s.jpg" % name) 30 | label_file = osp.join(self.root, "SegmentationClassAug/%s.png" % name) 31 | self.files.append({ 32 | "img": img_file, 33 | "label": label_file, 34 | "name": name 35 | }) 36 | 37 | def __len__(self): 38 | return len(self.files) 39 | 40 | def generate_scale_label(self, image, label): 41 | f_scale = 0.5 + random.randint(0, 11) / 10.0 42 | image = cv2.resize(image, None, fx=f_scale, fy=f_scale, interpolation = cv2.INTER_LINEAR) 43 | label = cv2.resize(label, None, fx=f_scale, fy=f_scale, interpolation = cv2.INTER_NEAREST) 44 | return image, label 45 | 46 | def __getitem__(self, index): 47 | datafiles = self.files[index] 48 | image = cv2.imread(datafiles["img"], cv2.IMREAD_COLOR) 49 | label = cv2.imread(datafiles["label"], cv2.IMREAD_GRAYSCALE) 50 | size = image.shape 51 | name = datafiles["name"] 52 | if self.scale: 53 | image, label = self.generate_scale_label(image, label) 54 | image = np.asarray(image, np.float32) 55 | image -= self.mean 56 | img_h, img_w = label.shape 57 | pad_h = max(self.crop_h - img_h, 0) 58 | pad_w = max(self.crop_w - img_w, 0) 59 | if pad_h > 0 or pad_w > 0: 60 | img_pad = cv2.copyMakeBorder(image, 0, pad_h, 0, 61 | pad_w, cv2.BORDER_CONSTANT, 62 | value=(0.0, 0.0, 0.0)) 63 | label_pad = cv2.copyMakeBorder(label, 0, pad_h, 0, 64 | pad_w, cv2.BORDER_CONSTANT, 65 | value=(self.ignore_label,)) 66 | else: 67 | img_pad, label_pad = image, label 68 | 69 | img_h, img_w = label_pad.shape 70 | h_off = random.randint(0, img_h - self.crop_h) 71 | w_off = random.randint(0, img_w - self.crop_w) 72 | # roi = cv2.Rect(w_off, h_off, self.crop_w, self.crop_h); 73 | image = np.asarray(img_pad[h_off : h_off+self.crop_h, w_off : w_off+self.crop_w], np.float32) 74 | label = np.asarray(label_pad[h_off : h_off+self.crop_h, w_off : w_off+self.crop_w], np.float32) 75 | #image = image[:, :, ::-1] # change to BGR 76 | image = image.transpose((2, 0, 1)) 77 | if self.is_mirror: 78 | flip = np.random.choice(2) * 2 - 1 79 | image = image[:, :, ::flip] 80 | label = label[:, ::flip] 81 | 82 | return image.copy(), label.copy(), np.array(size), name 83 | 84 | 85 | class VOCDataTestSet(data.Dataset): 86 | def __init__(self, root, list_path, crop_size=(505, 505), mean=(128, 128, 128)): 87 | self.root = root 88 | self.list_path = list_path 89 | self.crop_h, self.crop_w = crop_size 90 | self.mean = mean 91 | # self.mean_bgr = np.array([104.00698793, 116.66876762, 122.67891434]) 92 | self.img_ids = [i_id.strip() for i_id in open(list_path)] 93 | self.files = [] 94 | # for split in ["train", "trainval", "val"]: 95 | for name in self.img_ids: 96 | img_file = osp.join(self.root, "JPEGImages/%s.jpg" % name) 97 | self.files.append({ 98 | "img": img_file 99 | }) 100 | 101 | def __len__(self): 102 | return len(self.files) 103 | 104 | def __getitem__(self, index): 105 | datafiles = self.files[index] 106 | image = cv2.imread(datafiles["img"], cv2.IMREAD_COLOR) 107 | size = image.shape 108 | name = osp.splitext(osp.basename(datafiles["img"]))[0] 109 | image = np.asarray(image, np.float32) 110 | image -= self.mean 111 | 112 | img_h, img_w, _ = image.shape 113 | pad_h = max(self.crop_h - img_h, 0) 114 | pad_w = max(self.crop_w - img_w, 0) 115 | if pad_h > 0 or pad_w > 0: 116 | image = cv2.copyMakeBorder(image, 0, pad_h, 0, 117 | pad_w, cv2.BORDER_CONSTANT, 118 | value=(0.0, 0.0, 0.0)) 119 | image = image.transpose((2, 0, 1)) 120 | return image, name, size 121 | 122 | 123 | if __name__ == '__main__': 124 | dst = VOCDataSet("./data", is_transform=True) 125 | trainloader = data.DataLoader(dst, batch_size=4) 126 | for i, data in enumerate(trainloader): 127 | imgs, labels = data 128 | if i == 0: 129 | img = torchvision.utils.make_grid(imgs).numpy() 130 | img = np.transpose(img, (1, 2, 0)) 131 | img = img[:, :, ::-1] 132 | plt.imshow(img) 133 | plt.show() 134 | -------------------------------------------------------------------------------- /deeplab/loss.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | import torch.nn as nn 4 | from torch.autograd import Variable 5 | 6 | class CrossEntropy2d(nn.Module): 7 | 8 | def __init__(self, size_average=True, ignore_label=255): 9 | super(CrossEntropy2d, self).__init__() 10 | self.size_average = size_average 11 | self.ignore_label = ignore_label 12 | 13 | def forward(self, predict, target, weight=None): 14 | """ 15 | Args: 16 | predict:(n, c, h, w) 17 | target:(n, h, w) 18 | weight (Tensor, optional): a manual rescaling weight given to each class. 19 | If given, has to be a Tensor of size "nclasses" 20 | """ 21 | assert not target.requires_grad 22 | assert predict.dim() == 4 23 | assert target.dim() == 3 24 | assert predict.size(0) == target.size(0), "{0} vs {1} ".format(predict.size(0), target.size(0)) 25 | assert predict.size(2) == target.size(1), "{0} vs {1} ".format(predict.size(2), target.size(1)) 26 | assert predict.size(3) == target.size(2), "{0} vs {1} ".format(predict.size(3), target.size(3)) 27 | n, c, h, w = predict.size() 28 | target_mask = (target >= 0) * (target != self.ignore_label) 29 | target = target[target_mask] 30 | if not target.data.dim(): 31 | return Variable(torch.zeros(1)) 32 | predict = predict.transpose(1, 2).transpose(2, 3).contiguous() 33 | predict = predict[target_mask.view(n, h, w, 1).repeat(1, 1, 1, c)].view(-1, c) 34 | loss = F.cross_entropy(predict, target, weight=weight, size_average=self.size_average) 35 | return loss -------------------------------------------------------------------------------- /deeplab/metric.py: -------------------------------------------------------------------------------- 1 | import os, sys 2 | import numpy as np 3 | 4 | from multiprocessing import Pool 5 | import copy_reg 6 | import types 7 | def _pickle_method(m): 8 | if m.im_self is None: 9 | return getattr, (m.im_class, m.im_func.func_name) 10 | else: 11 | return getattr, (m.im_self, m.im_func.func_name) 12 | 13 | copy_reg.pickle(types.MethodType, _pickle_method) 14 | 15 | class ConfusionMatrix(object): 16 | 17 | def __init__(self, nclass, classes=None): 18 | self.nclass = nclass 19 | self.classes = classes 20 | self.M = np.zeros((nclass, nclass)) 21 | 22 | def add(self, gt, pred): 23 | assert(np.max(pred) <= self.nclass) 24 | assert(len(gt) == len(pred)) 25 | for i in range(len(gt)): 26 | if not gt[i] == 255: 27 | self.M[gt[i], pred[i]] += 1.0 28 | 29 | def addM(self, matrix): 30 | assert(matrix.shape == self.M.shape) 31 | self.M += matrix 32 | 33 | def __str__(self): 34 | pass 35 | 36 | def recall(self): 37 | recall = 0.0 38 | for i in xrange(self.nclass): 39 | recall += self.M[i, i] / np.sum(self.M[:, i]) 40 | 41 | return recall/self.nclass 42 | 43 | def accuracy(self): 44 | accuracy = 0.0 45 | for i in xrange(self.nclass): 46 | accuracy += self.M[i, i] / np.sum(self.M[i, :]) 47 | 48 | return accuracy/self.nclass 49 | 50 | def jaccard(self): 51 | jaccard = 0.0 52 | jaccard_perclass = [] 53 | for i in xrange(self.nclass): 54 | jaccard_perclass.append(self.M[i, i] / (np.sum(self.M[i, :]) + np.sum(self.M[:, i]) - self.M[i, i])) 55 | 56 | return np.sum(jaccard_perclass)/len(jaccard_perclass), jaccard_perclass, self.M 57 | 58 | def generateM(self, item): 59 | gt, pred = item 60 | m = np.zeros((self.nclass, self.nclass)) 61 | assert(len(gt) == len(pred)) 62 | for i in range(len(gt)): 63 | if gt[i] < self.nclass: #and pred[i] < self.nclass: 64 | m[gt[i], pred[i]] += 1.0 65 | return m 66 | 67 | 68 | if __name__ == '__main__': 69 | args = parse_args() 70 | 71 | m_list = [] 72 | data_list = [] 73 | test_ids = [i.strip() for i in open(args.test_ids) if not i.strip() == ''] 74 | for index, img_id in enumerate(test_ids): 75 | if index % 100 == 0: 76 | print('%d processd'%(index)) 77 | pred_img_path = os.path.join(args.pred_dir, img_id+'.png') 78 | gt_img_path = os.path.join(args.gt_dir, img_id+'.png') 79 | pred = cv2.imread(pred_img_path, cv2.IMREAD_GRAYSCALE) 80 | gt = cv2.imread(gt_img_path, cv2.IMREAD_GRAYSCALE) 81 | # show_all(gt, pred) 82 | data_list.append([gt.flatten(), pred.flatten()]) 83 | 84 | ConfM = ConfusionMatrix(args.class_num) 85 | f = ConfM.generateM 86 | pool = Pool() 87 | m_list = pool.map(f, data_list) 88 | pool.close() 89 | pool.join() 90 | 91 | for m in m_list: 92 | ConfM.addM(m) 93 | 94 | aveJ, j_list, M = ConfM.jaccard() 95 | with open(args.save_path, 'w') as f: 96 | f.write('meanIOU: ' + str(aveJ) + '\n') 97 | f.write(str(j_list)+'\n') 98 | f.write(str(M)+'\n') 99 | -------------------------------------------------------------------------------- /deeplab/model.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import math 3 | import torch.utils.model_zoo as model_zoo 4 | import torch 5 | import numpy as np 6 | affine_par = True 7 | 8 | 9 | def outS(i): 10 | i = int(i) 11 | i = (i+1)/2 12 | i = int(np.ceil((i+1)/2.0)) 13 | i = (i+1)/2 14 | return i 15 | 16 | def conv3x3(in_planes, out_planes, stride=1): 17 | "3x3 convolution with padding" 18 | return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, 19 | padding=1, bias=False) 20 | 21 | 22 | class BasicBlock(nn.Module): 23 | expansion = 1 24 | 25 | def __init__(self, inplanes, planes, stride=1, downsample=None): 26 | super(BasicBlock, self).__init__() 27 | self.conv1 = conv3x3(inplanes, planes, stride) 28 | self.bn1 = nn.BatchNorm2d(planes, affine = affine_par) 29 | self.relu = nn.ReLU(inplace=True) 30 | self.conv2 = conv3x3(planes, planes) 31 | self.bn2 = nn.BatchNorm2d(planes, affine = affine_par) 32 | self.downsample = downsample 33 | self.stride = stride 34 | 35 | def forward(self, x): 36 | residual = x 37 | 38 | out = self.conv1(x) 39 | out = self.bn1(out) 40 | out = self.relu(out) 41 | 42 | out = self.conv2(out) 43 | out = self.bn2(out) 44 | 45 | if self.downsample is not None: 46 | residual = self.downsample(x) 47 | 48 | out += residual 49 | out = self.relu(out) 50 | 51 | return out 52 | 53 | 54 | class Bottleneck(nn.Module): 55 | expansion = 4 56 | 57 | def __init__(self, inplanes, planes, stride=1, dilation=1, downsample=None): 58 | super(Bottleneck, self).__init__() 59 | self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, stride=stride, bias=False) # change 60 | self.bn1 = nn.BatchNorm2d(planes,affine = affine_par) 61 | for i in self.bn1.parameters(): 62 | i.requires_grad = False 63 | 64 | padding = dilation 65 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, # change 66 | padding=padding, bias=False, dilation = dilation) 67 | self.bn2 = nn.BatchNorm2d(planes,affine = affine_par) 68 | for i in self.bn2.parameters(): 69 | i.requires_grad = False 70 | self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False) 71 | self.bn3 = nn.BatchNorm2d(planes * 4, affine = affine_par) 72 | for i in self.bn3.parameters(): 73 | i.requires_grad = False 74 | self.relu = nn.ReLU(inplace=True) 75 | self.downsample = downsample 76 | self.stride = stride 77 | 78 | 79 | def forward(self, x): 80 | residual = x 81 | 82 | out = self.conv1(x) 83 | out = self.bn1(out) 84 | out = self.relu(out) 85 | 86 | out = self.conv2(out) 87 | out = self.bn2(out) 88 | out = self.relu(out) 89 | 90 | out = self.conv3(out) 91 | out = self.bn3(out) 92 | 93 | if self.downsample is not None: 94 | residual = self.downsample(x) 95 | 96 | out += residual 97 | out = self.relu(out) 98 | 99 | return out 100 | 101 | class Classifier_Module(nn.Module): 102 | 103 | def __init__(self, dilation_series, padding_series, num_classes): 104 | super(Classifier_Module, self).__init__() 105 | self.conv2d_list = nn.ModuleList() 106 | for dilation, padding in zip(dilation_series, padding_series): 107 | self.conv2d_list.append(nn.Conv2d(2048, num_classes, kernel_size=3, stride=1, padding=padding, dilation=dilation, bias = True)) 108 | 109 | for m in self.conv2d_list: 110 | m.weight.data.normal_(0, 0.01) 111 | 112 | def forward(self, x): 113 | out = self.conv2d_list[0](x) 114 | for i in range(len(self.conv2d_list)-1): 115 | out += self.conv2d_list[i+1](x) 116 | return out 117 | 118 | class Residual_Covolution(nn.Module): 119 | def __init__(self, icol, ocol, num_classes): 120 | super(Residual_Covolution, self).__init__() 121 | self.conv1 = nn.Conv2d(icol, ocol, kernel_size=3, stride=1, padding=12, dilation=12, bias=True) 122 | self.conv2 = nn.Conv2d(ocol, num_classes, kernel_size=3, stride=1, padding=12, dilation=12, bias=True) 123 | self.conv3 = nn.Conv2d(num_classes, ocol, kernel_size=1, stride=1, padding=0, dilation=1, bias=True) 124 | self.conv4 = nn.Conv2d(ocol, icol, kernel_size=1, stride=1, padding=0, dilation=1, bias=True) 125 | self.relu = nn.ReLU(inplace=True) 126 | 127 | def forward(self, x): 128 | dow1 = self.conv1(x) 129 | dow1 = self.relu(dow1) 130 | seg = self.conv2(dow1) 131 | inc1 = self.conv3(seg) 132 | add1 = dow1 + self.relu(inc1) 133 | inc2 = self.conv4(add1) 134 | out = x + self.relu(inc2) 135 | return out, seg 136 | 137 | class Residual_Refinement_Module(nn.Module): 138 | 139 | def __init__(self, num_classes): 140 | super(Residual_Refinement_Module, self).__init__() 141 | self.RC1 = Residual_Covolution(2048, 512, num_classes) 142 | self.RC2 = Residual_Covolution(2048, 512, num_classes) 143 | 144 | def forward(self, x): 145 | x, seg1 = self.RC1(x) 146 | _, seg2 = self.RC2(x) 147 | return [seg1, seg1+seg2] 148 | 149 | class ResNet_Refine(nn.Module): 150 | def __init__(self, block, layers, num_classes): 151 | self.inplanes = 64 152 | super(ResNet_Refine, self).__init__() 153 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, 154 | bias=False) 155 | self.bn1 = nn.BatchNorm2d(64, affine = affine_par) 156 | for i in self.bn1.parameters(): 157 | i.requires_grad = False 158 | self.relu = nn.ReLU(inplace=True) 159 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, ceil_mode=True) # change 160 | self.layer1 = self._make_layer(block, 64, layers[0]) 161 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2) 162 | self.layer3 = self._make_layer(block, 256, layers[2], stride=1, dilation=2) 163 | self.layer4 = self._make_layer(block, 512, layers[3], stride=1, dilation=4) 164 | self.layer5 = Residual_Refinement_Module(num_classes) 165 | 166 | for m in self.modules(): 167 | if isinstance(m, nn.Conv2d): 168 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 169 | m.weight.data.normal_(0, 0.01) 170 | elif isinstance(m, nn.BatchNorm2d): 171 | m.weight.data.fill_(1) 172 | m.bias.data.zero_() 173 | # for i in m.parameters(): 174 | # i.requires_grad = False 175 | 176 | def _make_layer(self, block, planes, blocks, stride=1, dilation=1): 177 | downsample = None 178 | if stride != 1 or self.inplanes != planes * block.expansion or dilation == 2 or dilation == 4: 179 | downsample = nn.Sequential( 180 | nn.Conv2d(self.inplanes, planes * block.expansion, 181 | kernel_size=1, stride=stride, bias=False), 182 | nn.BatchNorm2d(planes * block.expansion,affine = affine_par)) 183 | for i in downsample._modules['1'].parameters(): 184 | i.requires_grad = False 185 | layers = [] 186 | layers.append(block(self.inplanes, planes, stride,dilation=dilation, downsample=downsample)) 187 | self.inplanes = planes * block.expansion 188 | for i in range(1, blocks): 189 | layers.append(block(self.inplanes, planes, dilation=dilation)) 190 | 191 | return nn.Sequential(*layers) 192 | 193 | def forward(self, x): 194 | x = self.conv1(x) 195 | x = self.bn1(x) 196 | x = self.relu(x) 197 | x = self.maxpool(x) 198 | x = self.layer1(x) 199 | x = self.layer2(x) 200 | x = self.layer3(x) 201 | x = self.layer4(x) 202 | x = self.layer5(x) 203 | 204 | return x 205 | 206 | class ResNet(nn.Module): 207 | def __init__(self, block, layers, num_classes): 208 | self.inplanes = 64 209 | super(ResNet, self).__init__() 210 | self.conv1 = nn.Conv2d(3, 64, kernel_size=7, stride=2, padding=3, 211 | bias=False) 212 | self.bn1 = nn.BatchNorm2d(64, affine = affine_par) 213 | for i in self.bn1.parameters(): 214 | i.requires_grad = False 215 | self.relu = nn.ReLU(inplace=True) 216 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1, ceil_mode=True) # change 217 | self.layer1 = self._make_layer(block, 64, layers[0]) 218 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2) 219 | self.layer3 = self._make_layer(block, 256, layers[2], stride=1, dilation=2) 220 | self.layer4 = self._make_layer(block, 512, layers[3], stride=1, dilation=4) 221 | self.layer5 = self._make_pred_layer(Classifier_Module, [6,12,18,24],[6,12,18,24],num_classes) 222 | 223 | for m in self.modules(): 224 | if isinstance(m, nn.Conv2d): 225 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 226 | m.weight.data.normal_(0, 0.01) 227 | elif isinstance(m, nn.BatchNorm2d): 228 | m.weight.data.fill_(1) 229 | m.bias.data.zero_() 230 | # for i in m.parameters(): 231 | # i.requires_grad = False 232 | 233 | def _make_layer(self, block, planes, blocks, stride=1, dilation=1): 234 | downsample = None 235 | if stride != 1 or self.inplanes != planes * block.expansion or dilation == 2 or dilation == 4: 236 | downsample = nn.Sequential( 237 | nn.Conv2d(self.inplanes, planes * block.expansion, 238 | kernel_size=1, stride=stride, bias=False), 239 | nn.BatchNorm2d(planes * block.expansion,affine = affine_par)) 240 | for i in downsample._modules['1'].parameters(): 241 | i.requires_grad = False 242 | layers = [] 243 | layers.append(block(self.inplanes, planes, stride,dilation=dilation, downsample=downsample)) 244 | self.inplanes = planes * block.expansion 245 | for i in range(1, blocks): 246 | layers.append(block(self.inplanes, planes, dilation=dilation)) 247 | 248 | return nn.Sequential(*layers) 249 | def _make_pred_layer(self,block, dilation_series, padding_series,num_classes): 250 | return block(dilation_series,padding_series,num_classes) 251 | 252 | def forward(self, x): 253 | x = self.conv1(x) 254 | x = self.bn1(x) 255 | x = self.relu(x) 256 | x = self.maxpool(x) 257 | x = self.layer1(x) 258 | x = self.layer2(x) 259 | x = self.layer3(x) 260 | x = self.layer4(x) 261 | x = self.layer5(x) 262 | 263 | return x 264 | 265 | class MS_Deeplab(nn.Module): 266 | def __init__(self,block,num_classes): 267 | super(MS_Deeplab,self).__init__() 268 | self.Scale = ResNet(block,[3, 4, 23, 3],num_classes) #changed to fix #4 269 | 270 | def forward(self,x): 271 | output = self.Scale(x) # for original scale 272 | output_size = output.size()[2] 273 | input_size = x.size()[2] 274 | 275 | self.interp1 = nn.Upsample(size=(int(input_size*0.75)+1, int(input_size*0.75)+1), mode='bilinear') 276 | self.interp2 = nn.Upsample(size=(int(input_size*0.5)+1, int(input_size*0.5)+1), mode='bilinear') 277 | self.interp3 = nn.Upsample(size=(output_size, output_size), mode='bilinear') 278 | 279 | x75 = self.interp1(x) 280 | output75 = self.interp3(self.Scale(x75)) # for 0.75x scale 281 | 282 | x5 = self.interp2(x) 283 | output5 = self.interp3(self.Scale(x5)) # for 0.5x scale 284 | 285 | out_max = torch.max(torch.max(output, output75), output5) 286 | return [output, output75, output5, out_max] 287 | 288 | def Res_Ms_Deeplab(num_classes=21): 289 | model = MS_Deeplab(Bottleneck, num_classes) 290 | return model 291 | 292 | def Res_Deeplab(num_classes=21, is_refine=False): 293 | if is_refine: 294 | model = ResNet_Refine(Bottleneck,[3, 4, 23, 3], num_classes) 295 | else: 296 | model = ResNet(Bottleneck,[3, 4, 23, 3], num_classes) 297 | return model 298 | 299 | -------------------------------------------------------------------------------- /evaluate.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import scipy 3 | from scipy import ndimage 4 | import cv2 5 | import numpy as np 6 | import sys 7 | 8 | import torch 9 | from torch.autograd import Variable 10 | import torchvision.models as models 11 | import torch.nn.functional as F 12 | from torch.utils import data 13 | from deeplab.model import Res_Deeplab 14 | from deeplab.datasets import VOCDataSet 15 | from collections import OrderedDict 16 | import os 17 | 18 | import matplotlib.pyplot as plt 19 | import torch.nn as nn 20 | IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32) 21 | 22 | DATA_DIRECTORY = '../../data/VOCdevkit/voc12' 23 | DATA_LIST_PATH = './dataset/list/val.txt' 24 | IGNORE_LABEL = 255 25 | NUM_CLASSES = 21 26 | NUM_STEPS = 1449 # Number of images in the validation set. 27 | RESTORE_FROM = './deeplab_resnet.ckpt' 28 | 29 | def get_arguments(): 30 | """Parse all the arguments provided from the CLI. 31 | 32 | Returns: 33 | A list of parsed arguments. 34 | """ 35 | parser = argparse.ArgumentParser(description="DeepLabLFOV Network") 36 | parser.add_argument("--data-dir", type=str, default=DATA_DIRECTORY, 37 | help="Path to the directory containing the PASCAL VOC dataset.") 38 | parser.add_argument("--data-list", type=str, default=DATA_LIST_PATH, 39 | help="Path to the file listing the images in the dataset.") 40 | parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL, 41 | help="The index of the label to ignore during the training.") 42 | parser.add_argument("--num-classes", type=int, default=NUM_CLASSES, 43 | help="Number of classes to predict (including background).") 44 | parser.add_argument("--restore-from", type=str, default=RESTORE_FROM, 45 | help="Where restore model parameters from.") 46 | parser.add_argument("--gpu", type=int, default=0, 47 | help="choose gpu device.") 48 | return parser.parse_args() 49 | 50 | 51 | def get_iou(data_list, class_num, save_path=None): 52 | from multiprocessing import Pool 53 | from deeplab.metric import ConfusionMatrix 54 | 55 | ConfM = ConfusionMatrix(class_num) 56 | f = ConfM.generateM 57 | pool = Pool() 58 | m_list = pool.map(f, data_list) 59 | pool.close() 60 | pool.join() 61 | 62 | for m in m_list: 63 | ConfM.addM(m) 64 | 65 | aveJ, j_list, M = ConfM.jaccard() 66 | print('meanIOU: ' + str(aveJ) + '\n') 67 | if save_path: 68 | with open(save_path, 'w') as f: 69 | f.write('meanIOU: ' + str(aveJ) + '\n') 70 | f.write(str(j_list)+'\n') 71 | f.write(str(M)+'\n') 72 | 73 | def show_all(gt, pred): 74 | import matplotlib.pyplot as plt 75 | from matplotlib import colors 76 | from mpl_toolkits.axes_grid1 import make_axes_locatable 77 | 78 | fig, axes = plt.subplots(1, 2) 79 | ax1, ax2 = axes 80 | 81 | classes = np.array(('background', # always index 0 82 | 'aeroplane', 'bicycle', 'bird', 'boat', 83 | 'bottle', 'bus', 'car', 'cat', 'chair', 84 | 'cow', 'diningtable', 'dog', 'horse', 85 | 'motorbike', 'person', 'pottedplant', 86 | 'sheep', 'sofa', 'train', 'tvmonitor')) 87 | colormap = [(0,0,0),(0.5,0,0),(0,0.5,0),(0.5,0.5,0),(0,0,0.5),(0.5,0,0.5),(0,0.5,0.5), 88 | (0.5,0.5,0.5),(0.25,0,0),(0.75,0,0),(0.25,0.5,0),(0.75,0.5,0),(0.25,0,0.5), 89 | (0.75,0,0.5),(0.25,0.5,0.5),(0.75,0.5,0.5),(0,0.25,0),(0.5,0.25,0),(0,0.75,0), 90 | (0.5,0.75,0),(0,0.25,0.5)] 91 | cmap = colors.ListedColormap(colormap) 92 | bounds=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21] 93 | norm = colors.BoundaryNorm(bounds, cmap.N) 94 | 95 | ax1.set_title('gt') 96 | ax1.imshow(gt, cmap=cmap, norm=norm) 97 | 98 | ax2.set_title('pred') 99 | ax2.imshow(pred, cmap=cmap, norm=norm) 100 | 101 | plt.show() 102 | 103 | def main(): 104 | """Create the model and start the evaluation process.""" 105 | args = get_arguments() 106 | 107 | gpu0 = args.gpu 108 | 109 | model = Res_Deeplab(num_classes=args.num_classes) 110 | 111 | saved_state_dict = torch.load(args.restore_from) 112 | model.load_state_dict(saved_state_dict) 113 | 114 | model.eval() 115 | model.cuda(gpu0) 116 | 117 | testloader = data.DataLoader(VOCDataSet(args.data_dir, args.data_list, crop_size=(505, 505), mean=IMG_MEAN, scale=False, mirror=False), 118 | batch_size=1, shuffle=False, pin_memory=True) 119 | 120 | interp = nn.Upsample(size=(505, 505), mode='bilinear', align_corners=True) 121 | data_list = [] 122 | 123 | for index, batch in enumerate(testloader): 124 | if index % 100 == 0: 125 | print('%d processd'%(index)) 126 | image, label, size, name = batch 127 | size = size[0].numpy() 128 | output = model(Variable(image, volatile=True).cuda(gpu0)) 129 | output = interp(output).cpu().data[0].numpy() 130 | 131 | output = output[:,:size[0],:size[1]] 132 | gt = np.asarray(label[0].numpy()[:size[0],:size[1]], dtype=np.int) 133 | 134 | output = output.transpose(1,2,0) 135 | output = np.asarray(np.argmax(output, axis=2), dtype=np.int) 136 | 137 | # show_all(gt, output) 138 | data_list.append([gt.flatten(), output.flatten()]) 139 | 140 | get_iou(data_list, args.num_classes) 141 | 142 | 143 | if __name__ == '__main__': 144 | main() 145 | -------------------------------------------------------------------------------- /evaluate_msc.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import scipy 3 | from scipy import ndimage 4 | import cv2 5 | import numpy as np 6 | import sys 7 | 8 | import torch 9 | from torch.autograd import Variable 10 | import torchvision.models as models 11 | import torch.nn.functional as F 12 | from torch.utils import data 13 | from deeplab.model import Res_Deeplab 14 | from deeplab.datasets import VOCDataSet 15 | from collections import OrderedDict 16 | import os 17 | 18 | import matplotlib.pyplot as plt 19 | import torch.nn as nn 20 | IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32) 21 | 22 | DATA_DIRECTORY = '../data/VOCdevkit/voc12' 23 | DATA_LIST_PATH = './dataset/list/val.txt' 24 | IGNORE_LABEL = 255 25 | NUM_CLASSES = 21 26 | NUM_STEPS = 1449 # Number of images in the validation set. 27 | RESTORE_FROM = './deeplab_resnet.ckpt' 28 | 29 | def get_arguments(): 30 | """Parse all the arguments provided from the CLI. 31 | 32 | Returns: 33 | A list of parsed arguments. 34 | """ 35 | parser = argparse.ArgumentParser(description="DeepLabLFOV Network") 36 | parser.add_argument("--data-dir", type=str, default=DATA_DIRECTORY, 37 | help="Path to the directory containing the PASCAL VOC dataset.") 38 | parser.add_argument("--data-list", type=str, default=DATA_LIST_PATH, 39 | help="Path to the file listing the images in the dataset.") 40 | parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL, 41 | help="The index of the label to ignore during the training.") 42 | parser.add_argument("--num-classes", type=int, default=NUM_CLASSES, 43 | help="Number of classes to predict (including background).") 44 | parser.add_argument("--restore-from", type=str, default=RESTORE_FROM, 45 | help="Where restore model parameters from.") 46 | parser.add_argument("--gpu", type=int, default=0, 47 | help="choose gpu device.") 48 | return parser.parse_args() 49 | 50 | 51 | def get_iou(data_list, class_num, save_path=None): 52 | from multiprocessing import Pool 53 | from deeplab.metric import ConfusionMatrix 54 | 55 | ConfM = ConfusionMatrix(class_num) 56 | f = ConfM.generateM 57 | pool = Pool() 58 | m_list = pool.map(f, data_list) 59 | pool.close() 60 | pool.join() 61 | 62 | for m in m_list: 63 | ConfM.addM(m) 64 | 65 | aveJ, j_list, M = ConfM.jaccard() 66 | print('meanIOU: ' + str(aveJ) + '\n') 67 | if save_path: 68 | with open(save_path, 'w') as f: 69 | f.write('meanIOU: ' + str(aveJ) + '\n') 70 | f.write(str(j_list)+'\n') 71 | f.write(str(M)+'\n') 72 | 73 | def show_all(gt, pred): 74 | import matplotlib.pyplot as plt 75 | from matplotlib import colors 76 | from mpl_toolkits.axes_grid1 import make_axes_locatable 77 | 78 | fig, axes = plt.subplots(1, 2) 79 | ax1, ax2 = axes 80 | 81 | classes = np.array(('background', # always index 0 82 | 'aeroplane', 'bicycle', 'bird', 'boat', 83 | 'bottle', 'bus', 'car', 'cat', 'chair', 84 | 'cow', 'diningtable', 'dog', 'horse', 85 | 'motorbike', 'person', 'pottedplant', 86 | 'sheep', 'sofa', 'train', 'tvmonitor')) 87 | colormap = [(0,0,0),(0.5,0,0),(0,0.5,0),(0.5,0.5,0),(0,0,0.5),(0.5,0,0.5),(0,0.5,0.5), 88 | (0.5,0.5,0.5),(0.25,0,0),(0.75,0,0),(0.25,0.5,0),(0.75,0.5,0),(0.25,0,0.5), 89 | (0.75,0,0.5),(0.25,0.5,0.5),(0.75,0.5,0.5),(0,0.25,0),(0.5,0.25,0),(0,0.75,0), 90 | (0.5,0.75,0),(0,0.25,0.5)] 91 | cmap = colors.ListedColormap(colormap) 92 | bounds=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21] 93 | norm = colors.BoundaryNorm(bounds, cmap.N) 94 | 95 | ax1.set_title('gt') 96 | ax1.imshow(gt, cmap=cmap, norm=norm) 97 | 98 | ax2.set_title('pred') 99 | ax2.imshow(pred, cmap=cmap, norm=norm) 100 | 101 | plt.show() 102 | 103 | def main(): 104 | """Create the model and start the evaluation process.""" 105 | args = get_arguments() 106 | 107 | gpu0 = args.gpu 108 | 109 | model = Res_Deeplab(num_classes=args.num_classes) 110 | 111 | saved_state_dict = torch.load(args.restore_from) 112 | model.load_state_dict(saved_state_dict) 113 | 114 | model.eval() 115 | model.cuda(gpu0) 116 | 117 | testloader = data.DataLoader(VOCDataSet(args.data_dir, args.data_list, crop_size=(505, 505), mean=IMG_MEAN, scale=False, mirror=False), 118 | batch_size=1, shuffle=False, pin_memory=True) 119 | 120 | interp = nn.Upsample(size=(505, 505), mode='bilinear') 121 | data_list = [] 122 | 123 | for index, batch in enumerate(testloader): 124 | if index % 100 == 0: 125 | print('%d processd'%(index)) 126 | images, label, size, name = batch 127 | images = Variable(images, volatile=True) 128 | h, w, c = size[0].numpy() 129 | images075 = nn.Upsample(size=(int(h*0.75), int(w*0.75)), mode='bilinear')(images) 130 | images05 = nn.Upsample(size=(int(h*0.5), int(w*0.5)), mode='bilinear')(images) 131 | 132 | out100 = model(images.cuda(args.gpu)) 133 | out075 = model(images075.cuda(args.gpu)) 134 | out05 = model(images05.cuda(args.gpu)) 135 | o_h, o_w = out100.size()[2:] 136 | interpo1 = nn.Upsample(size=(o_h, o_w), mode='bilinear') 137 | out_max = torch.max(torch.stack([out100, interpo1(out075), interpo1(out05)]), dim=0)[0] 138 | 139 | output = interp(out_max).cpu().data[0].numpy() 140 | 141 | output = output[:,:h,:w] 142 | output = output.transpose(1,2,0) 143 | output = np.asarray(np.argmax(output, axis=2), dtype=np.int) 144 | 145 | gt = np.asarray(label[0].numpy()[:h,:w], dtype=np.int) 146 | 147 | # show_all(gt, output) 148 | data_list.append([gt.flatten(), output.flatten()]) 149 | 150 | get_iou(data_list, args.num_classes) 151 | 152 | 153 | if __name__ == '__main__': 154 | main() -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | import torch 4 | import torch.nn as nn 5 | from torch.utils import data 6 | import numpy as np 7 | import pickle 8 | import cv2 9 | from torch.autograd import Variable 10 | import torch.optim as optim 11 | import scipy.misc 12 | import torch.backends.cudnn as cudnn 13 | import sys 14 | import os 15 | import os.path as osp 16 | from deeplab.model import Res_Deeplab 17 | from deeplab.loss import CrossEntropy2d 18 | from deeplab.datasets import VOCDataSet 19 | import matplotlib.pyplot as plt 20 | import random 21 | import timeit 22 | start = timeit.default_timer() 23 | 24 | IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32) 25 | 26 | BATCH_SIZE = 10 27 | DATA_DIRECTORY = '../../data/VOCdevkit/voc12' 28 | DATA_LIST_PATH = './dataset/list/train_aug.txt' 29 | IGNORE_LABEL = 255 30 | INPUT_SIZE = '321,321' 31 | LEARNING_RATE = 2.5e-4 32 | MOMENTUM = 0.9 33 | NUM_CLASSES = 21 34 | NUM_STEPS = 20000 35 | POWER = 0.9 36 | RANDOM_SEED = 1234 37 | RESTORE_FROM = './dataset/MS_DeepLab_resnet_pretrained_COCO_init.pth' 38 | SAVE_NUM_IMAGES = 2 39 | SAVE_PRED_EVERY = 1000 40 | SNAPSHOT_DIR = './snapshots/' 41 | WEIGHT_DECAY = 0.0005 42 | 43 | def get_arguments(): 44 | """Parse all the arguments provided from the CLI. 45 | 46 | Returns: 47 | A list of parsed arguments. 48 | """ 49 | parser = argparse.ArgumentParser(description="DeepLab-ResNet Network") 50 | parser.add_argument("--batch-size", type=int, default=BATCH_SIZE, 51 | help="Number of images sent to the network in one step.") 52 | parser.add_argument("--data-dir", type=str, default=DATA_DIRECTORY, 53 | help="Path to the directory containing the PASCAL VOC dataset.") 54 | parser.add_argument("--data-list", type=str, default=DATA_LIST_PATH, 55 | help="Path to the file listing the images in the dataset.") 56 | parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL, 57 | help="The index of the label to ignore during the training.") 58 | parser.add_argument("--input-size", type=str, default=INPUT_SIZE, 59 | help="Comma-separated string with height and width of images.") 60 | parser.add_argument("--is-training", action="store_true", 61 | help="Whether to updates the running means and variances during the training.") 62 | parser.add_argument("--learning-rate", type=float, default=LEARNING_RATE, 63 | help="Base learning rate for training with polynomial decay.") 64 | parser.add_argument("--momentum", type=float, default=MOMENTUM, 65 | help="Momentum component of the optimiser.") 66 | parser.add_argument("--not-restore-last", action="store_true", 67 | help="Whether to not restore last (FC) layers.") 68 | parser.add_argument("--num-classes", type=int, default=NUM_CLASSES, 69 | help="Number of classes to predict (including background).") 70 | parser.add_argument("--num-steps", type=int, default=NUM_STEPS, 71 | help="Number of training steps.") 72 | parser.add_argument("--power", type=float, default=POWER, 73 | help="Decay parameter to compute the learning rate.") 74 | parser.add_argument("--random-mirror", action="store_true", 75 | help="Whether to randomly mirror the inputs during the training.") 76 | parser.add_argument("--random-scale", action="store_true", 77 | help="Whether to randomly scale the inputs during the training.") 78 | parser.add_argument("--random-seed", type=int, default=RANDOM_SEED, 79 | help="Random seed to have reproducible results.") 80 | parser.add_argument("--restore-from", type=str, default=RESTORE_FROM, 81 | help="Where restore model parameters from.") 82 | parser.add_argument("--save-num-images", type=int, default=SAVE_NUM_IMAGES, 83 | help="How many images to save.") 84 | parser.add_argument("--save-pred-every", type=int, default=SAVE_PRED_EVERY, 85 | help="Save summaries and checkpoint every often.") 86 | parser.add_argument("--snapshot-dir", type=str, default=SNAPSHOT_DIR, 87 | help="Where to save snapshots of the model.") 88 | parser.add_argument("--weight-decay", type=float, default=WEIGHT_DECAY, 89 | help="Regularisation parameter for L2-loss.") 90 | parser.add_argument("--gpu", type=int, default=0, 91 | help="choose gpu device.") 92 | return parser.parse_args() 93 | 94 | args = get_arguments() 95 | 96 | def loss_calc(pred, label): 97 | """ 98 | This function returns cross entropy loss for semantic segmentation 99 | """ 100 | # out shape batch_size x channels x h x w -> batch_size x channels x h x w 101 | # label shape h x w x 1 x batch_size -> batch_size x 1 x h x w 102 | label = Variable(label.long()).cuda() 103 | criterion = torch.nn.CrossEntropyLoss(ignore_index=IGNORE_LABEL).cuda() 104 | 105 | return criterion(pred, label) 106 | 107 | 108 | def lr_poly(base_lr, iter, max_iter, power): 109 | return base_lr*((1-float(iter)/max_iter)**(power)) 110 | 111 | 112 | def get_1x_lr_params_NOscale(model): 113 | """ 114 | This generator returns all the parameters of the net except for 115 | the last classification layer. Note that for each batchnorm layer, 116 | requires_grad is set to False in deeplab_resnet.py, therefore this function does not return 117 | any batchnorm parameter 118 | """ 119 | b = [] 120 | 121 | b.append(model.conv1) 122 | b.append(model.bn1) 123 | b.append(model.layer1) 124 | b.append(model.layer2) 125 | b.append(model.layer3) 126 | b.append(model.layer4) 127 | 128 | 129 | for i in range(len(b)): 130 | for j in b[i].modules(): 131 | jj = 0 132 | for k in j.parameters(): 133 | jj+=1 134 | if k.requires_grad: 135 | yield k 136 | 137 | def get_10x_lr_params(model): 138 | """ 139 | This generator returns all the parameters for the last layer of the net, 140 | which does the classification of pixel into classes 141 | """ 142 | b = [] 143 | b.append(model.layer5.parameters()) 144 | 145 | for j in range(len(b)): 146 | for i in b[j]: 147 | yield i 148 | 149 | 150 | def adjust_learning_rate(optimizer, i_iter): 151 | """Sets the learning rate to the initial LR divided by 5 at 60th, 120th and 160th epochs""" 152 | lr = lr_poly(args.learning_rate, i_iter, args.num_steps, args.power) 153 | optimizer.param_groups[0]['lr'] = lr 154 | optimizer.param_groups[1]['lr'] = lr * 10 155 | 156 | 157 | def main(): 158 | """Create the model and start the training.""" 159 | 160 | os.environ["CUDA_VISIBLE_DEVICES"]=str(args.gpu) 161 | h, w = map(int, args.input_size.split(',')) 162 | input_size = (h, w) 163 | 164 | cudnn.enabled = True 165 | 166 | # Create network. 167 | model = Res_Deeplab(num_classes=args.num_classes) 168 | # For a small batch size, it is better to keep 169 | # the statistics of the BN layers (running means and variances) 170 | # frozen, and to not update the values provided by the pre-trained model. 171 | # If is_training=True, the statistics will be updated during the training. 172 | # Note that is_training=False still updates BN parameters gamma (scale) and beta (offset) 173 | # if they are presented in var_list of the optimiser definition. 174 | 175 | saved_state_dict = torch.load(args.restore_from) 176 | new_params = model.state_dict().copy() 177 | for i in saved_state_dict: 178 | #Scale.layer5.conv2d_list.3.weight 179 | i_parts = i.split('.') 180 | # print i_parts 181 | if not args.num_classes == 21 or not i_parts[1]=='layer5': 182 | new_params['.'.join(i_parts[1:])] = saved_state_dict[i] 183 | model.load_state_dict(new_params) 184 | #model.float() 185 | #model.eval() # use_global_stats = True 186 | model.train() 187 | model.cuda() 188 | 189 | cudnn.benchmark = True 190 | 191 | if not os.path.exists(args.snapshot_dir): 192 | os.makedirs(args.snapshot_dir) 193 | 194 | 195 | trainloader = data.DataLoader(VOCDataSet(args.data_dir, args.data_list, max_iters=args.num_steps*args.batch_size, crop_size=input_size, 196 | scale=args.random_scale, mirror=args.random_mirror, mean=IMG_MEAN), 197 | batch_size=args.batch_size, shuffle=True, num_workers=5, pin_memory=True) 198 | 199 | optimizer = optim.SGD([{'params': get_1x_lr_params_NOscale(model), 'lr': args.learning_rate }, 200 | {'params': get_10x_lr_params(model), 'lr': 10*args.learning_rate}], 201 | lr=args.learning_rate, momentum=args.momentum,weight_decay=args.weight_decay) 202 | optimizer.zero_grad() 203 | 204 | interp = nn.Upsample(size=input_size, mode='bilinear', align_corners=True) 205 | 206 | 207 | for i_iter, batch in enumerate(trainloader): 208 | images, labels, _, _ = batch 209 | images = Variable(images).cuda() 210 | 211 | optimizer.zero_grad() 212 | adjust_learning_rate(optimizer, i_iter) 213 | pred = interp(model(images)) 214 | loss = loss_calc(pred, labels) 215 | loss.backward() 216 | optimizer.step() 217 | 218 | 219 | print 'iter = ', i_iter, 'of', args.num_steps,'completed, loss = ', loss.data.cpu().numpy() 220 | 221 | if i_iter >= args.num_steps-1: 222 | print 'save model ...' 223 | torch.save(model.state_dict(),osp.join(args.snapshot_dir, 'VOC12_scenes_'+str(args.num_steps)+'.pth')) 224 | break 225 | 226 | if i_iter % args.save_pred_every == 0 and i_iter!=0: 227 | print 'taking snapshot ...' 228 | torch.save(model.state_dict(),osp.join(args.snapshot_dir, 'VOC12_scenes_'+str(i_iter)+'.pth')) 229 | 230 | end = timeit.default_timer() 231 | print end-start,'seconds' 232 | 233 | if __name__ == '__main__': 234 | main() 235 | -------------------------------------------------------------------------------- /train_msc.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | import torch 4 | import torch.nn as nn 5 | from torch.utils import data 6 | import numpy as np 7 | import pickle 8 | import cv2 9 | from torch.autograd import Variable 10 | import torch.optim as optim 11 | import scipy.misc 12 | import torch.backends.cudnn as cudnn 13 | import sys 14 | import os 15 | import os.path as osp 16 | import scipy.ndimage as nd 17 | from deeplab.model import Res_Deeplab 18 | from deeplab.loss import CrossEntropy2d 19 | from deeplab.datasets import VOCDataSet 20 | import matplotlib.pyplot as plt 21 | import random 22 | import timeit 23 | start = timeit.timeit() 24 | 25 | IMG_MEAN = np.array((104.00698793,116.66876762,122.67891434), dtype=np.float32) 26 | 27 | BATCH_SIZE = 1 28 | DATA_DIRECTORY = '../data/VOCdevkit/voc12' 29 | DATA_LIST_PATH = './dataset/list/train_aug.txt' 30 | ITER_SIZE = 10 31 | IGNORE_LABEL = 255 32 | INPUT_SIZE = '321,321' 33 | LEARNING_RATE = 2.5e-4 34 | MOMENTUM = 0.9 35 | NUM_CLASSES = 21 36 | NUM_STEPS = 20000 37 | POWER = 0.9 38 | RANDOM_SEED = 1234 39 | RESTORE_FROM = './dataset/MS_DeepLab_resnet_pretrained_COCO_init.pth' 40 | SAVE_NUM_IMAGES = 2 41 | SAVE_PRED_EVERY = 1000 42 | SNAPSHOT_DIR = './snapshots_msc/' 43 | WEIGHT_DECAY = 0.0005 44 | 45 | def get_arguments(): 46 | """Parse all the arguments provided from the CLI. 47 | 48 | Returns: 49 | A list of parsed arguments. 50 | """ 51 | parser = argparse.ArgumentParser(description="DeepLab-ResNet Network") 52 | parser.add_argument("--batch-size", type=int, default=BATCH_SIZE, 53 | help="Number of images sent to the network in one step.") 54 | parser.add_argument("--data-dir", type=str, default=DATA_DIRECTORY, 55 | help="Path to the directory containing the PASCAL VOC dataset.") 56 | parser.add_argument("--data-list", type=str, default=DATA_LIST_PATH, 57 | help="Path to the file listing the images in the dataset.") 58 | parser.add_argument("--ignore-label", type=int, default=IGNORE_LABEL, 59 | help="The index of the label to ignore during the training.") 60 | parser.add_argument("--input-size", type=str, default=INPUT_SIZE, 61 | help="Comma-separated string with height and width of images.") 62 | parser.add_argument("--iter-size", type=int, default=ITER_SIZE, 63 | help="Number of steps after which gradient update is applied.") 64 | parser.add_argument("--is-training", action="store_true", 65 | help="Whether to updates the running means and variances during the training.") 66 | parser.add_argument("--learning-rate", type=float, default=LEARNING_RATE, 67 | help="Base learning rate for training with polynomial decay.") 68 | parser.add_argument("--momentum", type=float, default=MOMENTUM, 69 | help="Momentum component of the optimiser.") 70 | parser.add_argument("--not-restore-last", action="store_true", 71 | help="Whether to not restore last (FC) layers.") 72 | parser.add_argument("--num-classes", type=int, default=NUM_CLASSES, 73 | help="Number of classes to predict (including background).") 74 | parser.add_argument("--num-steps", type=int, default=NUM_STEPS, 75 | help="Number of training steps.") 76 | parser.add_argument("--power", type=float, default=POWER, 77 | help="Decay parameter to compute the learning rate.") 78 | parser.add_argument("--random-mirror", action="store_true", 79 | help="Whether to randomly mirror the inputs during the training.") 80 | parser.add_argument("--random-scale", action="store_true", 81 | help="Whether to randomly scale the inputs during the training.") 82 | parser.add_argument("--random-seed", type=int, default=RANDOM_SEED, 83 | help="Random seed to have reproducible results.") 84 | parser.add_argument("--restore-from", type=str, default=RESTORE_FROM, 85 | help="Where restore model parameters from.") 86 | parser.add_argument("--save-num-images", type=int, default=SAVE_NUM_IMAGES, 87 | help="How many images to save.") 88 | parser.add_argument("--save-pred-every", type=int, default=SAVE_PRED_EVERY, 89 | help="Save summaries and checkpoint every often.") 90 | parser.add_argument("--snapshot-dir", type=str, default=SNAPSHOT_DIR, 91 | help="Where to save snapshots of the model.") 92 | parser.add_argument("--weight-decay", type=float, default=WEIGHT_DECAY, 93 | help="Regularisation parameter for L2-loss.") 94 | parser.add_argument("--gpu", type=int, default=0, 95 | help="choose gpu device.") 96 | return parser.parse_args() 97 | 98 | args = get_arguments() 99 | 100 | def loss_calc(pred, label, gpu): 101 | """ 102 | This function returns cross entropy loss for semantic segmentation 103 | """ 104 | # out shape batch_size x channels x h x w -> batch_size x channels x h x w 105 | # label shape h x w x 1 x batch_size -> batch_size x 1 x h x w 106 | label = torch.from_numpy(label).long() 107 | label = Variable(label).cuda(gpu) 108 | m = nn.LogSoftmax() 109 | criterion = CrossEntropy2d().cuda(gpu) 110 | pred = m(pred) 111 | 112 | return criterion(pred, label) 113 | 114 | 115 | def lr_poly(base_lr, iter, max_iter, power): 116 | return base_lr*((1-float(iter)/max_iter)**(power)) 117 | 118 | 119 | def get_1x_lr_params_NOscale(model): 120 | """ 121 | This generator returns all the parameters of the net except for 122 | the last classification layer. Note that for each batchnorm layer, 123 | requires_grad is set to False in deeplab_resnet.py, therefore this function does not return 124 | any batchnorm parameter 125 | """ 126 | b = [] 127 | 128 | b.append(model.conv1) 129 | b.append(model.bn1) 130 | b.append(model.layer1) 131 | b.append(model.layer2) 132 | b.append(model.layer3) 133 | b.append(model.layer4) 134 | 135 | 136 | for i in range(len(b)): 137 | for j in b[i].modules(): 138 | jj = 0 139 | for k in j.parameters(): 140 | jj+=1 141 | if k.requires_grad: 142 | yield k 143 | 144 | def get_10x_lr_params(model): 145 | """ 146 | This generator returns all the parameters for the last layer of the net, 147 | which does the classification of pixel into classes 148 | """ 149 | b = [] 150 | b.append(model.layer5.parameters()) 151 | 152 | for j in range(len(b)): 153 | for i in b[j]: 154 | yield i 155 | 156 | 157 | def adjust_learning_rate(optimizer, i_iter): 158 | """Sets the learning rate to the initial LR divided by 5 at 60th, 120th and 160th epochs""" 159 | lr = lr_poly(args.learning_rate, i_iter, args.num_steps, args.power) 160 | optimizer.param_groups[0]['lr'] = lr 161 | optimizer.param_groups[1]['lr'] = lr * 10 162 | 163 | 164 | def main(): 165 | """Create the model and start the training.""" 166 | 167 | h, w = map(int, args.input_size.split(',')) 168 | input_size = (h, w) 169 | 170 | cudnn.enabled = True 171 | gpu = args.gpu 172 | 173 | # Create network. 174 | model = Res_Deeplab(num_classes=args.num_classes) 175 | # For a small batch size, it is better to keep 176 | # the statistics of the BN layers (running means and variances) 177 | # frozen, and to not update the values provided by the pre-trained model. 178 | # If is_training=True, the statistics will be updated during the training. 179 | # Note that is_training=False still updates BN parameters gamma (scale) and beta (offset) 180 | # if they are presented in var_list of the optimiser definition. 181 | 182 | saved_state_dict = torch.load(args.restore_from) 183 | new_params = model.state_dict().copy() 184 | for i in saved_state_dict: 185 | #Scale.layer5.conv2d_list.3.weight 186 | i_parts = i.split('.') 187 | # print i_parts 188 | if not args.num_classes == 21 or not i_parts[1]=='layer5': 189 | new_params['.'.join(i_parts[1:])] = saved_state_dict[i] 190 | model.load_state_dict(new_params) 191 | #model.float() 192 | #model.eval() # use_global_stats = True 193 | model.train() 194 | model.cuda(args.gpu) 195 | 196 | cudnn.benchmark = True 197 | 198 | if not os.path.exists(args.snapshot_dir): 199 | os.makedirs(args.snapshot_dir) 200 | 201 | 202 | trainloader = data.DataLoader(VOCDataSet(args.data_dir, args.data_list, max_iters=args.num_steps*args.iter_size, 203 | crop_size=input_size, scale=args.random_scale, mirror=args.random_mirror, mean=IMG_MEAN), 204 | batch_size=args.batch_size, shuffle=True, num_workers=1, pin_memory=True) 205 | 206 | optimizer = optim.SGD([{'params': get_1x_lr_params_NOscale(model), 'lr': args.learning_rate }, 207 | {'params': get_10x_lr_params(model), 'lr': 10*args.learning_rate}], 208 | lr=args.learning_rate, momentum=args.momentum,weight_decay=args.weight_decay) 209 | optimizer.zero_grad() 210 | 211 | b_loss = 0 212 | for i_iter, batch in enumerate(trainloader): 213 | 214 | images, labels, _, _ = batch 215 | images, labels = Variable(images), labels.numpy() 216 | h, w = images.size()[2:] 217 | images075 = nn.Upsample(size=(int(h*0.75), int(w*0.75)), mode='bilinear')(images) 218 | images05 = nn.Upsample(size=(int(h*0.5), int(w*0.5)), mode='bilinear')(images) 219 | 220 | out = model(images.cuda(args.gpu)) 221 | out075 = model(images075.cuda(args.gpu)) 222 | out05 = model(images05.cuda(args.gpu)) 223 | o_h, o_w = out.size()[2:] 224 | interpo1 = nn.Upsample(size=(o_h, o_w), mode='bilinear') 225 | interpo2 = nn.Upsample(size=(h, w), mode='bilinear') 226 | out_max = interpo2(torch.max(torch.stack([out, interpo1(out075), interpo1(out05)]), dim=0)[0]) 227 | 228 | loss = loss_calc(out_max, labels, args.gpu) 229 | d1, d2 = float(labels.shape[1]), float(labels.shape[2]) 230 | loss100 = loss_calc(out, nd.zoom(labels, (1.0, out.size()[2]/d1, out.size()[3]/d2), order=0), args.gpu) 231 | loss075 = loss_calc(out075, nd.zoom(labels, (1.0, out075.size()[2]/d1, out075.size()[3]/d2), order=0), args.gpu) 232 | loss05 = loss_calc(out05, nd.zoom(labels, (1.0, out05.size()[2]/d1, out05.size()[3]/d2), order=0), args.gpu) 233 | loss_all = (loss + loss100 + loss075 + loss05) / args.iter_size 234 | loss_all.backward() 235 | b_loss += loss_all.data.cpu().numpy() 236 | 237 | b_iter = i_iter / args.iter_size 238 | 239 | if b_iter >= args.num_steps-1: 240 | print 'save model ...' 241 | optimizer.step() 242 | torch.save(model.state_dict(),osp.join(args.snapshot_dir, 'VOC12_scenes_'+str(args.num_steps)+'.pth')) 243 | break 244 | 245 | if i_iter % args.iter_size == 0 and i_iter != 0: 246 | print 'iter = ', b_iter, 'of', args.num_steps,'completed, loss = ', b_loss 247 | optimizer.step() 248 | adjust_learning_rate(optimizer, b_iter) 249 | optimizer.zero_grad() 250 | b_loss = 0 251 | 252 | if i_iter % (args.save_pred_every*args.iter_size) == 0 and b_iter!=0: 253 | print 'taking snapshot ...' 254 | torch.save(model.state_dict(),osp.join(args.snapshot_dir, 'VOC12_scenes_'+str(b_iter)+'.pth')) 255 | 256 | end = timeit.timeit() 257 | print end-start,'seconds' 258 | 259 | if __name__ == '__main__': 260 | main() 261 | --------------------------------------------------------------------------------