├── .gitignore ├── LICENSE ├── README.md ├── data ├── 0 Basic Label Analysis.ipynb ├── 1 Preprocessing.ipynb ├── 2 Data Augmentation.ipynb ├── 3- Test Torch DataLoader Transforms.ipynb ├── log_preprocessing ├── sample │ ├── 10_left.jpeg │ ├── 10_right.jpeg │ ├── 13_left.jpeg │ ├── 13_right.jpeg │ ├── 15_left.jpeg │ ├── 15_right.jpeg │ ├── 16_left.jpeg │ ├── 16_right.jpeg │ ├── 17_left.jpeg │ └── 17_right.jpeg ├── sampleSubmission.csv └── trainLabels.csv ├── logs └── .gitkeep ├── requirements.txt ├── setup.sh ├── sparse_setup.sh └── src ├── DRNet.py ├── Sparse_ResNet.py ├── configuration.py.example ├── data_loading.py ├── kaggle.py ├── logVisualizer.py ├── lr_scheduler_configuration.py ├── output_writing.py ├── preprocess.py ├── quadratic_weighted_kappa.py ├── s_train_model.py ├── sparse_config.py ├── train_model.py ├── train_sparse_model.py ├── trainer.py ├── transfer_learning_configuration.py ├── transforms_configuration.py └── visualization ├── __init__.py ├── cnn_layer_visualization.py ├── gradcam.py ├── guided_backprop.py └── misc_functions.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | 49 | # Translations 50 | *.mo 51 | *.pot 52 | 53 | # Django stuff: 54 | *.log 55 | local_settings.py 56 | 57 | # Flask stuff: 58 | instance/ 59 | .webassets-cache 60 | 61 | # Scrapy stuff: 62 | .scrapy 63 | 64 | # Sphinx documentation 65 | docs/_build/ 66 | 67 | # PyBuilder 68 | target/ 69 | 70 | # Jupyter Notebook 71 | .ipynb_checkpoints 72 | 73 | # pyenv 74 | .python-version 75 | 76 | # celery beat schedule file 77 | celerybeat-schedule 78 | 79 | # SageMath parsed files 80 | *.sage.py 81 | 82 | # dotenv 83 | .env 84 | 85 | # virtualenv 86 | .venv 87 | venv/ 88 | ENV/ 89 | 90 | # Spyder project settings 91 | .spyderproject 92 | .spyproject 93 | 94 | # Rope project settings 95 | .ropeproject 96 | 97 | # mkdocs documentation 98 | /site 99 | 100 | # mypy 101 | .mypy_cache/ 102 | data/train 103 | data/test 104 | configuration.py 105 | submission.csv 106 | data/test_small 107 | data/train_small 108 | models 109 | data/test_small_300 110 | data/train_small_300 111 | generated 112 | results 113 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU General Public License is a free, copyleft license for 11 | software and other kinds of works. 12 | 13 | The licenses for most software and other practical works are designed 14 | to take away your freedom to share and change the works. By contrast, 15 | the GNU General Public License is intended to guarantee your freedom to 16 | share and change all versions of a program--to make sure it remains free 17 | software for all its users. We, the Free Software Foundation, use the 18 | GNU General Public License for most of our software; it applies also to 19 | any other work released this way by its authors. You can apply it to 20 | your programs, too. 21 | 22 | When we speak of free software, we are referring to freedom, not 23 | price. Our General Public Licenses are designed to make sure that you 24 | have the freedom to distribute copies of free software (and charge for 25 | them if you wish), that you receive source code or can get it if you 26 | want it, that you can change the software or use pieces of it in new 27 | free programs, and that you know you can do these things. 28 | 29 | To protect your rights, we need to prevent others from denying you 30 | these rights or asking you to surrender the rights. Therefore, you have 31 | certain responsibilities if you distribute copies of the software, or if 32 | you modify it: responsibilities to respect the freedom of others. 33 | 34 | For example, if you distribute copies of such a program, whether 35 | gratis or for a fee, you must pass on to the recipients the same 36 | freedoms that you received. You must make sure that they, too, receive 37 | or can get the source code. And you must show them these terms so they 38 | know their rights. 39 | 40 | Developers that use the GNU GPL protect your rights with two steps: 41 | (1) assert copyright on the software, and (2) offer you this License 42 | giving you legal permission to copy, distribute and/or modify it. 43 | 44 | For the developers' and authors' protection, the GPL clearly explains 45 | that there is no warranty for this free software. For both users' and 46 | authors' sake, the GPL requires that modified versions be marked as 47 | changed, so that their problems will not be attributed erroneously to 48 | authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run 51 | modified versions of the software inside them, although the manufacturer 52 | can do so. This is fundamentally incompatible with the aim of 53 | protecting users' freedom to change the software. The systematic 54 | pattern of such abuse occurs in the area of products for individuals to 55 | use, which is precisely where it is most unacceptable. Therefore, we 56 | have designed this version of the GPL to prohibit the practice for those 57 | products. If such problems arise substantially in other domains, we 58 | stand ready to extend this provision to those domains in future versions 59 | of the GPL, as needed to protect the freedom of users. 60 | 61 | Finally, every program is threatened constantly by software patents. 62 | States should not allow patents to restrict development and use of 63 | software on general-purpose computers, but in those that do, we wish to 64 | avoid the special danger that patents applied to a free program could 65 | make it effectively proprietary. To prevent this, the GPL assures that 66 | patents cannot be used to render the program non-free. 67 | 68 | The precise terms and conditions for copying, distribution and 69 | modification follow. 70 | 71 | TERMS AND CONDITIONS 72 | 73 | 0. Definitions. 74 | 75 | "This License" refers to version 3 of the GNU General Public License. 76 | 77 | "Copyright" also means copyright-like laws that apply to other kinds of 78 | works, such as semiconductor masks. 79 | 80 | "The Program" refers to any copyrightable work licensed under this 81 | License. Each licensee is addressed as "you". "Licensees" and 82 | "recipients" may be individuals or organizations. 83 | 84 | To "modify" a work means to copy from or adapt all or part of the work 85 | in a fashion requiring copyright permission, other than the making of an 86 | exact copy. The resulting work is called a "modified version" of the 87 | earlier work or a work "based on" the earlier work. 88 | 89 | A "covered work" means either the unmodified Program or a work based 90 | on the Program. 91 | 92 | To "propagate" a work means to do anything with it that, without 93 | permission, would make you directly or secondarily liable for 94 | infringement under applicable copyright law, except executing it on a 95 | computer or modifying a private copy. Propagation includes copying, 96 | distribution (with or without modification), making available to the 97 | public, and in some countries other activities as well. 98 | 99 | To "convey" a work means any kind of propagation that enables other 100 | parties to make or receive copies. Mere interaction with a user through 101 | a computer network, with no transfer of a copy, is not conveying. 102 | 103 | An interactive user interface displays "Appropriate Legal Notices" 104 | to the extent that it includes a convenient and prominently visible 105 | feature that (1) displays an appropriate copyright notice, and (2) 106 | tells the user that there is no warranty for the work (except to the 107 | extent that warranties are provided), that licensees may convey the 108 | work under this License, and how to view a copy of this License. If 109 | the interface presents a list of user commands or options, such as a 110 | menu, a prominent item in the list meets this criterion. 111 | 112 | 1. Source Code. 113 | 114 | The "source code" for a work means the preferred form of the work 115 | for making modifications to it. "Object code" means any non-source 116 | form of a work. 117 | 118 | A "Standard Interface" means an interface that either is an official 119 | standard defined by a recognized standards body, or, in the case of 120 | interfaces specified for a particular programming language, one that 121 | is widely used among developers working in that language. 122 | 123 | The "System Libraries" of an executable work include anything, other 124 | than the work as a whole, that (a) is included in the normal form of 125 | packaging a Major Component, but which is not part of that Major 126 | Component, and (b) serves only to enable use of the work with that 127 | Major Component, or to implement a Standard Interface for which an 128 | implementation is available to the public in source code form. A 129 | "Major Component", in this context, means a major essential component 130 | (kernel, window system, and so on) of the specific operating system 131 | (if any) on which the executable work runs, or a compiler used to 132 | produce the work, or an object code interpreter used to run it. 133 | 134 | The "Corresponding Source" for a work in object code form means all 135 | the source code needed to generate, install, and (for an executable 136 | work) run the object code and to modify the work, including scripts to 137 | control those activities. However, it does not include the work's 138 | System Libraries, or general-purpose tools or generally available free 139 | programs which are used unmodified in performing those activities but 140 | which are not part of the work. For example, Corresponding Source 141 | includes interface definition files associated with source files for 142 | the work, and the source code for shared libraries and dynamically 143 | linked subprograms that the work is specifically designed to require, 144 | such as by intimate data communication or control flow between those 145 | subprograms and other parts of the work. 146 | 147 | The Corresponding Source need not include anything that users 148 | can regenerate automatically from other parts of the Corresponding 149 | Source. 150 | 151 | The Corresponding Source for a work in source code form is that 152 | same work. 153 | 154 | 2. Basic Permissions. 155 | 156 | All rights granted under this License are granted for the term of 157 | copyright on the Program, and are irrevocable provided the stated 158 | conditions are met. This License explicitly affirms your unlimited 159 | permission to run the unmodified Program. The output from running a 160 | covered work is covered by this License only if the output, given its 161 | content, constitutes a covered work. This License acknowledges your 162 | rights of fair use or other equivalent, as provided by copyright law. 163 | 164 | You may make, run and propagate covered works that you do not 165 | convey, without conditions so long as your license otherwise remains 166 | in force. You may convey covered works to others for the sole purpose 167 | of having them make modifications exclusively for you, or provide you 168 | with facilities for running those works, provided that you comply with 169 | the terms of this License in conveying all material for which you do 170 | not control copyright. Those thus making or running the covered works 171 | for you must do so exclusively on your behalf, under your direction 172 | and control, on terms that prohibit them from making any copies of 173 | your copyrighted material outside their relationship with you. 174 | 175 | Conveying under any other circumstances is permitted solely under 176 | the conditions stated below. Sublicensing is not allowed; section 10 177 | makes it unnecessary. 178 | 179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 180 | 181 | No covered work shall be deemed part of an effective technological 182 | measure under any applicable law fulfilling obligations under article 183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 184 | similar laws prohibiting or restricting circumvention of such 185 | measures. 186 | 187 | When you convey a covered work, you waive any legal power to forbid 188 | circumvention of technological measures to the extent such circumvention 189 | is effected by exercising rights under this License with respect to 190 | the covered work, and you disclaim any intention to limit operation or 191 | modification of the work as a means of enforcing, against the work's 192 | users, your or third parties' legal rights to forbid circumvention of 193 | technological measures. 194 | 195 | 4. Conveying Verbatim Copies. 196 | 197 | You may convey verbatim copies of the Program's source code as you 198 | receive it, in any medium, provided that you conspicuously and 199 | appropriately publish on each copy an appropriate copyright notice; 200 | keep intact all notices stating that this License and any 201 | non-permissive terms added in accord with section 7 apply to the code; 202 | keep intact all notices of the absence of any warranty; and give all 203 | recipients a copy of this License along with the Program. 204 | 205 | You may charge any price or no price for each copy that you convey, 206 | and you may offer support or warranty protection for a fee. 207 | 208 | 5. Conveying Modified Source Versions. 209 | 210 | You may convey a work based on the Program, or the modifications to 211 | produce it from the Program, in the form of source code under the 212 | terms of section 4, provided that you also meet all of these conditions: 213 | 214 | a) The work must carry prominent notices stating that you modified 215 | it, and giving a relevant date. 216 | 217 | b) The work must carry prominent notices stating that it is 218 | released under this License and any conditions added under section 219 | 7. This requirement modifies the requirement in section 4 to 220 | "keep intact all notices". 221 | 222 | c) You must license the entire work, as a whole, under this 223 | License to anyone who comes into possession of a copy. This 224 | License will therefore apply, along with any applicable section 7 225 | additional terms, to the whole of the work, and all its parts, 226 | regardless of how they are packaged. This License gives no 227 | permission to license the work in any other way, but it does not 228 | invalidate such permission if you have separately received it. 229 | 230 | d) If the work has interactive user interfaces, each must display 231 | Appropriate Legal Notices; however, if the Program has interactive 232 | interfaces that do not display Appropriate Legal Notices, your 233 | work need not make them do so. 234 | 235 | A compilation of a covered work with other separate and independent 236 | works, which are not by their nature extensions of the covered work, 237 | and which are not combined with it such as to form a larger program, 238 | in or on a volume of a storage or distribution medium, is called an 239 | "aggregate" if the compilation and its resulting copyright are not 240 | used to limit the access or legal rights of the compilation's users 241 | beyond what the individual works permit. Inclusion of a covered work 242 | in an aggregate does not cause this License to apply to the other 243 | parts of the aggregate. 244 | 245 | 6. Conveying Non-Source Forms. 246 | 247 | You may convey a covered work in object code form under the terms 248 | of sections 4 and 5, provided that you also convey the 249 | machine-readable Corresponding Source under the terms of this License, 250 | in one of these ways: 251 | 252 | a) Convey the object code in, or embodied in, a physical product 253 | (including a physical distribution medium), accompanied by the 254 | Corresponding Source fixed on a durable physical medium 255 | customarily used for software interchange. 256 | 257 | b) Convey the object code in, or embodied in, a physical product 258 | (including a physical distribution medium), accompanied by a 259 | written offer, valid for at least three years and valid for as 260 | long as you offer spare parts or customer support for that product 261 | model, to give anyone who possesses the object code either (1) a 262 | copy of the Corresponding Source for all the software in the 263 | product that is covered by this License, on a durable physical 264 | medium customarily used for software interchange, for a price no 265 | more than your reasonable cost of physically performing this 266 | conveying of source, or (2) access to copy the 267 | Corresponding Source from a network server at no charge. 268 | 269 | c) Convey individual copies of the object code with a copy of the 270 | written offer to provide the Corresponding Source. This 271 | alternative is allowed only occasionally and noncommercially, and 272 | only if you received the object code with such an offer, in accord 273 | with subsection 6b. 274 | 275 | d) Convey the object code by offering access from a designated 276 | place (gratis or for a charge), and offer equivalent access to the 277 | Corresponding Source in the same way through the same place at no 278 | further charge. You need not require recipients to copy the 279 | Corresponding Source along with the object code. If the place to 280 | copy the object code is a network server, the Corresponding Source 281 | may be on a different server (operated by you or a third party) 282 | that supports equivalent copying facilities, provided you maintain 283 | clear directions next to the object code saying where to find the 284 | Corresponding Source. Regardless of what server hosts the 285 | Corresponding Source, you remain obligated to ensure that it is 286 | available for as long as needed to satisfy these requirements. 287 | 288 | e) Convey the object code using peer-to-peer transmission, provided 289 | you inform other peers where the object code and Corresponding 290 | Source of the work are being offered to the general public at no 291 | charge under subsection 6d. 292 | 293 | A separable portion of the object code, whose source code is excluded 294 | from the Corresponding Source as a System Library, need not be 295 | included in conveying the object code work. 296 | 297 | A "User Product" is either (1) a "consumer product", which means any 298 | tangible personal property which is normally used for personal, family, 299 | or household purposes, or (2) anything designed or sold for incorporation 300 | into a dwelling. In determining whether a product is a consumer product, 301 | doubtful cases shall be resolved in favor of coverage. For a particular 302 | product received by a particular user, "normally used" refers to a 303 | typical or common use of that class of product, regardless of the status 304 | of the particular user or of the way in which the particular user 305 | actually uses, or expects or is expected to use, the product. A product 306 | is a consumer product regardless of whether the product has substantial 307 | commercial, industrial or non-consumer uses, unless such uses represent 308 | the only significant mode of use of the product. 309 | 310 | "Installation Information" for a User Product means any methods, 311 | procedures, authorization keys, or other information required to install 312 | and execute modified versions of a covered work in that User Product from 313 | a modified version of its Corresponding Source. The information must 314 | suffice to ensure that the continued functioning of the modified object 315 | code is in no case prevented or interfered with solely because 316 | modification has been made. 317 | 318 | If you convey an object code work under this section in, or with, or 319 | specifically for use in, a User Product, and the conveying occurs as 320 | part of a transaction in which the right of possession and use of the 321 | User Product is transferred to the recipient in perpetuity or for a 322 | fixed term (regardless of how the transaction is characterized), the 323 | Corresponding Source conveyed under this section must be accompanied 324 | by the Installation Information. But this requirement does not apply 325 | if neither you nor any third party retains the ability to install 326 | modified object code on the User Product (for example, the work has 327 | been installed in ROM). 328 | 329 | The requirement to provide Installation Information does not include a 330 | requirement to continue to provide support service, warranty, or updates 331 | for a work that has been modified or installed by the recipient, or for 332 | the User Product in which it has been modified or installed. Access to a 333 | network may be denied when the modification itself materially and 334 | adversely affects the operation of the network or violates the rules and 335 | protocols for communication across the network. 336 | 337 | Corresponding Source conveyed, and Installation Information provided, 338 | in accord with this section must be in a format that is publicly 339 | documented (and with an implementation available to the public in 340 | source code form), and must require no special password or key for 341 | unpacking, reading or copying. 342 | 343 | 7. Additional Terms. 344 | 345 | "Additional permissions" are terms that supplement the terms of this 346 | License by making exceptions from one or more of its conditions. 347 | Additional permissions that are applicable to the entire Program shall 348 | be treated as though they were included in this License, to the extent 349 | that they are valid under applicable law. If additional permissions 350 | apply only to part of the Program, that part may be used separately 351 | under those permissions, but the entire Program remains governed by 352 | this License without regard to the additional permissions. 353 | 354 | When you convey a copy of a covered work, you may at your option 355 | remove any additional permissions from that copy, or from any part of 356 | it. (Additional permissions may be written to require their own 357 | removal in certain cases when you modify the work.) You may place 358 | additional permissions on material, added by you to a covered work, 359 | for which you have or can give appropriate copyright permission. 360 | 361 | Notwithstanding any other provision of this License, for material you 362 | add to a covered work, you may (if authorized by the copyright holders of 363 | that material) supplement the terms of this License with terms: 364 | 365 | a) Disclaiming warranty or limiting liability differently from the 366 | terms of sections 15 and 16 of this License; or 367 | 368 | b) Requiring preservation of specified reasonable legal notices or 369 | author attributions in that material or in the Appropriate Legal 370 | Notices displayed by works containing it; or 371 | 372 | c) Prohibiting misrepresentation of the origin of that material, or 373 | requiring that modified versions of such material be marked in 374 | reasonable ways as different from the original version; or 375 | 376 | d) Limiting the use for publicity purposes of names of licensors or 377 | authors of the material; or 378 | 379 | e) Declining to grant rights under trademark law for use of some 380 | trade names, trademarks, or service marks; or 381 | 382 | f) Requiring indemnification of licensors and authors of that 383 | material by anyone who conveys the material (or modified versions of 384 | it) with contractual assumptions of liability to the recipient, for 385 | any liability that these contractual assumptions directly impose on 386 | those licensors and authors. 387 | 388 | All other non-permissive additional terms are considered "further 389 | restrictions" within the meaning of section 10. If the Program as you 390 | received it, or any part of it, contains a notice stating that it is 391 | governed by this License along with a term that is a further 392 | restriction, you may remove that term. If a license document contains 393 | a further restriction but permits relicensing or conveying under this 394 | License, you may add to a covered work material governed by the terms 395 | of that license document, provided that the further restriction does 396 | not survive such relicensing or conveying. 397 | 398 | If you add terms to a covered work in accord with this section, you 399 | must place, in the relevant source files, a statement of the 400 | additional terms that apply to those files, or a notice indicating 401 | where to find the applicable terms. 402 | 403 | Additional terms, permissive or non-permissive, may be stated in the 404 | form of a separately written license, or stated as exceptions; 405 | the above requirements apply either way. 406 | 407 | 8. Termination. 408 | 409 | You may not propagate or modify a covered work except as expressly 410 | provided under this License. Any attempt otherwise to propagate or 411 | modify it is void, and will automatically terminate your rights under 412 | this License (including any patent licenses granted under the third 413 | paragraph of section 11). 414 | 415 | However, if you cease all violation of this License, then your 416 | license from a particular copyright holder is reinstated (a) 417 | provisionally, unless and until the copyright holder explicitly and 418 | finally terminates your license, and (b) permanently, if the copyright 419 | holder fails to notify you of the violation by some reasonable means 420 | prior to 60 days after the cessation. 421 | 422 | Moreover, your license from a particular copyright holder is 423 | reinstated permanently if the copyright holder notifies you of the 424 | violation by some reasonable means, this is the first time you have 425 | received notice of violation of this License (for any work) from that 426 | copyright holder, and you cure the violation prior to 30 days after 427 | your receipt of the notice. 428 | 429 | Termination of your rights under this section does not terminate the 430 | licenses of parties who have received copies or rights from you under 431 | this License. If your rights have been terminated and not permanently 432 | reinstated, you do not qualify to receive new licenses for the same 433 | material under section 10. 434 | 435 | 9. Acceptance Not Required for Having Copies. 436 | 437 | You are not required to accept this License in order to receive or 438 | run a copy of the Program. Ancillary propagation of a covered work 439 | occurring solely as a consequence of using peer-to-peer transmission 440 | to receive a copy likewise does not require acceptance. However, 441 | nothing other than this License grants you permission to propagate or 442 | modify any covered work. These actions infringe copyright if you do 443 | not accept this License. Therefore, by modifying or propagating a 444 | covered work, you indicate your acceptance of this License to do so. 445 | 446 | 10. Automatic Licensing of Downstream Recipients. 447 | 448 | Each time you convey a covered work, the recipient automatically 449 | receives a license from the original licensors, to run, modify and 450 | propagate that work, subject to this License. You are not responsible 451 | for enforcing compliance by third parties with this License. 452 | 453 | An "entity transaction" is a transaction transferring control of an 454 | organization, or substantially all assets of one, or subdividing an 455 | organization, or merging organizations. If propagation of a covered 456 | work results from an entity transaction, each party to that 457 | transaction who receives a copy of the work also receives whatever 458 | licenses to the work the party's predecessor in interest had or could 459 | give under the previous paragraph, plus a right to possession of the 460 | Corresponding Source of the work from the predecessor in interest, if 461 | the predecessor has it or can get it with reasonable efforts. 462 | 463 | You may not impose any further restrictions on the exercise of the 464 | rights granted or affirmed under this License. For example, you may 465 | not impose a license fee, royalty, or other charge for exercise of 466 | rights granted under this License, and you may not initiate litigation 467 | (including a cross-claim or counterclaim in a lawsuit) alleging that 468 | any patent claim is infringed by making, using, selling, offering for 469 | sale, or importing the Program or any portion of it. 470 | 471 | 11. Patents. 472 | 473 | A "contributor" is a copyright holder who authorizes use under this 474 | License of the Program or a work on which the Program is based. The 475 | work thus licensed is called the contributor's "contributor version". 476 | 477 | A contributor's "essential patent claims" are all patent claims 478 | owned or controlled by the contributor, whether already acquired or 479 | hereafter acquired, that would be infringed by some manner, permitted 480 | by this License, of making, using, or selling its contributor version, 481 | but do not include claims that would be infringed only as a 482 | consequence of further modification of the contributor version. For 483 | purposes of this definition, "control" includes the right to grant 484 | patent sublicenses in a manner consistent with the requirements of 485 | this License. 486 | 487 | Each contributor grants you a non-exclusive, worldwide, royalty-free 488 | patent license under the contributor's essential patent claims, to 489 | make, use, sell, offer for sale, import and otherwise run, modify and 490 | propagate the contents of its contributor version. 491 | 492 | In the following three paragraphs, a "patent license" is any express 493 | agreement or commitment, however denominated, not to enforce a patent 494 | (such as an express permission to practice a patent or covenant not to 495 | sue for patent infringement). To "grant" such a patent license to a 496 | party means to make such an agreement or commitment not to enforce a 497 | patent against the party. 498 | 499 | If you convey a covered work, knowingly relying on a patent license, 500 | and the Corresponding Source of the work is not available for anyone 501 | to copy, free of charge and under the terms of this License, through a 502 | publicly available network server or other readily accessible means, 503 | then you must either (1) cause the Corresponding Source to be so 504 | available, or (2) arrange to deprive yourself of the benefit of the 505 | patent license for this particular work, or (3) arrange, in a manner 506 | consistent with the requirements of this License, to extend the patent 507 | license to downstream recipients. "Knowingly relying" means you have 508 | actual knowledge that, but for the patent license, your conveying the 509 | covered work in a country, or your recipient's use of the covered work 510 | in a country, would infringe one or more identifiable patents in that 511 | country that you have reason to believe are valid. 512 | 513 | If, pursuant to or in connection with a single transaction or 514 | arrangement, you convey, or propagate by procuring conveyance of, a 515 | covered work, and grant a patent license to some of the parties 516 | receiving the covered work authorizing them to use, propagate, modify 517 | or convey a specific copy of the covered work, then the patent license 518 | you grant is automatically extended to all recipients of the covered 519 | work and works based on it. 520 | 521 | A patent license is "discriminatory" if it does not include within 522 | the scope of its coverage, prohibits the exercise of, or is 523 | conditioned on the non-exercise of one or more of the rights that are 524 | specifically granted under this License. You may not convey a covered 525 | work if you are a party to an arrangement with a third party that is 526 | in the business of distributing software, under which you make payment 527 | to the third party based on the extent of your activity of conveying 528 | the work, and under which the third party grants, to any of the 529 | parties who would receive the covered work from you, a discriminatory 530 | patent license (a) in connection with copies of the covered work 531 | conveyed by you (or copies made from those copies), or (b) primarily 532 | for and in connection with specific products or compilations that 533 | contain the covered work, unless you entered into that arrangement, 534 | or that patent license was granted, prior to 28 March 2007. 535 | 536 | Nothing in this License shall be construed as excluding or limiting 537 | any implied license or other defenses to infringement that may 538 | otherwise be available to you under applicable patent law. 539 | 540 | 12. No Surrender of Others' Freedom. 541 | 542 | If conditions are imposed on you (whether by court order, agreement or 543 | otherwise) that contradict the conditions of this License, they do not 544 | excuse you from the conditions of this License. If you cannot convey a 545 | covered work so as to satisfy simultaneously your obligations under this 546 | License and any other pertinent obligations, then as a consequence you may 547 | not convey it at all. For example, if you agree to terms that obligate you 548 | to collect a royalty for further conveying from those to whom you convey 549 | the Program, the only way you could satisfy both those terms and this 550 | License would be to refrain entirely from conveying the Program. 551 | 552 | 13. Use with the GNU Affero General Public License. 553 | 554 | Notwithstanding any other provision of this License, you have 555 | permission to link or combine any covered work with a work licensed 556 | under version 3 of the GNU Affero General Public License into a single 557 | combined work, and to convey the resulting work. The terms of this 558 | License will continue to apply to the part which is the covered work, 559 | but the special requirements of the GNU Affero General Public License, 560 | section 13, concerning interaction through a network will apply to the 561 | combination as such. 562 | 563 | 14. Revised Versions of this License. 564 | 565 | The Free Software Foundation may publish revised and/or new versions of 566 | the GNU General Public License from time to time. Such new versions will 567 | be similar in spirit to the present version, but may differ in detail to 568 | address new problems or concerns. 569 | 570 | Each version is given a distinguishing version number. If the 571 | Program specifies that a certain numbered version of the GNU General 572 | Public License "or any later version" applies to it, you have the 573 | option of following the terms and conditions either of that numbered 574 | version or of any later version published by the Free Software 575 | Foundation. If the Program does not specify a version number of the 576 | GNU General Public License, you may choose any version ever published 577 | by the Free Software Foundation. 578 | 579 | If the Program specifies that a proxy can decide which future 580 | versions of the GNU General Public License can be used, that proxy's 581 | public statement of acceptance of a version permanently authorizes you 582 | to choose that version for the Program. 583 | 584 | Later license versions may give you additional or different 585 | permissions. However, no additional obligations are imposed on any 586 | author or copyright holder as a result of your choosing to follow a 587 | later version. 588 | 589 | 15. Disclaimer of Warranty. 590 | 591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 599 | 600 | 16. Limitation of Liability. 601 | 602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 610 | SUCH DAMAGES. 611 | 612 | 17. Interpretation of Sections 15 and 16. 613 | 614 | If the disclaimer of warranty and limitation of liability provided 615 | above cannot be given local legal effect according to their terms, 616 | reviewing courts shall apply local law that most closely approximates 617 | an absolute waiver of all civil liability in connection with the 618 | Program, unless a warranty or assumption of liability accompanies a 619 | copy of the Program in return for a fee. 620 | 621 | END OF TERMS AND CONDITIONS 622 | 623 | How to Apply These Terms to Your New Programs 624 | 625 | If you develop a new program, and you want it to be of the greatest 626 | possible use to the public, the best way to achieve this is to make it 627 | free software which everyone can redistribute and change under these terms. 628 | 629 | To do so, attach the following notices to the program. It is safest 630 | to attach them to the start of each source file to most effectively 631 | state the exclusion of warranty; and each file should have at least 632 | the "copyright" line and a pointer to where the full notice is found. 633 | 634 | 635 | Copyright (C) 636 | 637 | This program is free software: you can redistribute it and/or modify 638 | it under the terms of the GNU General Public License as published by 639 | the Free Software Foundation, either version 3 of the License, or 640 | (at your option) any later version. 641 | 642 | This program is distributed in the hope that it will be useful, 643 | but WITHOUT ANY WARRANTY; without even the implied warranty of 644 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 645 | GNU General Public License for more details. 646 | 647 | You should have received a copy of the GNU General Public License 648 | along with this program. If not, see . 649 | 650 | Also add information on how to contact you by electronic and paper mail. 651 | 652 | If the program does terminal interaction, make it output a short 653 | notice like this when it starts in an interactive mode: 654 | 655 | Copyright (C) 656 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 657 | This is free software, and you are welcome to redistribute it 658 | under certain conditions; type `show c' for details. 659 | 660 | The hypothetical commands `show w' and `show c' should show the appropriate 661 | parts of the General Public License. Of course, your program's commands 662 | might be different; for a GUI interface, you would use an "about box". 663 | 664 | You should also get your employer (if you work as a programmer) or school, 665 | if any, to sign a "copyright disclaimer" for the program, if necessary. 666 | For more information on this, and how to apply and follow the GNU GPL, see 667 | . 668 | 669 | The GNU General Public License does not permit incorporating your program 670 | into proprietary programs. If your program is a subroutine library, you 671 | may consider it more useful to permit linking proprietary applications with 672 | the library. If this is what you want to do, use the GNU Lesser General 673 | Public License instead of this License. But first, please read 674 | . 675 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DL4CV-TUM 2 | A common repository for TUM Team ID - 47 - For comparison of Transfer Learning and Learning from scratch. 3 | 4 | # Kaggle Contest 5 | https://www.kaggle.com/c/diabetic-retinopathy-detection 6 | 7 | # Various Training Data 8 | https://docs.google.com/spreadsheets/d/1b3mdgSsHYE9qkPnDfKy6G-DpFOgDlyPRgZsmIW34bFE 9 | 10 | # Setup and Training model 11 | 12 | 1. Please follow the setup.sh file to install all the utilities to run this package. Run each command individually though. 13 | 2. Download all the dataset. 14 | 3. Preprocess the dataset for faster training using src/preprocess.py file. 15 | 4. Create a copy of configureation.py.example to configuration.py and play with params. 16 | 5. Run src/train_model.py to train your model. 17 | 18 | Have fun. 19 | -------------------------------------------------------------------------------- /data/sample/10_left.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/10_left.jpeg -------------------------------------------------------------------------------- /data/sample/10_right.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/10_right.jpeg -------------------------------------------------------------------------------- /data/sample/13_left.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/13_left.jpeg -------------------------------------------------------------------------------- /data/sample/13_right.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/13_right.jpeg -------------------------------------------------------------------------------- /data/sample/15_left.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/15_left.jpeg -------------------------------------------------------------------------------- /data/sample/15_right.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/15_right.jpeg -------------------------------------------------------------------------------- /data/sample/16_left.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/16_left.jpeg -------------------------------------------------------------------------------- /data/sample/16_right.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/16_right.jpeg -------------------------------------------------------------------------------- /data/sample/17_left.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/17_left.jpeg -------------------------------------------------------------------------------- /data/sample/17_right.jpeg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/data/sample/17_right.jpeg -------------------------------------------------------------------------------- /logs/.gitkeep: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saurabheights/DiabeticRetinopathyDetection/678129495dd771f608ea6f5d14fe519addefa434/logs/.gitkeep -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | torch 2 | torchvision 3 | pandas 4 | cv2 5 | numpy 6 | -------------------------------------------------------------------------------- /setup.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | if [ "$#" -ne 2 ]; then 4 | echo "Illegal number of parameters. Usage:- bash setup.sh " 5 | fi 6 | 7 | # Step - 1 - a - Install Nvidia Drivers and cuda. 8 | sudo apt install nvidia-cuda-toolkit 9 | echo "After reboot, run nvidia-smi. There should be no driver/library version mismatch." 10 | sudo reboot 11 | 12 | # Step - 2 -Install Miniconda - Faster and precise - Python 3.6.3 13 | curl -O https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh 14 | echo "Installing miniconda in default location ~/miniconda3" 15 | bash ./Miniconda3-latest-Linux-x86_64.sh -b 16 | 17 | echo "Adding conda in PATH in bashrc" 18 | echo 'export PATH=~/miniconda3/bin:$PATH' >> ~/.bashrc 19 | source ~/.bashrc 20 | 21 | conda create -y --name py36 python=3.6.3 22 | source activate py36 23 | echo "Created conda environment py36" 24 | 25 | echo "Installing dependencies for the project" 26 | conda install -y numpy=1.14.0 opencv=3.3.1 27 | conda install -y -c anaconda pillow=5.0.0 pandas=0.22.0 jupyter=1.0.0 28 | conda install -y -c pytorch pytorch=0.3.0 torchvision=0.2.0 29 | conda install -y -c conda-forge matplotlib=2.1.1 cycler=0.10.0 progressbar2=3.34.3 30 | 31 | # Step 3 - Download dataset - https://www.kaggle.com/c/diabetic-retinopathy-detection 32 | echo "Downloading dataset. Remember to accept the permission at Kaggle for this project by attempting to download a file." 33 | pip install kaggle-cli 34 | sudo apt install p7zip-full # Faster unzip without concatenating multipart zip files 35 | mkdir -p data/full 36 | cd data/full 37 | KAGGLE_USERNAME=$1 38 | KAGGLE_PASSWORD=$2 39 | kg download -u ${KAGGLE_USERNAME} -p ${KAGGLE_PASSWORD} -c diabetic-retinopathy-detection 40 | 7z x test.zip.001 41 | rm test.zip.00* 42 | 7z x train.zip.001 43 | rm train.zip.00* 44 | unzip trainLabels.csv.zip 45 | 46 | echo "Dataset downloaded and decompressed." 47 | 48 | # Step 4- run preprocessing on both train and test dataset. 49 | echo "Run src/preprocess.py over train and test folder paths with scale as 300 and dont forget to thank Omar :D." -------------------------------------------------------------------------------- /sparse_setup.sh: -------------------------------------------------------------------------------- 1 | 2 | 3 | #https://gist.github.com/bzamecnik/b0c342d22a2a21f6af9d10eba3d4597b 4 | 5 | #new cuda version needed 6 | 7 | 8 | #install driver: 9 | 10 | wget http://us.download.nvidia.com/tesla/390.12/nvidia-diag-driver-local-repo-ubuntu1604-390.12_1.0-1_amd64.deb 11 | sudo apt-key add /var/nvidia-diag-driver-local-repo-390.12/7fa2af80.pub 12 | sudo dpkg -i nvidia-diag-driver-local-repo-ubuntu1604-390.12_1.0-1_amd64.deb 13 | sudo apt-get update 14 | sudo apt-get install cuda-drivers 15 | sudo reboot 16 | 17 | 18 | ## step 2 19 | nvidia-smi 20 | echo "Check output of nvidia-smi" 21 | 22 | wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.1.85-1_amd64.deb 23 | sudo dpkg -i cuda-repo-ubuntu1604_9.1.85-1_amd64.deb 24 | sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub 25 | sudo apt-get update 26 | sudo apt-get install cuda 27 | sudo reboot 28 | 29 | 30 | #afterwards export of path needed: 31 | 32 | 33 | 34 | 35 | curl -O https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh 36 | echo "Installing miniconda in default location ~/miniconda3" 37 | bash ./Miniconda3-latest-Linux-x86_64.sh -b 38 | 39 | echo "Adding conda and cuda in PATH in bashrc" 40 | 41 | export PATH=/usr/local/cuda-9.1/bin${PATH:+:${PATH}} >> ~/.bashrc 42 | 43 | export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64{LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} >> ~/.bashrc 44 | 45 | echo 'export PATH=~/miniconda3/bin:$PATH' >> ~/.bashrc 46 | source ~/.bashrc 47 | 48 | conda install -y numpy=1.14.0 opencv=3.3.1 49 | conda install -y -c anaconda pillow=5.0.0 pandas=0.22.0 jupyter=1.0.0 50 | conda install -y -c conda-forge matplotlib=2.1.1 cycler=0.10.0 progressbar2=3.34.3 51 | conda install pytorch 52 | 53 | 54 | git clone https://github.com/saurabheights/DiabeticRetinopathyDetection/ 55 | 56 | git clone https://github.com/facebookresearch/SparseConvNet 57 | 58 | cd SparseConvNet/PyTorch 59 | sudo apt-get install libsparsehash-dev 60 | sudo apt-get install unrar 61 | pip install git+https://github.com/pytorch/tnt.git@master 62 | python setup.py develop 63 | 64 | 65 | 66 | 67 | 68 | #https://gist.github.com/bzamecnik/b0c342d22a2a21f6af9d10eba3d4597b 69 | 70 | #new cuda version needed 71 | 72 | 73 | #install driver: 74 | 75 | wget http://us.download.nvidia.com/tesla/390.12/nvidia-diag-driver-local-repo-ubuntu1604-390.12_1.0-1_amd64.deb 76 | sudo apt-key add /var/nvidia-diag-driver-local-repo-390.12/7fa2af80.pub 77 | sudo dpkg -i nvidia-diag-driver-local-repo-ubuntu1604-390.12_1.0-1_amd64.deb 78 | sudo apt-get update 79 | sudo apt-get install cuda-drivers 80 | sudo reboot 81 | 82 | 83 | ## step 2 84 | nvidia-smi 85 | echo "Check output of nvidia-smi" 86 | 87 | wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.1.85-1_amd64.deb 88 | sudo dpkg -i cuda-repo-ubuntu1604_9.1.85-1_amd64.deb 89 | sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub 90 | sudo apt-get update 91 | sudo apt-get install cuda 92 | sudo reboot 93 | 94 | 95 | #afterwards export of path needed: 96 | 97 | 98 | 99 | 100 | curl -O https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh 101 | echo "Installing miniconda in default location ~/miniconda3" 102 | bash ./Miniconda3-latest-Linux-x86_64.sh -b 103 | 104 | echo "Adding conda and cuda in PATH in bashrc" 105 | 106 | export PATH=/usr/local/cuda-9.1/bin${PATH:+:${PATH}} >> ~/.bashrc 107 | 108 | export LD_LIBRARY_PATH=/usr/local/cuda-9.1/lib64{LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} >> ~/.bashrc 109 | 110 | echo 'export PATH=~/miniconda3/bin:$PATH' >> ~/.bashrc 111 | source ~/.bashrc 112 | 113 | conda install -y numpy=1.14.0 opencv=3.3.1 114 | conda install -y -c anaconda pillow=5.0.0 pandas=0.22.0 jupyter=1.0.0 115 | conda install -y -c conda-forge matplotlib=2.1.1 cycler=0.10.0 progressbar2=3.34.3 116 | conda install pytorch 117 | 118 | 119 | git clone https://github.com/saurabheights/DiabeticRetinopathyDetection/ 120 | 121 | git clone https://github.com/facebookresearch/SparseConvNet 122 | 123 | cd SparseConvNet/PyTorch 124 | sudo apt-get install libsparsehash-dev 125 | sudo apt-get install unrar 126 | pip install git+https://github.com/pytorch/tnt.git@master 127 | python setup.py develop 128 | 129 | 130 | 131 | 132 | -------------------------------------------------------------------------------- /src/DRNet.py: -------------------------------------------------------------------------------- 1 | from torchvision.models import resnet18, resnet34, resnet50 2 | import torch.nn as nn 3 | from itertools import chain 4 | 5 | class DRNet(nn.Module): 6 | 7 | ''' 8 | Class to create a ResNet Model for transfer learning. 9 | 10 | Arguments: 11 | num_classes: number of output target classes, default is 5 12 | pretrained: boolean to determine whether to retrieve pretrained weights 'image-net', defaults to FALSE 13 | net_size: choose which ResNet architecture, can be 18, 34, 50, otherwise defaults to 50 14 | freeze_features: boolean to determine whether to freeze some layers or not, defaults to FALSE 15 | freeze_until_layer: integer to determine number of layers to freeze starting from the beginning of the network. 16 | 17 | freeze_until_layer can be {1,2,3,4,5}, other values are not allowed. 18 | 19 | Each number corresponds to a layer in ResNet architecture, see Table in the paper: https://arxiv.org/pdf/1512.03385.pdf 20 | 21 | rates: an array of learning rates for each layer, has to be 6 elements (5 ResNet layers + fc layer) 22 | 23 | ''' 24 | def __init__(self, num_classes=5, pretrained=False, net_size=50, 25 | freeze_features=False, freeze_until_layer=5, rates=[], default_lr=1e-9): 26 | super(DRNet, self).__init__() 27 | 28 | self.layer_learning_rates = rates 29 | self.default_lr = default_lr 30 | 31 | if(net_size == 18): 32 | self.resnet = resnet18(pretrained=pretrained) 33 | elif (net_size == 34): 34 | self.resnet = resnet34(pretrained=pretrained) 35 | elif (net_size == 50): 36 | self.resnet = resnet50(pretrained=pretrained) 37 | else: 38 | print("Error in DRNet: Invalid model size for ResNet. Initializeing a ResNet 50 instead.") 39 | net_size = 50 40 | self.resnet = resnet50(pretrained=pretrained) 41 | 42 | layers = [self.resnet.conv1, self.resnet.bn1, self.resnet.relu, self.resnet.maxpool, #1 43 | self.resnet.layer1, #2 44 | self.resnet.layer2, #3 45 | self.resnet.layer3, #4 46 | self.resnet.layer4, #5 47 | self.resnet.avgpool, 48 | self.resnet.fc] 49 | 50 | if (freeze_features & pretrained): 51 | if (freeze_until_layer < 1) | (freeze_until_layer > 5): 52 | print("Error in DRNet: Freezing layers not possible. Cannot freeze parameters until the given layer.") 53 | else: 54 | for layer in layers[0:freeze_until_layer+3]: 55 | for param in layer.parameters(): 56 | param.requires_grad = False 57 | 58 | 59 | num_features = self.resnet.fc.in_features 60 | self.resnet.fc = nn.Linear(num_features, num_classes) 61 | 62 | self.set_name("DRNet(ResNet "+str(net_size)+")") 63 | 64 | def get_lr_layer(self, layer_index): 65 | if (len(self.layer_learning_rates) == 0): 66 | return self.default_lr 67 | 68 | if (layer_index < 1) | (layer_index > 6): 69 | return self.default_lr 70 | 71 | if (len(self.layer_learning_rates) < layer_index): 72 | return self.default_lr 73 | 74 | return self.layer_learning_rates[layer_index-1] 75 | 76 | def get_params_layer(self, layer_index): 77 | if (layer_index < 1) | (layer_index > 6): 78 | return None 79 | if (layer_index == 1): 80 | return chain(self.resnet.conv1.parameters(), self.resnet.bn1.parameters()) 81 | if (layer_index == 2): 82 | return self.resnet.layer1.parameters() 83 | if (layer_index == 3): 84 | return self.resnet.layer2.parameters() 85 | if (layer_index == 4): 86 | return self.resnet.layer3.parameters() 87 | if (layer_index == 5): 88 | return self.resnet.layer4.parameters() 89 | if (layer_index == 6): 90 | return self.resnet.fc.parameters() 91 | 92 | 93 | def set_name(self, name): 94 | self.__name__ = name 95 | 96 | def forward(self, x): 97 | 98 | x = self.resnet(x) 99 | 100 | return x 101 | 102 | @property 103 | def is_cuda(self): 104 | return next(self.parameters()).is_cuda -------------------------------------------------------------------------------- /src/Sparse_ResNet.py: -------------------------------------------------------------------------------- 1 | # Copyright 2017-present, Facebook, Inc. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | 7 | import torch 8 | import torch.nn as nn 9 | import sparseconvnet as scn 10 | #from data import getIterators 11 | 12 | # two-dimensional SparseConvNet 13 | class Sparse_Res_Net(nn.Module): 14 | def __init__(self,num_classes = 5): 15 | nn.Module.__init__(self) 16 | self.sparseModel=scn.Sequential( 17 | ).add(scn.DenseToSparse(2)).add(scn.ValidConvolution(2, 3, 8, 2, False) 18 | ).add(scn.MaxPooling(2, 4, 2) 19 | ).add(scn.SparseResNet(2, 8, [ 20 | ['b', 8, 3, 1], 21 | ['b', 16, 2, 2], 22 | ['b', 24, 2, 2], 23 | ['b', 32, 2, 2]]) 24 | ).add(scn.Convolution(2, 32, 64, 4, 1, False) 25 | ).add(scn.BatchNormReLU(64) 26 | ).add(scn.SparseToDense(2,64)) 27 | self.linear = nn.Linear(6400, num_classes) 28 | def forward(self, x): 29 | x = self.sparseModel(x) 30 | #print("before linear") 31 | #print(type(x)) 32 | x = x.view(-1,6400) 33 | x = self.linear(x) 34 | return x 35 | ''' 36 | model=Model() 37 | spatial_size = model.sparseModel.input_spatial_size(torch.LongTensor([1, 1])) 38 | print('Input spatial size:', spatial_size) 39 | dataset = getIterators(spatial_size, 63, 3) 40 | scn.ClassificationTrainValidate( 41 | model, dataset, 42 | {'n_epochs': 100, 43 | 'initial_lr': 0.1, 44 | 'lr_decay': 0.05, 45 | 'weight_decay': 1e-4, 46 | 'use_gpu': torch.cuda.is_available(), 47 | 'check_point': True,}) 48 | ''' 49 | -------------------------------------------------------------------------------- /src/configuration.py.example: -------------------------------------------------------------------------------- 1 | # create a configuration.py like this 2 | 3 | from math import tan 4 | from random import uniform 5 | 6 | from PIL import Image 7 | from torchvision import transforms 8 | from torchvision.models import AlexNet 9 | from torch.optim import Adam, SGD 10 | 11 | data_params = { 12 | 'train_path': '../data/train', 13 | 'test_path': '../data/test', 14 | 'label_path': '../data/trainLabels.csv', 15 | 'batch_size': 100, 16 | 'submission_file': '../data/submission.csv', 17 | # 'almost_even', 'even', 'posneg', None. 18 | # 'even': Same number of samples for each class 19 | # 'posneg': Same number of samples for class 0 and all other classes 20 | 'rebalance_strategy': 'even', 21 | 'num_loading_workers': 4 22 | } 23 | 24 | kaggle_params = { 25 | # Change auto submit to True, to submit from code. 26 | 'auto_submit':False, 27 | # Change to your Kaggle username and password. 28 | 'kaggle_username':'abc', 29 | 'kaggle_password':'xyz', 30 | # Keep message enclosed in single qoutes, which are further enclosed in double qoutes. 31 | 'kaggle_submission_message':"'Luke, I am you father'" 32 | } 33 | 34 | training_params = { 35 | 'num_epochs': 1, 36 | 'log_nth': 1 37 | } 38 | 39 | train_control = { 40 | 'optimizer' : Adam, # Adam, SGD (we can add more) 41 | 'lr_scheduler_type': 'none', # 'exp', 'step', 'plateau', 'none' 42 | 43 | 'step_scheduler_args' : { 44 | 'gamma' : 0.1, # factor to decay learing rate (new_lr = gamma * lr) 45 | 'step_size': 3 # number of epochs to take a step of decay 46 | }, 47 | 48 | 'exp_scheduler_args' : { 49 | 'gamma' : 0.1 # factor to decay learing rate (new_lr = gamma * lr) 50 | }, 51 | 52 | 'plateau_scheduler_args' : { 53 | 'factor' : 0.2, # factor to decay learing rate (new_lr = factor * lr) 54 | 'patience' : 3, # number of epochs to wait as monitored value does not change before decreasing LR 55 | 'verbose' : True, # print a message when LR is changed 56 | 'threshold' : 1e-3, # when to consider the monitored varaible not changing (focus on significant changes) 57 | 'min_lr' : 1e-7, # lower bound on learning rate, not decreased further 58 | 'cooldown' : 0 # number of epochs to wait before resuming operation after LR was reduced 59 | } 60 | } 61 | 62 | optimizer_params = { 63 | 'lr': 1e-3 64 | } 65 | 66 | model_params = { 67 | # if False, just load the model from the disk and evaluate 68 | 'train': True, 69 | # if False, previously (partially) trained model is further trained. 70 | 'train_from_scratch':True, 71 | 'model_path': '../models/alexnet.model', 72 | 'model': AlexNet, 73 | 'model_kwargs': {'num_classes': 5}, 74 | # the device to put the variables on (cpu/gpu) 75 | 'pytorch_device': 'cpu', 76 | # cuda device if gpu 77 | 'cuda_device': 0, 78 | } 79 | 80 | # training data transforms (random rotation, random skew, scale and crop 224) 81 | train_data_transforms = transforms.Compose([ 82 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 83 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 84 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1,1)), # scale +- 10%, resize to 300 85 | transforms.CenterCrop((224)), 86 | transforms.ToTensor() 87 | ]) 88 | 89 | # validation data transforms (random rotation, random skew, scale and crop 224) 90 | val_data_transforms = transforms.Compose([ 91 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 92 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 93 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1,1)), # scale +- 10%, resize to 300 94 | transforms.CenterCrop((224)), 95 | transforms.ToTensor() 96 | ]) 97 | 98 | # test data transforms (random rotation) 99 | test_data_transforms = transforms.Compose([ 100 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 101 | transforms.RandomResizedCrop(300, scale=(1,1), ratio=(1,1)), # resize to 300 102 | transforms.CenterCrop((224)), 103 | transforms.ToTensor() 104 | ]) 105 | 106 | 107 | # Function to implement skew based on PIL transform 108 | 109 | def skew_image(img, angle, inc_width=False): 110 | """ 111 | Skew image using some math 112 | :param img: PIL image object 113 | :param angle: Angle in radians (function doesn't do well outside the range -1 -> 1, but still works) 114 | :return: PIL image object 115 | """ 116 | width, height = img.size 117 | # Get the width that is to be added to the image based on the angle of skew 118 | xshift = tan(abs(angle)) * height 119 | new_width = width + int(xshift) 120 | 121 | if new_width < 0: 122 | return img 123 | 124 | # Apply transform 125 | img = img.transform( 126 | (new_width, height), 127 | Image.AFFINE, 128 | (1, angle, -xshift if angle > 0 else 0, 0, 1, 0), 129 | Image.BICUBIC 130 | ) 131 | 132 | if (inc_width): 133 | return img 134 | else: 135 | return img.crop((0, 0, width, height)) 136 | 137 | 138 | -------------------------------------------------------------------------------- /src/data_loading.py: -------------------------------------------------------------------------------- 1 | import logging 2 | from pathlib import Path 3 | 4 | import numpy as np 5 | import pandas as pd 6 | import torch 7 | from torch.utils.data import Dataset 8 | from torch.utils.data.sampler import SubsetRandomSampler 9 | from torchvision.datasets.folder import default_loader 10 | 11 | 12 | class RetinopathyDataset(Dataset): 13 | 14 | def __init__(self, root_dir, csv_file, transform=None, loader=default_loader): 15 | self.root_dir = Path(root_dir) if type(root_dir) is str else root_dir 16 | self.image_names = list(self.root_dir.glob('*.jpeg')) 17 | self.transform = transform 18 | if csv_file: 19 | image_name_set = {i.stem for i in self.image_names} 20 | labels = pd.read_csv(csv_file) 21 | self.labels = {image: label for image, label in 22 | zip(list(labels.image), list(labels.level)) 23 | if image in image_name_set} 24 | else: 25 | self.labels = None 26 | self.loader = loader 27 | 28 | def __len__(self): 29 | return len(self.image_names) 30 | 31 | def __getitem__(self, index): 32 | image_path = self.image_names[index] 33 | image = self.loader(image_path) 34 | if self.transform: 35 | image = self.transform(image) 36 | 37 | if self.labels: 38 | label = self.labels[image_path.stem] 39 | return image, label 40 | else: 41 | return image, image_path.stem 42 | 43 | 44 | class LabelBalancer: 45 | def __init__(self, y): 46 | self.y = np.asarray(list(y)) 47 | 48 | def _try_frac(self, m, n, pn): 49 | # Return (a, b) s.t. a <= m, b <= n 50 | # and b / a is as close to pn as possible 51 | r = int(round(float(pn * m) / (1.0 - pn))) 52 | s = int(round(float((1.0 - pn) * n) / pn)) 53 | return (m, r) if r <= n else ((s, n) if s <= m else (m, n)) 54 | 55 | def _get_counts(self, nneg, npos, frac_pos): 56 | if frac_pos > 0.5: 57 | return self._try_frac(nneg, npos, frac_pos) 58 | else: 59 | return self._try_frac(npos, nneg, 1.0 - frac_pos)[::-1] 60 | 61 | def _get_row_counts(self, num_classes): 62 | row_pos = [] 63 | row_n = [] 64 | for i in range(num_classes): 65 | curr_column_pos = np.where(self.y == i)[0] 66 | if len(curr_column_pos) == 0: 67 | raise ValueError(f"No positive labels for row {i}.") 68 | row_pos.append(curr_column_pos) 69 | row_n.append(len(curr_column_pos)) 70 | logging.info(f'Found {len(curr_column_pos)} samples for category {i}') 71 | return row_pos, row_n 72 | 73 | def rebalance_categorical_train_idxs_pos_neg(self, rebalance=0.5, num_classes=5, rand_state=None): 74 | """Get training indices based on @y 75 | @rebalance: bool or fraction of positive examples desired 76 | If True, default fraction is 0.5. If False no balancing. 77 | """ 78 | rs = np.random if rand_state is None else rand_state 79 | row_pos, row_n = self._get_row_counts(num_classes) 80 | n_neg = row_n[0] 81 | n_pos = sum(row_n[1:]) 82 | p = 0.5 if rebalance is True else rebalance 83 | n_neg, n_pos = self._get_counts(n_neg, n_pos, p) 84 | row_pos[0] = rs.choice(row_pos[0], size=n_neg, replace=False) 85 | for i in range(1, len(row_pos)): 86 | row_pos[i] = rs.choice(row_pos[i], 87 | size=min(int(n_pos / (num_classes - 1)), row_n[i]), 88 | replace=False) 89 | idxs = np.concatenate(row_pos) 90 | rs.shuffle(idxs) 91 | return list(idxs) 92 | 93 | def rebalance_categorical_train_idxs_evenly(self, num_classes=5, rand_state=None): 94 | """Get training indices based on @y 95 | @rebalance: bool or fraction of positive examples desired 96 | If True, default fraction is 0.5. If False no balancing. 97 | """ 98 | rs = np.random if rand_state is None else rand_state 99 | row_pos, row_n = self._get_row_counts(num_classes) 100 | min_n = min(row_n) 101 | for i in range(num_classes): 102 | row_pos[i] = rs.choice(row_pos[i], size=min_n, replace=False) 103 | idxs = np.concatenate(row_pos) 104 | rs.shuffle(idxs) 105 | return idxs 106 | 107 | def rebalance_categorical_train_idxs_almost_evenly(self, num_classes=5, rand_state=None): 108 | """Get training indices based on @y 109 | @rebalance: bool or fraction of positive examples desired 110 | If True, default fraction is 0.5. If False no balancing. 111 | """ 112 | rs = np.random if rand_state is None else rand_state 113 | row_pos, row_n = self._get_row_counts(num_classes) 114 | min_n = min(row_n) 115 | for i in range(num_classes): 116 | row_pos[i] = rs.choice(row_pos[i], size=min(row_n[i], int(min_n * 1.5)), replace=False) 117 | idxs = np.concatenate(row_pos) 118 | rs.shuffle(idxs) 119 | return idxs 120 | 121 | 122 | # customization of https://gist.github.com/kevinzakka/d33bf8d6c7f06a9d8c76d97a7879f5cb 123 | def get_train_valid_loader(data_dir, 124 | label_path, 125 | batch_size, 126 | train_transforms, 127 | valid_transforms, 128 | random_seed, 129 | rebalance_strategy, 130 | valid_size=0.1, 131 | shuffle=True, 132 | num_workers=4, 133 | pin_memory=False): 134 | """ 135 | Utility function for loading and returning train and valid 136 | multi-process iterators over the CIFAR-10 dataset. A sample 137 | 9x9 grid of the images can be optionally displayed. 138 | If using CUDA, num_workers should be set to 1 and pin_memory to True. 139 | Params 140 | ------ 141 | - data_dir: path directory to the dataset. 142 | - batch_size: how many samples per batch to load. 143 | - augment: whether to apply the data augmentation scheme 144 | mentioned in the paper. Only applied on the train split. 145 | - random_seed: fix seed for reproducibility. 146 | - valid_size: percentage split of the training set used for 147 | the validation set. Should be a float in the range [0, 1]. 148 | - shuffle: whether to shuffle the train/validation indices. 149 | - show_sample: plot 9x9 sample grid of the dataset. 150 | - num_workers: number of subprocesses to use when loading the dataset. 151 | - pin_memory: whether to copy tensors into CUDA pinned memory. Set it to 152 | True if using GPU. 153 | Returns 154 | ------- 155 | - train_loader: training set iterator. 156 | - valid_loader: validation set iterator. 157 | """ 158 | error_msg = "[!] valid_size should be in the range [0, 1]." 159 | assert ((valid_size >= 0) and (valid_size <= 1)), error_msg 160 | 161 | train_dataset = RetinopathyDataset(data_dir, label_path, 162 | train_transforms) 163 | 164 | valid_dataset = RetinopathyDataset(data_dir, label_path, 165 | valid_transforms) 166 | 167 | num_train = len(train_dataset) 168 | indices = list(range(num_train)) 169 | split = int(np.floor(valid_size * num_train)) 170 | 171 | if shuffle == True: 172 | np.random.seed(random_seed) 173 | np.random.shuffle(indices) 174 | 175 | train_idx, valid_idx = indices[split:], indices[:split] 176 | 177 | if rebalance_strategy in {'almost_even', 'even', 'posneg'}: 178 | label_balancer = LabelBalancer(train_dataset.labels.values()) 179 | logging.info(f'Train samples before rebalancing: {len(train_idx)}') 180 | if rebalance_strategy == 'even': 181 | train_idx = label_balancer.rebalance_categorical_train_idxs_evenly() 182 | elif rebalance_strategy == 'almost_even': 183 | train_idx = label_balancer.rebalance_categorical_train_idxs_almost_evenly() 184 | else: 185 | train_idx = label_balancer.rebalance_categorical_train_idxs_pos_neg() 186 | logging.info(f'Train samples after rebalancing: {len(train_idx)}') 187 | elif rebalance_strategy is not None: 188 | logging.info('Could not recognize rebalance_strategy. Not rebalancing') 189 | 190 | train_sampler = SubsetRandomSampler(train_idx) 191 | valid_sampler = SubsetRandomSampler(valid_idx) 192 | 193 | train_loader = torch.utils.data.DataLoader(train_dataset, 194 | batch_size=batch_size, sampler=train_sampler, 195 | num_workers=num_workers, pin_memory=pin_memory) 196 | 197 | valid_loader = torch.utils.data.DataLoader(valid_dataset, 198 | batch_size=batch_size, sampler=valid_sampler, 199 | num_workers=num_workers, pin_memory=pin_memory) 200 | 201 | return (train_loader, valid_loader) 202 | 203 | 204 | def get_test_loader(data_dir, 205 | batch_size, 206 | transforms, 207 | shuffle=False, 208 | num_workers=4, 209 | pin_memory=False): 210 | """ 211 | Utility function for loading and returning a multi-process 212 | test iterator over the CIFAR-10 dataset. 213 | If using CUDA, num_workers should be set to 1 and pin_memory to True. 214 | Params 215 | ------ 216 | - data_dir: path directory to the dataset. 217 | - batch_size: how many samples per batch to load. 218 | - shuffle: whether to shuffle the dataset after every epoch. 219 | - num_workers: number of subprocesses to use when loading the dataset. 220 | - pin_memory: whether to copy tensors into CUDA pinned memory. Set it to 221 | True if using GPU. 222 | Returns 223 | ------- 224 | - data_loader: test set iterator. 225 | """ 226 | 227 | dataset = RetinopathyDataset(data_dir, None, transforms) 228 | 229 | data_loader = torch.utils.data.DataLoader(dataset, 230 | batch_size=batch_size, 231 | shuffle=shuffle, 232 | num_workers=num_workers, 233 | pin_memory=pin_memory) 234 | 235 | return data_loader 236 | -------------------------------------------------------------------------------- /src/kaggle.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | from configuration import kaggle_params 4 | import subprocess 5 | 6 | def submit_solution(filepath): 7 | u = kaggle_params['kaggle_username'] 8 | p = kaggle_params['kaggle_password'] 9 | msg = kaggle_params['kaggle_submission_message'] 10 | bashCommand = f"kg submit {filepath} -u {u} -p {p} -c diabetic-retinopathy-detection -m {msg}" 11 | out = os.popen(bashCommand).read() 12 | return out 13 | -------------------------------------------------------------------------------- /src/logVisualizer.py: -------------------------------------------------------------------------------- 1 | import re 2 | import matplotlib.patches as mpatches 3 | import matplotlib.pyplot as plt 4 | 5 | # Add as many as filenames possible to logfile_paths. Each graph will plot these file curves 6 | file1 = "../logs/2018-01-21_22-58-39-resnet18-train-0.22387.log" 7 | file2 = "../logs/2018-01-22_10-20-27-resnet18-train-0.53539.log" 8 | logfile_paths = [file1, file2] 9 | logfile_graph_labels = ["A", "B"] 10 | 11 | # Dictionary to store all log messages for each file 12 | logs = {} 13 | 14 | # Parse each logfile and read list of messages into dictionary logs 15 | for i, filename in enumerate(logfile_paths): 16 | f = open(filename, "r") 17 | lines = f.read().split("\n") 18 | messages = [] 19 | for line in lines: 20 | if re.match("\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3} - ", line): 21 | messages.append(line) 22 | else: 23 | messages[-1] += line 24 | # Map messages to file basename only from the whole path. 25 | logs[logfile_graph_labels[i]] = messages 26 | 27 | # Extract training loss for each log file 28 | ALL_TRAIN_QWK={} 29 | ALL_VAL_QWK={} 30 | ALL_TRAIN_LOSS={} 31 | ALL_VAL_LOSS={} 32 | ALL_CUMULATIVE_TRAINING_TIME = {} 33 | ALL_TRAINING_TIME_PER_EPOCH = {} 34 | 35 | numberExtractPattern = re.compile(r'[+-]?\d+(?:\.\d+)?') 36 | 37 | def getTrainingLossForEachEpoch(messagesList): 38 | TRAIN_QWK = [] 39 | TRAIN_LOSS = [] 40 | for message in messagesList: 41 | if "TRAIN QWK" in message: 42 | TRAIN_QWK.append(float(numberExtractPattern.findall(message)[9])) 43 | TRAIN_LOSS.append(float(numberExtractPattern.findall(message)[10])) 44 | return TRAIN_QWK, TRAIN_LOSS 45 | 46 | def getValidationLossForEachEpoch(messagesList): 47 | VALIDATION_QWK = [] 48 | VALIDATION_LOSS = [] 49 | for message in messagesList: 50 | if "VAL QWK" in message: 51 | VALIDATION_QWK.append(float(numberExtractPattern.findall(message)[9])) 52 | VALIDATION_LOSS.append(float(numberExtractPattern.findall(message)[10])) 53 | return VALIDATION_QWK, VALIDATION_LOSS 54 | 55 | def getTrainingTimeForEachEpoch(messagesList): 56 | CUMULATIVE_TRAINING_TIME = [] 57 | TRAINING_TIME_PER_EPOCH = [] 58 | prevTrainingTime = 0 59 | for message in messagesList: 60 | if " - Training Time - " in message: 61 | CUMULATIVE_TRAINING_TIME.append(float(numberExtractPattern.findall(message)[8])) 62 | TRAINING_TIME_PER_EPOCH.append(float(numberExtractPattern.findall(message)[8]) - prevTrainingTime) 63 | prevTrainingTime = CUMULATIVE_TRAINING_TIME[-1] 64 | return CUMULATIVE_TRAINING_TIME, TRAINING_TIME_PER_EPOCH 65 | 66 | for label in logfile_graph_labels: 67 | ALL_TRAIN_QWK[label], ALL_TRAIN_LOSS[label] = getTrainingLossForEachEpoch(logs[label]) 68 | ALL_VAL_QWK[label], ALL_VAL_LOSS[label] = getValidationLossForEachEpoch(logs[label]) 69 | ALL_CUMULATIVE_TRAINING_TIME[label], ALL_TRAINING_TIME_PER_EPOCH[label] = getTrainingTimeForEachEpoch(logs[label]) 70 | 71 | figureCount=1 72 | 73 | def plotGraphOfMultipleLogFiles(logFileLabels, yValues, xLabel="Epoch", yLabel="Training Loss"): 74 | global figureCount 75 | plt.figure(figureCount) 76 | figureCount += 1 77 | for label in logfile_graph_labels: 78 | numEpochs = len(yValues[label]) 79 | epochList = list(range(1,numEpochs+1)) 80 | plt.plot(epochList, yValues[label], label=label) 81 | plt.xlabel(xLabel) 82 | plt.ylabel(yLabel) 83 | plt.legend() 84 | 85 | plotGraphOfMultipleLogFiles(logfile_graph_labels, ALL_TRAIN_LOSS, xLabel="Epoch", yLabel="Training Loss") 86 | plotGraphOfMultipleLogFiles(logfile_graph_labels, ALL_VAL_LOSS, xLabel="Epoch", yLabel="Validation Loss") 87 | plotGraphOfMultipleLogFiles(logfile_graph_labels, ALL_TRAIN_QWK, xLabel="Epoch", yLabel="Training QWK") 88 | plotGraphOfMultipleLogFiles(logfile_graph_labels, ALL_VAL_QWK, xLabel="Epoch", yLabel="Validation QWK") 89 | plotGraphOfMultipleLogFiles(logfile_graph_labels, ALL_CUMULATIVE_TRAINING_TIME, xLabel="Epoch", yLabel="Total Time") 90 | plotGraphOfMultipleLogFiles(logfile_graph_labels, ALL_TRAINING_TIME_PER_EPOCH, xLabel="Epoch", yLabel="Time per Epoch") 91 | plt.show() 92 | -------------------------------------------------------------------------------- /src/lr_scheduler_configuration.py: -------------------------------------------------------------------------------- 1 | # create a configuration.py like this 2 | # transfer learning configuration file Example 3 | 4 | from math import tan 5 | from random import uniform 6 | 7 | from PIL import Image 8 | from torchvision import transforms 9 | import torch.nn as nn 10 | from DRNet import DRNet 11 | from torch.optim import Adam, SGD 12 | 13 | 14 | data_params = { 15 | 'train_path': '../data/train_300', 16 | 'test_path': '../data/test', 17 | 'label_path': '../data/trainLabels.csv', 18 | 'batch_size': 32, 19 | 'submission_file': '../data/submission.csv', 20 | # 'even', 'posneg', None. 21 | # 'even': Same number of samples for each class 22 | # 'posneg': Same number of samples for class 0 and all other classes 23 | 'rebalance_strategy': 'even', 24 | 'num_loading_workers': 8 25 | } 26 | 27 | kaggle_params = { 28 | # Change auto submit to True, to submit from code. 29 | 'auto_submit':False, 30 | # Change to your Kaggle username and password. 31 | 'kaggle_username':'abc', 32 | 'kaggle_password':'xyz', 33 | # Keep message enclosed in single qoutes, which are further enclosed in double qoutes. 34 | 'kaggle_submission_message':"'Luke, I am you father'" 35 | } 36 | 37 | training_params = { 38 | 'num_epochs': 20, 39 | 'log_nth': 25, 40 | } 41 | 42 | train_control = { 43 | 'optimizer' : SGD, # Adam, SGD (we can add more) 44 | 'lr_scheduler_type': 'plateau', # 'exp', 'step', 'plateau', 'none' 45 | 46 | 'step_scheduler_args' : { 47 | 'gamma' : 0.1, # factor to decay learing rate (new_lr = gamma * lr) 48 | 'step_size': 3 # number of epochs to take a step of decay 49 | }, 50 | 51 | 'exp_scheduler_args' : { 52 | 'gamma' : 0.1 # factor to decay learing rate (new_lr = gamma * lr) 53 | }, 54 | 55 | 'plateau_scheduler_args' : { 56 | 'factor' : 0.2, # factor to decay learing rate (new_lr = factor * lr) 57 | 'patience' : 3, # number of epochs to wait as monitored value does not change before decreasing LR 58 | 'verbose' : True, # print a message when LR is changed 59 | 'threshold' : 1e-3, # when to consider the monitored varaible not changing (focus on significant changes) 60 | 'min_lr' : 1e-7, # lower bound on learning rate, not decreased further 61 | 'cooldown' : 0 # number of epochs to wait before resuming operation after LR was reduced 62 | } 63 | 64 | } 65 | 66 | optimizer_params = { 67 | 'lr': 5e-4 68 | } 69 | 70 | 71 | 72 | model_params = { 73 | # if False, just load the model from the disk and evaluate 74 | 'train': True, 75 | # if False, previously (partially) trained model is further trained. 76 | 'train_from_scratch': True, 77 | 'model_path': '../models/TLNet_test_sch.model', 78 | 'model': DRNet, 79 | 'model_kwargs' : { 80 | 'num_classes' : 5, 81 | 'pretrained' : True, # load pre-trained weights on image-net 82 | 'net_size' : 18, # 18, 34, or 50 83 | 'freeze_features' : False, # fixed feature extractor OR fine-tune 84 | 'freeze_until_layer' : 1 # 1, 2, 3, 4, or 5 (check ResNet Paper) 85 | }, 86 | # the device to put the variables on (cpu/gpu) 87 | 'pytorch_device': 'gpu', 88 | # cuda device if gpu 89 | 'cuda_device': 0, 90 | } 91 | 92 | 93 | 94 | 95 | # normalization recommended by PyTorch documentation 96 | # only in case of transfer learning 97 | 98 | transfer_learn = model_params['model_kwargs']['pretrained'] 99 | 100 | normalize_transfer_learning = transforms.Normalize( 101 | mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) if transfer_learn else transforms.Normalize( 102 | mean=[0, 0, 0], std=[1, 1, 1]) 103 | 104 | 105 | # training data transforms (random rotation, random skew, scale and crop 224) 106 | train_data_transforms = transforms.Compose([ 107 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 108 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 109 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1,1)), # scale +- 10%, resize to 300 110 | transforms.CenterCrop((224)), 111 | transforms.ToTensor(), 112 | normalize_transfer_learning 113 | ]) 114 | 115 | # validation data transforms (random rotation, random skew, scale and crop 224) 116 | val_data_transforms = transforms.Compose([ 117 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 118 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 119 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1,1)), # scale +- 10%, resize to 300 120 | transforms.CenterCrop((224)), 121 | transforms.ToTensor(), 122 | normalize_transfer_learning 123 | ]) 124 | 125 | # test data transforms (random rotation) 126 | test_data_transforms = transforms.Compose([ 127 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 128 | transforms.RandomResizedCrop(300, scale=(1,1), ratio=(1,1)), # resize to 300 129 | transforms.CenterCrop((224)), 130 | transforms.ToTensor(), 131 | normalize_transfer_learning 132 | ]) 133 | 134 | 135 | # Function to implement skew based on PIL transform 136 | # adopted from: https://www.programcreek.com/python/example/69877/PIL.Image.AFFINE 137 | 138 | def skew_image(img, angle, inc_width=False): 139 | """ 140 | Skew image using some math 141 | :param img: PIL image object 142 | :param angle: Angle in radians (function doesn't do well outside the range -1 -> 1, but still works) 143 | :return: PIL image object 144 | """ 145 | width, height = img.size 146 | # Get the width that is to be added to the image based on the angle of skew 147 | xshift = tan(abs(angle)) * height 148 | new_width = width + int(xshift) 149 | 150 | if new_width < 0: 151 | return img 152 | 153 | # Apply transform 154 | img = img.transform( 155 | (new_width, height), 156 | Image.AFFINE, 157 | (1, angle, -xshift if angle > 0 else 0, 0, 1, 0), 158 | Image.BICUBIC 159 | ) 160 | 161 | if (inc_width): 162 | return img 163 | else: 164 | return img.crop((0, 0, width, height)) 165 | -------------------------------------------------------------------------------- /src/output_writing.py: -------------------------------------------------------------------------------- 1 | from pathlib import Path 2 | 3 | 4 | def write_submission_csv(predictions, names, filepath): 5 | filepath = Path(filepath) 6 | filepath.parent.mkdir(exist_ok=True) 7 | enumerated_names = [(i, name) for i, name in enumerate(names)] 8 | enumerated_names.sort(key=lambda name: name[1]) 9 | 10 | with open(filepath, 'w') as out_file: 11 | out_file.write('image,level\n') 12 | for i, name in enumerated_names: 13 | prediction = predictions[i] 14 | out_file.write(f'{name},{prediction}\n') 15 | # Add labels for corrupt images using prediction for their left image. 16 | if "25313_left" in names: 17 | out_file.write(f"25313_right,{predictions[names.index('25313_left')]}\n") 18 | if "27096_left" in names: 19 | out_file.write(f"27096_right,{predictions[names.index('27096_left')]}\n") 20 | -------------------------------------------------------------------------------- /src/preprocess.py: -------------------------------------------------------------------------------- 1 | # Example Usage: 2 | # >> python preprocess.py SOURCE_PATH 3 | # target path is SOURCE_PATH+'_300' 4 | 5 | 6 | # Partially adopted from https://github.com/btgraham/SparseConvNet/blob/kaggle_Diabetic_Retinopathy_competition/Data/kaggleDiabeticRetinopathy/preprocessImages.py 7 | 8 | import cv2 9 | import numpy as np 10 | from os.path import join, exists, basename 11 | from os import listdir, makedirs 12 | import sys 13 | import logging 14 | from datetime import datetime 15 | 16 | 17 | # function to scale the given image based on a scale value for the radius 18 | 19 | def scaleRadius(img,scale): 20 | x=img[img.shape[0]//2,:,:].sum(1) 21 | r=(x>x.mean()/10).sum()/2 22 | s=scale*1.0/r 23 | return cv2.resize(img,(0,0),fx=s,fy=s) 24 | 25 | 26 | # function to preprocess images from a give folder and save them 27 | # (save_path_same) determines whether to use the same path for the given folder or save in the current relative path 28 | # (target_size) determines whether to resize the images after preprocessing to exact dimensions or not 29 | 30 | def preprocess(folder='sample', scales=[300], save_path_same=True, target_size=(0,0)): 31 | 32 | file = basename(folder) 33 | handlers = [logging.FileHandler(datetime.now().strftime(f"%Y-%m-%d_%H-%M-%S-{file}.log")), 34 | logging.StreamHandler()] 35 | 36 | logging.basicConfig(format='%(message)s', 37 | level=logging.INFO, handlers=handlers) 38 | 39 | for scale in scales: 40 | if (save_path_same): 41 | write_folder = folder+'_'+str(scale) 42 | else: 43 | write_folder = 'processed_'+str(scale) 44 | 45 | if not exists(write_folder): 46 | makedirs(write_folder) 47 | 48 | for f in listdir(folder): 49 | try: 50 | read_path = join(folder, f) 51 | a = cv2.imread(read_path) 52 | a = scaleRadius(a,scale) 53 | b = np.zeros(a.shape) 54 | cv2.circle(b,(a.shape[1]//2,a.shape[0]//2),int(scale*0.9),(1,1,1),-1,8,0) 55 | aa = cv2.addWeighted(a,4,cv2.GaussianBlur(a,(0,0),scale/30),-4,128)*b+128*(1-b) 56 | 57 | if (target_size != (0,0)): 58 | aa = cv2.resize(aa, target_size) 59 | boo = cv2.imwrite(join(write_folder, f), aa) 60 | 61 | logging.info("Processed Image: "+ str(f)) 62 | logging.info("Save Location: "+ str(join(write_folder, f))) 63 | logging.info("Success: "+str(boo)) 64 | logging.info("New Dimensions: "+str(aa.shape[0])+" X "+str(aa.shape[1])) 65 | logging.info("______________________________________________\n") 66 | 67 | except: 68 | logging.info("Could not process file: "+str(f)) 69 | 70 | 71 | if __name__ == "__main__": 72 | source_path = sys.argv[1] 73 | same_path = eval(sys.argv[2]) 74 | print("Reading Images from Directory: "+str(source_path)) 75 | preprocess(source_path, save_path_same = same_path) 76 | print("DONE.") 77 | 78 | 79 | 80 | -------------------------------------------------------------------------------- /src/quadratic_weighted_kappa.py: -------------------------------------------------------------------------------- 1 | # Taken from https://github.com/benhamner/Metrics/blob/master/Python/ml_metrics/quadratic_weighted_kappa.py 2 | # and adapted for PyTorch 3 | 4 | import numpy as np 5 | 6 | 7 | def confusion_matrix(rater_a, rater_b, min_rating=None, max_rating=None): 8 | """ 9 | Returns the confusion matrix between rater's ratings 10 | """ 11 | assert (len(rater_a) == len(rater_b)) 12 | if min_rating is None: 13 | min_rating = min(rater_a + rater_b) 14 | if max_rating is None: 15 | max_rating = max(rater_a + rater_b) 16 | num_ratings = int(max_rating - min_rating + 1) 17 | conf_mat = [[0 for i in range(num_ratings)] 18 | for j in range(num_ratings)] 19 | for a, b in zip(rater_a, rater_b): 20 | conf_mat[a - min_rating][b - min_rating] += 1 21 | return conf_mat 22 | 23 | 24 | def histogram(ratings, min_rating=None, max_rating=None): 25 | """ 26 | Returns the counts of each type of rating that a rater made 27 | """ 28 | if min_rating is None: 29 | min_rating = min(ratings) 30 | if max_rating is None: 31 | max_rating = max(ratings) 32 | num_ratings = int(max_rating - min_rating + 1) 33 | hist_ratings = [0 for x in range(num_ratings)] 34 | for r in ratings: 35 | hist_ratings[r - min_rating] += 1 36 | return hist_ratings 37 | 38 | 39 | def quadratic_weighted_kappa(rater_a, rater_b, min_rating=None, max_rating=None): 40 | """ 41 | Calculates the quadratic weighted kappa 42 | quadratic_weighted_kappa calculates the quadratic weighted kappa 43 | value, which is a measure of inter-rater agreement between two raters 44 | that provide discrete numeric ratings. Potential values range from -1 45 | (representing complete disagreement) to 1 (representing complete 46 | agreement). A kappa value of 0 is expected if all agreement is due to 47 | chance. 48 | 49 | quadratic_weighted_kappa(rater_a, rater_b), where rater_a and rater_b 50 | each correspond to a list of integer ratings. These lists must have the 51 | same length. 52 | 53 | The ratings should be integers, and it is assumed that they contain 54 | the complete range of possible ratings. 55 | 56 | quadratic_weighted_kappa(X, min_rating, max_rating), where min_rating 57 | is the minimum possible rating, and max_rating is the maximum possible 58 | rating 59 | """ 60 | assert (rater_a.shape == rater_b.shape) 61 | if min_rating is None: 62 | min_rating = min(min(rater_a), min(rater_b)) 63 | if max_rating is None: 64 | max_rating = max(max(rater_a), max(rater_b)) 65 | conf_mat = confusion_matrix(rater_a, rater_b, 66 | min_rating, max_rating) 67 | num_ratings = len(conf_mat) 68 | num_scored_items = float(len(rater_a)) 69 | 70 | hist_rater_a = histogram(rater_a, min_rating, max_rating) 71 | hist_rater_b = histogram(rater_b, min_rating, max_rating) 72 | 73 | numerator = 0.0 74 | denominator = 0.0 75 | 76 | for i in range(num_ratings): 77 | for j in range(num_ratings): 78 | expected_count = (hist_rater_a[i] * hist_rater_b[j] 79 | / num_scored_items) 80 | d = _safe_div(pow(i - j, 2.0), pow(num_ratings - 1, 2.0)) 81 | numerator += d * conf_mat[i][j] / num_scored_items 82 | denominator += d * expected_count / num_scored_items 83 | 84 | return 1.0 - _safe_div(numerator, denominator) 85 | 86 | 87 | def _safe_div(num, denum): 88 | return 0. if denum == 0 else num / denum 89 | 90 | 91 | def linear_weighted_kappa(rater_a, rater_b, min_rating=None, max_rating=None): 92 | """ 93 | Calculates the linear weighted kappa 94 | linear_weighted_kappa calculates the linear weighted kappa 95 | value, which is a measure of inter-rater agreement between two raters 96 | that provide discrete numeric ratings. Potential values range from -1 97 | (representing complete disagreement) to 1 (representing complete 98 | agreement). A kappa value of 0 is expected if all agreement is due to 99 | chance. 100 | 101 | linear_weighted_kappa(rater_a, rater_b), where rater_a and rater_b 102 | each correspond to a list of integer ratings. These lists must have the 103 | same length. 104 | 105 | The ratings should be integers, and it is assumed that they contain 106 | the complete range of possible ratings. 107 | 108 | linear_weighted_kappa(X, min_rating, max_rating), where min_rating 109 | is the minimum possible rating, and max_rating is the maximum possible 110 | rating 111 | """ 112 | assert (len(rater_a) == len(rater_b)) 113 | if min_rating is None: 114 | min_rating = min(rater_a + rater_b) 115 | if max_rating is None: 116 | max_rating = max(rater_a + rater_b) 117 | conf_mat = confusion_matrix(rater_a, rater_b, 118 | min_rating, max_rating) 119 | num_ratings = len(conf_mat) 120 | num_scored_items = float(len(rater_a)) 121 | 122 | hist_rater_a = histogram(rater_a, min_rating, max_rating) 123 | hist_rater_b = histogram(rater_b, min_rating, max_rating) 124 | 125 | numerator = 0.0 126 | denominator = 0.0 127 | 128 | for i in range(num_ratings): 129 | for j in range(num_ratings): 130 | expected_count = (hist_rater_a[i] * hist_rater_b[j] 131 | / num_scored_items) 132 | d = abs(i - j) / float(num_ratings - 1) 133 | numerator += d * conf_mat[i][j] / num_scored_items 134 | denominator += d * expected_count / num_scored_items 135 | 136 | return 1.0 - numerator / denominator 137 | 138 | 139 | def kappa(rater_a, rater_b, min_rating=None, max_rating=None): 140 | """ 141 | Calculates the kappa 142 | kappa calculates the kappa 143 | value, which is a measure of inter-rater agreement between two raters 144 | that provide discrete numeric ratings. Potential values range from -1 145 | (representing complete disagreement) to 1 (representing complete 146 | agreement). A kappa value of 0 is expected if all agreement is due to 147 | chance. 148 | 149 | kappa(rater_a, rater_b), where rater_a and rater_b 150 | each correspond to a list of integer ratings. These lists must have the 151 | same length. 152 | 153 | The ratings should be integers, and it is assumed that they contain 154 | the complete range of possible ratings. 155 | 156 | kappa(X, min_rating, max_rating), where min_rating 157 | is the minimum possible rating, and max_rating is the maximum possible 158 | rating 159 | """ 160 | assert (len(rater_a) == len(rater_b)) 161 | if min_rating is None: 162 | min_rating = min(rater_a + rater_b) 163 | if max_rating is None: 164 | max_rating = max(rater_a + rater_b) 165 | conf_mat = confusion_matrix(rater_a, rater_b, 166 | min_rating, max_rating) 167 | num_ratings = len(conf_mat) 168 | num_scored_items = float(len(rater_a)) 169 | 170 | hist_rater_a = histogram(rater_a, min_rating, max_rating) 171 | hist_rater_b = histogram(rater_b, min_rating, max_rating) 172 | 173 | numerator = 0.0 174 | denominator = 0.0 175 | 176 | for i in range(num_ratings): 177 | for j in range(num_ratings): 178 | expected_count = (hist_rater_a[i] * hist_rater_b[j] 179 | / num_scored_items) 180 | if i == j: 181 | d = 0.0 182 | else: 183 | d = 1.0 184 | numerator += d * conf_mat[i][j] / num_scored_items 185 | denominator += d * expected_count / num_scored_items 186 | 187 | return 1.0 - numerator / denominator 188 | 189 | 190 | def mean_quadratic_weighted_kappa(kappas, weights=None): 191 | """ 192 | Calculates the mean of the quadratic 193 | weighted kappas after applying Fisher's r-to-z transform, which is 194 | approximately a variance-stabilizing transformation. This 195 | transformation is undefined if one of the kappas is 1.0, so all kappa 196 | values are capped in the range (-0.999, 0.999). The reverse 197 | transformation is then applied before returning the result. 198 | 199 | mean_quadratic_weighted_kappa(kappas), where kappas is a vector of 200 | kappa values 201 | 202 | mean_quadratic_weighted_kappa(kappas, weights), where weights is a vector 203 | of weights that is the same size as kappas. Weights are applied in the 204 | z-space 205 | """ 206 | kappas = np.array(kappas, dtype=float) 207 | if weights is None: 208 | weights = np.ones(np.shape(kappas)) 209 | else: 210 | weights = weights / np.mean(weights) 211 | 212 | # ensure that kappas are in the range [-.999, .999] 213 | kappas = np.array([min(x, .999) for x in kappas]) 214 | kappas = np.array([max(x, -.999) for x in kappas]) 215 | 216 | z = 0.5 * np.log((1 + kappas) / (1 - kappas)) * weights 217 | z = np.mean(z) 218 | return (np.exp(2 * z) - 1) / (np.exp(2 * z) + 1) 219 | 220 | 221 | def weighted_mean_quadratic_weighted_kappa(solution, submission): 222 | predicted_score = submission[submission.columns[-1]].copy() 223 | predicted_score.name = "predicted_score" 224 | if predicted_score.index[0] == 0: 225 | predicted_score = predicted_score[:len(solution)] 226 | predicted_score.index = solution.index 227 | combined = solution.join(predicted_score, how="left") 228 | groups = combined.groupby(by="essay_set") 229 | kappas = [quadratic_weighted_kappa(group[1]["essay_score"], group[1]["predicted_score"]) for group in groups] 230 | weights = [group[1]["essay_weight"].irow(0) for group in groups] 231 | return mean_quadratic_weighted_kappa(kappas, weights=weights) 232 | -------------------------------------------------------------------------------- /src/s_train_model.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | 3 | import torch 4 | from torch.optim.lr_scheduler import StepLR, ExponentialLR, ReduceLROnPlateau 5 | 6 | import logging 7 | import pprint 8 | from sparse_config import data_params, train_data_transforms, val_data_transforms, test_data_transforms, \ 9 | training_params, train_control, model_params, optimizer_params, kaggle_params 10 | from data_loading import get_train_valid_loader, get_test_loader 11 | from output_writing import write_submission_csv 12 | from trainer import ModelTrainer 13 | from kaggle import submit_solution 14 | 15 | if __name__ == '__main__': 16 | # Check mode and model for logging file name 17 | mode = 'train' if model_params['train'] else 'test' 18 | model_name = model_params['model'].__name__ 19 | 20 | # Handler - Basically output logging statements to both - a file and console. 21 | handlers = [logging.FileHandler(datetime.now().strftime(f"../logs/%Y-%m-%d_%H-%M-%S-{model_name}-{mode}.log")), 22 | logging.StreamHandler()] 23 | logging.basicConfig(format='%(asctime)s - %(message)s', 24 | level=logging.INFO, handlers=handlers) 25 | 26 | logging.info('Started') 27 | 28 | # Log training, data, model and optimizer parameters. 29 | logging.info(f"Training params:\n{pprint.pformat(training_params)}") 30 | logging.info(f"Training Control params:\n{pprint.pformat(train_control)}") 31 | logging.info(f"Data params:\n{pprint.pformat(data_params)}") 32 | logging.info(f"Model params:\n{pprint.pformat(model_params)}") 33 | logging.info(f"Optimizer params:\n{pprint.pformat(optimizer_params)}") 34 | 35 | train_dataset_loader, valid_dataset_loader = get_train_valid_loader(data_params['train_path'], 36 | data_params['label_path'], 37 | random_seed=54321, 38 | batch_size=data_params['batch_size'], 39 | rebalance_strategy=data_params['rebalance_strategy'], 40 | train_transforms=train_data_transforms, 41 | valid_transforms=val_data_transforms, 42 | num_workers=data_params['num_loading_workers'], 43 | pin_memory=False) 44 | test_dataset_loader = get_test_loader(data_params['test_path'], 45 | batch_size=data_params['batch_size'], 46 | transforms=test_data_transforms, 47 | num_workers=data_params['num_loading_workers'], 48 | pin_memory=False) 49 | 50 | 51 | if model_params['train'] and model_params['train_from_scratch']: 52 | model = model_params['model'](**model_params['model_kwargs']) 53 | else: 54 | if model_params['pytorch_device'] == 'gpu': 55 | model = torch.load(model_params['model_path']) 56 | else: 57 | model = torch.load(model_params['model_path'], lambda storage, loc: storage) 58 | 59 | 60 | # Pass only the trainable parameters to the optimizer, otherwise pyTorch throws an error 61 | # relevant to Transfer learning with fixed features 62 | 63 | optimizer = train_control['optimizer'](filter(lambda p: p.requires_grad, model.parameters()), 64 | **optimizer_params) 65 | 66 | 67 | # Initiate Scheduler 68 | 69 | if (train_control['lr_scheduler_type'] == 'step'): 70 | scheduler = StepLR(optimizer, **train_control['step_scheduler_args']) 71 | elif (train_control['lr_scheduler_type'] == 'exp'): 72 | scheduler = ExponentialLR(optimizer, **train_control['exp_scheduler_args']) 73 | elif (train_control['lr_scheduler_type'] == 'plateau'): 74 | scheduler = ReduceLROnPlateau(optimizer, **train_control['plateau_scheduler_args']) 75 | else: 76 | scheduler = StepLR(optimizer, step_size=100, gamma = 1) 77 | 78 | if model_params['pytorch_device'] == 'gpu': 79 | with torch.cuda.device(model_params['cuda_device']): 80 | model_trainer = ModelTrainer(model, train_dataset_loader, valid_dataset_loader, test_dataset_loader, 81 | model_params['model_path'], 82 | optimizer = optimizer, 83 | optimizer_args=optimizer_params, 84 | scheduler = scheduler, 85 | host_device='gpu') 86 | if model_params['train']: 87 | model_trainer.train_model(**training_params) 88 | predictions, image_names = model_trainer.predict_on_test() 89 | 90 | else: 91 | model_trainer = ModelTrainer(model, train_dataset_loader, valid_dataset_loader, test_dataset_loader, 92 | model_params['model_path'], 93 | optimizer = optimizer, 94 | optimizer_args=optimizer_params, 95 | scheduler = scheduler, 96 | host_device='cpu') 97 | if model_params['train']: 98 | model_trainer.train_model(**training_params) 99 | predictions, image_names = model_trainer.predict_on_test() 100 | 101 | write_submission_csv(predictions, image_names, data_params['submission_file']) 102 | if kaggle_params['auto_submit'] : 103 | output = submit_solution(data_params['submission_file']) 104 | logging.info(f"Kaggle submission output = {output}") 105 | logging.info('Finished.') 106 | -------------------------------------------------------------------------------- /src/sparse_config.py: -------------------------------------------------------------------------------- 1 | # create a configuration.py like this 2 | # transfer learning configuration file Example 3 | 4 | from math import tan 5 | from random import uniform 6 | 7 | from PIL import Image 8 | from torchvision import transforms 9 | import torch.nn as nn 10 | from DRNet import DRNet 11 | from torch.optim import Adam, SGD 12 | from Sparse_ResNet import Sparse_Res_Net 13 | import sparseconvnet as scn 14 | data_params = { 15 | 'train_path': '../../data/full_train/train', 16 | 'test_path': '../../data/full_test/test', 17 | 'label_path': '../data/trainLabels.csv', 18 | 'batch_size': 32, 19 | 'submission_file': '../data/submission.csv', 20 | # 'even', 'posneg', None. 21 | # 'even': Same number of samples for each class 22 | # 'posneg': Same number of samples for class 0 and all other classes 23 | 'rebalance_strategy': 'even', 24 | 'num_loading_workers': 8 25 | } 26 | 27 | kaggle_params = { 28 | # Change auto submit to True, to submit from code. 29 | 'auto_submit':False, 30 | # Change to your Kaggle username and password. 31 | 'kaggle_username':'darkhunt3r', 32 | 'kaggle_password':'majuleKE899497', 33 | # Keep message enclosed in single qoutes, which are further enclosed in double qoutes. 34 | 'kaggle_submission_message':"'Luke, I am you father'" 35 | } 36 | 37 | training_params = { 38 | 'num_epochs': 200, 39 | 'log_nth': 25, 40 | } 41 | 42 | train_control = { 43 | 'optimizer' : Adam, # Adam, SGD (we can add more) 44 | 'lr_scheduler_type': 'none', # 'exp', 'step', 'plateau', 'none' 45 | 46 | 'step_scheduler_args' : { 47 | 'gamma' : 0.1, # factor to decay learing rate (new_lr = gamma * lr) 48 | 'step_size': 3 # number of epochs to take a step of decay 49 | }, 50 | 51 | 'exp_scheduler_args' : { 52 | 'gamma' : 0.1 # factor to decay learing rate (new_lr = gamma * lr) 53 | }, 54 | 55 | 'plateau_scheduler_args' : { 56 | 'factor' : 0.2, # factor to decay learing rate (new_lr = factor * lr) 57 | 'patience' : 3, # number of epochs to wait as monitored value does not change before decreasing LR 58 | 'verbose' : True, # print a message when LR is changed 59 | 'threshold' : 1e-3, # when to consider the monitored varaible not changing (focus on significant changes) 60 | 'min_lr' : 1e-7, # lower bound on learning rate, not decreased further 61 | 'cooldown' : 0 # number of epochs to wait before resuming operation after LR was reduced 62 | } 63 | 64 | } 65 | 66 | optimizer_params = { 67 | 'lr': 5e-4 68 | } 69 | 70 | 71 | 72 | model_params = { 73 | # if False, just load the model from the disk and evaluate 74 | 'train': True, 75 | # if False, previously (partially) trained model is further trained. 76 | 'train_from_scratch': True, 77 | 'model_path': '../models/Sparse_ResNet.model', 78 | 'model': Sparse_Res_Net, 79 | 'model_kwargs' : { 80 | 'num_classes' : 5, 81 | 82 | }, 83 | # the device to put the variables on (cpu/gpu) 84 | 'pytorch_device': 'gpu', 85 | # cuda device if gpu 86 | 'cuda_device': 0, 87 | } 88 | 89 | 90 | 91 | 92 | # normalization recommended by PyTorch documentation 93 | # only in case of transfer learning 94 | 95 | transfer_learn = model_params['model_kwargs'] 96 | 97 | normalize_transfer_learning = transforms.Normalize( 98 | mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) if transfer_learn else transforms.Normalize( 99 | mean=[0, 0, 0], std=[1, 1, 1]) 100 | 101 | 102 | # training data transforms (random rotation, random skew, scale and crop 224) 103 | train_data_transforms = transforms.Compose([ 104 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 105 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 106 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1,1)), # scale +- 10%, resize to 300 107 | transforms.CenterCrop((224)), 108 | transforms.ToTensor(), 109 | 110 | #normalize_transfer_learning 111 | ]) 112 | 113 | # validation data transforms (random rotation, random skew, scale and crop 224) 114 | val_data_transforms = transforms.Compose([ 115 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 116 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 117 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1,1)), # scale +- 10%, resize to 300 118 | transforms.CenterCrop((224)), 119 | transforms.ToTensor(), 120 | #normalize_transfer_learning 121 | ]) 122 | 123 | # test data transforms (random rotation) 124 | test_data_transforms = transforms.Compose([ 125 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 126 | transforms.RandomResizedCrop(300, scale=(1,1), ratio=(1,1)), # resize to 300 127 | transforms.CenterCrop((224)), 128 | transforms.ToTensor(), 129 | #transforms.Lambda(lambda x: input.addSample() ) 130 | 131 | ]) 132 | 133 | 134 | # Function to implement skew based on PIL transform 135 | # adopted from: https://www.programcreek.com/python/example/69877/PIL.Image.AFFINE 136 | 137 | def skew_image(img, angle, inc_width=False): 138 | """ 139 | Skew image using some math 140 | :param img: PIL image object 141 | :param angle: Angle in radians (function doesn't do well outside the range -1 -> 1, but still works) 142 | :return: PIL image object 143 | """ 144 | width, height = img.size 145 | # Get the width that is to be added to the image based on the angle of skew 146 | xshift = tan(abs(angle)) * height 147 | new_width = width + int(xshift) 148 | 149 | if new_width < 0: 150 | return img 151 | 152 | # Apply transform 153 | img = img.transform( 154 | (new_width, height), 155 | Image.AFFINE, 156 | (1, angle, -xshift if angle > 0 else 0, 0, 1, 0), 157 | Image.BICUBIC 158 | ) 159 | 160 | if (inc_width): 161 | return img 162 | else: 163 | return img.crop((0, 0, width, height)) 164 | -------------------------------------------------------------------------------- /src/train_model.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | 3 | import torch 4 | from torch.optim.lr_scheduler import StepLR, ExponentialLR, ReduceLROnPlateau 5 | 6 | import logging 7 | import pprint 8 | from configuration import data_params, train_data_transforms, val_data_transforms, test_data_transforms, \ 9 | training_params, train_control, model_params, optimizer_params, kaggle_params 10 | from data_loading import get_train_valid_loader, get_test_loader 11 | from output_writing import write_submission_csv 12 | from trainer import ModelTrainer 13 | from kaggle import submit_solution 14 | 15 | if __name__ == '__main__': 16 | # Check mode and model for logging file name 17 | mode = 'train' if model_params['train'] else 'test' 18 | model_name = model_params['model'].__name__ 19 | 20 | # Handler - Basically output logging statements to both - a file and console. 21 | handlers = [logging.FileHandler(datetime.now().strftime(f"../logs/%Y-%m-%d_%H-%M-%S-{model_name}-{mode}.log")), 22 | logging.StreamHandler()] 23 | logging.basicConfig(format='%(asctime)s - %(message)s', 24 | level=logging.INFO, handlers=handlers) 25 | 26 | logging.info('Started') 27 | 28 | # Log training, data, model and optimizer parameters. 29 | logging.info(f"Training params:\n{pprint.pformat(training_params)}") 30 | logging.info(f"Training Control params:\n{pprint.pformat(train_control)}") 31 | logging.info(f"Data params:\n{pprint.pformat(data_params)}") 32 | logging.info(f"Model params:\n{pprint.pformat(model_params)}") 33 | logging.info(f"Optimizer params:\n{pprint.pformat(optimizer_params)}") 34 | 35 | train_dataset_loader, valid_dataset_loader = get_train_valid_loader(data_params['train_path'], 36 | data_params['label_path'], 37 | random_seed=54321, 38 | batch_size=data_params['batch_size'], 39 | rebalance_strategy=data_params['rebalance_strategy'], 40 | train_transforms=train_data_transforms, 41 | valid_transforms=val_data_transforms, 42 | num_workers=data_params['num_loading_workers'], 43 | pin_memory=False) 44 | test_dataset_loader = get_test_loader(data_params['test_path'], 45 | batch_size=data_params['batch_size'], 46 | transforms=test_data_transforms, 47 | num_workers=data_params['num_loading_workers'], 48 | pin_memory=False) 49 | 50 | 51 | if model_params['train'] and model_params['train_from_scratch']: 52 | model = model_params['model'](**model_params['model_kwargs'], default_lr=optimizer_params['lr']) 53 | else: 54 | if model_params['pytorch_device'] == 'gpu': 55 | model = torch.load(model_params['model_path']) 56 | else: 57 | model = torch.load(model_params['model_path'], lambda storage, loc: storage) 58 | 59 | 60 | # Pass only the trainable parameters to the optimizer, otherwise pyTorch throws an error 61 | # relevant to Transfer learning with fixed features 62 | 63 | if (model_params['per_layer_rates']): 64 | optimizer = train_control['optimizer']([ 65 | {'params': model.get_params_layer(i), 66 | 'lr': model.get_lr_layer(i)} for i in range(1,7) 67 | ], 68 | **optimizer_params) 69 | else: 70 | optimizer = train_control['optimizer'](filter(lambda p: p.requires_grad, model.parameters()), 71 | **optimizer_params) 72 | 73 | 74 | # Initiate Scheduler 75 | 76 | if (train_control['lr_scheduler_type'] == 'step'): 77 | scheduler = StepLR(optimizer, **train_control['step_scheduler_args']) 78 | elif (train_control['lr_scheduler_type'] == 'exp'): 79 | scheduler = ExponentialLR(optimizer, **train_control['exp_scheduler_args']) 80 | elif (train_control['lr_scheduler_type'] == 'plateau'): 81 | scheduler = ReduceLROnPlateau(optimizer, **train_control['plateau_scheduler_args']) 82 | else: 83 | scheduler = StepLR(optimizer, step_size=100, gamma = 1) 84 | 85 | if model_params['pytorch_device'] == 'gpu': 86 | with torch.cuda.device(model_params['cuda_device']): 87 | model_trainer = ModelTrainer(model, train_dataset_loader, valid_dataset_loader, test_dataset_loader, 88 | model_params['model_path'], 89 | optimizer = optimizer, 90 | optimizer_args=optimizer_params, 91 | scheduler = scheduler, 92 | host_device='gpu') 93 | if model_params['train']: 94 | model_trainer.train_model(**training_params) 95 | predictions, image_names = model_trainer.predict_on_test() 96 | 97 | else: 98 | model_trainer = ModelTrainer(model, train_dataset_loader, valid_dataset_loader, test_dataset_loader, 99 | model_params['model_path'], 100 | optimizer = optimizer, 101 | optimizer_args=optimizer_params, 102 | scheduler = scheduler, 103 | host_device='cpu') 104 | if model_params['train']: 105 | model_trainer.train_model(**training_params) 106 | predictions, image_names = model_trainer.predict_on_test() 107 | 108 | write_submission_csv(predictions, image_names, data_params['submission_file']) 109 | if kaggle_params['auto_submit'] : 110 | output = submit_solution(data_params['submission_file']) 111 | logging.info(f"Kaggle submission output = {output}") 112 | logging.info('Finished.') 113 | -------------------------------------------------------------------------------- /src/train_sparse_model.py: -------------------------------------------------------------------------------- 1 | from datetime import datetime 2 | 3 | import torch 4 | from torch.optim.lr_scheduler import StepLR, ExponentialLR, ReduceLROnPlateau 5 | 6 | import logging 7 | import pprint 8 | from sparse_config import data_params, train_data_transforms, val_data_transforms, test_data_transforms, \ 9 | training_params, model_params, optimizer_params, kaggle_params,train_control 10 | from data_loading import get_train_valid_loader, get_test_loader 11 | from output_writing import write_submission_csv 12 | from trainer import ModelTrainer 13 | from kaggle import submit_solution 14 | 15 | if __name__ == '__main__': 16 | # Check mode and model for logging file name 17 | mode = 'train' if model_params['train'] else 'test' 18 | model_name = model_params['model'].__name__ 19 | 20 | # Handler - Basically output logging statements to both - a file and console. 21 | handlers = [logging.FileHandler(datetime.now().strftime(f"../logs/%Y-%m-%d_%H-%M-%S-{model_name}-{mode}.log")), 22 | logging.StreamHandler()] 23 | logging.basicConfig(format='%(asctime)s - %(message)s', 24 | level=logging.INFO, handlers=handlers) 25 | 26 | logging.info('Started') 27 | 28 | # Log training, data, model and optimizer parameters. 29 | logging.info(f"Training params:\n{pprint.pformat(training_params)}") 30 | logging.info(f"Training Control params:\n{pprint.pformat(train_control)}") 31 | logging.info(f"Data params:\n{pprint.pformat(data_params)}") 32 | logging.info(f"Model params:\n{pprint.pformat(model_params)}") 33 | logging.info(f"Optimizer params:\n{pprint.pformat(optimizer_params)}") 34 | 35 | train_dataset_loader, valid_dataset_loader = get_train_valid_loader(data_params['train_path'], 36 | data_params['label_path'], 37 | random_seed=54321, 38 | batch_size=data_params['batch_size'], 39 | rebalance_strategy=data_params['rebalance_strategy'], 40 | train_transforms=train_data_transforms, 41 | valid_transforms=val_data_transforms, 42 | num_workers=data_params['num_loading_workers'], 43 | pin_memory=False) 44 | test_dataset_loader = get_test_loader(data_params['test_path'], 45 | batch_size=data_params['batch_size'], 46 | transforms=test_data_transforms, 47 | num_workers=data_params['num_loading_workers'], 48 | pin_memory=False) 49 | 50 | 51 | if model_params['train'] and model_params['train_from_scratch']: 52 | model = model_params['model'](**model_params['model_kwargs']) 53 | else: 54 | if model_params['pytorch_device'] == 'gpu': 55 | model = torch.load(model_params['model_path']) 56 | else: 57 | model = torch.load(model_params['model_path'], lambda storage, loc: storage) 58 | 59 | 60 | # Pass only the trainable parameters to the optimizer, otherwise pyTorch throws an error 61 | # relevant to Transfer learning with fixed features 62 | 63 | optimizer = train_control['optimizer'](filter(lambda p: p.requires_grad, model.parameters()), 64 | **optimizer_params) 65 | 66 | 67 | # Initiate Scheduler 68 | 69 | if (train_control['lr_scheduler_type'] == 'step'): 70 | scheduler = StepLR(optimizer, **train_control['step_scheduler_args']) 71 | elif (train_control['lr_scheduler_type'] == 'exp'): 72 | scheduler = ExponentialLR(optimizer, **train_control['exp_scheduler_args']) 73 | elif (train_control['lr_scheduler_type'] == 'plateau'): 74 | scheduler = ReduceLROnPlateau(optimizer, **train_control['plateau_scheduler_args']) 75 | else: 76 | scheduler = StepLR(optimizer, step_size=100, gamma = 1) 77 | 78 | if model_params['pytorch_device'] == 'gpu': 79 | with torch.cuda.device(model_params['cuda_device']): 80 | model_trainer = ModelTrainer(model, train_dataset_loader, valid_dataset_loader, test_dataset_loader, 81 | model_params['model_path'], 82 | optimizer = optimizer, 83 | optimizer_args=optimizer_params, 84 | scheduler = scheduler, 85 | host_device='gpu') 86 | if model_params['train']: 87 | model_trainer.train_model(**training_params) 88 | predictions, image_names = model_trainer.predict_on_test() 89 | 90 | else: 91 | model_trainer = ModelTrainer(model, train_dataset_loader, valid_dataset_loader, test_dataset_loader, 92 | model_params['model_path'], 93 | optimizer = optimizer, 94 | optimizer_args=optimizer_params, 95 | scheduler = scheduler, 96 | host_device='cpu') 97 | if model_params['train']: 98 | model_trainer.train_model(**training_params) 99 | predictions, image_names = model_trainer.predict_on_test() 100 | 101 | write_submission_csv(predictions, image_names, data_params['submission_file']) 102 | if kaggle_params['auto_submit'] : 103 | output = submit_solution(data_params['submission_file']) 104 | logging.info(f"Kaggle submission output = {output}") 105 | logging.info('Finished.') 106 | -------------------------------------------------------------------------------- /src/trainer.py: -------------------------------------------------------------------------------- 1 | from copy import deepcopy 2 | from itertools import chain 3 | from pathlib import Path 4 | 5 | import progressbar 6 | import torch 7 | from torch.autograd import Variable 8 | import time 9 | import logging 10 | 11 | from quadratic_weighted_kappa import quadratic_weighted_kappa 12 | 13 | 14 | class ModelTrainer: 15 | default_adam_args = {"lr": 1e-4, 16 | "betas": (0.9, 0.999), 17 | "eps": 1e-8, 18 | "weight_decay": 0.0} 19 | 20 | def __init__(self, model, train_dataset_loader, valid_dataset_loader, test_dataset_loader, 21 | model_path, 22 | scheduler, 23 | host_device='cpu', 24 | optimizer=torch.optim.Adam, 25 | optimizer_args={}, 26 | loss_func=torch.nn.CrossEntropyLoss(size_average=False), 27 | patience=float('Inf')): 28 | self.model = model 29 | self.train_dataset_loader = train_dataset_loader 30 | self.valid_dataset_loader = valid_dataset_loader 31 | self.test_dataset_loader = test_dataset_loader 32 | self.model_path = Path(model_path) 33 | self.model_path.parent.mkdir(exist_ok=True) 34 | 35 | self.host_device = host_device 36 | self.optimizer_args = optimizer_args 37 | self.optimizer = optimizer 38 | self.scheduler = scheduler 39 | self.loss_func = loss_func 40 | 41 | self._reset_histories() 42 | self.patience = patience 43 | self.wait = 0 44 | self.best_qwk = -1 45 | self.best_model = None 46 | 47 | def _reset_histories(self): 48 | """ 49 | Resets train and val histories for the qwkuracy and the loss. 50 | """ 51 | self.train_loss_history = [] 52 | self.train_qwk_history = [] 53 | self.val_loss_history = [] 54 | self.val_qwk_history = [] 55 | 56 | def train_model(self, num_epochs, log_nth): 57 | training_start_time = time.time() 58 | 59 | optimizer = self.optimizer 60 | 61 | self._reset_histories() 62 | if self.host_device == 'gpu': 63 | self.model.cuda() 64 | iter_per_epoch = len(self.train_dataset_loader) 65 | logging.info("Start training") 66 | logging.info(f"Size of training data: " 67 | f"{len(self.train_dataset_loader.sampler) * self.train_dataset_loader.batch_size}") 68 | 69 | 70 | for i_epoch in range(num_epochs): 71 | logging.info("Starting new epoch...") 72 | running_loss = 0. 73 | 74 | 75 | all_y = [] 76 | all_y_pred = [] 77 | 78 | # scheduler step for exp and step schedulers 79 | 80 | if (not isinstance(self.scheduler, torch.optim.lr_scheduler.ReduceLROnPlateau)): 81 | self.scheduler.step() 82 | logging.info(f"Learning rate is {self.scheduler.get_lr()}") 83 | 84 | for i_batch, batch in enumerate(self.train_dataset_loader): 85 | x, y = batch 86 | x, y = Variable(x), Variable(y) 87 | if self.host_device == 'gpu': 88 | x, y = x.cuda(), y.cuda() 89 | 90 | optimizer.zero_grad() 91 | outputs = self.model(x) 92 | if self.host_device == 'gpu': 93 | train_loss = self.loss_func(outputs.cuda(), y) 94 | else: 95 | train_loss = self.loss_func(outputs, y) 96 | 97 | train_loss.backward() 98 | optimizer.step() 99 | 100 | running_loss += train_loss.data[0] 101 | _, y_pred = torch.max(outputs.data, 1) 102 | all_y.append(y) 103 | all_y_pred.append(y_pred) 104 | 105 | if not log_nth == 0 and (i_batch % log_nth) == 0: 106 | logging.info(f'[Iteration {i_batch}/{iter_per_epoch}] ' 107 | f'TRAIN loss: {running_loss / sum(curr_y.shape[0] for curr_y in all_y):.3f}') 108 | self.train_loss_history.append(running_loss) 109 | y = torch.cat(all_y) 110 | y_pred = torch.cat(all_y_pred) 111 | train_qwk = quadratic_weighted_kappa(y_pred, y.data) 112 | 113 | logging.info(f'[Epoch {i_epoch+1}/{num_epochs}] ' 114 | f'TRAIN QWK: {train_qwk:.3f}; loss: {running_loss / y.shape[0]:.3f}') 115 | self.train_qwk_history.append(train_qwk) 116 | 117 | running_loss = 0. 118 | all_y = [] 119 | all_y_pred = [] 120 | for x, y in self.valid_dataset_loader: 121 | x, y = Variable(x), Variable(y) 122 | if self.host_device == 'gpu': 123 | x, y = x.cuda(), y.cuda() 124 | 125 | outputs = self.model(x) 126 | if self.host_device == 'gpu': 127 | val_loss = self.loss_func(outputs.cuda(), y) 128 | else: 129 | val_loss = self.loss_func(outputs, y) 130 | 131 | running_loss += val_loss.data[0] 132 | _, y_pred = torch.max(outputs.data, 1) 133 | all_y.append(y) 134 | all_y_pred.append(y_pred) 135 | 136 | y = torch.cat(all_y) 137 | y_pred = torch.cat(all_y_pred) 138 | val_qwk = quadratic_weighted_kappa(y_pred, y.data) 139 | 140 | logging.info(f'[Epoch {i_epoch+1}/{num_epochs}] ' 141 | f'VAL QWK: {val_qwk:.3f}; loss: {running_loss / y.shape[0]:.3f}') 142 | 143 | self.val_qwk_history.append(val_qwk) 144 | self.val_loss_history.append(running_loss) 145 | training_time = time.time() - training_start_time 146 | logging.info(f"Epoch {i_epoch+1} - Training Time - {training_time} seconds") 147 | 148 | 149 | # scheduler step for plateau scheduler 150 | val_loss_scheduler = running_loss 151 | if (isinstance(self.scheduler, torch.optim.lr_scheduler.ReduceLROnPlateau)): 152 | self.scheduler.step(val_loss_scheduler) 153 | 154 | if val_qwk > self.best_qwk: 155 | logging.info(f'New best validation QWK score: {val_qwk}') 156 | self.best_qwk = val_qwk 157 | self.best_model = deepcopy(self.model) 158 | self.wait = 0 159 | logging.info('Storing best model...') 160 | torch.save(self.best_model, self.model_path) 161 | logging.info('Done storing') 162 | else: 163 | self.wait += 1 164 | if self.wait >= self.patience: 165 | logging.info('Stopped after epoch %d' % (i_epoch)) 166 | break 167 | 168 | training_time = time.time() - training_start_time 169 | logging.info(f"Full Training Time - {training_time} seconds") 170 | 171 | def predict_on_test(self): 172 | testing_start_time = time.time() 173 | all_y_pred = [] 174 | all_image_names = [] 175 | logging.info(f'num_test_images_batch={len(self.test_dataset_loader)}') 176 | image_index = 0 177 | bar = progressbar.ProgressBar(max_value=len(self.test_dataset_loader)) 178 | bar.start(init=True) 179 | for x, image_names in self.test_dataset_loader: 180 | x = Variable(x) 181 | if self.host_device == 'gpu': 182 | x = x.cuda() 183 | outputs = self.model(x) 184 | _, y_pred = torch.max(outputs.data, 1) 185 | all_y_pred.append(y_pred) 186 | all_image_names.append(image_names) 187 | image_index += 1 188 | bar.update(image_index) 189 | bar.finish() 190 | 191 | all_image_names = list(chain.from_iterable(all_image_names)) 192 | testing_time = time.time() - testing_start_time 193 | logging.info(f"Full Testing Time - {testing_time} seconds for " 194 | f"{len(self.test_dataset_loader) * self.test_dataset_loader.batch_size} Images") 195 | if self.host_device == 'gpu': 196 | return torch.cat(all_y_pred).cpu().numpy(), all_image_names 197 | else: 198 | return torch.cat(all_y_pred).numpy(), all_image_names 199 | -------------------------------------------------------------------------------- /src/transfer_learning_configuration.py: -------------------------------------------------------------------------------- 1 | # create a configuration.py like this 2 | # transfer learning configuration file Example 3 | 4 | from math import tan 5 | from random import uniform 6 | 7 | from PIL import Image 8 | from torchvision import transforms 9 | import torch.nn as nn 10 | from DRNet import DRNet 11 | from torch.optim import Adam, SGD 12 | 13 | 14 | data_params = { 15 | 'train_path': '../data/train_300', 16 | 'test_path': '../data/test', 17 | 'label_path': '../data/trainLabels.csv', 18 | 'batch_size': 64, 19 | 'submission_file': '../data/submission.csv', 20 | # 'even', 'posneg', None. 21 | # 'even': Same number of samples for each class 22 | # 'posneg': Same number of samples for class 0 and all other classes 23 | 'rebalance_strategy': 'even', 24 | 'num_loading_workers': 8 25 | } 26 | 27 | kaggle_params = { 28 | # Change auto submit to True, to submit from code. 29 | 'auto_submit':False, 30 | # Change to your Kaggle username and password. 31 | 'kaggle_username':'abc', 32 | 'kaggle_password':'xyz', 33 | # Keep message enclosed in single qoutes, which are further enclosed in double qoutes. 34 | 'kaggle_submission_message':"'Luke, I am you father'" 35 | } 36 | 37 | training_params = { 38 | 'num_epochs': 50, 39 | 'log_nth': 10, 40 | } 41 | 42 | train_control = { 43 | 'optimizer' : Adam, # Adam, SGD (we can add more) 44 | 'lr_scheduler_type': 'plateau', # 'exp', 'step', 'plateau', 'none' 45 | 46 | 'step_scheduler_args' : { 47 | 'gamma' : 0.1, # factor to decay learing rate (new_lr = gamma * lr) 48 | 'step_size': 3 # number of epochs to take a step of decay 49 | }, 50 | 51 | 'exp_scheduler_args' : { 52 | 'gamma' : 0.1 # factor to decay learing rate (new_lr = gamma * lr) 53 | }, 54 | 55 | 'plateau_scheduler_args' : { 56 | 'factor' : 0.1, # factor to decay learing rate (new_lr = factor * lr) 57 | 'patience' : 5, # number of epochs to wait as monitored value does not change before decreasing LR 58 | 'verbose' : True, # print a message when LR is changed 59 | 'threshold' : 1e-3, # when to consider the monitored varaible not changing (focus on significant changes) 60 | 'min_lr' : 1e-9, # lower bound on learning rate, not decreased further 61 | 'cooldown' : 0 # number of epochs to wait before resuming operation after LR was reduced 62 | } 63 | 64 | } 65 | 66 | optimizer_params = { 67 | 'lr': 1e-3 68 | } 69 | 70 | 71 | 72 | model_params = { 73 | # if False, just load the model from the disk and evaluate 74 | 'train': True, 75 | # if False, previously (partially) trained model is further trained. 76 | 'train_from_scratch': True, 77 | 'model_path': '../models/DRNet_TL_varLR_adam_plateau.model', 78 | 'model': DRNet, 79 | 'per_layer_rates' : True, # if True, the array rates in model kwargs will be used 80 | 'model_kwargs' : { 81 | 'num_classes' : 5, 82 | 'pretrained' : True, # load pre-trained weights on image-net 83 | 'net_size' : 18, # 18, 34, or 50 84 | 'freeze_features' : False, # fixed feature extractor OR fine-tune 85 | 'freeze_until_layer' : 2, # 1, 2, 3, 4, or 5 (check ResNet Paper) 86 | 'rates' : [1e-5, 1e-4, 1e-3, 1e-3, 1e-3, 1e-3] 87 | # array for learning rates of DRNet, layers 1 to 5 ResNet, 6 fc classifier 88 | }, 89 | # the device to put the variables on (cpu/gpu) 90 | 'pytorch_device': 'gpu', 91 | # cuda device if gpu 92 | 'cuda_device': 0, 93 | } 94 | 95 | 96 | 97 | 98 | # normalization recommended by PyTorch documentation 99 | # only in case of transfer learning 100 | 101 | transfer_learn = model_params['model_kwargs']['pretrained'] 102 | 103 | normalize_transfer_learning = transforms.Normalize( 104 | mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) if transfer_learn else transforms.Normalize( 105 | mean=[0, 0, 0], std=[1, 1, 1]) 106 | 107 | 108 | # training data transforms (random rotation, random skew, scale and crop 224) 109 | train_data_transforms = transforms.Compose([ 110 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 111 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 112 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1,1)), # scale +- 10%, resize to 300 113 | transforms.CenterCrop((224)), 114 | transforms.ToTensor(), 115 | normalize_transfer_learning 116 | ]) 117 | 118 | # validation data transforms (random rotation, random skew, scale and crop 224) 119 | val_data_transforms = transforms.Compose([ 120 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 121 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 122 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1,1)), # scale +- 10%, resize to 300 123 | transforms.CenterCrop((224)), 124 | transforms.ToTensor(), 125 | normalize_transfer_learning 126 | ]) 127 | 128 | # test data transforms (random rotation) 129 | test_data_transforms = transforms.Compose([ 130 | transforms.Lambda(lambda x: x.rotate(uniform(0,360), resample=Image.BICUBIC)), # random rotation 0 to 360 131 | transforms.RandomResizedCrop(300, scale=(1,1), ratio=(1,1)), # resize to 300 132 | transforms.CenterCrop((224)), 133 | transforms.ToTensor(), 134 | normalize_transfer_learning 135 | ]) 136 | 137 | 138 | # Function to implement skew based on PIL transform 139 | # adopted from: https://www.programcreek.com/python/example/69877/PIL.Image.AFFINE 140 | 141 | def skew_image(img, angle, inc_width=False): 142 | """ 143 | Skew image using some math 144 | :param img: PIL image object 145 | :param angle: Angle in radians (function doesn't do well outside the range -1 -> 1, but still works) 146 | :return: PIL image object 147 | """ 148 | width, height = img.size 149 | # Get the width that is to be added to the image based on the angle of skew 150 | xshift = tan(abs(angle)) * height 151 | new_width = width + int(xshift) 152 | 153 | if new_width < 0: 154 | return img 155 | 156 | # Apply transform 157 | img = img.transform( 158 | (new_width, height), 159 | Image.AFFINE, 160 | (1, angle, -xshift if angle > 0 else 0, 0, 1, 0), 161 | Image.BICUBIC 162 | ) 163 | 164 | if (inc_width): 165 | return img 166 | else: 167 | return img.crop((0, 0, width, height)) 168 | 169 | -------------------------------------------------------------------------------- /src/transforms_configuration.py: -------------------------------------------------------------------------------- 1 | # Configuration file with custom transforms for train/test/valid 2 | 3 | from math import tan 4 | from random import uniform 5 | 6 | from PIL import Image 7 | from torchvision import transforms 8 | from torchvision.models import AlexNet 9 | 10 | data_params = { 11 | 'train_path': '../data/small_train_300', 12 | 'test_path': '../data/test', 13 | 'label_path': '../data/trainLabels.csv', 14 | 'batch_size': 5, 15 | 'submission_file': '../data/submission.csv', 16 | # 'even', 'posneg', None. 17 | # 'even': Same number of samples for each class 18 | # 'posneg': Same number of samples for class 0 and all other classes 19 | 'rebalance_strategy': 'even', 20 | 'num_loading_workers': 4 21 | } 22 | 23 | training_params = { 24 | 'num_epochs': 50, 25 | 'log_nth': 5 26 | } 27 | 28 | model_params = { 29 | # if False, just load the model from the disk and evaluate 30 | 'train': True, 31 | 'model_path': '../models/alexnet.model', 32 | 'model': AlexNet, 33 | 'model_kwargs': {'num_classes': 5}, 34 | # the device to put the variables on (cpu/gpu) 35 | 'pytorch_device': 'cpu', 36 | # cuda device if gpu 37 | 'cuda_device': 0, 38 | } 39 | 40 | optimizer_params = { 41 | 'lr': 1e-4 42 | } 43 | 44 | # training data transforms (random rotation, random skew, scale and crop 224) 45 | train_data_transforms = transforms.Compose([ 46 | transforms.Lambda(lambda x: x.rotate(uniform(0, 360), resample=Image.BICUBIC)), # random rotation 0 to 360 47 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 48 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1, 1)), # scale +- 10%, resize to 300 49 | transforms.CenterCrop((224)), 50 | transforms.ToTensor() 51 | ]) 52 | 53 | # validation data transforms (random rotation, random skew, scale and crop 224) 54 | val_data_transforms = transforms.Compose([ 55 | transforms.Lambda(lambda x: x.rotate(uniform(0, 360), resample=Image.BICUBIC)), # random rotation 0 to 360 56 | transforms.Lambda(lambda x: skew_image(x, uniform(-0.2, 0.2), inc_width=True)), # random skew +- 0.2 57 | transforms.RandomResizedCrop(300, scale=(0.9, 1.1), ratio=(1, 1)), # scale +- 10%, resize to 300 58 | transforms.CenterCrop((224)), 59 | transforms.ToTensor() 60 | ]) 61 | 62 | # test data transforms (random rotation) 63 | test_data_transforms = transforms.Compose([ 64 | transforms.Lambda(lambda x: x.rotate(uniform(0, 360), resample=Image.BICUBIC)), # random rotation 0 to 360 65 | transforms.RandomResizedCrop(300, scale=(1, 1), ratio=(1, 1)), # resize to 300 66 | transforms.CenterCrop((224)), 67 | transforms.ToTensor() 68 | ]) 69 | 70 | 71 | # Function to implement skew based on PIL transform 72 | 73 | def skew_image(img, angle, inc_width=False): 74 | """ 75 | Skew image using some math 76 | :param img: PIL image object 77 | :param angle: Angle in radians (function doesn't do well outside the range -1 -> 1, but still works) 78 | :return: PIL image object 79 | """ 80 | width, height = img.size 81 | # Get the width that is to be added to the image based on the angle of skew 82 | xshift = tan(abs(angle)) * height 83 | new_width = width + int(xshift) 84 | 85 | if new_width < 0: 86 | return img 87 | 88 | # Apply transform 89 | img = img.transform( 90 | (new_width, height), 91 | Image.AFFINE, 92 | (1, angle, -xshift if angle > 0 else 0, 0, 1, 0), 93 | Image.BICUBIC 94 | ) 95 | 96 | if (inc_width): 97 | return img 98 | else: 99 | return img.crop((0, 0, width, height)) 100 | -------------------------------------------------------------------------------- /src/visualization/__init__.py: -------------------------------------------------------------------------------- 1 | # Adjusted code from https://github.com/utkuozbulak/pytorch-cnn-visualizations to work with ResNet -------------------------------------------------------------------------------- /src/visualization/cnn_layer_visualization.py: -------------------------------------------------------------------------------- 1 | """ 2 | Created on Sat Nov 18 23:12:08 2017 3 | 4 | @author: Utku Ozbulak - github.com/utkuozbulak 5 | 6 | Adjusted for ResNet architecture from 7 | https://github.com/utkuozbulak/pytorch-cnn-visualizations/blob/master/src/cnn_layer_visualization.py 8 | 9 | """ 10 | import os 11 | 12 | import cv2 13 | import numpy as np 14 | import torch 15 | from torch.optim import SGD 16 | 17 | from DRNet import DRNet 18 | from visualization.misc_functions import preprocess_image, recreate_image 19 | 20 | from torchvision.models.resnet import Bottleneck 21 | 22 | 23 | class CNNLayerVisualization(): 24 | """ 25 | Produces an image that minimizes the loss of a convolution 26 | operation for a specific layer and filter 27 | """ 28 | 29 | def __init__(self, model, selected_layer, selected_filter): 30 | self.model = model 31 | self.model.eval() 32 | self.selected_layer = selected_layer 33 | self.selected_filter = selected_filter 34 | self.conv_output = 0 35 | # Generate a random image 36 | self.created_image = np.uint8(np.random.uniform(150, 180, (224, 224, 3))) 37 | # Create the folder to export images if not exists 38 | if not os.path.exists('../../generated'): 39 | os.makedirs('../../generated') 40 | 41 | def hook_layer(self): 42 | def hook_function(module, grad_in, grad_out): 43 | # Gets the conv output of the selected filter (from selected layer) 44 | self.conv_output = grad_out[0, self.selected_filter] 45 | 46 | # Hook the selected layer 47 | for module_name, module in self.model._modules.items(): 48 | if module_name.startswith('layer'): 49 | for layer_module_name, layer_module in module._modules.items(): 50 | if module_name == self.selected_layer[0] and layer_module_name == self.selected_layer[1] \ 51 | and self.selected_layer[2] == 'conv1': 52 | layer_module.conv1.register_forward_hook(hook_function) 53 | break 54 | if module_name == self.selected_layer[0] and layer_module_name == self.selected_layer[1] \ 55 | and self.selected_layer[2] == 'conv2': 56 | layer_module.conv2.register_forward_hook(hook_function) 57 | break 58 | if module_name == self.selected_layer[0] and layer_module_name == self.selected_layer[1] \ 59 | and self.selected_layer[2] == 'conv3': 60 | layer_module.conv3.register_forward_hook(hook_function) 61 | break 62 | elif module_name != 'fc': 63 | if self.selected_layer == 'conv1' == module_name: 64 | module.register_forward_hook(hook_function) 65 | break 66 | 67 | def visualise_layer_with_hooks(self): 68 | # Hook the selected layer 69 | self.hook_layer() 70 | # Process image and return variable 71 | self.processed_image = preprocess_image(self.created_image) 72 | # Define optimizer for the image 73 | # Earlier layers need higher learning rates to visualize whereas later layers need less 74 | optimizer = SGD([self.processed_image], lr=5, weight_decay=1e-6) 75 | for i in range(1, 51): 76 | optimizer.zero_grad() 77 | # Assign create image to a variable to move forward in the model 78 | x = self.processed_image 79 | 80 | found_x = False 81 | self.model(x) 82 | for module_name, module in self.model._modules.items(): 83 | if found_x: 84 | break 85 | if module_name.startswith('layer'): 86 | for layer_module_name, layer_module in module._modules.items(): 87 | residual = x 88 | out = layer_module.conv1(x) 89 | if module_name == self.selected_layer[0] and layer_module_name == self.selected_layer[1] \ 90 | and self.selected_layer[2] == 'conv1': 91 | x = out 92 | found_x = True 93 | break 94 | out = layer_module.bn1(out) 95 | out = layer_module.relu(out) 96 | 97 | out = layer_module.conv2(out) 98 | if module_name == self.selected_layer[0] and layer_module_name == self.selected_layer[1] \ 99 | and self.selected_layer[2] == 'conv2': 100 | x = out 101 | found_x = True 102 | break 103 | out = layer_module.bn2(out) 104 | 105 | if type(layer_module) == Bottleneck: 106 | out = layer_module.relu(out) 107 | 108 | out = layer_module.conv3(out) 109 | if module_name == self.selected_layer[0] and layer_module_name == self.selected_layer[1] \ 110 | and self.selected_layer[2] == 'conv3': 111 | x = out 112 | found_x = True 113 | break 114 | out = layer_module.bn3(out) 115 | 116 | if layer_module.downsample is not None: 117 | residual = layer_module.downsample(x) 118 | 119 | out += residual 120 | out = layer_module.relu(out) 121 | x = out 122 | elif module_name != 'fc': 123 | x = module(x) 124 | if self.selected_layer == 'conv1' == module_name: 125 | break 126 | # Loss function is the mean of the output of the selected layer/filter 127 | # We try to minimize the mean of the output of that specific filter 128 | loss = torch.mean(self.conv_output) 129 | print('Iteration:', str(i), 'Loss:', "{0:.2f}".format(loss.data.numpy()[0])) 130 | # Backward 131 | loss.backward() 132 | # Update image 133 | optimizer.step() 134 | # Recreate image 135 | self.created_image = recreate_image(self.processed_image) 136 | # Save image 137 | if i % 50 == 0: 138 | cv2.imwrite('../../generated/resnet18_ft/layer_vis_l' + str(self.selected_layer) + 139 | '_f' + str(self.selected_filter) + '_iter' + str(i) + '.jpg', 140 | self.created_image) 141 | 142 | def visualise_layer_without_hooks(self): 143 | # Process image and return variable 144 | self.processed_image = preprocess_image(self.created_image) 145 | # Define optimizer for the image 146 | # Earlier layers need higher learning rates to visualize whereas later layers need less 147 | optimizer = SGD([self.processed_image], lr=5, weight_decay=1e-6) 148 | for i in range(1, 51): 149 | optimizer.zero_grad() 150 | # Assign create image to a variable to move forward in the model 151 | x = self.processed_image 152 | 153 | found_x = False 154 | for module_name, module in self.model._modules.items(): 155 | if found_x: 156 | break 157 | if module_name.startswith('layer'): 158 | for layer_module_name, layer_module in module._modules.items(): 159 | residual = x 160 | out = layer_module.conv1(x) 161 | if module_name == self.selected_layer[0] and layer_module_name == self.selected_layer[1] \ 162 | and self.selected_layer[2] == 'conv1': 163 | x = out 164 | found_x = True 165 | break 166 | out = layer_module.bn1(out) 167 | out = layer_module.relu(out) 168 | 169 | out = layer_module.conv2(out) 170 | if module_name == self.selected_layer[0] and layer_module_name == self.selected_layer[1] \ 171 | and self.selected_layer[2] == 'conv2': 172 | x = out 173 | found_x = True 174 | break 175 | out = layer_module.bn2(out) 176 | 177 | if type(layer_module) == Bottleneck: 178 | out = layer_module.relu(out) 179 | 180 | out = layer_module.conv3(out) 181 | if module_name == self.selected_layer[0] and layer_module_name == self.selected_layer[1] \ 182 | and self.selected_layer[2] == 'conv3': 183 | x = out 184 | found_x = True 185 | break 186 | out = layer_module.bn3(out) 187 | 188 | if layer_module.downsample is not None: 189 | residual = layer_module.downsample(x) 190 | 191 | out += residual 192 | out = layer_module.relu(out) 193 | x = out 194 | elif module_name != 'fc': 195 | x = module(x) 196 | if self.selected_layer == 'conv1' == module_name: 197 | break 198 | 199 | # Here, we get the specific filter from the output of the convolution operation 200 | # x is a tensor of shape 1x512x28x28.(For layer 17) 201 | # So there are 512 unique filter outputs 202 | # Following line selects a filter from 512 filters so self.conv_output will become 203 | # a tensor of shape 28x28 204 | self.conv_output = x[0, self.selected_filter] 205 | # Loss function is the mean of the output of the selected layer/filter 206 | # We try to minimize the mean of the output of that specific filter 207 | loss = torch.mean(self.conv_output) 208 | print('Iteration:', str(i), 'Loss:', "{0:.2f}".format(loss.data.numpy()[0])) 209 | # Backward 210 | loss.backward() 211 | # Update image 212 | optimizer.step() 213 | # Recreate image 214 | self.created_image = recreate_image(self.processed_image) 215 | # Save image 216 | if i % 50 == 0: 217 | cv2.imwrite('../../generated/layer_vis_l' + str(self.selected_layer) + 218 | '_f' + str(self.selected_filter) + '_iter' + str(i) + '.jpg', 219 | self.created_image) 220 | 221 | 222 | if __name__ == '__main__': 223 | # Set to 'conv1', ('layer1', '0', 'conv1'), ('layer1', '0', 'conv2'),... 224 | # ResNet18 and 34 have up to conv2, ResNet50 has conv3 225 | for cnn_layer in [ 226 | 'conv1', 227 | ('layer1', '0', 'conv1'), ('layer2', '1', 'conv2'), ('layer3', '0', 'conv1'), 228 | # ('layer4', '2', 'conv3') 229 | ('layer3', '1', 'conv2') 230 | ]: 231 | # Set to the number of the filter in the selectec conv layer 232 | for filter_pos in [0, 1, 2, 30]: 233 | pretrained_model = torch.load('../../models/DRNet_TL_18_Finetune_adam_plateau.model', lambda storage, loc: storage) 234 | if type(pretrained_model) == DRNet: 235 | pretrained_model = pretrained_model.resnet 236 | 237 | layer_vis = CNNLayerVisualization(pretrained_model, cnn_layer, filter_pos) 238 | 239 | # Either way use visualize_layer_with_hooks or visualise_layer_without_hooks, both work fine. 240 | # Layer visualization with pytorch hooks 241 | layer_vis.visualise_layer_with_hooks() 242 | 243 | # Layer visualization without pytorch hooks 244 | # layer_vis.visualise_layer_without_hooks() 245 | print(f'Visualization done completed for {cnn_layer}, {filter_pos}') 246 | -------------------------------------------------------------------------------- /src/visualization/gradcam.py: -------------------------------------------------------------------------------- 1 | """ 2 | Created on Thu Oct 26 11:06:51 2017 3 | 4 | @author: Utku Ozbulak - github.com/utkuozbulak 5 | 6 | Adjusted for ResNet architecture from 7 | https://github.com/utkuozbulak/pytorch-cnn-visualizations/blob/master/src/gradcam.py 8 | """ 9 | import cv2 10 | import numpy as np 11 | import torch 12 | from torchvision.models.resnet import Bottleneck 13 | 14 | from visualization.misc_functions import get_params, save_class_activation_on_image 15 | 16 | 17 | class CamExtractor(): 18 | """ 19 | Extracts cam features from the model 20 | """ 21 | def __init__(self, model, target_layer): 22 | self.model = model 23 | self.target_layer = target_layer 24 | self.gradients = None 25 | 26 | def save_gradient(self, grad): 27 | self.gradients = grad 28 | 29 | def forward_pass_on_convolutions(self, x): 30 | """ 31 | Does a forward pass on convolutions, hooks the function at given layer 32 | """ 33 | conv_output = None 34 | for module_name, module in self.model._modules.items(): 35 | if module_name.startswith('layer'): 36 | for layer_module_name, layer_module in module._modules.items(): 37 | residual = x 38 | out = layer_module.conv1(x) 39 | if module_name == self.target_layer[0] and layer_module_name == self.target_layer[1] \ 40 | and self.target_layer[2] == 'conv1': 41 | out.register_hook(self.save_gradient) 42 | conv_output = out 43 | out = layer_module.bn1(out) 44 | out = layer_module.relu(out) 45 | 46 | out = layer_module.conv2(out) 47 | if module_name == self.target_layer[0] and layer_module_name == self.target_layer[1] \ 48 | and self.target_layer[2] == 'conv2': 49 | out.register_hook(self.save_gradient) 50 | conv_output = out 51 | out = layer_module.bn2(out) 52 | 53 | if type(layer_module) == Bottleneck: 54 | out = layer_module.relu(out) 55 | 56 | out = layer_module.conv3(out) 57 | if module_name == self.target_layer[0] and layer_module_name == self.target_layer[1] \ 58 | and self.target_layer[2] == 'conv3': 59 | out.register_hook(self.save_gradient) 60 | conv_output = out 61 | out = layer_module.bn3(out) 62 | 63 | if layer_module.downsample is not None: 64 | residual = layer_module.downsample(x) 65 | 66 | out += residual 67 | out = layer_module.relu(out) 68 | x = out 69 | elif module_name != 'fc': 70 | x = module(x) 71 | if self.target_layer == 'conv1' == module_name: 72 | x.register_hook(self.save_gradient) 73 | conv_output = x 74 | return conv_output, x 75 | 76 | def forward_pass(self, x): 77 | """ 78 | Does a full forward pass on the model 79 | """ 80 | # Forward pass on the convolutions 81 | conv_output, x = self.forward_pass_on_convolutions(x) 82 | x = x.view(x.size(0), -1) # Flatten 83 | # Forward pass on the classifier 84 | x = self.model.fc(x) 85 | return conv_output, x 86 | 87 | 88 | class GradCam(): 89 | """ 90 | Produces class activation map 91 | """ 92 | def __init__(self, model, target_layer): 93 | self.model = model 94 | self.model.eval() 95 | # Define extractor 96 | self.extractor = CamExtractor(self.model, target_layer) 97 | 98 | def generate_cam(self, input_image, target_index=None): 99 | # Full forward pass 100 | # conv_output is the output of convolutions at specified layer 101 | # model_output is the final output of the model (1, 1000) 102 | conv_output, model_output = self.extractor.forward_pass(input_image) 103 | if target_index is None: 104 | target_index = np.argmax(model_output.data.numpy()) 105 | # Target for backprop 106 | one_hot_output = torch.FloatTensor(1, model_output.size()[-1]).zero_() 107 | one_hot_output[0][target_index] = 1 108 | # Zero grads 109 | self.model.zero_grad() 110 | # self.model.features.zero_grad() 111 | # self.model.classifier.zero_grad() 112 | # Backward pass with specified target 113 | model_output.backward(gradient=one_hot_output, retain_graph=True) 114 | # Get hooked gradients 115 | guided_gradients = self.extractor.gradients.data.numpy()[0] 116 | # Get convolution outputs 117 | target = conv_output.data.numpy()[0] 118 | # Get weights from gradients 119 | weights = np.mean(guided_gradients, axis=(1, 2)) # Take averages for each gradient 120 | # Create empty numpy array for cam 121 | cam = np.ones(target.shape[1:], dtype=np.float32) 122 | # Multiply each weight with its conv output and then, sum 123 | for i, w in enumerate(weights): 124 | cam += w * target[i, :, :] 125 | cam = cv2.resize(cam, (224, 224)) 126 | cam = np.maximum(cam, 0) 127 | cam = (cam - np.min(cam)) / (np.max(cam) - np.min(cam)) # Normalize between 0-1 128 | cam = np.uint8(cam * 255) # Scale between 0-255 to visualize 129 | return cam 130 | 131 | 132 | if __name__ == '__main__': 133 | # one for each class, you might adjust the paths in misc_functions#get_params 134 | for target_example in range(5): 135 | (original_image, prep_img, target_class, file_name_to_export, pretrained_model) =\ 136 | get_params(target_example) 137 | # Set to 'conv1', ('layer1', '0', 'conv1'), ('layer1', '0', 'conv2'),... 138 | for target_layer in [ 139 | 'conv1', ('layer1', '0', 'conv1'), ('layer2', '1', 'conv2'), ('layer3', '0', 'conv1'), 140 | # ('layer4', '1', 'conv2') 141 | ('layer4', '2', 'conv3') 142 | ]: 143 | # Grad cam 144 | grad_cam = GradCam(pretrained_model, target_layer=target_layer) 145 | # Generate cam mask 146 | cam = grad_cam.generate_cam(prep_img, target_class) 147 | # Save mask 148 | current_file_name_to_export = file_name_to_export + f'_{str(target_layer)}' 149 | save_class_activation_on_image(original_image, cam, current_file_name_to_export) 150 | print(f'Grad cam completed for {target_example}, {target_layer}') 151 | -------------------------------------------------------------------------------- /src/visualization/guided_backprop.py: -------------------------------------------------------------------------------- 1 | """ 2 | Created on Thu Oct 26 11:23:47 2017 3 | 4 | @author: Utku Ozbulak - github.com/utkuozbulak 5 | 6 | Adjusted for ResNet architecture from 7 | https://github.com/utkuozbulak/pytorch-cnn-visualizations/blob/master/src/guided_backprop.py 8 | """ 9 | import torch 10 | from torch.nn import ReLU 11 | 12 | from visualization.misc_functions import (get_params, 13 | convert_to_grayscale, 14 | save_gradient_images, 15 | get_positive_negative_saliency) 16 | 17 | 18 | class GuidedBackprop(): 19 | """ 20 | Produces gradients generated with guided back propagation from the given image 21 | """ 22 | 23 | def __init__(self, model, processed_im, target_class): 24 | self.model = model 25 | self.input_image = processed_im 26 | self.target_class = target_class 27 | self.gradients = None 28 | # Put model in evaluation mode 29 | self.model.eval() 30 | self.update_relus() 31 | self.hook_layers() 32 | 33 | def hook_layers(self): 34 | def hook_function(module, grad_in, grad_out): 35 | self.gradients = grad_in[0] 36 | 37 | # Register hook to the first layer 38 | first_layer = list(self.model._modules.items())[0][1] 39 | first_layer.register_backward_hook(hook_function) 40 | 41 | def update_relus(self): 42 | """ 43 | Updates relu activation functions so that it only returns positive gradients 44 | """ 45 | 46 | def relu_hook_function(module, grad_in, grad_out): 47 | """ 48 | If there is a negative gradient, changes it to zero 49 | """ 50 | if isinstance(module, ReLU): 51 | return (torch.clamp(grad_in[0], min=0.0),) 52 | 53 | # Loop through layers, hook up ReLUs with relu_hook_function 54 | for module_name, module in self.model._modules.items(): 55 | if module_name.startswith('layer'): 56 | for layer_module_name, layer_module in module._modules.items(): 57 | for _, block_module in layer_module._modules.items(): 58 | if isinstance(block_module, ReLU): 59 | block_module.register_backward_hook(relu_hook_function) 60 | elif module_name != 'fc': 61 | if isinstance(module, ReLU): 62 | module.register_backward_hook(relu_hook_function) 63 | 64 | def generate_gradients(self): 65 | # Forward pass 66 | model_output = self.model(self.input_image) 67 | # Zero gradients 68 | self.model.zero_grad() 69 | # Target for backprop 70 | one_hot_output = torch.FloatTensor(1, model_output.size()[-1]).zero_() 71 | one_hot_output[0][self.target_class] = 1 72 | # Backward pass 73 | model_output.backward(gradient=one_hot_output) 74 | # Convert Pytorch variable to numpy array 75 | # [0] to get rid of the first channel (1,3,224,224) 76 | gradients_as_arr = self.gradients.data.numpy()[0] 77 | return gradients_as_arr 78 | 79 | 80 | if __name__ == '__main__': 81 | # one for each class, you might adjust the paths in misc_functions#get_params 82 | 83 | for target_example in range(5): 84 | (original_image, prep_img, target_class, file_name_to_export, pretrained_model) = \ 85 | get_params(target_example) 86 | 87 | # Guided backprop 88 | GBP = GuidedBackprop(pretrained_model, prep_img, target_class) 89 | # Get gradients 90 | guided_grads = GBP.generate_gradients() 91 | # Save colored gradients 92 | save_gradient_images(guided_grads, file_name_to_export + '_Guided_BP_color') 93 | # Convert to grayscale 94 | grayscale_guided_grads = convert_to_grayscale(guided_grads) 95 | # Save grayscale gradients 96 | save_gradient_images(grayscale_guided_grads, file_name_to_export + '_Guided_BP_gray') 97 | # Positive and negative saliency maps 98 | pos_sal, neg_sal = get_positive_negative_saliency(guided_grads) 99 | save_gradient_images(pos_sal, file_name_to_export + '_pos_sal') 100 | save_gradient_images(neg_sal, file_name_to_export + '_neg_sal') 101 | print(f'Guided backprop completed for {target_example}') 102 | -------------------------------------------------------------------------------- /src/visualization/misc_functions.py: -------------------------------------------------------------------------------- 1 | """ 2 | Created on Thu Oct 21 11:09:09 2017 3 | 4 | @author: Utku Ozbulak - github.com/utkuozbulak 5 | Adjusted for ResNet architecture from 6 | https://github.com/utkuozbulak/pytorch-cnn-visualizations/blob/master/src/misc_functions.py 7 | """ 8 | import copy 9 | import os 10 | 11 | import cv2 12 | import numpy as np 13 | import torch 14 | from torch.autograd import Variable 15 | 16 | from DRNet import DRNet 17 | 18 | 19 | def convert_to_grayscale(cv2im): 20 | """ 21 | Converts 3d image to grayscale 22 | 23 | Args: 24 | cv2im (numpy arr): RGB image with shape (D,W,H) 25 | 26 | returns: 27 | grayscale_im (numpy_arr): Grayscale image with shape (1,W,D) 28 | """ 29 | grayscale_im = np.sum(np.abs(cv2im), axis=0) 30 | im_max = np.percentile(grayscale_im, 99) 31 | im_min = np.min(grayscale_im) 32 | grayscale_im = (np.clip((grayscale_im - im_min) / (im_max - im_min), 0, 1)) 33 | grayscale_im = np.expand_dims(grayscale_im, axis=0) 34 | return grayscale_im 35 | 36 | 37 | def save_gradient_images(gradient, file_name): 38 | """ 39 | Exports the original gradient image 40 | 41 | Args: 42 | gradient (np arr): Numpy array of the gradient with shape (3, 224, 224) 43 | file_name (str): File name to be exported 44 | """ 45 | if not os.path.exists('../../results'): 46 | os.makedirs('../../results') 47 | gradient = gradient - gradient.min() 48 | gradient /= gradient.max() 49 | gradient = np.uint8(gradient * 255).transpose(1, 2, 0) 50 | path_to_file = os.path.join('../../results', file_name + '.jpg') 51 | # Convert RBG to GBR 52 | gradient = gradient[..., ::-1] 53 | cv2.imwrite(path_to_file, gradient) 54 | 55 | 56 | def save_class_activation_on_image(org_img, activation_map, file_name): 57 | """ 58 | Saves cam activation map and activation map on the original image 59 | 60 | Args: 61 | org_img (PIL img): Original image 62 | activation_map (numpy arr): activation map (grayscale) 0-255 63 | file_name (str): File name of the exported image 64 | """ 65 | if not os.path.exists('../../results'): 66 | os.makedirs('../../results') 67 | # Grayscale activation map 68 | path_to_file = os.path.join('../../results', file_name + '_Cam_Grayscale.jpg') 69 | cv2.imwrite(path_to_file, activation_map) 70 | # Heatmap of activation map 71 | activation_heatmap = cv2.applyColorMap(activation_map, cv2.COLORMAP_HSV) 72 | path_to_file = os.path.join('../../results', file_name + '_Cam_Heatmap.jpg') 73 | cv2.imwrite(path_to_file, activation_heatmap) 74 | # Heatmap on picture 75 | org_img = cv2.resize(org_img, (224, 224)) 76 | img_with_heatmap = np.float32(activation_heatmap) + np.float32(org_img) 77 | img_with_heatmap = img_with_heatmap / np.max(img_with_heatmap) 78 | path_to_file = os.path.join('../../results', file_name + '_Cam_On_Image.jpg') 79 | cv2.imwrite(path_to_file, np.uint8(255 * img_with_heatmap)) 80 | 81 | 82 | def preprocess_image(cv2im, resize_im=True): 83 | """ 84 | Processes image for CNNs 85 | 86 | Args: 87 | PIL_img (PIL_img): Image to process 88 | resize_im (bool): Resize to 224 or not 89 | returns: 90 | im_as_var (Pytorch variable): Variable that contains processed float tensor 91 | """ 92 | # mean and std list for channels (Imagenet) 93 | mean = [0.485, 0.456, 0.406] 94 | std = [0.229, 0.224, 0.225] 95 | # Resize image 96 | if resize_im: 97 | cv2im = cv2.resize(cv2im, (224, 224)) 98 | im_as_arr = np.float32(cv2im) 99 | im_as_arr = np.ascontiguousarray(im_as_arr[..., ::-1]) 100 | im_as_arr = im_as_arr.transpose(2, 0, 1) # Convert array to D,W,H 101 | # Normalize the channels 102 | for channel, _ in enumerate(im_as_arr): 103 | im_as_arr[channel] /= 255 104 | im_as_arr[channel] -= mean[channel] 105 | im_as_arr[channel] /= std[channel] 106 | # Convert to float tensor 107 | im_as_ten = torch.from_numpy(im_as_arr).float() 108 | # Add one more channel to the beginning. Tensor shape = 1,3,224,224 109 | im_as_ten.unsqueeze_(0) 110 | # Convert to Pytorch variable 111 | im_as_var = Variable(im_as_ten, requires_grad=True) 112 | return im_as_var 113 | 114 | 115 | def recreate_image(im_as_var): 116 | """ 117 | Recreates images from a torch variable, sort of reverse preprocessing 118 | 119 | Args: 120 | im_as_var (torch variable): Image to recreate 121 | 122 | returns: 123 | recreated_im (numpy arr): Recreated image in array 124 | """ 125 | reverse_mean = [-0.485, -0.456, -0.406] 126 | reverse_std = [1 / 0.229, 1 / 0.224, 1 / 0.225] 127 | recreated_im = copy.copy(im_as_var.data.numpy()[0]) 128 | for c in range(3): 129 | recreated_im[c] /= reverse_std[c] 130 | recreated_im[c] -= reverse_mean[c] 131 | recreated_im[recreated_im > 1] = 1 132 | recreated_im[recreated_im < 0] = 0 133 | recreated_im = np.round(recreated_im * 255) 134 | 135 | recreated_im = np.uint8(recreated_im).transpose(1, 2, 0) 136 | # Convert RBG to GBR 137 | recreated_im = recreated_im[..., ::-1] 138 | return recreated_im 139 | 140 | 141 | def get_positive_negative_saliency(gradient): 142 | """ 143 | Generates positive and negative saliency maps based on the gradient 144 | Args: 145 | gradient (numpy arr): Gradient of the operation to visualize 146 | 147 | returns: 148 | pos_saliency ( ) 149 | """ 150 | pos_saliency = (np.maximum(0, gradient) / gradient.max()) 151 | neg_saliency = (np.maximum(0, -gradient) / -gradient.min()) 152 | return pos_saliency, neg_saliency 153 | 154 | 155 | def get_params(example_index): 156 | """ 157 | Gets used variables for almost all visualizations, like the image, model etc. 158 | 159 | Args: 160 | example_index (int): Image id to use from examples 161 | 162 | returns: 163 | original_image (numpy arr): Original image read from the file 164 | prep_img (numpy_arr): Processed image 165 | target_class (int): Target class for the image 166 | file_name_to_export (string): File name to export the visualizations 167 | pretrained_model(Pytorch model): Model to use for the operations 168 | """ 169 | # Pick one of the examples 170 | example_list = [['../../data/train_small_300/186_left.jpeg', 0], 171 | ['../../data/train_small_300/178_left.jpeg', 1], 172 | ['../../data/train_small_300/15_right.jpeg', 2], 173 | ['../../data/train_small_300/1002_left.jpeg', 3], 174 | ['../../data/train_small_300/1084_left.jpeg', 4]] 175 | selected_example = example_index 176 | img_path = example_list[selected_example][0] 177 | target_class = example_list[selected_example][1] 178 | file_name_to_export = img_path[img_path.rfind('/') + 1:img_path.rfind('.')] 179 | # Read image 180 | original_image = cv2.imread(img_path, 1) 181 | # Process image 182 | prep_img = preprocess_image(original_image) 183 | # Define model 184 | # pretrained_model = models.alexnet(pretrained=False) 185 | pretrained_model = torch.load('../../models/DRNet_TL_50_ft_adam_plateau.model', lambda storage, loc: storage) 186 | if type(pretrained_model) == DRNet: 187 | pretrained_model = pretrained_model.resnet 188 | return (original_image, 189 | prep_img, 190 | target_class, 191 | file_name_to_export, 192 | pretrained_model) 193 | --------------------------------------------------------------------------------