├── .gitattributes ├── .gitignore ├── LICENSE ├── README.md ├── cnn_lcd.py ├── cnn_models.py ├── dataset.py └── paper ├── DEEPSLAM.eps ├── cvpr.sty ├── cvpr_eso.sty ├── egbib.bib ├── eso-pic.sty ├── ieee.bst ├── simplot_city_overfeat_1.png ├── simplot_w_gt_city_overfeat_1.png ├── simplot_w_gt_college_overfeat_1.png └── zjc_egpaper_for_review.tex /.gitattributes: -------------------------------------------------------------------------------- 1 | paper/* linguist-vendored 2 | 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.pdf 2 | *.pptx 3 | *.zip 4 | *.out 5 | *.ods 6 | *.txt 7 | OverFeat/ 8 | img/ 9 | models/ 10 | data/ 11 | npy/ 12 | tf_ckpts/ 13 | .idea/ 14 | 15 | # Byte-compiled / optimized / DLL files 16 | __pycache__/ 17 | *.py[cod] 18 | *$py.class 19 | 20 | # C extensions 21 | *.so 22 | 23 | # Distribution / packaging 24 | .Python 25 | env/ 26 | build/ 27 | develop-eggs/ 28 | dist/ 29 | downloads/ 30 | eggs/ 31 | .eggs/ 32 | lib/ 33 | lib64/ 34 | parts/ 35 | sdist/ 36 | var/ 37 | wheels/ 38 | *.egg-info/ 39 | .installed.cfg 40 | *.egg 41 | 42 | # PyInstaller 43 | # Usually these files are written by a python script from a template 44 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 45 | *.manifest 46 | *.spec 47 | 48 | # Installer logs 49 | pip-log.txt 50 | pip-delete-this-directory.txt 51 | 52 | # Unit test / coverage reports 53 | htmlcov/ 54 | .tox/ 55 | .coverage 56 | .coverage.* 57 | .cache 58 | nosetests.xml 59 | coverage.xml 60 | *.cover 61 | .hypothesis/ 62 | 63 | # Translations 64 | *.mo 65 | *.pot 66 | 67 | # Django stuff: 68 | *.log 69 | local_settings.py 70 | 71 | # Flask stuff: 72 | instance/ 73 | .webassets-cache 74 | 75 | # Scrapy stuff: 76 | .scrapy 77 | 78 | # Sphinx documentation 79 | docs/_build/ 80 | 81 | # PyBuilder 82 | target/ 83 | 84 | # Jupyter Notebook 85 | .ipynb_checkpoints 86 | 87 | # pyenv 88 | .python-version 89 | 90 | # celery beat schedule file 91 | celerybeat-schedule 92 | 93 | # SageMath parsed files 94 | *.sage.py 95 | 96 | # dotenv 97 | .env 98 | 99 | # virtualenv 100 | .venv 101 | venv/ 102 | ENV/ 103 | 104 | # Spyder project settings 105 | .spyderproject 106 | .spyproject 107 | 108 | # Rope project settings 109 | .ropeproject 110 | 111 | # mkdocs documentation 112 | /site 113 | 114 | # mypy 115 | .mypy_cache/ 116 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU General Public License is a free, copyleft license for 11 | software and other kinds of works. 12 | 13 | The licenses for most software and other practical works are designed 14 | to take away your freedom to share and change the works. By contrast, 15 | the GNU General Public License is intended to guarantee your freedom to 16 | share and change all versions of a program--to make sure it remains free 17 | software for all its users. We, the Free Software Foundation, use the 18 | GNU General Public License for most of our software; it applies also to 19 | any other work released this way by its authors. You can apply it to 20 | your programs, too. 21 | 22 | When we speak of free software, we are referring to freedom, not 23 | price. Our General Public Licenses are designed to make sure that you 24 | have the freedom to distribute copies of free software (and charge for 25 | them if you wish), that you receive source code or can get it if you 26 | want it, that you can change the software or use pieces of it in new 27 | free programs, and that you know you can do these things. 28 | 29 | To protect your rights, we need to prevent others from denying you 30 | these rights or asking you to surrender the rights. Therefore, you have 31 | certain responsibilities if you distribute copies of the software, or if 32 | you modify it: responsibilities to respect the freedom of others. 33 | 34 | For example, if you distribute copies of such a program, whether 35 | gratis or for a fee, you must pass on to the recipients the same 36 | freedoms that you received. You must make sure that they, too, receive 37 | or can get the source code. And you must show them these terms so they 38 | know their rights. 39 | 40 | Developers that use the GNU GPL protect your rights with two steps: 41 | (1) assert copyright on the software, and (2) offer you this License 42 | giving you legal permission to copy, distribute and/or modify it. 43 | 44 | For the developers' and authors' protection, the GPL clearly explains 45 | that there is no warranty for this free software. For both users' and 46 | authors' sake, the GPL requires that modified versions be marked as 47 | changed, so that their problems will not be attributed erroneously to 48 | authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run 51 | modified versions of the software inside them, although the manufacturer 52 | can do so. This is fundamentally incompatible with the aim of 53 | protecting users' freedom to change the software. The systematic 54 | pattern of such abuse occurs in the area of products for individuals to 55 | use, which is precisely where it is most unacceptable. Therefore, we 56 | have designed this version of the GPL to prohibit the practice for those 57 | products. If such problems arise substantially in other domains, we 58 | stand ready to extend this provision to those domains in future versions 59 | of the GPL, as needed to protect the freedom of users. 60 | 61 | Finally, every program is threatened constantly by software patents. 62 | States should not allow patents to restrict development and use of 63 | software on general-purpose computers, but in those that do, we wish to 64 | avoid the special danger that patents applied to a free program could 65 | make it effectively proprietary. To prevent this, the GPL assures that 66 | patents cannot be used to render the program non-free. 67 | 68 | The precise terms and conditions for copying, distribution and 69 | modification follow. 70 | 71 | TERMS AND CONDITIONS 72 | 73 | 0. Definitions. 74 | 75 | "This License" refers to version 3 of the GNU General Public License. 76 | 77 | "Copyright" also means copyright-like laws that apply to other kinds of 78 | works, such as semiconductor masks. 79 | 80 | "The Program" refers to any copyrightable work licensed under this 81 | License. Each licensee is addressed as "you". "Licensees" and 82 | "recipients" may be individuals or organizations. 83 | 84 | To "modify" a work means to copy from or adapt all or part of the work 85 | in a fashion requiring copyright permission, other than the making of an 86 | exact copy. The resulting work is called a "modified version" of the 87 | earlier work or a work "based on" the earlier work. 88 | 89 | A "covered work" means either the unmodified Program or a work based 90 | on the Program. 91 | 92 | To "propagate" a work means to do anything with it that, without 93 | permission, would make you directly or secondarily liable for 94 | infringement under applicable copyright law, except executing it on a 95 | computer or modifying a private copy. Propagation includes copying, 96 | distribution (with or without modification), making available to the 97 | public, and in some countries other activities as well. 98 | 99 | To "convey" a work means any kind of propagation that enables other 100 | parties to make or receive copies. Mere interaction with a user through 101 | a computer network, with no transfer of a copy, is not conveying. 102 | 103 | An interactive user interface displays "Appropriate Legal Notices" 104 | to the extent that it includes a convenient and prominently visible 105 | feature that (1) displays an appropriate copyright notice, and (2) 106 | tells the user that there is no warranty for the work (except to the 107 | extent that warranties are provided), that licensees may convey the 108 | work under this License, and how to view a copy of this License. If 109 | the interface presents a list of user commands or options, such as a 110 | menu, a prominent item in the list meets this criterion. 111 | 112 | 1. Source Code. 113 | 114 | The "source code" for a work means the preferred form of the work 115 | for making modifications to it. "Object code" means any non-source 116 | form of a work. 117 | 118 | A "Standard Interface" means an interface that either is an official 119 | standard defined by a recognized standards body, or, in the case of 120 | interfaces specified for a particular programming language, one that 121 | is widely used among developers working in that language. 122 | 123 | The "System Libraries" of an executable work include anything, other 124 | than the work as a whole, that (a) is included in the normal form of 125 | packaging a Major Component, but which is not part of that Major 126 | Component, and (b) serves only to enable use of the work with that 127 | Major Component, or to implement a Standard Interface for which an 128 | implementation is available to the public in source code form. A 129 | "Major Component", in this context, means a major essential component 130 | (kernel, window system, and so on) of the specific operating system 131 | (if any) on which the executable work runs, or a compiler used to 132 | produce the work, or an object code interpreter used to run it. 133 | 134 | The "Corresponding Source" for a work in object code form means all 135 | the source code needed to generate, install, and (for an executable 136 | work) run the object code and to modify the work, including scripts to 137 | control those activities. However, it does not include the work's 138 | System Libraries, or general-purpose tools or generally available free 139 | programs which are used unmodified in performing those activities but 140 | which are not part of the work. For example, Corresponding Source 141 | includes interface definition files associated with source files for 142 | the work, and the source code for shared libraries and dynamically 143 | linked subprograms that the work is specifically designed to require, 144 | such as by intimate data communication or control flow between those 145 | subprograms and other parts of the work. 146 | 147 | The Corresponding Source need not include anything that users 148 | can regenerate automatically from other parts of the Corresponding 149 | Source. 150 | 151 | The Corresponding Source for a work in source code form is that 152 | same work. 153 | 154 | 2. Basic Permissions. 155 | 156 | All rights granted under this License are granted for the term of 157 | copyright on the Program, and are irrevocable provided the stated 158 | conditions are met. This License explicitly affirms your unlimited 159 | permission to run the unmodified Program. The output from running a 160 | covered work is covered by this License only if the output, given its 161 | content, constitutes a covered work. This License acknowledges your 162 | rights of fair use or other equivalent, as provided by copyright law. 163 | 164 | You may make, run and propagate covered works that you do not 165 | convey, without conditions so long as your license otherwise remains 166 | in force. You may convey covered works to others for the sole purpose 167 | of having them make modifications exclusively for you, or provide you 168 | with facilities for running those works, provided that you comply with 169 | the terms of this License in conveying all material for which you do 170 | not control copyright. Those thus making or running the covered works 171 | for you must do so exclusively on your behalf, under your direction 172 | and control, on terms that prohibit them from making any copies of 173 | your copyrighted material outside their relationship with you. 174 | 175 | Conveying under any other circumstances is permitted solely under 176 | the conditions stated below. Sublicensing is not allowed; section 10 177 | makes it unnecessary. 178 | 179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 180 | 181 | No covered work shall be deemed part of an effective technological 182 | measure under any applicable law fulfilling obligations under article 183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 184 | similar laws prohibiting or restricting circumvention of such 185 | measures. 186 | 187 | When you convey a covered work, you waive any legal power to forbid 188 | circumvention of technological measures to the extent such circumvention 189 | is effected by exercising rights under this License with respect to 190 | the covered work, and you disclaim any intention to limit operation or 191 | modification of the work as a means of enforcing, against the work's 192 | users, your or third parties' legal rights to forbid circumvention of 193 | technological measures. 194 | 195 | 4. Conveying Verbatim Copies. 196 | 197 | You may convey verbatim copies of the Program's source code as you 198 | receive it, in any medium, provided that you conspicuously and 199 | appropriately publish on each copy an appropriate copyright notice; 200 | keep intact all notices stating that this License and any 201 | non-permissive terms added in accord with section 7 apply to the code; 202 | keep intact all notices of the absence of any warranty; and give all 203 | recipients a copy of this License along with the Program. 204 | 205 | You may charge any price or no price for each copy that you convey, 206 | and you may offer support or warranty protection for a fee. 207 | 208 | 5. Conveying Modified Source Versions. 209 | 210 | You may convey a work based on the Program, or the modifications to 211 | produce it from the Program, in the form of source code under the 212 | terms of section 4, provided that you also meet all of these conditions: 213 | 214 | a) The work must carry prominent notices stating that you modified 215 | it, and giving a relevant date. 216 | 217 | b) The work must carry prominent notices stating that it is 218 | released under this License and any conditions added under section 219 | 7. This requirement modifies the requirement in section 4 to 220 | "keep intact all notices". 221 | 222 | c) You must license the entire work, as a whole, under this 223 | License to anyone who comes into possession of a copy. This 224 | License will therefore apply, along with any applicable section 7 225 | additional terms, to the whole of the work, and all its parts, 226 | regardless of how they are packaged. This License gives no 227 | permission to license the work in any other way, but it does not 228 | invalidate such permission if you have separately received it. 229 | 230 | d) If the work has interactive user interfaces, each must display 231 | Appropriate Legal Notices; however, if the Program has interactive 232 | interfaces that do not display Appropriate Legal Notices, your 233 | work need not make them do so. 234 | 235 | A compilation of a covered work with other separate and independent 236 | works, which are not by their nature extensions of the covered work, 237 | and which are not combined with it such as to form a larger program, 238 | in or on a volume of a storage or distribution medium, is called an 239 | "aggregate" if the compilation and its resulting copyright are not 240 | used to limit the access or legal rights of the compilation's users 241 | beyond what the individual works permit. Inclusion of a covered work 242 | in an aggregate does not cause this License to apply to the other 243 | parts of the aggregate. 244 | 245 | 6. Conveying Non-Source Forms. 246 | 247 | You may convey a covered work in object code form under the terms 248 | of sections 4 and 5, provided that you also convey the 249 | machine-readable Corresponding Source under the terms of this License, 250 | in one of these ways: 251 | 252 | a) Convey the object code in, or embodied in, a physical product 253 | (including a physical distribution medium), accompanied by the 254 | Corresponding Source fixed on a durable physical medium 255 | customarily used for software interchange. 256 | 257 | b) Convey the object code in, or embodied in, a physical product 258 | (including a physical distribution medium), accompanied by a 259 | written offer, valid for at least three years and valid for as 260 | long as you offer spare parts or customer support for that product 261 | model, to give anyone who possesses the object code either (1) a 262 | copy of the Corresponding Source for all the software in the 263 | product that is covered by this License, on a durable physical 264 | medium customarily used for software interchange, for a price no 265 | more than your reasonable cost of physically performing this 266 | conveying of source, or (2) access to copy the 267 | Corresponding Source from a network server at no charge. 268 | 269 | c) Convey individual copies of the object code with a copy of the 270 | written offer to provide the Corresponding Source. This 271 | alternative is allowed only occasionally and noncommercially, and 272 | only if you received the object code with such an offer, in accord 273 | with subsection 6b. 274 | 275 | d) Convey the object code by offering access from a designated 276 | place (gratis or for a charge), and offer equivalent access to the 277 | Corresponding Source in the same way through the same place at no 278 | further charge. You need not require recipients to copy the 279 | Corresponding Source along with the object code. If the place to 280 | copy the object code is a network server, the Corresponding Source 281 | may be on a different server (operated by you or a third party) 282 | that supports equivalent copying facilities, provided you maintain 283 | clear directions next to the object code saying where to find the 284 | Corresponding Source. Regardless of what server hosts the 285 | Corresponding Source, you remain obligated to ensure that it is 286 | available for as long as needed to satisfy these requirements. 287 | 288 | e) Convey the object code using peer-to-peer transmission, provided 289 | you inform other peers where the object code and Corresponding 290 | Source of the work are being offered to the general public at no 291 | charge under subsection 6d. 292 | 293 | A separable portion of the object code, whose source code is excluded 294 | from the Corresponding Source as a System Library, need not be 295 | included in conveying the object code work. 296 | 297 | A "User Product" is either (1) a "consumer product", which means any 298 | tangible personal property which is normally used for personal, family, 299 | or household purposes, or (2) anything designed or sold for incorporation 300 | into a dwelling. In determining whether a product is a consumer product, 301 | doubtful cases shall be resolved in favor of coverage. For a particular 302 | product received by a particular user, "normally used" refers to a 303 | typical or common use of that class of product, regardless of the status 304 | of the particular user or of the way in which the particular user 305 | actually uses, or expects or is expected to use, the product. A product 306 | is a consumer product regardless of whether the product has substantial 307 | commercial, industrial or non-consumer uses, unless such uses represent 308 | the only significant mode of use of the product. 309 | 310 | "Installation Information" for a User Product means any methods, 311 | procedures, authorization keys, or other information required to install 312 | and execute modified versions of a covered work in that User Product from 313 | a modified version of its Corresponding Source. The information must 314 | suffice to ensure that the continued functioning of the modified object 315 | code is in no case prevented or interfered with solely because 316 | modification has been made. 317 | 318 | If you convey an object code work under this section in, or with, or 319 | specifically for use in, a User Product, and the conveying occurs as 320 | part of a transaction in which the right of possession and use of the 321 | User Product is transferred to the recipient in perpetuity or for a 322 | fixed term (regardless of how the transaction is characterized), the 323 | Corresponding Source conveyed under this section must be accompanied 324 | by the Installation Information. But this requirement does not apply 325 | if neither you nor any third party retains the ability to install 326 | modified object code on the User Product (for example, the work has 327 | been installed in ROM). 328 | 329 | The requirement to provide Installation Information does not include a 330 | requirement to continue to provide support service, warranty, or updates 331 | for a work that has been modified or installed by the recipient, or for 332 | the User Product in which it has been modified or installed. Access to a 333 | network may be denied when the modification itself materially and 334 | adversely affects the operation of the network or violates the rules and 335 | protocols for communication across the network. 336 | 337 | Corresponding Source conveyed, and Installation Information provided, 338 | in accord with this section must be in a format that is publicly 339 | documented (and with an implementation available to the public in 340 | source code form), and must require no special password or key for 341 | unpacking, reading or copying. 342 | 343 | 7. Additional Terms. 344 | 345 | "Additional permissions" are terms that supplement the terms of this 346 | License by making exceptions from one or more of its conditions. 347 | Additional permissions that are applicable to the entire Program shall 348 | be treated as though they were included in this License, to the extent 349 | that they are valid under applicable law. If additional permissions 350 | apply only to part of the Program, that part may be used separately 351 | under those permissions, but the entire Program remains governed by 352 | this License without regard to the additional permissions. 353 | 354 | When you convey a copy of a covered work, you may at your option 355 | remove any additional permissions from that copy, or from any part of 356 | it. (Additional permissions may be written to require their own 357 | removal in certain cases when you modify the work.) You may place 358 | additional permissions on material, added by you to a covered work, 359 | for which you have or can give appropriate copyright permission. 360 | 361 | Notwithstanding any other provision of this License, for material you 362 | add to a covered work, you may (if authorized by the copyright holders of 363 | that material) supplement the terms of this License with terms: 364 | 365 | a) Disclaiming warranty or limiting liability differently from the 366 | terms of sections 15 and 16 of this License; or 367 | 368 | b) Requiring preservation of specified reasonable legal notices or 369 | author attributions in that material or in the Appropriate Legal 370 | Notices displayed by works containing it; or 371 | 372 | c) Prohibiting misrepresentation of the origin of that material, or 373 | requiring that modified versions of such material be marked in 374 | reasonable ways as different from the original version; or 375 | 376 | d) Limiting the use for publicity purposes of names of licensors or 377 | authors of the material; or 378 | 379 | e) Declining to grant rights under trademark law for use of some 380 | trade names, trademarks, or service marks; or 381 | 382 | f) Requiring indemnification of licensors and authors of that 383 | material by anyone who conveys the material (or modified versions of 384 | it) with contractual assumptions of liability to the recipient, for 385 | any liability that these contractual assumptions directly impose on 386 | those licensors and authors. 387 | 388 | All other non-permissive additional terms are considered "further 389 | restrictions" within the meaning of section 10. If the Program as you 390 | received it, or any part of it, contains a notice stating that it is 391 | governed by this License along with a term that is a further 392 | restriction, you may remove that term. If a license document contains 393 | a further restriction but permits relicensing or conveying under this 394 | License, you may add to a covered work material governed by the terms 395 | of that license document, provided that the further restriction does 396 | not survive such relicensing or conveying. 397 | 398 | If you add terms to a covered work in accord with this section, you 399 | must place, in the relevant source files, a statement of the 400 | additional terms that apply to those files, or a notice indicating 401 | where to find the applicable terms. 402 | 403 | Additional terms, permissive or non-permissive, may be stated in the 404 | form of a separately written license, or stated as exceptions; 405 | the above requirements apply either way. 406 | 407 | 8. Termination. 408 | 409 | You may not propagate or modify a covered work except as expressly 410 | provided under this License. Any attempt otherwise to propagate or 411 | modify it is void, and will automatically terminate your rights under 412 | this License (including any patent licenses granted under the third 413 | paragraph of section 11). 414 | 415 | However, if you cease all violation of this License, then your 416 | license from a particular copyright holder is reinstated (a) 417 | provisionally, unless and until the copyright holder explicitly and 418 | finally terminates your license, and (b) permanently, if the copyright 419 | holder fails to notify you of the violation by some reasonable means 420 | prior to 60 days after the cessation. 421 | 422 | Moreover, your license from a particular copyright holder is 423 | reinstated permanently if the copyright holder notifies you of the 424 | violation by some reasonable means, this is the first time you have 425 | received notice of violation of this License (for any work) from that 426 | copyright holder, and you cure the violation prior to 30 days after 427 | your receipt of the notice. 428 | 429 | Termination of your rights under this section does not terminate the 430 | licenses of parties who have received copies or rights from you under 431 | this License. If your rights have been terminated and not permanently 432 | reinstated, you do not qualify to receive new licenses for the same 433 | material under section 10. 434 | 435 | 9. Acceptance Not Required for Having Copies. 436 | 437 | You are not required to accept this License in order to receive or 438 | run a copy of the Program. Ancillary propagation of a covered work 439 | occurring solely as a consequence of using peer-to-peer transmission 440 | to receive a copy likewise does not require acceptance. However, 441 | nothing other than this License grants you permission to propagate or 442 | modify any covered work. These actions infringe copyright if you do 443 | not accept this License. Therefore, by modifying or propagating a 444 | covered work, you indicate your acceptance of this License to do so. 445 | 446 | 10. Automatic Licensing of Downstream Recipients. 447 | 448 | Each time you convey a covered work, the recipient automatically 449 | receives a license from the original licensors, to run, modify and 450 | propagate that work, subject to this License. You are not responsible 451 | for enforcing compliance by third parties with this License. 452 | 453 | An "entity transaction" is a transaction transferring control of an 454 | organization, or substantially all assets of one, or subdividing an 455 | organization, or merging organizations. If propagation of a covered 456 | work results from an entity transaction, each party to that 457 | transaction who receives a copy of the work also receives whatever 458 | licenses to the work the party's predecessor in interest had or could 459 | give under the previous paragraph, plus a right to possession of the 460 | Corresponding Source of the work from the predecessor in interest, if 461 | the predecessor has it or can get it with reasonable efforts. 462 | 463 | You may not impose any further restrictions on the exercise of the 464 | rights granted or affirmed under this License. For example, you may 465 | not impose a license fee, royalty, or other charge for exercise of 466 | rights granted under this License, and you may not initiate litigation 467 | (including a cross-claim or counterclaim in a lawsuit) alleging that 468 | any patent claim is infringed by making, using, selling, offering for 469 | sale, or importing the Program or any portion of it. 470 | 471 | 11. Patents. 472 | 473 | A "contributor" is a copyright holder who authorizes use under this 474 | License of the Program or a work on which the Program is based. The 475 | work thus licensed is called the contributor's "contributor version". 476 | 477 | A contributor's "essential patent claims" are all patent claims 478 | owned or controlled by the contributor, whether already acquired or 479 | hereafter acquired, that would be infringed by some manner, permitted 480 | by this License, of making, using, or selling its contributor version, 481 | but do not include claims that would be infringed only as a 482 | consequence of further modification of the contributor version. For 483 | purposes of this definition, "control" includes the right to grant 484 | patent sublicenses in a manner consistent with the requirements of 485 | this License. 486 | 487 | Each contributor grants you a non-exclusive, worldwide, royalty-free 488 | patent license under the contributor's essential patent claims, to 489 | make, use, sell, offer for sale, import and otherwise run, modify and 490 | propagate the contents of its contributor version. 491 | 492 | In the following three paragraphs, a "patent license" is any express 493 | agreement or commitment, however denominated, not to enforce a patent 494 | (such as an express permission to practice a patent or covenant not to 495 | sue for patent infringement). To "grant" such a patent license to a 496 | party means to make such an agreement or commitment not to enforce a 497 | patent against the party. 498 | 499 | If you convey a covered work, knowingly relying on a patent license, 500 | and the Corresponding Source of the work is not available for anyone 501 | to copy, free of charge and under the terms of this License, through a 502 | publicly available network server or other readily accessible means, 503 | then you must either (1) cause the Corresponding Source to be so 504 | available, or (2) arrange to deprive yourself of the benefit of the 505 | patent license for this particular work, or (3) arrange, in a manner 506 | consistent with the requirements of this License, to extend the patent 507 | license to downstream recipients. "Knowingly relying" means you have 508 | actual knowledge that, but for the patent license, your conveying the 509 | covered work in a country, or your recipient's use of the covered work 510 | in a country, would infringe one or more identifiable patents in that 511 | country that you have reason to believe are valid. 512 | 513 | If, pursuant to or in connection with a single transaction or 514 | arrangement, you convey, or propagate by procuring conveyance of, a 515 | covered work, and grant a patent license to some of the parties 516 | receiving the covered work authorizing them to use, propagate, modify 517 | or convey a specific copy of the covered work, then the patent license 518 | you grant is automatically extended to all recipients of the covered 519 | work and works based on it. 520 | 521 | A patent license is "discriminatory" if it does not include within 522 | the scope of its coverage, prohibits the exercise of, or is 523 | conditioned on the non-exercise of one or more of the rights that are 524 | specifically granted under this License. You may not convey a covered 525 | work if you are a party to an arrangement with a third party that is 526 | in the business of distributing software, under which you make payment 527 | to the third party based on the extent of your activity of conveying 528 | the work, and under which the third party grants, to any of the 529 | parties who would receive the covered work from you, a discriminatory 530 | patent license (a) in connection with copies of the covered work 531 | conveyed by you (or copies made from those copies), or (b) primarily 532 | for and in connection with specific products or compilations that 533 | contain the covered work, unless you entered into that arrangement, 534 | or that patent license was granted, prior to 28 March 2007. 535 | 536 | Nothing in this License shall be construed as excluding or limiting 537 | any implied license or other defenses to infringement that may 538 | otherwise be available to you under applicable patent law. 539 | 540 | 12. No Surrender of Others' Freedom. 541 | 542 | If conditions are imposed on you (whether by court order, agreement or 543 | otherwise) that contradict the conditions of this License, they do not 544 | excuse you from the conditions of this License. If you cannot convey a 545 | covered work so as to satisfy simultaneously your obligations under this 546 | License and any other pertinent obligations, then as a consequence you may 547 | not convey it at all. For example, if you agree to terms that obligate you 548 | to collect a royalty for further conveying from those to whom you convey 549 | the Program, the only way you could satisfy both those terms and this 550 | License would be to refrain entirely from conveying the Program. 551 | 552 | 13. Use with the GNU Affero General Public License. 553 | 554 | Notwithstanding any other provision of this License, you have 555 | permission to link or combine any covered work with a work licensed 556 | under version 3 of the GNU Affero General Public License into a single 557 | combined work, and to convey the resulting work. The terms of this 558 | License will continue to apply to the part which is the covered work, 559 | but the special requirements of the GNU Affero General Public License, 560 | section 13, concerning interaction through a network will apply to the 561 | combination as such. 562 | 563 | 14. Revised Versions of this License. 564 | 565 | The Free Software Foundation may publish revised and/or new versions of 566 | the GNU General Public License from time to time. Such new versions will 567 | be similar in spirit to the present version, but may differ in detail to 568 | address new problems or concerns. 569 | 570 | Each version is given a distinguishing version number. If the 571 | Program specifies that a certain numbered version of the GNU General 572 | Public License "or any later version" applies to it, you have the 573 | option of following the terms and conditions either of that numbered 574 | version or of any later version published by the Free Software 575 | Foundation. If the Program does not specify a version number of the 576 | GNU General Public License, you may choose any version ever published 577 | by the Free Software Foundation. 578 | 579 | If the Program specifies that a proxy can decide which future 580 | versions of the GNU General Public License can be used, that proxy's 581 | public statement of acceptance of a version permanently authorizes you 582 | to choose that version for the Program. 583 | 584 | Later license versions may give you additional or different 585 | permissions. However, no additional obligations are imposed on any 586 | author or copyright holder as a result of your choosing to follow a 587 | later version. 588 | 589 | 15. Disclaimer of Warranty. 590 | 591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 599 | 600 | 16. Limitation of Liability. 601 | 602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 610 | SUCH DAMAGES. 611 | 612 | 17. Interpretation of Sections 15 and 16. 613 | 614 | If the disclaimer of warranty and limitation of liability provided 615 | above cannot be given local legal effect according to their terms, 616 | reviewing courts shall apply local law that most closely approximates 617 | an absolute waiver of all civil liability in connection with the 618 | Program, unless a warranty or assumption of liability accompanies a 619 | copy of the Program in return for a fee. 620 | 621 | END OF TERMS AND CONDITIONS 622 | 623 | How to Apply These Terms to Your New Programs 624 | 625 | If you develop a new program, and you want it to be of the greatest 626 | possible use to the public, the best way to achieve this is to make it 627 | free software which everyone can redistribute and change under these terms. 628 | 629 | To do so, attach the following notices to the program. It is safest 630 | to attach them to the start of each source file to most effectively 631 | state the exclusion of warranty; and each file should have at least 632 | the "copyright" line and a pointer to where the full notice is found. 633 | 634 | 635 | Copyright (C) 636 | 637 | This program is free software: you can redistribute it and/or modify 638 | it under the terms of the GNU General Public License as published by 639 | the Free Software Foundation, either version 3 of the License, or 640 | (at your option) any later version. 641 | 642 | This program is distributed in the hope that it will be useful, 643 | but WITHOUT ANY WARRANTY; without even the implied warranty of 644 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 645 | GNU General Public License for more details. 646 | 647 | You should have received a copy of the GNU General Public License 648 | along with this program. If not, see . 649 | 650 | Also add information on how to contact you by electronic and paper mail. 651 | 652 | If the program does terminal interaction, make it output a short 653 | notice like this when it starts in an interactive mode: 654 | 655 | Copyright (C) 656 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 657 | This is free software, and you are welcome to redistribute it 658 | under certain conditions; type `show c' for details. 659 | 660 | The hypothetical commands `show w' and `show c' should show the appropriate 661 | parts of the General Public License. Of course, your program's commands 662 | might be different; for a GUI interface, you would use an "about box". 663 | 664 | You should also get your employer (if you work as a programmer) or school, 665 | if any, to sign a "copyright disclaimer" for the program, if necessary. 666 | For more information on this, and how to apply and follow the GNU GPL, see 667 | . 668 | 669 | The GNU General Public License does not permit incorporating your program 670 | into proprietary programs. If your program is a subroutine library, you 671 | may consider it more useful to permit linking proprietary applications with 672 | the library. If this is what you want to do, use the GNU Lesser General 673 | Public License instead of this License. But first, please read 674 | . 675 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CNNs for Loop-Closure Detection 2 | The following is based on the methodology proposed in "*Loop closure detection for 3 | visual SLAM systems using convolutional neural network*" (see citation below). Various 4 | CNN architectures are available for method evaluation on the Oxford *New College* and 5 | *City Centre* datasets. The code can easily be extended for additional datasets and 6 | CNNs. 7 | 8 |

9 | simplot_city_overfeat_1 10 |

11 | 12 | 13 | X. Zhang, Y. Su and X. Zhu, "Loop closure detection for visual SLAM systems using convolutional neural network," 2017 23rd International Conference on Automation and Computing (ICAC), Huddersfield, 2017, pp. 1-6. 14 | doi: 10.23919/IConAC.2017.8082072 15 | 16 | # Usage 17 | *NOTE: The Overfeat model is known not to work as it can give the same output for all inputs (see [this issue][3]). I do not intend 18 | to fix this, but I am open to integrating a PR. The TF slim models still work.* 19 | 20 | The main script is `cnn_lcd.py` and offers the following options. 21 | ``` 22 | python cnn_lcd.py --help 23 | usage: cnn_lcd.py [-h] [--dataset DATASET] [--overfeat OVERFEAT] 24 | [--weights_dir WEIGHTS_DIR] [--weights_base WEIGHTS_BASE] 25 | [--layer LAYER] [--plot_gt] [--cluster] [--sweep_median] 26 | [--debug] 27 | model 28 | 29 | CNNs for loop-closure detection. 30 | 31 | positional arguments: 32 | model Model name: [overfeat, inception_v{1,2,3,4}, nasnet, 33 | resnet_v2_152] 34 | 35 | optional arguments: 36 | -h, --help show this help message and exit 37 | --dataset DATASET Either "city" or "college". 38 | --overfeat OVERFEAT 0 for small network, 1 for large 39 | --weights_dir WEIGHTS_DIR 40 | Weights directory. 41 | --weights_base WEIGHTS_BASE 42 | Basename of weights file. 43 | --layer LAYER Layer number to extract features from. 44 | --plot_gt Plots heat-map of ground truth and exits 45 | --cluster Additionally performs clustering on sim matrix. 46 | --sweep_median Sweep median filter size values. 47 | --debug Use small number of images to debug code 48 | ``` 49 | Note that you need to run the script with python2 to use the *OverFeat* model, and 50 | python3 to use the TensorFlow models. 51 | 52 | 53 | # Installation/Requirements 54 | The following Python packages are needed (installed with pip, conda, etc.): 55 | * tensorflow 56 | * numpy 57 | * scipy 58 | * skimage 59 | * matplotlib 60 | * sklearn 61 | * requests 62 | 63 | Additionally, in order to use the *Overfeat* model, you'll need to install 64 | the Python API provided [here][1]. The GPU version should ideally be installed, but 65 | the authors only provide the source for the CPU version. *OverFeat* also has a 66 | TensorFlow implementation, but does not offer pre-trained checkpoint files. The 67 | repository provides package installation instructions. 68 | 69 | In order to use the TensorFlow models (everything but *OverFeat*), you will need 70 | to clone the [TensorFlow Slim model repository][2]. The easiest way to do so is 71 | to clone both this repository and the TensorFlow models repository to the same 72 | directory: 73 | 74 | ``` 75 | git clone https://github.com/tensorflow/models/ 76 | git clone 77 | ``` 78 | Directory structure should look like this: 79 | ``` 80 | + models/ 81 | |--- ... 82 | |--- slim 83 | |--- ... 84 | | cnn_lcd.py 85 | | ... 86 | ``` 87 | 88 | 89 | [1]: https://github.com/sermanet/OverFeat 90 | [2]: https://github.com/tensorflow/models/tree/master/research/slim 91 | [3]: https://github.com/sermanet/OverFeat/issues/39 92 | -------------------------------------------------------------------------------- /cnn_lcd.py: -------------------------------------------------------------------------------- 1 | # ===================================================================== 2 | # cnn_lcd.py - CNNs for loop-closure detection in vSLAM systems. 3 | # Copyright (C) 2018 Zach Carmichael 4 | # 5 | # This program is free software: you can redistribute it and/or modify 6 | # it under the terms of the GNU General Public License as published by 7 | # the Free Software Foundation, either version 3 of the License, or 8 | # (at your option) any later version. 9 | # 10 | # This program is distributed in the hope that it will be useful, 11 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 12 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 13 | # GNU General Public License for more details. 14 | # 15 | # You should have received a copy of the GNU General Public License 16 | # along with this program. If not, see . 17 | # ===================================================================== 18 | from __future__ import (print_function, division, unicode_literals, 19 | absolute_import) 20 | 21 | import argparse 22 | import os 23 | import sys 24 | 25 | import numpy as np 26 | from scipy.signal import medfilt 27 | from sklearn.metrics import (precision_recall_curve, average_precision_score, 28 | precision_score, recall_score) 29 | from sklearn.cluster import KMeans 30 | 31 | # local imports 32 | from cnn_models import get_model_features, is_valid_model 33 | from dataset import get_dataset 34 | 35 | import matplotlib.pyplot as plt 36 | if 'DISPLAY' not in os.environ.keys(): 37 | import matplotlib as mpl 38 | 39 | mpl.use('Agg') 40 | del mpl 41 | 42 | plt.rcParams.update({'font.size': 12, 43 | 'font.family': 'Times New Roman'}) 44 | 45 | 46 | def get_descriptors(imgs, feat_func, pca=True, pca_dim=500, eps=1e-5, 47 | cache=None, name=''): 48 | """ Returns feature descriptor vector for given image(s). Method follows 49 | procedure adapted from Zhang et al.: 50 | 'Loop Closure Detection for Visual SLAM Systems Using Convolutional 51 | Neural Network' 52 | 53 | Args: 54 | imgs: Iterable of images each of shape (h,w,c) 55 | feat_func: Function that takes imgs and cache as arguments, and 56 | returns CNN features and cache 57 | pca: Whether to perform PCA (and whitening) 58 | pca_dim: Dimension to reduce vectors to 59 | eps: Small value to prevent division by 0 60 | cache: Dict containing descs/other cached values 61 | name: Name used for caching 62 | """ 63 | if cache is None: 64 | cache = {} 65 | if name + 'pdescs' not in cache: 66 | # Get features from network 67 | descs, cache = feat_func(imgs, cache) 68 | # Ensure features as vectors 69 | descs = descs.reshape(len(imgs), -1) 70 | print(descs.shape) 71 | # L2 Norm 72 | descs = descs / np.linalg.norm(descs, axis=1)[:, None] 73 | cache.update({name + 'pdescs': descs}) 74 | else: 75 | descs = cache[name + 'pdescs'] 76 | if pca: 77 | print('Performing PCA with pca_dim={}'.format(pca_dim)) 78 | descs, cache = pca_red(descs, pca_dim, eps=eps, whiten=True, 79 | cache=cache, name=name) 80 | print('PCA done.') 81 | return descs, cache 82 | 83 | 84 | def pca_red(descs, dim, eps=1e-5, whiten=True, cache=None, name=''): 85 | """ Performs PCA + whitening on image descriptors 86 | 87 | Args: 88 | descs: input matrix of image descriptors 89 | dim: the number of principal components to reduce descs to 90 | eps: small epsilon to avoid 0-division 91 | whiten: whether to whiten the principal components 92 | cache: PCA cache (see name parameter) 93 | name: used to differentiate different cached value between models 94 | 95 | Returns: 96 | descs: the descs post-reduction 97 | cache: the (updated) cache 98 | """ 99 | if cache is None: 100 | cache = {} 101 | # Zero-center data 102 | dmean = descs.mean(axis=0) 103 | descs = descs - dmean 104 | if name + 'S' not in cache or name + 'U' not in cache: 105 | # Compute covariance matrix 106 | cov = descs.T.dot(descs) / (descs.shape[0] - 1) 107 | # Apply SVD 108 | U, S, W = np.linalg.svd(cov) 109 | cache.update({name + 'U': U, name + 'S': S}) 110 | else: 111 | U, S = cache[name + 'U'], cache[name + 'S'] 112 | # Project onto principal axes 113 | descs = descs.dot(U[:, :dim]) 114 | # Whiten 115 | if whiten: 116 | descs = descs / np.sqrt(S[:dim] + eps) 117 | return descs, cache 118 | 119 | 120 | def cluster_kmeans(sim): 121 | """Run k-means on similarity matrix and segment""" 122 | sim_dim = sim.shape[0] 123 | sim = sim.reshape(-1, 1) 124 | 125 | # Augment with spatial coordinates 126 | sim_aug = np.concatenate( 127 | [sim, 128 | np.mgrid[:sim_dim, :sim_dim].reshape(-1, sim_dim ** 2).T], 129 | axis=1 130 | ) 131 | 132 | # Empirical metric for number of loop-closures given number of images 133 | # in sequence (assumption: equally-spaced samples): 134 | n_clusters = int(np.sqrt(sim_dim)) 135 | print('Performing clustering via KMeans(n={}).'.format(n_clusters)) 136 | 137 | km = KMeans(n_clusters=n_clusters, n_jobs=2, 138 | max_iter=300) 139 | labels = km.fit_predict(sim_aug) 140 | print('Got cluster labels') 141 | 142 | for i in range(n_clusters): 143 | lab_idx = (labels == i) 144 | if lab_idx.size: 145 | cc = sim[lab_idx].mean() 146 | # cc = sim[lab_idx].max() 147 | sim[lab_idx] = cc 148 | 149 | # Re-normalize and reshape 150 | sim = sim.reshape(sim_dim, sim_dim) / sim.max() 151 | return sim 152 | 153 | 154 | def median_filter(sim, gt, k_size=None): 155 | """ Apply median filtering and tune kernel size if applicable. 156 | 157 | Args: 158 | sim: The similarity matrix 159 | gt: The ground truth matrix 160 | k_size: The square kernel size 161 | 162 | Returns: 163 | sim: filtered similarity matrix 164 | """ 165 | # NOTE: only lower triangular part of matrix actually requires filtering 166 | tri_idx = np.tril_indices(gt.shape[0], -1) 167 | 168 | if k_size is None: 169 | print('Sweeping median kernel sizes.') 170 | best_ks = None 171 | # Compute baseline AP (no median filtering) 172 | best_ap = average_precision_score(gt[tri_idx], sim[tri_idx]) 173 | best_sim = sim 174 | k_sizes = list(range(1, 61, 2)) 175 | # Compute similarity matrix 176 | for ks in k_sizes: 177 | sim_filtered = medfilt(sim, kernel_size=ks) 178 | # Re-normalize 179 | sim_filtered = sim_filtered / sim_filtered.max() 180 | ks_ap = average_precision_score(gt[tri_idx], sim_filtered[tri_idx]) 181 | if ks_ap > best_ap: 182 | best_ks = ks 183 | best_ap = ks_ap 184 | best_sim = sim_filtered 185 | print('Finished with ks={} (AP={}). Best so far={}'.format(ks, ks_ap, best_ks)) 186 | print('Best ks={} yielded an AP of {}%.'.format(best_ks, best_ap * 100)) 187 | sim = best_sim 188 | else: 189 | print('Filtering with median kernel with kernel size {}.'.format(k_size)) 190 | sim = medfilt(sim, kernel_size=k_size) 191 | 192 | # Re-normalize 193 | sim = sim / sim.max() 194 | ap = average_precision_score(gt[tri_idx], sim[tri_idx]) 195 | print('Median filter with ks={} on sim yielded an AP of {}%.'.format(k_size, ap * 100)) 196 | 197 | return sim 198 | 199 | 200 | def similarity_matrix(descs, gt, median=True, cluster=False, plot=False, 201 | k_size=None, name=''): 202 | """ Compute pairwise similarity between descriptors. Using provided gt to find best 203 | parameters given function args. 204 | 205 | Args: 206 | descs: feature descriptors of shape (n, d) 207 | gt: the ground truth 208 | median: whether to use median filtering (chooses median value that obtains 209 | highest avg precision... 210 | cluster: whether to cluster 211 | plot: whether to plot matrix 212 | k_size: specify None to sweep, otherwise the value to use 213 | name: name for plot file+cache 214 | """ 215 | print('Computing similarity matrix...') 216 | n = descs.shape[0] 217 | diffs = np.zeros((n, n)) 218 | 219 | # Compute L2 norm of each vector 220 | norms = np.linalg.norm(descs, axis=1) 221 | descs_norm = descs / norms[:, None] 222 | 223 | # Compute similarity of every vector with every vector 224 | for i, desc in enumerate(descs): 225 | # Compute difference 226 | diff = np.linalg.norm(descs_norm - descs_norm[i], axis=1) 227 | diffs[i] = diff 228 | 229 | # Compute max difference 230 | dmax = diffs.max() 231 | 232 | # Normalize difference and create sim matrix 233 | sim = 1. - (diffs / dmax) 234 | assert gt.shape[0] == sim.shape[0] 235 | 236 | if cluster: 237 | sim = cluster_kmeans(sim) 238 | 239 | if median: 240 | sim = median_filter(sim, gt, k_size=k_size) 241 | 242 | if plot: 243 | f, ax = plt.subplots() 244 | cax = ax.imshow(sim, cmap='coolwarm', interpolation='nearest', 245 | vmin=0., vmax=1.) 246 | cbar = f.colorbar(cax, ticks=[0, 0.5, 1]) 247 | cbar.ax.set_yticklabels(['0', '0.5', '1']) 248 | plt.savefig('simplot_{}.png'.format(name), format='png', dpi=150) 249 | plt.show() 250 | 251 | # Preprocess gt... 252 | gt = gt.copy() 253 | gt += gt.T # add transpose 254 | gt += np.eye(gt.shape[0], dtype=gt.dtype) 255 | 256 | # Plot 257 | f, ax = plt.subplots(1, 2) 258 | ax[0].imshow(sim, cmap='coolwarm', interpolation='nearest', 259 | vmin=0., vmax=1.) 260 | ax[0].set_axis_off() 261 | ax[0].set_title('Similarity Matrix') 262 | ax[1].imshow(gt, cmap='gray', interpolation='nearest', 263 | vmin=0., vmax=1.) 264 | ax[1].set_axis_off() 265 | ax[1].set_title('Ground Truth') 266 | plt.savefig('simplot_w_gt_{}.png'.format(name), format='png', dpi=150) 267 | plt.show() 268 | 269 | return sim 270 | 271 | 272 | def mean_per_class_accuracy(y_true, y_pred, n_classes=None, labels=None): 273 | """ Computes mean per-class accuracy 274 | 275 | Args: 276 | y_true: the true labels 277 | y_pred: the predicted labels 278 | n_classes: the number of classes, optional. If not provided, the number of 279 | unique classes or length of `labels` if provided. 280 | labels: the unique labels, optional. If not provided, unique labels are used 281 | if `n_classes` not provided, otherwise range(n_classes). 282 | 283 | Returns: 284 | mean per-class accuracy 285 | """ 286 | if n_classes is None: 287 | if labels is None: 288 | labels = np.unique(y_true) 289 | n_classes = len(labels) 290 | elif labels is None: 291 | labels = np.arange(n_classes) 292 | elif len(labels) != n_classes: 293 | raise ValueError('Number of classes specified ({}) differs from ' 294 | 'number of labels ({}).'.format(n_classes, len(labels))) 295 | acc = 0. 296 | for c in labels: 297 | c_mask = (y_true == c) 298 | c_count = c_mask.sum() 299 | if c_count: # Avoid division by 0 300 | # Add accuracy for class c 301 | acc += np.logical_and(c_mask, (y_pred == c)).sum() / c_count 302 | # Mean accuracy per class 303 | return acc / n_classes 304 | 305 | 306 | def compute_and_plot_scores(sim, gt, model_name): 307 | """ Computes relevant metrics and plots results. 308 | 309 | Args: 310 | sim: Similarity matrix 311 | gt: Ground truth matrix 312 | model_name: Name of the model for logging 313 | """ 314 | # Modify sim matrix to get "real" vector of loop-closures 315 | # symmetric matrix, take either diagonal matrix, rid diagonal 316 | sim = sim[np.tril_indices(sim.shape[0], -1)] 317 | 318 | # Ground truth only present in lower diagonal for Oxford datasets 319 | gt = gt[np.tril_indices(gt.shape[0], -1)] 320 | 321 | # Compute PR-curve 322 | precision, recall, thresholds = precision_recall_curve(gt, sim) 323 | average_precision = average_precision_score(gt, sim) 324 | print('Average Precision: {}'.format(average_precision)) 325 | 326 | best_macc = 0. 327 | best_mthresh = None 328 | # Compute the best MPC-accuracy at hard-coded thresholds 329 | thresholds = np.arange(0, 1.02, 0.02) 330 | for thresh in thresholds: 331 | sim_thresh = np.zeros_like(sim) 332 | sim_thresh[sim >= thresh] = 1 333 | macc = mean_per_class_accuracy(gt, sim_thresh, n_classes=2) 334 | if macc > best_macc: 335 | best_macc = macc 336 | best_mthresh = thresh 337 | 338 | sim_mthresh = np.zeros_like(sim) 339 | sim_mthresh[sim >= best_mthresh] = 1 340 | precision_at_mthresh = precision_score(gt, sim_mthresh) 341 | recall_at_mthresh = recall_score(gt, sim_mthresh) 342 | print('Best MPC-ACC (thresh={}): {}'.format(best_mthresh, best_macc)) 343 | print('Precision (thresh={}): {}'.format(best_mthresh, precision_at_mthresh)) 344 | print('Recall (thresh={}): {}'.format(best_mthresh, recall_at_mthresh)) 345 | 346 | plt.step(recall, precision, color='b', alpha=0.2, 347 | where='post') 348 | plt.fill_between(recall, precision, step='post', alpha=0.2, 349 | color='b') 350 | plt.xlabel('Recall') 351 | plt.ylabel('Precision') 352 | plt.ylim([0.0, 1.05]) 353 | plt.xlim([0.0, 1.0]) 354 | plt.title('2-class Precision-Recall curve: AP={0:0.3f}'.format( 355 | average_precision)) 356 | plt.savefig('precision-recall_curve_{}.png'.format(model_name), 357 | format='png', dpi=150) 358 | plt.show() 359 | 360 | 361 | def main(args): 362 | model_name = args.model 363 | 364 | # Check specified model 365 | if is_valid_model(model_name): 366 | # Create weights path 367 | weights_path = os.path.join(args.weights_dir, args.weights_base) 368 | weights_path = weights_path.format(args.overfeat) 369 | # Create feature function 370 | feat_func = lambda _imgs, _cache: get_model_features(_imgs, model_name, 371 | overfeat_weights_path=weights_path, 372 | overfeat_typ=args.overfeat, layer=args.layer, 373 | cache=_cache) 374 | else: 375 | print('Unknown model type: {}'.format(model_name)) 376 | sys.exit(1) 377 | 378 | # Load dataset 379 | imgs, gt = get_dataset(args.dataset, args.debug) 380 | if args.plot_gt: 381 | plt.figure() 382 | plt.imshow(gt, cmap='gray', interpolation='nearest') 383 | plt.savefig('{}_gt_plot.png'.format(args.dataset), format='png', dpi=150) 384 | plt.show() 385 | sys.exit(0) 386 | 387 | # Compute feature descriptors 388 | descs, cache = get_descriptors(imgs, feat_func, pca=True, pca_dim=500, 389 | eps=1e-5, cache=None, name=model_name) 390 | # Kernel sizes for median filter 391 | if args.sweep_median: 392 | k_size = None 393 | elif args.dataset.lower() == 'city': 394 | k_size = 17 # BEST HARD-CODED PARAMETER FROM SWEEP: ```range(1,61,2)``` 395 | elif args.dataset.lower() == 'college': 396 | k_size = 11 # BEST HARD-CODED PARAMETER FROM SWEEP: ```range(1,61,2)``` 397 | else: 398 | k_size = None # SWEEP 399 | 400 | # Compute similarity matrix 401 | sim = similarity_matrix(descs, gt, plot=True, cluster=args.cluster, median=True, 402 | k_size=k_size, name='_'.join([args.dataset, model_name])) 403 | 404 | assert sim.shape == gt.shape, 'sim and gt not the same shape: {} != {}'.format(sim.shape, gt.shape) 405 | 406 | compute_and_plot_scores(sim, gt, model_name) 407 | 408 | 409 | if __name__ == '__main__': 410 | # Parse CLI args 411 | parser = argparse.ArgumentParser(description='CNNs for loop-closure ' 412 | 'detection.') 413 | parser.add_argument('model', type=str, 414 | help='Model name: [overfeat, inception_v{1,2,3,4}, nasnet, resnet_v2_152]') 415 | parser.add_argument('--dataset', type=str, 416 | help='Either "city" or "college".', default='city') 417 | parser.add_argument('--overfeat', type=int, 418 | help='0 for small network, 1 for large', default=1) 419 | parser.add_argument('--weights_dir', type=str, default='OverFeat/data/default', 420 | help='Weights directory.') 421 | parser.add_argument('--weights_base', type=str, default='net_weight_{}', 422 | help='Basename of weights file.') 423 | parser.add_argument('--layer', type=int, default=None, 424 | help='Layer number to extract features from.') 425 | parser.add_argument('--plot_gt', action='store_true', 426 | help='Plots heat-map of ground truth and exits') 427 | parser.add_argument('--cluster', action='store_true', 428 | help='Additionally performs clustering on sim matrix.') 429 | parser.add_argument('--sweep_median', action='store_true', 430 | help='Sweep median filter size values.') 431 | parser.add_argument('--debug', action='store_true', 432 | help='Use small number of images to debug code') 433 | args = parser.parse_args() 434 | 435 | # Start program 436 | main(args) 437 | -------------------------------------------------------------------------------- /cnn_models.py: -------------------------------------------------------------------------------- 1 | # ===================================================================== 2 | # cnn_models.py - CNNs for loop-closure detection in vSLAM systems. 3 | # Copyright (C) 2018 Zach Carmichael 4 | # 5 | # This program is free software: you can redistribute it and/or modify 6 | # it under the terms of the GNU General Public License as published by 7 | # the Free Software Foundation, either version 3 of the License, or 8 | # (at your option) any later version. 9 | # 10 | # This program is distributed in the hope that it will be useful, 11 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 12 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 13 | # GNU General Public License for more details. 14 | # 15 | # You should have received a copy of the GNU General Public License 16 | # along with this program. If not, see . 17 | # ===================================================================== 18 | from skimage.transform import resize as skresize 19 | import numpy as np 20 | 21 | import sys 22 | import os 23 | import tarfile 24 | 25 | # local imports 26 | from dataset import download_file 27 | 28 | if sys.version_info.major == 2: # .major requires 2.7+ 29 | print('Python 2 detected: OverFeat-only mode.') 30 | # OverFeat Architecture 31 | import overfeat 32 | elif sys.version_info.major == 3: 33 | print('Python 3 detected: OverFeat unavailable.') 34 | # Add local library to system path 35 | sys.path.append(os.path.join('models', 'research', 'slim')) 36 | 37 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1' 38 | 39 | # TF Slim Models 40 | from models.research.slim.preprocessing.preprocessing_factory import get_preprocessing 41 | from models.research.slim.nets.nets_factory import get_network_fn 42 | import tensorflow as tf # For TF models 43 | 44 | slim = tf.contrib.slim 45 | else: 46 | raise Exception('how: {}'.format(sys.version_info.major)) 47 | 48 | # === MODEL VARS === 49 | # Model checkpoint directory 50 | CKPT_DIR = 'tf_ckpts' 51 | # Inception V1 52 | INCEPTION_V1_URL = 'http://download.tensorflow.org/models/inception_v1_2016_08_28.tar.gz' 53 | INCEPTION_V1_PATH = os.path.join(CKPT_DIR, 'inception_v1_2016_08_28.tar.gz') 54 | INCEPTION_V1_CKPT = os.path.join(CKPT_DIR, 'inception_v1.ckpt') 55 | # Inception V2 56 | INCEPTION_V2_URL = 'http://download.tensorflow.org/models/inception_v2_2016_08_28.tar.gz' 57 | INCEPTION_V2_PATH = os.path.join(CKPT_DIR, 'inception_v2_2016_08_28.tar.gz') 58 | INCEPTION_V2_CKPT = os.path.join(CKPT_DIR, 'inception_v2.ckpt') 59 | # Inception V3 60 | INCEPTION_V3_URL = 'http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz' 61 | INCEPTION_V3_PATH = os.path.join(CKPT_DIR, 'inception_v3_2016_08_28.tar.gz') 62 | INCEPTION_V3_CKPT = os.path.join(CKPT_DIR, 'inception_v3.ckpt') 63 | # Inception V4 64 | INCEPTION_V4_URL = 'http://download.tensorflow.org/models/inception_v4_2016_09_09.tar.gz' 65 | INCEPTION_V4_PATH = os.path.join(CKPT_DIR, 'inception_v4_2016_09_09.tar.gz') 66 | INCEPTION_V4_CKPT = os.path.join(CKPT_DIR, 'inception_v4.ckpt') 67 | # NASNet 68 | NASNET_URL = 'https://storage.googleapis.com/download.tensorflow.org/models/nasnet-a_large_04_10_2017.tar.gz' 69 | NASNET_PATH = os.path.join(CKPT_DIR, 'nasnet-a_large_04_10_2017.tar.gz') 70 | NASNET_CKPT = os.path.join(CKPT_DIR, 'model.ckpt') 71 | NASNET_CKPT_FULL = os.path.join(CKPT_DIR, 'model.ckpt.data-00000-of-00001') 72 | # ResNet V2 152 73 | RESNET_V2_152_URL = 'http://download.tensorflow.org/models/resnet_v2_152_2017_04_14.tar.gz' 74 | RESNET_V2_152_PATH = os.path.join(CKPT_DIR, 'resnet_v2_152_2017_04_14.tar.gz') 75 | RESNET_V2_152_CKPT = os.path.join(CKPT_DIR, 'resnet_v2_152.ckpt') 76 | # === MODEL INFO === 77 | DEFAULT_FEATURE_LAYER = { 78 | 'inception_v1': 'InceptionV1/Logits/AvgPool_0a_7x7/AvgPool:0', 79 | 'inception_v2': 'InceptionV2/Logits/AvgPool_1a_7x7/AvgPool:0', 80 | 'inception_v3': 'InceptionV3/Logits/AvgPool_1a_8x8/AvgPool:0', 81 | 'inception_v4': 'InceptionV4/Logits/AvgPool_1a/AvgPool:0', 82 | 'nasnet_large': 'final_layer/Mean:0', 83 | 'resnet_v2_152': 'resnet_v2_152/pool5:0', 84 | 'overfeat_0': 19, 85 | 'overfeat_1': 20 86 | } 87 | MODEL_PARAMS_NAME = { 88 | 'inception_v1': 'InceptionV1', 89 | 'inception_v2': 'InceptionV2', 90 | 'inception_v3': 'InceptionV3', 91 | 'inception_v4': 'InceptionV4', 92 | 'nasnet_large': None, 93 | 'resnet_v2_152': 'resnet_v2_152', 94 | 'overfeat_0': None, 95 | 'overfeat_1': None 96 | } 97 | MODEL_CKPT_PATHS = { # Slim-only 98 | 'inception_v1': [INCEPTION_V1_CKPT, INCEPTION_V1_URL, INCEPTION_V1_PATH], 99 | 'inception_v2': [INCEPTION_V2_CKPT, INCEPTION_V2_URL, INCEPTION_V2_PATH], 100 | 'inception_v3': [INCEPTION_V3_CKPT, INCEPTION_V3_URL, INCEPTION_V3_PATH], 101 | 'inception_v4': [INCEPTION_V4_CKPT, INCEPTION_V4_URL, INCEPTION_V4_PATH], 102 | 'nasnet_large': [(NASNET_CKPT_FULL, NASNET_CKPT), NASNET_URL, NASNET_PATH], 103 | 'resnet_v2_152': [RESNET_V2_152_CKPT, RESNET_V2_152_URL, RESNET_V2_152_PATH] 104 | } 105 | MODEL_ALIASES = { 106 | 'nasnet': 'nasnet_large', 107 | 'overfeat': 'overfeat_1' 108 | } 109 | 110 | 111 | def get_ckpt(url, dl_dest): 112 | """Downloads and extracts model checkpoint file.""" 113 | download_file(url, dl_dest) 114 | with tarfile.open(dl_dest) as tar: 115 | def is_within_directory(directory, target): 116 | 117 | abs_directory = os.path.abspath(directory) 118 | abs_target = os.path.abspath(target) 119 | 120 | prefix = os.path.commonprefix([abs_directory, abs_target]) 121 | 122 | return prefix == abs_directory 123 | 124 | def safe_extract(tar, path=".", members=None, *, numeric_owner=False): 125 | 126 | for member in tar.getmembers(): 127 | member_path = os.path.join(path, member.name) 128 | if not is_within_directory(path, member_path): 129 | raise Exception("Attempted Path Traversal in Tar File") 130 | 131 | tar.extractall(path, members, numeric_owner=numeric_owner) 132 | 133 | 134 | safe_extract(tar, path=CKPT_DIR) 135 | 136 | 137 | def _capitalize(s): 138 | s = s.split('_') 139 | for i, ss in enumerate(s): 140 | if len(ss): 141 | ss = ss[0].upper() + ss[1:] 142 | s[i] = ss 143 | return ' '.join(s) 144 | 145 | 146 | def _resolve_alias(name): 147 | name = name.lower() 148 | return MODEL_ALIASES.get(name, name) 149 | 150 | 151 | def is_valid_model(model_name): 152 | if sys.version_info.major == 2 and model_name != 'overfeat': 153 | print('Python 3 needed for TF Slim model (execute using python3 not python2...)') 154 | sys.exit(1) 155 | 156 | if model_name[:len('overfeat')] == 'overfeat': 157 | if sys.version_info.major == 3: 158 | print('Python 2 needed for overfeat model (execute using python2 not python3...)') 159 | sys.exit(1) 160 | 161 | return _resolve_alias(model_name) in DEFAULT_FEATURE_LAYER 162 | 163 | 164 | def valid_model_names(): 165 | return DEFAULT_FEATURE_LAYER.keys() 166 | 167 | 168 | def is_tf_model(model_name): 169 | return _resolve_alias(model_name) in MODEL_CKPT_PATHS 170 | 171 | 172 | def overfeat_preprocess(img, resize): 173 | # Ensure single-precision 174 | img = img 175 | # Crop and resize image 176 | h0, w0 = img.shape[:2] 177 | # Compute crop indices 178 | d0 = min(h0, w0) 179 | hc = round((h0 - d0) / 2.) 180 | wc = round((w0 - d0) / 2.) 181 | # Center crop image (ensure 3 channels...) 182 | img = img[int(hc):int(hc + d0), int(wc):int(wc + d0), :] 183 | # Resize image 184 | img = skresize(img, (resize, resize), mode='constant', 185 | preserve_range=True, order=1).astype(np.float32) 186 | # Change channel order: h,w,c -> c,h,w 187 | img = np.rollaxis(img, 2, 0) 188 | return img 189 | 190 | 191 | def get_overfeat_features(imgs, weights_path, typ, layer=None, cache=None): 192 | """Returns features at layer for given image(s) from OverFeat model. 193 | 194 | Small (fast) network: 22 layers 195 | Large (accurate) network: 25 layers 196 | 197 | Args: 198 | imgs: Iterable of images each of shape (h,w,c) 199 | weights_path: Path to the OverFeat weights 200 | typ: 0 for small, 1 for large version of OverFeat 201 | layer: The layer to extract features from 202 | cache: Dict containing descs/other cached values 203 | """ 204 | if cache is None: 205 | cache = {} 206 | if 'overfeat_descs' not in cache: 207 | # Initialize network 208 | print('Loading OverFeat ({}) model...'.format(typ)) 209 | overfeat.init(weights_path, typ) 210 | # Determine feature layer if none specified 211 | if layer is None: 212 | if overfeat.get_n_layers() == 22: # small 213 | layer = 19 # 16 also recommended 214 | else: # large 215 | # Layer used by Zhang et al. 216 | layer = 22 217 | # Determine resize dim 218 | if typ == 0: 219 | resize = 231 # small network 220 | else: 221 | resize = 221 # large network 222 | # Allocate for feature descriptors 223 | descs = [] 224 | # Run images through network 225 | print('Running images through OverFeat, extracting features ' 226 | 'at layer {}.'.format(layer)) 227 | 228 | for idx, img in enumerate(imgs): 229 | if (idx + 1) % 100 == 0: 230 | print('Processing image {}...'.format(idx + 1)) 231 | # Preprocess image 232 | img = overfeat_preprocess(img, resize) 233 | # Run through model 234 | _ = overfeat.fprop(img) 235 | # Retrieve feature output 236 | desc = overfeat.get_output(layer) 237 | descs.append(desc) 238 | # Free network 239 | overfeat.free() 240 | # NumPy-ify 241 | descs = np.asarray(descs) 242 | cache.update(overfeat_descs=descs) 243 | else: 244 | descs = cache['overfeat_descs'] 245 | return descs, cache 246 | 247 | 248 | def get_slim_model_features(imgs, model_name, layer=None, cache=None): 249 | """Returns features at layer for given image(s) from a TF-Slim model. 250 | 251 | Args: 252 | imgs: Iterable of images each of shape (h,w,c) 253 | model_name: The model name 254 | layer: The layer to extract features from 255 | cache: Dict containing descs/other cached values 256 | """ 257 | model_name = _resolve_alias(model_name) 258 | 259 | if cache is None: 260 | cache = {} 261 | if model_name + '_descs' not in cache: 262 | # Get model ckpt info 263 | model_ckpt, model_url, model_path = MODEL_CKPT_PATHS[model_name] 264 | 265 | if isinstance(model_ckpt, tuple): 266 | # For ckpts that should be specified using file basename only (no extension) 267 | model_ckpt, model_ckpt_full = model_ckpt 268 | else: 269 | model_ckpt = model_ckpt_full = model_ckpt 270 | 271 | # Grab ckpt if not available locally 272 | if not os.path.isfile(model_ckpt_full): 273 | get_ckpt(model_url, model_path) 274 | 275 | # Determine feature layer if none specified 276 | if layer is None: 277 | layer = DEFAULT_FEATURE_LAYER[model_name] 278 | 279 | # Allocate for feature descriptors 280 | descs = [] 281 | # Run images through network 282 | print('Running images through {}, extracting features ' 283 | 'at layer {}.'.format(_capitalize(model_name), layer)) 284 | 285 | # Set up image placeholder 286 | net_fn = get_network_fn(model_name, num_classes=None) 287 | im_size = net_fn.default_image_size 288 | image = tf.placeholder(tf.float32, shape=(None, None, 3)) 289 | 290 | # Preprocess image 291 | preprocess_func = get_preprocessing(model_name, is_training=False) 292 | pp_image = preprocess_func(image, im_size, im_size) 293 | pp_images = tf.expand_dims(pp_image, 0) 294 | 295 | # Compute network output 296 | logits_input, _ = net_fn(pp_images) # because num_classes=None 297 | 298 | # Restore parameters from checkpoint as init_fn 299 | model_params_name = MODEL_PARAMS_NAME[model_name] 300 | if model_params_name: 301 | model_vars = slim.get_model_variables(model_params_name) 302 | else: 303 | model_vars = slim.get_model_variables() 304 | init_fn = slim.assign_from_checkpoint_fn( 305 | model_ckpt, model_vars) 306 | 307 | with tf.Session() as sess: 308 | # Init model variables 309 | init_fn(sess) 310 | # Get target feature tensor 311 | feat_tensor = sess.graph.get_tensor_by_name(layer) 312 | 313 | for idx, img in enumerate(imgs): 314 | if (idx + 1) % 100 == 0: 315 | print('Processing image {}...'.format(idx + 1)) 316 | # Run image through model 317 | desc = sess.run(feat_tensor, 318 | feed_dict={image: img}) 319 | descs.append(desc) 320 | 321 | descs = np.asarray(descs) 322 | # Update cache 323 | cache.update({model_name + '_descs': descs}) 324 | else: 325 | descs = cache[model_name + '_descs'] 326 | return descs, cache 327 | 328 | 329 | def get_model_features(imgs, model_name, overfeat_weights_path=None, 330 | overfeat_typ=None, layer=None, cache=None): 331 | """ Get model features from an available model. 332 | 333 | Args: 334 | imgs: The images to extract features for 335 | model_name: Name of the CNN (see is_valid_model) 336 | overfeat_weights_path: See get_overfeat_features 337 | overfeat_typ: See get_overfeat_features 338 | layer: See get_overfeat_features or get_slim_model_features 339 | cache: See get_overfeat_features or get_slim_model_features 340 | 341 | Returns: 342 | descs, cache: See get_overfeat_features or get_slim_model_features 343 | """ 344 | if is_valid_model(model_name): 345 | if is_tf_model(model_name): 346 | return get_slim_model_features(imgs, model_name, layer=layer, cache=cache) 347 | else: 348 | return get_overfeat_features(imgs, overfeat_weights_path, overfeat_typ, 349 | layer=layer, cache=cache) 350 | else: 351 | raise ValueError('`{}` is not a valid model name. Valid:\n{}.'.format(model_name, 352 | valid_model_names())) 353 | -------------------------------------------------------------------------------- /dataset.py: -------------------------------------------------------------------------------- 1 | # ===================================================================== 2 | # dataset.py - CNNs for loop-closure detection in vSLAM systems. 3 | # Copyright (C) 2018 Zach Carmichael 4 | # 5 | # This program is free software: you can redistribute it and/or modify 6 | # it under the terms of the GNU General Public License as published by 7 | # the Free Software Foundation, either version 3 of the License, or 8 | # (at your option) any later version. 9 | # 10 | # This program is distributed in the hope that it will be useful, 11 | # but WITHOUT ANY WARRANTY; without even the implied warranty of 12 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 13 | # GNU General Public License for more details. 14 | # 15 | # You should have received a copy of the GNU General Public License 16 | # along with this program. If not, see . 17 | # ===================================================================== 18 | from scipy.io import loadmat 19 | from scipy.ndimage import imread 20 | import numpy as np 21 | 22 | import os 23 | import requests 24 | import zipfile 25 | import sys 26 | from glob import glob 27 | 28 | # === DATASET VARS === 29 | # Data directory 30 | DATA_DIR = 'data' 31 | # City Centre Dataset 32 | CITY_DATA_DIR = os.path.join(DATA_DIR, 'city') 33 | CITY_IMGZIP_PATH = os.path.join(CITY_DATA_DIR, 'Images.zip') 34 | CITY_IMG_PATH = os.path.join(CITY_DATA_DIR, 'Images') 35 | CITY_GT_PATH = os.path.join(CITY_DATA_DIR, 'CityCentreGroundTruth.mat') 36 | CITY_IMG_URL = 'http://www.robots.ox.ac.uk/~mobile/IJRR_2008_Dataset/Data/CityCentre/Images.zip' 37 | CITY_GT_URL = 'http://www.robots.ox.ac.uk/~mobile/IJRR_2008_Dataset/Data/CityCentre/masks/CityCentreGroundTruth.mat' 38 | # New College Dataset 39 | COLLEGE_DATA_DIR = os.path.join(DATA_DIR, 'college') 40 | COLLEGE_IMGZIP_PATH = os.path.join(COLLEGE_DATA_DIR, 'Images.zip') 41 | COLLEGE_IMG_PATH = os.path.join(COLLEGE_DATA_DIR, 'Images') 42 | COLLEGE_GT_PATH = os.path.join(COLLEGE_DATA_DIR, 'NewCollegeGroundTruth.mat') 43 | COLLEGE_IMG_URL = 'http://www.robots.ox.ac.uk/~mobile/IJRR_2008_Dataset/Data/NewCollege/Images.zip' 44 | COLLEGE_GT_URL = 'http://www.robots.ox.ac.uk/~mobile/IJRR_2008_Dataset/Data/NewCollege/masks/NewCollegeGroundTruth.mat' 45 | 46 | 47 | def download_file(url, file_name): 48 | """Downloads a file to destination 49 | 50 | Code adapted from: 51 | https://stackoverflow.com/questions/15644964/python-progress-bar-and-downloads 52 | 53 | Args: 54 | url: URL of file to download 55 | file_name: Where to write downloaded file 56 | """ 57 | # Ensure destination exists 58 | dest_dir = os.path.dirname(file_name) 59 | if not os.path.isdir(dest_dir): 60 | os.makedirs(dest_dir) 61 | with open(file_name, 'wb') as f: 62 | print('Downloading {} from {}'.format(file_name, url)) 63 | response = requests.get(url, stream=True) 64 | total_length = response.headers.get('content-length') 65 | if total_length is None: # no content length header 66 | f.write(response.content) 67 | else: 68 | dl = 0 69 | total_length = int(total_length) 70 | for data in response.iter_content(chunk_size=4096): 71 | dl += len(data) 72 | f.write(data) 73 | # Output progress 74 | complete = dl / total_length 75 | done = int(50 * complete) 76 | sys.stdout.write('\r[{}{}] {:6.2f}%'.format('=' * done, ' ' * (50 - done), 77 | complete * 100)) 78 | sys.stdout.flush() 79 | sys.stdout.write('\n') 80 | sys.stdout.flush() 81 | 82 | 83 | def get_dataset(name, debug=False): 84 | debug_amt = 25 85 | if name.lower() == 'city': # city centre dataset 86 | print('Loading the City Centre dataset...') 87 | # Load images 88 | print('Loading images') 89 | if not os.path.isfile(CITY_IMGZIP_PATH): 90 | download_file(CITY_IMG_URL, CITY_IMGZIP_PATH) 91 | if not os.path.isdir(CITY_IMG_PATH): 92 | # Unzip archive 93 | print('Unzipping {} to {}'.format(CITY_IMGZIP_PATH, CITY_DATA_DIR)) 94 | with zipfile.ZipFile(CITY_IMGZIP_PATH, 'r') as zip_handle: 95 | zip_handle.extractall(CITY_DATA_DIR) 96 | # Sort by image number 97 | img_names = sorted(glob(os.path.join(CITY_IMG_PATH, '*.jpg'))) 98 | assert len(img_names) == 2474 99 | if debug: 100 | print('Using fewer images ({}) per debug flag...'.format( 101 | debug_amt)) 102 | img_names = img_names[:debug_amt] 103 | imgs = np.asarray([imread(img) for img in img_names]) 104 | # Load GT 105 | if not os.path.isfile(CITY_GT_PATH): 106 | download_file(CITY_GT_URL, CITY_GT_PATH) 107 | print('Loading ground truth') 108 | gt = loadmat(CITY_GT_PATH)['truth'] 109 | if debug: 110 | gt = gt[:debug_amt, :debug_amt] 111 | elif name.lower() == 'college': # new college dataset 112 | print('Loading the New College dataset...') 113 | # Load images 114 | print('Loading images') 115 | if not os.path.isfile(COLLEGE_IMGZIP_PATH): 116 | download_file(COLLEGE_IMG_URL, COLLEGE_IMGZIP_PATH) 117 | if not os.path.isdir(COLLEGE_IMG_PATH): 118 | # Unzip archive 119 | print('Unzipping {} to {}'.format(COLLEGE_IMGZIP_PATH, 120 | COLLEGE_DATA_DIR)) 121 | with zipfile.ZipFile(COLLEGE_IMGZIP_PATH, 'r') as zip_handle: 122 | zip_handle.extractall(COLLEGE_DATA_DIR) 123 | # Sort by image number 124 | img_names = sorted(glob(os.path.join(COLLEGE_IMG_PATH, '*.jpg'))) 125 | assert len(img_names) == 2146 126 | if debug: 127 | print('Using fewer images ({}) per debug flag...'.format( 128 | debug_amt)) 129 | img_names = img_names[:debug_amt] 130 | imgs = np.asarray([imread(img) for img in img_names]) 131 | # Load GT 132 | if not os.path.isfile(COLLEGE_GT_PATH): 133 | download_file(COLLEGE_GT_URL, COLLEGE_GT_PATH) 134 | print('Loading ground truth') 135 | gt = loadmat(COLLEGE_GT_PATH)['truth'] 136 | if debug: 137 | gt = gt[:debug_amt, :debug_amt] 138 | elif name.lower() == 'tsukuba': # new tsukuba dataset 139 | raise NotImplementedError 140 | else: 141 | raise ValueError('Invalid dataset name: {}.'.format(name)) 142 | return imgs, gt 143 | -------------------------------------------------------------------------------- /paper/DEEPSLAM.eps: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/craymichael/CNN_LCD/a0087f7ce809220b65b5b0bdac77f5f0de5bfed6/paper/DEEPSLAM.eps -------------------------------------------------------------------------------- /paper/cvpr.sty: -------------------------------------------------------------------------------- 1 | % --------------------------------------------------------------- 2 | % 3 | % $Id: cvpr.sty,v 1.3 2005/10/24 19:56:15 awf Exp $ 4 | % 5 | % by Paolo.Ienne@di.epfl.ch 6 | % some mods by awf@acm.org 7 | % 8 | % --------------------------------------------------------------- 9 | % 10 | % no guarantee is given that the format corresponds perfectly to 11 | % IEEE 8.5" x 11" Proceedings, but most features should be ok. 12 | % 13 | % --------------------------------------------------------------- 14 | % with LaTeX2e: 15 | % ============= 16 | % 17 | % use as 18 | % \documentclass[times,10pt,twocolumn]{article} 19 | % \usepackage{latex8} 20 | % \usepackage{times} 21 | % 22 | % --------------------------------------------------------------- 23 | 24 | % with LaTeX 2.09: 25 | % ================ 26 | % 27 | % use as 28 | % \documentstyle[times,art10,twocolumn,latex8]{article} 29 | % 30 | % --------------------------------------------------------------- 31 | % with both versions: 32 | % =================== 33 | % 34 | % specify \cvprfinalcopy to emit the final camera-ready copy 35 | % 36 | % specify references as 37 | % \bibliographystyle{ieee} 38 | % \bibliography{...your files...} 39 | % 40 | % --------------------------------------------------------------- 41 | 42 | \usepackage{eso-pic} 43 | \usepackage{xspace} 44 | 45 | \typeout{CVPR 8.5 x 11-Inch Proceedings Style `cvpr.sty'.} 46 | 47 | % ten point helvetica bold required for captions 48 | % eleven point times bold required for second-order headings 49 | % in some sites the name of the fonts may differ, 50 | % change the name here: 51 | \font\cvprtenhv = phvb at 8pt % *** IF THIS FAILS, SEE cvpr.sty *** 52 | \font\elvbf = ptmb scaled 1100 53 | 54 | % If the above lines give an error message, try to comment them and 55 | % uncomment these: 56 | %\font\cvprtenhv = phvb7t at 8pt 57 | %\font\elvbf = ptmb7t scaled 1100 58 | 59 | % set dimensions of columns, gap between columns, and paragraph indent 60 | \setlength{\textheight}{8.875in} 61 | \setlength{\textwidth}{6.875in} 62 | \setlength{\columnsep}{0.3125in} 63 | \setlength{\topmargin}{0in} 64 | \setlength{\headheight}{0in} 65 | \setlength{\headsep}{0in} 66 | \setlength{\parindent}{1pc} 67 | \setlength{\oddsidemargin}{-.304in} 68 | \setlength{\evensidemargin}{-.304in} 69 | 70 | \newif\ifcvprfinal 71 | \cvprfinalfalse 72 | \def\cvprfinalcopy{\global\cvprfinaltrue} 73 | 74 | % memento from size10.clo 75 | % \normalsize{\@setfontsize\normalsize\@xpt\@xiipt} 76 | % \small{\@setfontsize\small\@ixpt{11}} 77 | % \footnotesize{\@setfontsize\footnotesize\@viiipt{9.5}} 78 | % \scriptsize{\@setfontsize\scriptsize\@viipt\@viiipt} 79 | % \tiny{\@setfontsize\tiny\@vpt\@vipt} 80 | % \large{\@setfontsize\large\@xiipt{14}} 81 | % \Large{\@setfontsize\Large\@xivpt{18}} 82 | % \LARGE{\@setfontsize\LARGE\@xviipt{22}} 83 | % \huge{\@setfontsize\huge\@xxpt{25}} 84 | % \Huge{\@setfontsize\Huge\@xxvpt{30}} 85 | 86 | \def\@maketitle 87 | { 88 | \newpage 89 | \null 90 | \vskip .375in 91 | \begin{center} 92 | {\Large \bf \@title \par} 93 | % additional two empty lines at the end of the title 94 | \vspace*{24pt} 95 | { 96 | \large 97 | \lineskip .5em 98 | \begin{tabular}[t]{c} 99 | \@author 100 | %\ifcvprfinal\@author\else Anonymous CVPR submission\\ 101 | \vspace*{1pt}\\%This space will need to be here in the final copy, so don't squeeze it out for the 102 | review copy. 103 | %Paper ID \cvprPaperID \fi 104 | \end{tabular} 105 | \par 106 | } 107 | % additional small space at the end of the author name 108 | \vskip .5em 109 | % additional empty line at the end of the title block 110 | \vspace*{12pt} 111 | \end{center} 112 | } 113 | 114 | \def\abstract 115 | {% 116 | \centerline{\large\bf Abstract}% 117 | \vspace*{12pt}% 118 | \it% 119 | } 120 | 121 | \def\endabstract 122 | { 123 | % additional empty line at the end of the abstract 124 | \vspace*{12pt} 125 | } 126 | 127 | \def\affiliation#1{\gdef\@affiliation{#1}} \gdef\@affiliation{} 128 | 129 | \newlength{\@ctmp} 130 | \newlength{\@figindent} 131 | \setlength{\@figindent}{1pc} 132 | 133 | \long\def\@makecaption#1#2{ 134 | \setbox\@tempboxa\hbox{\small \noindent #1.~#2} 135 | \setlength{\@ctmp}{\hsize} 136 | \addtolength{\@ctmp}{-\@figindent}\addtolength{\@ctmp}{-\@figindent} 137 | % IF longer than one indented paragraph line 138 | \ifdim \wd\@tempboxa >\@ctmp 139 | % THEN DON'T set as an indented paragraph 140 | {\small #1.~#2\par} 141 | \else 142 | % ELSE center 143 | \hbox to\hsize{\hfil\box\@tempboxa\hfil} 144 | \fi} 145 | 146 | % correct heading spacing and type 147 | \def\cvprsection{\@startsection {section}{1}{\z@} 148 | {10pt plus 2pt minus 2pt}{7pt} {\large\bf}} 149 | \def\cvprssect#1{\cvprsection*{#1}} 150 | \def\cvprsect#1{\cvprsection{\hskip -1em.~#1}} 151 | \def\section{\@ifstar\cvprssect\cvprsect} 152 | 153 | \def\cvprsubsection{\@startsection {subsection}{2}{\z@} 154 | {8pt plus 2pt minus 2pt}{6pt} {\elvbf}} 155 | \def\cvprssubsect#1{\cvprsubsection*{#1}} 156 | \def\cvprsubsect#1{\cvprsubsection{\hskip -1em.~#1}} 157 | \def\subsection{\@ifstar\cvprssubsect\cvprsubsect} 158 | 159 | %% --------- Page background marks: Ruler and confidentiality 160 | 161 | % ----- define vruler 162 | \makeatletter 163 | \newbox\cvprrulerbox 164 | \newcount\cvprrulercount 165 | \newdimen\cvprruleroffset 166 | \newdimen\cv@lineheight 167 | \newdimen\cv@boxheight 168 | \newbox\cv@tmpbox 169 | \newcount\cv@refno 170 | \newcount\cv@tot 171 | % NUMBER with left flushed zeros \fillzeros[] 172 | \newcount\cv@tmpc@ \newcount\cv@tmpc 173 | \def\fillzeros[#1]#2{\cv@tmpc@=#2\relax\ifnum\cv@tmpc@<0\cv@tmpc@=-\cv@tmpc@\fi 174 | \cv@tmpc=1 % 175 | \loop\ifnum\cv@tmpc@<10 \else \divide\cv@tmpc@ by 10 \advance\cv@tmpc by 1 \fi 176 | \ifnum\cv@tmpc@=10\relax\cv@tmpc@=11\relax\fi \ifnum\cv@tmpc@>10 \repeat 177 | \ifnum#2<0\advance\cv@tmpc1\relax-\fi 178 | \loop\ifnum\cv@tmpc<#1\relax0\advance\cv@tmpc1\relax\fi \ifnum\cv@tmpc<#1 \repeat 179 | \cv@tmpc@=#2\relax\ifnum\cv@tmpc@<0\cv@tmpc@=-\cv@tmpc@\fi \relax\the\cv@tmpc@}% 180 | % \makevruler[][][][][] 181 | \def\makevruler[#1][#2][#3][#4][#5]{\begingroup\offinterlineskip 182 | \textheight=#5\vbadness=10000\vfuzz=120ex\overfullrule=0pt% 183 | \global\setbox\cvprrulerbox=\vbox to \textheight{% 184 | {\parskip=0pt\hfuzz=150em\cv@boxheight=\textheight 185 | \cv@lineheight=#1\global\cvprrulercount=#2% 186 | \cv@tot\cv@boxheight\divide\cv@tot\cv@lineheight\advance\cv@tot2% 187 | \cv@refno1\vskip-\cv@lineheight\vskip1ex% 188 | \loop\setbox\cv@tmpbox=\hbox to0cm{{\cvprtenhv\hfil\fillzeros[#4]\cvprrulercount}}% 189 | \ht\cv@tmpbox\cv@lineheight\dp\cv@tmpbox0pt\box\cv@tmpbox\break 190 | \advance\cv@refno1\global\advance\cvprrulercount#3\relax 191 | \ifnum\cv@refno<\cv@tot\repeat}}\endgroup}% 192 | \makeatother 193 | % ----- end of vruler 194 | 195 | % \makevruler[][][][][] 196 | \def\cvprruler#1{\makevruler[12pt][#1][1][3][0.993\textheight]\usebox{\cvprrulerbox}} 197 | \AddToShipoutPicture{% 198 | \ifcvprfinal\else 199 | %\AtTextLowerLeft{% 200 | % \color[gray]{.15}\framebox(\LenToUnit{\textwidth},\LenToUnit{\textheight}){} 201 | %} 202 | \cvprruleroffset=\textheight 203 | \advance\cvprruleroffset by -3.7pt 204 | \color[rgb]{.5,.5,1} 205 | \AtTextUpperLeft{% 206 | \put(\LenToUnit{-35pt},\LenToUnit{-\cvprruleroffset}){%left ruler 207 | \cvprruler{\cvprrulercount}} 208 | \put(\LenToUnit{\textwidth\kern 30pt},\LenToUnit{-\cvprruleroffset}){%right ruler 209 | \cvprruler{\cvprrulercount}} 210 | } 211 | % \def\pid{\parbox{1in}{\begin{center}\bf\sf{\small CVPR}\\\#\cvprPaperID\end{center}}} 212 | % \AtTextUpperLeft{%paperID in corners 213 | % \put(\LenToUnit{-65pt},\LenToUnit{45pt}){\pid} 214 | % \put(\LenToUnit{\textwidth\kern-8pt},\LenToUnit{45pt}){\pid} 215 | % } 216 | % \AtTextUpperLeft{%confidential 217 | % \put(0,\LenToUnit{1cm}){\parbox{\textwidth}{\centering\cvprtenhv 218 | % CVPR 2018 Submission \#\cvprPaperID. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.}} 219 | % } 220 | \fi 221 | } 222 | 223 | %%% Make figure placement a little more predictable. 224 | % We trust the user to move figures if this results 225 | % in ugliness. 226 | % Minimize bad page breaks at figures 227 | \renewcommand{\textfraction}{0.01} 228 | \renewcommand{\floatpagefraction}{0.99} 229 | \renewcommand{\topfraction}{0.99} 230 | \renewcommand{\bottomfraction}{0.99} 231 | \renewcommand{\dblfloatpagefraction}{0.99} 232 | \renewcommand{\dbltopfraction}{0.99} 233 | \setcounter{totalnumber}{99} 234 | \setcounter{topnumber}{99} 235 | \setcounter{bottomnumber}{99} 236 | 237 | % Add a period to the end of an abbreviation unless there's one 238 | % already, then \xspace. 239 | \makeatletter 240 | \DeclareRobustCommand\onedot{\futurelet\@let@token\@onedot} 241 | \def\@onedot{\ifx\@let@token.\else.\null\fi\xspace} 242 | 243 | \def\eg{\emph{e.g}\onedot} \def\Eg{\emph{E.g}\onedot} 244 | \def\ie{\emph{i.e}\onedot} \def\Ie{\emph{I.e}\onedot} 245 | \def\cf{\emph{c.f}\onedot} \def\Cf{\emph{C.f}\onedot} 246 | \def\etc{\emph{etc}\onedot} \def\vs{\emph{vs}\onedot} 247 | \def\wrt{w.r.t\onedot} \def\dof{d.o.f\onedot} 248 | \def\etal{\emph{et al}\onedot} 249 | \makeatother 250 | 251 | % --------------------------------------------------------------- 252 | 253 | -------------------------------------------------------------------------------- /paper/cvpr_eso.sty: -------------------------------------------------------------------------------- 1 | %% 2 | %% This is file `everyshi.sty', 3 | %% generated with the docstrip utility. 4 | %% 5 | %% The original source files were: 6 | %% 7 | %% everyshi.dtx (with options: `package') 8 | %% 9 | %% Copyright (C) [1994..1999] by Martin Schroeder. All rights reserved. 10 | %% 11 | %% This file is part of the EveryShi package 12 | %% 13 | %% This program may be redistributed and/or modified under the terms 14 | %% of the LaTeX Project Public License, either version 1.0 of this 15 | %% license, or (at your option) any later version. 16 | %% The latest version of this license is in 17 | %% CTAN:macros/latex/base/lppl.txt. 18 | %% 19 | %% Happy users are requested to send me a postcard. :-) 20 | %% 21 | %% The EveryShi package contains these files: 22 | %% 23 | %% everyshi.asc 24 | %% everyshi.dtx 25 | %% everyshi.dvi 26 | %% everyshi.ins 27 | %% everyshi.bug 28 | %% 29 | %% Error Reports in case of UNCHANGED versions to 30 | %% 31 | %% Martin Schr"oder 32 | %% Cr"usemannallee 3 33 | %% D-28213 Bremen 34 | %% Martin.Schroeder@ACM.org 35 | %% 36 | %% File: everyshi.dtx Copyright (C) 2001 Martin Schr\"oder 37 | \NeedsTeXFormat{LaTeX2e} 38 | \ProvidesPackage{everyshi} 39 | [2001/05/15 v3.00 EveryShipout Package (MS)] 40 | %% \CharacterTable 41 | %% {Upper-case \A\B\C\D\E\F\G\H\I\J\K\L\M\N\O\P\Q\R\S\T\U\V\W\X\Y\Z 42 | %% Lower-case \a\b\c\d\e\f\g\h\i\j\k\l\m\n\o\p\q\r\s\t\u\v\w\x\y\z 43 | %% Digits \0\1\2\3\4\5\6\7\8\9 44 | %% Exclamation \! Double quote \" Hash (number) \# 45 | %% Dollar \$ Percent \% Ampersand \& 46 | %% Acute accent \' Left paren \( Right paren \) 47 | %% Asterisk \* Plus \+ Comma \, 48 | %% Minus \- Point \. Solidus \/ 49 | %% Colon \: Semicolon \; Less than \< 50 | %% Equals \= Greater than \> Question mark \? 51 | %% Commercial at \@ Left bracket \[ Backslash \\ 52 | %% Right bracket \] Circumflex \^ Underscore \_ 53 | %% Grave accent \` Left brace \{ Vertical bar \| 54 | %% Right brace \} Tilde \~} 55 | %% 56 | %% \iffalse meta-comment 57 | %% =================================================================== 58 | %% @LaTeX-package-file{ 59 | %% author = {Martin Schr\"oder}, 60 | %% version = "3.00", 61 | %% date = "15 May 2001", 62 | %% filename = "everyshi.sty", 63 | %% address = {Martin Schr\"oder 64 | %% Cr\"usemannallee 3 65 | %% 28213 Bremen 66 | %% Germany}, 67 | %% telephone = "+49-421-2239425", 68 | %% email = "martin@oneiros.de", 69 | %% pgp-Key = "2048 bit / KeyID 292814E5", 70 | %% pgp-fingerprint = "7E86 6EC8 97FA 2995 82C3 FEA5 2719 090E", 71 | %% docstring = "LaTeX package which provides hooks into 72 | %% \cs{shipout}. 73 | %% } 74 | %% =================================================================== 75 | %% \fi 76 | 77 | \newcommand{\@EveryShipout@Hook}{} 78 | \newcommand{\@EveryShipout@AtNextHook}{} 79 | \newcommand*{\EveryShipout}[1] 80 | {\g@addto@macro\@EveryShipout@Hook{#1}} 81 | \newcommand*{\AtNextShipout}[1] 82 | {\g@addto@macro\@EveryShipout@AtNextHook{#1}} 83 | \newcommand{\@EveryShipout@Shipout}{% 84 | \afterassignment\@EveryShipout@Test 85 | \global\setbox\@cclv= % 86 | } 87 | \newcommand{\@EveryShipout@Test}{% 88 | \ifvoid\@cclv\relax 89 | \aftergroup\@EveryShipout@Output 90 | \else 91 | \@EveryShipout@Output 92 | \fi% 93 | } 94 | \newcommand{\@EveryShipout@Output}{% 95 | \@EveryShipout@Hook% 96 | \@EveryShipout@AtNextHook% 97 | \gdef\@EveryShipout@AtNextHook{}% 98 | \@EveryShipout@Org@Shipout\box\@cclv% 99 | } 100 | \newcommand{\@EveryShipout@Org@Shipout}{} 101 | \newcommand*{\@EveryShipout@Init}{% 102 | \message{ABD: EveryShipout initializing macros}% 103 | \let\@EveryShipout@Org@Shipout\shipout 104 | \let\shipout\@EveryShipout@Shipout 105 | } 106 | \AtBeginDocument{\@EveryShipout@Init} 107 | \endinput 108 | %% 109 | %% End of file `everyshi.sty'. 110 | 111 | -------------------------------------------------------------------------------- /paper/egbib.bib: -------------------------------------------------------------------------------- 1 | @article{taketomi_visual_2017, 2 | title = {Visual {SLAM} algorithms: a survey from 2010 to 2016}, 3 | volume = {9}, 4 | issn = {1882-6695}, 5 | shorttitle = {Visual {SLAM} algorithms}, 6 | url = {http://ipsjcva.springeropen.com/articles/10.1186/s41074-017-0027-2}, 7 | doi = {10.1186/s41074-017-0027-2}, 8 | language = {en}, 9 | number = {1}, 10 | urldate = {2018-02-22}, 11 | journal = {IPSJ Transactions on Computer Vision and Applications}, 12 | author = {Taketomi, Takafumi and Uchiyama, Hideaki and Ikeda, Sei}, 13 | month = dec, 14 | year = {2017}, 15 | file = 16 | {s41074-017-0027-2.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/3MJ335NE/s41074-017-0027-2.pdf:application/pdf} 17 | } 18 | 19 | @inproceedings{wang_survey_2017, 20 | title = {A survey of simultaneous localization and mapping on unstructured lunar complex environment}, 21 | url = {http://aip.scitation.org/doi/abs/10.1063/1.5005198}, 22 | doi = {10.1063/1.5005198}, 23 | urldate = {2018-02-22}, 24 | author = {Wang, Yiqiao and Zhang, Wei and An, Pei}, 25 | year = {2017}, 26 | pages = {030010}, 27 | file = 28 | {1.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/EN54IT4J/1.pdf:application/pdf} 29 | } 30 | 31 | 32 | @inproceedings{zhang_loop_2017, 33 | title = {Loop closure detection for visual {SLAM} systems using convolutional neural network}, 34 | doi = {10.23919/IConAC.2017.8082072}, 35 | abstract = {This paper is concerned of the loop closure detection problem, which is one of the most 36 | critical parts for visual Simultaneous Localization and Mapping (SLAM) systems. Most of state-of-the-art 37 | methods use hand-crafted features and bag-of-visual-words (BoVW) to tackle this problem. Recent development in 38 | deep learning indicates that CNN features significantly outperform hand-crafted features for image 39 | representation. This advanced technology has not been fully exploited in robotics, especially in visual SLAM 40 | systems. We propose a loop closure detection method based on convolutional neural networks (CNNs). Images are 41 | fed into a pre-trained CNN model to extract features. We pre-process CNN features instead of using them 42 | directly as most of the presented approaches did before they are used to detect loops. The workflow of 43 | extracting CNN features, processing data, computing similarity score and detecting loops is presented. Finally 44 | the performance of proposed method is evaluated on several open datasets by comparing it with Fab-Map using 45 | precision-recall metric.}, 46 | booktitle = {2017 23rd {International} {Conference} on {Automation} and {Computing} ({ICAC})}, 47 | author = {Zhang, X. and Su, Y. and Zhu, X.}, 48 | month = sep, 49 | year = {2017}, 50 | keywords = {learning (artificial intelligence), feature extraction, Principal component analysis, 51 | Visualization, Feature extraction, image representation, Deep Learning, Neural networks, bag-of-visual-words, 52 | CNN model, Computational modeling, convolution, convolutional neural network, Convolutional Neural Network, 53 | data processing, feedforward neural nets, hand-crafted features, Image representation, Loop Closure Detection, 54 | loop closure detection method, loop closure detection problem, mobile robots, pre-process CNN features, 55 | precision-recall metric, robot vision, similarity score, Simultaneous localization and mapping, SLAM, SLAM 56 | (robots), visual simultaneous localization and mapping systems, visual SLAM systems}, 57 | pages = {1--6}, 58 | file = {IEEE Xplore Abstract 59 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/59XH5I3S/8082072.html:text/html} 60 | } 61 | 62 | @article{taketomi_visual_2017, 63 | title = {Visual {SLAM} algorithms: a survey from 2010 to 2016}, 64 | volume = {9}, 65 | issn = {1882-6695}, 66 | shorttitle = {Visual {SLAM} algorithms}, 67 | url = {http://ipsjcva.springeropen.com/articles/10.1186/s41074-017-0027-2}, 68 | doi = {10.1186/s41074-017-0027-2}, 69 | language = {en}, 70 | number = {1}, 71 | urldate = {2018-02-22}, 72 | journal = {IPSJ Transactions on Computer Vision and Applications}, 73 | author = {Taketomi, Takafumi and Uchiyama, Hideaki and Ikeda, Sei}, 74 | month = dec, 75 | year = {2017}, 76 | file = 77 | {s41074-017-0027-2.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/3MJ335NE/s41074-017-0027-2.pdf:application/pdf} 78 | } 79 | 80 | @article{gupta_cognitive_2017, 81 | title = {Cognitive mapping and planning for visual navigation}, 82 | volume = {3}, 83 | journal = {arXiv preprint arXiv:1702.03920}, 84 | author = {Gupta, Saurabh and Davidson, James and Levine, Sergey and Sukthankar, Rahul and Malik, 85 | Jitendra}, 86 | year = {2017}, 87 | file = 88 | {Gupta_Cognitive_Mapping_and_CVPR_2017_paper.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/HTW6Q3BM/Gupta_Cognitive_Mapping_and_CVPR_2017_paper.pdf:application/pdf} 89 | } 90 | 91 | @article{gao_unsupervised_2017, 92 | title = {Unsupervised learning to detect loops using deep neural networks for visual {SLAM} system}, 93 | volume = {41}, 94 | issn = {0929-5593, 1573-7527}, 95 | url = {http://link.springer.com/10.1007/s10514-015-9516-2}, 96 | doi = {10.1007/s10514-015-9516-2}, 97 | language = {en}, 98 | number = {1}, 99 | urldate = {2018-02-22}, 100 | journal = {Autonomous Robots}, 101 | author = {Gao, Xiang and Zhang, Tao}, 102 | month = jan, 103 | year = {2017}, 104 | pages = {1--18}, 105 | file = 106 | {10.1007s10514-015-9516-2.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/JTMGU2H3/10.1007s10514-015-9516-2.pdf:application/pdf} 107 | } 108 | 109 | @inproceedings{wang_survey_2017, 110 | title = {A survey of simultaneous localization and mapping on unstructured lunar complex environment}, 111 | url = {http://aip.scitation.org/doi/abs/10.1063/1.5005198}, 112 | doi = {10.1063/1.5005198}, 113 | urldate = {2018-02-22}, 114 | author = {Wang, Yiqiao and Zhang, Wei and An, Pei}, 115 | year = {2017}, 116 | pages = {030010}, 117 | file = 118 | {1.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/EN54IT4J/1.pdf:application/pdf} 119 | } 120 | 121 | @article{naseer_robust_2018, 122 | title = {Robust {Visual} {Localization} {Across} {Seasons}}, 123 | volume = {PP}, 124 | issn = {1552-3098}, 125 | doi = {10.1109/TRO.2017.2788045}, 126 | abstract = {Localization is an integral part of reliable robot navigation, and long-term autonomy 127 | requires robustness against perceptional changes in the environment during localization. In the context of 128 | vision-based localization, such changes can be caused by illumination variations, occlusion, structural 129 | development, different weather conditions, and seasons. In this paper, we present a novel approach for 130 | localizing a robot over longer periods of time using only monocular image data. We propose a novel data 131 | association approach for matching streams of incoming images to an image sequence stored in a database. Our 132 | method exploits network flows to leverage sequential information to improve the localization performance and 133 | to maintain several possible trajectories hypotheses in parallel. To compare images, we consider a semidense 134 | image description based on histogram of oriented gradients features as well as global descriptors from deep 135 | convolutional neural networks trained on ImageNet for robust localization. We perform extensive evaluations on 136 | a variety of datasets and show that our approach outperforms existing state-of-the-art approaches.}, 137 | number = {99}, 138 | journal = {IEEE Transactions on Robotics}, 139 | author = {Naseer, T. and Burgard, W. and Stachniss, C.}, 140 | year = {2018}, 141 | keywords = {Robustness, Visualization, Image matching, Image sequences, Lighting, Meteorology, 142 | Robots}, 143 | pages = {1--14}, 144 | file = {IEEE Xplore Abstract 145 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/BB9BXT4I/8269404.html:text/html} 146 | } 147 | 148 | @inproceedings{pascoe_nid-slam:_2017, 149 | title = {{NID}-{SLAM}: {Robust} {Monocular} {SLAM} using {Normalised} {Information} {Distance}}, 150 | shorttitle = {{NID}-{SLAM}}, 151 | booktitle = {Conference on {Computer} {Vision} and {Pattern} {Recognition}}, 152 | author = {Pascoe, Geoffrey and Maddern, Will and Tanner, Michael and Piniés, Pedro and Newman, Paul}, 153 | year = {2017}, 154 | file = 155 | {Pascoe_NID-SLAM_Robust_Monocular_CVPR_2017_paper.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/V8ZSZTIS/Pascoe_NID-SLAM_Robust_Monocular_CVPR_2017_paper.pdf:application/pdf} 156 | } 157 | 158 | @inproceedings{naseer_robust_2014, 159 | title = {Robust {Visual} {Robot} {Localization} {Across} {Seasons} {Using} {Network} {Flows}.}, 160 | booktitle = {{AAAI}}, 161 | author = {Naseer, Tayyab and Spinello, Luciano and Burgard, Wolfram and Stachniss, Cyrill}, 162 | year = {2014}, 163 | pages = {2564--2570}, 164 | file = 165 | {naseerAAAI14.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/BTMNUGKU/naseerAAAI14.pdf:application/pdf} 166 | } 167 | 168 | @article{cummins_appearance-only_2011, 169 | title = {Appearance-only {SLAM} at large scale with {FAB}-{MAP} 2.0}, 170 | volume = {30}, 171 | issn = {0278-3649}, 172 | url = {https://doi.org/10.1177/0278364910385483}, 173 | doi = {10.1177/0278364910385483}, 174 | abstract = {We describe a new formulation of appearance-only SLAM suitable for very large 175 | scale place recognition. The system navigates in the space of appearance, assigning each 176 | new observation to either a new or a previously visited location, without reference to 177 | metric position. The system is demonstrated performing reliable online appearance mapping 178 | and loop-closure detection over a 1000 km trajectory, with mean filter update times of 14  179 | ms. The scalability of the system is achieved by defining a sparse approximation to the 180 | FAB-MAP model suitable for implementation using an inverted index. Our formulation of the 181 | problem is fully probabilistic and naturally incorporates robustness against perceptual 182 | aliasing. We also demonstrate that the approach substantially outperforms the standard 183 | term-frequency inverse-document-frequency (tf-idf) ranking measure. The 1000 km data set 184 | comprising almost a terabyte of omni-directional and stereo imagery is available for use, 185 | and we hope that it will serve as a benchmark for future systems.}, 186 | language = {en}, 187 | number = {9}, 188 | urldate = {2018-03-18}, 189 | journal = {The International Journal of Robotics Research}, 190 | author = {Cummins, Mark and Newman, Paul}, 191 | month = aug, 192 | year = {2011}, 193 | pages = {1100--1123} 194 | } 195 | 196 | @inproceedings{hou_convolutional_2015, 197 | title = {Convolutional neural network-based image representation for visual loop closure detection}, 198 | doi = {10.1109/ICInfA.2015.7279659}, 199 | abstract = {Deep convolutional neural networks (CNN) have recently been shown in many computer vision 200 | and pattern recognition applications to outperform by a significant margin state-of-the-art solutions that use 201 | traditional hand-crafted features. However, this impressive performance is yet to be fully exploited in 202 | robotics. In this paper, we focus one specific problem that can benefit from the recent development of the CNN 203 | technology, i.e., we focus on using a pre-trained CNN model as a method of generating an image representation 204 | appropriate for visual loop closure detection in SLAM (simultaneous localization and mapping). We perform a 205 | comprehensive evaluation of the outputs at the intermediate layers of a CNN as image descriptors, in 206 | comparison with state-of-the-art image descriptors, in terms of their ability to match images for detecting 207 | loop closures. The main conclusions of our study include: (a) CNN-based image representations perform 208 | comparably to state-of-the-art hand-crafted competitors in environments without significant lighting change, 209 | (b) they outperform state-of-the-art competitors when lighting changes significantly, and (c) they are also 210 | significantly faster to extract than the state-of-the-art hand-crafted features even on a conventional CPU and 211 | are two orders of magnitude faster on an entry-level GPU.}, 212 | booktitle = {2015 {IEEE} {International} {Conference} on {Information} and {Automation}}, 213 | author = {Hou, Y. and Zhang, H. and Zhou, S.}, 214 | month = aug, 215 | year = {2015}, 216 | keywords = {neural nets, object detection, Standards, Visualization, Feature extraction, image 217 | representation, convolutional neural network, robot vision, Simultaneous localization and mapping, SLAM, SLAM 218 | (robots), Lighting, CNN technology, computer vision application, conventional CPU, Convolutional neural 219 | networks(CNN), entry-level GPU, graphics processing unit, hand-crafted feature, image descriptors, image 220 | matching, Image retrieval, loop closure detection, pattern recognition application, robotics, simultaneous 221 | localization and mapping, visual loop closure detection}, 222 | pages = {2238--2245} 223 | } 224 | 225 | @inproceedings{naseer_robust_2015, 226 | title = {Robust visual {SLAM} across seasons}, 227 | doi = {10.1109/IROS.2015.7353721}, 228 | abstract = {In this paper, we present an appearance-based visual SLAM approach that focuses on 229 | detecting loop closures across seasons. Given two image sequences, our method first extracts one descriptor 230 | per image for both sequences using a deep convolutional neural network. Then, we compute a similarity matrix 231 | by comparing each image of a query sequence with a database. Finally, based on the similarity matrix, we 232 | formulate a flow network problem and compute matching hypotheses between sequences. In this way, our approach 233 | can handle partially matching routes, loops in the trajectory and different speeds of the robot. With a 234 | matching hypothesis as loop closure information and the odometry information of the robot, we formulate a 235 | graph based SLAM problem and compute a joint maximum likelihood trajectory.}, 236 | booktitle = {2015 {IEEE}/{RSJ} {International} {Conference} on {Intelligent} {Robots} and {Systems} 237 | ({IROS})}, 238 | author = {Naseer, T. and Ruhnke, M. and Stachniss, C. and Spinello, L. and Burgard, W.}, 239 | month = sep, 240 | year = {2015}, 241 | keywords = {appearance-based visual SLAM approach, Databases, deep convolutional neural network, 242 | descriptor extraction, feature extraction, Feature extraction, flow network problem, graph based SLAM problem, 243 | image matching, image retrieval, image sequences, joint maximum likelihood trajectory, loop-closure detection, 244 | matching hypothesis, matrix algebra, neural nets, odometry information, partially matching routes, query 245 | sequence, robot vision, robust visual SLAM, Robustness, similarity matrix, Simultaneous localization and 246 | mapping, SLAM (robots), Trajectory, Visualization}, 247 | pages = {2529--2535}, 248 | file = {IEEE Xplore Abstract 249 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/IGPB9WXX/7353721.html:text/html;IEEE Xplore 250 | Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/QNLHRMMX/Naseer et al. - 2015 - Robust 251 | visual SLAM across seasons.pdf:application/pdf} 252 | } 253 | 254 | 255 | @inproceedings{zhang_loop_2017, 256 | title = {Loop closure detection for visual {SLAM} systems using convolutional neural network}, 257 | doi = {10.23919/IConAC.2017.8082072}, 258 | abstract = {This paper is concerned of the loop closure detection problem, which is one of the most 259 | critical parts for visual Simultaneous Localization and Mapping (SLAM) systems. Most of state-of-the-art 260 | methods use hand-crafted features and bag-of-visual-words (BoVW) to tackle this problem. Recent development in 261 | deep learning indicates that CNN features significantly outperform hand-crafted features for image 262 | representation. This advanced technology has not been fully exploited in robotics, especially in visual SLAM 263 | systems. We propose a loop closure detection method based on convolutional neural networks (CNNs). Images are 264 | fed into a pre-trained CNN model to extract features. We pre-process CNN features instead of using them 265 | directly as most of the presented approaches did before they are used to detect loops. The workflow of 266 | extracting CNN features, processing data, computing similarity score and detecting loops is presented. Finally 267 | the performance of proposed method is evaluated on several open datasets by comparing it with Fab-Map using 268 | precision-recall metric.}, 269 | booktitle = {2017 23rd {International} {Conference} on {Automation} and {Computing} ({ICAC})}, 270 | author = {Zhang, X. and Su, Y. and Zhu, X.}, 271 | month = sep, 272 | year = {2017}, 273 | keywords = {learning (artificial intelligence), feature extraction, Principal component analysis, 274 | Visualization, Feature extraction, image representation, Deep Learning, Neural networks, bag-of-visual-words, 275 | CNN model, Computational modeling, convolution, convolutional neural network, Convolutional Neural Network, 276 | data processing, feedforward neural nets, hand-crafted features, Image representation, Loop Closure Detection, 277 | loop closure detection method, loop closure detection problem, mobile robots, pre-process CNN features, 278 | precision-recall metric, robot vision, similarity score, Simultaneous localization and mapping, SLAM, SLAM 279 | (robots), visual simultaneous localization and mapping systems, visual SLAM systems}, 280 | pages = {1--6}, 281 | file = {IEEE Xplore Abstract 282 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/59XH5I3S/8082072.html:text/html} 283 | } 284 | 285 | @article{taketomi_visual_2017, 286 | title = {Visual {SLAM} algorithms: a survey from 2010 to 2016}, 287 | volume = {9}, 288 | issn = {1882-6695}, 289 | shorttitle = {Visual {SLAM} algorithms}, 290 | url = {http://ipsjcva.springeropen.com/articles/10.1186/s41074-017-0027-2}, 291 | doi = {10.1186/s41074-017-0027-2}, 292 | language = {en}, 293 | number = {1}, 294 | urldate = {2018-02-22}, 295 | journal = {IPSJ Transactions on Computer Vision and Applications}, 296 | author = {Taketomi, Takafumi and Uchiyama, Hideaki and Ikeda, Sei}, 297 | month = dec, 298 | year = {2017}, 299 | file = 300 | {s41074-017-0027-2.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/3MJ335NE/s41074-017-0027-2.pdf:application/pdf} 301 | } 302 | 303 | @article{gupta_cognitive_2017, 304 | title = {Cognitive mapping and planning for visual navigation}, 305 | volume = {3}, 306 | journal = {arXiv preprint arXiv:1702.03920}, 307 | author = {Gupta, Saurabh and Davidson, James and Levine, Sergey and Sukthankar, Rahul and Malik, 308 | Jitendra}, 309 | year = {2017}, 310 | file = 311 | {Gupta_Cognitive_Mapping_and_CVPR_2017_paper.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/HTW6Q3BM/Gupta_Cognitive_Mapping_and_CVPR_2017_paper.pdf:application/pdf} 312 | } 313 | 314 | @article{gao_unsupervised_2017, 315 | title = {Unsupervised learning to detect loops using deep neural networks for visual {SLAM} system}, 316 | volume = {41}, 317 | issn = {0929-5593, 1573-7527}, 318 | url = {http://link.springer.com/10.1007/s10514-015-9516-2}, 319 | doi = {10.1007/s10514-015-9516-2}, 320 | language = {en}, 321 | number = {1}, 322 | urldate = {2018-02-22}, 323 | journal = {Autonomous Robots}, 324 | author = {Gao, Xiang and Zhang, Tao}, 325 | month = jan, 326 | year = {2017}, 327 | pages = {1--18}, 328 | file = 329 | {10.1007s10514-015-9516-2.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/JTMGU2H3/10.1007s10514-015-9516-2.pdf:application/pdf} 330 | } 331 | 332 | @inproceedings{wang_survey_2017, 333 | title = {A survey of simultaneous localization and mapping on unstructured lunar complex environment}, 334 | url = {http://aip.scitation.org/doi/abs/10.1063/1.5005198}, 335 | doi = {10.1063/1.5005198}, 336 | urldate = {2018-02-22}, 337 | author = {Wang, Yiqiao and Zhang, Wei and An, Pei}, 338 | year = {2017}, 339 | pages = {030010}, 340 | file = 341 | {1.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/EN54IT4J/1.pdf:application/pdf} 342 | } 343 | 344 | @article{naseer_robust_2018, 345 | title = {Robust {Visual} {Localization} {Across} {Seasons}}, 346 | volume = {PP}, 347 | issn = {1552-3098}, 348 | doi = {10.1109/TRO.2017.2788045}, 349 | abstract = {Localization is an integral part of reliable robot navigation, and long-term autonomy 350 | requires robustness against perceptional changes in the environment during localization. In the context of 351 | vision-based localization, such changes can be caused by illumination variations, occlusion, structural 352 | development, different weather conditions, and seasons. In this paper, we present a novel approach for 353 | localizing a robot over longer periods of time using only monocular image data. We propose a novel data 354 | association approach for matching streams of incoming images to an image sequence stored in a database. Our 355 | method exploits network flows to leverage sequential information to improve the localization performance and 356 | to maintain several possible trajectories hypotheses in parallel. To compare images, we consider a semidense 357 | image description based on histogram of oriented gradients features as well as global descriptors from deep 358 | convolutional neural networks trained on ImageNet for robust localization. We perform extensive evaluations on 359 | a variety of datasets and show that our approach outperforms existing state-of-the-art approaches.}, 360 | number = {99}, 361 | journal = {IEEE Transactions on Robotics}, 362 | author = {Naseer, T. and Burgard, W. and Stachniss, C.}, 363 | year = {2018}, 364 | keywords = {Robustness, Visualization, Image matching, Image sequences, Lighting, Meteorology, 365 | Robots}, 366 | pages = {1--14}, 367 | file = {IEEE Xplore Abstract 368 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/BB9BXT4I/8269404.html:text/html} 369 | } 370 | 371 | @inproceedings{pascoe_nid-slam:_2017, 372 | title = {{NID}-{SLAM}: {Robust} {Monocular} {SLAM} using {Normalised} {Information} {Distance}}, 373 | shorttitle = {{NID}-{SLAM}}, 374 | booktitle = {Conference on {Computer} {Vision} and {Pattern} {Recognition}}, 375 | author = {Pascoe, Geoffrey and Maddern, Will and Tanner, Michael and Piniés, Pedro and Newman, Paul}, 376 | year = {2017}, 377 | file = 378 | {Pascoe_NID-SLAM_Robust_Monocular_CVPR_2017_paper.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/V8ZSZTIS/Pascoe_NID-SLAM_Robust_Monocular_CVPR_2017_paper.pdf:application/pdf} 379 | } 380 | 381 | @inproceedings{naseer_robust_2014, 382 | title = {Robust {Visual} {Robot} {Localization} {Across} {Seasons} {Using} {Network} {Flows}.}, 383 | booktitle = {{AAAI}}, 384 | author = {Naseer, Tayyab and Spinello, Luciano and Burgard, Wolfram and Stachniss, Cyrill}, 385 | year = {2014}, 386 | pages = {2564--2570}, 387 | file = 388 | {naseerAAAI14.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/BTMNUGKU/naseerAAAI14.pdf:application/pdf} 389 | } 390 | 391 | @article{cummins_appearance-only_2011, 392 | title = {Appearance-only {SLAM} at large scale with {FAB}-{MAP} 2.0}, 393 | volume = {30}, 394 | issn = {0278-3649}, 395 | url = {https://doi.org/10.1177/0278364910385483}, 396 | doi = {10.1177/0278364910385483}, 397 | abstract = {We describe a new formulation of appearance-only SLAM suitable for very large 398 | scale place recognition. The system navigates in the space of appearance, assigning each 399 | new observation to either a new or a previously visited location, without reference to 400 | metric position. The system is demonstrated performing reliable online appearance mapping 401 | and loop-closure detection over a 1000 km trajectory, with mean filter update times of 14  402 | ms. The scalability of the system is achieved by defining a sparse approximation to the 403 | FAB-MAP model suitable for implementation using an inverted index. Our formulation of the 404 | problem is fully probabilistic and naturally incorporates robustness against perceptual 405 | aliasing. We also demonstrate that the approach substantially outperforms the standard 406 | term-frequency inverse-document-frequency (tf-idf) ranking measure. The 1000 km data set 407 | comprising almost a terabyte of omni-directional and stereo imagery is available for use, 408 | and we hope that it will serve as a benchmark for future systems.}, 409 | language = {en}, 410 | number = {9}, 411 | urldate = {2018-03-18}, 412 | journal = {The International Journal of Robotics Research}, 413 | author = {Cummins, Mark and Newman, Paul}, 414 | month = aug, 415 | year = {2011}, 416 | pages = {1100--1123} 417 | } 418 | 419 | @inproceedings{hou_convolutional_2015, 420 | title = {Convolutional neural network-based image representation for visual loop closure detection}, 421 | doi = {10.1109/ICInfA.2015.7279659}, 422 | abstract = {Deep convolutional neural networks (CNN) have recently been shown in many computer vision 423 | and pattern recognition applications to outperform by a significant margin state-of-the-art solutions that use 424 | traditional hand-crafted features. However, this impressive performance is yet to be fully exploited in 425 | robotics. In this paper, we focus one specific problem that can benefit from the recent development of the CNN 426 | technology, i.e., we focus on using a pre-trained CNN model as a method of generating an image representation 427 | appropriate for visual loop closure detection in SLAM (simultaneous localization and mapping). We perform a 428 | comprehensive evaluation of the outputs at the intermediate layers of a CNN as image descriptors, in 429 | comparison with state-of-the-art image descriptors, in terms of their ability to match images for detecting 430 | loop closures. The main conclusions of our study include: (a) CNN-based image representations perform 431 | comparably to state-of-the-art hand-crafted competitors in environments without significant lighting change, 432 | (b) they outperform state-of-the-art competitors when lighting changes significantly, and (c) they are also 433 | significantly faster to extract than the state-of-the-art hand-crafted features even on a conventional CPU and 434 | are two orders of magnitude faster on an entry-level GPU.}, 435 | booktitle = {2015 {IEEE} {International} {Conference} on {Information} and {Automation}}, 436 | author = {Hou, Y. and Zhang, H. and Zhou, S.}, 437 | month = aug, 438 | year = {2015}, 439 | keywords = {neural nets, object detection, Standards, Visualization, Feature extraction, image 440 | representation, convolutional neural network, robot vision, Simultaneous localization and mapping, SLAM, SLAM 441 | (robots), Lighting, CNN technology, computer vision application, conventional CPU, Convolutional neural 442 | networks(CNN), entry-level GPU, graphics processing unit, hand-crafted feature, image descriptors, image 443 | matching, Image retrieval, loop closure detection, pattern recognition application, robotics, simultaneous 444 | localization and mapping, visual loop closure detection}, 445 | pages = {2238--2245}, 446 | file = {IEEE Xplore Abstract 447 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/V9UMRSXL/7279659.html:text/html} 448 | } 449 | 450 | @misc{noauthor_download_nodate, 451 | title = {Download {New} {Tsukuba} {Stereo} {Dataset}}, 452 | url = {http://www.cvlab.cs.tsukuba.ac.jp/dataset/tsukubastereo.php}, 453 | urldate = {2018-03-18}, 454 | file = {Download New Tsukuba Stereo 455 | Dataset:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/8JHXHY4T/tsukubastereo.html:text/html} 456 | } 457 | 458 | @inproceedings{naseer_robust_2015, 459 | title = {Robust visual {SLAM} across seasons}, 460 | doi = {10.1109/IROS.2015.7353721}, 461 | abstract = {In this paper, we present an appearance-based visual SLAM approach that focuses on 462 | detecting loop closures across seasons. Given two image sequences, our method first extracts one descriptor 463 | per image for both sequences using a deep convolutional neural network. Then, we compute a similarity matrix 464 | by comparing each image of a query sequence with a database. Finally, based on the similarity matrix, we 465 | formulate a flow network problem and compute matching hypotheses between sequences. In this way, our approach 466 | can handle partially matching routes, loops in the trajectory and different speeds of the robot. With a 467 | matching hypothesis as loop closure information and the odometry information of the robot, we formulate a 468 | graph based SLAM problem and compute a joint maximum likelihood trajectory.}, 469 | booktitle = {2015 {IEEE}/{RSJ} {International} {Conference} on {Intelligent} {Robots} and {Systems} 470 | ({IROS})}, 471 | author = {Naseer, T. and Ruhnke, M. and Stachniss, C. and Spinello, L. and Burgard, W.}, 472 | month = sep, 473 | year = {2015}, 474 | keywords = {neural nets, feature extraction, Robustness, Visualization, Feature extraction, robot 475 | vision, Simultaneous localization and mapping, SLAM (robots), image matching, appearance-based visual SLAM 476 | approach, Databases, deep convolutional neural network, descriptor extraction, flow network problem, graph 477 | based SLAM problem, image retrieval, image sequences, joint maximum likelihood trajectory, loop-closure 478 | detection, matching hypothesis, matrix algebra, odometry information, partially matching routes, query 479 | sequence, robust visual SLAM, similarity matrix, Trajectory}, 480 | pages = {2529--2535}, 481 | file = {IEEE Xplore Abstract 482 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/IGPB9WXX/7353721.html:text/html;IEEE Xplore 483 | Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/QNLHRMMX/Naseer et al. - 2015 - Robust 484 | visual SLAM across seasons.pdf:application/pdf} 485 | } 486 | 487 | @inproceedings{klein_parallel_2007, 488 | title = {Parallel tracking and mapping for small {AR} workspaces}, 489 | booktitle = {Mixed and {Augmented} {Reality}, 2007. {ISMAR} 2007. 6th {IEEE} and {ACM} {International} 490 | {Symposium} on}, 491 | publisher = {IEEE}, 492 | author = {Klein, Georg and Murray, David}, 493 | year = {2007}, 494 | pages = {225--234}, 495 | file = 496 | {KleinMurray2007ISMAR.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/Q4WL9U57/KleinMurray2007ISMAR.pdf:application/pdf} 497 | } 498 | 499 | @article{mur-artal_orb-slam:_2015, 500 | title = {{ORB}-{SLAM}: {A} {Versatile} and {Accurate} {Monocular} {SLAM} {System}}, 501 | volume = {31}, 502 | issn = {1552-3098}, 503 | shorttitle = {{ORB}-{SLAM}}, 504 | doi = {10.1109/TRO.2015.2463671}, 505 | abstract = {This paper presents ORB-SLAM, a feature-based monocular simultaneous localization and 506 | mapping (SLAM) system that operates in real time, in small and large indoor and outdoor environments. The 507 | system is robust to severe motion clutter, allows wide baseline loop closing and relocalization, and includes 508 | full automatic initialization. Building on excellent algorithms of recent years, we designed from scratch a 509 | novel system that uses the same features for all SLAM tasks: tracking, mapping, relocalization, and loop 510 | closing. A survival of the fittest strategy that selects the points and keyframes of the reconstruction leads 511 | to excellent robustness and generates a compact and trackable map that only grows if the scene content 512 | changes, allowing lifelong operation. We present an exhaustive evaluation in 27 sequences from the most 513 | popular datasets. ORB-SLAM achieves unprecedented performance with respect to other state-of-the-art monocular 514 | SLAM approaches. For the benefit of the community, we make the source code public.}, 515 | number = {5}, 516 | journal = {IEEE Transactions on Robotics}, 517 | author = {Mur-Artal, R. and Montiel, J. M. M. and Tardós, J. D.}, 518 | month = oct, 519 | year = {2015}, 520 | keywords = {Visualization, Feature extraction, Computational modeling, Simultaneous localization and 521 | mapping, SLAM (robots), Cameras, feature-based monocular simultaneous localization and mapping system, 522 | Lifelong mapping, localization, monocular vision, Optimization, ORB-SLAM system, Real-time systems, 523 | recognition, simultaneous localization and mapping (SLAM), survival of the fittest strategy}, 524 | pages = {1147--1163}, 525 | file = {IEEE Xplore Abstract 526 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/659EIHPJ/7219438.html:text/html;IEEE Xplore 527 | Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/NSUYHISU/Mur-Artal et al. - 2015 - 528 | ORB-SLAM A Versatile and Accurate Monocular SLAM .pdf:application/pdf} 529 | } 530 | 531 | @inproceedings{martinez-carranza_towards_2015, 532 | title = {Towards autonomous flight of micro aerial vehicles using {ORB}-{SLAM}}, 533 | doi = {10.1109/RED-UAS.2015.7441013}, 534 | abstract = {In the last couple of years a novel visual simultaneous localisation and mapping (SLAM) 535 | system, based on visual features, has emerged as one of the best, if not the best, systems for estimating the 536 | 6D camera pose whilst building a 3D map of the observed scene. This method is called ORB-SLAM and one of its 537 | key ideas is to use the same visual descriptor, a binary descriptor called ORB, for all the visual tasks, this 538 | is, for feature matching, relocalisation and loop closure. On the top of this, ORB-SLAM combines local and 539 | graph-based global bundle adjustment, which enables a scalable map generation whilst keeping real-time 540 | performance. Therefore, motivated by its performance in terms of processing speed, robustness against erratic 541 | motion and scalability, in this paper we present an implementation of autonomous flight for a low-cost micro 542 | aerial vehicle (MAV), where ORB-SLAM is used as a visual positioning system that feeds a PD controller that 543 | controls pitch, roll and yaw. Our results indicate that our implementation has potential and could soon be 544 | implemented on a bigger aerial platform with more complex trajectories to be flown autonomously.}, 545 | booktitle = {2015 {Workshop} on {Research}, {Education} and {Development} of {Unmanned} {Aerial} 546 | {Systems} ({RED}-{UAS})}, 547 | author = {Martínez-Carranza, J. and Loewen, N. and Márquez, F. and García, E. O. and Mayol-Cuevas, 548 | W.}, 549 | month = nov, 550 | year = {2015}, 551 | keywords = {Visualization, Simultaneous localization and mapping, SLAM (robots), image matching, 552 | Trajectory, Cameras, 3D map, 6D camera pose estimation, autonomous aerial vehicles, autonomous flight, binary 553 | descriptor, cameras, erratic motion, feature matching, feature relocalisation, graph theory, graph-based 554 | global bundle adjustment, local bundle adjustment, loop closure, microaerial vehicles, microrobots, 555 | Navigation, ORB-SLAM, PD control, PD controller, pitch control, pose estimation, position control, processing 556 | speed, robust control, robustness, roll control, scalable map generation, simultaneous localisation and 557 | mapping system, Vehicles, visual descriptor, visual features, visual positioning system, yaw control}, 558 | pages = {241--248}, 559 | file = {IEEE Xplore Abstract 560 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/9L3ET2PX/7441013.html:text/html;IEEE Xplore 561 | Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/23GCDADZ/Martínez-Carranza et al. - 562 | 2015 - Towards autonomous flight of micro aerial vehicles.pdf:application/pdf} 563 | } 564 | 565 | @inproceedings{engel_lsd-slam:_2014, 566 | series = {Lecture {Notes} in {Computer} {Science}}, 567 | title = {{LSD}-{SLAM}: {Large}-{Scale} {Direct} {Monocular} {SLAM}}, 568 | isbn = {978-3-319-10604-5 978-3-319-10605-2}, 569 | shorttitle = {{LSD}-{SLAM}}, 570 | url = {https://link-springer-com.ezproxy.rit.edu/chapter/10.1007/978-3-319-10605-2_54}, 571 | doi = {10.1007/978-3-319-10605-2_54}, 572 | abstract = {We propose a direct (feature-less) monocular SLAM algorithm which, in contrast to current 573 | state-of-the-art regarding direct methods, allows to build large-scale, consistent maps of the environment. 574 | Along with highly accurate pose estimation based on direct image alignment, the 3D environment is 575 | reconstructed in real-time as pose-graph of keyframes with associated semi-dense depth maps. These are 576 | obtained by filtering over a large number of pixelwise small-baseline stereo comparisons. The explicitly 577 | scale-drift aware formulation allows the approach to operate on challenging sequences including large 578 | variations in scene scale. Major enablers are two key novelties: (1) a novel direct tracking method which 579 | operates on sim(3){\textbackslash}mathfrak\{sim\}(3), thereby explicitly detecting scale-drift, and (2) an 580 | elegant probabilistic solution to include the effect of noisy depth values into tracking. The resulting direct 581 | monocular SLAM system runs in real-time on a CPU.}, 582 | language = {en}, 583 | urldate = {2018-03-27}, 584 | booktitle = {Computer {Vision} – {ECCV} 2014}, 585 | publisher = {Springer, Cham}, 586 | author = {Engel, Jakob and Schöps, Thomas and Cremers, Daniel}, 587 | month = sep, 588 | year = {2014}, 589 | pages = {834--849}, 590 | file = {Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/VSG3E5Z7/Engel et al. 591 | - 2014 - LSD-SLAM Large-Scale Direct Monocular 592 | SLAM.pdf:application/pdf;Snapshot:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/7HUHDQI4/978-3-319-10605-2_54.html:text/html} 593 | } 594 | 595 | @article{mur-artal_orb-slam2:_2017, 596 | title = {{ORB}-{SLAM}2: {An} {Open}-{Source} {SLAM} {System} for {Monocular}, {Stereo}, and {RGB}-{D} 597 | {Cameras}}, 598 | volume = {33}, 599 | issn = {1552-3098}, 600 | shorttitle = {{ORB}-{SLAM}2}, 601 | doi = {10.1109/TRO.2017.2705103}, 602 | abstract = {We present ORB-SLAM2, a complete simultaneous localization and mapping (SLAM) system for 603 | monocular, stereo and RGB-D cameras, including map reuse, loop closing, and relocalization capabilities. The 604 | system works in real time on standard central processing units in a wide variety of environments from small 605 | hand-held indoors sequences, to drones flying in industrial environments and cars driving around a city. Our 606 | back-end, based on bundle adjustment with monocular and stereo observations, allows for accurate trajectory 607 | estimation with metric scale. Our system includes a lightweight localization mode that leverages visual 608 | odometry tracks for unmapped regions and matches with map points that allow for zero-drift localization. The 609 | evaluation on 29 popular public sequences shows that our method achieves state-of-the-art accuracy, being in 610 | most cases the most accurate SLAM solution. We publish the source code, not only for the benefit of the SLAM 611 | community, but with the aim of being an out-of-the-box SLAM solution for researchers in other fields.}, 612 | number = {5}, 613 | journal = {IEEE Transactions on Robotics}, 614 | author = {Mur-Artal, R. and Tardós, J. D.}, 615 | month = oct, 616 | year = {2017}, 617 | keywords = {Feature extraction, mobile robots, robot vision, Simultaneous localization and mapping, 618 | SLAM (robots), Trajectory, Cameras, Optimization, simultaneous localization and mapping (SLAM), cameras, 619 | ORB-SLAM, distance measurement, Kalman filters, lightweight localization mode, Localization, map points, 620 | mapping, monocular cameras, motion estimation, open-source SLAM system, path planning, RGB-D, RGB-D cameras, 621 | simultaneous localization and mapping system, SLAM community, stereo, stereo cameras, Tracking loops, 622 | zero-drift localization}, 623 | pages = {1255--1262}, 624 | file = {IEEE Xplore Abstract 625 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/VPKUCTDG/7946260.html:text/html;IEEE Xplore 626 | Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/59L3WGMD/Mur-Artal and Tardós - 2017 - 627 | ORB-SLAM2 An Open-Source SLAM System for Monocula.pdf:application/pdf} 628 | } 629 | 630 | @inproceedings{caselitz_monocular_2016, 631 | title = {Monocular camera localization in 3D {LiDAR} maps}, 632 | isbn = {978-1-5090-3762-9}, 633 | url = {http://ieeexplore.ieee.org/document/7759304/}, 634 | doi = {10.1109/IROS.2016.7759304}, 635 | abstract = {Localizing a camera in a given map is essential for vision-based navigation. In contrast 636 | to common methods for visual localization that use maps acquired with cameras, we propose a novel approach, 637 | which tracks the pose of monocular camera with respect to a given 3D LiDAR map. We employ a visual odometry 638 | system based on local bundle adjustment to reconstruct a sparse set of 3D points from image features. These 639 | points are continuously matched against the map to track the camera pose in an online fashion. Our approach to 640 | visual localization has several advantages. Since it only relies on matching geometry, it is robust to changes 641 | in the photometric appearance of the environment. Utilizing panoramic LiDAR maps additionally provides 642 | viewpoint invariance. Yet lowcost and lightweight camera sensors are used for tracking. We present real-world 643 | experiments demonstrating that our method accurately estimates the 6-DoF camera pose over long trajectories 644 | and under varying conditions.}, 645 | language = {en}, 646 | urldate = {2018-03-27}, 647 | publisher = {IEEE}, 648 | author = {Caselitz, Tim and Steder, Bastian and Ruhnke, Michael and Burgard, Wolfram}, 649 | month = oct, 650 | year = {2016}, 651 | pages = {1926--1931}, 652 | file = {Caselitz et al. - 2016 - Monocular camera localization in 3D LiDAR 653 | maps.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/CEL7IHIH/Caselitz et al. - 2016 - Monocular 654 | camera localization in 3D LiDAR maps.pdf:application/pdf} 655 | } 656 | 657 | @inproceedings{wolcott_visual_2014, 658 | title = {Visual localization within {LIDAR} maps for automated urban driving}, 659 | isbn = {978-1-4799-6934-0 978-1-4799-6931-9}, 660 | url = {http://ieeexplore.ieee.org/document/6942558/}, 661 | doi = {10.1109/IROS.2014.6942558}, 662 | abstract = {This paper reports on the problem of map-based visual localization in urban environments 663 | for autonomous vehicles. Self-driving cars have become a reality on roadways and are going to be a consumer 664 | product in the near future. One of the most significant road-blocks to autonomous vehicles is the prohibitive 665 | cost of the sensor suites necessary for localization. The most common sensor on these platforms, a 666 | three-dimensional (3D) light detection and ranging (LIDAR) scanner, generates dense point clouds with measures 667 | of surface reflectivity—which other state-of-the-art localization methods have shown are capable of 668 | centimeter-level accuracy. Alternatively, we seek to obtain comparable localization accuracy with significantly 669 | cheaper, commodity cameras. We propose to localize a single monocular camera within a 3D prior groundmap, 670 | generated by a survey vehicle equipped with 3D LIDAR scanners. To do so, we exploit a graphics processing unit 671 | to generate several synthetic views of our belief environment. We then seek to maximize the normalized mutual 672 | information between our real camera measurements and these synthetic views. Results are shown for two 673 | different datasets, a 3.0 km and a 1.5 km trajectory, where we also compare against the state-of-the-art in 674 | LIDAR map-based localization.}, 675 | language = {en}, 676 | urldate = {2018-03-27}, 677 | publisher = {IEEE}, 678 | author = {Wolcott, Ryan W. and Eustice, Ryan M.}, 679 | month = sep, 680 | year = {2014}, 681 | pages = {176--183}, 682 | file = {Wolcott and Eustice - 2014 - Visual localization within LIDAR maps for 683 | automate.pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/QF64SI42/Wolcott and Eustice - 2014 - 684 | Visual localization within LIDAR maps for automate.pdf:application/pdf} 685 | } 686 | 687 | @inproceedings{chen_robust_2017, 688 | title = {Robust {SLAM} system based on monocular vision and {LiDAR} for robotic urban search and 689 | rescue}, 690 | doi = {10.1109/SSRR.2017.8088138}, 691 | abstract = {In this paper, we propose a monocular SLAM system for robotic urban search and rescue 692 | (USAR), based on which most USAR tasks (e.g. localization, mapping, exploration and object recognition) can be 693 | fulfilled by rescue robots with only a single camera. The proposed system can be a promising basis to 694 | implement fully autonomous rescue robots. However, the feature-based map built by the monocular SLAM is 695 | difficult for the operator to understand and use. We therefore combine the monocular SLAM with a 2D LIDAR SLAM 696 | to realize a 2D mapping and 6D localization SLAM system which can not only obtain a real scale of the 697 | environment and make the map more friendly to users, but also solve the problem that the robot pose cannot be 698 | tracked by the 2D LIDAR SLAM when the robot climbing stairs and ramps. We test our system using a real rescue 699 | robot in simulated disaster environments. The experimental results show that good performance can be achieved 700 | using the proposed system in the USAR. The system has also been successfully applied and tested in the RoboCup 701 | Rescue Robot League (RRL) competitions, where our rescue robot team entered the top 5 and won the Best in 702 | Class small robot mobility in 2016 RoboCup RRL Leipzig Germany, and the champions of 2016 and 2017 RoboCup 703 | China Open RRL competitions.}, 704 | booktitle = {2017 {IEEE} {International} {Symposium} on {Safety}, {Security} and {Rescue} {Robotics} 705 | ({SSRR})}, 706 | author = {Chen, X. and Zhang, H. and Lu, H. and Xiao, J. and Qiu, Q. and Li, Y.}, 707 | month = oct, 708 | year = {2017}, 709 | keywords = {Feature extraction, mobile robots, robot vision, Simultaneous localization and mapping, 710 | SLAM (robots), monocular vision, 2D LIDAR SLAM, 6D localization SLAM system, disasters, fully autonomous 711 | rescue robots, Laser radar, LiDAR SLAM, monocular SLAM, monocular SLAM system, multi-robot systems, object 712 | recognition, Object recognition, relocalization, rescue robot team, rescue robots, Rescue robots, RoboCup 713 | Rescue Robot League competitions, robot climbing stairs, robot pose, robotic urban search, robust SLAM system, 714 | service robots, small robot mobility, Two dimensional displays, Urban search and rescue, USAR tasks}, 715 | pages = {41--47} 716 | } 717 | 718 | @inproceedings{levinson_map-based_2007, 719 | title = {Map-{Based} {Precision} {Vehicle} {Localization} in {Urban} {Environments}}, 720 | isbn = {978-0-262-52484-1}, 721 | url = {http://www.roboticsproceedings.org/rss03/p16.pdf}, 722 | doi = {10.15607/RSS.2007.III.016}, 723 | abstract = {Many urban navigation applications (e.g., autonomous navigation, driver assistance 724 | systems) can benefit greatly from localization with centimeter accuracy. Yet such accuracy cannot be achieved 725 | reliably with GPS-based inertial guidance systems, specifically in urban settings.}, 726 | language = {en}, 727 | urldate = {2018-03-27}, 728 | publisher = {Robotics: Science and Systems Foundation}, 729 | author = {Levinson, J. and Montemerlo, M. and Thrun, S.}, 730 | month = jun, 731 | year = {2007}, 732 | file = {Levinson et al. - 2007 - Map-Based Precision Vehicle Localization in Urban 733 | .pdf:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/A9AURFHW/Levinson et al. - 2007 - Map-Based 734 | Precision Vehicle Localization in Urban .pdf:application/pdf} 735 | } 736 | 737 | @article{sermanet_overfeat:_2013, 738 | title = {{OverFeat}: {Integrated} {Recognition}, {Localization} and {Detection} using {Convolutional} 739 | {Networks}}, 740 | shorttitle = {{OverFeat}}, 741 | url = {http://arxiv.org/abs/1312.6229}, 742 | abstract = {We present an integrated framework for using Convolutional Networks for classification, 743 | localization and detection. We show how a multiscale and sliding window approach can be efficiently 744 | implemented within a ConvNet. We also introduce a novel deep learning approach to localization by learning to 745 | predict object boundaries. Bounding boxes are then accumulated rather than suppressed in order to increase 746 | detection confidence. We show that different tasks can be learned simultaneously using a single shared 747 | network. This integrated framework is the winner of the localization task of the ImageNet Large Scale Visual 748 | Recognition Challenge 2013 (ILSVRC2013) and obtained very competitive results for the detection and 749 | classifications tasks. In post-competition work, we establish a new state of the art for the detection task. 750 | Finally, we release a feature extractor from our best model called OverFeat.}, 751 | urldate = {2018-03-27}, 752 | journal = {arXiv:1312.6229 [cs]}, 753 | author = {Sermanet, Pierre and Eigen, David and Zhang, Xiang and Mathieu, Michael and Fergus, Rob and 754 | LeCun, Yann}, 755 | month = dec, 756 | year = {2013}, 757 | note = {arXiv: 1312.6229}, 758 | keywords = {Computer Science - Computer Vision and Pattern Recognition}, 759 | file = {arXiv\:1312.6229 760 | PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/G2RIBP3F/Sermanet et al. - 2013 - OverFeat 761 | Integrated Recognition, Localization and.pdf:application/pdf} 762 | } 763 | 764 | @inproceedings{bay_surf:_2006, 765 | series = {Lecture {Notes} in {Computer} {Science}}, 766 | title = {{SURF}: {Speeded} {Up} {Robust} {Features}}, 767 | isbn = {978-3-540-33832-1 978-3-540-33833-8}, 768 | shorttitle = {{SURF}}, 769 | url = {https://link.springer.com/chapter/10.1007/11744023_32}, 770 | doi = {10.1007/11744023_32}, 771 | abstract = {In this paper, we present a novel scale- and rotation-invariant interest point detector 772 | and descriptor, coined SURF (Speeded Up Robust Features). It approximates or even outperforms previously 773 | proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and 774 | compared much faster.This is achieved by relying on integral images for image convolutions; by building on the 775 | strengths of the leading existing detectors and descriptors (in casu, using a Hessian matrix-based measure for 776 | the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This 777 | leads to a combination of novel detection, description, and matching steps. The paper presents experimental 778 | results on a standard evaluation set, as well as on imagery obtained in the context of a real-life object 779 | recognition application. Both show SURF’s strong performance.}, 780 | language = {en}, 781 | urldate = {2018-03-27}, 782 | booktitle = {Computer {Vision} – {ECCV} 2006}, 783 | publisher = {Springer, Berlin, Heidelberg}, 784 | author = {Bay, Herbert and Tuytelaars, Tinne and Gool, Luc Van}, 785 | month = may, 786 | year = {2006}, 787 | pages = {404--417}, 788 | file = {Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/XWPJJMZ9/Bay et al. - 789 | 2006 - SURF Speeded Up Robust 790 | Features.pdf:application/pdf;Snapshot:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/FAJ8LMU6/10.html:text/html} 791 | } 792 | 793 | @inproceedings{lowe_object_1999, 794 | title = {Object recognition from local scale-invariant features}, 795 | volume = {2}, 796 | doi = {10.1109/ICCV.1999.790410}, 797 | abstract = {An object recognition system has been developed that uses a new class of local image 798 | features. The features are invariant to image scaling, translation, and rotation, and partially invariant to 799 | illumination changes and affine or 3D projection. These features share similar properties with neurons in 800 | inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently 801 | detected through a staged filtering approach that identifies stable points in scale space. Image keys are 802 | created that allow for local geometric deformations by representing blurred image gradients in multiple 803 | orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method 804 | that identifies candidate object matches. Final verification of each match is achieved by finding a low 805 | residual least squares solution for the unknown model parameters. Experimental results show that robust object 806 | recognition can be achieved in cluttered partially occluded images with a computation time of under 2 807 | seconds}, 808 | booktitle = {Proceedings of the {Seventh} {IEEE} {International} {Conference} on {Computer} {Vision}}, 809 | author = {Lowe, D. G.}, 810 | year = {1999}, 811 | keywords = {feature extraction, Lighting, image matching, Neurons, object recognition, Object 812 | recognition, 3D projection, blurred image gradients, candidate object matches, cluttered partially occluded 813 | images, computation time, computational geometry, Computer science, Electrical capacitance tomography, 814 | Filters, Image recognition, inferior temporal cortex, Layout, least squares approximations, local geometric 815 | deformations, local image features, local scale-invariant features, low residual least squares solution, 816 | multiple orientation planes, nearest neighbor indexing method, primate vision, Programmable logic arrays, 817 | Reactive power, robust object recognition, staged filtering approach, unknown model parameters}, 818 | pages = {1150--1157 vol.2}, 819 | file = {IEEE Xplore Abstract 820 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/AGRJWLKV/790410.html:text/html} 821 | } 822 | 823 | @inproceedings{dalal_histograms_2005, 824 | title = {Histograms of oriented gradients for human detection}, 825 | volume = {1}, 826 | doi = {10.1109/CVPR.2005.177}, 827 | abstract = {We study the question of feature sets for robust visual object recognition; adopting 828 | linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, 829 | we show experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly 830 | outperform existing feature sets for human detection. We study the influence of each stage of the computation 831 | on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial 832 | binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for 833 | good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we 834 | introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose 835 | variations and backgrounds.}, 836 | booktitle = {2005 {IEEE} {Computer} {Society} {Conference} on {Computer} {Vision} and {Pattern} 837 | {Recognition} ({CVPR}'05)}, 838 | author = {Dalal, N. and Triggs, B.}, 839 | month = jun, 840 | year = {2005}, 841 | keywords = {Testing, feature extraction, Humans, object detection, Robustness, object recognition, 842 | Object recognition, coarse spatial binning, contrast normalization, edge based descriptors, fine orientation 843 | binning, fine-scale gradients, gradient based descriptors, gradient methods, High performance computing, 844 | Histograms, histograms of oriented gradients, human detection, Image databases, Image edge detection, linear 845 | SVM, Object detection, overlapping descriptor, pedestrian database, robust visual object recognition, support 846 | vector machines, Support vector machines}, 847 | pages = {886--893 vol. 1}, 848 | file = {IEEE Xplore Abstract 849 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/AGLJKWFG/1467360.html:text/html;IEEE Xplore 850 | Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/7GRP49FP/Dalal and Triggs - 2005 - 851 | Histograms of oriented gradients for human detecti.pdf:application/pdf} 852 | } 853 | 854 | @inproceedings{rublee_orb:_2011, 855 | title = {{ORB}: {An} efficient alternative to {SIFT} or {SURF}}, 856 | shorttitle = {{ORB}}, 857 | doi = {10.1109/ICCV.2011.6126544}, 858 | abstract = {Feature matching is at the base of many computer vision problems, such as object 859 | recognition or structure from motion. Current methods rely on costly descriptors for detection and matching. 860 | In this paper, we propose a very fast binary descriptor based on BRIEF, called ORB, which is rotation 861 | invariant and resistant to noise. We demonstrate through experiments how ORB is at two orders of magnitude 862 | faster than SIFT, while performing as well in many situations. The efficiency is tested on several real-world 863 | applications, including object detection and patch-tracking on a smart phone.}, 864 | booktitle = {2011 {International} {Conference} on {Computer} {Vision}}, 865 | author = {Rublee, E. and Rabaud, V. and Konolige, K. and Bradski, G.}, 866 | month = nov, 867 | year = {2011}, 868 | keywords = {object detection, image matching, binary descriptor, feature matching, object recognition, 869 | Boats, BRIEF, computer vision, noise resistance, ORB, patch-tracking, SIFT, smart phone, SURF, tracking, 870 | transforms}, 871 | pages = {2564--2571}, 872 | file = {IEEE Xplore Abstract 873 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/URPH74YZ/6126544.html:text/html;IEEE Xplore 874 | Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/MII6WPYC/Rublee et al. - 2011 - ORB An 875 | efficient alternative to SIFT or SURF.pdf:application/pdf} 876 | } 877 | 878 | @inproceedings{levinson_robust_2010, 879 | title = {Robust vehicle localization in urban environments using probabilistic maps}, 880 | doi = {10.1109/ROBOT.2010.5509700}, 881 | abstract = {Autonomous vehicle navigation in dynamic urban environments requires localization accuracy 882 | exceeding that available from GPS-based inertial guidance systems. We have shown previously that GPS, IMU, and 883 | LIDAR data can be used to generate a high-resolution infrared remittance ground map that can be subsequently 884 | used for localization. We now propose an extension to this approach that yields substantial improvements over 885 | previous work in vehicle localization, including higher precision, the ability to learn and improve maps over 886 | time, and increased robustness to environment changes and dynamic obstacles. Specifically, we model the 887 | environment, instead of as a spatial grid of fixed infrared remittance values, as a probabilistic grid whereby 888 | every cell is represented as its own gaussian distribution over remittance values. Subsequently, Bayesian 889 | inference is able to preferentially weight parts of the map most likely to be stationary and of consistent 890 | angular reflectivity, thereby reducing uncertainty and catastrophic errors. Furthermore, by using offline SLAM 891 | to align multiple passes of the same environment, possibly separated in time by days or even months, it is 892 | possible to build an increasingly robust understanding of the world that can be then exploited for 893 | localization. We validate the effectiveness of our approach by using these algorithms to localize our vehicle 894 | against probabilistic maps in various dynamic environments, achieving RMS accuracy in the 10cm-range and thus 895 | outperforming previous work. Importantly, this approach has enabled us to autonomously drive our vehicle for 896 | hundreds of miles in dense traffic on narrow urban roads which were formerly unnavigable with previous 897 | localization methods.}, 898 | booktitle = {2010 {IEEE} {International} {Conference} on {Robotics} and {Automation}}, 899 | author = {Levinson, J. and Thrun, S.}, 900 | month = may, 901 | year = {2010}, 902 | keywords = {Robustness, probability, SLAM, Navigation, Laser radar, autonomous vehicle navigation, 903 | Bayesian inference, Bayesian methods, dynamic urban environment, Gaussian distribution, Global Positioning 904 | System, GPS-based inertial guidance systems, high-resolution infrared remittance ground map, IMU, LIDAR, 905 | Mobile robots, navigation, probabilistic grid, probabilistic maps, Reflectivity, Remotely operated vehicles, 906 | road vehicles, robust vehicle localization, spatial grid, Vehicle dynamics}, 907 | pages = {4372--4378}, 908 | file = {IEEE Xplore Abstract 909 | Record:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/96JIMV4W/5509700.html:text/html;IEEE Xplore 910 | Full Text PDF:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/39PM43BQ/Levinson and Thrun - 2010 - 911 | Robust vehicle localization in urban environments .pdf:application/pdf} 912 | } 913 | 914 | @misc{satoshi_kagami_autonomous_2013, 915 | title = {Autonomous {Vehicle} {Navigation} by {Building} 3D {Map} and by {Detecting} {Human} 916 | {Trajectory} using {LIDAR} - {Semantic} {Scholar}}, 917 | url = 918 | {/paper/Autonomous-Vehicle-Navigation-by-Building-3D-Map-by-Kagami-Thompson/81b14341e3e063d819d032b6ce0bc0be0917c867}, 919 | abstract = {This paper describes an autonomous vehicle navigation system based on Velodyne LIDAR. The 920 | system mainly focusing on an autonomy at the car park, and it includes following functions, 1) odometory 921 | correction, 2) 3D map building, 3) localization, 4) detecting human trajectories as well as static obstacles, 922 | 5) path planning, and 6) vehicle control to a given trajectory. All those functions are developed on ROS. Car 923 | park of 70x50[m] area is used for experiment and results are shown.}, 924 | urldate = {2018-03-28}, 925 | author = {{Satoshi Kagami} and {Simon Thompson, Ippei Samejima, Tsuyoshi Hamada, Shinpei Kato, Naotaka 926 | Hatao, Yuma Nihei, Takuro Egawa, Kazuya Takeda, Hiroshi Takemura, Hiroshi Mizoguchi}}, 927 | year = {2013}, 928 | file = 929 | {Snapshot:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/JTKYRN7D/81b14341e3e063d819d032b6ce0bc0be0917c867.html:text/html} 930 | } 931 | 932 | @article{sujiwo_monocular_2016, 933 | title = {Monocular {Vision}-{Based} {Localization} {Using} {ORB}-{SLAM} with {LIDAR}-{Aided} {Mapping} 934 | in {Real}-{World} {Robot} {Challenge}}, 935 | volume = {28}, 936 | url = {https://www.fujipress.jp/jrm/rb/robot002800040479/}, 937 | doi = {10.20965/jrm.2016.p0479}, 938 | abstract = {Title: Monocular Vision-Based Localization Using ORB-SLAM with LIDAR-Aided Mapping in 939 | Real-World Robot Challenge {\textbar} Keywords: visual localization, autonomous vehicle, field robotics, 940 | Tsukuba Challenge {\textbar} Author: Adi Sujiwo, Tomohito Ando, Eijiro Takeuchi, Yoshiki Ninomiya, and Masato 941 | Edahiro}, 942 | number = {4}, 943 | urldate = {2018-03-28}, 944 | journal = {Journal of Robotics and Mechatronics}, 945 | author = {Sujiwo, Adi and Ando, Tomohito and Takeuchi, Eijiro and Ninomiya, Yoshiki and Edahiro, 946 | Masato}, 947 | month = aug, 948 | year = {2016}, 949 | pages = {479--490}, 950 | file = 951 | {Snapshot:/home/zach/.zotero/zotero/n6tqo7vy.default/zotero/storage/NRRMNLPE/robot002800040479.html:text/html} 952 | } 953 | -------------------------------------------------------------------------------- /paper/eso-pic.sty: -------------------------------------------------------------------------------- 1 | %% 2 | %% This is file `eso-pic.sty', 3 | %% generated with the docstrip utility. 4 | %% 5 | %% The original source files were: 6 | %% 7 | %% eso-pic.dtx (with options: `package') 8 | %% 9 | %% This is a generated file. 10 | %% 11 | %% Copyright (C) 1998-2002 by Rolf Niepraschk 12 | %% 13 | %% This file may be distributed and/or modified under the conditions of 14 | %% the LaTeX Project Public License, either version 1.2 of this license 15 | %% or (at your option) any later version. The latest version of this 16 | %% license is in: 17 | %% 18 | %% http://www.latex-project.org/lppl.txt 19 | %% 20 | %% and version 1.2 or later is part of all distributions of LaTeX version 21 | %% 1999/12/01 or later. 22 | %% 23 | \NeedsTeXFormat{LaTeX2e}[1999/12/01] 24 | \ProvidesPackage{eso-pic} 25 | [2002/11/16 v1.1b eso-pic (RN)] 26 | \input{cvpr_eso.sty} 27 | \newcommand\LenToUnit[1]{#1\@gobble} 28 | 29 | \newcommand\AtPageUpperLeft[1]{% 30 | \begingroup 31 | \@tempdima=0pt\relax\@tempdimb=\ESO@yoffsetI\relax 32 | \put(\LenToUnit{\@tempdima},\LenToUnit{\@tempdimb}){#1}% 33 | \endgroup 34 | } 35 | \newcommand\AtPageLowerLeft[1]{\AtPageUpperLeft{% 36 | \put(0,\LenToUnit{-\paperheight}){#1}}} 37 | \newcommand\AtPageCenter[1]{\AtPageUpperLeft{% 38 | \put(\LenToUnit{.5\paperwidth},\LenToUnit{-.5\paperheight}){#1}}% 39 | } 40 | \newcommand\AtTextUpperLeft[1]{% 41 | \begingroup 42 | \setlength\@tempdima{1in}% 43 | \ifodd\c@page% 44 | \advance\@tempdima\oddsidemargin% 45 | \else% 46 | \advance\@tempdima\evensidemargin% 47 | \fi% 48 | \@tempdimb=\ESO@yoffsetI\relax\advance\@tempdimb-1in\relax% 49 | \advance\@tempdimb-\topmargin% 50 | \advance\@tempdimb-\headheight\advance\@tempdimb-\headsep% 51 | \put(\LenToUnit{\@tempdima},\LenToUnit{\@tempdimb}){#1}% 52 | \endgroup 53 | } 54 | \newcommand\AtTextLowerLeft[1]{\AtTextUpperLeft{% 55 | \put(0,\LenToUnit{-\textheight}){#1}}} 56 | \newcommand\AtTextCenter[1]{\AtTextUpperLeft{% 57 | \put(\LenToUnit{.5\textwidth},\LenToUnit{-.5\textheight}){#1}}} 58 | \newcommand{\ESO@HookI}{} \newcommand{\ESO@HookII}{} 59 | \newcommand{\ESO@HookIII}{} 60 | \newcommand{\AddToShipoutPicture}{% 61 | \@ifstar{\g@addto@macro\ESO@HookII}{\g@addto@macro\ESO@HookI}} 62 | \newcommand{\ClearShipoutPicture}{\global\let\ESO@HookI\@empty} 63 | \newcommand\ESO@isMEMOIR[1]{} 64 | \@ifclassloaded{memoir}{\renewcommand\ESO@isMEMOIR[1]{#1}}{} 65 | \newcommand{\@ShipoutPicture}{% 66 | \bgroup 67 | \@tempswafalse% 68 | \ifx\ESO@HookI\@empty\else\@tempswatrue\fi% 69 | \ifx\ESO@HookII\@empty\else\@tempswatrue\fi% 70 | \ifx\ESO@HookIII\@empty\else\@tempswatrue\fi% 71 | \if@tempswa% 72 | \@tempdima=1in\@tempdimb=-\@tempdima% 73 | \advance\@tempdimb\ESO@yoffsetI% 74 | \ESO@isMEMOIR{% 75 | \advance\@tempdima\trimedge% 76 | \advance\@tempdima\paperwidth% 77 | \advance\@tempdima-\stockwidth% 78 | \if@twoside\ifodd\c@page\else% 79 | \advance\@tempdima-2\trimedge% 80 | \advance\@tempdima-\paperwidth% 81 | \advance\@tempdima\stockwidth% 82 | \fi\fi% 83 | \advance\@tempdimb\trimtop}% 84 | \unitlength=1pt% 85 | \global\setbox\@cclv\vbox{% 86 | \vbox{\let\protect\relax 87 | \pictur@(0,0)(\strip@pt\@tempdima,\strip@pt\@tempdimb)% 88 | \ESO@HookIII\ESO@HookI\ESO@HookII% 89 | \global\let\ESO@HookII\@empty% 90 | \endpicture}% 91 | \nointerlineskip% 92 | \box\@cclv}% 93 | \fi 94 | \egroup 95 | } 96 | \EveryShipout{\@ShipoutPicture} 97 | \RequirePackage{keyval} 98 | \newif\ifESO@dvips\ESO@dvipsfalse \newif\ifESO@grid\ESO@gridfalse 99 | \newif\ifESO@texcoord\ESO@texcoordfalse 100 | \newcommand*\ESO@gridunitname{} 101 | \newcommand*\ESO@gridunit{} 102 | \newcommand*\ESO@labelfactor{} 103 | \newcommand*\ESO@griddelta{}\newcommand*\ESO@griddeltaY{} 104 | \newcommand*\ESO@gridDelta{}\newcommand*\ESO@gridDeltaY{} 105 | \newcommand*\ESO@gridcolor{} 106 | \newcommand*\ESO@subgridcolor{} 107 | \newcommand*\ESO@subgridstyle{dotted}% ??? 108 | \newcommand*\ESO@gap{} 109 | \newcommand*\ESO@yoffsetI{}\newcommand*\ESO@yoffsetII{} 110 | \newcommand*\ESO@gridlines{\thinlines} 111 | \newcommand*\ESO@subgridlines{\thinlines} 112 | \newcommand*\ESO@hline[1]{\ESO@subgridlines\line(1,0){#1}} 113 | \newcommand*\ESO@vline[1]{\ESO@subgridlines\line(0,1){#1}} 114 | \newcommand*\ESO@Hline[1]{\ESO@gridlines\line(1,0){#1}} 115 | \newcommand*\ESO@Vline[1]{\ESO@gridlines\line(0,1){#1}} 116 | \newcommand\ESO@fcolorbox[4][]{\fbox{#4}} 117 | \newcommand\ESO@color[1]{} 118 | \newcommand\ESO@colorbox[3][]{% 119 | \begingroup 120 | \fboxrule=0pt\fbox{#3}% 121 | \endgroup 122 | } 123 | \newcommand\gridSetup[6][]{% 124 | \edef\ESO@gridunitname{#1}\edef\ESO@gridunit{#2} 125 | \edef\ESO@labelfactor{#3}\edef\ESO@griddelta{#4} 126 | \edef\ESO@gridDelta{#5}\edef\ESO@gap{#6}} 127 | \define@key{ESO}{texcoord}[true]{\csname ESO@texcoord#1\endcsname} 128 | \define@key{ESO}{pscoord}[true]{\csname @tempswa#1\endcsname 129 | \if@tempswa\ESO@texcoordfalse\else\ESO@texcoordtrue\fi} 130 | \define@key{ESO}{dvips}[true]{\csname ESO@dvips#1\endcsname} 131 | \define@key{ESO}{grid}[true]{\csname ESO@grid#1\endcsname 132 | \setkeys{ESO}{gridcolor=black,subgridcolor=black}} 133 | \define@key{ESO}{colorgrid}[true]{\csname ESO@grid#1\endcsname 134 | \setkeys{ESO}{gridcolor=red,subgridcolor=green}} 135 | \define@key{ESO}{gridcolor}{\def\ESO@gridcolor{#1}} 136 | \define@key{ESO}{subgridcolor}{\def\ESO@subgridcolor{#1}} 137 | \define@key{ESO}{subgridstyle}{\def\ESO@subgridstyle{#1}}% 138 | \define@key{ESO}{gridunit}{% 139 | \def\@tempa{#1} 140 | \def\@tempb{bp} 141 | \ifx\@tempa\@tempb 142 | \gridSetup[\@tempa]{1bp}{1}{10}{50}{2} 143 | \else 144 | \def\@tempb{pt} 145 | \ifx\@tempa\@tempb 146 | \gridSetup[\@tempa]{1pt}{1}{10}{50}{2} 147 | \else 148 | \def\@tempb{in} 149 | \ifx\@tempa\@tempb 150 | \gridSetup[\@tempa]{.1in}{.1}{2}{10}{.5} 151 | \else 152 | \gridSetup[mm]{1mm}{1}{5}{20}{1} 153 | \fi 154 | \fi 155 | \fi 156 | } 157 | \setkeys{ESO}{subgridstyle=solid,pscoord=true,gridunit=mm} 158 | \def\ProcessOptionsWithKV#1{% 159 | \let\@tempc\@empty 160 | \@for\CurrentOption:=\@classoptionslist\do{% 161 | \@ifundefined{KV@#1@\CurrentOption}% 162 | {}{\edef\@tempc{\@tempc,\CurrentOption,}}}% 163 | \edef\@tempc{% 164 | \noexpand\setkeys{#1}{\@tempc\@ptionlist{\@currname.\@currext}}}% 165 | \@tempc 166 | \AtEndOfPackage{\let\@unprocessedoptions\relax}}% 167 | \ProcessOptionsWithKV{ESO}% 168 | \newcommand\ESO@div[2]{% 169 | \@tempdima=#1\relax\@tempdimb=\ESO@gridunit\relax 170 | \@tempdimb=#2\@tempdimb\divide\@tempdima by \@tempdimb% 171 | \@tempcnta\@tempdima\advance\@tempcnta\@ne} 172 | \AtBeginDocument{% 173 | \IfFileExists{color.sty} 174 | {% 175 | \RequirePackage{color} 176 | \let\ESO@color=\color\let\ESO@colorbox=\colorbox 177 | \let\ESO@fcolorbox=\fcolorbox 178 | }{} 179 | \@ifundefined{Gin@driver}{}% 180 | {% 181 | \ifx\Gin@driver\@empty\else% 182 | \filename@parse{\Gin@driver}\def\reserved@a{dvips}% 183 | \ifx\filename@base\reserved@a\ESO@dvipstrue\fi% 184 | \fi 185 | }% 186 | \ifx\pdfoutput\undefined\else 187 | \ifx\pdfoutput\relax\else 188 | \ifcase\pdfoutput\else 189 | \ESO@dvipsfalse% 190 | \fi 191 | \fi 192 | \fi 193 | \ifESO@dvips\def\@tempb{eepic}\else\def\@tempb{epic}\fi 194 | \def\@tempa{dotted}%\def\ESO@gap{\LenToUnit{6\@wholewidth}}% 195 | \ifx\@tempa\ESO@subgridstyle 196 | \IfFileExists{\@tempb.sty}% 197 | {% 198 | \RequirePackage{\@tempb} 199 | \renewcommand*\ESO@hline[1]{\ESO@subgridlines\dottedline{\ESO@gap}% 200 | (0,0)(##1,0)} 201 | \renewcommand*\ESO@vline[1]{\ESO@subgridlines\dottedline{\ESO@gap}% 202 | (0,0)(0,##1)} 203 | }{} 204 | \else 205 | \ifx\ESO@gridcolor\ESO@subgridcolor% 206 | \renewcommand*\ESO@gridlines{\thicklines} 207 | \fi 208 | \fi 209 | } 210 | \ifESO@texcoord 211 | \def\ESO@yoffsetI{0pt}\def\ESO@yoffsetII{-\paperheight} 212 | \edef\ESO@griddeltaY{-\ESO@griddelta}\edef\ESO@gridDeltaY{-\ESO@gridDelta} 213 | \else 214 | \def\ESO@yoffsetI{\paperheight}\def\ESO@yoffsetII{0pt} 215 | \edef\ESO@griddeltaY{\ESO@griddelta}\edef\ESO@gridDeltaY{\ESO@gridDelta} 216 | \fi 217 | \newcommand\ESO@gridpicture{% 218 | \begingroup 219 | \setlength\unitlength{\ESO@gridunit}% 220 | \ESO@color{\ESO@subgridcolor}% 221 | \ESO@div{\paperheight}{\ESO@griddelta}% 222 | \multiput(0,0)(0,\ESO@griddeltaY){\@tempcnta}% 223 | {\ESO@hline{\LenToUnit{\paperwidth}}}% 224 | \ESO@div{\paperwidth}{\ESO@griddelta}% 225 | \multiput(0,\LenToUnit{\ESO@yoffsetII})(\ESO@griddelta,0){\@tempcnta}% 226 | {\ESO@vline{\LenToUnit{\paperheight}}}% 227 | \ESO@color{\ESO@gridcolor}% 228 | \ESO@div{\paperheight}{\ESO@gridDelta}% 229 | \multiput(0,0)(0,\ESO@gridDeltaY){\@tempcnta}% 230 | {\ESO@Hline{\LenToUnit{\paperwidth}}}% 231 | \ESO@div{\paperwidth}{\ESO@gridDelta}% 232 | \multiput(0,\LenToUnit{\ESO@yoffsetII})(\ESO@gridDelta,0){\@tempcnta}% 233 | {\ESO@Vline{\LenToUnit{\paperheight}}}% 234 | \fontsize{10}{12}\normalfont% 235 | \ESO@div{\paperwidth}{\ESO@gridDelta}% 236 | \multiput(0,\ESO@gridDeltaY)(\ESO@gridDelta,0){\@tempcnta}{% 237 | \@tempcntb=\@tempcnta\advance\@tempcntb-\@multicnt% 238 | \ifnum\@tempcntb>1\relax 239 | \multiply\@tempcntb by \ESO@gridDelta\relax% 240 | \@tempdima=\@tempcntb sp\@tempdima=\ESO@labelfactor\@tempdima% 241 | \@tempcntb=\@tempdima% 242 | \makebox(0,0)[c]{\ESO@colorbox{white}{\the\@tempcntb}}% 243 | \fi}% 244 | \ifx\ESO@gridunitname\@empty\def\@tempa{0}\else\def\@tempa{1}\fi% 245 | \ESO@div{\paperheight}{\ESO@gridDelta}% 246 | \multiput(\ESO@gridDelta,0)(0,\ESO@gridDeltaY){\@tempcnta}{% 247 | \@tempcntb=\@tempcnta\advance\@tempcntb-\@multicnt% 248 | \ifnum\@tempcntb>\@tempa\relax 249 | \multiply\@tempcntb by \ESO@gridDelta\relax% 250 | \@tempdima=\@tempcntb sp\@tempdima=\ESO@labelfactor\@tempdima% 251 | \@tempcntb=\@tempdima% 252 | \makebox(0,0)[c]{\ESO@colorbox{white}{\the\@tempcntb}}% 253 | \fi 254 | }% 255 | \ifx\ESO@gridunitname\@empty\else% 256 | \thicklines\fboxrule=\@wholewidth% 257 | \put(\ESO@gridDelta,\ESO@gridDeltaY){\makebox(0,0)[c]{% 258 | \ESO@fcolorbox{\ESO@gridcolor}{white}{% 259 | \textbf{\ESO@gridunitname}}}}% 260 | \fi 261 | \normalcolor% 262 | \endgroup 263 | } 264 | \ifESO@grid\g@addto@macro\ESO@HookIII{\ESO@gridpicture}\fi 265 | \endinput 266 | %% 267 | %% End of file `eso-pic.sty'. 268 | 269 | -------------------------------------------------------------------------------- /paper/ieee.bst: -------------------------------------------------------------------------------- 1 | % --------------------------------------------------------------- 2 | % 3 | % ieee.bst,v 1.0 2002/04/16 4 | % 5 | % by Glenn Paulley (paulley@acm.org) 6 | % 7 | % Modified from latex8.bst 1995/09/15 15:13:49 ienne Exp $ 8 | % 9 | % by Paolo.Ienne@di.epfl.ch 10 | % 11 | % 12 | % --------------------------------------------------------------- 13 | % 14 | % no guarantee is given that the format corresponds perfectly to 15 | % IEEE 8.5" x 11" Proceedings, but most features should be ok. 16 | % 17 | % --------------------------------------------------------------- 18 | % 19 | % `ieee' from BibTeX standard bibliography style `abbrv' 20 | % version 0.99a for BibTeX versions 0.99a or later, LaTeX version 2.09. 21 | % Copyright (C) 1985, all rights reserved. 22 | % Copying of this file is authorized only if either 23 | % (1) you make absolutely no changes to your copy, including name, or 24 | % (2) if you do make changes, you name it something other than 25 | % btxbst.doc, plain.bst, unsrt.bst, alpha.bst, and abbrv.bst. 26 | % This restriction helps ensure that all standard styles are identical. 27 | % The file btxbst.doc has the documentation for this style. 28 | 29 | ENTRY 30 | { address 31 | author 32 | booktitle 33 | chapter 34 | edition 35 | editor 36 | howpublished 37 | institution 38 | journal 39 | key 40 | month 41 | note 42 | number 43 | organization 44 | pages 45 | publisher 46 | school 47 | series 48 | title 49 | type 50 | volume 51 | year 52 | } 53 | {} 54 | { label } 55 | 56 | INTEGERS { output.state before.all mid.sentence after.sentence after.block } 57 | 58 | FUNCTION {init.state.consts} 59 | { #0 'before.all := 60 | #1 'mid.sentence := 61 | #2 'after.sentence := 62 | #3 'after.block := 63 | } 64 | 65 | STRINGS { s t } 66 | 67 | FUNCTION {output.nonnull} 68 | { 's := 69 | output.state mid.sentence = 70 | { ", " * write$ } 71 | { output.state after.block = 72 | { add.period$ write$ 73 | newline$ 74 | "\newblock " write$ 75 | } 76 | { output.state before.all = 77 | 'write$ 78 | { add.period$ " " * write$ } 79 | if$ 80 | } 81 | if$ 82 | mid.sentence 'output.state := 83 | } 84 | if$ 85 | s 86 | } 87 | 88 | FUNCTION {output} 89 | { duplicate$ empty$ 90 | 'pop$ 91 | 'output.nonnull 92 | if$ 93 | } 94 | 95 | FUNCTION {output.check} 96 | { 't := 97 | duplicate$ empty$ 98 | { pop$ "empty " t * " in " * cite$ * warning$ } 99 | 'output.nonnull 100 | if$ 101 | } 102 | 103 | FUNCTION {output.bibitem} 104 | { newline$ 105 | "\bibitem{" write$ 106 | cite$ write$ 107 | "}" write$ 108 | newline$ 109 | "" 110 | before.all 'output.state := 111 | } 112 | 113 | FUNCTION {fin.entry} 114 | { add.period$ 115 | write$ 116 | newline$ 117 | } 118 | 119 | FUNCTION {new.block} 120 | { output.state before.all = 121 | 'skip$ 122 | { after.block 'output.state := } 123 | if$ 124 | } 125 | 126 | FUNCTION {new.sentence} 127 | { output.state after.block = 128 | 'skip$ 129 | { output.state before.all = 130 | 'skip$ 131 | { after.sentence 'output.state := } 132 | if$ 133 | } 134 | if$ 135 | } 136 | 137 | FUNCTION {not} 138 | { { #0 } 139 | { #1 } 140 | if$ 141 | } 142 | 143 | FUNCTION {and} 144 | { 'skip$ 145 | { pop$ #0 } 146 | if$ 147 | } 148 | 149 | FUNCTION {or} 150 | { { pop$ #1 } 151 | 'skip$ 152 | if$ 153 | } 154 | 155 | FUNCTION {new.block.checka} 156 | { empty$ 157 | 'skip$ 158 | 'new.block 159 | if$ 160 | } 161 | 162 | FUNCTION {new.block.checkb} 163 | { empty$ 164 | swap$ empty$ 165 | and 166 | 'skip$ 167 | 'new.block 168 | if$ 169 | } 170 | 171 | FUNCTION {new.sentence.checka} 172 | { empty$ 173 | 'skip$ 174 | 'new.sentence 175 | if$ 176 | } 177 | 178 | FUNCTION {new.sentence.checkb} 179 | { empty$ 180 | swap$ empty$ 181 | and 182 | 'skip$ 183 | 'new.sentence 184 | if$ 185 | } 186 | 187 | FUNCTION {field.or.null} 188 | { duplicate$ empty$ 189 | { pop$ "" } 190 | 'skip$ 191 | if$ 192 | } 193 | 194 | FUNCTION {emphasize} 195 | { duplicate$ empty$ 196 | { pop$ "" } 197 | { "{\em " swap$ * "}" * } 198 | if$ 199 | } 200 | 201 | INTEGERS { nameptr namesleft numnames } 202 | 203 | FUNCTION {format.names} 204 | { 's := 205 | #1 'nameptr := 206 | s num.names$ 'numnames := 207 | numnames 'namesleft := 208 | { namesleft #0 > } 209 | { s nameptr "{f.~}{vv~}{ll}{, jj}" format.name$ 't := 210 | nameptr #1 > 211 | { namesleft #1 > 212 | { ", " * t * } 213 | { numnames #2 > 214 | { "," * } 215 | 'skip$ 216 | if$ 217 | t "others" = 218 | { " et~al." * } 219 | { " and " * t * } 220 | if$ 221 | } 222 | if$ 223 | } 224 | 't 225 | if$ 226 | nameptr #1 + 'nameptr := 227 | 228 | namesleft #1 - 'namesleft := 229 | } 230 | while$ 231 | } 232 | 233 | FUNCTION {format.authors} 234 | { author empty$ 235 | { "" } 236 | { author format.names } 237 | if$ 238 | } 239 | 240 | FUNCTION {format.editors} 241 | { editor empty$ 242 | { "" } 243 | { editor format.names 244 | editor num.names$ #1 > 245 | { ", editors" * } 246 | { ", editor" * } 247 | if$ 248 | } 249 | if$ 250 | } 251 | 252 | FUNCTION {format.title} 253 | { title empty$ 254 | { "" } 255 | { title "t" change.case$ } 256 | if$ 257 | } 258 | 259 | FUNCTION {n.dashify} 260 | { 't := 261 | "" 262 | { t empty$ not } 263 | { t #1 #1 substring$ "-" = 264 | { t #1 #2 substring$ "--" = not 265 | { "--" * 266 | t #2 global.max$ substring$ 't := 267 | } 268 | { { t #1 #1 substring$ "-" = } 269 | { "-" * 270 | t #2 global.max$ substring$ 't := 271 | } 272 | while$ 273 | } 274 | if$ 275 | } 276 | { t #1 #1 substring$ * 277 | t #2 global.max$ substring$ 't := 278 | } 279 | if$ 280 | } 281 | while$ 282 | } 283 | 284 | FUNCTION {format.date} 285 | { year empty$ 286 | { month empty$ 287 | { "" } 288 | { "there's a month but no year in " cite$ * warning$ 289 | month 290 | } 291 | if$ 292 | } 293 | { month empty$ 294 | 'year 295 | { month " " * year * } 296 | if$ 297 | } 298 | if$ 299 | } 300 | 301 | FUNCTION {format.btitle} 302 | { title emphasize 303 | } 304 | 305 | FUNCTION {tie.or.space.connect} 306 | { duplicate$ text.length$ #3 < 307 | { "~" } 308 | { " " } 309 | if$ 310 | swap$ * * 311 | } 312 | 313 | FUNCTION {either.or.check} 314 | { empty$ 315 | 'pop$ 316 | { "can't use both " swap$ * " fields in " * cite$ * warning$ } 317 | if$ 318 | } 319 | 320 | FUNCTION {format.bvolume} 321 | { volume empty$ 322 | { "" } 323 | { "volume" volume tie.or.space.connect 324 | series empty$ 325 | 'skip$ 326 | { " of " * series emphasize * } 327 | if$ 328 | "volume and number" number either.or.check 329 | } 330 | if$ 331 | } 332 | 333 | FUNCTION {format.number.series} 334 | { volume empty$ 335 | { number empty$ 336 | { series field.or.null } 337 | { output.state mid.sentence = 338 | { "number" } 339 | { "Number" } 340 | if$ 341 | number tie.or.space.connect 342 | series empty$ 343 | { "there's a number but no series in " cite$ * warning$ } 344 | { " in " * series * } 345 | if$ 346 | } 347 | if$ 348 | } 349 | { "" } 350 | if$ 351 | } 352 | 353 | FUNCTION {format.edition} 354 | { edition empty$ 355 | { "" } 356 | { output.state mid.sentence = 357 | { edition "l" change.case$ " edition" * } 358 | { edition "t" change.case$ " edition" * } 359 | if$ 360 | } 361 | if$ 362 | } 363 | 364 | INTEGERS { multiresult } 365 | 366 | FUNCTION {multi.page.check} 367 | { 't := 368 | #0 'multiresult := 369 | { multiresult not 370 | t empty$ not 371 | and 372 | } 373 | { t #1 #1 substring$ 374 | duplicate$ "-" = 375 | swap$ duplicate$ "," = 376 | swap$ "+" = 377 | or or 378 | { #1 'multiresult := } 379 | { t #2 global.max$ substring$ 't := } 380 | if$ 381 | } 382 | while$ 383 | multiresult 384 | } 385 | 386 | FUNCTION {format.pages} 387 | { pages empty$ 388 | { "" } 389 | { pages multi.page.check 390 | { "pages" pages n.dashify tie.or.space.connect } 391 | { "page" pages tie.or.space.connect } 392 | if$ 393 | } 394 | if$ 395 | } 396 | 397 | FUNCTION {format.vol.num.pages} 398 | { volume field.or.null 399 | number empty$ 400 | 'skip$ 401 | { "(" number * ")" * * 402 | volume empty$ 403 | { "there's a number but no volume in " cite$ * warning$ } 404 | 'skip$ 405 | if$ 406 | } 407 | if$ 408 | pages empty$ 409 | 'skip$ 410 | { duplicate$ empty$ 411 | { pop$ format.pages } 412 | { ":" * pages n.dashify * } 413 | if$ 414 | } 415 | if$ 416 | } 417 | 418 | FUNCTION {format.chapter.pages} 419 | { chapter empty$ 420 | 'format.pages 421 | { type empty$ 422 | { "chapter" } 423 | { type "l" change.case$ } 424 | if$ 425 | chapter tie.or.space.connect 426 | pages empty$ 427 | 'skip$ 428 | { ", " * format.pages * } 429 | if$ 430 | } 431 | if$ 432 | } 433 | 434 | FUNCTION {format.in.ed.booktitle} 435 | { booktitle empty$ 436 | { "" } 437 | { editor empty$ 438 | { "In " booktitle emphasize * } 439 | { "In " format.editors * ", " * booktitle emphasize * } 440 | if$ 441 | } 442 | if$ 443 | } 444 | 445 | FUNCTION {empty.misc.check} 446 | 447 | { author empty$ title empty$ howpublished empty$ 448 | month empty$ year empty$ note empty$ 449 | and and and and and 450 | key empty$ not and 451 | { "all relevant fields are empty in " cite$ * warning$ } 452 | 'skip$ 453 | if$ 454 | } 455 | 456 | FUNCTION {format.thesis.type} 457 | { type empty$ 458 | 'skip$ 459 | { pop$ 460 | type "t" change.case$ 461 | } 462 | if$ 463 | } 464 | 465 | FUNCTION {format.tr.number} 466 | { type empty$ 467 | { "Technical Report" } 468 | 'type 469 | if$ 470 | number empty$ 471 | { "t" change.case$ } 472 | { number tie.or.space.connect } 473 | if$ 474 | } 475 | 476 | FUNCTION {format.article.crossref} 477 | { key empty$ 478 | { journal empty$ 479 | { "need key or journal for " cite$ * " to crossref " * crossref * 480 | warning$ 481 | "" 482 | } 483 | { "In {\em " journal * "\/}" * } 484 | if$ 485 | } 486 | { "In " key * } 487 | if$ 488 | " \cite{" * crossref * "}" * 489 | } 490 | 491 | FUNCTION {format.crossref.editor} 492 | { editor #1 "{vv~}{ll}" format.name$ 493 | editor num.names$ duplicate$ 494 | #2 > 495 | { pop$ " et~al." * } 496 | { #2 < 497 | 'skip$ 498 | { editor #2 "{ff }{vv }{ll}{ jj}" format.name$ "others" = 499 | { " et~al." * } 500 | { " and " * editor #2 "{vv~}{ll}" format.name$ * } 501 | if$ 502 | } 503 | if$ 504 | } 505 | if$ 506 | } 507 | 508 | FUNCTION {format.book.crossref} 509 | { volume empty$ 510 | { "empty volume in " cite$ * "'s crossref of " * crossref * warning$ 511 | "In " 512 | } 513 | { "Volume" volume tie.or.space.connect 514 | " of " * 515 | } 516 | if$ 517 | editor empty$ 518 | editor field.or.null author field.or.null = 519 | or 520 | { key empty$ 521 | { series empty$ 522 | { "need editor, key, or series for " cite$ * " to crossref " * 523 | crossref * warning$ 524 | "" * 525 | } 526 | { "{\em " * series * "\/}" * } 527 | if$ 528 | } 529 | { key * } 530 | if$ 531 | } 532 | { format.crossref.editor * } 533 | if$ 534 | " \cite{" * crossref * "}" * 535 | } 536 | 537 | FUNCTION {format.incoll.inproc.crossref} 538 | { editor empty$ 539 | editor field.or.null author field.or.null = 540 | or 541 | { key empty$ 542 | { booktitle empty$ 543 | { "need editor, key, or booktitle for " cite$ * " to crossref " * 544 | crossref * warning$ 545 | "" 546 | } 547 | { "In {\em " booktitle * "\/}" * } 548 | if$ 549 | } 550 | { "In " key * } 551 | if$ 552 | } 553 | { "In " format.crossref.editor * } 554 | if$ 555 | " \cite{" * crossref * "}" * 556 | } 557 | 558 | FUNCTION {article} 559 | { output.bibitem 560 | format.authors "author" output.check 561 | new.block 562 | format.title "title" output.check 563 | new.block 564 | crossref missing$ 565 | { journal emphasize "journal" output.check 566 | format.vol.num.pages output 567 | format.date "year" output.check 568 | } 569 | { format.article.crossref output.nonnull 570 | format.pages output 571 | } 572 | if$ 573 | new.block 574 | note output 575 | fin.entry 576 | } 577 | 578 | FUNCTION {book} 579 | { output.bibitem 580 | author empty$ 581 | { format.editors "author and editor" output.check } 582 | { format.authors output.nonnull 583 | crossref missing$ 584 | { "author and editor" editor either.or.check } 585 | 'skip$ 586 | if$ 587 | } 588 | if$ 589 | new.block 590 | format.btitle "title" output.check 591 | crossref missing$ 592 | { format.bvolume output 593 | new.block 594 | format.number.series output 595 | new.sentence 596 | publisher "publisher" output.check 597 | address output 598 | } 599 | { new.block 600 | format.book.crossref output.nonnull 601 | } 602 | if$ 603 | format.edition output 604 | format.date "year" output.check 605 | new.block 606 | note output 607 | fin.entry 608 | } 609 | 610 | FUNCTION {booklet} 611 | { output.bibitem 612 | format.authors output 613 | new.block 614 | format.title "title" output.check 615 | howpublished address new.block.checkb 616 | howpublished output 617 | address output 618 | format.date output 619 | new.block 620 | note output 621 | fin.entry 622 | } 623 | 624 | FUNCTION {inbook} 625 | { output.bibitem 626 | author empty$ 627 | { format.editors "author and editor" output.check } 628 | { format.authors output.nonnull 629 | 630 | crossref missing$ 631 | { "author and editor" editor either.or.check } 632 | 'skip$ 633 | if$ 634 | } 635 | if$ 636 | new.block 637 | format.btitle "title" output.check 638 | crossref missing$ 639 | { format.bvolume output 640 | format.chapter.pages "chapter and pages" output.check 641 | new.block 642 | format.number.series output 643 | new.sentence 644 | publisher "publisher" output.check 645 | address output 646 | } 647 | { format.chapter.pages "chapter and pages" output.check 648 | new.block 649 | format.book.crossref output.nonnull 650 | } 651 | if$ 652 | format.edition output 653 | format.date "year" output.check 654 | new.block 655 | note output 656 | fin.entry 657 | } 658 | 659 | FUNCTION {incollection} 660 | { output.bibitem 661 | format.authors "author" output.check 662 | new.block 663 | format.title "title" output.check 664 | new.block 665 | crossref missing$ 666 | { format.in.ed.booktitle "booktitle" output.check 667 | format.bvolume output 668 | format.number.series output 669 | format.chapter.pages output 670 | new.sentence 671 | publisher "publisher" output.check 672 | address output 673 | format.edition output 674 | format.date "year" output.check 675 | } 676 | { format.incoll.inproc.crossref output.nonnull 677 | format.chapter.pages output 678 | } 679 | if$ 680 | new.block 681 | note output 682 | fin.entry 683 | } 684 | 685 | FUNCTION {inproceedings} 686 | { output.bibitem 687 | format.authors "author" output.check 688 | new.block 689 | format.title "title" output.check 690 | new.block 691 | crossref missing$ 692 | { format.in.ed.booktitle "booktitle" output.check 693 | format.bvolume output 694 | format.number.series output 695 | format.pages output 696 | address empty$ 697 | { organization publisher new.sentence.checkb 698 | organization output 699 | publisher output 700 | format.date "year" output.check 701 | } 702 | { address output.nonnull 703 | format.date "year" output.check 704 | new.sentence 705 | organization output 706 | publisher output 707 | } 708 | if$ 709 | } 710 | { format.incoll.inproc.crossref output.nonnull 711 | format.pages output 712 | } 713 | if$ 714 | new.block 715 | note output 716 | fin.entry 717 | } 718 | 719 | FUNCTION {conference} { inproceedings } 720 | 721 | FUNCTION {manual} 722 | { output.bibitem 723 | author empty$ 724 | { organization empty$ 725 | 'skip$ 726 | { organization output.nonnull 727 | address output 728 | } 729 | if$ 730 | } 731 | { format.authors output.nonnull } 732 | if$ 733 | new.block 734 | format.btitle "title" output.check 735 | author empty$ 736 | { organization empty$ 737 | { address new.block.checka 738 | address output 739 | } 740 | 'skip$ 741 | if$ 742 | } 743 | { organization address new.block.checkb 744 | organization output 745 | address output 746 | } 747 | if$ 748 | format.edition output 749 | format.date output 750 | new.block 751 | note output 752 | fin.entry 753 | } 754 | 755 | FUNCTION {mastersthesis} 756 | { output.bibitem 757 | format.authors "author" output.check 758 | new.block 759 | format.title "title" output.check 760 | new.block 761 | "Master's thesis" format.thesis.type output.nonnull 762 | school "school" output.check 763 | address output 764 | format.date "year" output.check 765 | new.block 766 | note output 767 | fin.entry 768 | } 769 | 770 | FUNCTION {misc} 771 | { output.bibitem 772 | format.authors output 773 | title howpublished new.block.checkb 774 | format.title output 775 | howpublished new.block.checka 776 | howpublished output 777 | format.date output 778 | new.block 779 | note output 780 | fin.entry 781 | empty.misc.check 782 | } 783 | 784 | FUNCTION {phdthesis} 785 | { output.bibitem 786 | format.authors "author" output.check 787 | new.block 788 | format.btitle "title" output.check 789 | new.block 790 | "PhD thesis" format.thesis.type output.nonnull 791 | school "school" output.check 792 | address output 793 | format.date "year" output.check 794 | new.block 795 | note output 796 | fin.entry 797 | } 798 | 799 | FUNCTION {proceedings} 800 | { output.bibitem 801 | editor empty$ 802 | { organization output } 803 | { format.editors output.nonnull } 804 | 805 | if$ 806 | new.block 807 | format.btitle "title" output.check 808 | format.bvolume output 809 | format.number.series output 810 | address empty$ 811 | { editor empty$ 812 | { publisher new.sentence.checka } 813 | { organization publisher new.sentence.checkb 814 | organization output 815 | } 816 | if$ 817 | publisher output 818 | format.date "year" output.check 819 | } 820 | { address output.nonnull 821 | format.date "year" output.check 822 | new.sentence 823 | editor empty$ 824 | 'skip$ 825 | { organization output } 826 | if$ 827 | publisher output 828 | } 829 | if$ 830 | new.block 831 | note output 832 | fin.entry 833 | } 834 | 835 | FUNCTION {techreport} 836 | { output.bibitem 837 | format.authors "author" output.check 838 | new.block 839 | format.title "title" output.check 840 | new.block 841 | format.tr.number output.nonnull 842 | institution "institution" output.check 843 | address output 844 | format.date "year" output.check 845 | new.block 846 | note output 847 | fin.entry 848 | } 849 | 850 | FUNCTION {unpublished} 851 | { output.bibitem 852 | format.authors "author" output.check 853 | new.block 854 | format.title "title" output.check 855 | new.block 856 | note "note" output.check 857 | format.date output 858 | fin.entry 859 | } 860 | 861 | FUNCTION {default.type} { misc } 862 | 863 | MACRO {jan} {"Jan."} 864 | 865 | MACRO {feb} {"Feb."} 866 | 867 | MACRO {mar} {"Mar."} 868 | 869 | MACRO {apr} {"Apr."} 870 | 871 | MACRO {may} {"May"} 872 | 873 | MACRO {jun} {"June"} 874 | 875 | MACRO {jul} {"July"} 876 | 877 | MACRO {aug} {"Aug."} 878 | 879 | MACRO {sep} {"Sept."} 880 | 881 | MACRO {oct} {"Oct."} 882 | 883 | MACRO {nov} {"Nov."} 884 | 885 | MACRO {dec} {"Dec."} 886 | 887 | MACRO {acmcs} {"ACM Comput. Surv."} 888 | 889 | MACRO {acta} {"Acta Inf."} 890 | 891 | MACRO {cacm} {"Commun. ACM"} 892 | 893 | MACRO {ibmjrd} {"IBM J. Res. Dev."} 894 | 895 | MACRO {ibmsj} {"IBM Syst.~J."} 896 | 897 | MACRO {ieeese} {"IEEE Trans. Softw. Eng."} 898 | 899 | MACRO {ieeetc} {"IEEE Trans. Comput."} 900 | 901 | MACRO {ieeetcad} 902 | {"IEEE Trans. Comput.-Aided Design Integrated Circuits"} 903 | 904 | MACRO {ipl} {"Inf. Process. Lett."} 905 | 906 | MACRO {jacm} {"J.~ACM"} 907 | 908 | MACRO {jcss} {"J.~Comput. Syst. Sci."} 909 | 910 | MACRO {scp} {"Sci. Comput. Programming"} 911 | 912 | MACRO {sicomp} {"SIAM J. Comput."} 913 | 914 | MACRO {tocs} {"ACM Trans. Comput. Syst."} 915 | 916 | MACRO {tods} {"ACM Trans. Database Syst."} 917 | 918 | MACRO {tog} {"ACM Trans. Gr."} 919 | 920 | MACRO {toms} {"ACM Trans. Math. Softw."} 921 | 922 | MACRO {toois} {"ACM Trans. Office Inf. Syst."} 923 | 924 | MACRO {toplas} {"ACM Trans. Prog. Lang. Syst."} 925 | 926 | MACRO {tcs} {"Theoretical Comput. Sci."} 927 | 928 | READ 929 | 930 | FUNCTION {sortify} 931 | { purify$ 932 | "l" change.case$ 933 | } 934 | 935 | INTEGERS { len } 936 | 937 | FUNCTION {chop.word} 938 | { 's := 939 | 'len := 940 | s #1 len substring$ = 941 | { s len #1 + global.max$ substring$ } 942 | 's 943 | if$ 944 | } 945 | 946 | FUNCTION {sort.format.names} 947 | { 's := 948 | #1 'nameptr := 949 | "" 950 | s num.names$ 'numnames := 951 | numnames 'namesleft := 952 | { namesleft #0 > } 953 | { nameptr #1 > 954 | { " " * } 955 | 'skip$ 956 | if$ 957 | s nameptr "{vv{ } }{ll{ }}{ f{ }}{ jj{ }}" format.name$ 't := 958 | nameptr numnames = t "others" = and 959 | { "et al" * } 960 | { t sortify * } 961 | if$ 962 | nameptr #1 + 'nameptr := 963 | namesleft #1 - 'namesleft := 964 | } 965 | while$ 966 | } 967 | 968 | FUNCTION {sort.format.title} 969 | { 't := 970 | "A " #2 971 | "An " #3 972 | "The " #4 t chop.word 973 | chop.word 974 | chop.word 975 | sortify 976 | #1 global.max$ substring$ 977 | } 978 | 979 | FUNCTION {author.sort} 980 | { author empty$ 981 | { key empty$ 982 | { "to sort, need author or key in " cite$ * warning$ 983 | "" 984 | } 985 | { key sortify } 986 | if$ 987 | } 988 | { author sort.format.names } 989 | if$ 990 | } 991 | 992 | FUNCTION {author.editor.sort} 993 | { author empty$ 994 | { editor empty$ 995 | { key empty$ 996 | { "to sort, need author, editor, or key in " cite$ * warning$ 997 | "" 998 | } 999 | { key sortify } 1000 | if$ 1001 | } 1002 | { editor sort.format.names } 1003 | if$ 1004 | } 1005 | { author sort.format.names } 1006 | if$ 1007 | } 1008 | 1009 | FUNCTION {author.organization.sort} 1010 | { author empty$ 1011 | 1012 | { organization empty$ 1013 | { key empty$ 1014 | { "to sort, need author, organization, or key in " cite$ * warning$ 1015 | "" 1016 | } 1017 | { key sortify } 1018 | if$ 1019 | } 1020 | { "The " #4 organization chop.word sortify } 1021 | if$ 1022 | } 1023 | { author sort.format.names } 1024 | if$ 1025 | } 1026 | 1027 | FUNCTION {editor.organization.sort} 1028 | { editor empty$ 1029 | { organization empty$ 1030 | { key empty$ 1031 | { "to sort, need editor, organization, or key in " cite$ * warning$ 1032 | "" 1033 | } 1034 | { key sortify } 1035 | if$ 1036 | } 1037 | { "The " #4 organization chop.word sortify } 1038 | if$ 1039 | } 1040 | { editor sort.format.names } 1041 | if$ 1042 | } 1043 | 1044 | FUNCTION {presort} 1045 | { type$ "book" = 1046 | type$ "inbook" = 1047 | or 1048 | 'author.editor.sort 1049 | { type$ "proceedings" = 1050 | 'editor.organization.sort 1051 | { type$ "manual" = 1052 | 'author.organization.sort 1053 | 'author.sort 1054 | if$ 1055 | } 1056 | if$ 1057 | } 1058 | if$ 1059 | " " 1060 | * 1061 | year field.or.null sortify 1062 | * 1063 | " " 1064 | * 1065 | title field.or.null 1066 | sort.format.title 1067 | * 1068 | #1 entry.max$ substring$ 1069 | 'sort.key$ := 1070 | } 1071 | 1072 | ITERATE {presort} 1073 | 1074 | SORT 1075 | 1076 | STRINGS { longest.label } 1077 | 1078 | INTEGERS { number.label longest.label.width } 1079 | 1080 | FUNCTION {initialize.longest.label} 1081 | { "" 'longest.label := 1082 | #1 'number.label := 1083 | #0 'longest.label.width := 1084 | } 1085 | 1086 | FUNCTION {longest.label.pass} 1087 | { number.label int.to.str$ 'label := 1088 | number.label #1 + 'number.label := 1089 | label width$ longest.label.width > 1090 | { label 'longest.label := 1091 | label width$ 'longest.label.width := 1092 | } 1093 | 'skip$ 1094 | if$ 1095 | } 1096 | 1097 | EXECUTE {initialize.longest.label} 1098 | 1099 | ITERATE {longest.label.pass} 1100 | 1101 | FUNCTION {begin.bib} 1102 | { preamble$ empty$ 1103 | 'skip$ 1104 | { preamble$ write$ newline$ } 1105 | if$ 1106 | "\begin{thebibliography}{" longest.label * "}" * 1107 | "\itemsep=-1pt" * % Compact the entries a little. 1108 | write$ newline$ 1109 | } 1110 | 1111 | EXECUTE {begin.bib} 1112 | 1113 | EXECUTE {init.state.consts} 1114 | 1115 | ITERATE {call.type$} 1116 | 1117 | FUNCTION {end.bib} 1118 | { newline$ 1119 | "\end{thebibliography}" write$ newline$ 1120 | } 1121 | 1122 | EXECUTE {end.bib} 1123 | 1124 | % end of file ieee.bst 1125 | % --------------------------------------------------------------- 1126 | 1127 | 1128 | 1129 | 1130 | -------------------------------------------------------------------------------- /paper/simplot_city_overfeat_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/craymichael/CNN_LCD/a0087f7ce809220b65b5b0bdac77f5f0de5bfed6/paper/simplot_city_overfeat_1.png -------------------------------------------------------------------------------- /paper/simplot_w_gt_city_overfeat_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/craymichael/CNN_LCD/a0087f7ce809220b65b5b0bdac77f5f0de5bfed6/paper/simplot_w_gt_city_overfeat_1.png -------------------------------------------------------------------------------- /paper/simplot_w_gt_college_overfeat_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/craymichael/CNN_LCD/a0087f7ce809220b65b5b0bdac77f5f0de5bfed6/paper/simplot_w_gt_college_overfeat_1.png -------------------------------------------------------------------------------- /paper/zjc_egpaper_for_review.tex: -------------------------------------------------------------------------------- 1 | \documentclass[10pt,twocolumn,letterpaper]{article} 2 | 3 | \usepackage{cvpr} 4 | \usepackage{times} 5 | \usepackage{epsfig} 6 | \usepackage{graphicx} 7 | \usepackage{amsmath} 8 | \usepackage{amssymb} 9 | 10 | % Include other packages here, before hyperref. 11 | \usepackage{float} 12 | \usepackage{multirow} 13 | \usepackage[export]{adjustbox} % https://tex.stackexchange.com/questions/57418/crop-an-inserted-image 14 | 15 | % If you comment hyperref and then uncomment it, you should delete 16 | % egpaper.aux before re-running latex. (Or just hit 'q' on the first latex 17 | % run, let it finish, and you should be clear). 18 | \usepackage[pagebackref=true,breaklinks=true,letterpaper=true,colorlinks,bookmarks=false]{hyperref} 19 | 20 | %\cvprfinalcopy % *** Uncomment this line for the final submission 21 | 22 | \def\cvprPaperID{1337} % *** Enter the CVPR Paper ID here 23 | \def\httilde{\mbox{\tt\raisebox{-.5ex}{\symbol{126}}}} 24 | 25 | % Pages are numbered in submission mode, and unnumbered in camera-ready 26 | \ifcvprfinal\pagestyle{numbered}\fi 27 | 28 | % My stuff 29 | \hyphenation{vSLAM} 30 | 31 | \begin{document} 32 | 33 | %%%%%%%%% TITLE 34 | \title{Convolutional Neural Networks for Loop-Closure Detection in vSLAM Systems} 35 | 36 | \author{Zachariah Carmichael\\ 37 | Rochester Institute of Technology\\ 38 | Rochester, NY, USA\\ 39 | {\tt\small zjc2920@rit.edu} 40 | % For a paper whose authors are all at the same institution, 41 | % omit the following lines up until the closing ``}''. 42 | % Additional authors and addresses can be added with ``\and'', 43 | % just like the second author. 44 | % To save space, use either the email address or home page, not both 45 | } 46 | 47 | \maketitle 48 | %\thispagestyle{empty} 49 | 50 | %%%%%%%%% ABSTRACT 51 | \begin{abstract} 52 | Loop-closure detection is one of the most critical problems that visual simultaneous localization and mapping (vSLAM) systems need to mitigate. In conventional systems, hand-crafted features are used 53 | with a similarity metric to determine whether a loop-closure has occurred. Convolutional neural networks (CNNs) have been shown to extract more robust features, and various convolutional neural network 54 | architectures have been used successfully in place of hand-crafted features. In this work, extracted features from the \textup{OverFeat} CNN architecture were utilized to predict loop closures in several 55 | \textup{Oxford Robotics} vSLAM datasets. The results exhibit that obtained similarity matrices are indicative of similar pairwise images and could be viable for use in a vSLAM system. 56 | \end{abstract} 57 | 58 | %%%%%%%%% BODY TEXT 59 | \section{Introduction} 60 | vSLAM systems comprise two primary components: visual odometry and global map optimization \cite{taketomi_visual_2017}. Loop-closure detection algorithms are used in order to determine when to perform 61 | the latter component. The objective of such methods is to detect a loop-closure, \textit{i.e.} a spatial position that has already been visited by an agent. Using this location as a reference, the stored 62 | data structure, such as a graph, can be aligned with previously incorporated frames to more reliably and consistently determine agent location, as well as lessen map complexity. Many variants of SLAM 63 | systems have been introduced with a range of sensors and algorithms. LIDAR-based systems dominated the literature for years and have been shown to be effective 64 | \cite{levinson_robust_2010,levinson_map-based_2007}, but were inherently expensive. With motivation to reduce costs and implementation complexity, vSLAM emerged and focused on the practical use of 65 | monocular, stereo, and RGB-D cameras. 66 | 67 | \begin{figure} 68 | \centering 69 | % \includegraphics[width=0.9\linewidth]{new_college_comparison.eps} 70 | \adjincludegraphics[trim={2cm 2.2cm 1.5cm 2.4cm},clip,width=0.86\linewidth]{simplot_w_gt_college_inception_v1.png} 71 | \caption{Loop-closure similarity matrix compared to ground truth for \textit{New College} dataset.} 72 | \label{fig:new_college_sim} 73 | \end{figure} 74 | 75 | % In vSLAM methods, there are three primarily derived methods: feature-based, direct, and RGB-D \cite{taketomi_visual_2017}. Feature-based methods use explicitly extracted features from visual input data 76 | as input to the vSLAM system. ORB-SLAM \cite{mur-artal_orb-slam:_2015} is described as one of the most complete monocular vSLAM systems \cite{taketomi_visual_2017} which uses these hand-crafted features. 77 | In direct methods, the entire input space is used without explicit feature extraction. LSD-SLAM \cite{engel_lsd-slam:_2014} directly operates on images by recreating a synthetic scene with estimated 78 | depth information based on high-intensity gradients. This work focuses on classifying loop-closures for RGB data only and approaches the problem differently than any of the aforementioned methods. 79 | 80 | Rather than using traditionally utilized hand-crafted features, this work extracts feature vectors from a pre-trained CNN. Feature vectors are pre-processed before computing pairwise image similarity 81 | that is thresholded to indicate whether a loop-closure has occurred. The rest of this paper is organized as follows. After the introduction, Section \ref{sec:background} provides background information. 82 | Section \ref{sec:method} presents the proposed model. Section \ref{sec:results} introduces the dataset, mode of evaluation, and results obtained on the \textit{New College} and \textit{City Centre} 83 | datasets. Section \ref{sec:conclusion} contains concluding remarks. 84 | 85 | %------------------------------------------------------------------------- 86 | \section{Background} 87 | \label{sec:background} 88 | 89 | \begin{figure*} 90 | \centering 91 | \includegraphics[width=0.72\linewidth]{DEEPSLAM.eps} 92 | \caption{The \textit{OverFeat accurate} CNN architecture \cite{zhang_loop_2017}. Feature vectors are extracted from the first fully-connected (FC) layer.} 93 | \label{fig:DEEPSLAM} 94 | \end{figure*} 95 | 96 | Typically, bag-of-visual-words (BOVW) techniques are used for determining features of an image. BOVW methods combine many feature descriptors into a dictionary by clustering. Various approaches exist and 97 | are widely used for computer vision applications, such as those that use SIFT \cite{lowe_object_1999}, SURF \cite{bay_surf:_2006}, ORB \cite{rublee_orb:_2011}, HOG \cite{dalal_histograms_2005}, 98 | \textit{etc.} These hand-crafted features comprise descriptors extracted using empirical metrics determined by domain experts. 99 | 100 | Deep learning is considered a direct method of solving loop-closure as features are directly learned from images by optimizing a cost function and updating model parameters by gradient descent. The 101 | resulting anatomy of such networks generally comprises a set of generalized feature extracting layers with feedforward connections to to a less generalized classification layer. In recent works 102 | \cite{zhang_loop_2017,gupta_cognitive_2017,gao_unsupervised_2017,hou_convolutional_2015}, CNNs have been utilized for loop-closure detection successfully. It has been shown that CNN features are more 103 | robust to viewpoint, illumination, and scale than most hand-crafted feature methods \cite{zhang_loop_2017}. This work attempts to recreate and extend that performed by Zhang \etal \cite{zhang_loop_2017}. 104 | %using the \textit{OverFeat accurate} architecture \cite{sermanet_overfeat:_2013}, shown in Figure \ref{fig:DEEPSLAM} as well as additional feature processing. 105 | 106 | %------------------------------------------------------------------------- 107 | \section{Proposed Method} 108 | \label{sec:method} 109 | 110 | The \textit{OverFeat} CNN architecture \cite{sermanet_overfeat:_2013} was introduced in two flavors: a \textit{fast} and \textit{accurate} model. The latter was considered in this work as it contains 111 | higher connectivity and more layers, thereby learning richer features. It was proposed in 2013 by Sermanet \etal as an entry to ILSVRC2013 as well as a robust feature extractor. The \textit{accurate} 112 | model contains 25 layers total, including all convolutions, activations, padding, max pooling, and fully-connected layers, and an open-source implementation is provided by the authors\footnote{See 113 | \href{https://github.com/sermanet/OverFeat}{https://github.com/sermanet/OverFeat} for pre-trained \textit{OverFeat} weights, Torch implementation, and Python2 API.}. The feature extracting capabilities 114 | of the network are used in this work, specifically the vector $\mathbf{v}_{22}\in \mathbb{R}^{4096}$ from the first fully-connected layer, layer 22, as shown in Figure \ref{fig:DEEPSLAM}. 115 | 116 | The extracted vectors are collected into a set $\mathbf{V} = 117 | \{\mathbf{v}_{22}^{(1)},\mathbf{v}_{22}^{(2)},...,\mathbf{v}_{22}^{(n)}\}$ 118 | for the $n$ images that have been collected and normalized by dividing each by its $L_2$ norm. Principal components analysis (PCA) is then applied to $\mathbf{V}$ where $\mathbf{V}$ is viewed as a matrix 119 | of the shape $n$ by 4096. Whitening, or normalization of each principal component to its variance, is also applied in this phase. 120 | 121 | With the pre-processed CNN features, a similarity matrix is constructed between all pairwise combinations of images. An example of such is shown in Figure \ref{fig:new_college_sim}. Euclidean distance 122 | between images $i$ and $j$ is computed using (\ref{eq:distance}) 123 | 124 | \begin{equation}\label{eq:distance} 125 | \mathbf{D}_{i,j} = \left\Vert \frac{\mathbf{V}^{(i)}}{\Vert\mathbf{V}^{(i)}\Vert_2} - \frac{\mathbf{V}^{(j)}}{\Vert\mathbf{V}^{(j)}\Vert_2} \right\Vert_2 126 | \end{equation} 127 | 128 | \noindent where $\Vert\cdot\Vert_2$ is the $L_2$ norm of a vector. Similarity between $i$ and $j$ is then defined in (\ref{eq:similarity}) as 129 | 130 | \begin{equation}\label{eq:similarity} 131 | \mathbf{S}_{i,j} = 1 - \mathbf{D}_{i,j}/\max(\mathbf{D}) 132 | \end{equation} 133 | 134 | \noindent where $\mathbf{D}$ is the set of all differences between images. Resulting similarity scores are therefore in the range [0,1] and indicate the likelihood of a pair of images indicating a 135 | loop-closure. 136 | 137 | %------------------------------------------------------------------------- 138 | \section{Results} 139 | \label{sec:results} 140 | 141 | The proposed method was evaluated on the \textit{New College} and \textit{City Centre} datasets\footnote{Data available at 142 | \href{http://www.robots.ox.ac.uk/~mobile/IJRR_2008_Dataset/}{www.robots.ox.ac.uk/~mobile/IJRR\_2008\_Dataset/}.} which provide ground truth of loop-closures. Table \ref{tab:datasets} contains data about 143 | both datasets. 144 | 145 | \begin{table}[H] 146 | \caption{Dataset characteristics.} 147 | \label{tab:datasets} 148 | \centering 149 | \begin{tabular}{c||c|c|c} 150 | Dataset & \# Images & Image Size & \# Loop-Closures\\ 151 | \hline 152 | \textit{City Centre} & 2,474 & 640$\times$480 & 26,976\\ 153 | \textit{New College} & 2,146 & 640$\times$480 & 14,832\\ 154 | \end{tabular} 155 | \end{table} 156 | 157 | \noindent Loop-closures specified by each dataset are included as a binary mask where a '1' indicates a loop-closure. The images compose a video taken around the Oxford campus by a robot with frames 158 | added every 1.5m. 159 | 160 | To evaluate performance, precision-recall curves were created by sliding a threshold from 0 to 1 to binarize the similarity matrix. At each threshold, precision and recall were updated and plotted. 161 | During experiments, CNN feature vectors were reduced to 500 dimensions in the PCA phase. The similarity matrix compared with ground truth is shown in Figure \ref{fig:new_college_sim} for the \textit{New 162 | College} dataset and Figure \ref{fig:city_centre_sim} in the Appendix for the \textit{City Centre} dataset. 163 | 164 | \begin{table}[H] 165 | \caption{Dataset evaluation results.} 166 | \label{tab:results} 167 | \centering 168 | \begin{tabular}{c||c} 169 | Dataset & Average Precision\\ 170 | \hline 171 | \textit{City Centre} & 18.0\%\\ 172 | \textit{New College} & 17.1\%\\ 173 | \end{tabular} 174 | \end{table} 175 | 176 | % TESTING IDEAS: 177 | % -multiple pre-trained CNN architectures 178 | % -results at different layers of network 179 | % -overfeat 0 vs. 1 (small/large) networks 180 | % -try with/without PCA, whitening 181 | % -try different PCA dimensionality reductions 182 | % -record best thresholds 183 | 184 | %------------------------------------------------------------------------- 185 | \section{Conclusion} 186 | \label{sec:conclusion} 187 | 188 | This work evaluates a method of detecting loop-closures in vSLAM systems with purely RGB data. The pre-processed \textit{OverFeat} features yielded similarity matrices that were indicative of the 189 | ground-truth of the \textit{New College} and \textit{City Centre} datasets. The proposed system is capable of detecting loop-closures without the use of hand-crafted features. 190 | 191 | %------------------------------------------------------------------------- 192 | % \section{Expected Deliverables} 193 | % In this work, a convolutional neural network (CNN) will be used to predict loop-closures in visual Simultaneous Localization and Mapping (vSLAM) datasets. In particular, the publicly-available 194 | feature-extracting architecture, \textit{OverFeat}, proposed for such application by Zhang \etal will be reconstructed for use in loop-closure detection \cite{zhang_loop_2017}. On a high level, the 195 | algorithm operates by extracting features of down-sampled images, performing PCA and whitening, and then computing pairwise similarity between images to determine a loop closure event. The model, shown 196 | in Figure \ref{fig:DEEPSLAM}, will be evaluated on the commonly used \textit{New Tsukuba}\footnote{Data available at 197 | \hyperlink{http://www.cvlab.cs.tsukuba.ac.jp/dataset/tsukubastereo.php}{cvlab.cs.tsukuba.ac.jp}.} and \textit{Oxford Robotcar}\footnote{Data available at 198 | \hyperlink{http://www.robots.ox.ac.uk/~mobile/IJRR_2008_Dataset/}{robots.ox.ac.uk}.} datasets. Results of evaluation will be quantified using the precision-recall metric and compared with feature-based 199 | methods as a candidate robust replacement. If time permits, the architecture will be modified to attempt to improve performance based on additional approaches in the literature. All code will be made 200 | available used in experiments, as well as experimental results with applicable setup. Lastly, a single-slide poster presentation and finalized paper will be created to showcase the completed work. 201 | 202 | % %------------------------------------------------------------------------- 203 | % \section{Reference Descriptions} 204 | % The following list describes each of the references intended for use in the final paper, as well as guidance for the CNN architecture. 205 | 206 | % \begin{itemize} 207 | % \item In \cite{taketomi_visual_2017}, a summary of the vSLAM framework is given and a survey of feature-based, direct, and RGB-D vSLAM methods is presented. 208 | % \item \cite{gao_unsupervised_2017} proposes a stacked denoising autoencoder (SDA) for loop-closure detection in vSLAM. 209 | % \item In \cite{naseer_robust_2018}, a CNN-based method for loop-closure detection is extended from previous work \cite{naseer_robust_2014,naseer_robust_2015} and obtains state-of-the-art results on 210 | various datasets. 211 | % \item Gupta \etal use end-to-end CNNs trained jointly for both mapping and planning, a fully deep learning framework \cite{gupta_cognitive_2017}. 212 | % \item In \cite{zhang_loop_2017}, a CNN is used for loop-closure detection and is evaluated to show it is a feasible alternative to FAB-MAP \cite{cummins_appearance-only_2011}. 213 | % \item Pascoe \etal propose a robust vSLAM system that competes with state-of-the-art methods on common benchmarks, and outperforms them in robustness \cite{pascoe_nid-slam:_2017}. 214 | % \item In \cite{hou_convolutional_2015}, another CNN architecture is utilized as a loop-closure detection algorithm and is shown to be a feasible alternative to hand-crafted features. 215 | % \end{itemize} 216 | 217 | % Some papers will be used to describe related work, while others directly relate to the proposed work of this project, \ie \cite{zhang_loop_2017}. 218 | 219 | %------------------------------------------------------------------------- 220 | 221 | {\small 222 | \bibliographystyle{ieee} 223 | \bibliography{egbib} 224 | } 225 | 226 | %------------------------------------------------------------------------- 227 | \section{Appendix} 228 | 229 | This following illustrates additional visualization of results obtained on the \textit{City Centre} and \textit{New College} datasets. 230 | 231 | \begin{figure}[H] 232 | \centering 233 | % \includegraphics[width=0.9\linewidth]{city_centre_comparison.eps} 234 | \adjincludegraphics[trim={2cm 2.2cm 1.5cm 2.4cm},clip,width=0.86\linewidth]{simplot_w_gt_city_inception_v1.png} 235 | \caption{Loop-closure similarity matrix compared to ground truth for \textit{City Centre} dataset.} 236 | \label{fig:city_centre_sim} 237 | \end{figure} 238 | 239 | \noindent In a similarity matrix, "cooler" colors represent image pairs that are less similar and "warmer" colors represent image pairs that are more similar. Figure \ref{fig:city_centre_sim_only} 240 | displays the same similarity matrix showed in Figure \ref{fig:city_centre_sim} but with a color bar indicating similarity. 241 | 242 | \begin{figure}[H] 243 | \centering 244 | \adjincludegraphics[trim={1.5cm 0 1.5cm 1.25cm},clip,width=.95\linewidth]{simplot_city_overfeat_1.png} 245 | \caption{Loop-closure similarity matrix for the \textit{City Centre} dataset.} 246 | \label{fig:city_centre_sim_only} 247 | \end{figure} 248 | 249 | Additional models were utilized in place of the \textit{OverFeat} architecture. Pre-trained models on ImageNet were taken from the "slim" API in the TensorFlow Models repository. For each model, the 250 | final fully connected layer was used as the feature vector. Additionally, median filtering was applied to the results obtained by each model in a post-processing step. This step was included as salt and 251 | pepper noise was generally present in similarity matrices. Kernel sizes of $17\times17$ and $11\times11$ were found to yield the best results on the \textit{City Centre} and \textit{New College} datasets 252 | respectively. Square filter sizes from 1 to 59 pixels as well as no filtering were swept for each dataset. Tables \ref{tab:aux_results_city} and \ref{tab:aux_results_college} contain the results obtained 253 | by each model. Results are quantified by average precision (AP), mean-per-class accuracy (MACC), precision (prec.), and recall. The latter three metrics were evaluated using the threshold that produced 254 | the highest MACC. 255 | 256 | \begin{table}[H] 257 | \caption{Evaluation results for additional models on the \textit{City Centre} dataset.} 258 | \label{tab:aux_results_city} 259 | \centering 260 | \begin{tabular}{*{5}{|c}|} 261 | \hline 262 | \multirow{2}{*}{Model} & \multicolumn{4}{c|}{\textit{City Centre}} \\ 263 | \cline{2-5} 264 | & AP & MACC & Prec. & Recall \\ 265 | \hline\hline 266 | Inception V1 & \textbf{18.0\%} & 72.4\% & 1.66\% & 95.7\% \\ 267 | Inception V2 & 17.9\% & \textbf{82.7\%} & 17.0\% & 68.3\% \\ 268 | Inception V3 & 17.1\% & 54.2\% & 20.4\% & 8.58\% \\ 269 | Inception V4 & 8.97\% & 51.1\% & 9.01\% & \textbf{99.4\%} \\ 270 | NASNet & 14.3\% & 59.2\% & \textbf{21.8\%} & 18.9\% \\ 271 | ResNet V2 152 & 10.9\% & 81.1\% & 5.08\% & 74.5\% \\ 272 | \hline 273 | \end{tabular} 274 | \end{table} 275 | 276 | \begin{table}[H] 277 | \caption{Evaluation results for additional models on the \textit{New College} dataset.} 278 | \label{tab:aux_results_college} 279 | \centering 280 | \begin{tabular}{*{5}{|c}|} 281 | \hline 282 | \multirow{2}{*}{Model} & \multicolumn{4}{c|}{\textit{New College}} \\ 283 | \cline{2-5} 284 | & AP & MACC & Prec. & Recall \\ 285 | \hline\hline 286 | Inception V1 & 16.4\% & \textbf{88.4\%} & 6.33\% & 85.0\% \\ 287 | Inception V2 & 16.4\% & 85.1\% & \textbf{11.9\%} & 73.8\% \\ 288 | Inception V3 & \textbf{17.1\%} & 81.8\% & 2.26\% & 88.5\% \\ 289 | Inception V4 & 9.99\% & 79.6\% & 6.91\% & 64.8\% \\ 290 | NASNet & 9.90\% & 80.5\% & 5.24\% & 69.0\% \\ 291 | ResNet V2 152 & 8.33\% & 85.9\% & 3.12\% & \textbf{89.8\%} \\ 292 | \hline 293 | \end{tabular} 294 | \end{table} 295 | 296 | 297 | 298 | \end{document} 299 | 300 | --------------------------------------------------------------------------------