├── .gitignore ├── .gitmodules ├── .travis.yml ├── LICENSE ├── Pipfile ├── Pipfile.lock ├── README.md ├── bin ├── run_backend.py ├── run_workers.py ├── start.py ├── start_website.py └── stats.py ├── cache ├── cache.conf ├── run_redis.sh └── shutdown_redis.sh ├── client ├── LICENSE ├── MANIFEST.in ├── README.md ├── bin │ └── urlabuse ├── pyurlabuse │ ├── __init__.py │ └── api.py ├── setup.py └── tests │ └── tests.py ├── doc └── logo │ └── logo-circl.png ├── requirements.txt ├── setup.py ├── urlabuse ├── __init__.py ├── exceptions.py ├── helpers.py └── urlabuse.py └── website ├── 3drparty.sh ├── __init__.py ├── config └── config.ini.sample └── web ├── __init__.py ├── proxied.py ├── static ├── ajax-loader.gif └── main.js └── templates ├── 404.html ├── index.html └── url-report.html /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | *.log* 3 | 4 | # Configs 5 | redis.conf 6 | config.ini 7 | 8 | # Key files 9 | *.key 10 | 11 | # Py libs 12 | sphinxapi.py 13 | 14 | # JS libs 15 | angular.min.js 16 | ui-bootstrap-tpls.min.js 17 | 18 | # Packages stuff 19 | build 20 | dist 21 | *egg-info 22 | 23 | *.rdb 24 | 25 | .env 26 | website/secret_key 27 | -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CIRCL/url-abuse/3d2ae503ec6ecbee92f7b8010abd46afa5b52230/.gitmodules -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | 3 | python: 4 | - "3.6" 5 | - "3.6-dev" 6 | - "3.7-dev" 7 | 8 | sudo: required 9 | dist: xenial 10 | 11 | install: 12 | - pip install pipenv 13 | - pushd .. 14 | # Faup 15 | - git clone https://github.com/stricaud/faup.git 16 | - pushd faup/build 17 | - cmake .. && make 18 | - sudo make install 19 | - sudo ldconfig 20 | - popd 21 | # redis 22 | - git clone https://github.com/antirez/redis.git 23 | - pushd redis 24 | - git checkout 5.0 25 | - make 26 | - popd 27 | # Run uwhoisd 28 | - git clone https://github.com/Rafiot/uwhoisd.git 29 | - pushd uwhoisd 30 | - pipenv install 31 | - echo UWHOISD_HOME="'`pwd`'" > .env 32 | - pipenv run start.py 33 | - popd 34 | # Get back in the project directory 35 | - popd 36 | # Other Python deps 37 | - pipenv install 38 | - echo URLABUSE_HOME="'`pwd`'" > .env 39 | 40 | before_script: 41 | - cp website/config/config.ini.sample website/config/config.ini 42 | 43 | script: 44 | - pipenv run start.py 45 | - sleep 2 46 | - pipenv run start_website.py & 47 | - sleep 5 48 | - curl http://0.0.0.0:5200/ 49 | - pushd client 50 | - python tests/tests.py 51 | - popd 52 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU AFFERO GENERAL PUBLIC LICENSE 2 | Version 3, 19 November 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU Affero General Public License is a free, copyleft license for 11 | software and other kinds of works, specifically designed to ensure 12 | cooperation with the community in the case of network server software. 13 | 14 | The licenses for most software and other practical works are designed 15 | to take away your freedom to share and change the works. By contrast, 16 | our General Public Licenses are intended to guarantee your freedom to 17 | share and change all versions of a program--to make sure it remains free 18 | software for all its users. 19 | 20 | When we speak of free software, we are referring to freedom, not 21 | price. Our General Public Licenses are designed to make sure that you 22 | have the freedom to distribute copies of free software (and charge for 23 | them if you wish), that you receive source code or can get it if you 24 | want it, that you can change the software or use pieces of it in new 25 | free programs, and that you know you can do these things. 26 | 27 | Developers that use our General Public Licenses protect your rights 28 | with two steps: (1) assert copyright on the software, and (2) offer 29 | you this License which gives you legal permission to copy, distribute 30 | and/or modify the software. 31 | 32 | A secondary benefit of defending all users' freedom is that 33 | improvements made in alternate versions of the program, if they 34 | receive widespread use, become available for other developers to 35 | incorporate. Many developers of free software are heartened and 36 | encouraged by the resulting cooperation. However, in the case of 37 | software used on network servers, this result may fail to come about. 38 | The GNU General Public License permits making a modified version and 39 | letting the public access it on a server without ever releasing its 40 | source code to the public. 41 | 42 | The GNU Affero General Public License is designed specifically to 43 | ensure that, in such cases, the modified source code becomes available 44 | to the community. It requires the operator of a network server to 45 | provide the source code of the modified version running there to the 46 | users of that server. Therefore, public use of a modified version, on 47 | a publicly accessible server, gives the public access to the source 48 | code of the modified version. 49 | 50 | An older license, called the Affero General Public License and 51 | published by Affero, was designed to accomplish similar goals. This is 52 | a different license, not a version of the Affero GPL, but Affero has 53 | released a new version of the Affero GPL which permits relicensing under 54 | this license. 55 | 56 | The precise terms and conditions for copying, distribution and 57 | modification follow. 58 | 59 | TERMS AND CONDITIONS 60 | 61 | 0. Definitions. 62 | 63 | "This License" refers to version 3 of the GNU Affero General Public License. 64 | 65 | "Copyright" also means copyright-like laws that apply to other kinds of 66 | works, such as semiconductor masks. 67 | 68 | "The Program" refers to any copyrightable work licensed under this 69 | License. Each licensee is addressed as "you". "Licensees" and 70 | "recipients" may be individuals or organizations. 71 | 72 | To "modify" a work means to copy from or adapt all or part of the work 73 | in a fashion requiring copyright permission, other than the making of an 74 | exact copy. The resulting work is called a "modified version" of the 75 | earlier work or a work "based on" the earlier work. 76 | 77 | A "covered work" means either the unmodified Program or a work based 78 | on the Program. 79 | 80 | To "propagate" a work means to do anything with it that, without 81 | permission, would make you directly or secondarily liable for 82 | infringement under applicable copyright law, except executing it on a 83 | computer or modifying a private copy. Propagation includes copying, 84 | distribution (with or without modification), making available to the 85 | public, and in some countries other activities as well. 86 | 87 | To "convey" a work means any kind of propagation that enables other 88 | parties to make or receive copies. Mere interaction with a user through 89 | a computer network, with no transfer of a copy, is not conveying. 90 | 91 | An interactive user interface displays "Appropriate Legal Notices" 92 | to the extent that it includes a convenient and prominently visible 93 | feature that (1) displays an appropriate copyright notice, and (2) 94 | tells the user that there is no warranty for the work (except to the 95 | extent that warranties are provided), that licensees may convey the 96 | work under this License, and how to view a copy of this License. If 97 | the interface presents a list of user commands or options, such as a 98 | menu, a prominent item in the list meets this criterion. 99 | 100 | 1. Source Code. 101 | 102 | The "source code" for a work means the preferred form of the work 103 | for making modifications to it. "Object code" means any non-source 104 | form of a work. 105 | 106 | A "Standard Interface" means an interface that either is an official 107 | standard defined by a recognized standards body, or, in the case of 108 | interfaces specified for a particular programming language, one that 109 | is widely used among developers working in that language. 110 | 111 | The "System Libraries" of an executable work include anything, other 112 | than the work as a whole, that (a) is included in the normal form of 113 | packaging a Major Component, but which is not part of that Major 114 | Component, and (b) serves only to enable use of the work with that 115 | Major Component, or to implement a Standard Interface for which an 116 | implementation is available to the public in source code form. A 117 | "Major Component", in this context, means a major essential component 118 | (kernel, window system, and so on) of the specific operating system 119 | (if any) on which the executable work runs, or a compiler used to 120 | produce the work, or an object code interpreter used to run it. 121 | 122 | The "Corresponding Source" for a work in object code form means all 123 | the source code needed to generate, install, and (for an executable 124 | work) run the object code and to modify the work, including scripts to 125 | control those activities. However, it does not include the work's 126 | System Libraries, or general-purpose tools or generally available free 127 | programs which are used unmodified in performing those activities but 128 | which are not part of the work. For example, Corresponding Source 129 | includes interface definition files associated with source files for 130 | the work, and the source code for shared libraries and dynamically 131 | linked subprograms that the work is specifically designed to require, 132 | such as by intimate data communication or control flow between those 133 | subprograms and other parts of the work. 134 | 135 | The Corresponding Source need not include anything that users 136 | can regenerate automatically from other parts of the Corresponding 137 | Source. 138 | 139 | The Corresponding Source for a work in source code form is that 140 | same work. 141 | 142 | 2. Basic Permissions. 143 | 144 | All rights granted under this License are granted for the term of 145 | copyright on the Program, and are irrevocable provided the stated 146 | conditions are met. This License explicitly affirms your unlimited 147 | permission to run the unmodified Program. The output from running a 148 | covered work is covered by this License only if the output, given its 149 | content, constitutes a covered work. This License acknowledges your 150 | rights of fair use or other equivalent, as provided by copyright law. 151 | 152 | You may make, run and propagate covered works that you do not 153 | convey, without conditions so long as your license otherwise remains 154 | in force. You may convey covered works to others for the sole purpose 155 | of having them make modifications exclusively for you, or provide you 156 | with facilities for running those works, provided that you comply with 157 | the terms of this License in conveying all material for which you do 158 | not control copyright. Those thus making or running the covered works 159 | for you must do so exclusively on your behalf, under your direction 160 | and control, on terms that prohibit them from making any copies of 161 | your copyrighted material outside their relationship with you. 162 | 163 | Conveying under any other circumstances is permitted solely under 164 | the conditions stated below. Sublicensing is not allowed; section 10 165 | makes it unnecessary. 166 | 167 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 168 | 169 | No covered work shall be deemed part of an effective technological 170 | measure under any applicable law fulfilling obligations under article 171 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 172 | similar laws prohibiting or restricting circumvention of such 173 | measures. 174 | 175 | When you convey a covered work, you waive any legal power to forbid 176 | circumvention of technological measures to the extent such circumvention 177 | is effected by exercising rights under this License with respect to 178 | the covered work, and you disclaim any intention to limit operation or 179 | modification of the work as a means of enforcing, against the work's 180 | users, your or third parties' legal rights to forbid circumvention of 181 | technological measures. 182 | 183 | 4. Conveying Verbatim Copies. 184 | 185 | You may convey verbatim copies of the Program's source code as you 186 | receive it, in any medium, provided that you conspicuously and 187 | appropriately publish on each copy an appropriate copyright notice; 188 | keep intact all notices stating that this License and any 189 | non-permissive terms added in accord with section 7 apply to the code; 190 | keep intact all notices of the absence of any warranty; and give all 191 | recipients a copy of this License along with the Program. 192 | 193 | You may charge any price or no price for each copy that you convey, 194 | and you may offer support or warranty protection for a fee. 195 | 196 | 5. Conveying Modified Source Versions. 197 | 198 | You may convey a work based on the Program, or the modifications to 199 | produce it from the Program, in the form of source code under the 200 | terms of section 4, provided that you also meet all of these conditions: 201 | 202 | a) The work must carry prominent notices stating that you modified 203 | it, and giving a relevant date. 204 | 205 | b) The work must carry prominent notices stating that it is 206 | released under this License and any conditions added under section 207 | 7. This requirement modifies the requirement in section 4 to 208 | "keep intact all notices". 209 | 210 | c) You must license the entire work, as a whole, under this 211 | License to anyone who comes into possession of a copy. This 212 | License will therefore apply, along with any applicable section 7 213 | additional terms, to the whole of the work, and all its parts, 214 | regardless of how they are packaged. This License gives no 215 | permission to license the work in any other way, but it does not 216 | invalidate such permission if you have separately received it. 217 | 218 | d) If the work has interactive user interfaces, each must display 219 | Appropriate Legal Notices; however, if the Program has interactive 220 | interfaces that do not display Appropriate Legal Notices, your 221 | work need not make them do so. 222 | 223 | A compilation of a covered work with other separate and independent 224 | works, which are not by their nature extensions of the covered work, 225 | and which are not combined with it such as to form a larger program, 226 | in or on a volume of a storage or distribution medium, is called an 227 | "aggregate" if the compilation and its resulting copyright are not 228 | used to limit the access or legal rights of the compilation's users 229 | beyond what the individual works permit. Inclusion of a covered work 230 | in an aggregate does not cause this License to apply to the other 231 | parts of the aggregate. 232 | 233 | 6. Conveying Non-Source Forms. 234 | 235 | You may convey a covered work in object code form under the terms 236 | of sections 4 and 5, provided that you also convey the 237 | machine-readable Corresponding Source under the terms of this License, 238 | in one of these ways: 239 | 240 | a) Convey the object code in, or embodied in, a physical product 241 | (including a physical distribution medium), accompanied by the 242 | Corresponding Source fixed on a durable physical medium 243 | customarily used for software interchange. 244 | 245 | b) Convey the object code in, or embodied in, a physical product 246 | (including a physical distribution medium), accompanied by a 247 | written offer, valid for at least three years and valid for as 248 | long as you offer spare parts or customer support for that product 249 | model, to give anyone who possesses the object code either (1) a 250 | copy of the Corresponding Source for all the software in the 251 | product that is covered by this License, on a durable physical 252 | medium customarily used for software interchange, for a price no 253 | more than your reasonable cost of physically performing this 254 | conveying of source, or (2) access to copy the 255 | Corresponding Source from a network server at no charge. 256 | 257 | c) Convey individual copies of the object code with a copy of the 258 | written offer to provide the Corresponding Source. This 259 | alternative is allowed only occasionally and noncommercially, and 260 | only if you received the object code with such an offer, in accord 261 | with subsection 6b. 262 | 263 | d) Convey the object code by offering access from a designated 264 | place (gratis or for a charge), and offer equivalent access to the 265 | Corresponding Source in the same way through the same place at no 266 | further charge. You need not require recipients to copy the 267 | Corresponding Source along with the object code. If the place to 268 | copy the object code is a network server, the Corresponding Source 269 | may be on a different server (operated by you or a third party) 270 | that supports equivalent copying facilities, provided you maintain 271 | clear directions next to the object code saying where to find the 272 | Corresponding Source. Regardless of what server hosts the 273 | Corresponding Source, you remain obligated to ensure that it is 274 | available for as long as needed to satisfy these requirements. 275 | 276 | e) Convey the object code using peer-to-peer transmission, provided 277 | you inform other peers where the object code and Corresponding 278 | Source of the work are being offered to the general public at no 279 | charge under subsection 6d. 280 | 281 | A separable portion of the object code, whose source code is excluded 282 | from the Corresponding Source as a System Library, need not be 283 | included in conveying the object code work. 284 | 285 | A "User Product" is either (1) a "consumer product", which means any 286 | tangible personal property which is normally used for personal, family, 287 | or household purposes, or (2) anything designed or sold for incorporation 288 | into a dwelling. In determining whether a product is a consumer product, 289 | doubtful cases shall be resolved in favor of coverage. For a particular 290 | product received by a particular user, "normally used" refers to a 291 | typical or common use of that class of product, regardless of the status 292 | of the particular user or of the way in which the particular user 293 | actually uses, or expects or is expected to use, the product. A product 294 | is a consumer product regardless of whether the product has substantial 295 | commercial, industrial or non-consumer uses, unless such uses represent 296 | the only significant mode of use of the product. 297 | 298 | "Installation Information" for a User Product means any methods, 299 | procedures, authorization keys, or other information required to install 300 | and execute modified versions of a covered work in that User Product from 301 | a modified version of its Corresponding Source. The information must 302 | suffice to ensure that the continued functioning of the modified object 303 | code is in no case prevented or interfered with solely because 304 | modification has been made. 305 | 306 | If you convey an object code work under this section in, or with, or 307 | specifically for use in, a User Product, and the conveying occurs as 308 | part of a transaction in which the right of possession and use of the 309 | User Product is transferred to the recipient in perpetuity or for a 310 | fixed term (regardless of how the transaction is characterized), the 311 | Corresponding Source conveyed under this section must be accompanied 312 | by the Installation Information. But this requirement does not apply 313 | if neither you nor any third party retains the ability to install 314 | modified object code on the User Product (for example, the work has 315 | been installed in ROM). 316 | 317 | The requirement to provide Installation Information does not include a 318 | requirement to continue to provide support service, warranty, or updates 319 | for a work that has been modified or installed by the recipient, or for 320 | the User Product in which it has been modified or installed. Access to a 321 | network may be denied when the modification itself materially and 322 | adversely affects the operation of the network or violates the rules and 323 | protocols for communication across the network. 324 | 325 | Corresponding Source conveyed, and Installation Information provided, 326 | in accord with this section must be in a format that is publicly 327 | documented (and with an implementation available to the public in 328 | source code form), and must require no special password or key for 329 | unpacking, reading or copying. 330 | 331 | 7. Additional Terms. 332 | 333 | "Additional permissions" are terms that supplement the terms of this 334 | License by making exceptions from one or more of its conditions. 335 | Additional permissions that are applicable to the entire Program shall 336 | be treated as though they were included in this License, to the extent 337 | that they are valid under applicable law. If additional permissions 338 | apply only to part of the Program, that part may be used separately 339 | under those permissions, but the entire Program remains governed by 340 | this License without regard to the additional permissions. 341 | 342 | When you convey a copy of a covered work, you may at your option 343 | remove any additional permissions from that copy, or from any part of 344 | it. (Additional permissions may be written to require their own 345 | removal in certain cases when you modify the work.) You may place 346 | additional permissions on material, added by you to a covered work, 347 | for which you have or can give appropriate copyright permission. 348 | 349 | Notwithstanding any other provision of this License, for material you 350 | add to a covered work, you may (if authorized by the copyright holders of 351 | that material) supplement the terms of this License with terms: 352 | 353 | a) Disclaiming warranty or limiting liability differently from the 354 | terms of sections 15 and 16 of this License; or 355 | 356 | b) Requiring preservation of specified reasonable legal notices or 357 | author attributions in that material or in the Appropriate Legal 358 | Notices displayed by works containing it; or 359 | 360 | c) Prohibiting misrepresentation of the origin of that material, or 361 | requiring that modified versions of such material be marked in 362 | reasonable ways as different from the original version; or 363 | 364 | d) Limiting the use for publicity purposes of names of licensors or 365 | authors of the material; or 366 | 367 | e) Declining to grant rights under trademark law for use of some 368 | trade names, trademarks, or service marks; or 369 | 370 | f) Requiring indemnification of licensors and authors of that 371 | material by anyone who conveys the material (or modified versions of 372 | it) with contractual assumptions of liability to the recipient, for 373 | any liability that these contractual assumptions directly impose on 374 | those licensors and authors. 375 | 376 | All other non-permissive additional terms are considered "further 377 | restrictions" within the meaning of section 10. If the Program as you 378 | received it, or any part of it, contains a notice stating that it is 379 | governed by this License along with a term that is a further 380 | restriction, you may remove that term. If a license document contains 381 | a further restriction but permits relicensing or conveying under this 382 | License, you may add to a covered work material governed by the terms 383 | of that license document, provided that the further restriction does 384 | not survive such relicensing or conveying. 385 | 386 | If you add terms to a covered work in accord with this section, you 387 | must place, in the relevant source files, a statement of the 388 | additional terms that apply to those files, or a notice indicating 389 | where to find the applicable terms. 390 | 391 | Additional terms, permissive or non-permissive, may be stated in the 392 | form of a separately written license, or stated as exceptions; 393 | the above requirements apply either way. 394 | 395 | 8. Termination. 396 | 397 | You may not propagate or modify a covered work except as expressly 398 | provided under this License. Any attempt otherwise to propagate or 399 | modify it is void, and will automatically terminate your rights under 400 | this License (including any patent licenses granted under the third 401 | paragraph of section 11). 402 | 403 | However, if you cease all violation of this License, then your 404 | license from a particular copyright holder is reinstated (a) 405 | provisionally, unless and until the copyright holder explicitly and 406 | finally terminates your license, and (b) permanently, if the copyright 407 | holder fails to notify you of the violation by some reasonable means 408 | prior to 60 days after the cessation. 409 | 410 | Moreover, your license from a particular copyright holder is 411 | reinstated permanently if the copyright holder notifies you of the 412 | violation by some reasonable means, this is the first time you have 413 | received notice of violation of this License (for any work) from that 414 | copyright holder, and you cure the violation prior to 30 days after 415 | your receipt of the notice. 416 | 417 | Termination of your rights under this section does not terminate the 418 | licenses of parties who have received copies or rights from you under 419 | this License. If your rights have been terminated and not permanently 420 | reinstated, you do not qualify to receive new licenses for the same 421 | material under section 10. 422 | 423 | 9. Acceptance Not Required for Having Copies. 424 | 425 | You are not required to accept this License in order to receive or 426 | run a copy of the Program. Ancillary propagation of a covered work 427 | occurring solely as a consequence of using peer-to-peer transmission 428 | to receive a copy likewise does not require acceptance. However, 429 | nothing other than this License grants you permission to propagate or 430 | modify any covered work. These actions infringe copyright if you do 431 | not accept this License. Therefore, by modifying or propagating a 432 | covered work, you indicate your acceptance of this License to do so. 433 | 434 | 10. Automatic Licensing of Downstream Recipients. 435 | 436 | Each time you convey a covered work, the recipient automatically 437 | receives a license from the original licensors, to run, modify and 438 | propagate that work, subject to this License. You are not responsible 439 | for enforcing compliance by third parties with this License. 440 | 441 | An "entity transaction" is a transaction transferring control of an 442 | organization, or substantially all assets of one, or subdividing an 443 | organization, or merging organizations. If propagation of a covered 444 | work results from an entity transaction, each party to that 445 | transaction who receives a copy of the work also receives whatever 446 | licenses to the work the party's predecessor in interest had or could 447 | give under the previous paragraph, plus a right to possession of the 448 | Corresponding Source of the work from the predecessor in interest, if 449 | the predecessor has it or can get it with reasonable efforts. 450 | 451 | You may not impose any further restrictions on the exercise of the 452 | rights granted or affirmed under this License. For example, you may 453 | not impose a license fee, royalty, or other charge for exercise of 454 | rights granted under this License, and you may not initiate litigation 455 | (including a cross-claim or counterclaim in a lawsuit) alleging that 456 | any patent claim is infringed by making, using, selling, offering for 457 | sale, or importing the Program or any portion of it. 458 | 459 | 11. Patents. 460 | 461 | A "contributor" is a copyright holder who authorizes use under this 462 | License of the Program or a work on which the Program is based. The 463 | work thus licensed is called the contributor's "contributor version". 464 | 465 | A contributor's "essential patent claims" are all patent claims 466 | owned or controlled by the contributor, whether already acquired or 467 | hereafter acquired, that would be infringed by some manner, permitted 468 | by this License, of making, using, or selling its contributor version, 469 | but do not include claims that would be infringed only as a 470 | consequence of further modification of the contributor version. For 471 | purposes of this definition, "control" includes the right to grant 472 | patent sublicenses in a manner consistent with the requirements of 473 | this License. 474 | 475 | Each contributor grants you a non-exclusive, worldwide, royalty-free 476 | patent license under the contributor's essential patent claims, to 477 | make, use, sell, offer for sale, import and otherwise run, modify and 478 | propagate the contents of its contributor version. 479 | 480 | In the following three paragraphs, a "patent license" is any express 481 | agreement or commitment, however denominated, not to enforce a patent 482 | (such as an express permission to practice a patent or covenant not to 483 | sue for patent infringement). To "grant" such a patent license to a 484 | party means to make such an agreement or commitment not to enforce a 485 | patent against the party. 486 | 487 | If you convey a covered work, knowingly relying on a patent license, 488 | and the Corresponding Source of the work is not available for anyone 489 | to copy, free of charge and under the terms of this License, through a 490 | publicly available network server or other readily accessible means, 491 | then you must either (1) cause the Corresponding Source to be so 492 | available, or (2) arrange to deprive yourself of the benefit of the 493 | patent license for this particular work, or (3) arrange, in a manner 494 | consistent with the requirements of this License, to extend the patent 495 | license to downstream recipients. "Knowingly relying" means you have 496 | actual knowledge that, but for the patent license, your conveying the 497 | covered work in a country, or your recipient's use of the covered work 498 | in a country, would infringe one or more identifiable patents in that 499 | country that you have reason to believe are valid. 500 | 501 | If, pursuant to or in connection with a single transaction or 502 | arrangement, you convey, or propagate by procuring conveyance of, a 503 | covered work, and grant a patent license to some of the parties 504 | receiving the covered work authorizing them to use, propagate, modify 505 | or convey a specific copy of the covered work, then the patent license 506 | you grant is automatically extended to all recipients of the covered 507 | work and works based on it. 508 | 509 | A patent license is "discriminatory" if it does not include within 510 | the scope of its coverage, prohibits the exercise of, or is 511 | conditioned on the non-exercise of one or more of the rights that are 512 | specifically granted under this License. You may not convey a covered 513 | work if you are a party to an arrangement with a third party that is 514 | in the business of distributing software, under which you make payment 515 | to the third party based on the extent of your activity of conveying 516 | the work, and under which the third party grants, to any of the 517 | parties who would receive the covered work from you, a discriminatory 518 | patent license (a) in connection with copies of the covered work 519 | conveyed by you (or copies made from those copies), or (b) primarily 520 | for and in connection with specific products or compilations that 521 | contain the covered work, unless you entered into that arrangement, 522 | or that patent license was granted, prior to 28 March 2007. 523 | 524 | Nothing in this License shall be construed as excluding or limiting 525 | any implied license or other defenses to infringement that may 526 | otherwise be available to you under applicable patent law. 527 | 528 | 12. No Surrender of Others' Freedom. 529 | 530 | If conditions are imposed on you (whether by court order, agreement or 531 | otherwise) that contradict the conditions of this License, they do not 532 | excuse you from the conditions of this License. If you cannot convey a 533 | covered work so as to satisfy simultaneously your obligations under this 534 | License and any other pertinent obligations, then as a consequence you may 535 | not convey it at all. For example, if you agree to terms that obligate you 536 | to collect a royalty for further conveying from those to whom you convey 537 | the Program, the only way you could satisfy both those terms and this 538 | License would be to refrain entirely from conveying the Program. 539 | 540 | 13. Remote Network Interaction; Use with the GNU General Public License. 541 | 542 | Notwithstanding any other provision of this License, if you modify the 543 | Program, your modified version must prominently offer all users 544 | interacting with it remotely through a computer network (if your version 545 | supports such interaction) an opportunity to receive the Corresponding 546 | Source of your version by providing access to the Corresponding Source 547 | from a network server at no charge, through some standard or customary 548 | means of facilitating copying of software. This Corresponding Source 549 | shall include the Corresponding Source for any work covered by version 3 550 | of the GNU General Public License that is incorporated pursuant to the 551 | following paragraph. 552 | 553 | Notwithstanding any other provision of this License, you have 554 | permission to link or combine any covered work with a work licensed 555 | under version 3 of the GNU General Public License into a single 556 | combined work, and to convey the resulting work. The terms of this 557 | License will continue to apply to the part which is the covered work, 558 | but the work with which it is combined will remain governed by version 559 | 3 of the GNU General Public License. 560 | 561 | 14. Revised Versions of this License. 562 | 563 | The Free Software Foundation may publish revised and/or new versions of 564 | the GNU Affero General Public License from time to time. Such new versions 565 | will be similar in spirit to the present version, but may differ in detail to 566 | address new problems or concerns. 567 | 568 | Each version is given a distinguishing version number. If the 569 | Program specifies that a certain numbered version of the GNU Affero General 570 | Public License "or any later version" applies to it, you have the 571 | option of following the terms and conditions either of that numbered 572 | version or of any later version published by the Free Software 573 | Foundation. If the Program does not specify a version number of the 574 | GNU Affero General Public License, you may choose any version ever published 575 | by the Free Software Foundation. 576 | 577 | If the Program specifies that a proxy can decide which future 578 | versions of the GNU Affero General Public License can be used, that proxy's 579 | public statement of acceptance of a version permanently authorizes you 580 | to choose that version for the Program. 581 | 582 | Later license versions may give you additional or different 583 | permissions. However, no additional obligations are imposed on any 584 | author or copyright holder as a result of your choosing to follow a 585 | later version. 586 | 587 | 15. Disclaimer of Warranty. 588 | 589 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 590 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 591 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 592 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 593 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 594 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 595 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 596 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 597 | 598 | 16. Limitation of Liability. 599 | 600 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 601 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 602 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 603 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 604 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 605 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 606 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 607 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 608 | SUCH DAMAGES. 609 | 610 | 17. Interpretation of Sections 15 and 16. 611 | 612 | If the disclaimer of warranty and limitation of liability provided 613 | above cannot be given local legal effect according to their terms, 614 | reviewing courts shall apply local law that most closely approximates 615 | an absolute waiver of all civil liability in connection with the 616 | Program, unless a warranty or assumption of liability accompanies a 617 | copy of the Program in return for a fee. 618 | 619 | END OF TERMS AND CONDITIONS 620 | 621 | How to Apply These Terms to Your New Programs 622 | 623 | If you develop a new program, and you want it to be of the greatest 624 | possible use to the public, the best way to achieve this is to make it 625 | free software which everyone can redistribute and change under these terms. 626 | 627 | To do so, attach the following notices to the program. It is safest 628 | to attach them to the start of each source file to most effectively 629 | state the exclusion of warranty; and each file should have at least 630 | the "copyright" line and a pointer to where the full notice is found. 631 | 632 | 633 | Copyright (C) 634 | 635 | This program is free software: you can redistribute it and/or modify 636 | it under the terms of the GNU Affero General Public License as published by 637 | the Free Software Foundation, either version 3 of the License, or 638 | (at your option) any later version. 639 | 640 | This program is distributed in the hope that it will be useful, 641 | but WITHOUT ANY WARRANTY; without even the implied warranty of 642 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 643 | GNU Affero General Public License for more details. 644 | 645 | You should have received a copy of the GNU Affero General Public License 646 | along with this program. If not, see . 647 | 648 | Also add information on how to contact you by electronic and paper mail. 649 | 650 | If your software can interact with users remotely through a computer 651 | network, you should also make sure that it provides a way for users to 652 | get its source. For example, if your program is a web application, its 653 | interface could display a "Source" link that leads users to an archive 654 | of the code. There are many ways you could offer source, and different 655 | solutions will be better for different programs; see section 13 for the 656 | specific requirements. 657 | 658 | You should also get your employer (if you work as a programmer) or school, 659 | if any, to sign a "copyright disclaimer" for the program, if necessary. 660 | For more information on this, and how to apply and follow the GNU AGPL, see 661 | . 662 | -------------------------------------------------------------------------------- /Pipfile: -------------------------------------------------------------------------------- 1 | [[source]] 2 | name = "pypi" 3 | url = "https://pypi.org/simple" 4 | verify_ssl = true 5 | 6 | [dev-packages] 7 | 8 | [packages] 9 | redis = ">=3" 10 | pypssl = "*" 11 | pypdns = "*" 12 | pyeupi = "*" 13 | dnspython = "*" 14 | beautifulsoup4 = "*" 15 | pyipasnhistory = {editable = true,git = "https://github.com/D4-project/IPASN-History.git/",subdirectory = "client"} 16 | pybgpranking = {editable = true,git = "https://github.com/D4-project/BGP-Ranking.git/",subdirectory = "client"} 17 | flask = "*" 18 | flask-bootstrap = "*" 19 | flask-mail = "*" 20 | flask-wtf = "*" 21 | gunicorn = {extras = ["gevent"],version = "*"} 22 | url-abuse = {editable = true,path = "."} 23 | pyurlabuse = {editable = true,path = "./client"} 24 | pyfaup = {editable = true,git = "https://github.com/stricaud/faup.git/",subdirectory = "src/lib/bindings/python/"} 25 | pylookyloo = {editable = true,git = "https://github.com/CIRCL/lookyloo.git/",subdirectory = "client"} 26 | Jinja2 = ">=2.10.1" # CVE-2019-10906 27 | werkzeug = ">=0.15.3" # CVE-2019-14806 28 | 29 | [requires] 30 | python_version = "3" 31 | -------------------------------------------------------------------------------- /Pipfile.lock: -------------------------------------------------------------------------------- 1 | { 2 | "_meta": { 3 | "hash": { 4 | "sha256": "a90d2e1e5b904d6df6258ff95c683a30ba3d58e4fcd34666585cb1b39ef87ae4" 5 | }, 6 | "pipfile-spec": 6, 7 | "requires": { 8 | "python_version": "3" 9 | }, 10 | "sources": [ 11 | { 12 | "name": "pypi", 13 | "url": "https://pypi.org/simple", 14 | "verify_ssl": true 15 | } 16 | ] 17 | }, 18 | "default": { 19 | "beautifulsoup4": { 20 | "hashes": [ 21 | "sha256:05668158c7b85b791c5abde53e50265e16f98ad601c402ba44d70f96c4159612", 22 | "sha256:25288c9e176f354bf277c0a10aa96c782a6a18a17122dba2e8cec4a97e03343b", 23 | "sha256:f040590be10520f2ea4c2ae8c3dae441c7cfff5308ec9d58a0ec0c1b8f81d469" 24 | ], 25 | "index": "pypi", 26 | "version": "==4.8.0" 27 | }, 28 | "blinker": { 29 | "hashes": [ 30 | "sha256:471aee25f3992bd325afa3772f1063dbdbbca947a041b8b89466dc00d606f8b6" 31 | ], 32 | "version": "==1.4" 33 | }, 34 | "certifi": { 35 | "hashes": [ 36 | "sha256:2bbf76fd432960138b3ef6dda3dde0544f27cbf8546c458e60baf371917ba9ee", 37 | "sha256:50b1e4f8446b06f41be7dd6338db18e0990601dce795c2b1686458aa7e8fa7d8" 38 | ], 39 | "version": "==2021.5.30" 40 | }, 41 | "chardet": { 42 | "hashes": [ 43 | "sha256:0d6f53a15db4120f2b08c94f11e7d93d2c911ee118b6b30a04ec3ee8310179fa", 44 | "sha256:f864054d66fd9118f2e67044ac8981a54775ec5b67aed0441892edb553d21da5" 45 | ], 46 | "version": "==4.0.0" 47 | }, 48 | "click": { 49 | "hashes": [ 50 | "sha256:8c04c11192119b1ef78ea049e0a6f0463e4c48ef00a30160c704337586f3ad7a", 51 | "sha256:fba402a4a47334742d782209a7c79bc448911afe1149d07bdabdf480b3e2f4b6" 52 | ], 53 | "version": "==8.0.1" 54 | }, 55 | "dnspython": { 56 | "hashes": [ 57 | "sha256:36c5e8e38d4369a08b6780b7f27d790a292b2b08eea01607865bf0936c558e01", 58 | "sha256:f69c21288a962f4da86e56c4905b49d11aba7938d3d740e80d9e366ee4f1632d" 59 | ], 60 | "index": "pypi", 61 | "version": "==1.16.0" 62 | }, 63 | "dominate": { 64 | "hashes": [ 65 | "sha256:76ec2cde23700a6fc4fee098168b9dee43b99c2f1dd0ca6a711f683e8eb7e1e4", 66 | "sha256:84b5f71ed30021193cb0faa45d7776e1083f392cfe67a49f44e98cb2ed76c036" 67 | ], 68 | "version": "==2.6.0" 69 | }, 70 | "flask": { 71 | "hashes": [ 72 | "sha256:13f9f196f330c7c2c5d7a5cf91af894110ca0215ac051b5844701f2bfd934d52", 73 | "sha256:45eb5a6fd193d6cf7e0cf5d8a5b31f83d5faae0293695626f539a823e93b13f6" 74 | ], 75 | "index": "pypi", 76 | "version": "==1.1.1" 77 | }, 78 | "flask-bootstrap": { 79 | "hashes": [ 80 | "sha256:cb08ed940183f6343a64e465e83b3a3f13c53e1baabb8d72b5da4545ef123ac8" 81 | ], 82 | "index": "pypi", 83 | "version": "==3.3.7.1" 84 | }, 85 | "flask-mail": { 86 | "hashes": [ 87 | "sha256:22e5eb9a940bf407bcf30410ecc3708f3c56cc44b29c34e1726fe85006935f41" 88 | ], 89 | "index": "pypi", 90 | "version": "==0.9.1" 91 | }, 92 | "flask-wtf": { 93 | "hashes": [ 94 | "sha256:5d14d55cfd35f613d99ee7cba0fc3fbbe63ba02f544d349158c14ca15561cc36", 95 | "sha256:d9a9e366b32dcbb98ef17228e76be15702cd2600675668bca23f63a7947fd5ac" 96 | ], 97 | "index": "pypi", 98 | "version": "==0.14.2" 99 | }, 100 | "gevent": { 101 | "hashes": [ 102 | "sha256:16574e4aa902ebc7bad564e25aa9740a82620fdeb61e0bbf5cbc32e84c13cb6a", 103 | "sha256:188c3c6da67e17ffa28f960fc80f8b7e4ba0f4efdc7519822c9d3a1784ca78ea", 104 | "sha256:1e5af63e452cc1758924528a2ba6d3e472f5338e1534b7233cd01d3429fc1082", 105 | "sha256:242e32cc011ad7127525ca9181aef3379ce4ad9c733aefe311ecf90248ad9a6f", 106 | "sha256:2a9ae0a0fd956cbbc9c326b8f290dcad2b58acfb2e2732855fe1155fb110a04d", 107 | "sha256:33741e3cd51b90483b14f73b6a3b32b779acf965aeb91d22770c0c8e0c937b73", 108 | "sha256:3694f393ab08372bd337b9bc8eebef3ccab3c1623ef94536762a1eee68821449", 109 | "sha256:464ec84001ba5108a9022aded4c5e69ea4d13ef11a2386d3ec37c1d08f3074c9", 110 | "sha256:520cc2a029a9eef436e4e56b007af7859315cafa21937d43c1d5269f12f2c981", 111 | "sha256:77b65a68c83e1c680f52dc39d5e5406763dd10a18ce08420665504b6f047962e", 112 | "sha256:7bdfee07be5eee4f687bf90c54c2a65c909bcf2b6c4878faee51218ffa5d5d3e", 113 | "sha256:969743debf89d6409423aaeae978437cc042247f91f5801e946a07a0a3b59148", 114 | "sha256:96f704561a9dd9a817c67f2e279e23bfad6166cf95d63d35c501317e17f68bcf", 115 | "sha256:9f99c3ec61daed54dc074fbcf1a86bcf795b9dfac2f6d4cdae6dfdb8a9125692", 116 | "sha256:a130a1885603eabd8cea11b3e1c3c7333d4341b537eca7f0c4794cb5c7120db1", 117 | "sha256:a54b9c7516c211045d7897a73a4ccdc116b3720c9ad3c591ef9592b735202a3b", 118 | "sha256:ac98570649d9c276e39501a1d1cbf6c652b78f57a0eb1445c5ff25ff80336b63", 119 | "sha256:afaeda9a7e8e93d0d86bf1d65affe912366294913fe43f0d107145dc32cd9545", 120 | "sha256:b6ffc1131e017aafa70d7ec19cc24010b19daa2f11d5dc2dc191a79c3c9ea147", 121 | "sha256:ba0c6ad94614e9af4240affbe1b4839c54da5a0a7e60806c6f7f69c1a7f5426e", 122 | "sha256:bdb3677e77ab4ebf20c4752ac49f3b1e47445678dd69f82f9905362c68196456", 123 | "sha256:c2c4326bb507754ef354635c05f560a217c171d80f26ca65bea81aa59b1ac179", 124 | "sha256:cfb2878c2ecf27baea436bb9c4d8ab8c2fa7763c3916386d5602992b6a056ff3", 125 | "sha256:e370e0a861db6f63c75e74b6ee56a40f5cdac90212ec404621445afa12bfc94b", 126 | "sha256:e8a5d9fcf5d031f2e4c499f5f4b53262face416e22e8769078354f641255a663", 127 | "sha256:ecff28416c99e0f73137f35849c3027cc3edde9dc13b7707825ebbf728623928", 128 | "sha256:f0498df97a303da77e180a9368c9228b0fc94d10dd2ce79fc5ebb63fec0d2fc9", 129 | "sha256:f91fd07b9cf642f24e58ed381e19ec33e28b8eee8726c19b026ea24fcc9ff897" 130 | ], 131 | "version": "==21.1.2" 132 | }, 133 | "greenlet": { 134 | "hashes": [ 135 | "sha256:03f28a5ea20201e70ab70518d151116ce939b412961c33827519ce620957d44c", 136 | "sha256:06d7ac89e6094a0a8f8dc46aa61898e9e1aec79b0f8b47b2400dd51a44dbc832", 137 | "sha256:06ecb43b04480e6bafc45cb1b4b67c785e183ce12c079473359e04a709333b08", 138 | "sha256:096cb0217d1505826ba3d723e8981096f2622cde1eb91af9ed89a17c10aa1f3e", 139 | "sha256:0c557c809eeee215b87e8a7cbfb2d783fb5598a78342c29ade561440abae7d22", 140 | "sha256:0de64d419b1cb1bfd4ea544bedea4b535ef3ae1e150b0f2609da14bbf48a4a5f", 141 | "sha256:14927b15c953f8f2d2a8dffa224aa78d7759ef95284d4c39e1745cf36e8cdd2c", 142 | "sha256:16183fa53bc1a037c38d75fdc59d6208181fa28024a12a7f64bb0884434c91ea", 143 | "sha256:206295d270f702bc27dbdbd7651e8ebe42d319139e0d90217b2074309a200da8", 144 | "sha256:22002259e5b7828b05600a762579fa2f8b33373ad95a0ee57b4d6109d0e589ad", 145 | "sha256:2325123ff3a8ecc10ca76f062445efef13b6cf5a23389e2df3c02a4a527b89bc", 146 | "sha256:258f9612aba0d06785143ee1cbf2d7361801c95489c0bd10c69d163ec5254a16", 147 | "sha256:3096286a6072553b5dbd5efbefc22297e9d06a05ac14ba017233fedaed7584a8", 148 | "sha256:3d13da093d44dee7535b91049e44dd2b5540c2a0e15df168404d3dd2626e0ec5", 149 | "sha256:408071b64e52192869129a205e5b463abda36eff0cebb19d6e63369440e4dc99", 150 | "sha256:598bcfd841e0b1d88e32e6a5ea48348a2c726461b05ff057c1b8692be9443c6e", 151 | "sha256:5d928e2e3c3906e0a29b43dc26d9b3d6e36921eee276786c4e7ad9ff5665c78a", 152 | "sha256:5f75e7f237428755d00e7460239a2482fa7e3970db56c8935bd60da3f0733e56", 153 | "sha256:60848099b76467ef09b62b0f4512e7e6f0a2c977357a036de602b653667f5f4c", 154 | "sha256:6b1d08f2e7f2048d77343279c4d4faa7aef168b3e36039cba1917fffb781a8ed", 155 | "sha256:70bd1bb271e9429e2793902dfd194b653221904a07cbf207c3139e2672d17959", 156 | "sha256:76ed710b4e953fc31c663b079d317c18f40235ba2e3d55f70ff80794f7b57922", 157 | "sha256:7920e3eccd26b7f4c661b746002f5ec5f0928076bd738d38d894bb359ce51927", 158 | "sha256:7db68f15486d412b8e2cfcd584bf3b3a000911d25779d081cbbae76d71bd1a7e", 159 | "sha256:8833e27949ea32d27f7e96930fa29404dd4f2feb13cce483daf52e8842ec246a", 160 | "sha256:944fbdd540712d5377a8795c840a97ff71e7f3221d3fddc98769a15a87b36131", 161 | "sha256:9a6b035aa2c5fcf3dbbf0e3a8a5bc75286fc2d4e6f9cfa738788b433ec894919", 162 | "sha256:9bdcff4b9051fb1aa4bba4fceff6a5f770c6be436408efd99b76fc827f2a9319", 163 | "sha256:a9017ff5fc2522e45562882ff481128631bf35da444775bc2776ac5c61d8bcae", 164 | "sha256:aa4230234d02e6f32f189fd40b59d5a968fe77e80f59c9c933384fe8ba535535", 165 | "sha256:ad80bb338cf9f8129c049837a42a43451fc7c8b57ad56f8e6d32e7697b115505", 166 | "sha256:adb94a28225005890d4cf73648b5131e885c7b4b17bc762779f061844aabcc11", 167 | "sha256:b3090631fecdf7e983d183d0fad7ea72cfb12fa9212461a9b708ff7907ffff47", 168 | "sha256:b33b51ab057f8a20b497ffafdb1e79256db0c03ef4f5e3d52e7497200e11f821", 169 | "sha256:b97c9a144bbeec7039cca44df117efcbeed7209543f5695201cacf05ba3b5857", 170 | "sha256:be13a18cec649ebaab835dff269e914679ef329204704869f2f167b2c163a9da", 171 | "sha256:be9768e56f92d1d7cd94185bab5856f3c5589a50d221c166cc2ad5eb134bd1dc", 172 | "sha256:c1580087ab493c6b43e66f2bdd165d9e3c1e86ef83f6c2c44a29f2869d2c5bd5", 173 | "sha256:c35872b2916ab5a240d52a94314c963476c989814ba9b519bc842e5b61b464bb", 174 | "sha256:c70c7dd733a4c56838d1f1781e769081a25fade879510c5b5f0df76956abfa05", 175 | "sha256:c767458511a59f6f597bfb0032a1c82a52c29ae228c2c0a6865cfeaeaac4c5f5", 176 | "sha256:c87df8ae3f01ffb4483c796fe1b15232ce2b219f0b18126948616224d3f658ee", 177 | "sha256:ca1c4a569232c063615f9e70ff9a1e2fee8c66a6fb5caf0f5e8b21a396deec3e", 178 | "sha256:cc407b68e0a874e7ece60f6639df46309376882152345508be94da608cc0b831", 179 | "sha256:da862b8f7de577bc421323714f63276acb2f759ab8c5e33335509f0b89e06b8f", 180 | "sha256:dfe7eac0d253915116ed0cd160a15a88981a1d194c1ef151e862a5c7d2f853d3", 181 | "sha256:ed1377feed808c9c1139bdb6a61bcbf030c236dd288d6fca71ac26906ab03ba6", 182 | "sha256:f42ad188466d946f1b3afc0a9e1a266ac8926461ee0786c06baac6bd71f8a6f3", 183 | "sha256:f92731609d6625e1cc26ff5757db4d32b6b810d2a3363b0ff94ff573e5901f6f" 184 | ], 185 | "markers": "platform_python_implementation == 'CPython'", 186 | "version": "==1.1.0" 187 | }, 188 | "gunicorn": { 189 | "extras": [ 190 | "gevent" 191 | ], 192 | "hashes": [ 193 | "sha256:aa8e0b40b4157b36a5df5e599f45c9c76d6af43845ba3b3b0efe2c70473c2471", 194 | "sha256:fa2662097c66f920f53f70621c6c58ca4a3c4d3434205e608e121b5b3b71f4f3" 195 | ], 196 | "index": "pypi", 197 | "version": "==19.9.0" 198 | }, 199 | "idna": { 200 | "hashes": [ 201 | "sha256:b307872f855b18632ce0c21c5e45be78c0ea7ae4c15c828c20788b26921eb3f6", 202 | "sha256:b97d804b1e9b523befed77c48dacec60e6dcb0b5391d57af6a65a312a90648c0" 203 | ], 204 | "version": "==2.10" 205 | }, 206 | "itsdangerous": { 207 | "hashes": [ 208 | "sha256:5174094b9637652bdb841a3029700391451bd092ba3db90600dea710ba28e97c", 209 | "sha256:9e724d68fc22902a1435351f84c3fb8623f303fffcc566a4cb952df8c572cff0" 210 | ], 211 | "version": "==2.0.1" 212 | }, 213 | "jinja2": { 214 | "hashes": [ 215 | "sha256:03e47ad063331dd6a3f04a43eddca8a966a26ba0c5b7207a9a9e4e08f1b29419", 216 | "sha256:a6d58433de0ae800347cab1fa3043cebbabe8baa9d29e668f1c768cb87a333c6" 217 | ], 218 | "index": "pypi", 219 | "version": "==2.11.3" 220 | }, 221 | "markupsafe": { 222 | "hashes": [ 223 | "sha256:01a9b8ea66f1658938f65b93a85ebe8bc016e6769611be228d797c9d998dd298", 224 | "sha256:023cb26ec21ece8dc3907c0e8320058b2e0cb3c55cf9564da612bc325bed5e64", 225 | "sha256:0446679737af14f45767963a1a9ef7620189912317d095f2d9ffa183a4d25d2b", 226 | "sha256:0717a7390a68be14b8c793ba258e075c6f4ca819f15edfc2a3a027c823718567", 227 | "sha256:0955295dd5eec6cb6cc2fe1698f4c6d84af2e92de33fbcac4111913cd100a6ff", 228 | "sha256:10f82115e21dc0dfec9ab5c0223652f7197feb168c940f3ef61563fc2d6beb74", 229 | "sha256:1d609f577dc6e1aa17d746f8bd3c31aa4d258f4070d61b2aa5c4166c1539de35", 230 | "sha256:2ef54abee730b502252bcdf31b10dacb0a416229b72c18b19e24a4509f273d26", 231 | "sha256:3c112550557578c26af18a1ccc9e090bfe03832ae994343cfdacd287db6a6ae7", 232 | "sha256:47ab1e7b91c098ab893b828deafa1203de86d0bc6ab587b160f78fe6c4011f75", 233 | "sha256:49e3ceeabbfb9d66c3aef5af3a60cc43b85c33df25ce03d0031a608b0a8b2e3f", 234 | "sha256:4efca8f86c54b22348a5467704e3fec767b2db12fc39c6d963168ab1d3fc9135", 235 | "sha256:53edb4da6925ad13c07b6d26c2a852bd81e364f95301c66e930ab2aef5b5ddd8", 236 | "sha256:594c67807fb16238b30c44bdf74f36c02cdf22d1c8cda91ef8a0ed8dabf5620a", 237 | "sha256:611d1ad9a4288cf3e3c16014564df047fe08410e628f89805e475368bd304914", 238 | "sha256:6557b31b5e2c9ddf0de32a691f2312a32f77cd7681d8af66c2692efdbef84c18", 239 | "sha256:693ce3f9e70a6cf7d2fb9e6c9d8b204b6b39897a2c4a1aa65728d5ac97dcc1d8", 240 | "sha256:6a7fae0dd14cf60ad5ff42baa2e95727c3d81ded453457771d02b7d2b3f9c0c2", 241 | "sha256:6c4ca60fa24e85fe25b912b01e62cb969d69a23a5d5867682dd3e80b5b02581d", 242 | "sha256:7d91275b0245b1da4d4cfa07e0faedd5b0812efc15b702576d103293e252af1b", 243 | "sha256:905fec760bd2fa1388bb5b489ee8ee5f7291d692638ea5f67982d968366bef9f", 244 | "sha256:97383d78eb34da7e1fa37dd273c20ad4320929af65d156e35a5e2d89566d9dfb", 245 | "sha256:984d76483eb32f1bcb536dc27e4ad56bba4baa70be32fa87152832cdd9db0833", 246 | "sha256:a30e67a65b53ea0a5e62fe23682cfe22712e01f453b95233b25502f7c61cb415", 247 | "sha256:ab3ef638ace319fa26553db0624c4699e31a28bb2a835c5faca8f8acf6a5a902", 248 | "sha256:b2f4bf27480f5e5e8ce285a8c8fd176c0b03e93dcc6646477d4630e83440c6a9", 249 | "sha256:b7f2d075102dc8c794cbde1947378051c4e5180d52d276987b8d28a3bd58c17d", 250 | "sha256:be98f628055368795d818ebf93da628541e10b75b41c559fdf36d104c5787066", 251 | "sha256:d7f9850398e85aba693bb640262d3611788b1f29a79f0c93c565694658f4071f", 252 | "sha256:f5653a225f31e113b152e56f154ccbe59eeb1c7487b39b9d9f9cdb58e6c79dc5", 253 | "sha256:f826e31d18b516f653fe296d967d700fddad5901ae07c622bb3705955e1faa94", 254 | "sha256:f8ba0e8349a38d3001fae7eadded3f6606f0da5d748ee53cc1dab1d6527b9509", 255 | "sha256:f9081981fe268bd86831e5c75f7de206ef275defcb82bc70740ae6dc507aee51", 256 | "sha256:fa130dd50c57d53368c9d59395cb5526eda596d3ffe36666cd81a44d56e48872" 257 | ], 258 | "version": "==2.0.1" 259 | }, 260 | "pybgpranking": { 261 | "editable": true, 262 | "git": "https://github.com/D4-project/BGP-Ranking.git/", 263 | "ref": "b367e1852cafabcb35a4159f520649bd35c4686b", 264 | "subdirectory": "client" 265 | }, 266 | "pyeupi": { 267 | "hashes": [ 268 | "sha256:35b0e6b430f23ecd303f7cc7a8fe5147cf2509a5b2254eaf9695392c0af02901" 269 | ], 270 | "index": "pypi", 271 | "version": "==1.0" 272 | }, 273 | "pyfaup": { 274 | "editable": true, 275 | "git": "https://github.com/stricaud/faup.git/", 276 | "ref": "b65a4d816b008d715f4394cf2ccac474c1710350", 277 | "subdirectory": "src/lib/bindings/python/" 278 | }, 279 | "pyipasnhistory": { 280 | "editable": true, 281 | "git": "https://github.com/D4-project/IPASN-History.git/", 282 | "ref": "283539cfbbde4bb54497726634407025f7d685c2", 283 | "subdirectory": "client" 284 | }, 285 | "pylookyloo": { 286 | "editable": true, 287 | "git": "https://github.com/CIRCL/lookyloo.git/", 288 | "ref": "934324ed09fede42e0fed43c3c0eab80d6436bb2", 289 | "subdirectory": "client" 290 | }, 291 | "pypdns": { 292 | "hashes": [ 293 | "sha256:349ab1033e34a60fa0c4626b3432f5202c174656955fdf330986380c9a97cf3e", 294 | "sha256:c609678d47255a240c1e3f29a757355f610a8394ec22f21a07853360ebee6f20" 295 | ], 296 | "index": "pypi", 297 | "version": "==1.4.1" 298 | }, 299 | "pypssl": { 300 | "hashes": [ 301 | "sha256:4dbe772aefdf4ab18934d83cde79e2fc5d5ba9d2b4153dc419a63faab3432643" 302 | ], 303 | "index": "pypi", 304 | "version": "==2.1" 305 | }, 306 | "python-dateutil": { 307 | "hashes": [ 308 | "sha256:73ebfe9dbf22e832286dafa60473e4cd239f8592f699aa5adaf10050e6e1823c", 309 | "sha256:75bb3f31ea686f1197762692a9ee6a7550b59fc6ca3a1f4b5d7e32fb98e2da2a" 310 | ], 311 | "version": "==2.8.1" 312 | }, 313 | "pyurlabuse": { 314 | "editable": true, 315 | "path": "./client" 316 | }, 317 | "redis": { 318 | "hashes": [ 319 | "sha256:98a22fb750c9b9bb46e75e945dc3f61d0ab30d06117cbb21ff9cd1d315fedd3b", 320 | "sha256:c504251769031b0dd7dd5cf786050a6050197c6de0d37778c80c08cb04ae8275" 321 | ], 322 | "index": "pypi", 323 | "version": "==3.3.8" 324 | }, 325 | "requests": { 326 | "hashes": [ 327 | "sha256:27973dd4a904a4f13b263a19c866c13b92a39ed1c964655f025f3f8d3d75b804", 328 | "sha256:c210084e36a42ae6b9219e00e48287def368a26d03a048ddad7bfee44f75871e" 329 | ], 330 | "version": "==2.25.1" 331 | }, 332 | "requests-cache": { 333 | "hashes": [ 334 | "sha256:0b9b5555b3b2ecda74a9aa5abd98174bc7332de2e1d32f9f8f056583b01d6e99", 335 | "sha256:6e28e461873415036ea383c2414691cf1164cb01391ad4c45b84b3ebf0fb9287" 336 | ], 337 | "version": "==0.6.3" 338 | }, 339 | "six": { 340 | "hashes": [ 341 | "sha256:1e61c37477a1626458e36f7b1d82aa5c9b094fa4802892072e49de9c60c4c926", 342 | "sha256:8abb2f1d86890a2dfb989f9a77cfcfd3e47c2a354b01111771326f8aa26e0254" 343 | ], 344 | "version": "==1.16.0" 345 | }, 346 | "soupsieve": { 347 | "hashes": [ 348 | "sha256:052774848f448cf19c7e959adf5566904d525f33a3f8b6ba6f6f8f26ec7de0cc", 349 | "sha256:c2c1c2d44f158cdbddab7824a9af8c4f83c76b1e23e049479aa432feb6c4c23b" 350 | ], 351 | "version": "==2.2.1" 352 | }, 353 | "url-abuse": { 354 | "editable": true, 355 | "path": "." 356 | }, 357 | "url-normalize": { 358 | "hashes": [ 359 | "sha256:d23d3a070ac52a67b83a1c59a0e68f8608d1cd538783b401bc9de2c0fac999b2", 360 | "sha256:ec3c301f04e5bb676d333a7fa162fa977ad2ca04b7e652bfc9fac4e405728eed" 361 | ], 362 | "version": "==1.4.3" 363 | }, 364 | "urllib3": { 365 | "hashes": [ 366 | "sha256:753a0374df26658f99d826cfe40394a686d05985786d946fbe4165b5148f5a7c", 367 | "sha256:a7acd0977125325f516bda9735fa7142b909a8d01e8b2e4c8108d0984e6e0098" 368 | ], 369 | "index": "pypi", 370 | "version": "==1.26.5" 371 | }, 372 | "visitor": { 373 | "hashes": [ 374 | "sha256:2c737903b2b6864ebc6167eef7cf3b997126f1aa94bdf590f90f1436d23e480a" 375 | ], 376 | "version": "==0.1.3" 377 | }, 378 | "werkzeug": { 379 | "hashes": [ 380 | "sha256:87ae4e5b5366da2347eb3116c0e6c681a0e939a33b2805e2c0cbd282664932c4", 381 | "sha256:a13b74dd3c45f758d4ebdb224be8f1ab8ef58b3c0ffc1783a8c7d9f4f50227e6" 382 | ], 383 | "index": "pypi", 384 | "version": "==0.15.5" 385 | }, 386 | "wtforms": { 387 | "hashes": [ 388 | "sha256:7b504fc724d0d1d4d5d5c114e778ec88c37ea53144683e084215eed5155ada4c", 389 | "sha256:81195de0ac94fbc8368abbaf9197b88c4f3ffd6c2719b5bf5fc9da744f3d829c" 390 | ], 391 | "version": "==2.3.3" 392 | }, 393 | "zope.event": { 394 | "hashes": [ 395 | "sha256:2666401939cdaa5f4e0c08cf7f20c9b21423b95e88f4675b1443973bdb080c42", 396 | "sha256:5e76517f5b9b119acf37ca8819781db6c16ea433f7e2062c4afc2b6fbedb1330" 397 | ], 398 | "version": "==4.5.0" 399 | }, 400 | "zope.interface": { 401 | "hashes": [ 402 | "sha256:08f9636e99a9d5410181ba0729e0408d3d8748026ea938f3b970a0249daa8192", 403 | "sha256:0b465ae0962d49c68aa9733ba92a001b2a0933c317780435f00be7ecb959c702", 404 | "sha256:0cba8477e300d64a11a9789ed40ee8932b59f9ee05f85276dbb4b59acee5dd09", 405 | "sha256:0cee5187b60ed26d56eb2960136288ce91bcf61e2a9405660d271d1f122a69a4", 406 | "sha256:0ea1d73b7c9dcbc5080bb8aaffb776f1c68e807767069b9ccdd06f27a161914a", 407 | "sha256:0f91b5b948686659a8e28b728ff5e74b1be6bf40cb04704453617e5f1e945ef3", 408 | "sha256:15e7d1f7a6ee16572e21e3576d2012b2778cbacf75eb4b7400be37455f5ca8bf", 409 | "sha256:17776ecd3a1fdd2b2cd5373e5ef8b307162f581c693575ec62e7c5399d80794c", 410 | "sha256:194d0bcb1374ac3e1e023961610dc8f2c78a0f5f634d0c737691e215569e640d", 411 | "sha256:1c0e316c9add0db48a5b703833881351444398b04111188069a26a61cfb4df78", 412 | "sha256:205e40ccde0f37496904572035deea747390a8b7dc65146d30b96e2dd1359a83", 413 | "sha256:273f158fabc5ea33cbc936da0ab3d4ba80ede5351babc4f577d768e057651531", 414 | "sha256:2876246527c91e101184f63ccd1d716ec9c46519cc5f3d5375a3351c46467c46", 415 | "sha256:2c98384b254b37ce50eddd55db8d381a5c53b4c10ee66e1e7fe749824f894021", 416 | "sha256:2e5a26f16503be6c826abca904e45f1a44ff275fdb7e9d1b75c10671c26f8b94", 417 | "sha256:334701327f37c47fa628fc8b8d28c7d7730ce7daaf4bda1efb741679c2b087fc", 418 | "sha256:3748fac0d0f6a304e674955ab1365d515993b3a0a865e16a11ec9d86fb307f63", 419 | "sha256:3c02411a3b62668200910090a0dff17c0b25aaa36145082a5a6adf08fa281e54", 420 | "sha256:3dd4952748521205697bc2802e4afac5ed4b02909bb799ba1fe239f77fd4e117", 421 | "sha256:3f24df7124c323fceb53ff6168da70dbfbae1442b4f3da439cd441681f54fe25", 422 | "sha256:469e2407e0fe9880ac690a3666f03eb4c3c444411a5a5fddfdabc5d184a79f05", 423 | "sha256:4de4bc9b6d35c5af65b454d3e9bc98c50eb3960d5a3762c9438df57427134b8e", 424 | "sha256:5208ebd5152e040640518a77827bdfcc73773a15a33d6644015b763b9c9febc1", 425 | "sha256:52de7fc6c21b419078008f697fd4103dbc763288b1406b4562554bd47514c004", 426 | "sha256:5bb3489b4558e49ad2c5118137cfeaf59434f9737fa9c5deefc72d22c23822e2", 427 | "sha256:5dba5f530fec3f0988d83b78cc591b58c0b6eb8431a85edd1569a0539a8a5a0e", 428 | "sha256:5dd9ca406499444f4c8299f803d4a14edf7890ecc595c8b1c7115c2342cadc5f", 429 | "sha256:5f931a1c21dfa7a9c573ec1f50a31135ccce84e32507c54e1ea404894c5eb96f", 430 | "sha256:63b82bb63de7c821428d513607e84c6d97d58afd1fe2eb645030bdc185440120", 431 | "sha256:66c0061c91b3b9cf542131148ef7ecbecb2690d48d1612ec386de9d36766058f", 432 | "sha256:6f0c02cbb9691b7c91d5009108f975f8ffeab5dff8f26d62e21c493060eff2a1", 433 | "sha256:71aace0c42d53abe6fc7f726c5d3b60d90f3c5c055a447950ad6ea9cec2e37d9", 434 | "sha256:7d97a4306898b05404a0dcdc32d9709b7d8832c0c542b861d9a826301719794e", 435 | "sha256:7df1e1c05304f26faa49fa752a8c690126cf98b40b91d54e6e9cc3b7d6ffe8b7", 436 | "sha256:8270252effc60b9642b423189a2fe90eb6b59e87cbee54549db3f5562ff8d1b8", 437 | "sha256:867a5ad16892bf20e6c4ea2aab1971f45645ff3102ad29bd84c86027fa99997b", 438 | "sha256:877473e675fdcc113c138813a5dd440da0769a2d81f4d86614e5d62b69497155", 439 | "sha256:8892f89999ffd992208754851e5a052f6b5db70a1e3f7d54b17c5211e37a98c7", 440 | "sha256:9a9845c4c6bb56e508651f005c4aeb0404e518c6f000d5a1123ab077ab769f5c", 441 | "sha256:a1e6e96217a0f72e2b8629e271e1b280c6fa3fe6e59fa8f6701bec14e3354325", 442 | "sha256:a8156e6a7f5e2a0ff0c5b21d6bcb45145efece1909efcbbbf48c56f8da68221d", 443 | "sha256:a9506a7e80bcf6eacfff7f804c0ad5350c8c95b9010e4356a4b36f5322f09abb", 444 | "sha256:af310ec8335016b5e52cae60cda4a4f2a60a788cbb949a4fbea13d441aa5a09e", 445 | "sha256:b0297b1e05fd128d26cc2460c810d42e205d16d76799526dfa8c8ccd50e74959", 446 | "sha256:bf68f4b2b6683e52bec69273562df15af352e5ed25d1b6641e7efddc5951d1a7", 447 | "sha256:d0c1bc2fa9a7285719e5678584f6b92572a5b639d0e471bb8d4b650a1a910920", 448 | "sha256:d4d9d6c1a455d4babd320203b918ccc7fcbefe308615c521062bc2ba1aa4d26e", 449 | "sha256:db1fa631737dab9fa0b37f3979d8d2631e348c3b4e8325d6873c2541d0ae5a48", 450 | "sha256:dd93ea5c0c7f3e25335ab7d22a507b1dc43976e1345508f845efc573d3d779d8", 451 | "sha256:f44e517131a98f7a76696a7b21b164bcb85291cee106a23beccce454e1f433a4", 452 | "sha256:f7ee479e96f7ee350db1cf24afa5685a5899e2b34992fb99e1f7c1b0b758d263" 453 | ], 454 | "version": "==5.4.0" 455 | } 456 | }, 457 | "develop": {} 458 | } 459 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![Build Status](https://travis-ci.org/CIRCL/url-abuse.svg?branch=master)](https://travis-ci.org/CIRCL/url-abuse) 2 | 3 | # URL Abuse 4 | 5 | ![URL Abuse logo](./doc/logo/logo-circl.png?raw=true "URL Abuse") 6 | 7 | URL Abuse is a versatile free software for URL review, analysis and black-list reporting. URL Abuse is composed of a web interface where requests are submitted asynchronously and a back-end system to process the URLs into features modules. 8 | 9 | ## Features 10 | 11 | - HTTP redirects analysis and follows 12 | - [Google Safe-Browsing](https://developers.google.com/safe-browsing/) lookup 13 | - [Phishtank](http://www.phishtank.com/api_info.php) lookup 14 | - [VirusTotal](https://www.virustotal.com/en/documentation/public-api/) lookup and submission 15 | - [URL query](https://github.com/CIRCL/urlquery_python_api/) lookup 16 | - [CIRCL Passive DNS](http://www.circl.lu/services/passive-dns/) lookup 17 | - [CIRCL Passive SSL](http://www.circl.lu/services/passive-ssl/) lookup 18 | - [Universal WHOIS](https://github.com/Rafiot/uwhoisd) lookup for abuse contact 19 | - Sphinx search interface to RT/RTIR ticketing systems. The functionality is disabled by default but can be used to display information about existing report of malicious URLs. 20 | 21 | Please note that some of the API services will require an API key. The API keys should be located in the root of the URL Abuse directory. 22 | 23 | ## Online version 24 | 25 | - [CIRCL URL Abuse](https://www.circl.lu/urlabuse/) is online. 26 | 27 | If you don't want to use the online version or run your own version of URL Abuse, you can follow the install process below. 28 | 29 | ## Install 30 | 31 | **IMPORTANT**: Use [pipenv](https://pipenv.readthedocs.io/en/latest/) 32 | 33 | **NOTE**: Yes, it requires python3.6+. No, it will never support anything older. 34 | 35 | ## Install redis 36 | 37 | ```bash 38 | git clone https://github.com/antirez/redis.git 39 | cd redis 40 | git checkout 5.0 41 | make 42 | make test 43 | cd .. 44 | ``` 45 | 46 | # Install Faup 47 | 48 | ```bash 49 | git clone git://github.com/stricaud/faup.git 50 | cd faup 51 | mkdir build 52 | cd build 53 | cmake .. && make 54 | sudo make install 55 | ``` 56 | 57 | ## Install & run URL Abuse 58 | 59 | ```bash 60 | git clone https://github.com/CIRCL/url-abuse.git 61 | cd url-abuse 62 | pipenv install 63 | echo URLABUSE_HOME="'`pwd`'" > .env 64 | pipenv shell 65 | # Copy and review the configuration: 66 | cp website/config/config.ini.sample website/config/config.ini 67 | # Starts all the backend 68 | start.py 69 | # Start the web interface 70 | start_website.py 71 | ``` 72 | 73 | ## Contributing 74 | 75 | We welcome pull requests for new extensions, bug fixes. 76 | 77 | ### Add a new module 78 | 79 | Look at the existings functions/modules. The changes will have to be made in the following files: 80 | 81 | * Add the function you want to execure in url\_abuse\_async.py 82 | * Add a route in web/\_\_init\_\_.py. This route will do an async call to the function defined in url\_abuse\_async.py. The parameter of the function is sent in an POST object 83 | * Add a statement in web/templates/url-report.html. The data option is the parameter to pass to the javascript directive 84 | * Add a directive in web/static/main.js, it will take care of passing the parameter to the backend and regularly pull for the response of the async call 85 | 86 | ## Partner and Funding 87 | 88 | URL Abuse was being developed as part of the [“European Union anti-Phishing Initiative”](http://phishing-initiative.eu/) (EU PI) project. This project was coordinated by Cert-Lexsi and co-funded by the Prevention of and Fight against Crime programme of the European Union. 89 | 90 | URL Abuse is currently supported and funded by [CIRCL](https://www.circl.lu/) ( Computer Incident Response Center Luxembourg). 91 | -------------------------------------------------------------------------------- /bin/run_backend.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | from urlabuse.helpers import get_homedir, check_running 5 | from subprocess import Popen 6 | import time 7 | from pathlib import Path 8 | 9 | import argparse 10 | 11 | 12 | def launch_cache(storage_directory: Path=None): 13 | if not storage_directory: 14 | storage_directory = get_homedir() 15 | if not check_running('cache'): 16 | Popen(["./run_redis.sh"], cwd=(storage_directory / 'cache')) 17 | 18 | 19 | def shutdown_cache(storage_directory: Path=None): 20 | if not storage_directory: 21 | storage_directory = get_homedir() 22 | Popen(["./shutdown_redis.sh"], cwd=(storage_directory / 'cache')) 23 | 24 | 25 | def launch_all(): 26 | launch_cache() 27 | 28 | 29 | def check_all(stop=False): 30 | backends = [['cache', False]] 31 | while True: 32 | for b in backends: 33 | try: 34 | b[1] = check_running(b[0]) 35 | except Exception: 36 | b[1] = False 37 | if stop: 38 | if not any(b[1] for b in backends): 39 | break 40 | else: 41 | if all(b[1] for b in backends): 42 | break 43 | for b in backends: 44 | if not stop and not b[1]: 45 | print(f"Waiting on {b[0]}") 46 | if stop and b[1]: 47 | print(f"Waiting on {b[0]}") 48 | time.sleep(1) 49 | 50 | 51 | def stop_all(): 52 | shutdown_cache() 53 | 54 | 55 | if __name__ == '__main__': 56 | parser = argparse.ArgumentParser(description='Manage backend DBs.') 57 | parser.add_argument("--start", action='store_true', default=False, help="Start all") 58 | parser.add_argument("--stop", action='store_true', default=False, help="Stop all") 59 | parser.add_argument("--status", action='store_true', default=True, help="Show status") 60 | args = parser.parse_args() 61 | 62 | if args.start: 63 | launch_all() 64 | if args.stop: 65 | stop_all() 66 | if not args.stop and args.status: 67 | check_all() 68 | -------------------------------------------------------------------------------- /bin/run_workers.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | import argparse 4 | from multiprocessing import Pool 5 | from redis import Redis 6 | from urlabuse.helpers import get_socket_path 7 | from urlabuse.urlabuse import Query 8 | import json 9 | import time 10 | 11 | 12 | def worker(process_id: int): 13 | urlabuse_query = Query() 14 | queue = Redis(unix_socket_path=get_socket_path('cache'), db=0, 15 | decode_responses=True) 16 | print(f'Start Worker {process_id}') 17 | while True: 18 | jobid = queue.spop('to_process') 19 | if not jobid: 20 | time.sleep(.1) 21 | continue 22 | to_process = queue.hgetall(jobid) 23 | parameters = json.loads(to_process['data']) 24 | try: 25 | result = getattr(urlabuse_query, to_process['method'])(**parameters) 26 | queue.hset(jobid, 'result', json.dumps(result)) 27 | except Exception as e: 28 | print(e, to_process) 29 | 30 | 31 | if __name__ == '__main__': 32 | parser = argparse.ArgumentParser(description='Launch a certain amount of workers.') 33 | parser.add_argument('-n', '--number', default=10, type=int, help='Amount of workers to launch.') 34 | args = parser.parse_args() 35 | 36 | with Pool(args.number) as p: 37 | p.map(worker, list(range(args.number))) 38 | -------------------------------------------------------------------------------- /bin/start.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | from subprocess import Popen 5 | from urlabuse.helpers import get_homedir 6 | 7 | import redis 8 | import sys 9 | 10 | if redis.VERSION < (3, ): 11 | print('redis-py >= 3 is required.') 12 | sys.exit() 13 | 14 | if __name__ == '__main__': 15 | # Just fail if the env isn't set. 16 | get_homedir() 17 | p = Popen(['run_backend.py', '--start']) 18 | p.wait() 19 | Popen(['run_workers.py']) 20 | -------------------------------------------------------------------------------- /bin/start_website.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | from subprocess import Popen 5 | from urlabuse.helpers import get_homedir 6 | 7 | if __name__ == '__main__': 8 | website_dir = get_homedir() / 'website' 9 | Popen([f'{website_dir}/3drparty.sh'], cwd=website_dir) 10 | try: 11 | Popen(['gunicorn', '--worker-class', 'gevent', '-w', '10', '-b', '0.0.0.0:5200', 'web:app'], 12 | cwd=website_dir).communicate() 13 | except KeyboardInterrupt: 14 | print('Stopping gunicorn.') 15 | -------------------------------------------------------------------------------- /bin/stats.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | from datetime import date, timedelta 4 | import redis 5 | from urlabuse.helpers import get_socket_path 6 | import argparse 7 | 8 | 9 | def perdelta(start, end, delta): 10 | curr = start 11 | while curr < end: 12 | yield curr 13 | curr += delta 14 | 15 | 16 | if __name__ == '__main__': 17 | parser = argparse.ArgumentParser(description='Show on last 30 days.') 18 | args = parser.parse_args() 19 | 20 | r = redis.Redis(get_socket_path('cache')) 21 | 22 | for result in perdelta(date.today() - timedelta(days=30), date.today(), timedelta(days=1)): 23 | val = r.zcard('{}_submissions'.format(result)) 24 | print('{},{}'.format(result, val)) 25 | -------------------------------------------------------------------------------- /cache/cache.conf: -------------------------------------------------------------------------------- 1 | # Redis configuration file example. 2 | # 3 | # Note that in order to read the configuration file, Redis must be 4 | # started with the file path as first argument: 5 | # 6 | # ./redis-server /path/to/redis.conf 7 | 8 | # Note on units: when memory size is needed, it is possible to specify 9 | # it in the usual form of 1k 5GB 4M and so forth: 10 | # 11 | # 1k => 1000 bytes 12 | # 1kb => 1024 bytes 13 | # 1m => 1000000 bytes 14 | # 1mb => 1024*1024 bytes 15 | # 1g => 1000000000 bytes 16 | # 1gb => 1024*1024*1024 bytes 17 | # 18 | # units are case insensitive so 1GB 1Gb 1gB are all the same. 19 | 20 | ################################## INCLUDES ################################### 21 | 22 | # Include one or more other config files here. This is useful if you 23 | # have a standard template that goes to all Redis servers but also need 24 | # to customize a few per-server settings. Include files can include 25 | # other files, so use this wisely. 26 | # 27 | # Notice option "include" won't be rewritten by command "CONFIG REWRITE" 28 | # from admin or Redis Sentinel. Since Redis always uses the last processed 29 | # line as value of a configuration directive, you'd better put includes 30 | # at the beginning of this file to avoid overwriting config change at runtime. 31 | # 32 | # If instead you are interested in using includes to override configuration 33 | # options, it is better to use include as the last line. 34 | # 35 | # include /path/to/local.conf 36 | # include /path/to/other.conf 37 | 38 | ################################## MODULES ##################################### 39 | 40 | # Load modules at startup. If the server is not able to load modules 41 | # it will abort. It is possible to use multiple loadmodule directives. 42 | # 43 | # loadmodule /path/to/my_module.so 44 | # loadmodule /path/to/other_module.so 45 | 46 | ################################## NETWORK ##################################### 47 | 48 | # By default, if no "bind" configuration directive is specified, Redis listens 49 | # for connections from all the network interfaces available on the server. 50 | # It is possible to listen to just one or multiple selected interfaces using 51 | # the "bind" configuration directive, followed by one or more IP addresses. 52 | # 53 | # Examples: 54 | # 55 | # bind 192.168.1.100 10.0.0.1 56 | # bind 127.0.0.1 ::1 57 | # 58 | # ~~~ WARNING ~~~ If the computer running Redis is directly exposed to the 59 | # internet, binding to all the interfaces is dangerous and will expose the 60 | # instance to everybody on the internet. So by default we uncomment the 61 | # following bind directive, that will force Redis to listen only into 62 | # the IPv4 loopback interface address (this means Redis will be able to 63 | # accept connections only from clients running into the same computer it 64 | # is running). 65 | # 66 | # IF YOU ARE SURE YOU WANT YOUR INSTANCE TO LISTEN TO ALL THE INTERFACES 67 | # JUST COMMENT THE FOLLOWING LINE. 68 | # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 69 | bind 127.0.0.1 70 | 71 | # Protected mode is a layer of security protection, in order to avoid that 72 | # Redis instances left open on the internet are accessed and exploited. 73 | # 74 | # When protected mode is on and if: 75 | # 76 | # 1) The server is not binding explicitly to a set of addresses using the 77 | # "bind" directive. 78 | # 2) No password is configured. 79 | # 80 | # The server only accepts connections from clients connecting from the 81 | # IPv4 and IPv6 loopback addresses 127.0.0.1 and ::1, and from Unix domain 82 | # sockets. 83 | # 84 | # By default protected mode is enabled. You should disable it only if 85 | # you are sure you want clients from other hosts to connect to Redis 86 | # even if no authentication is configured, nor a specific set of interfaces 87 | # are explicitly listed using the "bind" directive. 88 | protected-mode yes 89 | 90 | # Accept connections on the specified port, default is 6379 (IANA #815344). 91 | # If port 0 is specified Redis will not listen on a TCP socket. 92 | port 0 93 | 94 | # TCP listen() backlog. 95 | # 96 | # In high requests-per-second environments you need an high backlog in order 97 | # to avoid slow clients connections issues. Note that the Linux kernel 98 | # will silently truncate it to the value of /proc/sys/net/core/somaxconn so 99 | # make sure to raise both the value of somaxconn and tcp_max_syn_backlog 100 | # in order to get the desired effect. 101 | tcp-backlog 511 102 | 103 | # Unix socket. 104 | # 105 | # Specify the path for the Unix socket that will be used to listen for 106 | # incoming connections. There is no default, so Redis will not listen 107 | # on a unix socket when not specified. 108 | # 109 | unixsocket cache.sock 110 | unixsocketperm 700 111 | 112 | # Close the connection after a client is idle for N seconds (0 to disable) 113 | timeout 0 114 | 115 | # TCP keepalive. 116 | # 117 | # If non-zero, use SO_KEEPALIVE to send TCP ACKs to clients in absence 118 | # of communication. This is useful for two reasons: 119 | # 120 | # 1) Detect dead peers. 121 | # 2) Take the connection alive from the point of view of network 122 | # equipment in the middle. 123 | # 124 | # On Linux, the specified value (in seconds) is the period used to send ACKs. 125 | # Note that to close the connection the double of the time is needed. 126 | # On other kernels the period depends on the kernel configuration. 127 | # 128 | # A reasonable value for this option is 300 seconds, which is the new 129 | # Redis default starting with Redis 3.2.1. 130 | tcp-keepalive 300 131 | 132 | ################################# GENERAL ##################################### 133 | 134 | # By default Redis does not run as a daemon. Use 'yes' if you need it. 135 | # Note that Redis will write a pid file in /var/run/redis.pid when daemonized. 136 | daemonize yes 137 | 138 | # If you run Redis from upstart or systemd, Redis can interact with your 139 | # supervision tree. Options: 140 | # supervised no - no supervision interaction 141 | # supervised upstart - signal upstart by putting Redis into SIGSTOP mode 142 | # supervised systemd - signal systemd by writing READY=1 to $NOTIFY_SOCKET 143 | # supervised auto - detect upstart or systemd method based on 144 | # UPSTART_JOB or NOTIFY_SOCKET environment variables 145 | # Note: these supervision methods only signal "process is ready." 146 | # They do not enable continuous liveness pings back to your supervisor. 147 | supervised no 148 | 149 | # If a pid file is specified, Redis writes it where specified at startup 150 | # and removes it at exit. 151 | # 152 | # When the server runs non daemonized, no pid file is created if none is 153 | # specified in the configuration. When the server is daemonized, the pid file 154 | # is used even if not specified, defaulting to "/var/run/redis.pid". 155 | # 156 | # Creating a pid file is best effort: if Redis is not able to create it 157 | # nothing bad happens, the server will start and run normally. 158 | #pidfile /var/run/redis_6379.pid 159 | 160 | # Specify the server verbosity level. 161 | # This can be one of: 162 | # debug (a lot of information, useful for development/testing) 163 | # verbose (many rarely useful info, but not a mess like the debug level) 164 | # notice (moderately verbose, what you want in production probably) 165 | # warning (only very important / critical messages are logged) 166 | loglevel notice 167 | 168 | # Specify the log file name. Also the empty string can be used to force 169 | # Redis to log on the standard output. Note that if you use standard 170 | # output for logging but daemonize, logs will be sent to /dev/null 171 | logfile "cache.log" 172 | 173 | # To enable logging to the system logger, just set 'syslog-enabled' to yes, 174 | # and optionally update the other syslog parameters to suit your needs. 175 | # syslog-enabled no 176 | 177 | # Specify the syslog identity. 178 | # syslog-ident redis 179 | 180 | # Specify the syslog facility. Must be USER or between LOCAL0-LOCAL7. 181 | # syslog-facility local0 182 | 183 | # Set the number of databases. The default database is DB 0, you can select 184 | # a different one on a per-connection basis using SELECT where 185 | # dbid is a number between 0 and 'databases'-1 186 | databases 16 187 | 188 | # By default Redis shows an ASCII art logo only when started to log to the 189 | # standard output and if the standard output is a TTY. Basically this means 190 | # that normally a logo is displayed only in interactive sessions. 191 | # 192 | # However it is possible to force the pre-4.0 behavior and always show a 193 | # ASCII art logo in startup logs by setting the following option to yes. 194 | always-show-logo yes 195 | 196 | ################################ SNAPSHOTTING ################################ 197 | # 198 | # Save the DB on disk: 199 | # 200 | # save 201 | # 202 | # Will save the DB if both the given number of seconds and the given 203 | # number of write operations against the DB occurred. 204 | # 205 | # In the example below the behaviour will be to save: 206 | # after 900 sec (15 min) if at least 1 key changed 207 | # after 300 sec (5 min) if at least 10 keys changed 208 | # after 60 sec if at least 10000 keys changed 209 | # 210 | # Note: you can disable saving completely by commenting out all "save" lines. 211 | # 212 | # It is also possible to remove all the previously configured save 213 | # points by adding a save directive with a single empty string argument 214 | # like in the following example: 215 | # 216 | # save "" 217 | 218 | save 9000 1 219 | save 3000 10 220 | save 600 10000 221 | 222 | # By default Redis will stop accepting writes if RDB snapshots are enabled 223 | # (at least one save point) and the latest background save failed. 224 | # This will make the user aware (in a hard way) that data is not persisting 225 | # on disk properly, otherwise chances are that no one will notice and some 226 | # disaster will happen. 227 | # 228 | # If the background saving process will start working again Redis will 229 | # automatically allow writes again. 230 | # 231 | # However if you have setup your proper monitoring of the Redis server 232 | # and persistence, you may want to disable this feature so that Redis will 233 | # continue to work as usual even if there are problems with disk, 234 | # permissions, and so forth. 235 | stop-writes-on-bgsave-error yes 236 | 237 | # Compress string objects using LZF when dump .rdb databases? 238 | # For default that's set to 'yes' as it's almost always a win. 239 | # If you want to save some CPU in the saving child set it to 'no' but 240 | # the dataset will likely be bigger if you have compressible values or keys. 241 | rdbcompression yes 242 | 243 | # Since version 5 of RDB a CRC64 checksum is placed at the end of the file. 244 | # This makes the format more resistant to corruption but there is a performance 245 | # hit to pay (around 10%) when saving and loading RDB files, so you can disable it 246 | # for maximum performances. 247 | # 248 | # RDB files created with checksum disabled have a checksum of zero that will 249 | # tell the loading code to skip the check. 250 | rdbchecksum yes 251 | 252 | # The filename where to dump the DB 253 | dbfilename dump.rdb 254 | 255 | # The working directory. 256 | # 257 | # The DB will be written inside this directory, with the filename specified 258 | # above using the 'dbfilename' configuration directive. 259 | # 260 | # The Append Only File will also be created inside this directory. 261 | # 262 | # Note that you must specify a directory here, not a file name. 263 | dir ./ 264 | 265 | ################################# REPLICATION ################################# 266 | 267 | # Master-Replica replication. Use replicaof to make a Redis instance a copy of 268 | # another Redis server. A few things to understand ASAP about Redis replication. 269 | # 270 | # +------------------+ +---------------+ 271 | # | Master | ---> | Replica | 272 | # | (receive writes) | | (exact copy) | 273 | # +------------------+ +---------------+ 274 | # 275 | # 1) Redis replication is asynchronous, but you can configure a master to 276 | # stop accepting writes if it appears to be not connected with at least 277 | # a given number of replicas. 278 | # 2) Redis replicas are able to perform a partial resynchronization with the 279 | # master if the replication link is lost for a relatively small amount of 280 | # time. You may want to configure the replication backlog size (see the next 281 | # sections of this file) with a sensible value depending on your needs. 282 | # 3) Replication is automatic and does not need user intervention. After a 283 | # network partition replicas automatically try to reconnect to masters 284 | # and resynchronize with them. 285 | # 286 | # replicaof 287 | 288 | # If the master is password protected (using the "requirepass" configuration 289 | # directive below) it is possible to tell the replica to authenticate before 290 | # starting the replication synchronization process, otherwise the master will 291 | # refuse the replica request. 292 | # 293 | # masterauth 294 | 295 | # When a replica loses its connection with the master, or when the replication 296 | # is still in progress, the replica can act in two different ways: 297 | # 298 | # 1) if replica-serve-stale-data is set to 'yes' (the default) the replica will 299 | # still reply to client requests, possibly with out of date data, or the 300 | # data set may just be empty if this is the first synchronization. 301 | # 302 | # 2) if replica-serve-stale-data is set to 'no' the replica will reply with 303 | # an error "SYNC with master in progress" to all the kind of commands 304 | # but to INFO, replicaOF, AUTH, PING, SHUTDOWN, REPLCONF, ROLE, CONFIG, 305 | # SUBSCRIBE, UNSUBSCRIBE, PSUBSCRIBE, PUNSUBSCRIBE, PUBLISH, PUBSUB, 306 | # COMMAND, POST, HOST: and LATENCY. 307 | # 308 | replica-serve-stale-data yes 309 | 310 | # You can configure a replica instance to accept writes or not. Writing against 311 | # a replica instance may be useful to store some ephemeral data (because data 312 | # written on a replica will be easily deleted after resync with the master) but 313 | # may also cause problems if clients are writing to it because of a 314 | # misconfiguration. 315 | # 316 | # Since Redis 2.6 by default replicas are read-only. 317 | # 318 | # Note: read only replicas are not designed to be exposed to untrusted clients 319 | # on the internet. It's just a protection layer against misuse of the instance. 320 | # Still a read only replica exports by default all the administrative commands 321 | # such as CONFIG, DEBUG, and so forth. To a limited extent you can improve 322 | # security of read only replicas using 'rename-command' to shadow all the 323 | # administrative / dangerous commands. 324 | replica-read-only yes 325 | 326 | # Replication SYNC strategy: disk or socket. 327 | # 328 | # ------------------------------------------------------- 329 | # WARNING: DISKLESS REPLICATION IS EXPERIMENTAL CURRENTLY 330 | # ------------------------------------------------------- 331 | # 332 | # New replicas and reconnecting replicas that are not able to continue the replication 333 | # process just receiving differences, need to do what is called a "full 334 | # synchronization". An RDB file is transmitted from the master to the replicas. 335 | # The transmission can happen in two different ways: 336 | # 337 | # 1) Disk-backed: The Redis master creates a new process that writes the RDB 338 | # file on disk. Later the file is transferred by the parent 339 | # process to the replicas incrementally. 340 | # 2) Diskless: The Redis master creates a new process that directly writes the 341 | # RDB file to replica sockets, without touching the disk at all. 342 | # 343 | # With disk-backed replication, while the RDB file is generated, more replicas 344 | # can be queued and served with the RDB file as soon as the current child producing 345 | # the RDB file finishes its work. With diskless replication instead once 346 | # the transfer starts, new replicas arriving will be queued and a new transfer 347 | # will start when the current one terminates. 348 | # 349 | # When diskless replication is used, the master waits a configurable amount of 350 | # time (in seconds) before starting the transfer in the hope that multiple replicas 351 | # will arrive and the transfer can be parallelized. 352 | # 353 | # With slow disks and fast (large bandwidth) networks, diskless replication 354 | # works better. 355 | repl-diskless-sync no 356 | 357 | # When diskless replication is enabled, it is possible to configure the delay 358 | # the server waits in order to spawn the child that transfers the RDB via socket 359 | # to the replicas. 360 | # 361 | # This is important since once the transfer starts, it is not possible to serve 362 | # new replicas arriving, that will be queued for the next RDB transfer, so the server 363 | # waits a delay in order to let more replicas arrive. 364 | # 365 | # The delay is specified in seconds, and by default is 5 seconds. To disable 366 | # it entirely just set it to 0 seconds and the transfer will start ASAP. 367 | repl-diskless-sync-delay 5 368 | 369 | # Replicas send PINGs to server in a predefined interval. It's possible to change 370 | # this interval with the repl_ping_replica_period option. The default value is 10 371 | # seconds. 372 | # 373 | # repl-ping-replica-period 10 374 | 375 | # The following option sets the replication timeout for: 376 | # 377 | # 1) Bulk transfer I/O during SYNC, from the point of view of replica. 378 | # 2) Master timeout from the point of view of replicas (data, pings). 379 | # 3) Replica timeout from the point of view of masters (REPLCONF ACK pings). 380 | # 381 | # It is important to make sure that this value is greater than the value 382 | # specified for repl-ping-replica-period otherwise a timeout will be detected 383 | # every time there is low traffic between the master and the replica. 384 | # 385 | # repl-timeout 60 386 | 387 | # Disable TCP_NODELAY on the replica socket after SYNC? 388 | # 389 | # If you select "yes" Redis will use a smaller number of TCP packets and 390 | # less bandwidth to send data to replicas. But this can add a delay for 391 | # the data to appear on the replica side, up to 40 milliseconds with 392 | # Linux kernels using a default configuration. 393 | # 394 | # If you select "no" the delay for data to appear on the replica side will 395 | # be reduced but more bandwidth will be used for replication. 396 | # 397 | # By default we optimize for low latency, but in very high traffic conditions 398 | # or when the master and replicas are many hops away, turning this to "yes" may 399 | # be a good idea. 400 | repl-disable-tcp-nodelay no 401 | 402 | # Set the replication backlog size. The backlog is a buffer that accumulates 403 | # replica data when replicas are disconnected for some time, so that when a replica 404 | # wants to reconnect again, often a full resync is not needed, but a partial 405 | # resync is enough, just passing the portion of data the replica missed while 406 | # disconnected. 407 | # 408 | # The bigger the replication backlog, the longer the time the replica can be 409 | # disconnected and later be able to perform a partial resynchronization. 410 | # 411 | # The backlog is only allocated once there is at least a replica connected. 412 | # 413 | # repl-backlog-size 1mb 414 | 415 | # After a master has no longer connected replicas for some time, the backlog 416 | # will be freed. The following option configures the amount of seconds that 417 | # need to elapse, starting from the time the last replica disconnected, for 418 | # the backlog buffer to be freed. 419 | # 420 | # Note that replicas never free the backlog for timeout, since they may be 421 | # promoted to masters later, and should be able to correctly "partially 422 | # resynchronize" with the replicas: hence they should always accumulate backlog. 423 | # 424 | # A value of 0 means to never release the backlog. 425 | # 426 | # repl-backlog-ttl 3600 427 | 428 | # The replica priority is an integer number published by Redis in the INFO output. 429 | # It is used by Redis Sentinel in order to select a replica to promote into a 430 | # master if the master is no longer working correctly. 431 | # 432 | # A replica with a low priority number is considered better for promotion, so 433 | # for instance if there are three replicas with priority 10, 100, 25 Sentinel will 434 | # pick the one with priority 10, that is the lowest. 435 | # 436 | # However a special priority of 0 marks the replica as not able to perform the 437 | # role of master, so a replica with priority of 0 will never be selected by 438 | # Redis Sentinel for promotion. 439 | # 440 | # By default the priority is 100. 441 | replica-priority 100 442 | 443 | # It is possible for a master to stop accepting writes if there are less than 444 | # N replicas connected, having a lag less or equal than M seconds. 445 | # 446 | # The N replicas need to be in "online" state. 447 | # 448 | # The lag in seconds, that must be <= the specified value, is calculated from 449 | # the last ping received from the replica, that is usually sent every second. 450 | # 451 | # This option does not GUARANTEE that N replicas will accept the write, but 452 | # will limit the window of exposure for lost writes in case not enough replicas 453 | # are available, to the specified number of seconds. 454 | # 455 | # For example to require at least 3 replicas with a lag <= 10 seconds use: 456 | # 457 | # min-replicas-to-write 3 458 | # min-replicas-max-lag 10 459 | # 460 | # Setting one or the other to 0 disables the feature. 461 | # 462 | # By default min-replicas-to-write is set to 0 (feature disabled) and 463 | # min-replicas-max-lag is set to 10. 464 | 465 | # A Redis master is able to list the address and port of the attached 466 | # replicas in different ways. For example the "INFO replication" section 467 | # offers this information, which is used, among other tools, by 468 | # Redis Sentinel in order to discover replica instances. 469 | # Another place where this info is available is in the output of the 470 | # "ROLE" command of a master. 471 | # 472 | # The listed IP and address normally reported by a replica is obtained 473 | # in the following way: 474 | # 475 | # IP: The address is auto detected by checking the peer address 476 | # of the socket used by the replica to connect with the master. 477 | # 478 | # Port: The port is communicated by the replica during the replication 479 | # handshake, and is normally the port that the replica is using to 480 | # listen for connections. 481 | # 482 | # However when port forwarding or Network Address Translation (NAT) is 483 | # used, the replica may be actually reachable via different IP and port 484 | # pairs. The following two options can be used by a replica in order to 485 | # report to its master a specific set of IP and port, so that both INFO 486 | # and ROLE will report those values. 487 | # 488 | # There is no need to use both the options if you need to override just 489 | # the port or the IP address. 490 | # 491 | # replica-announce-ip 5.5.5.5 492 | # replica-announce-port 1234 493 | 494 | ################################## SECURITY ################################### 495 | 496 | # Require clients to issue AUTH before processing any other 497 | # commands. This might be useful in environments in which you do not trust 498 | # others with access to the host running redis-server. 499 | # 500 | # This should stay commented out for backward compatibility and because most 501 | # people do not need auth (e.g. they run their own servers). 502 | # 503 | # Warning: since Redis is pretty fast an outside user can try up to 504 | # 150k passwords per second against a good box. This means that you should 505 | # use a very strong password otherwise it will be very easy to break. 506 | # 507 | # requirepass foobared 508 | 509 | # Command renaming. 510 | # 511 | # It is possible to change the name of dangerous commands in a shared 512 | # environment. For instance the CONFIG command may be renamed into something 513 | # hard to guess so that it will still be available for internal-use tools 514 | # but not available for general clients. 515 | # 516 | # Example: 517 | # 518 | # rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52 519 | # 520 | # It is also possible to completely kill a command by renaming it into 521 | # an empty string: 522 | # 523 | # rename-command CONFIG "" 524 | # 525 | # Please note that changing the name of commands that are logged into the 526 | # AOF file or transmitted to replicas may cause problems. 527 | 528 | ################################### CLIENTS #################################### 529 | 530 | # Set the max number of connected clients at the same time. By default 531 | # this limit is set to 10000 clients, however if the Redis server is not 532 | # able to configure the process file limit to allow for the specified limit 533 | # the max number of allowed clients is set to the current file limit 534 | # minus 32 (as Redis reserves a few file descriptors for internal uses). 535 | # 536 | # Once the limit is reached Redis will close all the new connections sending 537 | # an error 'max number of clients reached'. 538 | # 539 | # maxclients 10000 540 | 541 | ############################## MEMORY MANAGEMENT ################################ 542 | 543 | # Set a memory usage limit to the specified amount of bytes. 544 | # When the memory limit is reached Redis will try to remove keys 545 | # according to the eviction policy selected (see maxmemory-policy). 546 | # 547 | # If Redis can't remove keys according to the policy, or if the policy is 548 | # set to 'noeviction', Redis will start to reply with errors to commands 549 | # that would use more memory, like SET, LPUSH, and so on, and will continue 550 | # to reply to read-only commands like GET. 551 | # 552 | # This option is usually useful when using Redis as an LRU or LFU cache, or to 553 | # set a hard memory limit for an instance (using the 'noeviction' policy). 554 | # 555 | # WARNING: If you have replicas attached to an instance with maxmemory on, 556 | # the size of the output buffers needed to feed the replicas are subtracted 557 | # from the used memory count, so that network problems / resyncs will 558 | # not trigger a loop where keys are evicted, and in turn the output 559 | # buffer of replicas is full with DELs of keys evicted triggering the deletion 560 | # of more keys, and so forth until the database is completely emptied. 561 | # 562 | # In short... if you have replicas attached it is suggested that you set a lower 563 | # limit for maxmemory so that there is some free RAM on the system for replica 564 | # output buffers (but this is not needed if the policy is 'noeviction'). 565 | # 566 | # maxmemory 567 | 568 | # MAXMEMORY POLICY: how Redis will select what to remove when maxmemory 569 | # is reached. You can select among five behaviors: 570 | # 571 | # volatile-lru -> Evict using approximated LRU among the keys with an expire set. 572 | # allkeys-lru -> Evict any key using approximated LRU. 573 | # volatile-lfu -> Evict using approximated LFU among the keys with an expire set. 574 | # allkeys-lfu -> Evict any key using approximated LFU. 575 | # volatile-random -> Remove a random key among the ones with an expire set. 576 | # allkeys-random -> Remove a random key, any key. 577 | # volatile-ttl -> Remove the key with the nearest expire time (minor TTL) 578 | # noeviction -> Don't evict anything, just return an error on write operations. 579 | # 580 | # LRU means Least Recently Used 581 | # LFU means Least Frequently Used 582 | # 583 | # Both LRU, LFU and volatile-ttl are implemented using approximated 584 | # randomized algorithms. 585 | # 586 | # Note: with any of the above policies, Redis will return an error on write 587 | # operations, when there are no suitable keys for eviction. 588 | # 589 | # At the date of writing these commands are: set setnx setex append 590 | # incr decr rpush lpush rpushx lpushx linsert lset rpoplpush sadd 591 | # sinter sinterstore sunion sunionstore sdiff sdiffstore zadd zincrby 592 | # zunionstore zinterstore hset hsetnx hmset hincrby incrby decrby 593 | # getset mset msetnx exec sort 594 | # 595 | # The default is: 596 | # 597 | # maxmemory-policy noeviction 598 | 599 | # LRU, LFU and minimal TTL algorithms are not precise algorithms but approximated 600 | # algorithms (in order to save memory), so you can tune it for speed or 601 | # accuracy. For default Redis will check five keys and pick the one that was 602 | # used less recently, you can change the sample size using the following 603 | # configuration directive. 604 | # 605 | # The default of 5 produces good enough results. 10 Approximates very closely 606 | # true LRU but costs more CPU. 3 is faster but not very accurate. 607 | # 608 | # maxmemory-samples 5 609 | 610 | # Starting from Redis 5, by default a replica will ignore its maxmemory setting 611 | # (unless it is promoted to master after a failover or manually). It means 612 | # that the eviction of keys will be just handled by the master, sending the 613 | # DEL commands to the replica as keys evict in the master side. 614 | # 615 | # This behavior ensures that masters and replicas stay consistent, and is usually 616 | # what you want, however if your replica is writable, or you want the replica to have 617 | # a different memory setting, and you are sure all the writes performed to the 618 | # replica are idempotent, then you may change this default (but be sure to understand 619 | # what you are doing). 620 | # 621 | # Note that since the replica by default does not evict, it may end using more 622 | # memory than the one set via maxmemory (there are certain buffers that may 623 | # be larger on the replica, or data structures may sometimes take more memory and so 624 | # forth). So make sure you monitor your replicas and make sure they have enough 625 | # memory to never hit a real out-of-memory condition before the master hits 626 | # the configured maxmemory setting. 627 | # 628 | # replica-ignore-maxmemory yes 629 | 630 | ############################# LAZY FREEING #################################### 631 | 632 | # Redis has two primitives to delete keys. One is called DEL and is a blocking 633 | # deletion of the object. It means that the server stops processing new commands 634 | # in order to reclaim all the memory associated with an object in a synchronous 635 | # way. If the key deleted is associated with a small object, the time needed 636 | # in order to execute the DEL command is very small and comparable to most other 637 | # O(1) or O(log_N) commands in Redis. However if the key is associated with an 638 | # aggregated value containing millions of elements, the server can block for 639 | # a long time (even seconds) in order to complete the operation. 640 | # 641 | # For the above reasons Redis also offers non blocking deletion primitives 642 | # such as UNLINK (non blocking DEL) and the ASYNC option of FLUSHALL and 643 | # FLUSHDB commands, in order to reclaim memory in background. Those commands 644 | # are executed in constant time. Another thread will incrementally free the 645 | # object in the background as fast as possible. 646 | # 647 | # DEL, UNLINK and ASYNC option of FLUSHALL and FLUSHDB are user-controlled. 648 | # It's up to the design of the application to understand when it is a good 649 | # idea to use one or the other. However the Redis server sometimes has to 650 | # delete keys or flush the whole database as a side effect of other operations. 651 | # Specifically Redis deletes objects independently of a user call in the 652 | # following scenarios: 653 | # 654 | # 1) On eviction, because of the maxmemory and maxmemory policy configurations, 655 | # in order to make room for new data, without going over the specified 656 | # memory limit. 657 | # 2) Because of expire: when a key with an associated time to live (see the 658 | # EXPIRE command) must be deleted from memory. 659 | # 3) Because of a side effect of a command that stores data on a key that may 660 | # already exist. For example the RENAME command may delete the old key 661 | # content when it is replaced with another one. Similarly SUNIONSTORE 662 | # or SORT with STORE option may delete existing keys. The SET command 663 | # itself removes any old content of the specified key in order to replace 664 | # it with the specified string. 665 | # 4) During replication, when a replica performs a full resynchronization with 666 | # its master, the content of the whole database is removed in order to 667 | # load the RDB file just transferred. 668 | # 669 | # In all the above cases the default is to delete objects in a blocking way, 670 | # like if DEL was called. However you can configure each case specifically 671 | # in order to instead release memory in a non-blocking way like if UNLINK 672 | # was called, using the following configuration directives: 673 | 674 | lazyfree-lazy-eviction no 675 | lazyfree-lazy-expire no 676 | lazyfree-lazy-server-del no 677 | replica-lazy-flush no 678 | 679 | ############################## APPEND ONLY MODE ############################### 680 | 681 | # By default Redis asynchronously dumps the dataset on disk. This mode is 682 | # good enough in many applications, but an issue with the Redis process or 683 | # a power outage may result into a few minutes of writes lost (depending on 684 | # the configured save points). 685 | # 686 | # The Append Only File is an alternative persistence mode that provides 687 | # much better durability. For instance using the default data fsync policy 688 | # (see later in the config file) Redis can lose just one second of writes in a 689 | # dramatic event like a server power outage, or a single write if something 690 | # wrong with the Redis process itself happens, but the operating system is 691 | # still running correctly. 692 | # 693 | # AOF and RDB persistence can be enabled at the same time without problems. 694 | # If the AOF is enabled on startup Redis will load the AOF, that is the file 695 | # with the better durability guarantees. 696 | # 697 | # Please check http://redis.io/topics/persistence for more information. 698 | 699 | appendonly no 700 | 701 | # The name of the append only file (default: "appendonly.aof") 702 | 703 | appendfilename "appendonly.aof" 704 | 705 | # The fsync() call tells the Operating System to actually write data on disk 706 | # instead of waiting for more data in the output buffer. Some OS will really flush 707 | # data on disk, some other OS will just try to do it ASAP. 708 | # 709 | # Redis supports three different modes: 710 | # 711 | # no: don't fsync, just let the OS flush the data when it wants. Faster. 712 | # always: fsync after every write to the append only log. Slow, Safest. 713 | # everysec: fsync only one time every second. Compromise. 714 | # 715 | # The default is "everysec", as that's usually the right compromise between 716 | # speed and data safety. It's up to you to understand if you can relax this to 717 | # "no" that will let the operating system flush the output buffer when 718 | # it wants, for better performances (but if you can live with the idea of 719 | # some data loss consider the default persistence mode that's snapshotting), 720 | # or on the contrary, use "always" that's very slow but a bit safer than 721 | # everysec. 722 | # 723 | # More details please check the following article: 724 | # http://antirez.com/post/redis-persistence-demystified.html 725 | # 726 | # If unsure, use "everysec". 727 | 728 | # appendfsync always 729 | appendfsync everysec 730 | # appendfsync no 731 | 732 | # When the AOF fsync policy is set to always or everysec, and a background 733 | # saving process (a background save or AOF log background rewriting) is 734 | # performing a lot of I/O against the disk, in some Linux configurations 735 | # Redis may block too long on the fsync() call. Note that there is no fix for 736 | # this currently, as even performing fsync in a different thread will block 737 | # our synchronous write(2) call. 738 | # 739 | # In order to mitigate this problem it's possible to use the following option 740 | # that will prevent fsync() from being called in the main process while a 741 | # BGSAVE or BGREWRITEAOF is in progress. 742 | # 743 | # This means that while another child is saving, the durability of Redis is 744 | # the same as "appendfsync none". In practical terms, this means that it is 745 | # possible to lose up to 30 seconds of log in the worst scenario (with the 746 | # default Linux settings). 747 | # 748 | # If you have latency problems turn this to "yes". Otherwise leave it as 749 | # "no" that is the safest pick from the point of view of durability. 750 | 751 | no-appendfsync-on-rewrite no 752 | 753 | # Automatic rewrite of the append only file. 754 | # Redis is able to automatically rewrite the log file implicitly calling 755 | # BGREWRITEAOF when the AOF log size grows by the specified percentage. 756 | # 757 | # This is how it works: Redis remembers the size of the AOF file after the 758 | # latest rewrite (if no rewrite has happened since the restart, the size of 759 | # the AOF at startup is used). 760 | # 761 | # This base size is compared to the current size. If the current size is 762 | # bigger than the specified percentage, the rewrite is triggered. Also 763 | # you need to specify a minimal size for the AOF file to be rewritten, this 764 | # is useful to avoid rewriting the AOF file even if the percentage increase 765 | # is reached but it is still pretty small. 766 | # 767 | # Specify a percentage of zero in order to disable the automatic AOF 768 | # rewrite feature. 769 | 770 | auto-aof-rewrite-percentage 100 771 | auto-aof-rewrite-min-size 64mb 772 | 773 | # An AOF file may be found to be truncated at the end during the Redis 774 | # startup process, when the AOF data gets loaded back into memory. 775 | # This may happen when the system where Redis is running 776 | # crashes, especially when an ext4 filesystem is mounted without the 777 | # data=ordered option (however this can't happen when Redis itself 778 | # crashes or aborts but the operating system still works correctly). 779 | # 780 | # Redis can either exit with an error when this happens, or load as much 781 | # data as possible (the default now) and start if the AOF file is found 782 | # to be truncated at the end. The following option controls this behavior. 783 | # 784 | # If aof-load-truncated is set to yes, a truncated AOF file is loaded and 785 | # the Redis server starts emitting a log to inform the user of the event. 786 | # Otherwise if the option is set to no, the server aborts with an error 787 | # and refuses to start. When the option is set to no, the user requires 788 | # to fix the AOF file using the "redis-check-aof" utility before to restart 789 | # the server. 790 | # 791 | # Note that if the AOF file will be found to be corrupted in the middle 792 | # the server will still exit with an error. This option only applies when 793 | # Redis will try to read more data from the AOF file but not enough bytes 794 | # will be found. 795 | aof-load-truncated yes 796 | 797 | # When rewriting the AOF file, Redis is able to use an RDB preamble in the 798 | # AOF file for faster rewrites and recoveries. When this option is turned 799 | # on the rewritten AOF file is composed of two different stanzas: 800 | # 801 | # [RDB file][AOF tail] 802 | # 803 | # When loading Redis recognizes that the AOF file starts with the "REDIS" 804 | # string and loads the prefixed RDB file, and continues loading the AOF 805 | # tail. 806 | aof-use-rdb-preamble yes 807 | 808 | ################################ LUA SCRIPTING ############################### 809 | 810 | # Max execution time of a Lua script in milliseconds. 811 | # 812 | # If the maximum execution time is reached Redis will log that a script is 813 | # still in execution after the maximum allowed time and will start to 814 | # reply to queries with an error. 815 | # 816 | # When a long running script exceeds the maximum execution time only the 817 | # SCRIPT KILL and SHUTDOWN NOSAVE commands are available. The first can be 818 | # used to stop a script that did not yet called write commands. The second 819 | # is the only way to shut down the server in the case a write command was 820 | # already issued by the script but the user doesn't want to wait for the natural 821 | # termination of the script. 822 | # 823 | # Set it to 0 or a negative value for unlimited execution without warnings. 824 | lua-time-limit 5000 825 | 826 | ################################ REDIS CLUSTER ############################### 827 | # 828 | # ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 829 | # WARNING EXPERIMENTAL: Redis Cluster is considered to be stable code, however 830 | # in order to mark it as "mature" we need to wait for a non trivial percentage 831 | # of users to deploy it in production. 832 | # ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 833 | # 834 | # Normal Redis instances can't be part of a Redis Cluster; only nodes that are 835 | # started as cluster nodes can. In order to start a Redis instance as a 836 | # cluster node enable the cluster support uncommenting the following: 837 | # 838 | # cluster-enabled yes 839 | 840 | # Every cluster node has a cluster configuration file. This file is not 841 | # intended to be edited by hand. It is created and updated by Redis nodes. 842 | # Every Redis Cluster node requires a different cluster configuration file. 843 | # Make sure that instances running in the same system do not have 844 | # overlapping cluster configuration file names. 845 | # 846 | # cluster-config-file nodes-6379.conf 847 | 848 | # Cluster node timeout is the amount of milliseconds a node must be unreachable 849 | # for it to be considered in failure state. 850 | # Most other internal time limits are multiple of the node timeout. 851 | # 852 | # cluster-node-timeout 15000 853 | 854 | # A replica of a failing master will avoid to start a failover if its data 855 | # looks too old. 856 | # 857 | # There is no simple way for a replica to actually have an exact measure of 858 | # its "data age", so the following two checks are performed: 859 | # 860 | # 1) If there are multiple replicas able to failover, they exchange messages 861 | # in order to try to give an advantage to the replica with the best 862 | # replication offset (more data from the master processed). 863 | # Replicas will try to get their rank by offset, and apply to the start 864 | # of the failover a delay proportional to their rank. 865 | # 866 | # 2) Every single replica computes the time of the last interaction with 867 | # its master. This can be the last ping or command received (if the master 868 | # is still in the "connected" state), or the time that elapsed since the 869 | # disconnection with the master (if the replication link is currently down). 870 | # If the last interaction is too old, the replica will not try to failover 871 | # at all. 872 | # 873 | # The point "2" can be tuned by user. Specifically a replica will not perform 874 | # the failover if, since the last interaction with the master, the time 875 | # elapsed is greater than: 876 | # 877 | # (node-timeout * replica-validity-factor) + repl-ping-replica-period 878 | # 879 | # So for example if node-timeout is 30 seconds, and the replica-validity-factor 880 | # is 10, and assuming a default repl-ping-replica-period of 10 seconds, the 881 | # replica will not try to failover if it was not able to talk with the master 882 | # for longer than 310 seconds. 883 | # 884 | # A large replica-validity-factor may allow replicas with too old data to failover 885 | # a master, while a too small value may prevent the cluster from being able to 886 | # elect a replica at all. 887 | # 888 | # For maximum availability, it is possible to set the replica-validity-factor 889 | # to a value of 0, which means, that replicas will always try to failover the 890 | # master regardless of the last time they interacted with the master. 891 | # (However they'll always try to apply a delay proportional to their 892 | # offset rank). 893 | # 894 | # Zero is the only value able to guarantee that when all the partitions heal 895 | # the cluster will always be able to continue. 896 | # 897 | # cluster-replica-validity-factor 10 898 | 899 | # Cluster replicas are able to migrate to orphaned masters, that are masters 900 | # that are left without working replicas. This improves the cluster ability 901 | # to resist to failures as otherwise an orphaned master can't be failed over 902 | # in case of failure if it has no working replicas. 903 | # 904 | # Replicas migrate to orphaned masters only if there are still at least a 905 | # given number of other working replicas for their old master. This number 906 | # is the "migration barrier". A migration barrier of 1 means that a replica 907 | # will migrate only if there is at least 1 other working replica for its master 908 | # and so forth. It usually reflects the number of replicas you want for every 909 | # master in your cluster. 910 | # 911 | # Default is 1 (replicas migrate only if their masters remain with at least 912 | # one replica). To disable migration just set it to a very large value. 913 | # A value of 0 can be set but is useful only for debugging and dangerous 914 | # in production. 915 | # 916 | # cluster-migration-barrier 1 917 | 918 | # By default Redis Cluster nodes stop accepting queries if they detect there 919 | # is at least an hash slot uncovered (no available node is serving it). 920 | # This way if the cluster is partially down (for example a range of hash slots 921 | # are no longer covered) all the cluster becomes, eventually, unavailable. 922 | # It automatically returns available as soon as all the slots are covered again. 923 | # 924 | # However sometimes you want the subset of the cluster which is working, 925 | # to continue to accept queries for the part of the key space that is still 926 | # covered. In order to do so, just set the cluster-require-full-coverage 927 | # option to no. 928 | # 929 | # cluster-require-full-coverage yes 930 | 931 | # This option, when set to yes, prevents replicas from trying to failover its 932 | # master during master failures. However the master can still perform a 933 | # manual failover, if forced to do so. 934 | # 935 | # This is useful in different scenarios, especially in the case of multiple 936 | # data center operations, where we want one side to never be promoted if not 937 | # in the case of a total DC failure. 938 | # 939 | # cluster-replica-no-failover no 940 | 941 | # In order to setup your cluster make sure to read the documentation 942 | # available at http://redis.io web site. 943 | 944 | ########################## CLUSTER DOCKER/NAT support ######################## 945 | 946 | # In certain deployments, Redis Cluster nodes address discovery fails, because 947 | # addresses are NAT-ted or because ports are forwarded (the typical case is 948 | # Docker and other containers). 949 | # 950 | # In order to make Redis Cluster working in such environments, a static 951 | # configuration where each node knows its public address is needed. The 952 | # following two options are used for this scope, and are: 953 | # 954 | # * cluster-announce-ip 955 | # * cluster-announce-port 956 | # * cluster-announce-bus-port 957 | # 958 | # Each instruct the node about its address, client port, and cluster message 959 | # bus port. The information is then published in the header of the bus packets 960 | # so that other nodes will be able to correctly map the address of the node 961 | # publishing the information. 962 | # 963 | # If the above options are not used, the normal Redis Cluster auto-detection 964 | # will be used instead. 965 | # 966 | # Note that when remapped, the bus port may not be at the fixed offset of 967 | # clients port + 10000, so you can specify any port and bus-port depending 968 | # on how they get remapped. If the bus-port is not set, a fixed offset of 969 | # 10000 will be used as usually. 970 | # 971 | # Example: 972 | # 973 | # cluster-announce-ip 10.1.1.5 974 | # cluster-announce-port 6379 975 | # cluster-announce-bus-port 6380 976 | 977 | ################################## SLOW LOG ################################### 978 | 979 | # The Redis Slow Log is a system to log queries that exceeded a specified 980 | # execution time. The execution time does not include the I/O operations 981 | # like talking with the client, sending the reply and so forth, 982 | # but just the time needed to actually execute the command (this is the only 983 | # stage of command execution where the thread is blocked and can not serve 984 | # other requests in the meantime). 985 | # 986 | # You can configure the slow log with two parameters: one tells Redis 987 | # what is the execution time, in microseconds, to exceed in order for the 988 | # command to get logged, and the other parameter is the length of the 989 | # slow log. When a new command is logged the oldest one is removed from the 990 | # queue of logged commands. 991 | 992 | # The following time is expressed in microseconds, so 1000000 is equivalent 993 | # to one second. Note that a negative number disables the slow log, while 994 | # a value of zero forces the logging of every command. 995 | slowlog-log-slower-than 10000 996 | 997 | # There is no limit to this length. Just be aware that it will consume memory. 998 | # You can reclaim memory used by the slow log with SLOWLOG RESET. 999 | slowlog-max-len 128 1000 | 1001 | ################################ LATENCY MONITOR ############################## 1002 | 1003 | # The Redis latency monitoring subsystem samples different operations 1004 | # at runtime in order to collect data related to possible sources of 1005 | # latency of a Redis instance. 1006 | # 1007 | # Via the LATENCY command this information is available to the user that can 1008 | # print graphs and obtain reports. 1009 | # 1010 | # The system only logs operations that were performed in a time equal or 1011 | # greater than the amount of milliseconds specified via the 1012 | # latency-monitor-threshold configuration directive. When its value is set 1013 | # to zero, the latency monitor is turned off. 1014 | # 1015 | # By default latency monitoring is disabled since it is mostly not needed 1016 | # if you don't have latency issues, and collecting data has a performance 1017 | # impact, that while very small, can be measured under big load. Latency 1018 | # monitoring can easily be enabled at runtime using the command 1019 | # "CONFIG SET latency-monitor-threshold " if needed. 1020 | latency-monitor-threshold 0 1021 | 1022 | ############################# EVENT NOTIFICATION ############################## 1023 | 1024 | # Redis can notify Pub/Sub clients about events happening in the key space. 1025 | # This feature is documented at http://redis.io/topics/notifications 1026 | # 1027 | # For instance if keyspace events notification is enabled, and a client 1028 | # performs a DEL operation on key "foo" stored in the Database 0, two 1029 | # messages will be published via Pub/Sub: 1030 | # 1031 | # PUBLISH __keyspace@0__:foo del 1032 | # PUBLISH __keyevent@0__:del foo 1033 | # 1034 | # It is possible to select the events that Redis will notify among a set 1035 | # of classes. Every class is identified by a single character: 1036 | # 1037 | # K Keyspace events, published with __keyspace@__ prefix. 1038 | # E Keyevent events, published with __keyevent@__ prefix. 1039 | # g Generic commands (non-type specific) like DEL, EXPIRE, RENAME, ... 1040 | # $ String commands 1041 | # l List commands 1042 | # s Set commands 1043 | # h Hash commands 1044 | # z Sorted set commands 1045 | # x Expired events (events generated every time a key expires) 1046 | # e Evicted events (events generated when a key is evicted for maxmemory) 1047 | # A Alias for g$lshzxe, so that the "AKE" string means all the events. 1048 | # 1049 | # The "notify-keyspace-events" takes as argument a string that is composed 1050 | # of zero or multiple characters. The empty string means that notifications 1051 | # are disabled. 1052 | # 1053 | # Example: to enable list and generic events, from the point of view of the 1054 | # event name, use: 1055 | # 1056 | # notify-keyspace-events Elg 1057 | # 1058 | # Example 2: to get the stream of the expired keys subscribing to channel 1059 | # name __keyevent@0__:expired use: 1060 | # 1061 | # notify-keyspace-events Ex 1062 | # 1063 | # By default all notifications are disabled because most users don't need 1064 | # this feature and the feature has some overhead. Note that if you don't 1065 | # specify at least one of K or E, no events will be delivered. 1066 | notify-keyspace-events "" 1067 | 1068 | ############################### ADVANCED CONFIG ############################### 1069 | 1070 | # Hashes are encoded using a memory efficient data structure when they have a 1071 | # small number of entries, and the biggest entry does not exceed a given 1072 | # threshold. These thresholds can be configured using the following directives. 1073 | hash-max-ziplist-entries 512 1074 | hash-max-ziplist-value 64 1075 | 1076 | # Lists are also encoded in a special way to save a lot of space. 1077 | # The number of entries allowed per internal list node can be specified 1078 | # as a fixed maximum size or a maximum number of elements. 1079 | # For a fixed maximum size, use -5 through -1, meaning: 1080 | # -5: max size: 64 Kb <-- not recommended for normal workloads 1081 | # -4: max size: 32 Kb <-- not recommended 1082 | # -3: max size: 16 Kb <-- probably not recommended 1083 | # -2: max size: 8 Kb <-- good 1084 | # -1: max size: 4 Kb <-- good 1085 | # Positive numbers mean store up to _exactly_ that number of elements 1086 | # per list node. 1087 | # The highest performing option is usually -2 (8 Kb size) or -1 (4 Kb size), 1088 | # but if your use case is unique, adjust the settings as necessary. 1089 | list-max-ziplist-size -2 1090 | 1091 | # Lists may also be compressed. 1092 | # Compress depth is the number of quicklist ziplist nodes from *each* side of 1093 | # the list to *exclude* from compression. The head and tail of the list 1094 | # are always uncompressed for fast push/pop operations. Settings are: 1095 | # 0: disable all list compression 1096 | # 1: depth 1 means "don't start compressing until after 1 node into the list, 1097 | # going from either the head or tail" 1098 | # So: [head]->node->node->...->node->[tail] 1099 | # [head], [tail] will always be uncompressed; inner nodes will compress. 1100 | # 2: [head]->[next]->node->node->...->node->[prev]->[tail] 1101 | # 2 here means: don't compress head or head->next or tail->prev or tail, 1102 | # but compress all nodes between them. 1103 | # 3: [head]->[next]->[next]->node->node->...->node->[prev]->[prev]->[tail] 1104 | # etc. 1105 | list-compress-depth 0 1106 | 1107 | # Sets have a special encoding in just one case: when a set is composed 1108 | # of just strings that happen to be integers in radix 10 in the range 1109 | # of 64 bit signed integers. 1110 | # The following configuration setting sets the limit in the size of the 1111 | # set in order to use this special memory saving encoding. 1112 | set-max-intset-entries 512 1113 | 1114 | # Similarly to hashes and lists, sorted sets are also specially encoded in 1115 | # order to save a lot of space. This encoding is only used when the length and 1116 | # elements of a sorted set are below the following limits: 1117 | zset-max-ziplist-entries 128 1118 | zset-max-ziplist-value 64 1119 | 1120 | # HyperLogLog sparse representation bytes limit. The limit includes the 1121 | # 16 bytes header. When an HyperLogLog using the sparse representation crosses 1122 | # this limit, it is converted into the dense representation. 1123 | # 1124 | # A value greater than 16000 is totally useless, since at that point the 1125 | # dense representation is more memory efficient. 1126 | # 1127 | # The suggested value is ~ 3000 in order to have the benefits of 1128 | # the space efficient encoding without slowing down too much PFADD, 1129 | # which is O(N) with the sparse encoding. The value can be raised to 1130 | # ~ 10000 when CPU is not a concern, but space is, and the data set is 1131 | # composed of many HyperLogLogs with cardinality in the 0 - 15000 range. 1132 | hll-sparse-max-bytes 3000 1133 | 1134 | # Streams macro node max size / items. The stream data structure is a radix 1135 | # tree of big nodes that encode multiple items inside. Using this configuration 1136 | # it is possible to configure how big a single node can be in bytes, and the 1137 | # maximum number of items it may contain before switching to a new node when 1138 | # appending new stream entries. If any of the following settings are set to 1139 | # zero, the limit is ignored, so for instance it is possible to set just a 1140 | # max entires limit by setting max-bytes to 0 and max-entries to the desired 1141 | # value. 1142 | stream-node-max-bytes 4096 1143 | stream-node-max-entries 100 1144 | 1145 | # Active rehashing uses 1 millisecond every 100 milliseconds of CPU time in 1146 | # order to help rehashing the main Redis hash table (the one mapping top-level 1147 | # keys to values). The hash table implementation Redis uses (see dict.c) 1148 | # performs a lazy rehashing: the more operation you run into a hash table 1149 | # that is rehashing, the more rehashing "steps" are performed, so if the 1150 | # server is idle the rehashing is never complete and some more memory is used 1151 | # by the hash table. 1152 | # 1153 | # The default is to use this millisecond 10 times every second in order to 1154 | # actively rehash the main dictionaries, freeing memory when possible. 1155 | # 1156 | # If unsure: 1157 | # use "activerehashing no" if you have hard latency requirements and it is 1158 | # not a good thing in your environment that Redis can reply from time to time 1159 | # to queries with 2 milliseconds delay. 1160 | # 1161 | # use "activerehashing yes" if you don't have such hard requirements but 1162 | # want to free memory asap when possible. 1163 | activerehashing yes 1164 | 1165 | # The client output buffer limits can be used to force disconnection of clients 1166 | # that are not reading data from the server fast enough for some reason (a 1167 | # common reason is that a Pub/Sub client can't consume messages as fast as the 1168 | # publisher can produce them). 1169 | # 1170 | # The limit can be set differently for the three different classes of clients: 1171 | # 1172 | # normal -> normal clients including MONITOR clients 1173 | # replica -> replica clients 1174 | # pubsub -> clients subscribed to at least one pubsub channel or pattern 1175 | # 1176 | # The syntax of every client-output-buffer-limit directive is the following: 1177 | # 1178 | # client-output-buffer-limit 1179 | # 1180 | # A client is immediately disconnected once the hard limit is reached, or if 1181 | # the soft limit is reached and remains reached for the specified number of 1182 | # seconds (continuously). 1183 | # So for instance if the hard limit is 32 megabytes and the soft limit is 1184 | # 16 megabytes / 10 seconds, the client will get disconnected immediately 1185 | # if the size of the output buffers reach 32 megabytes, but will also get 1186 | # disconnected if the client reaches 16 megabytes and continuously overcomes 1187 | # the limit for 10 seconds. 1188 | # 1189 | # By default normal clients are not limited because they don't receive data 1190 | # without asking (in a push way), but just after a request, so only 1191 | # asynchronous clients may create a scenario where data is requested faster 1192 | # than it can read. 1193 | # 1194 | # Instead there is a default limit for pubsub and replica clients, since 1195 | # subscribers and replicas receive data in a push fashion. 1196 | # 1197 | # Both the hard or the soft limit can be disabled by setting them to zero. 1198 | client-output-buffer-limit normal 0 0 0 1199 | client-output-buffer-limit replica 256mb 64mb 60 1200 | client-output-buffer-limit pubsub 32mb 8mb 60 1201 | 1202 | # Client query buffers accumulate new commands. They are limited to a fixed 1203 | # amount by default in order to avoid that a protocol desynchronization (for 1204 | # instance due to a bug in the client) will lead to unbound memory usage in 1205 | # the query buffer. However you can configure it here if you have very special 1206 | # needs, such us huge multi/exec requests or alike. 1207 | # 1208 | # client-query-buffer-limit 1gb 1209 | 1210 | # In the Redis protocol, bulk requests, that are, elements representing single 1211 | # strings, are normally limited ot 512 mb. However you can change this limit 1212 | # here. 1213 | # 1214 | # proto-max-bulk-len 512mb 1215 | 1216 | # Redis calls an internal function to perform many background tasks, like 1217 | # closing connections of clients in timeout, purging expired keys that are 1218 | # never requested, and so forth. 1219 | # 1220 | # Not all tasks are performed with the same frequency, but Redis checks for 1221 | # tasks to perform according to the specified "hz" value. 1222 | # 1223 | # By default "hz" is set to 10. Raising the value will use more CPU when 1224 | # Redis is idle, but at the same time will make Redis more responsive when 1225 | # there are many keys expiring at the same time, and timeouts may be 1226 | # handled with more precision. 1227 | # 1228 | # The range is between 1 and 500, however a value over 100 is usually not 1229 | # a good idea. Most users should use the default of 10 and raise this up to 1230 | # 100 only in environments where very low latency is required. 1231 | hz 10 1232 | 1233 | # Normally it is useful to have an HZ value which is proportional to the 1234 | # number of clients connected. This is useful in order, for instance, to 1235 | # avoid too many clients are processed for each background task invocation 1236 | # in order to avoid latency spikes. 1237 | # 1238 | # Since the default HZ value by default is conservatively set to 10, Redis 1239 | # offers, and enables by default, the ability to use an adaptive HZ value 1240 | # which will temporary raise when there are many connected clients. 1241 | # 1242 | # When dynamic HZ is enabled, the actual configured HZ will be used as 1243 | # as a baseline, but multiples of the configured HZ value will be actually 1244 | # used as needed once more clients are connected. In this way an idle 1245 | # instance will use very little CPU time while a busy instance will be 1246 | # more responsive. 1247 | dynamic-hz yes 1248 | 1249 | # When a child rewrites the AOF file, if the following option is enabled 1250 | # the file will be fsync-ed every 32 MB of data generated. This is useful 1251 | # in order to commit the file to the disk more incrementally and avoid 1252 | # big latency spikes. 1253 | aof-rewrite-incremental-fsync yes 1254 | 1255 | # When redis saves RDB file, if the following option is enabled 1256 | # the file will be fsync-ed every 32 MB of data generated. This is useful 1257 | # in order to commit the file to the disk more incrementally and avoid 1258 | # big latency spikes. 1259 | rdb-save-incremental-fsync yes 1260 | 1261 | # Redis LFU eviction (see maxmemory setting) can be tuned. However it is a good 1262 | # idea to start with the default settings and only change them after investigating 1263 | # how to improve the performances and how the keys LFU change over time, which 1264 | # is possible to inspect via the OBJECT FREQ command. 1265 | # 1266 | # There are two tunable parameters in the Redis LFU implementation: the 1267 | # counter logarithm factor and the counter decay time. It is important to 1268 | # understand what the two parameters mean before changing them. 1269 | # 1270 | # The LFU counter is just 8 bits per key, it's maximum value is 255, so Redis 1271 | # uses a probabilistic increment with logarithmic behavior. Given the value 1272 | # of the old counter, when a key is accessed, the counter is incremented in 1273 | # this way: 1274 | # 1275 | # 1. A random number R between 0 and 1 is extracted. 1276 | # 2. A probability P is calculated as 1/(old_value*lfu_log_factor+1). 1277 | # 3. The counter is incremented only if R < P. 1278 | # 1279 | # The default lfu-log-factor is 10. This is a table of how the frequency 1280 | # counter changes with a different number of accesses with different 1281 | # logarithmic factors: 1282 | # 1283 | # +--------+------------+------------+------------+------------+------------+ 1284 | # | factor | 100 hits | 1000 hits | 100K hits | 1M hits | 10M hits | 1285 | # +--------+------------+------------+------------+------------+------------+ 1286 | # | 0 | 104 | 255 | 255 | 255 | 255 | 1287 | # +--------+------------+------------+------------+------------+------------+ 1288 | # | 1 | 18 | 49 | 255 | 255 | 255 | 1289 | # +--------+------------+------------+------------+------------+------------+ 1290 | # | 10 | 10 | 18 | 142 | 255 | 255 | 1291 | # +--------+------------+------------+------------+------------+------------+ 1292 | # | 100 | 8 | 11 | 49 | 143 | 255 | 1293 | # +--------+------------+------------+------------+------------+------------+ 1294 | # 1295 | # NOTE: The above table was obtained by running the following commands: 1296 | # 1297 | # redis-benchmark -n 1000000 incr foo 1298 | # redis-cli object freq foo 1299 | # 1300 | # NOTE 2: The counter initial value is 5 in order to give new objects a chance 1301 | # to accumulate hits. 1302 | # 1303 | # The counter decay time is the time, in minutes, that must elapse in order 1304 | # for the key counter to be divided by two (or decremented if it has a value 1305 | # less <= 10). 1306 | # 1307 | # The default value for the lfu-decay-time is 1. A Special value of 0 means to 1308 | # decay the counter every time it happens to be scanned. 1309 | # 1310 | # lfu-log-factor 10 1311 | # lfu-decay-time 1 1312 | 1313 | ########################### ACTIVE DEFRAGMENTATION ####################### 1314 | # 1315 | # WARNING THIS FEATURE IS EXPERIMENTAL. However it was stress tested 1316 | # even in production and manually tested by multiple engineers for some 1317 | # time. 1318 | # 1319 | # What is active defragmentation? 1320 | # ------------------------------- 1321 | # 1322 | # Active (online) defragmentation allows a Redis server to compact the 1323 | # spaces left between small allocations and deallocations of data in memory, 1324 | # thus allowing to reclaim back memory. 1325 | # 1326 | # Fragmentation is a natural process that happens with every allocator (but 1327 | # less so with Jemalloc, fortunately) and certain workloads. Normally a server 1328 | # restart is needed in order to lower the fragmentation, or at least to flush 1329 | # away all the data and create it again. However thanks to this feature 1330 | # implemented by Oran Agra for Redis 4.0 this process can happen at runtime 1331 | # in an "hot" way, while the server is running. 1332 | # 1333 | # Basically when the fragmentation is over a certain level (see the 1334 | # configuration options below) Redis will start to create new copies of the 1335 | # values in contiguous memory regions by exploiting certain specific Jemalloc 1336 | # features (in order to understand if an allocation is causing fragmentation 1337 | # and to allocate it in a better place), and at the same time, will release the 1338 | # old copies of the data. This process, repeated incrementally for all the keys 1339 | # will cause the fragmentation to drop back to normal values. 1340 | # 1341 | # Important things to understand: 1342 | # 1343 | # 1. This feature is disabled by default, and only works if you compiled Redis 1344 | # to use the copy of Jemalloc we ship with the source code of Redis. 1345 | # This is the default with Linux builds. 1346 | # 1347 | # 2. You never need to enable this feature if you don't have fragmentation 1348 | # issues. 1349 | # 1350 | # 3. Once you experience fragmentation, you can enable this feature when 1351 | # needed with the command "CONFIG SET activedefrag yes". 1352 | # 1353 | # The configuration parameters are able to fine tune the behavior of the 1354 | # defragmentation process. If you are not sure about what they mean it is 1355 | # a good idea to leave the defaults untouched. 1356 | 1357 | # Enabled active defragmentation 1358 | # activedefrag yes 1359 | 1360 | # Minimum amount of fragmentation waste to start active defrag 1361 | # active-defrag-ignore-bytes 100mb 1362 | 1363 | # Minimum percentage of fragmentation to start active defrag 1364 | # active-defrag-threshold-lower 10 1365 | 1366 | # Maximum percentage of fragmentation at which we use maximum effort 1367 | # active-defrag-threshold-upper 100 1368 | 1369 | # Minimal effort for defrag in CPU percentage 1370 | # active-defrag-cycle-min 5 1371 | 1372 | # Maximal effort for defrag in CPU percentage 1373 | # active-defrag-cycle-max 75 1374 | 1375 | # Maximum number of set/hash/zset/list fields that will be processed from 1376 | # the main dictionary scan 1377 | # active-defrag-max-scan-fields 1000 1378 | 1379 | -------------------------------------------------------------------------------- /cache/run_redis.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -e 4 | set -x 5 | 6 | ../../redis/src/redis-server ./cache.conf 7 | -------------------------------------------------------------------------------- /cache/shutdown_redis.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | # set -e 4 | set -x 5 | 6 | ../../redis/src/redis-cli -s ./cache.sock shutdown 7 | -------------------------------------------------------------------------------- /client/LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2013, 2014 Raphaël Vinot 2 | Copyright (c) 2013, 2014 CIRCL - Computer Incident Response Center Luxembourg 3 | (c/o smile, security made in Lëtzebuerg, Groupement 4 | d'Intérêt Economique) 5 | 6 | All rights reserved. 7 | 8 | Redistribution and use in source and binary forms, with or without modification, 9 | are permitted provided that the following conditions are met: 10 | 11 | 1. Redistributions of source code must retain the above copyright notice, 12 | this list of conditions and the following disclaimer. 13 | 2. Redistributions in binary form must reproduce the above copyright notice, 14 | this list of conditions and the following disclaimer in the documentation 15 | and/or other materials provided with the distribution. 16 | 17 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 18 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 19 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. 20 | IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, 21 | INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 22 | BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, 23 | DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF 24 | LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE 25 | OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED 26 | OF THE POSSIBILITY OF SUCH DAMAGE. 27 | -------------------------------------------------------------------------------- /client/MANIFEST.in: -------------------------------------------------------------------------------- 1 | include README.md 2 | -------------------------------------------------------------------------------- /client/README.md: -------------------------------------------------------------------------------- 1 | Client API for URL Abuse 2 | ======================== 3 | 4 | Client API to query CIRCL URL Abuse system. 5 | 6 | 7 | 8 | 9 | -------------------------------------------------------------------------------- /client/bin/urlabuse: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | import argparse 5 | 6 | from pyurlabuse import PyURLAbuse 7 | import json 8 | 9 | 10 | if __name__ == '__main__': 11 | parser = argparse.ArgumentParser(description='Run a query against URL Abuse') 12 | parser.add_argument('--url', type=str, help='URL of the instance.') 13 | 14 | parser.add_argument('--query', help='URL to lookup') 15 | parser.add_argument('--digest', action='store_true', help='Return the digest') 16 | 17 | args = parser.parse_args() 18 | 19 | if args.url: 20 | urlabuse = PyURLAbuse(args.url) 21 | else: 22 | urlabuse = PyURLAbuse() 23 | 24 | response = urlabuse.run_query(args.query, args.digest) 25 | if args.digest: 26 | print(response['digest'][0]) 27 | else: 28 | print(json.dumps(response, indent=2)) 29 | -------------------------------------------------------------------------------- /client/pyurlabuse/__init__.py: -------------------------------------------------------------------------------- 1 | from .api import PyURLAbuse 2 | -------------------------------------------------------------------------------- /client/pyurlabuse/api.py: -------------------------------------------------------------------------------- 1 | #!/bin/python 2 | # -*- coding: utf-8 -*- 3 | 4 | import json 5 | import requests 6 | import time 7 | from urllib.parse import urljoin 8 | 9 | 10 | class PyURLAbuse(object): 11 | 12 | def __init__(self, url='https://www.circl.lu/urlabuse/'): 13 | self.url = url 14 | 15 | self.session = requests.Session() 16 | self.session.headers.update({'content-type': 'application/json'}) 17 | 18 | @property 19 | def is_up(self): 20 | r = self.session.head(self.root_url) 21 | return r.status_code == 200 22 | 23 | def get_result(self, job_id): 24 | response = self.session.get(urljoin(self.url, '_result/{}'.format(job_id))) 25 | if response.status_code == 202: 26 | return None 27 | else: 28 | return response.json() 29 | 30 | def _async(self, path, query): 31 | response = self.session.post(urljoin(self.url, path), data=json.dumps(query)) 32 | return response.text 33 | 34 | def start(self, q): 35 | query = {'url': q} 36 | return self._async('start', query) 37 | 38 | def urls(self, q): 39 | query = {'url': q} 40 | return self._async('urls', query) 41 | 42 | def resolve(self, q): 43 | query = {'url': q} 44 | return self._async('resolve', query) 45 | 46 | def phishtank(self, q): 47 | query = {'query': q} 48 | return self._async('phishtank', query) 49 | 50 | def virustotal(self, q): 51 | query = {'query': q} 52 | return self._async('virustotal_report', query) 53 | 54 | def googlesafebrowsing(self, q): 55 | query = {'query': q} 56 | return self._async('googlesafebrowsing', query) 57 | 58 | def urlquery(self, q): 59 | query = {'query': q} 60 | return self._async('urlquery', query) 61 | 62 | def ticket(self, q): 63 | query = {'query': q} 64 | return self._async('ticket', query) 65 | 66 | def whoismail(self, q): 67 | query = {'query': q} 68 | return self._async('whois', query) 69 | 70 | def pdnscircl(self, q): 71 | query = {'query': q} 72 | return self._async('pdnscircl', query) 73 | 74 | def bgpr(self, q): 75 | query = {'query': q} 76 | return self._async('bgpranking', query) 77 | 78 | def sslcircl(self, q): 79 | query = {'query': q} 80 | return self._async('psslcircl', query) 81 | 82 | def lookyloo(self, q): 83 | query = {'url': q} 84 | return self._async('lookyloo', query) 85 | 86 | def _update_cache(self, cached): 87 | for result in cached['result']: 88 | for url, items in result.items(): 89 | self.resolve(url) 90 | self.phishtank(url) 91 | self.virustotal(url) 92 | self.googlesafebrowsing(url) 93 | self.urlquery(url) 94 | self.ticket(url) 95 | self.whoismail(url) 96 | if 'dns' not in items: 97 | continue 98 | for entry in items['dns']: 99 | if entry is None: 100 | continue 101 | for ip in entry: 102 | self.phishtank(ip) 103 | self.bgpr(ip) 104 | self.urlquery(ip) 105 | self.pdnscircl(ip) 106 | self.sslcircl(ip) 107 | self.whoismail(ip) 108 | 109 | def run_query(self, q, with_digest=False): 110 | cached = self.get_cache(q, with_digest) 111 | if len(cached['result']) > 0: 112 | has_cached_content = True 113 | self._update_cache(cached) 114 | for r in cached['result']: 115 | for url, content in r.items(): 116 | if not content: 117 | has_cached_content = False 118 | if has_cached_content: 119 | cached['info'] = 'Used cached content' 120 | return cached 121 | self.lookyloo(q) 122 | job_id = self.urls(q) 123 | all_urls = None 124 | while True: 125 | all_urls = self.get_result(job_id) 126 | if all_urls is None: 127 | time.sleep(.5) 128 | else: 129 | break 130 | 131 | res = {} 132 | for u in all_urls: 133 | res[u] = self.resolve(u) 134 | self.phishtank(u) 135 | self.virustotal(u) 136 | self.googlesafebrowsing(u) 137 | self.urlquery(u) 138 | self.ticket(u) 139 | self.whoismail(u) 140 | 141 | waiting = True 142 | done = [] 143 | while waiting: 144 | waiting = False 145 | for u, job_id in res.items(): 146 | if job_id in done: 147 | continue 148 | ips = self.get_result(job_id) 149 | if ips is not None: 150 | done.append(job_id) 151 | v4, v6 = ips 152 | if v4 is not None: 153 | for ip in v4: 154 | self.phishtank(ip) 155 | self.bgpr(ip) 156 | self.urlquery(ip) 157 | self.pdnscircl(ip) 158 | self.sslcircl(ip) 159 | self.whoismail(ip) 160 | if v6 is not None: 161 | for ip in v6: 162 | self.phishtank(ip) 163 | self.bgpr(ip) 164 | self.urlquery(ip) 165 | self.pdnscircl(ip) 166 | self.whoismail(ip) 167 | waiting = True 168 | time.sleep(.5) 169 | time.sleep(1) 170 | cached = self.get_cache(q, with_digest) 171 | cached['info'] = 'New query, all the details may not be available.' 172 | return cached 173 | 174 | def get_cache(self, q, digest=False): 175 | query = {'query': q, 'digest': digest} 176 | response = self.session.post(urljoin(self.url, 'get_cache'), data=json.dumps(query)) 177 | return response.json() 178 | -------------------------------------------------------------------------------- /client/setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | # -*- coding: utf-8 -*- 3 | from setuptools import setup 4 | 5 | setup( 6 | name='pyurlabuse', 7 | version='1.0', 8 | author='Raphaël Vinot', 9 | author_email='raphael.vinot@circl.lu', 10 | maintainer='Raphaël Vinot', 11 | url='https://github.com/CIRCL/url-abuse', 12 | description='Python API for URL Abuse.', 13 | long_description=open('README.md').read(), 14 | packages=['pyurlabuse'], 15 | scripts=['bin/urlabuse'], 16 | classifiers=[ 17 | 'License :: OSI Approved :: GNU General Public License v3 (GPLv3)', 18 | 'Development Status :: 3 - Alpha', 19 | 'Environment :: Console', 20 | 'Intended Audience :: Science/Research', 21 | 'Intended Audience :: Telecommunications Industry', 22 | 'Programming Language :: Python', 23 | 'Topic :: Security', 24 | 'Topic :: Internet', 25 | ], 26 | install_requires=['requests'], 27 | ) 28 | -------------------------------------------------------------------------------- /client/tests/tests.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | import unittest 5 | 6 | from pyurlabuse import PyURLAbuse 7 | 8 | import json 9 | 10 | 11 | class TestPyUrlAbuse(unittest.TestCase): 12 | 13 | def test_digest(self): 14 | urlabuse = PyURLAbuse('http://0.0.0.0:5200') 15 | response = urlabuse.run_query('https://circl.lu/url-abuse') 16 | print(json.dumps(response, indent=2)) 17 | self.assertTrue(response['result']) 18 | 19 | if __name__ == '__main__': 20 | unittest.main() 21 | -------------------------------------------------------------------------------- /doc/logo/logo-circl.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CIRCL/url-abuse/3d2ae503ec6ecbee92f7b8010abd46afa5b52230/doc/logo/logo-circl.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | -i https://pypi.org/simple 2 | -e . 3 | -e ./client 4 | -e git+https://github.com/CIRCL/lookyloo.git/@934324ed09fede42e0fed43c3c0eab80d6436bb2#egg=pylookyloo&subdirectory=client 5 | -e git+https://github.com/D4-project/BGP-Ranking.git/@b367e1852cafabcb35a4159f520649bd35c4686b#egg=pybgpranking&subdirectory=client 6 | -e git+https://github.com/D4-project/IPASN-History.git/@283539cfbbde4bb54497726634407025f7d685c2#egg=pyipasnhistory&subdirectory=client 7 | -e git+https://github.com/stricaud/faup.git/@b65a4d816b008d715f4394cf2ccac474c1710350#egg=pyfaup&subdirectory=src/lib/bindings/python/ 8 | beautifulsoup4==4.8.0 9 | blinker==1.4 10 | certifi==2019.6.16 11 | chardet==3.0.4 12 | click==7.0 13 | dnspython==1.16.0 14 | dominate==2.4.0 15 | flask-bootstrap==3.3.7.1 16 | flask-mail==0.9.1 17 | flask-wtf==0.14.2 18 | flask==1.1.1 19 | gevent==1.4.0 20 | greenlet==0.4.15 ; platform_python_implementation == 'CPython' 21 | gunicorn[gevent]==19.9.0 22 | idna==2.8 23 | itsdangerous==1.1.0 24 | jinja2==2.10.1 25 | markupsafe==1.1.1 26 | pyeupi==1.0 27 | pypdns==1.4.1 28 | pypssl==2.1 29 | python-dateutil==2.8.0 30 | redis==3.3.8 31 | requests-cache==0.5.2 32 | requests==2.22.0 33 | six==1.12.0 34 | soupsieve==1.9.3 35 | urllib3==1.25.3 36 | visitor==0.1.3 37 | werkzeug==0.15.5 38 | wtforms==2.2.1 39 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | from setuptools import setup 4 | 5 | 6 | setup( 7 | name='urlabuse', 8 | version='0.1', 9 | author='Raphaël Vinot', 10 | author_email='raphael.vinot@circl.lu', 11 | maintainer='Raphaël Vinot', 12 | url='https://github.com/CIRCL/url-abuse/', 13 | description='URL Abuse interface', 14 | packages=['urlabuse'], 15 | scripts=['bin/run_backend.py', 'bin/run_workers.py', 'bin/start.py', 'bin/start_website.py'], 16 | classifiers=[ 17 | 'License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)', 18 | 'Development Status :: 3 - Alpha', 19 | 'Environment :: Console', 20 | 'Operating System :: POSIX :: Linux', 21 | 'Intended Audience :: Science/Research', 22 | 'Intended Audience :: Telecommunications Industry', 23 | 'Intended Audience :: Information Technology', 24 | 'Programming Language :: Python :: 3', 25 | 'Topic :: Security', 26 | 'Topic :: Internet', 27 | ] 28 | ) 29 | -------------------------------------------------------------------------------- /urlabuse/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CIRCL/url-abuse/3d2ae503ec6ecbee92f7b8010abd46afa5b52230/urlabuse/__init__.py -------------------------------------------------------------------------------- /urlabuse/exceptions.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | 4 | 5 | class URLAbuseException(Exception): 6 | pass 7 | 8 | 9 | class CreateDirectoryException(URLAbuseException): 10 | pass 11 | 12 | 13 | class MissingEnv(URLAbuseException): 14 | pass 15 | -------------------------------------------------------------------------------- /urlabuse/helpers.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | import os 5 | from pathlib import Path 6 | from .exceptions import CreateDirectoryException, MissingEnv 7 | from redis import Redis 8 | from redis.exceptions import ConnectionError 9 | from datetime import datetime, timedelta 10 | import time 11 | import asyncio 12 | 13 | 14 | def get_storage_path() -> Path: 15 | if not os.environ.get('VIRTUAL_ENV'): 16 | raise MissingEnv("VIRTUAL_ENV is missing. This project really wants to run from a virtual envoronment.") 17 | return Path(os.environ['VIRTUAL_ENV']) 18 | 19 | 20 | def get_homedir() -> Path: 21 | if not os.environ.get('URLABUSE_HOME'): 22 | guessed_home = Path(__file__).resolve().parent.parent 23 | raise MissingEnv(f"URLABUSE_HOME is missing. \ 24 | Run the following command (assuming you run the code from the clonned repository):\ 25 | export URLABUSE_HOME='{guessed_home}'") 26 | return Path(os.environ['URLABUSE_HOME']) 27 | 28 | 29 | def safe_create_dir(to_create: Path) -> None: 30 | if to_create.exists() and not to_create.is_dir(): 31 | raise CreateDirectoryException(f'The path {to_create} already exists and is not a directory') 32 | os.makedirs(to_create, exist_ok=True) 33 | 34 | 35 | def set_running(name: str) -> None: 36 | r = Redis(unix_socket_path=get_socket_path('cache'), db=1, decode_responses=True) 37 | r.hset('running', name, 1) 38 | 39 | 40 | def unset_running(name: str) -> None: 41 | r = Redis(unix_socket_path=get_socket_path('cache'), db=1, decode_responses=True) 42 | r.hdel('running', name) 43 | 44 | 45 | def is_running() -> dict: 46 | r = Redis(unix_socket_path=get_socket_path('cache'), db=1, decode_responses=True) 47 | return r.hgetall('running') 48 | 49 | 50 | def get_socket_path(name: str) -> str: 51 | mapping = { 52 | 'cache': Path('cache', 'cache.sock') 53 | } 54 | return str(get_homedir() / mapping[name]) 55 | 56 | 57 | def check_running(name: str) -> bool: 58 | socket_path = get_socket_path(name) 59 | print(socket_path) 60 | try: 61 | r = Redis(unix_socket_path=socket_path) 62 | if r.ping(): 63 | return True 64 | except ConnectionError: 65 | return False 66 | 67 | 68 | def shutdown_requested() -> bool: 69 | try: 70 | r = Redis(unix_socket_path=get_socket_path('cache'), db=1, decode_responses=True) 71 | return r.exists('shutdown') 72 | except ConnectionRefusedError: 73 | return True 74 | except ConnectionError: 75 | return True 76 | 77 | 78 | async def long_sleep_async(sleep_in_sec: int, shutdown_check: int=10) -> bool: 79 | if shutdown_check > sleep_in_sec: 80 | shutdown_check = sleep_in_sec 81 | sleep_until = datetime.now() + timedelta(seconds=sleep_in_sec) 82 | while sleep_until > datetime.now(): 83 | await asyncio.sleep(shutdown_check) 84 | if shutdown_requested(): 85 | return False 86 | return True 87 | 88 | 89 | def long_sleep(sleep_in_sec: int, shutdown_check: int=10) -> bool: 90 | if shutdown_check > sleep_in_sec: 91 | shutdown_check = sleep_in_sec 92 | sleep_until = datetime.now() + timedelta(seconds=sleep_in_sec) 93 | while sleep_until > datetime.now(): 94 | time.sleep(shutdown_check) 95 | if shutdown_requested(): 96 | return False 97 | return True 98 | -------------------------------------------------------------------------------- /urlabuse/urlabuse.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # 3 | # 4 | # Copyright (C) 2014 Sascha Rommelfangen, Raphael Vinot 5 | # Copyright (C) 2014 CIRCL Computer Incident Response Center Luxembourg (SMILE gie) 6 | # 7 | 8 | from datetime import date, timedelta 9 | import json 10 | from redis import Redis 11 | from urllib.parse import quote 12 | from .helpers import get_socket_path 13 | import ipaddress 14 | 15 | 16 | from pyfaup.faup import Faup 17 | import socket 18 | import dns.resolver 19 | import re 20 | import logging 21 | from pypdns import PyPDNS 22 | from pyipasnhistory import IPASNHistory 23 | from pybgpranking import BGPRanking 24 | from pylookyloo import Lookyloo 25 | 26 | from pypssl import PyPSSL 27 | from pyeupi import PyEUPI 28 | import requests 29 | from bs4 import BeautifulSoup 30 | 31 | try: 32 | # import sphinxapi 33 | sphinx = True 34 | except Exception: 35 | sphinx = False 36 | 37 | 38 | class Query(): 39 | 40 | def __init__(self, loglevel: int=logging.DEBUG): 41 | self.__init_logger(loglevel) 42 | self.fex = Faup() 43 | self.cache = Redis(unix_socket_path=get_socket_path('cache'), db=1, 44 | decode_responses=True) 45 | 46 | def __init_logger(self, loglevel) -> None: 47 | self.logger = logging.getLogger(f'{self.__class__.__name__}') 48 | self.logger.setLevel(loglevel) 49 | 50 | def _cache_set(self, key, value, field=None): 51 | if field is None: 52 | self.cache.setex(key, json.dumps(value), 3600) 53 | else: 54 | self.cache.hset(key, field, json.dumps(value)) 55 | self.cache.expire(key, 3600) 56 | 57 | def _cache_get(self, key, field=None): 58 | if field is None: 59 | value_json = self.cache.get(key) 60 | else: 61 | value_json = self.cache.hget(key, field) 62 | if value_json is not None: 63 | return json.loads(value_json) 64 | return None 65 | 66 | def to_bool(self, s): 67 | """ 68 | Converts the given string to a boolean. 69 | """ 70 | return s.lower() in ('1', 'true', 'yes', 'on') 71 | 72 | def get_submissions(self, url, day=None): 73 | if day is None: 74 | day = date.today().isoformat() 75 | else: 76 | day = day.isoformat() 77 | return self.cache.zscore(f'{day}_submissions', url) 78 | 79 | def get_mail_sent(self, url, day=None): 80 | if day is None: 81 | day = date.today().isoformat() 82 | else: 83 | day = day.isoformat() 84 | self.fex.decode(url) 85 | host = self.fex.get_host() 86 | return self.cache.sismember(f'{day}_mails', host) 87 | 88 | def set_mail_sent(self, url, day=None): 89 | if day is None: 90 | day = date.today().isoformat() 91 | else: 92 | day = day.isoformat() 93 | self.fex.decode(url) 94 | host = self.fex.get_host() 95 | return self.cache.sadd(f'{day}_mails', host) 96 | 97 | def is_valid_url(self, url): 98 | cached = self._cache_get(url, 'valid') 99 | key = f'{date.today().isoformat()}_submissions' 100 | self.cache.zincrby(key, 1, url) 101 | if cached is not None: 102 | return cached 103 | if url.startswith('hxxp'): 104 | url = 'http' + url[4:] 105 | elif not url.startswith('http'): 106 | url = 'http://' + url 107 | logging.debug("Checking validity of URL: " + url) 108 | self.fex.decode(url) 109 | scheme = self.fex.get_scheme() 110 | host = self.fex.get_host() 111 | if scheme is None or host is None: 112 | reason = "Not a valid http/https URL/URI" 113 | return False, url, reason 114 | self._cache_set(url, (True, url, None), 'valid') 115 | return True, url, None 116 | 117 | def is_ip(self, host): 118 | try: 119 | ipaddress.ip_address(host) 120 | return True 121 | except ValueError: 122 | return False 123 | 124 | def try_resolve(self, url): 125 | self.fex.decode(url) 126 | host = self.fex.get_host().lower() 127 | if self.is_ip(host): 128 | return True, None 129 | try: 130 | ipaddr = dns.resolver.query(host, 'A') 131 | except Exception: 132 | reason = "DNS server problem. Check resolver settings." 133 | return False, reason 134 | if not ipaddr: 135 | reason = "Host " + host + " does not exist." 136 | return False, reason 137 | return True, None 138 | 139 | def get_urls(self, url, depth=1): 140 | if depth > 5: 141 | print('Too many redirects.') 142 | return 143 | 144 | def meta_redirect(content): 145 | c = content.lower() 146 | soup = BeautifulSoup(c, "html.parser") 147 | for result in soup.find_all(attrs={'http-equiv': 'refresh'}): 148 | if result: 149 | out = result["content"].split(";") 150 | if len(out) == 2: 151 | wait, text = out 152 | try: 153 | a, url = text.split('=', 1) 154 | return url.strip() 155 | except Exception: 156 | print(text) 157 | return None 158 | 159 | resolve, reason = self.try_resolve(url) 160 | if not resolve: 161 | # FIXME: inform that the domain does not resolve 162 | yield url 163 | return 164 | 165 | logging.debug(f"Making HTTP connection to {url}") 166 | 167 | headers = {'User-agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0'} 168 | try: 169 | response = requests.get(url, allow_redirects=True, headers=headers, 170 | timeout=15, verify=False) 171 | except Exception: 172 | # That one can fail (DNS for example) 173 | # FIXME: inform that the get failed 174 | yield url 175 | return 176 | if response.history is not None: 177 | for h in response.history: 178 | # Yeld the urls in the order we find them 179 | yield h.url 180 | 181 | yield response.url 182 | 183 | meta_redir_url = meta_redirect(response.content) 184 | if meta_redir_url is not None: 185 | depth += 1 186 | if not meta_redir_url.startswith('http'): 187 | self.fex.decode(url) 188 | base = '{}://{}'.format(self.fex.get_scheme(), self.fex.get_host()) 189 | port = self.fex.get_port() 190 | if port is not None: 191 | base += f':{port}' 192 | if not meta_redir_url.startswith('/'): 193 | # relative redirect. resource_path has the initial '/' 194 | if self.fex.get_resource_path() is not None: 195 | base += self.fex.get_resource_path() 196 | if not base.endswith('/'): 197 | base += '/' 198 | meta_redir_url = base + meta_redir_url 199 | for url in self.get_urls(meta_redir_url, depth): 200 | yield url 201 | 202 | def url_list(self, url): 203 | cached = self._cache_get(url, 'list') 204 | if cached is not None: 205 | return cached 206 | list_urls = [] 207 | for u in self.get_urls(url): 208 | if u is None or u in list_urls: 209 | continue 210 | list_urls.append(u) 211 | self._cache_set(url, list_urls, 'list') 212 | return list_urls 213 | 214 | def dns_resolve(self, url): 215 | cached = self._cache_get(url, 'dns') 216 | if cached is not None: 217 | return cached 218 | self.fex.decode(url) 219 | host = self.fex.get_host().lower() 220 | ipv4 = None 221 | ipv6 = None 222 | if self.is_ip(host): 223 | if ':' in host: 224 | try: 225 | socket.inet_pton(socket.AF_INET6, host) 226 | ipv6 = [host] 227 | except Exception: 228 | pass 229 | else: 230 | try: 231 | socket.inet_aton(host) 232 | ipv4 = [host] 233 | except Exception: 234 | pass 235 | else: 236 | try: 237 | ipv4 = [str(ip) for ip in dns.resolver.query(host, 'A')] 238 | except Exception: 239 | logging.debug("No IPv4 address assigned to: " + host) 240 | try: 241 | ipv6 = [str(ip) for ip in dns.resolver.query(host, 'AAAA')] 242 | except Exception: 243 | logging.debug("No IPv6 address assigned to: " + host) 244 | self._cache_set(url, (ipv4, ipv6), 'dns') 245 | return ipv4, ipv6 246 | 247 | def phish_query(self, url, key, query): 248 | cached = self._cache_get(query, 'phishtank') 249 | if cached is not None: 250 | return cached 251 | postfields = {'url': quote(query), 'format': 'json', 'app_key': key} 252 | response = requests.post(url, data=postfields) 253 | res = response.json() 254 | if res["meta"]["status"] == "success": 255 | if res["results"]["in_database"]: 256 | self._cache_set(query, res["results"]["phish_detail_page"], 'phishtank') 257 | return res["results"]["phish_detail_page"] 258 | else: 259 | # no information 260 | pass 261 | elif res["meta"]["status"] == 'error': 262 | # Inform the user? 263 | # errormsg = res["errortext"] 264 | pass 265 | return None 266 | 267 | def sphinxsearch(server, port, url, query): 268 | # WARNING: too dangerous to have on the public interface 269 | return '' 270 | """ 271 | if not sphinx: 272 | return None 273 | cached = _cache_get(query, 'sphinx') 274 | if cached is not None: 275 | return cached 276 | client = sphinxapi.SphinxClient() 277 | client.SetServer(server, port) 278 | client.SetMatchMode(2) 279 | client.SetConnectTimeout(5.0) 280 | result = [] 281 | res = client.Query(query) 282 | if res.get("matches") is not None: 283 | for ticket in res["matches"]: 284 | ticket_id = ticket["id"] 285 | ticket_link = url + str(ticket_id) 286 | result.append(ticket_link) 287 | _cache_set(query, result, 'sphinx') 288 | return result 289 | 290 | """ 291 | 292 | def vt_query_url(self, url, url_up, key, query, upload=True): 293 | cached = self._cache_get(query, 'vt') 294 | if cached is not None and cached[2] is not None: 295 | return cached 296 | parameters = {"resource": query, "apikey": key} 297 | if upload: 298 | parameters['scan'] = 1 299 | response = requests.post(url, data=parameters) 300 | if response.text is None or len(response.text) == 0: 301 | return None 302 | res = response.json() 303 | msg = res["verbose_msg"] 304 | link = res.get("permalink") 305 | positives = res.get("positives") 306 | total = res.get("total") 307 | self._cache_set(query, (msg, link, positives, total), 'vt') 308 | return msg, link, positives, total 309 | 310 | def gsb_query(self, url, query): 311 | cached = self._cache_get(query, 'gsb') 312 | if cached is not None: 313 | return cached 314 | param = '1\n' + query 315 | response = requests.post(url, data=param) 316 | status = response.status_code 317 | if status == 200: 318 | self._cache_set(query, response.text, 'gsb') 319 | return response.text 320 | 321 | ''' 322 | def urlquery_query(url, key, query): 323 | return None 324 | cached = _cache_get(query, 'urlquery') 325 | if cached is not None: 326 | return cached 327 | try: 328 | urlquery.url = url 329 | urlquery.key = key 330 | response = urlquery.search(query) 331 | except Exception: 332 | return None 333 | if response['_response_']['status'] == 'ok': 334 | if response.get('reports') is not None: 335 | total_alert_count = 0 336 | for r in response['reports']: 337 | total_alert_count += r['urlquery_alert_count'] 338 | total_alert_count += r['ids_alert_count'] 339 | total_alert_count += r['blacklist_alert_count'] 340 | _cache_set(query, total_alert_count, 'urlquery') 341 | return total_alert_count 342 | else: 343 | return None 344 | ''' 345 | 346 | def process_emails(self, emails, ignorelist, replacelist): 347 | to_return = list(set(emails)) 348 | for mail in reversed(to_return): 349 | for ignorelist_entry in ignorelist: 350 | if re.search(ignorelist_entry, mail, re.I): 351 | if mail in to_return: 352 | to_return.remove(mail) 353 | for k, v in list(replacelist.items()): 354 | if re.search(k, mail, re.I): 355 | if k in to_return: 356 | to_return.remove(k) 357 | to_return += v 358 | return to_return 359 | 360 | def whois(self, server, port, domain, ignorelist, replacelist): 361 | cached = self._cache_get(domain, 'whois') 362 | if cached is not None: 363 | return cached 364 | s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 365 | s.settimeout(15) 366 | try: 367 | s.connect((server, port)) 368 | except Exception: 369 | print("Connection problems - check WHOIS server") 370 | print(("WHOIS request while problem occurred: ", domain)) 371 | print(("WHOIS server: {}:{}".format(server, port))) 372 | return None 373 | if domain.startswith('http'): 374 | self.fex.decode(domain) 375 | d = self.fex.get_domain().lower() 376 | else: 377 | d = domain 378 | s.send(("{}\r\n".format(d)).encode()) 379 | response = b'' 380 | while True: 381 | d = s.recv(4096) 382 | response += d 383 | if d == b'': 384 | break 385 | s.close() 386 | match = re.findall(r'[\w\.-]+@[\w\.-]+', response.decode()) 387 | emails = self.process_emails(match, ignorelist, replacelist) 388 | if len(emails) == 0: 389 | return None 390 | list_mail = list(set(emails)) 391 | self._cache_set(domain, list_mail, 'whois') 392 | return list_mail 393 | 394 | def pdnscircl(self, url, user, passwd, q): 395 | cached = self._cache_get(q, 'pdns') 396 | if cached is not None: 397 | return cached 398 | pdns = PyPDNS(url, basic_auth=(user, passwd)) 399 | response = pdns.query(q) 400 | all_uniq = [] 401 | for e in reversed(response): 402 | host = e['rrname'].lower() 403 | if host in all_uniq: 404 | continue 405 | else: 406 | all_uniq.append(host) 407 | response = (len(all_uniq), all_uniq[:5]) 408 | self._cache_set(q, response, 'pdns') 409 | return response 410 | 411 | def psslcircl(self, url, user, passwd, q): 412 | cached = self._cache_get(q, 'pssl') 413 | if cached is not None: 414 | return cached 415 | pssl = PyPSSL(url, basic_auth=(user, passwd)) 416 | response = pssl.query(q) 417 | if response.get(q) is not None: 418 | certinfo = response.get(q) 419 | entries = {} 420 | for sha1 in certinfo['certificates']: 421 | entries[sha1] = [] 422 | if certinfo['subjects'].get(sha1): 423 | for value in certinfo['subjects'][sha1]['values']: 424 | entries[sha1].append(value) 425 | self._cache_set(q, entries, 'pssl') 426 | return entries 427 | return None 428 | 429 | def eupi(self, url, key, q): 430 | cached = self._cache_get(q, 'eupi') 431 | if cached is not None: 432 | return cached 433 | eu = PyEUPI(key, url) 434 | response = eu.search_url(url=q) 435 | if response.get('results'): 436 | r = response.get('results')[0]['tag_label'] 437 | self._cache_set(q, r, 'eupi') 438 | return r 439 | eu.post_submission(q) 440 | return None 441 | 442 | def bgpranking(self, ip): 443 | cached = self._cache_get(ip, 'ipasn') 444 | if cached is not None: 445 | asn = cached['asn'] 446 | prefix = cached['prefix'] 447 | else: 448 | ipasn = IPASNHistory() 449 | response = ipasn.query(ip) 450 | if 'response' not in response: 451 | asn = None 452 | prefix = None 453 | entry = response['response'][list(response['response'].keys())[0]] 454 | if entry: 455 | self._cache_set(ip, entry, 'ipasn') 456 | asn = entry['asn'] 457 | prefix = entry['prefix'] 458 | else: 459 | asn = None 460 | prefix = None 461 | 462 | if not asn or not prefix: 463 | # asn, prefix, asn_descr, rank, position, known_asns 464 | return None, None, None, None, None, None 465 | 466 | cached = self._cache_get(ip, 'bgpranking') 467 | if cached is not None: 468 | return cached 469 | bgpranking = BGPRanking() 470 | response = bgpranking.query(asn, date=(date.today() - timedelta(1)).isoformat()) 471 | if 'response' not in response or not response['response']: 472 | return None, None, None, None, None, None 473 | to_return = (asn, prefix, response['response']['asn_description'], response['response']['ranking']['rank'], 474 | response['response']['ranking']['position'], response['response']['ranking']['total_known_asns']) 475 | self._cache_set(ip, to_return, 'bgpranking') 476 | return to_return 477 | 478 | def lookyloo(self, url): 479 | cached = self._cache_get(url, 'lookyloo') 480 | if cached is not None: 481 | return cached 482 | lookyloo = Lookyloo() 483 | lookyloo_perma_url = lookyloo.enqueue(url) 484 | if lookyloo_perma_url: 485 | self._cache_set(url, lookyloo_perma_url, 'lookyloo') 486 | return lookyloo_perma_url 487 | return None 488 | 489 | def _deserialize_cached(self, entry): 490 | to_return = {} 491 | redirects = [] 492 | h = self.cache.hgetall(entry) 493 | for key, value in h.items(): 494 | v = json.loads(value) 495 | if key == 'list': 496 | redirects = v 497 | continue 498 | to_return[key] = v 499 | return to_return, redirects 500 | 501 | def get_url_data(self, url): 502 | data, redirects = self._deserialize_cached(url) 503 | if data.get('dns') is not None: 504 | ipv4, ipv6 = data['dns'] 505 | ip_data = {} 506 | if ipv4 is not None: 507 | for ip in ipv4: 508 | info, _ = self._deserialize_cached(ip) 509 | ip_data[ip] = info 510 | if ipv6 is not None: 511 | for ip in ipv6: 512 | info, _ = self._deserialize_cached(ip) 513 | ip_data[ip] = info 514 | if len(ip_data) > 0: 515 | data.update(ip_data) 516 | return {url: data}, redirects 517 | 518 | def cached(self, url, digest=False): 519 | url_data, redirects = self.get_url_data(url) 520 | to_return = [url_data] 521 | for u in redirects: 522 | if u == url: 523 | continue 524 | data, redir = self.get_url_data(u) 525 | to_return.append(data) 526 | if digest: 527 | return {'result': to_return, 'digest': self.digest(to_return)} 528 | return {'result': to_return} 529 | 530 | def ip_details_digest(self, ips, all_info, all_asns, all_mails): 531 | to_return = '' 532 | for ip in ips: 533 | to_return += '\t' + ip + '\n' 534 | data = all_info[ip] 535 | if data.get('bgpranking'): 536 | to_return += '\t\tis announced by {} ({}). Position {}/{}.\n'.format( 537 | data['bgpranking'][2], data['bgpranking'][0], 538 | data['bgpranking'][4], data['bgpranking'][5]) 539 | all_asns.add('{} ({})'.format(data['bgpranking'][2], data['bgpranking'][0])) 540 | if data.get('whois'): 541 | all_mails.update(data.get('whois')) 542 | return to_return 543 | 544 | def digest(self, data): 545 | to_return = '' 546 | all_mails = set() 547 | all_asns = set() 548 | for entry in data: 549 | # Each URL we're redirected to 550 | for url, info in entry.items(): 551 | # info contains the information we got for the URL. 552 | to_return += '\n{}\n'.format(url) 553 | if 'whois' in info: 554 | all_mails.update(info['whois']) 555 | if 'lookyloo' in info: 556 | to_return += '\tLookyloo permanent URL: {}\n'.format(info['lookyloo']) 557 | if 'vt' in info and len(info['vt']) == 4: 558 | if info['vt'][2] is not None: 559 | to_return += '\t{} out of {} positive detections in VT - {}\n'.format( 560 | info['vt'][2], info['vt'][3], info['vt'][1]) 561 | else: 562 | to_return += '\t{} - {}\n'.format(info['vt'][0], info['vt'][1]) 563 | if 'gsb' in info: 564 | to_return += '\tKnown as malicious on Google Safe Browsing: {}\n'.format(info['gsb']) 565 | if 'phishtank' in info: 566 | to_return += '\tKnown on PhishTank: {}\n'.format(info['phishtank']) 567 | 568 | if 'dns'in info: 569 | ipv4, ipv6 = info['dns'] 570 | if ipv4 is not None: 571 | to_return += self.ip_details_digest(ipv4, info, all_asns, all_mails) 572 | if ipv6 is not None: 573 | to_return += self.ip_details_digest(ipv6, info, all_asns, all_mails) 574 | return to_return, list(all_mails), list(all_asns) 575 | -------------------------------------------------------------------------------- /website/3drparty.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | set -e 4 | set -x 5 | 6 | DEST_DIR="web/static/" 7 | 8 | ANGULAR='1.7.8' 9 | ANGULAR_BOOTSTRAP='2.5.0' 10 | 11 | wget https://ajax.googleapis.com/ajax/libs/angularjs/${ANGULAR}/angular.min.js -O ${DEST_DIR}/angular.min.js 12 | wget https://angular-ui.github.io/bootstrap/ui-bootstrap-tpls-${ANGULAR_BOOTSTRAP}.min.js -O ${DEST_DIR}/ui-bootstrap-tpls.min.js 13 | 14 | wget https://raw.githubusercontent.com/sphinxsearch/sphinx/master/api/sphinxapi.py -O sphinxapi.py 15 | 16 | 17 | 18 | -------------------------------------------------------------------------------- /website/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CIRCL/url-abuse/3d2ae503ec6ecbee92f7b8010abd46afa5b52230/website/__init__.py -------------------------------------------------------------------------------- /website/config/config.ini.sample: -------------------------------------------------------------------------------- 1 | [GLOBAL] 2 | debug=False 3 | 4 | [WHOIS] 5 | server=127.0.0.1 6 | port=4243 7 | 8 | [SPHINX] 9 | server=127.0.0.1 10 | port=9312 11 | 12 | [ITS] 13 | url=https://rt:8443/rt/RTIR/Display.html?id= 14 | 15 | [abuse] 16 | ignore= 17 | ripe.net$ 18 | arin.net$ 19 | apnic.net$ 20 | idnic.net$ 21 | peering@ 22 | dns.lu$ 23 | domreg@ 24 | registrar-email 25 | 26 | fallback=set.this@invalid.tld 27 | 28 | [replacelist] 29 | abuse@ispsystem.com=abuse@ispserver.com 30 | abuse@ispsystem.net=abuse@ispserver.com 31 | hostmaster@root.lu=abuse@as5577.net 32 | noc@as5577.net=abuse@as5577.net 33 | abuse@godaddy.com=abuse@godaddy.com,phishing@godaddy.com,malware@godaddy.com 34 | ipadmin@websitewelcome.com=security@hostgator.com,ipadmin@websitewelcome.com 35 | 36 | [PHISHTANK] 37 | url=http://checkurl.phishtank.com/checkurl/ 38 | 39 | [GOOGLESAFEBROWSING] 40 | url=https://sb-ssl.google.com/safebrowsing/api/lookup?client=urlabuse&key={}&appver=1&pver=3.1 41 | 42 | [VIRUSTOTAL] 43 | url_upload=https://www.virustotal.com/vtapi/v2/url/scan 44 | url_report=https://www.virustotal.com/vtapi/v2/url/report 45 | 46 | [PDNS_CIRCL] 47 | url=https://www.circl.lu/pdns/query 48 | 49 | [PSSL_CIRCL] 50 | url=https://www.circl.lu/ 51 | 52 | [URLQUERY] 53 | url=https://uqapi.net/v3/json 54 | 55 | [EUPI] 56 | url=https://phishing-initiative.fr 57 | 58 | [domain] 59 | ignore= 60 | post.lu 61 | pt.lu 62 | netline.lu 63 | apple.com 64 | paypal.com 65 | -------------------------------------------------------------------------------- /website/web/__init__.py: -------------------------------------------------------------------------------- 1 | import json 2 | import os 3 | from pathlib import Path 4 | import uuid 5 | 6 | from flask import Flask, render_template, request, Response, redirect, url_for 7 | from flask_mail import Mail, Message 8 | from flask_bootstrap import Bootstrap 9 | from flask_wtf import FlaskForm 10 | from wtforms import StringField, SubmitField 11 | from wtforms.widgets import TextInput 12 | from wtforms.validators import Required 13 | 14 | import logging 15 | from logging.handlers import RotatingFileHandler 16 | from logging import Formatter 17 | 18 | from redis import Redis 19 | 20 | from urlabuse.helpers import get_socket_path, get_homedir 21 | from urlabuse.urlabuse import Query 22 | 23 | import configparser 24 | from .proxied import ReverseProxied 25 | 26 | 27 | config_dir = Path('config') 28 | 29 | 30 | class AngularTextInput(TextInput): 31 | 32 | def __call__(self, field, **kwargs): 33 | kwargs['ng-model'] = 'input_url' 34 | return super(AngularTextInput, self).__call__(field, **kwargs) 35 | 36 | 37 | class URLForm(FlaskForm): 38 | url = StringField('URL Field', 39 | description='Enter the URL you want to lookup here.', 40 | validators=[Required()], widget=AngularTextInput()) 41 | 42 | submit_button = SubmitField('Run lookup') 43 | 44 | 45 | def make_dict(parser, section): 46 | to_return = {} 47 | entries = parser.items(section) 48 | for k, v in entries: 49 | to_return[k] = v.split(',') 50 | return to_return 51 | 52 | 53 | def prepare_auth(): 54 | if not os.path.exists('users.key'): 55 | return None 56 | to_return = {} 57 | with open('users.key', 'r') as f: 58 | for line in f: 59 | line = line.strip() 60 | user, password = line.split('=') 61 | to_return[user] = password 62 | return to_return 63 | 64 | 65 | app = Flask(__name__) 66 | handler = RotatingFileHandler('urlabuse.log', maxBytes=10000, backupCount=5) 67 | handler.setFormatter(Formatter('%(asctime)s %(message)s')) 68 | app.wsgi_app = ReverseProxied(app.wsgi_app) 69 | app.logger.addHandler(handler) 70 | app.logger.setLevel(logging.INFO) 71 | Bootstrap(app) 72 | queue = Redis(unix_socket_path=get_socket_path('cache'), db=0, 73 | decode_responses=True) 74 | urlabuse_query = Query() 75 | 76 | # Mail Config 77 | app.config['MAIL_SERVER'] = 'localhost' 78 | app.config['MAIL_PORT'] = 25 79 | mail = Mail(app) 80 | 81 | secret_file_path = get_homedir() / 'website' / 'secret_key' 82 | 83 | if not secret_file_path.exists() or secret_file_path.stat().st_size < 64: 84 | with open(secret_file_path, 'wb') as f: 85 | f.write(os.urandom(64)) 86 | 87 | with open(secret_file_path, 'rb') as f: 88 | app.config['SECRET_KEY'] = f.read() 89 | 90 | app.config['BOOTSTRAP_SERVE_LOCAL'] = True 91 | app.config['configfile'] = config_dir / 'config.ini' 92 | 93 | parser = configparser.SafeConfigParser() 94 | parser.read(app.config['configfile']) 95 | 96 | replacelist = make_dict(parser, 'replacelist') 97 | auth_users = prepare_auth() 98 | ignorelist = [i.strip() 99 | for i in parser.get('abuse', 'ignore').split('\n') 100 | if len(i.strip()) > 0] 101 | autosend_threshold = 5 102 | 103 | 104 | def _get_user_ip(request): 105 | ip = request.headers.get('X-Forwarded-For') 106 | if ip is None: 107 | ip = request.remote_addr 108 | return ip 109 | 110 | 111 | @app.route('/', methods=['GET', 'POST']) 112 | def index(): 113 | if request.method == 'HEAD': 114 | # Just returns ack if the webserver is running 115 | return 'Ack' 116 | form = URLForm() 117 | return render_template('index.html', form=form) 118 | 119 | 120 | @app.route('/urlreport', methods=['GET']) 121 | def url_report(): 122 | return render_template('url-report.html') 123 | 124 | 125 | @app.errorhandler(404) 126 | def page_not_found(e): 127 | ip = request.headers.get('X-Forwarded-For') 128 | if ip is None: 129 | ip = request.remote_addr 130 | if request.path != '/_result/': 131 | app.logger.info('404 of {} on {}'.format(ip, request.path)) 132 | return render_template('404.html'), 404 133 | 134 | 135 | def authenticate(): 136 | """Sends a 401 response that enables basic auth""" 137 | return Response('Could not verify your access level for that URL.\n' 138 | 'You have to login with proper credentials', 401, 139 | {'WWW-Authenticate': 'Basic realm="Login Required"'}) 140 | 141 | 142 | def check_auth(username, password): 143 | """This function is called to check if a username / 144 | password combination is valid. 145 | """ 146 | if auth_users is None: 147 | return False 148 | else: 149 | db_pass = auth_users.get(username) 150 | return db_pass == password 151 | 152 | 153 | @app.route('/login', methods=['GET', 'POST']) 154 | def login(): 155 | auth = request.authorization 156 | if not auth or not check_auth(auth.username, auth.password): 157 | return authenticate() 158 | return redirect(url_for('index')) 159 | 160 | 161 | @app.route("/_result/", methods=['GET']) 162 | def check_valid(job_key): 163 | if not job_key or not queue.exists(job_key): 164 | return Response(json.dumps(None), mimetype='application/json'), 200 165 | if not queue.hexists(job_key, 'result'): 166 | return Response(json.dumps('Nay!'), mimetype='application/json'), 202 167 | result = queue.hget(job_key, 'result') 168 | queue.delete(job_key) 169 | return Response(result, mimetype='application/json'), 200 170 | 171 | 172 | def enqueue(method, data): 173 | job_id = str(uuid.uuid4()) 174 | p = queue.pipeline() 175 | p.hmset(job_id, {'method': method, 'data': json.dumps(data)}) 176 | p.sadd('to_process', job_id) 177 | p.execute() 178 | return job_id 179 | 180 | 181 | @app.route('/start', methods=['POST']) 182 | def run_query(): 183 | data = request.get_json(force=True) 184 | url = data["url"] 185 | ip = _get_user_ip(request) 186 | app.logger.info(f'{ip} {url}') 187 | if urlabuse_query.get_submissions(url) and urlabuse_query.get_submissions(url) >= autosend_threshold: 188 | send(url, '', True) 189 | return enqueue('is_valid_url', {'url': url}) 190 | 191 | 192 | @app.route('/lookyloo', methods=['POST']) 193 | def lookyloo(): 194 | data = request.get_json(force=True) 195 | return enqueue('lookyloo', {'url': data["url"]}) 196 | 197 | 198 | @app.route('/urls', methods=['POST']) 199 | def urls(): 200 | data = request.get_json(force=True) 201 | return enqueue('url_list', {'url': data["url"]}) 202 | 203 | 204 | @app.route('/resolve', methods=['POST']) 205 | def resolve(): 206 | data = request.get_json(force=True) 207 | return enqueue('dns_resolve', {'url': data["url"]}) 208 | 209 | 210 | def read_auth(name): 211 | key = config_dir / f'{name}.key' 212 | if not key.exists(): 213 | return '' 214 | with open(key) as f: 215 | to_return = [] 216 | for line in f.readlines(): 217 | to_return.append(line.strip()) 218 | return to_return 219 | 220 | 221 | @app.route('/phishtank', methods=['POST']) 222 | def phishtank(): 223 | auth = read_auth('phishtank') 224 | if not auth: 225 | return '' 226 | data = request.get_json(force=True) 227 | return enqueue('phish_query', {'url': parser.get("PHISHTANK", "url"), 228 | 'key': auth[0], 'query': data["query"]}) 229 | 230 | 231 | @app.route('/virustotal_report', methods=['POST']) 232 | def vt(): 233 | auth = read_auth('virustotal') 234 | if not auth: 235 | return '' 236 | data = request.get_json(force=True) 237 | return enqueue('vt_query_url', {'url': parser.get("VIRUSTOTAL", "url_report"), 238 | 'url_up': parser.get("VIRUSTOTAL", "url_upload"), 239 | 'key': auth[0], 'query': data["query"]}) 240 | 241 | 242 | @app.route('/googlesafebrowsing', methods=['POST']) 243 | def gsb(): 244 | auth = read_auth('googlesafebrowsing') 245 | if not auth: 246 | return '' 247 | key = auth[0] 248 | data = request.get_json(force=True) 249 | url = parser.get("GOOGLESAFEBROWSING", "url").format(key) 250 | return enqueue('gsb_query', {'url': url, 251 | 'query': data["query"]}) 252 | 253 | 254 | ''' 255 | @app.route('/urlquery', methods=['POST']) 256 | def urlquery(): 257 | auth = read_auth('urlquery') 258 | if not auth: 259 | return '' 260 | key = auth[0] 261 | data = json.loads(request.data.decode()) 262 | url = parser.get("URLQUERY", "url") 263 | query = data["query"] 264 | u = q.enqueue_call(func=urlquery_query, args=(url, key, query,), result_ttl=500) 265 | return u.get_id() 266 | 267 | @app.route('/ticket', methods=['POST']) 268 | def ticket(): 269 | if not request.authorization: 270 | return '' 271 | data = json.loads(request.data.decode()) 272 | server = parser.get("SPHINX", "server") 273 | port = int(parser.get("SPHINX", "port")) 274 | url = parser.get("ITS", "url") 275 | query = data["query"] 276 | u = q.enqueue_call(func=sphinxsearch, args=(server, port, url, query,), 277 | result_ttl=500) 278 | return u.get_id() 279 | ''' 280 | 281 | 282 | @app.route('/whois', methods=['POST']) 283 | def whoismail(): 284 | data = request.get_json(force=True) 285 | return enqueue('whois', {'server': parser.get("WHOIS", "server"), 286 | 'port': parser.getint("WHOIS", "port"), 287 | 'domain': data["query"], 288 | 'ignorelist': ignorelist, 'replacelist': replacelist}) 289 | 290 | 291 | @app.route('/eupi', methods=['POST']) 292 | def eu(): 293 | auth = read_auth('eupi') 294 | if not auth: 295 | return '' 296 | data = request.get_json(force=True) 297 | return enqueue('eupi', {'url': parser.get("EUPI", "url"), 298 | 'key': auth[0], 'q': data["query"]}) 299 | 300 | 301 | @app.route('/pdnscircl', methods=['POST']) 302 | def dnscircl(): 303 | auth = read_auth('pdnscircl') 304 | if not auth: 305 | return '' 306 | user, password = auth 307 | url = parser.get("PDNS_CIRCL", "url") 308 | data = request.get_json(force=True) 309 | return enqueue('pdnscircl', {'url': url, 'user': user.strip(), 310 | 'passwd': password.strip(), 'q': data["query"]}) 311 | 312 | 313 | @app.route('/bgpranking', methods=['POST']) 314 | def bgpr(): 315 | data = request.get_json(force=True) 316 | return enqueue('bgpranking', {'ip': data["query"]}) 317 | 318 | 319 | @app.route('/psslcircl', methods=['POST']) 320 | def sslcircl(): 321 | auth = read_auth('psslcircl') 322 | if not auth: 323 | return '' 324 | user, password = auth 325 | url = parser.get("PSSL_CIRCL", "url") 326 | data = request.get_json(force=True) 327 | return enqueue('psslcircl', {'url': url, 'user': user.strip(), 328 | 'passwd': password.strip(), 'q': data["query"]}) 329 | 330 | 331 | @app.route('/get_cache', methods=['POST']) 332 | def get_cache(): 333 | data = request.get_json(force=True) 334 | url = data["query"] 335 | if 'digest' in data: 336 | digest = data["digest"] 337 | else: 338 | digest = False 339 | data = urlabuse_query.cached(url, digest) 340 | return Response(json.dumps(data), mimetype='application/json') 341 | 342 | 343 | def send(url, ip='', autosend=False): 344 | if not urlabuse_query.get_mail_sent(url): 345 | data = urlabuse_query.cached(url, digest=True) 346 | if not autosend: 347 | subject = 'URL Abuse report from ' + ip 348 | else: 349 | subject = 'URL Abuse report sent automatically' 350 | msg = Message(subject, sender='urlabuse@circl.lu', recipients=["info@circl.lu"]) 351 | msg.body = data['digest'][0] 352 | msg.body += '\n\n' 353 | msg.body += json.dumps(data['result'], sort_keys=True, indent=2) 354 | mail.send(msg) 355 | urlabuse_query.set_mail_sent(url) 356 | 357 | 358 | @app.route('/submit', methods=['POST']) 359 | def send_mail(): 360 | data = request.get_json(force=True) 361 | url = data["url"] 362 | if not urlabuse_query.get_mail_sent(url): 363 | ip = _get_user_ip(request) 364 | send(url, ip) 365 | form = URLForm() 366 | return render_template('index.html', form=form) 367 | -------------------------------------------------------------------------------- /website/web/proxied.py: -------------------------------------------------------------------------------- 1 | class ReverseProxied(object): 2 | '''Wrap the application in this middleware and configure the 3 | front-end server to add these headers, to let you quietly bind 4 | this to a URL other than / and to an HTTP scheme that is 5 | different than what is used locally. 6 | 7 | In nginx: 8 | location /myprefix { 9 | proxy_pass http://192.168.0.1:5001; 10 | proxy_set_header Host $host; 11 | proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; 12 | proxy_set_header X-Scheme $scheme; 13 | proxy_set_header X-Script-Name /myprefix; 14 | } 15 | 16 | :param app: the WSGI application 17 | ''' 18 | def __init__(self, app): 19 | self.app = app 20 | 21 | def __call__(self, environ, start_response): 22 | script_name = environ.get('HTTP_X_SCRIPT_NAME', '') 23 | if script_name: 24 | environ['SCRIPT_NAME'] = script_name 25 | path_info = environ['PATH_INFO'] 26 | if path_info.startswith(script_name): 27 | environ['PATH_INFO'] = path_info[len(script_name):] 28 | 29 | scheme = environ.get('HTTP_X_SCHEME', '') 30 | if scheme: 31 | environ['wsgi.url_scheme'] = scheme 32 | return self.app(environ, start_response) 33 | -------------------------------------------------------------------------------- /website/web/static/ajax-loader.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CIRCL/url-abuse/3d2ae503ec6ecbee92f7b8010abd46afa5b52230/website/web/static/ajax-loader.gif -------------------------------------------------------------------------------- /website/web/static/main.js: -------------------------------------------------------------------------------- 1 | (function () { 2 | 'use strict'; 3 | 4 | var app = angular.module('URLabuseApp', ['ui.bootstrap']); 5 | 6 | app.factory("flash", function($rootScope) { 7 | var queue = []; 8 | var currentMessage = ""; 9 | 10 | $rootScope.$on("newFlashMessage", function() { 11 | currentMessage = queue.shift() || ""; 12 | }); 13 | 14 | return { 15 | setMessage: function(message) { 16 | queue.push(message); 17 | }, 18 | getMessage: function() { 19 | return currentMessage; 20 | } 21 | }; 22 | }); 23 | 24 | app.factory('globFct', [ '$log', '$http', '$timeout', function($log, $http, $timeout){ 25 | return { 26 | poller: function myself(jobID, callback) { 27 | var timeout = ""; 28 | // fire another request 29 | $http.get('_result/' + jobID.data). 30 | then(function(data) { 31 | if(data.status === 202) { 32 | $log.log(data, status); 33 | } else if (data.status === 200){ 34 | $log.log(data.data); 35 | $timeout.cancel(timeout); 36 | if (data.data === "null"){ 37 | $log.log('Got null data'); 38 | return; 39 | } else { 40 | callback(data.data); 41 | return; 42 | }; 43 | } 44 | // continue to call the poller() function every 2 seconds 45 | // until the timout is cancelled 46 | timeout = $timeout(function() {myself(jobID, callback);}, 2000); 47 | }); 48 | }, 49 | query: function(path, data, callback) { 50 | $http.post(path, data). 51 | then(callback, function(error) { 52 | $log.log(error); 53 | }); 54 | } 55 | }; 56 | }]); 57 | 58 | app.controller('URLabuseController', function($scope, $log, globFct, flash) { 59 | 60 | $scope.poller = globFct.poller; 61 | $scope.query = globFct.query; 62 | $scope.flash = flash; 63 | 64 | var get_redirects = function(jobID) { 65 | $scope.poller(jobID, function(data){ 66 | $log.log(data); 67 | $scope.urls = data; 68 | }); 69 | }; 70 | 71 | 72 | $scope.getResults = function() { 73 | // get the URL from the input 74 | $scope.query_url = ''; 75 | $scope.urls = ''; 76 | // Reset the message 77 | $scope.$emit('newFlashMessage', ''); 78 | 79 | var userInput = $scope.input_url; 80 | 81 | var lookyloo = function(jobID) { 82 | $scope.poller(jobID, function(data){ 83 | $scope.lookyloo_url = data; 84 | }); 85 | }; 86 | 87 | var check_validity = function(jobID) { 88 | $scope.poller(jobID, function(data){ 89 | $scope.query_url = data[1]; 90 | if(data[0] === false){ 91 | $scope.error = data[2]; 92 | } else { 93 | $scope.query('urls', {"url": data[1]}, get_redirects); 94 | } 95 | }); 96 | }; 97 | 98 | $scope.query('start', {"url": userInput}, check_validity); 99 | $scope.query('lookyloo', {"url": userInput}, lookyloo); 100 | }; 101 | 102 | $scope.submit_email = function() { 103 | $scope.query('submit', {"url": $scope.query_url}, function(){ 104 | $scope.query_url = ''; 105 | $scope.urls = ''; 106 | $scope.input_url = ''; 107 | flash.setMessage("Mail sent to CIRCL"); 108 | $scope.$emit('newFlashMessage', ''); 109 | }); 110 | }; 111 | 112 | }); 113 | 114 | app.directive('uqUrlreport', function(globFct) { 115 | 116 | return { 117 | scope: { 118 | url: '=uqUrlreport', 119 | // status: {isFirstOpen: true, isFirstDisabled: false} 120 | }, 121 | link: function(scope, element, attrs) { 122 | var get_ips = function(jobID) { 123 | globFct.poller(jobID, function(data){ 124 | scope.ipv4 = data[0]; 125 | scope.ipv6 = data[1]; 126 | if (!scope.ipv4){ 127 | scope.ipv4 = ['Unable to resolve in IPv4']; 128 | } 129 | if (!scope.ipv6){ 130 | scope.ipv6 = ['Unable to resolve in IPv6']; 131 | } 132 | }); 133 | }; 134 | globFct.query('resolve', {"url": scope.url}, get_ips); 135 | }, 136 | templateUrl: 'urlreport', 137 | }; 138 | 139 | }); 140 | 141 | app.directive('uqPhishtank', function(globFct) { 142 | return { 143 | scope: { 144 | query: '=data', 145 | }, 146 | link: function(scope, element, attrs) { 147 | var get_response = function(jobID) { 148 | globFct.poller(jobID, function(data){ 149 | scope.response = data; 150 | }); 151 | }; 152 | globFct.query('phishtank', {"query": scope.query}, get_response); 153 | }, 154 | template: function(elem, attr){ 155 | return '
Known phishing website on Phishtank. More details.
';} 156 | }; 157 | }); 158 | 159 | app.directive('uqVirustotal', function(globFct) { 160 | return { 161 | scope: { 162 | query: '=data', 163 | }, 164 | link: function(scope, element, attrs) { 165 | var get_response = function(jobID) { 166 | globFct.poller(jobID, function(data){ 167 | scope.message = data[0]; 168 | scope.link = data[1]; 169 | scope.positives = data[2]; 170 | scope.total = data[3]; 171 | if(scope.link && scope.positives === null){ 172 | scope.alert_val = "info"; 173 | scope.message = "Scan request successfully queued, report available soon."; 174 | } else if (scope.link && scope.positives === 0){ 175 | scope.message = "None of the " + data[3] + " scanners know this URL as malicious."; 176 | scope.alert_val = "success"; 177 | } else if (scope.link && scope.positives < scope.total/3){ 178 | scope.message = data[2] + " of the " + data[3] + " scanners know this URL as malicious."; 179 | scope.alert_val = "warning"; 180 | } else if (scope.link && scope.positives >= scope.total/3){ 181 | scope.message = data[2] + " of the " + data[3] + " scanners know this URL as malicious."; 182 | scope.alert_val = "danger"; 183 | } 184 | }); 185 | }; 186 | globFct.query('virustotal_report', {"query": scope.query}, get_response); 187 | }, 188 | template: function(elem, attr){ 189 | return '
{{message}} More details.
';} 190 | }; 191 | }); 192 | 193 | app.directive('uqGooglesafebrowsing', function(globFct) { 194 | return { 195 | scope: { 196 | query: '=data', 197 | }, 198 | link: function(scope, element, attrs) { 199 | var get_response = function(jobID) { 200 | globFct.poller(jobID, function(data){ 201 | scope.response = data; 202 | }); 203 | }; 204 | globFct.query('googlesafebrowsing', {"query": scope.query}, get_response); 205 | }, 206 | template: function(elem, attr){ 207 | return '
Known {{response}} website on Google Safe Browsing. More details.
';} 208 | }; 209 | }); 210 | 211 | app.directive('uqEupi', function(globFct) { 212 | return { 213 | scope: { 214 | query: '=data', 215 | }, 216 | link: function(scope, element, attrs) { 217 | var get_response = function(jobID) { 218 | globFct.poller(jobID, function(data){ 219 | if (data === "inconnu"){ 220 | return; 221 | } 222 | scope.response = data; 223 | if(data === "clean"){ 224 | scope.alert_val = "success"; 225 | } 226 | else{ 227 | ascope.alert_val = "danger"; 228 | } 229 | }); 230 | }; 231 | globFct.query('eupi', {"query": scope.query}, get_response); 232 | }, 233 | template: function(elem, attr){ 234 | return '
Known as {{response}} by the European Union antiphishing initiative.
';} 235 | }; 236 | }); 237 | 238 | app.directive('uqUrlquery', function(globFct) { 239 | return { 240 | scope: { 241 | query: '=data', 242 | }, 243 | link: function(scope, element, attrs) { 244 | var get_response = function(jobID) { 245 | globFct.poller(jobID, function(data){ 246 | scope.response = data; 247 | }); 248 | }; 249 | globFct.query('urlquery', {"query": scope.query}, get_response); 250 | }, 251 | template: function(elem, attr){ 252 | return '
The total alert count on URLquery is {{response}}.
';} 253 | }; 254 | }); 255 | 256 | app.directive('uqTicket', function(globFct) { 257 | return { 258 | scope: { 259 | query: '=data', 260 | }, 261 | link: function(scope, element, attrs) { 262 | var get_response = function(jobID) { 263 | globFct.poller(jobID, function(data){ 264 | scope.response = data; 265 | }); 266 | }; 267 | globFct.query('ticket', {"query": scope.query}, get_response); 268 | }, 269 | template: '
Tickets:
' 270 | }; 271 | }); 272 | 273 | app.directive('uqWhois', function(globFct) { 274 | return { 275 | scope: { 276 | query: '=data', 277 | }, 278 | link: function(scope, element, attrs) { 279 | var get_response = function(jobID) { 280 | globFct.poller(jobID, function(data){ 281 | scope.response = data.join(); 282 | }); 283 | }; 284 | globFct.query('whois', {"query": scope.query}, get_response); 285 | }, 286 | template: '
Contact points from Whois: {{ response }}
' 287 | }; 288 | }); 289 | app.directive('uqPdnscircl', function(globFct) { 290 | return { 291 | scope: { 292 | query: '=data', 293 | }, 294 | link: function(scope, element, attrs) { 295 | var get_response = function(jobID) { 296 | globFct.poller(jobID, function(data){ 297 | scope.nbentries = data[0]; 298 | scope.lastentries = data[1]; 299 | }); 300 | }; 301 | globFct.query('pdnscircl', {"query": scope.query}, get_response); 302 | }, 303 | template: '
Has {{nbentries}} unique entries in CIRCL Passive DNS. {{lastentries.length}} most recent one(s):
  • {{domain}}
' 304 | }; 305 | }); 306 | app.directive('uqPsslcircl', function(globFct) { 307 | return { 308 | scope: { 309 | query: '=data', 310 | }, 311 | link: function(scope, element, attrs) { 312 | var get_response = function(jobID) { 313 | globFct.poller(jobID, function(data){ 314 | scope.entries = data; 315 | }); 316 | }; 317 | globFct.query('psslcircl', {"query": scope.query}, get_response); 318 | }, 319 | template: '
SSL certificates related to this IP:
  • {{sha1}}: {{subject[0]}}
' 320 | }; 321 | }); 322 | app.directive('uqBgpranking', function(globFct) { 323 | return { 324 | scope: { 325 | query: '=data', 326 | }, 327 | link: function(scope, element, attrs) { 328 | var get_response = function(jobID) { 329 | globFct.poller(jobID, function(data){ 330 | scope.asndesc = data[2]; 331 | scope.asn = data[0]; 332 | scope.prefix = data[1]; 333 | scope.position = data[4]; 334 | scope.total = data[5]; 335 | scope.value = data[3]; 336 | if (scope.position < 100){ 337 | scope.alert_val = "danger"; 338 | } else if (scope.position < 1000){ 339 | scope.alert_val = "warning"; 340 | } else { 341 | scope.alert_val = "info"; 342 | } 343 | }); 344 | }; 345 | globFct.query('bgpranking', {"query": scope.query}, get_response); 346 | }, 347 | template: '
Information from BGP Ranking:
  • Announced by: {{asndesc}} ({{asn}})
  • This ASN is at position {{position}} in the list of {{total}} known ASNs ({{value}}).
' 348 | }; 349 | }); 350 | }()); 351 | -------------------------------------------------------------------------------- /website/web/templates/404.html: -------------------------------------------------------------------------------- 1 | {% extends "index.html" %} 2 | {% block title %}Page Not Found{% endblock %} 3 | {% block body %} 4 |

Page Not Found

5 |

What you were looking for is just not there. 6 |

Back to index. 7 | {% endblock %} 8 | 9 | -------------------------------------------------------------------------------- /website/web/templates/index.html: -------------------------------------------------------------------------------- 1 | {% extends "bootstrap/base.html" %} 2 | {% import "bootstrap/wtf.html" as wtf %} 3 | {% import "bootstrap/fixes.html" as fixes %} 4 | 5 | {% block title %}CIRCL URL Abuse{% endblock %} 6 | 7 | {% block navbar %} 8 |

11 | {% endblock %} 12 | 13 | {% block html_attribs %} ng-app="URLabuseApp" {% endblock html_attribs %} 14 | 15 | {% block body_attribs %} ng-controller="URLabuseController" {% endblock body_attribs %} 16 | 17 | {% block content %} 18 |
19 |

URL Abuse testing form

20 |

URL Abuse is a public CIRCL service to review URL.
For more information about the service

21 |
22 | {% raw %} 23 |
24 |
{{ flash.getMessage() }}
25 |
26 | {% endraw %} 27 |
28 |
29 | {{ form.hidden_tag() }} 30 | {{ wtf.form_errors(form, hiddens="only") }} 31 | {%- for field in form %} 32 | {% if not bootstrap_is_hidden_field(field) -%} 33 | {{ wtf.form_field(field, form_type=form_type, 34 | horizontal_columns=horizontal_columns, 35 | button_map={"submit_button": "primary"}) }} 36 | {%- endif %} 37 | {%- endfor %} 38 |
39 | 40 | {% raw %} 41 |
42 |
43 |

Report

44 |

{{ query_url }}

45 |

See on Lookyloo

46 | 49 |
50 |
51 | {% endraw %} 52 |
53 | {% raw %} 54 |
55 |
56 |
57 |
Send report to CIRCL
58 |
59 |
60 |
61 |
62 |
63 |
64 | {% endraw %} 65 | 66 | 67 |
68 | {% endblock %} 69 | 70 | {% block head %} 71 | {{super()}} 72 | {{fixes.ie8()}} 73 | 74 | 75 | 76 | {% endblock %} 77 | 78 | -------------------------------------------------------------------------------- /website/web/templates/url-report.html: -------------------------------------------------------------------------------- 1 | {% raw %} 2 | 3 | 4 |
5 | {{url}} 6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 |
15 | 16 |
17 | {{ip}} 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 |
28 |
29 | 30 |
31 | 32 |
33 | {{ip}} 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 |
42 |
43 |
44 |
45 |
46 | 47 | {% endraw %} 48 | --------------------------------------------------------------------------------