├── .gitignore ├── BappDescription.html ├── BappManifest.bmf ├── LICENSE ├── README.md ├── Screenshots ├── SME-Screenshot1.JPG └── SME-Screenshot2.JPG ├── site_map_extractor.py └── unittest links.html /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | 49 | # Translations 50 | *.mo 51 | *.pot 52 | 53 | # Django stuff: 54 | *.log 55 | local_settings.py 56 | 57 | # Flask stuff: 58 | instance/ 59 | .webassets-cache 60 | 61 | # Scrapy stuff: 62 | .scrapy 63 | 64 | # Sphinx documentation 65 | docs/_build/ 66 | 67 | # PyBuilder 68 | target/ 69 | 70 | # Jupyter Notebook 71 | .ipynb_checkpoints 72 | 73 | # pyenv 74 | .python-version 75 | 76 | # celery beat schedule file 77 | celerybeat-schedule 78 | 79 | # SageMath parsed files 80 | *.sage.py 81 | 82 | # dotenv 83 | .env 84 | 85 | # virtualenv 86 | .venv 87 | venv/ 88 | ENV/ 89 | 90 | # Spyder project settings 91 | .spyderproject 92 | .spyproject 93 | 94 | # Rope project settings 95 | .ropeproject 96 | 97 | # mkdocs documentation 98 | /site 99 | 100 | # mypy 101 | .mypy_cache/ 102 | -------------------------------------------------------------------------------- /BappDescription.html: -------------------------------------------------------------------------------- 1 |

This extension extracts information from the Site Map. You can use the full site map or just in-scope items. Three types of information can be extracted:

2 | 3 |

Anchor Links - Searches responses for links of the form <a href=. Note that this will include links within JavaScript and in commented out areas. The log displays the found links and the page the link was found on. You have the option to select absolute links, relative links, or both. Log data can optionally be saved to a .csv file.

4 | 5 |

Response Codes - Finds all requests that returned one of the selected response code ranges (1xx/2xx/3xx/4xx/5xx). The log displays the page requested, the referer if one was specified, the specific response code, and if the response was a redirect, where the page was redirected. The log can optionally be saved to a .csv file.

6 | 7 |

Export Site Map - Saves the site map requests and responses to a .txt file. You can specify that all requests should be exported or only those with a corresponding response. The full content of the requests and responses is saved. This enables you to write independent code to further process the site map as you wish.

Testcases for Site Map Extractor

4 | Last tested with:
5 | - Burp Suite Pro 2.1.07
6 | - Jython 2.7.1
7 |
8 | How to test?
9 | Put this file on a webserver and request it (make sure you get a 200, not cache) using a browser with Burp as proxy.
10 | Open the Site Map Extractor Extension, set it to full site map and hit 'run'.
11 | Re-test? Delete the request 'item' from the Proxy > HTTP History first.
12 |
13 |
14 | 15 | == http cases with double quotes === 16 | 17 | Case 1A: " with http 18 | 1A 19 | 20 | Case 1B: " with http and target blank 21 | 1B 22 | 23 | Case 1C: " with http and target _blank with rel="nofollow" 24 | 1C 25 | 26 | Case 1D: " with http and target _blank with rel="nofollow noreferrer" 27 | 1D 28 | 29 | Case 1E: " with http and target _blank with rel="nofollow noreferrer" 30 | 1E 31 | 32 | == https cases with double quotes === 33 | 34 | Case 2A: " with https 35 | 2A 36 | 37 | Case 2B: " with https and target blank 38 | 2B 39 | 40 | Case 2C: " with https and target _blank with rel="nofollow" 41 | 2C 42 | 43 | Case 2D: " with https and target _blank with rel="nofollow noreferrer" 44 | 2D 45 | 46 | Case 2E: " with https and target _blank with rel="nofollow noreferrer noopener" 47 | 2E 48 | 49 | 50 | == https cases with double quotes to other domain=== 51 | 52 | Case 3A: " with https to other domain 53 | 3A 54 | 55 | Case 3B: " with https and target blank to other domain 56 | 3B 57 | 58 | Case 3C: " with https and target _blank with rel="nofollow" to other domain 59 | 3C 60 | 61 | Case 3D: " with https and target _blank with rel="nofollow noreferrer" to other domain 62 | 3D 63 | 64 | 65 | 66 | == http cases with single quotes === 67 | 68 | Case 4A: ' with http 69 | 4A 70 | 71 | Case 4B: ' with http and target blank 72 | 4B 73 | 74 | Case 4C: ' with http and target _blank with rel='nofollow' 75 | 4C 76 | 77 | Case 4D: ' with http and target _blank with rel='nofollow noreferrer' 78 | 4D 79 | 80 | == https cases with double quotes === 81 | 82 | Case 5A: ' with https 83 | 5A 84 | 85 | Case 5B: ' with https and target blank 86 | 5B 87 | 88 | Case 5C: ' with https and target _blank with rel='nofollow' 89 | 5C 90 | 91 | Case 5D: ' with https and target _blank with rel='nofollow noreferrer' 92 | 5D 93 | 94 | == https cases with double quotes to other domain=== 95 | 96 | Case 6A: ' with https to other domain 97 | 6A 98 | 99 | Case 6B: ' with https and target blank to other domain 100 | 6B 101 | 102 | Case 6C: ' with https and target _blank with rel='nofollow' to other domain 103 | 6C 104 | 105 | Case 6D: ' with https and target _blank with rel='nofollow noreferrer' to other domain 106 | 6D 107 | 108 | == relative cases with single quotes === 109 | 110 | Case 7A: ' with relative 111 | 7A 112 | 113 | Case 7B: ' with http and target blank 114 | 7B 115 | 116 | Case 7C: ' with http and target _blank with rel='nofollow' 117 | 7C 118 | 119 | Case 7D: ' with http and target _blank with rel='nofollow noreferrer' 120 | 7D 121 | 122 | == relative cases with double quotes === 123 | 124 | Case 8A: ' with relative 125 | 8A 126 | 127 | Case 8B: ' with https and target blank 128 | 8B 129 | 130 | Case 8C: ' with https and target _blank with rel='nofollow' 131 | 8C 132 | 133 | Case 8D: ' with https and target _blank with rel='nofollow noreferrer' 134 | 8D 135 | 136 | 137 | == Some troublemakers ;) 138 | 139 | 10A 140 | 141 | 10B 142 | 143 | 10C 144 | 145 | 10D 146 | 147 | 10E 148 | 149 | 10F 150 | 151 | 10G 152 | 153 | 154 | 10H 155 | 156 | 10I 157 | 158 | 10J: Exact case yet unknown, but happens sometimes with unicode, add when known 159 | 160 | 161 | --------------------------------------------------------------------------------