├── app
├── requirements.txt
├── Templates
│ ├── base.html
│ ├── index.html
│ └── result.html
├── Dockerfile
└── app.py
├── test_doc.docx
├── notebooks
├── bilstm_model.png
└── password_model_bilstm.ipynb
├── models
├── bilstm_model
│ └── 1
│ │ ├── saved_model.pb
│ │ ├── keras_metadata.pb
│ │ └── variables
│ │ ├── variables.index
│ │ └── variables.data-00000-of-00001
└── models.config
├── docker-compose.yml
├── LICENSE
├── .gitignore
└── README.md
/app/requirements.txt:
--------------------------------------------------------------------------------
1 | flask
2 | flask-bootstrap
3 | numpy
4 | requests
5 |
--------------------------------------------------------------------------------
/test_doc.docx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GhostPack/DeepPass/HEAD/test_doc.docx
--------------------------------------------------------------------------------
/notebooks/bilstm_model.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GhostPack/DeepPass/HEAD/notebooks/bilstm_model.png
--------------------------------------------------------------------------------
/models/bilstm_model/1/saved_model.pb:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GhostPack/DeepPass/HEAD/models/bilstm_model/1/saved_model.pb
--------------------------------------------------------------------------------
/models/bilstm_model/1/keras_metadata.pb:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GhostPack/DeepPass/HEAD/models/bilstm_model/1/keras_metadata.pb
--------------------------------------------------------------------------------
/models/bilstm_model/1/variables/variables.index:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GhostPack/DeepPass/HEAD/models/bilstm_model/1/variables/variables.index
--------------------------------------------------------------------------------
/app/Templates/base.html:
--------------------------------------------------------------------------------
1 | {% extends "bootstrap/base.html" %}
2 |
3 | {% block title %}DeepPass{% endblock %}
4 |
5 | {% block content %}{% endblock %}
--------------------------------------------------------------------------------
/models/bilstm_model/1/variables/variables.data-00000-of-00001:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/GhostPack/DeepPass/HEAD/models/bilstm_model/1/variables/variables.data-00000-of-00001
--------------------------------------------------------------------------------
/models/models.config:
--------------------------------------------------------------------------------
1 | model_config_list: {
2 | config: {
3 | name: "password",
4 | base_path: "/models/bilstm_model",
5 | model_platform: "tensorflow"
6 | }
7 | }
--------------------------------------------------------------------------------
/app/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM python:3.8-slim-buster
2 | WORKDIR /code
3 | ENV FLASK_APP=app.py
4 | ENV FLASK_RUN_HOST=0.0.0.0
5 | COPY requirements.txt requirements.txt
6 | RUN pip install -r requirements.txt
7 | EXPOSE 5000
8 | COPY . .
9 | #run python -c "import nltk; nltk.download('punkt')"
10 | CMD ["flask", "run"]
--------------------------------------------------------------------------------
/docker-compose.yml:
--------------------------------------------------------------------------------
1 | version: "3.8"
2 |
3 | services:
4 | web:
5 | build:
6 | context: ./app/
7 | dockerfile: Dockerfile
8 | ports:
9 | - "127.0.0.1:5000:5000"
10 | depends_on:
11 | - tika
12 | - tensorflow
13 | tika:
14 | image: "apache/tika"
15 | ports:
16 | - "127.0.0.1:9998:9998"
17 | tensorflow:
18 | image: tensorflow/serving:latest
19 | restart: unless-stopped
20 | volumes:
21 | - './models:/models'
22 | command:
23 | - '--model_config_file=/models/models.config'
24 | ports:
25 | - '127.0.0.1:8501:8501'
26 |
--------------------------------------------------------------------------------
/app/Templates/index.html:
--------------------------------------------------------------------------------
1 | {% extends "base.html" %}
2 |
3 | {% block content %}
4 |
5 |
Document Results
6 |
7 | {% if results is not none %}
8 | {% for result in results %}
9 |
{{ result.file_name }}
10 |
11 | {% if result.model_password_candidates is not none and result.model_password_candidates|length > 0%}
12 |
Password Model Results
13 | {% for result in result.model_password_candidates %}
14 |
{{ " ".join(result["left_context"]) }} {{ result["password"] }} {{ " ".join(result["right_context"]) }}
15 | {% endfor %}
16 | {% endif %}
17 |
18 | {% if result.regex_password_candidates is not none and result.regex_password_candidates|length > 0%}
19 |
Password Regex results
20 | {% for result in result.regex_password_candidates %}
21 |
{{ " ".join(result["left_context"]) }} {{ result["password"] }} {{ " ".join(result["right_context"]) }}
22 | {% endfor %}
23 | {% endif %}
24 |
25 | {% if result.custom_regex_matches is not none and result.custom_regex_matches|length > 0%}
26 |
Custom Regex results
27 | {% for result in result.custom_regex_matches %}
28 |
{{ " ".join(result["left_context"]) }} {{ result["password"] }} {{ " ".join(result["right_context"]) }}
29 | {% endfor %}
30 | {% endif %}
31 |
32 | {% endfor %}
33 | {% else %}
34 |
No documents processed.
35 | {% endif %}
36 |
37 |
38 |
Back
39 |
40 | {% endblock %}
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # C extensions
7 | *.so
8 |
9 | # Distribution / packaging
10 | .Python
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | .eggs/
17 | lib/
18 | lib64/
19 | parts/
20 | sdist/
21 | var/
22 | wheels/
23 | pip-wheel-metadata/
24 | share/python-wheels/
25 | *.egg-info/
26 | .installed.cfg
27 | *.egg
28 | MANIFEST
29 |
30 | # PyInstaller
31 | # Usually these files are written by a python script from a template
32 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
33 | *.manifest
34 | *.spec
35 |
36 | # Installer logs
37 | pip-log.txt
38 | pip-delete-this-directory.txt
39 |
40 | # Unit test / coverage reports
41 | htmlcov/
42 | .tox/
43 | .nox/
44 | .coverage
45 | .coverage.*
46 | .cache
47 | nosetests.xml
48 | coverage.xml
49 | *.cover
50 | *.py,cover
51 | .hypothesis/
52 | .pytest_cache/
53 |
54 | # Translations
55 | *.mo
56 | *.pot
57 |
58 | # Django stuff:
59 | *.log
60 | local_settings.py
61 | db.sqlite3
62 | db.sqlite3-journal
63 |
64 | # Flask stuff:
65 | instance/
66 | .webassets-cache
67 |
68 | # Scrapy stuff:
69 | .scrapy
70 |
71 | # Sphinx documentation
72 | docs/_build/
73 |
74 | # PyBuilder
75 | target/
76 |
77 | # Jupyter Notebook
78 | .ipynb_checkpoints
79 |
80 | # IPython
81 | profile_default/
82 | ipython_config.py
83 |
84 | # pyenv
85 | .python-version
86 |
87 | # pipenv
88 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
89 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
90 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
91 | # install all needed dependencies.
92 | #Pipfile.lock
93 |
94 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
95 | __pypackages__/
96 |
97 | # Celery stuff
98 | celerybeat-schedule
99 | celerybeat.pid
100 |
101 | # SageMath parsed files
102 | *.sage.py
103 |
104 | # Environments
105 | .env
106 | .venv
107 | env/
108 | venv/
109 | ENV/
110 | env.bak/
111 | venv.bak/
112 |
113 | # Spyder project settings
114 | .spyderproject
115 | .spyproject
116 |
117 | # Rope project settings
118 | .ropeproject
119 |
120 | # mkdocs documentation
121 | /site
122 |
123 | # mypy
124 | .mypy_cache/
125 | .dmypy.json
126 | dmypy.json
127 |
128 | # Pyre type checker
129 | .pyre/
130 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # DeepPass
2 |
3 | Dockerized application that analyzes documents for password candidates.
4 |
5 | The blogpost ["DeepPass — Finding Passwords With Deep Learning"](https://posts.specterops.io/deeppass-finding-passwords-with-deep-learning-4d31c534cd00) gives more detail on the approach and development of the model.
6 |
7 | To run: `docker-compose up`
8 |
9 | This will expose http://localhost:5000 where documents can be uploaded.
10 |
11 | The API can manually be used at `http://localhost:5000/api/passwords` :
12 |
13 | ```
14 | C:\Users\harmj0y\Documents\GitHub\DeepPass>curl -F "file=@test_doc.docx" http://localhost:5000/api/passwords
15 | [{"file_name": "test_doc.docx", "model_password_candidates": [{"left_context": ["for", "the", "production", "server", "is:"], "password": "P@ssword123!", "right_context": ["Please", "dont", "tell", "anyone", "on"]}, {"left_context": ["that", "the", "other", "password", "is"], "password": "LiverPool1", "right_context": [".", "This", "is", "our", "backup."]}], "regex_password_candidates": [{"left_context": ["for", "the", "production", "server", "is:"], "password": "P@ssword123!", "right_context": ["Please", "dont", "tell", "anyone", "on"]}], "custom_regex_matches": null}]
16 | ```
17 |
18 | [Apache Tika](https://hub.docker.com/r/apache/tika) is used to extract data from [various document formats](https://tika.apache.org/0.9/formats.html). [Tensorflow Serving](https://hub.docker.com/r/tensorflow/serving) is used for serving the model.
19 |
20 | The neural network is Bidirectional LSTM:
21 |
22 | ```
23 | embedding_dimension = 20
24 | dropout = 0.5
25 | cells = 200
26 |
27 | model = Sequential()
28 | model.add(Embedding(total_chars, embedding_dimension, input_length=32, mask_zero=True))
29 | model.add(Bidirectional(LSTM(cells)))
30 | model.add(Dropout(dropout))
31 | model.add(Dense(1, activation='sigmoid'))
32 | ```
33 |
34 | It was trained on 2,000,000 passwords randomly selected from [this leaked password list](https://crackstation.net/files/crackstation-human-only.txt.gz) and 2,000,000 extracted terms from various Google dorked documents. The stats for the .1 test set are:
35 |
36 | ```
37 | ------------------
38 | loss : 0.04804224148392677
39 | tn : 199446.0
40 | fp : 731.0
41 | fn : 3281.0
42 | tp : 196542.0
43 | ------------------
44 | accuracy : 0.9899700284004211
45 | precision : 0.9962944984436035
46 | recall : 0.983580470085144
47 | ------------------
48 | F1 score. : 0.9898966618590025
49 | ------------------
50 | ```
51 |
52 | The training notebook for the model is in `./notebooks/password_model_bilstm.ipynb`
--------------------------------------------------------------------------------
/app/app.py:
--------------------------------------------------------------------------------
1 | from flask import Flask, render_template, url_for, request, redirect, jsonify
2 | from multiprocessing import Process
3 | from flask_bootstrap import Bootstrap
4 | import os, pickle, json, re, requests, zipfile, shutil, json, uuid
5 | import numpy as np
6 |
7 | app = Flask(__name__, template_folder='Templates')
8 | Bootstrap(app)
9 |
10 | # the deep learning password model served by tensorflow/serving
11 | MODEL_URI = 'http://tensorflow:8501/v1/models/password:predict'
12 |
13 | # the Tika docker serve point
14 | TIKA_URI = 'http://tika:9998/tika'
15 |
16 | # extracted from the fit Keras tokenizer so we don't need Keras/Tensorflow as a requirement
17 | CHAR_DICT = {'